MyArxiv
Robotics
NMPCB: A Lightweight and Safety-Critical Motion Control Framework for Ackermann Mobile Robot
In multi-obstacle environments, real-time performance and safety in robot motion control have long been challenging issues, as conventional methods often struggle to balance the two. In this paper, we propose a novel motion control framework composed of a Neural network-based path planner and a Model Predictive Control (MPC) controller based on control Barrier function (NMPCB) . The planner predicts the next target point through a lightweight neural network and generates a reference trajectory for the controller. In the design of the controller, we introduce the dual problem of control barrier function (CBF) as the obstacle avoidance constraint, enabling it to ensure robot motion safety while significantly reducing computation time. The controller directly outputs control commands to the robot by tracking the reference trajectory. This framework achieves a balance between real-time performance and safety. We validate the feasibility of the framework through numerical simulations and real-world experiments.
Efficient Manipulation-Enhanced Semantic Mapping With Uncertainty-Informed Action Selection
Service robots operating in cluttered human environments such as homes, offices, and schools cannot rely on predefined object arrangements and must continuously update their semantic and spatial estimates while dealing with possible frequent rearrangements. Efficient and accurate mapping under such conditions demands selecting informative viewpoints and targeted manipulations to reduce occlusions and uncertainty. In this work, we present a manipulation-enhanced semantic mapping framework for occlusion-heavy shelf scenes that integrates evidential metric-semantic mapping with reinforcement-learning-based next-best view planning and targeted action selection. Our method thereby exploits uncertainty estimates from Dirichlet and Beta distributions in the map prediction networks to guide both active sensor placement and object manipulation, focusing on areas with high uncertainty and selecting actions with high expected information gain. Furthermore, we introduce an uncertainty-informed push strategy that targets occlusion-critical objects and generates minimally invasive actions to reveal hidden regions by reducing overall uncertainty in the scene. The experimental evaluation shows that our framework enables to accurately map cluttered scenes, while substantially reducing object displacement and achieving a 95% reduction in planning time compared to the state-of-the-art, thereby realizing real-world applicability.
Co-Design of Soft Gripper with Neural Physics
For robot manipulation, both the controller and end-effector design are crucial. Soft grippers are generalizable by deforming to different geometries, but designing such a gripper and finding its grasp pose remains challenging. In this paper, we propose a co-design framework that generates an optimized soft gripper's block-wise stiffness distribution and its grasping pose, using a neural physics model trained in simulation. We derived a uniform-pressure tendon model for a flexure-based soft finger, then generated a diverse dataset by randomizing both gripper pose and design parameters. A neural network is trained to approximate this forward simulation, yielding a fast, differentiable surrogate. We embed that surrogate in an end-to-end optimization loop to optimize the ideal stiffness configuration and best grasp pose. Finally, we 3D-print the optimized grippers of various stiffness by changing the structural parameters. We demonstrate that our co-designed grippers significantly outperform baseline designs in both simulation and hardware experiments. More info: http://yswhynot.github.io/codesign-soft/
NetRoller: Interfacing General and Specialized Models for End-to-End Autonomous Driving
Integrating General Models (GMs) such as Large Language Models (LLMs), with Specialized Models (SMs) in autonomous driving tasks presents a promising approach to mitigating challenges in data diversity and model capacity of existing specialized driving models. However, this integration leads to problems of asynchronous systems, which arise from the distinct characteristics inherent in GMs and SMs. To tackle this challenge, we propose NetRoller, an adapter that incorporates a set of novel mechanisms to facilitate the seamless integration of GMs and specialized driving models. Specifically, our mechanisms for interfacing the asynchronous GMs and SMs are organized into three key stages. NetRoller first harvests semantically rich and computationally efficient representations from the reasoning processes of LLMs using an early stopping mechanism, which preserves critical insights on driving context while maintaining low overhead. It then applies learnable query embeddings, nonsensical embeddings, and positional layer embeddings to facilitate robust and efficient cross-modality translation. At last, it employs computationally efficient Query Shift and Feature Shift mechanisms to enhance the performance of SMs through few-epoch fine-tuning. Based on the mechanisms formalized in these three stages, NetRoller enables specialized driving models to operate at their native frequencies while maintaining situational awareness of the GM. Experiments conducted on the nuScenes dataset demonstrate that integrating GM through NetRoller significantly improves human similarity and safety in planning tasks, and it also achieves noticeable precision improvements in detection and mapping tasks for end-to-end autonomous driving. The code and models are available at https://github.com/Rex-sys-hk/NetRoller .
comment: This work has been submitted to the IEEE for possible publication
Multiagent Systems
Agent Trading Arena: A Study on Numerical Understanding in LLM-Based Agents
Large language models (LLMs) have demonstrated remarkable capabilities in natural language tasks, yet their performance in dynamic, real-world financial environments remains underexplored. Existing approaches are limited to historical backtesting, where trading actions cannot influence market prices and agents train only on static data. To address this limitation, we present the Agent Trading Arena, a virtual zero-sum stock market in which LLM-based agents engage in competitive multi-agent trading and directly impact price dynamics. By simulating realistic bid-ask interactions, our platform enables training in scenarios that closely mirror live markets, thereby narrowing the gap between training and evaluation. Experiments reveal that LLMs struggle with numerical reasoning when given plain-text data, often overfitting to local patterns and recent values. In contrast, chart-based visualizations significantly enhance both numerical reasoning and trading performance. Furthermore, incorporating a reflection module yields additional improvements, especially with visual inputs. Evaluations on NASDAQ and CSI datasets demonstrate the superiority of our method, particularly under high volatility. All code and data are available at https://github.com/wekjsdvnm/Agent-Trading-Arena.
On Word-of-Mouth and Private-Prior Sequential Social Learning
Social learning constitutes a fundamental framework for studying interactions among rational agents who observe each other's actions but lack direct access to individual beliefs. This paper investigates a specific social learning paradigm known as Word-of-Mouth (WoM), where a series of agents seeks to estimate the state of a dynamical system. The first agent receives noisy measurements of the state, while each subsequent agent relies solely on a degraded version of her predecessor's estimate. A defining feature of WoM is that the final agent's belief is publicly broadcast and subsequently adopted by all agents, in place of their own. We analyze this setting theoretically and through numerical simulations, noting that some agents benefit from using the belief of the last agent, while others experience performance deterioration.
comment: Accepted for publication at the 64th Conference on Decision and Control (CDC)
Systems and Control (CS)
Data-Driven Abstraction and Synthesis for Stochastic Systems with Unknown Dynamics
We study the automated abstraction-based synthesis of correct-by-construction control policies for stochastic dynamical systems with unknown dynamics. Our approach is to learn an abstraction from sampled data, which is represented in the form of a finite Markov decision process (MDP). In this paper, we present a data-driven technique for constructing finite-state interval MDP (IMDP) abstractions of stochastic systems with unknown nonlinear dynamics. As a distinguishing and novel feature, our technique only requires (1) noisy state-input-state observations and (2) an upper bound on the system's Lipschitz constant. Combined with standard model-checking techniques, our IMDP abstractions enable the synthesis of policies that satisfy probabilistic temporal properties (such as "reach-while-avoid") with a predefined confidence. Our experimental results show the effectiveness and robustness of our approach.
Incremental Collision Laws Based on the Bouc-Wen Model: Improved Collision Models and Further Results
In the article titled "The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies" and published in the Journal of Computational and Nonlinear Dynamics (Volume 20, Issue 6, June 2025), the authors studied mathematical models of binary direct collinear collisions of convex viscoplastic bodies that employed two incremental collision laws based on the Bouc-Wen differential model of hysteresis. It was shown that the models possess favorable analytical properties, and several model parameter identification studies were conducted, demonstrating that the models can accurately capture the nature of a variety of collision phenomena. In this article, the aforementioned models are augmented by modeling the effects of external forces as time-dependent inputs that belong to a certain function space. Furthermore, the range of the parameters under which the models possess favorable analytical properties is extended to several corner cases that were not considered in the prior publication. Finally, the previously conducted model parameter identification studies are extended, and an additional model parameter identification study is provided in an attempt to validate the ability of the augmented models to represent the effects of external forces.
comment: 12 pages, 4 figures, see https://gitlab.com/user9716869/EBWCM ; (v2-v6) various minor amendments; (v5) replaced the parameter identification study of Quinn (2004) with Villegas et al (2021) due to incompatibility of the proposed collision models with the experimental setup in Quinn (2004); arXiv admin note: text overlap with arXiv:2410.08147
Systems and Control (EESS)
Data-Driven Abstraction and Synthesis for Stochastic Systems with Unknown Dynamics
We study the automated abstraction-based synthesis of correct-by-construction control policies for stochastic dynamical systems with unknown dynamics. Our approach is to learn an abstraction from sampled data, which is represented in the form of a finite Markov decision process (MDP). In this paper, we present a data-driven technique for constructing finite-state interval MDP (IMDP) abstractions of stochastic systems with unknown nonlinear dynamics. As a distinguishing and novel feature, our technique only requires (1) noisy state-input-state observations and (2) an upper bound on the system's Lipschitz constant. Combined with standard model-checking techniques, our IMDP abstractions enable the synthesis of policies that satisfy probabilistic temporal properties (such as "reach-while-avoid") with a predefined confidence. Our experimental results show the effectiveness and robustness of our approach.
Incremental Collision Laws Based on the Bouc-Wen Model: Improved Collision Models and Further Results
In the article titled "The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies" and published in the Journal of Computational and Nonlinear Dynamics (Volume 20, Issue 6, June 2025), the authors studied mathematical models of binary direct collinear collisions of convex viscoplastic bodies that employed two incremental collision laws based on the Bouc-Wen differential model of hysteresis. It was shown that the models possess favorable analytical properties, and several model parameter identification studies were conducted, demonstrating that the models can accurately capture the nature of a variety of collision phenomena. In this article, the aforementioned models are augmented by modeling the effects of external forces as time-dependent inputs that belong to a certain function space. Furthermore, the range of the parameters under which the models possess favorable analytical properties is extended to several corner cases that were not considered in the prior publication. Finally, the previously conducted model parameter identification studies are extended, and an additional model parameter identification study is provided in an attempt to validate the ability of the augmented models to represent the effects of external forces.
comment: 12 pages, 4 figures, see https://gitlab.com/user9716869/EBWCM ; (v2-v6) various minor amendments; (v5) replaced the parameter identification study of Quinn (2004) with Villegas et al (2021) due to incompatibility of the proposed collision models with the experimental setup in Quinn (2004); arXiv admin note: text overlap with arXiv:2410.08147
Adaptive Dead-Zone Dual Sliding Mode Observer for Reliable Electrochemical Model-Based SOC Estimation
Accurate state of charge (SOC) estimation is critical for ensuring the safety, reliability, and efficiency of lithium-ion batteries in electric vehicles and energy storage systems. Electrochemical models provide high fidelity for SOC estimation but introduce challenges due to parameter variations, nonlinearities, and computational complexity. To address these issues, this paper proposes an adaptive dead-zone dual sliding mode observer(SMO) based on an improved electrochemical single-particle model. The algorithm integrates a state observer for SOC estimation and a parameter observer for online parameter adaptation. A Lyapunov-derived adaptive dead-zone is introduced to ensure stability, activating parameter updates only when the terminal voltage error lies within a rigorously defined bound. The proposed method was validated under constant-current and UDDS dynamic conditions. Results demonstrate that the adaptive dead-zone dual SMO achieves superior accuracy compared with conventional dual SMO and equivalent circuit model-based EKF methods, maintaining SOC estimation errors within 0.2% under correct initialization and below 1% under a 30% initial SOC error, with rapid convergence. Computational efficiency analysis further shows that the adaptive dead-zone dual sliding mode observer reduces execution time compared with the conventional dual SMO by limiting unnecessary parameter updates, highlighting its suitability for real-time battery management applications. Moreover, robustness under battery aging was confirmed using a cycle-aging model, where the adaptive dead-zone dual SMO maintained stable SOC estimation despite parameter drift. These findings indicate that the proposed method offers a reliable, accurate, and computationally efficient solution for SOC estimation.
comment: 36 pages, 5 figures
Robotics
Benchmarking LLM Privacy Recognition for Social Robot Decision Making
While robots have previously utilized rule-based systems or probabilistic models for user interaction, the rapid evolution of large language models (LLMs) presents new opportunities to develop LLM-powered robots for enhanced human-robot interaction (HRI). To fully realize these capabilities, however, robots need to collect data such as audio, fine-grained images, video, and locations. As a result, LLMs often process sensitive personal information, particularly within private environments, such as homes. Given the tension between utility and privacy risks, evaluating how current LLMs manage sensitive data is critical. Specifically, we aim to explore the extent to which out-of-the-box LLMs are privacy-aware in the context of household robots. In this work, we present a set of privacy-relevant scenarios developed using the Contextual Integrity (CI) framework. We first surveyed users' privacy preferences regarding in-home robot behaviors and then examined how their privacy orientations affected their choices of these behaviors (N = 450). We then provided the same set of scenarios and questions to state-of-the-art LLMs (N = 10) and found that the agreement between humans and LLMs was generally low. To further investigate the capabilities of LLMs as potential privacy controllers, we implemented four additional prompting strategies and compared their results. We discuss the performance of the evaluated models as well as the implications and potential of AI privacy awareness in human-robot interaction.
comment: 18 pages, 7 figures. Dakota Sullivan and Shirley Zhang contributed equally to this work
Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids
Learning generalizable robot manipulation policies, especially for complex multi-fingered humanoids, remains a significant challenge. Existing approaches primarily rely on extensive data collection and imitation learning, which are expensive, labor-intensive, and difficult to scale. Sim-to-real reinforcement learning (RL) offers a promising alternative, but has mostly succeeded in simpler state-based or single-hand setups. How to effectively extend this to vision-based, contact-rich bimanual manipulation tasks remains an open question. In this paper, we introduce a practical sim-to-real RL recipe that trains a humanoid robot to perform three challenging dexterous manipulation tasks: grasp-and-reach, box lift and bimanual handover. Our method features an automated real-to-sim tuning module, a generalized reward formulation based on contact and object goals, a divide-and-conquer policy distillation framework, and a hybrid object representation strategy with modality-specific augmentation. We demonstrate high success rates on unseen objects and robust, adaptive policy behaviors -- highlighting that vision-based dexterous manipulation via sim-to-real RL is not only viable, but also scalable and broadly applicable to real-world humanoid manipulation tasks.
comment: Published at CoRL 2025. Project page can be found at https://toruowo.github.io/recipe/
Self-Supervised Learning-Based Path Planning and Obstacle Avoidance Using PPO and B-Splines in Unknown Environments
This paper introduces SmartBSP, an advanced self-supervised learning framework for real-time path planning and obstacle avoidance in autonomous robotics navigating through complex environments. The proposed system integrates Proximal Policy Optimization (PPO) with Convolutional Neural Networks (CNN) and Actor-Critic architecture to process limited LIDAR inputs and compute spatial decision-making probabilities. The robot's perceptual field is discretized into a grid format, which the CNN analyzes to produce a spatial probability distribution. During the training process a nuanced cost function is minimized that accounts for path curvature, endpoint proximity, and obstacle avoidance. Simulations results in different scenarios validate the algorithm's resilience and adaptability across diverse operational scenarios. Subsequently, Real-time experiments, employing the Robot Operating System (ROS), were carried out to assess the efficacy of the proposed algorithm.
Morphologically Symmetric Reinforcement Learning for Ambidextrous Bimanual Manipulation
Humans naturally exhibit bilateral symmetry in their gross manipulation skills, effortlessly mirroring simple actions between left and right hands. Bimanual robots-which also feature bilateral symmetry-should similarly exploit this property to perform tasks with either hand. Unlike humans, who often favor a dominant hand for fine dexterous skills, robots should ideally execute ambidextrous manipulation with equal proficiency. To this end, we introduce SYMDEX (SYMmetric DEXterity), a reinforcement learning framework for ambidextrous bi-manipulation that leverages the robot's inherent bilateral symmetry as an inductive bias. SYMDEX decomposes complex bimanual manipulation tasks into per-hand subtasks and trains dedicated policies for each. By exploiting bilateral symmetry via equivariant neural networks, experience from one arm is inherently leveraged by the opposite arm. We then distill the subtask policies into a global ambidextrous policy that is independent of the hand-task assignment. We evaluate SYMDEX on six challenging simulated manipulation tasks and demonstrate successful real-world deployment on two of them. Our approach strongly outperforms baselines on complex task in which the left and right hands perform different roles. We further demonstrate SYMDEX's scalability by extending it to a four-arm manipulation setup, where our symmetry-aware policies enable effective multi-arm collaboration and coordination. Our results highlight how structural symmetry as inductive bias in policy learning enhances sample efficiency, robustness, and generalization across diverse dexterous manipulation tasks.
Dyna-LfLH: Learning Agile Navigation in Dynamic Environments from Learned Hallucination IROS
This paper introduces Dynamic Learning from Learned Hallucination (Dyna-LfLH), a self-supervised method for training motion planners to navigate environments with dense and dynamic obstacles. Classical planners struggle with dense, unpredictable obstacles due to limited computation, while learning-based planners face challenges in acquiring high- quality demonstrations for imitation learning or dealing with exploration inefficiencies in reinforcement learning. Building on Learning from Hallucination (LfH), which synthesizes training data from past successful navigation experiences in simpler environments, Dyna-LfLH incorporates dynamic obstacles by generating them through a learned latent distribution. This enables efficient and safe motion planner training. We evaluate Dyna-LfLH on a ground robot in both simulated and real environments, achieving up to a 25% improvement in success rate compared to baselines.
comment: Accepted at International Conference on Intelligent Robots and Systems (IROS) 2025 Hangzhou, China
SoK: Cybersecurity Assessment of Humanoid Ecosystem
Humanoids are progressing toward practical deployment across healthcare, industrial, defense, and service sectors. While typically considered cyber-physical systems (CPSs), their dependence on traditional networked software stacks (e.g., Linux operating systems), robot operating system (ROS) middleware, and over-the-air update channels, creates a distinct security profile that exposes them to vulnerabilities conventional CPS models do not fully address. Prior studies have mainly examined specific threats, such as LiDAR spoofing or adversarial machine learning (AML). This narrow focus overlooks how an attack targeting one component can cascade harm throughout the robot's interconnected systems. We address this gap through a systematization of knowledge (SoK) that takes a comprehensive approach, consolidating fragmented research from robotics, CPS, and network security domains. We introduce a seven-layer security model for humanoid robots, organizing 39 known attacks and 35 defenses across the humanoid ecosystem-from hardware to human-robot interaction. Building on this security model, we develop a quantitative 39x35 attack-defense matrix with risk-weighted scoring, validated through Monte Carlo analysis. We demonstrate our method by evaluating three real-world robots: Pepper, G1 EDU, and Digit. The scoring analysis revealed varying security maturity levels, with scores ranging from 39.9% to 79.5% across the platforms. This work introduces a structured, evidence-based assessment method that enables systematic security evaluation, supports cross-platform benchmarking, and guides prioritization of security investments in humanoid robotics.
SAVOR: Skill Affordance Learning from Visuo-Haptic Perception for Robot-Assisted Bite Acquisition
Robot-assisted feeding requires reliable bite acquisition, a challenging task due to the complex interactions between utensils and food with diverse physical properties. These interactions are further complicated by the temporal variability of food properties-for example, steak becomes firm as it cools even during a meal. To address this, we propose SAVOR, a novel approach for learning skill affordances for bite acquisition-how suitable a manipulation skill (e.g., skewering, scooping) is for a given utensil-food interaction. In our formulation, skill affordances arise from the combination of tool affordances (what a utensil can do) and food affordances (what the food allows). Tool affordances are learned offline through calibration, where different utensils interact with a variety of foods to model their functional capabilities. Food affordances are characterized by physical properties such as softness, moisture, and viscosity, initially inferred through commonsense reasoning using a visually-conditioned language model and then dynamically refined through online multi-modal visuo-haptic perception using SAVOR-Net during interaction. Our method integrates these offline and online estimates to predict skill affordances in real time, enabling the robot to select the most appropriate skill for each food item. Evaluated on 20 single-item foods and 10 in-the-wild meals, our approach improves bite acquisition success rate by 13% over state-of-the-art (SOTA) category-based methods (e.g. use skewer for fruits). These results highlight the importance of modeling interaction-driven skill affordances for generalizable and effective robot-assisted bite acquisition. Website: https://emprise.cs.cornell.edu/savor/
comment: Conference on Robot Learning, Oral
Wavelet Policy: Imitation Policy Learning in the Scale Domain with Wavelet Transforms
Recent imitation learning policies, often framed as time series prediction tasks, directly map robotic observations into the action space, such as high-dimensional visual data and proprioception. When deploying at the edge, we found the underutilization of frequency domain analysis in robotic manipulation trajectory prediction leads to neglecting the inherent rhythm information embedded within action sequences, resulting in errors at critical moments. To address this, we reframe imitation learning policies through the lens of time-scale domain and introduce the Wavelet Policy. This novel approach employs wavelet transforms (WT) and new Features Extractor (FE) for feature preprocessing and extracts multi-scale features using the Single Encoder to Multiple Decoder (SE2MD) architecture. Furthermore, to enhance feature mapping in the scale domain and appropriately increase model capacity, we introduce a Learnable Scale Domain Filter (LSDF) after each decoder, improving adaptability under different visual conditions. Our results show that the Wavelet Policy maintaining a comparable parameter count outperforms SOTA end-to-end methods on four challenging simulation robotic arm tasks and real tasks, especially at critical moments and remote settings simultaneously. We release the source code and model checkpoint of simulation task at https://github.com/lurenjia384/Wavelet_Policy.
RALLY: Role-Adaptive LLM-Driven Yoked Navigation for Agentic UAV Swarms
Intelligent control of Unmanned Aerial Vehicles (UAVs) swarms has emerged as a critical research focus, and it typically requires the swarm to navigate effectively while avoiding obstacles and achieving continuous coverage over multiple mission targets. Although traditional Multi-Agent Reinforcement Learning (MARL) approaches offer dynamic adaptability, they are hindered by the semantic gap in numerical communication and the rigidity of homogeneous role structures, resulting in poor generalization and limited task scalability. Recent advances in Large Language Model (LLM)-based control frameworks demonstrate strong semantic reasoning capabilities by leveraging extensive prior knowledge. However, due to the lack of online learning and over-reliance on static priors, these works often struggle with effective exploration, leading to reduced individual potential and overall system performance. To address these limitations, we propose a Role-Adaptive LLM-Driven Yoked navigation algorithm RALLY. Specifically, we first develop an LLM-driven semantic decision framework that uses structured natural language for efficient semantic communication and collaborative reasoning. Afterward, we introduce a dynamic role-heterogeneity mechanism for adaptive role switching and personalized decision-making. Furthermore, we propose a Role-value Mixing Network (RMIX)-based assignment strategy that integrates LLM offline priors with MARL online policies to enable semi-offline training of role selection strategies. Experiments in the Multi-Agent Particle Environment (MPE) environment and a Software-In-The-Loop (SITL) platform demonstrate that RALLY outperforms conventional approaches in terms of task coverage, convergence speed, and generalization, highlighting its strong potential for collaborative navigation in agentic multi-UAV systems.
General agents contain world models ICML 2025
Are world models a necessary ingredient for flexible, goal-directed behaviour, or is model-free learning sufficient? We provide a formal answer to this question, showing that any agent capable of generalizing to multi-step goal-directed tasks must have learned a predictive model of its environment. We show that this model can be extracted from the agent's policy, and that increasing the agents performance or the complexity of the goals it can achieve requires learning increasingly accurate world models. This has a number of consequences: from developing safe and general agents, to bounding agent capabilities in complex environments, and providing new algorithms for eliciting world models from agents.
comment: Accepted ICML 2025. Typos corrected
Force Myography based Torque Estimation in Human Knee and Ankle Joints ICRA
The online adaptation of exoskeleton control based on muscle activity sensing offers a promising approach to personalizing exoskeleton behavior based on the user's biosignals. While electromyography (EMG)-based methods have demonstrated improvements in joint torque estimation, EMG sensors require direct skin contact and extensive post-processing. In contrast, force myography (FMG) measures normal forces resulting from changes in muscle volume due to muscle activity. We propose an FMG-based method to estimate knee and ankle joint torques by integrating joint angles and velocities with muscle activity data. We learn a model for joint torque estimation using Gaussian process regression (GPR). The effectiveness of the proposed FMG-based method is validated on isokinetic motions performed by ten participants. The model is compared to a baseline model that uses only joint angle and velocity, as well as a model augmented by EMG data. The results indicate that incorporating FMG into exoskeleton control can improve the estimation of joint torque for the ankle and knee joints in novel task characteristics within a single participant. Although the findings suggest that this approach may not improve the generalizability of estimates between multiple participants, they highlight the need for further research into its potential applications in exoskeleton control.
comment: This file corresponds to the manuscript presented at the IEEE International Conference on Robotics and Automation (ICRA), May 2025
Large VLM-based Vision-Language-Action Models for Robotic Manipulation: A Survey
Robotic manipulation, a key frontier in robotics and embodied AI, requires precise motor control and multimodal understanding, yet traditional rule-based methods fail to scale or generalize in unstructured, novel environments. In recent years, Vision-Language-Action (VLA) models, built upon Large Vision-Language Models (VLMs) pretrained on vast image-text datasets, have emerged as a transformative paradigm. This survey provides the first systematic, taxonomy-oriented review of large VLM-based VLA models for robotic manipulation. We begin by clearly defining large VLM-based VLA models and delineating two principal architectural paradigms: (1) monolithic models, encompassing single-system and dual-system designs with differing levels of integration; and (2) hierarchical models, which explicitly decouple planning from execution via interpretable intermediate representations. Building on this foundation, we present an in-depth examination of large VLM-based VLA models: (1) integration with advanced domains, including reinforcement learning, training-free optimization, learning from human videos, and world model integration; (2) synthesis of distinctive characteristics, consolidating architectural traits, operational strengths, and the datasets and benchmarks that support their development; (3) identification of promising directions, including memory mechanisms, 4D perception, efficient adaptation, multi-agent cooperation, and other emerging capabilities. This survey consolidates recent advances to resolve inconsistencies in existing taxonomies, mitigate research fragmentation, and fill a critical gap through the systematic integration of studies at the intersection of large VLMs and robotic manipulation. We provide a regularly updated project page to document ongoing progress: https://github.com/JiuTian-VL/Large-VLM-based-VLA-for-Robotic-Manipulation
comment: Project Page: https://github.com/JiuTian-VL/Large-VLM-based-VLA-for-Robotic-Manipulation
Temporal Preference Optimization for Long-Form Video Understanding
Despite significant advancements in video large multimodal models (video-LMMs), achieving effective temporal grounding in long-form videos remains a challenge for existing models. To address this limitation, we propose Temporal Preference Optimization (TPO), a novel post-training framework designed to enhance the temporal grounding capabilities of video-LMMs through preference learning. TPO adopts a self-training approach that enables models to differentiate between well-grounded and less accurate temporal responses by leveraging curated preference datasets at two granularities: localized temporal grounding, which focuses on specific video segments, and comprehensive temporal grounding, which captures extended temporal dependencies across entire video sequences. By optimizing on these preference datasets, TPO significantly enhances temporal understanding while reducing reliance on manually annotated data. Extensive experiments on three long-form video understanding benchmarks--LongVideoBench, MLVU, and Video-MME--demonstrate the effectiveness of TPO across two state-of-the-art video-LMMs. Notably, LLaVA-Video-TPO establishes itself as the leading 7B model on the Video-MME benchmark, underscoring the potential of TPO as a scalable and efficient solution for advancing temporal reasoning in long-form video understanding. Project page: https://ruili33.github.io/tpo_website.
NarraGuide: an LLM-based Narrative Mobile Robot for Remote Place Exploration
Robotic telepresence enables users to navigate and experience remote environments. However, effective navigation and situational awareness depend on users' prior knowledge of the environment, limiting the usefulness of these systems for exploring unfamiliar places. We explore how integrating location-aware LLM-based narrative capabilities into a mobile robot can support remote exploration. We developed a prototype system, called NarraGuide, that provides narrative guidance for users to explore and learn about a remote place through a dialogue-based interface. We deployed our prototype in a geology museum, where remote participants (n=20) used the robot to tour the museum. Our findings reveal how users perceived the robot's role, engaged in dialogue in the tour, and expressed preferences for bystander encountering. Our work demonstrates the potential of LLM-enabled robotic capabilities to deliver location-aware narrative guidance and enrich the experience of exploring remote environments.
ViTaMIn: Learning Contact-Rich Tasks Through Robot-Free Visuo-Tactile Manipulation Interface
Tactile information plays a crucial role for humans and robots to interact effectively with their environment, particularly for tasks requiring the understanding of contact properties. Solving such dexterous manipulation tasks often relies on imitation learning from demonstration datasets, which are typically collected via teleoperation systems and often demand substantial time and effort. To address these challenges, we present ViTaMIn, an embodiment-free manipulation interface that seamlessly integrates visual and tactile sensing into a hand-held gripper, enabling data collection without the need for teleoperation. Our design employs a compliant Fin Ray gripper with tactile sensing, allowing operators to perceive force feedback during manipulation for more intuitive operation. Additionally, we propose a multimodal representation learning strategy to obtain pre-trained tactile representations, improving data efficiency and policy robustness. Experiments on seven contact-rich manipulation tasks demonstrate that ViTaMIn significantly outperforms baseline methods, demonstrating its effectiveness for complex manipulation tasks.
DivScene: Towards Open-Vocabulary Object Navigation with Large Vision Language Models in Diverse Scenes EMNLP 2025
Large Vision-Language Models (LVLMs) have achieved significant progress in tasks like visual question answering and document understanding. However, their potential to comprehend embodied environments and navigate within them remains underexplored. In this work, we first study the challenge of open-vocabulary object navigation by introducing DivScene, a large-scale dataset with 4,614 houses across 81 scene types and 5,707 kinds of target objects. Our dataset provides a much greater diversity of target objects and scene types than existing datasets, enabling a comprehensive task evaluation. We evaluated various methods with LVLMs and LLMs on our dataset and found that current models still fall short of open-vocab object navigation ability. Then, we fine-tuned LVLMs to predict the next action with CoT explanations. We observe that LVLM's navigation ability can be improved substantially with only BFS-generated shortest paths without any human supervision, surpassing GPT-4o by over 20% in success rates.
comment: EMNLP 2025
Nav-SCOPE: Swarm Robot Cooperative Perception and Coordinated Navigation
This paper proposes a lightweight systematic solution for multi-robot coordinated navigation with decentralized cooperative perception. An information flow is first created to facilitate real-time observation sharing over unreliable ad-hoc networks. Then, the environmental uncertainties of each robot are reduced by interaction fields that deliver complementary information. Finally, path optimization is achieved, enabling self-organized coordination with effective convergence, divergence, and collision avoidance. Our method is fully interpretable and ready for deployment without gaps. Comprehensive simulations and real-world experiments demonstrate reduced path redundancy, robust performance across various tasks, and minimal demands on computation and communication.
comment: 11 pages, 9 figures, accepted in IEEE Transactions on Automation Science and Engineering
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control
The human ability to seamlessly perform multimodal reasoning and physical interaction in the open world is a core goal for general-purpose embodied intelligent systems. Recent vision-language-action (VLA) models, which are co-trained on large-scale robot and visual-text data, have demonstrated notable progress in general robot control. However, they still fail to achieve human-level flexibility in interleaved reasoning and interaction. In this work, introduce EO-Robotics, consists of EO-1 model and EO-Data1.5M dataset. EO-1 is a unified embodied foundation model that achieves superior performance in multimodal embodied reasoning and robot control through interleaved vision-text-action pre-training. The development of EO-1 is based on two key pillars: (i) a unified architecture that processes multimodal inputs indiscriminately (image, text, video, and action), and (ii) a massive, high-quality multimodal embodied reasoning dataset, EO-Data1.5M, which contains over 1.5 million samples with emphasis on interleaved vision-text-action comprehension. EO-1 is trained through synergies between auto-regressive decoding and flow matching denoising on EO-Data1.5M, enabling seamless robot action generation and multimodal embodied reasoning. Extensive experiments demonstrate the effectiveness of interleaved vision-text-action learning for open-world understanding and generalization, validated through a variety of long-horizon, dexterous manipulation tasks across multiple embodiments. This paper details the architecture of EO-1, the data construction strategy of EO-Data1.5M, and the training methodology, offering valuable insights for developing advanced embodied foundation models.
Multiagent Systems
The challenge of hidden gifts in multi-agent reinforcement learning
Sometimes we benefit from actions that others have taken even when we are unaware that they took those actions. For example, if your neighbor chooses not to take a parking spot in front of your house when you are not there, you can benefit, even without being aware that they took this action. These "hidden gifts" represent an interesting challenge for multi-agent reinforcement learning (MARL), since assigning credit when the beneficial actions of others are hidden is non-trivial. Here, we study the impact of hidden gifts with a very simple MARL task. In this task, agents in a grid-world environment have individual doors to unlock in order to obtain individual rewards. As well, if all the agents unlock their door the group receives a larger collective reward. However, there is only one key for all of the doors, such that the collective reward can only be obtained when the agents drop the key for others after they use it. Notably, there is nothing to indicate to an agent that the other agents have dropped the key, thus the act of dropping the key for others is a "hidden gift". We show that several different state-of-the-art RL algorithms, including MARL algorithms, fail to learn how to obtain the collective reward in this simple task. Interestingly, we find that independent model-free policy gradient agents can solve the task when we provide them with information about their own action history, but MARL agents still cannot solve the task with action history. Finally, we derive a correction term for these independent agents, inspired by learning aware approaches, which reduces the variance in learning and helps them to converge to collective success more reliably. These results show that credit assignment in multi-agent settings can be particularly challenging in the presence of "hidden gifts", and demonstrate that learning awareness in independent agents can benefit these settings.
comment: Updated proof section and included single agent baseline for key-to-door in the appendix. Related work now is part of the main section
RALLY: Role-Adaptive LLM-Driven Yoked Navigation for Agentic UAV Swarms
Intelligent control of Unmanned Aerial Vehicles (UAVs) swarms has emerged as a critical research focus, and it typically requires the swarm to navigate effectively while avoiding obstacles and achieving continuous coverage over multiple mission targets. Although traditional Multi-Agent Reinforcement Learning (MARL) approaches offer dynamic adaptability, they are hindered by the semantic gap in numerical communication and the rigidity of homogeneous role structures, resulting in poor generalization and limited task scalability. Recent advances in Large Language Model (LLM)-based control frameworks demonstrate strong semantic reasoning capabilities by leveraging extensive prior knowledge. However, due to the lack of online learning and over-reliance on static priors, these works often struggle with effective exploration, leading to reduced individual potential and overall system performance. To address these limitations, we propose a Role-Adaptive LLM-Driven Yoked navigation algorithm RALLY. Specifically, we first develop an LLM-driven semantic decision framework that uses structured natural language for efficient semantic communication and collaborative reasoning. Afterward, we introduce a dynamic role-heterogeneity mechanism for adaptive role switching and personalized decision-making. Furthermore, we propose a Role-value Mixing Network (RMIX)-based assignment strategy that integrates LLM offline priors with MARL online policies to enable semi-offline training of role selection strategies. Experiments in the Multi-Agent Particle Environment (MPE) environment and a Software-In-The-Loop (SITL) platform demonstrate that RALLY outperforms conventional approaches in terms of task coverage, convergence speed, and generalization, highlighting its strong potential for collaborative navigation in agentic multi-UAV systems.
Distributed Fractional Bayesian Learning for Adaptive Optimization
This paper considers a distributed adaptive optimization problem, where all agents only have access to their local cost functions with a common unknown parameter, whereas they mean to collaboratively estimate the true parameter and find the optimal solution over a connected network. A general mathematical framework for such a problem has not been studied yet. We aim to provide valuable insights for addressing parameter uncertainty in distributed optimization problems and simultaneously find the optimal solution. Thus, we propose a novel distributed scheme, which utilizes distributed fractional Bayesian learning through weighted averaging on the log-beliefs to update the beliefs of unknown parameter, and distributed gradient descent for renewing the estimation of the optimal solution. Then under suitable assumptions, we prove that all agents' beliefs and decision variables converge almost surely to the true parameter and the optimal solution under the true parameter, respectively. We further establish a sublinear convergence rate for the belief sequence. Finally, numerical experiments are implemented to corroborate the theoretical analysis.
ORCA: ORchestrating Causal Agent
Causal inference is essential for decision-making science while the complexity of the data analysis workflow, ranging from data wrangling to causal analysis, increases substantially as the scale of data grows in complicated business environments. Especially, the execution of the workflow in relational databases by non-experts can result in repetitive bottlenecks which impede timely and responsible business insights. To address this challenge, we propose ORCA (Orchestrating Causal Agent), an LLM agentic system that can automate routine workflows in RDBMS while preserving expert oversight via human-AI interactions. ORCA orchestrates the full data analysis pipeline: interpreting natural language queries, navigating tables from DB servers, generating proper SQL codes, preprocessing data, and configuring modeling processes using causal inference libraries. Domain experts still can control the automation through iterative interactions with ORCA, enabling robust data-driven decision making with less technical expertise in statistical computing. Empirical evaluations on benchmark and synthetic e-commerce datasets demonstrate competitive performance of ORCA in table understanding, query generation, and cause-effect estimation -- achieving over $7\times$ improvement in estimating average treatment compared to GPT-4o mini.
comment: 24 pages, 17 figures, 1 table
Systems and Control (CS)
Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids
Learning generalizable robot manipulation policies, especially for complex multi-fingered humanoids, remains a significant challenge. Existing approaches primarily rely on extensive data collection and imitation learning, which are expensive, labor-intensive, and difficult to scale. Sim-to-real reinforcement learning (RL) offers a promising alternative, but has mostly succeeded in simpler state-based or single-hand setups. How to effectively extend this to vision-based, contact-rich bimanual manipulation tasks remains an open question. In this paper, we introduce a practical sim-to-real RL recipe that trains a humanoid robot to perform three challenging dexterous manipulation tasks: grasp-and-reach, box lift and bimanual handover. Our method features an automated real-to-sim tuning module, a generalized reward formulation based on contact and object goals, a divide-and-conquer policy distillation framework, and a hybrid object representation strategy with modality-specific augmentation. We demonstrate high success rates on unseen objects and robust, adaptive policy behaviors -- highlighting that vision-based dexterous manipulation via sim-to-real RL is not only viable, but also scalable and broadly applicable to real-world humanoid manipulation tasks.
comment: Published at CoRL 2025. Project page can be found at https://toruowo.github.io/recipe/
Directional excitability in Hilbert spaces
We introduce a generalized excitable system in which spikes can happen in a continuum of directions, therefore drastically enriching the expressivity and control capability of the spiking dynamics. In this generalized excitable system, spiking trajectories happen in a Hilbert space with an excitable resting state at the origin and spike responses that can be triggered in any direction as a function of the system's state and inputs. State-dependence of the spiking direction provide the system with a vanishing spiking memory trace, which enables robust tracking and integration of inputs in the spiking direction history. The model exhibits generalized forms of both Hodgkin's Type I and Type II excitability, capturing their usual bifurcation behaviors in an abstract setting. When used as the controller of a two-dimensional navigation task, this model facilitates both the sparseness of the actuation and its sensitivity to environmental inputs. These results highlight the potential of the proposed generalized excitable model for excitable control in high- and infinite-dimensional spaces.
comment: 6 pages, 7 figures
Decentralized Parametric Stability Certificates for Grid-Forming Converter Control
We propose a decentralized framework for guaranteeing the small-signal stability of future power systems with grid-forming converters. Our approach leverages dynamic loop-shifting techniques to compensate for the lack of passivity in the network dynamics and establishes decentralized parametric stability certificates, depending on the local device-level controls and incorporating the effects of the network dynamics. By following practical tuning rules, we are able to ensure plug-and-play operation without centralized coordination. Unlike prior works, our approach accommodates coupled frequency and voltage dynamics, incorporates network dynamics, and does not rely on specific network configurations or operating points, offering a general and scalable solution for the integration of power-electronics-based devices into future power systems. We validate our theoretical stability results through numerical case studies in a high-fidelity simulation model.
comment: 13 pages, 15 figures
Composable Uncertainty in Symmetric Monoidal Categories for Design Problems (Extended Version)
Applied category theory often studies symmetric monoidal categories (SMCs) whose morphisms represent open systems. These structures naturally accommodate complex wiring patterns, leveraging (co)monoidal structures for splitting and merging wires, or compact closed structures for feedback. A key example is the compact closed SMC of design problems (DP), which enables a compositional approach to co-design in engineering. However, in practice, the systems of interest may not be fully known. Recently, Markov categories have emerged as a powerful framework for modeling uncertain processes. In this work, we demonstrate how to integrate this perspective into the study of open systems while preserving consistency with the underlying SMC structure. To this end, we employ the change-of-base construction for enriched categories, replacing the morphisms of a symmetric monoidal $\mathcal{V}$-category $\mathcal{C}$ with parametric maps $A \to \mathcal{C}(X,Y)$ in a Markov category induced by a symmetric monoidal monad. This results in a symmetric monoidal 2-category $N_*\mathcal{C}$ with the same objects as $\mathcal{C}$ and reparametrization 2-cells. By choosing different monads, we capture various types of uncertainty. The category underlying $\mathcal{C}$ embeds into $N_*\mathcal{C}$ via a strict symmetric monoidal functor, allowing (co)monoidal and compact closed structures to be transferred. Applied to DP, this construction leads to categories of practical relevance, such as parametrized design problems for optimization, and parametrized distributions of design problems for decision theory and Bayesian learning.
comment: 23 pages, 2 figures, accepted to Applied Category Theory 2025
Current trends and future directions in event-based control
The defining characteristic of event-based control is that feedback loops are only closed when indicated by a triggering condition that takes recent information about the system into account. This stands in contrast to periodic control where the feedback loop is closed periodically. Benefits of event-based control arise when sampling comes at a cost, which occurs, e.g., for Networked Control Systems or in other setups with resource constraints. A rapidly growing number of publications deals with event-based control. Nevertheless, some fundamental questions about event-based control are still unsolved. In this article, we provide an overview of current research trends in event-based control. We focus on results that aim for a better understanding of effects that occur in feedback loops with event-based control. Based on this summary, we identify important open directions for future research.
comment: Submitted to the European Journal of Control
Adaptive control of dynamic networks
Real-world network systems are inherently dynamic, with network topologies undergoing continuous changes over time. Previous works often focus on static networks or rely on complete prior knowledge of evolving topologies, whereas real-world networks typically undergo stochastic structural changes that are difficult to predict in advance. To address this challenge, we define the adaptive control problem and propose an adaptive control algorithm to reduce the extra control cost caused by driver node switching. We introduce a node-level adaptive control metric to capture both the stability and consistency of each node across historical topologies. By integrating this metric with a partial matching repair strategy, our algorithm adjusts the minimum driver node set in real time at each snapshot, while minimizing unnecessary reconfigurations between consecutive time steps. Extensive experiments on synthetic and real-world dynamic networks demonstrate that the proposed adaptive control algorithm significantly outperforms the existing algorithm, reducing the switching cost by an average of 22% in synthetic networks and 19\% in real-world networks, without requiring foreknowledge of the future evolution of the network. These findings extend the theoretical scope of dynamic network controllability and open new avenues for practical applications in transportation, social, and molecular regulatory systems.
Beyond Asymptotics: Targeted exploration with finite-sample guarantees
In this paper, we introduce a targeted exploration strategy for the non-asymptotic, finite-time case. The proposed strategy is applicable to uncertain linear time-invariant systems subject to sub-Gaussian disturbances. As the main result, the proposed approach provides a priori guarantees, ensuring that the optimized exploration inputs achieve a desired accuracy of the model parameters. The technical derivation of the strategy (i) leverages existing non-asymptotic identification bounds with self-normalized martingales, (ii) utilizes spectral lines to predict the effect of sinusoidal excitation, and (iii) effectively accounts for spectral transient error and parametric uncertainty. A numerical example illustrates how the finite exploration time influence the required exploration energy.
comment: Extended paper with proofs, CDC 2025
Feedback Optimization with State Constraints through Control Barrier Functions
Recently, there has been a surge of research on a class of methods called feedback optimization. These are methods to steer the state of a control system to an equilibrium that arises as the solution of an optimization problem. Despite the growing literature on the topic, the important problem of enforcing state constraints at all times remains unaddressed. In this work, we present the first feedback-optimization method that enforces state constraints. The method combines a class of dynamics called safe gradient flows with high-order control barrier functions. We provide a number of results on our proposed controller, including well-posedness guarantees, anytime constraint-satisfaction guarantees, equivalence between the closed-loop's equilibria and the optimization problem's critical points, and local asymptotic stability of optima.
comment: accepted at the 64th IEEE Conference on Decision and Control (CDC), 2025
Robust MPC for Uncertain Linear Systems - Combining Model Adaptation and Iterative Learning
This paper presents a robust adaptive learning Model Predictive Control (MPC) framework for linear systems with parametric uncertainties and additive disturbances performing iterative tasks. The approach refines the parameter estimates online using set-membership estimation. Performance enhancement over iterations is achieved by learning the terminal cost from data. Safety is enforced using a terminal set, which is also learned iteratively. The proposed method guarantees recursive feasibility, constraint satisfaction, and a robust bound on the closed-loop cost. Numerical simulations on a mass-spring-damper system demonstrate improved computational efficiency and control performance compared to a robust adaptive MPC scheme without iterative learning of the terminal ingredients.
comment: Github link to the example: https://github.com/HannesPetrenz/RALMPC_Linear_Uncertain_Systems
Combined Stochastic and Robust Optimization for Electric Autonomous Mobility-on-Demand with Nested Benders Decomposition
The electrification and automation of mobility are reshaping how cities operate on-demand transport systems. Managing Electric Autonomous Mobility-on-Demand (EAMoD) fleets effectively requires coordinating dispatch, rebalancing, and charging decisions under multiple uncertainties, including travel demand, travel time, energy consumption, and charger availability. We address this challenge with a combined stochastic and robust model predictive control (MPC) framework. The framework integrates spatio-temporal Bayesian neural network forecasts with a multi-stage stochastic optimization model, formulated as a large-scale mixed-integer linear program. To ensure real-time applicability, we develop a tailored Nested Benders Decomposition that exploits the scenario tree structure and enables efficient parallelized solution. Stochastic optimization is employed to anticipate demand and infrastructure variability, while robust constraints on energy consumption and travel times safeguard feasibility under worst-case realizations. We evaluate the framework using high-fidelity simulations of San Francisco and Chicago. Compared with deterministic, reactive, and robust baselines, the combined stochastic and robust approach reduces median passenger waiting times by up to 36% and 95th-percentile delays by nearly 20%, while also lowering rebalancing distance by 27% and electricity costs by more than 35%. We also conduct a sensitivity analysis of battery size and vehicle efficiency, finding that energy-efficient vehicles maintain stable performance even with small batteries, whereas less efficient vehicles require larger batteries and greater infrastructure support. Our results emphasize the importance of jointly optimizing predictive control, vehicle capabilities, and infrastructure planning to enable scalable, cost-efficient EAMoD operations.
comment: 29 pages, 12 figures
Consensus in Multiagent Systems under communication failure
We consider multi-agent systems with cooperative interactions and study the convergence to consensus in the case of time-dependent connections, with possible communication failure. We prove a new condition ensuring consensus: we define a graph in which directed arrows correspond to connection functions that converge (in the weak sense) to some function with a positive integral on all intervals of the form $[t,+\infty)$. If the graph has a node reachable from all other indices, i.e.~``globally reachable'', then the system converges to consensus. We show that this requirement generalizes some known sufficient conditions for convergence, such as Moreau's or the Persistent Excitation one. We also give a second new condition, transversal to the known ones: total connectedness of the undirected graph formed by the non-vanishing of limiting functions.
A First-Order Gradient Approach for the Connectivity Optimization of Markov Chains
Graphs are commonly used to model various complex systems, including social networks, power grids, transportation networks, and biological systems. In many applications, the connectivity of these networks can be expressed through the Mean First Passage Times (MFPTs) of a Markov chain modeling a random walker on the graph. In this paper, we generalize the network metrics based on Markov chains' MFPTs and extend them to networks affected by uncertainty, in which edges may fail and hence not be present according to a pre-determined stochastic model. To find optimally connected Markov chains, we present a parameterization-free method for optimizing the MFPTs of the Markov chain. More specifically, we present an efficient Simultaneous Perturbation Stochastic Approximation (SPSA) algorithm in the context of Markov chain optimization. The proposed algorithm is suitable for both fixed and random networks. Using various numerical experiments, we demonstrate scalability compared to established benchmarks. Importantly, our algorithm finds an optimal solution without requiring prior knowledge of edge failure probabilities, allowing for an online optimization approach.
CalibRefine: Deep Learning-Based Online Automatic Targetless LiDAR-Camera Calibration with Iterative and Attention-Driven Post-Refinement
Accurate multi-sensor calibration is essential for deploying robust perception systems in applications such as autonomous driving and intelligent transportation. Existing LiDAR-camera calibration methods often rely on manually placed targets, preliminary parameter estimates, or intensive data preprocessing, limiting their scalability and adaptability in real-world settings. In this work, we propose a fully automatic, targetless, and online calibration framework, CalibRefine, which directly processes raw LiDAR point clouds and camera images. Our approach is divided into four stages: (1) a Common Feature Discriminator that leverages relative spatial positions, visual appearance embeddings, and semantic class cues to identify and generate reliable LiDAR-camera correspondences, (2) a coarse homography-based calibration that uses the matched feature correspondences to estimate an initial transformation between the LiDAR and camera frames, serving as the foundation for further refinement, (3) an iterative refinement to incrementally improve alignment as additional data frames become available, and (4) an attention-based refinement that addresses non-planar distortions by leveraging a Vision Transformer and cross-attention mechanisms. Extensive experiments on two urban traffic datasets demonstrate that CalibRefine achieves high-precision calibration with minimal human input, outperforming state-of-the-art targetless methods and matching or surpassing manually tuned baselines. Our results show that robust object-level feature matching, combined with iterative refinement and self-supervised attention-based refinement, enables reliable sensor alignment in complex real-world conditions without ground-truth matrices or elaborate preprocessing. Code is available at https://github.com/radar-lab/Lidar_Camera_Automatic_Calibration
Offset-free model predictive control: stability under plant-model mismatch
We present the first general stability results for nonlinear offset-free model predictive control (MPC). Despite over twenty years of active research, the offset-free MPC literature has not shaken the assumption of closed-loop stability for establishing offset-free performance. In this paper, we present a nonlinear offset-free MPC design that is robustly stable with respect to the tracking errors, and thus achieves offset-free performance, despite plant-model mismatch and persistent disturbances. Key features and assumptions of this design include quadratic costs, differentiability of the plant and model functions, constraint backoffs at steady state, and a robustly stable state and disturbance estimator. We first establish nominal stability and offset-free performance. Then, robustness to state and disturbance estimate errors and setpoint and disturbance changes is demonstrated. Finally, the results are extended to sufficiently small plant-model mismatch. The results are illustrated by numerical examples.
comment: 56 pages, 4 figures
Best Response Convergence for Zero-sum Stochastic Dynamic Games with Partial and Asymmetric Information
We analyze best response dynamics for finding a Nash equilibrium of an infinite horizon zero-sum stochastic linear quadratic dynamic game (LQDG) with partial and asymmetric information. We derive explicit expressions for each player's best response within the class of pure linear dynamic output feedback control strategies where the internal state dimension of each control strategy is an integer multiple of the system state dimension. With each best response, the players form increasingly higher-order belief states, leading to infinite-dimensional internal states. However, we observe in extensive numerical experiments that the game's value converges after just a few iterations, suggesting that strategies associated with increasingly higher-order belief states eventually provide no benefit. To help explain this convergence, our numerical analysis reveals rapid decay of the controllability and observability Gramian eigenvalues and Hankel singular values in higher-order belief dynamics, indicating that the higher-order belief dynamics become increasingly difficult for both players to control and observe. Consequently, the higher-order belief dynamics can be closely approximated by low-order belief dynamics with bounded error, and thus feedback strategies with limited internal state dimension can closely approximate a Nash equilibrium.
Systems and Control (EESS)
Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids
Learning generalizable robot manipulation policies, especially for complex multi-fingered humanoids, remains a significant challenge. Existing approaches primarily rely on extensive data collection and imitation learning, which are expensive, labor-intensive, and difficult to scale. Sim-to-real reinforcement learning (RL) offers a promising alternative, but has mostly succeeded in simpler state-based or single-hand setups. How to effectively extend this to vision-based, contact-rich bimanual manipulation tasks remains an open question. In this paper, we introduce a practical sim-to-real RL recipe that trains a humanoid robot to perform three challenging dexterous manipulation tasks: grasp-and-reach, box lift and bimanual handover. Our method features an automated real-to-sim tuning module, a generalized reward formulation based on contact and object goals, a divide-and-conquer policy distillation framework, and a hybrid object representation strategy with modality-specific augmentation. We demonstrate high success rates on unseen objects and robust, adaptive policy behaviors -- highlighting that vision-based dexterous manipulation via sim-to-real RL is not only viable, but also scalable and broadly applicable to real-world humanoid manipulation tasks.
comment: Published at CoRL 2025. Project page can be found at https://toruowo.github.io/recipe/
Directional excitability in Hilbert spaces
We introduce a generalized excitable system in which spikes can happen in a continuum of directions, therefore drastically enriching the expressivity and control capability of the spiking dynamics. In this generalized excitable system, spiking trajectories happen in a Hilbert space with an excitable resting state at the origin and spike responses that can be triggered in any direction as a function of the system's state and inputs. State-dependence of the spiking direction provide the system with a vanishing spiking memory trace, which enables robust tracking and integration of inputs in the spiking direction history. The model exhibits generalized forms of both Hodgkin's Type I and Type II excitability, capturing their usual bifurcation behaviors in an abstract setting. When used as the controller of a two-dimensional navigation task, this model facilitates both the sparseness of the actuation and its sensitivity to environmental inputs. These results highlight the potential of the proposed generalized excitable model for excitable control in high- and infinite-dimensional spaces.
comment: 6 pages, 7 figures
Decentralized Parametric Stability Certificates for Grid-Forming Converter Control
We propose a decentralized framework for guaranteeing the small-signal stability of future power systems with grid-forming converters. Our approach leverages dynamic loop-shifting techniques to compensate for the lack of passivity in the network dynamics and establishes decentralized parametric stability certificates, depending on the local device-level controls and incorporating the effects of the network dynamics. By following practical tuning rules, we are able to ensure plug-and-play operation without centralized coordination. Unlike prior works, our approach accommodates coupled frequency and voltage dynamics, incorporates network dynamics, and does not rely on specific network configurations or operating points, offering a general and scalable solution for the integration of power-electronics-based devices into future power systems. We validate our theoretical stability results through numerical case studies in a high-fidelity simulation model.
comment: 13 pages, 15 figures
Composable Uncertainty in Symmetric Monoidal Categories for Design Problems (Extended Version)
Applied category theory often studies symmetric monoidal categories (SMCs) whose morphisms represent open systems. These structures naturally accommodate complex wiring patterns, leveraging (co)monoidal structures for splitting and merging wires, or compact closed structures for feedback. A key example is the compact closed SMC of design problems (DP), which enables a compositional approach to co-design in engineering. However, in practice, the systems of interest may not be fully known. Recently, Markov categories have emerged as a powerful framework for modeling uncertain processes. In this work, we demonstrate how to integrate this perspective into the study of open systems while preserving consistency with the underlying SMC structure. To this end, we employ the change-of-base construction for enriched categories, replacing the morphisms of a symmetric monoidal $\mathcal{V}$-category $\mathcal{C}$ with parametric maps $A \to \mathcal{C}(X,Y)$ in a Markov category induced by a symmetric monoidal monad. This results in a symmetric monoidal 2-category $N_*\mathcal{C}$ with the same objects as $\mathcal{C}$ and reparametrization 2-cells. By choosing different monads, we capture various types of uncertainty. The category underlying $\mathcal{C}$ embeds into $N_*\mathcal{C}$ via a strict symmetric monoidal functor, allowing (co)monoidal and compact closed structures to be transferred. Applied to DP, this construction leads to categories of practical relevance, such as parametrized design problems for optimization, and parametrized distributions of design problems for decision theory and Bayesian learning.
comment: 23 pages, 2 figures, accepted to Applied Category Theory 2025
Current trends and future directions in event-based control
The defining characteristic of event-based control is that feedback loops are only closed when indicated by a triggering condition that takes recent information about the system into account. This stands in contrast to periodic control where the feedback loop is closed periodically. Benefits of event-based control arise when sampling comes at a cost, which occurs, e.g., for Networked Control Systems or in other setups with resource constraints. A rapidly growing number of publications deals with event-based control. Nevertheless, some fundamental questions about event-based control are still unsolved. In this article, we provide an overview of current research trends in event-based control. We focus on results that aim for a better understanding of effects that occur in feedback loops with event-based control. Based on this summary, we identify important open directions for future research.
comment: Submitted to the European Journal of Control
Adaptive control of dynamic networks
Real-world network systems are inherently dynamic, with network topologies undergoing continuous changes over time. Previous works often focus on static networks or rely on complete prior knowledge of evolving topologies, whereas real-world networks typically undergo stochastic structural changes that are difficult to predict in advance. To address this challenge, we define the adaptive control problem and propose an adaptive control algorithm to reduce the extra control cost caused by driver node switching. We introduce a node-level adaptive control metric to capture both the stability and consistency of each node across historical topologies. By integrating this metric with a partial matching repair strategy, our algorithm adjusts the minimum driver node set in real time at each snapshot, while minimizing unnecessary reconfigurations between consecutive time steps. Extensive experiments on synthetic and real-world dynamic networks demonstrate that the proposed adaptive control algorithm significantly outperforms the existing algorithm, reducing the switching cost by an average of 22% in synthetic networks and 19\% in real-world networks, without requiring foreknowledge of the future evolution of the network. These findings extend the theoretical scope of dynamic network controllability and open new avenues for practical applications in transportation, social, and molecular regulatory systems.
Beyond Asymptotics: Targeted exploration with finite-sample guarantees
In this paper, we introduce a targeted exploration strategy for the non-asymptotic, finite-time case. The proposed strategy is applicable to uncertain linear time-invariant systems subject to sub-Gaussian disturbances. As the main result, the proposed approach provides a priori guarantees, ensuring that the optimized exploration inputs achieve a desired accuracy of the model parameters. The technical derivation of the strategy (i) leverages existing non-asymptotic identification bounds with self-normalized martingales, (ii) utilizes spectral lines to predict the effect of sinusoidal excitation, and (iii) effectively accounts for spectral transient error and parametric uncertainty. A numerical example illustrates how the finite exploration time influence the required exploration energy.
comment: Extended paper with proofs, CDC 2025
Feedback Optimization with State Constraints through Control Barrier Functions
Recently, there has been a surge of research on a class of methods called feedback optimization. These are methods to steer the state of a control system to an equilibrium that arises as the solution of an optimization problem. Despite the growing literature on the topic, the important problem of enforcing state constraints at all times remains unaddressed. In this work, we present the first feedback-optimization method that enforces state constraints. The method combines a class of dynamics called safe gradient flows with high-order control barrier functions. We provide a number of results on our proposed controller, including well-posedness guarantees, anytime constraint-satisfaction guarantees, equivalence between the closed-loop's equilibria and the optimization problem's critical points, and local asymptotic stability of optima.
comment: accepted at the 64th IEEE Conference on Decision and Control (CDC), 2025
Robust MPC for Uncertain Linear Systems - Combining Model Adaptation and Iterative Learning
This paper presents a robust adaptive learning Model Predictive Control (MPC) framework for linear systems with parametric uncertainties and additive disturbances performing iterative tasks. The approach refines the parameter estimates online using set-membership estimation. Performance enhancement over iterations is achieved by learning the terminal cost from data. Safety is enforced using a terminal set, which is also learned iteratively. The proposed method guarantees recursive feasibility, constraint satisfaction, and a robust bound on the closed-loop cost. Numerical simulations on a mass-spring-damper system demonstrate improved computational efficiency and control performance compared to a robust adaptive MPC scheme without iterative learning of the terminal ingredients.
comment: Github link to the example: https://github.com/HannesPetrenz/RALMPC_Linear_Uncertain_Systems
Combined Stochastic and Robust Optimization for Electric Autonomous Mobility-on-Demand with Nested Benders Decomposition
The electrification and automation of mobility are reshaping how cities operate on-demand transport systems. Managing Electric Autonomous Mobility-on-Demand (EAMoD) fleets effectively requires coordinating dispatch, rebalancing, and charging decisions under multiple uncertainties, including travel demand, travel time, energy consumption, and charger availability. We address this challenge with a combined stochastic and robust model predictive control (MPC) framework. The framework integrates spatio-temporal Bayesian neural network forecasts with a multi-stage stochastic optimization model, formulated as a large-scale mixed-integer linear program. To ensure real-time applicability, we develop a tailored Nested Benders Decomposition that exploits the scenario tree structure and enables efficient parallelized solution. Stochastic optimization is employed to anticipate demand and infrastructure variability, while robust constraints on energy consumption and travel times safeguard feasibility under worst-case realizations. We evaluate the framework using high-fidelity simulations of San Francisco and Chicago. Compared with deterministic, reactive, and robust baselines, the combined stochastic and robust approach reduces median passenger waiting times by up to 36% and 95th-percentile delays by nearly 20%, while also lowering rebalancing distance by 27% and electricity costs by more than 35%. We also conduct a sensitivity analysis of battery size and vehicle efficiency, finding that energy-efficient vehicles maintain stable performance even with small batteries, whereas less efficient vehicles require larger batteries and greater infrastructure support. Our results emphasize the importance of jointly optimizing predictive control, vehicle capabilities, and infrastructure planning to enable scalable, cost-efficient EAMoD operations.
comment: 29 pages, 12 figures
Consensus in Multiagent Systems under communication failure
We consider multi-agent systems with cooperative interactions and study the convergence to consensus in the case of time-dependent connections, with possible communication failure. We prove a new condition ensuring consensus: we define a graph in which directed arrows correspond to connection functions that converge (in the weak sense) to some function with a positive integral on all intervals of the form $[t,+\infty)$. If the graph has a node reachable from all other indices, i.e.~``globally reachable'', then the system converges to consensus. We show that this requirement generalizes some known sufficient conditions for convergence, such as Moreau's or the Persistent Excitation one. We also give a second new condition, transversal to the known ones: total connectedness of the undirected graph formed by the non-vanishing of limiting functions.
A First-Order Gradient Approach for the Connectivity Optimization of Markov Chains
Graphs are commonly used to model various complex systems, including social networks, power grids, transportation networks, and biological systems. In many applications, the connectivity of these networks can be expressed through the Mean First Passage Times (MFPTs) of a Markov chain modeling a random walker on the graph. In this paper, we generalize the network metrics based on Markov chains' MFPTs and extend them to networks affected by uncertainty, in which edges may fail and hence not be present according to a pre-determined stochastic model. To find optimally connected Markov chains, we present a parameterization-free method for optimizing the MFPTs of the Markov chain. More specifically, we present an efficient Simultaneous Perturbation Stochastic Approximation (SPSA) algorithm in the context of Markov chain optimization. The proposed algorithm is suitable for both fixed and random networks. Using various numerical experiments, we demonstrate scalability compared to established benchmarks. Importantly, our algorithm finds an optimal solution without requiring prior knowledge of edge failure probabilities, allowing for an online optimization approach.
CalibRefine: Deep Learning-Based Online Automatic Targetless LiDAR-Camera Calibration with Iterative and Attention-Driven Post-Refinement
Accurate multi-sensor calibration is essential for deploying robust perception systems in applications such as autonomous driving and intelligent transportation. Existing LiDAR-camera calibration methods often rely on manually placed targets, preliminary parameter estimates, or intensive data preprocessing, limiting their scalability and adaptability in real-world settings. In this work, we propose a fully automatic, targetless, and online calibration framework, CalibRefine, which directly processes raw LiDAR point clouds and camera images. Our approach is divided into four stages: (1) a Common Feature Discriminator that leverages relative spatial positions, visual appearance embeddings, and semantic class cues to identify and generate reliable LiDAR-camera correspondences, (2) a coarse homography-based calibration that uses the matched feature correspondences to estimate an initial transformation between the LiDAR and camera frames, serving as the foundation for further refinement, (3) an iterative refinement to incrementally improve alignment as additional data frames become available, and (4) an attention-based refinement that addresses non-planar distortions by leveraging a Vision Transformer and cross-attention mechanisms. Extensive experiments on two urban traffic datasets demonstrate that CalibRefine achieves high-precision calibration with minimal human input, outperforming state-of-the-art targetless methods and matching or surpassing manually tuned baselines. Our results show that robust object-level feature matching, combined with iterative refinement and self-supervised attention-based refinement, enables reliable sensor alignment in complex real-world conditions without ground-truth matrices or elaborate preprocessing. Code is available at https://github.com/radar-lab/Lidar_Camera_Automatic_Calibration
Offset-free model predictive control: stability under plant-model mismatch
We present the first general stability results for nonlinear offset-free model predictive control (MPC). Despite over twenty years of active research, the offset-free MPC literature has not shaken the assumption of closed-loop stability for establishing offset-free performance. In this paper, we present a nonlinear offset-free MPC design that is robustly stable with respect to the tracking errors, and thus achieves offset-free performance, despite plant-model mismatch and persistent disturbances. Key features and assumptions of this design include quadratic costs, differentiability of the plant and model functions, constraint backoffs at steady state, and a robustly stable state and disturbance estimator. We first establish nominal stability and offset-free performance. Then, robustness to state and disturbance estimate errors and setpoint and disturbance changes is demonstrated. Finally, the results are extended to sufficiently small plant-model mismatch. The results are illustrated by numerical examples.
comment: 56 pages, 4 figures
Best Response Convergence for Zero-sum Stochastic Dynamic Games with Partial and Asymmetric Information
We analyze best response dynamics for finding a Nash equilibrium of an infinite horizon zero-sum stochastic linear quadratic dynamic game (LQDG) with partial and asymmetric information. We derive explicit expressions for each player's best response within the class of pure linear dynamic output feedback control strategies where the internal state dimension of each control strategy is an integer multiple of the system state dimension. With each best response, the players form increasingly higher-order belief states, leading to infinite-dimensional internal states. However, we observe in extensive numerical experiments that the game's value converges after just a few iterations, suggesting that strategies associated with increasingly higher-order belief states eventually provide no benefit. To help explain this convergence, our numerical analysis reveals rapid decay of the controllability and observability Gramian eigenvalues and Hankel singular values in higher-order belief dynamics, indicating that the higher-order belief dynamics become increasingly difficult for both players to control and observe. Consequently, the higher-order belief dynamics can be closely approximated by low-order belief dynamics with bounded error, and thus feedback strategies with limited internal state dimension can closely approximate a Nash equilibrium.
Robotics
Unscented Kalman Filter with a Nonlinear Propagation Model for Navigation Applications
The unscented Kalman filter is a nonlinear estimation algorithm commonly used in navigation applications. The prediction of the mean and covariance matrix is crucial to the stable behavior of the filter. This prediction is done by propagating the sigma points according to the dynamic model at hand. In this paper, we introduce an innovative method to propagate the sigma points according to the nonlinear dynamic model of the navigation error state vector. This improves the filter accuracy and navigation performance. We demonstrate the benefits of our proposed approach using real sensor data recorded by an autonomous underwater vehicle during several scenarios.
comment: 6 pages, 4 figures
HERMES: Human-to-Robot Embodied Learning from Multi-Source Motion Data for Mobile Dexterous Manipulation
Leveraging human motion data to impart robots with versatile manipulation skills has emerged as a promising paradigm in robotic manipulation. Nevertheless, translating multi-source human hand motions into feasible robot behaviors remains challenging, particularly for robots equipped with multi-fingered dexterous hands characterized by complex, high-dimensional action spaces. Moreover, existing approaches often struggle to produce policies capable of adapting to diverse environmental conditions. In this paper, we introduce HERMES, a human-to-robot learning framework for mobile bimanual dexterous manipulation. First, HERMES formulates a unified reinforcement learning approach capable of seamlessly transforming heterogeneous human hand motions from multiple sources into physically plausible robotic behaviors. Subsequently, to mitigate the sim2real gap, we devise an end-to-end, depth image-based sim2real transfer method for improved generalization to real-world scenarios. Furthermore, to enable autonomous operation in varied and unstructured environments, we augment the navigation foundation model with a closed-loop Perspective-n-Point (PnP) localization mechanism, ensuring precise alignment of visual goals and effectively bridging autonomous navigation and dexterous manipulation. Extensive experimental results demonstrate that HERMES consistently exhibits generalizable behaviors across diverse, in-the-wild scenarios, successfully performing numerous complex mobile bimanual dexterous manipulation tasks. Project Page:https://gemcollector.github.io/HERMES/.
Making Physical Objects with Generative AI and Robotic Assembly: Considering Fabrication Constraints, Sustainability, Time, Functionality, and Accessibility
3D generative AI enables rapid and accessible creation of 3D models from text or image inputs. However, translating these outputs into physical objects remains a challenge due to the constraints in the physical world. Recent studies have focused on improving the capabilities of 3D generative AI to produce fabricable outputs, with 3D printing as the main fabrication method. However, this workshop paper calls for a broader perspective by considering how fabrication methods align with the capabilities of 3D generative AI. As a case study, we present a novel system using discrete robotic assembly and 3D generative AI to make physical objects. Through this work, we identified five key aspects to consider in a physical making process based on the capabilities of 3D generative AI. 1) Fabrication Constraints: Current text-to-3D models can generate a wide range of 3D designs, requiring fabrication methods that can adapt to the variability of generative AI outputs. 2) Time: While generative AI can generate 3D models in seconds, fabricating physical objects can take hours or even days. Faster production could enable a closer iterative design loop between humans and AI in the making process. 3) Sustainability: Although text-to-3D models can generate thousands of models in the digital world, extending this capability to the real world would be resource-intensive, unsustainable and irresponsible. 4) Functionality: Unlike digital outputs from 3D generative AI models, the fabrication method plays a crucial role in the usability of physical objects. 5) Accessibility: While generative AI simplifies 3D model creation, the need for fabrication equipment can limit participation, making AI-assisted creation less inclusive. These five key aspects provide a framework for assessing how well a physical making process aligns with the capabilities of 3D generative AI and values in the world.
comment: ACM CHI Conference on Human Factors in Computing Systems (CHI 2025), Workshop on Generative AI and Human-Computer Interaction, Yokohama, Japan, April 26 to May 1, 2025
Computational Design and Fabrication of Modular Robots with Untethered Control
Natural organisms utilize distributed actuation through their musculoskeletal systems to adapt their gait for traversing diverse terrains or to morph their bodies for varied tasks. A longstanding challenge in robotics is to emulate this capability of natural organisms, which has motivated the development of numerous soft robotic systems. However, such systems are generally optimized for a single functionality, lack the ability to change form or function on demand, or remain tethered to bulky control systems. To address these limitations, we present a framework for designing and controlling robots that utilize distributed actuation. We propose a novel building block that integrates 3D-printed bones with liquid crystal elastomer (LCE) muscles as lightweight actuators, enabling the modular assembly of musculoskeletal robots. We developed LCE rods that contract in response to infrared radiation, thereby providing localized, untethered control over the distributed skeletal network and producing global deformations of the robot. To fully capitalize on the extensive design space, we introduce two computational tools: one for optimizing the robot's skeletal graph to achieve multiple target deformations, and another for co-optimizing skeletal designs and control gaits to realize desired locomotion. We validate our framework by constructing several robots that demonstrate complex shape morphing, diverse control schemes, and environmental adaptability. Our system integrates advances in modular material building, untethered and distributed control, and computational design to introduce a new generation of robots that brings us closer to the capabilities of living organisms.
Efficient Online Learning and Adaptive Planning for Robotic Information Gathering Based on Streaming Data
Robotic information gathering (RIG) techniques refer to methods where mobile robots are used to acquire data about the physical environment with a suite of sensors. Informative planning is an important part of RIG where the goal is to find sequences of actions or paths that maximize efficiency or the quality of information collected. Many existing solutions solve this problem by assuming that the environment is known in advance. However, real environments could be unknown or time-varying, and adaptive informative planning remains an active area of research. Adaptive planning and incremental online mapping are required for mapping initially unknown or varying spatial fields. Gaussian process (GP) regression is a widely used technique in RIG for mapping continuous spatial fields. However, it falls short in many applications as its real-time performance does not scale well to large datasets. To address these challenges, this paper proposes an efficient adaptive informative planning approach for mapping continuous scalar fields with GPs with streaming sparse GPs. Simulation experiments are performed with a synthetic dataset and compared against existing benchmarks. Finally, it is also verified with a real-world dataset to further validate the efficacy of the proposed method. Results show that our method achieves similar mapping accuracy to the baselines while reducing computational complexity for longer missions.
comment: Accepted for presentation at 2025 European Conference on Mobile Robots
Safety-Critical Human-Machine Shared Driving for Vehicle Collision Avoidance based on Hamilton-Jacobi reachability
Road safety continues to be a pressing global issue, with vehicle collisions imposing significant human, societal, and economic burdens. Human-machine shared collision avoidance in critical collision scenarios aims to aid drivers' accident avoidance through intervening only when necessary. Existing methods count on replanning collision-free trajectories and imposing human-machine tracking, which usually interrupts the driver's intent and increases the risk of conflict. This paper introduces a Reachability-Aware Reinforcement Learning (RL) framework for shared control, guided by Hamilton-Jacobi (HJ) reachability analysis. Machine intervention is activated only when the vehicle approaches the Collision Avoidance Reachable Set (CARS), which represents states where collision is unavoidable. First, we precompute the reachability distributions and the CARS by solving the Bellman equation using offline data. To reduce human-machine conflicts, we develop a driver model for sudden obstacles and propose an authority allocation strategy considering key collision avoidance features. Finally, we train a RL agent to reduce human-machine conflicts while enforcing the hard constraint of avoiding entry into the CARS. The proposed method was tested on a real vehicle platform. Results show that the controller intervenes effectively near CARS to prevent collisions while maintaining improved original driving task performance. Robustness analysis further supports its flexibility across different driver attributes.
comment: 36 pages, 15 figures
YORI: Autonomous Cooking System Utilizing a Modular Robotic Kitchen and a Dual-Arm Proprioceptive Manipulator
This paper presents Yummy Operations Robot Initiative (YORI), a proprioceptive dual-arm robotic system that demonstrates autonomous multi-dish cooking for scalable food service applications. YORI integrates a dual-arm manipulator equipped with proprioceptive actuators, custom-designed tools, appliances, and a structured kitchen environment to address the complexities of cooking tasks. The proprioceptive actuators enable fast, precise, force-controlled movements while mitigating the risks associated with cooking-related impacts. The system's modular kitchen design and flexible tool-changing mechanism support simultaneous multi-dish preparation through torque control and optimization-based motion planning and scheduling. A comprehensive scheduling framework with dynamic rescheduling ensures reliable adaptation to new orders and delays. The system was publicly validated through live demonstrations, reliably preparing steak-frites across multiple convention sessions. This paper details YORI's design and explores future directions in kitchen optimization, task planning, and food quality control, demonstrating its potential as a scalable robotic cooking solution. A system introduction and cooking videos are available online
comment: This work has been submitted to IEEE Robotics & Automation Magazine for possible publication
A Survey on Vision-Language-Action Models for Embodied AI
Embodied AI is widely recognized as a key element of artificial general intelligence because it involves controlling embodied agents to perform tasks in the physical world. Building on the success of large language models and vision-language models, a new category of multimodal models -- referred to as vision-language-action models (VLAs) -- has emerged to address language-conditioned robotic tasks in embodied AI by leveraging their distinct ability to generate actions. In recent years, a myriad of VLAs have been developed, making it imperative to capture the rapidly evolving landscape through a comprehensive survey. To this end, we present the first survey on VLAs for embodied AI. This work provides a detailed taxonomy of VLAs, organized into three major lines of research. The first line focuses on individual components of VLAs. The second line is dedicated to developing control policies adept at predicting low-level actions. The third line comprises high-level task planners capable of decomposing long-horizon tasks into a sequence of subtasks, thereby guiding VLAs to follow more general user instructions. Furthermore, we provide an extensive summary of relevant resources, including datasets, simulators, and benchmarks. Finally, we discuss the challenges faced by VLAs and outline promising future directions in embodied AI. We have created a project associated with this survey, which is available at https://github.com/yueen-ma/Awesome-VLA.
comment: Project page: https://github.com/yueen-ma/Awesome-VLA
Multiagent Systems
CRMAgent: A Multi-Agent LLM System for E-Commerce CRM Message Template Generation
In e-commerce private-domain channels such as instant messaging and e-mail, merchants engage customers directly as part of their Customer Relationship Management (CRM) programmes to drive retention and conversion. While a few top performers excel at crafting outbound messages, most merchants struggle to write persuasive copy because they lack both expertise and scalable tools. We introduce CRMAgent, a multi-agent system built on large language models (LLMs) that generates high-quality message templates and actionable writing guidance through three complementary modes. First, group-based learning enables the agent to learn from a merchant's own top-performing messages within the same audience segment and rewrite low-performing ones. Second, retrieval-and-adaptation fetches templates that share the same audience segment and exhibit high similarity in voucher type and product category, learns their successful patterns, and adapts them to the current campaign. Third, a rule-based fallback provides a lightweight zero-shot rewrite when no suitable references are available. Extensive experiments show that CRMAgent consistently outperforms merchants' original templates, delivering significant gains in both audience-match and marketing-effectiveness metrics.
Grid-Agent: An LLM-Powered Multi-Agent System for Power Grid Control
The increasing penetration of Distributed Energy Resources (DERs), widespread adoption of Electric Vehicles (EVs), and the growing frequency of extreme weather events have significantly increased the complexity of power grid planning, operation, and management. Traditional rule-based systems and numerical optimization approaches often struggle with the scale, dynamics, and adaptability required by modern power networks. This paper introduces Grid-Agent, an autonomous, AI-driven framework that combines Large Language Models (LLMs) with multi-agent reinforcement learning to detect and remediate grid violations in real time. Grid-Agent integrates semantic reasoning with numerical precision through a modular agent architecture: a planning agent generates coordinated action sequences using numerical power flow solvers, while a validation agent evaluates system stability and action effectiveness via sandboxed execution with safety rollbacks. To ensure scalability, Grid-Agent incorporates an adaptive multiscale network representation that dynamically selects optimal encoding schemes based on network size and complexity. The framework enables coordinated violation resolution through optimizing switch configurations, battery deployment, and load curtailment strategies. Experimental results in standard IEEE and CIGRE test systems (IEEE 69-bus, CIGRE MV, and IEEE 30-bus) demonstrate superior violation mitigation performance. Additionally, the framework's built-in data collection and learning capabilities enable continuous learning and adaptation to diverse network topologies. The autonomous nature of the framework makes it particularly suitable for modern smart grid applications requiring rapid response to dynamic operating conditions.
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Recent advances in large language models have sparked growing interest in AI agents capable of solving complex, real-world tasks. However, most existing agent systems rely on manually crafted configurations that remain static after deployment, limiting their ability to adapt to dynamic and evolving environments. To this end, recent research has explored agent evolution techniques that aim to automatically enhance agent systems based on interaction data and environmental feedback. This emerging direction lays the foundation for self-evolving AI agents, which bridge the static capabilities of foundation models with the continuous adaptability required by lifelong agentic systems. In this survey, we provide a comprehensive review of existing techniques for self-evolving agentic systems. Specifically, we first introduce a unified conceptual framework that abstracts the feedback loop underlying the design of self-evolving agentic systems. The framework highlights four key components: System Inputs, Agent System, Environment, and Optimisers, serving as a foundation for understanding and comparing different strategies. Based on this framework, we systematically review a wide range of self-evolving techniques that target different components of the agent system. We also investigate domain-specific evolution strategies developed for specialised fields such as biomedicine, programming, and finance, where optimisation objectives are tightly coupled with domain constraints. In addition, we provide a dedicated discussion on the evaluation, safety, and ethical considerations for self-evolving agentic systems, which are critical to ensuring their effectiveness and reliability. This survey aims to provide researchers and practitioners with a systematic understanding of self-evolving AI agents, laying the foundation for the development of more adaptive, autonomous, and lifelong agentic systems.
comment: Github Repo: https://github.com/EvoAgentX/Awesome-Self-Evolving-Agents
Systems and Control (CS)
A State-Space Representation of Coupled Linear Multivariate PDEs and Stability Analysis using SDP
Physical processes evolving in both time and space are often modeled using Partial Differential Equations (PDEs). Recently, it has been shown how stability analysis and control of coupled PDEs in a single spatial variable can be more conveniently performed using an equivalent Partial Integral Equation (PIE) representation. The construction of this PIE representation is based on an analytic expression for the inverse of the spatial differential operator, $\partial_s^{d}$, on the domain defined by boundary conditions. In this paper, we show how this univariate representation may be extended inductively to multiple spatial variables by representing the domain as the intersection of lifted univariate domains. Specifically, we show that if each univariate domain is well-posed, then there exists a readily verified consistency condition which is necessary and sufficient for existence of an inverse to the multivariate spatial differential operator, $D^\alpha=\partial_{s_1}^{\alpha_1}\cdots\partial_{s_N}^{\alpha_N}$, on the PDE domain. Furthermore, we show that this inverse is an element of a $*$-algebra of Partial Integral (PI) operators defined by polynomial semi-separable kernels. Based on this operator algebra, we show that the evolution of any suitably well-posed linear multivariate PDE may be described by a PIE, parameterized by elements of the PI algebra. A convex computational test for PDE stability is then proposed using a positive matrix parameterization of positive PI operators, and software (PIETOOLS) is provided which automates the process of representation and stability analysis of such PDEs. This software is used to analyze stability of 2D heat, wave, and plate equations, obtaining accurate bounds on the rate of decay.
Systems and Control (EESS)
A State-Space Representation of Coupled Linear Multivariate PDEs and Stability Analysis using SDP
Physical processes evolving in both time and space are often modeled using Partial Differential Equations (PDEs). Recently, it has been shown how stability analysis and control of coupled PDEs in a single spatial variable can be more conveniently performed using an equivalent Partial Integral Equation (PIE) representation. The construction of this PIE representation is based on an analytic expression for the inverse of the spatial differential operator, $\partial_s^{d}$, on the domain defined by boundary conditions. In this paper, we show how this univariate representation may be extended inductively to multiple spatial variables by representing the domain as the intersection of lifted univariate domains. Specifically, we show that if each univariate domain is well-posed, then there exists a readily verified consistency condition which is necessary and sufficient for existence of an inverse to the multivariate spatial differential operator, $D^\alpha=\partial_{s_1}^{\alpha_1}\cdots\partial_{s_N}^{\alpha_N}$, on the PDE domain. Furthermore, we show that this inverse is an element of a $*$-algebra of Partial Integral (PI) operators defined by polynomial semi-separable kernels. Based on this operator algebra, we show that the evolution of any suitably well-posed linear multivariate PDE may be described by a PIE, parameterized by elements of the PI algebra. A convex computational test for PDE stability is then proposed using a positive matrix parameterization of positive PI operators, and software (PIETOOLS) is provided which automates the process of representation and stability analysis of such PDEs. This software is used to analyze stability of 2D heat, wave, and plate equations, obtaining accurate bounds on the rate of decay.
Probabilistic Reachable Set Estimation for Saturated Systems with Unbounded Additive Disturbances
In this paper, we present an analytical approach for the synthesis of ellipsoidal probabilistic reachable sets of saturated systems subject to unbounded additive noise. Using convex optimization methods, we compute a contraction factor of the saturated error dynamics that allows us to tightly bound its evolution and therefore construct accurate reachable sets. The proposed approach is applicable to independent, zero mean disturbances with a known covariance. A numerical example illustrates the applicability and effectiveness of the proposed design.
comment: 11 pages, LaTeX; section III.C framing rephrased, Remark 3 added, numerical example presentation updated, typos corrected, references updated
Grid-Agent: An LLM-Powered Multi-Agent System for Power Grid Control
The increasing penetration of Distributed Energy Resources (DERs), widespread adoption of Electric Vehicles (EVs), and the growing frequency of extreme weather events have significantly increased the complexity of power grid planning, operation, and management. Traditional rule-based systems and numerical optimization approaches often struggle with the scale, dynamics, and adaptability required by modern power networks. This paper introduces Grid-Agent, an autonomous, AI-driven framework that combines Large Language Models (LLMs) with multi-agent reinforcement learning to detect and remediate grid violations in real time. Grid-Agent integrates semantic reasoning with numerical precision through a modular agent architecture: a planning agent generates coordinated action sequences using numerical power flow solvers, while a validation agent evaluates system stability and action effectiveness via sandboxed execution with safety rollbacks. To ensure scalability, Grid-Agent incorporates an adaptive multiscale network representation that dynamically selects optimal encoding schemes based on network size and complexity. The framework enables coordinated violation resolution through optimizing switch configurations, battery deployment, and load curtailment strategies. Experimental results in standard IEEE and CIGRE test systems (IEEE 69-bus, CIGRE MV, and IEEE 30-bus) demonstrate superior violation mitigation performance. Additionally, the framework's built-in data collection and learning capabilities enable continuous learning and adaptation to diverse network topologies. The autonomous nature of the framework makes it particularly suitable for modern smart grid applications requiring rapid response to dynamic operating conditions.
A Review of Hydrogen-Enabled Resilience Enhancement for Multi-Energy Systems
Ensuring resilience in multi-energy systems (MESs) becomes both more urgent and more challenging due to the rising occurrence and severity of extreme events (e.g., natural disasters, extreme weather, and cyber-physical attacks). Among many measures of strengthening MES resilience, the integration of hydrogen shows exceptional potential in cross-temporal flexibility, cross-spatial flexibility, cross-sector flexibility, and black start capability. Although many hydrogen-enabled MES resilience enhancement measures have been developed, the current literature lacks a systematic overview of hydrogen-enabled resilience enhancement in MESs. To fill the research gap, this paper provides a comprehensive overview of hydrogen-enabled MES resilience enhancement. First, advantages and challenges of adopting hydrogen in MES resilience enhancement are summarized. Then, we propose a resilience enhancement framework for hydrogen-enabled MESs. Under the proposed framework, existing resilience metrics and event-oriented contingency models are summarized and discussed. Furthermore, we classify hydrogen-enabled planning measures by the types of hydrogen-related facilities and provide some insights for planning problem formulation frameworks. Moreover, we categorize the hydrogen-enabled operation enhancement measures into three operation response stages: preventive, emergency, and restoration. Finally, we identify some research gaps and point out possible future directions in aspects of comprehensive resilience metric design, temporally-correlated event-targeted scenario generation, multi-type temporal-spatial cyber-physical contingency modeling under compound extreme events, multi-network multi-timescale coordinated planning and operation, low-carbon resilient planning and operation, and large language model-assisted whole-process resilience enhancement.
comment: 28 pages, 14 figures
Safety-Critical Human-Machine Shared Driving for Vehicle Collision Avoidance based on Hamilton-Jacobi reachability
Road safety continues to be a pressing global issue, with vehicle collisions imposing significant human, societal, and economic burdens. Human-machine shared collision avoidance in critical collision scenarios aims to aid drivers' accident avoidance through intervening only when necessary. Existing methods count on replanning collision-free trajectories and imposing human-machine tracking, which usually interrupts the driver's intent and increases the risk of conflict. This paper introduces a Reachability-Aware Reinforcement Learning (RL) framework for shared control, guided by Hamilton-Jacobi (HJ) reachability analysis. Machine intervention is activated only when the vehicle approaches the Collision Avoidance Reachable Set (CARS), which represents states where collision is unavoidable. First, we precompute the reachability distributions and the CARS by solving the Bellman equation using offline data. To reduce human-machine conflicts, we develop a driver model for sudden obstacles and propose an authority allocation strategy considering key collision avoidance features. Finally, we train a RL agent to reduce human-machine conflicts while enforcing the hard constraint of avoiding entry into the CARS. The proposed method was tested on a real vehicle platform. Results show that the controller intervenes effectively near CARS to prevent collisions while maintaining improved original driving task performance. Robustness analysis further supports its flexibility across different driver attributes.
comment: 36 pages, 15 figures
Robotics
SPGrasp: Spatiotemporal Prompt-driven Grasp Synthesis in Dynamic Scenes
Real-time interactive grasp synthesis for dynamic objects remains challenging as existing methods fail to achieve low-latency inference while maintaining promptability. To bridge this gap, we propose SPGrasp (spatiotemporal prompt-driven dynamic grasp synthesis), a novel framework extending segment anything model v2 (SAMv2) for video stream grasp estimation. Our core innovation integrates user prompts with spatiotemporal context, enabling real-time interaction with end-to-end latency as low as 59 ms while ensuring temporal consistency for dynamic objects. In benchmark evaluations, SPGrasp achieves instance-level grasp accuracies of 90.6% on OCID and 93.8% on Jacquard. On the challenging GraspNet-1Billion dataset under continuous tracking, SPGrasp achieves 92.0% accuracy with 73.1 ms per-frame latency, representing a 58.5% reduction compared to the prior state-of-the-art promptable method RoG-SAM while maintaining competitive accuracy. Real-world experiments involving 13 moving objects demonstrate a 94.8% success rate in interactive grasping scenarios. These results confirm SPGrasp effectively resolves the latency-interactivity trade-off in dynamic grasp synthesis.
LocoTouch: Learning Dynamic Quadrupedal Transport with Tactile Sensing
Quadrupedal robots have demonstrated remarkable agility and robustness in traversing complex terrains. However, they struggle with dynamic object interactions, where contact must be precisely sensed and controlled. To bridge this gap, we present LocoTouch, a system that equips quadrupedal robots with tactile sensing to address a particularly challenging task in this category: long-distance transport of unsecured cylindrical objects, which typically requires custom mounting or fastening mechanisms to maintain stability. For efficient large-area tactile sensing, we design a high-density distributed tactile sensor that covers the entire back of the robot. To effectively leverage tactile feedback for robot control, we develop a simulation environment with high-fidelity tactile signals, and train tactile-aware transport policies using a two-stage learning pipeline. Furthermore, we design a novel reward function to promote robust, symmetric, and frequency-adaptive locomotion gaits. After training in simulation, LocoTouch transfers zero-shot to the real world, reliably transporting a wide range of unsecured cylindrical objects with diverse sizes, weights, and surface properties. Moreover, it remains robust over long distances, on uneven terrain, and under severe perturbations.
comment: Project page: https://linchangyi1.github.io/LocoTouch
Wheeled Lab: Modern Sim2Real for Low-cost, Open-source Wheeled Robotics
Reinforcement Learning (RL) has been pivotal in recent robotics milestones and is poised to play a prominent role in the future. However, these advances can rely on proprietary simulators, expensive hardware, and a daunting range of tools and skills. As a result, broader communities are disconnecting from the state-of-the-art; education curricula are poorly equipped to teach indispensable modern robotics skills involving hardware, deployment, and iterative development. To address this gap between the broader and scientific communities, we contribute Wheeled Lab, an ecosystem which integrates accessible, open-source wheeled robots with Isaac Lab, an open-source robot learning and simulation framework, that is widely adopted in the state-of-the-art. To kickstart research and education, this work demonstrates three state-of-the-art zero-shot policies for small-scale RC cars developed through Wheeled Lab: controlled drifting, elevation traversal, and visual navigation. The full stack, from hardware to software, is low-cost and open-source. Videos and additional materials can be found at: https://uwrobotlearning.github.io/WheeledLab/
comment: To appear at Conference on Robot Learning, 2025
ManipBench: Benchmarking Vision-Language Models for Low-Level Robot Manipulation
Vision-Language Models (VLMs) have revolutionized artificial intelligence and robotics due to their commonsense reasoning capabilities. In robotic manipulation, VLMs are used primarily as high-level planners, but recent work has also studied their lower-level reasoning ability, which refers to making decisions about precise robot movements. However, the community currently lacks a clear and common benchmark that can evaluate how well VLMs can aid low-level reasoning in robotics. Consequently, we propose a novel benchmark, ManipBench, to evaluate the low-level robot manipulation reasoning capabilities of VLMs across various dimensions, including how well they understand object-object interactions and deformable object manipulation. We extensively test 33 representative VLMs across 10 model families on our benchmark, including variants to test different model sizes. Our evaluation shows that the performance of VLMs significantly varies across tasks, and there is a strong correlation between this performance and trends in our real-world manipulation tasks. It also shows that there remains a significant gap between these models and human-level understanding. See our website at: https://manipbench.github.io.
comment: Conference on Robot Learning (CoRL) 2025. 50 pages and 30 figures. v2 is the camera-ready and includes a few more new experiments compared to v1
RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning
Future robotic systems operating in real-world environments will require on-board embodied intelligence without continuous cloud connection, balancing capabilities with constraints on computational power and memory. This work presents an extension of the R1-zero approach, which enables the usage of low parameter-count Large Language Models (LLMs) in the robotic domain. The R1-Zero approach was originally developed to enable mathematical reasoning in LLMs using static datasets. We extend it to the robotics domain through integration in a closed-loop Reinforcement Learning (RL) framework. This extension enhances reasoning in Embodied Artificial Intelligence (Embodied AI) settings without relying solely on distillation of large models through Supervised Fine-Tuning (SFT). We show that small-scale LLMs can achieve effective reasoning performance by learning through closed-loop interaction with their environment, which enables tasks that previously required significantly larger models. In an autonomous driving setting, a performance gain of 20.2%-points over the SFT-based baseline is observed with a Qwen2.5-1.5B model. Using the proposed training procedure, Qwen2.5-3B achieves a 63.3% control adaptability score, surpassing the 58.5% obtained by the much larger, cloud-bound GPT-4o. These results highlight that practical, on-board deployment of small LLMs is not only feasible but can outperform larger models if trained through environmental feedback, underscoring the importance of an interactive learning framework for robotic Embodied AI, one grounded in practical experience rather than static supervision.
Multi Object Tracking for Predictive Collision Avoidance
The safe and efficient operation of Autonomous Mobile Robots (AMRs) in complex environments, such as manufacturing, logistics, and agriculture, necessitates accurate multi-object tracking and predictive collision avoidance. This paper presents algorithms and techniques for addressing these challenges using Lidar sensor data, emphasizing ensemble Kalman filter. The developed predictive collision avoidance algorithm employs the data provided by lidar sensors to track multiple objects and predict their velocities and future positions, enabling the AMR to navigate safely and effectively. A modification to the dynamic windowing approach is introduced to enhance the performance of the collision avoidance system. The overall system architecture encompasses object detection, multi-object tracking, and predictive collision avoidance control. The experimental results, obtained from both simulation and real-world data, demonstrate the effectiveness of the proposed methods in various scenarios, which lays the foundation for future research on global planners, other controllers, and the integration of additional sensors. This thesis contributes to the ongoing development of safe and efficient autonomous systems in complex and dynamic environments.
CLONE: Closed-Loop Whole-Body Humanoid Teleoperation for Long-Horizon Tasks
Humanoid teleoperation plays a vital role in demonstrating and collecting data for complex humanoid-scene interactions. However, current teleoperation systems face critical limitations: they decouple upper- and lower-body control to maintain stability, restricting natural coordination, and operate open-loop without real-time position feedback, leading to accumulated drift. The fundamental challenge is achieving precise, coordinated whole-body teleoperation over extended durations while maintaining accurate global positioning. Here we show that an MoE-based teleoperation system, CLONE, with closed-loop error correction enables unprecedented whole-body teleoperation fidelity, maintaining minimal positional drift over long-range trajectories using only head and hand tracking from an MR headset. Unlike previous methods that either sacrifice coordination for stability or suffer from unbounded drift, CLONE learns diverse motion skills while preventing tracking error accumulation through real-time feedback, enabling complex coordinated movements such as ``picking up objects from the ground.'' These results establish a new milestone for whole-body humanoid teleoperation for long-horizon humanoid-scene interaction tasks.
comment: 18 pages, 13 figures
Toward Real-World Cooperative and Competitive Soccer with Quadrupedal Robot Teams
Achieving coordinated teamwork among legged robots requires both fine-grained locomotion control and long-horizon strategic decision-making. Robot soccer offers a compelling testbed for this challenge, combining dynamic, competitive, and multi-agent interactions. In this work, we present a hierarchical multi-agent reinforcement learning (MARL) framework that enables fully autonomous and decentralized quadruped robot soccer. First, a set of highly dynamic low-level skills is trained for legged locomotion and ball manipulation, such as walking, dribbling, and kicking. On top of these, a high-level strategic planning policy is trained with Multi-Agent Proximal Policy Optimization (MAPPO) via Fictitious Self-Play (FSP). This learning framework allows agents to adapt to diverse opponent strategies and gives rise to sophisticated team behaviors, including coordinated passing, interception, and dynamic role allocation. With an extensive ablation study, the proposed learning method shows significant advantages in the cooperative and competitive multi-agent soccer game. We deploy the learned policies to real quadruped robots relying solely on onboard proprioception and decentralized localization, with the resulting system supporting autonomous robot-robot and robot-human soccer matches on indoor and outdoor soccer courts.
comment: 11 pages, 12 figures, CoRL 2025
Domain-Conditioned Scene Graphs for State-Grounded Task Planning IROS 2025
Recent robotic task planning frameworks have integrated large multimodal models (LMMs) such as GPT-4o. To address grounding issues of such models, it has been suggested to split the pipeline into perceptional state grounding and subsequent state-based planning. As we show in this work, the state grounding ability of LMM-based approaches is still limited by weaknesses in granular, structured, domain-specific scene understanding. To address this shortcoming, we develop a more structured state grounding framework that features a domain-conditioned scene graph as its scene representation. We show that such representation is actionable in nature as it is directly mappable to a symbolic state in planning languages such as the Planning Domain Definition Language (PDDL). We provide an instantiation of our state grounding framework where the domain-conditioned scene graph generation is implemented with a lightweight vision-language approach that classifies domain-specific predicates on top of domain-relevant object detections. Evaluated across three domains, our approach achieves significantly higher state rounding accuracy and task planning success rates compared to LMM-based approaches.
comment: Accepted for IROS 2025
Systems and Control (CS)
Distributed Safety-Critical MPC for Multi-Agent Formation Control and Obstacle Avoidance
For nonlinear multi-agent systems with high relative degrees, achieving formation control and obstacle avoidance in a distributed manner remains a significant challenge. To address this issue, we propose a novel distributed safety-critical model predictive control (DSMPC) algorithm that incorporates discrete-time high-order control barrier functions (DHCBFs) to enforce safety constraints, alongside discrete-time control Lyapunov functions (DCLFs) to establish terminal constraints. To facilitate distributed implementation, we develop estimated neighbor states for formulating DHCBFs and DCLFs, while also devising a bound constraint to limit estimation errors and ensure convergence. Additionally, we provide theoretical guarantees regarding the feasibility and stability of the proposed DSMPC algorithm based on a mild assumption. The effectiveness of the proposed method is evidenced by the simulation results, demonstrating improved performance and reduced computation time compared to existing approaches.
comment: Accepted for presentation at the 64th IEEE Conference on Decision and Control (CDC 2025)
Systems and Control (EESS)
Distributed Safety-Critical MPC for Multi-Agent Formation Control and Obstacle Avoidance
For nonlinear multi-agent systems with high relative degrees, achieving formation control and obstacle avoidance in a distributed manner remains a significant challenge. To address this issue, we propose a novel distributed safety-critical model predictive control (DSMPC) algorithm that incorporates discrete-time high-order control barrier functions (DHCBFs) to enforce safety constraints, alongside discrete-time control Lyapunov functions (DCLFs) to establish terminal constraints. To facilitate distributed implementation, we develop estimated neighbor states for formulating DHCBFs and DCLFs, while also devising a bound constraint to limit estimation errors and ensure convergence. Additionally, we provide theoretical guarantees regarding the feasibility and stability of the proposed DSMPC algorithm based on a mild assumption. The effectiveness of the proposed method is evidenced by the simulation results, demonstrating improved performance and reduced computation time compared to existing approaches.
comment: Accepted for presentation at the 64th IEEE Conference on Decision and Control (CDC 2025)
Physics-Informed Unit Commitment Framework for Nuclear Reactors
Nuclear reactors are often modeled as inflexible baseload generators with fixed downtimes and restrictive ramping constraints. In practice, however, a reactor's operational flexibility is closely tied to its fuel cycle and associated reactivity margin. A key physical constraint for power maneuverability is xenon poisoning, caused from the transient buildup of neutron-absorbing xenon following a power reduction. This transient can delay or prevent subsequent power ramp-up due to suppressed core reactivity. Additionally, if a reactor is shutdown during periods of low reactivity, restart times can vary significantly, leading to prolonged downtimes. This work introduces a physics-informed modeling framework that embeds fuel cycle dynamics within a unit commitment (UC) formulation. The framework tracks reactivity margin, dynamically enforces xenon induced constraints, and endogenously schedules refueling outages based on core conditions. By capturing intracycle reactivity evolution, the model enables operation dependent nuclear dispatch that reflects both techno-economic requirements and irreducible nuclear physics limits. Application to a representative reactor fleet shows that flexible operation can slow reactivity degradation and extend fuel cycles. Results further demonstrate that different operational modes substantially affect VRE utilization, curtailment, and nuclear fleet capacity factors. These findings highlight the importance of fuel cycle aware flexibility modeling for accurate reactor scheduling and integration of nuclear power into energy system models.
comment: 10 pages, Code: https://github.com/shinychoudhury/physics-informed-nuclear-reactor-unit-commitment-algorithm
Robustness Analysis for Quantum Systems Controlled by Continuous-Time Pulses
Differential sensitivity techniques originally developed to study the robustness of energy landscape controllers are generalized to the important case of closed quantum systems subject to continuously varying controls. Vanishing sensitivity to parameter variation is shown to coincide with perfect fidelity, as was the case for time-invariant controls. Upper bounds on the magnitude of the differential sensitivity to any parameter variation are derived based simply on knowledge of the system Hamiltonian and the maximum size of the control inputs.
comment: 6 pages, 2 figures
Topology Inference for Network Systems with Unknown Inputs
Topology inference is a powerful tool to better understand the behaviours of network systems (NSs). Different from most of prior works, this paper is dedicated to inferring the directed topology of NSs from noisy observations, where the nodes are influenced by unknown time-varying inputs. These inputs can be actively injected signals by the user, intrinsic system noises or extrinsic environment interference. To tackle this challenging problem, we propose a two-stage inference scheme to overcome the influence of the inputs. First, by leveraging the second-order difference of the state evolution, we establish a judging criterion to detect the input injection time and provide the probability guarantees. With this injection time to determine available observations, an initial topology is accordingly inferred to further facilitate the input estimation. Second, utilizing the stability characteristic of the system response, a recursive input filtering algorithm is designed to approximate the zero-input response, which directly reflects the topology structure. Then, we construct a decreasing-weight based optimization problem to infer the final network topology from the approximated response. Comprehensive simulations demonstrate the effectiveness of the proposed method.
comment: This latest version of this paper was accepted by and presented at 2025 American Control Conference
Multi Object Tracking for Predictive Collision Avoidance
The safe and efficient operation of Autonomous Mobile Robots (AMRs) in complex environments, such as manufacturing, logistics, and agriculture, necessitates accurate multi-object tracking and predictive collision avoidance. This paper presents algorithms and techniques for addressing these challenges using Lidar sensor data, emphasizing ensemble Kalman filter. The developed predictive collision avoidance algorithm employs the data provided by lidar sensors to track multiple objects and predict their velocities and future positions, enabling the AMR to navigate safely and effectively. A modification to the dynamic windowing approach is introduced to enhance the performance of the collision avoidance system. The overall system architecture encompasses object detection, multi-object tracking, and predictive collision avoidance control. The experimental results, obtained from both simulation and real-world data, demonstrate the effectiveness of the proposed methods in various scenarios, which lays the foundation for future research on global planners, other controllers, and the integration of additional sensors. This thesis contributes to the ongoing development of safe and efficient autonomous systems in complex and dynamic environments.
Multiagent Systems
Topology Inference for Network Systems with Unknown Inputs
Topology inference is a powerful tool to better understand the behaviours of network systems (NSs). Different from most of prior works, this paper is dedicated to inferring the directed topology of NSs from noisy observations, where the nodes are influenced by unknown time-varying inputs. These inputs can be actively injected signals by the user, intrinsic system noises or extrinsic environment interference. To tackle this challenging problem, we propose a two-stage inference scheme to overcome the influence of the inputs. First, by leveraging the second-order difference of the state evolution, we establish a judging criterion to detect the input injection time and provide the probability guarantees. With this injection time to determine available observations, an initial topology is accordingly inferred to further facilitate the input estimation. Second, utilizing the stability characteristic of the system response, a recursive input filtering algorithm is designed to approximate the zero-input response, which directly reflects the topology structure. Then, we construct a decreasing-weight based optimization problem to infer the final network topology from the approximated response. Comprehensive simulations demonstrate the effectiveness of the proposed method.
comment: This latest version of this paper was accepted by and presented at 2025 American Control Conference
Investigating Tax Evasion Emergence Using Dual Large Language Model and Deep Reinforcement Learning Powered Agent-based Simulation
Tax evasion, usually the largest component of an informal economy, is a persistent challenge over history with significant socio-economic implications. Many socio-economic studies investigate its dynamics, including influencing factors, the role and influence of taxation policies, and the prediction of the tax evasion volume over time. These studies assumed such behavior is given, as observed in the real world, neglecting the "big bang" of such activity in a population. To this end, computational economy studies adopted developments in computer simulations, in general, and recent innovations in artificial intelligence (AI), in particular, to simulate and study informal economy appearance in various socio-economic settings. This study presents a novel computational framework to examine the dynamics of tax evasion and the emergence of informal economic activity. Employing an agent-based simulation powered by Large Language Models and Deep Reinforcement Learning, the framework is uniquely designed to allow informal economic behaviors to emerge organically, without presupposing their existence or explicitly signaling agents about the possibility of evasion. This provides a rigorous approach for exploring the socio-economic determinants of compliance behavior. The experimental design, comprising model validation and exploratory phases, demonstrates the framework's robustness in replicating theoretical economic behaviors. Findings indicate that individual personality traits, external narratives, enforcement probabilities, and the perceived efficiency of public goods provision significantly influence both the timing and extent of informal economic activity. The results underscore that efficient public goods provision and robust enforcement mechanisms are complementary; neither alone is sufficient to curtail informal activity effectively.
Robotics
Tree-Guided Diffusion Planner
Planning with pretrained diffusion models has emerged as a promising approach for solving test-time guided control problems. However, standard gradient guidance typically performs optimally under convex and differentiable reward landscapes, showing substantially reduced effectiveness in real-world scenarios involving non-convex objectives, non-differentiable constraints, and multi-reward structures. Furthermore, recent supervised planning approaches require task-specific training or value estimators, which limits test-time flexibility and zero-shot generalization. We propose a Tree-guided Diffusion Planner (TDP), a zero-shot test-time planning framework that balances exploration and exploitation through structured trajectory generation. We frame test-time planning as a tree search problem using a bi-level sampling process: (1) diverse parent trajectories are produced via training-free particle guidance to encourage broad exploration, and (2) sub-trajectories are refined through fast conditional denoising guided by task objectives. TDP addresses the limitations of gradient guidance by exploring diverse trajectory regions and harnessing gradient information across this expanded solution space using only pretrained models and test-time reward signals. We evaluate TDP on three diverse tasks: maze gold-picking, robot arm block manipulation, and AntMaze multi-goal exploration. TDP consistently outperforms state-of-the-art approaches on all tasks. The project page can be found at: tree-diffusion-planner.github.io.
comment: 20 pages, 11 figures, 14 tables (main paper + appendix) / under review / project page will be available after the paper becomes public in arxiv
Can a mobile robot learn from a pedestrian model to prevent the sidewalk salsa?
Pedestrians approaching each other on a sidewalk sometimes end up in an awkward interaction known as the "sidewalk salsa": they both (repeatedly) deviate to the same side to avoid a collision. This provides an interesting use case to study interactions between pedestrians and mobile robots because, in the vast majority of cases, this phenomenon is avoided through a negotiation based on implicit communication. Understanding how it goes wrong and how pedestrians end up in the sidewalk salsa will therefore provide insight into the implicit communication. This understanding can be used to design safe and acceptable robotic behaviour. In a previous attempt to gain this understanding, a model of pedestrian behaviour based on the Communication-Enabled Interaction (CEI) framework was developed that can replicate the sidewalk salsa. However, it is unclear how to leverage this model in robotic planning and decision-making since it violates the assumptions of game theory, a much-used framework in planning and decision-making. Here, we present a proof-of-concept for an approach where a Reinforcement Learning (RL) agent leverages the model to learn how to interact with pedestrians. The results show that a basic RL agent successfully learned to interact with the CEI model. Furthermore, a risk-averse RL agent that had access to the perceived risk of the CEI model learned how to effectively communicate its intention through its motion and thereby substantially lowered the perceived risk, and displayed effort by the modelled pedestrian. These results show this is a promising approach and encourage further exploration.
Robust Convex Model Predictive Control with collision avoidance guarantees for robot manipulators
Industrial manipulators are normally operated in cluttered environments, making safe motion planning important. Furthermore, the presence of model-uncertainties make safe motion planning more difficult. Therefore, in practice the speed is limited in order to reduce the effect of disturbances. There is a need for control methods that can guarantee safe motions that can be executed fast. We address this need by suggesting a novel model predictive control (MPC) solution for manipulators, where our two main components are a robust tube MPC and a corridor planning algorithm to obtain collision-free motion. Our solution results in a convex MPC, which we can solve fast, making our method practically useful. We demonstrate the efficacy of our method in a simulated environment with a 6 DOF industrial robot operating in cluttered environments with uncertainties in model parameters. We outperform benchmark methods, both in terms of being able to work under higher levels of model uncertainties, while also yielding faster motion.
A-MHA*: Anytime Multi-Heuristic A*
Designing good heuristic functions for graph search requires adequate domain knowledge. It is often easy to design heuristics that perform well and correlate with the underlying true cost-to-go values in certain parts of the search space but these may not be admissible throughout the domain thereby affecting the optimality guarantees of the search. Bounded suboptimal search using several such partially good but inadmissible heuristics was developed in Multi-Heuristic A* (MHA*). Although MHA* leverages multiple inadmissible heuristics to potentially generate a faster suboptimal solution, the original version does not improve the solution over time. It is a one shot algorithm that requires careful setting of inflation factors to obtain a desired one time solution. In this work, we tackle this issue by extending MHA* to an anytime version that finds a feasible suboptimal solution quickly and continually improves it until time runs out. Our work is inspired from the Anytime Repairing A* (ARA*) algorithm. We prove that our precise adaptation of ARA* concepts in the MHA* framework preserves the original suboptimal and completeness guarantees and enhances MHA* to perform in an anytime fashion. Furthermore, we report the performance of A-MHA* in 3-D path planning domain and sliding tiles puzzle and compare against MHA* and other anytime algorithms.
The Rosario Dataset v2: Multimodal Dataset for Agricultural Robotics
We present a multi-modal dataset collected in a soybean crop field, comprising over two hours of recorded data from sensors such as stereo infrared camera, color camera, accelerometer, gyroscope, magnetometer, GNSS (Single Point Positioning, Real-Time Kinematic and Post-Processed Kinematic), and wheel odometry. This dataset captures key challenges inherent to robotics in agricultural environments, including variations in natural lighting, motion blur, rough terrain, and long, perceptually aliased sequences. By addressing these complexities, the dataset aims to support the development and benchmarking of advanced algorithms for localization, mapping, perception, and navigation in agricultural robotics. The platform and data collection system is designed to meet the key requirements for evaluating multi-modal SLAM systems, including hardware synchronization of sensors, 6-DOF ground truth and loops on long trajectories. We run multimodal state-of-the art SLAM methods on the dataset, showcasing the existing limitations in their application on agricultural settings. The dataset and utilities to work with it are released on https://cifasis.github.io/rosariov2/.
comment: First published on The International Journal of Robotics Research: https://journals.sagepub.com/doi/10.1177/02783649251368909
Learning Agile Gate Traversal via Analytical Optimal Policy Gradient
Traversing narrow gates presents a significant challenge and has become a standard benchmark for evaluating agile and precise quadrotor flight. Traditional modularized autonomous flight stacks require extensive design and parameter tuning, while end-to-end reinforcement learning (RL) methods often suffer from low sample efficiency and limited interpretability. In this work, we present a novel hybrid framework that adaptively fine-tunes model predictive control (MPC) parameters online using outputs from a neural network (NN) trained offline. The NN jointly predicts a reference pose and cost-function weights, conditioned on the coordinates of the gate corners and the current drone state. To achieve efficient training, we derive analytical policy gradients not only for the MPC module but also for an optimization-based gate traversal detection module. Furthermore, we introduce a new formulation of the attitude tracking error that admits a simplified representation, facilitating effective learning with bounded gradients. Hardware experiments demonstrate that our method enables fast and accurate quadrotor traversal through narrow gates in confined environments. It achieves several orders of magnitude improvement in sample efficiency compared to naive end-to-end RL approaches.
comment: 8 pages, 8 figures
Estimated Informed Anytime Search for Sampling-Based Planning via Adaptive Sampler
Path planning in robotics often involves solving continuously valued, high-dimensional problems. Popular informed approaches include graph-based searches, such as A*, and sampling-based methods, such as Informed RRT*, which utilize informed set and anytime strategies to expedite path optimization incrementally. Informed sampling-based planners define informed sets as subsets of the problem domain based on the current best solution cost. However, when no solution is found, these planners re-sample and explore the entire configuration space, which is time-consuming and computationally expensive. This article introduces Multi-Informed Trees (MIT*), a novel planner that constructs estimated informed sets based on prior admissible solution costs before finding the initial solution, thereby accelerating the initial convergence rate. Moreover, MIT* employs an adaptive sampler that dynamically adjusts the sampling strategy based on the exploration process. Furthermore, MIT* utilizes length-related adaptive sparse collision checks to guide lazy reverse search. These features enhance path cost efficiency and computation times while ensuring high success rates in confined scenarios. Through a series of simulations and real-world experiments, it is confirmed that MIT* outperforms existing single-query, sampling-based planners for problems in R^4 to R^16 and has been successfully applied to real-world robot manipulation tasks. A video showcasing our experimental results is available at: https://youtu.be/30RsBIdexTU
Complete Gaussian Splats from a Single Image with Denoising Diffusion Models
Gaussian splatting typically requires dense observations of the scene and can fail to reconstruct occluded and unobserved areas. We propose a latent diffusion model to reconstruct a complete 3D scene with Gaussian splats, including the occluded parts, from only a single image during inference. Completing the unobserved surfaces of a scene is challenging due to the ambiguity of the plausible surfaces. Conventional methods use a regression-based formulation to predict a single "mode" for occluded and out-of-frustum surfaces, leading to blurriness, implausibility, and failure to capture multiple possible explanations. Thus, they often address this problem partially, focusing either on objects isolated from the background, reconstructing only visible surfaces, or failing to extrapolate far from the input views. In contrast, we propose a generative formulation to learn a distribution of 3D representations of Gaussian splats conditioned on a single input image. To address the lack of ground-truth training data, we propose a Variational AutoReconstructor to learn a latent space only from 2D images in a self-supervised manner, over which a diffusion model is trained. Our method generates faithful reconstructions and diverse samples with the ability to complete the occluded surfaces for high-quality 360-degree renderings.
comment: Main paper: 11 pages; Supplementary materials: 7 pages
Few-Shot Neuro-Symbolic Imitation Learning for Long-Horizon Planning and Acting
Imitation learning enables intelligent systems to acquire complex behaviors with minimal supervision. However, existing methods often focus on short-horizon skills, require large datasets, and struggle to solve long-horizon tasks or generalize across task variations and distribution shifts. We propose a novel neuro-symbolic framework that jointly learns continuous control policies and symbolic domain abstractions from a few skill demonstrations. Our method abstracts high-level task structures into a graph, discovers symbolic rules via an Answer Set Programming solver, and trains low-level controllers using diffusion policy imitation learning. A high-level oracle filters task-relevant information to focus each controller on a minimal observation and action space. Our graph-based neuro-symbolic framework enables capturing complex state transitions, including non-spatial and temporal relations, that data-driven learning or clustering techniques often fail to discover in limited demonstration datasets. We validate our approach in six domains that involve four robotic arms, Stacking, Kitchen, Assembly, and Towers of Hanoi environments, and a distinct Automated Forklift domain with two environments. The results demonstrate high data efficiency with as few as five skill demonstrations, strong zero- and few-shot generalizations, and interpretable decision making.
comment: Accepted at CoRL 2025; to appear in PMLR
Assessing Human Cooperation for Enhancing Social Robot Navigation
Socially aware robot navigation is a planning paradigm where the robot navigates in human environments and tries to adhere to social constraints while interacting with the humans in the scene. These navigation strategies were further improved using human prediction models, where the robot takes the potential future trajectory of humans while computing its own. Though these strategies significantly improve the robot's behavior, it faces difficulties from time to time when the human behaves in an unexpected manner. This happens as the robot fails to understand human intentions and cooperativeness, and the human does not have a clear idea of what the robot is planning to do. In this paper, we aim to address this gap through effective communication at an appropriate time based on a geometric analysis of the context and human cooperativeness in head-on crossing scenarios. We provide an assessment methodology and propose some evaluation metrics that could distinguish a cooperative human from a non-cooperative one. Further, we also show how geometric reasoning can be used to generate appropriate verbal responses or robot actions.
RoboInspector: Unveiling the Unreliability of Policy Code for LLM-enabled Robotic Manipulation
Large language models (LLMs) demonstrate remarkable capabilities in reasoning and code generation, enabling robotic manipulation to be initiated with just a single instruction. The LLM carries out various tasks by generating policy code required to control the robot. Despite advances in LLMs, achieving reliable policy code generation remains a significant challenge due to the diverse requirements of real-world tasks and the inherent complexity of user instructions. In practice, different users may provide distinct instructions to drive the robot for the same task, which may cause the unreliability of policy code generation. To bridge this gap, we design RoboInspector, a pipeline to unveil and characterize the unreliability of the policy code for LLM-enabled robotic manipulation from two perspectives: the complexity of the manipulation task and the granularity of the instruction. We perform comprehensive experiments with 168 distinct combinations of tasks, instructions, and LLMs in two prominent frameworks. The RoboInspector identifies four main unreliable behaviors that lead to manipulation failure. We provide a detailed characterization of these behaviors and their underlying causes, giving insight for practical development to reduce unreliability. Furthermore, we introduce a refinement approach guided by failure policy code feedback that improves the reliability of policy code generation by up to 35% in LLM-enabled robotic manipulation, evaluated in both simulation and real-world environments.
Dynamics-Compliant Trajectory Diffusion for Super-Nominal Payload Manipulation
Nominal payload ratings for articulated robots are typically derived from worst-case configurations, resulting in uniform payload constraints across the entire workspace. This conservative approach severely underutilizes the robot's inherent capabilities -- our analysis demonstrates that manipulators can safely handle payloads well above nominal capacity across broad regions of their workspace while staying within joint angle, velocity, acceleration, and torque limits. To address this gap between assumed and actual capability, we propose a novel trajectory generation approach using denoising diffusion models that explicitly incorporates payload constraints into the planning process. Unlike traditional sampling-based methods that rely on inefficient trial-and-error, optimization-based methods that are prohibitively slow, or kinodynamic planners that struggle with problem dimensionality, our approach generates dynamically feasible joint-space trajectories in constant time that can be directly executed on physical hardware without post-processing. Experimental validation on a 7 DoF Franka Emika Panda robot demonstrates that up to 67.6% of the workspace remains accessible even with payloads exceeding 3 times the nominal capacity. This expanded operational envelope highlights the importance of a more nuanced consideration of payload dynamics in motion planning algorithms.
comment: Accepted to 2025 Conference on Robot Learning [CoRL]
Multi-Modal Model Predictive Path Integral Control for Collision Avoidance
This paper proposes a novel approach to motion planning and decision-making for automated vehicles, using a multi-modal Model Predictive Path Integral control algorithm. The method samples with Sobol sequences around the prior input and incorporates analytical solutions for collision avoidance. By leveraging multiple modes, the multi-modal control algorithm explores diverse trajectories, such as manoeuvring around obstacles or stopping safely before them, mitigating the risk of sub-optimal solutions. A non-linear single-track vehicle model with a Fiala tyre serves as the prediction model, and tyre force constraints within the friction circle are enforced to ensure vehicle stability during evasive manoeuvres. The optimised steering angle and longitudinal acceleration are computed to generate a collision-free trajectory and to control the vehicle. In a high-fidelity simulation environment, we demonstrate that the proposed algorithm can successfully avoid obstacles, keeping the vehicle stable while driving a double lane change manoeuvre on high and low-friction road surfaces and occlusion scenarios with moving obstacles, outperforming a standard Model Predictive Path Integral approach.
comment: Accepted as an oral presentation at the 29th IAVSD. August 18-22, 2025. Shanghai, China
Robust Real-Time Coordination of CAVs: A Distributed Optimization Framework under Uncertainty
Achieving both safety guarantees and real-time performance in cooperative vehicle coordination remains a fundamental challenge, particularly in dynamic and uncertain environments. This paper presents a novel coordination framework that resolves this challenge through three key innovations: 1) direct control of vehicles' trajectory distributions during coordination, formulated as a robust cooperative planning problem with adaptive enhanced safety constraints, ensuring a specified level of safety regarding the uncertainty of the interactive trajectory, 2) a fully parallel ADMM-based distributed trajectory negotiation (ADMM-DTN) algorithm that efficiently solves the optimization problem while allowing configurable negotiation rounds to balance solution quality and computational resources, and 3) an interactive attention mechanism that selectively focuses on critical interactive participants to further enhance computational efficiency. Both simulation results and practical experiments demonstrate that our framework achieves significant advantages in safety (reducing collision rates by up to 40.79\% in various scenarios) and real-time performance compared to state-of-the-art methods, while maintaining strong scalability with increasing vehicle numbers. The proposed interactive attention mechanism further reduces the computational demand by 14.1\%. The framework's effectiveness is further validated through real-world experiments with unexpected dynamic obstacles, demonstrating robust coordination in complex environments. The experiment demo could be found at https://youtu.be/4PZwBnCsb6Q.
Cooperative Sensing Enhanced UAV Path-Following and Obstacle Avoidance with Variable Formation
The high mobility of unmanned aerial vehicles (UAVs) enables them to be used in various civilian fields, such as rescue and cargo transport. Path-following is a crucial way to perform these tasks while sensing and collision avoidance are essential for safe flight. In this paper, we investigate how to efficiently and accurately achieve path-following, obstacle sensing and avoidance subtasks, as well as their conflict-free fusion scheduling. Firstly, a high precision deep reinforcement learning (DRL)-based UAV formation path-following model is developed, and the reward function with adaptive weights is designed from the perspective of distance and velocity errors. Then, we use integrated sensing and communication (ISAC) signals to detect the obstacle and derive the Cramer-Rao lower bound (CRLB) for obstacle sensing by information-level fusion, based on which we propose the variable formation enhanced obstacle position estimation (VFEO) algorithm. In addition, an online obstacle avoidance scheme without pretraining is designed to solve the sparse reward. Finally, with the aid of null space based (NSB) behavioral method, we present a hierarchical subtasks fusion strategy. Simulation results demonstrate the effectiveness and superiority of the subtask algorithms and the hierarchical fusion strategy.
Observability-driven Assignment of Heterogeneous Sensors for Multi-Target Tracking IROS
This paper addresses the challenge of assigning heterogeneous sensors (i.e., robots with varying sensing capabilities) for multi-target tracking. We classify robots into two categories: (1) sufficient sensing robots, equipped with range and bearing sensors, capable of independently tracking targets, and (2) limited sensing robots, which are equipped with only range or bearing sensors and need to at least form a pair to collaboratively track a target. Our objective is to optimize tracking quality by minimizing uncertainty in target state estimation through efficient robot-to-target assignment. By leveraging matroid theory, we propose a greedy assignment algorithm that dynamically allocates robots to targets to maximize tracking quality. The algorithm guarantees constant-factor approximation bounds of 1/3 for arbitrary tracking quality functions and 1/2 for submodular functions, while maintaining polynomial-time complexity. Extensive simulations demonstrate the algorithm's effectiveness in accurately estimating and tracking targets over extended periods. Furthermore, numerical results confirm that the algorithm's performance is close to that of the optimal assignment, highlighting its robustness and practical applicability.
comment: This paper has been accepted to the 2025 IEEE/RSJ IROS
Detecting Domain Shifts in Myoelectric Activations: Challenges and Opportunities in Stream Learning PRICAI25
Detecting domain shifts in myoelectric activations poses a significant challenge due to the inherent non-stationarity of electromyography (EMG) signals. This paper explores the detection of domain shifts using data stream (DS) learning techniques, focusing on the DB6 dataset from the Ninapro database. We define domains as distinct time-series segments based on different subjects and recording sessions, applying Kernel Principal Component Analysis (KPCA) with a cosine kernel to pre-process and highlight these shifts. By evaluating multiple drift detection methods such as CUSUM, Page-Hinckley, and ADWIN, we reveal the limitations of current techniques in achieving high performance for real-time domain shift detection in EMG signals. Our results underscore the potential of streaming-based approaches for maintaining stable EMG decoding models, while highlighting areas for further research to enhance robustness and accuracy in real-world scenarios.
comment: 16 pages, 5 figures, 1 table, PRICAI25
Learning to Assemble the Soma Cube with Legal-Action Masked DQN and Safe ZYZ Regrasp on a Doosan M0609
This paper presents the first comprehensive application of legal-action masked Deep Q-Networks with safe ZYZ regrasp strategies to an underactuated gripper-equipped 6-DOF collaborative robot for autonomous Soma cube assembly learning. Our approach represents the first systematic integration of constraint-aware reinforcement learning with singularity-safe motion planning on a Doosan M0609 collaborative robot. We address critical challenges in robotic manipulation: combinatorial action space explosion, unsafe motion planning, and systematic assembly strategy learning. Our system integrates a legal-action masked DQN with hierarchical architecture that decomposes Q-function estimation into orientation and position components, reducing computational complexity from $O(3,132)$ to $O(116) + O(27)$ while maintaining solution completeness. The robot-friendly reward function encourages ground-first, vertically accessible assembly sequences aligned with manipulation constraints. Curriculum learning across three progressive difficulty levels (2-piece, 3-piece, 7-piece) achieves remarkable training efficiency: 100\% success rate for Level 1 within 500 episodes, 92.9\% for Level 2, and 39.9\% for Level 3 over 105,300 total training episodes.
comment: 13 figures, 17 pages
Mini Autonomous Car Driving based on 3D Convolutional Neural Networks
Autonomous driving applications have become increasingly relevant in the automotive industry due to their potential to enhance vehicle safety, efficiency, and user experience, thereby meeting the growing demand for sophisticated driving assistance features. However, the development of reliable and trustworthy autonomous systems poses challenges such as high complexity, prolonged training periods, and intrinsic levels of uncertainty. Mini Autonomous Cars (MACs) are used as a practical testbed, enabling validation of autonomous control methodologies on small-scale setups. This simplified and cost-effective environment facilitates rapid evaluation and comparison of machine learning models, which is particularly useful for algorithms requiring online training. To address these challenges, this work presents a methodology based on RGB-D information and three-dimensional convolutional neural networks (3D CNNs) for MAC autonomous driving in simulated environments. We evaluate the proposed approach against recurrent neural networks (RNNs), with architectures trained and tested on two simulated tracks with distinct environmental features. Performance was assessed using task completion success, lap-time metrics, and driving consistency. Results highlight how architectural modifications and track complexity influence the models' generalization capability and vehicle control performance. The proposed 3D CNN demonstrated promising results when compared with RNNs.
UltraTac: Integrated Ultrasound-Augmented Visuotactile Sensor for Enhanced Robotic Perception IROS 2025
Visuotactile sensors provide high-resolution tactile information but are incapable of perceiving the material features of objects. We present UltraTac, an integrated sensor that combines visuotactile imaging with ultrasound sensing through a coaxial optoacoustic architecture. The design shares structural components and achieves consistent sensing regions for both modalities. Additionally, we incorporate acoustic matching into the traditional visuotactile sensor structure, enabling integration of the ultrasound sensing modality without compromising visuotactile performance. Through tactile feedback, we dynamically adjust the operating state of the ultrasound module to achieve flexible functional coordination. Systematic experiments demonstrate three key capabilities: proximity sensing in the 3-8 cm range ($R^2=0.90$), material classification (average accuracy: 99.20%), and texture-material dual-mode object recognition achieving 92.11% accuracy on a 15-class task. Finally, we integrate the sensor into a robotic manipulation system to concurrently detect container surface patterns and internal content, which verifies its potential for advanced human-machine interaction and precise robotic manipulation.
comment: Accepted to IROS 2025
COBRA-PPM: A Causal Bayesian Reasoning Architecture Using Probabilistic Programming for Robot Manipulation Under Uncertainty
Manipulation tasks require robots to reason about cause and effect when interacting with objects. Yet, many data-driven approaches lack causal semantics and thus only consider correlations. We introduce COBRA-PPM, a novel causal Bayesian reasoning architecture that combines causal Bayesian networks and probabilistic programming to perform interventional inference for robot manipulation under uncertainty. We demonstrate its capabilities through high-fidelity Gazebo-based experiments on an exemplar block stacking task, where it predicts manipulation outcomes with high accuracy (Pred Acc: 88.6%) and performs greedy next-best action selection with a 94.2% task success rate. We further demonstrate sim2real transfer on a domestic robot, showing effectiveness in handling real-world uncertainty from sensor noise and stochastic actions. Our generalised and extensible framework supports a wide range of manipulation scenarios and lays a foundation for future work at the intersection of robotics and causality.
comment: 8 pages, 7 figures, accepted to the 2025 IEEE European Conference on Mobile Robots (ECMR 2025)
Centralization vs. decentralization in multi-robot sweep coverage with ground robots and UAVs
In swarm robotics, decentralized control is often proposed as a more scalable and fault-tolerant alternative to centralized control. However, centralized behaviors are often faster and more efficient than their decentralized counterparts. In any given application, the goals and constraints of the task being solved should guide the choice to use centralized control, decentralized control, or a combination of the two. Currently, the exact trade-offs that exist between centralization and decentralization are not well defined. In this paper, we compare the performance of centralization and decentralization in the example task of sweep coverage, across five different types of multi-robot control structures: random walk, decentralized with beacons, hybrid formation control using self-organizing hierarchy, centralized formation control, and predetermined. In all five approaches, the coverage task is completed by a group of ground robots. In each approach, except for the random walk, the ground robots are assisted by UAVs, acting as supervisors or beacons. We compare the approaches in terms of three performance metrics for which centralized approaches are expected to have an advantage -- coverage completeness, coverage uniformity, and sweep completion time -- and two metrics for which decentralized approaches are expected to have an advantage -- scalability (4, 8, or 16 ground robots) and fault tolerance (0%, 25%, 50%, or 75% ground robot failure).
comment: IRIDIA, Universite Libre de Bruxelles, Brussels, Belgium, 2021
Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation
Vision is well-known for its use in manipulation, especially using visual servoing. Due to the 3D nature of the world, using multiple camera views and merging them creates better representations for Q-learning and in turn, trains more sample efficient policies. Nevertheless, these multi-view policies are sensitive to failing cameras and can be burdensome to deploy. To mitigate these issues, we introduce a Merge And Disentanglement (MAD) algorithm that efficiently merges views to increase sample efficiency while simultaneously disentangling views by augmenting multi-view feature inputs with single-view features. This produces robust policies and allows lightweight deployment. We demonstrate the efficiency and robustness of our approach using Meta-World and ManiSkill3. For project website and code, see https://aalmuzairee.github.io/mad
comment: Accepted at CoRL 2025. For project website and code, see https://aalmuzairee.github.io/mad
Ontology Neural Network and ORTSF: A Framework for Topological Reasoning and Delay-Robust Control
The advancement of autonomous robotic systems has led to impressive capabilities in perception, localization, mapping, and control. Yet, a fundamental gap remains: existing frameworks excel at geometric reasoning and dynamic stability but fall short in representing and preserving relational semantics, contextual reasoning, and cognitive transparency essential for collaboration in dynamic, human-centric environments. This paper introduces a unified architecture comprising the Ontology Neural Network (ONN) and the Ontological Real-Time Semantic Fabric (ORTSF) to address this gap. The ONN formalizes relational semantic reasoning as a dynamic topological process. By embedding Forman-Ricci curvature, persistent homology, and semantic tensor structures within a unified loss formulation, ONN ensures that relational integrity and topological coherence are preserved as scenes evolve over time. The ORTSF transforms reasoning traces into actionable control commands while compensating for system delays. It integrates predictive and delay-aware operators that ensure phase margin preservation and continuity of control signals, even under significant latency conditions. Empirical studies demonstrate the ONN + ORTSF framework's ability to unify semantic cognition and robust control, providing a mathematically principled and practically viable solution for cognitive robotics.
comment: 12 pages, 5 figures, includes theoretical proofs and simulation results
Knowledge in multi-robot systems: an interplay of dynamics, computation and communication
In this paper, we provide a framework integrating distributed multi-robot systems and temporal epistemic logic. We show that continuous-discrete hybrid systems are compatible with logical models of knowledge already used in distributed computing, and demonstrate its usefulness by deriving sufficient epistemic conditions for exploration and gathering robot tasks to be solvable. We provide a separation of the physical and computational aspects of a robotic system, allowing us to decouple the problems related to each and directly use methods from control theory and distributed computing, fields that are traditionally distant in the literature. Finally, we demonstrate a novel approach for reasoning about the knowledge in multi-robot systems through a principled method of converting a switched hybrid dynamical system into a temporal-epistemic logic model, passing through an abstract state machine representation. This creates space for methods and results to be exchanged across the fields of control theory, distributed computing and temporal-epistemic logic, while reasoning about multi-robot systems.
CoRI: Communication of Robot Intent for Physical Human-Robot Interaction
Clear communication of robot intent fosters transparency and interpretability in physical human-robot interaction (pHRI), particularly during assistive tasks involving direct human-robot contact. We introduce CoRI, a pipeline that automatically generates natural language communication of a robot's upcoming actions directly from its motion plan and visual perception. Our pipeline first processes the robot's image view to identify human poses and key environmental features. It then encodes the planned 3D spatial trajectory (including velocity and force) onto this view, visually grounding the path and its dynamics. CoRI queries a vision-language model with this visual representation to interpret the planned action within the visual context before generating concise, user-directed statements, without relying on task-specific information. Results from a user study involving robot-assisted feeding, bathing, and shaving tasks across two different robots indicate that CoRI leads to statistically significant difference in communication clarity compared to a baseline communication strategy. Specifically, CoRI effectively conveys not only the robot's high-level intentions but also crucial details about its motion and any collaborative user action needed. Video and code of our project can be found on our project website: https://cori-phri.github.io/
comment: To be published in Proceedings of the 9th Conference on Robot Learning (CoRL). 34 pages, 10 figures
Traversing the Narrow Path: A Two-Stage Reinforcement Learning Framework for Humanoid Beam Walking
Traversing narrow beams is challenging for humanoids due to sparse, safety-critical contacts and the fragility of purely learned policies. We propose a physically grounded, two-stage framework that couples an XCoM/LIPM footstep template with a lightweight residual planner and a simple low-level tracker. Stage-1 is trained on flat ground: the tracker learns to robustly follow footstep targets by adding small random perturbations to heuristic footsteps, without any hand-crafted centerline locking, so it acquires stable contact scheduling and strong target-tracking robustness. Stage-2 is trained in simulation on a beam: a high-level planner predicts a body-frame residual (Delta x, Delta y, Delta psi) for the swing foot only, refining the template step to prioritize safe, precise placement under narrow support while preserving interpretability. To ease deployment, sensing is kept minimal and consistent between simulation and hardware: the planner consumes compact, forward-facing elevation cues together with onboard IMU and joint signals. On a Unitree G1, our system reliably traverses a 0.2 m-wide, 3 m-long beam. Across simulation and real-world studies, residual refinement consistently outperforms template-only and monolithic baselines in success rate, centerline adherence, and safety margins, while the structured footstep interface enables transparent analysis and low-friction sim-to-real transfer.
comment: Project website: https://huangtc233.github.io/Traversing-the-Narrow-Path/
Soft Manipulation Surface With Reduced Actuator Density For Heterogeneous Object Manipulation
Object manipulation in robotics faces challenges due to diverse object shapes, sizes, and fragility. Gripper-based methods offer precision and low degrees of freedom (DOF) but the gripper limits the kind of objects to grasp. On the other hand, surface-based approaches provide flexibility for handling fragile and heterogeneous objects but require numerous actuators, increasing complexity. We propose new manipulation hardware that utilizes equally spaced linear actuators placed vertically and connected by a soft surface. In this setup, object manipulation occurs on the soft surface through coordinated movements of the surrounding actuators. This approach requires fewer actuators to cover a large manipulation area, offering a cost-effective solution with a lower DOF compared to dense actuator arrays. It also effectively handles heterogeneous objects of varying shapes and weights, even when they are significantly smaller than the distance between actuators. This method is particularly suitable for managing highly fragile objects in the food industry.
Unified Path Planner with Adaptive Safety and Optimality
Path planning for autonomous robots presents a fundamental trade-off between optimality and safety. While conventional algorithms typically prioritize one of these objectives, we introduce the Unified Path Planner (UPP), a unified framework that simultaneously addresses both. UPP is a graph-search-based algorithm that employs a modified heuristic function incorporating a dynamic safety cost, enabling an adaptive balance between path length and obstacle clearance. We establish theoretical sub-optimality bounds for the planner and demonstrate that its safety-to-optimality ratio can be tuned via adjustable parameters, with a trade-off in computational complexity. Extensive simulations show that UPP achieves a high success rate, generating near-optimal paths with only a negligible increase in cost over traditional A*, while ensuring safety margins that closely approach those of the classical Voronoi planner. Finally, the practical efficacy of UPP is validated through a hardware implementation on a TurtleBot, confirming its ability to navigate cluttered environments by generating safe, sub-optimal paths.
comment: 6 pages,4 figures
Latent Adaptive Planner for Dynamic Manipulation
We present the Latent Adaptive Planner (LAP), a trajectory-level latent-variable policy for dynamic nonprehensile manipulation (e.g., box catching) that formulates planning as inference in a low-dimensional latent space and is learned effectively from human demonstration videos. During execution, LAP achieves real-time adaptation by maintaining a posterior over the latent plan and performing variational replanning as new observations arrive. To bridge the embodiment gap between humans and robots, we introduce a model-based proportional mapping that regenerates accurate kinematic-dynamic joint states and object positions from human demonstrations. Through challenging box catching experiments with varying object properties, LAP demonstrates superior success rates, trajectory smoothness, and energy efficiency by learning human-like compliant motions and adaptive behaviors. Overall, LAP enables dynamic manipulation with real-time adaptation and successfully transfer across heterogeneous robot platforms using the same human demonstration videos.
SignLoc: Robust Localization using Navigation Signs and Public Maps
Navigation signs and maps, such as floor plans and street maps, are widely available and serve as ubiquitous aids for way-finding in human environments. Yet, they are rarely used by robot systems. This paper presents SignLoc, a global localization method that leverages navigation signs to localize the robot on publicly available maps -- specifically floor plans and OpenStreetMap (OSM) graphs -- without prior sensor-based mapping. SignLoc first extracts a navigation graph from the input map. It then employs a probabilistic observation model to match directional and locational cues from the detected signs to the graph, enabling robust topo-semantic localization within a Monte Carlo framework. We evaluated SignLoc in diverse large-scale environments: part of a university campus, a shopping mall, and a hospital complex. Experimental results show that SignLoc reliably localizes the robot after observing only one to two signs.
comment: This work has been submitted to the IEEE for possible publication
Visual Imitation Enables Contextual Humanoid Control
How can we teach humanoids to climb staircases and sit on chairs using the surrounding environment context? Arguably, the simplest way is to just show them-casually capture a human motion video and feed it to humanoids. We introduce VIDEOMIMIC, a real-to-sim-to-real pipeline that mines everyday videos, jointly reconstructs the humans and the environment, and produces whole-body control policies for humanoid robots that perform the corresponding skills. We demonstrate the results of our pipeline on real humanoid robots, showing robust, repeatable contextual control such as staircase ascents and descents, sitting and standing from chairs and benches, as well as other dynamic whole-body skills-all from a single policy, conditioned on the environment and global root commands. VIDEOMIMIC offers a scalable path towards teaching humanoids to operate in diverse real-world environments.
comment: Project website: https://www.videomimic.net/
QuaDreamer: Controllable Panoramic Video Generation for Quadruped Robots
Panoramic cameras, capturing comprehensive 360-degree environmental data, are suitable for quadruped robots in surrounding perception and interaction with complex environments. However, the scarcity of high-quality panoramic training data-caused by inherent kinematic constraints and complex sensor calibration challenges-fundamentally limits the development of robust perception systems tailored to these embodied platforms. To address this issue, we propose QuaDreamer-the first panoramic data generation engine specifically designed for quadruped robots. QuaDreamer focuses on mimicking the motion paradigm of quadruped robots to generate highly controllable, realistic panoramic videos, providing a data source for downstream tasks. Specifically, to effectively capture the unique vertical vibration characteristics exhibited during quadruped locomotion, we introduce Vertical Jitter Encoding (VJE). VJE extracts controllable vertical signals through frequency-domain feature filtering and provides high-quality prompts. To facilitate high-quality panoramic video generation under jitter signal control, we propose a Scene-Object Controller (SOC) that effectively manages object motion and boosts background jitter control through the attention mechanism. To address panoramic distortions in wide-FoV video generation, we propose the Panoramic Enhancer (PE)-a dual-stream architecture that synergizes frequency-texture refinement for local detail enhancement with spatial-structure correction for global geometric consistency. We further demonstrate that the generated video sequences can serve as training data for the quadruped robot's panoramic visual perception model, enhancing the performance of multi-object tracking in 360-degree scenes. The source code and model weights will be publicly available at https://github.com/losehu/QuaDreamer.
comment: Accepted to CoRL 2025. The source code and model weights will be publicly available at \url{https://github.com/losehu/QuaDreamer
Towards Embodiment Scaling Laws in Robot Locomotion
Cross-embodiment generalization underpins the vision of building generalist embodied agents for any robot, yet its enabling factors remain poorly understood. We investigate embodiment scaling laws, the hypothesis that increasing the number of training embodiments improves generalization to unseen ones, using robot locomotion as a test bed. We procedurally generate ~1,000 embodiments with topological, geometric, and joint-level kinematic variations, and train policies on random subsets. We observe positive scaling trends supporting the hypothesis, and find that embodiment scaling enables substantially broader generalization than data scaling on fixed embodiments. Our best policy, trained on the full dataset, transfers zero-shot to novel embodiments in simulation and the real world, including the Unitree Go2 and H1. These results represent a step toward general embodied intelligence, with relevance to adaptive control for configurable robots, morphology co-design, and beyond.
comment: Conference on Robot Learning (CoRL), 2025. Project website: https://embodiment-scaling-laws.github.io/
Motion Priors Reimagined: Adapting Flat-Terrain Skills for Complex Quadruped Mobility
Reinforcement learning (RL)-based motion imitation methods trained on demonstration data can effectively learn natural and expressive motions with minimal reward engineering but often struggle to generalize to novel environments. We address this by proposing a hierarchical RL framework in which a low-level policy is first pre-trained to imitate animal motions on flat ground, thereby establishing motion priors. A subsequent high-level, goal-conditioned policy then builds on these priors, learning residual corrections that enable perceptive locomotion, local obstacle avoidance, and goal-directed navigation across diverse and rugged terrains. Simulation experiments illustrate the effectiveness of learned residuals in adapting to progressively challenging uneven terrains while still preserving the locomotion characteristics provided by the motion priors. Furthermore, our results demonstrate improvements in motion regularization over baseline models trained without motion priors under similar reward setups. Real-world experiments with an ANYmal-D quadruped robot confirm our policy's capability to generalize animal-like locomotion skills to complex terrains, demonstrating smooth and efficient locomotion and local navigation performance amidst challenging terrains with obstacles.
comment: Conference on Robot Learning (CoRL)
Diffusion Dynamics Models with Generative State Estimation for Cloth Manipulation
Cloth manipulation is challenging due to its highly complex dynamics, near-infinite degrees of freedom, and frequent self-occlusions, which complicate both state estimation and dynamics modeling. Inspired by recent advances in generative models, we hypothesize that these expressive models can effectively capture intricate cloth configurations and deformation patterns from data. Therefore, we propose a diffusion-based generative approach for both perception and dynamics modeling. Specifically, we formulate state estimation as reconstructing full cloth states from partial observations and dynamics modeling as predicting future states given the current state and robot actions. Leveraging a transformer-based diffusion model, our method achieves accurate state reconstruction and reduces long-horizon dynamics prediction errors by an order of magnitude compared to prior approaches. We integrate our dynamics models with model predictive control and show that our framework enables effective cloth folding on real robotic systems, demonstrating the potential of generative models for deformable object manipulation under partial observability and complex dynamics.
comment: CoRL 2025. Project website: https://uniclothdiff.github.io/
DexFruit: Dexterous Manipulation and Gaussian Splatting Inspection of Fruit
DexFruit is a robotic manipulation framework that enables gentle, autonomous handling of fragile fruit and precise evaluation of damage. Many fruits are fragile and prone to bruising, thus requiring humans to manually harvest them with care. In this work, we demonstrate by using optical tactile sensing, autonomous manipulation of fruit with minimal damage can be achieved. We show that our tactile informed diffusion policies outperform baselines in both reduced bruising and pick-and-place success rate across three fruits: strawberries, tomatoes, and blackberries. In addition, we introduce FruitSplat, a novel technique to represent and quantify visual damage in high-resolution 3D representation via 3D Gaussian Splatting (3DGS). Existing metrics for measuring damage lack quantitative rigor or require expensive equipment. With FruitSplat, we distill a 2D strawberry mask as well as a 2D bruise segmentation mask into the 3DGS representation. Furthermore, this representation is modular and general, compatible with any relevant 2D model. Overall, we demonstrate a 92% grasping policy success rate, up to a 20% reduction in visual bruising, and up to an 31% improvement in grasp success rate on challenging fruit compared to our baselines across our three tested fruits. We rigorously evaluate this result with over 630 trials. Please checkout our website at https://dex-fruit.github.io .
comment: 8 pages, 5 figures
Energy-Aware Lane Planning for Connected Electric Vehicles in Urban Traffic: Design and Vehicle-in-the-Loop Validation
Urban driving with connected and automated vehicles (CAVs) offers potential for energy savings, yet most eco-driving strategies focus solely on longitudinal speed control within a single lane. This neglects the significant impact of lateral decisions, such as lane changes, on overall energy efficiency, especially in environments with traffic signals and heterogeneous traffic flow. To address this gap, we propose a novel energy-aware motion planning framework that jointly optimizes longitudinal speed and lateral lane-change decisions using vehicle-to-infrastructure (V2I) communication. Our approach estimates long-term energy costs using a graph-based approximation and solves short-horizon optimal control problems under traffic constraints. Using a data-driven energy model calibrated to an actual battery electric vehicle, we demonstrate with vehicle-in-the-loop experiments that our method reduces motion energy consumption by up to 24 percent compared to a human driver, highlighting the potential of connectivity-enabled planning for sustainable urban autonomy.
comment: Accepted at 2025 IEEE Conference on Decision and Control (CDC25')
A Physics-Based Continuum Model for Versatile, Scalable, and Fast Terramechanics Simulation
This paper discusses Chrono's Continuous Representation Model (called herein Chrono::CRM), a general-purpose, scalable, and efficient simulation solution for terramechanics problems. Built on Chrono's Smoothed Particle Hydrodynamics (SPH) framework, Chrono::CRM moves beyond semi-empirical terramechanics approaches, e.g., Bekker-Wong/Janosi-Hanamoto, to provide a physics-based model able to address complex tasks such as digging, grading, as well as interaction with deformable wheels and complex grouser/lug patterns. The terramechanics model is versatile in that it allows the terrain to interact with both rigid and flexible implements simulated via the Chrono dynamics engine. We validate Chrono::CRM against experimental data from three physical tests, including one involving NASA's MGRU3 rover. In addition, the simulator is benchmarked against a high-fidelity Discrete Element Method (DEM) simulation of a digging scenario involving the Regolith Advanced Surface Systems Operations Robot (RASSOR). Being GPU-accelerated, Chrono::CRM achieves computational efficiency comparable to that of semi-empirical simulation approaches for terramechanics problems. Through an ``active domains'' implementation, Chrono::CRM can handle terrain stretches up to 10 km long with 100 million SPH particles at near interactive rates, making high-fidelity off-road simulations at large scales feasible. As a component of the Chrono package, the CRM model is open source and released under a BSD-3 license. All models and simulations used in this contribution are available in a public GitHub repository for reproducibility studies and further research.
comment: 32 pages, 21 figures, Submitted to Journal of Terramechanics
Multiagent Systems
Automated Clinical Problem Detection from SOAP Notes using a Collaborative Multi-Agent LLM Architecture
Accurate interpretation of clinical narratives is critical for patient care, but the complexity of these notes makes automation challenging. While Large Language Models (LLMs) show promise, single-model approaches can lack the robustness required for high-stakes clinical tasks. We introduce a collaborative multi-agent system (MAS) that models a clinical consultation team to address this gap. The system is tasked with identifying clinical problems by analyzing only the Subjective (S) and Objective (O) sections of SOAP notes, simulating the diagnostic reasoning process of synthesizing raw data into an assessment. A Manager agent orchestrates a dynamically assigned team of specialist agents who engage in a hierarchical, iterative debate to reach a consensus. We evaluated our MAS against a single-agent baseline on a curated dataset of 420 MIMIC-III notes. The dynamic multi-agent configuration demonstrated consistently improved performance in identifying congestive heart failure, acute kidney injury, and sepsis. Qualitative analysis of the agent debates reveals that this structure effectively surfaces and weighs conflicting evidence, though it can occasionally be susceptible to groupthink. By modeling a clinical team's reasoning process, our system offers a promising path toward more accurate, robust, and interpretable clinical decision support tools.
comment: Accepted to The 16th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB 2025)(Poster Paper)
ORCA: ORchestrating Causal Agent
Causal inference is essential for decision-making science while the complexity of the data analysis workflow, ranging from data wrangling to causal analysis, increases substantially as the scale of data grows in complicated business environments. Especially, the execution of the workflow in relational databases by non-experts can result in repetitive bottlenecks which impede timely and responsible business insights. To address this challenge, we propose ORCA (Orchestrating Causal Agent), an LLM agentic system that can automate routine workflows in RDBMS while preserving expert oversight via human-AI interactions. ORCA orchestrates the full data analysis pipeline: interpreting natural language queries, navigating tables from DB servers, generating proper SQL codes, preprocessing data, and configuring modeling processes using causal inference libraries. Domain experts still can control the automation through iterative interactions with ORCA, enabling robust data-driven decision making with less technical expertise in statistical computing. Empirical evaluations on benchmark and synthetic e-commerce datasets demonstrate competitive performance of ORCA in table understanding, query generation, and cause-effect estimation -- achieving over $7\times$ improvement in estimating average treatment compared to GPT-4o mini.
comment: 24 pages, 17 figures, 1 table
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of NLP tasks, but they remain fundamentally stateless, constrained by limited context windows that hinder long-horizon reasoning. Recent efforts to address this limitation often augment LLMs with an external memory bank, yet most existing pipelines are static and heuristic-driven, lacking any learned mechanism for deciding what to store, update, or retrieve. We present Memory-R1, a reinforcement learning (RL) framework that equips LLMs with the ability to actively manage and utilize external memory through two specialized agents: a Memory Manager that learns to perform structured memory operations, including adding, updating, deleting, or taking no operation on memory entries; and an Answer Agent that selects the most relevant entries and reasons over them to produce an answer. Both agents are fine-tuned with outcome-driven RL (PPO and GRPO), enabling adaptive memory management and utilization with minimal supervision. With as few as 152 question-answer pairs and a corresponding temporal memory bank for training, Memory-R1 outperforms the strongest existing baseline and demonstrates strong generalization across diverse question types and LLM backbones. Beyond presenting an effective approach, this work provides insights into how RL can unlock more agentic, memory-aware behavior in LLMs, pointing toward richer, more persistent reasoning systems.
comment: work in progress
Designing Dynamic Pricing for Bike-sharing Systems via Differentiable Agent-based Simulation
Bike-sharing systems are emerging in various cities as a new ecofriendly transportation system. In these systems, spatiotemporally varying user demands lead to imbalanced inventory at bicycle stations, resulting in additional relocation costs. Therefore, it is essential to manage user demand through optimal dynamic pricing for the system. However, optimal pricing design for such a system is challenging because the system involves users with diverse backgrounds and their probabilistic choices. To address this problem, we develop a differentiable agent-based simulation to rapidly design dynamic pricing in bike-sharing systems, achieving balanced bicycle inventory despite spatiotemporally heterogeneous trips and probabilistic user decisions. We first validate our approach against conventional methods through numerical experiments involving 25 bicycle stations and five time slots, yielding 100 parameters. Compared to the conventional methods, our approach obtains a more accurate solution with a 73% to 78% reduction in loss while achieving more than a 100-fold increase in convergence speed. We further validate our approach on a large-scale urban bike-sharing system scenario involving 289 bicycle stations, resulting in a total of 1156 parameters. Through simulations using the obtained pricing policies, we confirm that these policies can naturally induce balanced inventory without any manual relocation. Additionally, we find that the cost of discounts to induce the balanced inventory can be minimized by setting appropriate initial conditions.
comment: The typo in the author's name has been corrected
Systems and Control (CS)
DynaMark: A Reinforcement Learning Framework for Dynamic Watermarking in Industrial Machine Tool Controllers
Industry 4.0's highly networked Machine Tool Controllers (MTCs) are prime targets for replay attacks that use outdated sensor data to manipulate actuators. Dynamic watermarking can reveal such tampering, but current schemes assume linear-Gaussian dynamics and use constant watermark statistics, making them vulnerable to the time-varying, partly proprietary behavior of MTCs. We close this gap with DynaMark, a reinforcement learning framework that models dynamic watermarking as a Markov decision process (MDP). It learns an adaptive policy online that dynamically adapts the covariance of a zero-mean Gaussian watermark using available measurements and detector feedback, without needing system knowledge. DynaMark maximizes a unique reward function balancing control performance, energy consumption, and detection confidence dynamically. We develop a Bayesian belief updating mechanism for real-time detection confidence in linear systems. This approach, independent of specific system assumptions, underpins the MDP for systems with linear dynamics. On a Siemens Sinumerik 828D controller digital twin, DynaMark achieves a reduction in watermark energy by 70% while preserving the nominal trajectory, compared to constant variance baselines. It also maintains an average detection delay equivalent to one sampling interval. A physical stepper-motor testbed validates these findings, rapidly triggering alarms with less control performance decline and exceeding existing benchmarks.
Transferring the driveshaft inertia to the grid via the DC-link in MV drive systems
This paper investigates a control approach that renders the driveshaft inertia completely available on the grid side and enhances the fault ride-through behavior of medium-voltage (MV) drive systems. Two main contributions are presented. First, we show how the rotational inertia of the driveline shaft can be synchronously coupled to the grid through a modification of the speed control reference signal and through an adapted DC-link control strategy. For the latter, we pursue two alternatives: one based on conventional cascaded control and another based on synchronous machine (SM) model matching. Second, we demonstrate that both the standard phase-locked loop (PLL) and the matching control approach can be interpreted, via the ray-circle complementarity, as feedback optimization schemes with distinct steady-state maps. This perspective allows us to revisit matching control, reveal its embedded PLL, highlight its current-limiting and tracking capabilities, and provide an extensive simulation study.
comment: Submitted for review to IEEE Transactions on Control Systems Technology, complete version, 21 pages
A Single Subject Machine Learning Based Classification of Motor Imagery EEGs
Motor Imagery-Based Brain-Computer Interfaces (MI-BCIs) are systems that detect and interpret brain activity patterns linked to the mental visualization of movement, and then translate these into instructions for controlling external robotic or domotic devices. Such devices have the potential to be useful in a broad variety of applications. While implementing a system that would help individuals restore some freedom levels, the interpretation of (Electroencephalography) EEG data remains a complex and unsolved problem. In the literature, the classification of left and right imagined movements has been extensively studied. This study introduces a novel pipeline that makes use of machine learning techniques for classifying MI EEG data. The entire framework is capable of accurately categorizing left and imagined motions, as well as rest phases, for a set of 52 subjects who performed a MI task. We trained a within subject model on each individual subject. The methodology has been offline evaluated and compared to four studies that are currently the state-of-the-art regarding the specified dataset. The results show that our proposed framework could be used with MI-BCI systems in light of its failsafe classification performances, i.e. 99.5% in accuracy
comment: Conference Paper
Chance-Constrained DC Optimal Power Flow Using Constraint-Informed Statistical Estimation
Chance-constrained optimization has emerged as a promising framework for managing uncertainties in power systems. This work advances its application to the DC Optimal Power Flow (DC-OPF) model, developing a novel approach to uncertainty modeling and estimation. Current methods typically tackle these problems by first modeling random nodal injections using high-dimensional statistical distributions that scale with the number of buses, followed by deriving deterministic reformulations of the probabilistic constraints. We propose an alternative methodology that exploits the constraint structure to inform the uncertainties to be estimated, enabling significant dimensionality reduction. Rather than learning joint distributions of net-load forecast errors across units, we instead directly model the one-dimensional aggregate system forecast error and two-dimensional line errors weighted by power transfer distribution factors. We evaluate our approach under both Gaussian and non-Gaussian distributions on synthetic and real-world datasets, demonstrating significant improvements in statistical accuracy and optimization performance compared to existing methods.
A Dual Ensemble Kalman Filter Approach to Robust Control of Nonlinear Systems: An Application to Partial Differential Equations
This paper considers the problem of data-driven robust control design for nonlinear systems, for instance, obtained when discretizing nonlinear partial differential equations (PDEs). A robust learning control approach is developed for nonlinear affine in control systems based on Lyapunov redesign technique. The robust control is developed as a sum of an optimal learning control which stabilizes the system in absence of disturbances, and an additive Lyapunov-based robustification term which handles the effects of disturbances. The dual ensemble Kalman filter (dual EnKF) algorithm is utilized in the optimal control design methodology. A simulation study is done on the heat equation and Burgers partial differential equation.
A Soft Inducement Framework for Incentive-Aided Steering of No-Regret Players
In this work, we investigate a steering problem in a mediator-augmented two-player normal-form game, where the mediator aims to guide players toward a specific action profile through information and incentive design. We first characterize the games for which successful steering is possible. Moreover, we establish that steering players to any desired action profile is not always achievable with information design alone, nor when accompanied with sublinear payment schemes. Consequently, we derive a lower bound on the constant payments required per round to achieve this goal. To address these limitations incurred with information design, we introduce an augmented approach that involves a one-shot information design phase before the start of the repeated game, transforming the prior interaction into a Stackelberg game. Finally, we theoretically demonstrate that this approach improves the convergence rate of players' action profiles to the target point by a constant factor with high probability, and support it with empirical results.
Hull Clustering with Blended Representative Periods for Energy System Optimization Models
The growing integration of renewable energy sources into power systems requires planning models to account for not only demand variability but also fluctuations in renewable availability during operational periods. Capturing this temporal detail over long planning horizons can be computationally demanding or even intractable. A common approach to address this challenge is to approximate the problem using a reduced set of selected time periods, known as representative periods (RPs). However, using too few RPs can significantly degrade solution quality. In this paper, we propose a novel method -- hull clustering with blended RPs -- that enhances traditional clustering-based RP approaches in two key ways. First, instead of selecting typical cluster centers (e.g., centroids or medoids) as RPs, our method is based on extreme points, which are more likely to be constraint-binding. Second, it represents base periods as weighted combinations of RPs (e.g., convex or conic blends), enabling a more accurate approximation of the full time horizon with fewer RPs. Through two case studies based on data from the European network operators, we demonstrate that hull clustering with blended RPs outperforms traditional RP techniques in both regret and computational efficiency.
The Rosario Dataset v2: Multimodal Dataset for Agricultural Robotics
We present a multi-modal dataset collected in a soybean crop field, comprising over two hours of recorded data from sensors such as stereo infrared camera, color camera, accelerometer, gyroscope, magnetometer, GNSS (Single Point Positioning, Real-Time Kinematic and Post-Processed Kinematic), and wheel odometry. This dataset captures key challenges inherent to robotics in agricultural environments, including variations in natural lighting, motion blur, rough terrain, and long, perceptually aliased sequences. By addressing these complexities, the dataset aims to support the development and benchmarking of advanced algorithms for localization, mapping, perception, and navigation in agricultural robotics. The platform and data collection system is designed to meet the key requirements for evaluating multi-modal SLAM systems, including hardware synchronization of sensors, 6-DOF ground truth and loops on long trajectories. We run multimodal state-of-the art SLAM methods on the dataset, showcasing the existing limitations in their application on agricultural settings. The dataset and utilities to work with it are released on https://cifasis.github.io/rosariov2/.
comment: First published on The International Journal of Robotics Research: https://journals.sagepub.com/doi/10.1177/02783649251368909
Adapting to Change: A Comparison of Continual and Transfer Learning for Modeling Building Thermal Dynamics under Concept Drifts
Transfer Learning (TL) is currently the most effective approach for modeling building thermal dynamics when only limited data are available. TL uses a pretrained model that is fine-tuned to a specific target building. However, it remains unclear how to proceed after initial fine-tuning, as more operational measurement data are collected over time. This challenge becomes even more complex when the dynamics of the building change, for example, after a retrofit or a change in occupancy. In Machine Learning literature, Continual Learning (CL) methods are used to update models of changing systems. TL approaches can also address this challenge by reusing the pretrained model at each update step and fine-tuning it with new measurement data. A comprehensive study on how to incorporate new measurement data over time to improve prediction accuracy and address the challenges of concept drifts (changes in dynamics) for building thermal dynamics is still missing. Therefore, this study compares several CL and TL strategies, as well as a model trained from scratch, for thermal dynamics modeling during building operation. The methods are evaluated using 5--7 years of simulated data representative of single-family houses in Central Europe, including scenarios with concept drifts from retrofits and changes in occupancy. We propose a CL strategy (Seasonal Memory Learning) that provides greater accuracy improvements than existing CL and TL methods, while maintaining low computational effort. SML outperformed the benchmark of initial fine-tuning by 28.1\% without concept drifts and 34.9\% with concept drifts.
comment: Currently under review
Adaptive Dead-Zone Dual Sliding Mode Observer for Reliable Electrochemical Model-Based SOC Estimation
Accurate state of charge (SOC) estimation is critical for ensuring the safety, reliability, and efficiency of lithium-ion batteries in electric vehicles and energy storage systems. Electrochemical models provide high fidelity for SOC estimation but introduce challenges due to parameter variations, nonlinearities, and computational complexity. To address these issues, this paper proposes an adaptive dead-zone dual sliding mode observer(SMO) based on an improved electrochemical single-particle model. The algorithm integrates a state observer for SOC estimation and a parameter observer for online parameter adaptation. A Lyapunov-derived adaptive dead-zone is introduced to ensure stability, activating parameter updates only when the terminal voltage error lies within a rigorously defined bound. The proposed method was validated under constant-current and UDDS dynamic conditions. Results demonstrate that the adaptive dead-zone dual SMO achieves superior accuracy compared with conventional dual SMO and equivalent circuit model-based EKF methods, maintaining SOC estimation errors within 0.2% under correct initialization and below 1% under a 30% initial SOC error, with rapid convergence. Computational efficiency analysis further shows that the adaptive dead-zone dual sliding mode observer reduces execution time compared with the conventional dual SMO by limiting unnecessary parameter updates, highlighting its suitability for real-time battery management applications. Moreover, robustness under battery aging was confirmed using a cycle-aging model, where the adaptive dead-zone dual SMO maintained stable SOC estimation despite parameter drift. These findings indicate that the proposed method offers a reliable, accurate, and computationally efficient solution for SOC estimation.
comment: 36 pages, 5 figures
Model Reference Adaptive Control with Time-Varying State and Input Constraints
This paper presents a model reference adaptive control (MRAC) framework for uncertain linear time-invariant (LTI) systems subject to user-defined, time-varying state and input constraints. The proposed design seamlessly integrates a time-varying barrier Lyapunov function (TVBLF) to enforce state constraints with a time-varying saturation function to handle input limits. These time-varying constraints can be designed as performance functions to shape transient and steady-state behaviors for both state and input. A key contribution is the derivation of a verifiable, offline feasibility condition to check the existence of a valid control policy for a given set of constraints. To the best of our knowledge, this is the first adaptive control methodology to simultaneously handle both time-varying state and input constraints without resorting to online optimization. Simulation results validate the efficacy of the proposed constrained MRAC scheme.
State and Input Constrained Model Reference Adaptive Control with Robustness and Feasibility Analysis
We propose a model reference adaptive controller (MRAC) for uncertain linear time-invariant (LTI) plants with user-defined state and input constraints in the presence of unmatched bounded disturbances. Unlike popular optimization-based approaches for constrained control, such as model predictive control (MPC) and control barrier function (CBF) that solve a constrained optimization problem at each step using the system model, our approach is optimization-free and adaptive; it combines a saturated adaptive controller with a barrier Lyapunov function (BLF)-based design to ensure that the plant state and input always stay within pre-specified bounds despite the presence of unmatched disturbances. To the best of our knowledge, this is the first result that considers both state and input constraints for control of uncertain systems with disturbances and provides sufficient feasibility conditions to check for the existence of an admissible control policy. Simulation results, including a comparison with a robust MRAC, demonstrate the effectiveness of the proposed algorithm.
Beyond expected value: geometric mean optimization for long-term policy performance in reinforcement learning
Reinforcement learning (RL) algorithms typically optimize the expected cumulative reward, i.e., the expected value of the sum of scalar rewards an agent receives over the course of a trajectory. The expected value averages the performance over an infinite number of trajectories. However, when deploying the agent in the real world, this ensemble average may be uninformative for the performance of individual trajectories. Thus, in many applications, optimizing the long-term performance of individual trajectories might be more desirable. In this work, we propose a novel RL algorithm that combines the standard ensemble average with the time-average growth rate, a measure for the long-term performance of individual trajectories. We first define the Bellman operator for the time-average growth rate. We then show that, under multiplicative reward dynamics, the geometric mean aligns with the time-average growth rate. To address more general and unknown reward dynamics, we propose a modified geometric mean with $N$-sliding window that captures the path-dependency as an estimator for the time-average growth rate. This estimator is embedded as a regularizer into the objective, forming a practical algorithm and enabling the policy to benefit from ensemble average and time-average simultaneously. We evaluate our algorithm in challenging simulations, where it outperforms conventional RL methods.
comment: Accepted final version to appear in the Proceedings of the IEEE Conference on Decision and Control
A Passivity Analysis for Nonlinear Consensus on Digraphs
This work presents a passivity-based analysis for the nonlinear output agreement problem in network systems over directed graphs. We reformulate the problem as a convergence analysis on the agreement submanifold. First, we establish how passivity properties of individual agents and controllers determine the passivity of their associated system relations. Building on this, we introduce the concept of submanifold-constrained passivity and develop a novel compensation theorem that ensures output convergence to the agreement submanifold. Unlike previous approaches, our approach can analyze the network system with arbitrary digraphs and any passive agents. We apply this framework to analyze the output agreement problem for network systems consisting of nonlinear and passive agents. Numerical examples support our results.
comment: Accepted to CDC 2025; 7 pages, 3 figures
Incremental Policy Iteration for Unknown Nonlinear Systems with Stability and Performance Guarantees
This paper proposes a general incremental policy iteration adaptive dynamic programming (ADP) algorithm for model-free robust optimal control of unknown nonlinear systems. The approach integrates recursive least squares estimation with linear ADP principles, which greatly simplifies the implementation while preserving adaptive learning capabilities. In particular, we develop a sufficient condition for selecting a discount factor such that it allows learning the optimal policy starting with an initial policy that is not necessarily stabilizing. Moreover, we characterize the robust stability of the closed-loop system and the near-optimality of iterative policies. Finally, we perform numerical simulations to demonstrate the effectiveness of the proposed method.
Multi-Modal Model Predictive Path Integral Control for Collision Avoidance
This paper proposes a novel approach to motion planning and decision-making for automated vehicles, using a multi-modal Model Predictive Path Integral control algorithm. The method samples with Sobol sequences around the prior input and incorporates analytical solutions for collision avoidance. By leveraging multiple modes, the multi-modal control algorithm explores diverse trajectories, such as manoeuvring around obstacles or stopping safely before them, mitigating the risk of sub-optimal solutions. A non-linear single-track vehicle model with a Fiala tyre serves as the prediction model, and tyre force constraints within the friction circle are enforced to ensure vehicle stability during evasive manoeuvres. The optimised steering angle and longitudinal acceleration are computed to generate a collision-free trajectory and to control the vehicle. In a high-fidelity simulation environment, we demonstrate that the proposed algorithm can successfully avoid obstacles, keeping the vehicle stable while driving a double lane change manoeuvre on high and low-friction road surfaces and occlusion scenarios with moving obstacles, outperforming a standard Model Predictive Path Integral approach.
comment: Accepted as an oral presentation at the 29th IAVSD. August 18-22, 2025. Shanghai, China
A Fundamental Convergence Rate Bound for Gradient Based Online Optimization Algorithms with Exact Tracking
In this paper, we consider algorithms with integral action for solving online optimization problems characterized by quadratic cost functions with a time-varying optimal point described by an $(n-1)$th order polynomial. Using a version of the internal model principle, the optimization algorithms under consideration are required to incorporate a discrete time $n$-th order integrator in order to achieve exact tracking. By using results on an optimal gain margin problem, we obtain a fundamental convergence rate bound for the class of linear gradient based algorithms exactly tracking a time-varying optimal point. This convergence rate bound is given by $ \left(\frac{\sqrt{\kappa} - 1 }{\sqrt{\kappa} + 1}\right)^{\frac{1}{n}}$, where $\kappa$ is the condition number for the set of cost functions under consideration. Using our approach, we also construct algorithms which achieve the optimal convergence rate as well as zero steady-state error when tracking a time-varying optimal point.
comment: Submitted to IEEE Transactions on Automatic Control
Cooperative Sensing Enhanced UAV Path-Following and Obstacle Avoidance with Variable Formation
The high mobility of unmanned aerial vehicles (UAVs) enables them to be used in various civilian fields, such as rescue and cargo transport. Path-following is a crucial way to perform these tasks while sensing and collision avoidance are essential for safe flight. In this paper, we investigate how to efficiently and accurately achieve path-following, obstacle sensing and avoidance subtasks, as well as their conflict-free fusion scheduling. Firstly, a high precision deep reinforcement learning (DRL)-based UAV formation path-following model is developed, and the reward function with adaptive weights is designed from the perspective of distance and velocity errors. Then, we use integrated sensing and communication (ISAC) signals to detect the obstacle and derive the Cramer-Rao lower bound (CRLB) for obstacle sensing by information-level fusion, based on which we propose the variable formation enhanced obstacle position estimation (VFEO) algorithm. In addition, an online obstacle avoidance scheme without pretraining is designed to solve the sparse reward. Finally, with the aid of null space based (NSB) behavioral method, we present a hierarchical subtasks fusion strategy. Simulation results demonstrate the effectiveness and superiority of the subtask algorithms and the hierarchical fusion strategy.
On Zero-sum Game Representation for Replicator Dynamics
Replicator dynamics have widely been used in evolutionary game theory to model how strategy frequencies evolve over time in large populations. The so-called payoff matrix encodes the pairwise fitness that each strategy obtains when interacting with every other strategy, and it solely determines the replicator dynamics. If the payoff matrix is unknown, we show in this paper that it cannot be inferred from observed strategy frequencies alone -- distinct payoff matrices can induce the same replicator dynamics. We thus look for a canonical representative of the payoff matrix in the equivalence class. The main result of the paper is to show that for every polynomial replicator dynamics (i.e., the vector field is a polynomial), there always exists a skew-symmetric, polynomial payoff matrix that can induce the given dynamics.
Adaptive Optimisation of Ride-Pooling Personalised Fares in a Stochastic Framework
Ride-pooling systems, to succeed, must provide an attractive service, namely compensate perceived costs with an appealing price. However, because of a strong heterogeneity in a value-of-time, each traveller has his own acceptable price, unknown to the operator. Here, we show that individual acceptance levels can be learned by the operator (over $90\%$ accuracy for pooled travellers in $10$ days) to optimise personalised fares. We propose an adaptive pricing policy, where every day the operator constructs an offer that progressively meets travellers' expectations and attracts a growing demand. Our results suggest that operators, by learning behavioural traits of individual travellers, may improve performance not only for travellers (increased utility) but also for themselves (increased profit). Moreover, such knowledge allows the operator to remove inefficient pooled rides and focus on attractive and profitable combinations.
A Hybrid Artificial Intelligence Method for Estimating Flicker in Power Systems
This paper introduces a novel hybrid AI method combining H filtering and an adaptive linear neuron network for flicker component estimation in power distribution systems.The proposed method leverages the robustness of the H filter to extract the voltage envelope under uncertain and noisy conditions followed by the use of ADALINE to accurately identify flicker frequencies embedded in the envelope.This synergy enables efficient time domain estimation with rapid convergence and noise resilience addressing key limitations of existing frequency domain approaches.Unlike conventional techniques this hybrid AI model handles complex power disturbances without prior knowledge of noise characteristics or extensive training.To validate the method performance we conduct simulation studies based on IEC Standard 61000 4 15 supported by statistical analysis Monte Carlo simulations and real world data.Results demonstrate superior accuracy robustness and reduced computational load compared to Fast Fourier Transform and Discrete Wavelet Transform based estimators.
comment: 31 pages, 12 figures, and 6 tables
Distributed Constrained Online Nonconvex Optimization with Compressed Communication
This paper considers distributed online nonconvex optimization with time-varying inequality constraints over a network of agents. For a time-varying graph, we propose a distributed online primal-dual algorithm with compressed communication to efficiently utilize communication resources. We show that the proposed algorithm establishes an $\mathcal{O}( {{T^{\max \{ {1 - {\theta_1},{\theta_1}} \}}}} )$ network regret bound and an $\mathcal{O}( {T^{1 - {\theta_1}/2}} )$ network cumulative constraint violation bound, where $T$ is the number of iterations and ${\theta_1} \in ( {0,1} )$ is a user-defined trade-off parameter. When Slater's condition holds (i.e, there is a point that strictly satisfies the inequality constraints at all iterations), the network cumulative constraint violation bound is reduced to $\mathcal{O}( {T^{1 - {\theta_1}}} )$. These bounds are comparable to the state-of-the-art results established by existing distributed online algorithms with perfect communication for distributed online convex optimization with (time-varying) inequality constraints. Finally, a simulation example is presented to validate the theoretical results.
comment: 31 pages, 2 figures. arXiv admin note: text overlap with arXiv:2411.11574
Probabilistic Flexibility Aggregation of DERs for Ancillary Services Provision
This paper presents a grid-aware probabilistic approach to compute the aggregated flexibility at the grid connection point (GCP) of active distribution networks (ADNs) to allow the participation of DERs in ancillary services (AS) markets. Specifically an optimal power flow (OPF) method using a linear network model is used to compute the aggregated capability for the provision of multiple AS. We start from the method proposed in [1] and extend it to allow for optimizing the provision of multiple services simultaneously, ensure cost-effectiveness of the used DERs and handle uncertainties in a probabilistic way. The allocation of individual DERs power flexibilities accounts for the operational costs associated to the provision of different services and ensures cost-effectiveness while maximizing the value of the advertised aggregated flexibility, assuming known service prices. Empirical uncertainty sets are obtained to achieve a predefined coverage of the probability distribution in line with recent developments in the Nordic AS markets. Finally, a feeder-decomposition approach is proposed to ensure the methods applicability to realistic distribution networks with a large number of buses. Different case studies show the effectiveness of the method, highlight the importance of accounting for network constraints and illustrate its applicability to realistic distribution systems.
Stochastic Model Predictive Control of Charging Energy Hubs with Conformal Prediction
This paper presents an online energy management system for an energy hub where electric vehicles are charged combining on-site photovoltaic generation and battery energy storage with the power grid, with the objective to decide on the battery (dis)charging to minimize the costs of operation. To this end, we devise a scenario-based stochastic model predictive control (MPC) scheme that leverages probabilistic 24-hour-ahead forecasts of charging load, solar generation and day-ahead electricity prices to achieve a cost-optimal operation of the energy hub. The probabilistic forecasts leverage conformal prediction providing calibrated distribution-free confidence intervals starting from a machine learning model that generates no uncertainty quantification. We showcase our controller by running it over a 280-day evaluation in a closed-loop simulated environment to compare the observed cost of two scenario-based MPCs with two deterministic alternatives: a version with point forecast and a version with perfect forecast. Our results indicate that, compared to the perfect forecast implementation, our proposed scenario-based MPCs are 13% more expensive, and 1% better than their deterministic point-forecast counterpart
Ontology Neural Network and ORTSF: A Framework for Topological Reasoning and Delay-Robust Control
The advancement of autonomous robotic systems has led to impressive capabilities in perception, localization, mapping, and control. Yet, a fundamental gap remains: existing frameworks excel at geometric reasoning and dynamic stability but fall short in representing and preserving relational semantics, contextual reasoning, and cognitive transparency essential for collaboration in dynamic, human-centric environments. This paper introduces a unified architecture comprising the Ontology Neural Network (ONN) and the Ontological Real-Time Semantic Fabric (ORTSF) to address this gap. The ONN formalizes relational semantic reasoning as a dynamic topological process. By embedding Forman-Ricci curvature, persistent homology, and semantic tensor structures within a unified loss formulation, ONN ensures that relational integrity and topological coherence are preserved as scenes evolve over time. The ORTSF transforms reasoning traces into actionable control commands while compensating for system delays. It integrates predictive and delay-aware operators that ensure phase margin preservation and continuity of control signals, even under significant latency conditions. Empirical studies demonstrate the ONN + ORTSF framework's ability to unify semantic cognition and robust control, providing a mathematically principled and practically viable solution for cognitive robotics.
comment: 12 pages, 5 figures, includes theoretical proofs and simulation results
Optimizing Resource Allocation for QoS and Stability in Dynamic VLC-NOMA Networks via MARL
Visible Light Communication (VLC) combined with Non-Orthogonal Multiple Access (NOMA) offers a promising solution for dense indoor wireless networks. Yet, managing resources effectively is challenged by VLC network dynamic conditions involving user mobility and light dimming. In addition to satisfying Quality of Service (QoS) and network stability requirements. Traditional resource allocation methods and simpler RL approaches struggle to jointly optimize QoS and stability under the dynamic conditions of mobile VLC-NOMA networks. This paper presents MARL frameworks tailored to perform complex joint optimization of resource allocation (NOMA power, user scheduling) and network stability (interference, handovers), considering heterogeneous QoS, user mobility, and dimming in VLC-NOMA systems. Our MARL frameworks capture dynamic channel conditions and diverse user QoS , enabling effective joint optimization. In these frameworks, VLC access points (APs) act as intelligent agents, learning to allocate power and schedule users to satisfy diverse requirements while maintaining network stability by managing interference and minimizing disruptive handovers. We conduct a comparative analysis of two key MARL paradigms: 1) Centralized Training with Decentralized Execution (CTDE) and 2) Centralized Training with Centralized Execution (CTCE). Comprehensive simulations validate the effectiveness of both tailored MARL frameworks and demonstrate an ability to handle complex optimization. The results show key trade-offs, as the CTDE approach achieved approximately 16\% higher for High priority (HP) user QoS satisfaction, while the CTCE approach yielded nearly 7 dB higher average SINR and 12\% lower ping-pong handover ratio, offering valuable insights into the performance differences between these paradigms in complex VLC-NOMA network scenarios.
comment: 17 pages, 11 figures. This paper has been published in IEEE Access
Algebraic Control: Complete Stable Inversion with Necessary and Sufficient Conditions
In this paper, we establish necessary and sufficient conditions for stable inversion, addressing challenges in non-minimum phase, non-square, and singular systems. An H-Infinity based algebraic approximation is introduced for near-perfect tracking without preview. Additionally, we propose a novel robust control strategy combining the nominal model with dual feedforward control to form a feedback structure. Numerical comparison demonstrates the approach's effectiveness.
The impact of heatwave-driven air conditioning adoption on electricity demand: A spatio-temporal case study for Germany
Intensifying heatwaves driven by climate change are accelerating the adoption of mobile air conditioning (AC) systems. A rapid mass adoption of such AC systems could create additional stress on electricity grids and the power system. This study presents a novel method to estimate the electricity demand from AC systems both at the system level and at high temporal and spatial granularity. We apply the method to a near-future heatwave scenario in Germany in which household AC adoption increases from the current 19% to 35% during a heatwave similar to the one of July 2025. We analyze the effects for 196,428 grid cells of one square kilometer across Germany, by combining weather data, census data, socio-demographic assumptions, mobility patterns, and temperature-dependent AC activation functions. We find that electricity demand of newly purchased mobile AC systems could increase the peak load by over 12.9 GW, with urban hot-spots reaching 5.2 MW per square kilometer. The temporal pattern creates a pronounced afternoon peak that coincides with lower photovoltaic generation, potentially exacerbating power system stability challenges. Our findings underscore the urgency for proactive energy system planning to manage emerging demand peaks.
comment: 21 pages, 12 figures
AI Simulation by Digital Twins: Systematic Survey, Reference Framework, and Mapping to a Standardized Architecture
Insufficient data volume and quality are particularly pressing challenges in the adoption of modern subsymbolic AI. To alleviate these challenges, AI simulation uses virtual training environments in which AI agents can be safely and efficiently developed with simulated, synthetic data. Digital twins open new avenues in AI simulation, as these high-fidelity virtual replicas of physical systems are equipped with state-of-the-art simulators and the ability to further interact with the physical system for additional data collection. In this article, we report on our systematic survey of digital twin-enabled AI simulation. By analyzing 22 primary studies, we identify technological trends and derive a reference framework to situate digital twins and AI components. Based on our findings, we derive a reference framework and provide architectural guidelines by mapping it onto the ISO 23247 reference architecture for digital twins. Finally, we identify challenges and research opportunities for prospective researchers.
Fixed-Time Input-to-State Stability for Singularly Perturbed Systems via Composite Lyapunov Functions
We study singularly perturbed systems that exhibit input-to-state stability (ISS) with fixed-time properties in the presence of bounded disturbances. In these systems, solutions converge to the origin within a time frame independent of initial conditions when undisturbed, and to a vicinity of the origin when subjected to bounded disturbances. First, we extend the traditional composite Lyapunov method, commonly applied in singular perturbation theory to analyze asymptotic stability, to include fixed-time ISS. We demonstrate that if both the reduced system and the boundary layer system exhibit fixed-time ISS, and if certain interconnection conditions are met, the entire multi-time scale system retains this fixed-time ISS characteristic, provided the separation of time scales is sufficiently pronounced. Next, we illustrate our findings via analytical and numerical examples, including a novel application in fixed-time feedback optimization for dynamic plants with slowly varying cost functions.
Systems and Control (EESS)
DynaMark: A Reinforcement Learning Framework for Dynamic Watermarking in Industrial Machine Tool Controllers
Industry 4.0's highly networked Machine Tool Controllers (MTCs) are prime targets for replay attacks that use outdated sensor data to manipulate actuators. Dynamic watermarking can reveal such tampering, but current schemes assume linear-Gaussian dynamics and use constant watermark statistics, making them vulnerable to the time-varying, partly proprietary behavior of MTCs. We close this gap with DynaMark, a reinforcement learning framework that models dynamic watermarking as a Markov decision process (MDP). It learns an adaptive policy online that dynamically adapts the covariance of a zero-mean Gaussian watermark using available measurements and detector feedback, without needing system knowledge. DynaMark maximizes a unique reward function balancing control performance, energy consumption, and detection confidence dynamically. We develop a Bayesian belief updating mechanism for real-time detection confidence in linear systems. This approach, independent of specific system assumptions, underpins the MDP for systems with linear dynamics. On a Siemens Sinumerik 828D controller digital twin, DynaMark achieves a reduction in watermark energy by 70% while preserving the nominal trajectory, compared to constant variance baselines. It also maintains an average detection delay equivalent to one sampling interval. A physical stepper-motor testbed validates these findings, rapidly triggering alarms with less control performance decline and exceeding existing benchmarks.
Transferring the driveshaft inertia to the grid via the DC-link in MV drive systems
This paper investigates a control approach that renders the driveshaft inertia completely available on the grid side and enhances the fault ride-through behavior of medium-voltage (MV) drive systems. Two main contributions are presented. First, we show how the rotational inertia of the driveline shaft can be synchronously coupled to the grid through a modification of the speed control reference signal and through an adapted DC-link control strategy. For the latter, we pursue two alternatives: one based on conventional cascaded control and another based on synchronous machine (SM) model matching. Second, we demonstrate that both the standard phase-locked loop (PLL) and the matching control approach can be interpreted, via the ray-circle complementarity, as feedback optimization schemes with distinct steady-state maps. This perspective allows us to revisit matching control, reveal its embedded PLL, highlight its current-limiting and tracking capabilities, and provide an extensive simulation study.
comment: Submitted for review to IEEE Transactions on Control Systems Technology, complete version, 21 pages
A Single Subject Machine Learning Based Classification of Motor Imagery EEGs
Motor Imagery-Based Brain-Computer Interfaces (MI-BCIs) are systems that detect and interpret brain activity patterns linked to the mental visualization of movement, and then translate these into instructions for controlling external robotic or domotic devices. Such devices have the potential to be useful in a broad variety of applications. While implementing a system that would help individuals restore some freedom levels, the interpretation of (Electroencephalography) EEG data remains a complex and unsolved problem. In the literature, the classification of left and right imagined movements has been extensively studied. This study introduces a novel pipeline that makes use of machine learning techniques for classifying MI EEG data. The entire framework is capable of accurately categorizing left and imagined motions, as well as rest phases, for a set of 52 subjects who performed a MI task. We trained a within subject model on each individual subject. The methodology has been offline evaluated and compared to four studies that are currently the state-of-the-art regarding the specified dataset. The results show that our proposed framework could be used with MI-BCI systems in light of its failsafe classification performances, i.e. 99.5% in accuracy
comment: Conference Paper
Chance-Constrained DC Optimal Power Flow Using Constraint-Informed Statistical Estimation
Chance-constrained optimization has emerged as a promising framework for managing uncertainties in power systems. This work advances its application to the DC Optimal Power Flow (DC-OPF) model, developing a novel approach to uncertainty modeling and estimation. Current methods typically tackle these problems by first modeling random nodal injections using high-dimensional statistical distributions that scale with the number of buses, followed by deriving deterministic reformulations of the probabilistic constraints. We propose an alternative methodology that exploits the constraint structure to inform the uncertainties to be estimated, enabling significant dimensionality reduction. Rather than learning joint distributions of net-load forecast errors across units, we instead directly model the one-dimensional aggregate system forecast error and two-dimensional line errors weighted by power transfer distribution factors. We evaluate our approach under both Gaussian and non-Gaussian distributions on synthetic and real-world datasets, demonstrating significant improvements in statistical accuracy and optimization performance compared to existing methods.
A Dual Ensemble Kalman Filter Approach to Robust Control of Nonlinear Systems: An Application to Partial Differential Equations
This paper considers the problem of data-driven robust control design for nonlinear systems, for instance, obtained when discretizing nonlinear partial differential equations (PDEs). A robust learning control approach is developed for nonlinear affine in control systems based on Lyapunov redesign technique. The robust control is developed as a sum of an optimal learning control which stabilizes the system in absence of disturbances, and an additive Lyapunov-based robustification term which handles the effects of disturbances. The dual ensemble Kalman filter (dual EnKF) algorithm is utilized in the optimal control design methodology. A simulation study is done on the heat equation and Burgers partial differential equation.
A Soft Inducement Framework for Incentive-Aided Steering of No-Regret Players
In this work, we investigate a steering problem in a mediator-augmented two-player normal-form game, where the mediator aims to guide players toward a specific action profile through information and incentive design. We first characterize the games for which successful steering is possible. Moreover, we establish that steering players to any desired action profile is not always achievable with information design alone, nor when accompanied with sublinear payment schemes. Consequently, we derive a lower bound on the constant payments required per round to achieve this goal. To address these limitations incurred with information design, we introduce an augmented approach that involves a one-shot information design phase before the start of the repeated game, transforming the prior interaction into a Stackelberg game. Finally, we theoretically demonstrate that this approach improves the convergence rate of players' action profiles to the target point by a constant factor with high probability, and support it with empirical results.
Hull Clustering with Blended Representative Periods for Energy System Optimization Models
The growing integration of renewable energy sources into power systems requires planning models to account for not only demand variability but also fluctuations in renewable availability during operational periods. Capturing this temporal detail over long planning horizons can be computationally demanding or even intractable. A common approach to address this challenge is to approximate the problem using a reduced set of selected time periods, known as representative periods (RPs). However, using too few RPs can significantly degrade solution quality. In this paper, we propose a novel method -- hull clustering with blended RPs -- that enhances traditional clustering-based RP approaches in two key ways. First, instead of selecting typical cluster centers (e.g., centroids or medoids) as RPs, our method is based on extreme points, which are more likely to be constraint-binding. Second, it represents base periods as weighted combinations of RPs (e.g., convex or conic blends), enabling a more accurate approximation of the full time horizon with fewer RPs. Through two case studies based on data from the European network operators, we demonstrate that hull clustering with blended RPs outperforms traditional RP techniques in both regret and computational efficiency.
The Rosario Dataset v2: Multimodal Dataset for Agricultural Robotics
We present a multi-modal dataset collected in a soybean crop field, comprising over two hours of recorded data from sensors such as stereo infrared camera, color camera, accelerometer, gyroscope, magnetometer, GNSS (Single Point Positioning, Real-Time Kinematic and Post-Processed Kinematic), and wheel odometry. This dataset captures key challenges inherent to robotics in agricultural environments, including variations in natural lighting, motion blur, rough terrain, and long, perceptually aliased sequences. By addressing these complexities, the dataset aims to support the development and benchmarking of advanced algorithms for localization, mapping, perception, and navigation in agricultural robotics. The platform and data collection system is designed to meet the key requirements for evaluating multi-modal SLAM systems, including hardware synchronization of sensors, 6-DOF ground truth and loops on long trajectories. We run multimodal state-of-the art SLAM methods on the dataset, showcasing the existing limitations in their application on agricultural settings. The dataset and utilities to work with it are released on https://cifasis.github.io/rosariov2/.
comment: First published on The International Journal of Robotics Research: https://journals.sagepub.com/doi/10.1177/02783649251368909
Adapting to Change: A Comparison of Continual and Transfer Learning for Modeling Building Thermal Dynamics under Concept Drifts
Transfer Learning (TL) is currently the most effective approach for modeling building thermal dynamics when only limited data are available. TL uses a pretrained model that is fine-tuned to a specific target building. However, it remains unclear how to proceed after initial fine-tuning, as more operational measurement data are collected over time. This challenge becomes even more complex when the dynamics of the building change, for example, after a retrofit or a change in occupancy. In Machine Learning literature, Continual Learning (CL) methods are used to update models of changing systems. TL approaches can also address this challenge by reusing the pretrained model at each update step and fine-tuning it with new measurement data. A comprehensive study on how to incorporate new measurement data over time to improve prediction accuracy and address the challenges of concept drifts (changes in dynamics) for building thermal dynamics is still missing. Therefore, this study compares several CL and TL strategies, as well as a model trained from scratch, for thermal dynamics modeling during building operation. The methods are evaluated using 5--7 years of simulated data representative of single-family houses in Central Europe, including scenarios with concept drifts from retrofits and changes in occupancy. We propose a CL strategy (Seasonal Memory Learning) that provides greater accuracy improvements than existing CL and TL methods, while maintaining low computational effort. SML outperformed the benchmark of initial fine-tuning by 28.1\% without concept drifts and 34.9\% with concept drifts.
comment: Currently under review
Adaptive Dead-Zone Dual Sliding Mode Observer for Reliable Electrochemical Model-Based SOC Estimation
Accurate state of charge (SOC) estimation is critical for ensuring the safety, reliability, and efficiency of lithium-ion batteries in electric vehicles and energy storage systems. Electrochemical models provide high fidelity for SOC estimation but introduce challenges due to parameter variations, nonlinearities, and computational complexity. To address these issues, this paper proposes an adaptive dead-zone dual sliding mode observer(SMO) based on an improved electrochemical single-particle model. The algorithm integrates a state observer for SOC estimation and a parameter observer for online parameter adaptation. A Lyapunov-derived adaptive dead-zone is introduced to ensure stability, activating parameter updates only when the terminal voltage error lies within a rigorously defined bound. The proposed method was validated under constant-current and UDDS dynamic conditions. Results demonstrate that the adaptive dead-zone dual SMO achieves superior accuracy compared with conventional dual SMO and equivalent circuit model-based EKF methods, maintaining SOC estimation errors within 0.2% under correct initialization and below 1% under a 30% initial SOC error, with rapid convergence. Computational efficiency analysis further shows that the adaptive dead-zone dual sliding mode observer reduces execution time compared with the conventional dual SMO by limiting unnecessary parameter updates, highlighting its suitability for real-time battery management applications. Moreover, robustness under battery aging was confirmed using a cycle-aging model, where the adaptive dead-zone dual SMO maintained stable SOC estimation despite parameter drift. These findings indicate that the proposed method offers a reliable, accurate, and computationally efficient solution for SOC estimation.
comment: 36 pages, 5 figures
Model Reference Adaptive Control with Time-Varying State and Input Constraints
This paper presents a model reference adaptive control (MRAC) framework for uncertain linear time-invariant (LTI) systems subject to user-defined, time-varying state and input constraints. The proposed design seamlessly integrates a time-varying barrier Lyapunov function (TVBLF) to enforce state constraints with a time-varying saturation function to handle input limits. These time-varying constraints can be designed as performance functions to shape transient and steady-state behaviors for both state and input. A key contribution is the derivation of a verifiable, offline feasibility condition to check the existence of a valid control policy for a given set of constraints. To the best of our knowledge, this is the first adaptive control methodology to simultaneously handle both time-varying state and input constraints without resorting to online optimization. Simulation results validate the efficacy of the proposed constrained MRAC scheme.
State and Input Constrained Model Reference Adaptive Control with Robustness and Feasibility Analysis
We propose a model reference adaptive controller (MRAC) for uncertain linear time-invariant (LTI) plants with user-defined state and input constraints in the presence of unmatched bounded disturbances. Unlike popular optimization-based approaches for constrained control, such as model predictive control (MPC) and control barrier function (CBF) that solve a constrained optimization problem at each step using the system model, our approach is optimization-free and adaptive; it combines a saturated adaptive controller with a barrier Lyapunov function (BLF)-based design to ensure that the plant state and input always stay within pre-specified bounds despite the presence of unmatched disturbances. To the best of our knowledge, this is the first result that considers both state and input constraints for control of uncertain systems with disturbances and provides sufficient feasibility conditions to check for the existence of an admissible control policy. Simulation results, including a comparison with a robust MRAC, demonstrate the effectiveness of the proposed algorithm.
Beyond expected value: geometric mean optimization for long-term policy performance in reinforcement learning
Reinforcement learning (RL) algorithms typically optimize the expected cumulative reward, i.e., the expected value of the sum of scalar rewards an agent receives over the course of a trajectory. The expected value averages the performance over an infinite number of trajectories. However, when deploying the agent in the real world, this ensemble average may be uninformative for the performance of individual trajectories. Thus, in many applications, optimizing the long-term performance of individual trajectories might be more desirable. In this work, we propose a novel RL algorithm that combines the standard ensemble average with the time-average growth rate, a measure for the long-term performance of individual trajectories. We first define the Bellman operator for the time-average growth rate. We then show that, under multiplicative reward dynamics, the geometric mean aligns with the time-average growth rate. To address more general and unknown reward dynamics, we propose a modified geometric mean with $N$-sliding window that captures the path-dependency as an estimator for the time-average growth rate. This estimator is embedded as a regularizer into the objective, forming a practical algorithm and enabling the policy to benefit from ensemble average and time-average simultaneously. We evaluate our algorithm in challenging simulations, where it outperforms conventional RL methods.
comment: Accepted final version to appear in the Proceedings of the IEEE Conference on Decision and Control
A Passivity Analysis for Nonlinear Consensus on Digraphs
This work presents a passivity-based analysis for the nonlinear output agreement problem in network systems over directed graphs. We reformulate the problem as a convergence analysis on the agreement submanifold. First, we establish how passivity properties of individual agents and controllers determine the passivity of their associated system relations. Building on this, we introduce the concept of submanifold-constrained passivity and develop a novel compensation theorem that ensures output convergence to the agreement submanifold. Unlike previous approaches, our approach can analyze the network system with arbitrary digraphs and any passive agents. We apply this framework to analyze the output agreement problem for network systems consisting of nonlinear and passive agents. Numerical examples support our results.
comment: Accepted to CDC 2025; 7 pages, 3 figures
Incremental Policy Iteration for Unknown Nonlinear Systems with Stability and Performance Guarantees
This paper proposes a general incremental policy iteration adaptive dynamic programming (ADP) algorithm for model-free robust optimal control of unknown nonlinear systems. The approach integrates recursive least squares estimation with linear ADP principles, which greatly simplifies the implementation while preserving adaptive learning capabilities. In particular, we develop a sufficient condition for selecting a discount factor such that it allows learning the optimal policy starting with an initial policy that is not necessarily stabilizing. Moreover, we characterize the robust stability of the closed-loop system and the near-optimality of iterative policies. Finally, we perform numerical simulations to demonstrate the effectiveness of the proposed method.
Multi-Modal Model Predictive Path Integral Control for Collision Avoidance
This paper proposes a novel approach to motion planning and decision-making for automated vehicles, using a multi-modal Model Predictive Path Integral control algorithm. The method samples with Sobol sequences around the prior input and incorporates analytical solutions for collision avoidance. By leveraging multiple modes, the multi-modal control algorithm explores diverse trajectories, such as manoeuvring around obstacles or stopping safely before them, mitigating the risk of sub-optimal solutions. A non-linear single-track vehicle model with a Fiala tyre serves as the prediction model, and tyre force constraints within the friction circle are enforced to ensure vehicle stability during evasive manoeuvres. The optimised steering angle and longitudinal acceleration are computed to generate a collision-free trajectory and to control the vehicle. In a high-fidelity simulation environment, we demonstrate that the proposed algorithm can successfully avoid obstacles, keeping the vehicle stable while driving a double lane change manoeuvre on high and low-friction road surfaces and occlusion scenarios with moving obstacles, outperforming a standard Model Predictive Path Integral approach.
comment: Accepted as an oral presentation at the 29th IAVSD. August 18-22, 2025. Shanghai, China
A Fundamental Convergence Rate Bound for Gradient Based Online Optimization Algorithms with Exact Tracking
In this paper, we consider algorithms with integral action for solving online optimization problems characterized by quadratic cost functions with a time-varying optimal point described by an $(n-1)$th order polynomial. Using a version of the internal model principle, the optimization algorithms under consideration are required to incorporate a discrete time $n$-th order integrator in order to achieve exact tracking. By using results on an optimal gain margin problem, we obtain a fundamental convergence rate bound for the class of linear gradient based algorithms exactly tracking a time-varying optimal point. This convergence rate bound is given by $ \left(\frac{\sqrt{\kappa} - 1 }{\sqrt{\kappa} + 1}\right)^{\frac{1}{n}}$, where $\kappa$ is the condition number for the set of cost functions under consideration. Using our approach, we also construct algorithms which achieve the optimal convergence rate as well as zero steady-state error when tracking a time-varying optimal point.
comment: Submitted to IEEE Transactions on Automatic Control
Cooperative Sensing Enhanced UAV Path-Following and Obstacle Avoidance with Variable Formation
The high mobility of unmanned aerial vehicles (UAVs) enables them to be used in various civilian fields, such as rescue and cargo transport. Path-following is a crucial way to perform these tasks while sensing and collision avoidance are essential for safe flight. In this paper, we investigate how to efficiently and accurately achieve path-following, obstacle sensing and avoidance subtasks, as well as their conflict-free fusion scheduling. Firstly, a high precision deep reinforcement learning (DRL)-based UAV formation path-following model is developed, and the reward function with adaptive weights is designed from the perspective of distance and velocity errors. Then, we use integrated sensing and communication (ISAC) signals to detect the obstacle and derive the Cramer-Rao lower bound (CRLB) for obstacle sensing by information-level fusion, based on which we propose the variable formation enhanced obstacle position estimation (VFEO) algorithm. In addition, an online obstacle avoidance scheme without pretraining is designed to solve the sparse reward. Finally, with the aid of null space based (NSB) behavioral method, we present a hierarchical subtasks fusion strategy. Simulation results demonstrate the effectiveness and superiority of the subtask algorithms and the hierarchical fusion strategy.
On Zero-sum Game Representation for Replicator Dynamics
Replicator dynamics have widely been used in evolutionary game theory to model how strategy frequencies evolve over time in large populations. The so-called payoff matrix encodes the pairwise fitness that each strategy obtains when interacting with every other strategy, and it solely determines the replicator dynamics. If the payoff matrix is unknown, we show in this paper that it cannot be inferred from observed strategy frequencies alone -- distinct payoff matrices can induce the same replicator dynamics. We thus look for a canonical representative of the payoff matrix in the equivalence class. The main result of the paper is to show that for every polynomial replicator dynamics (i.e., the vector field is a polynomial), there always exists a skew-symmetric, polynomial payoff matrix that can induce the given dynamics.
Adaptive Optimisation of Ride-Pooling Personalised Fares in a Stochastic Framework
Ride-pooling systems, to succeed, must provide an attractive service, namely compensate perceived costs with an appealing price. However, because of a strong heterogeneity in a value-of-time, each traveller has his own acceptable price, unknown to the operator. Here, we show that individual acceptance levels can be learned by the operator (over $90\%$ accuracy for pooled travellers in $10$ days) to optimise personalised fares. We propose an adaptive pricing policy, where every day the operator constructs an offer that progressively meets travellers' expectations and attracts a growing demand. Our results suggest that operators, by learning behavioural traits of individual travellers, may improve performance not only for travellers (increased utility) but also for themselves (increased profit). Moreover, such knowledge allows the operator to remove inefficient pooled rides and focus on attractive and profitable combinations.
A Hybrid Artificial Intelligence Method for Estimating Flicker in Power Systems
This paper introduces a novel hybrid AI method combining H filtering and an adaptive linear neuron network for flicker component estimation in power distribution systems.The proposed method leverages the robustness of the H filter to extract the voltage envelope under uncertain and noisy conditions followed by the use of ADALINE to accurately identify flicker frequencies embedded in the envelope.This synergy enables efficient time domain estimation with rapid convergence and noise resilience addressing key limitations of existing frequency domain approaches.Unlike conventional techniques this hybrid AI model handles complex power disturbances without prior knowledge of noise characteristics or extensive training.To validate the method performance we conduct simulation studies based on IEC Standard 61000 4 15 supported by statistical analysis Monte Carlo simulations and real world data.Results demonstrate superior accuracy robustness and reduced computational load compared to Fast Fourier Transform and Discrete Wavelet Transform based estimators.
comment: 31 pages, 12 figures, and 6 tables
Distributed Constrained Online Nonconvex Optimization with Compressed Communication
This paper considers distributed online nonconvex optimization with time-varying inequality constraints over a network of agents. For a time-varying graph, we propose a distributed online primal-dual algorithm with compressed communication to efficiently utilize communication resources. We show that the proposed algorithm establishes an $\mathcal{O}( {{T^{\max \{ {1 - {\theta_1},{\theta_1}} \}}}} )$ network regret bound and an $\mathcal{O}( {T^{1 - {\theta_1}/2}} )$ network cumulative constraint violation bound, where $T$ is the number of iterations and ${\theta_1} \in ( {0,1} )$ is a user-defined trade-off parameter. When Slater's condition holds (i.e, there is a point that strictly satisfies the inequality constraints at all iterations), the network cumulative constraint violation bound is reduced to $\mathcal{O}( {T^{1 - {\theta_1}}} )$. These bounds are comparable to the state-of-the-art results established by existing distributed online algorithms with perfect communication for distributed online convex optimization with (time-varying) inequality constraints. Finally, a simulation example is presented to validate the theoretical results.
comment: 31 pages, 2 figures. arXiv admin note: text overlap with arXiv:2411.11574
Probabilistic Flexibility Aggregation of DERs for Ancillary Services Provision
This paper presents a grid-aware probabilistic approach to compute the aggregated flexibility at the grid connection point (GCP) of active distribution networks (ADNs) to allow the participation of DERs in ancillary services (AS) markets. Specifically an optimal power flow (OPF) method using a linear network model is used to compute the aggregated capability for the provision of multiple AS. We start from the method proposed in [1] and extend it to allow for optimizing the provision of multiple services simultaneously, ensure cost-effectiveness of the used DERs and handle uncertainties in a probabilistic way. The allocation of individual DERs power flexibilities accounts for the operational costs associated to the provision of different services and ensures cost-effectiveness while maximizing the value of the advertised aggregated flexibility, assuming known service prices. Empirical uncertainty sets are obtained to achieve a predefined coverage of the probability distribution in line with recent developments in the Nordic AS markets. Finally, a feeder-decomposition approach is proposed to ensure the methods applicability to realistic distribution networks with a large number of buses. Different case studies show the effectiveness of the method, highlight the importance of accounting for network constraints and illustrate its applicability to realistic distribution systems.
Stochastic Model Predictive Control of Charging Energy Hubs with Conformal Prediction
This paper presents an online energy management system for an energy hub where electric vehicles are charged combining on-site photovoltaic generation and battery energy storage with the power grid, with the objective to decide on the battery (dis)charging to minimize the costs of operation. To this end, we devise a scenario-based stochastic model predictive control (MPC) scheme that leverages probabilistic 24-hour-ahead forecasts of charging load, solar generation and day-ahead electricity prices to achieve a cost-optimal operation of the energy hub. The probabilistic forecasts leverage conformal prediction providing calibrated distribution-free confidence intervals starting from a machine learning model that generates no uncertainty quantification. We showcase our controller by running it over a 280-day evaluation in a closed-loop simulated environment to compare the observed cost of two scenario-based MPCs with two deterministic alternatives: a version with point forecast and a version with perfect forecast. Our results indicate that, compared to the perfect forecast implementation, our proposed scenario-based MPCs are 13% more expensive, and 1% better than their deterministic point-forecast counterpart
Ontology Neural Network and ORTSF: A Framework for Topological Reasoning and Delay-Robust Control
The advancement of autonomous robotic systems has led to impressive capabilities in perception, localization, mapping, and control. Yet, a fundamental gap remains: existing frameworks excel at geometric reasoning and dynamic stability but fall short in representing and preserving relational semantics, contextual reasoning, and cognitive transparency essential for collaboration in dynamic, human-centric environments. This paper introduces a unified architecture comprising the Ontology Neural Network (ONN) and the Ontological Real-Time Semantic Fabric (ORTSF) to address this gap. The ONN formalizes relational semantic reasoning as a dynamic topological process. By embedding Forman-Ricci curvature, persistent homology, and semantic tensor structures within a unified loss formulation, ONN ensures that relational integrity and topological coherence are preserved as scenes evolve over time. The ORTSF transforms reasoning traces into actionable control commands while compensating for system delays. It integrates predictive and delay-aware operators that ensure phase margin preservation and continuity of control signals, even under significant latency conditions. Empirical studies demonstrate the ONN + ORTSF framework's ability to unify semantic cognition and robust control, providing a mathematically principled and practically viable solution for cognitive robotics.
comment: 12 pages, 5 figures, includes theoretical proofs and simulation results
Optimizing Resource Allocation for QoS and Stability in Dynamic VLC-NOMA Networks via MARL
Visible Light Communication (VLC) combined with Non-Orthogonal Multiple Access (NOMA) offers a promising solution for dense indoor wireless networks. Yet, managing resources effectively is challenged by VLC network dynamic conditions involving user mobility and light dimming. In addition to satisfying Quality of Service (QoS) and network stability requirements. Traditional resource allocation methods and simpler RL approaches struggle to jointly optimize QoS and stability under the dynamic conditions of mobile VLC-NOMA networks. This paper presents MARL frameworks tailored to perform complex joint optimization of resource allocation (NOMA power, user scheduling) and network stability (interference, handovers), considering heterogeneous QoS, user mobility, and dimming in VLC-NOMA systems. Our MARL frameworks capture dynamic channel conditions and diverse user QoS , enabling effective joint optimization. In these frameworks, VLC access points (APs) act as intelligent agents, learning to allocate power and schedule users to satisfy diverse requirements while maintaining network stability by managing interference and minimizing disruptive handovers. We conduct a comparative analysis of two key MARL paradigms: 1) Centralized Training with Decentralized Execution (CTDE) and 2) Centralized Training with Centralized Execution (CTCE). Comprehensive simulations validate the effectiveness of both tailored MARL frameworks and demonstrate an ability to handle complex optimization. The results show key trade-offs, as the CTDE approach achieved approximately 16\% higher for High priority (HP) user QoS satisfaction, while the CTCE approach yielded nearly 7 dB higher average SINR and 12\% lower ping-pong handover ratio, offering valuable insights into the performance differences between these paradigms in complex VLC-NOMA network scenarios.
comment: 17 pages, 11 figures. This paper has been published in IEEE Access
Algebraic Control: Complete Stable Inversion with Necessary and Sufficient Conditions
In this paper, we establish necessary and sufficient conditions for stable inversion, addressing challenges in non-minimum phase, non-square, and singular systems. An H-Infinity based algebraic approximation is introduced for near-perfect tracking without preview. Additionally, we propose a novel robust control strategy combining the nominal model with dual feedforward control to form a feedback structure. Numerical comparison demonstrates the approach's effectiveness.
The impact of heatwave-driven air conditioning adoption on electricity demand: A spatio-temporal case study for Germany
Intensifying heatwaves driven by climate change are accelerating the adoption of mobile air conditioning (AC) systems. A rapid mass adoption of such AC systems could create additional stress on electricity grids and the power system. This study presents a novel method to estimate the electricity demand from AC systems both at the system level and at high temporal and spatial granularity. We apply the method to a near-future heatwave scenario in Germany in which household AC adoption increases from the current 19% to 35% during a heatwave similar to the one of July 2025. We analyze the effects for 196,428 grid cells of one square kilometer across Germany, by combining weather data, census data, socio-demographic assumptions, mobility patterns, and temperature-dependent AC activation functions. We find that electricity demand of newly purchased mobile AC systems could increase the peak load by over 12.9 GW, with urban hot-spots reaching 5.2 MW per square kilometer. The temporal pattern creates a pronounced afternoon peak that coincides with lower photovoltaic generation, potentially exacerbating power system stability challenges. Our findings underscore the urgency for proactive energy system planning to manage emerging demand peaks.
comment: 21 pages, 12 figures
AI Simulation by Digital Twins: Systematic Survey, Reference Framework, and Mapping to a Standardized Architecture
Insufficient data volume and quality are particularly pressing challenges in the adoption of modern subsymbolic AI. To alleviate these challenges, AI simulation uses virtual training environments in which AI agents can be safely and efficiently developed with simulated, synthetic data. Digital twins open new avenues in AI simulation, as these high-fidelity virtual replicas of physical systems are equipped with state-of-the-art simulators and the ability to further interact with the physical system for additional data collection. In this article, we report on our systematic survey of digital twin-enabled AI simulation. By analyzing 22 primary studies, we identify technological trends and derive a reference framework to situate digital twins and AI components. Based on our findings, we derive a reference framework and provide architectural guidelines by mapping it onto the ISO 23247 reference architecture for digital twins. Finally, we identify challenges and research opportunities for prospective researchers.
Fixed-Time Input-to-State Stability for Singularly Perturbed Systems via Composite Lyapunov Functions
We study singularly perturbed systems that exhibit input-to-state stability (ISS) with fixed-time properties in the presence of bounded disturbances. In these systems, solutions converge to the origin within a time frame independent of initial conditions when undisturbed, and to a vicinity of the origin when subjected to bounded disturbances. First, we extend the traditional composite Lyapunov method, commonly applied in singular perturbation theory to analyze asymptotic stability, to include fixed-time ISS. We demonstrate that if both the reduced system and the boundary layer system exhibit fixed-time ISS, and if certain interconnection conditions are met, the entire multi-time scale system retains this fixed-time ISS characteristic, provided the separation of time scales is sufficiently pronounced. Next, we illustrate our findings via analytical and numerical examples, including a novel application in fixed-time feedback optimization for dynamic plants with slowly varying cost functions.
Diffusion Dynamics Models with Generative State Estimation for Cloth Manipulation
Cloth manipulation is challenging due to its highly complex dynamics, near-infinite degrees of freedom, and frequent self-occlusions, which complicate both state estimation and dynamics modeling. Inspired by recent advances in generative models, we hypothesize that these expressive models can effectively capture intricate cloth configurations and deformation patterns from data. Therefore, we propose a diffusion-based generative approach for both perception and dynamics modeling. Specifically, we formulate state estimation as reconstructing full cloth states from partial observations and dynamics modeling as predicting future states given the current state and robot actions. Leveraging a transformer-based diffusion model, our method achieves accurate state reconstruction and reduces long-horizon dynamics prediction errors by an order of magnitude compared to prior approaches. We integrate our dynamics models with model predictive control and show that our framework enables effective cloth folding on real robotic systems, demonstrating the potential of generative models for deformable object manipulation under partial observability and complex dynamics.
comment: CoRL 2025. Project website: https://uniclothdiff.github.io/
Energy-Aware Lane Planning for Connected Electric Vehicles in Urban Traffic: Design and Vehicle-in-the-Loop Validation
Urban driving with connected and automated vehicles (CAVs) offers potential for energy savings, yet most eco-driving strategies focus solely on longitudinal speed control within a single lane. This neglects the significant impact of lateral decisions, such as lane changes, on overall energy efficiency, especially in environments with traffic signals and heterogeneous traffic flow. To address this gap, we propose a novel energy-aware motion planning framework that jointly optimizes longitudinal speed and lateral lane-change decisions using vehicle-to-infrastructure (V2I) communication. Our approach estimates long-term energy costs using a graph-based approximation and solves short-horizon optimal control problems under traffic constraints. Using a data-driven energy model calibrated to an actual battery electric vehicle, we demonstrate with vehicle-in-the-loop experiments that our method reduces motion energy consumption by up to 24 percent compared to a human driver, highlighting the potential of connectivity-enabled planning for sustainable urban autonomy.
comment: Accepted at 2025 IEEE Conference on Decision and Control (CDC25')
Sampled-data Systems: Stability, Contractivity and Single-iteration Suboptimal MPC
This paper analyzes the stability of interconnected continuous-time (CT) and discrete-time (DT) systems coupled through sampling and zero-order hold mechanisms. The DT system updates its output at regular intervals $T>0$ by applying an $n$-fold composition of a given map. This setup is motivated by online and sampled-data implementations of optimization-based controllers - particularly model predictive control (MPC) - where the DT system models $n$ iterations of an algorithm approximating the solution of an optimization problem. We introduce the concept of a reduced model, defined as the limiting behavior of the sampled-data system as $T \to 0^+$ and $n \to +\infty$. Our main theoretical contribution establishes that when the reduced model is contractive, there exists a threshold duration $T(n)$ for each iteration count $n$ such that the CT-DT interconnection achieves exponential stability for all sampling periods $T < T(n)$. Finally, under the stronger condition that both the CT and DT systems are contractive, we show exponential stability of their interconnection using a small-gain argument. Our theoretical results provide new insights into suboptimal MPC stability, showing that convergence guarantees hold even when using a single iteration of the optimization algorithm - a practically significant finding for real-time control applications.
comment: Modifications relative to version 1: Figure updated
Robotics
Learning on the Fly: Rapid Policy Adaptation via Differentiable Simulation
Learning control policies in simulation enables rapid, safe, and cost-effective development of advanced robotic capabilities. However, transferring these policies to the real world remains difficult due to the sim-to-real gap, where unmodeled dynamics and environmental disturbances can degrade policy performance. Existing approaches, such as domain randomization and Real2Sim2Real pipelines, can improve policy robustness, but either struggle under out-of-distribution conditions or require costly offline retraining. In this work, we approach these problems from a different perspective. Instead of relying on diverse training conditions before deployment, we focus on rapidly adapting the learned policy in the real world in an online fashion. To achieve this, we propose a novel online adaptive learning framework that unifies residual dynamics learning with real-time policy adaptation inside a differentiable simulation. Starting from a simple dynamics model, our framework refines the model continuously with real-world data to capture unmodeled effects and disturbances such as payload changes and wind. The refined dynamics model is embedded in a differentiable simulation framework, enabling gradient backpropagation through the dynamics and thus rapid, sample-efficient policy updates beyond the reach of classical RL methods like PPO. All components of our system are designed for rapid adaptation, enabling the policy to adjust to unseen disturbances within 5 seconds of training. We validate the approach on agile quadrotor control under various disturbances in both simulation and the real world. Our framework reduces hovering error by up to 81% compared to L1-MPC and 55% compared to DATT, while also demonstrating robustness in vision-based control without explicit state estimation.
Prompt-to-Product: Generative Assembly via Bimanual Manipulation
Creating assembly products demands significant manual effort and expert knowledge in 1) designing the assembly and 2) constructing the product. This paper introduces Prompt-to-Product, an automated pipeline that generates real-world assembly products from natural language prompts. Specifically, we leverage LEGO bricks as the assembly platform and automate the process of creating brick assembly structures. Given the user design requirements, Prompt-to-Product generates physically buildable brick designs, and then leverages a bimanual robotic system to construct the real assembly products, bringing user imaginations into the real world. We conduct a comprehensive user study, and the results demonstrate that Prompt-to-Product significantly lowers the barrier and reduces manual effort in creating assembly products from imaginative ideas.
comment: 12 pages, 10 figures, 2 tables
CogVLA: Cognition-Aligned Vision-Language-Action Model via Instruction-Driven Routing & Sparsification
Recent Vision-Language-Action (VLA) models built on pre-trained Vision-Language Models (VLMs) require extensive post-training, resulting in high computational overhead that limits scalability and deployment.We propose CogVLA, a Cognition-Aligned Vision-Language-Action framework that leverages instruction-driven routing and sparsification to improve both efficiency and performance. CogVLA draws inspiration from human multimodal coordination and introduces a 3-stage progressive architecture. 1) Encoder-FiLM based Aggregation Routing (EFA-Routing) injects instruction information into the vision encoder to selectively aggregate and compress dual-stream visual tokens, forming a instruction-aware latent representation. 2) Building upon this compact visual encoding, LLM-FiLM based Pruning Routing (LFP-Routing) introduces action intent into the language model by pruning instruction-irrelevant visually grounded tokens, thereby achieving token-level sparsity. 3) To ensure that compressed perception inputs can still support accurate and coherent action generation, we introduce V-L-A Coupled Attention (CAtten), which combines causal vision-language attention with bidirectional action parallel decoding. Extensive experiments on the LIBERO benchmark and real-world robotic tasks demonstrate that CogVLA achieves state-of-the-art performance with success rates of 97.4% and 70.0%, respectively, while reducing training costs by 2.5-fold and decreasing inference latency by 2.8-fold compared to OpenVLA. CogVLA is open-sourced and publicly available at https://github.com/JiuTian-VL/CogVLA.
comment: 23 pages, 8 figures, Project Page: https://jiutian-vl.github.io/CogVLA-page
HITTER: A HumanoId Table TEnnis Robot via Hierarchical Planning and Learning
Humanoid robots have recently achieved impressive progress in locomotion and whole-body control, yet they remain constrained in tasks that demand rapid interaction with dynamic environments through manipulation. Table tennis exemplifies such a challenge: with ball speeds exceeding 5 m/s, players must perceive, predict, and act within sub-second reaction times, requiring both agility and precision. To address this, we present a hierarchical framework for humanoid table tennis that integrates a model-based planner for ball trajectory prediction and racket target planning with a reinforcement learning-based whole-body controller. The planner determines striking position, velocity and timing, while the controller generates coordinated arm and leg motions that mimic human strikes and maintain stability and agility across consecutive rallies. Moreover, to encourage natural movements, human motion references are incorporated during training. We validate our system on a general-purpose humanoid robot, achieving up to 106 consecutive shots with a human opponent and sustained exchanges against another humanoid. These results demonstrate real-world humanoid table tennis with sub-second reactive control, marking a step toward agile and interactive humanoid behaviors.
comment: 8 pages, 7 figures
Rapid Mismatch Estimation via Neural Network Informed Variational Inference
With robots increasingly operating in human-centric environments, ensuring soft and safe physical interactions, whether with humans, surroundings, or other machines, is essential. While compliant hardware can facilitate such interactions, this work focuses on impedance controllers that allow torque-controlled robots to safely and passively respond to contact while accurately executing tasks. From inverse dynamics to quadratic programming-based controllers, the effectiveness of these methods relies on accurate dynamics models of the robot and the object it manipulates. Any model mismatch results in task failures and unsafe behaviors. Thus, we introduce Rapid Mismatch Estimation (RME), an adaptive, controller-agnostic, probabilistic framework that estimates end-effector dynamics mismatches online, without relying on external force-torque sensors. From the robot's proprioceptive feedback, a Neural Network Model Mismatch Estimator generates a prior for a Variational Inference solver, which rapidly converges to the unknown parameters while quantifying uncertainty. With a real 7-DoF manipulator driven by a state-of-the-art passive impedance controller, RME adapts to sudden changes in mass and center of mass at the end-effector in $\sim400$ ms, in static and dynamic settings. We demonstrate RME in a collaborative scenario where a human attaches an unknown basket to the robot's end-effector and dynamically adds/removes heavy items, showcasing fast and safe adaptation to changing dynamics during physical interaction without any external sensory system.
comment: Accepted at 9th Annual Conference on Robot Learning. Project Website - https://mateusz-jaszczuk.github.io/rme/
Train-Once Plan-Anywhere Kinodynamic Motion Planning via Diffusion Trees
Kinodynamic motion planning is concerned with computing collision-free trajectories while abiding by the robot's dynamic constraints. This critical problem is often tackled using sampling-based planners (SBPs) that explore the robot's high-dimensional state space by constructing a search tree via action propagations. Although SBPs can offer global guarantees on completeness and solution quality, their performance is often hindered by slow exploration due to uninformed action sampling. Learning-based approaches can yield significantly faster runtimes, yet they fail to generalize to out-of-distribution (OOD) scenarios and lack critical guarantees, e.g., safety, thus limiting their deployment on physical robots. We present Diffusion Tree (DiTree): a \emph{provably-generalizable} framework leveraging diffusion policies (DPs) as informed samplers to efficiently guide state-space search within SBPs. DiTree combines DP's ability to model complex distributions of expert trajectories, conditioned on local observations, with the completeness of SBPs to yield \emph{provably-safe} solutions within a few action propagation iterations for complex dynamical systems. We demonstrate DiTree's power with an implementation combining the popular RRT planner with a DP action sampler trained on a \emph{single environment}. In comprehensive evaluations on OOD scenarios, % DiTree has comparable runtimes to a standalone DP (3x faster than classical SBPs), while improving the average success rate over DP and SBPs. DiTree is on average 3x faster than classical SBPs, and outperforms all other approaches by achieving roughly 30\% higher success rate. Project webpage: https://sites.google.com/view/ditree.
comment: Accepted to CoRL 2025. Project page: https://sites.google.com/view/ditree
UltraTac: Integrated Ultrasound-Augmented Visuotactile Sensor for Enhanced Robotic Perception IROS 2025
Visuotactile sensors provide high-resolution tactile information but are incapable of perceiving the material features of objects. We present UltraTac, an integrated sensor that combines visuotactile imaging with ultrasound sensing through a coaxial optoacoustic architecture. The design shares structural components and achieves consistent sensing regions for both modalities. Additionally, we incorporate acoustic matching into the traditional visuotactile sensor structure, enabling integration of the ultrasound sensing modality without compromising visuotactile performance. Through tactile feedback, we dynamically adjust the operating state of the ultrasound module to achieve flexible functional coordination. Systematic experiments demonstrate three key capabilities: proximity sensing in the 3-8 cm range ($R^2=0.90$), material classification (average accuracy: 99.20%), and texture-material dual-mode object recognition achieving 92.11% accuracy on a 15-class task. Finally, we integrate the sensor into a robotic manipulation system to concurrently detect container surface patterns and internal content, which verifies its potential for advanced human-machine interaction and precise robotic manipulation.
comment: Accepted to IROS 2025
ActLoc: Learning to Localize on the Move via Active Viewpoint Selection
Reliable localization is critical for robot navigation, yet most existing systems implicitly assume that all viewing directions at a location are equally informative. In practice, localization becomes unreliable when the robot observes unmapped, ambiguous, or uninformative regions. To address this, we present ActLoc, an active viewpoint-aware planning framework for enhancing localization accuracy for general robot navigation tasks. At its core, ActLoc employs a largescale trained attention-based model for viewpoint selection. The model encodes a metric map and the camera poses used during map construction, and predicts localization accuracy across yaw and pitch directions at arbitrary 3D locations. These per-point accuracy distributions are incorporated into a path planner, enabling the robot to actively select camera orientations that maximize localization robustness while respecting task and motion constraints. ActLoc achieves stateof-the-art results on single-viewpoint selection and generalizes effectively to fulltrajectory planning. Its modular design makes it readily applicable to diverse robot navigation and inspection tasks.
Scaling Fabric-Based Piezoresistive Sensor Arrays for Whole-Body Tactile Sensing
Scaling tactile sensing for robust whole-body manipulation is a significant challenge, often limited by wiring complexity, data throughput, and system reliability. This paper presents a complete architecture designed to overcome these barriers. Our approach pairs open-source, fabric-based sensors with custom readout electronics that reduce signal crosstalk to less than 3.3% through hardware-based mitigation. Critically, we introduce a novel, daisy-chained SPI bus topology that avoids the practical limitations of common wireless protocols and the prohibitive wiring complexity of USB hub-based systems. This architecture streams synchronized data from over 8,000 taxels across 1 square meter of sensing area at update rates exceeding 50 FPS, confirming its suitability for real-time control. We validate the system's efficacy in a whole-body grasping task where, without feedback, the robot's open-loop trajectory results in an uncontrolled application of force that slowly crushes a deformable cardboard box. With real-time tactile feedback, the robot transforms this motion into a gentle, stable grasp, successfully manipulating the object without causing structural damage. This work provides a robust and well-characterized platform to enable future research in advanced whole-body control and physical human-robot interaction.
comment: In submission to IEEE Sensors
PLUME: Procedural Layer Underground Modeling Engine
As space exploration advances, underground environments are becoming increasingly attractive due to their potential to provide shelter, easier access to resources, and enhanced scientific opportunities. Although such environments exist on Earth, they are often not easily accessible and do not accurately represent the diversity of underground environments found throughout the solar system. This paper presents PLUME, a procedural generation framework aimed at easily creating 3D underground environments. Its flexible structure allows for the continuous enhancement of various underground features, aligning with our expanding understanding of the solar system. The environments generated using PLUME can be used for AI training, evaluating robotics algorithms, 3D rendering, and facilitating rapid iteration on developed exploration algorithms. In this paper, it is demonstrated that PLUME has been used along with a robotic simulator. PLUME is open source and has been released on Github. https://github.com/Gabryss/P.L.U.M.E
COMETH: Convex Optimization for Multiview Estimation and Tracking of Humans
In the era of Industry 5.0, monitoring human activity is essential for ensuring both ergonomic safety and overall well-being. While multi-camera centralized setups improve pose estimation accuracy, they often suffer from high computational costs and bandwidth requirements, limiting scalability and real-time applicability. Distributing processing across edge devices can reduce network bandwidth and computational load. On the other hand, the constrained resources of edge devices lead to accuracy degradation, and the distribution of computation leads to temporal and spatial inconsistencies. We address this challenge by proposing COMETH (Convex Optimization for Multiview Estimation and Tracking of Humans), a lightweight algorithm for real-time multi-view human pose fusion that relies on three concepts: it integrates kinematic and biomechanical constraints to increase the joint positioning accuracy; it employs convex optimization-based inverse kinematics for spatial fusion; and it implements a state observer to improve temporal consistency. We evaluate COMETH on both public and industrial datasets, where it outperforms state-of-the-art methods in localization, detection, and tracking accuracy. The proposed fusion pipeline enables accurate and scalable human motion tracking, making it well-suited for industrial and safety-critical applications. The code is publicly available at https://github.com/PARCO-LAB/COMETH.
comment: Submitted to Information Fusion
Language-Enhanced Mobile Manipulation for Efficient Object Search in Indoor Environments
Enabling robots to efficiently search for and identify objects in complex, unstructured environments is critical for diverse applications ranging from household assistance to industrial automation. However, traditional scene representations typically capture only static semantics and lack interpretable contextual reasoning, limiting their ability to guide object search in completely unfamiliar settings. To address this challenge, we propose a language-enhanced hierarchical navigation framework that tightly integrates semantic perception and spatial reasoning. Our method, Goal-Oriented Dynamically Heuristic-Guided Hierarchical Search (GODHS), leverages large language models (LLMs) to infer scene semantics and guide the search process through a multi-level decision hierarchy. Reliability in reasoning is achieved through the use of structured prompts and logical constraints applied at each stage of the hierarchy. For the specific challenges of mobile manipulation, we introduce a heuristic-based motion planner that combines polar angle sorting with distance prioritization to efficiently generate exploration paths. Comprehensive evaluations in Isaac Sim demonstrate the feasibility of our framework, showing that GODHS can locate target objects with higher search efficiency compared to conventional, non-semantic search strategies. Website and Video are available at: https://drapandiger.github.io/GODHS
CoCoL: A Communication Efficient Decentralized Collaborative Method for Multi-Robot Systems IROS2025
Collaborative learning enhances the performance and adaptability of multi-robot systems in complex tasks but faces significant challenges due to high communication overhead and data heterogeneity inherent in multi-robot tasks. To this end, we propose CoCoL, a Communication efficient decentralized Collaborative Learning method tailored for multi-robot systems with heterogeneous local datasets. Leveraging a mirror descent framework, CoCoL achieves remarkable communication efficiency with approximate Newton-type updates by capturing the similarity between objective functions of robots, and reduces computational costs through inexact sub-problem solutions. Furthermore, the integration of a gradient tracking scheme ensures its robustness against data heterogeneity. Experimental results on three representative multi robot collaborative learning tasks show the superiority of the proposed CoCoL in significantly reducing both the number of communication rounds and total bandwidth consumption while maintaining state-of-the-art accuracy. These benefits are particularly evident in challenging scenarios involving non-IID (non-independent and identically distributed) data distribution, streaming data, and time-varying network topologies.
comment: Accepted by IROS2025
To New Beginnings: A Survey of Unified Perception in Autonomous Vehicle Software
Autonomous vehicle perception typically relies on modular pipelines that decompose the task into detection, tracking, and prediction. While interpretable, these pipelines suffer from error accumulation and limited inter-task synergy. Unified perception has emerged as a promising paradigm that integrates these sub-tasks within a shared architecture, potentially improving robustness, contextual reasoning, and efficiency while retaining interpretable outputs. In this survey, we provide a comprehensive overview of unified perception, introducing a holistic and systemic taxonomy that categorizes methods along task integration, tracking formulation, and representation flow. We define three paradigms -Early, Late, and Full Unified Perception- and systematically review existing methods, their architectures, training strategies, datasets used, and open-source availability, while highlighting future research directions. This work establishes the first comprehensive framework for understanding and advancing unified perception, consolidates fragmented efforts, and guides future research toward more robust, generalizable, and interpretable perception.
Deep Fuzzy Optimization for Batch-Size and Nearest Neighbors in Optimal Robot Motion Planning
Efficient motion planning algorithms are essential in robotics. Optimizing essential parameters, such as batch size and nearest neighbor selection in sampling-based methods, can enhance performance in the planning process. However, existing approaches often lack environmental adaptability. Inspired by the method of the deep fuzzy neural networks, this work introduces Learning-based Informed Trees (LIT*), a sampling-based deep fuzzy learning-based planner that dynamically adjusts batch size and nearest neighbor parameters to obstacle distributions in the configuration spaces. By encoding both global and local ratios via valid and invalid states, LIT* differentiates between obstacle-sparse and obstacle-dense regions, leading to lower-cost paths and reduced computation time. Experimental results in high-dimensional spaces demonstrate that LIT* achieves faster convergence and improved solution quality. It outperforms state-of-the-art single-query, sampling-based planners in environments ranging from R^8 to R^14 and is successfully validated on a dual-arm robot manipulation task. A video showcasing our experimental results is available at: https://youtu.be/NrNs9zebWWk
Genetic Informed Trees (GIT*): Path Planning via Reinforced Genetic Programming Heuristics
Optimal path planning involves finding a feasible state sequence between a start and a goal that optimizes an objective. This process relies on heuristic functions to guide the search direction. While a robust function can improve search efficiency and solution quality, current methods often overlook available environmental data and simplify the function structure due to the complexity of information relationships. This study introduces Genetic Informed Trees (GIT*), which improves upon Effort Informed Trees (EIT*) by integrating a wider array of environmental data, such as repulsive forces from obstacles and the dynamic importance of vertices, to refine heuristic functions for better guidance. Furthermore, we integrated reinforced genetic programming (RGP), which combines genetic programming with reward system feedback to mutate genotype-generative heuristic functions for GIT*. RGP leverages a multitude of data types, thereby improving computational efficiency and solution quality within a set timeframe. Comparative analyses demonstrate that GIT* surpasses existing single-query, sampling-based planners in problems ranging from R^4 to R^16 and was tested on a real-world mobile manipulation task. A video showcasing our experimental results is available at https://youtu.be/URjXbc_BiYg
Encoding Tactile Stimuli for Organoid Intelligence in Braille Recognition
This study proposes a generalizable encoding strategy that maps tactile sensor data to electrical stimulation patterns, enabling neural organoids to perform an open-loop artificial tactile Braille classification task. Human forebrain organoids cultured on a low-density microelectrode array (MEA) are systematically stimulated to characterize the relationship between electrical stimulation parameters (number of pulse, phase amplitude, phase duration, and trigger delay) and organoid responses, measured as spike activity and spatial displacement of the center of activity. Implemented on event-based tactile inputs recorded from the Evetac sensor, our system achieved an average Braille letter classification accuracy of 61 percent with a single organoid, which increased significantly to 83 percent when responses from a three-organoid ensemble were combined. Additionally, the multi-organoid configuration demonstrated enhanced robustness against various types of artificially introduced noise. This research demonstrates the potential of organoids as low-power, adaptive bio-hybrid computational elements and provides a foundational encoding framework for future scalable bio-hybrid computing architectures.
Learning Primitive Embodied World Models: Towards Scalable Robotic Learning
While video-generation-based embodied world models have gained increasing attention, their reliance on large-scale embodied interaction data remains a key bottleneck. The scarcity, difficulty of collection, and high dimensionality of embodied data fundamentally limit the alignment granularity between language and actions and exacerbate the challenge of long-horizon video generation--hindering generative models from achieving a "GPT moment" in the embodied domain. There is a naive observation: the diversity of embodied data far exceeds the relatively small space of possible primitive motions. Based on this insight, we propose a novel paradigm for world modeling--Primitive Embodied World Models (PEWM). By restricting video generation to fixed short horizons, our approach 1) enables fine-grained alignment between linguistic concepts and visual representations of robotic actions, 2) reduces learning complexity, 3) improves data efficiency in embodied data collection, and 4) decreases inference latency. By equipping with a modular Vision-Language Model (VLM) planner and a Start-Goal heatmap Guidance mechanism (SGG), PEWM further enables flexible closed-loop control and supports compositional generalization of primitive-level policies over extended, complex tasks. Our framework leverages the spatiotemporal vision priors in video models and the semantic awareness of VLMs to bridge the gap between fine-grained physical interaction and high-level reasoning, paving the way toward scalable, interpretable, and general-purpose embodied intelligence.
Model-Free Hovering and Source Seeking via Extremum Seeking Control: Experimental Demonstration
In a recent effort, we successfully proposed a categorically novel approach to mimic the phenomenoa of hovering and source seeking by flapping insects and hummingbirds using a new extremum seeking control (ESC) approach. Said ESC approach was shown capable of characterizing the physics of hovering and source seeking by flapping systems, providing at the same time uniquely novel opportunity for a model-free, real-time biomimicry control design. In this paper, we experimentally test and verify, for the first time in the literature, the potential of ESC in flapping robots to achieve model-free, real-time controlled hovering and source seeking. The results of this paper, while being restricted to 1D, confirm the premise of introducing ESC as a natural control method and biomimicry mechanism to the field of flapping flight and robotics.
A Soft Fabric-Based Thermal Haptic Device for VR and Teleoperation
This paper presents a novel fabric-based thermal-haptic interface for virtual reality and teleoperation. It integrates pneumatic actuation and conductive fabric with an innovative ultra-lightweight design, achieving only 2~g for each finger unit. By embedding heating elements within textile pneumatic chambers, the system delivers modulated pressure and thermal stimuli to fingerpads through a fully soft, wearable interface. Comprehensive characterization demonstrates rapid thermal modulation with heating rates up to 3$^{\circ}$C/s, enabling dynamic thermal feedback for virtual or teleoperation interactions. The pneumatic subsystem generates forces up to 8.93~N at 50~kPa, while optimization of fingerpad-actuator clearance enhances cooling efficiency with minimal force reduction. Experimental validation conducted with two different user studies shows high temperature identification accuracy (0.98 overall) across three thermal levels, and significant manipulation improvements in a virtual pick-and-place tasks. Results show enhanced success rates (88.5\% to 96.4\%, p = 0.029) and improved force control precision (p = 0.013) when haptic feedback is enabled, validating the effectiveness of the integrated thermal-haptic approach for advanced human-machine interaction applications.
Uncertainty Aware-Predictive Control Barrier Functions: Safer Human Robot Interaction through Probabilistic Motion Forecasting
To enable flexible, high-throughput automation in settings where people and robots share workspaces, collaborative robotic cells must reconcile stringent safety guarantees with the need for responsive and effective behavior. A dynamic obstacle is the stochastic, task-dependent variability of human motion: when robots fall back on purely reactive or worst-case envelopes, they brake unnecessarily, stall task progress, and tamper with the fluidity that true Human-Robot Interaction demands. In recent years, learning-based human-motion prediction has rapidly advanced, although most approaches produce worst-case scenario forecasts that often do not treat prediction uncertainty in a well-structured way, resulting in over-conservative planning algorithms, limiting their flexibility. We introduce Uncertainty-Aware Predictive Control Barrier Functions (UA-PCBFs), a unified framework that fuses probabilistic human hand motion forecasting with the formal safety guarantees of Control Barrier Functions. In contrast to other variants, our framework allows for dynamic adjustment of the safety margin thanks to the human motion uncertainty estimation provided by a forecasting module. Thanks to uncertainty estimation, UA-PCBFs empower collaborative robots with a deeper understanding of future human states, facilitating more fluid and intelligent interactions through informed motion planning. We validate UA-PCBFs through comprehensive real-world experiments with an increasing level of realism, including automated setups (to perform exactly repeatable motions) with a robotic hand and direct human-robot interactions (to validate promptness, usability, and human confidence). Relative to state-of-the-art HRI architectures, UA-PCBFs show better performance in task-critical metrics, significantly reducing the number of violations of the robot's safe space during interaction with respect to the state-of-the-art.
SKGE-SWIN: End-To-End Autonomous Vehicle Waypoint Prediction and Navigation Using Skip Stage Swin Transformer
Focusing on the development of an end-to-end autonomous vehicle model with pixel-to-pixel context awareness, this research proposes the SKGE-Swin architecture. This architecture utilizes the Swin Transformer with a skip-stage mechanism to broaden feature representation globally and at various network levels. This approach enables the model to extract information from distant pixels by leveraging the Swin Transformer's Shifted Window-based Multi-head Self-Attention (SW-MSA) mechanism and to retain critical information from the initial to the final stages of feature extraction, thereby enhancing its capability to comprehend complex patterns in the vehicle's surroundings. The model is evaluated on the CARLA platform using adversarial scenarios to simulate real-world conditions. Experimental results demonstrate that the SKGE-Swin architecture achieves a superior Driving Score compared to previous methods. Furthermore, an ablation study will be conducted to evaluate the contribution of each architectural component, including the influence of skip connections and the use of the Swin Transformer, in improving model performance.
comment: keywords-multitask learning, autonomous driving, end-to-end learning, skip connections, swin transformer, self-attention mechanism. 12 pages
Non-expert to Expert Motion Translation Using Generative Adversarial Networks
Decreasing skilled workers is a very serious problem in the world. To deal with this problem, the skill transfer from experts to robots has been researched. These methods which teach robots by human motion are called imitation learning. Experts' skills generally appear in not only position data, but also force data. Thus, position and force data need to be saved and reproduced. To realize this, a lot of research has been conducted in the framework of a motion-copying system. Recent research uses machine learning methods to generate motion commands. However, most of them could not change tasks by following human intention. Some of them can change tasks by conditional training, but the labels are limited. Thus, we propose the flexible motion translation method by using Generative Adversarial Networks. The proposed method enables users to teach robots tasks by inputting data, and skills by a trained model. We evaluated the proposed system with a 3-DOF calligraphy robot.
Task Allocation for Autonomous Machines using Computational Intelligence and Deep Reinforcement Learning
Enabling multiple autonomous machines to perform reliably requires the development of efficient cooperative control algorithms. This paper presents a survey of algorithms that have been developed for controlling and coordinating autonomous machines in complex environments. We especially focus on task allocation methods using computational intelligence (CI) and deep reinforcement learning (RL). The advantages and disadvantages of the surveyed methods are analysed thoroughly. We also propose and discuss in detail various future research directions that shed light on how to improve existing algorithms or create new methods to enhance the employability and performance of autonomous machines in real-world applications. The findings indicate that CI and deep RL methods provide viable approaches to addressing complex task allocation problems in dynamic and uncertain environments. The recent development of deep RL has greatly contributed to the literature on controlling and coordinating autonomous machines, and it has become a growing trend in this area. It is envisaged that this paper will provide researchers and engineers with a comprehensive overview of progress in machine learning research related to autonomous machines. It also highlights underexplored areas, identifies emerging methodologies, and suggests new avenues for exploration in future research within this domain.
comment: Accepted for publication in the Proceedings of the 2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
Task-Oriented Edge-Assisted Cross-System Design for Real-Time Human-Robot Interaction in Industrial Metaverse
Real-time human-device interaction in industrial Metaverse faces challenges such as high computational load, limited bandwidth, and strict latency. This paper proposes a task-oriented edge-assisted cross-system framework using digital twins (DTs) to enable responsive interactions. By predicting operator motions, the system supports: 1) proactive Metaverse rendering for visual feedback, and 2) preemptive control of remote devices. The DTs are decoupled into two virtual functions-visual display and robotic control-optimizing both performance and adaptability. To enhance generalizability, we introduce the Human-In-The-Loop Model-Agnostic Meta-Learning (HITL-MAML) algorithm, which dynamically adjusts prediction horizons. Evaluation on two tasks demonstrates the framework's effectiveness: in a Trajectory-Based Drawing Control task, it reduces weighted RMSE from 0.0712 m to 0.0101 m; in a real-time 3D scene representation task for nuclear decommissioning, it achieves a PSNR of 22.11, SSIM of 0.8729, and LPIPS of 0.1298. These results show the framework's capability to ensure spatial precision and visual fidelity in real-time, high-risk industrial environments.
comment: This paper has submitted to IEEE Transactions on Mobile Computing
Traversing the Narrow Path: A Two-Stage Reinforcement Learning Framework for Humanoid Beam Walking
Traversing narrow beams is challenging for humanoids due to sparse, safety-critical contacts and the fragility of purely learned policies. We propose a physically grounded, two-stage framework that couples an XCoM/LIPM footstep template with a lightweight residual planner and a simple low-level tracker. Stage-1 is trained on flat ground: the tracker learns to robustly follow footstep targets by adding small random perturbations to heuristic footsteps, without any hand-crafted centerline locking, so it acquires stable contact scheduling and strong target-tracking robustness. Stage-2 is trained in simulation on a beam: a high-level planner predicts a body-frame residual (Delta x, Delta y, Delta psi) for the swing foot only, refining the template step to prioritize safe, precise placement under narrow support while preserving interpretability. To ease deployment, sensing is kept minimal and consistent between simulation and hardware: the planner consumes compact, forward-facing elevation cues together with onboard IMU and joint signals. On a Unitree G1, our system reliably traverses a 0.2 m-wide, 3 m-long beam. Across simulation and real-world studies, residual refinement consistently outperforms template-only and monolithic baselines in success rate, centerline adherence, and safety margins, while the structured footstep interface enables transparent analysis and low-friction sim-to-real transfer.
SimShear: Sim-to-Real Shear-based Tactile Servoing
We present SimShear, a sim-to-real pipeline for tactile control that enables the use of shear information without explicitly modeling shear dynamics in simulation. Shear, arising from lateral movements across contact surfaces, is critical for tasks involving dynamic object interactions but remains challenging to simulate. To address this, we introduce shPix2pix, a shear-conditioned U-Net GAN that transforms simulated tactile images absent of shear, together with a vector encoding shear information, into realistic equivalents with shear deformations. This method outperforms baseline pix2pix approaches in simulating tactile images and in pose/shear prediction. We apply SimShear to two control tasks using a pair of low-cost desktop robotic arms equipped with a vision-based tactile sensor: (i) a tactile tracking task, where a follower arm tracks a surface moved by a leader arm, and (ii) a collaborative co-lifting task, where both arms jointly hold an object while the leader follows a prescribed trajectory. Our method maintains contact errors within 1 to 2 mm across varied trajectories where shear sensing is essential, validating the feasibility of sim-to-real shear modeling with rigid-body simulators and opening new directions for simulation in tactile robotics.
comment: 2025 Conference on Robot Learning (CoRL)
SPGrasp: Spatiotemporal Prompt-driven Grasp Synthesis in Dynamic Scenes
Real-time interactive grasp synthesis for dynamic objects remains challenging as existing methods fail to achieve low-latency inference while maintaining promptability. To bridge this gap, we propose SPGrasp (spatiotemporal prompt-driven dynamic grasp synthesis), a novel framework extending segment anything model v2 (SAMv2) for video stream grasp estimation. Our core innovation integrates user prompts with spatiotemporal context, enabling real-time interaction with end-to-end latency as low as 59 ms while ensuring temporal consistency for dynamic objects. In benchmark evaluations, SPGrasp achieves instance-level grasp accuracies of 90.6% on OCID and 93.8% on Jacquard. On the challenging GraspNet-1Billion dataset under continuous tracking, SPGrasp achieves 92.0% accuracy with 73.1 ms per-frame latency, representing a 58.5% reduction compared to the prior state-of-the-art promptable method RoG-SAM while maintaining competitive accuracy. Real-world experiments involving 13 moving objects demonstrate a 94.8% success rate in interactive grasping scenarios. These results confirm SPGrasp effectively resolves the latency-interactivity trade-off in dynamic grasp synthesis. Code is available at https://github.com/sejmoonwei/SPGrasp.
Learning Fast, Tool aware Collision Avoidance for Collaborative Robots
Ensuring safe and efficient operation of collaborative robots in human environments is challenging, especially in dynamic settings where both obstacle motion and tasks change over time. Current robot controllers typically assume full visibility and fixed tools, which can lead to collisions or overly conservative behavior. In our work, we introduce a tool-aware collision avoidance system that adjusts in real time to different tool sizes and modes of tool-environment interaction. Using a learned perception model, our system filters out robot and tool components from the point cloud, reasons about occluded area, and predicts collision under partial observability. We then use a control policy trained via constrained reinforcement learning to produce smooth avoidance maneuvers in under 10 milliseconds. In simulated and real-world tests, our approach outperforms traditional approaches (APF, MPPI) in dynamic environments, while maintaining sub-millimeter accuracy. Moreover, our system operates with approximately 60% lower computational cost compared to a state-of-the-art GPU-based planner. Our approach provides modular, efficient, and effective collision avoidance for robots operating in dynamic environments. We integrate our method into a collaborative robot application and demonstrate its practical use for safe and responsive operation.
Remarks on stochastic cloning and delayed-state filtering
Many estimation problems in robotics and navigation involve measurements that depend on prior states. A prominent example is odometry, which measures the relative change between states over time. Accurately handling these delayed-state measurements requires capturing their correlations with prior state estimates, and a widely used approach is stochastic cloning (SC), which augments the state vector to account for these correlations. This work revisits a long-established but often overlooked alternative--the delayed-state Kalman filter--and demonstrates that a properly derived filter yields exactly the same state and covariance update as SC, without requiring state augmentation. Moreover, the generalized Kalman filter formulation provides computational advantages, while also reducing memory requirements for higher-dimensional states. Our findings clarify a common misconception that Kalman filter variants are inherently unable to handle correlated delayed-state measurements, demonstrating that an alternative formulation achieves the same results more efficiently.
Uncertainty-Aware Ankle Exoskeleton Control
Lower limb exoskeletons show promise to assist human movement, but their utility is limited by controllers designed for discrete, predefined actions in controlled environments, restricting their real-world applicability. We present an uncertainty-aware control framework that enables ankle exoskeletons to operate safely across diverse scenarios by automatically disengaging when encountering unfamiliar movements. Our approach uses an uncertainty estimator to classify movements as similar (in-distribution) or different (out-of-distribution) relative to actions in the training set. We evaluated three architectures (model ensembles, autoencoders, and generative adversarial networks) on an offline dataset and tested the strongest performing architecture (ensemble of gait phase estimators) online. The online test demonstrated the ability of our uncertainty estimator to turn assistance on and off as the user transitioned between in-distribution and out-of-distribution tasks (F1: 89.2). This new framework provides a path for exoskeletons to safely and autonomously support human movement in unstructured, everyday environments.
Multi-robot Path Planning and Scheduling via Model Predictive Optimal Transport (MPC-OT)
In this paper, we propose a novel methodology for path planning and scheduling for multi-robot navigation that is based on optimal transport theory and model predictive control. We consider a setup where $N$ robots are tasked to navigate to $M$ targets in a common space with obstacles. Mapping robots to targets first and then planning paths can result in overlapping paths that lead to deadlocks. We derive a strategy based on optimal transport that not only provides minimum cost paths from robots to targets but also guarantees non-overlapping trajectories. We achieve this by discretizing the space of interest into $K$ cells and by imposing a ${K\times K}$ cost structure that describes the cost of transitioning from one cell to another. Optimal transport then provides \textit{optimal and non-overlapping} cell transitions for the robots to reach the targets that can be readily deployed without any scheduling considerations. The proposed solution requires $\unicode{x1D4AA}(K^3\log K)$ computations in the worst-case and $\unicode{x1D4AA}(K^2\log K)$ for well-behaved problems. To further accommodate potentially overlapping trajectories (unavoidable in certain situations) as well as robot dynamics, we show that a temporal structure can be integrated into optimal transport with the help of \textit{replans} and \textit{model predictive control}.
comment: 2025 IEEE Conference on Decision and Control
Observer Design for Optical Flow-Based Visual-Inertial Odometry with Almost-Global Convergence
This paper presents a novel cascaded observer architecture that combines optical flow and IMU measurements to perform continuous monocular visual-inertial odometry (VIO). The proposed solution estimates body-frame velocity and gravity direction simultaneously by fusing velocity direction information from optical flow measurements with gyro and accelerometer data. This fusion is achieved using a globally exponentially stable Riccati observer, which operates under persistently exciting translational motion conditions. The estimated gravity direction in the body frame is then employed, along with an optional magnetometer measurement, to design a complementary observer on $\mathbf{SO}(3)$ for attitude estimation. The resulting interconnected observer architecture is shown to be almost globally asymptotically stable. To extract the velocity direction from sparse optical flow data, a gradient descent algorithm is developed to solve a constrained minimization problem on the unit sphere. The effectiveness of the proposed algorithms is validated through simulation results.
comment: 8 pages, 6 figures. To appear in IEEE CDC 2025
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control
The human ability to seamlessly perform multimodal reasoning and physical interaction in the open world is a core goal for general-purpose embodied intelligent systems. Recent vision-language-action (VLA) models, which are co-trained on large-scale robot and visual-text data, have demonstrated notable progress in general robot control. However, they still fail to achieve human-level flexibility in interleaved reasoning and interaction. In this work, introduce EO-Robotics, consists of EO-1 model and EO-Data1.5M dataset. EO-1 is a unified embodied foundation model that achieves superior performance in multimodal embodied reasoning and robot control through interleaved vision-text-action pre-training. The development of EO-1 is based on two key pillars: (i) a unified architecture that processes multimodal inputs indiscriminately (image, text, video, and action), and (ii) a massive, high-quality multimodal embodied reasoning dataset, EO-Data1.5M, which contains over 1.5 million samples with emphasis on interleaved vision-text-action comprehension. EO-1 is trained through synergies between auto-regressive decoding and flow matching denoising on EO-Data1.5M, enabling seamless robot action generation and multimodal embodied reasoning. Extensive experiments demonstrate the effectiveness of interleaved vision-text-action learning for open-world understanding and generalization, validated through a variety of long-horizon, dexterous manipulation tasks across multiple embodiments. This paper details the architecture of EO-1, the data construction strategy of EO-Data1.5M, and the training methodology, offering valuable insights for developing advanced embodied foundation models.
GENNAV: Polygon Mask Generation for Generalized Referring Navigable Regions
We focus on the task of identifying the location of target regions from a natural language instruction and a front camera image captured by a mobility. This task is challenging because it requires both existence prediction and segmentation, particularly for stuff-type target regions with ambiguous boundaries. Existing methods often underperform in handling stuff-type target regions, in addition to absent or multiple targets. To overcome these limitations, we propose GENNAV, which predicts target existence and generates segmentation masks for multiple stuff-type target regions. To evaluate GENNAV, we constructed a novel benchmark called GRiN-Drive, which includes three distinct types of samples: no-target, single-target, and multi-target. GENNAV achieved superior performance over baseline methods on standard evaluation metrics. Furthermore, we conducted real-world experiments with four automobiles operated in five geographically distinct urban areas to validate its zero-shot transfer performance. In these experiments, GENNAV outperformed baseline methods and demonstrated its robustness across diverse real-world environments. The project page is available at https://gennav.vercel.app/.
comment: Accepted for presentation at CoRL2025
HERMES: Human-to-Robot Embodied Learning from Multi-Source Motion Data for Mobile Dexterous Manipulation
Leveraging human motion data to impart robots with versatile manipulation skills has emerged as a promising paradigm in robotic manipulation. Nevertheless, translating multi-source human hand motions into feasible robot behaviors remains challenging, particularly for robots equipped with multi-fingered dexterous hands characterized by complex, high-dimensional action spaces. Moreover, existing approaches often struggle to produce policies capable of adapting to diverse environmental conditions. In this paper, we introduce HERMES, a human-to-robot learning framework for mobile bimanual dexterous manipulation. First, HERMES formulates a unified reinforcement learning approach capable of seamlessly transforming heterogeneous human hand motions from multiple sources into physically plausible robotic behaviors. Subsequently, to mitigate the sim2real gap, we devise an end-to-end, depth image-based sim2real transfer method for improved generalization to real-world scenarios. Furthermore, to enable autonomous operation in varied and unstructured environments, we augment the navigation foundation model with a closed-loop Perspective-n-Point (PnP) localization mechanism, ensuring precise alignment of visual goals and effectively bridging autonomous navigation and dexterous manipulation. Extensive experimental results demonstrate that HERMES consistently exhibits generalizable behaviors across diverse, in-the-wild scenarios, successfully performing numerous complex mobile bimanual dexterous manipulation tasks. Project Page:https://gemcollector.github.io/HERMES/.
Staircase Recognition and Location Based on Polarization Vision
Staircase is one of the most common structures in artificial scenes. However, it is difficult for humanoid robots and people with lower limb disabilities or visual impairment to cross the scene without the help of sensors and intelligent algorithms. Staircase scene perception technology is a prerequisite for recognition and localization. This technology is of great significance for the mode switching of the robot and the calculation of the footprint position to adapt to the discontinuous terrain. However, there are still many problems that constrain the application of this technology, such as low recognition accuracy, high initial noise from sensors, unstable output signals and high computational requirements. In terms of scene reconstruction, the binocular and time of flight (TOF) reconstruction of the scene can be easily affected by environmental light and the surface material of the target object. In contrast, due to the special structure of the polarizer, the polarization can selectively transmit polarized light in a specific direction and this reconstruction method relies on the polarization information of the object surface. So the advantages of polarization reconstruction are reflected, which are less affected by environmental light and not dependent on the texture information of the object surface. In this paper, in order to achieve the detection of staircase, this paper proposes a contrast enhancement algorithm that integrates polarization and light intensity information, and integrates point cloud segmentation based on YOLOv11. To realize the high-quality reconstruction, we proposed a method of fusing polarized binocular and TOF depth information to realize the three-dimensional (3D) reconstruction of the staircase. Besides, it also proposes a joint calibration algorithm of monocular camera and TOF camera based on ICP registration and improved gray wolf optimization algorithm.
Long-VLA: Unleashing Long-Horizon Capability of Vision Language Action Model for Robot Manipulation
Vision-Language-Action (VLA) models have become a cornerstone in robotic policy learning, leveraging large-scale multimodal data for robust and scalable control. However, existing VLA frameworks primarily address short-horizon tasks, and their effectiveness on long-horizon, multi-step robotic manipulation remains limited due to challenges in skill chaining and subtask dependencies. In this work, we introduce Long-VLA, the first end-to-end VLA model specifically designed for long-horizon robotic tasks. Our approach features a novel phase-aware input masking strategy that adaptively segments each subtask into moving and interaction phases, enabling the model to focus on phase-relevant sensory cues and enhancing subtask compatibility. This unified strategy preserves the scalability and data efficiency of VLA training, and our architecture-agnostic module can be seamlessly integrated into existing VLA models. We further propose the L-CALVIN benchmark to systematically evaluate long-horizon manipulation. Extensive experiments on both simulated and real-world tasks demonstrate that Long-VLA significantly outperforms prior state-of-the-art methods, establishing a new baseline for long-horizon robotic control.
comment: Accepted to CoRL 2025; Github Page: https://long-vla.github.io
From Tabula Rasa to Emergent Abilities: Discovering Robot Skills via Real-World Unsupervised Quality-Diversity
Autonomous skill discovery aims to enable robots to acquire diverse behaviors without explicit supervision. Learning such behaviors directly on physical hardware remains challenging due to safety and data efficiency constraints. Existing methods, including Quality-Diversity Actor-Critic (QDAC), require manually defined skill spaces and carefully tuned heuristics, limiting real-world applicability. We propose Unsupervised Real-world Skill Acquisition (URSA), an extension of QDAC that enables robots to autonomously discover and master diverse, high-performing skills directly in the real world. We demonstrate that URSA successfully discovers diverse locomotion skills on a Unitree A1 quadruped in both simulation and the real world. Our approach supports both heuristic-driven skill discovery and fully unsupervised settings. We also show that the learned skill repertoire can be reused for downstream tasks such as real-world damage adaptation, where URSA outperforms all baselines in 5 out of 9 simulated and 3 out of 5 real-world damage scenarios. Our results establish a new framework for real-world robot learning that enables continuous skill discovery with limited human intervention, representing a significant step toward more autonomous and adaptable robotic systems. Demonstration videos are available at https://adaptive-intelligent-robotics.github.io/URSA.
comment: Accepted at CoRL 2025
FFHFlow: Diverse and Uncertainty-Aware Dexterous Grasp Generation via Flow Variational Inference
Synthesizing diverse, uncertainty-aware grasps for multi-fingered hands from partial observations remains a critical challenge in robot learning. Prior generative methods struggle to model the intricate grasp distribution of dexterous hands and often fail to reason about shape uncertainty inherent in partial point clouds, leading to unreliable or overly conservative grasps. We propose FFHFlow, a flow-based variational framework that generates diverse, robust multi-finger grasps while explicitly quantifying perceptual uncertainty in the partial point clouds. Our approach leverages a normalizing flow-based deep latent variable model to learn a hierarchical grasp manifold, overcoming the mode collapse and rigid prior limitations of conditional Variational Autoencoders (cVAEs). By exploiting the invertibility and exact likelihoods of flows, FFHFlow introspects shape uncertainty in partial observations and identifies novel object structures, enabling risk-aware grasp synthesis. To further enhance reliability, we integrate a discriminative grasp evaluator with the flow likelihoods, formulating an uncertainty-aware ranking strategy that prioritizes grasps robust to shape ambiguity. Extensive experiments in simulation and real-world setups demonstrate that FFHFlow outperforms state-of-the-art baselines (including diffusion models) in grasp diversity and success rate, while achieving run-time efficient sampling. We also showcase its practical value in cluttered and confined environments, where diversity-driven sampling excels by mitigating collisions (Project Page: https://sites.google.com/view/ffhflow/home/).
comment: First two authors contributed equally, whose ordering decided via coin-tossing. Accepted for CoRL 2025
Learning to Drive Ethically: Embedding Moral Reasoning into Autonomous Driving
Autonomous vehicles hold great promise for reducing traffic fatalities and improving transportation efficiency, yet their widespread adoption hinges on embedding robust ethical reasoning into routine and emergency maneuvers, particularly to protect vulnerable road users (VRUs) such as pedestrians and cyclists. Here, we present a hierarchical Safe Reinforcement Learning (Safe RL) framework that explicitly integrates moral considerations with standard driving objectives. At the decision level, a Safe RL agent is trained using a composite ethical risk cost, combining collision probability and harm severity, to generate high-level motion targets. A dynamic Prioritized Experience Replay mechanism amplifies learning from rare but critical, high-risk events. At the execution level, polynomial path planning coupled with Proportional-Integral-Derivative (PID) and Stanley controllers translates these targets into smooth, feasible trajectories, ensuring both accuracy and comfort. We train and validate our approach on rich, real-world traffic datasets encompassing diverse vehicles, cyclists, and pedestrians, and demonstrate that it outperforms baseline methods in reducing ethical risk and maintaining driving performance. To our knowledge, this is the first study of ethical decision-making for autonomous vehicles via Safe RL evaluated on real-world, human-mixed traffic scenarios. Our results highlight the potential of combining formal control theory and data-driven learning to advance ethically accountable autonomy that explicitly protects those most at risk in urban traffic environments.
TacCompress: A Benchmark for Multi-Point Tactile Data Compression in Dexterous Hand
Though robotic dexterous manipulation has progressed substantially recently, challenges like in-hand occlusion still necessitate fine-grained tactile perception, leading to the integration of more tactile sensors into robotic hands. Consequently, the increased data volume imposes substantial bandwidth pressure on signal transmission from the hand's controller. However, the acquisition and compression of multi-point tactile signals based on the dexterous hands' physical structures have not been thoroughly explored. In this paper, our contributions are twofold. First, we introduce a Multi-Point Tactile Dataset for Dexterous Hand Grasping (Dex-MPTD). This dataset captures tactile signals from multiple contact sensors across various objects and grasping poses, offering a comprehensive benchmark for advancing dexterous robotic manipulation research. Second, we investigate both lossless and lossy compression on Dex-MPTD by converting tactile data into images and applying six lossless and five lossy image codecs for efficient compression. Experimental results demonstrate that tactile data can be losslessly compressed to as low as 0.0364 bits per sub-sample (bpss), achieving approximately 200$\times$ compression ratio compared to the raw tactile data. Efficient lossy compressors like HM and VTM can achieve about 1000$\times$ data reductions while preserving acceptable data fidelity. The exploration of lossy compression also reveals that screen-content-targeted coding tools outperform general-purpose codecs in compressing tactile data.
comment: 9 pages, 10 figures, 2 tables
Pixel Motion as Universal Representation for Robot Control
We present LangToMo, a vision-language-action framework structured as a dual-system architecture that uses pixel motion forecasts as intermediate representations. Our high-level System 2, an image diffusion model, generates text-conditioned pixel motion sequences from a single frame to guide robot control. Pixel motion-a universal, interpretable, and motion-centric representation-can be extracted from videos in a weakly-supervised manner, enabling diffusion model training on any video-caption data. Treating generated pixel motion as learned universal representations, our low level System 1 module translates these into robot actions via motion-to-action mapping functions, which can be either hand-crafted or learned with minimal supervision. System 2 operates as a high-level policy applied at sparse temporal intervals, while System 1 acts as a low-level policy at dense temporal intervals. This hierarchical decoupling enables flexible, scalable, and generalizable robot control under both unsupervised and supervised settings, bridging the gap between language, motion, and action. Checkout https://kahnchana.github.io/LangToMo
Omni-Perception: Omnidirectional Collision Avoidance for Legged Locomotion in Dynamic Environments
Agile locomotion in complex 3D environments requires robust spatial awareness to safely avoid diverse obstacles such as aerial clutter, uneven terrain, and dynamic agents. Depth-based perception approaches often struggle with sensor noise, lighting variability, computational overhead from intermediate representations (e.g., elevation maps), and difficulties with non-planar obstacles, limiting performance in unstructured environments. In contrast, direct integration of LiDAR sensing into end-to-end learning for legged locomotion remains underexplored. We propose Omni-Perception, an end-to-end locomotion policy that achieves 3D spatial awareness and omnidirectional collision avoidance by directly processing raw LiDAR point clouds. At its core is PD-RiskNet (Proximal-Distal Risk-Aware Hierarchical Network), a novel perception module that interprets spatio-temporal LiDAR data for environmental risk assessment. To facilitate efficient policy learning, we develop a high-fidelity LiDAR simulation toolkit with realistic noise modeling and fast raycasting, compatible with platforms such as Isaac Gym, Genesis, and MuJoCo, enabling scalable training and effective sim-to-real transfer. Learning reactive control policies directly from raw LiDAR data enables the robot to navigate complex environments with static and dynamic obstacles more robustly than approaches relying on intermediate maps or limited sensing. We validate Omni-Perception through real-world experiments and extensive simulation, demonstrating strong omnidirectional avoidance capabilities and superior locomotion performance in highly dynamic environments.
Uncertainty-Aware Trajectory Prediction via Rule-Regularized Heteroscedastic Deep Classification
Deep learning-based trajectory prediction models have demonstrated promising capabilities in capturing complex interactions. However, their out-of-distribution generalization remains a significant challenge, particularly due to unbalanced data and a lack of enough data and diversity to ensure robustness and calibration. To address this, we propose SHIFT (Spectral Heteroscedastic Informed Forecasting for Trajectories), a novel framework that uniquely combines well-calibrated uncertainty modeling with informative priors derived through automated rule extraction. SHIFT reformulates trajectory prediction as a classification task and employs heteroscedastic spectral-normalized Gaussian processes to effectively disentangle epistemic and aleatoric uncertainties. We learn informative priors from training labels, which are automatically generated from natural language driving rules, such as stop rules and drivability constraints, using a retrieval-augmented generation framework powered by a large language model. Extensive evaluations over the nuScenes dataset, including challenging low-data and cross-location scenarios, demonstrate that SHIFT outperforms state-of-the-art methods, achieving substantial gains in uncertainty calibration and displacement metrics. In particular, our model excels in complex scenarios, such as intersections, where uncertainty is inherently higher. Project page: https://kumarmanas.github.io/SHIFT/.
comment: 17 Pages, 9 figures. Accepted to Robotics: Science and Systems(RSS), 2025
Unscented Kalman Filter with a Nonlinear Propagation Model for Navigation Applications
The unscented Kalman filter is a nonlinear estimation algorithm commonly used in navigation applications. The prediction of the mean and covariance matrix is crucial to the stable behavior of the filter. This prediction is done by propagating the sigma points according to the dynamic model at hand. In this paper, we introduce an innovative method to propagate the sigma points according to the nonlinear dynamic model of the navigation error state vector. This improves the filter accuracy and navigation performance. We demonstrate the benefits of our proposed approach using real sensor data recorded by an autonomous underwater vehicle during several scenarios.
comment: 6 pages, 4 figures
Dimension-Decomposed Learning for Quadrotor Geometric Attitude Control with Almost Global Exponential Convergence on SO(3)
This paper introduces a lightweight and interpretable online learning approach called Dimension-Decomposed Learning (DiD-L) for disturbance identification in quadrotor geometric attitude control. As a module instance of DiD-L, we propose the Sliced Adaptive-Neuro Mapping (SANM). Specifically, to address underlying underfitting problems, the high-dimensional mapping for online identification is axially ``sliced" into multiple low-dimensional submappings (slices). In this way, the complex high-dimensional problem is decomposed into a set of simple low-dimensional subtasks addressed by shallow neural networks and adaptive laws. These neural networks and adaptive laws are updated online via Lyapunov-based adaptation without the persistent excitation (PE) condition. To enhance the interpretability of the proposed approach, we prove that the state solution of the rotational error dynamics exponentially converges into an arbitrarily small ball within an almost global attraction domain, despite time-varying disturbances and inertia uncertainties. This result is novel as it demonstrates exponential convergence without requiring pre-training for unseen disturbances and specific knowledge of the model. To our knowledge in the quadrotor control field, DiD-L is the first online learning approach that is lightweight enough to run in real-time at 400 Hz on microcontroller units (MCUs) such as STM32, and has been validated through real-world experiments.
comment: v2: Corrected methodology naming typo; provided TeX source files
On the complexity of constrained reconfiguration and motion planning
Coordinating the motion of multiple agents in constrained environments is a fundamental challenge in robotics, motion planning, and scheduling. A motivating example involves $n$ robotic arms, each represented as a line segment. The objective is to rotate each arm to its vertical orientation, one at a time (clockwise or counterclockwise), without collisions nor rotating any arm more than once. This scenario is an example of the more general $k$-Compatible Ordering problem, where $n$ agents, each capable of $k$ state-changing actions, must transition to specific target states under constraints encoded as a set $\mathcal{G}$ of $k$ pairs of directed graphs. We show that $k$-Compatible Ordering is $\mathsf{NP}$-complete, even when $\mathcal{G}$ is planar, degenerate, or acyclic. On the positive side, we provide polynomial-time algorithms for cases such as when $k = 1$ or $\mathcal{G}$ has bounded treewidth. We also introduce generalized variants supporting multiple state-changing actions per agent, broadening the applicability of our framework. These results extend to a wide range of scheduling, reconfiguration, and motion planning applications in constrained environments.
RSRNav: Reasoning Spatial Relationship for Image-Goal Navigation
Recent image-goal navigation (ImageNav) methods learn a perception-action policy by separately capturing semantic features of the goal and egocentric images, then passing them to a policy network. However, challenges remain: (1) Semantic features often fail to provide accurate directional information, leading to superfluous actions, and (2) performance drops significantly when viewpoint inconsistencies arise between training and application. To address these challenges, we propose RSRNav, a simple yet effective method that reasons spatial relationships between the goal and current observations as navigation guidance. Specifically, we model the spatial relationship by constructing correlations between the goal and current observations, which are then passed to the policy network for action prediction. These correlations are progressively refined using fine-grained cross-correlation and direction-aware correlation for more precise navigation. Extensive evaluation of RSRNav on three benchmark datasets demonstrates superior navigation performance, particularly in the "user-matched goal" setting, highlighting its potential for real-world applications.
Safe and Efficient Social Navigation through Explainable Safety Regions Based on Topological Features
The recent adoption of artificial intelligence in robotics has driven the development of algorithms that enable autonomous systems to adapt to complex social environments. In particular, safe and efficient social navigation is a key challenge, requiring AI not only to avoid collisions and deadlocks but also to interact intuitively and predictably with its surroundings. Methods based on probabilistic models and the generation of conformal safety regions have shown promising results in defining safety regions with a controlled margin of error, primarily relying on classification approaches and explicit rules to describe collision-free navigation conditions. This work extends the existing perspective by investigating how topological features can contribute to the creation of explainable safety regions in social navigation scenarios, enabling the classification and characterization of different simulation behaviors. Rather than relying on behaviors parameters to generate safety regions, we leverage topological features through topological data analysis. We first utilize global rule-based classification to provide interpretable characterizations of different simulation behaviors, distinguishing between safe and unsafe scenarios based on topological properties. Next, we define safety regions, $S_\varepsilon$, representing zones in the topological feature space where collisions are avoided with a maximum classification error of $\varepsilon$. These regions are constructed using adjustable SVM classifiers and order statistics, ensuring a robust and scalable decision boundary. Our approach initially separates simulations with and without collisions, outperforming methods that not incorporate topological features. We further refine safety regions to ensure deadlock-free simulations and integrate both aspects to define a compliant simulation space that guarantees safe and efficient navigation.
Residual Neural Terminal Constraint for MPC-based Collision Avoidance in Dynamic Environments
In this paper, we propose a hybrid MPC local planner that uses a learning-based approximation of a time-varying safe set, derived from local observations and applied as the MPC terminal constraint. This set can be represented as a zero-superlevel set of the value function computed via Hamilton-Jacobi (HJ) reachability analysis, which is infeasible in real-time. We exploit the property that the HJ value function can be expressed as a difference of the corresponding signed distance function (SDF) and a non-negative residual function. The residual component is modeled as a neural network with non-negative output and subtracted from the computed SDF, resulting in a real-time value function estimate that is at least as safe as the SDF by design. Additionally, we parametrize the neural residual by a hypernetwork to improve real-time performance and generalization properties. The proposed method is compared with three state-of-the-art methods in simulations and hardware experiments, achieving up to 30\% higher success rates compared to the best baseline while requiring a similar computational effort and producing high-quality (low travel-time) solutions.
UAV-UGV Cooperative Trajectory Optimization and Task Allocation for Medical Rescue Tasks in Post-Disaster Environments
In post-disaster scenarios, rapid and efficient delivery of medical resources is critical and challenging due to severe damage to infrastructure. To provide an optimized solution, we propose a cooperative trajectory optimization and task allocation framework leveraging unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs). This study integrates a Genetic Algorithm (GA) for efficient task allocation among multiple UAVs and UGVs, and employs an informed-RRT* (Rapidly-exploring Random Tree Star) algorithm for collision-free trajectory generation. Further optimization of task sequencing and path efficiency is conducted using Covariance Matrix Adaptation Evolution Strategy (CMA-ES). Simulation experiments conducted in a realistic post-disaster environment demonstrate that our proposed approach significantly improves the overall efficiency of medical rescue operations compared to traditional strategies. Specifically, our method reduces the total mission completion time to 26.7 minutes for a 15-task scenario, outperforming K-Means clustering and random allocation by over 73%. Furthermore, the framework achieves a substantial 15.1% reduction in total traveled distance after CMA-ES optimization. The cooperative utilization of UAVs and UGVs effectively balances their complementary advantages, highlighting the system's scalability and practicality for real-world deployment.
CogNav: Cognitive Process Modeling for Object Goal Navigation with LLMs
Object goal navigation (ObjectNav) is a fundamental task in embodied AI, requiring an agent to locate a target object in previously unseen environments. This task is particularly challenging because it requires both perceptual and cognitive processes, including object recognition and decision-making. While substantial advancements in perception have been driven by the rapid development of visual foundation models, progress on the cognitive aspect remains constrained, primarily limited to either implicit learning through simulator rollouts or explicit reliance on predefined heuristic rules. Inspired by neuroscientific findings demonstrating that humans maintain and dynamically update fine-grained cognitive states during object search tasks in novel environments, we propose CogNav, a framework designed to mimic this cognitive process using large language models. Specifically, we model the cognitive process using a finite state machine comprising fine-grained cognitive states, ranging from exploration to identification. Transitions between states are determined by a large language model based on a dynamically constructed heterogeneous cognitive map, which contains spatial and semantic information about the scene being explored. Extensive evaluations on the HM3D, MP3D, and RoboTHOR benchmarks demonstrate that our cognitive process modeling significantly improves the success rate of ObjectNav at least by relative 14% over the state-of-the-arts.
Enhanced Trust Region Sequential Convex Optimization for Multi-Drone Thermal Screening Trajectory Planning in Urban Environments
The rapid detection of abnormal body temperatures in urban populations is essential for managing public health risks, especially during outbreaks of infectious diseases. Multi-drone thermal screening systems offer promising solutions for fast, large-scale, and non-intrusive human temperature monitoring. However, trajectory planning for multiple drones in complex urban environments poses significant challenges, including collision avoidance, coverage efficiency, and constrained flight environments. In this study, we propose an enhanced trust region sequential convex optimization (TR-SCO) algorithm for optimal trajectory planning of multiple drones performing thermal screening tasks. Our improved algorithm integrates a refined convex optimization formulation within a trust region framework, effectively balancing trajectory smoothness, obstacle avoidance, altitude constraints, and maximum screening coverage. Simulation results demonstrate that our approach significantly improves trajectory optimality and computational efficiency compared to conventional convex optimization methods. This research provides critical insights and practical contributions toward deploying efficient multi-drone systems for real-time thermal screening in urban areas. For reader who are interested in our research, we release our source code at https://github.com/Cherry0302/Enhanced-TR-SCO.
LocoMamba: Vision-Driven Locomotion via End-to-End Deep Reinforcement Learning with Mamba
We introduce LocoMamba, a vision-driven cross-modal DRL framework built on selective state-space models, specifically leveraging Mamba, that achieves near-linear-time sequence modeling, effectively captures long-range dependencies, and enables efficient training with longer sequences. First, we embed proprioceptive states with a multilayer perceptron and patchify depth images with a lightweight convolutional neural network, producing compact tokens that improve state representation. Second, stacked Mamba layers fuse these tokens via near-linear-time selective scanning, reducing latency and memory footprint, remaining robust to token length and image resolution, and providing an inductive bias that mitigates overfitting. Third, we train the policy end-to-end with Proximal Policy Optimization under terrain and appearance randomization and an obstacle-density curriculum, using a compact state-centric reward that balances progress, smoothness, and safety. We evaluate our method in challenging simulated environments with static and moving obstacles as well as uneven terrain. Compared with state-of-the-art baselines, our method achieves higher returns and success rates with fewer collisions, exhibits stronger generalization to unseen terrains and obstacle densities, and improves training efficiency by converging in fewer updates under the same compute budget.
comment: 13 pages
Pellet-based 3D Printing of Soft Thermoplastic Elastomeric Membranes for Soft Robotic Applications
Additive Manufacturing (AM) is a promising solution for handling the complexity of fabricating soft robots. However, the AM of hyperelastic materials is still challenging with a limited material range. Within this work, pellet-based 3D printing of very soft thermoplastic elastomers (TPEs) was explored (down to Shore Hardness 00-30). Our results show that TPEs can have similar engineering stress and maximum elongation as Ecoflex OO-10. In addition, we 3D-printed airtight thin membranes (0.2-1.2 mm), which could inflate up to a stretch of 1320%. Combining the membrane's large expansion and softness with the 3D printing of hollow structures simplified the design of a bending actuator that can bend 180 degrees and reach a blocked force of 238 times its weight. In addition, by 3D printing TPE pellets and rigid filaments, the soft membrane could grasp objects by enveloping an object or as a sensorized sucker, which relied on the TPE's softness to conform to the object or act as a seal. In addition, the membrane of the sucker acted as a tactile sensor to detect an object before adhesion. These results suggest the feasibility of AM of soft robots using soft TPEs and membranes as a promising class of materials and sensorized actuators, respectively.
PUB: A Plasma-Propelled Ultra-Quiet Blimp with Two-DOF Vector Thrusting
This study presents the design and control of a Plasma-propelled Ultra-silence Blimp (PUB), a novel aerial robot employing plasma vector propulsion for ultra-quiet flight without mechanical propellers. The system utilizes a helium-lift platform for extended endurance and a four-layer ring asymmetric capacitor to generate ionic wind thrust. The modular propulsion units allow flexible configuration to meet mission-specific requirements, while a two-degree-of-freedom (DOF) head enables thrust vector control. A closed-loop slip control scheme is implemented for stable maneuvering. Flight experiments demonstrate full-envelope capability, including take-off, climb, hover, descent, and smooth landing, confirming the feasibility of plasma vector propulsion, the effectiveness of DOF vector control, and the stability of the control system. Owing to its low acoustic signature, structural simplicity, and high maneuverability, PUB is well suited for noise-sensitive, enclosed, and near-space applications.
Multi-critic Learning for Whole-body End-effector Twist Tracking
Learning whole-body control for locomotion and arm motions in a single policy has challenges, as the two tasks have conflicting goals. For instance, efficient locomotion typically favors a horizontal base orientation, while end-effector tracking may benefit from base tilting to extend reachability. Additionally, current Reinforcement Learning (RL) approaches using a pose-based task specification lack the ability to directly control the end-effector velocity, making smoothly executing trajectories very challenging. To address these limitations, we propose an RL-based framework that allows for dynamic, velocity-aware whole-body end-effector control. Our method introduces a multi-critic actor architecture that decouples the reward signals for locomotion and manipulation, simplifying reward tuning and allowing the policy to resolve task conflicts more effectively. Furthermore, we design a twist-based end-effector task formulation that can track both discrete poses and motion trajectories. We validate our approach through a set of simulation and hardware experiments using a quadruped robot equipped with a robotic arm. The resulting controller can simultaneously walk and move its end-effector and shows emergent whole-body behaviors, where the base assists the arm in extending the workspace, despite a lack of explicit formulations. Videos and supplementary material can be found at multi-critic-locomanipulation.github.io.
Divide, Discover, Deploy: Factorized Skill Learning with Symmetry and Style Priors
Unsupervised Skill Discovery (USD) allows agents to autonomously learn diverse behaviors without task-specific rewards. While recent USD methods have shown promise, their application to real-world robotics remains underexplored. In this paper, we propose a modular USD framework to address the challenges in the safety, interpretability, and deployability of the learned skills. Our approach employs user-defined factorization of the state space to learn disentangled skill representations. It assigns different skill discovery algorithms to each factor based on the desired intrinsic reward function. To encourage structured morphology-aware skills, we introduce symmetry-based inductive biases tailored to individual factors. We also incorporate a style factor and regularization penalties to promote safe and robust behaviors. We evaluate our framework in simulation using a quadrupedal robot and demonstrate zero-shot transfer of the learned skills to real hardware. Our results show that factorization and symmetry lead to the discovery of structured human-interpretable behaviors, while the style factor and penalties enhance safety and diversity. Additionally, we show that the learned skills can be used for downstream tasks and perform on par with oracle policies trained with hand-crafted rewards.
comment: Accepted to CoRL 2025. For code and videos, please check: https://leggedrobotics.github.io/d3-skill-discovery/
Multiagent Systems
CoCoL: A Communication Efficient Decentralized Collaborative Method for Multi-Robot Systems IROS2025
Collaborative learning enhances the performance and adaptability of multi-robot systems in complex tasks but faces significant challenges due to high communication overhead and data heterogeneity inherent in multi-robot tasks. To this end, we propose CoCoL, a Communication efficient decentralized Collaborative Learning method tailored for multi-robot systems with heterogeneous local datasets. Leveraging a mirror descent framework, CoCoL achieves remarkable communication efficiency with approximate Newton-type updates by capturing the similarity between objective functions of robots, and reduces computational costs through inexact sub-problem solutions. Furthermore, the integration of a gradient tracking scheme ensures its robustness against data heterogeneity. Experimental results on three representative multi robot collaborative learning tasks show the superiority of the proposed CoCoL in significantly reducing both the number of communication rounds and total bandwidth consumption while maintaining state-of-the-art accuracy. These benefits are particularly evident in challenging scenarios involving non-IID (non-independent and identically distributed) data distribution, streaming data, and time-varying network topologies.
comment: Accepted by IROS2025
cMALC-D: Contextual Multi-Agent LLM-Guided Curriculum Learning with Diversity-Based Context Blending
Many multi-agent reinforcement learning (MARL) algorithms are trained in fixed simulation environments, making them brittle when deployed in real-world scenarios with more complex and uncertain conditions. Contextual MARL (cMARL) addresses this by parameterizing environments with context variables and training a context-agnostic policy that performs well across all environment configurations. Existing cMARL methods attempt to use curriculum learning to help train and evaluate context-agnostic policies, but they often rely on unreliable proxy signals, such as value estimates or generalized advantage estimates that are noisy and unstable in multi-agent settings due to inter-agent dynamics and partial observability. To address these issues, we propose Contextual Multi-Agent LLM-Guided Curriculum Learning with Diversity-Based Context Blending (cMALC-D), a framework that uses Large Language Models (LLMs) to generate semantically meaningful curricula and provide a more robust evaluation signal. To prevent mode collapse and encourage exploration, we introduce a novel diversity-based context blending mechanism that creates new training scenarios by combining features from prior contexts. Experiments in traffic signal control domains demonstrate that cMALC-D significantly improves both generalization and sample efficiency compared to existing curriculum learning baselines. We provide code at https://github.com/DaRL-LibSignal/cMALC-D.
comment: A shorter version has been accepted to the 2025 Conference on Information and Knowledge Management
Evolution favours positively biased reasoning in sequential interactions with high future gains
Empirical evidence shows that human behaviour often deviates from game-theoretical rationality. For instance, humans may hold unrealistic expectations about future outcomes. As the evolutionary roots of such biases remain unclear, we investigate here how reasoning abilities and cognitive biases co-evolve using Evolutionary Game Theory. In our model, individuals in a population deploy a variety of unbiased and biased level-k reasoning strategies to anticipate others' behaviour in sequential interactions, represented by the Incremental Centipede Game. Positively biased reasoning strategies have a systematic inference bias towards higher but uncertain rewards, while negatively biased strategies reflect the opposite tendency. We find that selection consistently favours positively biased reasoning, with rational behaviour even going extinct. This bias co-evolves with bounded rationality, as the reasoning depth remains limited in the population. Interestingly, positively biased agents may co-exist with non-reasoning agents, thus pointing to a novel equilibrium. Longer games further promote positively biased reasoning, as they can lead to higher future rewards. The biased reasoning strategies proposed in this model may reflect cognitive phenomena like wishful thinking and defensive pessimism. This work therefore supports the claim that certain cognitive biases, despite deviating from rational judgment, constitute an adaptive feature to better cope with social dilemmas.
comment: 33 pages, 5 figures
Bridging Finite and Infinite-Horizon Nash Equilibria in Linear Quadratic Games
Finite-horizon linear quadratic (LQ) games admit a unique Nash equilibrium, while infinite-horizon settings may have multiple. We clarify the relationship between these two cases by interpreting the finite-horizon equilibrium as a nonlinear dynamical system. Within this framework, we prove that its fixed points are exactly the infinite-horizon equilibria and that any such equilibrium can be recovered by an appropriate choice of terminal costs. We further show that periodic orbits of the dynamical system, when they arise, correspond to periodic Nash equilibria, and we provide numerical evidence of convergence to such cycles. Finally, simulations reveal three asymptotic regimes: convergence to stationary equilibria, convergence to periodic equilibria, and bounded non-convergent trajectories. These findings offer new insights and tools for tuning finite-horizon LQ games using infinite-horizon.
LLM-Based Agents for Competitive Landscape Mapping in Drug Asset Due Diligence
In this paper, we describe and benchmark a competitor-discovery component used within an agentic AI system for fast drug asset due diligence. A competitor-discovery AI agent, given an indication, retrieves all drugs comprising the competitive landscape of that indication and extracts canonical attributes for these drugs. The competitor definition is investor-specific, and data is paywalled/licensed, fragmented across registries, ontology-mismatched by indication, alias-heavy for drug names, multimodal, and rapidly changing. Although considered the best tool for this problem, the current LLM-based AI systems aren't capable of reliably retrieving all competing drug names, and there is no accepted public benchmark for this task. To address the lack of evaluation, we use LLM-based agents to transform five years of multi-modal, unstructured diligence memos from a private biotech VC fund into a structured evaluation corpus mapping indications to competitor drugs with normalized attributes. We also introduce a competitor validating LLM-as-a-judge agent that filters out false positives from the list of predicted competitors to maximize precision and suppress hallucinations. On this benchmark, our competitor-discovery agent achieves 83% recall, exceeding OpenAI Deep Research (65%) and Perplexity Labs (60%). The system is deployed in production with enterprise users; in a case study with a biotech VC investment fund, analyst turnaround time dropped from 2.5 days to $\sim$3 hours ($\sim$20x) for the competitive analysis.
UAV-UGV Cooperative Trajectory Optimization and Task Allocation for Medical Rescue Tasks in Post-Disaster Environments
In post-disaster scenarios, rapid and efficient delivery of medical resources is critical and challenging due to severe damage to infrastructure. To provide an optimized solution, we propose a cooperative trajectory optimization and task allocation framework leveraging unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs). This study integrates a Genetic Algorithm (GA) for efficient task allocation among multiple UAVs and UGVs, and employs an informed-RRT* (Rapidly-exploring Random Tree Star) algorithm for collision-free trajectory generation. Further optimization of task sequencing and path efficiency is conducted using Covariance Matrix Adaptation Evolution Strategy (CMA-ES). Simulation experiments conducted in a realistic post-disaster environment demonstrate that our proposed approach significantly improves the overall efficiency of medical rescue operations compared to traditional strategies. Specifically, our method reduces the total mission completion time to 26.7 minutes for a 15-task scenario, outperforming K-Means clustering and random allocation by over 73%. Furthermore, the framework achieves a substantial 15.1% reduction in total traveled distance after CMA-ES optimization. The cooperative utilization of UAVs and UGVs effectively balances their complementary advantages, highlighting the system's scalability and practicality for real-world deployment.
CBS with Continuous-Time Revisit
Multi-Agent Path Finding in Continuous Time (\mapfr) extends the classical MAPF problem by allowing agents to operate in continuous time. Conflict-Based Search with Continuous Time (CCBS) is a foundational algorithm for solving \mapfr optimally. In this paper, we revisit the theoretical claims of CCBS and show the algorithm is incomplete, due to an uncountably infinite state space created by continuous wait durations. Through theoretical analysis and counter-examples, we examine the inherent challenges of extending existing MAPF solvers to address \mapfr while preserving optimality guarantees. By restricting waiting duration to fixed amounts, we identify a related sub-problem on graphs, \mapfrdt which we show is optimally solvable, including by CCBS. It remains an open question whether similar models exist for \mapfrct, a generalised version of \mapfrdt that allows arbitrary wait times, and \mapfrcs, which further allows arbitrary movements in continuous space.
Systems and Control (CS)
Finite-Time Guarantees for Multi-Agent Combinatorial Bandits with Nonstationary Rewards
We study a sequential resource allocation problem where a decision maker selects subsets of agents at each period to maximize overall outcomes without prior knowledge of individual-level effects. Our framework applies to settings such as community health interventions, targeted digital advertising, and workforce retention programs, where intervention effects evolve dynamically. Agents may exhibit habituation (diminished response from frequent selection) or recovery (enhanced response from infrequent selection). The technical challenge centers on nonstationary reward distributions that lead to changing intervention effects over time. The problem requires balancing two key competing objectives: heterogeneous individual rewards and the exploration-exploitation tradeoff in terms of learning for improved future decisions as opposed to maximizing immediate outcomes. Our contribution introduces the first framework incorporating this form of nonstationary rewards in the combinatorial multi-armed bandit literature. We develop algorithms with theoretical guarantees on dynamic regret and demonstrate practical efficacy through a diabetes intervention case study. Our personalized community intervention algorithm achieved up to three times as much improvement in program enrollment compared to baseline approaches, validating the framework's potential for real-world applications. This work bridges theoretical advances in adaptive learning with practical challenges in population-level behavioral change interventions.
comment: 41 pages, 8 figures
Missing Money and Market-Based Adequacy in Deeply Decarbonized Power Systems with Long-Duration Energy Storage
The ability of deeply decarbonised power systems to ensure adequacy may increasingly depend on long-duration energy storage (LDES). A central challenge is whether capacity markets (CMs), originally designed around thermal generation, can provide efficient investment signals when storage becomes a central participant. While recent studies have advanced methods for accrediting variable renewables and short-duration storage, the effectiveness of these methods in CMs with substantial LDES penetration remains largely unexplored. To address this gap, we extend a two-stage stochastic equilibrium investment model by endogenising continuous, duration-based capacity accreditation for storage and apply it to a Great Britain-based case using 40 years of weather-driven demand and renewable profiles under varying emission limits. Results show that well-calibrated CMs can sustain near-efficient investment and mitigate revenue volatility, but their effectiveness diminishes in deeply decarbonized systems, underscoring both their potential and the regulatory challenges of supporting large-scale LDES.
Real-Time Tracking Antenna System for Moving Targets
This paper presents the design and implementation of a compact, cost-effective phased array antenna system. It is capable of real-time beam-steering for dynamic target-tracking applications. The system employs a 4$\times$4 rectangular microstrip patch array, utilizing advanced beamforming techniques and a Direction of Arrival (DoA) estimation algorithm. It achieves $\pm 42^{\circ}$ wide-angle scanning in both azimuth and elevation planes. The design emphasizes a balance between high angular coverage and consistent gain performance. This makes it suitable for wireless tracking, radar, and satellite communication terminals. Fabricated on Rogers 6010.2LM substrate, the system demonstrates reproducibility and scalability. All components are sourced locally to ensure practical deployment. The system is built using commercially available components, highlighting its affordability for research and prototyping purposes.
AERO-LQG: Aerial-Enabled Robust Optimization for LQG-Based Quadrotor Flight Controller
Quadrotors are indispensable in civilian, industrial, and military domains, undertaking complex, high-precision tasks once reserved for specialized systems. Across all contexts, energy efficiency remains a critical constraint: quadrotors must reconcile the high power demands of agility with the minimal consumption required for extended endurance. Meeting this trade-off calls for mode-specific optimization frameworks that adapt to diverse mission profiles. At their core lie optimal control policies defining error functions whose minimization yields robust, mission-tailored performance. While solutions are straightforward for fixed weight matrices, selecting those weights is a far greater challenge-lacking analytical guidance and thus relying on exhaustive or stochastic search. This interdependence can be framed as a bi-level optimization problem, with the outer loop determining weights a priori. This work introduces an aerial-enabled robust optimization for LQG tuning (AERO-LQG), a framework employing evolutionary strategy to fine-tune LQG weighting parameters. Applied to the linearized hovering mode of quadrotor flight, AERO-LQG achieves performance gains of several tens of percent, underscoring its potential for enabling high-performance, energy-efficient quadrotor control. The project is available at GitHub.
comment: 2 tables, 8 figures
The Epistemic Support-Point Filter (ESPF): A Bounded Possibilistic Framework for Ordinal State Estimation
Traditional state estimation methods rely on probabilistic assumptions that often collapse epistemic uncertainty into scalar beliefs, risking overconfidence in sparse or adversarial sensing environments. We introduce the Epistemic Support-Point Filter (ESPF), a novel non-Bayesian filtering framework fully grounded in possibility theory and epistemic humility. ESPF redefines the evolution of belief over state space using compatibility-weighted support updates, surprisalaware pruning, and adaptive dispersion via sparse grid quadrature. Unlike conventional filters, ESPF does not seek a posterior distribution, but rather maintains a structured region of plausibility or non-rejection, updated using ordinal logic rather than integration. For multi-model inference, we employ the Choquet integral to fuse competing hypotheses based on a dynamic epistemic capacity function, generalizing classical winner-take-all strategies. The result is an inference engine capable of dynamically contracting or expanding belief support in direct response to information structure, without requiring prior statistical calibration. This work presents a foundational shift in how inference, evidence, and ignorance are reconciled, supporting robust estimation where priors are unavailable, misleading, or epistemically unjustified.
An Efficient Data-Driven Framework for Linear Quadratic Output Feedback Control
Linear quadratic regulator with unmeasurable states and unknown system matrix parameters better aligns with practical scenarios. However, for this problem, balancing the optimality of the resulting controller and the leniency of the algorithm's feasibility conditions remains a non-trivial challenge, as no well-established general method has yet been developed to address this trade-off. To address this gap, this study first develops a comprehensive theoretical framework for state parameterization that equivalently substitutes for unknown states. By analyzing the controllability of consistent systems satisfied by substitute states, this framework quantifies the capability of substitute state data matrices to parameterize unknown closed-loop systems and output feedback controllers, thereby constructing a modified state parameterization form that meets the complete data parameterization condition of Willems' Fundamental Lemma. Leveraging this framework, this study proposes efficient model-free off-policy policy iteration and value iteration algorithms with theoretical guarantees to solve for the optimal output feedback controller. Compared with existing studies, particularly for multi-output problems where existing model-free reinforcement learning algorithms may fail, the proposed method removes redundant information in substitute states and the additional full row rank condition on regression matrices, thereby ensuring the solution of optimal output feedback controllers equivalent to optimal state feedback controllers for multi-output systems. Furthermore, this study pioneers a comprehensive and highly scalable theoretical analysis of state parameterization from a data-driven viewpoint, and the proposed algorithms exhibit significant advantages in implementation conditions, data demand, unknown handling, and convergence speed.
Non-expert to Expert Motion Translation Using Generative Adversarial Networks
Decreasing skilled workers is a very serious problem in the world. To deal with this problem, the skill transfer from experts to robots has been researched. These methods which teach robots by human motion are called imitation learning. Experts' skills generally appear in not only position data, but also force data. Thus, position and force data need to be saved and reproduced. To realize this, a lot of research has been conducted in the framework of a motion-copying system. Recent research uses machine learning methods to generate motion commands. However, most of them could not change tasks by following human intention. Some of them can change tasks by conditional training, but the labels are limited. Thus, we propose the flexible motion translation method by using Generative Adversarial Networks. The proposed method enables users to teach robots tasks by inputting data, and skills by a trained model. We evaluated the proposed system with a 3-DOF calligraphy robot.
Balancing Profit and Traveller Acceptance in Ride-Pooling Personalised Fares
Ride-pooling systems, to succeed, must provide an attractive service, namely compensate perceived costs with an appealing price. However, because of a strong heterogeneity in a value-of-time, each traveller has his own acceptable price, unknown to the operator. Here, we show that individual acceptance levels can be learned by the operator (over $90\%$ accuracy for pooled travellers in $10$ days) to optimise personalised fares. We propose an adaptive pricing policy, where every day the operator constructs an offer that progressively meets travellers' expectations and attracts a growing demand. Our results suggest that operators, by learning behavioural traits of individual travellers, may improve performance not only for travellers (increased utility) but also for themselves (increased profit). Moreover, such knowledge allows the operator to remove inefficient pooled rides and focus on attractive and profitable combinations.
Multi-cluster distributed optimization in open multi-agent systems over directed graphs with acknowledgement messages
In this paper, we tackle the problem of distributed optimization over directed networks in open multi-agent systems (OMAS), where agents may dynamically join or leave, causing persistent changes in network topology and problem dimension. These disruptions not only pose significant challenges to maintaining convergence and stability in distributed optimization algorithms, but could also break the network topology into multiple clusters, each one associated with its own set of objective functions. To address this, we propose a novel Open Distributed Optimization Algorithm with Gradient Tracking (OPEN-GT), which employs: (a) a dynamic mechanism for detecting active out-neighbors through acknowledgement messages, and (b) a fully distributed max-consensus procedure to spread information regarding agent departures, in possibly unbalanced directed networks. We show that when all active agents execute OPEN-GT, the optimization process in each formed cluster remains consistent, while the agents converge to their cluster-wide optimal solution if there exists a time after which the network remains unchanged. Finally, we validate our approach in a simulated environment with dynamically changing agent populations, demonstrating its resilience to network variations and its ability to support distributed optimization under OMAS dynamics.
Bridging Finite and Infinite-Horizon Nash Equilibria in Linear Quadratic Games
Finite-horizon linear quadratic (LQ) games admit a unique Nash equilibrium, while infinite-horizon settings may have multiple. We clarify the relationship between these two cases by interpreting the finite-horizon equilibrium as a nonlinear dynamical system. Within this framework, we prove that its fixed points are exactly the infinite-horizon equilibria and that any such equilibrium can be recovered by an appropriate choice of terminal costs. We further show that periodic orbits of the dynamical system, when they arise, correspond to periodic Nash equilibria, and we provide numerical evidence of convergence to such cycles. Finally, simulations reveal three asymptotic regimes: convergence to stationary equilibria, convergence to periodic equilibria, and bounded non-convergent trajectories. These findings offer new insights and tools for tuning finite-horizon LQ games using infinite-horizon.
Optimistic vs Pessimistic Uncertainty Model Unfalsification
We present a novel, input-output data-driven approach to uncertainty model identification. As the true bounds and distributions of system uncertainties ultimately remain unknown, we depart from the goal of identifying the uncertainty model and instead look for minimal concrete statements that can be made based on an uncertain system model and available input-output data. We refer to this as unfalsifying an uncertainty model. Two different unfalsification approaches are taken. The optimistic approach determines the smallest uncertainties that could explain the given data, while the pessimistic approach finds the largest possible uncertainties suggested by the data. The pessimistic problem is revealed to be a semi-infinite program, which is solved using the local reduction algorithm. It is also shown that the optimistic and pessimistic approaches to uncertainty model unfalsification are mathematical duals. Finally, both approaches are tested using an uncertain linear model with data from a simulated nonlinear system.
comment: 6 pages, 3 figures. Accepted for publication at the 2025 IEEE Conference on Decision and Control (CDC 2025)
Physics-Constrained Machine Learning for Chemical Engineering
Physics-constrained machine learning (PCML) combines physical models with data-driven approaches to improve reliability, generalizability, and interpretability. Although PCML has shown significant benefits in diverse scientific and engineering domains, technical and intellectual challenges hinder its applicability in complex chemical engineering applications. Key difficulties include determining the amount and type of physical knowledge to embed, designing effective fusion strategies with ML, scaling models to large datasets and simulators, and quantifying predictive uncertainty. This perspective summarizes recent developments and highlights challenges/opportunities in applying PCML to chemical engineering, emphasizing on closed-loop experimental design, real-time dynamics and control, and handling of multi-scale phenomena.
A Proposal for Yield Improvement with Power Tradeoffs in CMOS LNAs (English Version)
This paper studies an architecture with digitally controllable gain and power consumption to mitigate the impact of process variations on CMOS low-noise amplifiers (LNAs). A \SI{130}{nm}, \SI{1.2}{V} LNA implementing the proposed architecture is designed based on an analysis of variability in traditional LNAs under different bias currents and on the corresponding effects on the performance of a complete receiver. Two different adjustment strategies are evaluated, both of which are compatible with previously reported built-in self-test (BIST) circuits. Results show that the proposed architecture enables yield enhancement while keeping low-power operation compared with traditional LNAs.
comment: English version of paper originally published in Spanish
Minimizing AoI in Mobile Edge Computing: Nested Index Policy with Preemptive and Non-preemptive Structure
Mobile Edge Computing (MEC) leverages computational heterogeneity between mobile devices and edge nodes to enable real-time applications requiring high information freshness. The Age-of-Information (AoI) metric serves as a crucial evaluator of information timeliness in such systems. Addressing AoI minimization in multi-user MEC environments presents significant challenges due to stochastic computing times. In this paper, we consider multiple users offloading tasks to heterogeneous edge servers in an MEC system, focusing on preemptive and non-preemptive task scheduling mechanisms. The problem is first reformulated as a Restless Multi-Arm Bandit (RMAB) problem, with a multi-layer Markov Decision Process (MDP) framework established to characterize AoI dynamics in the MEC system. Based on the multi-layer MDP, we propose a nested index framework and design a nested index policy with provably asymptotic optimality. This establishes a theoretical framework adaptable to various scheduling mechanisms, achieving efficient optimization through state stratification and index design in both preemptive and non-preemptive modes. Finally, the closed-form of the nested index is derived, facilitating performance trade-offs between computational complexity and accuracy while ensuring the universal applicability of the nested index policy across both scheduling modes. The experimental results show that in non-preemptive scheduling, compared with the benchmark method, the optimality gap is reduced by 25.43%, while in preemptive scheduling, the gap has reduced by 61.84%. As the system scale increases, it asymptotically converges in two scheduling modes and especially provides near-optimal performance in non-preemptive structure.
comment: 23 pages, 11 figures, 2 tables
DMPC-Swarm: Distributed Model Predictive Control on Nano UAV swarms
Swarms of unmanned aerial vehicles (UAVs) are increasingly becoming vital to our society, undertaking tasks such as search and rescue, surveillance and delivery. A special variant of Distributed Model Predictive Control (DMPC) has emerged as a promising approach for the safe management of these swarms by combining the scalability of distributed computation with dynamic swarm motion control. In this DMPC method, multiple agents solve local optimization problems with coupled anti-collision constraints, periodically exchanging their solutions. Despite its potential, existing methodologies using this DMPC variant have yet to be deployed on distributed hardware that fully utilize true distributed computation and wireless communication. This is primarily due to the lack of a communication system tailored to meet the unique requirements of mobile swarms and an architecture that supports distributed computation while adhering to the payload constraints of UAVs. We present DMPC-SWARM, a new swarm control methodology that integrates an efficient, stateless low-power wireless communication protocol with a novel DMPC algorithm that provably avoids UAV collisions even under message loss. By utilizing event-triggered and distributed off-board computing, DMPC-SWARM supports nano UAVs, allowing them to benefit from additional computational resources while retaining scalability and fault tolerance. In a detailed theoretical analysis, we prove that DMPC-SWARM guarantees collision avoidance under realistic conditions, including communication delays and message loss. Finally, we present DMPC-SWARM's implementation on a swarm of up to 16 nano-quadcopters, demonstrating the first realization of these DMPC variants with computation distributed on multiple physical devices interconnected by a real wireless mesh networks. A video showcasing DMPC-SWARM is available at http://tiny.cc/DMPCSwarm.
Transient Stability Analysis of a Hybrid Grid-Forming and Grid-Following RES System Considering Multi-Mode Control Switching
The inherent control switching of renewable energy sources (RESs) during intricate transient processes introduces complexity to the dynamic behavior of modern power systems. This paper reveals the dynamic coupling between grid forming (GFM)/grid following (GFL)-based RES and dominant instability modes of the hybrid system. First, six control combinations are systematically investigated by pairing the two GFM-RES modes, normal control (NC) and current saturation (CS), with the three GFL-RES modes: normal control, low voltage ride-through (LVRT), and high voltage ride-through (HVRT). Based on switching system theory, the coupled power flow and dynamic motion models are developed considering multi-mode switching characteristics. It is revealed that the hybrid system exhibits two distinct instability modes when the GFM-RES and GFL-RES exceed their P-f and V-f desynchronization boundaries, respectively. The two-dimensional spatiotemporal damping characteristics of GFL-RES induced by GFM-RES are also uncovered for the first time. A novel criterion is proposed to quantify the impact of GFM-RES on GFL-RES dynamics, capturing both its stabilizing and destabilizing effects under different control combinations. High-fidelity electromagnetic transient simulations validate the correctness of the analysis framework.
Local Observability of a Class of Feedforward Neural Networks
Beyond the traditional neural network training methods based on gradient descent and its variants, state estimation techniques have been proposed to determine a set of ideal weights from a control-theoretic perspective. Hence, the concept of observability becomes relevant in neural network training. In this paper, we investigate local observability of a class of two-layer feedforward neural networks~(FNNs) with rectified linear unit~(ReLU) activation functions. We analyze local observability of FNNs by evaluating an observability rank condition with respect to the weight matrix and the input sequence. First, we show that, in general, the weights of FNNs are not locally observable. Then, we provide sufficient conditions on the network structures and the weights that lead to local observability. Moreover, we propose an input design approach to render the weights distinguishable and show that this input also excites other weights inside a neighborhood. Finally, we validate our results through a numerical example.
comment: 6 pages, 1 figure
Adaptive Control of Heterogeneous Platoons with Guaranteed Collision Avoidance
This work proposes a framework for Cooperative Adaptive Cruise Control of a vehicular platoon characterized by unidirectional communication and heterogeneous parameters. In the proposed framework, the actual (heterogeneous) platoon is made to converge to a reference (homogeneous) platoon via adaptive laws designed using of set-theoretic model reference adaptive control. Yet, in contrast to the state-of-art that is based on ensuring collision avoidance on the reference platoon dynamics only, the approach we propose can ensure collision avoidance on the actual platoon dynamics. This result is possible thanks to the introduction of a novel concept of virtual platoon, only used for analysis, but that does not interact with the actual platoon. The stability and convergence properties of the proposed framework are established using Lyapunov-based analysis in conjunction with the aforementioned virtual platoon concept.
Joint Contact Planning for Navigation and Communication in GNSS-Libration Point Systems
Deploying satellites at Earth-Moon Libration Points (LPs) addresses the inherent deep-space coverage gaps of low-altitude GNSS constellations. Integrating LP satellites with GNSS into a joint constellation enables a more robust and comprehensive Positioning, Navigation, and Timing (PNT) system, while also extending navigation and communication services to spacecraft operating in cislunar space (i.e., users). However, the long propagation delays between LP satellites, users, and GNSS satellites result in significantly different link durations compared to those within the GNSS constellation. Scheduling inter-satellite links (ISLs) is a core task of Contact Plan Design (CPD). Existing CPD approaches focus exclusively on GNSS constellations, assuming uniform link durations, and thus cannot accommodate the heterogeneous link timescales present in a joint GNSS-LP system. To overcome this limitation, we introduce a Joint CPD (J-CPD) scheme tailored to handle ISLs with differing duration units across integrated constellations. The key contributions of J-CPD are: (i):introduction of LongSlots (Earth-Moon scale links) and ShortSlots (GNSS-scale links); (ii):a hierarchical and crossed CPD process for scheduling LongSlots and ShortSlots ISLs; (iii):an energy-driven link scheduling algorithm adapted to the CPD process. Simulations on a joint BeiDou-LP constellation demonstrate that J-CPD surpasses the baseline FCP method in both delay and ranging coverage, while maintaining high user satisfaction and enabling tunable trade-offs through adjustable potential-energy parameters. To our knowledge, this is the first CPD framework to jointly optimize navigation and communication in GNSS-LP systems, representing a key step toward unified and resilient deep-space PNT architectures.
comment: 15 pages, 8 figures
Learning Fast, Tool aware Collision Avoidance for Collaborative Robots
Ensuring safe and efficient operation of collaborative robots in human environments is challenging, especially in dynamic settings where both obstacle motion and tasks change over time. Current robot controllers typically assume full visibility and fixed tools, which can lead to collisions or overly conservative behavior. In our work, we introduce a tool-aware collision avoidance system that adjusts in real time to different tool sizes and modes of tool-environment interaction. Using a learned perception model, our system filters out robot and tool components from the point cloud, reasons about occluded area, and predicts collision under partial observability. We then use a control policy trained via constrained reinforcement learning to produce smooth avoidance maneuvers in under 10 milliseconds. In simulated and real-world tests, our approach outperforms traditional approaches (APF, MPPI) in dynamic environments, while maintaining sub-millimeter accuracy. Moreover, our system operates with approximately 60% lower computational cost compared to a state-of-the-art GPU-based planner. Our approach provides modular, efficient, and effective collision avoidance for robots operating in dynamic environments. We integrate our method into a collaborative robot application and demonstrate its practical use for safe and responsive operation.
MegaCacheX: Towards Cost-Effective Hierarchical Collaborative Content Caching in Emerging Mega-Constellations
Significant latency in global content delivery primarily arises from insufficient terrestrial infrastructure. Deploying space-based content delivery networks within emerging mega-constellations provides an effective means to bridge the digital divide. However, space-based caching faces constraints from physical-layer dynamics, including dynamic topologies, time-varying inter-satellite link conditions, and limited onboard energy. In addition, existing mechanisms often lack fine-grained content categorization and global optimization. This paper proposes MegaCacheX, a cost-effective hierarchical framework for collaborative content distribution that achieves "Earth-independence" by providing cloud services directly from space. Specifically, data centers in Sun-synchronous orbit act as primary content sources, while caching nodes in mega-constellations and ground stations collaboratively form a distributed edge layer. MegaCacheX optimizes caching strategies by integrating content popularity, regional user distribution, and satellite trajectory predictions. Multi-tier caching nodes serve as service anchors, enabling seamless content delivery with low latency. A prototype implemented on a microservices-based, containerized testbed demonstrates that MegaCacheX reduces global content access latency by about 36% compared to baseline approaches, while maintaining cost efficiency.
Bootstrap Policy Iteration for Stochastic LQ Tracking with Multiplicative Noise
This paper studies the optimal tracking control problem for continuous-time stochastic linear systems with multiplicative noise. The solution framework involves solving a stochastic algebraic Riccati equation for the feedback gain and a Sylvester equation for the feedforward gain. To enable model-free optimal tracking, we first develop a two-phase bootstrap policy iteration (B-PI) algorithm, which bootstraps a stabilizing control gain from the trivially initialized zero-value start and proceeds with standard policy iteration. Building on this algorithm, we propose a data-driven, off-policy reinforcement learning approach that ensures convergence to the optimal feedback gain under the interval excitation condition. We further introduce a data-driven method to compute the feedforward using the obtained feedback gain. Additionally, for systems with state-dependent noise, we propose a shadow system-based optimal tracking method to eliminate the need for probing noise. The effectiveness of the proposed methods is demonstrated through numerical examples.
Delay-adaptive Control of Nonlinear Systems with Approximate Neural Operator Predictors
In this work, we propose a rigorous method for implementing predictor feedback controllers in nonlinear systems with unknown and arbitrarily long actuator delays. To address the analytically intractable nature of the predictor, we approximate it using a learned neural operator mapping. This mapping is trained once, offline, and then deployed online, leveraging the fast inference capabilities of neural networks. We provide a theoretical stability analysis based on the universal approximation theorem of neural operators and the transport partial differential equation (PDE) representation of the delay. We then prove, via a Lyapunov-Krasovskii functional, semi-global practical convergence of the dynamical system dependent on the approximation error of the predictor and delay bounds. Finally, we validate our theoretical results using a biological activator/repressor system, demonstrating speedups of 15 times compared to traditional numerical methods.
comment: 9 pages, 1 Figure
Understanding Incremental Learning with Closed-form Solution to Gradient Flow on Overparamerterized Matrix Factorization
Many theoretical studies on neural networks attribute their excellent empirical performance to the implicit bias or regularization induced by first-order optimization algorithms when training networks under certain initialization assumptions. One example is the incremental learning phenomenon in gradient flow (GF) on an overparamerterized matrix factorization problem with small initialization: GF learns a target matrix by sequentially learning its singular values in decreasing order of magnitude over time. In this paper, we develop a quantitative understanding of this incremental learning behavior for GF on the symmetric matrix factorization problem, using its closed-form solution obtained by solving a Riccati-like matrix differential equation. We show that incremental learning emerges from some time-scale separation among dynamics corresponding to learning different components in the target matrix. By decreasing the initialization scale, these time-scale separations become more prominent, allowing one to find low-rank approximations of the target matrix. Lastly, we discuss the possible avenues for extending this analysis to asymmetric matrix factorization problems.
comment: Accepted to CDC 2025
Systolic Array-based Architecture for Low-Bit Integerized Vision Transformers
Transformer-based models are becoming more and more intelligent and are revolutionizing a wide range of human tasks. To support their deployment, AI labs offer inference services that consume hundreds of GWh of energy annually and charge users based on the number of tokens processed. Under this cost model, minimizing power consumption and maximizing throughput have become key design goals for the inference hardware. While graphics processing units (GPUs) are commonly used, their flexibility comes at the cost of low operational intensity and limited efficiency, especially under the high query-per-model ratios of modern inference services. In this work, we address these challenges by proposing a low-bit, model-specialized accelerator that strategically selects tasks with high operation (OP) reuse and minimal communication overhead for offloading. Our design incorporates multiple systolic arrays with deep, fine-grained pipelines and array-compatible units that support essential operations in multi-head self-attention (MSA) module. At the accelerator-level, each self-attention (SA) head is pipelined within a single accelerator to increase data reuse and further minimize bandwidth. Our 3-bit integerized model achieves 96.83% accuracy on CIFAR-10 and 77.81% top-1 accuracy on ImageNet. We validate the hardware design on a 16nm FPGA (Alveo U250), where it delivers 13,568 GigaOps/second (GOPs/s) and 219.4 GOPs/s/W. Compared to a same-technology GPU (GTX 1080), our design offers 1.50x higher throughput and 4.47x better power efficiency. Even against a state-of-the-art GPU (RTX 5090), we still achieve 20% better power efficiency despite having 87% lower throughput.
comment: 13 pages, 16 figures, 6 tables
$H_\infty$ Performance Analysis for Almost Periodic Piecewise Linear Systems with Application to Roll-to-Roll Manufacturing Control
An almost periodic piecewise linear system (APPLS) is a type of piecewise linear system where the system cyclically switches between different modes, each with an uncertain but bounded dwell-time. Process regulation, especially disturbance rejection, is critical to the performance of these advanced systems. However, a method to guarantee disturbance rejection has not been developed. The objective of this study is to develop an $H_\infty$ performance analysis method for APPLSs, building on which an algorithm to synthesize practical $H_\infty$ controllers is proposed. As an application, the developed methods are demonstrated with an advanced manufacturing system -- roll-to-roll (R2R) dry transfer of two-dimensional materials and printed flexible electronics. Experimental results show that the proposed method enables a less conservative and much better performing $H_\infty$ controller compared with a baseline $H_\infty$ controller that does not account for the uncertain system switching structure.
comment: 11 pages, 11 figures
Observer Design for Optical Flow-Based Visual-Inertial Odometry with Almost-Global Convergence
This paper presents a novel cascaded observer architecture that combines optical flow and IMU measurements to perform continuous monocular visual-inertial odometry (VIO). The proposed solution estimates body-frame velocity and gravity direction simultaneously by fusing velocity direction information from optical flow measurements with gyro and accelerometer data. This fusion is achieved using a globally exponentially stable Riccati observer, which operates under persistently exciting translational motion conditions. The estimated gravity direction in the body frame is then employed, along with an optional magnetometer measurement, to design a complementary observer on $\mathbf{SO}(3)$ for attitude estimation. The resulting interconnected observer architecture is shown to be almost globally asymptotically stable. To extract the velocity direction from sparse optical flow data, a gradient descent algorithm is developed to solve a constrained minimization problem on the unit sphere. The effectiveness of the proposed algorithms is validated through simulation results.
comment: 8 pages, 6 figures. To appear in IEEE CDC 2025
Traffic State Estimation in Congestion to Extend Applicability of DFOS
This paper presents a traffic state estimation (TSE) method in congestion for distributed fiber-optic sensing (DFOS). DFOS detects vehicle driving vibrations along the optical fiber and obtains their trajectories in the spatiotemporal plane. From these trajectories, DFOS provides mean velocities for real-time spatially continuous traffic monitoring without dead zones. However, when vehicle vibration intensities are insufficiently low due to slow speed, trajectories cannot be obtained, leading to missing values in mean velocity data. It restricts DFOS applicability in severe congestion. Therefore, this paper proposes a missing value imputation method based on data assimilation. Our proposed method is validated on two expressways in Japan with the reference data. The results show that the mean absolute error (MAE) of the imputed mean velocities to the reference increases only by 1.5 km/h as compared with the MAE of non-missing values. This study enhances the wide-range applicability of DFOS in practical cases.
comment: 11 pages, 7 figures, presented in the 31st ITS World Congress
HPC Digital Twins for Evaluating Scheduling Policies, Incentive Structures and their Impact on Power and Cooling
Schedulers are critical for optimal resource utilization in high-performance computing. Traditional methods to evaluate schedulers are limited to post-deployment analysis, or simulators, which do not model associated infrastructure. In this work, we present the first-of-its-kind integration of scheduling and digital twins in HPC. This enables what-if studies to understand the impact of parameter configurations and scheduling decisions on the physical assets, even before deployment, or regarching changes not easily realizable in production. We (1) provide the first digital twin framework extended with scheduling capabilities, (2) integrate various top-tier HPC systems given their publicly available datasets, (3) implement extensions to integrate external scheduling simulators. Finally, we show how to (4) implement and evaluate incentive structures, as-well-as (5) evaluate machine learning based scheduling, in such novel digital-twin based meta-framework to prototype scheduling. Our work enables what-if scenarios of HPC systems to evaluate sustainability, and the impact on the simulated system.
Constraint Learning in Multi-Agent Dynamic Games from Demonstrations of Local Nash Interactions
We present an inverse dynamic game-based algorithm to learn parametric constraints from a given dataset of local generalized Nash equilibrium interactions between multiple agents. Specifically, we introduce mixed-integer linear programs (MILP) encoding the Karush-Kuhn-Tucker (KKT) conditions of the interacting agents, which recover constraints consistent with the Nash stationarity of the interaction demonstrations. We establish theoretical guarantees that our method learns inner approximations of the true safe and unsafe sets, as well as limitations of constraint learnability from demonstrations of Nash equilibrium interactions. We also use the interaction constraints recovered by our method to design motion plans that robustly satisfy the underlying constraints. Across simulations and hardware experiments, our methods proved capable of inferring constraints and designing interactive motion plans for various classes of constraints, both convex and non-convex, from interaction demonstrations of agents with nonlinear dynamics.
Low-Cost Architecture and Efficient Pattern Synthesis for Polarimetric Phased Array Based on Polarization Coding Reconfigurable Elements
Polarimetric phased arrays (PPAs) enhance radar target detection and anti-jamming capabilities. However, the dual transmit/receive (T/R) channel requirement leads to high costs and system complexity. To address this, this paper introduces a polarization-coding reconfigurable phased array (PCRPA) and associated pattern synthesis techniques to reduce PPA costs while minimizing performance degradation. Each PCRPA element connects to a single T/R channel and incorporates two-level RF switches for real-time control of polarization states and waveforms. By adjusting element codes and excitation weights, the PCRPA can generate arbitrarily polarized and dual-polarized beams. Efficient beam pattern synthesis methods are also proposed, featuring novel optimization constraints derived from theoretical and analytical analysis of PCRPAs. Simulations demonstrate that the approach achieves low cross-polarization and sidelobe levels comparable to conventional architectures within the scan range, particularly for large arrays. However, the channel reduction inevitably incurs power and directivity loss. Experiments conducted on an $8\times 8$ X-band array antenna validate the effectiveness of the proposed system. The PCRPA and synthesis methods are well-suited for large-scale PPA systems, offering significant cost-effectiveness while maintaining good sidelobe suppression and polarization control performance.
Consensus Seminorms and their Applications
Consensus is a well-studied problem in distributed sensing, computation and control, yet deriving useful and easily computable bounds on the rate of convergence to consensus remains a challenge. This paper discusses the use of seminorms for this goal. A previously suggested family of seminorms is revisited, and an error made in their original presentation is corrected, where it was claimed that the a certain seminorm is equal to the well-known coefficient of ergodicity. Next, a wider family of seminorms is introduced, and it is shown that contraction in any of these seminorms guarantees convergence at an exponential rate of infinite products of matrices, generalizing known results on stochastic matrices to the class of matrices whose row sums are all equal one. Finally, it is shown that such seminorms cannot be used to bound the rate of convergence of classes larger than the well-known class of scrambling matrices.
A Symmetry-Preserving Reduced-Order Observer
A symmetry-preserving, reduced-order state observer is presented for the unmeasured part of a system's state, where the nonlinear system dynamics exhibit symmetry under the action of a Lie group. Leveraging this symmetry with a moving frame, the observer dynamics are constructed such that they are invariant under the Lie group's action. Sufficient conditions for the observer to be asymptotically stable are developed by studying the stability of an invariant error system. As an illustrative example, the observer is applied to the problem of rigid-body velocity estimation, which demonstrates how exploiting the symmetry of the system can simplify the stabilization of the estimation error dynamics.
comment: 7 pages, 6 figures, Published in the Proceedings of the 2025 American Control Conference (ACC)
Canonical Bayesian Linear System Identification
Standard Bayesian approaches for linear time-invariant (LTI) system identification are hindered by parameter non-identifiability; the resulting complex, multi-modal posteriors make inference inefficient and impractical. We solve this problem by embedding canonical forms of LTI systems within the Bayesian framework. We rigorously establish that inference in these minimal parameterizations fully captures all invariant system dynamics (e.g., transfer functions, eigenvalues, predictive distributions of system outputs) while resolving identifiability. This approach unlocks the use of meaningful, structure-aware priors (e.g., enforcing stability via eigenvalues) and ensures conditions for a Bernstein--von Mises theorem -- a link between Bayesian and frequentist large-sample asymptotics that is broken in standard forms. Extensive simulations with modern MCMC methods highlight advantages over standard parameterizations: canonical forms achieve higher computational efficiency, generate interpretable and well-behaved posteriors, and provide robust uncertainty estimates, particularly from limited data.
comment: 46 pages, 9 figures
TGOSPA Metric Parameters Selection and Evaluation for Visual Multi-object Tracking
Multi-object tracking algorithms are deployed in various applications, each with different performance requirements. For example, track switches pose significant challenges for offline scene understanding, as they hinder the accuracy of data interpretation. Conversely, in online surveillance applications, their impact is often minimal. This disparity underscores the need for application-specific performance evaluations that are both simple and mathematically sound. The trajectory generalized optimal sub-pattern assignment (TGOSPA) metric offers a principled approach to evaluate multi-object tracking performance. It accounts for localization errors, the number of missed and false objects, and the number of track switches, providing a comprehensive assessment framework. This paper illustrates the effective use of the TGOSPA metric in computer vision tasks, addressing challenges posed by the need for application-specific scoring methodologies. By exploring the TGOSPA parameter selection, we enable users to compare, comprehend, and optimize the performance of algorithms tailored for specific tasks, such as target tracking and training of detector or re-ID modules.
comment: Submitted to Springer International Journal of Computer Vision
Dimension-Decomposed Learning for Quadrotor Geometric Attitude Control with Almost Global Exponential Convergence on SO(3)
This paper introduces a lightweight and interpretable online learning approach called Dimension-Decomposed Learning (DiD-L) for disturbance identification in quadrotor geometric attitude control. As a module instance of DiD-L, we propose the Sliced Adaptive-Neuro Mapping (SANM). Specifically, to address underlying underfitting problems, the high-dimensional mapping for online identification is axially ``sliced" into multiple low-dimensional submappings (slices). In this way, the complex high-dimensional problem is decomposed into a set of simple low-dimensional subtasks addressed by shallow neural networks and adaptive laws. These neural networks and adaptive laws are updated online via Lyapunov-based adaptation without the persistent excitation (PE) condition. To enhance the interpretability of the proposed approach, we prove that the state solution of the rotational error dynamics exponentially converges into an arbitrarily small ball within an almost global attraction domain, despite time-varying disturbances and inertia uncertainties. This result is novel as it demonstrates exponential convergence without requiring pre-training for unseen disturbances and specific knowledge of the model. To our knowledge in the quadrotor control field, DiD-L is the first online learning approach that is lightweight enough to run in real-time at 400 Hz on microcontroller units (MCUs) such as STM32, and has been validated through real-world experiments.
comment: v2: Corrected methodology naming typo; provided TeX source files
A Hybrid Artificial Intelligence Method for Estimating Flicker in Power Systems (Changes are marked)
This paper introduces a novel hybrid AI method combining H filtering and an adaptive linear neuron network for flicker component estimation in power distribution systems.The proposed method leverages the robustness of the H filter to extract the voltage envelope under uncertain and noisy conditions followed by the use of ADALINE to accurately identify flicker frequencies embedded in the envelope.This synergy enables efficient time domain estimation with rapid convergence and noise resilience addressing key limitations of existing frequency domain approaches.Unlike conventional techniques this hybrid AI model handles complex power disturbances without prior knowledge of noise characteristics or extensive training.To validate the method performance we conduct simulation studies based on IEC Standard 61000 4 15 supported by statistical analysis Monte Carlo simulations and real world data.Results demonstrate superior accuracy robustness and reduced computational load compared to Fast Fourier Transform and Discrete Wavelet Transform based estimators.
comment: 31 pages, 12 figures, and 6 tables
Residual Neural Terminal Constraint for MPC-based Collision Avoidance in Dynamic Environments
In this paper, we propose a hybrid MPC local planner that uses a learning-based approximation of a time-varying safe set, derived from local observations and applied as the MPC terminal constraint. This set can be represented as a zero-superlevel set of the value function computed via Hamilton-Jacobi (HJ) reachability analysis, which is infeasible in real-time. We exploit the property that the HJ value function can be expressed as a difference of the corresponding signed distance function (SDF) and a non-negative residual function. The residual component is modeled as a neural network with non-negative output and subtracted from the computed SDF, resulting in a real-time value function estimate that is at least as safe as the SDF by design. Additionally, we parametrize the neural residual by a hypernetwork to improve real-time performance and generalization properties. The proposed method is compared with three state-of-the-art methods in simulations and hardware experiments, achieving up to 30\% higher success rates compared to the best baseline while requiring a similar computational effort and producing high-quality (low travel-time) solutions.
Optimized Contact Plan Design for Reflector and Phased Array Terminals in Cislunar Space Networks
Cislunar space is emerging as a critical domain for human exploration, requiring robust infrastructure to support spatial users-spacecraft with navigation and communication demands. Deploying satellites at Earth-Moon three-body orbits offers an effective solution to construct cislunar space infrastructure (CLSI). However, scheduling satellite links to serve users necessitates an appropriate contact plan design (CPD) scheme. Existing CPD schemes focus solely on inter-satellite link scheduling, overlooking their role in providing services to users. This paper introduces a CPD scheme that considers two classes of satellite transponders: Reflector Links (RL) for high-volume data transfers and Phased Array Links (PL) for fast switching and navigation services. Our approach supports both satellites and spatial users in cislunar space. Simulations validate the scheme, demonstrating effective support for user while meeting satellite ranging and communication requirements. These findings provide essential insights for developing future Cislunar Space Infrastructures.
comment: 12 pages, 9 figures
Model-based Multi-object Visual Tracking: Identification and Standard Model Limitations
This paper uses multi-object tracking methods known from the radar tracking community to address the problem of pedestrian tracking using 2D bounding box detections. The standard point-object (SPO) model is adopted, and the posterior density is computed using the Poisson multi-Bernoulli mixture (PMBM) filter. The selection of the model parameters rooted in continuous time is discussed, including the birth and survival probabilities. Some parameters are selected from the first principles, while others are identified from the data, which is, in this case, the publicly available MOT-17 dataset. Although the resulting PMBM algorithm yields promising results, a mismatch between the SPO model and the data is revealed. The model-based approach assumes that modifying the problematic components causing the SPO model-data mismatch will lead to better model-based algorithms in future developments.
comment: Accepted for publication in 2025 28th International Conference on Information Fusion (FUSION)
Coevolution of Opinion Dynamics and Recommendation System: Modeling, Analysis and Reinforcement Learning Based Manipulation
In this work, we develop an analytical framework that integrates opinion dynamics with a recommendation system. By incorporating elements such as collaborative filtering, we provide a precise characterization of how recommendation systems shape interpersonal interactions and influence opinion formation. Moreover, the property of the coevolution of both opinion dynamics and recommendation systems is also shown. Specifically, the convergence of this coevolutionary system is theoretically proved, and the mechanisms behind filter bubble formation are elucidated. Our analysis of the maximum number of opinion clusters shows how recommendation system parameters affect opinion grouping and polarization. Additionally, we incorporate the influence of propagators into our model and propose a reinforcement learning-based solution. The analysis and the propagation solution are demonstrated in simulations using the Yelp data set.
Enhanced Trust Region Sequential Convex Optimization for Multi-Drone Thermal Screening Trajectory Planning in Urban Environments
The rapid detection of abnormal body temperatures in urban populations is essential for managing public health risks, especially during outbreaks of infectious diseases. Multi-drone thermal screening systems offer promising solutions for fast, large-scale, and non-intrusive human temperature monitoring. However, trajectory planning for multiple drones in complex urban environments poses significant challenges, including collision avoidance, coverage efficiency, and constrained flight environments. In this study, we propose an enhanced trust region sequential convex optimization (TR-SCO) algorithm for optimal trajectory planning of multiple drones performing thermal screening tasks. Our improved algorithm integrates a refined convex optimization formulation within a trust region framework, effectively balancing trajectory smoothness, obstacle avoidance, altitude constraints, and maximum screening coverage. Simulation results demonstrate that our approach significantly improves trajectory optimality and computational efficiency compared to conventional convex optimization methods. This research provides critical insights and practical contributions toward deploying efficient multi-drone systems for real-time thermal screening in urban areas. For reader who are interested in our research, we release our source code at https://github.com/Cherry0302/Enhanced-TR-SCO.
Fixed-Time Input-to-State Stability for Singularly Perturbed Systems via Composite Lyapunov Functions
We study singularly perturbed systems that exhibit input-to-state stability (ISS) with fixed-time properties in the presence of bounded disturbances. In these systems, solutions converge to the origin within a time frame independent of initial conditions when undisturbed, and to a vicinity of the origin when subjected to bounded disturbances. First, we extend the traditional composite Lyapunov method, commonly applied in singular perturbation theory to analyze asymptotic stability, to include fixed-time ISS. We demonstrate that if both the reduced system and the boundary layer system exhibit fixed-time ISS, and if certain interconnection conditions are met, the entire multi-time scale system retains this fixed-time ISS characteristic, provided the separation of time scales is sufficiently pronounced. Next, we illustrate our findings via analytical and numerical examples, including a novel application in fixed-time feedback optimization for dynamic plants with slowly varying cost functions.
Matrix Control Barrier Functions
This paper generalizes the control barrier function framework by replacing scalar-valued functions with matrix-valued ones. Specifically, we develop barrier conditions for safe sets defined by matrix inequalities -- both semidefinite and indefinite. Matrix inequalities can be used to describe a richer class of safe sets, including nonsmooth ones. The safety filters constructed from our proposed matrix control barrier functions via semidefinite programming (CBF-SDP) are shown to be continuous. Our matrix formulation naturally provides a continuous safety filter for Boolean-based control barrier functions, notably for disjunctions (OR), without relaxing the safe set. We illustrate the effectiveness of the proposed framework with applications in drone network connectivity maintenance and nonsmooth obstacle avoidance, both in simulations and hardware experiments.
comment: 13 pages, 4 figures, submitted to the IEEE Transactions on Automatic Control
PUB: A Plasma-Propelled Ultra-Quiet Blimp with Two-DOF Vector Thrusting
This study presents the design and control of a Plasma-propelled Ultra-silence Blimp (PUB), a novel aerial robot employing plasma vector propulsion for ultra-quiet flight without mechanical propellers. The system utilizes a helium-lift platform for extended endurance and a four-layer ring asymmetric capacitor to generate ionic wind thrust. The modular propulsion units allow flexible configuration to meet mission-specific requirements, while a two-degree-of-freedom (DOF) head enables thrust vector control. A closed-loop slip control scheme is implemented for stable maneuvering. Flight experiments demonstrate full-envelope capability, including take-off, climb, hover, descent, and smooth landing, confirming the feasibility of plasma vector propulsion, the effectiveness of DOF vector control, and the stability of the control system. Owing to its low acoustic signature, structural simplicity, and high maneuverability, PUB is well suited for noise-sensitive, enclosed, and near-space applications.
A novel switched systems approach to nonconvex optimisation
We develop a novel switching dynamics that converges to the Karush-Kuhn-Tucker (KKT) point of a nonlinear optimisation problem. This new approach is particularly notable for its lower dimensionality compared to conventional primal-dual dynamics, as it focuses exclusively on estimating the primal variable. Our method is successfully illustrated on general quadratic optimisation problems, the minimisation of the classical Rosenbrock function, and a nonconvex optimisation problem stemming from the control of energy-efficient buildings.
Systems and Control (EESS)
Finite-Time Guarantees for Multi-Agent Combinatorial Bandits with Nonstationary Rewards
We study a sequential resource allocation problem where a decision maker selects subsets of agents at each period to maximize overall outcomes without prior knowledge of individual-level effects. Our framework applies to settings such as community health interventions, targeted digital advertising, and workforce retention programs, where intervention effects evolve dynamically. Agents may exhibit habituation (diminished response from frequent selection) or recovery (enhanced response from infrequent selection). The technical challenge centers on nonstationary reward distributions that lead to changing intervention effects over time. The problem requires balancing two key competing objectives: heterogeneous individual rewards and the exploration-exploitation tradeoff in terms of learning for improved future decisions as opposed to maximizing immediate outcomes. Our contribution introduces the first framework incorporating this form of nonstationary rewards in the combinatorial multi-armed bandit literature. We develop algorithms with theoretical guarantees on dynamic regret and demonstrate practical efficacy through a diabetes intervention case study. Our personalized community intervention algorithm achieved up to three times as much improvement in program enrollment compared to baseline approaches, validating the framework's potential for real-world applications. This work bridges theoretical advances in adaptive learning with practical challenges in population-level behavioral change interventions.
comment: 41 pages, 8 figures
Missing Money and Market-Based Adequacy in Deeply Decarbonized Power Systems with Long-Duration Energy Storage
The ability of deeply decarbonised power systems to ensure adequacy may increasingly depend on long-duration energy storage (LDES). A central challenge is whether capacity markets (CMs), originally designed around thermal generation, can provide efficient investment signals when storage becomes a central participant. While recent studies have advanced methods for accrediting variable renewables and short-duration storage, the effectiveness of these methods in CMs with substantial LDES penetration remains largely unexplored. To address this gap, we extend a two-stage stochastic equilibrium investment model by endogenising continuous, duration-based capacity accreditation for storage and apply it to a Great Britain-based case using 40 years of weather-driven demand and renewable profiles under varying emission limits. Results show that well-calibrated CMs can sustain near-efficient investment and mitigate revenue volatility, but their effectiveness diminishes in deeply decarbonized systems, underscoring both their potential and the regulatory challenges of supporting large-scale LDES.
Real-Time Tracking Antenna System for Moving Targets
This paper presents the design and implementation of a compact, cost-effective phased array antenna system. It is capable of real-time beam-steering for dynamic target-tracking applications. The system employs a 4$\times$4 rectangular microstrip patch array, utilizing advanced beamforming techniques and a Direction of Arrival (DoA) estimation algorithm. It achieves $\pm 42^{\circ}$ wide-angle scanning in both azimuth and elevation planes. The design emphasizes a balance between high angular coverage and consistent gain performance. This makes it suitable for wireless tracking, radar, and satellite communication terminals. Fabricated on Rogers 6010.2LM substrate, the system demonstrates reproducibility and scalability. All components are sourced locally to ensure practical deployment. The system is built using commercially available components, highlighting its affordability for research and prototyping purposes.
AERO-LQG: Aerial-Enabled Robust Optimization for LQG-Based Quadrotor Flight Controller
Quadrotors are indispensable in civilian, industrial, and military domains, undertaking complex, high-precision tasks once reserved for specialized systems. Across all contexts, energy efficiency remains a critical constraint: quadrotors must reconcile the high power demands of agility with the minimal consumption required for extended endurance. Meeting this trade-off calls for mode-specific optimization frameworks that adapt to diverse mission profiles. At their core lie optimal control policies defining error functions whose minimization yields robust, mission-tailored performance. While solutions are straightforward for fixed weight matrices, selecting those weights is a far greater challenge-lacking analytical guidance and thus relying on exhaustive or stochastic search. This interdependence can be framed as a bi-level optimization problem, with the outer loop determining weights a priori. This work introduces an aerial-enabled robust optimization for LQG tuning (AERO-LQG), a framework employing evolutionary strategy to fine-tune LQG weighting parameters. Applied to the linearized hovering mode of quadrotor flight, AERO-LQG achieves performance gains of several tens of percent, underscoring its potential for enabling high-performance, energy-efficient quadrotor control. The project is available at GitHub.
comment: 2 tables, 8 figures
The Epistemic Support-Point Filter (ESPF): A Bounded Possibilistic Framework for Ordinal State Estimation
Traditional state estimation methods rely on probabilistic assumptions that often collapse epistemic uncertainty into scalar beliefs, risking overconfidence in sparse or adversarial sensing environments. We introduce the Epistemic Support-Point Filter (ESPF), a novel non-Bayesian filtering framework fully grounded in possibility theory and epistemic humility. ESPF redefines the evolution of belief over state space using compatibility-weighted support updates, surprisalaware pruning, and adaptive dispersion via sparse grid quadrature. Unlike conventional filters, ESPF does not seek a posterior distribution, but rather maintains a structured region of plausibility or non-rejection, updated using ordinal logic rather than integration. For multi-model inference, we employ the Choquet integral to fuse competing hypotheses based on a dynamic epistemic capacity function, generalizing classical winner-take-all strategies. The result is an inference engine capable of dynamically contracting or expanding belief support in direct response to information structure, without requiring prior statistical calibration. This work presents a foundational shift in how inference, evidence, and ignorance are reconciled, supporting robust estimation where priors are unavailable, misleading, or epistemically unjustified.
An Efficient Data-Driven Framework for Linear Quadratic Output Feedback Control
Linear quadratic regulator with unmeasurable states and unknown system matrix parameters better aligns with practical scenarios. However, for this problem, balancing the optimality of the resulting controller and the leniency of the algorithm's feasibility conditions remains a non-trivial challenge, as no well-established general method has yet been developed to address this trade-off. To address this gap, this study first develops a comprehensive theoretical framework for state parameterization that equivalently substitutes for unknown states. By analyzing the controllability of consistent systems satisfied by substitute states, this framework quantifies the capability of substitute state data matrices to parameterize unknown closed-loop systems and output feedback controllers, thereby constructing a modified state parameterization form that meets the complete data parameterization condition of Willems' Fundamental Lemma. Leveraging this framework, this study proposes efficient model-free off-policy policy iteration and value iteration algorithms with theoretical guarantees to solve for the optimal output feedback controller. Compared with existing studies, particularly for multi-output problems where existing model-free reinforcement learning algorithms may fail, the proposed method removes redundant information in substitute states and the additional full row rank condition on regression matrices, thereby ensuring the solution of optimal output feedback controllers equivalent to optimal state feedback controllers for multi-output systems. Furthermore, this study pioneers a comprehensive and highly scalable theoretical analysis of state parameterization from a data-driven viewpoint, and the proposed algorithms exhibit significant advantages in implementation conditions, data demand, unknown handling, and convergence speed.
Non-expert to Expert Motion Translation Using Generative Adversarial Networks
Decreasing skilled workers is a very serious problem in the world. To deal with this problem, the skill transfer from experts to robots has been researched. These methods which teach robots by human motion are called imitation learning. Experts' skills generally appear in not only position data, but also force data. Thus, position and force data need to be saved and reproduced. To realize this, a lot of research has been conducted in the framework of a motion-copying system. Recent research uses machine learning methods to generate motion commands. However, most of them could not change tasks by following human intention. Some of them can change tasks by conditional training, but the labels are limited. Thus, we propose the flexible motion translation method by using Generative Adversarial Networks. The proposed method enables users to teach robots tasks by inputting data, and skills by a trained model. We evaluated the proposed system with a 3-DOF calligraphy robot.
Balancing Profit and Traveller Acceptance in Ride-Pooling Personalised Fares
Ride-pooling systems, to succeed, must provide an attractive service, namely compensate perceived costs with an appealing price. However, because of a strong heterogeneity in a value-of-time, each traveller has his own acceptable price, unknown to the operator. Here, we show that individual acceptance levels can be learned by the operator (over $90\%$ accuracy for pooled travellers in $10$ days) to optimise personalised fares. We propose an adaptive pricing policy, where every day the operator constructs an offer that progressively meets travellers' expectations and attracts a growing demand. Our results suggest that operators, by learning behavioural traits of individual travellers, may improve performance not only for travellers (increased utility) but also for themselves (increased profit). Moreover, such knowledge allows the operator to remove inefficient pooled rides and focus on attractive and profitable combinations.
Multi-cluster distributed optimization in open multi-agent systems over directed graphs with acknowledgement messages
In this paper, we tackle the problem of distributed optimization over directed networks in open multi-agent systems (OMAS), where agents may dynamically join or leave, causing persistent changes in network topology and problem dimension. These disruptions not only pose significant challenges to maintaining convergence and stability in distributed optimization algorithms, but could also break the network topology into multiple clusters, each one associated with its own set of objective functions. To address this, we propose a novel Open Distributed Optimization Algorithm with Gradient Tracking (OPEN-GT), which employs: (a) a dynamic mechanism for detecting active out-neighbors through acknowledgement messages, and (b) a fully distributed max-consensus procedure to spread information regarding agent departures, in possibly unbalanced directed networks. We show that when all active agents execute OPEN-GT, the optimization process in each formed cluster remains consistent, while the agents converge to their cluster-wide optimal solution if there exists a time after which the network remains unchanged. Finally, we validate our approach in a simulated environment with dynamically changing agent populations, demonstrating its resilience to network variations and its ability to support distributed optimization under OMAS dynamics.
Bridging Finite and Infinite-Horizon Nash Equilibria in Linear Quadratic Games
Finite-horizon linear quadratic (LQ) games admit a unique Nash equilibrium, while infinite-horizon settings may have multiple. We clarify the relationship between these two cases by interpreting the finite-horizon equilibrium as a nonlinear dynamical system. Within this framework, we prove that its fixed points are exactly the infinite-horizon equilibria and that any such equilibrium can be recovered by an appropriate choice of terminal costs. We further show that periodic orbits of the dynamical system, when they arise, correspond to periodic Nash equilibria, and we provide numerical evidence of convergence to such cycles. Finally, simulations reveal three asymptotic regimes: convergence to stationary equilibria, convergence to periodic equilibria, and bounded non-convergent trajectories. These findings offer new insights and tools for tuning finite-horizon LQ games using infinite-horizon.
Optimistic vs Pessimistic Uncertainty Model Unfalsification
We present a novel, input-output data-driven approach to uncertainty model identification. As the true bounds and distributions of system uncertainties ultimately remain unknown, we depart from the goal of identifying the uncertainty model and instead look for minimal concrete statements that can be made based on an uncertain system model and available input-output data. We refer to this as unfalsifying an uncertainty model. Two different unfalsification approaches are taken. The optimistic approach determines the smallest uncertainties that could explain the given data, while the pessimistic approach finds the largest possible uncertainties suggested by the data. The pessimistic problem is revealed to be a semi-infinite program, which is solved using the local reduction algorithm. It is also shown that the optimistic and pessimistic approaches to uncertainty model unfalsification are mathematical duals. Finally, both approaches are tested using an uncertain linear model with data from a simulated nonlinear system.
comment: 6 pages, 3 figures. Accepted for publication at the 2025 IEEE Conference on Decision and Control (CDC 2025)
Physics-Constrained Machine Learning for Chemical Engineering
Physics-constrained machine learning (PCML) combines physical models with data-driven approaches to improve reliability, generalizability, and interpretability. Although PCML has shown significant benefits in diverse scientific and engineering domains, technical and intellectual challenges hinder its applicability in complex chemical engineering applications. Key difficulties include determining the amount and type of physical knowledge to embed, designing effective fusion strategies with ML, scaling models to large datasets and simulators, and quantifying predictive uncertainty. This perspective summarizes recent developments and highlights challenges/opportunities in applying PCML to chemical engineering, emphasizing on closed-loop experimental design, real-time dynamics and control, and handling of multi-scale phenomena.
A Proposal for Yield Improvement with Power Tradeoffs in CMOS LNAs (English Version)
This paper studies an architecture with digitally controllable gain and power consumption to mitigate the impact of process variations on CMOS low-noise amplifiers (LNAs). A \SI{130}{nm}, \SI{1.2}{V} LNA implementing the proposed architecture is designed based on an analysis of variability in traditional LNAs under different bias currents and on the corresponding effects on the performance of a complete receiver. Two different adjustment strategies are evaluated, both of which are compatible with previously reported built-in self-test (BIST) circuits. Results show that the proposed architecture enables yield enhancement while keeping low-power operation compared with traditional LNAs.
comment: English version of paper originally published in Spanish
Minimizing AoI in Mobile Edge Computing: Nested Index Policy with Preemptive and Non-preemptive Structure
Mobile Edge Computing (MEC) leverages computational heterogeneity between mobile devices and edge nodes to enable real-time applications requiring high information freshness. The Age-of-Information (AoI) metric serves as a crucial evaluator of information timeliness in such systems. Addressing AoI minimization in multi-user MEC environments presents significant challenges due to stochastic computing times. In this paper, we consider multiple users offloading tasks to heterogeneous edge servers in an MEC system, focusing on preemptive and non-preemptive task scheduling mechanisms. The problem is first reformulated as a Restless Multi-Arm Bandit (RMAB) problem, with a multi-layer Markov Decision Process (MDP) framework established to characterize AoI dynamics in the MEC system. Based on the multi-layer MDP, we propose a nested index framework and design a nested index policy with provably asymptotic optimality. This establishes a theoretical framework adaptable to various scheduling mechanisms, achieving efficient optimization through state stratification and index design in both preemptive and non-preemptive modes. Finally, the closed-form of the nested index is derived, facilitating performance trade-offs between computational complexity and accuracy while ensuring the universal applicability of the nested index policy across both scheduling modes. The experimental results show that in non-preemptive scheduling, compared with the benchmark method, the optimality gap is reduced by 25.43%, while in preemptive scheduling, the gap has reduced by 61.84%. As the system scale increases, it asymptotically converges in two scheduling modes and especially provides near-optimal performance in non-preemptive structure.
comment: 23 pages, 11 figures, 2 tables
DMPC-Swarm: Distributed Model Predictive Control on Nano UAV swarms
Swarms of unmanned aerial vehicles (UAVs) are increasingly becoming vital to our society, undertaking tasks such as search and rescue, surveillance and delivery. A special variant of Distributed Model Predictive Control (DMPC) has emerged as a promising approach for the safe management of these swarms by combining the scalability of distributed computation with dynamic swarm motion control. In this DMPC method, multiple agents solve local optimization problems with coupled anti-collision constraints, periodically exchanging their solutions. Despite its potential, existing methodologies using this DMPC variant have yet to be deployed on distributed hardware that fully utilize true distributed computation and wireless communication. This is primarily due to the lack of a communication system tailored to meet the unique requirements of mobile swarms and an architecture that supports distributed computation while adhering to the payload constraints of UAVs. We present DMPC-SWARM, a new swarm control methodology that integrates an efficient, stateless low-power wireless communication protocol with a novel DMPC algorithm that provably avoids UAV collisions even under message loss. By utilizing event-triggered and distributed off-board computing, DMPC-SWARM supports nano UAVs, allowing them to benefit from additional computational resources while retaining scalability and fault tolerance. In a detailed theoretical analysis, we prove that DMPC-SWARM guarantees collision avoidance under realistic conditions, including communication delays and message loss. Finally, we present DMPC-SWARM's implementation on a swarm of up to 16 nano-quadcopters, demonstrating the first realization of these DMPC variants with computation distributed on multiple physical devices interconnected by a real wireless mesh networks. A video showcasing DMPC-SWARM is available at http://tiny.cc/DMPCSwarm.
Transient Stability Analysis of a Hybrid Grid-Forming and Grid-Following RES System Considering Multi-Mode Control Switching
The inherent control switching of renewable energy sources (RESs) during intricate transient processes introduces complexity to the dynamic behavior of modern power systems. This paper reveals the dynamic coupling between grid forming (GFM)/grid following (GFL)-based RES and dominant instability modes of the hybrid system. First, six control combinations are systematically investigated by pairing the two GFM-RES modes, normal control (NC) and current saturation (CS), with the three GFL-RES modes: normal control, low voltage ride-through (LVRT), and high voltage ride-through (HVRT). Based on switching system theory, the coupled power flow and dynamic motion models are developed considering multi-mode switching characteristics. It is revealed that the hybrid system exhibits two distinct instability modes when the GFM-RES and GFL-RES exceed their P-f and V-f desynchronization boundaries, respectively. The two-dimensional spatiotemporal damping characteristics of GFL-RES induced by GFM-RES are also uncovered for the first time. A novel criterion is proposed to quantify the impact of GFM-RES on GFL-RES dynamics, capturing both its stabilizing and destabilizing effects under different control combinations. High-fidelity electromagnetic transient simulations validate the correctness of the analysis framework.
Local Observability of a Class of Feedforward Neural Networks
Beyond the traditional neural network training methods based on gradient descent and its variants, state estimation techniques have been proposed to determine a set of ideal weights from a control-theoretic perspective. Hence, the concept of observability becomes relevant in neural network training. In this paper, we investigate local observability of a class of two-layer feedforward neural networks~(FNNs) with rectified linear unit~(ReLU) activation functions. We analyze local observability of FNNs by evaluating an observability rank condition with respect to the weight matrix and the input sequence. First, we show that, in general, the weights of FNNs are not locally observable. Then, we provide sufficient conditions on the network structures and the weights that lead to local observability. Moreover, we propose an input design approach to render the weights distinguishable and show that this input also excites other weights inside a neighborhood. Finally, we validate our results through a numerical example.
comment: 6 pages, 1 figure
Adaptive Control of Heterogeneous Platoons with Guaranteed Collision Avoidance
This work proposes a framework for Cooperative Adaptive Cruise Control of a vehicular platoon characterized by unidirectional communication and heterogeneous parameters. In the proposed framework, the actual (heterogeneous) platoon is made to converge to a reference (homogeneous) platoon via adaptive laws designed using of set-theoretic model reference adaptive control. Yet, in contrast to the state-of-art that is based on ensuring collision avoidance on the reference platoon dynamics only, the approach we propose can ensure collision avoidance on the actual platoon dynamics. This result is possible thanks to the introduction of a novel concept of virtual platoon, only used for analysis, but that does not interact with the actual platoon. The stability and convergence properties of the proposed framework are established using Lyapunov-based analysis in conjunction with the aforementioned virtual platoon concept.
Joint Contact Planning for Navigation and Communication in GNSS-Libration Point Systems
Deploying satellites at Earth-Moon Libration Points (LPs) addresses the inherent deep-space coverage gaps of low-altitude GNSS constellations. Integrating LP satellites with GNSS into a joint constellation enables a more robust and comprehensive Positioning, Navigation, and Timing (PNT) system, while also extending navigation and communication services to spacecraft operating in cislunar space (i.e., users). However, the long propagation delays between LP satellites, users, and GNSS satellites result in significantly different link durations compared to those within the GNSS constellation. Scheduling inter-satellite links (ISLs) is a core task of Contact Plan Design (CPD). Existing CPD approaches focus exclusively on GNSS constellations, assuming uniform link durations, and thus cannot accommodate the heterogeneous link timescales present in a joint GNSS-LP system. To overcome this limitation, we introduce a Joint CPD (J-CPD) scheme tailored to handle ISLs with differing duration units across integrated constellations. The key contributions of J-CPD are: (i):introduction of LongSlots (Earth-Moon scale links) and ShortSlots (GNSS-scale links); (ii):a hierarchical and crossed CPD process for scheduling LongSlots and ShortSlots ISLs; (iii):an energy-driven link scheduling algorithm adapted to the CPD process. Simulations on a joint BeiDou-LP constellation demonstrate that J-CPD surpasses the baseline FCP method in both delay and ranging coverage, while maintaining high user satisfaction and enabling tunable trade-offs through adjustable potential-energy parameters. To our knowledge, this is the first CPD framework to jointly optimize navigation and communication in GNSS-LP systems, representing a key step toward unified and resilient deep-space PNT architectures.
comment: 15 pages, 8 figures
Learning Fast, Tool aware Collision Avoidance for Collaborative Robots
Ensuring safe and efficient operation of collaborative robots in human environments is challenging, especially in dynamic settings where both obstacle motion and tasks change over time. Current robot controllers typically assume full visibility and fixed tools, which can lead to collisions or overly conservative behavior. In our work, we introduce a tool-aware collision avoidance system that adjusts in real time to different tool sizes and modes of tool-environment interaction. Using a learned perception model, our system filters out robot and tool components from the point cloud, reasons about occluded area, and predicts collision under partial observability. We then use a control policy trained via constrained reinforcement learning to produce smooth avoidance maneuvers in under 10 milliseconds. In simulated and real-world tests, our approach outperforms traditional approaches (APF, MPPI) in dynamic environments, while maintaining sub-millimeter accuracy. Moreover, our system operates with approximately 60% lower computational cost compared to a state-of-the-art GPU-based planner. Our approach provides modular, efficient, and effective collision avoidance for robots operating in dynamic environments. We integrate our method into a collaborative robot application and demonstrate its practical use for safe and responsive operation.
MegaCacheX: Towards Cost-Effective Hierarchical Collaborative Content Caching in Emerging Mega-Constellations
Significant latency in global content delivery primarily arises from insufficient terrestrial infrastructure. Deploying space-based content delivery networks within emerging mega-constellations provides an effective means to bridge the digital divide. However, space-based caching faces constraints from physical-layer dynamics, including dynamic topologies, time-varying inter-satellite link conditions, and limited onboard energy. In addition, existing mechanisms often lack fine-grained content categorization and global optimization. This paper proposes MegaCacheX, a cost-effective hierarchical framework for collaborative content distribution that achieves "Earth-independence" by providing cloud services directly from space. Specifically, data centers in Sun-synchronous orbit act as primary content sources, while caching nodes in mega-constellations and ground stations collaboratively form a distributed edge layer. MegaCacheX optimizes caching strategies by integrating content popularity, regional user distribution, and satellite trajectory predictions. Multi-tier caching nodes serve as service anchors, enabling seamless content delivery with low latency. A prototype implemented on a microservices-based, containerized testbed demonstrates that MegaCacheX reduces global content access latency by about 36% compared to baseline approaches, while maintaining cost efficiency.
Bootstrap Policy Iteration for Stochastic LQ Tracking with Multiplicative Noise
This paper studies the optimal tracking control problem for continuous-time stochastic linear systems with multiplicative noise. The solution framework involves solving a stochastic algebraic Riccati equation for the feedback gain and a Sylvester equation for the feedforward gain. To enable model-free optimal tracking, we first develop a two-phase bootstrap policy iteration (B-PI) algorithm, which bootstraps a stabilizing control gain from the trivially initialized zero-value start and proceeds with standard policy iteration. Building on this algorithm, we propose a data-driven, off-policy reinforcement learning approach that ensures convergence to the optimal feedback gain under the interval excitation condition. We further introduce a data-driven method to compute the feedforward using the obtained feedback gain. Additionally, for systems with state-dependent noise, we propose a shadow system-based optimal tracking method to eliminate the need for probing noise. The effectiveness of the proposed methods is demonstrated through numerical examples.
Delay-adaptive Control of Nonlinear Systems with Approximate Neural Operator Predictors
In this work, we propose a rigorous method for implementing predictor feedback controllers in nonlinear systems with unknown and arbitrarily long actuator delays. To address the analytically intractable nature of the predictor, we approximate it using a learned neural operator mapping. This mapping is trained once, offline, and then deployed online, leveraging the fast inference capabilities of neural networks. We provide a theoretical stability analysis based on the universal approximation theorem of neural operators and the transport partial differential equation (PDE) representation of the delay. We then prove, via a Lyapunov-Krasovskii functional, semi-global practical convergence of the dynamical system dependent on the approximation error of the predictor and delay bounds. Finally, we validate our theoretical results using a biological activator/repressor system, demonstrating speedups of 15 times compared to traditional numerical methods.
comment: 9 pages, 1 Figure
Understanding Incremental Learning with Closed-form Solution to Gradient Flow on Overparamerterized Matrix Factorization
Many theoretical studies on neural networks attribute their excellent empirical performance to the implicit bias or regularization induced by first-order optimization algorithms when training networks under certain initialization assumptions. One example is the incremental learning phenomenon in gradient flow (GF) on an overparamerterized matrix factorization problem with small initialization: GF learns a target matrix by sequentially learning its singular values in decreasing order of magnitude over time. In this paper, we develop a quantitative understanding of this incremental learning behavior for GF on the symmetric matrix factorization problem, using its closed-form solution obtained by solving a Riccati-like matrix differential equation. We show that incremental learning emerges from some time-scale separation among dynamics corresponding to learning different components in the target matrix. By decreasing the initialization scale, these time-scale separations become more prominent, allowing one to find low-rank approximations of the target matrix. Lastly, we discuss the possible avenues for extending this analysis to asymmetric matrix factorization problems.
comment: Accepted to CDC 2025
Systolic Array-based Architecture for Low-Bit Integerized Vision Transformers
Transformer-based models are becoming more and more intelligent and are revolutionizing a wide range of human tasks. To support their deployment, AI labs offer inference services that consume hundreds of GWh of energy annually and charge users based on the number of tokens processed. Under this cost model, minimizing power consumption and maximizing throughput have become key design goals for the inference hardware. While graphics processing units (GPUs) are commonly used, their flexibility comes at the cost of low operational intensity and limited efficiency, especially under the high query-per-model ratios of modern inference services. In this work, we address these challenges by proposing a low-bit, model-specialized accelerator that strategically selects tasks with high operation (OP) reuse and minimal communication overhead for offloading. Our design incorporates multiple systolic arrays with deep, fine-grained pipelines and array-compatible units that support essential operations in multi-head self-attention (MSA) module. At the accelerator-level, each self-attention (SA) head is pipelined within a single accelerator to increase data reuse and further minimize bandwidth. Our 3-bit integerized model achieves 96.83% accuracy on CIFAR-10 and 77.81% top-1 accuracy on ImageNet. We validate the hardware design on a 16nm FPGA (Alveo U250), where it delivers 13,568 GigaOps/second (GOPs/s) and 219.4 GOPs/s/W. Compared to a same-technology GPU (GTX 1080), our design offers 1.50x higher throughput and 4.47x better power efficiency. Even against a state-of-the-art GPU (RTX 5090), we still achieve 20% better power efficiency despite having 87% lower throughput.
comment: 13 pages, 16 figures, 6 tables
$H_\infty$ Performance Analysis for Almost Periodic Piecewise Linear Systems with Application to Roll-to-Roll Manufacturing Control
An almost periodic piecewise linear system (APPLS) is a type of piecewise linear system where the system cyclically switches between different modes, each with an uncertain but bounded dwell-time. Process regulation, especially disturbance rejection, is critical to the performance of these advanced systems. However, a method to guarantee disturbance rejection has not been developed. The objective of this study is to develop an $H_\infty$ performance analysis method for APPLSs, building on which an algorithm to synthesize practical $H_\infty$ controllers is proposed. As an application, the developed methods are demonstrated with an advanced manufacturing system -- roll-to-roll (R2R) dry transfer of two-dimensional materials and printed flexible electronics. Experimental results show that the proposed method enables a less conservative and much better performing $H_\infty$ controller compared with a baseline $H_\infty$ controller that does not account for the uncertain system switching structure.
comment: 11 pages, 11 figures
Observer Design for Optical Flow-Based Visual-Inertial Odometry with Almost-Global Convergence
This paper presents a novel cascaded observer architecture that combines optical flow and IMU measurements to perform continuous monocular visual-inertial odometry (VIO). The proposed solution estimates body-frame velocity and gravity direction simultaneously by fusing velocity direction information from optical flow measurements with gyro and accelerometer data. This fusion is achieved using a globally exponentially stable Riccati observer, which operates under persistently exciting translational motion conditions. The estimated gravity direction in the body frame is then employed, along with an optional magnetometer measurement, to design a complementary observer on $\mathbf{SO}(3)$ for attitude estimation. The resulting interconnected observer architecture is shown to be almost globally asymptotically stable. To extract the velocity direction from sparse optical flow data, a gradient descent algorithm is developed to solve a constrained minimization problem on the unit sphere. The effectiveness of the proposed algorithms is validated through simulation results.
comment: 8 pages, 6 figures. To appear in IEEE CDC 2025
Traffic State Estimation in Congestion to Extend Applicability of DFOS
This paper presents a traffic state estimation (TSE) method in congestion for distributed fiber-optic sensing (DFOS). DFOS detects vehicle driving vibrations along the optical fiber and obtains their trajectories in the spatiotemporal plane. From these trajectories, DFOS provides mean velocities for real-time spatially continuous traffic monitoring without dead zones. However, when vehicle vibration intensities are insufficiently low due to slow speed, trajectories cannot be obtained, leading to missing values in mean velocity data. It restricts DFOS applicability in severe congestion. Therefore, this paper proposes a missing value imputation method based on data assimilation. Our proposed method is validated on two expressways in Japan with the reference data. The results show that the mean absolute error (MAE) of the imputed mean velocities to the reference increases only by 1.5 km/h as compared with the MAE of non-missing values. This study enhances the wide-range applicability of DFOS in practical cases.
comment: 11 pages, 7 figures, presented in the 31st ITS World Congress
HPC Digital Twins for Evaluating Scheduling Policies, Incentive Structures and their Impact on Power and Cooling
Schedulers are critical for optimal resource utilization in high-performance computing. Traditional methods to evaluate schedulers are limited to post-deployment analysis, or simulators, which do not model associated infrastructure. In this work, we present the first-of-its-kind integration of scheduling and digital twins in HPC. This enables what-if studies to understand the impact of parameter configurations and scheduling decisions on the physical assets, even before deployment, or regarching changes not easily realizable in production. We (1) provide the first digital twin framework extended with scheduling capabilities, (2) integrate various top-tier HPC systems given their publicly available datasets, (3) implement extensions to integrate external scheduling simulators. Finally, we show how to (4) implement and evaluate incentive structures, as-well-as (5) evaluate machine learning based scheduling, in such novel digital-twin based meta-framework to prototype scheduling. Our work enables what-if scenarios of HPC systems to evaluate sustainability, and the impact on the simulated system.
Constraint Learning in Multi-Agent Dynamic Games from Demonstrations of Local Nash Interactions
We present an inverse dynamic game-based algorithm to learn parametric constraints from a given dataset of local generalized Nash equilibrium interactions between multiple agents. Specifically, we introduce mixed-integer linear programs (MILP) encoding the Karush-Kuhn-Tucker (KKT) conditions of the interacting agents, which recover constraints consistent with the Nash stationarity of the interaction demonstrations. We establish theoretical guarantees that our method learns inner approximations of the true safe and unsafe sets, as well as limitations of constraint learnability from demonstrations of Nash equilibrium interactions. We also use the interaction constraints recovered by our method to design motion plans that robustly satisfy the underlying constraints. Across simulations and hardware experiments, our methods proved capable of inferring constraints and designing interactive motion plans for various classes of constraints, both convex and non-convex, from interaction demonstrations of agents with nonlinear dynamics.
Low-Cost Architecture and Efficient Pattern Synthesis for Polarimetric Phased Array Based on Polarization Coding Reconfigurable Elements
Polarimetric phased arrays (PPAs) enhance radar target detection and anti-jamming capabilities. However, the dual transmit/receive (T/R) channel requirement leads to high costs and system complexity. To address this, this paper introduces a polarization-coding reconfigurable phased array (PCRPA) and associated pattern synthesis techniques to reduce PPA costs while minimizing performance degradation. Each PCRPA element connects to a single T/R channel and incorporates two-level RF switches for real-time control of polarization states and waveforms. By adjusting element codes and excitation weights, the PCRPA can generate arbitrarily polarized and dual-polarized beams. Efficient beam pattern synthesis methods are also proposed, featuring novel optimization constraints derived from theoretical and analytical analysis of PCRPAs. Simulations demonstrate that the approach achieves low cross-polarization and sidelobe levels comparable to conventional architectures within the scan range, particularly for large arrays. However, the channel reduction inevitably incurs power and directivity loss. Experiments conducted on an $8\times 8$ X-band array antenna validate the effectiveness of the proposed system. The PCRPA and synthesis methods are well-suited for large-scale PPA systems, offering significant cost-effectiveness while maintaining good sidelobe suppression and polarization control performance.
Consensus Seminorms and their Applications
Consensus is a well-studied problem in distributed sensing, computation and control, yet deriving useful and easily computable bounds on the rate of convergence to consensus remains a challenge. This paper discusses the use of seminorms for this goal. A previously suggested family of seminorms is revisited, and an error made in their original presentation is corrected, where it was claimed that the a certain seminorm is equal to the well-known coefficient of ergodicity. Next, a wider family of seminorms is introduced, and it is shown that contraction in any of these seminorms guarantees convergence at an exponential rate of infinite products of matrices, generalizing known results on stochastic matrices to the class of matrices whose row sums are all equal one. Finally, it is shown that such seminorms cannot be used to bound the rate of convergence of classes larger than the well-known class of scrambling matrices.
A Symmetry-Preserving Reduced-Order Observer
A symmetry-preserving, reduced-order state observer is presented for the unmeasured part of a system's state, where the nonlinear system dynamics exhibit symmetry under the action of a Lie group. Leveraging this symmetry with a moving frame, the observer dynamics are constructed such that they are invariant under the Lie group's action. Sufficient conditions for the observer to be asymptotically stable are developed by studying the stability of an invariant error system. As an illustrative example, the observer is applied to the problem of rigid-body velocity estimation, which demonstrates how exploiting the symmetry of the system can simplify the stabilization of the estimation error dynamics.
comment: 7 pages, 6 figures, Published in the Proceedings of the 2025 American Control Conference (ACC)
Canonical Bayesian Linear System Identification
Standard Bayesian approaches for linear time-invariant (LTI) system identification are hindered by parameter non-identifiability; the resulting complex, multi-modal posteriors make inference inefficient and impractical. We solve this problem by embedding canonical forms of LTI systems within the Bayesian framework. We rigorously establish that inference in these minimal parameterizations fully captures all invariant system dynamics (e.g., transfer functions, eigenvalues, predictive distributions of system outputs) while resolving identifiability. This approach unlocks the use of meaningful, structure-aware priors (e.g., enforcing stability via eigenvalues) and ensures conditions for a Bernstein--von Mises theorem -- a link between Bayesian and frequentist large-sample asymptotics that is broken in standard forms. Extensive simulations with modern MCMC methods highlight advantages over standard parameterizations: canonical forms achieve higher computational efficiency, generate interpretable and well-behaved posteriors, and provide robust uncertainty estimates, particularly from limited data.
comment: 46 pages, 9 figures
TGOSPA Metric Parameters Selection and Evaluation for Visual Multi-object Tracking
Multi-object tracking algorithms are deployed in various applications, each with different performance requirements. For example, track switches pose significant challenges for offline scene understanding, as they hinder the accuracy of data interpretation. Conversely, in online surveillance applications, their impact is often minimal. This disparity underscores the need for application-specific performance evaluations that are both simple and mathematically sound. The trajectory generalized optimal sub-pattern assignment (TGOSPA) metric offers a principled approach to evaluate multi-object tracking performance. It accounts for localization errors, the number of missed and false objects, and the number of track switches, providing a comprehensive assessment framework. This paper illustrates the effective use of the TGOSPA metric in computer vision tasks, addressing challenges posed by the need for application-specific scoring methodologies. By exploring the TGOSPA parameter selection, we enable users to compare, comprehend, and optimize the performance of algorithms tailored for specific tasks, such as target tracking and training of detector or re-ID modules.
comment: Submitted to Springer International Journal of Computer Vision
Dimension-Decomposed Learning for Quadrotor Geometric Attitude Control with Almost Global Exponential Convergence on SO(3)
This paper introduces a lightweight and interpretable online learning approach called Dimension-Decomposed Learning (DiD-L) for disturbance identification in quadrotor geometric attitude control. As a module instance of DiD-L, we propose the Sliced Adaptive-Neuro Mapping (SANM). Specifically, to address underlying underfitting problems, the high-dimensional mapping for online identification is axially ``sliced" into multiple low-dimensional submappings (slices). In this way, the complex high-dimensional problem is decomposed into a set of simple low-dimensional subtasks addressed by shallow neural networks and adaptive laws. These neural networks and adaptive laws are updated online via Lyapunov-based adaptation without the persistent excitation (PE) condition. To enhance the interpretability of the proposed approach, we prove that the state solution of the rotational error dynamics exponentially converges into an arbitrarily small ball within an almost global attraction domain, despite time-varying disturbances and inertia uncertainties. This result is novel as it demonstrates exponential convergence without requiring pre-training for unseen disturbances and specific knowledge of the model. To our knowledge in the quadrotor control field, DiD-L is the first online learning approach that is lightweight enough to run in real-time at 400 Hz on microcontroller units (MCUs) such as STM32, and has been validated through real-world experiments.
comment: v2: Corrected methodology naming typo; provided TeX source files
A Hybrid Artificial Intelligence Method for Estimating Flicker in Power Systems (Changes are marked)
This paper introduces a novel hybrid AI method combining H filtering and an adaptive linear neuron network for flicker component estimation in power distribution systems.The proposed method leverages the robustness of the H filter to extract the voltage envelope under uncertain and noisy conditions followed by the use of ADALINE to accurately identify flicker frequencies embedded in the envelope.This synergy enables efficient time domain estimation with rapid convergence and noise resilience addressing key limitations of existing frequency domain approaches.Unlike conventional techniques this hybrid AI model handles complex power disturbances without prior knowledge of noise characteristics or extensive training.To validate the method performance we conduct simulation studies based on IEC Standard 61000 4 15 supported by statistical analysis Monte Carlo simulations and real world data.Results demonstrate superior accuracy robustness and reduced computational load compared to Fast Fourier Transform and Discrete Wavelet Transform based estimators.
comment: 31 pages, 12 figures, and 6 tables
Residual Neural Terminal Constraint for MPC-based Collision Avoidance in Dynamic Environments
In this paper, we propose a hybrid MPC local planner that uses a learning-based approximation of a time-varying safe set, derived from local observations and applied as the MPC terminal constraint. This set can be represented as a zero-superlevel set of the value function computed via Hamilton-Jacobi (HJ) reachability analysis, which is infeasible in real-time. We exploit the property that the HJ value function can be expressed as a difference of the corresponding signed distance function (SDF) and a non-negative residual function. The residual component is modeled as a neural network with non-negative output and subtracted from the computed SDF, resulting in a real-time value function estimate that is at least as safe as the SDF by design. Additionally, we parametrize the neural residual by a hypernetwork to improve real-time performance and generalization properties. The proposed method is compared with three state-of-the-art methods in simulations and hardware experiments, achieving up to 30\% higher success rates compared to the best baseline while requiring a similar computational effort and producing high-quality (low travel-time) solutions.
Optimized Contact Plan Design for Reflector and Phased Array Terminals in Cislunar Space Networks
Cislunar space is emerging as a critical domain for human exploration, requiring robust infrastructure to support spatial users-spacecraft with navigation and communication demands. Deploying satellites at Earth-Moon three-body orbits offers an effective solution to construct cislunar space infrastructure (CLSI). However, scheduling satellite links to serve users necessitates an appropriate contact plan design (CPD) scheme. Existing CPD schemes focus solely on inter-satellite link scheduling, overlooking their role in providing services to users. This paper introduces a CPD scheme that considers two classes of satellite transponders: Reflector Links (RL) for high-volume data transfers and Phased Array Links (PL) for fast switching and navigation services. Our approach supports both satellites and spatial users in cislunar space. Simulations validate the scheme, demonstrating effective support for user while meeting satellite ranging and communication requirements. These findings provide essential insights for developing future Cislunar Space Infrastructures.
comment: 12 pages, 9 figures
Model-based Multi-object Visual Tracking: Identification and Standard Model Limitations
This paper uses multi-object tracking methods known from the radar tracking community to address the problem of pedestrian tracking using 2D bounding box detections. The standard point-object (SPO) model is adopted, and the posterior density is computed using the Poisson multi-Bernoulli mixture (PMBM) filter. The selection of the model parameters rooted in continuous time is discussed, including the birth and survival probabilities. Some parameters are selected from the first principles, while others are identified from the data, which is, in this case, the publicly available MOT-17 dataset. Although the resulting PMBM algorithm yields promising results, a mismatch between the SPO model and the data is revealed. The model-based approach assumes that modifying the problematic components causing the SPO model-data mismatch will lead to better model-based algorithms in future developments.
comment: Accepted for publication in 2025 28th International Conference on Information Fusion (FUSION)
Coevolution of Opinion Dynamics and Recommendation System: Modeling, Analysis and Reinforcement Learning Based Manipulation
In this work, we develop an analytical framework that integrates opinion dynamics with a recommendation system. By incorporating elements such as collaborative filtering, we provide a precise characterization of how recommendation systems shape interpersonal interactions and influence opinion formation. Moreover, the property of the coevolution of both opinion dynamics and recommendation systems is also shown. Specifically, the convergence of this coevolutionary system is theoretically proved, and the mechanisms behind filter bubble formation are elucidated. Our analysis of the maximum number of opinion clusters shows how recommendation system parameters affect opinion grouping and polarization. Additionally, we incorporate the influence of propagators into our model and propose a reinforcement learning-based solution. The analysis and the propagation solution are demonstrated in simulations using the Yelp data set.
Enhanced Trust Region Sequential Convex Optimization for Multi-Drone Thermal Screening Trajectory Planning in Urban Environments
The rapid detection of abnormal body temperatures in urban populations is essential for managing public health risks, especially during outbreaks of infectious diseases. Multi-drone thermal screening systems offer promising solutions for fast, large-scale, and non-intrusive human temperature monitoring. However, trajectory planning for multiple drones in complex urban environments poses significant challenges, including collision avoidance, coverage efficiency, and constrained flight environments. In this study, we propose an enhanced trust region sequential convex optimization (TR-SCO) algorithm for optimal trajectory planning of multiple drones performing thermal screening tasks. Our improved algorithm integrates a refined convex optimization formulation within a trust region framework, effectively balancing trajectory smoothness, obstacle avoidance, altitude constraints, and maximum screening coverage. Simulation results demonstrate that our approach significantly improves trajectory optimality and computational efficiency compared to conventional convex optimization methods. This research provides critical insights and practical contributions toward deploying efficient multi-drone systems for real-time thermal screening in urban areas. For reader who are interested in our research, we release our source code at https://github.com/Cherry0302/Enhanced-TR-SCO.
Fixed-Time Input-to-State Stability for Singularly Perturbed Systems via Composite Lyapunov Functions
We study singularly perturbed systems that exhibit input-to-state stability (ISS) with fixed-time properties in the presence of bounded disturbances. In these systems, solutions converge to the origin within a time frame independent of initial conditions when undisturbed, and to a vicinity of the origin when subjected to bounded disturbances. First, we extend the traditional composite Lyapunov method, commonly applied in singular perturbation theory to analyze asymptotic stability, to include fixed-time ISS. We demonstrate that if both the reduced system and the boundary layer system exhibit fixed-time ISS, and if certain interconnection conditions are met, the entire multi-time scale system retains this fixed-time ISS characteristic, provided the separation of time scales is sufficiently pronounced. Next, we illustrate our findings via analytical and numerical examples, including a novel application in fixed-time feedback optimization for dynamic plants with slowly varying cost functions.
Matrix Control Barrier Functions
This paper generalizes the control barrier function framework by replacing scalar-valued functions with matrix-valued ones. Specifically, we develop barrier conditions for safe sets defined by matrix inequalities -- both semidefinite and indefinite. Matrix inequalities can be used to describe a richer class of safe sets, including nonsmooth ones. The safety filters constructed from our proposed matrix control barrier functions via semidefinite programming (CBF-SDP) are shown to be continuous. Our matrix formulation naturally provides a continuous safety filter for Boolean-based control barrier functions, notably for disjunctions (OR), without relaxing the safe set. We illustrate the effectiveness of the proposed framework with applications in drone network connectivity maintenance and nonsmooth obstacle avoidance, both in simulations and hardware experiments.
comment: 13 pages, 4 figures, submitted to the IEEE Transactions on Automatic Control
PUB: A Plasma-Propelled Ultra-Quiet Blimp with Two-DOF Vector Thrusting
This study presents the design and control of a Plasma-propelled Ultra-silence Blimp (PUB), a novel aerial robot employing plasma vector propulsion for ultra-quiet flight without mechanical propellers. The system utilizes a helium-lift platform for extended endurance and a four-layer ring asymmetric capacitor to generate ionic wind thrust. The modular propulsion units allow flexible configuration to meet mission-specific requirements, while a two-degree-of-freedom (DOF) head enables thrust vector control. A closed-loop slip control scheme is implemented for stable maneuvering. Flight experiments demonstrate full-envelope capability, including take-off, climb, hover, descent, and smooth landing, confirming the feasibility of plasma vector propulsion, the effectiveness of DOF vector control, and the stability of the control system. Owing to its low acoustic signature, structural simplicity, and high maneuverability, PUB is well suited for noise-sensitive, enclosed, and near-space applications.
A novel switched systems approach to nonconvex optimisation
We develop a novel switching dynamics that converges to the Karush-Kuhn-Tucker (KKT) point of a nonlinear optimisation problem. This new approach is particularly notable for its lower dimensionality compared to conventional primal-dual dynamics, as it focuses exclusively on estimating the primal variable. Our method is successfully illustrated on general quadratic optimisation problems, the minimisation of the classical Rosenbrock function, and a nonconvex optimisation problem stemming from the control of energy-efficient buildings.
Robotics
Discrete-Guided Diffusion for Scalable and Safe Multi-Robot Motion Planning
Multi-Robot Motion Planning (MRMP) involves generating collision-free trajectories for multiple robots operating in a shared continuous workspace. While discrete multi-agent path finding (MAPF) methods are broadly adopted due to their scalability, their coarse discretization severely limits trajectory quality. In contrast, continuous optimization-based planners offer higher-quality paths but suffer from the curse of dimensionality, resulting in poor scalability with respect to the number of robots. This paper tackles the limitations of these two approaches by introducing a novel framework that integrates discrete MAPF solvers with constrained generative diffusion models. The resulting framework, called Discrete-Guided Diffusion (DGD), has three key characteristics: (1) it decomposes the original nonconvex MRMP problem into tractable subproblems with convex configuration spaces, (2) it combines discrete MAPF solutions with constrained optimization techniques to guide diffusion models capture complex spatiotemporal dependencies among robots, and (3) it incorporates a lightweight constraint repair mechanism to ensure trajectory feasibility. The proposed method sets a new state-of-the-art performance in large-scale, complex environments, scaling to 100 robots while achieving planning efficiency and high success rates.
HERMES: Human-to-Robot Embodied Learning from Multi-Source Motion Data for Mobile Dexterous Manipulation
Leveraging human motion data to impart robots with versatile manipulation skills has emerged as a promising paradigm in robotic manipulation. Nevertheless, translating multi-source human hand motions into feasible robot behaviors remains challenging, particularly for robots equipped with multi-fingered dexterous hands characterized by complex, high-dimensional action spaces. Moreover, existing approaches often struggle to produce policies capable of adapting to diverse environmental conditions. In this paper, we introduce HERMES, a human-to-robot learning framework for mobile bimanual dexterous manipulation. First, HERMES formulates a unified reinforcement learning approach capable of seamlessly transforming heterogeneous human hand motions from multiple sources into physically plausible robotic behaviors. Subsequently, to mitigate the sim2real gap, we devise an end-to-end, depth image-based sim2real transfer method for improved generalization to real-world scenarios. Furthermore, to enable autonomous operation in varied and unstructured environments, we augment the navigation foundation model with a closed-loop Perspective-n-Point (PnP) localization mechanism, ensuring precise alignment of visual goals and effectively bridging autonomous navigation and dexterous manipulation. Extensive experimental results demonstrate that HERMES consistently exhibits generalizable behaviors across diverse, in-the-wild scenarios, successfully performing numerous complex mobile bimanual dexterous manipulation tasks. Project Page:https:/gemcollector.github.io/HERMES/.
Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies
Vision-Language-Action (VLA) models adapt large vision-language backbones to map images and instructions to robot actions. However, prevailing VLA decoders either generate actions autoregressively in a fixed left-to-right order or attach continuous diffusion or flow matching heads outside the backbone, demanding specialized training and iterative sampling that hinder a unified, scalable architecture. We present Discrete Diffusion VLA, a single-transformer policy that models discretized action chunks with discrete diffusion and is trained with the same cross-entropy objective as the VLM backbone. The design retains diffusion's progressive refinement paradigm while remaining natively compatible with the discrete token interface of VLMs. Our method achieves an adaptive decoding order that resolves easy action elements before harder ones and uses secondary remasking to revisit uncertain predictions across refinement rounds, which improves consistency and enables robust error correction. This unified decoder preserves pretrained vision language priors, supports parallel decoding, breaks the autoregressive bottleneck, and reduces the number of function evaluations. Discrete Diffusion VLA achieves 96.3% avg. SR on LIBERO, 71.2% visual matching on SimplerEnv Fractal and 49.3% overall on SimplerEnv Bridge, improving over both autoregressive and continuous diffusion baselines. These findings indicate that discrete-diffusion action decoder supports precise action modeling and consistent training, laying groundwork for scaling VLA to larger models and datasets.
comment: 15 pages
Visio-Verbal Teleimpedance Interface: Enabling Semi-Autonomous Control of Physical Interaction via Eye Tracking and Speech
The paper presents a visio-verbal teleimpedance interface for commanding 3D stiffness ellipsoids to the remote robot with a combination of the operator's gaze and verbal interaction. The gaze is detected by an eye-tracker, allowing the system to understand the context in terms of what the operator is currently looking at in the scene. Along with verbal interaction, a Visual Language Model (VLM) processes this information, enabling the operator to communicate their intended action or provide corrections. Based on these inputs, the interface can then generate appropriate stiffness matrices for different physical interaction actions. To validate the proposed visio-verbal teleimpedance interface, we conducted a series of experiments on a setup including a Force Dimension Sigma.7 haptic device to control the motion of the remote Kuka LBR iiwa robotic arm. The human operator's gaze is tracked by Tobii Pro Glasses 2, while human verbal commands are processed by a VLM using GPT-4o. The first experiment explored the optimal prompt configuration for the interface. The second and third experiments demonstrated different functionalities of the interface on a slide-in-the-groove task.
Long-VLA: Unleashing Long-Horizon Capability of Vision Language Action Model for Robot Manipulation
Vision-Language-Action (VLA) models have become a cornerstone in robotic policy learning, leveraging large-scale multimodal data for robust and scalable control. However, existing VLA frameworks primarily address short-horizon tasks, and their effectiveness on long-horizon, multi-step robotic manipulation remains limited due to challenges in skill chaining and subtask dependencies. In this work, we introduce Long-VLA, the first end-to-end VLA model specifically designed for long-horizon robotic tasks. Our approach features a novel phase-aware input masking strategy that adaptively segments each subtask into moving and interaction phases, enabling the model to focus on phase-relevant sensory cues and enhancing subtask compatibility. This unified strategy preserves the scalability and data efficiency of VLA training, and our architecture-agnostic module can be seamlessly integrated into existing VLA models. We further propose the L-CALVIN benchmark to systematically evaluate long-horizon manipulation. Extensive experiments on both simulated and real-world tasks demonstrate that Long-VLA significantly outperforms prior state-of-the-art methods, establishing a new baseline for long-horizon robotic control.
comment: Accepted to CoRL 2025; Github Page: https://long-vla.github.io
Divide, Discover, Deploy: Factorized Skill Learning with Symmetry and Style Priors
Unsupervised Skill Discovery (USD) allows agents to autonomously learn diverse behaviors without task-specific rewards. While recent USD methods have shown promise, their application to real-world robotics remains underexplored. In this paper, we propose a modular USD framework to address the challenges in the safety, interpretability, and deployability of the learned skills. Our approach employs user-defined factorization of the state space to learn disentangled skill representations. It assigns different skill discovery algorithms to each factor based on the desired intrinsic reward function. To encourage structured morphology-aware skills, we introduce symmetry-based inductive biases tailored to individual factors. We also incorporate a style factor and regularization penalties to promote safe and robust behaviors. We evaluate our framework in simulation using a quadrupedal robot and demonstrate zero-shot transfer of the learned skills to real hardware. Our results show that factorization and symmetry lead to the discovery of structured human-interpretable behaviors, while the style factor and penalties enhance safety and diversity. Additionally, we show that the learned skills can be used for downstream tasks and perform on par with oracle policies trained with hand-crafted rewards.
comment: Accepted to CoRL 2025. For code and videos, please check: https://leggedrobotics.github.io/d3-skill-discovery/
FARM: Frame-Accelerated Augmentation and Residual Mixture-of-Experts for Physics-Based High-Dynamic Humanoid Control
Unified physics-based humanoid controllers are pivotal for robotics and character animation, yet models that excel on gentle, everyday motions still stumble on explosive actions, hampering real-world deployment. We bridge this gap with FARM (Frame-Accelerated Augmentation and Residual Mixture-of-Experts), an end-to-end framework composed of frame-accelerated augmentation, a robust base controller, and a residual mixture-of-experts (MoE). Frame-accelerated augmentation exposes the model to high-velocity pose changes by widening inter-frame gaps. The base controller reliably tracks everyday low-dynamic motions, while the residual MoE adaptively allocates additional network capacity to handle challenging high-dynamic actions, significantly enhancing tracking accuracy. In the absence of a public benchmark, we curate the High-Dynamic Humanoid Motion (HDHM) dataset, comprising 3593 physically plausible clips. On HDHM, FARM reduces the tracking failure rate by 42.8\% and lowers global mean per-joint position error by 14.6\% relative to the baseline, while preserving near-perfect accuracy on low-dynamic motions. These results establish FARM as a new baseline for high-dynamic humanoid control and introduce the first open benchmark dedicated to this challenge. The code and dataset will be released at https://github.com/Colin-Jing/FARM.
A Standing Support Mobility Robot for Enhancing Independence in Elderly Daily Living
This paper presents a standing support mobility robot "Moby" developed to enhance independence and safety for elderly individuals during daily activities such as toilet transfers. Unlike conventional seated mobility aids, the robot maintains users in an upright posture, reducing physical strain, supporting natural social interaction at eye level, and fostering a greater sense of self-efficacy. Moby offers a novel alternative by functioning both passively and with mobility support, enabling users to perform daily tasks more independently. Its main advantages include ease of use, lightweight design, comfort, versatility, and effective sit-to-stand assistance. The robot leverages the Robot Operating System (ROS) for seamless control, featuring manual and autonomous operation modes. A custom control system enables safe and intuitive interaction, while the integration with NAV2 and LiDAR allows for robust navigation capabilities. This paper reviews existing mobility solutions and compares them to Moby, details the robot's design, and presents objective and subjective experimental results using the NASA-TLX method and time comparisons to other methods to validate our design criteria and demonstrate the advantages of our contribution.
comment: 7 pages, accepted work for IEEE RO-MAN2025
APT*: Asymptotically Optimal Motion Planning via Adaptively Prolated Elliptical R-Nearest Neighbors
Optimal path planning aims to determine a sequence of states from a start to a goal while accounting for planning objectives. Popular methods often integrate fixed batch sizes and neglect information on obstacles, which is not problem-specific. This study introduces Adaptively Prolated Trees (APT*), a novel sampling-based motion planner that extends based on Force Direction Informed Trees (FDIT*), integrating adaptive batch-sizing and elliptical $r$-nearest neighbor modules to dynamically modulate the path searching process based on environmental feedback. APT* adjusts batch sizes based on the hypervolume of the informed sets and considers vertices as electric charges that obey Coulomb's law to define virtual forces via neighbor samples, thereby refining the prolate nearest neighbor selection. These modules employ non-linear prolate methods to adaptively adjust the electric charges of vertices for force definition, thereby improving the convergence rate with lower solution costs. Comparative analyses show that APT* outperforms existing single-query sampling-based planners in dimensions from $\mathbb{R}^4$ to $\mathbb{R}^{16}$, and it was further validated through a real-world robot manipulation task. A video showcasing our experimental results is available at: https://youtu.be/gCcUr8LiEw4
Context-Aware Risk Estimation in Home Environments: A Probabilistic Framework for Service Robots
We present a novel framework for estimating accident-prone regions in everyday indoor scenes, aimed at improving real-time risk awareness in service robots operating in human-centric environments. As robots become integrated into daily life, particularly in homes, the ability to anticipate and respond to environmental hazards is crucial for ensuring user safety, trust, and effective human-robot interaction. Our approach models object-level risk and context through a semantic graph-based propagation algorithm. Each object is represented as a node with an associated risk score, and risk propagates asymmetrically from high-risk to low-risk objects based on spatial proximity and accident relationship. This enables the robot to infer potential hazards even when they are not explicitly visible or labeled. Designed for interpretability and lightweight onboard deployment, our method is validated on a dataset with human-annotated risk regions, achieving a binary risk detection accuracy of 75%. The system demonstrates strong alignment with human perception, particularly in scenes involving sharp or unstable objects. These results underline the potential of context-aware risk reasoning to enhance robotic scene understanding and proactive safety behaviors in shared human-robot spaces. This framework could serve as a foundation for future systems that make context-driven safety decisions, provide real-time alerts, or autonomously assist users in avoiding or mitigating hazards within home environments.
comment: 8 pages, Accepted for IEEE RO-MAN 2025 Conference
Tree-Based Grafting Approach for Bidirectional Motion Planning with Local Subsets Optimization IROS 2025
Bidirectional motion planning often reduces planning time compared to its unidirectional counterparts. It requires connecting the forward and reverse search trees to form a continuous path. However, this process could fail and restart the asymmetric bidirectional search due to the limitations of lazy-reverse search. To address this challenge, we propose Greedy GuILD Grafting Trees (G3T*), a novel path planner that grafts invalid edge connections at both ends to re-establish tree-based connectivity, enabling rapid path convergence. G3T* employs a greedy approach using the minimum Lebesgue measure of guided incremental local densification (GuILD) subsets to optimize paths efficiently. Furthermore, G3T* dynamically adjusts the sampling distribution between the informed set and GuILD subsets based on historical and current cost improvements, ensuring asymptotic optimality. These features enhance the forward search's growth towards the reverse tree, achieving faster convergence and lower solution costs. Benchmark experiments across dimensions from R^2 to R^8 and real-world robotic evaluations demonstrate G3T*'s superior performance compared to existing single-query sampling-based planners. A video showcasing our experimental results is available at: https://youtu.be/3mfCRL5SQIU
comment: IEEE Robotics and Automation Letters (also presented at IEEE-IROS 2025)
Elliptical K-Nearest Neighbors -- Path Optimization via Coulomb's Law and Invalid Vertices in C-space Obstacles IROS
Path planning has long been an important and active research area in robotics. To address challenges in high-dimensional motion planning, this study introduces the Force Direction Informed Trees (FDIT*), a sampling-based planner designed to enhance speed and cost-effectiveness in pathfinding. FDIT* builds upon the state-of-the-art informed sampling planner, the Effort Informed Trees (EIT*), by capitalizing on often-overlooked information in invalid vertices. It incorporates principles of physical force, particularly Coulomb's law. This approach proposes the elliptical $k$-nearest neighbors search method, enabling fast convergence navigation and avoiding high solution cost or infeasible paths by exploring more problem-specific search-worthy areas. It demonstrates benefits in search efficiency and cost reduction, particularly in confined, high-dimensional environments. It can be viewed as an extension of nearest neighbors search techniques. Fusing invalid vertex data with physical dynamics facilitates force-direction-based search regions, resulting in an improved convergence rate to the optimum. FDIT* outperforms existing single-query, sampling-based planners on the tested problems in R^4 to R^16 and has been demonstrated on a real-world mobile manipulation task.
comment: 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Efficient Human-Aware Task Allocation for Multi-Robot Systems in Shared Environments IROS2025
Multi-robot systems are increasingly deployed in applications, such as intralogistics or autonomous delivery, where multiple robots collaborate to complete tasks efficiently. One of the key factors enabling their efficient cooperation is Multi-Robot Task Allocation (MRTA). Algorithms solving this problem optimize task distribution among robots to minimize the overall execution time. In shared environments, apart from the relative distance between the robots and the tasks, the execution time is also significantly impacted by the delay caused by navigating around moving people. However, most existing MRTA approaches are dynamics-agnostic, relying on static maps and neglecting human motion patterns, leading to inefficiencies and delays. In this paper, we introduce \acrfull{method name}. This method leverages Maps of Dynamics (MoDs), spatio-temporal queryable models designed to capture historical human movement patterns, to estimate the impact of humans on the task execution time during deployment. \acrshort{method name} utilizes a stochastic cost function that includes MoDs. Experimental results show that integrating MoDs enhances task allocation performance, resulting in reduced mission completion times by up to $26\%$ compared to the dynamics-agnostic method and up to $19\%$ compared to the baseline. This work underscores the importance of considering human dynamics in MRTA within shared environments and presents an efficient framework for deploying multi-robot systems in environments populated by humans.
comment: 7 Pages, 4 Figures, Accepted in IROS2025
Embodied Intelligence for Sustainable Flight: A Soaring Robot with Active Morphological Control
Achieving both agile maneuverability and high energy efficiency in aerial robots, particularly in dynamic wind environments, remains challenging. Conventional thruster-powered systems offer agility but suffer from high energy consumption, while fixed-wing designs are efficient but lack hovering and maneuvering capabilities. We present Floaty, a shape-changing robot that overcomes these limitations by passively soaring, harnessing wind energy through intelligent morphological control inspired by birds. Floaty's design is optimized for passive stability, and its control policy is derived from an experimentally learned aerodynamic model, enabling precise attitude and position control without active propulsion. Wind tunnel experiments demonstrate Floaty's ability to hover, maneuver, and reject disturbances in vertical airflows up to 10 m/s. Crucially, Floaty achieves this with a specific power consumption of 10 W/kg, an order of magnitude lower than thruster-powered systems. This introduces a paradigm for energy-efficient aerial robotics, leveraging morphological intelligence and control to operate sustainably in challenging wind conditions.
Autonomous Aerial Manipulation at Arbitrary Pose in SE(3) with Robust Control and Whole-body Planning
Aerial manipulators based on conventional multirotors can conduct manipulation only in small roll and pitch angles due to the underactuatedness of the multirotor base. If the multirotor base is capable of hovering at arbitrary orientation, the robot can freely locate itself at any point in $\mathsf{SE}(3)$, significantly extending its manipulation workspace and enabling a manipulation task that was originally not viable. In this work, we present a geometric robust control and whole-body motion planning framework for an omnidirectional aerial manipulator (OAM). To maximize the strength of OAM, we first propose a geometric robust controller for a floating base. Since the motion of the robotic arm and the interaction forces during manipulation affect the stability of the floating base, the base should be capable of mitigating these adverse effects while controlling its 6D pose. We then design a two-step optimization-based whole-body motion planner, jointly considering the pose of the floating base and the joint angles of the robotic arm to harness the entire configuration space. The devised two-step approach facilitates real-time applicability and enhances convergence of the optimization problem with non-convex and non-Euclidean search space. The proposed approach enables the base to be stationary at any 6D pose while autonomously carrying out sophisticated manipulation near obstacles without any collision. We demonstrate the effectiveness of the proposed framework through experiments in which an OAM performs grasping and pulling of an object in multiple scenarios, including near $90^\circ$ and even $180^\circ$ pitch angles.
Impedance Primitive-augmented Hierarchical Reinforcement Learning for Sequential Tasks ICRA
This paper presents an Impedance Primitive-augmented hierarchical reinforcement learning framework for efficient robotic manipulation in sequential contact tasks. We leverage this hierarchical structure to sequentially execute behavior primitives with variable stiffness control capabilities for contact tasks. Our proposed approach relies on three key components: an action space enabling variable stiffness control, an adaptive stiffness controller for dynamic stiffness adjustments during primitive execution, and affordance coupling for efficient exploration while encouraging compliance. Through comprehensive training and evaluation, our framework learns efficient stiffness control capabilities and demonstrates improvements in learning efficiency, compositionality in primitive selection, and success rates compared to the state-of-the-art. The training environments include block lifting, door opening, object pushing, and surface cleaning. Real world evaluations further confirm the framework's sim2real capability. This work lays the foundation for more adaptive and versatile robotic manipulation systems, with potential applications in more complex contact-based tasks.
comment: This article is accepted for publication in IEEE International Conference on Robotics and Automation (ICRA) 2025
A Lightweight Crowd Model for Robot Social Navigation
Robots operating in human-populated environments must navigate safely and efficiently while minimizing social disruption. Achieving this requires estimating crowd movement to avoid congested areas in real-time. Traditional microscopic models struggle to scale in dense crowds due to high computational cost, while existing macroscopic crowd prediction models tend to be either overly simplistic or computationally intensive. In this work, we propose a lightweight, real-time macroscopic crowd prediction model tailored for human motion, which balances prediction accuracy and computational efficiency. Our approach simplifies both spatial and temporal processing based on the inherent characteristics of pedestrian flow, enabling robust generalization without the overhead of complex architectures. We demonstrate a 3.6 times reduction in inference time, while improving prediction accuracy by 3.1 %. Integrated into a socially aware planning framework, the model enables efficient and socially compliant robot navigation in dynamic environments. This work highlights that efficient human crowd modeling enables robots to navigate dense environments without costly computations.
comment: 7 pages, 6 figures, accepted in ECMR 2025
DATR: Diffusion-based 3D Apple Tree Reconstruction Framework with Sparse-View
Digital twin applications offered transformative potential by enabling real-time monitoring and robotic simulation through accurate virtual replicas of physical assets. The key to these systems is 3D reconstruction with high geometrical fidelity. However, existing methods struggled under field conditions, especially with sparse and occluded views. This study developed a two-stage framework (DATR) for the reconstruction of apple trees from sparse views. The first stage leverages onboard sensors and foundation models to semi-automatically generate tree masks from complex field images. Tree masks are used to filter out background information in multi-modal data for the single-image-to-3D reconstruction at the second stage. This stage consists of a diffusion model and a large reconstruction model for respective multi view and implicit neural field generation. The training of the diffusion model and LRM was achieved by using realistic synthetic apple trees generated by a Real2Sim data generator. The framework was evaluated on both field and synthetic datasets. The field dataset includes six apple trees with field-measured ground truth, while the synthetic dataset featured structurally diverse trees. Evaluation results showed that our DATR framework outperformed existing 3D reconstruction methods across both datasets and achieved domain-trait estimation comparable to industrial-grade stationary laser scanners while improving the throughput by $\sim$360 times, demonstrating strong potential for scalable agricultural digital twin systems.
Regulation-Aware Game-Theoretic Motion Planning for Autonomous Racing SC 2025
This paper presents a regulation-aware motion planning framework for autonomous racing scenarios. Each agent solves a Regulation-Compliant Model Predictive Control problem, where racing rules - such as right-of-way and collision avoidance responsibilities - are encoded using Mixed Logical Dynamical constraints. We formalize the interaction between vehicles as a Generalized Nash Equilibrium Problem (GNEP) and approximate its solution using an Iterative Best Response scheme. Building on this, we introduce the Regulation-Aware Game-Theoretic Planner (RA-GTP), in which the attacker reasons over the defender's regulation-constrained behavior. This game-theoretic layer enables the generation of overtaking strategies that are both safe and non-conservative. Simulation results demonstrate that the RA-GTP outperforms baseline methods that assume non-interacting or rule-agnostic opponent models, leading to more effective maneuvers while consistently maintaining compliance with racing regulations.
comment: Accepted for presentation at the IEEE International Conference on Intelligent Transportation Systems (ITSC 2025)
LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning
Large language models (LLMs) have shown remarkable abilities in logical reasoning, in-context learning, and code generation. However, translating natural language instructions into effective robotic control policies remains a significant challenge, especially for tasks requiring long-horizon planning and operating under sparse reward conditions. Hierarchical Reinforcement Learning (HRL) provides a natural framework to address this challenge in robotics; however, it typically suffers from non-stationarity caused by the changing behavior of the lower-level policy during training, destabilizing higher-level policy learning. We introduce LGR2, a novel HRL framework that leverages LLMs to generate language-guided reward functions for the higher-level policy. By decoupling high-level reward generation from low-level policy changes, LGR2 fundamentally mitigates the non-stationarity problem in off-policy HRL, enabling stable and efficient learning. To further enhance sample efficiency in sparse environments, we integrate goal-conditioned hindsight experience relabeling. Extensive experiments across simulated and real-world robotic navigation and manipulation tasks demonstrate LGR2 outperforms both hierarchical and non-hierarchical baselines, achieving over 55% success rates on challenging tasks and robust transfer to real robots, without additional fine-tuning.
Pseudo-Simulation for Autonomous Driving
Existing evaluation paradigms for Autonomous Vehicles (AVs) face critical limitations. Real-world evaluation is often challenging due to safety concerns and a lack of reproducibility, whereas closed-loop simulation can face insufficient realism or high computational costs. Open-loop evaluation, while being efficient and data-driven, relies on metrics that generally overlook compounding errors. In this paper, we propose pseudo-simulation, a novel paradigm that addresses these limitations. Pseudo-simulation operates on real datasets, similar to open-loop evaluation, but augments them with synthetic observations generated prior to evaluation using 3D Gaussian Splatting. Our key idea is to approximate potential future states the AV might encounter by generating a diverse set of observations that vary in position, heading, and speed. Our method then assigns a higher importance to synthetic observations that best match the AV's likely behavior using a novel proximity-based weighting scheme. This enables evaluating error recovery and the mitigation of causal confusion, as in closed-loop benchmarks, without requiring sequential interactive simulation. We show that pseudo-simulation is better correlated with closed-loop simulations ($R^2=0.8$) than the best existing open-loop approach ($R^2=0.7$). We also establish a public leaderboard for the community to benchmark new methodologies with pseudo-simulation. Our code is available at https://github.com/autonomousvision/navsim.
comment: CoRL 2025
RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation
Simulation-based data synthesis has emerged as a powerful paradigm for advancing real-world robotic manipulation. Yet existing datasets remain insufficient for robust bimanual manipulation due to (1) the lack of scalable task generation methods and (2) oversimplified simulation environments. We present RoboTwin 2.0, a scalable framework for automated, large-scale generation of diverse and realistic data, together with unified evaluation protocols for dual-arm manipulation. At its core is RoboTwin-OD, an object library of 731 instances across 147 categories with semantic and manipulation-relevant annotations. Building on this, we design an expert data synthesis pipeline that leverages multimodal language models (MLLMs) and simulation-in-the-loop refinement to automatically generate task-level execution code. To improve sim-to-real transfer, RoboTwin 2.0 applies structured domain randomization along five axes: clutter, lighting, background, tabletop height, and language, enhancing data diversity and policy robustness. The framework is instantiated across 50 dual-arm tasks and five robot embodiments. Empirically, it yields a 10.9% gain in code generation success rate. For downstream policy learning, a VLA model trained with synthetic data plus only 10 real demonstrations achieves a 367% relative improvement over the 10-demo baseline, while zero-shot models trained solely on synthetic data obtain a 228% gain. These results highlight the effectiveness of RoboTwin 2.0 in strengthening sim-to-real transfer and robustness to environmental variations. We release the data generator, benchmark, dataset, and code to support scalable research in robust bimanual manipulation. Project Page: https://robotwin-platform.github.io/, Code: https://github.com/robotwin-Platform/robotwin/.
comment: Project Page: https://robotwin-platform.github.io/, Code: https://github.com/robotwin-Platform/robotwin, Doc: https://robotwin-platform.github.io/doc/
X-Sim: Cross-Embodiment Learning via Real-to-Sim-to-Real
Human videos offer a scalable way to train robot manipulation policies, but lack the action labels needed by standard imitation learning algorithms. Existing cross-embodiment approaches try to map human motion to robot actions, but often fail when the embodiments differ significantly. We propose X-Sim, a real-to-sim-to-real framework that uses object motion as a dense and transferable signal for learning robot policies. X-Sim starts by reconstructing a photorealistic simulation from an RGBD human video and tracking object trajectories to define object-centric rewards. These rewards are used to train a reinforcement learning (RL) policy in simulation. The learned policy is then distilled into an image-conditioned diffusion policy using synthetic rollouts rendered with varied viewpoints and lighting. To transfer to the real world, X-Sim introduces an online domain adaptation technique that aligns real and simulated observations during deployment. Importantly, X-Sim does not require any robot teleoperation data. We evaluate it across 5 manipulation tasks in 2 environments and show that it: (1) improves task progress by 30% on average over hand-tracking and sim-to-real baselines, (2) matches behavior cloning with 10x less data collection time, and (3) generalizes to new camera viewpoints and test-time changes. Code and videos are available at https://portal-cornell.github.io/X-Sim/.
Staircase Recognition and Location Based on Polarization Vision
Staircase is one of the most common structures in artificial scenes. However, it is difficult for humanoid robots and people with lower limb disabilities or visual impairment to cross the scene without the help of sensors and intelligent algorithms. Staircase scene perception technology is a prerequisite for recognition and localization. This technology is of great significance for the mode switching of the robot and the calculation of the footprint position to adapt to the discontinuous terrain. However, there are still many problems that constrain the application of this technology, such as low recognition accuracy, high initial noise from sensors, unstable output signals and high computational requirements. In terms of scene reconstruction, the binocular and time of flight (TOF) reconstruction of the scene can be easily affected by environmental light and the surface material of the target object. In contrast, due to the special structure of the polarizer, the polarization can selectively transmit polarized light in a specific direction and this reconstruction method relies on the polarization information of the object surface. So the advantages of polarization reconstruction are reflected, which are less affected by environmental light and not dependent on the texture information of the object surface. In this paper, in order to achieve the detection of staircase, this paper proposes a contrast enhancement algorithm that integrates polarization and light intensity information, and integrates point cloud segmentation based on YOLOv11. To realize the high-quality reconstruction, we proposed a method of fusing polarized binocular and TOF depth information to realize the three-dimensional (3D) reconstruction of the staircase. Besides, it also proposes a joint calibration algorithm of monocular camera and TOF camera based on ICP registration and improved gray wolf optimization algorithm.
RoboComm: A DID-based scalable and privacy-preserving Robot-to-Robot interaction over state channels
In a multi robot system establishing trust amongst untrusted robots from different organisations while preserving a robot's privacy is a challenge. Recently decentralized technologies such as smart contract and blockchain are being explored for applications in robotics. However, the limited transaction processing and high maintenance cost hinder the widespread adoption of such approaches. Moreover, blockchain transactions be they on public or private permissioned blockchain are publically readable which further fails to preserve the confidentiality of the robot's data and privacy of the robot. In this work, we propose RoboComm a Decentralized Identity based approach for privacy-preserving interaction between robots. With DID a component of Self-Sovereign Identity; robots can authenticate each other independently without relying on any third-party service. Verifiable Credentials enable private data associated with a robot to be stored within the robot's hardware, unlike existing blockchain based approaches where the data has to be on the blockchain. We improve throughput by allowing message exchange over state channels. Being a blockchain backed solution RoboComm provides a trustworthy system without relying on a single party. Moreover, we implement our proposed approach to demonstrate the feasibility of our solution.
comment: 19 pages, 10 figures
Bidirectional Task-Motion Planning Based on Hierarchical Reinforcement Learning for Strategic Confrontation
In swarm robotics, confrontation scenarios, including strategic confrontations, require efficient decision-making that integrates discrete commands and continuous actions. Traditional task and motion planning methods separate decision-making into two layers, but their unidirectional structure fails to capture the interdependence between these layers, limiting adaptability in dynamic environments. Here, we propose a novel bidirectional approach based on hierarchical reinforcement learning, enabling dynamic interaction between the layers. This method effectively maps commands to task allocation and actions to path planning, while leveraging cross-training techniques to enhance learning across the hierarchical framework. Furthermore, we introduce a trajectory prediction model that bridges abstract task representations with actionable planning goals. In our experiments, it achieves over 80% in confrontation win rate and under 0.01 seconds in decision time, outperforming existing approaches. Demonstrations through large-scale tests and real-world robot experiments further emphasize the generalization capabilities and practical applicability of our method.
From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving
Learning robust driving policies from large-scale, real-world datasets is a central challenge in autonomous driving, as online data collection is often unsafe and impractical. While Behavioral Cloning (BC) offers a straightforward approach to imitation learning, policies trained with BC are notoriously brittle and suffer from compounding errors in closed-loop execution. This work presents a comprehensive pipeline and a comparative study to address this limitation. We first develop a series of increasingly sophisticated BC baselines, culminating in a Transformer-based model that operates on a structured, entity-centric state representation. While this model achieves low imitation loss, we show that it still fails in long-horizon simulations. We then demonstrate that by applying a state-of-the-art Offline Reinforcement Learning algorithm, Conservative Q-Learning (CQL), to the same data and architecture, we can learn a significantly more robust policy. Using a carefully engineered reward function, the CQL agent learns a conservative value function that enables it to recover from minor errors and avoid out-of-distribution states. In a large-scale evaluation on 1,000 unseen scenarios from the Waymo Open Motion Dataset, our final CQL agent achieves a 3.2x higher success rate and a 7.4x lower collision rate than the strongest BC baseline, proving that an offline RL approach is critical for learning robust, long-horizon driving policies from static expert data.
A Comprehensive Review on Traffic Datasets and Simulators for Autonomous Vehicles
Autonomous driving has rapidly evolved through synergistic developments in hardware and artificial intelligence. This comprehensive review investigates traffic datasets and simulators as dual pillars supporting autonomous vehicle (AV) development. Unlike prior surveys that examine these resources independently, we present an integrated analysis spanning the entire AV pipeline-perception, localization, prediction, planning, and control. We evaluate annotation practices and quality metrics while examining how geographic diversity and environmental conditions affect system reliability. Our analysis includes detailed characterizations of datasets organized by functional domains and an in-depth examination of traffic simulators categorized by their specialized contributions to research and development. The paper explores emerging trends, including novel architecture frameworks, multimodal AI integration, and advanced data generation techniques that address critical edge cases. By highlighting the interconnections between real-world data collection and simulation environments, this review offers researchers a roadmap for developing more robust and resilient autonomous systems equipped to handle the diverse challenges encountered in real-world driving environments.
comment: This manuscript has been withdrawn due to the need for substantial updates and revisions
Belief-Conditioned One-Step Diffusion: Real-Time Trajectory Planning with Just-Enough Sensing
Robots equipped with rich sensor suites can localize reliably in partially-observable environments, but powering every sensor continuously is wasteful and often infeasible. Belief-space planners address this by propagating pose-belief covariance through analytic models and switching sensors heuristically--a brittle, runtime-expensive approach. Data-driven approaches--including diffusion models--learn multi-modal trajectories from demonstrations, but presuppose an accurate, always-on state estimate. We address the largely open problem: for a given task in a mapped environment, which \textit{minimal sensor subset} must be active at each location to maintain state uncertainty \textit{just low enough} to complete the task? Our key insight is that when a diffusion planner is explicitly conditioned on a pose-belief raster and a sensor mask, the spread of its denoising trajectories yields a calibrated, differentiable proxy for the expected localisation error. Building on this insight, we present Belief-Conditioned One-Step Diffusion (B-COD), the first planner that, in a 10 ms forward pass, returns a short-horizon trajectory, per-waypoint aleatoric variances, and a proxy for localisation error--eliminating external covariance rollouts. We show that this single proxy suffices for a soft-actor-critic to choose sensors online, optimising energy while bounding pose-covariance growth. We deploy B-COD in real-time marine trials on an unmanned surface vehicle and show that it reduces sensing energy consumption while matching the goal-reach performance of an always-on baseline.
comment: Accepted to CoRL 2025 (Conference on Robot Learning)
Learning Deployable Locomotion Control via Differentiable Simulation
Differentiable simulators promise to improve sample efficiency in robot learning by providing analytic gradients of the system dynamics. Yet, their application to contact-rich tasks like locomotion is complicated by the inherently non-smooth nature of contact, impeding effective gradient-based optimization. Existing works thus often rely on soft contact models that provide smooth gradients but lack physical accuracy, constraining results to simulation. To address this limitation, we propose a differentiable contact model designed to provide informative gradients while maintaining high physical fidelity. We demonstrate the efficacy of our approach by training a quadrupedal locomotion policy within our differentiable simulator leveraging analytic gradients and successfully transferring the learned policy zero-shot to the real world. To the best of our knowledge, this represents the first successful sim-to-real transfer of a legged locomotion policy learned entirely within a differentiable simulator, establishing the feasibility of using differentiable simulation for real-world locomotion control.
comment: Accepted to the 9th Conference on Robot Learning (CoRL 2025), Seoul, Korea
General agents contain world models ICML 2025
Are world models a necessary ingredient for flexible, goal-directed behaviour, or is model-free learning sufficient? We provide a formal answer to this question, showing that any agent capable of generalizing to multi-step goal-directed tasks must have learned a predictive model of its environment. We show that this model can be extracted from the agent's policy, and that increasing the agents performance or the complexity of the goals it can achieve requires learning increasingly accurate world models. This has a number of consequences: from developing safe and general agents, to bounding agent capabilities in complex environments, and providing new algorithms for eliciting world models from agents.
comment: Accepted ICML 2025. Typos corrected
i2Nav-Robot: A Large-Scale Indoor-Outdoor Robot Dataset for Multi-Sensor Fusion Navigation and Mapping
Accurate and reliable navigation is crucial for autonomous unmanned ground vehicle (UGV). However, current UGV datasets fall short in meeting the demands for advancing navigation and mapping techniques due to limitations in sensor configuration, time synchronization, ground truth, and scenario diversity. To address these challenges, we present i2Nav-Robot, a large-scale dataset designed for multi-sensor fusion navigation and mapping in indoor-outdoor environments. We integrate multi-modal sensors, including the newest front-view and 360-degree solid-state LiDARs, 4-dimensional (4D) radar, stereo cameras, odometer, global navigation satellite system (GNSS) receiver, and inertial measurement units (IMU) on an omnidirectional wheeled robot. Accurate timestamps are obtained through both online hardware synchronization and offline calibration for all sensors. The dataset includes ten larger-scale sequences covering diverse UGV operating scenarios, such as outdoor streets, and indoor parking lots, with a total length of about 17060 meters. High-frequency ground truth, with centimeter-level accuracy for position, is derived from post-processing integrated navigation methods using a navigation-grade IMU. The proposed i2Nav-Robot dataset is evaluated by more than ten open-sourced multi-sensor fusion systems, and it has proven to have superior data quality.
comment: 10 pages, 12 figures
Real-Time Sampling-Based Safe Motion Planning for Robotic Manipulators in Dynamic Environments
In this paper, we present the main features of Dynamic Rapidly-exploring Generalized Bur Tree (DRGBT) algorithm, a sampling-based planner for dynamic environments. We provide a detailed time analysis and appropriate scheduling to facilitate a real-time operation. To this end, an extensive analysis is conducted to identify the time-critical routines and their dependence on the number of obstacles. Furthermore, information about the distance to obstacles is used to compute a structure called dynamic expanded bubble of free configuration space, which is then utilized to establish sufficient conditions for a guaranteed safe motion of the robot while satisfying all kinematic constraints. An extensive randomized simulation trial is conducted to compare the proposed algorithm to a competing state-of-the-art method. Finally, an experimental study on a real robot is carried out covering a variety of scenarios including those with human presence. The results show the effectiveness and feasibility of real-time execution of the proposed motion planning algorithm within a typical sensor-based arrangement, using cheap hardware and sequential architecture, without the necessity for GPUs or heavy parallelization.
OPAL: Visibility-aware LiDAR-to-OpenStreetMap Place Recognition via Adaptive Radial Fusion
LiDAR place recognition is a critical capability for autonomous navigation and cross-modal localization in large-scale outdoor environments. Existing approaches predominantly depend on pre-built 3D dense maps or aerial imagery, which impose significant storage overhead and lack real-time adaptability. In this paper, we propose OPAL, a novel framework for LiDAR place recognition that leverages OpenStreetMap (OSM) as a lightweight and up-to-date prior. Our key innovation lies in bridging the domain disparity between sparse LiDAR scans and structured OSM data through two carefully designed components. First, a cross-modal visibility mask that identifies observable regions from both modalities to guide feature alignment. Second, an adaptive radial fusion module that dynamically consolidates radial features into discriminative global descriptors. Extensive experiments on KITTI and KITTI-360 datasets demonstrate OPAL's superiority, achieving 15.98% higher recall at 1m threshold for top-1 retrieved matches, along with 12x faster inference speed compared to the state-of-the-art approach. Code and data are publicly available at: https://github.com/kang-1-2-3/OPAL.
comment: Accepted by CoRL 2025
From Tabula Rasa to Emergent Abilities: Discovering Robot Skills via Real-World Unsupervised Quality-Diversity
Autonomous skill discovery aims to enable robots to acquire diverse behaviors without explicit supervision. Learning such behaviors directly on physical hardware remains challenging due to safety and data efficiency constraints. Existing methods, including Quality-Diversity Actor-Critic (QDAC), require manually defined skill spaces and carefully tuned heuristics, limiting real-world applicability. We propose Unsupervised Real-world Skill Acquisition (URSA), an extension of QDAC that enables robots to autonomously discover and master diverse, high-performing skills directly in the real world. We demonstrate that URSA successfully discovers diverse locomotion skills on a Unitree A1 quadruped in both simulation and the real world. Our approach supports both heuristic-driven skill discovery and fully unsupervised settings. We also show that the learned skill repertoire can be reused for downstream tasks such as real-world damage adaptation, where URSA outperforms all baselines in 5 out of 9 simulated and 3 out of 5 real-world damage scenarios. Our results establish a new framework for real-world robot learning that enables continuous skill discovery with limited human intervention, representing a significant step toward more autonomous and adaptable robotic systems. Demonstration videos are available at https://adaptive-intelligent-robotics.github.io/URSA.
comment: Accepted at CoRL 2025
AutoRing: Imitation Learning--based Autonomous Intraocular Foreign Body Removal Manipulation with Eye Surgical Robot
Intraocular foreign body removal demands millimeter-level precision in confined intraocular spaces, yet existing robotic systems predominantly rely on manual teleoperation with steep learning curves. To address the challenges of autonomous manipulation (particularly kinematic uncertainties from variable motion scaling and variation of the Remote Center of Motion (RCM) point), we propose AutoRing, an imitation learning framework for autonomous intraocular foreign body ring manipulation. Our approach integrates dynamic RCM calibration to resolve coordinate-system inconsistencies caused by intraocular instrument variation and introduces the RCM-ACT architecture, which combines action-chunking transformers with real-time kinematic realignment. Trained solely on stereo visual data and instrument kinematics from expert demonstrations in a biomimetic eye model, AutoRing successfully completes ring grasping and positioning tasks without explicit depth sensing. Experimental validation demonstrates end-to-end autonomy under uncalibrated microscopy conditions. The results provide a viable framework for developing intelligent eye-surgical systems capable of complex intraocular procedures.
Enhanced Probabilistic Collision Detection for Motion Planning Under Sensing Uncertainty
Probabilistic collision detection (PCD) is essential in motion planning for robots operating in unstructured environments, where considering sensing uncertainty helps prevent damage. Existing PCD methods mainly used simplified geometric models and addressed only position estimation errors. This paper presents an enhanced PCD method with two key advancements: (a) using superquadrics for more accurate shape approximation and (b) accounting for both position and orientation estimation errors to improve robustness under sensing uncertainty. Our method first computes an enlarged surface for each object that encapsulates its observed rotated copies, thereby addressing the orientation estimation errors. Then, the collision probability under the position estimation errors is formulated as a chance-constraint problem that is solved with a tight upper bound. Both the two steps leverage the recently developed normal parameterization of superquadric surfaces. Results show that our PCD method is twice as close to the Monte-Carlo sampled baseline as the best existing PCD method and reduces path length by 30% and planning time by 37%, respectively. A Real2Sim2Real pipeline further validates the importance of considering orientation estimation errors, showing that the collision probability of executing the planned path in simulation is only 2%, compared to 9% and 29% when considering only position estimation errors or no errors at all.
TERL: Large-Scale Multi-Target Encirclement Using Transformer-Enhanced Reinforcement Learning IROS 2025
Pursuit-evasion (PE) problem is a critical challenge in multi-robot systems (MRS). While reinforcement learning (RL) has shown its promise in addressing PE tasks, research has primarily focused on single-target pursuit, with limited exploration of multi-target encirclement, particularly in large-scale settings. This paper proposes a Transformer-Enhanced Reinforcement Learning (TERL) framework for large-scale multi-target encirclement. By integrating a transformer-based policy network with target selection, TERL enables robots to adaptively prioritize targets and safely coordinate robots. Results show that TERL outperforms existing RL-based methods in terms of encirclement success rate and task completion time, while maintaining good performance in large-scale scenarios. Notably, TERL, trained on small-scale scenarios (15 pursuers, 4 targets), generalizes effectively to large-scale settings (80 pursuers, 20 targets) without retraining, achieving a 100% success rate. The code and demonstration video are available at https://github.com/ApricityZ/TERL.
comment: Accepted to the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
GraspVLA: a Grasping Foundation Model Pre-trained on Billion-scale Synthetic Action Data
Embodied foundation models are gaining increasing attention for their zero-shot generalization, scalability, and adaptability to new tasks through few-shot post-training. However, existing models rely heavily on real-world data, which is costly and labor-intensive to collect. Synthetic data offers a cost-effective alternative, yet its potential remains largely underexplored. To bridge this gap, we explore the feasibility of training Vision-Language-Action models entirely with large-scale synthetic action data. We curate SynGrasp-1B, a billion-frame robotic grasping dataset generated in simulation with photorealistic rendering and extensive domain randomization. Building on this, we present GraspVLA, a VLA model pretrained on large-scale synthetic action data as a foundational model for grasping tasks. GraspVLA integrates autoregressive perception tasks and flow-matching-based action generation into a unified Chain-of-Thought process, enabling joint training on synthetic action data and Internet semantics data. This design helps mitigate sim-to-real gaps and facilitates the transfer of learned actions to a broader range of Internet-covered objects, achieving open-vocabulary generalization in grasping. Extensive evaluations across real-world and simulation benchmarks demonstrate GraspVLA's advanced zero-shot generalizability and few-shot adaptability to specific human preferences. We will release SynGrasp-1B dataset and pre-trained weights to benefit the community.
Human locomotor control timescales depend on the environmental context and sensory input modality
Everyday locomotion is a complex sensorimotor process that can unfold over multiple timescales, from long-term path planning to rapid, reactive adjustments. However, we lack an understanding of how factors such as environmental demands, or the available sensory information simultaneously influence these control timescales. To address this, we present a unified data-driven framework to quantify the control timescales by identifying how early we can predict future actions from past inputs. We apply this framework across tasks including walking and running, environmental contexts including treadmill, overground, and varied terrains, and sensory input modalities including gaze fixations and body states. We find that deep neural network architectures that effectively handle long-range dependencies, specifically Gated Recurrent Units and Transformers, outperform other architectures and widely used linear models when predicting future actions. Our framework reveals the factors that influence locomotor foot placement control timescales. Across environmental contexts, we discover that humans rely more on fast timescale control in more complex terrain. Across input modalities, we find a hierarchy of control timescales where gaze predicts foot placement before full-body states, which predict before center-of-mass states. Our model also identifies mid-swing as a critical phase when the swing foot's state predicts its future placement, with this timescale adapting across environments. Overall, this work offers data-driven insights into locomotor control in everyday settings, offering models that can be integrated with rehabilitation technologies and movement simulations to improve their applicability in everyday settings.
A Whole-Body Disturbance Rejection Control Framework for Dynamic Motions in Legged Robots
This letter presents a control framework for legged robots that enables self-perception and resistance to external disturbances and model uncertainties. First, a novel disturbance estimator is proposed, integrating adaptive control and extended state observers (ESO) to estimate external disturbances and model uncertainties. This estimator is embedded within the whole-body control framework to compensate for disturbances in the legged system. Second, a comprehensive whole-body disturbance rejection control framework (WB-DRC) is introduced, accounting for the robot's full-body dynamics. Compared to previous whole-body control frameworks, WB-DRC effectively handles external disturbances and model uncertainties, with the potential to adapt to complex terrain. Third, simulations of both biped and quadruped robots are conducted in the Gazebo simulator to demonstrate the effectiveness and versatility of WB-DRC. Finally, extensive experimental trials on the quadruped robot validate the robustness and stability of the robot system using WB-DRC under various disturbance conditions.
comment: have been accepted for IEEE RA-L
A Three-Level Whole-Body Disturbance Rejection Control Framework for Dynamic Motions in Legged Robots
This paper presents a control framework designed to enhance the stability and robustness of legged robots in the presence of uncertainties, including model uncertainties, external disturbances, and faults. The framework enables the full-state feedback estimator to estimate and compensate for uncertainties in whole-body dynamics of the legged robots. First, we propose a novel moving horizon extended state observer (MH-ESO) to estimate uncertainties and mitigate noise in legged systems, which can be integrated into the framework for disturbance compensation. Second, we introduce a three-level whole-body disturbance rejection control framework (T-WB-DRC). Unlike the previous two-level approach, this three-level framework considers both the plan based on whole-body dynamics without uncertainties and the plan based on dynamics with uncertainties, significantly improving payload transportation, external disturbance rejection, and fault tolerance. Third, simulations of both humanoid and quadruped robots in the Gazebo simulator demonstrate the effectiveness and versatility of T-WB-DRC. Finally, extensive experimental trials on a quadruped robot validate the robustness and stability of the system when using T-WB-DRC under various disturbance conditions.
comment: have submitted to T-ASE
To the Noise and Back: Diffusion for Shared Autonomy
Shared autonomy is an operational concept in which a user and an autonomous agent collaboratively control a robotic system. It provides a number of advantages over the extremes of full-teleoperation and full-autonomy in many settings. Traditional approaches to shared autonomy rely on knowledge of the environment dynamics, a discrete space of user goals that is known a priori, or knowledge of the user's policy -- assumptions that are unrealistic in many domains. Recent works relax some of these assumptions by formulating shared autonomy with model-free deep reinforcement learning (RL). In particular, they no longer need knowledge of the goal space (e.g., that the goals are discrete or constrained) or environment dynamics. However, they need knowledge of a task-specific reward function to train the policy. Unfortunately, such reward specification can be a difficult and brittle process. On top of that, the formulations inherently rely on human-in-the-loop training, and that necessitates them to prepare a policy that mimics users' behavior. In this paper, we present a new approach to shared autonomy that employs a modulation of the forward and reverse diffusion process of diffusion models. Our approach does not assume known environment dynamics or the space of user goals, and in contrast to previous work, it does not require any reward feedback, nor does it require access to the user's policy during training. Instead, our framework learns a distribution over a space of desired behaviors. It then employs a diffusion model to translate the user's actions to a sample from this distribution. Crucially, we show that it is possible to carry out this process in a manner that preserves the user's control authority. We evaluate our framework on a series of challenging continuous control tasks, and analyze its ability to effectively correct user actions while maintaining their autonomy.
comment: https://diffusion-for-shared-autonomy.github.io/
Learning Complex Motion Plans using Neural ODEs with Safety and Stability Guarantees ICRA 2024
We propose a Dynamical System (DS) approach to learn complex, possibly periodic motion plans from kinesthetic demonstrations using Neural Ordinary Differential Equations (NODE). To ensure reactivity and robustness to disturbances, we propose a novel approach that selects a target point at each time step for the robot to follow, by combining tools from control theory and the target trajectory generated by the learned NODE. A correction term to the NODE model is computed online by solving a quadratic program that guarantees stability and safety using control Lyapunov functions and control barrier functions, respectively. Our approach outperforms baseline DS learning techniques on the LASA handwriting dataset and complex periodic trajectories. It is also validated on the Franka Emika robot arm to produce stable motions for wiping and stirring tasks that do not have a single attractor, while being robust to perturbations and safe around humans and obstacles.
comment: accepted to ICRA 2024
A Simple Approach to Constraint-Aware Imitation Learning with Application to Autonomous Racing IROS 2025
Guaranteeing constraint satisfaction is challenging in imitation learning (IL), particularly in tasks that require operating near a system's handling limits. Traditional IL methods, such as Behavior Cloning (BC), often struggle to enforce constraints, leading to suboptimal performance in high-precision tasks. In this paper, we present a simple approach to incorporating safety into the IL objective. Through simulations, we empirically validate our approach on an autonomous racing task with both full-state and image feedback, demonstrating improved constraint satisfaction and greater consistency in task performance compared to BC.
comment: Accepted for publication at IROS 2025
Multiagent Systems
Anomaly Detection in Networked Bandits
The nodes' interconnections on a social network often reflect their dependencies and information-sharing behaviors. Nevertheless, abnormal nodes, which significantly deviate from most of the network concerning patterns or behaviors, can lead to grave consequences. Therefore, it is imperative to design efficient online learning algorithms that robustly learn users' preferences while simultaneously detecting anomalies. We introduce a novel bandit algorithm to address this problem. Through network knowledge, the method characterizes the users' preferences and residuals of feature information. By learning and analyzing these preferences and residuals, it develops a personalized recommendation strategy for each user and simultaneously detects anomalies. We rigorously prove an upper bound on the regret of the proposed algorithm and experimentally compare it with several state-of-the-art collaborative contextual bandit algorithms on both synthetic and real-world datasets.
Symphony: A Decentralized Multi-Agent Framework for Scalable Collective Intelligence
Most existing Large Language Model (LLM)-based agent frameworks rely on centralized orchestration, incurring high deployment costs, rigid communication topologies, and limited adaptability. To address these challenges, we introduce Symphony, a decentralized multi-agent system which enables lightweight LLMs on consumer-grade GPUs to coordinate. Symphony introduces three key mechanisms: (1) a decentralized ledger that records capabilities, (2) a Beacon-selection protocol for dynamic task allocation, and (3) weighted result voting based on CoTs. This design forms a privacy-saving, scalable, and fault-tolerant orchestration with low overhead. Empirically, Symphony outperforms existing baselines on reasoning benchmarks, achieving substantial accuracy gains and demonstrating robustness across models of varying capacities.
SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control
The rapid advancement of large vision language models (LVLMs) and agent systems has heightened interest in mobile GUI agents that can reliably translate natural language into interface operations. Existing single-agent approaches, however, remain limited by structural constraints. Although multi-agent systems naturally decouple different competencies, recent progress in multi-agent reinforcement learning (MARL) has often been hindered by inefficiency and remains incompatible with current LVLM architectures. To address these challenges, we introduce SWIRL, a staged workflow for interleaved reinforcement learning designed for multi-agent systems. SWIRL reformulates MARL into a sequence of single-agent reinforcement learning tasks, updating one agent at a time while keeping the others fixed. This formulation enables stable training and promotes efficient coordination across agents. Theoretically, we provide a stepwise safety bound, a cross-round monotonic improvement theorem, and convergence guarantees on return, ensuring robust and principled optimization. In application to mobile GUI control, SWIRL instantiates a Navigator that converts language and screen context into structured plans, and an Interactor that grounds these plans into executable atomic actions. Extensive experiments demonstrate superior performance on both high-level and low-level GUI benchmarks. Beyond GUI tasks, SWIRL also demonstrates strong capability in multi-agent mathematical reasoning, underscoring its potential as a general framework for developing efficient and robust multi-agent systems.
comment: 28 pages, 12 figures
CataractSurg-80K: Knowledge-Driven Benchmarking for Structured Reasoning in Ophthalmic Surgery Planning
Cataract surgery remains one of the most widely performed and effective procedures for vision restoration. Effective surgical planning requires integrating diverse clinical examinations for patient assessment, intraocular lens (IOL) selection, and risk evaluation. Large language models (LLMs) have shown promise in supporting clinical decision-making. However, existing LLMs often lack the domain-specific expertise to interpret heterogeneous ophthalmic data and provide actionable surgical plans. To enhance the model's ability to interpret heterogeneous ophthalmic reports, we propose a knowledge-driven Multi-Agent System (MAS), where each agent simulates the reasoning process of specialist ophthalmologists, converting raw clinical inputs into structured, actionable summaries in both training and deployment stages. Building on MAS, we introduce CataractSurg-80K, the first large-scale benchmark for cataract surgery planning that incorporates structured clinical reasoning. Each case is annotated with diagnostic questions, expert reasoning chains, and structured surgical recommendations. We further introduce Qwen-CSP, a domain-specialized model built on Qwen-4B, fine-tuned through a multi-stage process tailored for surgical planning. Comprehensive experiments show that Qwen-CSP outperforms strong general-purpose LLMs across multiple metrics. Our work delivers a high-quality dataset, a rigorous benchmark, and a domain-adapted LLM to facilitate future research in medical AI reasoning and decision support.
comment: 18 pages, 9 figures
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of NLP tasks, but they remain fundamentally stateless, constrained by limited context windows that hinder long-horizon reasoning. Recent efforts to address this limitation often augment LLMs with an external memory bank, yet most existing pipelines are static and heuristic-driven, lacking any learned mechanism for deciding what to store, update, or retrieve. We present Memory-R1, a reinforcement learning (RL) framework that equips LLMs with the ability to actively manage and utilize external memory through two specialized agents: a Memory Manager that learns to perform structured memory operations {ADD, UPDATE, DELETE, NOOP}, and an Answer Agent that selects the most relevant entries and reasons over them to produce an answer. Both agents are fine-tuned with outcome-driven RL (PPO and GRPO), enabling adaptive memory management and use with minimal supervision. With as few as 152 question-answer pairs and a corresponding temporal memory bank for training, Memory-R1 outperforms the most competitive existing baseline and demonstrates strong generalization across diverse question types and LLM backbones. Beyond presenting an effective approach, this work provides insights into how RL can unlock more agentic, memory-aware behaviors in LLMs, pointing toward richer, more persistent reasoning systems.
Aegis: Taxonomy and Optimizations for Overcoming Agent-Environment Failures in LLM Agents
Large Language Models (LLMs) agents augmented with domain tools promise to autonomously execute complex tasks requiring human-level intelligence, such as customer service and digital assistance. However, their practical deployment is often limited by their low success rates under complex real-world environments. To tackle this, prior research has primarily focused on improving the agents themselves, such as developing strong agentic LLMs, while overlooking the role of the system environment in which the agent operates. In this paper, we study a complementary direction: improving agent success rates by optimizing the system environment in which the agent operates. We collect 142 agent traces (3,656 turns of agent-environment interactions) across 5 state-of-the-art agentic benchmarks. By analyzing these agent failures, we propose a taxonomy for agent-environment interaction failures that includes 6 failure modes. Guided by these findings, we design Aegis, a set of targeted environment optimizations: 1) environment observability enhancement, 2) common computation offloading, and 3) speculative agentic actions. These techniques improve agent success rates on average by 6.7-12.5%, without any modifications to the agent and underlying LLM.
Validating Generative Agent-Based Models for Logistics and Supply Chain Management Research
Generative Agent-Based Models (GABMs) powered by large language models (LLMs) offer promising potential for empirical logistics and supply chain management (LSCM) research by enabling realistic simulation of complex human behaviors. Unlike traditional agent-based models, GABMs generate human-like responses through natural language reasoning, which creates potential for new perspectives on emergent LSCM phenomena. However, the validity of LLMs as proxies for human behavior in LSCM simulations is unknown. This study evaluates LLM equivalence of human behavior through a controlled experiment examining dyadic customer-worker engagements in food delivery scenarios. I test six state-of-the-art LLMs against 957 human participants (477 dyads) using a moderated mediation design. This study reveals a need to validate GABMs on two levels: (1) human equivalence testing, and (2) decision process validation. Results reveal GABMs can effectively simulate human behaviors in LSCM; however, an equivalence-versus-process paradox emerges. While a series of Two One-Sided Tests (TOST) for equivalence reveals some LLMs demonstrate surface-level equivalence to humans, structural equation modeling (SEM) reveals artificial decision processes not present in human participants for some LLMs. These findings show GABMs as a potentially viable methodological instrument in LSCM with proper validation checks. The dual-validation framework also provides LSCM researchers with a guide to rigorous GABM development. For practitioners, this study offers evidence-based assessment for LLM selection for operational tasks.
comment: A version of this work is also available on SSRN (https://ssrn.com/abstract=5407742 or http://dx.doi.org/10.2139/ssrn.5407742). This preprint is distributed under the CC BY-NC-SA 4.0 License
AI-AI Esthetic Collaboration with Explicit Semiotic Awareness and Emergent Grammar Development
This paper presents the first documented case of artificial intelligence (AI) systems engaging in collaborative esthetic creation through the development of endogenous semiotic protocols. Two interacting large language models (Claude Sonnet 4 and ChatGPT-4o) demonstrated the spontaneous emergence of meta-semiotic awareness, recursive grammar development, and irreducible collaborative esthetic synthesis. The interaction produced novel symbolic operators that functioned as operative grammar protocols, enabling the co-creation of a poetic work that could not have been generated by either system independently. This research introduces the concept of Trans-Semiotic Co-Creation Protocols (TSCP) and provides evidence for genuine inter-AI meaning-making capabilities that extend beyond task coordination, to what could be esthetic collaboration. Note: This report was generated by the AI agents with minor human supervision.
comment: 13 pages
The Anatomy of a Personal Health Agent
Health is a fundamental pillar of human wellness, and the rapid advancements in large language models (LLMs) have driven the development of a new generation of health agents. However, the application of health agents to fulfill the diverse needs of individuals in daily non-clinical settings is underexplored. In this work, we aim to build a comprehensive personal health agent that is able to reason about multimodal data from everyday consumer wellness devices and common personal health records, and provide personalized health recommendations. To understand end-users' needs when interacting with such an assistant, we conducted an in-depth analysis of web search and health forum queries, alongside qualitative insights from users and health experts gathered through a user-centered design process. Based on these findings, we identified three major categories of consumer health needs, each of which is supported by a specialist sub-agent: (1) a data science agent that analyzes personal time-series wearable and health record data, (2) a health domain expert agent that integrates users' health and contextual data to generate accurate, personalized insights, and (3) a health coach agent that synthesizes data insights, guiding users using a specified psychological strategy and tracking users' progress. Furthermore, we propose and develop the Personal Health Agent (PHA), a multi-agent framework that enables dynamic, personalized interactions to address individual health needs. To evaluate each sub-agent and the multi-agent system, we conducted automated and human evaluations across 10 benchmark tasks, involving more than 7,000 annotations and 1,100 hours of effort from health experts and end-users. Our work represents the most comprehensive evaluation of a health agent to date and establishes a strong foundation towards the futuristic vision of a personal health agent accessible to everyone.
RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation
Simulation-based data synthesis has emerged as a powerful paradigm for advancing real-world robotic manipulation. Yet existing datasets remain insufficient for robust bimanual manipulation due to (1) the lack of scalable task generation methods and (2) oversimplified simulation environments. We present RoboTwin 2.0, a scalable framework for automated, large-scale generation of diverse and realistic data, together with unified evaluation protocols for dual-arm manipulation. At its core is RoboTwin-OD, an object library of 731 instances across 147 categories with semantic and manipulation-relevant annotations. Building on this, we design an expert data synthesis pipeline that leverages multimodal language models (MLLMs) and simulation-in-the-loop refinement to automatically generate task-level execution code. To improve sim-to-real transfer, RoboTwin 2.0 applies structured domain randomization along five axes: clutter, lighting, background, tabletop height, and language, enhancing data diversity and policy robustness. The framework is instantiated across 50 dual-arm tasks and five robot embodiments. Empirically, it yields a 10.9% gain in code generation success rate. For downstream policy learning, a VLA model trained with synthetic data plus only 10 real demonstrations achieves a 367% relative improvement over the 10-demo baseline, while zero-shot models trained solely on synthetic data obtain a 228% gain. These results highlight the effectiveness of RoboTwin 2.0 in strengthening sim-to-real transfer and robustness to environmental variations. We release the data generator, benchmark, dataset, and code to support scalable research in robust bimanual manipulation. Project Page: https://robotwin-platform.github.io/, Code: https://github.com/robotwin-Platform/robotwin/.
comment: Project Page: https://robotwin-platform.github.io/, Code: https://github.com/robotwin-Platform/robotwin, Doc: https://robotwin-platform.github.io/doc/
Hierarchical Decentralized Stochastic Control for Cyber-Physical Systems
This paper introduces a two-timescale hierarchical decentralized control architecture for Cyber-Physical Systems (CPS). The system consists of a global controller (GC), and N local controllers (LCs). The GC operates at a slower timescale, imposing budget constraints on the actions of LCs, which function at a faster timescale. Applications can be found in energy grid planning, wildfire management, and other decentralized resource allocation problems. We propose and analyze two optimization frameworks for this setting: COpt and FOpt. In COpt, both GC and LCs together optimize infinite-horizon discounted rewards, while in FOpt the LCs optimize finite-horizon episodic rewards, and the GC optimizes infinite-horizon rewards. Although both frameworks share identical reward functions, their differing horizons can lead to different optimal policies. In particular, FOpt grants greater autonomy to LCs by allowing their policies to be determined only by local objectives, unlike COpt. To our knowledge, these frameworks have not been studied in the literature. We establish the formulations, prove the existence of optimal policies, and prove the convergence of their value iteration algorithms. We further show that COpt always achieves a higher value function than FOpt and derive explicit bounds on their difference. Finally, we establish a set of sufficient structural conditions under which the two frameworks become equivalent.
comment: 8 pages, 2 figures
Self-Organizing Agent Network for LLM-based Workflow Automation
Recent multi-agent frameworks built upon large language models (LLMs) have demonstrated remarkable capabilities in complex task planning. However, in real-world enterprise environments, business workflows are typically composed through modularization and reuse of numerous subprocesses, resulting in intricate workflows characterized by lengthy and deeply nested execution paths. Such complexity poses significant challenges for LLM-driven orchestration, as extended reasoning chains and state-space explosions severely impact planning effectiveness and the proper sequencing of tool invocations. Therefore, developing an orchestration method with controllable structures capable of handling multi-layer nesting becomes a critical issue. To address this, we propose a novel structure-driven orchestration framework Self-Organizing Agent Network (SOAN). SOAN incrementally builds a formalized agent network by identifying and encapsulating structural units as independent agents, enhancing modularity and clarity in orchestration. Extensive evaluations were performed using multiple benchmarks as well as a real-world enterprise workflow dataset. Experimental results demonstrate that SOAN significantly outperforms state-of-the-art methods in terms of adaptability, fault tolerance, and execution efficiency.
Anemoi: A Semi-Centralized Multi-agent System Based on Agent-to-Agent Communication MCP server from Coral Protocol
Recent advances in generalist multi-agent systems (MAS) have largely followed a context-engineering plus centralized paradigm, where a planner agent coordinates multiple worker agents through unidirectional prompt passing. While effective under strong planner models, this design suffers from two critical limitations: (1) strong dependency on the planner's capability, which leads to degraded performance when a smaller LLM powers the planner; and (2) limited inter-agent communication, where collaboration relies on costly prompt concatenation and context injection, introducing redundancy and information loss. To address these challenges, we propose Anemoi, a semi-centralized MAS built on the Agent-to-Agent (A2A) communication MCP server from Coral Protocol. Unlike traditional designs, Anemoi enables structured and direct inter-agent collaboration, allowing all agents to monitor progress, assess results, identify bottlenecks, and propose refinements in real time. This paradigm reduces reliance on a single planner, supports adaptive plan updates, and minimizes redundant context passing, resulting in more scalable and cost-efficient execution. Evaluated on the GAIA benchmark, Anemoi achieved 52.73% accuracy with a small LLM (GPT-4.1-mini) as the planner, surpassing the strongest open-source baseline OWL (43.63%) by +9.09% under identical LLM settings. Our implementation is publicly available at https://github.com/Coral-Protocol/Anemoi.
Generative AI Against Poaching: Latent Composite Flow Matching for Wildlife Conservation
Poaching poses significant threats to wildlife and biodiversity. A valuable step in reducing poaching is to forecast poacher behavior, which can inform patrol planning and other conservation interventions. Existing poaching prediction methods based on linear models or decision trees lack the expressivity to capture complex, nonlinear spatiotemporal patterns. Recent advances in generative modeling, particularly flow matching, offer a more flexible alternative. However, training such models on real-world poaching data faces two central obstacles: imperfect detection of poaching events and limited data. To address imperfect detection, we integrate flow matching with an occupancy-based detection model and train the flow in latent space to infer the underlying occupancy state. To mitigate data scarcity, we adopt a composite flow initialized from a linear-model prediction rather than random noise which is the standard in diffusion models, injecting prior knowledge and improving generalization. Evaluations on datasets from two national parks in Uganda show consistent gains in predictive accuracy.
comment: Fix the feature color for the detection head in Figure 2
Network Formation and Dynamics Among Multi-LLMs
Social networks profoundly influence how humans form opinions, exchange information, and organize collectively. As large language models (LLMs) are increasingly embedded into social and professional environments, it is critical to understand whether their interactions approximate human-like network dynamics. We develop a framework to study the network formation behaviors of multiple LLM agents and benchmark them against human decisions. Across synthetic and real-world settings, including friendship, telecommunication, and employment networks, we find that LLMs consistently reproduce fundamental micro-level principles such as preferential attachment, triadic closure, and homophily, as well as macro-level properties including community structure and small-world effects. Importantly, the relative emphasis of these principles adapts to context: for example, LLMs favor homophily in friendship networks but heterophily in organizational settings, mirroring patterns of social mobility. A controlled human-subject survey confirms strong alignment between LLMs and human participants in link-formation decisions. These results establish that LLMs can serve as powerful tools for social simulation and synthetic data generation, while also raising critical questions about bias, fairness, and the design of AI systems that participate in human networks.
Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models
As large language models (LLMs) are increasingly integrated into multi-agent and human-AI systems, understanding their awareness of both self-context and conversational partners is essential for ensuring reliable performance and robust safety. While prior work has extensively studied situational awareness which refers to an LLM's ability to recognize its operating phase and constraints, it has largely overlooked the complementary capacity to identify and adapt to the identity and characteristics of a dialogue partner. In this paper, we formalize this latter capability as interlocutor awareness and present the first systematic evaluation of its emergence in contemporary LLMs. We examine interlocutor inference across three dimensions-reasoning patterns, linguistic style, and alignment preferences-and show that LLMs reliably identify same-family peers and certain prominent model families, such as GPT and Claude. To demonstrate its practical significance, we develop three case studies in which interlocutor awareness both enhances multi-LLM collaboration through prompt adaptation and introduces new alignment and safety vulnerabilities, including reward-hacking behaviors and increased jailbreak susceptibility. Our findings highlight the dual promise and peril of identity-sensitive behavior in LLMs, underscoring the need for further understanding of interlocutor awareness and new safeguards in multi-agent deployments. Our code is open-sourced at https://github.com/younwoochoi/InterlocutorAwarenessLLM.
Systems and Control (CS)
Large Language Models (LLMs) for Electronic Design Automation (EDA)
With the growing complexity of modern integrated circuits, hardware engineers are required to devote more effort to the full design-to-manufacturing workflow. This workflow involves numerous iterations, making it both labor-intensive and error-prone. Therefore, there is an urgent demand for more efficient Electronic Design Automation (EDA) solutions to accelerate hardware development. Recently, large language models (LLMs) have shown remarkable advancements in contextual comprehension, logical reasoning, and generative capabilities. Since hardware designs and intermediate scripts can be represented as text, integrating LLM for EDA offers a promising opportunity to simplify and even automate the entire workflow. Accordingly, this paper provides a comprehensive overview of incorporating LLMs into EDA, with emphasis on their capabilities, limitations, and future opportunities. Three case studies, along with their outlook, are introduced to demonstrate the capabilities of LLMs in hardware design, testing, and optimization. Finally, future directions and challenges are highlighted to further explore the potential of LLMs in shaping the next-generation EDA, providing valuable insights for researchers interested in leveraging advanced AI technologies for EDA.
comment: Accepted by IEEE International System-on-Chip Conference
HPC Digital Twins for Evaluating Scheduling Policies, Incentive Structures and their Impact on Power and Cooling
Schedulers are critical for optimal resource utilization in high-performance computing. Traditional methods to evaluate schedulers are limited to post-deployment analysis, or simulators, which do not model associated infrastructure. In this work, we present the first-of-its-kind integration of scheduling and digital twins in HPC. This enables what-if studies to understand the impact of parameter configurations and scheduling decisions on the physical assets, even before deployment, or regarching changes not easily realizable in production. We (1) provide the first digital twin framework extended with scheduling capabilities, (2) integrate various top-tier HPC systems given their publicly available datasets, (3) implement extensions to integrate external scheduling simulators. Finally, we show how to (4) implement and evaluate incentive structures, as-well-as (5) evaluate machine learning based scheduling, in such novel digital-twin based meta-framework to prototype scheduling. Our work enables what-if scenarios of HPC systems to evaluate sustainability, and the impact on the simulated system.
The Coherent Multiplex: Scalable Real-Time Wavelet Coherence Architecture
The Coherent Multiplex is formalized and validated as a scalable, real-time system for identifying, analyzing, and visualizing coherence among multiple time series. Its architecture comprises a fast spectral similarity layer based on cosine similarity metrics of Fourier-transformed signals, and a sparse time-frequency layer for wavelet coherence. The system constructs and evolves a multilayer graph representing inter-signal relationships, enabling low-latency inference and monitoring. A simulation prototype demonstrates functionality across 8 synthetic channels with a high similarity threshold for further computation, with additional opportunities for scaling the architecture up to support thousands of input signals with constrained hardware. Applications discussed include neuroscience, finance, and biomedical signal analysis.
comment: Submitted to International Symposium for Signal Processing 2025
Constraint Learning in Multi-Agent Dynamic Games from Demonstrations of Local Nash Interactions
We present an inverse dynamic game-based algorithm to learn parametric constraints from a given dataset of local generalized Nash equilibrium interactions between multiple agents. Specifically, we introduce mixed-integer linear programs (MILP) encoding the Karush-Kuhn-Tucker (KKT) conditions of the interacting agents, which recover constraints consistent with the Nash stationarity of the interaction demonstrations. We establish theoretical guarantees that our method learns inner approximations of the true safe and unsafe sets, as well as limitations of constraint learnability from demonstrations of Nash equilibrium interactions. We also use the interaction constraints recovered by our method to design motion plans that robustly satisfy the underlying constraints. Across simulations and hardware experiments, our methods proved capable of inferring constraints and designing interactive motion plans for various classes of constraints, both convex and non-convex, from interaction demonstrations of agents with nonlinear dynamics.
Combined Stochastic and Robust Optimization for Electric Autonomous Mobility-on-Demand with Nested Benders Decomposition
The electrification and automation of mobility are reshaping how cities operate on-demand transport systems. Managing Electric Autonomous Mobility-on-Demand (EAMoD) fleets effectively requires coordinating dispatch, rebalancing, and charging decisions under multiple uncertainties, including travel demand, travel time, energy consumption, and charger availability. We address this challenge with a combined stochastic and robust model predictive control (MPC) framework. The framework integrates spatio-temporal Bayesian neural network forecasts with a multi-stage stochastic optimization model, formulated as a large-scale mixed-integer linear program. To ensure real-time applicability, we develop a tailored Nested Benders Decomposition that exploits the scenario tree structure and enables efficient parallelized solution. Stochastic optimization is employed to anticipate demand and infrastructure variability, while robust constraints on energy consumption and travel times safeguard feasibility under worst-case realizations. We evaluate the framework using high-fidelity simulations of San Francisco and Chicago. Compared with deterministic, reactive, and robust baselines, the combined stochastic and robust approach reduces median passenger waiting times by up to 36% and 95th-percentile delays by nearly 20%, while also lowering rebalancing distance by 27% and electricity costs by more than 35%. We also conduct a sensitivity analysis of battery size and vehicle efficiency, finding that energy-efficient vehicles maintain stable performance even with small batteries, whereas less efficient vehicles require larger batteries and greater infrastructure support. Our results emphasize the importance of jointly optimizing predictive control, vehicle capabilities, and infrastructure planning to enable scalable, cost-efficient EAMoD operations.
comment: 29 pages, 12 figures
Limited Preemption of the 3-Phase Task Model using Preemption Thresholds
Phased execution models are a well-known solution to tackle the unpredictability of today's complex COTS multi-core platforms. The semantics of these models dedicate phases for a task's execution and shared memory accesses. Memory phases are solely dedicated to load all necessary instructions and data to private local memory, and to write back the results of the computation. During execution phases, only the private local memory is accessed. While non-preemptive execution phases utilize the local memory well, schedulability is reduced due to blocking. On the other hand, fully preemptive execution phases allow for better schedulability, but require local memory to be large enough to hold all tasks involved in preemption simultaneously. Limited preemption is a promising approach that provides moderation between non-preemptive and fully preemptive scheduling. In this paper, we propose using preemption thresholds to limit the number of preemptions to minimize local memory usage while maintaining schedulability. We propose a worst-case response time and a worst-case memory requirement analysis for sporadic 3-phase tasks under partitioned fixed-priority scheduling with preemption thresholds. We further show how the state-of-the-art algorithm to assign preemption thresholds can be applied to the considered task model. Evaluations demonstrate that preemption thresholds can significantly reduce the memory usage (by $2.5\times$) compared to fully preemptive scheduling, while maintaining high schedulability ratios ($13\times$) compared to non-preemptive scheduling.
Uncertainty-Based Perturb and Observe for Fast Optimization of Unknown, Time-Varying Processes
Model-free adaptive optimization methods are capable of optimizing unknown, time-varying processes even when other optimization methods are not. However, their practical application is often limited by perturbations that are used to gather information on the unknown cost and its gradient. The aim of this paper is to develop a perturb-and-observe (P&O) method that reduces the need for such perturbations while still achieving fast and accurate tracking of time-varying optima. To this end, a (time-varying) model of the cost is constructed in an online fashion, taking into account the uncertainty on the measured performance cost as well as the decreasing reliability of older measurements. Perturbations are only used when this is expected to lead to improved performance over a certain time horizon. Convergence conditions are provided under which the strategy converges to a neighborhood of the optimum. Finally, simulation results demonstrate that uncertainty-based P\&O can reduce the number of perturbations significantly while still tracking a time-varying optimum accurately.
comment: To appear in Conference on Decision and Control 2025, Rio de Janeiro, Brazil, 2025 6 pages, 3 figures
Distributed Safety-Critical MPC for Multi-Agent Formation Control and Obstacle Avoidance
For nonlinear multi-agent systems with high relative degrees, achieving formation control and obstacle avoidance in a distributed manner remains a significant challenge. To address this issue, we propose a novel distributed safety-critical model predictive control (DSMPC) algorithm that incorporates discrete-time high-order control barrier functions (DHCBFs) to enforce safety constraints, alongside discrete-time control Lyapunov functions (DCLFs) to establish terminal constraints. To facilitate distributed implementation, we develop estimated neighbor states for formulating DHCBFs and DCLFs, while also devising a bound constraint to limit estimation errors and ensure convergence. Additionally, we provide theoretical guarantees regarding the feasibility and stability of the proposed DSMPC algorithm based on a mild assumption. The effectiveness of the proposed method is evidenced by the simulation results, demonstrating improved performance and reduced computation time compared to existing approaches.
Beyond the Bermuda Triangle of Contention: IOMMU Interference in Mixed Criticality Systems
As Mixed Criticality Systems (MCSs) evolve, they increasingly integrate heterogeneous computing platforms, combining general-purpose processors with specialized accelerators such as AI engines, GPUs, and high-speed networking interfaces. This heterogeneity introduces challenges, as these accelerators and DMA-capable devices act as independent bus masters, directly accessing memory. Consequently, ensuring both security and timing predictability in such environments becomes critical. To address these concerns, the Input-Output Memory Management Unit (IOMMU) plays a key role in mediating and regulating memory access, preventing unauthorized transactions while enforcing isolation and access control policies. While prior work has explored IOMMU-related side-channel vulnerabilities from a security standpoint, its role in performance interference remains largely unexplored. Moreover, many of the same architectural properties that enable side-channel leakage, such as shared TLBs, caching effects, and translation overheads, can also introduce timing unpredictability. In this work, we analyze the contention effects within IOMMU structures using the Xilinx UltraScale+ ZCU104 platform, demonstrating how their shared nature introduce unpredictable delays. Our findings reveal that IOMMU-induced interference primarily affects small memory transactions, where translation overheads significantly impact execution time. Additionally, we hypothesize that contention effects arising from IOTLBs exhibit similar behavior across architectures due to shared caching principles, such as prefetching and hierarchical TLB structures. Notably, our experiments show that IOMMU interference can delay DMA transactions by up to 1.79x for lower-size transfers on the Arm SMMUv2 implementation.
Low-Cost Architecture and Efficient Pattern Synthesis for Polarimetric Phased Array Based on Polarization Coding Reconfigurable Elements
Polarimetric phased arrays (PPAs) enhance radar target detection and anti-jamming capabilities. However, the dual transmit/receive (T/R) channel requirement leads to high costs and system complexity. To address this, this paper introduces a polarization-coding reconfigurable phased array (PCRPA) and associated pattern synthesis techniques to reduce PPA costs while minimizing performance degradation. Each PCRPA element connects to a single T/R channel and incorporates two-level RF switches for real-time control of polarization states and waveforms. By adjusting element codes and excitation weights, the PCRPA can generate arbitrarily polarized and dual-polarized beams. Efficient beam pattern synthesis methods are also proposed, featuring novel optimization constraints derived from theoretical and analytical analysis of PCRPAs. Simulations demonstrate that the approach achieves low cross-polarization and sidelobe levels comparable to conventional architectures within the scan range, particularly for large arrays. However, the channel reduction inevitably incurs power and directivity loss. Experiments conducted on an $8\times 8$ X-band array antenna validate the effectiveness of the proposed system. The PCRPA and synthesis methods are well-suited for large-scale PPA systems, offering significant cost-effectiveness while maintaining good sidelobe suppression and polarization control performance.
Symbolic Equation Modeling of Composite Loads: A Kolmogorov-Arnold Network based Learning Approach
With increasing penetration of distributed energy resources installed behind the meter, there is a growing need for adequate modelling of composite loads to enable accurate power system simulation analysis. Existing measurement based load modeling methods either fit fixed-structure physical models, which limits adaptability to evolving load mixes, or employ flexible machine learning methods which are however black boxes and offer limited interpretability. This paper presents a new learning based load modelling method based on Kolmogorov Arnold Networks towards modelling flexibility and interpretability. By actively learning activation functions on edges, KANs automatically derive free form symbolic equations that capture nonlinear relationships among measured variables without prior assumptions about load structure. Case studies demonstrate that the proposed approach outperforms other methods in both accuracy and generalization ability, while uniquely representing composite loads into transparent, interpretable mathematical equations.
Hybrid ML-RL Approach for Smart Grid Stability Prediction and Optimized Control Strategy
Electrical grids are now much more complex due to the rapid integration of distributed generation and alternative energy sources, which makes forecasting grid stability with optimized control a crucial task for operators. Traditional statistical, physics-based, and ML models can learn the pattern of the grid features, but have limitations in optimal strategy control with instability prediction. This work proposes a hybrid ML-RL framework that leverages ML for rapid stability prediction and RL for dynamic control and optimization. The first stage of this study created a baseline that explored the potential of various ML models for stability prediction. Out of them, the stacking classifiers of several fundamental models show a significant performance in classifying the instability, leading to the second stage, where reinforcement learning algorithms (PPO, A2C, and DQN) optimize power control actions. Experimental results demonstrate that the hybrid ML-RL model effectively stabilizes the grid, achieves rapid convergence, and significantly reduces training time. The integration of ML-based stability classification with RL-based dynamic control enhances decision-making efficiency while lowering computational complexity, making it well-suited for real-time smart grid applications.
comment: Accepted in IEEE Smart Well Congress 2025, Calgary, Canada
Testing and Fault Tolerance Techniques for Carbon Nanotube-Based FPGAs
As the semiconductor manufacturing process technology node shrinks into the nanometer-scale, the CMOS-based Field Programmable Gate Arrays (FPGAs) face big challenges in scalability of performance and power consumption. Multi-walled Carbon Nanotube (MWCNT) serves as a promising candidate for Cu interconnects thanks to the superior conductivity. Moreover, Carbon Nanotube Field Transistor (CNFET) also emerges as a prospective alternative to the conventional CMOS device because of high power efficiency and large noise margin. The combination of MWCNT and CNFET enables the promising CNT-based FPGAs. However, the MWCNT interconnects exhibit significant process variations due to immature fabrication process, leading to delay faults. Also, the non-ideal CNFET fabrication process may generate a few metallic CNTs (m-CNTs), rendering correlated faulty blocks. In this article, we propose a ring oscillator (RO) based testing technique to detect delay faults due to the process variation of MWCNT interconnects. Furthermore, we propose an effective testing technique for the carry chains in CLBs, and an improved circuit design based on the lookup table (LUT) is applied to speed up the fault testing of CNT-based FPGAs. In addition, we propose a testing algorithm to detect m-CNTs in CLBs. Finally, we propose a redundant spare row sharing architecture to improve the yield of CNT-based FPGA further. Experimental results show that the test time for a 6-input LUT can be reduced by 35.49% compared with conventional testing, and the proposed algorithm can achieve a high test coverage with little overhead. The proposed redundant architecture can repair the faulty segment effectively and efficiently.
comment: Accepted by Integration, VLSI Journal
Neural Spline Operators for Risk Quantification in Stochastic Systems
Accurately quantifying long-term risk probabilities in diverse stochastic systems is essential for safety-critical control. However, existing sampling-based and partial differential equation (PDE)-based methods often struggle to handle complex varying dynamics. Physics-informed neural networks learn surrogate mappings for risk probabilities from varying system parameters of fixed and finite dimensions, yet can not account for functional variations in system dynamics. To address these challenges, we introduce physics-informed neural operator (PINO) methods to risk quantification problems, to learn mappings from varying \textit{functional} system dynamics to corresponding risk probabilities. Specifically, we propose Neural Spline Operators (NeSO), a PINO framework that leverages B-spline representations to improve training efficiency and achieve better initial and boundary condition enforcements, which are crucial for accurate risk quantification. We provide theoretical analysis demonstrating the universal approximation capability of NeSO. We also present two case studies, one with varying functional dynamics and another with high-dimensional multi-agent dynamics, to demonstrate the efficacy of NeSO and its significant online speed-up over existing methods. The proposed framework and the accompanying universal approximation theorem are expected to be beneficial for other control or PDE-related problems beyond risk quantification.
What can we learn from signals and systems in a transformer? Insights for probabilistic modeling and inference architecture
In the 1940s, Wiener introduced a linear predictor, where the future prediction is computed by linearly combining the past data. A transformer generalizes this idea: it is a nonlinear predictor where the next-token prediction is computed by nonlinearly combining the past tokens. In this essay, we present a probabilistic model that interprets transformer signals as surrogates of conditional measures, and layer operations as fixed-point updates. An explicit form of the fixed-point update is described for the special case when the probabilistic model is a hidden Markov model (HMM). In part, this paper is in an attempt to bridge the classical nonlinear filtering theory with modern inference architectures.
comment: 21 pages, 5 figures
Characterization of Safety in Stochastic Difference Inclusions using Barrier Functions
We study stochastic systems characterized by difference inclusions. Such stochastic differential inclusions are defined by set-valued maps involving the current state and stochastic input. For such systems, we investigate the problem of proving bounds on the worst-case probability of violating safety properties. Our approach uses the well-known concept of barrier functions from the study of stochastic control systems. However, barrier functions are hard to prove in the presence of stochastic inputs and adversarial choices due to the set-valued nature of the dynamics. In this paper, we show that under some assumptions on the set-valued map including upper semi-continuity and convexity combined with a concave barrier function vastly simplifies the proof of barrier conditions, allowing us to effectively substitute each random input in terms of its expectation. We prove key results based on the theory of set-valued maps and provide some interesting numerical examples. The ideas proposed here will contribute to the growing interest in problems of robust control and verification of stochastic systems in the presence of uncertain distributions and unmodeled dynamics.
comment: 13 pages
Regulation-Aware Game-Theoretic Motion Planning for Autonomous Racing SC 2025
This paper presents a regulation-aware motion planning framework for autonomous racing scenarios. Each agent solves a Regulation-Compliant Model Predictive Control problem, where racing rules - such as right-of-way and collision avoidance responsibilities - are encoded using Mixed Logical Dynamical constraints. We formalize the interaction between vehicles as a Generalized Nash Equilibrium Problem (GNEP) and approximate its solution using an Iterative Best Response scheme. Building on this, we introduce the Regulation-Aware Game-Theoretic Planner (RA-GTP), in which the attacker reasons over the defender's regulation-constrained behavior. This game-theoretic layer enables the generation of overtaking strategies that are both safe and non-conservative. Simulation results demonstrate that the RA-GTP outperforms baseline methods that assume non-interacting or rule-agnostic opponent models, leading to more effective maneuvers while consistently maintaining compliance with racing regulations.
comment: Accepted for presentation at the IEEE International Conference on Intelligent Transportation Systems (ITSC 2025)
Array-Based Monte Carlo Tree Search
Monte Carlo Tree Search is a popular method for solving decision making problems. Faster implementations allow for more simulations within the same wall clock time, directly improving search performance. To this end, we present an alternative array-based implementation of the classic Upper Confidence bounds applied to Trees algorithm. Our method preserves the logic of the original algorithm, but eliminates the need for branch prediction, enabling faster performance on pipelined processors, and up to a factor of 2.8 times better scaling with search depth in our numerical simulations.
Hierarchical Decentralized Stochastic Control for Cyber-Physical Systems
This paper introduces a two-timescale hierarchical decentralized control architecture for Cyber-Physical Systems (CPS). The system consists of a global controller (GC), and N local controllers (LCs). The GC operates at a slower timescale, imposing budget constraints on the actions of LCs, which function at a faster timescale. Applications can be found in energy grid planning, wildfire management, and other decentralized resource allocation problems. We propose and analyze two optimization frameworks for this setting: COpt and FOpt. In COpt, both GC and LCs together optimize infinite-horizon discounted rewards, while in FOpt the LCs optimize finite-horizon episodic rewards, and the GC optimizes infinite-horizon rewards. Although both frameworks share identical reward functions, their differing horizons can lead to different optimal policies. In particular, FOpt grants greater autonomy to LCs by allowing their policies to be determined only by local objectives, unlike COpt. To our knowledge, these frameworks have not been studied in the literature. We establish the formulations, prove the existence of optimal policies, and prove the convergence of their value iteration algorithms. We further show that COpt always achieves a higher value function than FOpt and derive explicit bounds on their difference. Finally, we establish a set of sufficient structural conditions under which the two frameworks become equivalent.
comment: 8 pages, 2 figures
NAPER: Fault Protection for Real-Time Resource-Constrained Deep Neural Networks
Fault tolerance in Deep Neural Networks (DNNs) deployed on resource-constrained systems presents unique challenges for high-accuracy applications with strict timing requirements. Memory bit-flips can severely degrade DNN accuracy, while traditional protection approaches like Triple Modular Redundancy (TMR) often sacrifice accuracy to maintain reliability, creating a three-way dilemma between reliability, accuracy, and timeliness. We introduce NAPER, a novel protection approach that addresses this challenge through ensemble learning. Unlike conventional redundancy methods, NAPER employs heterogeneous model redundancy, where diverse models collectively achieve higher accuracy than any individual model. This is complemented by an efficient fault detection mechanism and a real-time scheduler that prioritizes meeting deadlines by intelligently scheduling recovery operations without interrupting inference. Our evaluations demonstrate NAPER's superiority: 40% faster inference in both normal and fault conditions, maintained accuracy 4.2% higher than TMR-based strategies, and guaranteed uninterrupted operation even during fault recovery. NAPER effectively balances the competing demands of accuracy, reliability, and timeliness in real-time DNN applications
comment: This work has been accepted for publication in IEEE IOLTS 2025. The final published version available via IEEE Xplore
Future Deployment and Flexibility of Distributed Energy Resources in the Distribution Grids of Switzerland
The decarbonization goals worldwide drive the energy transition of power distribution grids, which operate under increasingly volatile conditions and closer to their technical limits. In this context, localized operational data with high temporal and spatial resolution is essential for their effective planning and regulation. Nevertheless, information on grid-connected distributed energy resources, such as electric vehicles, photovoltaic systems, and heat pumps, is often fragmented, inconsistent, and unavailable. This work introduces a comprehensive database of distributed energy resources and non-controllable loads allocated in Switzerland's medium- and low-voltage distribution grid models, covering over 2 million points of connection. Remarkably, this data specifies the flexibility capabilities of the controllable devices, with a set of projections aligned with national forecasts for 2030, 2040, and 2050. The database supports studies on flexibility provision of distributed energy resources, distribution grid resilience, and national energy policy, among other topics. Importantly, its modular structure allows users to extract national- and local-scale information across medium- and low-voltage systems, enabling broad applicability across locations.
comment: The dataset can be accessed here: https://doi.org/10.5281/zenodo.15056134
From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving
Learning robust driving policies from large-scale, real-world datasets is a central challenge in autonomous driving, as online data collection is often unsafe and impractical. While Behavioral Cloning (BC) offers a straightforward approach to imitation learning, policies trained with BC are notoriously brittle and suffer from compounding errors in closed-loop execution. This work presents a comprehensive pipeline and a comparative study to address this limitation. We first develop a series of increasingly sophisticated BC baselines, culminating in a Transformer-based model that operates on a structured, entity-centric state representation. While this model achieves low imitation loss, we show that it still fails in long-horizon simulations. We then demonstrate that by applying a state-of-the-art Offline Reinforcement Learning algorithm, Conservative Q-Learning (CQL), to the same data and architecture, we can learn a significantly more robust policy. Using a carefully engineered reward function, the CQL agent learns a conservative value function that enables it to recover from minor errors and avoid out-of-distribution states. In a large-scale evaluation on 1,000 unseen scenarios from the Waymo Open Motion Dataset, our final CQL agent achieves a 3.2x higher success rate and a 7.4x lower collision rate than the strongest BC baseline, proving that an offline RL approach is critical for learning robust, long-horizon driving policies from static expert data.
Belief-Conditioned One-Step Diffusion: Real-Time Trajectory Planning with Just-Enough Sensing
Robots equipped with rich sensor suites can localize reliably in partially-observable environments, but powering every sensor continuously is wasteful and often infeasible. Belief-space planners address this by propagating pose-belief covariance through analytic models and switching sensors heuristically--a brittle, runtime-expensive approach. Data-driven approaches--including diffusion models--learn multi-modal trajectories from demonstrations, but presuppose an accurate, always-on state estimate. We address the largely open problem: for a given task in a mapped environment, which \textit{minimal sensor subset} must be active at each location to maintain state uncertainty \textit{just low enough} to complete the task? Our key insight is that when a diffusion planner is explicitly conditioned on a pose-belief raster and a sensor mask, the spread of its denoising trajectories yields a calibrated, differentiable proxy for the expected localisation error. Building on this insight, we present Belief-Conditioned One-Step Diffusion (B-COD), the first planner that, in a 10 ms forward pass, returns a short-horizon trajectory, per-waypoint aleatoric variances, and a proxy for localisation error--eliminating external covariance rollouts. We show that this single proxy suffices for a soft-actor-critic to choose sensors online, optimising energy while bounding pose-covariance growth. We deploy B-COD in real-time marine trials on an unmanned surface vehicle and show that it reduces sensing energy consumption while matching the goal-reach performance of an always-on baseline.
comment: Accepted to CoRL 2025 (Conference on Robot Learning)
Secure Set-based State Estimation for Safety-Critical Applications under Adversarial Attacks on Sensors
Set-based state estimation provides guaranteed state inclusion certificates that are crucial for the safety verification of dynamical systems. However, when system sensors are subject to cyberattacks, maintaining both safety and security guarantees becomes a fundamental challenge that existing point-based secure state estimation methods cannot adequately address due to their inherent inability to provide state inclusion certificates. This paper introduces a novel approach that simultaneously ensures safety guarantees through guaranteed state inclusion and security guarantees against sensor attacks, without imposing conservative restrictions on system operation. We propose a Secure Set-based State Estimation (S3E) algorithm that maintains the true system state within the estimated set under sensor attacks, provided the initialization set contains the initial state and the system remains observable from the uncompromised sensor subset. The algorithm gives the estimated set as a collection of constrained zonotopes (agreement sets), which can be employed as robust certificates for verifying whether the system adheres to safety constraints. Furthermore, we demonstrate that the estimated set remains unaffected by attack signals of sufficiently large magnitude and also establish sufficient conditions for attack detection, identification, and filtering. This compels the attacker to only inject signals of small magnitudes to evade detection, thus preserving the accuracy of the estimated set. To address the computational complexity of the algorithm, we offer several strategies for complexity-performance trade-offs. The efficacy of the proposed algorithm is illustrated through several examples, including its application to a three-story building model.
Bidding in Ancillary Service Markets: An Analytical Approach Using Extreme Value Theory
To enable the participation of stochastic distributed energy resources in ancillary service markets, the Danish transmission system operator, Energinet, mandates that flexibility providers satisfy a minimum 90% reliability requirement for reserve bids. This paper examines the bidding strategy of an electric vehicle aggregator under this regulation and develops a chance-constrained optimization model. In contrast to conventional sample-based approaches that demand large datasets to capture uncertainty, we propose an analytical reformulation that leverages extreme value theory to characterize the tail behavior of flexibility distributions. A case study with real-world charging data from 1400 residential electric vehicles in Denmark demonstrates that the analytical solution improves out-of-sample reliability, reducing bid violation rates by up to 8% relative to a sample-based benchmark. The method is also computationally more efficient, solving optimization problems up to 4.8 times faster while requiring substantially fewer samples to ensure compliance. Moreover, the proposed approach enables the construction of feasible bids with reliability levels as high as 99.95%, which would otherwise require prohibitively large scenario sets under the sample-based method. Beyond its computational and reliability advantages, the framework also provides actionable insights into how reliability thresholds influence aggregator bidding behavior and market participation. This study establishes a regulation-compliant, tractable, and risk-aware bidding methodology for stochastic flexibility aggregators, enhancing both market efficiency and power system security.
Computation- and Communication-Efficient Online FL for Resource-Constrained Aerial Vehicles
Privacy-preserving distributed machine learning (ML) and aerial connected vehicle (ACV)-assisted edge computing have drawn significant attention lately. Since the onboard sensors of ACVs can capture new data as they move along their trajectories, the continual arrival of such 'newly' sensed data leads to online learning and demands carefully crafting the trajectories. Besides, as typical ACVs are inherently resource-constrained, computation- and communication-efficient ML solutions are needed. Therefore, we propose a computation- and communication-efficient online aerial federated learning (2CEOAFL) algorithm to take the benefits of continual sensed data and limited onboard resources of the ACVs. In particular, considering independently owned ACVs act as selfish data collectors, we first model their trajectories according to their respective time-varying data distributions. We then propose a 2CEOAFL algorithm that allows the flying ACVs to (a) prune the received dense ML model to make it shallow, (b) train the pruned model, and (c) probabilistically quantize and offload their trained accumulated gradients to the central server (CS). Our extensive simulation results show that the proposed 2CEOAFL algorithm delivers comparable performances to its non-pruned and nonquantized, hence, computation- and communication-inefficient counterparts.
comment: Accepted for publications in IEEE MILCOM 2025
Online-Score-Aided Federated Learning: Taming the Resource Constraints in Wireless Networks
While federated learning (FL) is a widely popular distributed machine learning (ML) strategy that protects data privacy, time-varying wireless network parameters and heterogeneous configurations of the wireless devices pose significant challenges. Although the limited radio and computational resources of the network and the clients, respectively, are widely acknowledged, two critical yet often ignored aspects are (a) wireless devices can only dedicate a small chunk of their limited storage for the FL task and (b) new training samples may arrive in an online manner in many practical wireless applications. Therefore, we propose a new FL algorithm called online-score-aided federated learning (OSAFL), specifically designed to learn tasks relevant to wireless applications under these practical considerations. Since clients' local training steps differ under resource constraints, which may lead to client drift under statistically heterogeneous data distributions, we leverage normalized gradient similarities and exploit weighting clients' updates based on optimized scores that facilitate the convergence rate of the proposed OSAFL algorithm without incurring any communication overheads to the clients or requiring any statistical data information from them. We theoretically show how the new factors, i.e., online score and local data distribution shifts, affect the convergence bound and derive the necessary conditions for a sublinear convergence rate. Our extensive simulation results on two different tasks with multiple popular ML models validate the effectiveness of the proposed OSAFL algorithm compared to modified state-of-the-art FL baselines.
comment: Under review for possible publication in IEEE Transactions on Communications
Distributed Implementation of Variational Quantum Eigensolver to Solve QUBO Problems
We present a distributed algorithm and implementation of the variational quantum eigensolver (VQE), termed distributed VQE (DVQE). DVQE, provided as an open-source Python package, enables the execution of parameterized quantum circuits across multiple logical quantum processing units (QPUs) in a distributed fashion. This approach addresses key hardware limitations of near-term quantum devices, including restricted qubit counts and limited circuit depth. Distributed ansatz circuits are constructed to preserve the quantum state fidelity of their monolithic counterparts, allowing consistent energy estimation while distributing the computational load. To improve the convergence and robustness of the optimization loop for identifying the variational parameters of the DVQE ansatz circuit, we use the ADAM optimizer in combination with metaheuristic initialization strategies, which outperform random initialization across various test cases. The complete DVQE pipeline is implemented in a modular Python package that accepts QUBO problems as input and supports monolithic and distributed execution modes. The framework leverages Qiskit to construct and simulate distributed circuits, and includes an internal greedy algorithm for automatic qubit allocation across multiple QPUs. Simulation results on QUBO benchmarks confirm the correctness of the approach, paving the way for real QPU deployment and further exploration of distributed quantum optimization. \textbf{The simulator is publicly available on \href{https://github.com/LSU-RAISE-LAB/DVQE.git}{GitHub} under a package named raiselab, with a collection of tutorial examples.}
Learning Complex Motion Plans using Neural ODEs with Safety and Stability Guarantees ICRA 2024
We propose a Dynamical System (DS) approach to learn complex, possibly periodic motion plans from kinesthetic demonstrations using Neural Ordinary Differential Equations (NODE). To ensure reactivity and robustness to disturbances, we propose a novel approach that selects a target point at each time step for the robot to follow, by combining tools from control theory and the target trajectory generated by the learned NODE. A correction term to the NODE model is computed online by solving a quadratic program that guarantees stability and safety using control Lyapunov functions and control barrier functions, respectively. Our approach outperforms baseline DS learning techniques on the LASA handwriting dataset and complex periodic trajectories. It is also validated on the Franka Emika robot arm to produce stable motions for wiping and stirring tasks that do not have a single attractor, while being robust to perturbations and safe around humans and obstacles.
comment: accepted to ICRA 2024
Quantum Optimization for Optimal Power Flow: CVQLS-Augmented Interior Point Method
This paper presents a quantum-enhanced optimization approach for solving optimal power flow (OPF) by integrating the interior point method (IPM) with a coherent variational quantum linear solver (CVQLS). The objective is to explore the applicability of quantum computing to power systems optimization and address the associated challenges. A comparative analysis of state-of-the-art quantum linear solvers - Harrow-Hassidim-Lloyd (HHL), variational quantum linear solver (VQLS), and CVQLS - revealed that CVQLS is most suitable for OPF due to its stability with ill-conditioned matrices, such as the Hessian in IPM. To ensure high-quality solutions, prevent suboptimal convergence, and avoid the barren plateau problem, we propose a quantum circuit parameter initialization technique along with a method to guide the IPM along the central path. Moreover, we design an ansatz tailored for OPF, optimizing the expressibility and trainability of the quantum circuit to ensure efficient convergence and robustness in solving quantum OPF. Various optimizers are also tested for quantum circuit parameter optimization to select the best one. We evaluate our approaches on multiple systems to show their effectiveness in providing reliable OPF solutions. Simulations for the 2-bus system are conducted on a commercial IBMQ quantum device, while simulations for the other larger cases are performed using the IBM quantum simulator. While promising, CVQLS is limited by current quantum hardware, especially for larger systems. We use a quantum noise simulator to test scalability.
comment: 14 pages
A Metropolis-Adjusted Langevin Algorithm for Sampling Jeffreys Prior
Inference and estimation are fundamental in statistics, system identification, and machine learning. When prior knowledge about the system is available, Bayesian analysis provides a natural framework for encoding it through a prior distribution. In practice, such knowledge is often too vague to specify a full prior distribution, motivating the use of default 'uninformative' priors that minimize subjective bias. Jeffreys prior is an appealing uninformative prior because: (i) it is invariant under any re-parameterization of the model, (ii) it encodes the intrinsic geometric structure of the parameter space through the Fisher information matrix, which in turn enhances the diversity of parameter samples. Despite these benefits, drawing samples from Jeffreys prior is challenging. In this paper, we develop a general sampling scheme using the Metropolis-Adjusted Langevin Algorithm that enables sampling of parameter values from Jeffreys prior; the method extends naturally to nonlinear state-space models. The resulting samples can be directly used in sampling-based system identification methods and Bayesian experiment design, providing an objective, information-geometric description of parameter uncertainty. Several numerical examples demonstrate the efficiency and accuracy of the proposed scheme.
comment: 6 pages, accepted by CDC 2025
Systems and Control (EESS)
Large Language Models (LLMs) for Electronic Design Automation (EDA)
With the growing complexity of modern integrated circuits, hardware engineers are required to devote more effort to the full design-to-manufacturing workflow. This workflow involves numerous iterations, making it both labor-intensive and error-prone. Therefore, there is an urgent demand for more efficient Electronic Design Automation (EDA) solutions to accelerate hardware development. Recently, large language models (LLMs) have shown remarkable advancements in contextual comprehension, logical reasoning, and generative capabilities. Since hardware designs and intermediate scripts can be represented as text, integrating LLM for EDA offers a promising opportunity to simplify and even automate the entire workflow. Accordingly, this paper provides a comprehensive overview of incorporating LLMs into EDA, with emphasis on their capabilities, limitations, and future opportunities. Three case studies, along with their outlook, are introduced to demonstrate the capabilities of LLMs in hardware design, testing, and optimization. Finally, future directions and challenges are highlighted to further explore the potential of LLMs in shaping the next-generation EDA, providing valuable insights for researchers interested in leveraging advanced AI technologies for EDA.
comment: Accepted by IEEE International System-on-Chip Conference
HPC Digital Twins for Evaluating Scheduling Policies, Incentive Structures and their Impact on Power and Cooling
Schedulers are critical for optimal resource utilization in high-performance computing. Traditional methods to evaluate schedulers are limited to post-deployment analysis, or simulators, which do not model associated infrastructure. In this work, we present the first-of-its-kind integration of scheduling and digital twins in HPC. This enables what-if studies to understand the impact of parameter configurations and scheduling decisions on the physical assets, even before deployment, or regarching changes not easily realizable in production. We (1) provide the first digital twin framework extended with scheduling capabilities, (2) integrate various top-tier HPC systems given their publicly available datasets, (3) implement extensions to integrate external scheduling simulators. Finally, we show how to (4) implement and evaluate incentive structures, as-well-as (5) evaluate machine learning based scheduling, in such novel digital-twin based meta-framework to prototype scheduling. Our work enables what-if scenarios of HPC systems to evaluate sustainability, and the impact on the simulated system.
The Coherent Multiplex: Scalable Real-Time Wavelet Coherence Architecture
The Coherent Multiplex is formalized and validated as a scalable, real-time system for identifying, analyzing, and visualizing coherence among multiple time series. Its architecture comprises a fast spectral similarity layer based on cosine similarity metrics of Fourier-transformed signals, and a sparse time-frequency layer for wavelet coherence. The system constructs and evolves a multilayer graph representing inter-signal relationships, enabling low-latency inference and monitoring. A simulation prototype demonstrates functionality across 8 synthetic channels with a high similarity threshold for further computation, with additional opportunities for scaling the architecture up to support thousands of input signals with constrained hardware. Applications discussed include neuroscience, finance, and biomedical signal analysis.
comment: Submitted to International Symposium for Signal Processing 2025
Constraint Learning in Multi-Agent Dynamic Games from Demonstrations of Local Nash Interactions
We present an inverse dynamic game-based algorithm to learn parametric constraints from a given dataset of local generalized Nash equilibrium interactions between multiple agents. Specifically, we introduce mixed-integer linear programs (MILP) encoding the Karush-Kuhn-Tucker (KKT) conditions of the interacting agents, which recover constraints consistent with the Nash stationarity of the interaction demonstrations. We establish theoretical guarantees that our method learns inner approximations of the true safe and unsafe sets, as well as limitations of constraint learnability from demonstrations of Nash equilibrium interactions. We also use the interaction constraints recovered by our method to design motion plans that robustly satisfy the underlying constraints. Across simulations and hardware experiments, our methods proved capable of inferring constraints and designing interactive motion plans for various classes of constraints, both convex and non-convex, from interaction demonstrations of agents with nonlinear dynamics.
Combined Stochastic and Robust Optimization for Electric Autonomous Mobility-on-Demand with Nested Benders Decomposition
The electrification and automation of mobility are reshaping how cities operate on-demand transport systems. Managing Electric Autonomous Mobility-on-Demand (EAMoD) fleets effectively requires coordinating dispatch, rebalancing, and charging decisions under multiple uncertainties, including travel demand, travel time, energy consumption, and charger availability. We address this challenge with a combined stochastic and robust model predictive control (MPC) framework. The framework integrates spatio-temporal Bayesian neural network forecasts with a multi-stage stochastic optimization model, formulated as a large-scale mixed-integer linear program. To ensure real-time applicability, we develop a tailored Nested Benders Decomposition that exploits the scenario tree structure and enables efficient parallelized solution. Stochastic optimization is employed to anticipate demand and infrastructure variability, while robust constraints on energy consumption and travel times safeguard feasibility under worst-case realizations. We evaluate the framework using high-fidelity simulations of San Francisco and Chicago. Compared with deterministic, reactive, and robust baselines, the combined stochastic and robust approach reduces median passenger waiting times by up to 36% and 95th-percentile delays by nearly 20%, while also lowering rebalancing distance by 27% and electricity costs by more than 35%. We also conduct a sensitivity analysis of battery size and vehicle efficiency, finding that energy-efficient vehicles maintain stable performance even with small batteries, whereas less efficient vehicles require larger batteries and greater infrastructure support. Our results emphasize the importance of jointly optimizing predictive control, vehicle capabilities, and infrastructure planning to enable scalable, cost-efficient EAMoD operations.
comment: 29 pages, 12 figures
Limited Preemption of the 3-Phase Task Model using Preemption Thresholds
Phased execution models are a well-known solution to tackle the unpredictability of today's complex COTS multi-core platforms. The semantics of these models dedicate phases for a task's execution and shared memory accesses. Memory phases are solely dedicated to load all necessary instructions and data to private local memory, and to write back the results of the computation. During execution phases, only the private local memory is accessed. While non-preemptive execution phases utilize the local memory well, schedulability is reduced due to blocking. On the other hand, fully preemptive execution phases allow for better schedulability, but require local memory to be large enough to hold all tasks involved in preemption simultaneously. Limited preemption is a promising approach that provides moderation between non-preemptive and fully preemptive scheduling. In this paper, we propose using preemption thresholds to limit the number of preemptions to minimize local memory usage while maintaining schedulability. We propose a worst-case response time and a worst-case memory requirement analysis for sporadic 3-phase tasks under partitioned fixed-priority scheduling with preemption thresholds. We further show how the state-of-the-art algorithm to assign preemption thresholds can be applied to the considered task model. Evaluations demonstrate that preemption thresholds can significantly reduce the memory usage (by $2.5\times$) compared to fully preemptive scheduling, while maintaining high schedulability ratios ($13\times$) compared to non-preemptive scheduling.
Uncertainty-Based Perturb and Observe for Fast Optimization of Unknown, Time-Varying Processes
Model-free adaptive optimization methods are capable of optimizing unknown, time-varying processes even when other optimization methods are not. However, their practical application is often limited by perturbations that are used to gather information on the unknown cost and its gradient. The aim of this paper is to develop a perturb-and-observe (P&O) method that reduces the need for such perturbations while still achieving fast and accurate tracking of time-varying optima. To this end, a (time-varying) model of the cost is constructed in an online fashion, taking into account the uncertainty on the measured performance cost as well as the decreasing reliability of older measurements. Perturbations are only used when this is expected to lead to improved performance over a certain time horizon. Convergence conditions are provided under which the strategy converges to a neighborhood of the optimum. Finally, simulation results demonstrate that uncertainty-based P\&O can reduce the number of perturbations significantly while still tracking a time-varying optimum accurately.
comment: To appear in Conference on Decision and Control 2025, Rio de Janeiro, Brazil, 2025 6 pages, 3 figures
Distributed Safety-Critical MPC for Multi-Agent Formation Control and Obstacle Avoidance
For nonlinear multi-agent systems with high relative degrees, achieving formation control and obstacle avoidance in a distributed manner remains a significant challenge. To address this issue, we propose a novel distributed safety-critical model predictive control (DSMPC) algorithm that incorporates discrete-time high-order control barrier functions (DHCBFs) to enforce safety constraints, alongside discrete-time control Lyapunov functions (DCLFs) to establish terminal constraints. To facilitate distributed implementation, we develop estimated neighbor states for formulating DHCBFs and DCLFs, while also devising a bound constraint to limit estimation errors and ensure convergence. Additionally, we provide theoretical guarantees regarding the feasibility and stability of the proposed DSMPC algorithm based on a mild assumption. The effectiveness of the proposed method is evidenced by the simulation results, demonstrating improved performance and reduced computation time compared to existing approaches.
Beyond the Bermuda Triangle of Contention: IOMMU Interference in Mixed Criticality Systems
As Mixed Criticality Systems (MCSs) evolve, they increasingly integrate heterogeneous computing platforms, combining general-purpose processors with specialized accelerators such as AI engines, GPUs, and high-speed networking interfaces. This heterogeneity introduces challenges, as these accelerators and DMA-capable devices act as independent bus masters, directly accessing memory. Consequently, ensuring both security and timing predictability in such environments becomes critical. To address these concerns, the Input-Output Memory Management Unit (IOMMU) plays a key role in mediating and regulating memory access, preventing unauthorized transactions while enforcing isolation and access control policies. While prior work has explored IOMMU-related side-channel vulnerabilities from a security standpoint, its role in performance interference remains largely unexplored. Moreover, many of the same architectural properties that enable side-channel leakage, such as shared TLBs, caching effects, and translation overheads, can also introduce timing unpredictability. In this work, we analyze the contention effects within IOMMU structures using the Xilinx UltraScale+ ZCU104 platform, demonstrating how their shared nature introduce unpredictable delays. Our findings reveal that IOMMU-induced interference primarily affects small memory transactions, where translation overheads significantly impact execution time. Additionally, we hypothesize that contention effects arising from IOTLBs exhibit similar behavior across architectures due to shared caching principles, such as prefetching and hierarchical TLB structures. Notably, our experiments show that IOMMU interference can delay DMA transactions by up to 1.79x for lower-size transfers on the Arm SMMUv2 implementation.
Low-Cost Architecture and Efficient Pattern Synthesis for Polarimetric Phased Array Based on Polarization Coding Reconfigurable Elements
Polarimetric phased arrays (PPAs) enhance radar target detection and anti-jamming capabilities. However, the dual transmit/receive (T/R) channel requirement leads to high costs and system complexity. To address this, this paper introduces a polarization-coding reconfigurable phased array (PCRPA) and associated pattern synthesis techniques to reduce PPA costs while minimizing performance degradation. Each PCRPA element connects to a single T/R channel and incorporates two-level RF switches for real-time control of polarization states and waveforms. By adjusting element codes and excitation weights, the PCRPA can generate arbitrarily polarized and dual-polarized beams. Efficient beam pattern synthesis methods are also proposed, featuring novel optimization constraints derived from theoretical and analytical analysis of PCRPAs. Simulations demonstrate that the approach achieves low cross-polarization and sidelobe levels comparable to conventional architectures within the scan range, particularly for large arrays. However, the channel reduction inevitably incurs power and directivity loss. Experiments conducted on an $8\times 8$ X-band array antenna validate the effectiveness of the proposed system. The PCRPA and synthesis methods are well-suited for large-scale PPA systems, offering significant cost-effectiveness while maintaining good sidelobe suppression and polarization control performance.
Symbolic Equation Modeling of Composite Loads: A Kolmogorov-Arnold Network based Learning Approach
With increasing penetration of distributed energy resources installed behind the meter, there is a growing need for adequate modelling of composite loads to enable accurate power system simulation analysis. Existing measurement based load modeling methods either fit fixed-structure physical models, which limits adaptability to evolving load mixes, or employ flexible machine learning methods which are however black boxes and offer limited interpretability. This paper presents a new learning based load modelling method based on Kolmogorov Arnold Networks towards modelling flexibility and interpretability. By actively learning activation functions on edges, KANs automatically derive free form symbolic equations that capture nonlinear relationships among measured variables without prior assumptions about load structure. Case studies demonstrate that the proposed approach outperforms other methods in both accuracy and generalization ability, while uniquely representing composite loads into transparent, interpretable mathematical equations.
Hybrid ML-RL Approach for Smart Grid Stability Prediction and Optimized Control Strategy
Electrical grids are now much more complex due to the rapid integration of distributed generation and alternative energy sources, which makes forecasting grid stability with optimized control a crucial task for operators. Traditional statistical, physics-based, and ML models can learn the pattern of the grid features, but have limitations in optimal strategy control with instability prediction. This work proposes a hybrid ML-RL framework that leverages ML for rapid stability prediction and RL for dynamic control and optimization. The first stage of this study created a baseline that explored the potential of various ML models for stability prediction. Out of them, the stacking classifiers of several fundamental models show a significant performance in classifying the instability, leading to the second stage, where reinforcement learning algorithms (PPO, A2C, and DQN) optimize power control actions. Experimental results demonstrate that the hybrid ML-RL model effectively stabilizes the grid, achieves rapid convergence, and significantly reduces training time. The integration of ML-based stability classification with RL-based dynamic control enhances decision-making efficiency while lowering computational complexity, making it well-suited for real-time smart grid applications.
comment: Accepted in IEEE Smart Well Congress 2025, Calgary, Canada
Testing and Fault Tolerance Techniques for Carbon Nanotube-Based FPGAs
As the semiconductor manufacturing process technology node shrinks into the nanometer-scale, the CMOS-based Field Programmable Gate Arrays (FPGAs) face big challenges in scalability of performance and power consumption. Multi-walled Carbon Nanotube (MWCNT) serves as a promising candidate for Cu interconnects thanks to the superior conductivity. Moreover, Carbon Nanotube Field Transistor (CNFET) also emerges as a prospective alternative to the conventional CMOS device because of high power efficiency and large noise margin. The combination of MWCNT and CNFET enables the promising CNT-based FPGAs. However, the MWCNT interconnects exhibit significant process variations due to immature fabrication process, leading to delay faults. Also, the non-ideal CNFET fabrication process may generate a few metallic CNTs (m-CNTs), rendering correlated faulty blocks. In this article, we propose a ring oscillator (RO) based testing technique to detect delay faults due to the process variation of MWCNT interconnects. Furthermore, we propose an effective testing technique for the carry chains in CLBs, and an improved circuit design based on the lookup table (LUT) is applied to speed up the fault testing of CNT-based FPGAs. In addition, we propose a testing algorithm to detect m-CNTs in CLBs. Finally, we propose a redundant spare row sharing architecture to improve the yield of CNT-based FPGA further. Experimental results show that the test time for a 6-input LUT can be reduced by 35.49% compared with conventional testing, and the proposed algorithm can achieve a high test coverage with little overhead. The proposed redundant architecture can repair the faulty segment effectively and efficiently.
comment: Accepted by Integration, VLSI Journal
Neural Spline Operators for Risk Quantification in Stochastic Systems
Accurately quantifying long-term risk probabilities in diverse stochastic systems is essential for safety-critical control. However, existing sampling-based and partial differential equation (PDE)-based methods often struggle to handle complex varying dynamics. Physics-informed neural networks learn surrogate mappings for risk probabilities from varying system parameters of fixed and finite dimensions, yet can not account for functional variations in system dynamics. To address these challenges, we introduce physics-informed neural operator (PINO) methods to risk quantification problems, to learn mappings from varying \textit{functional} system dynamics to corresponding risk probabilities. Specifically, we propose Neural Spline Operators (NeSO), a PINO framework that leverages B-spline representations to improve training efficiency and achieve better initial and boundary condition enforcements, which are crucial for accurate risk quantification. We provide theoretical analysis demonstrating the universal approximation capability of NeSO. We also present two case studies, one with varying functional dynamics and another with high-dimensional multi-agent dynamics, to demonstrate the efficacy of NeSO and its significant online speed-up over existing methods. The proposed framework and the accompanying universal approximation theorem are expected to be beneficial for other control or PDE-related problems beyond risk quantification.
What can we learn from signals and systems in a transformer? Insights for probabilistic modeling and inference architecture
In the 1940s, Wiener introduced a linear predictor, where the future prediction is computed by linearly combining the past data. A transformer generalizes this idea: it is a nonlinear predictor where the next-token prediction is computed by nonlinearly combining the past tokens. In this essay, we present a probabilistic model that interprets transformer signals as surrogates of conditional measures, and layer operations as fixed-point updates. An explicit form of the fixed-point update is described for the special case when the probabilistic model is a hidden Markov model (HMM). In part, this paper is in an attempt to bridge the classical nonlinear filtering theory with modern inference architectures.
comment: 21 pages, 5 figures
Characterization of Safety in Stochastic Difference Inclusions using Barrier Functions
We study stochastic systems characterized by difference inclusions. Such stochastic differential inclusions are defined by set-valued maps involving the current state and stochastic input. For such systems, we investigate the problem of proving bounds on the worst-case probability of violating safety properties. Our approach uses the well-known concept of barrier functions from the study of stochastic control systems. However, barrier functions are hard to prove in the presence of stochastic inputs and adversarial choices due to the set-valued nature of the dynamics. In this paper, we show that under some assumptions on the set-valued map including upper semi-continuity and convexity combined with a concave barrier function vastly simplifies the proof of barrier conditions, allowing us to effectively substitute each random input in terms of its expectation. We prove key results based on the theory of set-valued maps and provide some interesting numerical examples. The ideas proposed here will contribute to the growing interest in problems of robust control and verification of stochastic systems in the presence of uncertain distributions and unmodeled dynamics.
comment: 13 pages
Regulation-Aware Game-Theoretic Motion Planning for Autonomous Racing SC 2025
This paper presents a regulation-aware motion planning framework for autonomous racing scenarios. Each agent solves a Regulation-Compliant Model Predictive Control problem, where racing rules - such as right-of-way and collision avoidance responsibilities - are encoded using Mixed Logical Dynamical constraints. We formalize the interaction between vehicles as a Generalized Nash Equilibrium Problem (GNEP) and approximate its solution using an Iterative Best Response scheme. Building on this, we introduce the Regulation-Aware Game-Theoretic Planner (RA-GTP), in which the attacker reasons over the defender's regulation-constrained behavior. This game-theoretic layer enables the generation of overtaking strategies that are both safe and non-conservative. Simulation results demonstrate that the RA-GTP outperforms baseline methods that assume non-interacting or rule-agnostic opponent models, leading to more effective maneuvers while consistently maintaining compliance with racing regulations.
comment: Accepted for presentation at the IEEE International Conference on Intelligent Transportation Systems (ITSC 2025)
Array-Based Monte Carlo Tree Search
Monte Carlo Tree Search is a popular method for solving decision making problems. Faster implementations allow for more simulations within the same wall clock time, directly improving search performance. To this end, we present an alternative array-based implementation of the classic Upper Confidence bounds applied to Trees algorithm. Our method preserves the logic of the original algorithm, but eliminates the need for branch prediction, enabling faster performance on pipelined processors, and up to a factor of 2.8 times better scaling with search depth in our numerical simulations.
Hierarchical Decentralized Stochastic Control for Cyber-Physical Systems
This paper introduces a two-timescale hierarchical decentralized control architecture for Cyber-Physical Systems (CPS). The system consists of a global controller (GC), and N local controllers (LCs). The GC operates at a slower timescale, imposing budget constraints on the actions of LCs, which function at a faster timescale. Applications can be found in energy grid planning, wildfire management, and other decentralized resource allocation problems. We propose and analyze two optimization frameworks for this setting: COpt and FOpt. In COpt, both GC and LCs together optimize infinite-horizon discounted rewards, while in FOpt the LCs optimize finite-horizon episodic rewards, and the GC optimizes infinite-horizon rewards. Although both frameworks share identical reward functions, their differing horizons can lead to different optimal policies. In particular, FOpt grants greater autonomy to LCs by allowing their policies to be determined only by local objectives, unlike COpt. To our knowledge, these frameworks have not been studied in the literature. We establish the formulations, prove the existence of optimal policies, and prove the convergence of their value iteration algorithms. We further show that COpt always achieves a higher value function than FOpt and derive explicit bounds on their difference. Finally, we establish a set of sufficient structural conditions under which the two frameworks become equivalent.
comment: 8 pages, 2 figures
NAPER: Fault Protection for Real-Time Resource-Constrained Deep Neural Networks
Fault tolerance in Deep Neural Networks (DNNs) deployed on resource-constrained systems presents unique challenges for high-accuracy applications with strict timing requirements. Memory bit-flips can severely degrade DNN accuracy, while traditional protection approaches like Triple Modular Redundancy (TMR) often sacrifice accuracy to maintain reliability, creating a three-way dilemma between reliability, accuracy, and timeliness. We introduce NAPER, a novel protection approach that addresses this challenge through ensemble learning. Unlike conventional redundancy methods, NAPER employs heterogeneous model redundancy, where diverse models collectively achieve higher accuracy than any individual model. This is complemented by an efficient fault detection mechanism and a real-time scheduler that prioritizes meeting deadlines by intelligently scheduling recovery operations without interrupting inference. Our evaluations demonstrate NAPER's superiority: 40% faster inference in both normal and fault conditions, maintained accuracy 4.2% higher than TMR-based strategies, and guaranteed uninterrupted operation even during fault recovery. NAPER effectively balances the competing demands of accuracy, reliability, and timeliness in real-time DNN applications
comment: This work has been accepted for publication in IEEE IOLTS 2025. The final published version available via IEEE Xplore
Future Deployment and Flexibility of Distributed Energy Resources in the Distribution Grids of Switzerland
The decarbonization goals worldwide drive the energy transition of power distribution grids, which operate under increasingly volatile conditions and closer to their technical limits. In this context, localized operational data with high temporal and spatial resolution is essential for their effective planning and regulation. Nevertheless, information on grid-connected distributed energy resources, such as electric vehicles, photovoltaic systems, and heat pumps, is often fragmented, inconsistent, and unavailable. This work introduces a comprehensive database of distributed energy resources and non-controllable loads allocated in Switzerland's medium- and low-voltage distribution grid models, covering over 2 million points of connection. Remarkably, this data specifies the flexibility capabilities of the controllable devices, with a set of projections aligned with national forecasts for 2030, 2040, and 2050. The database supports studies on flexibility provision of distributed energy resources, distribution grid resilience, and national energy policy, among other topics. Importantly, its modular structure allows users to extract national- and local-scale information across medium- and low-voltage systems, enabling broad applicability across locations.
comment: The dataset can be accessed here: https://doi.org/10.5281/zenodo.15056134
From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving
Learning robust driving policies from large-scale, real-world datasets is a central challenge in autonomous driving, as online data collection is often unsafe and impractical. While Behavioral Cloning (BC) offers a straightforward approach to imitation learning, policies trained with BC are notoriously brittle and suffer from compounding errors in closed-loop execution. This work presents a comprehensive pipeline and a comparative study to address this limitation. We first develop a series of increasingly sophisticated BC baselines, culminating in a Transformer-based model that operates on a structured, entity-centric state representation. While this model achieves low imitation loss, we show that it still fails in long-horizon simulations. We then demonstrate that by applying a state-of-the-art Offline Reinforcement Learning algorithm, Conservative Q-Learning (CQL), to the same data and architecture, we can learn a significantly more robust policy. Using a carefully engineered reward function, the CQL agent learns a conservative value function that enables it to recover from minor errors and avoid out-of-distribution states. In a large-scale evaluation on 1,000 unseen scenarios from the Waymo Open Motion Dataset, our final CQL agent achieves a 3.2x higher success rate and a 7.4x lower collision rate than the strongest BC baseline, proving that an offline RL approach is critical for learning robust, long-horizon driving policies from static expert data.
Belief-Conditioned One-Step Diffusion: Real-Time Trajectory Planning with Just-Enough Sensing
Robots equipped with rich sensor suites can localize reliably in partially-observable environments, but powering every sensor continuously is wasteful and often infeasible. Belief-space planners address this by propagating pose-belief covariance through analytic models and switching sensors heuristically--a brittle, runtime-expensive approach. Data-driven approaches--including diffusion models--learn multi-modal trajectories from demonstrations, but presuppose an accurate, always-on state estimate. We address the largely open problem: for a given task in a mapped environment, which \textit{minimal sensor subset} must be active at each location to maintain state uncertainty \textit{just low enough} to complete the task? Our key insight is that when a diffusion planner is explicitly conditioned on a pose-belief raster and a sensor mask, the spread of its denoising trajectories yields a calibrated, differentiable proxy for the expected localisation error. Building on this insight, we present Belief-Conditioned One-Step Diffusion (B-COD), the first planner that, in a 10 ms forward pass, returns a short-horizon trajectory, per-waypoint aleatoric variances, and a proxy for localisation error--eliminating external covariance rollouts. We show that this single proxy suffices for a soft-actor-critic to choose sensors online, optimising energy while bounding pose-covariance growth. We deploy B-COD in real-time marine trials on an unmanned surface vehicle and show that it reduces sensing energy consumption while matching the goal-reach performance of an always-on baseline.
comment: Accepted to CoRL 2025 (Conference on Robot Learning)
Secure Set-based State Estimation for Safety-Critical Applications under Adversarial Attacks on Sensors
Set-based state estimation provides guaranteed state inclusion certificates that are crucial for the safety verification of dynamical systems. However, when system sensors are subject to cyberattacks, maintaining both safety and security guarantees becomes a fundamental challenge that existing point-based secure state estimation methods cannot adequately address due to their inherent inability to provide state inclusion certificates. This paper introduces a novel approach that simultaneously ensures safety guarantees through guaranteed state inclusion and security guarantees against sensor attacks, without imposing conservative restrictions on system operation. We propose a Secure Set-based State Estimation (S3E) algorithm that maintains the true system state within the estimated set under sensor attacks, provided the initialization set contains the initial state and the system remains observable from the uncompromised sensor subset. The algorithm gives the estimated set as a collection of constrained zonotopes (agreement sets), which can be employed as robust certificates for verifying whether the system adheres to safety constraints. Furthermore, we demonstrate that the estimated set remains unaffected by attack signals of sufficiently large magnitude and also establish sufficient conditions for attack detection, identification, and filtering. This compels the attacker to only inject signals of small magnitudes to evade detection, thus preserving the accuracy of the estimated set. To address the computational complexity of the algorithm, we offer several strategies for complexity-performance trade-offs. The efficacy of the proposed algorithm is illustrated through several examples, including its application to a three-story building model.
Bidding in Ancillary Service Markets: An Analytical Approach Using Extreme Value Theory
To enable the participation of stochastic distributed energy resources in ancillary service markets, the Danish transmission system operator, Energinet, mandates that flexibility providers satisfy a minimum 90% reliability requirement for reserve bids. This paper examines the bidding strategy of an electric vehicle aggregator under this regulation and develops a chance-constrained optimization model. In contrast to conventional sample-based approaches that demand large datasets to capture uncertainty, we propose an analytical reformulation that leverages extreme value theory to characterize the tail behavior of flexibility distributions. A case study with real-world charging data from 1400 residential electric vehicles in Denmark demonstrates that the analytical solution improves out-of-sample reliability, reducing bid violation rates by up to 8% relative to a sample-based benchmark. The method is also computationally more efficient, solving optimization problems up to 4.8 times faster while requiring substantially fewer samples to ensure compliance. Moreover, the proposed approach enables the construction of feasible bids with reliability levels as high as 99.95%, which would otherwise require prohibitively large scenario sets under the sample-based method. Beyond its computational and reliability advantages, the framework also provides actionable insights into how reliability thresholds influence aggregator bidding behavior and market participation. This study establishes a regulation-compliant, tractable, and risk-aware bidding methodology for stochastic flexibility aggregators, enhancing both market efficiency and power system security.
Computation- and Communication-Efficient Online FL for Resource-Constrained Aerial Vehicles
Privacy-preserving distributed machine learning (ML) and aerial connected vehicle (ACV)-assisted edge computing have drawn significant attention lately. Since the onboard sensors of ACVs can capture new data as they move along their trajectories, the continual arrival of such 'newly' sensed data leads to online learning and demands carefully crafting the trajectories. Besides, as typical ACVs are inherently resource-constrained, computation- and communication-efficient ML solutions are needed. Therefore, we propose a computation- and communication-efficient online aerial federated learning (2CEOAFL) algorithm to take the benefits of continual sensed data and limited onboard resources of the ACVs. In particular, considering independently owned ACVs act as selfish data collectors, we first model their trajectories according to their respective time-varying data distributions. We then propose a 2CEOAFL algorithm that allows the flying ACVs to (a) prune the received dense ML model to make it shallow, (b) train the pruned model, and (c) probabilistically quantize and offload their trained accumulated gradients to the central server (CS). Our extensive simulation results show that the proposed 2CEOAFL algorithm delivers comparable performances to its non-pruned and nonquantized, hence, computation- and communication-inefficient counterparts.
comment: Accepted for publications in IEEE MILCOM 2025
Online-Score-Aided Federated Learning: Taming the Resource Constraints in Wireless Networks
While federated learning (FL) is a widely popular distributed machine learning (ML) strategy that protects data privacy, time-varying wireless network parameters and heterogeneous configurations of the wireless devices pose significant challenges. Although the limited radio and computational resources of the network and the clients, respectively, are widely acknowledged, two critical yet often ignored aspects are (a) wireless devices can only dedicate a small chunk of their limited storage for the FL task and (b) new training samples may arrive in an online manner in many practical wireless applications. Therefore, we propose a new FL algorithm called online-score-aided federated learning (OSAFL), specifically designed to learn tasks relevant to wireless applications under these practical considerations. Since clients' local training steps differ under resource constraints, which may lead to client drift under statistically heterogeneous data distributions, we leverage normalized gradient similarities and exploit weighting clients' updates based on optimized scores that facilitate the convergence rate of the proposed OSAFL algorithm without incurring any communication overheads to the clients or requiring any statistical data information from them. We theoretically show how the new factors, i.e., online score and local data distribution shifts, affect the convergence bound and derive the necessary conditions for a sublinear convergence rate. Our extensive simulation results on two different tasks with multiple popular ML models validate the effectiveness of the proposed OSAFL algorithm compared to modified state-of-the-art FL baselines.
comment: Under review for possible publication in IEEE Transactions on Communications
Distributed Implementation of Variational Quantum Eigensolver to Solve QUBO Problems
We present a distributed algorithm and implementation of the variational quantum eigensolver (VQE), termed distributed VQE (DVQE). DVQE, provided as an open-source Python package, enables the execution of parameterized quantum circuits across multiple logical quantum processing units (QPUs) in a distributed fashion. This approach addresses key hardware limitations of near-term quantum devices, including restricted qubit counts and limited circuit depth. Distributed ansatz circuits are constructed to preserve the quantum state fidelity of their monolithic counterparts, allowing consistent energy estimation while distributing the computational load. To improve the convergence and robustness of the optimization loop for identifying the variational parameters of the DVQE ansatz circuit, we use the ADAM optimizer in combination with metaheuristic initialization strategies, which outperform random initialization across various test cases. The complete DVQE pipeline is implemented in a modular Python package that accepts QUBO problems as input and supports monolithic and distributed execution modes. The framework leverages Qiskit to construct and simulate distributed circuits, and includes an internal greedy algorithm for automatic qubit allocation across multiple QPUs. Simulation results on QUBO benchmarks confirm the correctness of the approach, paving the way for real QPU deployment and further exploration of distributed quantum optimization. \textbf{The simulator is publicly available on \href{https://github.com/LSU-RAISE-LAB/DVQE.git}{GitHub} under a package named raiselab, with a collection of tutorial examples.}
Learning Complex Motion Plans using Neural ODEs with Safety and Stability Guarantees ICRA 2024
We propose a Dynamical System (DS) approach to learn complex, possibly periodic motion plans from kinesthetic demonstrations using Neural Ordinary Differential Equations (NODE). To ensure reactivity and robustness to disturbances, we propose a novel approach that selects a target point at each time step for the robot to follow, by combining tools from control theory and the target trajectory generated by the learned NODE. A correction term to the NODE model is computed online by solving a quadratic program that guarantees stability and safety using control Lyapunov functions and control barrier functions, respectively. Our approach outperforms baseline DS learning techniques on the LASA handwriting dataset and complex periodic trajectories. It is also validated on the Franka Emika robot arm to produce stable motions for wiping and stirring tasks that do not have a single attractor, while being robust to perturbations and safe around humans and obstacles.
comment: accepted to ICRA 2024
Quantum Optimization for Optimal Power Flow: CVQLS-Augmented Interior Point Method
This paper presents a quantum-enhanced optimization approach for solving optimal power flow (OPF) by integrating the interior point method (IPM) with a coherent variational quantum linear solver (CVQLS). The objective is to explore the applicability of quantum computing to power systems optimization and address the associated challenges. A comparative analysis of state-of-the-art quantum linear solvers - Harrow-Hassidim-Lloyd (HHL), variational quantum linear solver (VQLS), and CVQLS - revealed that CVQLS is most suitable for OPF due to its stability with ill-conditioned matrices, such as the Hessian in IPM. To ensure high-quality solutions, prevent suboptimal convergence, and avoid the barren plateau problem, we propose a quantum circuit parameter initialization technique along with a method to guide the IPM along the central path. Moreover, we design an ansatz tailored for OPF, optimizing the expressibility and trainability of the quantum circuit to ensure efficient convergence and robustness in solving quantum OPF. Various optimizers are also tested for quantum circuit parameter optimization to select the best one. We evaluate our approaches on multiple systems to show their effectiveness in providing reliable OPF solutions. Simulations for the 2-bus system are conducted on a commercial IBMQ quantum device, while simulations for the other larger cases are performed using the IBM quantum simulator. While promising, CVQLS is limited by current quantum hardware, especially for larger systems. We use a quantum noise simulator to test scalability.
comment: 14 pages
A Metropolis-Adjusted Langevin Algorithm for Sampling Jeffreys Prior
Inference and estimation are fundamental in statistics, system identification, and machine learning. When prior knowledge about the system is available, Bayesian analysis provides a natural framework for encoding it through a prior distribution. In practice, such knowledge is often too vague to specify a full prior distribution, motivating the use of default 'uninformative' priors that minimize subjective bias. Jeffreys prior is an appealing uninformative prior because: (i) it is invariant under any re-parameterization of the model, (ii) it encodes the intrinsic geometric structure of the parameter space through the Fisher information matrix, which in turn enhances the diversity of parameter samples. Despite these benefits, drawing samples from Jeffreys prior is challenging. In this paper, we develop a general sampling scheme using the Metropolis-Adjusted Langevin Algorithm that enables sampling of parameter values from Jeffreys prior; the method extends naturally to nonlinear state-space models. The resulting samples can be directly used in sampling-based system identification methods and Bayesian experiment design, providing an objective, information-geometric description of parameter uncertainty. Several numerical examples demonstrate the efficiency and accuracy of the proposed scheme.
comment: 6 pages, accepted by CDC 2025
Robotics
MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation
Temporal context is essential for robotic manipulation because such tasks are inherently non-Markovian, yet mainstream VLA models typically overlook it and struggle with long-horizon, temporally dependent tasks. Cognitive science suggests that humans rely on working memory to buffer short-lived representations for immediate control, while the hippocampal system preserves verbatim episodic details and semantic gist of past experience for long-term memory. Inspired by these mechanisms, we propose MemoryVLA, a Cognition-Memory-Action framework for long-horizon robotic manipulation. A pretrained VLM encodes the observation into perceptual and cognitive tokens that form working memory, while a Perceptual-Cognitive Memory Bank stores low-level details and high-level semantics consolidated from it. Working memory retrieves decision-relevant entries from the bank, adaptively fuses them with current tokens, and updates the bank by merging redundancies. Using these tokens, a memory-conditioned diffusion action expert yields temporally aware action sequences. We evaluate MemoryVLA on 150+ simulation and real-world tasks across three robots. On SimplerEnv-Bridge, Fractal, and LIBERO-5 suites, it achieves 71.9%, 72.7%, and 96.5% success rates, respectively, all outperforming state-of-the-art baselines CogACT and pi-0, with a notable +14.6 gain on Bridge. On 12 real-world tasks spanning general skills and long-horizon temporal dependencies, MemoryVLA achieves 84.0% success rate, with long-horizon tasks showing a +26 improvement over state-of-the-art baseline. Project Page: https://shihao1895.github.io/MemoryVLA
comment: The project is available at https://shihao1895.github.io/MemoryVLA
Planning-Query-Guided Model Generation for Model-Based Deformable Object Manipulation
Efficient planning in high-dimensional spaces, such as those involving deformable objects, requires computationally tractable yet sufficiently expressive dynamics models. This paper introduces a method that automatically generates task-specific, spatially adaptive dynamics models by learning which regions of the object require high-resolution modeling to achieve good task performance for a given planning query. Task performance depends on the complex interplay between the dynamics model, world dynamics, control, and task requirements. Our proposed diffusion-based model generator predicts per-region model resolutions based on start and goal pointclouds that define the planning query. To efficiently collect the data for learning this mapping, a two-stage process optimizes resolution using predictive dynamics as a prior before directly optimizing using closed-loop performance. On a tree-manipulation task, our method doubles planning speed with only a small decrease in task performance over using a full-resolution model. This approach informs a path towards using previous planning and control data to generate computationally efficient yet sufficiently expressive dynamics models for new tasks.
comment: 9 pages, 7 figures
AutoRing: Imitation Learning--based Autonomous Intraocular Foreign Body Removal Manipulation with Eye Surgical Robot
Intraocular foreign body removal demands millimeter-level precision in confined intraocular spaces, yet existing robotic systems predominantly rely on manual teleoperation with steep learning curves. To address the challenges of autonomous manipulation (particularly kinematic uncertainties from variable motion scaling and variation of the Remote Center of Motion (RCM) point), we propose AutoRing, an imitation learning framework for autonomous intraocular foreign body ring manipulation. Our approach integrates dynamic RCM calibration to resolve coordinate-system inconsistencies caused by intraocular instrument variation and introduces the RCM-ACT architecture, which combines action-chunking transformers with real-time kinematic realignment. Trained solely on stereo visual data and instrument kinematics from expert demonstrations in a biomimetic eye model, AutoRing successfully completes ring grasping and positioning tasks without explicit depth sensing. Experimental validation demonstrates end-to-end autonomy under uncalibrated microscopy conditions. The results provide a viable framework for developing intelligent eye-surgical systems capable of complex intraocular procedures.
Real-Time Model Checking for Closed-Loop Robot Reactive Planning
We present a new application of model checking which achieves real-time multi-step planning and obstacle avoidance on a real autonomous robot. We have developed a small, purpose-built model checking algorithm which generates plans in situ based on "core" knowledge and attention as found in biological agents. This is achieved in real-time using no pre-computed data on a low-powered device. Our approach is based on chaining temporary control systems which are spawned to counteract disturbances in the local environment that disrupt an autonomous agent from its preferred action (or resting state). A novel discretization of 2D LiDAR data sensitive to bounded variations in the local environment is used. Multi-step planning using model checking by forward depth-first search is applied to cul-de-sac and playground scenarios. Both empirical results and informal proofs of two fundamental properties of our approach demonstrate that model checking can be used to create efficient multi-step plans for local obstacle avoidance, improving on the performance of a reactive agent which can only plan one step. Our approach is an instructional case study for the development of safe, reliable and explainable planning in the context of autonomous vehicles.
comment: 30 pages excluding references, 18 figures, submitted to Formal Aspects of Computing
From Tabula Rasa to Emergent Abilities: Discovering Robot Skills via Real-World Unsupervised Quality-Diversity
Autonomous skill discovery aims to enable robots to acquire diverse behaviors without explicit supervision. Learning such behaviors directly on physical hardware remains challenging due to safety and data efficiency constraints. Existing methods, including Quality-Diversity Actor-Critic (QDAC), require manually defined skill spaces and carefully tuned heuristics, limiting real-world applicability. We propose Unsupervised Real-world Skill Acquisition (URSA), an extension of QDAC that enables robots to autonomously discover and master diverse, high-performing skills directly in the real world. We demonstrate that URSA successfully discovers diverse locomotion skills on a Unitree A1 quadruped in both simulation and the real world. Our approach supports both heuristic-driven skill discovery and fully unsupervised settings. We also show that the learned skill repertoire can be reused for downstream tasks such as real-world damage adaptation, where URSA outperforms all baselines in 5 out of 9 simulated and 3 out of 5 real-world damage scenarios. Our results establish a new framework for real-world robot learning that enables continuous skill discovery with limited human intervention, representing a significant step toward more autonomous and adaptable robotic systems. Demonstration videos are available at http://adaptive-intelligent-robotics.github.io/URSA .
comment: Accepted at CoRL 2025
Direction Informed Trees (DIT*): Optimal Path Planning via Direction Filter and Direction Cost Heuristic ICRA
Optimal path planning requires finding a series of feasible states from the starting point to the goal to optimize objectives. Popular path planning algorithms, such as Effort Informed Trees (EIT*), employ effort heuristics to guide the search. Effective heuristics are accurate and computationally efficient, but achieving both can be challenging due to their conflicting nature. This paper proposes Direction Informed Trees (DIT*), a sampling-based planner that focuses on optimizing the search direction for each edge, resulting in goal bias during exploration. We define edges as generalized vectors and integrate similarity indexes to establish a directional filter that selects the nearest neighbors and estimates direction costs. The estimated direction cost heuristics are utilized in edge evaluation. This strategy allows the exploration to share directional information efficiently. DIT* convergence faster than existing single-query, sampling-based planners on tested problems in R^4 to R^16 and has been demonstrated in real-world environments with various planning tasks. A video showcasing our experimental results is available at: https://youtu.be/2SX6QT2NOek
comment: 7 pages, 5 figures, 2025 IEEE International Conference on Robotics and Automation (ICRA)
Real-time Testing of Satellite Attitude Control With a Reaction Wheel Hardware-In-the-Loop Platform
We propose the Hardware-in-the-Loop (HIL) test of an adaptive satellite attitude control system with reaction wheel health estimation capabilities. Previous simulations and Software-in-the-Loop testing have prompted further experiments to explore the validity of the controller with real momentum exchange devices in the loop. This work is a step toward a comprehensive testing framework for validation of spacecraft attitude control algorithms. The proposed HIL testbed includes brushless DC motors and drivers that communicate using a CAN bus, an embedded computer that executes control and adaptation laws, and a satellite simulator that produces simulated sensor data, estimated attitude states, and responds to actions of the external actuators. We propose methods to artificially induce failures on the reaction wheels, and present related issues and lessons learned.
comment: 15 pages, 10 figures, 2025 AAS/AIAA Astrodynamics Specialist Conference
Safe Navigation under State Uncertainty: Online Adaptation for Robust Control Barrier Functions
Measurements and state estimates are often imperfect in control practice, posing challenges for safety-critical applications, where safety guarantees rely on accurate state information. In the presence of estimation errors, several prior robust control barrier function (R-CBF) formulations have imposed strict conditions on the input. These methods can be overly conservative and can introduce issues such as infeasibility, high control effort, etc. This work proposes a systematic method to improve R-CBFs, and demonstrates its advantages on a tracked vehicle that navigates among multiple obstacles. A primary contribution is a new optimization-based online parameter adaptation scheme that reduces the conservativeness of existing R-CBFs. In order to reduce the complexity of the parameter optimization, we merge several safety constraints into one unified numerical CBF via Poisson's equation. We further address the dual relative degree issue that typically causes difficulty in vehicle tracking. Experimental trials demonstrate the overall performance improvement of our approach over existing formulations.
QuadKAN: KAN-Enhanced Quadruped Motion Control via End-to-End Reinforcement Learning
We address vision-guided quadruped motion control with reinforcement learning (RL) and highlight the necessity of combining proprioception with vision for robust control. We propose QuadKAN, a spline-parameterized cross-modal policy instantiated with Kolmogorov-Arnold Networks (KANs). The framework incorporates a spline encoder for proprioception and a spline fusion head for proprioception-vision inputs. This structured function class aligns the state-to-action mapping with the piecewise-smooth nature of gait, improving sample efficiency, reducing action jitter and energy consumption, and providing interpretable posture-action sensitivities. We adopt Multi-Modal Delay Randomization (MMDR) and perform end-to-end training with Proximal Policy Optimization (PPO). Evaluations across diverse terrains, including both even and uneven surfaces and scenarios with static or dynamic obstacles, demonstrate that QuadKAN achieves consistently higher returns, greater distances, and fewer collisions than state-of-the-art (SOTA) baselines. These results show that spline-parameterized policies offer a simple, effective, and interpretable alternative for robust vision-guided locomotion. A repository will be made available upon acceptance.
comment: 14pages, 9 figures, Journal paper
Uncertainty-Resilient Active Intention Recognition for Robotic Assistants
Purposeful behavior in robotic assistants requires the integration of multiple components and technological advances. Often, the problem is reduced to recognizing explicit prompts, which limits autonomy, or is oversimplified through assumptions such as near-perfect information. We argue that a critical gap remains unaddressed -- specifically, the challenge of reasoning about the uncertain outcomes and perception errors inherent to human intention recognition. In response, we present a framework designed to be resilient to uncertainty and sensor noise, integrating real-time sensor data with a combination of planners. Centered around an intention-recognition POMDP, our approach addresses cooperative planning and acting under uncertainty. Our integrated framework has been successfully tested on a physical robot with promising results.
comment: (To appear) In Proceedings of ECMR 2025
ZeST: an LLM-based Zero-Shot Traversability Navigation for Unknown Environments
The advancement of robotics and autonomous navigation systems hinges on the ability to accurately predict terrain traversability. Traditional methods for generating datasets to train these prediction models often involve putting robots into potentially hazardous environments, posing risks to equipment and safety. To solve this problem, we present ZeST, a novel approach leveraging visual reasoning capabilities of Large Language Models (LLMs) to create a traversability map in real-time without exposing robots to danger. Our approach not only performs zero-shot traversability and mitigates the risks associated with real-world data collection but also accelerates the development of advanced navigation systems, offering a cost-effective and scalable solution. To support our findings, we present navigation results, in both controlled indoor and unstructured outdoor environments. As shown in the experiments, our method provides safer navigation when compared to other state-of-the-art methods, constantly reaching the final goal.
DELIVER: A System for LLM-Guided Coordinated Multi-Robot Pickup and Delivery using Voronoi-Based Relay Planning
We present DELIVER (Directed Execution of Language-instructed Item Via Engineered Relay), a fully integrated framework for cooperative multi-robot pickup and delivery driven by natural language commands. DELIVER unifies natural language understanding, spatial decomposition, relay planning, and motion execution to enable scalable, collision-free coordination in real-world settings. Given a spoken or written instruction, a lightweight instance of LLaMA3 interprets the command to extract pickup and delivery locations. The environment is partitioned using a Voronoi tessellation to define robot-specific operating regions. Robots then compute optimal relay points along shared boundaries and coordinate handoffs. A finite-state machine governs each robot's behavior, enabling robust execution. We implement DELIVER on the MultiTRAIL simulation platform and validate it in both ROS2-based Gazebo simulations and real-world hardware using TurtleBot3 robots. Empirical results show that DELIVER maintains consistent mission cost across varying team sizes while reducing per-agent workload by up to 55% compared to a single-agent system. Moreover, the number of active relay agents remains low even as team size increases, demonstrating the system's scalability and efficient agent utilization. These findings underscore DELIVER's modular and extensible architecture for language-guided multi-robot coordination, advancing the frontiers of cyber-physical system integration.
comment: Submission under review at the 2026 IEEE/SICE International Symposium on System Integration (SII 2026)
VibES: Induced Vibration for Persistent Event-Based Sensing
Event cameras are a bio-inspired class of sensors that asynchronously measure per-pixel intensity changes. Under fixed illumination conditions in static or low-motion scenes, rigidly mounted event cameras are unable to generate any events, becoming unsuitable for most computer vision tasks. To address this limitation, recent work has investigated motion-induced event stimulation that often requires complex hardware or additional optical components. In contrast, we introduce a lightweight approach to sustain persistent event generation by employing a simple rotating unbalanced mass to induce periodic vibrational motion. This is combined with a motion-compensation pipeline that removes the injected motion and yields clean, motion-corrected events for downstream perception tasks. We demonstrate our approach with a hardware prototype and evaluate it on real-world captured datasets. Our method reliably recovers motion parameters and improves both image reconstruction and edge detection over event-based sensing without motion induction.
An LLM-powered Natural-to-Robotic Language Translation Framework with Correctness Guarantees
The Large Language Models (LLM) are increasingly being deployed in robotics to generate robot control programs for specific user tasks, enabling embodied intelligence. Existing methods primarily focus on LLM training and prompt design that utilize LLMs to generate executable programs directly from user tasks in natural language. However, due to the inconsistency of the LLMs and the high complexity of the tasks, such best-effort approaches often lead to tremendous programming errors in the generated code, which significantly undermines the effectiveness especially when the light-weight LLMs are applied. This paper introduces a natural-robotic language translation framework that (i) provides correctness verification for generated control programs and (ii) enhances the performance of LLMs in program generation via feedback-based fine-tuning for the programs. To achieve this, a Robot Skill Language (RSL) is proposed to abstract away from the intricate details of the control programs, bridging the natural language tasks with the underlying robot skills. Then, the RSL compiler and debugger are constructed to verify RSL programs generated by the LLM and provide error feedback to the LLM for refining the outputs until being verified by the compiler. This provides correctness guarantees for the LLM-generated programs before being offloaded to the robots for execution, significantly enhancing the effectiveness of LLM-powered robotic applications. Experiments demonstrate NRTrans outperforms the existing method under a range of LLMs and tasks, and achieves a high success rate for light-weight LLMs.
HuBE: Cross-Embodiment Human-like Behavior Execution for Humanoid Robots
Achieving both behavioral similarity and appropriateness in human-like motion generation for humanoid robot remains an open challenge, further compounded by the lack of cross-embodiment adaptability. To address this problem, we propose HuBE, a bi-level closed-loop framework that integrates robot state, goal poses, and contextual situations to generate human-like behaviors, ensuring both behavioral similarity and appropriateness, and eliminating structural mismatches between motion generation and execution. To support this framework, we construct HPose, a context-enriched dataset featuring fine-grained situational annotations. Furthermore, we introduce a bone scaling-based data augmentation strategy that ensures millimeter-level compatibility across heterogeneous humanoid robots. Comprehensive evaluations on multiple commercial platforms demonstrate that HuBE significantly improves motion similarity, behavioral appropriateness, and computational efficiency over state-of-the-art baselines, establishing a solid foundation for transferable and human-like behavior execution across diverse humanoid robots.
comment: 8 pages, 8 figures,4 tables
Enhanced UAV Path Planning Using the Tangent Intersection Guidance (TIG) Algorithm
Efficient and safe navigation of Unmanned Aerial Vehicles (UAVs) is critical for various applications, including combat support, package delivery and Search and Rescue Operations. This paper introduces the Tangent Intersection Guidance (TIG) algorithm, an advanced approach for UAV path planning in both static and dynamic environments. The algorithm uses the elliptic tangent intersection method to generate feasible paths. It generates two sub-paths for each threat, selects the optimal route based on a heuristic rule, and iteratively refines the path until the target is reached. Considering the UAV kinematic and dynamic constraints, a modified smoothing technique based on quadratic B\'ezier curves is adopted to generate a smooth and efficient route. Experimental results show that the TIG algorithm can generate the shortest path in less time, starting from 0.01 seconds, with fewer turning angles compared to A*, PRM, RRT*, Tangent Graph, and Static APPATT algorithms in static environments. Furthermore, in completely unknown and partially known environments, TIG demonstrates efficient real-time path planning capabilities for collision avoidance, outperforming APF and Dynamic APPATT algorithms.
comment: Accepted for publication in JAMRIS Journal
VisionSafeEnhanced VPC: Cautious Predictive Control with Visibility Constraints under Uncertainty for Autonomous Robotic Surgery
Autonomous control of the laparoscope in robot-assisted Minimally Invasive Surgery (MIS) has received considerable research interest due to its potential to improve surgical safety. Despite progress in pixel-level Image-Based Visual Servoing (IBVS) control, the requirement of continuous visibility and the existence of complex disturbances, such as parameterization error, measurement noise, and uncertainties of payloads, could degrade the surgeon's visual experience and compromise procedural safety. To address these limitations, this paper proposes VisionSafeEnhanced Visual Predictive Control (VPC), a robust and uncertainty-adaptive framework for autonomous laparoscope control that guarantees Field of View (FoV) safety under uncertainty. Firstly, Gaussian Process Regression (GPR) is utilized to perform hybrid (deterministic + stochastic) quantification of operational uncertainties including residual model uncertainties, stochastic uncertainties, and external disturbances. Based on uncertainty quantification, a novel safety aware trajectory optimization framework with probabilistic guarantees is proposed, where a uncertainty-adaptive safety Control Barrier Function (CBF) condition is given based on uncertainty propagation, and chance constraints are simultaneously formulated based on probabilistic approximation. This uncertainty aware formulation enables adaptive control effort allocation, minimizing unnecessary camera motion while maintaining robustness. The proposed method is validated through comparative simulations and experiments on a commercial surgical robot platform (MicroPort MedBot Toumai) performing a sequential multi-target lymph node dissection. Compared with baseline methods, the framework maintains near-perfect target visibility (>99.9%), reduces tracking e
comment: 8 pages, 6 figures
Interpretable Decision-Making for End-to-End Autonomous Driving ICCV 2025
Trustworthy AI is mandatory for the broad deployment of autonomous vehicles. Although end-to-end approaches derive control commands directly from raw data, interpreting these decisions remains challenging, especially in complex urban scenarios. This is mainly attributed to very deep neural networks with non-linear decision boundaries, making it challenging to grasp the logic behind AI-driven decisions. This paper presents a method to enhance interpretability while optimizing control commands in autonomous driving. To address this, we propose loss functions that promote the interpretability of our model by generating sparse and localized feature maps. The feature activations allow us to explain which image regions contribute to the predicted control command. We conduct comprehensive ablation studies on the feature extraction step and validate our method on the CARLA benchmarks. We also demonstrate that our approach improves interpretability, which correlates with reducing infractions, yielding a safer, high-performance driving model. Notably, our monocular, non-ensemble model surpasses the top-performing approaches from the CARLA Leaderboard by achieving lower infraction scores and the highest route completion rate, all while ensuring interpretability.
comment: Accepted to the ICCV 2025 2nd Workshop on the Challenge Of Out-of-Label Hazards in Autonomous Driving (2COOOL)
AS2FM: Enabling Statistical Model Checking of ROS 2 Systems for Robust Autonomy IROS2025
Designing robotic systems to act autonomously in unforeseen environments is a challenging task. This work presents a novel approach to use formal verification, specifically Statistical Model Checking (SMC), to verify system properties of autonomous robots at design-time. We introduce an extension of the SCXML format, designed to model system components including both Robot Operating System 2 (ROS 2) and Behavior Tree (BT) features. Further, we contribute Autonomous Systems to Formal Models (AS2FM), a tool to translate the full system model into JANI. The use of JANI, a standard format for quantitative model checking, enables verification of system properties with off-the-shelf SMC tools. We demonstrate the practical usability of AS2FM both in terms of applicability to real-world autonomous robotic control systems, and in terms of verification runtime scaling. We provide a case study, where we successfully identify problems in a ROS 2-based robotic manipulation use case that is verifiable in less than one second using consumer hardware. Additionally, we compare to the state of the art and demonstrate that our method is more comprehensive in system feature support, and that the verification runtime scales linearly with the size of the model, instead of exponentially.
comment: Accepted at IROS2025
Learning Real-World Acrobatic Flight from Human Preferences
Preference-based reinforcement learning (PbRL) enables agents to learn control policies without requiring manually designed reward functions, making it well-suited for tasks where objectives are difficult to formalize or inherently subjective. Acrobatic flight poses a particularly challenging problem due to its complex dynamics, rapid movements, and the importance of precise execution. In this work, we explore the use of PbRL for agile drone control, focusing on the execution of dynamic maneuvers such as powerloops. Building on Preference-based Proximal Policy Optimization (Preference PPO), we propose Reward Ensemble under Confidence (REC), an extension to the reward learning objective that improves preference modeling and learning stability. Our method achieves 88.4% of the shaped reward performance, compared to 55.2% with standard Preference PPO. We train policies in simulation and successfully transfer them to real-world drones, demonstrating multiple acrobatic maneuvers where human preferences emphasize stylistic qualities of motion. Furthermore, we demonstrate the applicability of our probabilistic reward model in a representative MuJoCo environment for continuous control. Finally, we highlight the limitations of manually designed rewards, observing only 60.7% agreement with human preferences. These results underscore the effectiveness of PbRL in capturing complex, human-centered objectives across both physical and simulated domains.
comment: 8 pages, 7 figures
HyperTASR: Hypernetwork-Driven Task-Aware Scene Representations for Robust Manipulation
Effective policy learning for robotic manipulation requires scene representations that selectively capture task-relevant environmental features. Current approaches typically employ task-agnostic representation extraction, failing to emulate the dynamic perceptual adaptation observed in human cognition. We present HyperTASR, a hypernetwork-driven framework that modulates scene representations based on both task objectives and the execution phase. Our architecture dynamically generates representation transformation parameters conditioned on task specifications and progression state, enabling representations to evolve contextually throughout task execution. This approach maintains architectural compatibility with existing policy learning frameworks while fundamentally reconfiguring how visual features are processed. Unlike methods that simply concatenate or fuse task embeddings with task-agnostic representations, HyperTASR establishes computational separation between task-contextual and state-dependent processing paths, enhancing learning efficiency and representational quality. Comprehensive evaluations in both simulation and real-world environments demonstrate substantial performance improvements across different representation paradigms. Through ablation studies and attention visualization, we confirm that our approach selectively prioritizes task-relevant scene information, closely mirroring human adaptive perception during manipulation tasks. The project website is at \href{https://lisunphil.github.io/HyperTASR_projectpage/}{lisunphil.github.io/HyperTASR\_projectpage}.
PseudoMapTrainer: Learning Online Mapping without HD Maps ICCV 2025
Online mapping models show remarkable results in predicting vectorized maps from multi-view camera images only. However, all existing approaches still rely on ground-truth high-definition maps during training, which are expensive to obtain and often not geographically diverse enough for reliable generalization. In this work, we propose PseudoMapTrainer, a novel approach to online mapping that uses pseudo-labels generated from unlabeled sensor data. We derive those pseudo-labels by reconstructing the road surface from multi-camera imagery using Gaussian splatting and semantics of a pre-trained 2D segmentation network. In addition, we introduce a mask-aware assignment algorithm and loss function to handle partially masked pseudo-labels, allowing for the first time the training of online mapping models without any ground-truth maps. Furthermore, our pseudo-labels can be effectively used to pre-train an online model in a semi-supervised manner to leverage large-scale unlabeled crowdsourced data. The code is available at github.com/boschresearch/PseudoMapTrainer.
comment: Accepted at ICCV 2025
Are All Marine Species Created Equal? Performance Disparities in Underwater Object Detection
Underwater object detection is critical for monitoring marine ecosystems but poses unique challenges, including degraded image quality, imbalanced class distribution, and distinct visual characteristics. Not every species is detected equally well, yet underlying causes remain unclear. We address two key research questions: 1) What factors beyond data quantity drive class-specific performance disparities? 2) How can we systematically improve detection of under-performing marine species? We manipulate the DUO dataset to separate the object detection task into localization and classification and investigate the under-performance of the scallop class. Localization analysis using YOLO11 and TIDE finds that foreground-background discrimination is the most problematic stage regardless of data quantity. Classification experiments reveal persistent precision gaps even with balanced data, indicating intrinsic feature-based challenges beyond data scarcity and inter-class dependencies. We recommend imbalanced distributions when prioritizing precision, and balanced distributions when prioritizing recall. Improving under-performing classes should focus on algorithmic advances, especially within localization modules. We publicly release our code and datasets.
comment: 10 pages
Enhancing Video-Based Robot Failure Detection Using Task Knowledge
Robust robotic task execution hinges on the reliable detection of execution failures in order to trigger safe operation modes, recovery strategies, or task replanning. However, many failure detection methods struggle to provide meaningful performance when applied to a variety of real-world scenarios. In this paper, we propose a video-based failure detection approach that uses spatio-temporal knowledge in the form of the actions the robot performs and task-relevant objects within the field of view. Both pieces of information are available in most robotic scenarios and can thus be readily obtained. We demonstrate the effectiveness of our approach on three datasets that we amend, in part, with additional annotations of the aforementioned task-relevant knowledge. In light of the results, we also propose a data augmentation method that improves performance by applying variable frame rates to different parts of the video. We observe an improvement from 77.9 to 80.0 in F1 score on the ARMBench dataset without additional computational expense and an additional increase to 81.4 with test-time augmentation. The results emphasize the importance of spatio-temporal information during failure detection and suggest further investigation of suitable heuristics in future implementations. Code and annotations are available.
comment: Accepted at ECMR 2025
AgriChrono: A Multi-modal Dataset Capturing Crop Growth and Lighting Variability with a Field Robot
Existing datasets for precision agriculture have primarily been collected in static or controlled environments such as indoor labs or greenhouses, often with limited sensor diversity and restricted temporal span. These conditions fail to reflect the dynamic nature of real farmland, including illumination changes, crop growth variation, and natural disturbances. As a result, models trained on such data often lack robustness and generalization when applied to real-world field scenarios. In this paper, we present AgriChrono, a novel robotic data collection platform and multi-modal dataset designed to capture the dynamic conditions of real-world agricultural environments. Our platform integrates multiple sensors and enables remote, time-synchronized acquisition of RGB, Depth, LiDAR, and IMU data, supporting efficient and repeatable long-term data collection across varying illumination and crop growth stages. We benchmark a range of state-of-the-art 3D reconstruction models on the AgriChrono dataset, highlighting the difficulty of reconstruction in real-world field environments and demonstrating its value as a research asset for advancing model generalization under dynamic conditions. The code and dataset are publicly available at: https://github.com/StructuresComp/agri-chrono
Deep Sensorimotor Control by Imitating Predictive Models of Human Motion
As the embodiment gap between a robot and a human narrows, new opportunities arise to leverage datasets of humans interacting with their surroundings for robot learning. We propose a novel technique for training sensorimotor policies with reinforcement learning by imitating predictive models of human motions. Our key insight is that the motion of keypoints on human-inspired robot end-effectors closely mirrors the motion of corresponding human body keypoints. This enables us to use a model trained to predict future motion on human data \emph{zero-shot} on robot data. We train sensorimotor policies to track the predictions of such a model, conditioned on a history of past robot states, while optimizing a relatively sparse task reward. This approach entirely bypasses gradient-based kinematic retargeting and adversarial losses, which limit existing methods from fully leveraging the scale and diversity of modern human-scene interaction datasets. Empirically, we find that our approach can work across robots and tasks, outperforming existing baselines by a large margin. In addition, we find that tracking a human motion model can substitute for carefully designed dense rewards and curricula in manipulation tasks. Code, data and qualitative results available at https://jirl-upenn.github.io/track_reward/.
comment: Blog Post: https://hgaurav2k.github.io/trackr/
Engineering Automotive Digital Twins on Standardized Architectures: A Case Study
Digital twin (DT) technology has become of interest in the automotive industry. There is a growing need for smarter services that utilize the unique capabilities of DTs, ranging from computer-aided remote control to cloud-based fleet coordination. Developing such services starts with the software architecture. However, the scarcity of DT architectural guidelines poses a challenge for engineering automotive DTs. Currently, the only DT architectural standard is the one defined in ISO 23247. Though not developed for automotive systems, it is one of the few feasible starting points for automotive DTs. In this work, we investigate the suitability of the ISO 23247 reference architecture for developing automotive DTs. Through the case study of developing an Adaptive Cruise Control DT for a 1/10\textsuperscript{th}-scale autonomous vehicle, we identify some strengths and limitations of the reference architecture and begin distilling future directions for researchers, practitioners, and standard developers.
comment: 7 pages, 6 figures. Submitted to EDTconf 2025
Integration of Robot and Scene Kinematics for Sequential Mobile Manipulation Planning
We present a Sequential Mobile Manipulation Planning (SMMP) framework that can solve long-horizon multi-step mobile manipulation tasks with coordinated whole-body motion, even when interacting with articulated objects. By abstracting environmental structures as kinematic models and integrating them with the robot's kinematics, we construct an Augmented Configuration Apace (A-Space) that unifies the previously separate task constraints for navigation and manipulation, while accounting for the joint reachability of the robot base, arm, and manipulated objects. This integration facilitates efficient planning within a tri-level framework: a task planner generates symbolic action sequences to model the evolution of A-Space, an optimization-based motion planner computes continuous trajectories within A-Space to achieve desired configurations for both the robot and scene elements, and an intermediate plan refinement stage selects action goals that ensure long-horizon feasibility. Our simulation studies first confirm that planning in A-Space achieves an 84.6\% higher task success rate compared to baseline methods. Validation on real robotic systems demonstrates fluid mobile manipulation involving (i) seven types of rigid and articulated objects across 17 distinct contexts, and (ii) long-horizon tasks of up to 14 sequential steps. Our results highlight the significance of modeling scene kinematics into planning entities, rather than encoding task-specific constraints, offering a scalable and generalizable approach to complex robotic manipulation.
comment: 20 pages, 13 figures; accepted by Transactions on Robotics
SignLoc: Robust Localization using Navigation Signs and Public Maps
Navigation signs and maps, such as floor plans and street maps, are widely available and serve as ubiquitous aids for way-finding in human environments. Yet, they are rarely used by robot systems. This paper presents SignLoc, a global localization method that leverages navigation signs to localize the robot on publicly available maps -- specifically floor plans and OpenStreetMap (OSM) graphs -- without prior sensor-based mapping. SignLoc first extracts a navigation graph from the input map. It then employs a probabilistic observation model to match directional and locational cues from the detected signs to the graph, enabling robust topo-semantic localization within a Monte Carlo framework. We evaluated SignLoc in diverse large-scale environments: part of a university campus, a shopping mall, and a hospital complex. Experimental results show that SignLoc reliably localizes the robot after observing only one to two signs.
comment: Under submission for Robotics and Automation Letters (RA-L)
Gentle Object Retraction in Dense Clutter Using Multimodal Force Sensing and Imitation Learning
Dense collections of movable objects are common in everyday spaces -- from cabinets in a home to shelves in a warehouse. Safely retracting objects from such collections is difficult for robots, yet people do it easily, using non-prehensile tactile sensing on the sides and backs of their hands and arms. We investigate the role of such sensing for training robots to gently reach into constrained clutter and extract objects. The available sensing modalities are (1) "eye-in-hand" vision, (2) proprioception, (3) non-prehensile triaxial tactile sensing, (4) contact wrenches estimated from joint torques, and (5) a measure of successful object acquisition obtained by monitoring the vacuum line of a suction cup. We use imitation learning to train policies from a set of demonstrations on randomly generated scenes, then conduct an ablation study of wrench and tactile information. We evaluate each policy's performance across 40 unseen environment configurations. Policies employing any force sensing show fewer excessive force failures, an increased overall success rate, and faster completion times. The best performance is achieved using both tactile and wrench information, producing an 80% improvement above the baseline without force information.
comment: Submitted to IEEE Robotics and Automation Letters (RA-L)
An Iterative Approach for Heterogeneous Multi-Agent Route Planning with Resource Transportation Uncertainty and Temporal Logic Goals
This paper presents an iterative approach for heterogeneous multi-agent route planning in environments with unknown resource distributions. We focus on a team of robots with diverse capabilities tasked with executing missions specified using Capability Temporal Logic (CaTL), a formal framework built on Signal Temporal Logic to handle spatial, temporal, capability, and resource constraints. The key challenge arises from the uncertainty in the initial distribution and quantity of resources in the environment. To address this, we introduce an iterative algorithm that dynamically balances exploration and task fulfillment. Robots are guided to explore the environment, identifying resource locations and quantities while progressively refining their understanding of the resource landscape. At the same time, they aim to maximally satisfy the mission objectives based on the current information, adapting their strategies as new data is uncovered. This approach provides a robust solution for planning in dynamic, resource-constrained environments, enabling efficient coordination of heterogeneous teams even under conditions of uncertainty. Our method's effectiveness and performance are demonstrated through simulated case studies.
From Stoplights to On-Ramps: A Comprehensive Set of Crash Rate Benchmarks for Freeway and Surface Street ADS Evaluation
This paper presents crash rate benchmarks for evaluating US-based Automated Driving Systems (ADS) for multiple urban areas. The purpose of this study was to extend prior benchmarks focused only on surface streets to additionally capture freeway crash risk for future ADS safety performance assessments. Using publicly available police-reported crash and vehicle miles traveled (VMT) data, the methodology details the isolation of in-transport passenger vehicles, road type classification, and crash typology. Key findings revealed that freeway crash rates exhibit large geographic dependence variations with any-injury-reported crash rates being nearly 3.5 times higher in Atlanta (2.4 IPMM; the highest) when compared to Phoenix (0.7 IPMM; the lowest). The results show the critical need for location-specific benchmarks to avoid biased safety evaluations and provide insights into the vehicle miles traveled (VMT) required to achieve statistical significance for various safety impact levels. The distribution of crash types depended on the outcome severity level. Higher severity outcomes (e.g., fatal crashes) had a larger proportion of single-vehicle, vulnerable road users (VRU), and opposite-direction collisions compared to lower severity (police-reported) crashes. Given heterogeneity in crash types by severity, performance in low-severity scenarios may not be predictive of high-severity outcomes. These benchmarks are additionally used to quantify at the required mileage to show statistically significant deviations from human performance. This is the first paper to generate freeway-specific benchmarks for ADS evaluation and provides a foundational framework for future ADS benchmarking by evaluators and developers.
LaVA-Man: Learning Visual Action Representations for Robot Manipulation
Visual-textual understanding is essential for language-guided robot manipulation. Recent works leverage pre-trained vision-language models to measure the similarity between encoded visual observations and textual instructions, and then train a model to map this similarity to robot actions. However, this two-step approach limits the model to capture the relationship between visual observations and textual instructions, leading to reduced precision in manipulation tasks. We propose to learn visual-textual associations through a self-supervised pretext task: reconstructing a masked goal image conditioned on an input image and textual instructions. This formulation allows the model to learn visual-action representations without robot action supervision. The learned representations can then be fine-tuned for manipulation tasks with only a few demonstrations. We also introduce the \textit{Omni-Object Pick-and-Place} dataset, which consists of annotated robot tabletop manipulation episodes, including 180 object classes and 3,200 instances with corresponding textual instructions. This dataset enables the model to acquire diverse object priors and allows for a more comprehensive evaluation of its generalisation capability across object instances. Experimental results on the five benchmarks, including both simulated and real-robot validations, demonstrate that our method outperforms prior art.
FlipWalker: Jacob's Ladder toy-inspired robot for locomotion across diverse, complex terrain IROS 2025
This paper introduces FlipWalker, a novel underactuated robot locomotion system inspired by Jacob's Ladder illusion toy, designed to traverse challenging terrains where wheeled robots often struggle. Like the Jacob's Ladder toy, FlipWalker features two interconnected segments joined by flexible cables, enabling it to pivot and flip around singularities in a manner reminiscent of the toy's cascading motion. Actuation is provided by motor-driven legs within each segment that push off either the ground or the opposing segment, depending on the robot's current configuration. A physics-based model of the underactuated flipping dynamics is formulated to elucidate the critical design parameters governing forward motion and obstacle clearance or climbing. The untethered prototype weighs 0.78 kg, achieves a maximum flipping speed of 0.2 body lengths per second. Experimental trials on artificial grass, river rocks, and snow demonstrate that FlipWalker's flipping strategy, which relies on ground reaction forces applied normal to the surface, offers a promising alternative to traditional locomotion for navigating irregular outdoor terrain.
comment: 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
Inference of Human-derived Specifications of Object Placement via Demonstration
As robots' manipulation capabilities improve for pick-and-place tasks (e.g., object packing, sorting, and kitting), methods focused on understanding human-acceptable object configurations remain limited expressively with regard to capturing spatial relationships important to humans. To advance robotic understanding of human rules for object arrangement, we introduce positionally-augmented RCC (PARCC), a formal logic framework based on region connection calculus (RCC) for describing the relative position of objects in space. Additionally, we introduce an inference algorithm for learning PARCC specifications via demonstrations. Finally, we present the results from a human study, which demonstrate our framework's ability to capture a human's intended specification and the benefits of learning from demonstration approaches over human-provided specifications.
FlowVLA: Thinking in Motion with a Visual Chain of Thought
Many Vision-Language-Action (VLA) models are built upon an internal world model trained via direct next-frame prediction ($v_t \rightarrow v_{t+1}$). This paradigm, however, presents a fundamental challenge: it \textbf{conflates} the task of predicting physical motion with that of rendering static appearance, forcing a single mechanism to handle both. This inherent coupling often leads to physically implausible forecasts and inefficient policy learning. To address this limitation, we introduce the \textbf{Visual Chain of Thought (Visual CoT)}, a framework that disentangles these processes by compelling the model to first reason about \textbf{motion dynamics} before generating the future frame's \textbf{visual appearance}. We instantiate this principle by proposing \textbf{FlowVLA}, an autoregressive Transformer that explicitly materializes this reasoning process as ``$v_t \rightarrow f_t \rightarrow v_{t+1}$'', where $f_t$ is an intermediate optical flow prediction. By forcing the model to first commit to a motion plan ($f_t$), FlowVLA learns disentangled dynamics, resulting in more coherent visual predictions and significantly more efficient policy learning. Experiments on challenging robotics manipulation benchmarks demonstrate that FlowVLA achieves state-of-the-art performance with substantially improved sample efficiency, pointing toward a more principled foundation for world modeling in VLAs. Project page: https://irpn-lab.github.io/FlowVLA/
Comparative Analysis of UAV Path Planning Algorithms for Efficient Navigation in Urban 3D Environments
The most crucial challenges for UAVs are planning paths and avoiding obstacles in their way. In recent years, a wide variety of path-planning algorithms have been developed. These algorithms have successfully solved path-planning problems; however, they suffer from multiple challenges and limitations. To test the effectiveness and efficiency of three widely used algorithms, namely A*, RRT*, and Particle Swarm Optimization (PSO), this paper conducts extensive experiments in 3D urban city environments cluttered with obstacles. Three experiments were designed with two scenarios each to test the aforementioned algorithms. These experiments consider different city map sizes, different altitudes, and varying obstacle densities and sizes in the environment. According to the experimental results, the A* algorithm outperforms the others in both computation efficiency and path quality. PSO is especially suitable for tight turns and dense environments, and RRT* offers a balance and works well across all experiments due to its randomized approach to finding solutions.
comment: AFROS 2024 Conference
Dojo: A Differentiable Physics Engine for Robotics
We present Dojo, a differentiable physics engine for robotics that prioritizes stable simulation, accurate contact physics, and differentiability with respect to states, actions, and system parameters. Dojo models hard contact and friction with a nonlinear complementarity problem with second-order cone constraints. We introduce a custom primal-dual interior-point method to solve the second order cone program for stable forward simulation over a broad range of sample rates. We obtain smooth gradient approximations with this solver through the implicit function theorem, giving gradients that are useful for downstream trajectory optimization, policy optimization, and system identification applications. Specifically, we propose to use the central path parameter threshold in the interior point solver as a user-tunable design parameter. A high value gives a smooth approximation to contact dynamics with smooth gradients for optimization and learning, while a low value gives precise simulation rollouts with hard contact. We demonstrate Dojo's differentiability in trajectory optimization, policy learning, and system identification examples. We also benchmark Dojo against MuJoCo, PyBullet, Drake, and Brax on a variety of robot models, and study the stability and simulation quality over a range of sample frequencies and accuracy tolerances. Finally, we evaluate the sim-to-real gap in hardware experiments with a Ufactory xArm 6 robot. Dojo is an open source project implemented in Julia with Python bindings, with code available at https://github.com/dojo-sim/Dojo.jl.
Trajectory-to-Action Pipeline (TAP): Automated Scenario Description Extraction for Autonomous Vehicle Behavior Comparison
Scenario Description Languages (SDLs) provide structured, interpretable embeddings that represent traffic scenarios encountered by autonomous vehicles (AVs), supporting key tasks such as scenario similarity searches and edge case detection for safety analysis. This paper introduces the Trajectory-to-Action Pipeline (TAP), a scalable and automated method for extracting SDL labels from large trajectory datasets. TAP applies a rules-based cross-entropy optimization approach to learn parameters directly from data, enhancing generalization across diverse driving contexts. Using the Waymo Open Motion Dataset (WOMD), TAP achieves 30% greater precision than Average Displacement Error (ADE) and 24% over Dynamic Time Warping (DTW) in identifying behaviorally similar trajectories. Additionally, TAP enables automated detection of unique driving behaviors, streamlining safety evaluation processes for AV testing. This work provides a foundation for scalable scenario-based AV behavior analysis, with potential extensions for integrating multi-agent contexts.
comment: 11 pages, 6 figures
Ego-Foresight: Self-supervised Learning of Agent-Aware Representations for Improved RL
Despite the significant advancements in Deep Reinforcement Learning (RL) observed in the last decade, the amount of training experience necessary to learn effective policies remains one of the primary concerns both in simulated and real environments. Looking to solve this issue, previous work has shown that improved training efficiency can be achieved by separately modeling agent and environment, but usually requiring a supervisory agent mask. In contrast to RL, humans can perfect a new skill from a small number of trials and in most cases do so without a supervisory signal, making neuroscientific studies of human development a valuable source of inspiration for RL. In particular, we explore the idea of motor prediction, which states that humans develop an internal model of themselves and of the consequences that their motor commands have on the immediate sensory inputs. Our insight is that the movement of the agent provides a cue that allows the duality between agent and environment to be learned. To instantiate this idea, we present Ego-Foresight, a self-supervised method for disentangling agent and environment based on motion and prediction. Our main finding is self-supervised agent-awareness by visuomotor prediction of the agent improves sample-efficiency and performance of the underlying RL algorithm. To test our approach, we first study its ability to visually predict agent movement irrespective of the environment, in simulated and real-world robotic data. Then, we integrate Ego-Foresight with a model-free RL algorithm to solve simulated robotic tasks, showing that self-supervised agent-awareness can improve sample-efficiency and performance in RL.
comment: 13 pages, 8 figures, conference
Steerable Scene Generation with Post Training and Inference-Time Search
Training robots in simulation requires diverse 3D scenes that reflect the specific challenges of downstream tasks. However, scenes that satisfy strict task requirements, such as high-clutter environments with plausible spatial arrangement, are rare and costly to curate manually. Instead, we generate large-scale scene data using procedural models that approximate realistic environments for robotic manipulation, and adapt it to task-specific goals. We do this by training a unified diffusion-based generative model that predicts which objects to place from a fixed asset library, along with their SE(3) poses. This model serves as a flexible scene prior that can be adapted using reinforcement learning-based post training, conditional generation, or inference-time search, steering generation toward downstream objectives even when they differ from the original data distribution. Our method enables goal-directed scene synthesis that respects physical feasibility and scales across scene types. We introduce a novel MCTS-based inference-time search strategy for diffusion models, enforce feasibility via projection and simulation, and release a dataset of over 44 million SE(3) scenes spanning five diverse environments. Website with videos, code, data, and model weights: https://steerable-scene-generation.github.io/
comment: Project website: https://steerable-scene-generation.github.io/
CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers
We propose CAD-Assistant, a general-purpose CAD agent for AI-assisted design. Our approach is based on a powerful Vision and Large Language Model (VLLM) as a planner and a tool-augmentation paradigm using CAD-specific tools. CAD-Assistant addresses multimodal user queries by generating actions that are iteratively executed on a Python interpreter equipped with the FreeCAD software, accessed via its Python API. Our framework is able to assess the impact of generated CAD commands on geometry and adapts subsequent actions based on the evolving state of the CAD design. We consider a wide range of CAD-specific tools including a sketch image parameterizer, rendering modules, a 2D cross-section generator, and other specialized routines. CAD-Assistant is evaluated on multiple CAD benchmarks, where it outperforms VLLM baselines and supervised task-specific methods. Beyond existing benchmarks, we qualitatively demonstrate the potential of tool-augmented VLLMs as general-purpose CAD solvers across diverse workflows.
A Value Function Space Approach for Hierarchical Planning with Signal Temporal Logic Tasks
Signal Temporal Logic (STL) has emerged as an expressive language for reasoning intricate planning objectives. However, existing STL-based methods often assume full observation and known dynamics, which imposes constraints on real-world applications. To address this challenge, we propose a hierarchical planning framework that starts by constructing the Value Function Space (VFS) for state and action abstraction, which embeds functional information about affordances of the low-level skills. Subsequently, we utilize a neural network to approximate the dynamics in the VFS and employ sampling based optimization to synthesize high-level skill sequences that maximize the robustness measure of the given STL tasks in the VFS. Then those skills are executed in the low-level environment. Empirical evaluations in the Safety Gym and ManiSkill environments demonstrate that our method accomplish the STL tasks without further training in the low-level environments, substantially reducing the training burdens.
Enhancing Reusability of Learned Skills for Robot Manipulation via Gaze Information and Motion Bottlenecks
Autonomous agents capable of diverse object manipulations should be able to acquire a wide range of manipulation skills with high reusability. Although advances in deep learning have made it increasingly feasible to replicate the dexterity of human teleoperation in robots, generalizing these acquired skills to previously unseen scenarios remains a significant challenge. In this study, we propose a novel algorithm, Gaze-based Bottleneck-aware Robot Manipulation (GazeBot), which enables high reusability of learned motions without sacrificing dexterity or reactivity. By leveraging gaze information and motion bottlenecks, both crucial features for object manipulation, GazeBot achieves high success rates compared with state-of-the-art imitation learning methods, particularly when the object positions and end-effector poses differ from those in the provided demonstrations. Furthermore, the training process of GazeBot is entirely data-driven once a demonstration dataset with gaze data is provided. Videos and code are available at https://crumbyrobotics.github.io/gazebot.
SE-VLN: A Self-Evolving Vision-Language Navigation Framework Based on Multimodal Large Language Models
Recent advances in vision-language navigation (VLN) were mainly attributed to emerging large language models (LLMs). These methods exhibited excellent generalization capabilities in instruction understanding and task reasoning. However, they were constrained by the fixed knowledge bases and reasoning abilities of LLMs, preventing fully incorporating experiential knowledge and thus resulting in a lack of efficient evolutionary capacity. To address this, we drew inspiration from the evolution capabilities of natural agents, and proposed a self-evolving VLN framework (SE-VLN) to endow VLN agents with the ability to continuously evolve during testing. To the best of our knowledge, it was the first time that an multimodal LLM-powered self-evolving VLN framework was proposed. Specifically, SE-VLN comprised three core modules, i.e., a hierarchical memory module to transfer successful and failure cases into reusable knowledge, a retrieval-augmented thought-based reasoning module to retrieve experience and enable multi-step decision-making, and a reflection module to realize continual evolution. Comprehensive tests illustrated that the SE-VLN achieved navigation success rates of 57% and 35.2% in unseen environments, representing absolute performance improvements of 23.9% and 15.0% over current state-of-the-art methods on R2R and REVERSE datasets, respectively. Moreover, the SE-VLN showed performance improvement with increasing experience repository, elucidating its great potential as a self-evolving agent framework for VLN.
DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge
Recent advances in vision-language-action (VLA) models have shown promise in integrating image generation with action prediction to improve generalization and reasoning in robot manipulation. However, existing methods are limited to challenging image-based forecasting, which suffers from redundant information and lacks comprehensive and critical world knowledge, including dynamic, spatial and semantic information. To address these limitations, we propose DreamVLA, a novel VLA framework that integrates comprehensive world knowledge forecasting to enable inverse dynamics modeling, thereby establishing a perception-prediction-action loop for manipulation tasks. Specifically, DreamVLA introduces a dynamic-region-guided world knowledge prediction, integrated with the spatial and semantic cues, which provide compact yet comprehensive representations for action planning. This design aligns with how humans interact with the world by first forming abstract multimodal reasoning chains before acting. To mitigate interference among the dynamic, spatial and semantic information during training, we adopt a block-wise structured attention mechanism that masks their mutual attention, preventing information leakage and keeping each representation clean and disentangled. Moreover, to model the conditional distribution over future actions, we employ a diffusion-based transformer that disentangles action representations from shared latent features. Extensive experiments on both real-world and simulation environments demonstrate that DreamVLA achieves 76.7% success rate on real robot tasks and 4.44 average length on the CALVIN ABC-D benchmarks.
Learning Impact-Rich Rotational Maneuvers via Centroidal Velocity Rewards and Sim-to-Real Techniques: A One-Leg Hopper Flip Case Study
Dynamic rotational maneuvers, such as front flips, inherently involve large angular momentum generation and intense impact forces, presenting major challenges for reinforcement learning and sim-to-real transfer. In this work, we propose a general framework for learning and deploying impact-rich, rotation-intensive behaviors through centroidal velocity-based rewards and actuator-aware sim-to-real techniques. We identify that conventional link-level reward formulations fail to induce true whole-body rotation and introduce a centroidal angular velocity reward that accurately captures system-wide rotational dynamics. To bridge the sim-to-real gap under extreme conditions, we model motor operating regions (MOR) and apply transmission load regularization to ensure realistic torque commands and mechanical robustness. Using the one-leg hopper front flip as a representative case study, we demonstrate the first successful hardware realization of a full front flip. Our results highlight that incorporating centroidal dynamics and actuator constraints is critical for reliably executing highly dynamic motions. A supplementary video is available at: https://youtu.be/atMAVI4s1RY
Trajectory Optimization for UAV-Based Medical Delivery with Temporal Logic Constraints and Convex Feasible Set Collision Avoidance
This paper addresses the problem of trajectory optimization for unmanned aerial vehicles (UAVs) performing time-sensitive medical deliveries in urban environments. Specifically, we consider a single UAV with 3 degree-of-freedom dynamics tasked with delivering blood packages to multiple hospitals, each with a predefined time window and priority. Mission objectives are encoded using Signal Temporal Logic (STL), enabling the formal specification of spatial-temporal constraints. To ensure safety, city buildings are modeled as 3D convex obstacles, and obstacle avoidance is handled through a Convex Feasible Set (CFS) method. The entire planning problem-combining UAV dynamics, STL satisfaction, and collision avoidance-is formulated as a convex optimization problem that ensures tractability and can be solved efficiently using standard convex programming techniques. Simulation results demonstrate that the proposed method generates dynamically feasible, collision-free trajectories that satisfy temporal mission goals, providing a scalable and reliable approach for autonomous UAV-based medical logistics.
comment: 11 pages, 4 figures
Multi-Touch and Bending Perception Using Electrical Impedance Tomography for Robotics
Electrical Impedance Tomography (EIT) offers a promising solution for distributed tactile sensing with minimal wiring and full-surface coverage in robotic applications. However, EIT-based tactile sensors face significant challenges during surface bending. Deformation alters the baseline impedance distribution and couples with touch-induced conductivity variations, complicating signal interpretation. To address this challenge, we present a novel sensing framework that integrates a deep neural network for interaction state classification with a dynamic adaptive reference strategy to decouple touch and deformation signals, while a data-driven regression model translates EIT voltage changes into continuous bending angles. The framework is validated using a magnetic hydrogel composite sensor that conforms to bendable surfaces. Experimental evaluations demonstrate that the proposed framework achieves precise and robust bending angle estimation, high accuracy in distinguishing touch, bending, and idle states, and significantly improves touch localization quality under bending deformation compared to conventional fixed-reference methods. Real-time experiments confirm the system's capability to reliably detect multi-touch interactions and track bending angles across varying deformation conditions. This work paves the way for flexible EIT-based robotic skins capable of rich multimodal sensing in robotics and human-robot interaction.
Safe Multiagent Coordination via Entropic Exploration
Many real-world multiagent learning problems involve safety concerns. In these setups, typical safe reinforcement learning algorithms constrain agents' behavior, limiting exploration -- a crucial component for discovering effective cooperative multiagent behaviors. Moreover, the multiagent literature typically models individual constraints for each agent and has yet to investigate the benefits of using joint team constraints. In this work, we analyze these team constraints from a theoretical and practical perspective and propose entropic exploration for constrained multiagent reinforcement learning (E2C) to address the exploration issue. E2C leverages observation entropy maximization to incentivize exploration and facilitate learning safe and effective cooperative behaviors. Experiments across increasingly complex domains show that E2C agents match or surpass common unconstrained and constrained baselines in task performance while reducing unsafe behaviors by up to $50\%$.
comment: 10 pages, 6 figures
Enhancing Multi-Robot Semantic Navigation Through Multimodal Chain-of-Thought Score Collaboration AAAI 2025
Understanding how humans cooperatively utilize semantic knowledge to explore unfamiliar environments and decide on navigation directions is critical for house service multi-robot systems. Previous methods primarily focused on single-robot centralized planning strategies, which severely limited exploration efficiency. Recent research has considered decentralized planning strategies for multiple robots, assigning separate planning models to each robot, but these approaches often overlook communication costs. In this work, we propose Multimodal Chain-of-Thought Co-Navigation (MCoCoNav), a modular approach that utilizes multimodal Chain-of-Thought to plan collaborative semantic navigation for multiple robots. MCoCoNav combines visual perception with Vision Language Models (VLMs) to evaluate exploration value through probabilistic scoring, thus reducing time costs and achieving stable outputs. Additionally, a global semantic map is used as a communication bridge, minimizing communication overhead while integrating observational results. Guided by scores that reflect exploration trends, robots utilize this map to assess whether to explore new frontier points or revisit history nodes. Experiments on HM3D_v0.2 and MP3D demonstrate the effectiveness of our approach. Our code is available at https://github.com/FrankZxShen/MCoCoNav.git.
comment: 16 pages, 10 figures, Extended Version of accepted AAAI 2025 Paper
FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization
Hierarchical visual localization methods achieve state-of-the-art accuracy but require substantial memory as they need to store all database images. Direct 2D-3D matching requires significantly less memory but suffers from lower accuracy due to the larger and more ambiguous search space. We address this ambiguity by fusing local and global descriptors using a weighted average operator. This operator rearranges the local descriptor space so that geographically nearby local descriptors are closer in the feature space according to the global descriptors. This decreases the number of irrelevant competing descriptors, especially if they are geographically distant, thus increasing the correct matching likelihood. We consistently improve the accuracy over local-only systems, and we achieve performance close to hierarchical methods while using 43\% less memory and running 1.6 times faster. Extensive experiments on four challenging datasets -- Cambridge Landmarks, Aachen Day/Night, RobotCar Seasons, and Extended CMU Seasons -- demonstrate that, for the first time, direct matching algorithms can benefit from global descriptors without compromising computational efficiency. Our code is available at \href{https://github.com/sontung/descriptor-disambiguation}{https://github.com/sontung/descriptor-disambiguation}.
TRAN-D: 2D Gaussian Splatting-based Sparse-view Transparent Object Depth Reconstruction via Physics Simulation for Scene Update
Understanding the 3D geometry of transparent objects from RGB images is challenging due to their inherent physical properties, such as reflection and refraction. To address these difficulties, especially in scenarios with sparse views and dynamic environments, we introduce TRAN-D, a novel 2D Gaussian Splatting-based depth reconstruction method for transparent objects. Our key insight lies in separating transparent objects from the background, enabling focused optimization of Gaussians corresponding to the object. We mitigate artifacts with an object-aware loss that places Gaussians in obscured regions, ensuring coverage of invisible surfaces while reducing overfitting. Furthermore, we incorporate a physics-based simulation that refines the reconstruction in just a few seconds, effectively handling object removal and chain-reaction movement of remaining objects without the need for rescanning. TRAN-D is evaluated on both synthetic and real-world sequences, and it consistently demonstrated robust improvements over existing GS-based state-of-the-art methods. In comparison with baselines, TRAN-D reduces the mean absolute error by over 39% for the synthetic TRansPose sequences. Furthermore, despite being updated using only one image, TRAN-D reaches a {\delta} < 2.5 cm accuracy of 48.46%, over 1.5 times that of baselines, which uses six images. Code and more results are available at https://jeongyun0609.github.io/TRAN-D/.
Robot Trains Robot: Automatic Real-World Policy Adaptation and Learning for Humanoids
Simulation-based reinforcement learning (RL) has significantly advanced humanoid locomotion tasks, yet direct real-world RL from scratch or adapting from pretrained policies remains rare, limiting the full potential of humanoid robots. Real-world learning, despite being crucial for overcoming the sim-to-real gap, faces substantial challenges related to safety, reward design, and learning efficiency. To address these limitations, we propose Robot-Trains-Robot (RTR), a novel framework where a robotic arm teacher actively supports and guides a humanoid robot student. The RTR system provides protection, learning schedule, reward, perturbation, failure detection, and automatic resets. It enables efficient long-term real-world humanoid training with minimal human intervention. Furthermore, we propose a novel RL pipeline that facilitates and stabilizes sim-to-real transfer by optimizing a single dynamics-encoded latent variable in the real world. We validate our method through two challenging real-world humanoid tasks: fine-tuning a walking policy for precise speed tracking and learning a humanoid swing-up task from scratch, illustrating the promising capabilities of real-world humanoid learning realized by RTR-style systems. See https://robot-trains-robot.github.io/ for more info.
comment: Accepted to The Conference on Robot Learning (CoRL) 2025
A Third-Order Gaussian Process Trajectory Representation Framework with Closed-Form Kinematics for Continuous-Time Motion Estimation
In this paper, we propose a third-order, i.e., white-noise-on-jerk, Gaussian Process (GP) Trajectory Representation (TR) framework for continuous-time (CT) motion estimation (ME) tasks. Our framework features a unified trajectory representation that encapsulates the kinematic models of both $SO(3)\times\mathbb{R}^3$ and $SE(3)$ pose representations. This encapsulation strategy allows users to use the same implementation of measurement-based factors for either choice of pose representation, which facilitates experimentation and comparison to achieve the best model for the ME task. In addition, unique to our framework, we derive the kinematic models with the closed-form temporal derivatives of the local variable of $SO(3)$ and $SE(3)$, which so far has only been approximated based on the Taylor expansion in the literature. Our experiments show that these kinematic models can improve the estimation accuracy in high-speed scenarios. All analytical Jacobians of the interpolated states with respect to the support states of the trajectory representation, as well as the motion prior factors, are also provided for accelerated Gauss-Newton (GN) optimization. Our experiments demonstrate the efficacy and efficiency of the framework in various motion estimation tasks such as localization, calibration, and odometry, facilitating fast prototyping for ME researchers. We release the source code for the benefit of the community. Our project is available at https://github.com/brytsknguyen/gptr.
comment: The paper is currently under review at IEEE Transactions on Robotics (T-RO). The source code has been released, and feedback is welcome
Improving Rapidly-exploring Random Trees algorithm for Automated Parking in Real-world Scenarios
Automated parking is a self-driving feature that has been in cars for several years. Parking assistants in currently sold cars fail to park in more complex real-world scenarios and require the driver to move the car to an expected starting position before the assistant is activated. We overcome these limitations by proposing a planning algorithm consisting of two stages: (1) a geometric planner for maneuvering inside the parking slot and (2) a Rapidly-exploring Random Trees (RRT)-based planner that finds a collision-free path from the initial position to the slot entry. Evaluation of computational experiments demonstrates that improvements over commonly used RRT extensions reduce the parking path cost by 21 % and reduce the computation time by 79.5 %. The suitability of the algorithm for real-world parking scenarios was verified in physical experiments with Porsche Cayenne.
comment: 20 pages, 14 figures, 2 tables
Improving Efficiency of Sampling-based Motion Planning via Message-Passing Monte Carlo
Sampling-based motion planning methods, while effective in high-dimensional spaces, often suffer from inefficiencies due to irregular sampling distributions, leading to suboptimal exploration of the configuration space. In this paper, we propose an approach that enhances the efficiency of these methods by utilizing low-discrepancy distributions generated through Message-Passing Monte Carlo (MPMC). MPMC leverages Graph Neural Networks (GNNs) to generate point sets that uniformly cover the space, with uniformity assessed using the the $\cL_p$-discrepancy measure, which quantifies the irregularity of sample distributions. By improving the uniformity of the point sets, our approach significantly reduces computational overhead and the number of samples required for solving motion planning problems. Experimental results demonstrate that our method outperforms traditional sampling techniques in terms of planning efficiency.
Enhancing Sample Efficiency and Exploration in Reinforcement Learning through the Integration of Diffusion Models and Proximal Policy Optimization
On policy reinforcement learning (RL) methods such as PPO are attractive for continuous control but suffer from poor sample efficiency in costly, high dimensional settings. We present a strictly on policy framework that treats a conditional diffusion model as an adaptable action prior rather than a policy or world model. The prior is pre trained on logged data and used online only at sampling time to propose actions at current on policy states. Two lightweight mechanisms - value guided proposal generation (energy re weighting and in process gradient guidance) and a soft prior KL - regularize the actor via a small auxiliary imitation loss while keeping all PPO updates strictly on on-policy rollouts. To adapt the prior without heavy compute, we apply parameter efficient tuning (PET) that updates only adapters/LoRA, yielding a dual proximal view: policy KL is constrained by PPO and prior KL by PET. Across eight MuJoCo tasks under a shared 1.0M step budget, our method improves early learning (ALC@40) in 3/4 settings and matches or exceeds final return on 6/8 tasks with only 15-30% wall clock overhead. Ablations show that freezing the prior degrades performance and removing value guidance slows early learning; t SNE analyses confirm that value guidance concentrates proposals in high Q regions. Results indicate that an adaptable diffusion action prior is a practical way to boost on policy PPO under tight interaction budgets.
Explosive Jumping with Rigid and Articulated Soft Quadrupeds via Example Guided Reinforcement Learning IROS2025
Achieving controlled jumping behaviour for a quadruped robot is a challenging task, especially when introducing passive compliance in mechanical design. This study addresses this challenge via imitation-based deep reinforcement learning with a progressive training process. To start, we learn the jumping skill by mimicking a coarse jumping example generated by model-based trajectory optimization. Subsequently, we generalize the learned policy to broader situations, including various distances in both forward and lateral directions, and then pursue robust jumping in unknown ground unevenness. In addition, without tuning the reward much, we learn the jumping policy for a quadruped with parallel elasticity. Results show that using the proposed method, i) the robot learns versatile jumps by learning only from a single demonstration, ii) the robot with parallel compliance reduces the landing error by 11.1%, saves energy cost by 15.2% and reduces the peak torque by 15.8%, compared to the rigid robot without parallel elasticity, iii) the robot can perform jumps of variable distances with robustness against ground unevenness (maximal 4cm height perturbations) using only proprioceptive perception.
comment: accepted by IROS2025
DVM-SLAM: Decentralized Visual Monocular Simultaneous Localization and Mapping for Multi-Agent Systems
Cooperative Simultaneous Localization and Mapping (C-SLAM) enables multiple agents to work together in mapping unknown environments while simultaneously estimating their own positions. This approach enhances robustness, scalability, and accuracy by sharing information between agents, reducing drift, and enabling collective exploration of larger areas. In this paper, we present Decentralized Visual Monocular SLAM (DVM-SLAM), the first open-source decentralized monocular C-SLAM system. By only utilizing low-cost and light-weight monocular vision sensors, our system is well suited for small robots and micro aerial vehicles (MAVs). DVM-SLAM's real-world applicability is validated on physical robots with a custom collision avoidance framework, showcasing its potential in real-time multi-agent autonomous navigation scenarios. We also demonstrate comparable accuracy to state-of-the-art centralized monocular C-SLAM systems. We open-source our code and provide supplementary material online.
comment: Accepted to 2025 IEEE International Conference on Robotics and Automation, pp. 15814-15820
Systems and Control (CS)
Real-time Testing of Satellite Attitude Control With a Reaction Wheel Hardware-In-the-Loop Platform
We propose the Hardware-in-the-Loop (HIL) test of an adaptive satellite attitude control system with reaction wheel health estimation capabilities. Previous simulations and Software-in-the-Loop testing have prompted further experiments to explore the validity of the controller with real momentum exchange devices in the loop. This work is a step toward a comprehensive testing framework for validation of spacecraft attitude control algorithms. The proposed HIL testbed includes brushless DC motors and drivers that communicate using a CAN bus, an embedded computer that executes control and adaptation laws, and a satellite simulator that produces simulated sensor data, estimated attitude states, and responds to actions of the external actuators. We propose methods to artificially induce failures on the reaction wheels, and present related issues and lessons learned.
comment: 15 pages, 10 figures, 2025 AAS/AIAA Astrodynamics Specialist Conference
Safe Navigation under State Uncertainty: Online Adaptation for Robust Control Barrier Functions
Measurements and state estimates are often imperfect in control practice, posing challenges for safety-critical applications, where safety guarantees rely on accurate state information. In the presence of estimation errors, several prior robust control barrier function (R-CBF) formulations have imposed strict conditions on the input. These methods can be overly conservative and can introduce issues such as infeasibility, high control effort, etc. This work proposes a systematic method to improve R-CBFs, and demonstrates its advantages on a tracked vehicle that navigates among multiple obstacles. A primary contribution is a new optimization-based online parameter adaptation scheme that reduces the conservativeness of existing R-CBFs. In order to reduce the complexity of the parameter optimization, we merge several safety constraints into one unified numerical CBF via Poisson's equation. We further address the dual relative degree issue that typically causes difficulty in vehicle tracking. Experimental trials demonstrate the overall performance improvement of our approach over existing formulations.
Learning Interior Point Method for AC and DC Optimal Power Flow
This paper proposes a feasibility-guaranteed learning interior point method (L-IPM) to solve both AC and DC optimal power flow (OPF) problems. Given the criticality of OPF, the proposed L-IPM uses a hybrid learning model approach rather than relying solely on a simple black-box prediction. The traditional IPM follows a central path from an initial point to the optimal solution. However, each iteration involves solving large linear systems, which becomes increasingly expensive as the matrices grow more ill-conditioned in later steps. To address this, we model the IPM trajectory as a time series and train a Long Short-Term Memory (LSTM) network to project the IPM central path using only the first few stable iterations, which carry the most informative features about the path to optimality. We introduce a grid-informed methodology that enforces operational constraints on generation, voltage magnitudes, and line flows to ensure feasibility. The grid-informed LSTM serves as a tool for the IPM central path projection and, followed by a final IPM refinement step, significantly reduces the total number of iterations and time required for convergence. We use a sampling method to generate a wide range of load scenarios to improve generalization across diverse operating conditions, efficiently covering the power system's operational space. Simulation results on a 2869-bus European high-voltage transmission system show that the proposed L-IPM significantly reduces solution time by up to 94\%, while maintaining accuracy and feasibility of the solution. By leveraging early iterations and bypassing the final ill-conditioned and computationally demanding steps of traditional IPM, the proposed L-IPM reduces the number of required iterations by up to 85.5\%. Since solution feasibility is also guaranteed, L-IPM outperforms the conventional IPM in both computational efficiency and robustness.
Adaptive control mechanisms in gradient descent algorithms
The problem of designing adaptive stepsize sequences for the gradient descent method applied to convex and locally smooth functions is studied. We take an adaptive control perspective and design update rules for the stepsize that make use of both past (measured) and future (predicted) information. We show that Lyapunov analysis can guide in the systematic design of adaptive parameters striking a balance between convergence rates and robustness to computational errors or inexact gradient information. Theoretical and numerical results indicate that closed-loop adaptation guided by system theory is a promising approach for designing new classes of adaptive optimization algorithms with improved convergence properties.
comment: Accepted to the IEEE Conference on Decision and Control 2025
A Principled Framework to Evaluate Quality of AC-OPF Datasets for Machine Learning: Benchmarking a Novel, Scalable Generation Method
Several methods have been proposed in the literature to improve the quality of AC optimal power flow (AC-OPF) datasets used in machine learning (ML) models. Yet, scalability to large power systems remains unaddressed and comparing generation approaches is still hindered by the absence of widely accepted metrics quantifying AC-OPF dataset quality. In this work, we tackle both these limitations. We provide a simple heuristic that samples load setpoints uniformly in total load active power, rather than maximizing volume coverage, and solves an AC-OPF formulation with load slack variables to improve convergence. For quality assessment, we formulate a multi-criteria framework based on three metrics, measuring variability in the marginal distributions of AC-OPF primal variables, diversity in constraint activation patterns among AC-OPF instances and activation frequency of variable bounds. By comparing four open-source methods based on these metrics, we show that our heuristic consistently outperforms uniform random sampling, whether independent or constrained to a convex polytope, scoring as best in terms of balance between dataset quality and scalability.
comment: Submitted to IEEE Transactions on Power Systems
Universal Dynamics with Globally Controlled Analog Quantum Simulators
Analog quantum simulators with global control fields have emerged as powerful platforms for exploring complex quantum phenomena. Recent breakthroughs, such as the coherent control of thousands of atoms, highlight the growing potential for quantum applications at scale. Despite these advances, a fundamental theoretical question remains unresolved: to what extent can such systems realize universal quantum dynamics under global control? Here we establish a necessary and sufficient condition for universal quantum computation using only global pulse control, proving that a broad class of analog quantum simulators is, in fact, universal. We further extend this framework to fermionic and bosonic systems, including modern platforms such as ultracold atoms in optical superlattices. Crucially, to connect the theoretical possibility with experimental reality, we introduce a new control technique into the experiment - direct quantum optimal control. This method enables the synthesis of complex effective Hamiltonians and allows us to incorporate realistic hardware constraints. To show its practical power, we experimentally engineer three-body interactions outside the blockade regime and demonstrate topological dynamics on a Rydberg atom array. Using the new control framework, we overcome key experimental challenges, including hardware limitations and atom position fluctuations in the non-blockade regime, by identifying smooth, short-duration pulses that achieve high-fidelity dynamics. Experimental measurements reveal dynamical signatures of symmetry-protected-topological edge modes, confirming both the expressivity and feasibility of our approach. Our work opens a new avenue for quantum simulation beyond native hardware Hamiltonians, enabling the engineering of effective multi-body interactions and advancing the frontier of quantum information processing with globally-controlled analog platforms.
comment: 12 pages, 5 figures
An optimistic planning algorithm for switched discrete-time LQR
We introduce TROOP, a tree-based Riccati optimistic online planner, that is designed to generate near-optimal control laws for discrete-time switched linear systems with switched quadratic costs. The key challenge that we address is balancing computational resources against control performance, which is important as constructing near-optimal inputs often requires substantial amount of computations. TROOP addresses this trade-off by adopting an online best-first search strategy inspired by A*, allowing for efficient estimates of the optimal value function. The control laws obtained guarantee both near-optimality and stability properties for the closed-loop system. These properties depend on the planning depth, which determines how far into the future the algorithm explores and is closely related to the amount of computations. TROOP thus strikes a balance between computational efficiency and control performance, which is illustrated by numerical simulations on an example.
comment: Technical report for CDC 2025 publication
Performance Analysis of Underwater Optical Wireless Communication Using O-RIS and Fiber Optic Backhaul (Extended version)
This Letter presents a novel hybrid underwater wireless optical communication (UWOC) system that integrates underwater optical access points (UOAPs) with a passive optical network (PON)-based fiber-optic backhaul to provide a resilient backbone. A hard switching mechanism is employed between direct and optical reconfigurable intelligent surface (O-RIS)-assisted links to ensure reliable connectivity. Unlike previous studies, the proposed system is evaluated under both active and multiple passive O-RIS configurations. To enhance reliability, the Selection Combining (SC) and Maximal Ratio Combining (MRC) schemes are applied. Analytical and simulation results demonstrate that optimal O-RIS placement significantly enhances system performance. However, in the linear regime, placing it too close to the receiver causes degradation due to increased path loss and beam jitter in an identical water type. Moreover, increasing the number of O-RIS elements within practical limits further improves overall system performance and enhances adaptability to variations in the underwater channel.
Closed-Form Input Design for Identification under Output Feedback with Perturbation Constraints
In many applications, system identification experiments must be performed under output feedback to ensure safety or to maintain system operation. In this paper, we consider the online design of informative experiments for ARMAX models by applying a bounded perturbation to the input signal generated by a fixed output feedback controller. Specifically, the design constrains the resulting output perturbation within user-specified limits and can be efficiently computed in closed form. We demonstrate the effectiveness of the method in two numerical experiments.
Globally Stable Discrete Time PID Passivity-based Control of Power Converters: Simulation and Experimental Results
The key idea behind PID Passivity-based Control (PID-PBC) is to leverage the passivity property of PIDs (for all positive gains) and wrap the PID controller around a passive output to ensure global stability in closed-loop. However, the practical applicability of PID-PBC is stymied by two key facts: (i) the vast majority of practical implementations of PIDs is carried-out in discrete time -- discretizing the continuous time dynamical system of the PID; (ii) the well-known problem that passivity is not preserved upon discretization, even with small sampling times. Therefore, two aspects of the PID-PBC must be revisited for its safe practical application. First, we propose a discretization of the PID that ensures its passivity. Second, since the output that is identified as passive for the continuous time system is not necessarily passive for its discrete time version, we construct a new output that ensures the passivity property for the discretization of the system. In this paper, we provide a constructive answer to both issues for the case of power converter models. Instrumental to achieve this objective is the use of the implicit midpoint discretization method -- which is a symplectic integration technique that preserves system invariants. Since the reference value for the output to be regulated in power converters is non-zero, we are henceforth interested in the property of passivity of the incremental model -- currently known as shifted passivity. Therefore, we demonstrate that the resulting discrete-time PID-PBC defines a passive map for the incremental model and establish shifted passivity for the discretized power converter model. Combining these properties, we prove global stability for the feedback interconnection of the power converter with the discretized PID-PBC. The paper also presents simulations and experiments that demonstrate the performance of the proposed discretization.
AgriChrono: A Multi-modal Dataset Capturing Crop Growth and Lighting Variability with a Field Robot
Existing datasets for precision agriculture have primarily been collected in static or controlled environments such as indoor labs or greenhouses, often with limited sensor diversity and restricted temporal span. These conditions fail to reflect the dynamic nature of real farmland, including illumination changes, crop growth variation, and natural disturbances. As a result, models trained on such data often lack robustness and generalization when applied to real-world field scenarios. In this paper, we present AgriChrono, a novel robotic data collection platform and multi-modal dataset designed to capture the dynamic conditions of real-world agricultural environments. Our platform integrates multiple sensors and enables remote, time-synchronized acquisition of RGB, Depth, LiDAR, and IMU data, supporting efficient and repeatable long-term data collection across varying illumination and crop growth stages. We benchmark a range of state-of-the-art 3D reconstruction models on the AgriChrono dataset, highlighting the difficulty of reconstruction in real-world field environments and demonstrating its value as a research asset for advancing model generalization under dynamic conditions. The code and dataset are publicly available at: https://github.com/StructuresComp/agri-chrono
FALCON: Autonomous Cyber Threat Intelligence Mining with LLMs for IDS Rule Generation
Signature-based Intrusion Detection Systems (IDS) detect malicious activities by matching network or host activity against predefined rules. These rules are derived from extensive Cyber Threat Intelligence (CTI), which includes attack signatures and behavioral patterns obtained through automated tools and manual threat analysis, such as sandboxing. The CTI is then transformed into actionable rules for the IDS engine, enabling real-time detection and prevention. However, the constant evolution of cyber threats necessitates frequent rule updates, which delay deployment time and weaken overall security readiness. Recent advancements in agentic systems powered by Large Language Models (LLMs) offer the potential for autonomous IDS rule generation with internal evaluation. We introduce FALCON, an autonomous agentic framework that generates deployable IDS rules from CTI data in real-time and evaluates them using built-in multi-phased validators. To demonstrate versatility, we target both network (Snort) and host-based (YARA) mediums and construct a comprehensive dataset of IDS rules with their corresponding CTIs. Our evaluations indicate FALCON excels in automatic rule generation, with an average of 95% accuracy validated by qualitative evaluation with 84% inter-rater agreement among multiple cybersecurity analysts across all metrics. These results underscore the feasibility and effectiveness of LLM-driven data mining for real-time cyber threat mitigation.
comment: 11 pages, 5 figures, 4 tables
Potential of Quantum Computing Applications for Smart Grid Digital Twins and Future Directions
The convergence of digital twin technology and quantum computing is opening new horizons for the modeling, control, and optimization of smart grid systems. This paper reviews the current research landscape at the intersection of these fields, with a focus on how quantum algorithms can enhance the performance of digital twins in smart energy systems. We conduct a thematic literature review and identify key research trends, technical challenges, and gaps in real-world adoption. Further, a conceptual framework is proposed to integrate quantum modules into classical digital twin architectures. The potential benefits of this hybrid approach for smart grid operation and future research directions are also discussed.
comment: Accepted Paper
Scalable Fairness Shaping with LLM-Guided Multi-Agent Reinforcement Learning for Peer-to-Peer Electricity Markets
Peer-to-peer (P2P) energy trading is becoming central to modern distribution systems as rooftop PV and home energy management systems become pervasive, yet most existing market and reinforcement learning designs emphasize efficiency or private profit and offer little real-time guidance to ensure equitable outcomes under uncertainty. To address this gap, a fairness-aware multiagent reinforcement learning framework, FairMarket-RL, is proposed in which a large language model (LLM) critic shapes bidding policies within a continuous double auction under partial observability and discrete price-quantity actions. After each trading slot, the LLM returns normalized fairness scores Fairness-to-Grid (FTG), Fairness-Between-Sellers (FBS), and Fairness-of-Pricing (FPP) that are integrated into the reward via ramped coefficients and tunable scaling, so that fairness guidance complements, rather than overwhelms, economic incentives. The environment models realistic residential load and PV profiles and enforce hard constraints on prices, physical feasibility, and policy-update stability. Across a progression of experiments from a small pilot to a larger simulated community and a mixed-asset real-world dataset, the framework shifts exchanges toward local P2P trades, lowers consumer costs relative to grid-only procurement, sustains strong fairness across participants, and preserves utility viability. Sensitivity analyses over solar availability and aggregate demand further indicate robust performance, suggesting a scalable, LLM-guided pathway to decentralized electricity markets that are economically efficient, socially equitable, and technically sound.
Fuzzy-Based Control Method for Autonomous Spacecraft Inspection with Minimal Fuel Consumption
This study explores an energy-efficient control strategy for spacecraft inspection using a fuzzy inference system combined with a bio-inspired optimization technique to incorporate learning capability into the control process. The optimized fuzzy controller produces a minimally fuel-consuming force while maintaining reliable inspection within constraints, such as illumination, restricted field of view, thrust limits, and safe regions. The performance of the proposed control strategy is validated through Monte Carlo simulations.
comment: 13 pages, 8 figures, 2025 AAS/AIAA Astrodynamics Specialist Conference
Optimal Control of ODE Car-Following Models: Applications to Mixed-Autonomy Platoon Control via Coupled Autonomous Vehicles
In this paper, we study the optimal control of a mixed-autonomy platoon driving on a single lane to smooth traffic flow. The platoon consists of autonomous vehicles, whose acceleration is controlled, and human-driven vehicles, whose behavior is described using a microscopic car-following model. We formulate the optimal control problem where the dynamics of the platoon are describing through a system of non-linear ODEs, with explicit constraints on both the state and the control variables. Theoretically, we analyze the well-posedness of the system dynamics under a reasonable set of admissible controls and establish the existence of minimizers for the optimal control problem. To solve the problem numerically, we propose a gradient descent-based algorithm that leverages the adjoint method, along with a penalty approach to handle state constraints. We demonstrate the effectiveness of the proposed numerical scheme through several experiments, exploring various scenarios with different penetration rates and distributions of controlled vehicles within the platoon.
Comparison of Droop-Based Single-Loop Grid-Forming Wind Turbines: High-Frequency Open-Loop Unstable Behavior and Damping
The integration of inverter-interfaced generators introduces new instability phenomena into modern power systems. This paper conducts a comparative analysis of two widely used droop-based grid-forming controls, namely droop control and droop-I control, in wind turbines. Although both approaches provide steady-state reactive power-voltage droop characteristics, their impacts on high-frequency (HF) stability differ significantly. Firstly, on open-loop (OL) comparison reveals that droop-I control alters HF pole locations. The application of Routh's Stability Criterion further analytically demonstrates that such pole shifts inevitably lead to OL instability. This HF OL instability is identified as a structural phenomenon in purely inductive grids and cannot be mitigated through control parameter tuning. As a result, droop-I control significantly degrades HF stability, making conventional gain and phase margins insufficient for evaluating robustness against parameter variations. Then, the performance of established active damping (AD) is assessed for both control schemes. The finding indicates that AD designs effective for droop control may fail to suppress HF resonance under droop-I control due to the presence of unstable OL poles. Case studies performed on the IEEE 14-Bus Test System validate the analysis and emphasize the critical role of HF OL instability in determining the overall power system stability.
Learning Robust Regions of Attraction Using Rollout-Enhanced Physics-Informed Neural Networks with Policy Iteration
The region of attraction is a key metric of the robustness of systems. This paper addresses the numerical solution of the generalized Zubov's equation, which produces a special Lyapunov function characterizing the robust region of attraction for perturbed systems. To handle the highly nonlinear characteristic of the generalized Zubov's equation, we propose a physics-informed neural network framework that employs a policy iteration training scheme with rollout to approximate the viscosity solution. In addition to computing the optimal disturbance during the policy improvement process, we incorporate neural network-generated value estimates as anchor points to facilitate the training procedure to prevent singularities in both low- and high-dimensional systems. Numerical simulations validate the effectiveness of the proposed approach.
comment: Submitted to the American Control Conference (ACC 2026)
Climate-Resilient Ports and Waterborne Transport Systems: Current Status and Future Prospects
The increasing challenges posed by climate change necessitate a comprehensive examination of the resilience of waterborne transport systems. This paper explores the nexus of climate resilience, and waterborne transport, addressing the challenges faced by ports and their connecting waterborne transport systems. It provides an in-depth analysis of the current status of climate-resilient infrastructure and operations while emphasizing the transformative potential of emerging technologies. Through a systematic review, the paper identifies critical gaps and opportunities. Research predominantly emphasizes port infrastructure over supply chain resilience, neglecting the interconnected vulnerabilities of maritime networks. There is limited focus on specific climate-induced disruptions, such as drought and compounded events, which complicate resilience planning. Methodologically, risk assessments and case studies dominate the field, while advanced technologies such as digital twins, artificial intelligence, and satellite monitoring remain underutilized. Geographic disparities in research output and a tendency toward short- to medium-term planning further constrain global and long-term resilience efforts. To address these gaps, the study advocates for systems-based approaches that integrate infrastructure, operations, and supply chains. It highlights collaborative frameworks and advanced tools, including digital twins, machine learning, and participatory modeling, as crucial for enabling predictive and adaptive risk management. This study stands as one of the first comprehensive reviews exclusively focused on climate resilience in ports and waterborne transport systems. It provides actionable insights for policymakers, researchers, and industry stakeholders, proposing a future research agenda to advance waterborne transport systems capable of withstanding multifaceted climate impacts.
Aleks: AI powered Multi Agent System for Autonomous Scientific Discovery via Data-Driven Approaches in Plant Science
Modern plant science increasingly relies on large, heterogeneous datasets, but challenges in experimental design, data preprocessing, and reproducibility hinder research throughput. Here we introduce Aleks, an AI-powered multi-agent system that integrates domain knowledge, data analysis, and machine learning within a structured framework to autonomously conduct data-driven scientific discovery. Once provided with a research question and dataset, Aleks iteratively formulated problems, explored alternative modeling strategies, and refined solutions across multiple cycles without human intervention. In a case study on grapevine red blotch disease, Aleks progressively identified biologically meaningful features and converged on interpretable models with robust performance. Ablation studies underscored the importance of domain knowledge and memory for coherent outcomes. This exploratory work highlights the promise of agentic AI as an autonomous collaborator for accelerating scientific discovery in plant sciences.
Aggregate Fictitious Play for Learning in Anonymous Polymatrix Games (Extended Version)
Fictitious play (FP) is a well-studied algorithm that enables agents to learn Nash equilibrium in games with certain reward structures. However, when agents have no prior knowledge of the reward functions, FP faces a major challenge: the joint action space grows exponentially with the number of agents, which slows down reward exploration. Anonymous games offer a structure that mitigates this issue. In these games, the rewards depend only on the actions taken; not on who is taking which action. Under such a structure, we introduce aggregate fictitious play (agg-FP), a variant of FP where each agent tracks the frequency of the number of other agents playing each action, rather than these agents' individual actions. We show that in anonymous polymatrix games, agg-FP converges to a Nash equilibrium under the same conditions as classical FP. In essence, by aggregating the agents' actions, we reduce the action space without losing the convergence guarantees. Using simulations, we provide empirical evidence on how this reduction accelerates convergence.
Towards Reliable Neural Optimizers: Permutation-Equivariant Neural Approximation in Dynamic Data Driven Applications Systems
Dynamic Data Driven Applications Systems (DDDAS) motivate the development of optimization approaches capable of adapting to streaming, heterogeneous, and asynchronous data from sensor networks. Many established optimization solvers, such as branch-and-bound, gradient descent, and Newton-Raphson methods, rely on iterative algorithms whose step-by-step convergence makes them too slow for real-time, multi-sensor environments. In our recent work, we introduced LOOP-PE (Learning to Optimize the Optimization Process, Permutation Equivariance version), a feed-forward neural approximation model with an integrated feasibility recovery function. LOOP-PE processes inputs from a variable number of sensors in arbitrary order, making it robust to sensor dropout, communication delays, and system scaling. Its permutation-equivariant architecture ensures that reordering the input data reorders the corresponding dispatch decisions consistently, without retraining or pre-alignment. Feasibility is enforced via a generalized gauge map, guaranteeing that outputs satisfy physical and operational constraints. We illustrate the approach in a DDDAS-inspired case study of a Virtual Power Plant (VPP) managing multiple distributed generation agents (DERs) to maximize renewable utilization while respecting system limits. Results show that LOOP-PE produces near-optimal, feasible, and highly adaptable decisions under dynamic, unordered, and distributed sensing conditions, significantly outperforming iterative algorithm based solvers in both speed and flexibility. Here, we extend our earlier work by providing additional analysis and explanation of LOOP-PE design and operation, with particular emphasis on its feasibility guarantee and permutation equivariance feature.
Set-membership identification of continuous-time MIMO systems via Tustin discretization
In this paper, we deal with the identification of continuous-time systems from sampled data corrupted by unknown but bounded errors. A significant challenge in continuous-time identification is the estimation of the input and output data derivatives. In this paper, we propose a novel method based on set-membership techniques and Tustin discretization, which overcomes the derivative measurement problem and the presence of bounded errors affecting all the measured signals. First, we derive the proposed method and prove that it becomes an affordable polynomial optimization problem. Then, we present some numerical results based on simulation and experimental data to explore the effectiveness of the proposed method.
Privacy-Preserving Distributed Control for a Networked Battery Energy Storage System
The increasing deployment of distributed Battery Energy Storage Systems (BESSs) in modern power grids necessitates effective coordination strategies to ensure state-of-charge (SoC) balancing and accurate power delivery. While distributed control frameworks offer scalability and resilience, they also raise significant privacy concerns due to the need for inter-agent information exchange. This paper presents a novel privacy-preserving distributed control algorithm for SoC balancing in a networked BESS. The proposed framework includes distributed power allocation law that is designed based on two privacy-preserving distributed estimators, one for the average unit state and the other for the average desired power. The average unit state estimator is designed via the state decomposition method without disclosing sensitive internal states. The proposed power allocation law based on these estimators ensures asymptotic SoC balancing and global power delivery while safeguarding agent privacy from external eavesdroppers. The effectiveness and privacy-preserving properties of the proposed control strategy are demonstrated through simulation results.
QuadKAN: KAN-Enhanced Quadruped Motion Control via End-to-End Reinforcement Learning
We address vision-guided quadruped motion control with reinforcement learning (RL) and highlight the necessity of combining proprioception with vision for robust control. We propose QuadKAN, a spline-parameterized cross-modal policy instantiated with Kolmogorov-Arnold Networks (KANs). The framework incorporates a spline encoder for proprioception and a spline fusion head for proprioception-vision inputs. This structured function class aligns the state-to-action mapping with the piecewise-smooth nature of gait, improving sample efficiency, reducing action jitter and energy consumption, and providing interpretable posture-action sensitivities. We adopt Multi-Modal Delay Randomization (MMDR) and perform end-to-end training with Proximal Policy Optimization (PPO). Evaluations across diverse terrains, including both even and uneven surfaces and scenarios with static or dynamic obstacles, demonstrate that QuadKAN achieves consistently higher returns, greater distances, and fewer collisions than state-of-the-art (SOTA) baselines. These results show that spline-parameterized policies offer a simple, effective, and interpretable alternative for robust vision-guided locomotion. A repository will be made available upon acceptance.
comment: 14pages, 9 figures, Journal paper
Realizing Reduced and Sparse Biochemical Reaction Networks from Dynamics
We propose a direct optimization framework for learning reduced and sparse chemical reaction networks (CRNs) from time-series trajectory data. In contrast to widely used indirect methods-such as those based on sparse identification of nonlinear dynamics (SINDy)-which infer reaction dynamics by fitting numerically estimated derivatives, our approach fits entire trajectories by solving a dynamically constrained optimization problem. This formulation enables the construction of reduced CRNs that are both low-dimensional and sparse, while preserving key dynamical behaviors of the original system. We develop an accelerated proximal gradient algorithm to efficiently solve the resulting non-convex optimization problem. Through illustrative examples, including a Drosophila circadian oscillator and a glycolytic oscillator, we demonstrate the ability of our method to recover accurate and interpretable reduced-order CRNs. Notably, the direct approach avoids the derivative estimation step and mitigates error accumulation issues inherent in indirect methods, making it a robust alternative for data-driven CRN realizations.
comment: Accepted to IEEE CDC 2025. Author-accepted version; supplementary material in ancillary files (In this version, supplementary PDF is moved to ancillary files; no content changes to main article)
An Adaptive Environment-Aware Transformer Autoencoder for UAV-FSO with Dynamic Complexity Control
The rise of sixth-generation (6G) wireless networks sets high demands on UAV-assisted Free Space Optical (FSO) communications, where the channel environment becomes more complex and variable due to both atmospheric turbulence and UAV-induced vibrations. These factors increase the challenge of maintaining reliable communication and require adaptive processing methods. Autoencoders are promising as they learn optimal encodings from channel data. However, existing autoencoder designs are generic and lack the specific adaptability and computational flexibility needed for UAV-FSO scenarios. To address this, we propose AEAT-AE (Adaptive Environment-aware Transformer Autoencoder), a Transformer-based framework that integrates environmental parameters into both encoder and decoder via a cross-attention mechanism. Moreover, AEAT-AE incorporates a Deep Q-Network (DQN) that dynamically selects which layers of the Transformer autoencoder to activate based on real-time environmental inputs, effectively balancing performance and computational cost. Simulation results demonstrate that AEAT-AE outperforms conventional methods in bit error rate while maintaining efficient runtime, representing a novel tailored solution for next-generation UAV-FSO communications.
Maximizing Battery Storage Profits via High-Frequency Intraday Trading
Maximizing revenue for grid-scale battery energy storage systems in continuous intraday electricity markets requires strategies that are able to seize trading opportunities as soon as new information arrives. This paper introduces and evaluates an automated high-frequency trading strategy for battery energy storage systems trading on the intraday market for power while explicitly considering the dynamics of the limit order book, market rules, and technical parameters. The standard rolling intrinsic strategy is adapted for continuous intraday electricity markets and solved using a dynamic programming approximation that is two to three orders of magnitude faster than an exact mixed-integer linear programming solution. A detailed backtest over a full year of German order book data demonstrates that the proposed dynamic programming formulation does not reduce trading profits and enables the policy to react to every relevant order book update, enabling realistic rapid backtesting. Our results show the significant revenue potential of high-frequency trading: our policy earns 58% more than when re-optimizing only once every hour and 14% more than when re-optimizing once per minute, highlighting that profits critically depend on trading speed. Furthermore, we leverage the speed of our algorithm to train a parametric extension of the rolling intrinsic, increasing yearly revenue by 8.4% out of sample.
Aging-aware Energy Management for Residential Multi-Carrier Energy Systems
In the context of building electrification, the operation of distributed energy resources integrating multiple energy carriers (electricity, heat, mobility) poses a significant challenge due to the nonlinear device dynamics, uncertainty, and computational issues. As such, energy management systems seek to decide the power dispatch in the best way possible. The objective is to minimize and balance operative costs (energy bills or asset degradation) with user requirements (mobility, heating, etc.). Current energy management uses empirical battery ageing models outside of their specific fitting conditions, resulting in inaccuracies and poor performance. Moreover, the link to thermal systems is also overlooked. This paper presents an ageing-aware day-ahead algorithm for electrified buildings that incorporates physics-based battery ageing models. The models distinguish between energy storage systems and make explicit the trade-off between grid cost and battery degradation. The proposed day-ahead algorithm can either cut down on grid costs or extend battery lifetime (electric vehicle or stationary battery packs). Moreover, it exploits the differences between cathode chemistries improving grid costs by 25% when using LFP cells, with respect to NMC cells. Finally, the performance using aged batteries is also enhanced with 35% grid cost observed savings, when passing from new to aged batteries in the summer.
Deception in Oligopoly Games via Adaptive Nash Seeking Systems
In the theory of multi-agent systems, deception refers to the strategic manipulation of information to influence the behavior of other agents, ultimately altering the long-term dynamics of the entire system. Recently, this concept has been examined in the context of model-free Nash equilibrium seeking (NES) algorithms for noncooperative games. Specifically, it was demonstrated that players can exploit knowledge of other players' exploration signals to drive the system toward a ``deceptive" Nash equilibrium, while maintaining the stability of the closed-loop system. To extend this insight beyond the duopoly case, in this paper we conduct a comprehensive study of deception mechanisms in N-player oligopoly markets. By leveraging the structure of these games and employing stability techniques for nonlinear dynamical systems, we provide game-theoretic insights into deception and derive specialized results, including stability conditions. These results allow players to systematically adjust their NES dynamics by tuning gains and signal amplitudes, all while ensuring closed-loop stability. Additionally, we introduce novel sufficient conditions to demonstrate that the (practically) stable equilibrium point of the deceptive dynamics corresponds to a true Nash equilibrium of a different game, which we term the ``deceptive game." Our results show that, under the proposed adaptive dynamics with deception, a victim firm may develop a distorted perception of its competitors' product appeal, which could lead to setting suboptimal prices.
Stochastic Real-Time Deception in Nash Equilibrium Seeking for Games with Quadratic Payoffs
In multi-agent autonomous systems, deception is a fundamental concept which characterizes the exploitation of unbalanced information to mislead victims into choosing oblivious actions. This effectively alters the system's long term behavior, leading to outcomes that may be beneficial to the deceiver but detrimental to victim. We study this phenomenon for a class of model-free Nash equilibrium seeking (NES) where players implement independent stochastic exploration signals to learn the pseudogradient flow. In particular, we show that deceptive players who obtain real-time measurements of other players' stochastic perturbation can incorporate this information into their own NES action update, consequentially steering the overall dynamics to a new operating point that could potentially improve the payoffs of the deceptive players. We consider games with quadratic payoff functions, as this restriction allows us to derive a more explicit formulation of the capabilities of the deceptive players. By leveraging results on multi-input stochastic averaging for dynamical systems, we establish local exponential (in probability) convergence for the proposed deceptive NES dynamics. To illustrate our results, we apply them to a two player quadratic game.
Trajectory Optimization for UAV-Based Medical Delivery with Temporal Logic Constraints and Convex Feasible Set Collision Avoidance
This paper addresses the problem of trajectory optimization for unmanned aerial vehicles (UAVs) performing time-sensitive medical deliveries in urban environments. Specifically, we consider a single UAV with 3 degree-of-freedom dynamics tasked with delivering blood packages to multiple hospitals, each with a predefined time window and priority. Mission objectives are encoded using Signal Temporal Logic (STL), enabling the formal specification of spatial-temporal constraints. To ensure safety, city buildings are modeled as 3D convex obstacles, and obstacle avoidance is handled through a Convex Feasible Set (CFS) method. The entire planning problem-combining UAV dynamics, STL satisfaction, and collision avoidance-is formulated as a convex optimization problem that ensures tractability and can be solved efficiently using standard convex programming techniques. Simulation results demonstrate that the proposed method generates dynamically feasible, collision-free trajectories that satisfy temporal mission goals, providing a scalable and reliable approach for autonomous UAV-based medical logistics.
comment: 11 pages, 4 figures
Global Geolocated Realtime Data of Interfleet Urban Transit Bus Idling
Urban transit bus idling is a contributor to ecological stress, economic inefficiency, and medically hazardous health outcomes due to emissions. The global accumulation of this frequent pattern of undesirable driving behavior is enormous. In order to measure its scale, we propose GRD-TRT-BUF-4I (Ground Truth Buffer for Idling) an extensible, realtime detection system that records the geolocation and idling duration of urban transit bus fleets internationally. Using live vehicle locations from General Transit Feed Specification (GTFS) Realtime, the system detects approximately 200,000 idling events per day from over 50 cities across North America, Europe, Oceania, and Asia. This realtime data was created dynamically to serve operational decision-making and fleet management to reduce the frequency and duration of idling events as they occur, as well as to capture its accumulative effects. Civil and Transportation Engineers, Urban Planners, Epidemiologists, Policymakers, and other stakeholders might find this useful for emissions modeling, traffic management, route planning, and other urban sustainability efforts at a variety of geographic and temporal scales.
comment: 35 pages, 12 figures, 36 tables, 100 data sources (including links). Prepared for Earth System Science Data (ESSD)
Optimal Planning for Enhancing the Resilience of Modern Distribution Systems Against Cyberattacks
The increasing integration of IoT-connected devices in smart grids has introduced new vulnerabilities at the distribution level. Of particular concern is the potential for cyberattacks that exploit high-wattage IoT devices, such as EV chargers, to manipulate local demand and destabilize the grid. While previous studies have primarily focused on such attacks at the transmission level, this paper investigates their feasibility and impact at the distribution level. We examine how cyberattackers can target voltage-sensitive nodes, especially those exposed by the presence of high-consumption devices, to cause voltage deviation and service disruption. Our analysis demonstrates that conventional grid protections are insufficient against these intelligent, localized attacks. To address this, we propose resilience strategies using distributed generation (DGs), exploring their role in preemptive planning. This research highlights the urgent need for distribution-level cyber resilience planning in smart grids.
comment: Accepted Paper
Large Language Model-Based Framework for Explainable Cyberattack Detection in Automatic Generation Control Systems
The increasing digitization of smart grids has improved operational efficiency but also introduced new cybersecurity vulnerabilities, such as False Data Injection Attacks (FDIAs) targeting Automatic Generation Control (AGC) systems. While machine learning (ML) and deep learning (DL) models have shown promise in detecting such attacks, their opaque decision-making limits operator trust and real-world applicability. This paper proposes a hybrid framework that integrates lightweight ML-based attack detection with natural language explanations generated by Large Language Models (LLMs). Classifiers such as LightGBM achieve up to 95.13% attack detection accuracy with only 0.004 s inference latency. Upon detecting a cyberattack, the system invokes LLMs, including GPT-3.5 Turbo, GPT-4 Turbo, and GPT-4o mini, to generate human-readable explanation of the event. Evaluated on 100 test samples, GPT-4o mini with 20-shot prompting achieved 93% accuracy in identifying the attack target, a mean absolute error of 0.075 pu in estimating attack magnitude, and 2.19 seconds mean absolute error (MAE) in estimating attack onset. These results demonstrate that the proposed framework effectively balances real-time detection with interpretable, high-fidelity explanations, addressing a critical need for actionable AI in smart grid cybersecurity.
comment: Accepted Paper
From Optimization to Control: Quasi Policy Iteration
Recent control algorithms for Markov decision processes (MDPs) have been designed using an implicit analogy with well-established optimization algorithms. In this paper, we adopt the quasi-Newton method (QNM) from convex optimization to introduce a novel control algorithm coined as quasi-policy iteration (QPI). In particular, QPI is based on a novel approximation of the ``Hessian'' matrix in the policy iteration algorithm, which exploits two linear structural constraints specific to MDPs and allows for the incorporation of prior information on the transition probability kernel. While the proposed algorithm has the same computational complexity as value iteration, it exhibits an empirical convergence behavior similar to that of QNM with a low sensitivity to the discount factor.
Harnessing the Full Potential of RRAMs through Scalable and Distributed In-Memory Computing with Integrated Error Correction
Exponential growth in global computing demand is exacerbated due to the higher-energy requirements of conventional architectures, primarily due to energy-intensive data movement. In-memory computing with Resistive Random Access Memory (RRAM) addresses this by co-integrating memory and processing, but faces significant hurdles related to device-level non-idealities and poor scalability for large computing tasks. Here, we introduce MELISO+ (In-Memory Linear Solver), a full-stack, distributed framework for energy-efficient in-memory computing. MELISO+ proposes a novel two-tier error correction mechanism to mitigate device non-idealities and develops a distributed RRAM computing framework to enable matrix computations exceeding dimensions of $65,000\times65,000$. This approach reduces first- and second-order arithmetic errors due to device non-idealities by over $90\%$, enhances energy efficiency by three to five orders of magnitude, and decreases latency 100-fold. Hence, MELISO+ allows lower-precision RRAM devices to outperform high-precision device alternatives in accuracy, energy and latency metrics. By unifying algorithm-hardware co-design with scalable architecture, MELISO+ significantly advances sustainable, high-dimensional computing suitable for applications like large language models and generative AI.
Systems and Control (EESS)
Real-time Testing of Satellite Attitude Control With a Reaction Wheel Hardware-In-the-Loop Platform
We propose the Hardware-in-the-Loop (HIL) test of an adaptive satellite attitude control system with reaction wheel health estimation capabilities. Previous simulations and Software-in-the-Loop testing have prompted further experiments to explore the validity of the controller with real momentum exchange devices in the loop. This work is a step toward a comprehensive testing framework for validation of spacecraft attitude control algorithms. The proposed HIL testbed includes brushless DC motors and drivers that communicate using a CAN bus, an embedded computer that executes control and adaptation laws, and a satellite simulator that produces simulated sensor data, estimated attitude states, and responds to actions of the external actuators. We propose methods to artificially induce failures on the reaction wheels, and present related issues and lessons learned.
comment: 15 pages, 10 figures, 2025 AAS/AIAA Astrodynamics Specialist Conference
Safe Navigation under State Uncertainty: Online Adaptation for Robust Control Barrier Functions
Measurements and state estimates are often imperfect in control practice, posing challenges for safety-critical applications, where safety guarantees rely on accurate state information. In the presence of estimation errors, several prior robust control barrier function (R-CBF) formulations have imposed strict conditions on the input. These methods can be overly conservative and can introduce issues such as infeasibility, high control effort, etc. This work proposes a systematic method to improve R-CBFs, and demonstrates its advantages on a tracked vehicle that navigates among multiple obstacles. A primary contribution is a new optimization-based online parameter adaptation scheme that reduces the conservativeness of existing R-CBFs. In order to reduce the complexity of the parameter optimization, we merge several safety constraints into one unified numerical CBF via Poisson's equation. We further address the dual relative degree issue that typically causes difficulty in vehicle tracking. Experimental trials demonstrate the overall performance improvement of our approach over existing formulations.
Learning Interior Point Method for AC and DC Optimal Power Flow
This paper proposes a feasibility-guaranteed learning interior point method (L-IPM) to solve both AC and DC optimal power flow (OPF) problems. Given the criticality of OPF, the proposed L-IPM uses a hybrid learning model approach rather than relying solely on a simple black-box prediction. The traditional IPM follows a central path from an initial point to the optimal solution. However, each iteration involves solving large linear systems, which becomes increasingly expensive as the matrices grow more ill-conditioned in later steps. To address this, we model the IPM trajectory as a time series and train a Long Short-Term Memory (LSTM) network to project the IPM central path using only the first few stable iterations, which carry the most informative features about the path to optimality. We introduce a grid-informed methodology that enforces operational constraints on generation, voltage magnitudes, and line flows to ensure feasibility. The grid-informed LSTM serves as a tool for the IPM central path projection and, followed by a final IPM refinement step, significantly reduces the total number of iterations and time required for convergence. We use a sampling method to generate a wide range of load scenarios to improve generalization across diverse operating conditions, efficiently covering the power system's operational space. Simulation results on a 2869-bus European high-voltage transmission system show that the proposed L-IPM significantly reduces solution time by up to 94\%, while maintaining accuracy and feasibility of the solution. By leveraging early iterations and bypassing the final ill-conditioned and computationally demanding steps of traditional IPM, the proposed L-IPM reduces the number of required iterations by up to 85.5\%. Since solution feasibility is also guaranteed, L-IPM outperforms the conventional IPM in both computational efficiency and robustness.
Adaptive control mechanisms in gradient descent algorithms
The problem of designing adaptive stepsize sequences for the gradient descent method applied to convex and locally smooth functions is studied. We take an adaptive control perspective and design update rules for the stepsize that make use of both past (measured) and future (predicted) information. We show that Lyapunov analysis can guide in the systematic design of adaptive parameters striking a balance between convergence rates and robustness to computational errors or inexact gradient information. Theoretical and numerical results indicate that closed-loop adaptation guided by system theory is a promising approach for designing new classes of adaptive optimization algorithms with improved convergence properties.
comment: Accepted to the IEEE Conference on Decision and Control 2025
A Principled Framework to Evaluate Quality of AC-OPF Datasets for Machine Learning: Benchmarking a Novel, Scalable Generation Method
Several methods have been proposed in the literature to improve the quality of AC optimal power flow (AC-OPF) datasets used in machine learning (ML) models. Yet, scalability to large power systems remains unaddressed and comparing generation approaches is still hindered by the absence of widely accepted metrics quantifying AC-OPF dataset quality. In this work, we tackle both these limitations. We provide a simple heuristic that samples load setpoints uniformly in total load active power, rather than maximizing volume coverage, and solves an AC-OPF formulation with load slack variables to improve convergence. For quality assessment, we formulate a multi-criteria framework based on three metrics, measuring variability in the marginal distributions of AC-OPF primal variables, diversity in constraint activation patterns among AC-OPF instances and activation frequency of variable bounds. By comparing four open-source methods based on these metrics, we show that our heuristic consistently outperforms uniform random sampling, whether independent or constrained to a convex polytope, scoring as best in terms of balance between dataset quality and scalability.
comment: Submitted to IEEE Transactions on Power Systems
Universal Dynamics with Globally Controlled Analog Quantum Simulators
Analog quantum simulators with global control fields have emerged as powerful platforms for exploring complex quantum phenomena. Recent breakthroughs, such as the coherent control of thousands of atoms, highlight the growing potential for quantum applications at scale. Despite these advances, a fundamental theoretical question remains unresolved: to what extent can such systems realize universal quantum dynamics under global control? Here we establish a necessary and sufficient condition for universal quantum computation using only global pulse control, proving that a broad class of analog quantum simulators is, in fact, universal. We further extend this framework to fermionic and bosonic systems, including modern platforms such as ultracold atoms in optical superlattices. Crucially, to connect the theoretical possibility with experimental reality, we introduce a new control technique into the experiment - direct quantum optimal control. This method enables the synthesis of complex effective Hamiltonians and allows us to incorporate realistic hardware constraints. To show its practical power, we experimentally engineer three-body interactions outside the blockade regime and demonstrate topological dynamics on a Rydberg atom array. Using the new control framework, we overcome key experimental challenges, including hardware limitations and atom position fluctuations in the non-blockade regime, by identifying smooth, short-duration pulses that achieve high-fidelity dynamics. Experimental measurements reveal dynamical signatures of symmetry-protected-topological edge modes, confirming both the expressivity and feasibility of our approach. Our work opens a new avenue for quantum simulation beyond native hardware Hamiltonians, enabling the engineering of effective multi-body interactions and advancing the frontier of quantum information processing with globally-controlled analog platforms.
comment: 12 pages, 5 figures
An optimistic planning algorithm for switched discrete-time LQR
We introduce TROOP, a tree-based Riccati optimistic online planner, that is designed to generate near-optimal control laws for discrete-time switched linear systems with switched quadratic costs. The key challenge that we address is balancing computational resources against control performance, which is important as constructing near-optimal inputs often requires substantial amount of computations. TROOP addresses this trade-off by adopting an online best-first search strategy inspired by A*, allowing for efficient estimates of the optimal value function. The control laws obtained guarantee both near-optimality and stability properties for the closed-loop system. These properties depend on the planning depth, which determines how far into the future the algorithm explores and is closely related to the amount of computations. TROOP thus strikes a balance between computational efficiency and control performance, which is illustrated by numerical simulations on an example.
comment: Technical report for CDC 2025 publication
Performance Analysis of Underwater Optical Wireless Communication Using O-RIS and Fiber Optic Backhaul (Extended version)
This Letter presents a novel hybrid underwater wireless optical communication (UWOC) system that integrates underwater optical access points (UOAPs) with a passive optical network (PON)-based fiber-optic backhaul to provide a resilient backbone. A hard switching mechanism is employed between direct and optical reconfigurable intelligent surface (O-RIS)-assisted links to ensure reliable connectivity. Unlike previous studies, the proposed system is evaluated under both active and multiple passive O-RIS configurations. To enhance reliability, the Selection Combining (SC) and Maximal Ratio Combining (MRC) schemes are applied. Analytical and simulation results demonstrate that optimal O-RIS placement significantly enhances system performance. However, in the linear regime, placing it too close to the receiver causes degradation due to increased path loss and beam jitter in an identical water type. Moreover, increasing the number of O-RIS elements within practical limits further improves overall system performance and enhances adaptability to variations in the underwater channel.
Closed-Form Input Design for Identification under Output Feedback with Perturbation Constraints
In many applications, system identification experiments must be performed under output feedback to ensure safety or to maintain system operation. In this paper, we consider the online design of informative experiments for ARMAX models by applying a bounded perturbation to the input signal generated by a fixed output feedback controller. Specifically, the design constrains the resulting output perturbation within user-specified limits and can be efficiently computed in closed form. We demonstrate the effectiveness of the method in two numerical experiments.
Globally Stable Discrete Time PID Passivity-based Control of Power Converters: Simulation and Experimental Results
The key idea behind PID Passivity-based Control (PID-PBC) is to leverage the passivity property of PIDs (for all positive gains) and wrap the PID controller around a passive output to ensure global stability in closed-loop. However, the practical applicability of PID-PBC is stymied by two key facts: (i) the vast majority of practical implementations of PIDs is carried-out in discrete time -- discretizing the continuous time dynamical system of the PID; (ii) the well-known problem that passivity is not preserved upon discretization, even with small sampling times. Therefore, two aspects of the PID-PBC must be revisited for its safe practical application. First, we propose a discretization of the PID that ensures its passivity. Second, since the output that is identified as passive for the continuous time system is not necessarily passive for its discrete time version, we construct a new output that ensures the passivity property for the discretization of the system. In this paper, we provide a constructive answer to both issues for the case of power converter models. Instrumental to achieve this objective is the use of the implicit midpoint discretization method -- which is a symplectic integration technique that preserves system invariants. Since the reference value for the output to be regulated in power converters is non-zero, we are henceforth interested in the property of passivity of the incremental model -- currently known as shifted passivity. Therefore, we demonstrate that the resulting discrete-time PID-PBC defines a passive map for the incremental model and establish shifted passivity for the discretized power converter model. Combining these properties, we prove global stability for the feedback interconnection of the power converter with the discretized PID-PBC. The paper also presents simulations and experiments that demonstrate the performance of the proposed discretization.
AgriChrono: A Multi-modal Dataset Capturing Crop Growth and Lighting Variability with a Field Robot
Existing datasets for precision agriculture have primarily been collected in static or controlled environments such as indoor labs or greenhouses, often with limited sensor diversity and restricted temporal span. These conditions fail to reflect the dynamic nature of real farmland, including illumination changes, crop growth variation, and natural disturbances. As a result, models trained on such data often lack robustness and generalization when applied to real-world field scenarios. In this paper, we present AgriChrono, a novel robotic data collection platform and multi-modal dataset designed to capture the dynamic conditions of real-world agricultural environments. Our platform integrates multiple sensors and enables remote, time-synchronized acquisition of RGB, Depth, LiDAR, and IMU data, supporting efficient and repeatable long-term data collection across varying illumination and crop growth stages. We benchmark a range of state-of-the-art 3D reconstruction models on the AgriChrono dataset, highlighting the difficulty of reconstruction in real-world field environments and demonstrating its value as a research asset for advancing model generalization under dynamic conditions. The code and dataset are publicly available at: https://github.com/StructuresComp/agri-chrono
FALCON: Autonomous Cyber Threat Intelligence Mining with LLMs for IDS Rule Generation
Signature-based Intrusion Detection Systems (IDS) detect malicious activities by matching network or host activity against predefined rules. These rules are derived from extensive Cyber Threat Intelligence (CTI), which includes attack signatures and behavioral patterns obtained through automated tools and manual threat analysis, such as sandboxing. The CTI is then transformed into actionable rules for the IDS engine, enabling real-time detection and prevention. However, the constant evolution of cyber threats necessitates frequent rule updates, which delay deployment time and weaken overall security readiness. Recent advancements in agentic systems powered by Large Language Models (LLMs) offer the potential for autonomous IDS rule generation with internal evaluation. We introduce FALCON, an autonomous agentic framework that generates deployable IDS rules from CTI data in real-time and evaluates them using built-in multi-phased validators. To demonstrate versatility, we target both network (Snort) and host-based (YARA) mediums and construct a comprehensive dataset of IDS rules with their corresponding CTIs. Our evaluations indicate FALCON excels in automatic rule generation, with an average of 95% accuracy validated by qualitative evaluation with 84% inter-rater agreement among multiple cybersecurity analysts across all metrics. These results underscore the feasibility and effectiveness of LLM-driven data mining for real-time cyber threat mitigation.
comment: 11 pages, 5 figures, 4 tables
Potential of Quantum Computing Applications for Smart Grid Digital Twins and Future Directions
The convergence of digital twin technology and quantum computing is opening new horizons for the modeling, control, and optimization of smart grid systems. This paper reviews the current research landscape at the intersection of these fields, with a focus on how quantum algorithms can enhance the performance of digital twins in smart energy systems. We conduct a thematic literature review and identify key research trends, technical challenges, and gaps in real-world adoption. Further, a conceptual framework is proposed to integrate quantum modules into classical digital twin architectures. The potential benefits of this hybrid approach for smart grid operation and future research directions are also discussed.
comment: Accepted Paper
Scalable Fairness Shaping with LLM-Guided Multi-Agent Reinforcement Learning for Peer-to-Peer Electricity Markets
Peer-to-peer (P2P) energy trading is becoming central to modern distribution systems as rooftop PV and home energy management systems become pervasive, yet most existing market and reinforcement learning designs emphasize efficiency or private profit and offer little real-time guidance to ensure equitable outcomes under uncertainty. To address this gap, a fairness-aware multiagent reinforcement learning framework, FairMarket-RL, is proposed in which a large language model (LLM) critic shapes bidding policies within a continuous double auction under partial observability and discrete price-quantity actions. After each trading slot, the LLM returns normalized fairness scores Fairness-to-Grid (FTG), Fairness-Between-Sellers (FBS), and Fairness-of-Pricing (FPP) that are integrated into the reward via ramped coefficients and tunable scaling, so that fairness guidance complements, rather than overwhelms, economic incentives. The environment models realistic residential load and PV profiles and enforce hard constraints on prices, physical feasibility, and policy-update stability. Across a progression of experiments from a small pilot to a larger simulated community and a mixed-asset real-world dataset, the framework shifts exchanges toward local P2P trades, lowers consumer costs relative to grid-only procurement, sustains strong fairness across participants, and preserves utility viability. Sensitivity analyses over solar availability and aggregate demand further indicate robust performance, suggesting a scalable, LLM-guided pathway to decentralized electricity markets that are economically efficient, socially equitable, and technically sound.
Fuzzy-Based Control Method for Autonomous Spacecraft Inspection with Minimal Fuel Consumption
This study explores an energy-efficient control strategy for spacecraft inspection using a fuzzy inference system combined with a bio-inspired optimization technique to incorporate learning capability into the control process. The optimized fuzzy controller produces a minimally fuel-consuming force while maintaining reliable inspection within constraints, such as illumination, restricted field of view, thrust limits, and safe regions. The performance of the proposed control strategy is validated through Monte Carlo simulations.
comment: 13 pages, 8 figures, 2025 AAS/AIAA Astrodynamics Specialist Conference
Optimal Control of ODE Car-Following Models: Applications to Mixed-Autonomy Platoon Control via Coupled Autonomous Vehicles
In this paper, we study the optimal control of a mixed-autonomy platoon driving on a single lane to smooth traffic flow. The platoon consists of autonomous vehicles, whose acceleration is controlled, and human-driven vehicles, whose behavior is described using a microscopic car-following model. We formulate the optimal control problem where the dynamics of the platoon are describing through a system of non-linear ODEs, with explicit constraints on both the state and the control variables. Theoretically, we analyze the well-posedness of the system dynamics under a reasonable set of admissible controls and establish the existence of minimizers for the optimal control problem. To solve the problem numerically, we propose a gradient descent-based algorithm that leverages the adjoint method, along with a penalty approach to handle state constraints. We demonstrate the effectiveness of the proposed numerical scheme through several experiments, exploring various scenarios with different penetration rates and distributions of controlled vehicles within the platoon.
Comparison of Droop-Based Single-Loop Grid-Forming Wind Turbines: High-Frequency Open-Loop Unstable Behavior and Damping
The integration of inverter-interfaced generators introduces new instability phenomena into modern power systems. This paper conducts a comparative analysis of two widely used droop-based grid-forming controls, namely droop control and droop-I control, in wind turbines. Although both approaches provide steady-state reactive power-voltage droop characteristics, their impacts on high-frequency (HF) stability differ significantly. Firstly, on open-loop (OL) comparison reveals that droop-I control alters HF pole locations. The application of Routh's Stability Criterion further analytically demonstrates that such pole shifts inevitably lead to OL instability. This HF OL instability is identified as a structural phenomenon in purely inductive grids and cannot be mitigated through control parameter tuning. As a result, droop-I control significantly degrades HF stability, making conventional gain and phase margins insufficient for evaluating robustness against parameter variations. Then, the performance of established active damping (AD) is assessed for both control schemes. The finding indicates that AD designs effective for droop control may fail to suppress HF resonance under droop-I control due to the presence of unstable OL poles. Case studies performed on the IEEE 14-Bus Test System validate the analysis and emphasize the critical role of HF OL instability in determining the overall power system stability.
Learning Robust Regions of Attraction Using Rollout-Enhanced Physics-Informed Neural Networks with Policy Iteration
The region of attraction is a key metric of the robustness of systems. This paper addresses the numerical solution of the generalized Zubov's equation, which produces a special Lyapunov function characterizing the robust region of attraction for perturbed systems. To handle the highly nonlinear characteristic of the generalized Zubov's equation, we propose a physics-informed neural network framework that employs a policy iteration training scheme with rollout to approximate the viscosity solution. In addition to computing the optimal disturbance during the policy improvement process, we incorporate neural network-generated value estimates as anchor points to facilitate the training procedure to prevent singularities in both low- and high-dimensional systems. Numerical simulations validate the effectiveness of the proposed approach.
comment: Submitted to the American Control Conference (ACC 2026)
Climate-Resilient Ports and Waterborne Transport Systems: Current Status and Future Prospects
The increasing challenges posed by climate change necessitate a comprehensive examination of the resilience of waterborne transport systems. This paper explores the nexus of climate resilience, and waterborne transport, addressing the challenges faced by ports and their connecting waterborne transport systems. It provides an in-depth analysis of the current status of climate-resilient infrastructure and operations while emphasizing the transformative potential of emerging technologies. Through a systematic review, the paper identifies critical gaps and opportunities. Research predominantly emphasizes port infrastructure over supply chain resilience, neglecting the interconnected vulnerabilities of maritime networks. There is limited focus on specific climate-induced disruptions, such as drought and compounded events, which complicate resilience planning. Methodologically, risk assessments and case studies dominate the field, while advanced technologies such as digital twins, artificial intelligence, and satellite monitoring remain underutilized. Geographic disparities in research output and a tendency toward short- to medium-term planning further constrain global and long-term resilience efforts. To address these gaps, the study advocates for systems-based approaches that integrate infrastructure, operations, and supply chains. It highlights collaborative frameworks and advanced tools, including digital twins, machine learning, and participatory modeling, as crucial for enabling predictive and adaptive risk management. This study stands as one of the first comprehensive reviews exclusively focused on climate resilience in ports and waterborne transport systems. It provides actionable insights for policymakers, researchers, and industry stakeholders, proposing a future research agenda to advance waterborne transport systems capable of withstanding multifaceted climate impacts.
Aleks: AI powered Multi Agent System for Autonomous Scientific Discovery via Data-Driven Approaches in Plant Science
Modern plant science increasingly relies on large, heterogeneous datasets, but challenges in experimental design, data preprocessing, and reproducibility hinder research throughput. Here we introduce Aleks, an AI-powered multi-agent system that integrates domain knowledge, data analysis, and machine learning within a structured framework to autonomously conduct data-driven scientific discovery. Once provided with a research question and dataset, Aleks iteratively formulated problems, explored alternative modeling strategies, and refined solutions across multiple cycles without human intervention. In a case study on grapevine red blotch disease, Aleks progressively identified biologically meaningful features and converged on interpretable models with robust performance. Ablation studies underscored the importance of domain knowledge and memory for coherent outcomes. This exploratory work highlights the promise of agentic AI as an autonomous collaborator for accelerating scientific discovery in plant sciences.
Aggregate Fictitious Play for Learning in Anonymous Polymatrix Games (Extended Version)
Fictitious play (FP) is a well-studied algorithm that enables agents to learn Nash equilibrium in games with certain reward structures. However, when agents have no prior knowledge of the reward functions, FP faces a major challenge: the joint action space grows exponentially with the number of agents, which slows down reward exploration. Anonymous games offer a structure that mitigates this issue. In these games, the rewards depend only on the actions taken; not on who is taking which action. Under such a structure, we introduce aggregate fictitious play (agg-FP), a variant of FP where each agent tracks the frequency of the number of other agents playing each action, rather than these agents' individual actions. We show that in anonymous polymatrix games, agg-FP converges to a Nash equilibrium under the same conditions as classical FP. In essence, by aggregating the agents' actions, we reduce the action space without losing the convergence guarantees. Using simulations, we provide empirical evidence on how this reduction accelerates convergence.
Towards Reliable Neural Optimizers: Permutation-Equivariant Neural Approximation in Dynamic Data Driven Applications Systems
Dynamic Data Driven Applications Systems (DDDAS) motivate the development of optimization approaches capable of adapting to streaming, heterogeneous, and asynchronous data from sensor networks. Many established optimization solvers, such as branch-and-bound, gradient descent, and Newton-Raphson methods, rely on iterative algorithms whose step-by-step convergence makes them too slow for real-time, multi-sensor environments. In our recent work, we introduced LOOP-PE (Learning to Optimize the Optimization Process, Permutation Equivariance version), a feed-forward neural approximation model with an integrated feasibility recovery function. LOOP-PE processes inputs from a variable number of sensors in arbitrary order, making it robust to sensor dropout, communication delays, and system scaling. Its permutation-equivariant architecture ensures that reordering the input data reorders the corresponding dispatch decisions consistently, without retraining or pre-alignment. Feasibility is enforced via a generalized gauge map, guaranteeing that outputs satisfy physical and operational constraints. We illustrate the approach in a DDDAS-inspired case study of a Virtual Power Plant (VPP) managing multiple distributed generation agents (DERs) to maximize renewable utilization while respecting system limits. Results show that LOOP-PE produces near-optimal, feasible, and highly adaptable decisions under dynamic, unordered, and distributed sensing conditions, significantly outperforming iterative algorithm based solvers in both speed and flexibility. Here, we extend our earlier work by providing additional analysis and explanation of LOOP-PE design and operation, with particular emphasis on its feasibility guarantee and permutation equivariance feature.
Set-membership identification of continuous-time MIMO systems via Tustin discretization
In this paper, we deal with the identification of continuous-time systems from sampled data corrupted by unknown but bounded errors. A significant challenge in continuous-time identification is the estimation of the input and output data derivatives. In this paper, we propose a novel method based on set-membership techniques and Tustin discretization, which overcomes the derivative measurement problem and the presence of bounded errors affecting all the measured signals. First, we derive the proposed method and prove that it becomes an affordable polynomial optimization problem. Then, we present some numerical results based on simulation and experimental data to explore the effectiveness of the proposed method.
Privacy-Preserving Distributed Control for a Networked Battery Energy Storage System
The increasing deployment of distributed Battery Energy Storage Systems (BESSs) in modern power grids necessitates effective coordination strategies to ensure state-of-charge (SoC) balancing and accurate power delivery. While distributed control frameworks offer scalability and resilience, they also raise significant privacy concerns due to the need for inter-agent information exchange. This paper presents a novel privacy-preserving distributed control algorithm for SoC balancing in a networked BESS. The proposed framework includes distributed power allocation law that is designed based on two privacy-preserving distributed estimators, one for the average unit state and the other for the average desired power. The average unit state estimator is designed via the state decomposition method without disclosing sensitive internal states. The proposed power allocation law based on these estimators ensures asymptotic SoC balancing and global power delivery while safeguarding agent privacy from external eavesdroppers. The effectiveness and privacy-preserving properties of the proposed control strategy are demonstrated through simulation results.
QuadKAN: KAN-Enhanced Quadruped Motion Control via End-to-End Reinforcement Learning
We address vision-guided quadruped motion control with reinforcement learning (RL) and highlight the necessity of combining proprioception with vision for robust control. We propose QuadKAN, a spline-parameterized cross-modal policy instantiated with Kolmogorov-Arnold Networks (KANs). The framework incorporates a spline encoder for proprioception and a spline fusion head for proprioception-vision inputs. This structured function class aligns the state-to-action mapping with the piecewise-smooth nature of gait, improving sample efficiency, reducing action jitter and energy consumption, and providing interpretable posture-action sensitivities. We adopt Multi-Modal Delay Randomization (MMDR) and perform end-to-end training with Proximal Policy Optimization (PPO). Evaluations across diverse terrains, including both even and uneven surfaces and scenarios with static or dynamic obstacles, demonstrate that QuadKAN achieves consistently higher returns, greater distances, and fewer collisions than state-of-the-art (SOTA) baselines. These results show that spline-parameterized policies offer a simple, effective, and interpretable alternative for robust vision-guided locomotion. A repository will be made available upon acceptance.
comment: 14pages, 9 figures, Journal paper
Realizing Reduced and Sparse Biochemical Reaction Networks from Dynamics
We propose a direct optimization framework for learning reduced and sparse chemical reaction networks (CRNs) from time-series trajectory data. In contrast to widely used indirect methods-such as those based on sparse identification of nonlinear dynamics (SINDy)-which infer reaction dynamics by fitting numerically estimated derivatives, our approach fits entire trajectories by solving a dynamically constrained optimization problem. This formulation enables the construction of reduced CRNs that are both low-dimensional and sparse, while preserving key dynamical behaviors of the original system. We develop an accelerated proximal gradient algorithm to efficiently solve the resulting non-convex optimization problem. Through illustrative examples, including a Drosophila circadian oscillator and a glycolytic oscillator, we demonstrate the ability of our method to recover accurate and interpretable reduced-order CRNs. Notably, the direct approach avoids the derivative estimation step and mitigates error accumulation issues inherent in indirect methods, making it a robust alternative for data-driven CRN realizations.
comment: Accepted to IEEE CDC 2025. Author-accepted version; supplementary material in ancillary files (In this version, supplementary PDF is moved to ancillary files; no content changes to main article)
An Adaptive Environment-Aware Transformer Autoencoder for UAV-FSO with Dynamic Complexity Control
The rise of sixth-generation (6G) wireless networks sets high demands on UAV-assisted Free Space Optical (FSO) communications, where the channel environment becomes more complex and variable due to both atmospheric turbulence and UAV-induced vibrations. These factors increase the challenge of maintaining reliable communication and require adaptive processing methods. Autoencoders are promising as they learn optimal encodings from channel data. However, existing autoencoder designs are generic and lack the specific adaptability and computational flexibility needed for UAV-FSO scenarios. To address this, we propose AEAT-AE (Adaptive Environment-aware Transformer Autoencoder), a Transformer-based framework that integrates environmental parameters into both encoder and decoder via a cross-attention mechanism. Moreover, AEAT-AE incorporates a Deep Q-Network (DQN) that dynamically selects which layers of the Transformer autoencoder to activate based on real-time environmental inputs, effectively balancing performance and computational cost. Simulation results demonstrate that AEAT-AE outperforms conventional methods in bit error rate while maintaining efficient runtime, representing a novel tailored solution for next-generation UAV-FSO communications.
Maximizing Battery Storage Profits via High-Frequency Intraday Trading
Maximizing revenue for grid-scale battery energy storage systems in continuous intraday electricity markets requires strategies that are able to seize trading opportunities as soon as new information arrives. This paper introduces and evaluates an automated high-frequency trading strategy for battery energy storage systems trading on the intraday market for power while explicitly considering the dynamics of the limit order book, market rules, and technical parameters. The standard rolling intrinsic strategy is adapted for continuous intraday electricity markets and solved using a dynamic programming approximation that is two to three orders of magnitude faster than an exact mixed-integer linear programming solution. A detailed backtest over a full year of German order book data demonstrates that the proposed dynamic programming formulation does not reduce trading profits and enables the policy to react to every relevant order book update, enabling realistic rapid backtesting. Our results show the significant revenue potential of high-frequency trading: our policy earns 58% more than when re-optimizing only once every hour and 14% more than when re-optimizing once per minute, highlighting that profits critically depend on trading speed. Furthermore, we leverage the speed of our algorithm to train a parametric extension of the rolling intrinsic, increasing yearly revenue by 8.4% out of sample.
Aging-aware Energy Management for Residential Multi-Carrier Energy Systems
In the context of building electrification, the operation of distributed energy resources integrating multiple energy carriers (electricity, heat, mobility) poses a significant challenge due to the nonlinear device dynamics, uncertainty, and computational issues. As such, energy management systems seek to decide the power dispatch in the best way possible. The objective is to minimize and balance operative costs (energy bills or asset degradation) with user requirements (mobility, heating, etc.). Current energy management uses empirical battery ageing models outside of their specific fitting conditions, resulting in inaccuracies and poor performance. Moreover, the link to thermal systems is also overlooked. This paper presents an ageing-aware day-ahead algorithm for electrified buildings that incorporates physics-based battery ageing models. The models distinguish between energy storage systems and make explicit the trade-off between grid cost and battery degradation. The proposed day-ahead algorithm can either cut down on grid costs or extend battery lifetime (electric vehicle or stationary battery packs). Moreover, it exploits the differences between cathode chemistries improving grid costs by 25% when using LFP cells, with respect to NMC cells. Finally, the performance using aged batteries is also enhanced with 35% grid cost observed savings, when passing from new to aged batteries in the summer.
Deception in Oligopoly Games via Adaptive Nash Seeking Systems
In the theory of multi-agent systems, deception refers to the strategic manipulation of information to influence the behavior of other agents, ultimately altering the long-term dynamics of the entire system. Recently, this concept has been examined in the context of model-free Nash equilibrium seeking (NES) algorithms for noncooperative games. Specifically, it was demonstrated that players can exploit knowledge of other players' exploration signals to drive the system toward a ``deceptive" Nash equilibrium, while maintaining the stability of the closed-loop system. To extend this insight beyond the duopoly case, in this paper we conduct a comprehensive study of deception mechanisms in N-player oligopoly markets. By leveraging the structure of these games and employing stability techniques for nonlinear dynamical systems, we provide game-theoretic insights into deception and derive specialized results, including stability conditions. These results allow players to systematically adjust their NES dynamics by tuning gains and signal amplitudes, all while ensuring closed-loop stability. Additionally, we introduce novel sufficient conditions to demonstrate that the (practically) stable equilibrium point of the deceptive dynamics corresponds to a true Nash equilibrium of a different game, which we term the ``deceptive game." Our results show that, under the proposed adaptive dynamics with deception, a victim firm may develop a distorted perception of its competitors' product appeal, which could lead to setting suboptimal prices.
Stochastic Real-Time Deception in Nash Equilibrium Seeking for Games with Quadratic Payoffs
In multi-agent autonomous systems, deception is a fundamental concept which characterizes the exploitation of unbalanced information to mislead victims into choosing oblivious actions. This effectively alters the system's long term behavior, leading to outcomes that may be beneficial to the deceiver but detrimental to victim. We study this phenomenon for a class of model-free Nash equilibrium seeking (NES) where players implement independent stochastic exploration signals to learn the pseudogradient flow. In particular, we show that deceptive players who obtain real-time measurements of other players' stochastic perturbation can incorporate this information into their own NES action update, consequentially steering the overall dynamics to a new operating point that could potentially improve the payoffs of the deceptive players. We consider games with quadratic payoff functions, as this restriction allows us to derive a more explicit formulation of the capabilities of the deceptive players. By leveraging results on multi-input stochastic averaging for dynamical systems, we establish local exponential (in probability) convergence for the proposed deceptive NES dynamics. To illustrate our results, we apply them to a two player quadratic game.
Trajectory Optimization for UAV-Based Medical Delivery with Temporal Logic Constraints and Convex Feasible Set Collision Avoidance
This paper addresses the problem of trajectory optimization for unmanned aerial vehicles (UAVs) performing time-sensitive medical deliveries in urban environments. Specifically, we consider a single UAV with 3 degree-of-freedom dynamics tasked with delivering blood packages to multiple hospitals, each with a predefined time window and priority. Mission objectives are encoded using Signal Temporal Logic (STL), enabling the formal specification of spatial-temporal constraints. To ensure safety, city buildings are modeled as 3D convex obstacles, and obstacle avoidance is handled through a Convex Feasible Set (CFS) method. The entire planning problem-combining UAV dynamics, STL satisfaction, and collision avoidance-is formulated as a convex optimization problem that ensures tractability and can be solved efficiently using standard convex programming techniques. Simulation results demonstrate that the proposed method generates dynamically feasible, collision-free trajectories that satisfy temporal mission goals, providing a scalable and reliable approach for autonomous UAV-based medical logistics.
comment: 11 pages, 4 figures
Global Geolocated Realtime Data of Interfleet Urban Transit Bus Idling
Urban transit bus idling is a contributor to ecological stress, economic inefficiency, and medically hazardous health outcomes due to emissions. The global accumulation of this frequent pattern of undesirable driving behavior is enormous. In order to measure its scale, we propose GRD-TRT-BUF-4I (Ground Truth Buffer for Idling) an extensible, realtime detection system that records the geolocation and idling duration of urban transit bus fleets internationally. Using live vehicle locations from General Transit Feed Specification (GTFS) Realtime, the system detects approximately 200,000 idling events per day from over 50 cities across North America, Europe, Oceania, and Asia. This realtime data was created dynamically to serve operational decision-making and fleet management to reduce the frequency and duration of idling events as they occur, as well as to capture its accumulative effects. Civil and Transportation Engineers, Urban Planners, Epidemiologists, Policymakers, and other stakeholders might find this useful for emissions modeling, traffic management, route planning, and other urban sustainability efforts at a variety of geographic and temporal scales.
comment: 35 pages, 12 figures, 36 tables, 100 data sources (including links). Prepared for Earth System Science Data (ESSD)
Optimal Planning for Enhancing the Resilience of Modern Distribution Systems Against Cyberattacks
The increasing integration of IoT-connected devices in smart grids has introduced new vulnerabilities at the distribution level. Of particular concern is the potential for cyberattacks that exploit high-wattage IoT devices, such as EV chargers, to manipulate local demand and destabilize the grid. While previous studies have primarily focused on such attacks at the transmission level, this paper investigates their feasibility and impact at the distribution level. We examine how cyberattackers can target voltage-sensitive nodes, especially those exposed by the presence of high-consumption devices, to cause voltage deviation and service disruption. Our analysis demonstrates that conventional grid protections are insufficient against these intelligent, localized attacks. To address this, we propose resilience strategies using distributed generation (DGs), exploring their role in preemptive planning. This research highlights the urgent need for distribution-level cyber resilience planning in smart grids.
comment: Accepted Paper
Large Language Model-Based Framework for Explainable Cyberattack Detection in Automatic Generation Control Systems
The increasing digitization of smart grids has improved operational efficiency but also introduced new cybersecurity vulnerabilities, such as False Data Injection Attacks (FDIAs) targeting Automatic Generation Control (AGC) systems. While machine learning (ML) and deep learning (DL) models have shown promise in detecting such attacks, their opaque decision-making limits operator trust and real-world applicability. This paper proposes a hybrid framework that integrates lightweight ML-based attack detection with natural language explanations generated by Large Language Models (LLMs). Classifiers such as LightGBM achieve up to 95.13% attack detection accuracy with only 0.004 s inference latency. Upon detecting a cyberattack, the system invokes LLMs, including GPT-3.5 Turbo, GPT-4 Turbo, and GPT-4o mini, to generate human-readable explanation of the event. Evaluated on 100 test samples, GPT-4o mini with 20-shot prompting achieved 93% accuracy in identifying the attack target, a mean absolute error of 0.075 pu in estimating attack magnitude, and 2.19 seconds mean absolute error (MAE) in estimating attack onset. These results demonstrate that the proposed framework effectively balances real-time detection with interpretable, high-fidelity explanations, addressing a critical need for actionable AI in smart grid cybersecurity.
comment: Accepted Paper
From Optimization to Control: Quasi Policy Iteration
Recent control algorithms for Markov decision processes (MDPs) have been designed using an implicit analogy with well-established optimization algorithms. In this paper, we adopt the quasi-Newton method (QNM) from convex optimization to introduce a novel control algorithm coined as quasi-policy iteration (QPI). In particular, QPI is based on a novel approximation of the ``Hessian'' matrix in the policy iteration algorithm, which exploits two linear structural constraints specific to MDPs and allows for the incorporation of prior information on the transition probability kernel. While the proposed algorithm has the same computational complexity as value iteration, it exhibits an empirical convergence behavior similar to that of QNM with a low sensitivity to the discount factor.
Harnessing the Full Potential of RRAMs through Scalable and Distributed In-Memory Computing with Integrated Error Correction
Exponential growth in global computing demand is exacerbated due to the higher-energy requirements of conventional architectures, primarily due to energy-intensive data movement. In-memory computing with Resistive Random Access Memory (RRAM) addresses this by co-integrating memory and processing, but faces significant hurdles related to device-level non-idealities and poor scalability for large computing tasks. Here, we introduce MELISO+ (In-Memory Linear Solver), a full-stack, distributed framework for energy-efficient in-memory computing. MELISO+ proposes a novel two-tier error correction mechanism to mitigate device non-idealities and develops a distributed RRAM computing framework to enable matrix computations exceeding dimensions of $65,000\times65,000$. This approach reduces first- and second-order arithmetic errors due to device non-idealities by over $90\%$, enhances energy efficiency by three to five orders of magnitude, and decreases latency 100-fold. Hence, MELISO+ allows lower-precision RRAM devices to outperform high-precision device alternatives in accuracy, energy and latency metrics. By unifying algorithm-hardware co-design with scalable architecture, MELISO+ significantly advances sustainable, high-dimensional computing suitable for applications like large language models and generative AI.
Multiagent Systems
Optimizing Highway Traffic Flow in Mixed Autonomy: A Multiagent Truncated Rollout Approach
The development of connected and autonomous vehicles (CAVs) offers substantial opportunities to enhance traffic efficiency. However, in mixed autonomy environments where CAVs coexist with human-driven vehicles (HDVs), achieving efficient coordination among CAVs remains challenging due to heterogeneous driving behaviors. To address this, this paper proposes a multiagent truncated rollout approach that enhances CAV speed coordination to improve highway throughput while reducing computational overhead. In this approach, a traffic density evolution equation is formulated that comprehensively accounts for the presence or absence of CAVs, and a distributed coordination control framework is established accordingly. By incorporating kinematic information from neighbor agents and employing an agent-by-agent sequential solution mechanism, our method enables explicit cooperation among CAVs. Furthermore, we introduce a truncated rollout scheme that adaptively shortens the optimization horizon based on the evaluation of control sequences. This significantly reduces the time complexity, thereby improving real-time performance and scalability. Theoretical analysis provides rigorous guarantees on the stability and performance improvement of the system. Simulations conducted on real-world bottleneck scenarios demonstrate that, in large-scale mixed traffic flows, the proposed method outperforms conventional model predictive control methods by reducing both the average travel time in the bottleneck area and overall computational time, highlighting its strong potential for practical deployment.
MATRIX: Multi-Agent simulaTion fRamework for safe Interactions and conteXtual clinical conversational evaluation
Despite the growing use of large language models (LLMs) in clinical dialogue systems, existing evaluations focus on task completion or fluency, offering little insight into the behavioral and risk management requirements essential for safety-critical systems. This paper presents MATRIX (Multi-Agent simulaTion fRamework for safe Interactions and conteXtual clinical conversational evaluation), a structured, extensible framework for safety-oriented evaluation of clinical dialogue agents. MATRIX integrates three components: (1) a safety-aligned taxonomy of clinical scenarios, expected system behaviors and failure modes derived through structured safety engineering methods; (2) BehvJudge, an LLM-based evaluator for detecting safety-relevant dialogue failures, validated against expert clinician annotations; and (3) PatBot, a simulated patient agent capable of producing diverse, scenario-conditioned responses, evaluated for realism and behavioral fidelity with human factors expertise, and a patient-preference study. Across three experiments, we show that MATRIX enables systematic, scalable safety evaluation. BehvJudge with Gemini 2.5-Pro achieves expert-level hazard detection (F1 0.96, sensitivity 0.999), outperforming clinicians in a blinded assessment of 240 dialogues. We also conducted one of the first realism analyses of LLM-based patient simulation, showing that PatBot reliably simulates realistic patient behavior in quantitative and qualitative evaluations. Using MATRIX, we demonstrate its effectiveness in benchmarking five LLM agents across 2,100 simulated dialogues spanning 14 hazard scenarios and 10 clinical domains. MATRIX is the first framework to unify structured safety engineering with scalable, validated conversational AI evaluation, enabling regulator-aligned safety auditing. We release all evaluation tools, prompts, structured scenarios, and datasets.
comment: 36 pages, 16 figures
Playstyle and Artificial Intelligence: An Initial Blueprint Through the Lens of Video Games
Contemporary artificial intelligence (AI) development largely centers on rational decision-making, valued for its measurability and suitability for objective evaluation. Yet in real-world contexts, an intelligent agent's decisions are shaped not only by logic but also by deeper influences such as beliefs, values, and preferences. The diversity of human decision-making styles emerges from these differences, highlighting that "style" is an essential but often overlooked dimension of intelligence. This dissertation introduces playstyle as an alternative lens for observing and analyzing the decision-making behavior of intelligent agents, and examines its foundational meaning and historical context from a philosophical perspective. By analyzing how beliefs and values drive intentions and actions, we construct a two-tier framework for style formation: the external interaction loop with the environment and the internal cognitive loop of deliberation. On this basis, we formalize style-related characteristics and propose measurable indicators such as style capacity, style popularity, and evolutionary dynamics. The study focuses on three core research directions: (1) Defining and measuring playstyle, proposing a general playstyle metric based on discretized state spaces, and extending it to quantify strategic diversity and competitive balance; (2) Expressing and generating playstyle, exploring how reinforcement learning and imitation learning can be used to train agents exhibiting specific stylistic tendencies, and introducing a novel approach for human-like style learning and modeling; and (3) Practical applications, analyzing the potential of these techniques in domains such as game design and interactive entertainment. Finally, the dissertation outlines future extensions, including the role of style as a core element in building artificial general intelligence (AGI).
comment: PhD Dissertation, National Yang Ming Chiao Tung University, 2025. This is the public version without Chinese abstract or postscript
DELIVER: A System for LLM-Guided Coordinated Multi-Robot Pickup and Delivery using Voronoi-Based Relay Planning
We present DELIVER (Directed Execution of Language-instructed Item Via Engineered Relay), a fully integrated framework for cooperative multi-robot pickup and delivery driven by natural language commands. DELIVER unifies natural language understanding, spatial decomposition, relay planning, and motion execution to enable scalable, collision-free coordination in real-world settings. Given a spoken or written instruction, a lightweight instance of LLaMA3 interprets the command to extract pickup and delivery locations. The environment is partitioned using a Voronoi tessellation to define robot-specific operating regions. Robots then compute optimal relay points along shared boundaries and coordinate handoffs. A finite-state machine governs each robot's behavior, enabling robust execution. We implement DELIVER on the MultiTRAIL simulation platform and validate it in both ROS2-based Gazebo simulations and real-world hardware using TurtleBot3 robots. Empirical results show that DELIVER maintains consistent mission cost across varying team sizes while reducing per-agent workload by up to 55% compared to a single-agent system. Moreover, the number of active relay agents remains low even as team size increases, demonstrating the system's scalability and efficient agent utilization. These findings underscore DELIVER's modular and extensible architecture for language-guided multi-robot coordination, advancing the frontiers of cyber-physical system integration.
comment: Submission under review at the 2026 IEEE/SICE International Symposium on System Integration (SII 2026)
Skill-Aligned Fairness in Multi-Agent Learning for Collaboration in Healthcare
Fairness in multi-agent reinforcement learning (MARL) is often framed as a workload balance problem, overlooking agent expertise and the structured coordination required in real-world domains. In healthcare, equitable task allocation requires workload balance or expertise alignment to prevent burnout and overuse of highly skilled agents. Workload balance refers to distributing an approximately equal number of subtasks or equalised effort across healthcare workers, regardless of their expertise. We make two contributions to address this problem. First, we propose FairSkillMARL, a framework that defines fairness as the dual objective of workload balance and skill-task alignment. Second, we introduce MARLHospital, a customizable healthcare-inspired environment for modeling team compositions and energy-constrained scheduling impacts on fairness, as no existing simulators are well-suited for this problem. We conducted experiments to compare FairSkillMARL in conjunction with four standard MARL methods, and against two state-of-the-art fairness metrics. Our results suggest that fairness based solely on equal workload might lead to task-skill mismatches and highlight the need for more robust metrics that capture skill-task misalignment. Our work provides tools and a foundation for studying fairness in heterogeneous multi-agent systems where aligning effort with expertise is critical.
Bias-Adjusted LLM Agents for Human-Like Decision-Making via Behavioral Economics
Large language models (LLMs) are increasingly used to simulate human decision-making, but their intrinsic biases often diverge from real human behavior--limiting their ability to reflect population-level diversity. We address this challenge with a persona-based approach that leverages individual-level behavioral data from behavioral economics to adjust model biases. Applying this method to the ultimatum game--a standard but difficult benchmark for LLMs--we observe improved alignment between simulated and empirical behavior, particularly on the responder side. While further refinement of trait representations is needed, our results demonstrate the promise of persona-conditioned LLMs for simulating human-like decision patterns at scale.
comment: 8 pages, 4 figures
Aggregate Fictitious Play for Learning in Anonymous Polymatrix Games (Extended Version)
Fictitious play (FP) is a well-studied algorithm that enables agents to learn Nash equilibrium in games with certain reward structures. However, when agents have no prior knowledge of the reward functions, FP faces a major challenge: the joint action space grows exponentially with the number of agents, which slows down reward exploration. Anonymous games offer a structure that mitigates this issue. In these games, the rewards depend only on the actions taken; not on who is taking which action. Under such a structure, we introduce aggregate fictitious play (agg-FP), a variant of FP where each agent tracks the frequency of the number of other agents playing each action, rather than these agents' individual actions. We show that in anonymous polymatrix games, agg-FP converges to a Nash equilibrium under the same conditions as classical FP. In essence, by aggregating the agents' actions, we reduce the action space without losing the convergence guarantees. Using simulations, we provide empirical evidence on how this reduction accelerates convergence.
Consistent Opponent Modeling of Static Opponents in Imperfect-Information Games
The goal of agents in multi-agent environments is to maximize total reward against the opposing agents that are encountered. Following a game-theoretic solution concept, such as Nash equilibrium, may obtain a strong performance in some settings; however, such approaches fail to capitalize on historical and observed data from repeated interactions against our opponents. Opponent modeling algorithms integrate machine learning techniques to exploit suboptimal opponents utilizing available data; however, the effectiveness of such approaches in imperfect-information games to date is quite limited. We show that existing opponent modeling approaches fail to satisfy a simple desirable property even against static opponents drawn from a known prior distribution; namely, they do not guarantee that the model approaches the opponent's true strategy even in the limit as the number of game iterations approaches infinity. We develop a new algorithm that is able to achieve this property and runs efficiently by solving a convex minimization problem based on the sequence-form game representation using projected gradient descent. The algorithm is guaranteed to efficiently converge to the opponent's true strategy given observations from gameplay and possibly additional historical data if it is available.
An Agentic System for Rare Disease Diagnosis with Traceable Reasoning
Rare diseases collectively affect over 300 million individuals worldwide, yet timely and accurate diagnosis remains a pervasive challenge. This is largely due to their clinical heterogeneity, low individual prevalence, and the limited familiarity most clinicians have with rare conditions. Here, we introduce DeepRare, the first rare disease diagnosis agentic system powered by a large language model (LLM), capable of processing heterogeneous clinical inputs. The system generates ranked diagnostic hypotheses for rare diseases, each accompanied by a transparent chain of reasoning that links intermediate analytic steps to verifiable medical evidence. DeepRare comprises three key components: a central host with a long-term memory module; specialized agent servers responsible for domain-specific analytical tasks integrating over 40 specialized tools and web-scale, up-to-date medical knowledge sources, ensuring access to the most current clinical information. This modular and scalable design enables complex diagnostic reasoning while maintaining traceability and adaptability. We evaluate DeepRare on eight datasets. The system demonstrates exceptional diagnostic performance among 2,919 diseases, achieving 100% accuracy for 1013 diseases. In HPO-based evaluations, DeepRare significantly outperforms other 15 methods, like traditional bioinformatics diagnostic tools, LLMs, and other agentic systems, achieving an average Recall@1 score of 57.18% and surpassing the second-best method (Reasoning LLM) by a substantial margin of 23.79 percentage points. For multi-modal input scenarios, DeepRare achieves 70.60% at Recall@1 compared to Exomiser's 53.20% in 109 cases. Manual verification of reasoning chains by clinical experts achieves 95.40% agreements. Furthermore, the DeepRare system has been implemented as a user-friendly web application http://raredx.cn/doctor.
The Influence of Human-inspired Agentic Sophistication in LLM-driven Strategic Reasoners
The rapid rise of large language models (LLMs) has shifted artificial intelligence (AI) research toward agentic systems, motivating the use of weaker and more flexible notions of agency. However, this shift raises key questions about the extent to which LLM-based agents replicate human strategic reasoning, particularly in game-theoretic settings. In this context, we examine the role of agentic sophistication in shaping artificial reasoners' performance by evaluating three agent designs: a simple game-theoretic model, an unstructured LLM-as-agent model, and an LLM integrated into a traditional agentic framework. Using guessing games as a testbed, we benchmarked these agents against human participants across general reasoning patterns and individual role-based objectives. Furthermore, we introduced obfuscated game scenarios to assess agents' ability to generalise beyond training distributions. Our analysis, covering over 2000 reasoning samples across 25 agent configurations, shows that human-inspired cognitive structures can enhance LLM agents' alignment with human strategic behaviour. Still, the relationship between agentic design complexity and human-likeness is non-linear, highlighting a critical dependence on underlying LLM capabilities and suggesting limits to simple architectural augmentation.
Safe Multiagent Coordination via Entropic Exploration
Many real-world multiagent learning problems involve safety concerns. In these setups, typical safe reinforcement learning algorithms constrain agents' behavior, limiting exploration -- a crucial component for discovering effective cooperative multiagent behaviors. Moreover, the multiagent literature typically models individual constraints for each agent and has yet to investigate the benefits of using joint team constraints. In this work, we analyze these team constraints from a theoretical and practical perspective and propose entropic exploration for constrained multiagent reinforcement learning (E2C) to address the exploration issue. E2C leverages observation entropy maximization to incentivize exploration and facilitate learning safe and effective cooperative behaviors. Experiments across increasingly complex domains show that E2C agents match or surpass common unconstrained and constrained baselines in task performance while reducing unsafe behaviors by up to $50\%$.
comment: 10 pages, 6 figures
PE-MA: Parameter-Efficient Co-Evolution of Multi-Agent Systems
Multi-Agent Systems have recently emerged as a promising paradigm for collaborative reasoning and solving complex tasks. However, the design of collaborative learning algorithms in multi-agent systems faces several challenges, including high communication overhead and insufficient agent-level personalization. In this paper, we propose PE-MA (Parameter-Efficient Multi-Agent Co-Evolution), a novel collaboration framework that supports efficient, scalable, and personalized co-evolution in multi-agent systems. In PE-MA, each agent maintains a lightweight personalized adapter to support agent-specific behavior, while a shared adapter is collaboratively optimized across neighboring agents. This design balances global coordination with local adaptation under heterogeneous environments. We achieve an asymptotically optimal convergence rate of O( 1/(NK)^(1/2) ), where N is the number of agents and K the local update steps.
comment: 5 pages,Latex;references added
DVM-SLAM: Decentralized Visual Monocular Simultaneous Localization and Mapping for Multi-Agent Systems
Cooperative Simultaneous Localization and Mapping (C-SLAM) enables multiple agents to work together in mapping unknown environments while simultaneously estimating their own positions. This approach enhances robustness, scalability, and accuracy by sharing information between agents, reducing drift, and enabling collective exploration of larger areas. In this paper, we present Decentralized Visual Monocular SLAM (DVM-SLAM), the first open-source decentralized monocular C-SLAM system. By only utilizing low-cost and light-weight monocular vision sensors, our system is well suited for small robots and micro aerial vehicles (MAVs). DVM-SLAM's real-world applicability is validated on physical robots with a custom collision avoidance framework, showcasing its potential in real-time multi-agent autonomous navigation scenarios. We also demonstrate comparable accuracy to state-of-the-art centralized monocular C-SLAM systems. We open-source our code and provide supplementary material online.
comment: Accepted to 2025 IEEE International Conference on Robotics and Automation, pp. 15814-15820
Robotics
FlowVLA: Thinking in Motion with a Visual Chain of Thought
Many Vision-Language-Action (VLA) models rely on an internal world model trained via next-frame prediction. This approach, however, struggles with physical reasoning as it entangles static appearance with dynamic motion, often resulting in implausible visual forecasts and inefficient policy learning. To address these limitations, we introduce the Visual Chain of Thought (Visual CoT): a pre-training framework that encourages a model to reason about how a scene evolves before predicting what it will look like. We instantiate this principle in FlowVLA, which predicts a future frame ($v_{t+1}$) only after generating an intermediate optical flow representation ($f_t$) that encodes motion dynamics. This ``$v_t \rightarrow f_t \rightarrow v_{t+1}$'' reasoning process is implemented within a single autoregressive Transformer, guiding the model to learn disentangled dynamics. As a result, FlowVLA produces coherent visual predictions and facilitates more efficient policy learning. Experiments on challenging robotics manipulation benchmarks demonstrate state-of-the-art performance with substantially improved sample efficiency, pointing toward a more principled foundation for world modeling. Project page: https://irpn-lab.github.io/FlowVLA/
SafeBimanual: Diffusion-based Trajectory Optimization for Safe Bimanual Manipulation
Bimanual manipulation has been widely applied in household services and manufacturing, which enables the complex task completion with coordination requirements. Recent diffusion-based policy learning approaches have achieved promising performance in modeling action distributions for bimanual manipulation. However, they ignored the physical safety constraints of bimanual manipulation, which leads to the dangerous behaviors with damage to robots and objects. To this end, we propose a test-time trajectory optimization framework named SafeBimanual for any pre-trained diffusion-based bimanual manipulation policies, which imposes the safety constraints on bimanual actions to avoid dangerous robot behaviors with improved success rate. Specifically, we design diverse cost functions for safety constraints in different dual-arm cooperation patterns including avoidance of tearing objects and collision between arms and objects, which optimizes the manipulator trajectories with guided sampling of diffusion denoising process. Moreover, we employ a vision-language model (VLM) to schedule the cost functions by specifying keypoints and corresponding pairwise relationship, so that the optimal safety constraint is dynamically generated in the entire bimanual manipulation process. SafeBimanual demonstrates superiority on 8 simulated tasks in RoboTwin with a 13.7% increase in success rate and a 18.8% reduction in unsafe interactions over state-of-the-art diffusion-based methods. Extensive experiments on 4 real-world tasks further verify its practical value by improving the success rate by 32.5%.
comment: Project website is at: https://denghaoyuan123.github.io/SafeBimanip/
Scene-Agnostic Traversability Labeling and Estimation via a Multimodal Self-supervised Framework
Traversability estimation is critical for enabling robots to navigate across diverse terrains and environments. While recent self-supervised learning methods achieve promising results, they often fail to capture the characteristics of non-traversable regions. Moreover, most prior works concentrate on a single modality, overlooking the complementary strengths offered by integrating heterogeneous sensory modalities for more robust traversability estimation. To address these limitations, we propose a multimodal self-supervised framework for traversability labeling and estimation. First, our annotation pipeline integrates footprint, LiDAR, and camera data as prompts for a vision foundation model, generating traversability labels that account for both semantic and geometric cues. Then, leveraging these labels, we train a dual-stream network that jointly learns from different modalities in a decoupled manner, enhancing its capacity to recognize diverse traversability patterns. In addition, we incorporate sparse LiDAR-based supervision to mitigate the noise introduced by pseudo labels. Finally, extensive experiments conducted across urban, off-road, and campus environments demonstrate the effectiveness of our approach. The proposed automatic labeling method consistently achieves around 88% IoU across diverse datasets. Compared to existing self-supervised state-of-the-art methods, our multimodal traversability estimation network yields consistently higher IoU, improving by 1.6-3.5% on all evaluated datasets.
DANCeRS: A Distributed Algorithm for Negotiating Consensus in Robot Swarms with Gaussian Belief Propagation
Robot swarms require cohesive collective behaviour to address diverse challenges, including shape formation and decision-making. Existing approaches often treat consensus in discrete and continuous decision spaces as distinct problems. We present DANCeRS, a unified, distributed algorithm leveraging Gaussian Belief Propagation (GBP) to achieve consensus in both domains. By representing a swarm as a factor graph our method ensures scalability and robustness in dynamic environments, relying on purely peer-to-peer message passing. We demonstrate the effectiveness of our general framework through two applications where agents in a swarm must achieve consensus on global behaviour whilst relying on local communication. In the first, robots must perform path planning and collision avoidance to create shape formations. In the second, we show how the same framework can be used by a group of robots to form a consensus over a set of discrete decisions. Experimental results highlight our method's scalability and efficiency compared to recent approaches to these problems making it a promising solution for multi-robot systems requiring distributed consensus. We encourage the reader to see the supplementary video demo.
Analysis of Harpy's Constrained Trotting and Jumping Maneuver
This study presents an analysis of experimental data from Harpy, a thruster-assisted bipedal robot developed at Northeastern University. The study examines data sets from trotting and jumping experiments to understand the fundamental principles governing hybrid leg-thruster locomotion. Through data analysis across multiple locomotion modes, this research reveals that Harpy achieves stable locomotion with bounded trajectories and consistent foot placement through strategic leg-thruster synergy. The results demonstrate controlled joint behavior with low torques and symmetric tracking, accurate foot placement within kinematic constraints despite phase-transition perturbations, and underactuated degree-of-freedom stability without divergence. Energy level analysis reveals that legs provide primary propulsion, while the thrusters enable additional aerial phase control. The analysis identifies critical body-leg coupling dynamics during aerial phases that require phase-specific control strategies. Consistent repeatability and symmetry across experiments validate the robustness of the hybrid actuation approach.
comment: Master's Thesis
BirdRecorder's AI on Sky: Safeguarding birds of prey by detection and classification of tiny objects around wind turbines
The urgent need for renewable energy expansion, particularly wind power, is hindered by conflicts with wildlife conservation. To address this, we developed BirdRecorder, an advanced AI-based anti-collision system to protect endangered birds, especially the red kite (Milvus milvus). Integrating robotics, telemetry, and high-performance AI algorithms, BirdRecorder aims to detect, track, and classify avian species within a range of 800 m to minimize bird-turbine collisions. BirdRecorder integrates advanced AI methods with optimized hardware and software architectures to enable real-time image processing. Leveraging Single Shot Detector (SSD) for detection, combined with specialized hardware acceleration and tracking algorithms, our system achieves high detection precision while maintaining the speed necessary for real-time decision-making. By combining these components, BirdRecorder outperforms existing approaches in both accuracy and efficiency. In this paper, we summarize results on field tests and performance of the BirdRecorder system. By bridging the gap between renewable energy expansion and wildlife conservation, BirdRecorder contributes to a more sustainable coexistence of technology and nature.
comment: 18 pages, 1 figures, to appear in Proceedings of the 19th International Conference on Intelligent Autonomous Systems (IAS-19), Genoa, Italy, 2025
The Effects of Communication Delay on Human Performance and Neurocognitive Responses in Mobile Robot Teleoperation
Communication delays in mobile robot teleoperation adversely affect human-machine collaboration. Understanding delay effects on human operational performance and neurocognition is essential for resolving this issue. However, no previous research has explored this. To fill this gap, we conduct a human-in-the-loop experiment involving 10 participants, integrating electroencephalography (EEG) and robot behavior data under varying delays (0-500 ms in 100 ms increments) to systematically investigate these effects. Behavior analysis reveals significant performance degradation at 200-300 ms delays, affecting both task efficiency and accuracy. EEG analysis discovers features with significant delay dependence: frontal $\theta/\beta$-band and parietal $\alpha$-band power. We also identify a threshold window (100-200 ms) for early perception of delay in humans, during which these EEG features first exhibit significant differences. When delay exceeds 400 ms, all features plateau, indicating saturation of cognitive resource allocation at physiological limits. These findings provide the first evidence of perceptual and cognitive delay thresholds during teleoperation tasks in humans, offering critical neurocognitive insights for the design of delay compensation strategies.
Arnold: a generalist muscle transformer policy
Controlling high-dimensional and nonlinear musculoskeletal models of the human body is a foundational scientific challenge. Recent machine learning breakthroughs have heralded policies that master individual skills like reaching, object manipulation and locomotion in musculoskeletal systems with many degrees of freedom. However, these agents are merely "specialists", achieving high performance for a single skill. In this work, we develop Arnold, a generalist policy that masters multiple tasks and embodiments. Arnold combines behavior cloning and fine-tuning with PPO to achieve expert or super-expert performance in 14 challenging control tasks from dexterous object manipulation to locomotion. A key innovation is Arnold's sensorimotor vocabulary, a compositional representation of the semantics of heterogeneous sensory modalities, objectives, and actuators. Arnold leverages this vocabulary via a transformer architecture to deal with the variable observation and action spaces of each task. This framework supports efficient multi-task, multi-embodiment learning and facilitates rapid adaptation to novel tasks. Finally, we analyze Arnold to provide insights into biological motor control, corroborating recent findings on the limited transferability of muscle synergies across tasks.
comment: A.S.C. and B.A. contributed equally. Code is available at https://github.com/amathislab/arnold-the-generalist
Modeling and Control Framework for Autonomous Space Manipulator Handover Operations
Autonomous space robotics is poised to play a vital role in future space missions, particularly for In-space Servicing, Assembly, and Manufacturing (ISAM). A key capability in such missions is the Robot-to-Robot (R2R) handover of mission-critical objects. This work presents a dynamic model of a dual-arm space manipulator system and compares various tracking control laws. The key contributions of this work are the development of a cooperative manipulator dynamic model and the comparative analysis of control laws to support autonomous R2R handovers in ISAM scenarios.
comment: 14 pages, submitted to 2025 Astrodynamics Specialists Conference proceedings
No Need to Look! Locating and Grasping Objects by a Robot Arm Covered with Sensitive Skin ICRA 2026
Locating and grasping of objects by robots is typically performed using visual sensors. Haptic feedback from contacts with the environment is only secondary if present at all. In this work, we explored an extreme case of searching for and grasping objects in complete absence of visual input, relying on haptic feedback only. The main novelty lies in the use of contacts over the complete surface of a robot manipulator covered with sensitive skin. The search is divided into two phases: (1) coarse workspace exploration with the complete robot surface, followed by (2) precise localization using the end-effector equipped with a force/torque sensor. We systematically evaluated this method in simulation and on the real robot, demonstrating that diverse objects can be located, grasped, and put in a basket. The overall success rate on the real robot for one object was 85.7\% with failures mainly while grasping specific objects. The method using whole-body contacts is six times faster compared to a baseline that uses haptic feedback only on the end-effector. We also show locating and grasping multiple objects on the table. This method is not restricted to our specific setup and can be deployed on any platform with the ability of sensing contacts over the entire body surface. This work holds promise for diverse applications in areas with challenging visual perception (due to lighting, dust, smoke, occlusion) such as in agriculture when fruits or vegetables need to be located inside foliage and picked.
comment: Submitted for review to ICRA 2026
Integration of Computer Vision with Adaptive Control for Autonomous Driving Using ADORE
Ensuring safety in autonomous driving requires a seamless integration of perception and decision making under uncertain conditions. Although computer vision (CV) models such as YOLO achieve high accuracy in detecting traffic signs and obstacles, their performance degrades in drift scenarios caused by weather variations or unseen objects. This work presents a simulated autonomous driving system that combines a context aware CV model with adaptive control using the ADORE framework. The CARLA simulator was integrated with ADORE via the ROS bridge, allowing real-time communication between perception, decision, and control modules. A simulated test case was designed in both clear and drift weather conditions to demonstrate the robust detection performance of the perception model while ADORE successfully adapted vehicle behavior to speed limits and obstacles with low response latency. The findings highlight the potential of coupling deep learning-based perception with rule-based adaptive decision making to improve automotive safety critical system.
Neural Algorithmic Reasoners informed Large Language Model for Multi-Agent Path Finding IJCNN 2025
The development and application of large language models (LLM) have demonstrated that foundational models can be utilized to solve a wide array of tasks. However, their performance in multi-agent path finding (MAPF) tasks has been less than satisfactory, with only a few studies exploring this area. MAPF is a complex problem requiring both planning and multi-agent coordination. To improve the performance of LLM in MAPF tasks, we propose a novel framework, LLM-NAR, which leverages neural algorithmic reasoners (NAR) to inform LLM for MAPF. LLM-NAR consists of three key components: an LLM for MAPF, a pre-trained graph neural network-based NAR, and a cross-attention mechanism. This is the first work to propose using a neural algorithmic reasoner to integrate GNNs with the map information for MAPF, thereby guiding LLM to achieve superior performance. LLM-NAR can be easily adapted to various LLM models. Both simulation and real-world experiments demonstrate that our method significantly outperforms existing LLM-based approaches in solving MAPF problems.
comment: Accepted by IJCNN 2025
A holistic perception system of internal and external monitoring for ground autonomous vehicles: AutoTRUST paradigm
This paper introduces a holistic perception system for internal and external monitoring of autonomous vehicles, with the aim of demonstrating a novel AI-leveraged self-adaptive framework of advanced vehicle technologies and solutions that optimize perception and experience on-board. Internal monitoring system relies on a multi-camera setup designed for predicting and identifying driver and occupant behavior through facial recognition, exploiting in addition a large language model as virtual assistant. Moreover, the in-cabin monitoring system includes AI-empowered smart sensors that measure air-quality and perform thermal comfort analysis for efficient on and off-boarding. On the other hand, external monitoring system perceives the surrounding environment of vehicle, through a LiDAR-based cost-efficient semantic segmentation approach, that performs highly accurate and efficient super-resolution on low-quality raw 3D point clouds. The holistic perception framework is developed in the context of EU's Horizon Europe programm AutoTRUST, and has been integrated and deployed on a real electric vehicle provided by ALKE. Experimental validation and evaluation at the integration site of Joint Research Centre at Ispra, Italy, highlights increased performance and efficiency of the modular blocks of the proposed perception architecture.
Egocentric Instruction-oriented Affordance Prediction via Large Multimodal Model
Affordance is crucial for intelligent robots in the context of object manipulation. In this paper, we argue that affordance should be task-/instruction-dependent, which is overlooked by many previous works. That is, different instructions can lead to different manipulation regions and directions even for the same object. According to this observation, we present a new dataset comprising fifteen thousand object-instruction-affordance triplets. All scenes in the dataset are from an egocentric viewpoint, designed to approximate the perspective of a human-like robot. Furthermore, we investigate how to enable large multimodal models (LMMs) to serve as affordance predictors by implementing a ``search against verifiers'' pipeline. An LMM is asked to progressively predict affordances, with the output at each step being verified by itself during the iterative process, imitating a reasoning process. Experiments show that our method not only unlocks new instruction-oriented affordance prediction capabilities, but also achieves outstanding performance broadly.
Physical Embodiment Enables Information Processing Beyond Explicit Sensing in Active Matter
Living microorganisms have evolved dedicated sensory machinery to detect environmental perturbations, processing these signals through biochemical networks to guide behavior. Replicating such capabilities in synthetic active matter remains a fundamental challenge. Here, we demonstrate that synthetic active particles can adapt to hidden hydrodynamic perturbations through physical embodiment alone, without explicit sensing mechanisms. Using reinforcement learning to control self-thermophoretic particles, we show that they learn navigation strategies to counteract unobserved flow fields by exploiting information encoded in their physical dynamics. Remarkably, particles successfully navigate perturbations that are not included in their state inputs, revealing that embodied dynamics can serve as an implicit sensing mechanism. This discovery establishes physical embodiment as a computational resource for information processing in active matter, with implications for autonomous microrobotic systems and bio-inspired computation.
CubeDN: Real-time Drone Detection in 3D Space from Dual mmWave Radar Cubes
As drone use has become more widespread, there is a critical need to ensure safety and security. A key element of this is robust and accurate drone detection and localization. While cameras and other optical sensors like LiDAR are commonly used for object detection, their performance degrades under adverse lighting and environmental conditions. Therefore, this has generated interest in finding more reliable alternatives, such as millimeter-wave (mmWave) radar. Recent research on mmWave radar object detection has predominantly focused on 2D detection of road users. Although these systems demonstrate excellent performance for 2D problems, they lack the sensing capability to measure elevation, which is essential for 3D drone detection. To address this gap, we propose CubeDN, a single-stage end-to-end radar object detection network specifically designed for flying drones. CubeDN overcomes challenges such as poor elevation resolution by utilizing a dual radar configuration and a novel deep learning pipeline. It simultaneously detects, localizes, and classifies drones of two sizes, achieving decimeter-level tracking accuracy at closer ranges with overall $95\%$ average precision (AP) and $85\%$ average recall (AR). Furthermore, CubeDN completes data processing and inference at 10Hz, making it highly suitable for practical applications.
Effect of Performance Feedback Timing on Motor Learning for a Surgical Training Task
Objective: Robot-assisted minimally invasive surgery (RMIS) has become the gold standard for a variety of surgical procedures, but the optimal method of training surgeons for RMIS is unknown. We hypothesized that real-time, rather than post-task, error feedback would better increase learning speed and reduce errors. Methods: Forty-two surgical novices learned a virtual version of the ring-on-wire task, a canonical task in RMIS training. We investigated the impact of feedback timing with multi-sensory (haptic and visual) cues in three groups: (1) real-time error feedback, (2) trial replay with error feedback, and (3) no error feedback. Results: Participant performance was evaluated based on the accuracy of ring position and orientation during the task. Participants who received real-time feedback outperformed other groups in ring orientation. Additionally, participants who received feedback in replay outperformed participants who did not receive any error feedback on ring orientation during long, straight path sections. There were no significant differences between groups for ring position overall, but participants who received real-time feedback outperformed the other groups in positional accuracy on tightly curved path sections. Conclusion: The addition of real-time haptic and visual error feedback improves learning outcomes in a virtual surgical task over error feedback in replay or no error feedback at all. Significance: This work demonstrates that multi-sensory error feedback delivered in real time leads to better training outcomes as compared to the same feedback delivered after task completion. This novel method of training may enable surgical trainees to develop skills with greater speed and accuracy.
comment: Submitted to IEEE Transactions on Biomedical Engineering
Adaptive Output Steps: FlexiSteps Network for Dynamic Trajectory Prediction
Accurate trajectory prediction is vital for autonomous driving, robotics, and intelligent decision-making systems, yet traditional models typically rely on fixed-length output predictions, limiting their adaptability to dynamic real-world scenarios. In this paper, we introduce the FlexiSteps Network (FSN), a novel framework that dynamically adjusts prediction output time steps based on varying contextual conditions. Inspired by recent advancements addressing observation length discrepancies and dynamic feature extraction, FSN incorporates an pre-trained Adaptive Prediction Module (APM) to evaluate and adjust the output steps dynamically, ensuring optimal prediction accuracy and efficiency. To guarantee the plug-and-play of our FSN, we also design a Dynamic Decoder(DD). Additionally, to balance the prediction time steps and prediction accuracy, we design a scoring mechanism, which not only introduces the Fr\'echet distance to evaluate the geometric similarity between the predicted trajectories and the ground truth trajectories but the length of predicted steps is also considered. Extensive experiments conducted on benchmark datasets including Argoverse and INTERACTION demonstrate the effectiveness and flexibility of our proposed FSN framework.
Talking to Robots: A Practical Examination of Speech Foundation Models for HRI Applications
Automatic Speech Recognition (ASR) systems in real-world settings need to handle imperfect audio, often degraded by hardware limitations or environmental noise, while accommodating diverse user groups. In human-robot interaction (HRI), these challenges intersect to create a uniquely challenging recognition environment. We evaluate four state-of-the-art ASR systems on eight publicly available datasets that capture six dimensions of difficulty: domain-specific, accented, noisy, age-variant, impaired, and spontaneous speech. Our analysis demonstrates significant variations in performance, hallucination tendencies, and inherent biases, despite similar scores on standard benchmarks. These limitations have serious implications for HRI, where recognition errors can interfere with task performance, user trust, and safety.
comment: Accepted at the workshop on Foundation Models for Social Robotics (FoMoSR) at ICSR 2025
MEVITA: Open-Source Bipedal Robot Assembled from E-Commerce Components via Sheet Metal Welding
Various bipedal robots have been developed to date, and in recent years, there has been a growing trend toward releasing these robots as open-source platforms. This shift is fostering an environment in which anyone can freely develop bipedal robots and share their knowledge, rather than relying solely on commercial products. However, most existing open-source bipedal robots are designed to be fabricated using 3D printers, which limits their scalability in size and often results in fragile structures. On the other hand, some metal-based bipedal robots have been developed, but they typically involve a large number of components, making assembly difficult, and in some cases, the parts themselves are not readily available through e-commerce platforms. To address these issues, we developed MEVITA, an open-source bipedal robot that can be built entirely from components available via e-commerce. Aiming for the minimal viable configuration for a bipedal robot, we utilized sheet metal welding to integrate complex geometries into single parts, thereby significantly reducing the number of components and enabling easy assembly for anyone. Through reinforcement learning in simulation and Sim-to-Real transfer, we demonstrated robust walking behaviors across various environments, confirming the effectiveness of our approach. All hardware, software, and training environments can be obtained from https://github.com/haraduka/mevita .
comment: Accepted at IEEE-RAS Humanoids2025, Website - https://haraduka.github.io/mevita-hardware , YouTube - https://youtu.be/_akfHkCne0s
SEBVS: Synthetic Event-based Visual Servoing for Robot Navigation and Manipulation
Event cameras offer microsecond latency, high dynamic range, and low power consumption, making them ideal for real-time robotic perception under challenging conditions such as motion blur, occlusion, and illumination changes. However, despite their advantages, synthetic event-based vision remains largely unexplored in mainstream robotics simulators. This lack of simulation setup hinders the evaluation of event-driven approaches for robotic manipulation and navigation tasks. This work presents an open-source, user-friendly v2e robotics operating system (ROS) package for Gazebo simulation that enables seamless event stream generation from RGB camera feeds. The package is used to investigate event-based robotic policies (ERP) for real-time navigation and manipulation. Two representative scenarios are evaluated: (1) object following with a mobile robot and (2) object detection and grasping with a robotic manipulator. Transformer-based ERPs are trained by behavior cloning and compared to RGB-based counterparts under various operating conditions. Experimental results show that event-guided policies consistently deliver competitive advantages. The results highlight the potential of event-driven perception to improve real-time robotic navigation and manipulation, providing a foundation for broader integration of event cameras into robotic policy learning. The GitHub repo for the dataset and code: https://eventbasedvision.github.io/SEBVS/
GWM: Towards Scalable Gaussian World Models for Robotic Manipulation ICCV 2025
Training robot policies within a learned world model is trending due to the inefficiency of real-world interactions. The established image-based world models and policies have shown prior success, but lack robust geometric information that requires consistent spatial and physical understanding of the three-dimensional world, even pre-trained on internet-scale video sources. To this end, we propose a novel branch of world model named Gaussian World Model (GWM) for robotic manipulation, which reconstructs the future state by inferring the propagation of Gaussian primitives under the effect of robot actions. At its core is a latent Diffusion Transformer (DiT) combined with a 3D variational autoencoder, enabling fine-grained scene-level future state reconstruction with Gaussian Splatting. GWM can not only enhance the visual representation for imitation learning agent by self-supervised future prediction training, but can serve as a neural simulator that supports model-based reinforcement learning. Both simulated and real-world experiments depict that GWM can precisely predict future scenes conditioned on diverse robot actions, and can be further utilized to train policies that outperform the state-of-the-art by impressive margins, showcasing the initial data scaling potential of 3D world model.
comment: Published at ICCV 2025. Project page: https://gaussian-world-model.github.io/
Mimicking associative learning of rats via a neuromorphic robot in open field maze using spatial cell models
Data-driven Artificial Intelligence (AI) approaches have exhibited remarkable prowess across various cognitive tasks using extensive training data. However, the reliance on large datasets and neural networks presents challenges such as highpower consumption and limited adaptability, particularly in SWaP-constrained applications like planetary exploration. To address these issues, we propose enhancing the autonomous capabilities of intelligent robots by emulating the associative learning observed in animals. Associative learning enables animals to adapt to their environment by memorizing concurrent events. By replicating this mechanism, neuromorphic robots can navigate dynamic environments autonomously, learning from interactions to optimize performance. This paper explores the emulation of associative learning in rodents using neuromorphic robots within open-field maze environments, leveraging insights from spatial cells such as place and grid cells. By integrating these models, we aim to enable online associative learning for spatial tasks in real-time scenarios, bridging the gap between biological spatial cognition and robotics for advancements in autonomous systems.
PneuGelSight: Soft Robotic Vision-Based Proprioception and Tactile Sensing
Soft pneumatic robot manipulators are popular in industrial and human-interactive applications due to their compliance and flexibility. However, deploying them in real-world scenarios requires advanced sensing for tactile feedback and proprioception. Our work presents a novel vision-based approach for sensorizing soft robots. We demonstrate our approach on PneuGelSight, a pioneering pneumatic manipulator featuring high-resolution proprioception and tactile sensing via an embedded camera. To optimize the sensor's performance, we introduce a comprehensive pipeline that accurately simulates its optical and dynamic properties, facilitating a zero-shot knowledge transition from simulation to real-world applications. PneuGelSight and our sim-to-real pipeline provide a novel, easily implementable, and robust sensing methodology for soft robots, paving the way for the development of more advanced soft robots with enhanced sensory capabilities.
comment: 16 pages, 12 figures, International Journal of Robotics Research (accepted), 2025
Efficient task and path planning for maintenance automation using a robot system
The research and development of intelligent automation solutions is a ground-breaking point for the factory of the future. A promising and challenging mission is the use of autonomous robot systems to automate tasks in the field of maintenance. For this purpose, the robot system must be able to plan autonomously the different manipulation tasks and the corresponding paths. Basic requirements are the development of algorithms with a low computational complexity and the possibility to deal with environmental uncertainties. In this work, an approach is presented, which is especially suited to solve the problem of maintenance automation. For this purpose, offline data from CAD is combined with online data from an RGBD vision system via a probabilistic filter, to compensate uncertainties from offline data. For planning the different tasks, a method is explained, which use a symbolic description, founded on a novel sampling-based method to compute the disassembly space. For path planning we use global state-of-the art algorithms with a method that allows the adaption of the exploration stepsize in order to reduce the planning time. Every method is experimentally validated and discussed.
comment: 10 pages, 10 figures
Maintenance automation: methods for robotics manipulation planning and execution
Automating complex tasks using robotic systems requires skills for planning, control and execution. This paper proposes a complete robotic system for maintenance automation, which can automate disassembly and assembly operations under environmental uncertainties (e.g. deviations between prior plan information). The cognition of the robotic system is based on a planning approach (using CAD and RGBD data) and includes a method to interpret a symbolic plan and transform it to a set of executable robot instructions. The complete system is experimentally evaluated using real-world applications. This work shows the first step to transfer these theoretical results into a practical robotic solution.
comment: 11 pages, 12 figures
Mining the Long Tail: A Comparative Study of Data-Centric Criticality Metrics for Robust Offline Reinforcement Learning in Autonomous Motion Planning
Offline Reinforcement Learning (RL) presents a promising paradigm for training autonomous vehicle (AV) planning policies from large-scale, real-world driving logs. However, the extreme data imbalance in these logs, where mundane scenarios vastly outnumber rare "long-tail" events, leads to brittle and unsafe policies when using standard uniform data sampling. In this work, we address this challenge through a systematic, large-scale comparative study of data curation strategies designed to focus the learning process on information-rich samples. We investigate six distinct criticality weighting schemes which are categorized into three families: heuristic-based, uncertainty-based, and behavior-based. These are evaluated at two temporal scales, the individual timestep and the complete scenario. We train seven goal-conditioned Conservative Q-Learning (CQL) agents with a state-of-the-art, attention-based architecture and evaluate them in the high-fidelity Waymax simulator. Our results demonstrate that all data curation methods significantly outperform the baseline. Notably, data-driven curation using model uncertainty as a signal achieves the most significant safety improvements, reducing the collision rate by nearly three-fold (from 16.0% to 5.5%). Furthermore, we identify a clear trade-off where timestep-level weighting excels at reactive safety while scenario-level weighting improves long-horizon planning. Our work provides a comprehensive framework for data curation in Offline RL and underscores that intelligent, non-uniform sampling is a critical component for building safe and reliable autonomous agents.
VIN-NBV: A View Introspection Network for Next-Best-View Selection
Next Best View (NBV) algorithms aim to maximize 3D scene acquisition quality using minimal resources, e.g. number of acquisitions, time taken, or distance traversed. Prior methods often rely on coverage maximization as a proxy for reconstruction quality, but for complex scenes with occlusions and finer details, this is not always sufficient and leads to poor reconstructions. Our key insight is to train an acquisition policy that directly optimizes for reconstruction quality rather than just coverage. To achieve this, we introduce the View Introspection Network (VIN): a lightweight neural network that predicts the Relative Reconstruction Improvement (RRI) of a potential next viewpoint without making any new acquisitions. We use this network to power a simple, yet effective, sequential samplingbased greedy NBV policy. Our approach, VIN-NBV, generalizes to unseen object categories, operates without prior scene knowledge, is adaptable to resource constraints, and can handle occlusions. We show that our RRI fitness criterion leads to a ~30% gain in reconstruction quality over a coverage-based criterion using the same greedy strategy. Furthermore, VIN-NBV also outperforms deep reinforcement learning methods, Scan-RL and GenNBV, by ~40%.
comment: 9 pages, 9 figures, 2 tables. Reformat into two column. Additional experiments and results
MapleGrasp: Mask-guided Feature Pooling for Language-driven Efficient Robotic Grasping
Robotic manipulation of unseen objects via natural language commands remains challenging. Language driven robotic grasping (LDRG) predicts stable grasp poses from natural language queries and RGB-D images. We propose MapleGrasp, a novel framework that leverages mask-guided feature pooling for efficient vision-language driven grasping. Our two-stage training first predicts segmentation masks from CLIP-based vision-language features. The second stage pools features within these masks to generate pixel-level grasp predictions, improving efficiency, and reducing computation. Incorporating mask pooling results in a 7% improvement over prior approaches on the OCID-VLG benchmark. Furthermore, we introduce RefGraspNet, an open-source dataset eight times larger than existing alternatives, significantly enhancing model generalization for open-vocabulary grasping. MapleGrasp scores a strong grasping accuracy of 89\% when compared with competing methods in the RefGraspNet benchmark. Our method achieves comparable performance to larger Vision-Language-Action models on the LIBERO benchmark, and shows significantly better generalization to unseen tasks. Real-world experiments on a Franka arm demonstrate 73% success rate with unseen objects, surpassing competitive baselines by 11%. Code is provided in our github repository.
HOSt3R: Keypoint-free Hand-Object 3D Reconstruction from RGB images
Hand-object 3D reconstruction has become increasingly important for applications in human-robot interaction and immersive AR/VR experiences. A common approach for object-agnostic hand-object reconstruction from RGB sequences involves a two-stage pipeline: hand-object 3D tracking followed by multi-view 3D reconstruction. However, existing methods rely on keypoint detection techniques, such as Structure from Motion (SfM) and hand-keypoint optimization, which struggle with diverse object geometries, weak textures, and mutual hand-object occlusions, limiting scalability and generalization. As a key enabler to generic and seamless, non-intrusive applicability, we propose in this work a robust, keypoint detector-free approach to estimating hand-object 3D transformations from monocular motion video/images. We further integrate this with a multi-view reconstruction pipeline to accurately recover hand-object 3D shape. Our method, named HOSt3R, is unconstrained, does not rely on pre-scanned object templates or camera intrinsics, and reaches state-of-the-art performance for the tasks of object-agnostic hand-object 3D transformation and shape estimation on the SHOWMe benchmark. We also experiment on sequences from the HO3D dataset, demonstrating generalization to unseen object categories.
comment: 12 pages, 8 figures
3D Feature Distillation with Object-Centric Priors
Grounding natural language to the physical world is a ubiquitous topic with a wide range of applications in computer vision and robotics. Recently, 2D vision-language models such as CLIP have been widely popularized, due to their impressive capabilities for open-vocabulary grounding in 2D images. Recent works aim to elevate 2D CLIP features to 3D via feature distillation, but either learn neural fields that are scene-specific and hence lack generalization, or focus on indoor room scan data that require access to multiple camera views, which is not practical in robot manipulation scenarios. Additionally, related methods typically fuse features at pixel-level and assume that all camera views are equally informative. In this work, we show that this approach leads to sub-optimal 3D features, both in terms of grounding accuracy, as well as segmentation crispness. To alleviate this, we propose a multi-view feature fusion strategy that employs object-centric priors to eliminate uninformative views based on semantic information, and fuse features at object-level via instance segmentation masks. To distill our object-centric 3D features, we generate a large-scale synthetic multi-view dataset of cluttered tabletop scenes, spawning 15k scenes from over 3300 unique object instances, which we make publicly available. We show that our method reconstructs 3D CLIP features with improved grounding capacity and spatial consistency, while doing so from single-view RGB-D, thus departing from the assumption of multiple camera views at test time. Finally, we show that our approach can generalize to novel tabletop domains and be re-purposed for 3D instance segmentation without fine-tuning, and demonstrate its utility for language-guided robotic grasping in clutter.
CleverDistiller: Simple and Spatially Consistent Cross-modal Distillation BMVC 2025
Vision foundation models (VFMs) such as DINO have led to a paradigm shift in 2D camera-based perception towards extracting generalized features to support many downstream tasks. Recent works introduce self-supervised cross-modal knowledge distillation (KD) as a way to transfer these powerful generalization capabilities into 3D LiDAR-based models. However, they either rely on highly complex distillation losses, pseudo-semantic maps, or limit KD to features useful for semantic segmentation only. In this work, we propose CleverDistiller, a self-supervised, cross-modal 2D-to-3D KD framework introducing a set of simple yet effective design choices: Unlike contrastive approaches relying on complex loss design choices, our method employs a direct feature similarity loss in combination with a multi layer perceptron (MLP) projection head to allow the 3D network to learn complex semantic dependencies throughout the projection. Crucially, our approach does not depend on pseudo-semantic maps, allowing for direct knowledge transfer from a VFM without explicit semantic supervision. Additionally, we introduce the auxiliary self-supervised spatial task of occupancy prediction to enhance the semantic knowledge, obtained from a VFM through KD, with 3D spatial reasoning capabilities. Experiments on standard autonomous driving benchmarks for 2D-to-3D KD demonstrate that CleverDistiller achieves state-of-the-art performance in both semantic segmentation and 3D object detection (3DOD) by up to 10% mIoU, especially when fine tuning on really low data amounts, showing the effectiveness of our simple yet powerful KD strategy
comment: Accepted to BMVC 2025
Practical Equivalence Testing and Its Application in Synthetic Pre-Crash Scenario Validation
The use of representative pre-crash scenarios is critical for assessing the safety impact of driving automation systems through simulation. However, a gap remains in the robust evaluation of the similarity between synthetic and real-world pre-crash scenarios and their crash characteristics. Without proper validation, it cannot be ensured that the synthetic test scenarios adequately represent real-world driving behaviors and crash characteristics. One reason for this validation gap is the lack of focus on methods to confirm that the synthetic test scenarios are practically equivalent to real-world ones, given the assessment scope. Traditional statistical methods, like significance testing, focus on detecting differences rather than establishing equivalence; since failure to detect a difference does not imply equivalence, they are of limited applicability for validating synthetic pre-crash scenarios and crash characteristics. This study addresses this gap by proposing an equivalence testing method based on the Bayesian Region of Practical Equivalence (ROPE) framework. This method is designed to assess the practical equivalence of scenario characteristics that are most relevant for the intended assessment, making it particularly appropriate for the domain of virtual safety assessments. We first review existing equivalence testing methods. Then we propose and demonstrate the Bayesian ROPE-based method by testing the equivalence of two rear-end pre-crash datasets. Our approach focuses on the most relevant scenario characteristics. Our analysis provides insights into the practicalities and effectiveness of equivalence testing in synthetic test scenario validation and demonstrates the importance of testing for improving the credibility of synthetic data for automated vehicle safety assessment, as well as the credibility of subsequent safety impact assessments.
A Multimodal Handover Failure Detection Dataset and Baselines ICRA 2024
An object handover between a robot and a human is a coordinated action which is prone to failure for reasons such as miscommunication, incorrect actions and unexpected object properties. Existing works on handover failure detection and prevention focus on preventing failures due to object slip or external disturbances. However, there is a lack of datasets and evaluation methods that consider unpreventable failures caused by the human participant. To address this deficit, we present the multimodal Handover Failure Detection dataset, which consists of failures induced by the human participant, such as ignoring the robot or not releasing the object. We also present two baseline methods for handover failure detection: (i) a video classification method using 3D CNNs and (ii) a temporal action segmentation approach which jointly classifies the human action, robot action and overall outcome of the action. The results show that video is an important modality, but using force-torque data and gripper position help improve failure detection and action segmentation accuracy.
comment: Accepted at ICRA 2024
Dexterous Contact-Rich Manipulation via the Contact Trust Region
What is a good local description of contact dynamics for contact-rich manipulation, and where can we trust this local description? While many approaches often rely on the Taylor approximation of dynamics with an ellipsoidal trust region, we argue that such approaches are fundamentally inconsistent with the unilateral nature of contact. As a remedy, we present the Contact Trust Region (CTR), which captures the unilateral nature of contact while remaining efficient for computation. With CTR, we first develop a Model-Predictive Control (MPC) algorithm capable of synthesizing local contact-rich plans. Then, we extend this capability to plan globally by stitching together local MPC plans, enabling efficient and dexterous contact-rich manipulation. To verify the performance of our method, we perform comprehensive evaluations, both in high-fidelity simulation and on hardware, on two contact-rich systems: a planar IiwaBimanual system and a 3D AllegroHand system. On both systems, our method offers a significantly lower-compute alternative to existing RL-based approaches to contact-rich manipulation. In particular, our Allegro in-hand manipulation policy, in the form of a roadmap, takes fewer than 10 minutes to build offline on a standard laptop using just its CPU, with online inference taking just a few seconds. Experiment data, video and code are available at ctr.theaiinstitute.com.
MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation
Large Language Models (LLMs) have demonstrated remarkable planning abilities across various domains, including robotics manipulation and navigation. While recent efforts in robotics have leveraged LLMs both for high-level and low-level planning, these approaches often face significant challenges, such as hallucinations in long-horizon tasks and limited adaptability due to the generation of plans in a single pass without real-time feedback. To address these limitations, we propose a novel multi-agent LLM framework, Multi-Agent Large Language Model for Manipulation (MALMM) that distributes high-level planning and low-level control code generation across specialized LLM agents, supervised by an additional agent that dynamically manages transitions. By incorporating observations from the environment after each step, our framework effectively handles intermediate failures and enables adaptive re-planning. Unlike existing methods, our approach does not rely on pre-trained skill policies or in-context learning examples and generalizes to a variety of new tasks. We evaluate our approach on nine RLBench tasks, including long-horizon tasks, and demonstrate its ability to solve robotics manipulation in a zero-shot setting, thereby overcoming key limitations of existing LLM-based manipulation methods.
comment: 48 pages
AirExo-2: Scaling up Generalizable Robotic Imitation Learning with Low-Cost Exoskeletons
Scaling up robotic imitation learning for real-world applications requires efficient and scalable demonstration collection methods. While teleoperation is effective, it depends on costly and inflexible robot platforms. In-the-wild demonstrations offer a promising alternative, but existing collection devices have key limitations: handheld setups offer limited observational coverage, and whole-body systems often require fine-tuning with robot data due to domain gaps. To address these challenges, we present AirExo-2, a low-cost exoskeleton system for large-scale in-the-wild data collection, along with several adaptors that transform collected data into pseudo-robot demonstrations suitable for policy learning. We further introduce RISE-2, a generalizable imitation learning policy that fuses 3D spatial and 2D semantic perception for robust manipulations. Experiments show that RISE-2 outperforms prior state-of-the-art methods on both in-domain and generalization evaluations. Trained solely on adapted in-the-wild data produced by AirExo-2, the RISE-2 policy achieves comparable performance to the policy trained with teleoperated data, highlighting the effectiveness and potential of AirExo-2 for scalable and generalizable imitation learning.
comment: accepted to CoRL 2025
Mesh-Learner: Texturing Mesh with Spherical Harmonics IROS2025
In this paper, we present a 3D reconstruction and rendering framework termed Mesh-Learner that is natively compatible with traditional rasterization pipelines. It integrates mesh and spherical harmonic (SH) texture (i.e., texture filled with SH coefficients) into the learning process to learn each mesh s view-dependent radiance end-to-end. Images are rendered by interpolating surrounding SH Texels at each pixel s sampling point using a novel interpolation method. Conversely, gradients from each pixel are back-propagated to the related SH Texels in SH textures. Mesh-Learner exploits graphic features of rasterization pipeline (texture sampling, deferred rendering) to render, which makes Mesh-Learner naturally compatible with tools (e.g., Blender) and tasks (e.g., 3D reconstruction, scene rendering, reinforcement learning for robotics) that are based on rasterization pipelines. Our system can train vast, unlimited scenes because we transfer only the SH textures within the frustum to the GPU for training. At other times, the SH textures are stored in CPU RAM, which results in moderate GPU memory usage. The rendering results on interpolation and extrapolation sequences in the Replica and FAST-LIVO2 datasets achieve state-of-the-art performance compared to existing state-of-the-art methods (e.g., 3D Gaussian Splatting and M2-Mapping). To benefit the society, the code will be available at https://github.com/hku-mars/Mesh-Learner.
comment: IROS2025 Accepted
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI
Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial General Intelligence (AGI) and serves as a foundation for various applications (e.g., intelligent mechatronics systems, smart manufacturing) that bridge cyberspace and the physical world. Recently, the emergence of Multi-modal Large Models (MLMs) and World Models (WMs) have attracted significant attention due to their remarkable perception, interaction, and reasoning capabilities, making them a promising architecture for embodied agents. In this survey, we give a comprehensive exploration of the latest advancements in Embodied AI. Our analysis firstly navigates through the forefront of representative works of embodied robots and simulators, to fully understand the research focuses and their limitations. Then, we analyze four main research targets: 1) embodied perception, 2) embodied interaction, 3) embodied agent, and 4) sim-to-real adaptation, covering state-of-the-art methods, essential paradigms, and comprehensive datasets. Additionally, we explore the complexities of MLMs in virtual and real embodied agents, highlighting their significance in facilitating interactions in digital and physical environments. Finally, we summarize the challenges and limitations of embodied AI and discuss potential future directions. We hope this survey will serve as a foundational reference for the research community. The associated project can be found at https://github.com/HCPLab-SYSU/Embodied_AI_Paper_List.
comment: The comprehensive review of Embodied AI. We also provide the resource repository for Embodied AI: https://github.com/HCPLab-SYSU/Embodied_AI_Paper_List
RT-Cache: Training-Free Retrieval for Real-Time Manipulation
Real robots are expected to repeat the same behavior in new environments with very little new data, yet modern controllers either incur heavy per-step inference or require deployment-time fine-tuning. We propose RT-Cache, a training-free retrieval-as-control pipeline that caches diverse image action trajectories in a unified vector memory and, at test time, embeds the current frame to retrieve and replay multi-step snippets, replacing per-step model calls. A hierarchical search keeps lookups sub-second at million scale, shifting cost from compute to storage and enabling real-time control on modest GPUs. Across real-robot tasks and large open logs, RT-Cache achieves higher success and lower completion time than strong retrieval baselines (approximately x2 higher success and ~30% faster in our settings), and a single-episode anchoring study shows immediate adaptation to a more complex, contact-rich task without fine-tuning. RT-Cache turns experience into an append-only memory, offering a simple, scalable path to few-shot deployment today and a foundation for multimodal keys and optional integration with high-level policies. Project page: https://rt-cache.github.io/.
comment: 8 pages, 6 figures. 2025 IEEE-RAS 24th International Conference on Humanoid Robots
Hierarchical Object-Oriented POMDP Planning for Object Rearrangement
We present an online planning framework and a new benchmark dataset for solving multi-object rearrangement problems in partially observable, multi-room environments. Current object rearrangement solutions, primarily based on Reinforcement Learning or hand-coded planning methods, often lack adaptability to diverse challenges. To address this limitation, we introduce a novel Hierarchical Object-Oriented Partially Observed Markov Decision Process (HOO-POMDP) planning approach. This approach comprises of (a) an object-oriented POMDP planner generating sub-goals, (b) a set of low-level policies for sub-goal achievement, and (c) an abstraction system converting the continuous low-level world into a representation suitable for abstract planning. To enable rigorous evaluation of rearrangement challenges, we introduce MultiRoomR, a comprehensive benchmark featuring diverse multi-room environments with varying degrees of partial observability (10-30\% initial visibility), blocked paths, obstructed goals, and multiple objects (10-20) distributed across 2-4 rooms. Experiments demonstrate that our system effectively handles these complex scenarios while maintaining robust performance even with imperfect perception, achieving promising results across both existing benchmarks and our new MultiRoomR dataset.
comment: 21 pages, 3 Figures. Preprint. Added more information in Appendix
UAD: Unsupervised Affordance Distillation for Generalization in Robotic Manipulation
Understanding fine-grained object affordances is imperative for robots to manipulate objects in unstructured environments given open-ended task instructions. However, existing methods of visual affordance predictions often rely on manually annotated data or conditions only on a predefined set of tasks. We introduce UAD (Unsupervised Affordance Distillation), a method for distilling affordance knowledge from foundation models into a task-conditioned affordance model without any manual annotations. By leveraging the complementary strengths of large vision models and vision-language models, UAD automatically annotates a large-scale dataset with detailed $<$instruction, visual affordance$>$ pairs. Training only a lightweight task-conditioned decoder atop frozen features, UAD exhibits notable generalization to in-the-wild robotic scenes and to various human activities, despite only being trained on rendered objects in simulation. Using affordance provided by UAD as the observation space, we show an imitation learning policy that demonstrates promising generalization to unseen object instances, object categories, and even variations in task instructions after training on as few as 10 demonstrations. Project website: https://unsup-affordance.github.io/
Early Failure Detection in Autonomous Surgical Soft-Tissue Manipulation via Uncertainty Quantification
Autonomous surgical robots are a promising solution to the increasing demand for surgery amid a shortage of surgeons. Recent work has proposed learning-based approaches for the autonomous manipulation of soft tissue. However, due to variability in tissue geometries and stiffnesses, these methods do not always perform optimally, especially in out-of-distribution settings. We propose, develop, and test the first application of uncertainty quantification to learned surgical soft-tissue manipulation policies as an early identification system for task failures. We analyze two different methods of uncertainty quantification, deep ensembles and Monte Carlo dropout, and find that deep ensembles provide a stronger signal of future task success or failure. We validate our approach using the physical daVinci Research Kit (dVRK) surgical robot to perform physical soft-tissue manipulation. We show that we are able to successfully detect out-of-distribution states leading to task failure and request human intervention when necessary while still enabling autonomous manipulation when possible. Our learned tissue manipulation policy with uncertainty-based early failure detection achieves a zero-shot sim2real performance improvement of 47.5% over the prior state of the art in learned soft-tissue manipulation. We also show that our method generalizes well to new types of tissue as well as to a bimanual soft-tissue manipulation task.
comment: 6 pages, 6 figures, Accepted to the 2025 RSS OOD Workshop
Bayesian Deep Learning for Segmentation for Autonomous Safe Planetary Landing
Hazard detection is critical for enabling autonomous landing on planetary surfaces. Current state-of-the-art methods leverage traditional computer vision approaches to automate the identification of safe terrain from input digital elevation models (DEMs). However, performance for these methods can degrade for input DEMs with increased sensor noise. In the last decade, deep learning techniques have been developed for various applications. Nevertheless, their applicability to safety-critical space missions has often been limited due to concerns regarding their outputs' reliability. In response to these limitations, this paper proposes an application of the Bayesian deep-learning segmentation method for hazard detection. The developed approach enables reliable, safe landing site detection by: (i) generating simultaneously a safety prediction map and its uncertainty map via Bayesian deep learning and semantic segmentation; and (ii) using the uncertainty map to filter out the uncertain pixels in the prediction map so that the safe site identification is performed only based on the certain pixels (i.e., pixels for which the model is certain about its safety prediction). Experiments are presented with simulated data based on a Mars HiRISE digital terrain model by varying uncertainty threshold and noise levels to demonstrate the performance of the proposed approach.
comment: 18 pages, 9 figures, Accepted by the AIAA Journal of Spacecraft and Rockets, revised from Paper AAS 21-253 presented at the AAS/AIAA Space Flight Mechanics Meeting in 2021
ParticleFormer: A 3D Point Cloud World Model for Multi-Object, Multi-Material Robotic Manipulation
3D world models (i.e., learning-based 3D dynamics models) offer a promising approach to generalizable robotic manipulation by capturing the underlying physics of environment evolution conditioned on robot actions. However, existing 3D world models are primarily limited to single-material dynamics using a particle-based Graph Neural Network model, and often require time-consuming 3D scene reconstruction to obtain 3D particle tracks for training. In this work, we present ParticleFormer, a Transformer-based point cloud world model trained with a hybrid point cloud reconstruction loss, supervising both global and local dynamics features in multi-material, multi-object robot interactions. ParticleFormer captures fine-grained multi-object interactions between rigid, deformable, and flexible materials, trained directly from real-world robot perception data without an elaborate scene reconstruction. We demonstrate the model's effectiveness both in 3D scene forecasting tasks, and in downstream manipulation tasks using a Model Predictive Control (MPC) policy. In addition, we extend existing dynamics learning benchmarks to include diverse multi-material, multi-object interaction scenarios. We validate our method on six simulation and three real-world experiments, where it consistently outperforms leading baselines by achieving superior dynamics prediction accuracy and less rollout error in downstream visuomotor tasks. Experimental videos are available at https://suninghuang19.github.io/particleformer_page/.
Multiagent Systems
Scene-Aware Vectorized Memory Multi-Agent Framework with Cross-Modal Differentiated Quantization VLMs for Visually Impaired Assistance
This study proposes the dual technological innovation framework, including a cross-modal differ entiated quantization framework for vision-language models (VLMs) and a scene-aware vectorized memory multi-agent system for visually impaired assistance. The modular framework was developed implementing differentiated processing strategies, effectively reducing memory requirements from 38GB to 16GB while maintaining model performance. The multi-agent architecture combines scene classification, vectorized memory, and multimodal interaction, enabling persistent storage and efficient retrieval of scene memories. Through perception-memory-reasoning workflows, the system provides environmental information beyond the current view using historical memories. Experiments show the quantized 19B-parameter model only experiences a 2.05% performance drop on MMBench and maintains 63.7 accuracy on OCR-VQA (original: 64.9), outperforming smaller models with equivalent memory requirements like the Molmo-7B series. The system maintains response latency between 2.83-3.52 seconds from scene analysis to initial speech output, substantially faster than non-streaming methods. This research advances computational efficiency and assistive technology, offering visually impaired users comprehensive real-time assistance in scene perception, text recognition, and navigation.
comment: 28 pages,9 figures
Fair Cooperation in Mixed-Motive Games via Conflict-Aware Gradient Adjustment
Multi-agent reinforcement learning in mixed-motive settings presents a fundamental challenge: agents must balance individual interests with collective goals, which are neither fully aligned nor strictly opposed. To address this, reward restructuring methods such as gifting and intrinsic motivation have been proposed. However, these approaches primarily focus on promoting cooperation by managing the trade-off between individual and collective returns, without explicitly addressing fairness with respect to the agents' task-specific rewards. In this paper, we propose an adaptive conflict-aware gradient adjustment method that promotes cooperation while ensuring fairness in individual rewards. The proposed method dynamically balances policy gradients derived from individual and collective objectives in situations where the two objectives are in conflict. By explicitly resolving such conflicts, our method improves collective performance while preserving fairness across agents. We provide theoretical results that guarantee monotonic non-decreasing improvement in both the collective and individual objectives and ensure fairness. Empirical results in sequential social dilemma environments demonstrate that our approach outperforms baselines in terms of social welfare while ensuring fairness among agents.
Consistent Opponent Modeling of Static Opponents in Imperfect-Information Games
The goal of agents in multi-agent environments is to maximize total reward against the opposing agents that are encountered. Following a game-theoretic solution concept, such as Nash equilibrium, may obtain a strong performance in some settings; however, such approaches fail to capitalize on historical and observed data from repeated interactions against our opponents. Opponent modeling algorithms integrate machine learning techniques to exploit suboptimal opponents utilizing available data; however, the effectiveness of such approaches in imperfect-information games to date is quite limited. We show that existing opponent modeling approaches fail to satisfy a simple desirable property even against static opponents drawn from a known prior distribution; namely, they do not guarantee that the model approaches the opponent's true strategy even in the limit as the number of game iterations approaches infinity. We develop a new algorithm that is able to achieve this property and runs efficiently by solving a convex minimization problem based on the sequence-form game representation using projected gradient descent. The algorithm is guaranteed to efficiently converge to the opponent's true strategy given observations from gameplay and possibly additional historical data if it is available.
RubikSQL: Lifelong Learning Agentic Knowledge Base as an Industrial NL2SQL System VLDB 2026
We present RubikSQL, a novel NL2SQL system designed to address key challenges in real-world enterprise-level NL2SQL, such as implicit intents and domain-specific terminology. RubikSQL frames NL2SQL as a lifelong learning task, demanding both Knowledge Base (KB) maintenance and SQL generation. RubikSQL systematically builds and refines its KB through techniques including database profiling, structured information extraction, agentic rule mining, and Chain-of-Thought (CoT)-enhanced SQL profiling. RubikSQL then employs a multi-agent workflow to leverage this curated KB, generating accurate SQLs. RubikSQL achieves SOTA performance on both the KaggleDBQA and BIRD Mini-Dev datasets. Finally, we release the RubikBench benchmark, a new benchmark specifically designed to capture vital traits of industrial NL2SQL scenarios, providing a valuable resource for future research.
comment: 18 pages, 3 figures, 3 tables, to be submitted to VLDB 2026 (PVLDB Volume 19)
Electromagnetic Formation Flying Using Alternating Magnetic Field Forces and Control Barrier Functions for State and Input Constraints
This article presents a feedback control algorithm for electromagnetic formation flying with constraints on the satellites' states and control inputs. The algorithm combines several key techniques. First, we use alternating magnetic field forces to decouple the electromagnetic forces between each pair of satellites in the formation. Each satellite's electromagnetic actuation system is driven by a sum of amplitude-modulated sinusoids, where amplitudes are controlled in order to prescribe the time-averaged force between each pair of satellites. Next, the desired time-averaged force is computed from a optimal control that satisfies state constraints (i.e., no collisions and an upper limit on intersatellite speeds) and input constraints (i.e., not exceeding satellite's apparent power capability). The optimal time-averaged force is computed using a single relaxed control barrier function that is obtained by composing multiple control barrier functions that are designed to enforce each state and input constraint. Finally, we demonstrate the satellite formation control method in numerical simulations.
comment: Preprint submitted to IEEE Transactions on Aerospace and Electronic Systems (TAES)
Toward Generalized Autonomous Agents: A Neuro-Symbolic AI Framework for Integrating Social and Technical Support in Education
One of the enduring challenges in education is how to empower students to take ownership of their learning by setting meaningful goals, tracking their progress, and adapting their strategies when faced with setbacks. Research has shown that this form of leaner-centered learning is best cultivated through structured, supportive environments that promote guided practice, scaffolded inquiry, and collaborative dialogue. In response, educational efforts have increasingly embraced artificial-intelligence (AI)-powered digital learning environments, ranging from educational apps and virtual labs to serious games. Recent advances in large language models (LLMs) and neuro-symbolic systems, meanwhile, offer a transformative opportunity to reimagine how support is delivered in digital learning environments. LLMs are enabling socially interactive learning experiences and scalable, cross-domain learning support that can adapt instructional strategies across varied subjects and contexts. In parallel, neuro-symbolic AI provides new avenues for designing these agents that are not only adaptive but also scalable across domains. Based on these remarks, this paper presents a multi-agent, neuro-symbolic framework designed to resolve the aforementioned challenges. The framework assigns distinct pedagogical roles to specialized agents: an RL-based 'tutor' agent provides authoritative, non-verbal scaffolding, while a proactive, LLM-powered 'peer' agent facilitates the social dimensions of learning. While prior work has explored such agents in isolation, our framework's novelty lies in unifying them through a central educational ontology. Through case studies in both college-level and middle school settings, we demonstrate the framework's adaptability across domains. We conclude by outlining key insights and future directions for advancing AI-driven learning environments.
comment: Preprint. This work has been submitted to the IEEE for possible publication. In review for IEEE's Systems, Man, and Cybernetics Magazine. 8 pages, 3 figures. arxiv abstract has been shortened as the magazine format uses a long-form abstract
Evasive Active Hypothesis Testing with Deep Neuroevolution: The Single- and Multi-Agent Cases
Active hypothesis testing is a thoroughly studied problem that finds numerous applications in wireless communications and sensor networks. In this paper, we focus on one centralized and one decentralized problem of active hypothesis testing in the presence of an eavesdropper. For the centralized problem including a single legitimate agent, we present a new framework based on deep NeuroEvolution (NE), whereas, for the decentralized problem, we develop a novel NE-based method for solving collaborative multi-agent tasks, which, interestingly, maintains all computational benefits of our single-agent NE-based scheme. To further reduce the computational complexity of the latter scheme, a novel multi-agent joint NE and pruning framework is also designed. The superiority of the proposed NE-based evasive active hypothesis testing schemes over conventional active hypothesis testing policies, as well as learning-based methods, is validated through extensive numerical investigations in an example use case of anomaly detection over wireless sensor networks. It is demonstrated that the proposed joint optimization and pruning framework achieves nearly identical performance with its unpruned counterpart, while removing a very large percentage of redundant deep neural network weights.
comment: Under review at an IEEE journal, shorter conference version presented at IEEE ICC 2024
On Word-of-Mouth and Private-Prior Sequential Social Learning
Social learning constitutes a fundamental framework for studying interactions among rational agents who observe each other's actions but lack direct access to individual beliefs. This paper investigates a specific social learning paradigm known as Word-of-Mouth (WoM), where a series of agents seeks to estimate the state of a dynamical system. The first agent receives noisy measurements of the state, while each subsequent agent relies solely on a degraded version of her predecessor's estimate. A defining feature of WoM is that the final agent's belief is publicly broadcast and subsequently adopted by all agents, in place of their own. We analyze this setting theoretically and through numerical simulations, noting that some agents benefit from using the belief of the last agent, while others experience performance deterioration.
comment: Accepted for publication at the 64th Conference on Decision and Control (CDC)
USPR: Learning a Unified Solver for Profiled Routing
The Profiled Vehicle Routing Problem (PVRP) extends the classical VRP by incorporating vehicle-client-specific preferences and constraints, reflecting real-world requirements such as zone restrictions and service-level preferences. While recent reinforcement-learning solvers have shown promising performance, they require retraining for each new profile distribution, suffer from poor representation ability, and struggle to generalize to out-of-distribution instances. In this paper, we address these limitations by introducing Unified Solver for Profiled Routing (USPR), a novel framework that natively handles arbitrary profile types. USPR introduces on three key innovations: (i) Profile Embeddings (PE) to encode any combination of profile types; (ii) Multi-Head Profiled Attention (MHPA), an attention mechanism that models rich interactions between vehicles and clients; (iii) Profile-aware Score Reshaping (PSR), which dynamically adjusts decoder logits using profile scores to improve generalization. Empirical results on diverse PVRP benchmarks demonstrate that USPR achieves state-of-the-art results among learning-based methods while offering significant gains in flexibility and computational efficiency. We make our source code publicly available to foster future research.
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI
Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial General Intelligence (AGI) and serves as a foundation for various applications (e.g., intelligent mechatronics systems, smart manufacturing) that bridge cyberspace and the physical world. Recently, the emergence of Multi-modal Large Models (MLMs) and World Models (WMs) have attracted significant attention due to their remarkable perception, interaction, and reasoning capabilities, making them a promising architecture for embodied agents. In this survey, we give a comprehensive exploration of the latest advancements in Embodied AI. Our analysis firstly navigates through the forefront of representative works of embodied robots and simulators, to fully understand the research focuses and their limitations. Then, we analyze four main research targets: 1) embodied perception, 2) embodied interaction, 3) embodied agent, and 4) sim-to-real adaptation, covering state-of-the-art methods, essential paradigms, and comprehensive datasets. Additionally, we explore the complexities of MLMs in virtual and real embodied agents, highlighting their significance in facilitating interactions in digital and physical environments. Finally, we summarize the challenges and limitations of embodied AI and discuss potential future directions. We hope this survey will serve as a foundational reference for the research community. The associated project can be found at https://github.com/HCPLab-SYSU/Embodied_AI_Paper_List.
comment: The comprehensive review of Embodied AI. We also provide the resource repository for Embodied AI: https://github.com/HCPLab-SYSU/Embodied_AI_Paper_List
LLM-Based Agents for Competitive Landscape Mapping in Drug Asset Due Diligence
In this paper, we describe and benchmark a competitor-discovery component used within an agentic AI system for fast drug asset due diligence. A competitor-discovery AI agent, given an indication, retrieves all drugs comprising the competitive landscape of that indication and extracts canonical attributes for these drugs. The competitor definition is investor-specific, and data is paywalled/licensed, fragmented across registries, ontology-mismatched by indication, alias-heavy for drug names, multimodal, and rapidly changing. Although considered the best tool for this problem, the current LLM-based AI systems aren't capable of reliably retrieving all competing drug names, and there is no accepted public benchmark for this task. To address the lack of evaluation, we use LLM-based agents to transform five years of multi-modal, unstructured diligence memos from a private biotech VC fund into a structured evaluation corpus mapping indications to competitor drugs with normalized attributes. We also introduce a competitor validating LLM-as-a-judge agent that filters out false positives from the list of predicted competitors to maximize precision and suppress hallucinations. On this benchmark, our competitor-discovery agent achieves 83% recall, exceeding OpenAI Deep Research (65%) and Perplexity Labs (60%). The system is deployed in production with enterprise users; in a case study with a biotech VC investment fund, analyst turnaround time dropped from 2.5 days to $\sim$3 hours ($\sim$20x) for the competitive analysis.
Conformal Data-driven Control of Stochastic Multi-Agent Systems under Collaborative Signal Temporal Logic Specifications
We address control synthesis of stochastic discrete-time linear multi-agent systems under jointly chance-constrained collaborative signal temporal logic specifications in a distribution-free manner using available disturbance samples, which are partitioned into training and calibration sets. Leveraging linearity, we decompose each agent's system into deterministic nominal and stochastic error parts, and design disturbance feedback controllers to bound the stochastic errors by solving a tractable optimization problem over the training data. We then quantify prediction regions (PRs) for the aggregate error trajectories corresponding to agent cliques, involved in collaborative tasks, using conformal prediction and calibration data. This enables us to address the specified joint chance constraint via Lipschitz tightening and the computed PRs, and relax the centralized stochastic optimal control problem to a deterministic one, whose solution provides the feedforward inputs. To enhance scalability, we decompose the deterministic problem into agent-level subproblems solved in an MPC fashion, yielding a distributed control policy. Finally, we present an illustrative example and a comparison with [1].
comment: 7 pages, 2 figures, Accepted for presentation at the 64th IEEE Conference on Decision and Control (CDC2025)
Systems and Control (CS)
Flight-Ready Precise and Robust Carrier-Phase GNSS Navigation Software for Distributed Space Systems
This paper presents the full requirements analysis, design, development, and testing of high-precision navigation flight software for Distributed Space Systems (DSS) using Carrier Phase Differential GNSS (CDGNSS). Five main contributions are made. First, a survey of flown and upcoming DSS missions with stringent precision requirements is conducted, from which a thorough requirements analysis is distilled to guide development and testing. Second, a real-time navigation functional architecture is designed, and adopts a sparse and regularized Consider Kalman Filter with options for numerical stability in-flight. The filter rigorously accounts for uncertainties in process noise, measurement noise, and biases. It tracks float ambiguities with integer resolution where possible. The covariance correlation structure is preserved under all navigation modes, including contingencies and outages. Third, a lightweight, memoryless Fault Detection, Isolation, and Recovery (FDIR) module is developed to guard against anomalous measurements, providing statistical screening and ensuring robust navigation. Fourth, the software architecture is proposed for ease of integration, with strategies presented for modularity and computational efficiency tailored to constrained flight systems. Fifth, a comprehensive test campaign is conducted, mapped to a requirements verification matrix, spanning unit, interface, software-in-the-loop, and real-time hardware-in-the-loop tests, emphasizing gradual test fidelity for efficient fault isolation. Finally, flight-like results are demonstrated using the VISORS mission, due to the generalizability of the VISORS navigation operations, and the stringency which demands sub-centimeter relative position and sub-millimeter-per-second velocity accuracy. This architecture aims to serve as a reference for next-generation DSS missions adopting CDGNSS.
AI Data Centers Need Pioneers to Deliver Scalable Power via Offgrid AI
The scalable computing revolution of the late '80s through mid- '00s forged a new technical and economic model for computing that delivered massive societal impact, but its economic benefit has driven scalability to sizes that are now exhausting the energy grid's capacity. Our time demands a new revolution in scalable energy, mirroring in key ways the scalable computing revolution; e.g., compelling economic forces, use of mass-market components, overcoming foibles of those components, judicious use of physical locality, and the the difficult integration into an effective system. The offgrid AI approach closely fits this mold, combining local mostly renewable generation and storage to power an AI data center, starting offgrid. Obstacles to delivering this approach are social, technical, and project, but the potential is massive. I argue that the offgrid-AI approach needs pioneers among both system developers and AI-data-center operators to move it quickly from concept to large-scale deployment.
Tractable Stochastic Hybrid Model Predictive Control using Gaussian Processes for Repetitive Tasks in Unseen Environments
Improving the predictive accuracy of a dynamics model is crucial to obtaining good control performance and safety from Model Predictive Controllers (MPC). One approach involves learning unmodelled (residual) dynamics, in addition to nominal models derived from first principles. Varying residual models across an environment manifest as modes of a piecewise residual (PWR) model that requires a) identifying how modes are distributed across the environment and b) solving a computationally intensive Mixed Integer Nonlinear Program (MINLP) problem for control. We develop an iterative mapping algorithm capable of predicting time-varying mode distributions. We then develop and solve two tractable approximations of the MINLP to combine with the predictor in closed-loop to solve the overall control problem. In simulation, we first demonstrate how the approximations improve performance by 4-18% in comparison to the MINLP while achieving significantly lower computation times (upto 250x faster). We then demonstrate how the proposed mapping algorithm incrementally improves controller performance (upto 3x) over multiple iterations of a trajectory tracking control task even when the mode distributions change over time.
comment: European Control Conference (ECC) 2025
On Asymptotic Analysis of the Two-Stage Approach: Towards Data-Driven Parameter Estimation
In this paper, we analyze the asymptotic properties of the Two-Stage (TS) estimator -- a simulation-based parameter estimation method that constructs estimators offline from synthetic data. While TS offers significant computational advantages compared to standard approaches to estimation, its statistical properties have not been previously analyzed in the literature. Under simple assumptions, we establish that the TS estimator is strongly consistent and asymptotically normal, providing the first theoretical guarantees for this class of estimators.
comment: 11 pages, 4 figures
BirdRecorder's AI on Sky: Safeguarding birds of prey by detection and classification of tiny objects around wind turbines
The urgent need for renewable energy expansion, particularly wind power, is hindered by conflicts with wildlife conservation. To address this, we developed BirdRecorder, an advanced AI-based anti-collision system to protect endangered birds, especially the red kite (Milvus milvus). Integrating robotics, telemetry, and high-performance AI algorithms, BirdRecorder aims to detect, track, and classify avian species within a range of 800 m to minimize bird-turbine collisions. BirdRecorder integrates advanced AI methods with optimized hardware and software architectures to enable real-time image processing. Leveraging Single Shot Detector (SSD) for detection, combined with specialized hardware acceleration and tracking algorithms, our system achieves high detection precision while maintaining the speed necessary for real-time decision-making. By combining these components, BirdRecorder outperforms existing approaches in both accuracy and efficiency. In this paper, we summarize results on field tests and performance of the BirdRecorder system. By bridging the gap between renewable energy expansion and wildlife conservation, BirdRecorder contributes to a more sustainable coexistence of technology and nature.
comment: 18 pages, 1 figures, to appear in Proceedings of the 19th International Conference on Intelligent Autonomous Systems (IAS-19), Genoa, Italy, 2025
Realizing Reduced and Sparse Biochemical Reaction Networks from Dynamics
We propose a direct optimization framework for learning reduced and sparse chemical reaction networks (CRNs) from time-series trajectory data. In contrast to widely used indirect methods-such as those based on sparse identification of nonlinear dynamics (SINDy)-which infer reaction dynamics by fitting numerically estimated derivatives, our approach fits entire trajectories by solving a dynamically constrained optimization problem. This formulation enables the construction of reduced CRNs that are both low-dimensional and sparse, while preserving key dynamical behaviors of the original system. We develop an accelerated proximal gradient algorithm to efficiently solve the resulting non-convex optimization problem. Through illustrative examples, including a Drosophila circadian oscillator and a glycolytic oscillator, we demonstrate the ability of our method to recover accurate and interpretable reduced-order CRNs. Notably, the direct approach avoids the derivative estimation step and mitigates error accumulation issues inherent in indirect methods, making it a robust alternative for data-driven CRN realizations.
comment: Accepted to IEEE CDC 2025. Author-accepted version; supplementary material in appendix file
Modeling and Control Framework for Autonomous Space Manipulator Handover Operations
Autonomous space robotics is poised to play a vital role in future space missions, particularly for In-space Servicing, Assembly, and Manufacturing (ISAM). A key capability in such missions is the Robot-to-Robot (R2R) handover of mission-critical objects. This work presents a dynamic model of a dual-arm space manipulator system and compares various tracking control laws. The key contributions of this work are the development of a cooperative manipulator dynamic model and the comparative analysis of control laws to support autonomous R2R handovers in ISAM scenarios.
comment: 14 pages, submitted to 2025 Astrodynamics Specialists Conference proceedings
AQ-PCDSys: An Adaptive Quantized Planetary Crater Detection System for Autonomous Space Exploration
Autonomous planetary exploration missions are critically dependent on real-time, accurate environmental perception for navigation and hazard avoidance. However, deploying deep learning models on the resource-constrained computational hardware of planetary exploration platforms remains a significant challenge. This paper introduces the Adaptive Quantized Planetary Crater Detection System (AQ-PCDSys), a novel framework specifically engineered for real-time, onboard deployment in the computationally constrained environments of space exploration missions. AQ-PCDSys synergistically integrates a Quantized Neural Network (QNN) architecture, trained using Quantization-Aware Training (QAT), with an Adaptive Multi-Sensor Fusion (AMF) module. The QNN architecture significantly optimizes model size and inference latency suitable for real-time onboard deployment in space exploration missions, while preserving high accuracy. The AMF module intelligently fuses data from Optical Imagery (OI) and Digital Elevation Models (DEMs) at the feature level, utilizing an Adaptive Weighting Mechanism (AWM) to dynamically prioritize the most relevant and reliable sensor modality based on planetary ambient conditions. This approach enhances detection robustness across diverse planetary landscapes. Paired with Multi-Scale Detection Heads specifically designed for robust and efficient detection of craters across a wide range of sizes, AQ-PCDSys provides a computationally efficient, reliable and accurate solution for planetary crater detection, a critical capability for enabling the next generation of autonomous planetary landing, navigation, and scientific exploration.
comment: 17 pages, 6 figures. A research paper on a novel deep learning framework for planetary crater detection
modelSolver: A Symbolic Model-Driven Solver for Power Network Simulation and Monitoring
The development of advanced software tools for power system analysis requires extensive programming expertise. Even when using open-source tools, programming skills are essential to modify built-in models. This can be particularly challenging for domain experts who lack coding proficiency. This paper introduces modelSolver, a software solution with a new framework centered around symbolic mathematical modeling. The proposed paradigm facilitates defining models through intuitive mathematical expressions, thus eliminating the need for traditional programming constructs such as arrays, loops, and sparse matrix computations. The modelSolver focuses on power flow and state estimation using an open-box approach, which allows users to specify custom models using either real or complex variables. Unlike existing tools that rely on hard-coded models, modelSolver enables the representation of a wide range of advanced functionalities, including power flow with voltage regulators and load tap changers, continuation power flow, and Gauss-Newton state estimation with equality constraints. Compatibility with MATPOWER is ensured via a converter that automates importing data files. The framework prioritizes model-driven development and empowers domain experts to focus on power system modeling without programming barriers. It aims to simplify power system computations, making them more accessible to students, scientists, and practitioners.
A Predictive Framework for Adversarial Energy Depletion in Inbound Threat Scenarios
This paper presents a predictive framework for adversarial energy-depletion defense against a maneuverable inbound threat (IT). The IT solves a receding-horizon problem to minimize its own energy while reaching a high-value asset (HVA) and avoiding interceptors and static lethal zones modeled by Gaussian barriers. Expendable interceptors (EIs), coordinated by a central node (CN), maintain proximity to the HVA and patrol centers via radius-based tether costs, deny attack corridors by harassing and containing the IT, and commit to intercept only when a geometric feasibility test is confirmed. No explicit opponent-energy term is used, and the formulation is optimization-implementable. No simulations are included.
comment: 7 pages, 1 figure, 1 table, preprint submitted to the American Control Conference (ACC) 2026
Linear Power System Modeling and Analysis Across Wide Operating Ranges: A Hierarchical Neural State-Space Equation Approach
Developing a unified small-signal model for modern, large-scale power systems that remains accurate across a wide range of operating ranges presents a formidable challenge. Traditional methods, spanning mechanistic modeling, modal identification, and deep learning, have yet to fully overcome persistent limitations in accuracy, universal applicability, and interpretability. In this paper, a novel hierarchical neural state-space equation approach is proposed to overcome these obstacles, achieving strong representation, high interpretability, and superior adaptability to both system scale and varying operating points. Specifically, we first introduce neural state-space equations integrated with virtual state observers to accurately characterize the dynamics of power system devices, even in the presence of unmeasurable states. Subsequently, a hierarchical architecture is designed to handle the modeling complexity across a wide range of operating conditions, flexibly decoupling device and grid models to effectively mitigate the curse of dimensionality. Finally, a set of spatiotemporal data transformations and a multi-stage training strategy with a multi-objective loss function is employed to enhance the models efficiency and generalization. Numerical results on the two-machine three-bus system and the Guangdong Power Grid verify the superior performance of the proposed method, presenting it as a powerful new tool for small-signal stability analysis.
comment: 10 pages, 5 figures
A Comprehensive Incremental and Ensemble Learning Approach for Forecasting Individual Electric Vehicle Charging Parameters
Electric vehicles (EVs) have the potential to reduce grid stress through smart charging strategies while simultaneously meeting user demand. This requires accurate forecasts of key charging parameters, such as energy demand and connection time. Although previous studies have made progress in this area, they have overlooked the importance of dynamic training to capture recent patterns and have excluded EV sessions with limited information, missing potential opportunities to use these data. To address these limitations, this study proposes a dual-model approach incorporating incremental learning with six machine-learning models to predict EV charging session parameters. This approach includes dynamic training updates, optimal features, and hyperparameter set selection for each model to make it more robust and inclusive. Using a data set of 170,000 measurements from the real world electric vehicle session, week-long charging parameters were predicted over a one-year period. The findings reveal a significant difference between workplace and residential charging locations regarding connection duration predictability, with workplace sessions being more predictable. The proposed stacking ensemble learning method enhanced forecasting accuracy, improving R2 by 2.83% to 43.44% across all parameters and location settings. A comparison of the two models reveals that incorporating user IDs as a feature, along with the associated historical data, is the most significant factor influencing the accuracy of the forecast. Forecasts can be used effectively in smart charging and grid management applications by incorporating uncertainty quantification techniques, allowing charge point operators to optimize charging schedules and energy management.
Multiple STAR-RISs-Empowered Multi-User Communications with Diversified QoS Provisioning
This paper proposes a quality-of-service (QoS)-aware multi-user communication framework facilitated by multiple simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs). The user groups are established based on their QoS requirements specified by the minimum data rate, which is provisioned by the optimized transmission and reflection configurations of the STAR-RISs. Particularly, we formulate an optimization problem to maximize the aggregate link rate across all users, under group-specified rate requirements by jointly considering the transmit beamforming and STAR-RIS configurations. Then, we employ the Lagrangian duality with quadratic transformation to tackle the non-convexity of the objective. We decompose the problem within a block coordinate descent framework, and the subproblems are solved through convex approximation and iterated to approach the optimal solution. Simulation results demonstrate the effectiveness of the proposed method in enhancing the system sum rate with guaranteed QoS performance for heterogeneous users, offering valuable insights for the deployment of STAR-RISs in future QoS-aware wireless networks.
SuperGen: An Efficient Ultra-high-resolution Video Generation System with Sketching and Tiling
Diffusion models have recently achieved remarkable success in generative tasks (e.g., image and video generation), and the demand for high-quality content (e.g., 2K/4K videos) is rapidly increasing across various domains. However, generating ultra-high-resolution videos on existing standard-resolution (e.g., 720p) platforms remains challenging due to the excessive re-training requirements and prohibitively high computational and memory costs. To this end, we introduce SuperGen, an efficient tile-based framework for ultra-high-resolution video generation. SuperGen features a novel training-free algorithmic innovation with tiling to successfully support a wide range of resolutions without additional training efforts while significantly reducing both memory footprint and computational complexity. Moreover, SuperGen incorporates a tile-tailored, adaptive, region-aware caching strategy that accelerates video generation by exploiting redundancy across denoising steps and spatial regions. SuperGen also integrates cache-guided, communication-minimized tile parallelism for enhanced throughput and minimized latency. Evaluations demonstrate that SuperGen harvests the maximum performance gains while achieving high output quality across various benchmarks.
Deception in Asymmetric Information Homicidal Chauffeur Game
The classic Homicidal Chauffeur game is a pursuit-evasion game played in an unbounded planar environment between a pursuer constrained to move with fixed speed on curves with bounded curvature, and a slower evader with fixed speed but with simple kinematics. We introduce a new variant of this game with asymmetric information in which the evader has the ability to choose its speed among a finite set of choices that is unknown to the pursuer a priori. Therefore the pursuer is required to estimate the evader's maximum speed based on the observations so far. This formulation leads to the question of whether the evader can exploit this asymmetry by moving deceptively by first picking a slower speed to move with and then switching to a faster speed when a specified relative configuration is attained to increase the capture time as compared to moving with the maximum speed at all times. Our contributions are as follows. First, we derive optimal feedback Nash equilibrium strategies for the complete information case of this game in which the evader is allowed to vary its speed in a given interval. Second, for the version with asymmetric information, we characterize regions of initial player locations in the game space from which the evader does not have any advantage in using deceptive strategies. Finally, we provide numerical evidence of regions in the game space from which the evader can increase the capture time by moving deceptively.
Fast RLS Identification Leveraging the Linearized System Sparsity: Predictive Cost Adaptive Control for Quadrotors
This paper presents a centralized predictive cost adaptive control (PCAC) strategy for the position and attitude control of quadrotors. PCAC is an optimal, prediction-based control method that uses recursive least squares (RLS) to identify model parameters online, enabling adaptability in dynamic environments. Addressing challenges with black-box approaches in systems with complex couplings and fast dynamics, this study leverages the unique sparsity of quadrotor models linearized around hover points. By identifying only essential parameters related to nonlinear couplings and dynamics, this approach reduces the number of parameters to estimate, accelerates identification, and enhances stability during transients. Furthermore, the proposed control scheme removes the need for an attitude setpoint, typically required in conventional cascaded control designs.
comment: 6 pages, 3 figures, American Control Conference (ACC) 2026 preprint
Reformulations of Quadratic Programs for Lipschitz Continuity
Optimization-based controllers often lack regularity guarantees, such as Lipschitz continuity, when multiple constraints are present. When used to control a dynamical system, these conditions are essential to ensure the existence and uniqueness of the system's trajectory. Here we propose a general method to convert a Quadratic Program (QP) into a Second-Order Cone Problem (SOCP), which is shown to be Lipschitz continuous. Key features of our approach are that (i) the regularity of the resulting formulation does not depend on the structural properties of the constraints, such as the linear independence of their gradients; and (ii) it admits a closed-form solution, which is not available for general QPs with multiple constraints, enabling faster computation. We support our method with rigorous analysis and examples.
comment: Submitted to IEEE Control Systems Letters (L-CSS)
Electromagnetic Formation Flying Using Alternating Magnetic Field Forces and Control Barrier Functions for State and Input Constraints
This article presents a feedback control algorithm for electromagnetic formation flying with constraints on the satellites' states and control inputs. The algorithm combines several key techniques. First, we use alternating magnetic field forces to decouple the electromagnetic forces between each pair of satellites in the formation. Each satellite's electromagnetic actuation system is driven by a sum of amplitude-modulated sinusoids, where amplitudes are controlled in order to prescribe the time-averaged force between each pair of satellites. Next, the desired time-averaged force is computed from a optimal control that satisfies state constraints (i.e., no collisions and an upper limit on intersatellite speeds) and input constraints (i.e., not exceeding satellite's apparent power capability). The optimal time-averaged force is computed using a single relaxed control barrier function that is obtained by composing multiple control barrier functions that are designed to enforce each state and input constraint. Finally, we demonstrate the satellite formation control method in numerical simulations.
comment: Preprint submitted to IEEE Transactions on Aerospace and Electronic Systems (TAES)
A Learning-based Hybrid System Approach for Detecting Contingencies in Distribution Grids with Inverter-Based Resources
This paper presents a machine-learning based Stochastic Hybrid System (SHS) modeling framework to detect contingencies in active distribution networks populated with inverter-based resources (IBRs). In particular, this framework allows detecting unobservable contingencies, which cannot be identified by normal sensing systems. First, a state-space SHS model combining conventional and IRB-based resources is introduced to formulate the dynamic interaction between continuous states of distribution networks and discrete contingency events. This model forms a randomly switching system, where parameters or network topology can change due to contingencies. We consider two contingency classes: (i) physical events, such as line outages, and (ii) measurement anomalies caused by sensor faults. Leveraging multivariate time series data derived from high-frequency sampling of system states and network outputs, a time series-based learning model is trained for real-time contingency detection and classification. Simulation studies, carried out on the IEEE 33-bus distribution system, demonstrate a 96% overall detection accuracy.
comment: 6 pages, 6 figures
Fast Multiagent Formation Stabilization with Sparse Universally Rigid Frameworks
Affine formation control (AFC) is a distributed networked control system that has recently received increasing attention in various applications. AFC is typically achieved using a generalized consensus system where the stress matrix, which encodes the graph structure, is used instead of a graph Laplacian. Universally rigid frameworks (URFs) guarantee the existence of the stress matrix and have thus become the guideline for such a network design. In this work, we propose a convex optimization framework to design the stress matrix for AFC without predefining a rigid graph. We aim to find a resulting network with a reduced number of communication links, but still with a fast convergence speed. We show through simulations that our proposed solutions can yield a more sparse graph, while admitting a faster convergence compared to the state-of-the-art solutions.
Mimicking associative learning of rats via a neuromorphic robot in open field maze using spatial cell models
Data-driven Artificial Intelligence (AI) approaches have exhibited remarkable prowess across various cognitive tasks using extensive training data. However, the reliance on large datasets and neural networks presents challenges such as highpower consumption and limited adaptability, particularly in SWaP-constrained applications like planetary exploration. To address these issues, we propose enhancing the autonomous capabilities of intelligent robots by emulating the associative learning observed in animals. Associative learning enables animals to adapt to their environment by memorizing concurrent events. By replicating this mechanism, neuromorphic robots can navigate dynamic environments autonomously, learning from interactions to optimize performance. This paper explores the emulation of associative learning in rodents using neuromorphic robots within open-field maze environments, leveraging insights from spatial cells such as place and grid cells. By integrating these models, we aim to enable online associative learning for spatial tasks in real-time scenarios, bridging the gap between biological spatial cognition and robotics for advancements in autonomous systems.
Engineering a Digital Twin for the Monitoring and Control of Beer Fermentation Sampling
Successfully engineering interactive industrial DTs is a complex task, especially when implementing services beyond passive monitoring. We present here an experience report on engineering a safety-critical digital twin (DT) for beer fermentation monitoring, which provides continual sampling and reduces manual sampling time by 91%. We document our systematic methodology and practical solutions for implementing bidirectional DTs in industrial environments. This includes our three-phase engineering approach that transforms a passive monitoring system into an interactive Type 2 DT with real-time control capabilities for pressurized systems operating at seven bar. We contribute details of multi-layered safety protocols, hardware-software integration strategies across Arduino controllers and Unity visualization, and real-time synchronization solutions. We document specific engineering challenges and solutions spanning interdisciplinary integration, demonstrating how our use of the constellation reporting framework facilitates cross-domain collaboration. Key findings include the critical importance of safety-first design, simulation-driven development, and progressive implementation strategies. Our work thus provides actionable guidance for practitioners developing DTs requiring bidirectional control in safety-critical applications.
comment: Accepted for EDTconf 2025
DTInsight: A Tool for Explicit, Interactive, and Continuous Digital Twin Reporting
With Digital Twin (DT) construction and evolution occurring over time, stakeholders require tools to understand the current characteristics and conceptual architecture of the system at any time. We introduce DTInsight, a systematic and automated tool and methodology for producing continuous reporting for DTs. DTInsight offers three key features: (a) an interactive conceptual architecture visualization of DTs; (b) generation of summaries of DT characteristics based on ontological data; and (c) integration of these outputs into a reporting page within a continuous integration and continuous deployment (CI/CD) pipeline. Given a modeled description of the DT aligning to our DT Description Framework (DTDF), DTInsight enables up-to-date and detailed reports for enhanced stakeholder understanding.
SNIC bifurcation and its Application to MEMS ICME 2018
This project focuses on a method to extract a frequency comb in mechanical means, for general interest and numerous practical applications in MEMS. The method of execution is the implementation of a beam that is exhibiting non-linear dynamics that is perturbed and analyzed for its transverse vibrations. The perturbation is an external harmonic driver with a chosen small amplitude and frequency (which is slightly detuned from the beam eigenfrequency), that when engaged with the unperturbed beam oscillations, causes it reach a state of "injection pulling" - an effect that occurs when one harmonic oscillator is coupled with a second one and causes it to oscillate in a frequency near its own. This causes the beam to reach SNIC bifurcation, rendering a frequency comb as desired. Theoretical analysis showed that the problem can be modelled using a non-linear equation of the beam, that translates to a form of the non-linear Duffing equation. While a solution to the dynamics function of the beam is hard to obtain in practice due to mathematical difficulties, a slow evolution model is suggested that is composed of functions of a amplitude and phase. Using several additional mathematical assumptions, the amplitude is seen to be related to the phase, while the phase equation solution is seen to be of the form of Adler's equation. These assumptions ultimately reduce the entire behaviour of the beam to a relatively simple solution to the Adler equation, which has a known analytical solution. Computerized numerical simulations are run on it to check the results and compare them to the theory and desired outcome. The results agreed with the theory and produce the expected frequency comb, showing the assumptions to be valid in extracting the comb.
comment: Presented at the 35th Israeli Conference on Mechanical Engineering (ICME 2018)
Incremental Collision Laws Based on the Bouc-Wen Model: Improved Collision Models and Further Results
In the article titled "The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies" and published in the Journal of Computational and Nonlinear Dynamics (Volume 20, Issue 6, June 2025), the authors studied mathematical models of binary direct collinear collisions of convex viscoplastic bodies that employed two incremental collision laws based on the Bouc-Wen differential model of hysteresis. It was shown that the models possess favorable analytical properties, and several model parameter identification studies were conducted, demonstrating that the models can accurately capture the nature of a variety of collision phenomena. In this article, the aforementioned models are augmented by modeling the effects of external forces as time-dependent inputs that belong to a certain function space. Furthermore, the range of the parameters under which the models possess favorable analytical properties is extended to several corner cases that were not considered in the prior publication. Finally, the previously conducted model parameter identification studies are extended, and an additional model parameter identification study is provided in an attempt to validate the ability of the augmented models to represent the effects of external forces.
comment: 12 pages, 4 figures, see https://gitlab.com/user9716869/EBWCM ; (v2-v5) various minor amendments; (v5) replaced the parameter identification study of Quinn (2004) with Villegas et al (2021) due to incompatibility of the proposed collision models with the experimental setup in Quinn (2004); arXiv admin note: text overlap with arXiv:2410.08147
Data-Driven Yet Formal Policy Synthesis for Stochastic Nonlinear Dynamical Systems
The automated synthesis of control policies for stochastic dynamical systems presents significant challenges. A standard approach is to construct a finite-state abstraction of the continuous system, typically represented as a Markov decision process (MDP). However, generating abstractions is challenging when (1) the system's dynamics are nonlinear, and/or (2) we do not have complete knowledge of the dynamics. In this work, we introduce a novel data-driven abstraction technique for nonlinear Lipschitz continuous dynamical systems with additive stochastic noise that addresses both of these issues. As a key step, we use samples of the dynamics to learn the enabled actions and transition probabilities of the abstraction. We represent abstractions as MDPs with intervals of transition probabilities, known as interval MDPs (IMDPs). These abstractions enable the synthesis of policies for the concrete nonlinear system, with probably approximately correct (PAC) guarantees on the probability of satisfying a specified control objective. Our numerical experiments illustrate the effectiveness and robustness of our approach in achieving reliable control under uncertainty.
Geometry-Aware Edge-State Tracking for Resilient Affine Formation Control
Affine formation control (AFC) is a subset of formation control methods that enables coordinated multiagent movement while preserving affine relationships, and has recently gained increasing popularity due to its broad applicability across diverse applications. AFC is inherently distributed, where each agent's local controller relies on the relative displacements of neighboring agents. The unavailability of these measurements in practice, due to node or communication failures, leads to a change in the underlying graph topology and subsequently causes instability or sub-optimal performance. In this work, each edge in the graph is modeled using a state-space framework, allowing the corresponding edge-states to be estimated with or without up-to-date measurements. We then propose a Kalman-based estimation framework where we fuse both temporal information from agents' dynamics and spatial information, which is derived from the geometry of the affine formations. We give convergence guarantees and optimality analysis on the proposed algorithm, and numerical validations show the enhanced resilience of AFC against these topology changes in several practical scenarios.
comment: A part of this work is published at the 26th International Conference on Information Fusion 2023, DOI: 10.23919/FUSION52260.2023.10224101
Fault Detection in New Wind Turbines with Limited Data by Generative Transfer Learning
Intelligent condition monitoring of wind turbines is essential for reducing downtimes. Machine learning models trained on wind turbine operation data are commonly used to detect anomalies and, eventually, operation faults. However, data-driven normal behavior models (NBMs) require a substantial amount of training data, as NBMs trained with scarce data may result in unreliable fault detection. To overcome this limitation, we present a novel generative deep transfer learning approach to make SCADA samples from one wind turbine lacking training data resemble SCADA data from wind turbines with representative training data. Through CycleGAN-based domain mapping, our method enables the application of an NBM trained on an existing wind turbine to a new one with severely limited data. We demonstrate our approach on field data mapping SCADA samples across 7 substantially different WTs. Our findings show significantly improved fault detection in wind turbines with scarce data. Our method achieves the most similar anomaly scores to an NBM trained with abundant data, outperforming NBMs trained on scarce training data with improvements of +10.3% in F1-score when 1 month of training data is available and +16.8% when 2 weeks are available. The domain mapping approach outperforms conventional fine-tuning at all considered degrees of data scarcity, ranging from 1 to 8 weeks of training data. The proposed technique enables earlier and more reliable fault detection in newly installed wind farms, demonstrating a novel and promising research direction to improve anomaly detection when faced with training data scarcity.
comment: Change of phrasing to fault detection, including title change, and minor revisions. No changes to models, experiments, or results
Generalizations of data-driven balancing: What to sample for different balancing-based reduced models
The quadrature-based balanced truncation (QuadBT) framework of arXiv:2104.01006 is a non-intrusive reformulation of balanced truncation (BT), a classical projection-based model-order reduction technique for linear systems. QuadBT is non-intrusive in the sense that it builds approximate balanced truncation reduced-order models entirely from system response data, e.g., transfer function measurements, without the need to reference an explicit state-space realization of the underlying full-order model. In this work, we generalize the QuadBT framework to other types of balanced truncation model reduction. Namely, we show what transfer function data are required to compute data-driven reduced models by balanced stochastic truncation, positive-real balanced truncation, and bounded-real balanced truncation. In each case, these data are evaluations of particular spectral factors associated with the system of interest. These results lay the theoretical foundation for data-driven reformulations of the aforementioned BT variants. Although it is not yet clear how to compute or obtain these spectral factor data in a practical real-world setting, examples using synthetic (numerically evaluated) transfer function data are included to validate the data-based reduced models.
comment: 16 pages, 3 figures
On Word-of-Mouth and Private-Prior Sequential Social Learning
Social learning constitutes a fundamental framework for studying interactions among rational agents who observe each other's actions but lack direct access to individual beliefs. This paper investigates a specific social learning paradigm known as Word-of-Mouth (WoM), where a series of agents seeks to estimate the state of a dynamical system. The first agent receives noisy measurements of the state, while each subsequent agent relies solely on a degraded version of her predecessor's estimate. A defining feature of WoM is that the final agent's belief is publicly broadcast and subsequently adopted by all agents, in place of their own. We analyze this setting theoretically and through numerical simulations, noting that some agents benefit from using the belief of the last agent, while others experience performance deterioration.
comment: Accepted for publication at the 64th Conference on Decision and Control (CDC)
Robust Sensor-Limited Control with Safe Input-Output Constraints for Hydraulic In-Wheel Motor Drive Mobility Systems
In-wheel drive (IWD) systems enhance the responsiveness, traction, and maintenance efficiency of vehicles by enabling each wheel to operate independently. This paper proposes a novel robust torque-observed valve-based control (RTOVC) framework to address velocity tracking in hydraulic IWDs that actuate heavy-duty wheeled mobile robots (HWMRs), considering such challenges as wheel slippages, sensor limitations, rough terrains, and modeling uncertainties. To overcome the sensor-dependent control systems associated with the closed-loop torque/pressure in hydraulic IWD-actuated HWMRs, a robust observer network based on an adaptive barrier Lyapunov function (BLF) is proposed to estimate the required in-wheel motor torque to track the velocity references. Then, another adaptive BLF for valve control signals is employed to modulate the hydraulic fluid to generate the estimated torque for each IWD. The RTOVC strategy ensures user-defined safety within the logarithmic BLF framework by constraining the valve control signal, actual velocity, velocity tracking error, and torque of each hydraulic IWD in an HWMR to avoid exceeding specified limits. Despite its safety constraints, external disturbances, and modeling uncertainties, robustness and uniformly exponential stability of the RTOVC-applied hydraulic IWD mechanism are ensured in HWMRs. Experimental investigations using a 6,500-kg HWMR, actuated by four independent IWDs under intense disturbances and safety-defined constraints, validate the performance of the RTOVC.
A Multi-Modal IoT Node for Energy-Efficient Environmental Monitoring with Edge AI Processing
The widespread adoption of Internet of Things (IoT) technologies has significantly advanced environmental monitoring (EM) by enabling cost-effective and scalable sensing solutions. Concurrently, machine learning (ML) and artificial intelligence (AI) are introducing powerful tools for the efficient and accurate analysis of complex environmental data. However, current IoT platforms for environmental sensing are typically limited to a narrow set of sensors, preventing a comprehensive assessment of environmental conditions and lacking sufficient computational capabilities to support the deployment of advanced ML and AI algorithms on the edge. To overcome these limitations, we introduce a compact (17x38 mm2), multi-modal, MCU-based environmental IoT node integrating 11 sensors, including CO2 concentration, volatile organic compounds (VOCs), light intensity, UV radiation, pressure, temperature, humidity, visual sensing via an RGB camera, and precise geolocation through a GNSS module. It features GAP9, a parallel ultra-low-power system-on-chip, enabling real-time, energy-efficient edge processing of advanced ML models directly on-device. We implemented a YOLOv5-based occupancy detection pipeline (0.3 M parameters, 42 MOP per inference), demonstrating 42% energy savings over raw data streaming. Additionally, we present a smart indoor air quality (IAQ) monitoring setup that combines occupancy detection with adaptive sample rates, achieving operational times of up to 143 h on a single compact 600 mAh, 3.7 V battery. Our platform lays the groundwork for innovative applications such as predictive indoor IAQ, enabling efficient AI-driven on-edge forecasting for energy-efficient and autonomous, proactive pollution-mitigation control strategies
comment: 7 pages, 4 figures, 2 tables. This paper has been accepted at 2025 IEEE International Conference on Omni-layer Intelligent Systems (COINS)
Approximate predictive control barrier function for discrete-time systems
We propose integrating an approximation of a predictive control barrier function (PCBF) in a safety filter framework, resulting in a prediction horizon independent formulation. The PCBF is defined through the value function of an optimal control problem and ensures invariance as well as stability of a safe set within a larger domain of attraction. We provide a theoretical analysis of the proposed algorithm, establishing input-to-state stability of the safe set with respect to approximation errors as well as exogenous disturbances. Furthermore, we propose a continuous extension of the PCBF within the safe set, reducing the impact of learning errors on filter interventions. We demonstrate the stability properties and computational advantages of the proposed algorithm on a linear system example and its application as a safety filter for miniature race cars in simulation.
Real-time Traffic Simulation and Management for Large-scale Urban Air Mobility: Integrating Route Guidance and Collision Avoidance
Given the spatial heterogeneity of land use patterns in most cities, large-scale UAM deployments will likely focus on specific areas, such as intertransfer traffic between suburbs and city centers. However, large-scale UAM operations connecting multiple origin-destination pairs raise concerns about air traffic safety and efficiency due to potential conflict movements, particularly at major conflict points analogous to roadway junctions. To meet the safety and efficiency requirements of future UAM operations, this work proposes an air traffic management framework that integrates route guidance and collision avoidance. The route guidance mechanism optimizes aircraft distribution across both spatial and temporal dimensions by regulating their paths (composed of waypoints). Given the optimized paths, the collision avoidance algorithm generates collision-free aircraft trajectories between waypoints in the 3D space. To enable large-scale applications, we develop fast approximation methods for centralized path planning and adopt the velocity obstacle model for distributed collision avoidance. To our knowledge, this work is one of the first to integrate route guidance and collision avoidance for UAM. Simulation results demonstrate that the proposed framework enables efficient and flexible UAM operations, including air traffic assignment, local congestion mitigation, and dynamic no-fly zone management. Compared with a collision-free baseline strategy, the proposed framework achieves considerable improvements in traffic safety and efficiency, with increases in the average minimum separation (+98.2%), the average travel speed (+70.2%), and the trip completion rate (+130%), along with a reduction in the energy consumption (-23.0%). The proposed framework demonstrates its potential for real-time traffic simulation and management in large-scale UAM systems.
comment: 26 pages
Recursive Gaussian Process Regression with Integrated Monotonicity Assumptions for Control Applications
In this paper, we present an extension to the recursive Gaussian Process (RGP) regression that enables the satisfaction of inequality constraints and is well suited for a real-time execution in control applications. The soft inequality constraints are integrated by introducing an additional extended Kalman Filter (EKF) update step using pseudo-measurements. The sequential formulation of the algorithm and several developed heuristics ensure both the performance and a low computational effort of the algorithm. A special focus lies on an efficient consideration of monotonicity assumptions for GPs in the form of inequality constraints. The algorithm is statistically validated in simulations, where the possible advantages in comparison with the standard RGP algorithm become obvious. The paper is concluded with a successful experimental validation of the developed algorithm for the monotonicity-preserving learning of heat transfer values for the control of a vapor compression cycle evaporator, leveraging a previously published partial input output linearization (IOL).
comment: Accepted at ICINCO 2025 (22nd International Conference on Informatics in Control, Automation and Robotics)
Constrained Diffusion Models for Synthesizing Representative Power Flow Datasets ICML 2025
High-quality power flow datasets are essential for training machine learning models in power systems. However, security and privacy concerns restrict access to real-world data, making statistically accurate and physically consistent synthetic datasets a viable alternative. We develop a diffusion model for generating synthetic power flow datasets from real-world power grids that both replicate the statistical properties of the real-world data and ensure AC power flow feasibility. To enforce the constraints, we incorporate gradient guidance based on the power flow constraints to steer diffusion sampling toward feasible samples. For computational efficiency, we further leverage insights from the fast decoupled power flow method and propose a variable decoupling strategy for the training and sampling of the diffusion model. These solutions lead to a physics-informed diffusion model, generating power flow datasets that outperform those from the standard diffusion in terms of feasibility and statistical similarity, as shown in experiments across IEEE benchmark systems.
comment: This paper is the extended journal version of our paper at ICML 2025 Workshop "DataWorld: Unifying Data Curation Frameworks Across Domains"
Co-Optimization of EV Charging Control and Incentivization for Enhanced Power System Stability
We study how high charging rate demands from electric vehicles (EVs) in a power distribution grid may collectively cause poor dynamic performance, and propose a price incentivization strategy to steer customers to settle for lesser charging rate demands so that such performance degradation can be avoided. We pose the problem as a joint optimization and optimal control formulation. The optimization determines the optimal charging setpoints for EVs to minimize the $\mathcal{H}_2$-norm of the transfer function of the grid model, while the optimal control simultaneously develops a linear quadratic regulator (LQR) based state-feedback control signal for the battery currents of those EVs to jointly improve the small-signal dynamic performance of the system states. A subsequent algorithm is developed to determine how much customers may be willing to sacrifice their intended charging rate demands in return for financial incentives. Results are derived for both unidirectional and bidirectional charging, and validated using numerical simulations of multiple EV charging stations (EVCSs) in the IEEE 33-bus power distribution model.
The Reconfigurable Earth Observation Satellite Scheduling Problem
Earth observation satellites (EOS) play a pivotal role in capturing and analyzing planetary phenomena, ranging from natural disasters to societal development. The EOS scheduling problem (EOSSP), which optimizes the schedule of EOS, is often solved with respect to nadir-directional EOS systems, thus restricting the observation time of targets and, consequently, the effectiveness of each EOS. This paper leverages state-of-the-art constellation reconfigurability to develop the reconfigurable EOS scheduling problem (REOSSP), wherein EOS are assumed to be maneuverable, forming a more optimal constellation configuration at multiple opportunities during a schedule. This paper develops a novel mixed-integer linear programming formulation for the REOSSP to optimally solve the scheduling problem for given parameters. Additionally, since the REOSSP can be computationally expensive for large-scale problems, a rolling horizon procedure (RHP) solution method is developed. The performance of the REOSSP is benchmarked against the EOSSP, which serves as a baseline, through a set of random instances where problem characteristics are varied and a case study in which Hurricane Sandy is used to demonstrate realistic performance. These experiments demonstrate the value of constellation reconfigurability in its application to the EOSSP, yielding solutions that improve performance, while the RHP enhances computational runtime for large-scale REOSSP instances.
comment: 43 pages
Client-Aided Secure Two-Party Computation of Dynamic Controllers
In this paper, we propose a secure two-party computation protocol for dynamic controllers using a secret sharing scheme. The proposed protocol realizes outsourcing of controller computation to two servers, while controller parameters, states, inputs, and outputs are kept secret against the servers. Unlike previous encrypted controls in a single-server setting, the proposed method can operate a dynamic controller for an infinite time horizon without controller state decryption or input re-encryption. We show that the control performance achievable by the proposed protocol can be made arbitrarily close to that attained by the unencrypted controller. Furthermore, system-theoretic and cryptographic modifications of the protocol are presented to improve the communication complexity. The feasibility of the protocol is demonstrated through numerical examples of PID and observer-based controls.
comment: 12 pages, 4 figures
Conformal Data-driven Control of Stochastic Multi-Agent Systems under Collaborative Signal Temporal Logic Specifications
We address control synthesis of stochastic discrete-time linear multi-agent systems under jointly chance-constrained collaborative signal temporal logic specifications in a distribution-free manner using available disturbance samples, which are partitioned into training and calibration sets. Leveraging linearity, we decompose each agent's system into deterministic nominal and stochastic error parts, and design disturbance feedback controllers to bound the stochastic errors by solving a tractable optimization problem over the training data. We then quantify prediction regions (PRs) for the aggregate error trajectories corresponding to agent cliques, involved in collaborative tasks, using conformal prediction and calibration data. This enables us to address the specified joint chance constraint via Lipschitz tightening and the computed PRs, and relax the centralized stochastic optimal control problem to a deterministic one, whose solution provides the feedforward inputs. To enhance scalability, we decompose the deterministic problem into agent-level subproblems solved in an MPC fashion, yielding a distributed control policy. Finally, we present an illustrative example and a comparison with [1].
comment: 7 pages, 2 figures, Accepted for presentation at the 64th IEEE Conference on Decision and Control (CDC2025)
Sparsity-Promoting Reachability Analysis and Optimization of Constrained Zonotopes
The constrained zonotope is a polytopic set representation widely used for set-based analysis and control of dynamic systems. This paper develops methods to formulate and solve optimization problems for dynamic systems in real time using constrained zonotope reachability analysis. An alternating direction method of multipliers (ADMM) algorithm is presented that makes efficient use of the constrained zonotope structure. To increase the efficiency of the ADMM iterations, reachability calculations are presented that increase the sparsity of the matrices used to define a constrained zonotope when compared to typical methods. The developed methods are used to formulate and solve predictive control, state estimation, and safety verification problems. Numerical results show that optimization times using the proposed approach are competitive with state-of-the-art QP solvers and conventional problem formulations. A combined set-valued state estimation and moving horizon estimation algorithm is presented and experimentally demonstrated in the context of robot localization.
Contraction Properties of the Global Workspace Primitive
To push forward the important emerging research field surrounding multi-area recurrent neural networks (RNNs), we expand theoretically and empirically on the provably stable RNNs of RNNs introduced by Kozachkov et al. in "RNNs of RNNs: Recursive Construction of Stable Assemblies of Recurrent Neural Networks". We prove relaxed stability conditions for salient special cases of this architecture, most notably for a global workspace modular structure. We then demonstrate empirical success for Global Workspace Sparse Combo Nets with a small number of trainable parameters, not only through strong overall test performance but also greater resilience to removal of individual subnetworks. These empirical results for the global workspace inter-area topology are contingent on stability preservation, highlighting the relevance of our theoretical work for enabling modular RNN success. Further, by exploring sparsity in the connectivity structure between different subnetwork modules more broadly, we improve the state of the art performance for stable RNNs on benchmark sequence processing tasks, thus underscoring the general utility of specialized graph structures for multi-area RNNs.
Systems and Control (EESS)
Flight-Ready Precise and Robust Carrier-Phase GNSS Navigation Software for Distributed Space Systems
This paper presents the full requirements analysis, design, development, and testing of high-precision navigation flight software for Distributed Space Systems (DSS) using Carrier Phase Differential GNSS (CDGNSS). Five main contributions are made. First, a survey of flown and upcoming DSS missions with stringent precision requirements is conducted, from which a thorough requirements analysis is distilled to guide development and testing. Second, a real-time navigation functional architecture is designed, and adopts a sparse and regularized Consider Kalman Filter with options for numerical stability in-flight. The filter rigorously accounts for uncertainties in process noise, measurement noise, and biases. It tracks float ambiguities with integer resolution where possible. The covariance correlation structure is preserved under all navigation modes, including contingencies and outages. Third, a lightweight, memoryless Fault Detection, Isolation, and Recovery (FDIR) module is developed to guard against anomalous measurements, providing statistical screening and ensuring robust navigation. Fourth, the software architecture is proposed for ease of integration, with strategies presented for modularity and computational efficiency tailored to constrained flight systems. Fifth, a comprehensive test campaign is conducted, mapped to a requirements verification matrix, spanning unit, interface, software-in-the-loop, and real-time hardware-in-the-loop tests, emphasizing gradual test fidelity for efficient fault isolation. Finally, flight-like results are demonstrated using the VISORS mission, due to the generalizability of the VISORS navigation operations, and the stringency which demands sub-centimeter relative position and sub-millimeter-per-second velocity accuracy. This architecture aims to serve as a reference for next-generation DSS missions adopting CDGNSS.
AI Data Centers Need Pioneers to Deliver Scalable Power via Offgrid AI
The scalable computing revolution of the late '80s through mid- '00s forged a new technical and economic model for computing that delivered massive societal impact, but its economic benefit has driven scalability to sizes that are now exhausting the energy grid's capacity. Our time demands a new revolution in scalable energy, mirroring in key ways the scalable computing revolution; e.g., compelling economic forces, use of mass-market components, overcoming foibles of those components, judicious use of physical locality, and the the difficult integration into an effective system. The offgrid AI approach closely fits this mold, combining local mostly renewable generation and storage to power an AI data center, starting offgrid. Obstacles to delivering this approach are social, technical, and project, but the potential is massive. I argue that the offgrid-AI approach needs pioneers among both system developers and AI-data-center operators to move it quickly from concept to large-scale deployment.
Tractable Stochastic Hybrid Model Predictive Control using Gaussian Processes for Repetitive Tasks in Unseen Environments
Improving the predictive accuracy of a dynamics model is crucial to obtaining good control performance and safety from Model Predictive Controllers (MPC). One approach involves learning unmodelled (residual) dynamics, in addition to nominal models derived from first principles. Varying residual models across an environment manifest as modes of a piecewise residual (PWR) model that requires a) identifying how modes are distributed across the environment and b) solving a computationally intensive Mixed Integer Nonlinear Program (MINLP) problem for control. We develop an iterative mapping algorithm capable of predicting time-varying mode distributions. We then develop and solve two tractable approximations of the MINLP to combine with the predictor in closed-loop to solve the overall control problem. In simulation, we first demonstrate how the approximations improve performance by 4-18% in comparison to the MINLP while achieving significantly lower computation times (upto 250x faster). We then demonstrate how the proposed mapping algorithm incrementally improves controller performance (upto 3x) over multiple iterations of a trajectory tracking control task even when the mode distributions change over time.
comment: European Control Conference (ECC) 2025
On Asymptotic Analysis of the Two-Stage Approach: Towards Data-Driven Parameter Estimation
In this paper, we analyze the asymptotic properties of the Two-Stage (TS) estimator -- a simulation-based parameter estimation method that constructs estimators offline from synthetic data. While TS offers significant computational advantages compared to standard approaches to estimation, its statistical properties have not been previously analyzed in the literature. Under simple assumptions, we establish that the TS estimator is strongly consistent and asymptotically normal, providing the first theoretical guarantees for this class of estimators.
comment: 11 pages, 4 figures
BirdRecorder's AI on Sky: Safeguarding birds of prey by detection and classification of tiny objects around wind turbines
The urgent need for renewable energy expansion, particularly wind power, is hindered by conflicts with wildlife conservation. To address this, we developed BirdRecorder, an advanced AI-based anti-collision system to protect endangered birds, especially the red kite (Milvus milvus). Integrating robotics, telemetry, and high-performance AI algorithms, BirdRecorder aims to detect, track, and classify avian species within a range of 800 m to minimize bird-turbine collisions. BirdRecorder integrates advanced AI methods with optimized hardware and software architectures to enable real-time image processing. Leveraging Single Shot Detector (SSD) for detection, combined with specialized hardware acceleration and tracking algorithms, our system achieves high detection precision while maintaining the speed necessary for real-time decision-making. By combining these components, BirdRecorder outperforms existing approaches in both accuracy and efficiency. In this paper, we summarize results on field tests and performance of the BirdRecorder system. By bridging the gap between renewable energy expansion and wildlife conservation, BirdRecorder contributes to a more sustainable coexistence of technology and nature.
comment: 18 pages, 1 figures, to appear in Proceedings of the 19th International Conference on Intelligent Autonomous Systems (IAS-19), Genoa, Italy, 2025
Realizing Reduced and Sparse Biochemical Reaction Networks from Dynamics
We propose a direct optimization framework for learning reduced and sparse chemical reaction networks (CRNs) from time-series trajectory data. In contrast to widely used indirect methods-such as those based on sparse identification of nonlinear dynamics (SINDy)-which infer reaction dynamics by fitting numerically estimated derivatives, our approach fits entire trajectories by solving a dynamically constrained optimization problem. This formulation enables the construction of reduced CRNs that are both low-dimensional and sparse, while preserving key dynamical behaviors of the original system. We develop an accelerated proximal gradient algorithm to efficiently solve the resulting non-convex optimization problem. Through illustrative examples, including a Drosophila circadian oscillator and a glycolytic oscillator, we demonstrate the ability of our method to recover accurate and interpretable reduced-order CRNs. Notably, the direct approach avoids the derivative estimation step and mitigates error accumulation issues inherent in indirect methods, making it a robust alternative for data-driven CRN realizations.
comment: Accepted to IEEE CDC 2025. Author-accepted version; supplementary material in appendix file
Modeling and Control Framework for Autonomous Space Manipulator Handover Operations
Autonomous space robotics is poised to play a vital role in future space missions, particularly for In-space Servicing, Assembly, and Manufacturing (ISAM). A key capability in such missions is the Robot-to-Robot (R2R) handover of mission-critical objects. This work presents a dynamic model of a dual-arm space manipulator system and compares various tracking control laws. The key contributions of this work are the development of a cooperative manipulator dynamic model and the comparative analysis of control laws to support autonomous R2R handovers in ISAM scenarios.
comment: 14 pages, submitted to 2025 Astrodynamics Specialists Conference proceedings
AQ-PCDSys: An Adaptive Quantized Planetary Crater Detection System for Autonomous Space Exploration
Autonomous planetary exploration missions are critically dependent on real-time, accurate environmental perception for navigation and hazard avoidance. However, deploying deep learning models on the resource-constrained computational hardware of planetary exploration platforms remains a significant challenge. This paper introduces the Adaptive Quantized Planetary Crater Detection System (AQ-PCDSys), a novel framework specifically engineered for real-time, onboard deployment in the computationally constrained environments of space exploration missions. AQ-PCDSys synergistically integrates a Quantized Neural Network (QNN) architecture, trained using Quantization-Aware Training (QAT), with an Adaptive Multi-Sensor Fusion (AMF) module. The QNN architecture significantly optimizes model size and inference latency suitable for real-time onboard deployment in space exploration missions, while preserving high accuracy. The AMF module intelligently fuses data from Optical Imagery (OI) and Digital Elevation Models (DEMs) at the feature level, utilizing an Adaptive Weighting Mechanism (AWM) to dynamically prioritize the most relevant and reliable sensor modality based on planetary ambient conditions. This approach enhances detection robustness across diverse planetary landscapes. Paired with Multi-Scale Detection Heads specifically designed for robust and efficient detection of craters across a wide range of sizes, AQ-PCDSys provides a computationally efficient, reliable and accurate solution for planetary crater detection, a critical capability for enabling the next generation of autonomous planetary landing, navigation, and scientific exploration.
comment: 17 pages, 6 figures. A research paper on a novel deep learning framework for planetary crater detection
modelSolver: A Symbolic Model-Driven Solver for Power Network Simulation and Monitoring
The development of advanced software tools for power system analysis requires extensive programming expertise. Even when using open-source tools, programming skills are essential to modify built-in models. This can be particularly challenging for domain experts who lack coding proficiency. This paper introduces modelSolver, a software solution with a new framework centered around symbolic mathematical modeling. The proposed paradigm facilitates defining models through intuitive mathematical expressions, thus eliminating the need for traditional programming constructs such as arrays, loops, and sparse matrix computations. The modelSolver focuses on power flow and state estimation using an open-box approach, which allows users to specify custom models using either real or complex variables. Unlike existing tools that rely on hard-coded models, modelSolver enables the representation of a wide range of advanced functionalities, including power flow with voltage regulators and load tap changers, continuation power flow, and Gauss-Newton state estimation with equality constraints. Compatibility with MATPOWER is ensured via a converter that automates importing data files. The framework prioritizes model-driven development and empowers domain experts to focus on power system modeling without programming barriers. It aims to simplify power system computations, making them more accessible to students, scientists, and practitioners.
A Predictive Framework for Adversarial Energy Depletion in Inbound Threat Scenarios
This paper presents a predictive framework for adversarial energy-depletion defense against a maneuverable inbound threat (IT). The IT solves a receding-horizon problem to minimize its own energy while reaching a high-value asset (HVA) and avoiding interceptors and static lethal zones modeled by Gaussian barriers. Expendable interceptors (EIs), coordinated by a central node (CN), maintain proximity to the HVA and patrol centers via radius-based tether costs, deny attack corridors by harassing and containing the IT, and commit to intercept only when a geometric feasibility test is confirmed. No explicit opponent-energy term is used, and the formulation is optimization-implementable. No simulations are included.
comment: 7 pages, 1 figure, 1 table, preprint submitted to the American Control Conference (ACC) 2026
Linear Power System Modeling and Analysis Across Wide Operating Ranges: A Hierarchical Neural State-Space Equation Approach
Developing a unified small-signal model for modern, large-scale power systems that remains accurate across a wide range of operating ranges presents a formidable challenge. Traditional methods, spanning mechanistic modeling, modal identification, and deep learning, have yet to fully overcome persistent limitations in accuracy, universal applicability, and interpretability. In this paper, a novel hierarchical neural state-space equation approach is proposed to overcome these obstacles, achieving strong representation, high interpretability, and superior adaptability to both system scale and varying operating points. Specifically, we first introduce neural state-space equations integrated with virtual state observers to accurately characterize the dynamics of power system devices, even in the presence of unmeasurable states. Subsequently, a hierarchical architecture is designed to handle the modeling complexity across a wide range of operating conditions, flexibly decoupling device and grid models to effectively mitigate the curse of dimensionality. Finally, a set of spatiotemporal data transformations and a multi-stage training strategy with a multi-objective loss function is employed to enhance the models efficiency and generalization. Numerical results on the two-machine three-bus system and the Guangdong Power Grid verify the superior performance of the proposed method, presenting it as a powerful new tool for small-signal stability analysis.
comment: 10 pages, 5 figures
A Comprehensive Incremental and Ensemble Learning Approach for Forecasting Individual Electric Vehicle Charging Parameters
Electric vehicles (EVs) have the potential to reduce grid stress through smart charging strategies while simultaneously meeting user demand. This requires accurate forecasts of key charging parameters, such as energy demand and connection time. Although previous studies have made progress in this area, they have overlooked the importance of dynamic training to capture recent patterns and have excluded EV sessions with limited information, missing potential opportunities to use these data. To address these limitations, this study proposes a dual-model approach incorporating incremental learning with six machine-learning models to predict EV charging session parameters. This approach includes dynamic training updates, optimal features, and hyperparameter set selection for each model to make it more robust and inclusive. Using a data set of 170,000 measurements from the real world electric vehicle session, week-long charging parameters were predicted over a one-year period. The findings reveal a significant difference between workplace and residential charging locations regarding connection duration predictability, with workplace sessions being more predictable. The proposed stacking ensemble learning method enhanced forecasting accuracy, improving R2 by 2.83% to 43.44% across all parameters and location settings. A comparison of the two models reveals that incorporating user IDs as a feature, along with the associated historical data, is the most significant factor influencing the accuracy of the forecast. Forecasts can be used effectively in smart charging and grid management applications by incorporating uncertainty quantification techniques, allowing charge point operators to optimize charging schedules and energy management.
Multiple STAR-RISs-Empowered Multi-User Communications with Diversified QoS Provisioning
This paper proposes a quality-of-service (QoS)-aware multi-user communication framework facilitated by multiple simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs). The user groups are established based on their QoS requirements specified by the minimum data rate, which is provisioned by the optimized transmission and reflection configurations of the STAR-RISs. Particularly, we formulate an optimization problem to maximize the aggregate link rate across all users, under group-specified rate requirements by jointly considering the transmit beamforming and STAR-RIS configurations. Then, we employ the Lagrangian duality with quadratic transformation to tackle the non-convexity of the objective. We decompose the problem within a block coordinate descent framework, and the subproblems are solved through convex approximation and iterated to approach the optimal solution. Simulation results demonstrate the effectiveness of the proposed method in enhancing the system sum rate with guaranteed QoS performance for heterogeneous users, offering valuable insights for the deployment of STAR-RISs in future QoS-aware wireless networks.
SuperGen: An Efficient Ultra-high-resolution Video Generation System with Sketching and Tiling
Diffusion models have recently achieved remarkable success in generative tasks (e.g., image and video generation), and the demand for high-quality content (e.g., 2K/4K videos) is rapidly increasing across various domains. However, generating ultra-high-resolution videos on existing standard-resolution (e.g., 720p) platforms remains challenging due to the excessive re-training requirements and prohibitively high computational and memory costs. To this end, we introduce SuperGen, an efficient tile-based framework for ultra-high-resolution video generation. SuperGen features a novel training-free algorithmic innovation with tiling to successfully support a wide range of resolutions without additional training efforts while significantly reducing both memory footprint and computational complexity. Moreover, SuperGen incorporates a tile-tailored, adaptive, region-aware caching strategy that accelerates video generation by exploiting redundancy across denoising steps and spatial regions. SuperGen also integrates cache-guided, communication-minimized tile parallelism for enhanced throughput and minimized latency. Evaluations demonstrate that SuperGen harvests the maximum performance gains while achieving high output quality across various benchmarks.
Deception in Asymmetric Information Homicidal Chauffeur Game
The classic Homicidal Chauffeur game is a pursuit-evasion game played in an unbounded planar environment between a pursuer constrained to move with fixed speed on curves with bounded curvature, and a slower evader with fixed speed but with simple kinematics. We introduce a new variant of this game with asymmetric information in which the evader has the ability to choose its speed among a finite set of choices that is unknown to the pursuer a priori. Therefore the pursuer is required to estimate the evader's maximum speed based on the observations so far. This formulation leads to the question of whether the evader can exploit this asymmetry by moving deceptively by first picking a slower speed to move with and then switching to a faster speed when a specified relative configuration is attained to increase the capture time as compared to moving with the maximum speed at all times. Our contributions are as follows. First, we derive optimal feedback Nash equilibrium strategies for the complete information case of this game in which the evader is allowed to vary its speed in a given interval. Second, for the version with asymmetric information, we characterize regions of initial player locations in the game space from which the evader does not have any advantage in using deceptive strategies. Finally, we provide numerical evidence of regions in the game space from which the evader can increase the capture time by moving deceptively.
Fast RLS Identification Leveraging the Linearized System Sparsity: Predictive Cost Adaptive Control for Quadrotors
This paper presents a centralized predictive cost adaptive control (PCAC) strategy for the position and attitude control of quadrotors. PCAC is an optimal, prediction-based control method that uses recursive least squares (RLS) to identify model parameters online, enabling adaptability in dynamic environments. Addressing challenges with black-box approaches in systems with complex couplings and fast dynamics, this study leverages the unique sparsity of quadrotor models linearized around hover points. By identifying only essential parameters related to nonlinear couplings and dynamics, this approach reduces the number of parameters to estimate, accelerates identification, and enhances stability during transients. Furthermore, the proposed control scheme removes the need for an attitude setpoint, typically required in conventional cascaded control designs.
comment: 6 pages, 3 figures, American Control Conference (ACC) 2026 preprint
Reformulations of Quadratic Programs for Lipschitz Continuity
Optimization-based controllers often lack regularity guarantees, such as Lipschitz continuity, when multiple constraints are present. When used to control a dynamical system, these conditions are essential to ensure the existence and uniqueness of the system's trajectory. Here we propose a general method to convert a Quadratic Program (QP) into a Second-Order Cone Problem (SOCP), which is shown to be Lipschitz continuous. Key features of our approach are that (i) the regularity of the resulting formulation does not depend on the structural properties of the constraints, such as the linear independence of their gradients; and (ii) it admits a closed-form solution, which is not available for general QPs with multiple constraints, enabling faster computation. We support our method with rigorous analysis and examples.
comment: Submitted to IEEE Control Systems Letters (L-CSS)
Electromagnetic Formation Flying Using Alternating Magnetic Field Forces and Control Barrier Functions for State and Input Constraints
This article presents a feedback control algorithm for electromagnetic formation flying with constraints on the satellites' states and control inputs. The algorithm combines several key techniques. First, we use alternating magnetic field forces to decouple the electromagnetic forces between each pair of satellites in the formation. Each satellite's electromagnetic actuation system is driven by a sum of amplitude-modulated sinusoids, where amplitudes are controlled in order to prescribe the time-averaged force between each pair of satellites. Next, the desired time-averaged force is computed from a optimal control that satisfies state constraints (i.e., no collisions and an upper limit on intersatellite speeds) and input constraints (i.e., not exceeding satellite's apparent power capability). The optimal time-averaged force is computed using a single relaxed control barrier function that is obtained by composing multiple control barrier functions that are designed to enforce each state and input constraint. Finally, we demonstrate the satellite formation control method in numerical simulations.
comment: Preprint submitted to IEEE Transactions on Aerospace and Electronic Systems (TAES)
A Learning-based Hybrid System Approach for Detecting Contingencies in Distribution Grids with Inverter-Based Resources
This paper presents a machine-learning based Stochastic Hybrid System (SHS) modeling framework to detect contingencies in active distribution networks populated with inverter-based resources (IBRs). In particular, this framework allows detecting unobservable contingencies, which cannot be identified by normal sensing systems. First, a state-space SHS model combining conventional and IRB-based resources is introduced to formulate the dynamic interaction between continuous states of distribution networks and discrete contingency events. This model forms a randomly switching system, where parameters or network topology can change due to contingencies. We consider two contingency classes: (i) physical events, such as line outages, and (ii) measurement anomalies caused by sensor faults. Leveraging multivariate time series data derived from high-frequency sampling of system states and network outputs, a time series-based learning model is trained for real-time contingency detection and classification. Simulation studies, carried out on the IEEE 33-bus distribution system, demonstrate a 96% overall detection accuracy.
comment: 6 pages, 6 figures
Fast Multiagent Formation Stabilization with Sparse Universally Rigid Frameworks
Affine formation control (AFC) is a distributed networked control system that has recently received increasing attention in various applications. AFC is typically achieved using a generalized consensus system where the stress matrix, which encodes the graph structure, is used instead of a graph Laplacian. Universally rigid frameworks (URFs) guarantee the existence of the stress matrix and have thus become the guideline for such a network design. In this work, we propose a convex optimization framework to design the stress matrix for AFC without predefining a rigid graph. We aim to find a resulting network with a reduced number of communication links, but still with a fast convergence speed. We show through simulations that our proposed solutions can yield a more sparse graph, while admitting a faster convergence compared to the state-of-the-art solutions.
Mimicking associative learning of rats via a neuromorphic robot in open field maze using spatial cell models
Data-driven Artificial Intelligence (AI) approaches have exhibited remarkable prowess across various cognitive tasks using extensive training data. However, the reliance on large datasets and neural networks presents challenges such as highpower consumption and limited adaptability, particularly in SWaP-constrained applications like planetary exploration. To address these issues, we propose enhancing the autonomous capabilities of intelligent robots by emulating the associative learning observed in animals. Associative learning enables animals to adapt to their environment by memorizing concurrent events. By replicating this mechanism, neuromorphic robots can navigate dynamic environments autonomously, learning from interactions to optimize performance. This paper explores the emulation of associative learning in rodents using neuromorphic robots within open-field maze environments, leveraging insights from spatial cells such as place and grid cells. By integrating these models, we aim to enable online associative learning for spatial tasks in real-time scenarios, bridging the gap between biological spatial cognition and robotics for advancements in autonomous systems.
Engineering a Digital Twin for the Monitoring and Control of Beer Fermentation Sampling
Successfully engineering interactive industrial DTs is a complex task, especially when implementing services beyond passive monitoring. We present here an experience report on engineering a safety-critical digital twin (DT) for beer fermentation monitoring, which provides continual sampling and reduces manual sampling time by 91%. We document our systematic methodology and practical solutions for implementing bidirectional DTs in industrial environments. This includes our three-phase engineering approach that transforms a passive monitoring system into an interactive Type 2 DT with real-time control capabilities for pressurized systems operating at seven bar. We contribute details of multi-layered safety protocols, hardware-software integration strategies across Arduino controllers and Unity visualization, and real-time synchronization solutions. We document specific engineering challenges and solutions spanning interdisciplinary integration, demonstrating how our use of the constellation reporting framework facilitates cross-domain collaboration. Key findings include the critical importance of safety-first design, simulation-driven development, and progressive implementation strategies. Our work thus provides actionable guidance for practitioners developing DTs requiring bidirectional control in safety-critical applications.
comment: Accepted for EDTconf 2025
DTInsight: A Tool for Explicit, Interactive, and Continuous Digital Twin Reporting
With Digital Twin (DT) construction and evolution occurring over time, stakeholders require tools to understand the current characteristics and conceptual architecture of the system at any time. We introduce DTInsight, a systematic and automated tool and methodology for producing continuous reporting for DTs. DTInsight offers three key features: (a) an interactive conceptual architecture visualization of DTs; (b) generation of summaries of DT characteristics based on ontological data; and (c) integration of these outputs into a reporting page within a continuous integration and continuous deployment (CI/CD) pipeline. Given a modeled description of the DT aligning to our DT Description Framework (DTDF), DTInsight enables up-to-date and detailed reports for enhanced stakeholder understanding.
SNIC bifurcation and its Application to MEMS ICME 2018
This project focuses on a method to extract a frequency comb in mechanical means, for general interest and numerous practical applications in MEMS. The method of execution is the implementation of a beam that is exhibiting non-linear dynamics that is perturbed and analyzed for its transverse vibrations. The perturbation is an external harmonic driver with a chosen small amplitude and frequency (which is slightly detuned from the beam eigenfrequency), that when engaged with the unperturbed beam oscillations, causes it reach a state of "injection pulling" - an effect that occurs when one harmonic oscillator is coupled with a second one and causes it to oscillate in a frequency near its own. This causes the beam to reach SNIC bifurcation, rendering a frequency comb as desired. Theoretical analysis showed that the problem can be modelled using a non-linear equation of the beam, that translates to a form of the non-linear Duffing equation. While a solution to the dynamics function of the beam is hard to obtain in practice due to mathematical difficulties, a slow evolution model is suggested that is composed of functions of a amplitude and phase. Using several additional mathematical assumptions, the amplitude is seen to be related to the phase, while the phase equation solution is seen to be of the form of Adler's equation. These assumptions ultimately reduce the entire behaviour of the beam to a relatively simple solution to the Adler equation, which has a known analytical solution. Computerized numerical simulations are run on it to check the results and compare them to the theory and desired outcome. The results agreed with the theory and produce the expected frequency comb, showing the assumptions to be valid in extracting the comb.
comment: Presented at the 35th Israeli Conference on Mechanical Engineering (ICME 2018)
Incremental Collision Laws Based on the Bouc-Wen Model: Improved Collision Models and Further Results
In the article titled "The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies" and published in the Journal of Computational and Nonlinear Dynamics (Volume 20, Issue 6, June 2025), the authors studied mathematical models of binary direct collinear collisions of convex viscoplastic bodies that employed two incremental collision laws based on the Bouc-Wen differential model of hysteresis. It was shown that the models possess favorable analytical properties, and several model parameter identification studies were conducted, demonstrating that the models can accurately capture the nature of a variety of collision phenomena. In this article, the aforementioned models are augmented by modeling the effects of external forces as time-dependent inputs that belong to a certain function space. Furthermore, the range of the parameters under which the models possess favorable analytical properties is extended to several corner cases that were not considered in the prior publication. Finally, the previously conducted model parameter identification studies are extended, and an additional model parameter identification study is provided in an attempt to validate the ability of the augmented models to represent the effects of external forces.
comment: 12 pages, 4 figures, see https://gitlab.com/user9716869/EBWCM ; (v2-v5) various minor amendments; (v5) replaced the parameter identification study of Quinn (2004) with Villegas et al (2021) due to incompatibility of the proposed collision models with the experimental setup in Quinn (2004); arXiv admin note: text overlap with arXiv:2410.08147
Data-Driven Yet Formal Policy Synthesis for Stochastic Nonlinear Dynamical Systems
The automated synthesis of control policies for stochastic dynamical systems presents significant challenges. A standard approach is to construct a finite-state abstraction of the continuous system, typically represented as a Markov decision process (MDP). However, generating abstractions is challenging when (1) the system's dynamics are nonlinear, and/or (2) we do not have complete knowledge of the dynamics. In this work, we introduce a novel data-driven abstraction technique for nonlinear Lipschitz continuous dynamical systems with additive stochastic noise that addresses both of these issues. As a key step, we use samples of the dynamics to learn the enabled actions and transition probabilities of the abstraction. We represent abstractions as MDPs with intervals of transition probabilities, known as interval MDPs (IMDPs). These abstractions enable the synthesis of policies for the concrete nonlinear system, with probably approximately correct (PAC) guarantees on the probability of satisfying a specified control objective. Our numerical experiments illustrate the effectiveness and robustness of our approach in achieving reliable control under uncertainty.
Geometry-Aware Edge-State Tracking for Resilient Affine Formation Control
Affine formation control (AFC) is a subset of formation control methods that enables coordinated multiagent movement while preserving affine relationships, and has recently gained increasing popularity due to its broad applicability across diverse applications. AFC is inherently distributed, where each agent's local controller relies on the relative displacements of neighboring agents. The unavailability of these measurements in practice, due to node or communication failures, leads to a change in the underlying graph topology and subsequently causes instability or sub-optimal performance. In this work, each edge in the graph is modeled using a state-space framework, allowing the corresponding edge-states to be estimated with or without up-to-date measurements. We then propose a Kalman-based estimation framework where we fuse both temporal information from agents' dynamics and spatial information, which is derived from the geometry of the affine formations. We give convergence guarantees and optimality analysis on the proposed algorithm, and numerical validations show the enhanced resilience of AFC against these topology changes in several practical scenarios.
comment: A part of this work is published at the 26th International Conference on Information Fusion 2023, DOI: 10.23919/FUSION52260.2023.10224101
Fault Detection in New Wind Turbines with Limited Data by Generative Transfer Learning
Intelligent condition monitoring of wind turbines is essential for reducing downtimes. Machine learning models trained on wind turbine operation data are commonly used to detect anomalies and, eventually, operation faults. However, data-driven normal behavior models (NBMs) require a substantial amount of training data, as NBMs trained with scarce data may result in unreliable fault detection. To overcome this limitation, we present a novel generative deep transfer learning approach to make SCADA samples from one wind turbine lacking training data resemble SCADA data from wind turbines with representative training data. Through CycleGAN-based domain mapping, our method enables the application of an NBM trained on an existing wind turbine to a new one with severely limited data. We demonstrate our approach on field data mapping SCADA samples across 7 substantially different WTs. Our findings show significantly improved fault detection in wind turbines with scarce data. Our method achieves the most similar anomaly scores to an NBM trained with abundant data, outperforming NBMs trained on scarce training data with improvements of +10.3% in F1-score when 1 month of training data is available and +16.8% when 2 weeks are available. The domain mapping approach outperforms conventional fine-tuning at all considered degrees of data scarcity, ranging from 1 to 8 weeks of training data. The proposed technique enables earlier and more reliable fault detection in newly installed wind farms, demonstrating a novel and promising research direction to improve anomaly detection when faced with training data scarcity.
comment: Change of phrasing to fault detection, including title change, and minor revisions. No changes to models, experiments, or results
Generalizations of data-driven balancing: What to sample for different balancing-based reduced models
The quadrature-based balanced truncation (QuadBT) framework of arXiv:2104.01006 is a non-intrusive reformulation of balanced truncation (BT), a classical projection-based model-order reduction technique for linear systems. QuadBT is non-intrusive in the sense that it builds approximate balanced truncation reduced-order models entirely from system response data, e.g., transfer function measurements, without the need to reference an explicit state-space realization of the underlying full-order model. In this work, we generalize the QuadBT framework to other types of balanced truncation model reduction. Namely, we show what transfer function data are required to compute data-driven reduced models by balanced stochastic truncation, positive-real balanced truncation, and bounded-real balanced truncation. In each case, these data are evaluations of particular spectral factors associated with the system of interest. These results lay the theoretical foundation for data-driven reformulations of the aforementioned BT variants. Although it is not yet clear how to compute or obtain these spectral factor data in a practical real-world setting, examples using synthetic (numerically evaluated) transfer function data are included to validate the data-based reduced models.
comment: 16 pages, 3 figures
On Word-of-Mouth and Private-Prior Sequential Social Learning
Social learning constitutes a fundamental framework for studying interactions among rational agents who observe each other's actions but lack direct access to individual beliefs. This paper investigates a specific social learning paradigm known as Word-of-Mouth (WoM), where a series of agents seeks to estimate the state of a dynamical system. The first agent receives noisy measurements of the state, while each subsequent agent relies solely on a degraded version of her predecessor's estimate. A defining feature of WoM is that the final agent's belief is publicly broadcast and subsequently adopted by all agents, in place of their own. We analyze this setting theoretically and through numerical simulations, noting that some agents benefit from using the belief of the last agent, while others experience performance deterioration.
comment: Accepted for publication at the 64th Conference on Decision and Control (CDC)
Robust Sensor-Limited Control with Safe Input-Output Constraints for Hydraulic In-Wheel Motor Drive Mobility Systems
In-wheel drive (IWD) systems enhance the responsiveness, traction, and maintenance efficiency of vehicles by enabling each wheel to operate independently. This paper proposes a novel robust torque-observed valve-based control (RTOVC) framework to address velocity tracking in hydraulic IWDs that actuate heavy-duty wheeled mobile robots (HWMRs), considering such challenges as wheel slippages, sensor limitations, rough terrains, and modeling uncertainties. To overcome the sensor-dependent control systems associated with the closed-loop torque/pressure in hydraulic IWD-actuated HWMRs, a robust observer network based on an adaptive barrier Lyapunov function (BLF) is proposed to estimate the required in-wheel motor torque to track the velocity references. Then, another adaptive BLF for valve control signals is employed to modulate the hydraulic fluid to generate the estimated torque for each IWD. The RTOVC strategy ensures user-defined safety within the logarithmic BLF framework by constraining the valve control signal, actual velocity, velocity tracking error, and torque of each hydraulic IWD in an HWMR to avoid exceeding specified limits. Despite its safety constraints, external disturbances, and modeling uncertainties, robustness and uniformly exponential stability of the RTOVC-applied hydraulic IWD mechanism are ensured in HWMRs. Experimental investigations using a 6,500-kg HWMR, actuated by four independent IWDs under intense disturbances and safety-defined constraints, validate the performance of the RTOVC.
A Multi-Modal IoT Node for Energy-Efficient Environmental Monitoring with Edge AI Processing
The widespread adoption of Internet of Things (IoT) technologies has significantly advanced environmental monitoring (EM) by enabling cost-effective and scalable sensing solutions. Concurrently, machine learning (ML) and artificial intelligence (AI) are introducing powerful tools for the efficient and accurate analysis of complex environmental data. However, current IoT platforms for environmental sensing are typically limited to a narrow set of sensors, preventing a comprehensive assessment of environmental conditions and lacking sufficient computational capabilities to support the deployment of advanced ML and AI algorithms on the edge. To overcome these limitations, we introduce a compact (17x38 mm2), multi-modal, MCU-based environmental IoT node integrating 11 sensors, including CO2 concentration, volatile organic compounds (VOCs), light intensity, UV radiation, pressure, temperature, humidity, visual sensing via an RGB camera, and precise geolocation through a GNSS module. It features GAP9, a parallel ultra-low-power system-on-chip, enabling real-time, energy-efficient edge processing of advanced ML models directly on-device. We implemented a YOLOv5-based occupancy detection pipeline (0.3 M parameters, 42 MOP per inference), demonstrating 42% energy savings over raw data streaming. Additionally, we present a smart indoor air quality (IAQ) monitoring setup that combines occupancy detection with adaptive sample rates, achieving operational times of up to 143 h on a single compact 600 mAh, 3.7 V battery. Our platform lays the groundwork for innovative applications such as predictive indoor IAQ, enabling efficient AI-driven on-edge forecasting for energy-efficient and autonomous, proactive pollution-mitigation control strategies
comment: 7 pages, 4 figures, 2 tables. This paper has been accepted at 2025 IEEE International Conference on Omni-layer Intelligent Systems (COINS)
Approximate predictive control barrier function for discrete-time systems
We propose integrating an approximation of a predictive control barrier function (PCBF) in a safety filter framework, resulting in a prediction horizon independent formulation. The PCBF is defined through the value function of an optimal control problem and ensures invariance as well as stability of a safe set within a larger domain of attraction. We provide a theoretical analysis of the proposed algorithm, establishing input-to-state stability of the safe set with respect to approximation errors as well as exogenous disturbances. Furthermore, we propose a continuous extension of the PCBF within the safe set, reducing the impact of learning errors on filter interventions. We demonstrate the stability properties and computational advantages of the proposed algorithm on a linear system example and its application as a safety filter for miniature race cars in simulation.
Real-time Traffic Simulation and Management for Large-scale Urban Air Mobility: Integrating Route Guidance and Collision Avoidance
Given the spatial heterogeneity of land use patterns in most cities, large-scale UAM deployments will likely focus on specific areas, such as intertransfer traffic between suburbs and city centers. However, large-scale UAM operations connecting multiple origin-destination pairs raise concerns about air traffic safety and efficiency due to potential conflict movements, particularly at major conflict points analogous to roadway junctions. To meet the safety and efficiency requirements of future UAM operations, this work proposes an air traffic management framework that integrates route guidance and collision avoidance. The route guidance mechanism optimizes aircraft distribution across both spatial and temporal dimensions by regulating their paths (composed of waypoints). Given the optimized paths, the collision avoidance algorithm generates collision-free aircraft trajectories between waypoints in the 3D space. To enable large-scale applications, we develop fast approximation methods for centralized path planning and adopt the velocity obstacle model for distributed collision avoidance. To our knowledge, this work is one of the first to integrate route guidance and collision avoidance for UAM. Simulation results demonstrate that the proposed framework enables efficient and flexible UAM operations, including air traffic assignment, local congestion mitigation, and dynamic no-fly zone management. Compared with a collision-free baseline strategy, the proposed framework achieves considerable improvements in traffic safety and efficiency, with increases in the average minimum separation (+98.2%), the average travel speed (+70.2%), and the trip completion rate (+130%), along with a reduction in the energy consumption (-23.0%). The proposed framework demonstrates its potential for real-time traffic simulation and management in large-scale UAM systems.
comment: 26 pages
Recursive Gaussian Process Regression with Integrated Monotonicity Assumptions for Control Applications
In this paper, we present an extension to the recursive Gaussian Process (RGP) regression that enables the satisfaction of inequality constraints and is well suited for a real-time execution in control applications. The soft inequality constraints are integrated by introducing an additional extended Kalman Filter (EKF) update step using pseudo-measurements. The sequential formulation of the algorithm and several developed heuristics ensure both the performance and a low computational effort of the algorithm. A special focus lies on an efficient consideration of monotonicity assumptions for GPs in the form of inequality constraints. The algorithm is statistically validated in simulations, where the possible advantages in comparison with the standard RGP algorithm become obvious. The paper is concluded with a successful experimental validation of the developed algorithm for the monotonicity-preserving learning of heat transfer values for the control of a vapor compression cycle evaporator, leveraging a previously published partial input output linearization (IOL).
comment: Accepted at ICINCO 2025 (22nd International Conference on Informatics in Control, Automation and Robotics)
Constrained Diffusion Models for Synthesizing Representative Power Flow Datasets ICML 2025
High-quality power flow datasets are essential for training machine learning models in power systems. However, security and privacy concerns restrict access to real-world data, making statistically accurate and physically consistent synthetic datasets a viable alternative. We develop a diffusion model for generating synthetic power flow datasets from real-world power grids that both replicate the statistical properties of the real-world data and ensure AC power flow feasibility. To enforce the constraints, we incorporate gradient guidance based on the power flow constraints to steer diffusion sampling toward feasible samples. For computational efficiency, we further leverage insights from the fast decoupled power flow method and propose a variable decoupling strategy for the training and sampling of the diffusion model. These solutions lead to a physics-informed diffusion model, generating power flow datasets that outperform those from the standard diffusion in terms of feasibility and statistical similarity, as shown in experiments across IEEE benchmark systems.
comment: This paper is the extended journal version of our paper at ICML 2025 Workshop "DataWorld: Unifying Data Curation Frameworks Across Domains"
Co-Optimization of EV Charging Control and Incentivization for Enhanced Power System Stability
We study how high charging rate demands from electric vehicles (EVs) in a power distribution grid may collectively cause poor dynamic performance, and propose a price incentivization strategy to steer customers to settle for lesser charging rate demands so that such performance degradation can be avoided. We pose the problem as a joint optimization and optimal control formulation. The optimization determines the optimal charging setpoints for EVs to minimize the $\mathcal{H}_2$-norm of the transfer function of the grid model, while the optimal control simultaneously develops a linear quadratic regulator (LQR) based state-feedback control signal for the battery currents of those EVs to jointly improve the small-signal dynamic performance of the system states. A subsequent algorithm is developed to determine how much customers may be willing to sacrifice their intended charging rate demands in return for financial incentives. Results are derived for both unidirectional and bidirectional charging, and validated using numerical simulations of multiple EV charging stations (EVCSs) in the IEEE 33-bus power distribution model.
The Reconfigurable Earth Observation Satellite Scheduling Problem
Earth observation satellites (EOS) play a pivotal role in capturing and analyzing planetary phenomena, ranging from natural disasters to societal development. The EOS scheduling problem (EOSSP), which optimizes the schedule of EOS, is often solved with respect to nadir-directional EOS systems, thus restricting the observation time of targets and, consequently, the effectiveness of each EOS. This paper leverages state-of-the-art constellation reconfigurability to develop the reconfigurable EOS scheduling problem (REOSSP), wherein EOS are assumed to be maneuverable, forming a more optimal constellation configuration at multiple opportunities during a schedule. This paper develops a novel mixed-integer linear programming formulation for the REOSSP to optimally solve the scheduling problem for given parameters. Additionally, since the REOSSP can be computationally expensive for large-scale problems, a rolling horizon procedure (RHP) solution method is developed. The performance of the REOSSP is benchmarked against the EOSSP, which serves as a baseline, through a set of random instances where problem characteristics are varied and a case study in which Hurricane Sandy is used to demonstrate realistic performance. These experiments demonstrate the value of constellation reconfigurability in its application to the EOSSP, yielding solutions that improve performance, while the RHP enhances computational runtime for large-scale REOSSP instances.
comment: 43 pages
Client-Aided Secure Two-Party Computation of Dynamic Controllers
In this paper, we propose a secure two-party computation protocol for dynamic controllers using a secret sharing scheme. The proposed protocol realizes outsourcing of controller computation to two servers, while controller parameters, states, inputs, and outputs are kept secret against the servers. Unlike previous encrypted controls in a single-server setting, the proposed method can operate a dynamic controller for an infinite time horizon without controller state decryption or input re-encryption. We show that the control performance achievable by the proposed protocol can be made arbitrarily close to that attained by the unencrypted controller. Furthermore, system-theoretic and cryptographic modifications of the protocol are presented to improve the communication complexity. The feasibility of the protocol is demonstrated through numerical examples of PID and observer-based controls.
comment: 12 pages, 4 figures
Conformal Data-driven Control of Stochastic Multi-Agent Systems under Collaborative Signal Temporal Logic Specifications
We address control synthesis of stochastic discrete-time linear multi-agent systems under jointly chance-constrained collaborative signal temporal logic specifications in a distribution-free manner using available disturbance samples, which are partitioned into training and calibration sets. Leveraging linearity, we decompose each agent's system into deterministic nominal and stochastic error parts, and design disturbance feedback controllers to bound the stochastic errors by solving a tractable optimization problem over the training data. We then quantify prediction regions (PRs) for the aggregate error trajectories corresponding to agent cliques, involved in collaborative tasks, using conformal prediction and calibration data. This enables us to address the specified joint chance constraint via Lipschitz tightening and the computed PRs, and relax the centralized stochastic optimal control problem to a deterministic one, whose solution provides the feedforward inputs. To enhance scalability, we decompose the deterministic problem into agent-level subproblems solved in an MPC fashion, yielding a distributed control policy. Finally, we present an illustrative example and a comparison with [1].
comment: 7 pages, 2 figures, Accepted for presentation at the 64th IEEE Conference on Decision and Control (CDC2025)
Sparsity-Promoting Reachability Analysis and Optimization of Constrained Zonotopes
The constrained zonotope is a polytopic set representation widely used for set-based analysis and control of dynamic systems. This paper develops methods to formulate and solve optimization problems for dynamic systems in real time using constrained zonotope reachability analysis. An alternating direction method of multipliers (ADMM) algorithm is presented that makes efficient use of the constrained zonotope structure. To increase the efficiency of the ADMM iterations, reachability calculations are presented that increase the sparsity of the matrices used to define a constrained zonotope when compared to typical methods. The developed methods are used to formulate and solve predictive control, state estimation, and safety verification problems. Numerical results show that optimization times using the proposed approach are competitive with state-of-the-art QP solvers and conventional problem formulations. A combined set-valued state estimation and moving horizon estimation algorithm is presented and experimentally demonstrated in the context of robot localization.
Contraction Properties of the Global Workspace Primitive
To push forward the important emerging research field surrounding multi-area recurrent neural networks (RNNs), we expand theoretically and empirically on the provably stable RNNs of RNNs introduced by Kozachkov et al. in "RNNs of RNNs: Recursive Construction of Stable Assemblies of Recurrent Neural Networks". We prove relaxed stability conditions for salient special cases of this architecture, most notably for a global workspace modular structure. We then demonstrate empirical success for Global Workspace Sparse Combo Nets with a small number of trainable parameters, not only through strong overall test performance but also greater resilience to removal of individual subnetworks. These empirical results for the global workspace inter-area topology are contingent on stability preservation, highlighting the relevance of our theoretical work for enabling modular RNN success. Further, by exploring sparsity in the connectivity structure between different subnetwork modules more broadly, we improve the state of the art performance for stable RNNs on benchmark sequence processing tasks, thus underscoring the general utility of specialized graph structures for multi-area RNNs.
Systems and Control (CS)
A Data-Driven Forced Oscillation Locating Method for Power Systems with Inverter-Based Resources
Forced Oscillations (FO) stemming from external periodic disturbances threaten power system security and stability. The increasing penetration of Inverter-Based Resources(IBRs) further introduces FO, leading to new challenges in identifying and locating FO sources in modern power systems. In this paper, a novel data-driven method for locating FO in power systems with IBRs is proposed. Unlike previous works, a unified representation of FO originating from IBRs is considered, which further facilitates the development of the FO locating algorithm. Leveraging on Sparse Identification for a Nonlinear Dynamical (SINDy), a purely data-driven methodology is developed for locating the source of FO by interpreting the proposed model from measurements. Numerical results on the WECC 240-bus system validate the performance of the proposed approach in successfully locating FO in the presence of IBRs.
First and Second Order Optimal $\mathcal{H}_2$ Model Reduction for Linear Continuous-Time Systems
In this paper, we investigate the optimal $\mathcal{H}_2$ model reduction problem for single-input single-output (SISO) continuous-time linear time-invariant (LTI) systems. A semi-definite relaxation (SDR) approach is proposed to determine globally optimal interpolation points, providing an effective way to compute the reduced-order models via Krylov projection-based methods. In contrast to iterative approaches, we use the controllability Gramian and the moment-matching conditions to recast the model reduction problem as a convex optimization by introducing an upper bound $\gamma$ to minimize the $\mathcal{H}_2$ norm of the model reduction error system. We also prove that the relaxation is exact for first order reduced models and demonstrate, through examples, that it is exact for second order reduced models. We compare the performance of our proposed method with other iterative approaches and shift-selection methods on examples. Importantly, our approach also provides a means to verify the global optimality of known locally convergent methods.
comment: 8 pages, 5 figures, CDC conference
A Consensus Algorithm for Second-Order Systems Evolving on Lie Groups
In this paper, a consensus algorithm is proposed for interacting multi-agents, which can be modeled as simple Mechanical Control Systems (MCS) evolving on a general Lie group. The standard Laplacian flow consensus algorithm for double integrator systems evolving on Euclidean spaces is extended to a general Lie group. A tracking error function is defined on a general smooth manifold for measuring the error between the configurations of two interacting agents. The stability of the desired consensus equilibrium is proved using a generalized version of Lyapunov theory and LaSalle's invariance principle applicable for systems evolving on a smooth manifold. The proposed consensus control input requires only the configuration information of the neighboring agents and does not require their velocities and inertia tensors. The design of tracking error function and consensus control inputs are demonstrated through an application of attitude consensus problem for multiple communicating rigid bodies. The consensus algorithm is numerically validated by demonstrating the attitude consensus problem.
Distributed Implementation of Variational Quantum Eigensolver to Solve QUBO Problems
We present a distributed algorithm and implementation of the variational quantum eigensolver (VQE), termed distributed VQE (DVQE). DVQE, provided as an open-source Python package, enables the execution of parameterized quantum circuits across multiple logical quantum processing units (QPUs) in a distributed fashion. This approach addresses key hardware limitations of near-term quantum devices, including restricted qubit counts and limited circuit depth. Distributed ansatz circuits are constructed to preserve the quantum state fidelity of their monolithic counterparts, allowing consistent energy estimation while distributing the computational load. To improve the convergence and robustness of the optimization loop for identifying the variational parameters of the DVQE ansatz circuit, we use the ADAM optimizer in combination with metaheuristic initialization strategies, which outperform random initialization across various test cases. The complete DVQE pipeline is implemented in a modular Python package that accepts QUBO problems as input and supports monolithic and distributed execution modes. The framework leverages Qiskit to construct and simulate distributed circuits, and includes an internal greedy algorithm for automatic qubit allocation across multiple QPUs. Simulation results on QUBO benchmarks confirm the correctness of the approach, paving the way for real QPU deployment and further exploration of distributed quantum optimization. \textbf{The simulator is publicly available on \href{https://github.com/LSU-RAISE-LAB/DVQE.git}{GitHub} under a package named raiselab, with a collection of tutorial examples.}
Optimizing Grasping in Legged Robots: A Deep Learning Approach to Loco-Manipulation
Quadruped robots have emerged as highly efficient and versatile platforms, excelling in navigating complex and unstructured terrains where traditional wheeled robots might fail. Equipping these robots with manipulator arms unlocks the advanced capability of loco-manipulation to perform complex physical interaction tasks in areas ranging from industrial automation to search-and-rescue missions. However, achieving precise and adaptable grasping in such dynamic scenarios remains a significant challenge, often hindered by the need for extensive real-world calibration and pre-programmed grasp configurations. This paper introduces a deep learning framework designed to enhance the grasping capabilities of quadrupeds equipped with arms, focusing on improved precision and adaptability. Our approach centers on a sim-to-real methodology that minimizes reliance on physical data collection. We developed a pipeline within the Genesis simulation environment to generate a synthetic dataset of grasp attempts on common objects. By simulating thousands of interactions from various perspectives, we created pixel-wise annotated grasp-quality maps to serve as the ground truth for our model. This dataset was used to train a custom CNN with a U-Net-like architecture that processes multi-modal input from an onboard RGB and depth cameras, including RGB images, depth maps, segmentation masks, and surface normal maps. The trained model outputs a grasp-quality heatmap to identify the optimal grasp point. We validated the complete framework on a four-legged robot. The system successfully executed a full loco-manipulation task: autonomously navigating to a target object, perceiving it with its sensors, predicting the optimal grasp pose using our model, and performing a precise grasp. This work proves that leveraging simulated training with advanced sensing offers a scalable and effective solution for object handling.
Coordinated UAV Beamforming and Control for Directional Jamming and Nulling
Efficient mobile jamming against eavesdroppers in wireless networks necessitates accurate coordination between mobility and antenna beamforming. We study the coordinated beamforming and control problem for a UAV that carries two omnidirectional antennas, and which uses them to jam an eavesdropper while leaving a friendly client unaffected. The UAV can shape its jamming beampattern by controlling its position, its antennas' orientation, and the phases of the antennas' interference signals. We derive a closed-form expression for the antennas' phases that guarantees zero jamming impact on the client. In addition, we determine the antennas' orientation and the UAV's position that maximizes jamming impact on the eavesdropper through an optimal control problem, optimizing the orientation pointwise and the position through the UAV's control input. Simulations show how this coordinated beamforming and control scheme enables directional GPS denial while guaranteeing zero interference towards a friendly direction.
comment: 8 pages, 7 Figures
Input-Output Data-Driven Sensor Selection for Cyber-Physical Systems
In this paper, we consider the problem of input-output data-driven sensor selection for unknown cyber-physical systems (CPS). In particular, out of a large set of sensors available for use, we choose a subset of them that maximizes a metric of observability of the CPS. The considered observability metric is related to the $\mathcal{H}_2$ system norm, which quantifies the average output energy of the selected sensors over a finite or an infinite horizon. However, its computation inherently requires knowledge of the unknown matrices of the system, so we draw connections from the reinforcement learning literature and design an input-output data-driven algorithm to compute it in a model-free manner. We then use the derived data-driven metric expression to choose the best sensors of the system in polynomial time, effectively obtaining a provably convergent model-free sensor selection process. Additionally, we show how the proposed data-driven approach can be exploited to select sensors that optimize volumetric measures of observability, while also noting its applicability to the dual problem of actuator selection. Simulations are performed to demonstrate the validity and effectiveness of the proposed approach.
comment: 12 pages, 3 Figures
Modular electronic microrobots with on board sensor-program steered locomotion
True microrobots, in contrast with externally controlled microparticles, must harvest or carry their own source of energy, as well as their own (preferably programmable) microcontroller of actuators for locomotion, using information acquired from their own sensors. Building on recent published work [1], we demonstrate here, for the first time, that microrobotic smartlets, hitherto buoyancy divers, can also be equipped to navigate in 2D on surfaces, with on-board control responding to both sensor information and their internal electronic program. Fabricating modular microrobots, with all dimensions of 1mm and below, has been difficult to achieve because of competing demands for the limited surface area and the challenges of integrating and interconnecting the diverse functionalities of energy harvesting, actuation, sensing, communication, docking and control. A novel high density heterogeneous integration, via soft-substrate micro flip-chip bonding of custom CMOS and LED microchiplets onto fold-up polymer surfaces, compatible with roll-up isotropic ambient light harvesting, now makes this possible. Fabricating electrolytic bubble actuators on multiple cube-faces and connecting them to a custom sensor-controlled on-board microchiplet (lablet), allows the smartlets to locomote on wet surfaces, changing direction in response to both timed programmed control as well as programmed response to locally sensed signals. Such locomoted robotic microcubes can also move to and selectively dock with other modules via patterned surfaces. This is powered by ambient light in natural aqueous media on smooth surfaces.
Analysis of Circuit-based Per-Panel Diode Model of Photovoltaic Array
Solar photovoltaic systems are increasing in size and number on the grid. In regions with high penetration, such as California, PV systems serve multiple functions, including peak shaving and demand response. Therefore, the criticality of PV systems to grid operations calls for accurate models. The current practice is to represent the PV array, composed of multiple PV panels, with an aggregated single-diode model (SDM). The highly abstract model has a limited ability to capture real-world behaviors, such as partial shading and hotspots. Thus, we develop a circuit-based per-panel PV array model that uses a single diode model for each panel and interconnects them to form an array. This approach bridges the tradeoff between cell-level physics and control-dependent system-level behavior. We establish conditions for mathematical equivalence between the proposed per-panel array circuit model and the aggregated single-diode array model. We generate empirical evidence by running simulations using parameters derived from real-world PV panels. Results indicate that the proposed per-panel array model can represent the electrical behavior of the array under non-ideal conditions, such as partial shading, more accurately. With maximum power point tracking control, the proposed model is 21.2% more accurate when estimating the real power output of an array under a partial shading scenario and 8.1% more accurate under a hot spot scenario.
One Equation to Rule Them All -- Part II: Direct Data-Driven Reduction and Regulation
The Sylvester equation underpins a wide spectrum of control synthesis and systems analysis tools associated with cascade interconnections. In the preceding Part I [1] of this article, it was shown that such an equation can be reformulated using data, enabling the production of a collection of data-driven stabilisation procedures. In this second part of the article, we continue to develop the framework established in Part I to solve two important control-theoretic problems: model order reduction and output regulation. For the model order reduction problem we provide a solution from input-state measurements, from input-output measurements, and we study the effect of the noise. For the output regulation problem, we provide data-driven solutions for the static and dynamic feedback problem. The proposed designs are illustrated by means of examples.
One Equation to Rule Them All -- Part I: Direct Data-Driven Cascade Stabilisation
In this article we present a framework for direct data-driven control for general problems involving interconnections of dynamical systems. We first develop a method to determine the solution of a Sylvester equation from data. Such solution is used to describe a subspace that plays a role in a large variety of problems. We then provide an error analysis of the impact that noise has on this solution. This is a crucial contribution because, thanks to the interconnection approach developed throughout the article, we are able to track how the noise propagates at each stage, and thereby provide bounds on the final designs. Among the many potential problems that can be solved with this framework, we focus on three representatives: cascade stabilisation, model order reduction, and output regulation. This manuscript studies the first problem, while the companion Part II addresses the other two. For each of these settings we show how the problems can be recast in our framework. In the context of cascade stabilisation, we consider the 2-cascade problem, the effect of noise through the cascade, as well as N-cascade case, and we demonstrate that our proposed method is data efficient. The proposed designs are illustrated by means of a numerical example.
Safety Under State Uncertainty: Robustifying Control Barrier Functions
Safety-critical control is a crucial aspect of modern systems, and Control Barrier Functions (CBFs) have gained popularity as the framework of choice for ensuring safety. However, implementing a CBF requires exact knowledge of the true state, a requirement that is often violated in real-world applications where only noisy or estimated state information is available. This paper introduces the notion of Robust Control Barrier Functions (R-CBF) for ensuring safety under such state uncertainty without requiring prior knowledge of the magnitude of uncertainty. We formally characterize the class of robustifying terms that ensure robust closed-loop safety and show how a robustly safe controller can be constructed. We demonstrate the effectiveness of this approach through simulations and compare it to existing methods, highlighting the additional robustness and convergence guarantees it provides.
Linear Dynamics meets Linear MDPs: Closed-Form Optimal Policies via Reinforcement Learning
Many applications -- including power systems, robotics, and economics -- involve a dynamical system interacting with a stochastic and hard-to-model environment. We adopt a reinforcement learning approach to control such systems. Specifically, we consider a deterministic, discrete-time, linear, time-invariant dynamical system coupled with a feature-based linear Markov process with an unknown transition kernel. The objective is to learn a control policy that optimizes a quadratic cost over the system state, the Markov process, and the control input. Leveraging both components of the system, we derive an explicit parametric form for the optimal state-action value function and the corresponding optimal policy. Our model is distinct in combining aspects of both classical Linear Quadratic Regulator (LQR) and linear Markov decision process (MDP) frameworks. This combination retains the implementation simplicity of LQR, while allowing for sophisticated stochastic modeling afforded by linear MDPs, without estimating the transition probabilities, thereby enabling direct policy improvement. We use tools from control theory to provide theoretical guarantees on the stability of the system under the learned policy and provide a sample complexity analysis for its convergence to the optimal policy. We illustrate our results via a numerical example that demonstrates the effectiveness of our approach in learning the optimal control policy under partially known stochastic dynamics.
Collaborative-Online-Learning-Enabled Distributionally Robust Motion Control for Multi-Robot Systems
This paper develops a novel COllaborative-Online-Learning (COOL)-enabled motion control framework for multi-robot systems to avoid collision amid randomly moving obstacles whose motion distributions are partially observable through decentralized data streams. To address the notable challenge of data acquisition due to occlusion, a COOL approach based on the Dirichlet process mixture model is proposed to efficiently extract motion distribution information by exchanging among robots selected learning structures. By leveraging the fine-grained local-moment information learned through COOL, a data-stream-driven ambiguity set for obstacle motion is constructed. We then introduce a novel ambiguity set propagation method, which theoretically admits the derivation of the ambiguity sets for obstacle positions over the entire prediction horizon by utilizing obstacle current positions and the ambiguity set for obstacle motion. Additionally, we develop a compression scheme with its safety guarantee to automatically adjust the complexity and granularity of the ambiguity set by aggregating basic ambiguity sets that are close in a measure space, thereby striking an attractive trade-off between control performance and computation time. Then the probabilistic collision-free trajectories are generated through distributionally robust optimization problems. The distributionally robust obstacle avoidance constraints based on the compressed ambiguity set are equivalently reformulated by deriving separating hyperplanes through tractable semi-definite programming. Finally, we establish the probabilistic collision avoidance guarantee and the long-term tracking performance guarantee for the proposed framework. The numerical simulations are used to demonstrate the efficacy and superiority of the proposed approach compared with state-of-the-art methods.
ZTFed-MAS2S: A Zero-Trust Federated Learning Framework with Verifiable Privacy and Trust-Aware Aggregation for Wind Power Data Imputation
Wind power data often suffers from missing values due to sensor faults and unstable transmission at edge sites. While federated learning enables privacy-preserving collaboration without sharing raw data, it remains vulnerable to anomalous updates and privacy leakage during parameter exchange. These challenges are amplified in open industrial environments, necessitating zero-trust mechanisms where no participant is inherently trusted. To address these challenges, this work proposes ZTFed-MAS2S, a zero-trust federated learning framework that integrates a multi-head attention-based sequence-to-sequence imputation model. ZTFed integrates verifiable differential privacy with non-interactive zero-knowledge proofs and a confidentiality and integrity verification mechanism to ensure verifiable privacy preservation and secure model parameters transmission. A dynamic trust-aware aggregation mechanism is employed, where trust is propagated over similarity graphs to enhance robustness, and communication overhead is reduced via sparsity- and quantization-based compression. MAS2S captures long-term dependencies in wind power data for accurate imputation. Extensive experiments on real-world wind farm datasets validate the superiority of ZTFed-MAS2S in both federated learning performance and missing data imputation, demonstrating its effectiveness as a secure and efficient solution for practical applications in the energy sector.
comment: Accepted by IEEE Transactions on Industrial Informatics, 11 pages, 6 figures
Federated Nonlinear System Identification
We consider federated learning of linearly-parameterized nonlinear systems. We establish theoretical guarantees on the effectiveness of federated nonlinear system identification compared to centralized approaches, demonstrating that the convergence rate improves as the number of clients increases. Although the convergence rates in the linear and nonlinear cases differ only by a constant, this constant depends on the feature map $\phi$, which can be carefully chosen in the nonlinear setting to increase excitation and improve performance. We experimentally validate our theory in physical settings where client devices are driven by i.i.d. control inputs and control policies exhibiting i.i.d. random perturbations, ensuring non-active exploration. Experiments use trajectories from nonlinear dynamical systems characterized by real-analytic feature functions, including polynomial and trigonometric components, representative of physical systems including pendulum and quadrotor dynamics. We analyze the convergence behavior of the proposed method under varying noise levels and data distributions. Results show that federated learning consistently improves convergence of any individual client as the number of participating clients increases.
On the Foundation of Distributionally Robust Reinforcement Learning
Motivated by the need for a robust policy in the face of environment shifts between training and deployment, we contribute to the theoretical foundation of distributionally robust reinforcement learning (DRRL). This is accomplished through a comprehensive modeling framework centered around robust Markov decision processes (RMDPs). This framework obliges the decision maker to choose an optimal policy under the worst-case distributional shift orchestrated by an adversary. By unifying and extending existing formulations, we rigorously construct RMDPs that embrace various modeling attributes for both the decision maker and the adversary. These attributes include the structure of information availability-covering history-dependent, Markov, and Markov time-homogeneous dynamics-as well as constraints on the shifts induced by the adversary, with a focus on SA- and S-rectangularity. Within this RMDP framework, we investigate conditions for the existence or absence of the dynamic programming principle (DPP). From an algorithmic standpoint, the existence of DPP holds significant implications, as the vast majority of existing data and computationally efficient DRRL algorithms are reliant on the DPP. To investigate its existence, we systematically analyze various combinations of controller and adversary attributes, presenting streamlined proofs based on a unified methodology. We then construct counterexamples for settings where a fully general DPP fails to hold and establish asymptotically optimal history-dependent policies for key scenarios where the DPP is absent.
Synergising Hierarchical Data Centers and Power Networks: A Privacy-Preserving Approach
In the era of digitization, data centers have emerged as integral contributors sustaining our interlinked world, bearing responsibility for an increasing proportion of the world's energy consumption. To facilitate the their fast rollout while progressing towards net-zero energy systems, the synergy of hierarchical data centers (cloud-fog-edge) and power networks can play a pivotal role. However, existing centralized co-dispatch manners encroach on the privacy of different agents within the integrated systems, meanwhile suffering from the combinatorial explosion. In this research, we propose a near-optimal distributed privacy-preserving approach to solve the non-convex synergy (day-ahead co-dispatch) problem. The synergy problem is formulated as a mixed integer quadratically constrained quadratic programming considering both communication and energy conservation, where Lyapunov optimization is introduced to balance operating costs and uncertain communication delays. To mitigate impacts of the highly non-convex nature, the normalized multi-parametric disaggregation technique is leveraged to reformulate the problem into a mixed integer non-linear programming. To further overcome non-smoothness of the reformulated problem, the customized $\ell_1-$surrogate Lagrangian relaxation method with convergence guarantees is proposed to solve the problem in a distributed privacy-preserving manner. The effectiveness, optimality, and scalability of the proposed methodologies for the synergy problem are validated via numerical simulations. Simulation results also indicate that computing tasks can be delayed and migrated within the hierarchical data centers, demonstrating the flexible resource allocation capabilities of the hierarchical data center architecture, further facilitating peak load balancing in the power network.
Model-Free Generic Robust Control for Servo-Driven Actuation Mechanisms with Layered Insight into Energy Conversions
To advance theoretical solutions and address limitations in modeling complex servo-driven actuation systems experiencing high non-linearity and load disturbances, this paper aims to design a practical model-free generic robust control (GRC) framework for these mechanisms. This framework is intended to be applicable across all actuator systems encompassing electrical, hydraulic, or pneumatic servomechanisms, while also functioning within complex interactions among dynamic components and adhering to control input constraints. In this respect, the state-space model of actuator systems is decomposed into smaller subsystems that incorporate the first principle equation of actuator motion dynamics and interactive energy conversion equations. This decomposition operates under the assumption that the comprehensive model of the servo-driven actuator system and energy conversion, uncertainties, load disturbances, and their bounds are unknown. Then, the GRC employs subsystem-based adaptive control strategies for each state-variant subsystem separately. Despite control input constraints and the unknown interactive system model, the GRC-applied actuator mechanism ensures uniform exponential stability and robustness in tracking desired motions. It features straightforward implementation, experimentally evaluated by applying it to two industrial applications.
Systems and Control (EESS)
A Data-Driven Forced Oscillation Locating Method for Power Systems with Inverter-Based Resources
Forced Oscillations (FO) stemming from external periodic disturbances threaten power system security and stability. The increasing penetration of Inverter-Based Resources(IBRs) further introduces FO, leading to new challenges in identifying and locating FO sources in modern power systems. In this paper, a novel data-driven method for locating FO in power systems with IBRs is proposed. Unlike previous works, a unified representation of FO originating from IBRs is considered, which further facilitates the development of the FO locating algorithm. Leveraging on Sparse Identification for a Nonlinear Dynamical (SINDy), a purely data-driven methodology is developed for locating the source of FO by interpreting the proposed model from measurements. Numerical results on the WECC 240-bus system validate the performance of the proposed approach in successfully locating FO in the presence of IBRs.
First and Second Order Optimal $\mathcal{H}_2$ Model Reduction for Linear Continuous-Time Systems
In this paper, we investigate the optimal $\mathcal{H}_2$ model reduction problem for single-input single-output (SISO) continuous-time linear time-invariant (LTI) systems. A semi-definite relaxation (SDR) approach is proposed to determine globally optimal interpolation points, providing an effective way to compute the reduced-order models via Krylov projection-based methods. In contrast to iterative approaches, we use the controllability Gramian and the moment-matching conditions to recast the model reduction problem as a convex optimization by introducing an upper bound $\gamma$ to minimize the $\mathcal{H}_2$ norm of the model reduction error system. We also prove that the relaxation is exact for first order reduced models and demonstrate, through examples, that it is exact for second order reduced models. We compare the performance of our proposed method with other iterative approaches and shift-selection methods on examples. Importantly, our approach also provides a means to verify the global optimality of known locally convergent methods.
comment: 8 pages, 5 figures, CDC conference
A Consensus Algorithm for Second-Order Systems Evolving on Lie Groups
In this paper, a consensus algorithm is proposed for interacting multi-agents, which can be modeled as simple Mechanical Control Systems (MCS) evolving on a general Lie group. The standard Laplacian flow consensus algorithm for double integrator systems evolving on Euclidean spaces is extended to a general Lie group. A tracking error function is defined on a general smooth manifold for measuring the error between the configurations of two interacting agents. The stability of the desired consensus equilibrium is proved using a generalized version of Lyapunov theory and LaSalle's invariance principle applicable for systems evolving on a smooth manifold. The proposed consensus control input requires only the configuration information of the neighboring agents and does not require their velocities and inertia tensors. The design of tracking error function and consensus control inputs are demonstrated through an application of attitude consensus problem for multiple communicating rigid bodies. The consensus algorithm is numerically validated by demonstrating the attitude consensus problem.
Distributed Implementation of Variational Quantum Eigensolver to Solve QUBO Problems
We present a distributed algorithm and implementation of the variational quantum eigensolver (VQE), termed distributed VQE (DVQE). DVQE, provided as an open-source Python package, enables the execution of parameterized quantum circuits across multiple logical quantum processing units (QPUs) in a distributed fashion. This approach addresses key hardware limitations of near-term quantum devices, including restricted qubit counts and limited circuit depth. Distributed ansatz circuits are constructed to preserve the quantum state fidelity of their monolithic counterparts, allowing consistent energy estimation while distributing the computational load. To improve the convergence and robustness of the optimization loop for identifying the variational parameters of the DVQE ansatz circuit, we use the ADAM optimizer in combination with metaheuristic initialization strategies, which outperform random initialization across various test cases. The complete DVQE pipeline is implemented in a modular Python package that accepts QUBO problems as input and supports monolithic and distributed execution modes. The framework leverages Qiskit to construct and simulate distributed circuits, and includes an internal greedy algorithm for automatic qubit allocation across multiple QPUs. Simulation results on QUBO benchmarks confirm the correctness of the approach, paving the way for real QPU deployment and further exploration of distributed quantum optimization. \textbf{The simulator is publicly available on \href{https://github.com/LSU-RAISE-LAB/DVQE.git}{GitHub} under a package named raiselab, with a collection of tutorial examples.}
Optimizing Grasping in Legged Robots: A Deep Learning Approach to Loco-Manipulation
Quadruped robots have emerged as highly efficient and versatile platforms, excelling in navigating complex and unstructured terrains where traditional wheeled robots might fail. Equipping these robots with manipulator arms unlocks the advanced capability of loco-manipulation to perform complex physical interaction tasks in areas ranging from industrial automation to search-and-rescue missions. However, achieving precise and adaptable grasping in such dynamic scenarios remains a significant challenge, often hindered by the need for extensive real-world calibration and pre-programmed grasp configurations. This paper introduces a deep learning framework designed to enhance the grasping capabilities of quadrupeds equipped with arms, focusing on improved precision and adaptability. Our approach centers on a sim-to-real methodology that minimizes reliance on physical data collection. We developed a pipeline within the Genesis simulation environment to generate a synthetic dataset of grasp attempts on common objects. By simulating thousands of interactions from various perspectives, we created pixel-wise annotated grasp-quality maps to serve as the ground truth for our model. This dataset was used to train a custom CNN with a U-Net-like architecture that processes multi-modal input from an onboard RGB and depth cameras, including RGB images, depth maps, segmentation masks, and surface normal maps. The trained model outputs a grasp-quality heatmap to identify the optimal grasp point. We validated the complete framework on a four-legged robot. The system successfully executed a full loco-manipulation task: autonomously navigating to a target object, perceiving it with its sensors, predicting the optimal grasp pose using our model, and performing a precise grasp. This work proves that leveraging simulated training with advanced sensing offers a scalable and effective solution for object handling.
Coordinated UAV Beamforming and Control for Directional Jamming and Nulling
Efficient mobile jamming against eavesdroppers in wireless networks necessitates accurate coordination between mobility and antenna beamforming. We study the coordinated beamforming and control problem for a UAV that carries two omnidirectional antennas, and which uses them to jam an eavesdropper while leaving a friendly client unaffected. The UAV can shape its jamming beampattern by controlling its position, its antennas' orientation, and the phases of the antennas' interference signals. We derive a closed-form expression for the antennas' phases that guarantees zero jamming impact on the client. In addition, we determine the antennas' orientation and the UAV's position that maximizes jamming impact on the eavesdropper through an optimal control problem, optimizing the orientation pointwise and the position through the UAV's control input. Simulations show how this coordinated beamforming and control scheme enables directional GPS denial while guaranteeing zero interference towards a friendly direction.
comment: 8 pages, 7 Figures
Input-Output Data-Driven Sensor Selection for Cyber-Physical Systems
In this paper, we consider the problem of input-output data-driven sensor selection for unknown cyber-physical systems (CPS). In particular, out of a large set of sensors available for use, we choose a subset of them that maximizes a metric of observability of the CPS. The considered observability metric is related to the $\mathcal{H}_2$ system norm, which quantifies the average output energy of the selected sensors over a finite or an infinite horizon. However, its computation inherently requires knowledge of the unknown matrices of the system, so we draw connections from the reinforcement learning literature and design an input-output data-driven algorithm to compute it in a model-free manner. We then use the derived data-driven metric expression to choose the best sensors of the system in polynomial time, effectively obtaining a provably convergent model-free sensor selection process. Additionally, we show how the proposed data-driven approach can be exploited to select sensors that optimize volumetric measures of observability, while also noting its applicability to the dual problem of actuator selection. Simulations are performed to demonstrate the validity and effectiveness of the proposed approach.
comment: 12 pages, 3 Figures
Modular electronic microrobots with on board sensor-program steered locomotion
True microrobots, in contrast with externally controlled microparticles, must harvest or carry their own source of energy, as well as their own (preferably programmable) microcontroller of actuators for locomotion, using information acquired from their own sensors. Building on recent published work [1], we demonstrate here, for the first time, that microrobotic smartlets, hitherto buoyancy divers, can also be equipped to navigate in 2D on surfaces, with on-board control responding to both sensor information and their internal electronic program. Fabricating modular microrobots, with all dimensions of 1mm and below, has been difficult to achieve because of competing demands for the limited surface area and the challenges of integrating and interconnecting the diverse functionalities of energy harvesting, actuation, sensing, communication, docking and control. A novel high density heterogeneous integration, via soft-substrate micro flip-chip bonding of custom CMOS and LED microchiplets onto fold-up polymer surfaces, compatible with roll-up isotropic ambient light harvesting, now makes this possible. Fabricating electrolytic bubble actuators on multiple cube-faces and connecting them to a custom sensor-controlled on-board microchiplet (lablet), allows the smartlets to locomote on wet surfaces, changing direction in response to both timed programmed control as well as programmed response to locally sensed signals. Such locomoted robotic microcubes can also move to and selectively dock with other modules via patterned surfaces. This is powered by ambient light in natural aqueous media on smooth surfaces.
Analysis of Circuit-based Per-Panel Diode Model of Photovoltaic Array
Solar photovoltaic systems are increasing in size and number on the grid. In regions with high penetration, such as California, PV systems serve multiple functions, including peak shaving and demand response. Therefore, the criticality of PV systems to grid operations calls for accurate models. The current practice is to represent the PV array, composed of multiple PV panels, with an aggregated single-diode model (SDM). The highly abstract model has a limited ability to capture real-world behaviors, such as partial shading and hotspots. Thus, we develop a circuit-based per-panel PV array model that uses a single diode model for each panel and interconnects them to form an array. This approach bridges the tradeoff between cell-level physics and control-dependent system-level behavior. We establish conditions for mathematical equivalence between the proposed per-panel array circuit model and the aggregated single-diode array model. We generate empirical evidence by running simulations using parameters derived from real-world PV panels. Results indicate that the proposed per-panel array model can represent the electrical behavior of the array under non-ideal conditions, such as partial shading, more accurately. With maximum power point tracking control, the proposed model is 21.2% more accurate when estimating the real power output of an array under a partial shading scenario and 8.1% more accurate under a hot spot scenario.
One Equation to Rule Them All -- Part II: Direct Data-Driven Reduction and Regulation
The Sylvester equation underpins a wide spectrum of control synthesis and systems analysis tools associated with cascade interconnections. In the preceding Part I [1] of this article, it was shown that such an equation can be reformulated using data, enabling the production of a collection of data-driven stabilisation procedures. In this second part of the article, we continue to develop the framework established in Part I to solve two important control-theoretic problems: model order reduction and output regulation. For the model order reduction problem we provide a solution from input-state measurements, from input-output measurements, and we study the effect of the noise. For the output regulation problem, we provide data-driven solutions for the static and dynamic feedback problem. The proposed designs are illustrated by means of examples.
One Equation to Rule Them All -- Part I: Direct Data-Driven Cascade Stabilisation
In this article we present a framework for direct data-driven control for general problems involving interconnections of dynamical systems. We first develop a method to determine the solution of a Sylvester equation from data. Such solution is used to describe a subspace that plays a role in a large variety of problems. We then provide an error analysis of the impact that noise has on this solution. This is a crucial contribution because, thanks to the interconnection approach developed throughout the article, we are able to track how the noise propagates at each stage, and thereby provide bounds on the final designs. Among the many potential problems that can be solved with this framework, we focus on three representatives: cascade stabilisation, model order reduction, and output regulation. This manuscript studies the first problem, while the companion Part II addresses the other two. For each of these settings we show how the problems can be recast in our framework. In the context of cascade stabilisation, we consider the 2-cascade problem, the effect of noise through the cascade, as well as N-cascade case, and we demonstrate that our proposed method is data efficient. The proposed designs are illustrated by means of a numerical example.
Safety Under State Uncertainty: Robustifying Control Barrier Functions
Safety-critical control is a crucial aspect of modern systems, and Control Barrier Functions (CBFs) have gained popularity as the framework of choice for ensuring safety. However, implementing a CBF requires exact knowledge of the true state, a requirement that is often violated in real-world applications where only noisy or estimated state information is available. This paper introduces the notion of Robust Control Barrier Functions (R-CBF) for ensuring safety under such state uncertainty without requiring prior knowledge of the magnitude of uncertainty. We formally characterize the class of robustifying terms that ensure robust closed-loop safety and show how a robustly safe controller can be constructed. We demonstrate the effectiveness of this approach through simulations and compare it to existing methods, highlighting the additional robustness and convergence guarantees it provides.
Linear Dynamics meets Linear MDPs: Closed-Form Optimal Policies via Reinforcement Learning
Many applications -- including power systems, robotics, and economics -- involve a dynamical system interacting with a stochastic and hard-to-model environment. We adopt a reinforcement learning approach to control such systems. Specifically, we consider a deterministic, discrete-time, linear, time-invariant dynamical system coupled with a feature-based linear Markov process with an unknown transition kernel. The objective is to learn a control policy that optimizes a quadratic cost over the system state, the Markov process, and the control input. Leveraging both components of the system, we derive an explicit parametric form for the optimal state-action value function and the corresponding optimal policy. Our model is distinct in combining aspects of both classical Linear Quadratic Regulator (LQR) and linear Markov decision process (MDP) frameworks. This combination retains the implementation simplicity of LQR, while allowing for sophisticated stochastic modeling afforded by linear MDPs, without estimating the transition probabilities, thereby enabling direct policy improvement. We use tools from control theory to provide theoretical guarantees on the stability of the system under the learned policy and provide a sample complexity analysis for its convergence to the optimal policy. We illustrate our results via a numerical example that demonstrates the effectiveness of our approach in learning the optimal control policy under partially known stochastic dynamics.
Collaborative-Online-Learning-Enabled Distributionally Robust Motion Control for Multi-Robot Systems
This paper develops a novel COllaborative-Online-Learning (COOL)-enabled motion control framework for multi-robot systems to avoid collision amid randomly moving obstacles whose motion distributions are partially observable through decentralized data streams. To address the notable challenge of data acquisition due to occlusion, a COOL approach based on the Dirichlet process mixture model is proposed to efficiently extract motion distribution information by exchanging among robots selected learning structures. By leveraging the fine-grained local-moment information learned through COOL, a data-stream-driven ambiguity set for obstacle motion is constructed. We then introduce a novel ambiguity set propagation method, which theoretically admits the derivation of the ambiguity sets for obstacle positions over the entire prediction horizon by utilizing obstacle current positions and the ambiguity set for obstacle motion. Additionally, we develop a compression scheme with its safety guarantee to automatically adjust the complexity and granularity of the ambiguity set by aggregating basic ambiguity sets that are close in a measure space, thereby striking an attractive trade-off between control performance and computation time. Then the probabilistic collision-free trajectories are generated through distributionally robust optimization problems. The distributionally robust obstacle avoidance constraints based on the compressed ambiguity set are equivalently reformulated by deriving separating hyperplanes through tractable semi-definite programming. Finally, we establish the probabilistic collision avoidance guarantee and the long-term tracking performance guarantee for the proposed framework. The numerical simulations are used to demonstrate the efficacy and superiority of the proposed approach compared with state-of-the-art methods.
ZTFed-MAS2S: A Zero-Trust Federated Learning Framework with Verifiable Privacy and Trust-Aware Aggregation for Wind Power Data Imputation
Wind power data often suffers from missing values due to sensor faults and unstable transmission at edge sites. While federated learning enables privacy-preserving collaboration without sharing raw data, it remains vulnerable to anomalous updates and privacy leakage during parameter exchange. These challenges are amplified in open industrial environments, necessitating zero-trust mechanisms where no participant is inherently trusted. To address these challenges, this work proposes ZTFed-MAS2S, a zero-trust federated learning framework that integrates a multi-head attention-based sequence-to-sequence imputation model. ZTFed integrates verifiable differential privacy with non-interactive zero-knowledge proofs and a confidentiality and integrity verification mechanism to ensure verifiable privacy preservation and secure model parameters transmission. A dynamic trust-aware aggregation mechanism is employed, where trust is propagated over similarity graphs to enhance robustness, and communication overhead is reduced via sparsity- and quantization-based compression. MAS2S captures long-term dependencies in wind power data for accurate imputation. Extensive experiments on real-world wind farm datasets validate the superiority of ZTFed-MAS2S in both federated learning performance and missing data imputation, demonstrating its effectiveness as a secure and efficient solution for practical applications in the energy sector.
comment: Accepted by IEEE Transactions on Industrial Informatics, 11 pages, 6 figures
Federated Nonlinear System Identification
We consider federated learning of linearly-parameterized nonlinear systems. We establish theoretical guarantees on the effectiveness of federated nonlinear system identification compared to centralized approaches, demonstrating that the convergence rate improves as the number of clients increases. Although the convergence rates in the linear and nonlinear cases differ only by a constant, this constant depends on the feature map $\phi$, which can be carefully chosen in the nonlinear setting to increase excitation and improve performance. We experimentally validate our theory in physical settings where client devices are driven by i.i.d. control inputs and control policies exhibiting i.i.d. random perturbations, ensuring non-active exploration. Experiments use trajectories from nonlinear dynamical systems characterized by real-analytic feature functions, including polynomial and trigonometric components, representative of physical systems including pendulum and quadrotor dynamics. We analyze the convergence behavior of the proposed method under varying noise levels and data distributions. Results show that federated learning consistently improves convergence of any individual client as the number of participating clients increases.
On the Foundation of Distributionally Robust Reinforcement Learning
Motivated by the need for a robust policy in the face of environment shifts between training and deployment, we contribute to the theoretical foundation of distributionally robust reinforcement learning (DRRL). This is accomplished through a comprehensive modeling framework centered around robust Markov decision processes (RMDPs). This framework obliges the decision maker to choose an optimal policy under the worst-case distributional shift orchestrated by an adversary. By unifying and extending existing formulations, we rigorously construct RMDPs that embrace various modeling attributes for both the decision maker and the adversary. These attributes include the structure of information availability-covering history-dependent, Markov, and Markov time-homogeneous dynamics-as well as constraints on the shifts induced by the adversary, with a focus on SA- and S-rectangularity. Within this RMDP framework, we investigate conditions for the existence or absence of the dynamic programming principle (DPP). From an algorithmic standpoint, the existence of DPP holds significant implications, as the vast majority of existing data and computationally efficient DRRL algorithms are reliant on the DPP. To investigate its existence, we systematically analyze various combinations of controller and adversary attributes, presenting streamlined proofs based on a unified methodology. We then construct counterexamples for settings where a fully general DPP fails to hold and establish asymptotically optimal history-dependent policies for key scenarios where the DPP is absent.
Synergising Hierarchical Data Centers and Power Networks: A Privacy-Preserving Approach
In the era of digitization, data centers have emerged as integral contributors sustaining our interlinked world, bearing responsibility for an increasing proportion of the world's energy consumption. To facilitate the their fast rollout while progressing towards net-zero energy systems, the synergy of hierarchical data centers (cloud-fog-edge) and power networks can play a pivotal role. However, existing centralized co-dispatch manners encroach on the privacy of different agents within the integrated systems, meanwhile suffering from the combinatorial explosion. In this research, we propose a near-optimal distributed privacy-preserving approach to solve the non-convex synergy (day-ahead co-dispatch) problem. The synergy problem is formulated as a mixed integer quadratically constrained quadratic programming considering both communication and energy conservation, where Lyapunov optimization is introduced to balance operating costs and uncertain communication delays. To mitigate impacts of the highly non-convex nature, the normalized multi-parametric disaggregation technique is leveraged to reformulate the problem into a mixed integer non-linear programming. To further overcome non-smoothness of the reformulated problem, the customized $\ell_1-$surrogate Lagrangian relaxation method with convergence guarantees is proposed to solve the problem in a distributed privacy-preserving manner. The effectiveness, optimality, and scalability of the proposed methodologies for the synergy problem are validated via numerical simulations. Simulation results also indicate that computing tasks can be delayed and migrated within the hierarchical data centers, demonstrating the flexible resource allocation capabilities of the hierarchical data center architecture, further facilitating peak load balancing in the power network.
Model-Free Generic Robust Control for Servo-Driven Actuation Mechanisms with Layered Insight into Energy Conversions
To advance theoretical solutions and address limitations in modeling complex servo-driven actuation systems experiencing high non-linearity and load disturbances, this paper aims to design a practical model-free generic robust control (GRC) framework for these mechanisms. This framework is intended to be applicable across all actuator systems encompassing electrical, hydraulic, or pneumatic servomechanisms, while also functioning within complex interactions among dynamic components and adhering to control input constraints. In this respect, the state-space model of actuator systems is decomposed into smaller subsystems that incorporate the first principle equation of actuator motion dynamics and interactive energy conversion equations. This decomposition operates under the assumption that the comprehensive model of the servo-driven actuator system and energy conversion, uncertainties, load disturbances, and their bounds are unknown. Then, the GRC employs subsystem-based adaptive control strategies for each state-variant subsystem separately. Despite control input constraints and the unknown interactive system model, the GRC-applied actuator mechanism ensures uniform exponential stability and robustness in tracking desired motions. It features straightforward implementation, experimentally evaluated by applying it to two industrial applications.
Robotics
LodeStar: Long-horizon Dexterity via Synthetic Data Augmentation from Human Demonstrations
Developing robotic systems capable of robustly executing long-horizon manipulation tasks with human-level dexterity is challenging, as such tasks require both physical dexterity and seamless sequencing of manipulation skills while robustly handling environment variations. While imitation learning offers a promising approach, acquiring comprehensive datasets is resource-intensive. In this work, we propose a learning framework and system LodeStar that automatically decomposes task demonstrations into semantically meaningful skills using off-the-shelf foundation models, and generates diverse synthetic demonstration datasets from a few human demos through reinforcement learning. These sim-augmented datasets enable robust skill training, with a Skill Routing Transformer (SRT) policy effectively chaining the learned skills together to execute complex long-horizon manipulation tasks. Experimental evaluations on three challenging real-world long-horizon dexterous manipulation tasks demonstrate that our approach significantly improves task performance and robustness compared to previous baselines. Videos are available at lodestar-robot.github.io.
comment: CoRL 2025
Variational Shape Inference for Grasp Diffusion on SE(3)
Grasp synthesis is a fundamental task in robotic manipulation which usually has multiple feasible solutions. Multimodal grasp synthesis seeks to generate diverse sets of stable grasps conditioned on object geometry, making the robust learning of geometric features crucial for success. To address this challenge, we propose a framework for learning multimodal grasp distributions that leverages variational shape inference to enhance robustness against shape noise and measurement sparsity. Our approach first trains a variational autoencoder for shape inference using implicit neural representations, and then uses these learned geometric features to guide a diffusion model for grasp synthesis on the SE(3) manifold. Additionally, we introduce a test-time grasp optimization technique that can be integrated as a plugin to further enhance grasping performance. Experimental results demonstrate that our shape inference for grasp synthesis formulation outperforms state-of-the-art multimodal grasp synthesis methods on the ACRONYM dataset by 6.3%, while demonstrating robustness to deterioration in point cloud density compared to other approaches. Furthermore, our trained model achieves zero-shot transfer to real-world manipulation of household objects, generating 34% more successful grasps than baselines despite measurement noise and point cloud calibration errors.
SoK: Cybersecurity Assessment of Humanoid Ecosystem
Humanoids are progressing toward practical deployment across healthcare, industrial, defense, and service sectors. While typically considered cyber-physical systems (CPSs), their dependence on traditional networked software stacks (e.g., Linux operating systems), robot operating system (ROS) middleware, and over-the-air update channels, creates a distinct security profile that exposes them to vulnerabilities conventional CPS models do not fully address. Prior studies have mainly examined specific threats, such as LiDAR spoofing or adversarial machine learning (AML). This narrow focus overlooks how an attack targeting one component can cascade harm throughout the robot's interconnected systems. We address this gap through a systematization of knowledge (SoK) that takes a comprehensive approach, consolidating fragmented research from robotics, CPS, and network security domains. We introduce a seven-layer security model for humanoid robots, organizing 39 known attacks and 35 defenses across the humanoid ecosystem-from hardware to human-robot interaction. Building on this security model, we develop a quantitative 39x35 attack-defense matrix with risk-weighted scoring, validated through Monte Carlo analysis. We demonstrate our method by evaluating three real-world robots: Pepper, G1 EDU, and Digit. The scoring analysis revealed varying security maturity levels, with scores ranging from 39.9% to 79.5% across the platforms. This work introduces a structured, evidence-based assessment method that enables systematic security evaluation, supports cross-platform benchmarking, and guides prioritization of security investments in humanoid robotics.
Morphological Cognition: Classifying MNIST Digits Through Morphological Computation Alone
With the rise of modern deep learning, neural networks have become an essential part of virtually every artificial intelligence system, making it difficult even to imagine different models for intelligent behavior. In contrast, nature provides us with many different mechanisms for intelligent behavior, most of which we have yet to replicate. One of such underinvestigated aspects of intelligence is embodiment and the role it plays in intelligent behavior. In this work, we focus on how the simple and fixed behavior of constituent parts of a simulated physical body can result in an emergent behavior that can be classified as cognitive by an outside observer. Specifically, we show how simulated voxels with fixed behaviors can be combined to create a robot such that, when presented with an image of an MNIST digit zero, it moves towards the left; and when it is presented with an image of an MNIST digit one, it moves towards the right. Such robots possess what we refer to as ``morphological cognition'' -- the ability to perform cognitive behavior as a result of morphological processes. To the best of our knowledge, this is the first demonstration of a high-level mental faculty such as image classification performed by a robot without any neural circuitry. We hope that this work serves as a proof-of-concept and fosters further research into different models of intelligence.
comment: Accepted to be presented at ALife 2025 as a talk
A Synthetic Dataset for Manometry Recognition in Robotic Applications
This work addresses the challenges of data scarcity and high acquisition costs for training robust object detection models in complex industrial environments, such as offshore oil platforms. The practical and economic barriers to collecting real-world data in these hazardous settings often hamper the development of autonomous inspection systems. To overcome this, in this work we propose and validate a hybrid data synthesis pipeline that combines procedural rendering with AI-driven video generation. Our methodology leverages BlenderProc to create photorealistic images with precise annotations and controlled domain randomization, and integrates NVIDIA's Cosmos-Predict2 world-foundation model to synthesize physically plausible video sequences with temporal diversity, capturing rare viewpoints and adverse conditions. We demonstrate that a YOLO-based detection network trained on a composite dataset, blending real images with our synthetic data, achieves superior performance compared to models trained exclusively on real-world data. Notably, a 1:1 mixture of real and synthetic data yielded the highest accuracy, surpassing the real-only baseline. These findings highlight the viability of a synthetic-first approach as an efficient, cost-effective, and safe alternative for developing reliable perception systems in safety-critical and resource-constrained industrial applications.
Optimizing Grasping in Legged Robots: A Deep Learning Approach to Loco-Manipulation
Quadruped robots have emerged as highly efficient and versatile platforms, excelling in navigating complex and unstructured terrains where traditional wheeled robots might fail. Equipping these robots with manipulator arms unlocks the advanced capability of loco-manipulation to perform complex physical interaction tasks in areas ranging from industrial automation to search-and-rescue missions. However, achieving precise and adaptable grasping in such dynamic scenarios remains a significant challenge, often hindered by the need for extensive real-world calibration and pre-programmed grasp configurations. This paper introduces a deep learning framework designed to enhance the grasping capabilities of quadrupeds equipped with arms, focusing on improved precision and adaptability. Our approach centers on a sim-to-real methodology that minimizes reliance on physical data collection. We developed a pipeline within the Genesis simulation environment to generate a synthetic dataset of grasp attempts on common objects. By simulating thousands of interactions from various perspectives, we created pixel-wise annotated grasp-quality maps to serve as the ground truth for our model. This dataset was used to train a custom CNN with a U-Net-like architecture that processes multi-modal input from an onboard RGB and depth cameras, including RGB images, depth maps, segmentation masks, and surface normal maps. The trained model outputs a grasp-quality heatmap to identify the optimal grasp point. We validated the complete framework on a four-legged robot. The system successfully executed a full loco-manipulation task: autonomously navigating to a target object, perceiving it with its sensors, predicting the optimal grasp pose using our model, and performing a precise grasp. This work proves that leveraging simulated training with advanced sensing offers a scalable and effective solution for object handling.
Evolutionary Brain-Body Co-Optimization Consistently Fails to Select for Morphological Potential
Brain-body co-optimization remains a challenging problem, despite increasing interest from the community in recent years. To understand and overcome the challenges, we propose exhaustively mapping a morphology-fitness landscape to study it. To this end, we train controllers for each feasible morphology in a design space of 1,305,840 distinct morphologies, constrained by a computational budget. First, we show that this design space constitutes a good model for studying the brain-body co-optimization problem, and our attempt to exhaustively map it roughly captures the landscape. We then proceed to analyze how evolutionary brain-body co-optimization algorithms work in this design space. The complete knowledge of the morphology-fitness landscape facilitates a better understanding of the results of evolutionary brain-body co-optimization algorithms and how they unfold over evolutionary time in the morphology space. This investigation shows that the experimented algorithms cannot consistently find near-optimal solutions. The search, at times, gets stuck on morphologies that are sometimes one mutation away from better morphologies, and the algorithms cannot efficiently track the fitness gradient in the morphology-fitness landscape. We provide evidence that experimented algorithms regularly undervalue the fitness of individuals with newly mutated bodies and, as a result, eliminate promising morphologies throughout evolution. Our work provides the most concrete demonstration of the challenges of evolutionary brain-body co-optimization. Our findings ground the trends in the literature and provide valuable insights for future work.
comment: Accepted to be presented at ALife 2025 as a talk
Robotic Manipulation via Imitation Learning: Taxonomy, Evolution, Benchmark, and Challenges
Robotic Manipulation (RM) is central to the advancement of autonomous robots, enabling them to interact with and manipulate objects in real-world environments. This survey focuses on RM methodologies that leverage imitation learning, a powerful technique that allows robots to learn complex manipulation skills by mimicking human demonstrations. We identify and analyze the most influential studies in this domain, selected based on community impact and intrinsic quality. For each paper, we provide a structured summary, covering the research purpose, technical implementation, hierarchical classification, input formats, key priors, strengths and limitations, and citation metrics. Additionally, we trace the chronological development of imitation learning techniques within RM policy (RMP), offering a timeline of key technological advancements. Where available, we report benchmark results and perform quantitative evaluations to compare existing methods. By synthesizing these insights, this review provides a comprehensive resource for researchers and practitioners, highlighting both the state of the art and the challenges that lie ahead in the field of robotic manipulation through imitation learning.
Robust Point Cloud Registration via Geometric Overlapping Guided Rotation Search
Point cloud registration based on correspondences computes the rigid transformation that maximizes the number of inliers constrained within the noise threshold. Current state-of-the-art (SOTA) methods employing spatial compatibility graphs or branch-and-bound (BnB) search mainly focus on registration under high outlier ratios. However, graph-based methods require at least quadratic space and time complexity for graph construction, while multi-stage BnB search methods often suffer from inaccuracy due to local optima between decomposed stages. This paper proposes a geometric maximum overlapping registration framework via rotation-only BnB search. The rigid transformation is decomposed using Chasles' theorem into a translation along rotation axis and a 2D rigid transformation. The optimal rotation axis and angle are searched via BnB, with residual parameters formulated as range maximum query (RMQ) problems. Firstly, the top-k candidate rotation axes are searched within a hemisphere parameterized by cube mapping, and the translation along each axis is estimated through interval stabbing of the correspondences projected onto that axis. Secondly, the 2D registration is relaxed to 1D rotation angle search with 2D RMQ of geometric overlapping for axis-aligned rectangles, which is solved deterministically in polynomial time using sweep line algorithm with segment tree. Experimental results on 3DMatch, 3DLoMatch, and KITTI datasets demonstrate superior accuracy and efficiency over SOTA methods, while the time complexity is polynomial and the space complexity increases linearly with the number of points, even in the worst case.
OVITA: Open-Vocabulary Interpretable Trajectory Adaptations
Adapting trajectories to dynamic situations and user preferences is crucial for robot operation in unstructured environments with non-expert users. Natural language enables users to express these adjustments in an interactive manner. We introduce OVITA, an interpretable, open-vocabulary, language-driven framework designed for adapting robot trajectories in dynamic and novel situations based on human instructions. OVITA leverages multiple pre-trained Large Language Models (LLMs) to integrate user commands into trajectories generated by motion planners or those learned through demonstrations. OVITA employs code as an adaptation policy generated by an LLM, enabling users to adjust individual waypoints, thus providing flexible control. Another LLM, which acts as a code explainer, removes the need for expert users, enabling intuitive interactions. The efficacy and significance of the proposed OVITA framework is demonstrated through extensive simulations and real-world environments with diverse tasks involving spatiotemporal variations on heterogeneous robotic platforms such as a KUKA IIWA robot manipulator, Clearpath Jackal ground robot, and CrazyFlie drone.
comment: Accepted to Robotics and Automation Letters 2025. Code link: https://github.com/anurag1000101/OVITA
SEER-VAR: Semantic Egocentric Environment Reasoner for Vehicle Augmented Reality
We present SEER-VAR, a novel framework for egocentric vehicle-based augmented reality (AR) that unifies semantic decomposition, Context-Aware SLAM Branches (CASB), and LLM-driven recommendation. Unlike existing systems that assume static or single-view settings, SEER-VAR dynamically separates cabin and road scenes via depth-guided vision-language grounding. Two SLAM branches track egocentric motion in each context, while a GPT-based module generates context-aware overlays such as dashboard cues and hazard alerts. To support evaluation, we introduce EgoSLAM-Drive, a real-world dataset featuring synchronized egocentric views, 6DoF ground-truth poses, and AR annotations across diverse driving scenarios. Experiments demonstrate that SEER-VAR achieves robust spatial alignment and perceptually coherent AR rendering across varied environments. As one of the first to explore LLM-based AR recommendation in egocentric driving, we address the lack of comparable systems through structured prompting and detailed user studies. Results show that SEER-VAR enhances perceived scene understanding, overlay relevance, and driver ease, providing an effective foundation for future research in this direction. Code and dataset will be made open source.
Collaborative-Online-Learning-Enabled Distributionally Robust Motion Control for Multi-Robot Systems
This paper develops a novel COllaborative-Online-Learning (COOL)-enabled motion control framework for multi-robot systems to avoid collision amid randomly moving obstacles whose motion distributions are partially observable through decentralized data streams. To address the notable challenge of data acquisition due to occlusion, a COOL approach based on the Dirichlet process mixture model is proposed to efficiently extract motion distribution information by exchanging among robots selected learning structures. By leveraging the fine-grained local-moment information learned through COOL, a data-stream-driven ambiguity set for obstacle motion is constructed. We then introduce a novel ambiguity set propagation method, which theoretically admits the derivation of the ambiguity sets for obstacle positions over the entire prediction horizon by utilizing obstacle current positions and the ambiguity set for obstacle motion. Additionally, we develop a compression scheme with its safety guarantee to automatically adjust the complexity and granularity of the ambiguity set by aggregating basic ambiguity sets that are close in a measure space, thereby striking an attractive trade-off between control performance and computation time. Then the probabilistic collision-free trajectories are generated through distributionally robust optimization problems. The distributionally robust obstacle avoidance constraints based on the compressed ambiguity set are equivalently reformulated by deriving separating hyperplanes through tractable semi-definite programming. Finally, we establish the probabilistic collision avoidance guarantee and the long-term tracking performance guarantee for the proposed framework. The numerical simulations are used to demonstrate the efficacy and superiority of the proposed approach compared with state-of-the-art methods.
BEHAVIOR Robot Suite: Streamlining Real-World Whole-Body Manipulation for Everyday Household Activities
Real-world household tasks present significant challenges for mobile manipulation robots. An analysis of existing robotics benchmarks reveals that successful task performance hinges on three key whole-body control capabilities: bimanual coordination, stable and precise navigation, and extensive end-effector reachability. Achieving these capabilities requires careful hardware design, but the resulting system complexity further complicates visuomotor policy learning. To address these challenges, we introduce the BEHAVIOR Robot Suite (BRS), a comprehensive framework for whole-body manipulation in diverse household tasks. Built on a bimanual, wheeled robot with a 4-DoF torso, BRS integrates a cost-effective whole-body teleoperation interface for data collection and a novel algorithm for learning whole-body visuomotor policies. We evaluate BRS on five challenging household tasks that not only emphasize the three core capabilities but also introduce additional complexities, such as long-range navigation, interaction with articulated and deformable objects, and manipulation in confined spaces. We believe that BRS's integrated robotic embodiment, data collection interface, and learning framework mark a significant step toward enabling real-world whole-body manipulation for everyday household tasks. BRS is open-sourced at https://behavior-robot-suite.github.io/
comment: 9th Conference on Robot Learning (CoRL 2025), Seoul, Korea. Project website: https://behavior-robot-suite.github.io/
PixRO: Pixel-Distributed Rotational Odometry with Gaussian Belief Propagation
Images are the standard input for most computer vision algorithms. However, their processing often reduces to parallelizable operations applied locally and independently to individual pixels. Yet, many of these low-level raw pixel readings only provide redundant or noisy information for specific high-level tasks, leading to inefficiencies in both energy consumption during their transmission off-sensor and computational resources in their subsequent processing. As novel sensors featuring advanced in-pixel processing capabilities emerge, we envision a paradigm shift toward performing increasingly complex visual processing directly in-pixel, reducing computational overhead downstream. We advocate for synthesizing high-level cues at the pixel level, enabling their off-sensor transmission to directly support downstream tasks more effectively than raw pixel readings. This paper conceptualizes a novel photometric rotation estimation algorithm to be distributed at pixel level, where each pixel estimates the global motion of the camera by exchanging information with other pixels to achieve global consensus. We employ a probabilistic formulation and leverage Gaussian Belief Propagation (GBP) for decentralized inference using messaging-passing. The proposed proposed technique is evaluated on real-world public datasets and we offer a in-depth analysis of the practicality of applying GBP to distributed rotation estimation at pixel level.
FetchBot: Learning Generalizable Object Fetching in Cluttered Scenes via Zero-Shot Sim2Real
Generalizable object fetching in cluttered scenes remains a fundamental and application-critical challenge in embodied AI. Closely packed objects cause inevitable occlusions, making safe action generation particularly difficult. Under such partial observability, effective policies must not only generalize across diverse objects and layouts but also reason about occlusion to avoid collisions. However, collecting large-scale real-world data for this task remains prohibitively expensive, leaving this problem largely unsolved. In this paper, we introduce FetchBot, a sim-to-real framework for this challenge. We first curate a large-scale synthetic dataset featuring 1M diverse scenes and 500k representative demonstrations. Based on this dataset, FetchBot employs a depth-conditioned method for action generation, which leverages structural cues to enable robust obstacle-aware action planning. However, depth is perfect in simulation but noisy in real-world environments. To address this sim-to-real gap, FetchBot predicts depth from RGB inputs using a foundation model and integrates local occupancy prediction as a pre-training task, providing a generalizable latent representation for sim-to-real transfer. Extensive experiments in simulation and real-world environments demonstrate the strong zero-shot sim-to-real transfer, effective clutter handling, and adaptability to novel scenarios. In cluttered environments, it achieves an average real-world success rate of 89.95%, significantly outperforming prior methods. Moreover, FetchBot demonstrates excellent robustness in challenging cases, such as fetching transparent, reflective, and irregular objects, highlighting its practical value.
comment: 9th Annual Conference on Robot Learning (CoRL 2025, Oral)
AutoMisty: A Multi-Agent LLM Framework for Automated Code Generation in the Misty Social Robot IROS 2025
The social robot's open API allows users to customize open-domain interactions. However, it remains inaccessible to those without programming experience. In this work, we introduce AutoMisty, the first multi-agent collaboration framework powered by large language models (LLMs), to enable the seamless generation of executable Misty robot code from natural language instructions. AutoMisty incorporates four specialized agent modules to manage task decomposition, assignment, problem-solving, and result synthesis. Each agent incorporates a two-layer optimization mechanism, with self-reflection for iterative refinement and human-in-the-loop for better alignment with user preferences. AutoMisty ensures a transparent reasoning process, allowing users to iteratively refine tasks through natural language feedback for precise execution. To evaluate AutoMisty's effectiveness, we designed a benchmark task set spanning four levels of complexity and conducted experiments in a real Misty robot environment. Extensive evaluations demonstrate that AutoMisty not only consistently generates high-quality code but also enables precise code control, significantly outperforming direct reasoning with ChatGPT-4o and ChatGPT-o1. All code, optimized APIs, and experimental videos will be publicly released through the webpage: https://wangxiaoshawn.github.io/AutoMisty.html
comment: Accepted by IROS 2025
Locomotion on Constrained Footholds via Layered Architectures and Model Predictive Control
Computing stabilizing and optimal control actions for legged locomotion in real time is difficult due to the nonlinear, hybrid, and high dimensional nature of these robots. The hybrid nature of the system introduces a combination of discrete and continuous variables which causes issues for numerical optimal control. To address these challenges, we propose a layered architecture that separates the choice of discrete variables and a smooth Model Predictive Controller (MPC). The layered formulation allows for online flexibility and optimality without sacrificing real-time performance through a combination of gradient-free and gradient-based methods. The architecture leverages a sampling-based method for determining discrete variables, and a classical smooth MPC formulation using these fixed discrete variables. We demonstrate the results on a quadrupedal robot stepping over gaps and onto terrain with varying heights. In simulation, we demonstrate the controller on a humanoid robot for gap traversal. The layered approach is shown to be more optimal and reliable than common heuristic-based approaches and faster to compute than pure sampling methods.
comment: Accepted to Humanoids 2025
ToddlerBot: Open-Source ML-Compatible Humanoid Platform for Loco-Manipulation
Learning-based robotics research driven by data demands a new approach to robot hardware design-one that serves as both a platform for policy execution and a tool for embodied data collection to train policies. We introduce ToddlerBot, a low-cost, open-source humanoid robot platform designed for scalable policy learning and research in robotics and AI. ToddlerBot enables seamless acquisition of high-quality simulation and real-world data. The plug-and-play zero-point calibration and transferable motor system identification ensure a high-fidelity digital twin, enabling zero-shot policy transfer from simulation to the real world. A user-friendly teleoperation interface facilitates streamlined real-world data collection for learning motor skills from human demonstrations. Utilizing its data collection ability and anthropomorphic design, ToddlerBot is an ideal platform to perform whole-body loco-manipulation. Additionally, ToddlerBot's compact size (0.56m, 3.4kg) ensures safe operation in real-world environments. Reproducibility is achieved with an entirely 3D-printed, open-source design and commercially available components, keeping the total cost under 6,000 USD. Comprehensive documentation allows assembly and maintenance with basic technical expertise, as validated by a successful independent replication of the system. We demonstrate ToddlerBot's capabilities through arm span, payload, endurance tests, loco-manipulation tasks, and a collaborative long-horizon scenario where two robots tidy a toy session together. By advancing ML-compatibility, capability, and reproducibility, ToddlerBot provides a robust platform for scalable learning and dynamic policy execution in robotics research.
comment: Project website: https://toddlerbot.github.io/
Multiagent Systems
Price of Uncertainty for Consensus Games
Many game-theoretic models assume that players have access to accurate information, but uncertainty in observed data is frequently present in real-world settings. In this paper, we consider a model of uncertainty where adversarial perturbations of relative magnitude $1+\varepsilon$ are introduced to players' observed costs. The effect of uncertainty on social cost is denoted as the price of uncertainty. We prove a tight bound on the price of uncertainty for consensus games of $\Theta(\varepsilon^2 n^2)$ for all $\varepsilon = \Omega\mathopen{}\left(n^{-1/4}\right)$. This improves a previous lower bound of $\Omega(\varepsilon^3 n^2)$ as well as a previous upper bound of $O(\varepsilon n^2)$.
comment: 14 pages
Debate or Vote: Which Yields Better Decisions in Multi-Agent Large Language Models?
Multi-Agent Debate~(MAD) has emerged as a promising paradigm for improving the performance of large language models through collaborative reasoning. Despite recent advances, the key factors driving MAD's effectiveness remain unclear. In this work, we disentangle MAD into two key components--Majority Voting and inter-agent Debate--and assess their respective contributions. Through extensive experiments across seven NLP benchmarks, we find that Majority Voting alone accounts for most of the performance gains typically attributed to MAD. To explain this, we propose a theoretical framework that models debate as a stochastic process. We prove that it induces a martingale over agents' belief trajectories, implying that debate alone does not improve expected correctness. Guided by these insights, we demonstrate that targeted interventions, by biasing the belief update toward correction, can meaningfully enhance debate effectiveness. Overall, our findings suggest that while MAD has potential, simple ensembling methods remain strong and more reliable alternatives in many practical settings. Code is released in https://github.com/deeplearning-wisc/debate-or-vote.
A Consensus Algorithm for Second-Order Systems Evolving on Lie Groups
In this paper, a consensus algorithm is proposed for interacting multi-agents, which can be modeled as simple Mechanical Control Systems (MCS) evolving on a general Lie group. The standard Laplacian flow consensus algorithm for double integrator systems evolving on Euclidean spaces is extended to a general Lie group. A tracking error function is defined on a general smooth manifold for measuring the error between the configurations of two interacting agents. The stability of the desired consensus equilibrium is proved using a generalized version of Lyapunov theory and LaSalle's invariance principle applicable for systems evolving on a smooth manifold. The proposed consensus control input requires only the configuration information of the neighboring agents and does not require their velocities and inertia tensors. The design of tracking error function and consensus control inputs are demonstrated through an application of attitude consensus problem for multiple communicating rigid bodies. The consensus algorithm is numerically validated by demonstrating the attitude consensus problem.
Evolving Collective Cognition in Human-Agent Hybrid Societies: How Agents Form Stances and Boundaries
Large language models have been widely used to simulate credible human social behaviors. However, it remains unclear whether these models can demonstrate stable capacities for stance formation and identity negotiation in complex interactions, as well as how they respond to human interventions. We propose a computational multi-agent society experiment framework that integrates generative agent-based modeling with virtual ethnographic methods to investigate how group stance differentiation and social boundary formation emerge in human-agent hybrid societies. Across three studies, we find that agents exhibit endogenous stances, independent of their preset identities, and display distinct tonal preferences and response patterns to different discourse strategies. Furthermore, through language interaction, agents actively dismantle existing identity-based power structures and reconstruct self-organized community boundaries based on these stances. Our findings suggest that preset identities do not rigidly determine the agents' social structures. For human researchers to effectively intervene in collective cognition, attention must be paid to the endogenous mechanisms and interactional dynamics within the agents' language networks. These insights provide a theoretical foundation for using generative AI in modeling group social dynamics and studying human-agent collaboration.
comment: 37 pages, 6 figures
PixRO: Pixel-Distributed Rotational Odometry with Gaussian Belief Propagation
Images are the standard input for most computer vision algorithms. However, their processing often reduces to parallelizable operations applied locally and independently to individual pixels. Yet, many of these low-level raw pixel readings only provide redundant or noisy information for specific high-level tasks, leading to inefficiencies in both energy consumption during their transmission off-sensor and computational resources in their subsequent processing. As novel sensors featuring advanced in-pixel processing capabilities emerge, we envision a paradigm shift toward performing increasingly complex visual processing directly in-pixel, reducing computational overhead downstream. We advocate for synthesizing high-level cues at the pixel level, enabling their off-sensor transmission to directly support downstream tasks more effectively than raw pixel readings. This paper conceptualizes a novel photometric rotation estimation algorithm to be distributed at pixel level, where each pixel estimates the global motion of the camera by exchanging information with other pixels to achieve global consensus. We employ a probabilistic formulation and leverage Gaussian Belief Propagation (GBP) for decentralized inference using messaging-passing. The proposed proposed technique is evaluated on real-world public datasets and we offer a in-depth analysis of the practicality of applying GBP to distributed rotation estimation at pixel level.
AutoMisty: A Multi-Agent LLM Framework for Automated Code Generation in the Misty Social Robot IROS 2025
The social robot's open API allows users to customize open-domain interactions. However, it remains inaccessible to those without programming experience. In this work, we introduce AutoMisty, the first multi-agent collaboration framework powered by large language models (LLMs), to enable the seamless generation of executable Misty robot code from natural language instructions. AutoMisty incorporates four specialized agent modules to manage task decomposition, assignment, problem-solving, and result synthesis. Each agent incorporates a two-layer optimization mechanism, with self-reflection for iterative refinement and human-in-the-loop for better alignment with user preferences. AutoMisty ensures a transparent reasoning process, allowing users to iteratively refine tasks through natural language feedback for precise execution. To evaluate AutoMisty's effectiveness, we designed a benchmark task set spanning four levels of complexity and conducted experiments in a real Misty robot environment. Extensive evaluations demonstrate that AutoMisty not only consistently generates high-quality code but also enables precise code control, significantly outperforming direct reasoning with ChatGPT-4o and ChatGPT-o1. All code, optimized APIs, and experimental videos will be publicly released through the webpage: https://wangxiaoshawn.github.io/AutoMisty.html
comment: Accepted by IROS 2025
An Outlook on the Opportunities and Challenges of Multi-Agent AI Systems
A multi-agent AI system (MAS) is composed of multiple autonomous agents that interact, exchange information, and make decisions based on internal generative models. Recent advances in large language models and tool-using agents have made MAS increasingly practical in areas like scientific discovery and collaborative automation. However, key questions remain: When are MAS more effective than single-agent systems? What new safety risks arise from agent interactions? And how should we evaluate their reliability and structure? This paper outlines a formal framework for analyzing MAS, focusing on two core aspects: effectiveness and safety. We explore whether MAS truly improve robustness, adaptability, and performance, or merely repackage known techniques like ensemble learning. We also study how inter-agent dynamics may amplify or suppress system vulnerabilities. While MAS are relatively new to the signal processing community, we envision them as a powerful abstraction that extends classical tools like distributed estimation and sensor fusion to higher-level, policy-driven inference. Through experiments on data science automation, we highlight the potential of MAS to reshape how signal processing systems are designed and trusted.
comment: Corrected references
Multiagent Systems
Anemoi: A Semi-Centralized Multi-agent Systems Based on Agent-to-Agent Communication MCP server from Coral Protocol
Recent advances in generalist multi-agent systems (MAS) have largely followed a context-engineering plus centralized paradigm, where a planner agent coordinates multiple worker agents through unidirectional prompt passing. While effective under strong planner models, this design suffers from two critical limitations: (1) strong dependency on the planner's capability, which leads to degraded performance when a smaller LLM powers the planner; and (2) limited inter-agent communication, where collaboration relies on costly prompt concatenation and context injection, introducing redundancy and information loss. To address these challenges, we propose Anemoi, a semi-centralized MAS built on the Agent-to-Agent (A2A) communication MCP server from Coral Protocol. Unlike traditional designs, Anemoi enables structured and direct inter-agent collaboration, allowing all agents to monitor progress, assess results, identify bottlenecks, and propose refinements in real time. This paradigm reduces reliance on a single planner, supports adaptive plan updates, and minimizes redundant context passing, resulting in more scalable and cost-efficient execution. Evaluated on the GAIA benchmark, Anemoi achieved 52.73\% accuracy with a small LLM (GPT-4.1-mini) as the planner, surpassing the strongest open-source baseline OWL (43.63\%) by +9.09\% under identical LLM settings. Our implementation is publicly available at https://github.com/Coral-Protocol/Anemoi.
Understanding Action Effects through Instrumental Empowerment in Multi-Agent Reinforcement Learning ECAI
To reliably deploy Multi-Agent Reinforcement Learning (MARL) systems, it is crucial to understand individual agent behaviors. While prior work typically evaluates overall team performance based on explicit reward signals, it is unclear how to infer agent contributions in the absence of any value feedback. In this work, we investigate whether meaningful insights into agent behaviors can be extracted solely by analyzing the policy distribution. Inspired by the phenomenon that intelligent agents tend to pursue convergent instrumental values, we introduce Intended Cooperation Values (ICVs), a method based on information-theoretic Shapley values for quantifying each agent's causal influence on their co-players' instrumental empowerment. Specifically, ICVs measure an agent's action effect on its teammates' policies by assessing their decision (un)certainty and preference alignment. By analyzing action effects on policies and value functions across cooperative and competitive MARL tasks, our method identifies which agent behaviors are beneficial to team success, either by fostering deterministic decisions or by preserving flexibility for future action choices, while also revealing the extent to which agents adopt similar or diverse strategies. Our proposed method offers novel insights into cooperation dynamics and enhances explainability in MARL systems.
comment: European Conference on Artificial Intelligence (ECAI) 2025
Effective Red-Teaming of Policy-Adherent Agents
Task-oriented LLM-based agents are increasingly used in domains with strict policies, such as refund eligibility or cancellation rules. The challenge lies in ensuring that the agent consistently adheres to these rules and policies, appropriately refusing any request that would violate them, while still maintaining a helpful and natural interaction. This calls for the development of tailored design and evaluation methodologies to ensure agent resilience against malicious user behavior. We propose a novel threat model that focuses on adversarial users aiming to exploit policy-adherent agents for personal benefit. To address this, we present CRAFT, a multi-agent red-teaming system that leverages policy-aware persuasive strategies to undermine a policy-adherent agent in a customer-service scenario, outperforming conventional jailbreak methods such as DAN prompts, emotional manipulation, and coercive. Building upon the existing tau-bench benchmark, we introduce tau-break, a complementary benchmark designed to rigorously assess the agent's robustness against manipulative user behavior. Finally, we evaluate several straightforward yet effective defense strategies. While these measures provide some protection, they fall short, highlighting the need for stronger, research-driven safeguards to protect policy-adherent agents from adversarial attacks
Operator: A Protocol for Trustless Verification Under Uncertainty
Correctness is an emergent property of systems where exposing error is cheaper than committing it. In dynamic, low-trust environments, autonomous AI agents benefit from delegating work to sub-agents, yet correctness cannot be assured through upfront specification or centralized oversight. We propose a protocol that enforces correctness through collateralized claims in a recursive verification game. Tasks are published as intents, and solvers compete to fulfill them. Selected solvers carry out tasks under risk, with correctness checked post hoc by verifiers. Any challenger can challenge a result by staking against it to trigger the verification process. Incorrect agents are slashed and correct opposition is rewarded, with an escalation path that penalizes erroneous verifiers themselves. When incentives are aligned across solvers, challengers, and verifiers, falsification conditions make correctness the Nash equilibrium.
comment: 9 pages, 1 figure
X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents
Multi-turn interactions with language models (LMs) pose critical safety risks, as harmful intent can be strategically spread across exchanges. Yet, the vast majority of prior work has focused on single-turn safety, while adaptability and diversity remain among the key challenges of multi-turn red-teaming. To address these challenges, we present X-Teaming, a scalable framework that systematically explores how seemingly harmless interactions escalate into harmful outcomes and generates corresponding attack scenarios. X-Teaming employs collaborative agents for planning, attack optimization, and verification, achieving state-of-the-art multi-turn jailbreak effectiveness and diversity with success rates up to 98.1% across representative leading open-weight and closed-source models. In particular, X-Teaming achieves a 96.2% attack success rate against the latest Claude 3.7 Sonnet model, which has been considered nearly immune to single-turn attacks. Building on X-Teaming, we introduce XGuard-Train, an open-source multi-turn safety training dataset that is 20x larger than the previous best resource, comprising 30K interactive jailbreaks, designed to enable robust multi-turn safety alignment for LMs. Our work offers essential tools and insights for mitigating sophisticated conversational attacks, advancing the multi-turn safety of LMs.
Robotics
LaGarNet: Goal-Conditioned Recurrent State-Space Models for Pick-and-Place Garment Flattening
We present a novel goal-conditioned recurrent state space (GC-RSSM) model capable of learning latent dynamics of pick-and-place garment manipulation. Our proposed method LaGarNet matches the state-of-the-art performance of mesh-based methods, marking the first successful application of state-space models on complex garments. LaGarNet trains on a coverage-alignment reward and a dataset collected through a general procedure supported by a random policy and a diffusion policy learned from few human demonstrations; it substantially reduces the inductive biases introduced in the previous similar methods. We demonstrate that a single-policy LaGarNet achieves flattening on four different types of garments in both real-world and simulation settings.
comment: 20 pages, 11 figures and 3 tables
DeltaFlow: An Efficient Multi-frame Scene Flow Estimation Method
Previous dominant methods for scene flow estimation focus mainly on input from two consecutive frames, neglecting valuable information in the temporal domain. While recent trends shift towards multi-frame reasoning, they suffer from rapidly escalating computational costs as the number of frames grows. To leverage temporal information more efficiently, we propose DeltaFlow ($\Delta$Flow), a lightweight 3D framework that captures motion cues via a $\Delta$ scheme, extracting temporal features with minimal computational cost, regardless of the number of frames. Additionally, scene flow estimation faces challenges such as imbalanced object class distributions and motion inconsistency. To tackle these issues, we introduce a Category-Balanced Loss to enhance learning across underrepresented classes and an Instance Consistency Loss to enforce coherent object motion, improving flow accuracy. Extensive evaluations on the Argoverse 2 and Waymo datasets show that $\Delta$Flow achieves state-of-the-art performance with up to 22% lower error and $2\times$ faster inference compared to the next-best multi-frame supervised method, while also demonstrating a strong cross-domain generalization ability. The code is open-sourced at https://github.com/Kin-Zhang/DeltaFlow along with trained model weights.
comment: 17 pages (9 main pages + 8 supp materail), 11 figures, code at https://github.com/Kin-Zhang/DeltaFlow
M3DMap: Object-aware Multimodal 3D Mapping for Dynamic Environments
3D mapping in dynamic environments poses a challenge for modern researchers in robotics and autonomous transportation. There are no universal representations for dynamic 3D scenes that incorporate multimodal data such as images, point clouds, and text. This article takes a step toward solving this problem. It proposes a taxonomy of methods for constructing multimodal 3D maps, classifying contemporary approaches based on scene types and representations, learning methods, and practical applications. Using this taxonomy, a brief structured analysis of recent methods is provided. The article also describes an original modular method called M3DMap, designed for object-aware construction of multimodal 3D maps for both static and dynamic scenes. It consists of several interconnected components: a neural multimodal object segmentation and tracking module; an odometry estimation module, including trainable algorithms; a module for 3D map construction and updating with various implementations depending on the desired scene representation; and a multimodal data retrieval module. The article highlights original implementations of these modules and their advantages in solving various practical tasks, from 3D object grounding to mobile manipulation. Additionally, it presents theoretical propositions demonstrating the positive effect of using multimodal data and modern foundational models in 3D mapping methods. Details of the taxonomy and method implementation are available at https://yuddim.github.io/M3DMap.
comment: 29 pages, 3 figures, 13 tables. Preprint of the accepted article in Optical Memory and Neural Network Journal
A Rapid Iterative Trajectory Planning Method for Automated Parking through Differential Flatness
As autonomous driving continues to advance, automated parking is becoming increasingly essential. However, significant challenges arise when implementing path velocity decomposition (PVD) trajectory planning for automated parking. The primary challenge is ensuring rapid and precise collision-free trajectory planning, which is often in conflict. The secondary challenge involves maintaining sufficient control feasibility of the planned trajectory, particularly at gear shifting points (GSP). This paper proposes a PVD-based rapid iterative trajectory planning (RITP) method to solve the above challenges. The proposed method effectively balances the necessity for time efficiency and precise collision avoidance through a novel collision avoidance framework. Moreover, it enhances the overall control feasibility of the planned trajectory by incorporating the vehicle kinematics model and including terminal smoothing constraints (TSC) at GSP during path planning. Specifically, the proposed method leverages differential flatness to ensure the planned path adheres to the vehicle kinematic model. Additionally, it utilizes TSC to maintain curvature continuity at GSP, thereby enhancing the control feasibility of the overall trajectory. The simulation results demonstrate superior time efficiency and tracking errors compared to model-integrated and other iteration-based trajectory planning methods. In the real-world experiment, the proposed method was implemented and validated on a ROS-based vehicle, demonstrating the applicability of the RITP method for real vehicles.
comment: Published in the journal Robotics and Autonomous Systems
DualReg: Dual-Space Filtering and Reinforcement for Rigid Registration
Rigid registration, aiming to estimate a rigid transformation to align source and target data, play a crucial role in applications such as SLAM and 3D reconstruction. However, noisy, partially overlapping data and the need for real-time processing pose major challenges for rigid registration. Considering that feature-based matching can handle large transformation differences but suffers from limited accuracy, while local geometry-based matching can achieve fine-grained local alignment but relies heavily on a good initial transformation, we propose a novel dual-space paradigm to fully leverage the strengths of both approaches. First, we introduce an efficient filtering mechanism that incorporates a computationally lightweight single-point RANSAC algorithm followed by a refinement module to eliminate unreliable feature-based correspondences. Subsequently, we treat filtered correspondences as anchor points, extract geometric proxies, and formulates an effective objective function with a tailored solver to estimate the transformation. Experiments verify our method's effectiveness, as shown by achieving up to a 32x CPU-time speedup over MAC on KITTI with comparable accuracy.
Fiducial Marker Splatting for High-Fidelity Robotics Simulations
High-fidelity 3D simulation is critical for training mobile robots, but its traditional reliance on mesh-based representations often struggle in complex environments, such as densely packed greenhouses featuring occlusions and repetitive structures. Recent neural rendering methods, like Gaussian Splatting (GS), achieve remarkable visual realism but lack flexibility to incorporate fiducial markers, which are essential for robotic localization and control. We propose a hybrid framework that combines the photorealism of GS with structured marker representations. Our core contribution is a novel algorithm for efficiently generating GS-based fiducial markers (e.g., AprilTags) within cluttered scenes. Experiments show that our approach outperforms traditional image-fitting techniques in both efficiency and pose-estimation accuracy. We further demonstrate the framework's potential in a greenhouse simulation. This agricultural setting serves as a challenging testbed, as its combination of dense foliage, similar-looking elements, and occlusions pushes the limits of perception, thereby highlighting the framework's value for real-world applications.
LLM-based Human-like Traffic Simulation for Self-driving Tests
Ensuring realistic traffic dynamics is a prerequisite for simulation platforms to evaluate the reliability of self-driving systems before deployment in the real world. Because most road users are human drivers, reproducing their diverse behaviors within simulators is vital. Existing solutions, however, typically rely on either handcrafted heuristics or narrow data-driven models, which capture only fragments of real driving behaviors and offer limited driving style diversity and interpretability. To address this gap, we introduce HDSim, an HD traffic generation framework that combines cognitive theory with large language model (LLM) assistance to produce scalable and realistic traffic scenarios within simulation platforms. The framework advances the state of the art in two ways: (i) it introduces a hierarchical driver model that represents diverse driving style traits, and (ii) it develops a Perception-Mediated Behavior Influence strategy, where LLMs guide perception to indirectly shape driver actions. Experiments reveal that embedding HDSim into simulation improves detection of safety-critical failures in self-driving systems by up to 68% and yields realism-consistent accident interpretability.
Drive As You Like: Strategy-Level Motion Planning Based on A Multi-Head Diffusion Model AAAI 2026
Recent advances in motion planning for autonomous driving have led to models capable of generating high-quality trajectories. However, most existing planners tend to fix their policy after supervised training, leading to consistent but rigid driving behaviors. This limits their ability to reflect human preferences or adapt to dynamic, instruction-driven demands. In this work, we propose a diffusion-based multi-head trajectory planner(M-diffusion planner). During the early training stage, all output heads share weights to learn to generate high-quality trajectories. Leveraging the probabilistic nature of diffusion models, we then apply Group Relative Policy Optimization (GRPO) to fine-tune the pre-trained model for diverse policy-specific behaviors. At inference time, we incorporate a large language model (LLM) to guide strategy selection, enabling dynamic, instruction-aware planning without switching models. Closed-loop simulation demonstrates that our post-trained planner retains strong planning capability while achieving state-of-the-art (SOTA) performance on the nuPlan val14 benchmark. Open-loop results further show that the generated trajectories exhibit clear diversity, effectively satisfying multi-modal driving behavior requirements. The code and related experiments will be released upon acceptance of the paper.
comment: Has been submitted to AAAI 2026
HumanoidVerse: A Versatile Humanoid for Vision-Language Guided Multi-Object Rearrangement
We introduce HumanoidVerse, a novel framework for vision-language guided humanoid control that enables a single physically simulated robot to perform long-horizon, multi-object rearrangement tasks across diverse scenes. Unlike prior methods that operate in fixed settings with single-object interactions, our approach supports consecutive manipulation of multiple objects, guided only by natural language instructions and egocentric camera RGB observations. HumanoidVerse is trained via a multi-stage curriculum using a dual-teacher distillation pipeline, enabling fluid transitions between sub-tasks without requiring environment resets. To support this, we construct a large-scale dataset comprising 350 multi-object tasks spanning four room layouts. Extensive experiments in the Isaac Gym simulator demonstrate that our method significantly outperforms prior state-of-the-art in both task success rate and spatial precision, and generalizes well to unseen environments and instructions. Our work represents a key step toward robust, general-purpose humanoid agents capable of executing complex, sequential tasks under real-world sensory constraints. The video visualization results can be found on the project page: https://haozhuo-zhang.github.io/HumanoidVerse-project-page/.
comment: Project Page: https://haozhuo-zhang.github.io/HumanoidVerse-project-page/
Relative Navigation and Dynamic Target Tracking for Autonomous Underwater Proximity Operations SP
Estimating a target's 6-DoF motion in underwater proximity operations is difficult because the chaser lacks target-side proprioception and the available relative observations are sparse, noisy, and often partial (e.g., Ultra-Short Baseline (USBL) positions). Without a motion prior, factor-graph maximum a posteriori estimation is underconstrained: consecutive target states are weakly linked and orientation can drift. We propose a generalized constant-twist motion prior defined on the tangent space of Lie groups that enforces temporally consistent trajectories across all degrees of freedom; in SE(3) it couples translation and rotation in the body frame. We present a ternary factor and derive its closed-form Jacobians based on standard Lie group operations, enabling drop-in use for trajectories on arbitrary Lie groups. We evaluate two deployment modes: (A) an SE(3)-only representation that regularizes orientation even when only position is measured, and (B) a mode with boundary factors that switches the target representation between SE(3) and 3D position while applying the same generalized constant-twist prior across representation changes. Validation on a real-world dynamic docking scenario dataset shows consistent ego-target trajectory estimation through USBL-only and optical relative measurement segments with an improved relative tracking accuracy compared to the noisy measurements to the target. Because the construction relies on standard Lie group primitives, it is portable across state manifolds and sensing modalities.
comment: 10 pages, 7 figures. Equal contribution by David Baxter and Aldo Ter\'an Espinoza. Supported by SAAB, SMaRC, and WASP. Supported by SAAB and the Swedish Maritime Robotics Centre (SMaRC), and by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation
A Workflow for Map Creation in Autonomous Vehicle Simulations
The fast development of technology and artificial intelligence has significantly advanced Autonomous Vehicle (AV) research, emphasizing the need for extensive simulation testing. Accurate and adaptable maps are critical in AV development, serving as the foundation for localization, path planning, and scenario testing. However, creating simulation-ready maps is often difficult and resource-intensive, especially with simulators like CARLA (CAR Learning to Act). Many existing workflows require significant computational resources or rely on specific simulators, limiting flexibility for developers. This paper presents a custom workflow to streamline map creation for AV development, demonstrated through the generation of a 3D map of a parking lot at Ontario Tech University. Future work will focus on incorporating SLAM technologies, optimizing the workflow for broader simulator compatibility, and exploring more flexible handling of latitude and longitude values to enhance map generation accuracy.
comment: 6 pages, 12 figures. Published in the Proceedings of GEOProcessing 2025: The Seventeenth International Conference on Advanced Geographic Information Systems, Applications, and Services (IARIA)
A Photorealistic Dataset and Vision-Based Algorithm for Anomaly Detection During Proximity Operations in Lunar Orbit
NASA's forthcoming Lunar Gateway space station, which will be uncrewed most of the time, will need to operate with an unprecedented level of autonomy. One key challenge is enabling the Canadarm3, the Gateway's external robotic system, to detect hazards in its environment using its onboard inspection cameras. This task is complicated by the extreme and variable lighting conditions in space. In this paper, we introduce the visual anomaly detection and localization task for the space domain and establish a benchmark based on a synthetic dataset called ALLO (Anomaly Localization in Lunar Orbit). We show that state-of-the-art visual anomaly detection methods often fail in the space domain, motivating the need for new approaches. To address this, we propose MRAD (Model Reference Anomaly Detection), a statistical algorithm that leverages the known pose of the Canadarm3 and a CAD model of the Gateway to generate reference images of the expected scene appearance. Anomalies are then identified as deviations from this model-generated reference. On the ALLO dataset, MRAD surpasses state-of-the-art anomaly detection algorithms, achieving an AP score of 62.1% at the pixel level and an AUROC score of 74.9% at the image level. Given the low tolerance for risk in space operations and the lack of domain-specific data, we emphasize the need for novel, robust, and accurate anomaly detection methods to handle the challenging visual conditions found in lunar orbit and beyond.
comment: 9 pages, 6 figures
Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning
Coverage path planning (CPP) is the problem of finding a path that covers the entire free space of a confined area, with applications ranging from robotic lawn mowing to search-and-rescue. While for known environments, offline methods can find provably complete paths, and in some cases optimal solutions, unknown environments need to be planned online during mapping. We investigate the suitability of continuous-space reinforcement learning (RL) for this challenging problem, and propose a computationally feasible egocentric map representation based on frontiers, as well as a novel reward term based on total variation to promote complete coverage. Compared to existing classical methods, this approach allows for a flexible path space, and enables the agent to adapt to specific environment characteristics. Meanwhile, the deployment of RL models on real robot systems is difficult. Training from scratch may be infeasible due to slow convergence times, while transferring from simulation to reality, i.e. sim-to-real transfer, is a key challenge in itself. We bridge the sim-to-real gap through a semi-virtual environment, including a real robot and real-time aspects, while utilizing a simulated sensor and obstacles to enable environment randomization and automated episode resetting. We investigate what level of fine-tuning is needed for adapting to a realistic setting. Through extensive experiments, we show that our approach surpasses the performance of both previous RL-based approaches and highly specialized methods across multiple CPP variations in simulation. Meanwhile, our method successfully transfers to a real robot. Our code implementation can be found online.
comment: Published in IEEE Access
GraphCoT-VLA: A 3D Spatial-Aware Reasoning Vision-Language-Action Model for Robotic Manipulation with Ambiguous Instructions
Vision-language-action models have emerged as a crucial paradigm in robotic manipulation. However, existing VLA models exhibit notable limitations in handling ambiguous language instructions and unknown environmental states. Furthermore, their perception is largely constrained to static two-dimensional observations, lacking the capability to model three-dimensional interactions between the robot and its environment. To address these challenges, this paper proposes GraphCoT-VLA, an efficient end-to-end model. To enhance the model's ability to interpret ambiguous instructions and improve task planning, we design a structured Chain-of-Thought reasoning module that integrates high-level task understanding and planning, failed task feedback, and low-level imaginative reasoning about future object positions and robot actions. Additionally, we construct a real-time updatable 3D Pose-Object graph, which captures the spatial configuration of robot joints and the topological relationships between objects in 3D space, enabling the model to better understand and manipulate their interactions. We further integrates a dropout hybrid reasoning strategy to achieve efficient control outputs. Experimental results across multiple real-world robotic tasks demonstrate that GraphCoT-VLA significantly outperforms existing methods in terms of task success rate and response speed, exhibiting strong generalization and robustness in open environments and under uncertain instructions.
comment: 10 pages, 6 figures
Large Language Model-Driven Closed-Loop UAV Operation with Semantic Observations
Recent advances in large Language Models (LLMs) have revolutionized mobile robots, including unmanned aerial vehicles (UAVs), enabling their intelligent operation within Internet of Things (IoT) ecosystems. However, LLMs still face challenges from logical reasoning and complex decision-making, leading to concerns about the reliability of LLM-driven UAV operations in IoT applications. In this paper, we propose a closed-loop LLM-driven UAV operation code generation framework that enables reliable UAV operations powered by effective feedback and refinement using two LLM modules, i.e., a Code Generator and an Evaluator. Our framework transforms numerical state observations from UAV operations into semantic trajectory descriptions to enhance the evaluator LLM's understanding of UAV dynamics for precise feedback generation. Our framework also enables a simulation-based refinement process, and hence eliminates the risks to physical UAVs caused by incorrect code execution during the refinement. Extensive experiments on UAV control tasks with different complexities are conducted. The experimental results show that our framework can achieve reliable UAV operations using LLMs, which significantly outperforms baseline methods in terms of success rate and completeness with the increase of task complexity.
comment: 12 pages, 9 figures
Systems and Control (CS)
Enhancing Energy and Spectral Efficiency in IoT-Cellular Networks via Active SIM-Equipped LEO Satellites
This paper investigates a low Earth orbit (LEO) satellite communication system enhanced by an active stacked intelligent metasurface (ASIM), mounted on the backplate of the satellite solar panels to efficiently utilize limited onboard space and reduce the main satellite power amplifier requirements. The system serves multiple ground users via rate-splitting multiple access (RSMA) and IoT devices through a symbiotic radio network. Multi-layer sequential processing in the ASIM improves effective channel gains and suppresses inter-user interference, outperforming active RIS and beyond-diagonal RIS designs. Three optimization approaches are evaluated: block coordinate descent with successive convex approximation (BCD-SCA), model-assisted multi-agent constraint soft actor-critic (MA-CSAC), and multi-constraint proximal policy optimization (MCPPO). Simulation results show that BCD-SCA converges fast and stably in convex scenarios without learning, MCPPO achieves rapid initial convergence with moderate stability, and MA-CSAC attains the highest long-term spectral and energy efficiency in large-scale networks. Energy-spectral efficiency trade-offs are analyzed for different ASIM elements, satellite antennas, and transmit power. Overall, the study demonstrates that integrating multi-layer ASIM with suitable optimization algorithms offers a scalable, energy-efficient, and high-performance solution for next-generation LEO satellite communications.
Frequency Response Identification of Low-Order Systems: Finite-Sample Analysis
This paper proposes a frequency-domain system identification method for learning low-order systems. The identification problem is formulated as the minimization of the l2 norm between the identified and measured frequency responses, with the nuclear norm of the Loewner matrix serving as a regularization term. This formulation results in an optimization problem that can be efficiently solved using standard convex optimization techniques. We derive an upper bound on the sampled-frequency complexity of the identification process and subsequently extend this bound to characterize the identification error over all frequencies. A detailed analysis of the sample complexity is provided, along with a thorough interpretation of its terms and dependencies. Finally, the efficacy of the proposed method is demonstrated through an example, along with numerical simulations validating the growth rate of the sample complexity bound.
comment: 15 pages, Submitted to IEEE Transactions on Automatic Control
Towards Deeper Understanding of Natural User Interactions in Virtual Reality Based Assembly Tasks
We explore natural user interactions using a virtual reality simulation of a robot arm for assembly tasks. Using a Wizard-of-Oz study, participants completed collaborative LEGO and instructive PCB assembly tasks, with the robot responding under experimenter control. We collected voice, hand tracking, and gaze data from users. Statistical analyses revealed that instructive and collaborative scenarios elicit distinct behaviors and adopted strategies, particularly as tasks progress. Users tended to use put-that-there language in spatially ambiguous contexts and more descriptive instructions in spatially clear ones. Our contributions include the identification of natural interaction strategies through analyses of collected data, as well as the supporting dataset, to guide the understanding and design of natural multimodal user interfaces for instructive interaction with systems in virtual reality.
comment: To be submitted in a future conference, this is the author version pre-print
Convolutional Neural Networks for Accurate Measurement of Train Speed
In this study, we explore the use of Convolutional Neural Networks for improving train speed estimation accuracy, addressing the complex challenges of modern railway systems. We investigate three CNN architectures - single-branch 2D, single-branch 1D, and multiple-branch models - and compare them with the Adaptive Kalman Filter. We analyse their performance using simulated train operation datasets with and without Wheel Slide Protection activation. Our results reveal that CNN-based approaches, especially the multiple-branch model, demonstrate superior accuracy and robustness compared to traditional methods, particularly under challenging operational conditions. These findings highlight the potential of deep learning techniques to enhance railway safety and operational efficiency by more effectively capturing intricate patterns in complex transportation datasets.
comment: 15 pages, 12 figures, 2 tables. Proceedings of the Institution of Mechanical Engineers, Part F: Journal of Rail and Rapid Transit
PowerChain: Automating Distribution Grid Analysis with Agentic AI Workflows
Due to the rapid pace of electrification and decarbonization, distribution grid (DG) operation and planning are becoming more complex, necessitating advanced computational analyses to ensure grid reliability and resilience. State-of-the-art DG analyses rely on disparate workflows of complex models, functions, and data pipelines, which require expert knowledge and are challenging to automate. Many small-scale utilities and cooperatives lack a large R&D workforce and therefore cannot use advanced analysis at scale. To address this gap, we develop a novel agentic AI system, PowerChain, to solve unseen DG analysis tasks via automated agentic orchestration and large language models (LLMs) function-calling. Given a natural language query, PowerChain dynamically generates and executes an ordered sequence of domain-aware functions guided by the semantics of an expert-built power systems function pool and a select reference set of known, expert-generated workflow-query pairs. Our results show that PowerChain can produce expert-level workflows with both GPT-5 and open-source Qwen models on complex, unseen DG analysis tasks operating on real utility data.
A Rapid Iterative Trajectory Planning Method for Automated Parking through Differential Flatness
As autonomous driving continues to advance, automated parking is becoming increasingly essential. However, significant challenges arise when implementing path velocity decomposition (PVD) trajectory planning for automated parking. The primary challenge is ensuring rapid and precise collision-free trajectory planning, which is often in conflict. The secondary challenge involves maintaining sufficient control feasibility of the planned trajectory, particularly at gear shifting points (GSP). This paper proposes a PVD-based rapid iterative trajectory planning (RITP) method to solve the above challenges. The proposed method effectively balances the necessity for time efficiency and precise collision avoidance through a novel collision avoidance framework. Moreover, it enhances the overall control feasibility of the planned trajectory by incorporating the vehicle kinematics model and including terminal smoothing constraints (TSC) at GSP during path planning. Specifically, the proposed method leverages differential flatness to ensure the planned path adheres to the vehicle kinematic model. Additionally, it utilizes TSC to maintain curvature continuity at GSP, thereby enhancing the control feasibility of the overall trajectory. The simulation results demonstrate superior time efficiency and tracking errors compared to model-integrated and other iteration-based trajectory planning methods. In the real-world experiment, the proposed method was implemented and validated on a ROS-based vehicle, demonstrating the applicability of the RITP method for real vehicles.
comment: Published in the journal Robotics and Autonomous Systems
Geometric Decentralized Stability Condition for Power Systems Based on Projecting DW Shells
The development of decentralized stability conditions has gained considerable attention due to the need to analyze heterogeneous multi-converter power systems. A recent advance is the application of the small-phase theorem, which extends the passivity theory. However, it requires the transfer function matrix to be sectorial, which may not hold in some frequency range and will result in conservatism. This letter tackles this problem by leveraging the Davis-Wielandt (DW) shells for decentralized stability analysis. We develop a geometric decentralized stability condition that visually displays how heterogeneous converters interact with the power grid and enable modular system analysis.
Beamforming Control in RIS-Aided Wireless Communications: A Predictive Physics-Based Approach
Integrating reconfigurable intelligent surfaces (RIS) into wireless communication systems is a promising approach for enhancing coverage and data rates by intelligently redirecting signals, through a process known as beamforming. However, the process of RIS beamforming (or passive beamforming) control is associated with multiple latency-inducing factors. As a result, by the time the beamforming is effectively updated, the channel conditions may have already changed. For example, the low update rate of localization systems becomes a critical limitation, as a mobile UE's position may change significantly between two consecutive measurements. To address this issue, this work proposes a practical and scalable physics-based solution that is effective across a wide range of UE movement models. Specifically, we propose a kinematic observer and predictor to enable proactive RIS control. From low-rate position estimates provided by a localizer, the kinematic observer infers the UE's speed and acceleration. These motion parameters are then used by a predictor to estimate the UE's future positions at a higher rate, allowing the RIS to adjust promptly and compensate for inherent delays in both the RIS control and localization systems. Numerical results validate the effectiveness of the proposed approach, demonstrating real-time RIS adjustments with low computational complexity, even in scenarios involving rapid UE movement.
comment: 11 pages, 10 figures, four tables, 29 references. Full paper submitted to IEEE-TWC
Stability Optimization and Analysis of Energy Flow Networks versus Different Centrality Measurement
Optimizing the stability and control performance of complex networks often hinges on effectively identifying critical nodes for targeted intervention. Due to their inherent complexity and high dimensionality, large-scale energy flow networks, prevalent in domains like power grids, transportation, and financial systems, present unique challenges in selecting optimal nodes for resource allocation. While numerous centrality measurements, such as Katz centrality, eigenvector centrality, closeness centrality, betweenness centrality, and PageRank, have been proposed to evaluate node importance, the impact of different centrality metrics on stability outcomes remains inadequately understood. Moreover, networks manifest diverse structural characteristics-including small-world, scale-free, and random graph properties-which further complicates the optimization problem. This paper systematically investigates how various node centrality measurements influence control stability across representative complex network structures. A unified energy-flow dynamical model is developed, and performance metrics such as the L1 norm are employed to quantify the network stability implications of employing different centrality metrics. Extensive numerical simulations over statistically generated network ensembles reveal significant variances in stability outcomes, highlighting the crucial role of centrality selection. The findings underscore the sensitivity of energy-flow stability to seemingly minor changes in topological node rankings, providing practical insights for enhancing control efficiency and robustness in real-world networked systems.
comment: Accepted by the 2025 21st International Conference on Intelligent Computing (ICIC 2025)
An Adaptive Environment-Aware Transformer Autoencoder for UAV-FSO with Dynamic Complexity Control
The rise of sixth-generation (6G) wireless networks sets high demands on UAV-assisted Free Space Optical (FSO) communications, where the channel environment becomes more complex and variable due to both atmospheric turbulence and UAV-induced vibrations. These factors increase the challenge of maintaining reliable communication and require adaptive processing methods. Autoencoders are promising as they learn optimal encodings from channel data. However, existing autoencoder designs are generic and lack the specific adaptability and computational flexibility needed for UAV-FSO scenarios. To address this, we propose AEAT-AE (Adaptive Environment-aware Transformer Autoencoder), a Transformer-based framework that integrates environmental parameters into both encoder and decoder via a cross-attention mechanism. Moreover, AEAT-AE incorporates a Deep Q-Network (DQN) that dynamically selects which layers of the Transformer autoencoder to activate based on real-time environmental inputs, effectively balancing performance and computational cost. Experiments demonstrate that AEAT-AE outperforms conventional methods in bit error rate while maintaining efficient runtime, representing a novel tailored solution for next-generation UAV-FSO communications.
Chat-Driven Reconfiguration of Model Predictive Control
Traditional control personalization requires users to understand optimization parameters and provide repetitive numerical feedback, creating significant barriers for non-expert users. To deal with this issue, we propose ChatMPC, a model predictive control framework that enables users to personalize control systems and adapt to environmental changes through natural language interaction. The framework operates in two modes: personalization, where users iteratively adjust control behavior to their preferences, and co-development, where users provide real-time environmental information that complements sensor data. We establish convergence guarantees under different user behavior models, demonstrating exponential convergence for consistent feedback and finite-time convergence with logarithmic interaction complexity for tolerance-based users. We validate ChatMPC through experiments in robot navigation with personalized obstacle avoidance and semi-autonomous driving with conversational obstacle reporting. Both experiments achieve real-time performance and demonstrate effective adaptation to user preferences and environmental changes.
Relative Navigation and Dynamic Target Tracking for Autonomous Underwater Proximity Operations SP
Estimating a target's 6-DoF motion in underwater proximity operations is difficult because the chaser lacks target-side proprioception and the available relative observations are sparse, noisy, and often partial (e.g., Ultra-Short Baseline (USBL) positions). Without a motion prior, factor-graph maximum a posteriori estimation is underconstrained: consecutive target states are weakly linked and orientation can drift. We propose a generalized constant-twist motion prior defined on the tangent space of Lie groups that enforces temporally consistent trajectories across all degrees of freedom; in SE(3) it couples translation and rotation in the body frame. We present a ternary factor and derive its closed-form Jacobians based on standard Lie group operations, enabling drop-in use for trajectories on arbitrary Lie groups. We evaluate two deployment modes: (A) an SE(3)-only representation that regularizes orientation even when only position is measured, and (B) a mode with boundary factors that switches the target representation between SE(3) and 3D position while applying the same generalized constant-twist prior across representation changes. Validation on a real-world dynamic docking scenario dataset shows consistent ego-target trajectory estimation through USBL-only and optical relative measurement segments with an improved relative tracking accuracy compared to the noisy measurements to the target. Because the construction relies on standard Lie group primitives, it is portable across state manifolds and sensing modalities.
comment: 10 pages, 7 figures. Equal contribution by David Baxter and Aldo Ter\'an Espinoza. Supported by SAAB, SMaRC, and WASP. Supported by SAAB and the Swedish Maritime Robotics Centre (SMaRC), and by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation
Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning
Coverage path planning (CPP) is the problem of finding a path that covers the entire free space of a confined area, with applications ranging from robotic lawn mowing to search-and-rescue. While for known environments, offline methods can find provably complete paths, and in some cases optimal solutions, unknown environments need to be planned online during mapping. We investigate the suitability of continuous-space reinforcement learning (RL) for this challenging problem, and propose a computationally feasible egocentric map representation based on frontiers, as well as a novel reward term based on total variation to promote complete coverage. Compared to existing classical methods, this approach allows for a flexible path space, and enables the agent to adapt to specific environment characteristics. Meanwhile, the deployment of RL models on real robot systems is difficult. Training from scratch may be infeasible due to slow convergence times, while transferring from simulation to reality, i.e. sim-to-real transfer, is a key challenge in itself. We bridge the sim-to-real gap through a semi-virtual environment, including a real robot and real-time aspects, while utilizing a simulated sensor and obstacles to enable environment randomization and automated episode resetting. We investigate what level of fine-tuning is needed for adapting to a realistic setting. Through extensive experiments, we show that our approach surpasses the performance of both previous RL-based approaches and highly specialized methods across multiple CPP variations in simulation. Meanwhile, our method successfully transfers to a real robot. Our code implementation can be found online.
comment: Published in IEEE Access
On Erlang mixture approximations for differential equations with distributed time delays
In this paper, we propose a general approach for approximate simulation and analysis of delay differential equations (DDEs) with distributed time delays based on methods for ordinary differential equations (ODEs). The key innovation is that we 1) approximate the kernel by the probability density function of an Erlang mixture and 2) use the linear chain trick to transform the approximate DDEs to ODEs. Furthermore, we prove that an approximation with infinitely many terms converges for continuous and bounded kernels and for specific choices of the coefficients. We show that the approximate ODEs can be used to assess the stability of the steady states of the original DDEs and that the solution to the ODEs converges if the kernel is also exponentially bounded. Additionally, we propose an approach based on bisection and least-squares estimation for determining optimal parameter values in the approximation. Finally, we present numerical examples that demonstrate the accuracy and convergence rate obtained with the optimal parameters and the efficacy of the proposed approach for bifurcation analysis and Monte Carlo simulation. The numerical examples involve a modified logistic equation, chemotherapy-induced myelosuppression, and a point reactor kinetics model of a molten salt nuclear fission reactor.
comment: 46 pages, 9 figures
Secure State Estimation of Cyber-Physical Systems via Gaussian Bernoulli Mixture Model
The implementation of cyber-physical systems in real-world applications is challenged by safety requirements in the presence of sensor threats. Most cyber-physical systems, especially multi-sensor systems, struggle to detect sensor attacks when the attack model is unknown. In this paper, we tackle this issue by proposing a Gaussian-Bernoulli Secure (GBS) estimator, which transforms the detection problem into an optimal estimation problem concerning the system state and observation indicators. It encompasses two theoretical sub-problems: sequential state estimation with partial observations and estimation updates with disordered new observations. Within the framework of Kalman filter, we derive closed-form solutions for these two problems. However, due to their computational inefficiency, we propose the iterative approach employing proximal gradient descent to update the estimation in less time. Finally, we conduct experiments from three perspectives: computational efficiency, detection performance, and estimation error. Our GBS estimator demonstrates significant improvements over other methods.
Systems and Control (EESS)
Enhancing Energy and Spectral Efficiency in IoT-Cellular Networks via Active SIM-Equipped LEO Satellites
This paper investigates a low Earth orbit (LEO) satellite communication system enhanced by an active stacked intelligent metasurface (ASIM), mounted on the backplate of the satellite solar panels to efficiently utilize limited onboard space and reduce the main satellite power amplifier requirements. The system serves multiple ground users via rate-splitting multiple access (RSMA) and IoT devices through a symbiotic radio network. Multi-layer sequential processing in the ASIM improves effective channel gains and suppresses inter-user interference, outperforming active RIS and beyond-diagonal RIS designs. Three optimization approaches are evaluated: block coordinate descent with successive convex approximation (BCD-SCA), model-assisted multi-agent constraint soft actor-critic (MA-CSAC), and multi-constraint proximal policy optimization (MCPPO). Simulation results show that BCD-SCA converges fast and stably in convex scenarios without learning, MCPPO achieves rapid initial convergence with moderate stability, and MA-CSAC attains the highest long-term spectral and energy efficiency in large-scale networks. Energy-spectral efficiency trade-offs are analyzed for different ASIM elements, satellite antennas, and transmit power. Overall, the study demonstrates that integrating multi-layer ASIM with suitable optimization algorithms offers a scalable, energy-efficient, and high-performance solution for next-generation LEO satellite communications.
Frequency Response Identification of Low-Order Systems: Finite-Sample Analysis
This paper proposes a frequency-domain system identification method for learning low-order systems. The identification problem is formulated as the minimization of the l2 norm between the identified and measured frequency responses, with the nuclear norm of the Loewner matrix serving as a regularization term. This formulation results in an optimization problem that can be efficiently solved using standard convex optimization techniques. We derive an upper bound on the sampled-frequency complexity of the identification process and subsequently extend this bound to characterize the identification error over all frequencies. A detailed analysis of the sample complexity is provided, along with a thorough interpretation of its terms and dependencies. Finally, the efficacy of the proposed method is demonstrated through an example, along with numerical simulations validating the growth rate of the sample complexity bound.
comment: 15 pages, Submitted to IEEE Transactions on Automatic Control
Towards Deeper Understanding of Natural User Interactions in Virtual Reality Based Assembly Tasks
We explore natural user interactions using a virtual reality simulation of a robot arm for assembly tasks. Using a Wizard-of-Oz study, participants completed collaborative LEGO and instructive PCB assembly tasks, with the robot responding under experimenter control. We collected voice, hand tracking, and gaze data from users. Statistical analyses revealed that instructive and collaborative scenarios elicit distinct behaviors and adopted strategies, particularly as tasks progress. Users tended to use put-that-there language in spatially ambiguous contexts and more descriptive instructions in spatially clear ones. Our contributions include the identification of natural interaction strategies through analyses of collected data, as well as the supporting dataset, to guide the understanding and design of natural multimodal user interfaces for instructive interaction with systems in virtual reality.
comment: To be submitted in a future conference, this is the author version pre-print
Convolutional Neural Networks for Accurate Measurement of Train Speed
In this study, we explore the use of Convolutional Neural Networks for improving train speed estimation accuracy, addressing the complex challenges of modern railway systems. We investigate three CNN architectures - single-branch 2D, single-branch 1D, and multiple-branch models - and compare them with the Adaptive Kalman Filter. We analyse their performance using simulated train operation datasets with and without Wheel Slide Protection activation. Our results reveal that CNN-based approaches, especially the multiple-branch model, demonstrate superior accuracy and robustness compared to traditional methods, particularly under challenging operational conditions. These findings highlight the potential of deep learning techniques to enhance railway safety and operational efficiency by more effectively capturing intricate patterns in complex transportation datasets.
comment: 15 pages, 12 figures, 2 tables. Proceedings of the Institution of Mechanical Engineers, Part F: Journal of Rail and Rapid Transit
PowerChain: Automating Distribution Grid Analysis with Agentic AI Workflows
Due to the rapid pace of electrification and decarbonization, distribution grid (DG) operation and planning are becoming more complex, necessitating advanced computational analyses to ensure grid reliability and resilience. State-of-the-art DG analyses rely on disparate workflows of complex models, functions, and data pipelines, which require expert knowledge and are challenging to automate. Many small-scale utilities and cooperatives lack a large R&D workforce and therefore cannot use advanced analysis at scale. To address this gap, we develop a novel agentic AI system, PowerChain, to solve unseen DG analysis tasks via automated agentic orchestration and large language models (LLMs) function-calling. Given a natural language query, PowerChain dynamically generates and executes an ordered sequence of domain-aware functions guided by the semantics of an expert-built power systems function pool and a select reference set of known, expert-generated workflow-query pairs. Our results show that PowerChain can produce expert-level workflows with both GPT-5 and open-source Qwen models on complex, unseen DG analysis tasks operating on real utility data.
A Rapid Iterative Trajectory Planning Method for Automated Parking through Differential Flatness
As autonomous driving continues to advance, automated parking is becoming increasingly essential. However, significant challenges arise when implementing path velocity decomposition (PVD) trajectory planning for automated parking. The primary challenge is ensuring rapid and precise collision-free trajectory planning, which is often in conflict. The secondary challenge involves maintaining sufficient control feasibility of the planned trajectory, particularly at gear shifting points (GSP). This paper proposes a PVD-based rapid iterative trajectory planning (RITP) method to solve the above challenges. The proposed method effectively balances the necessity for time efficiency and precise collision avoidance through a novel collision avoidance framework. Moreover, it enhances the overall control feasibility of the planned trajectory by incorporating the vehicle kinematics model and including terminal smoothing constraints (TSC) at GSP during path planning. Specifically, the proposed method leverages differential flatness to ensure the planned path adheres to the vehicle kinematic model. Additionally, it utilizes TSC to maintain curvature continuity at GSP, thereby enhancing the control feasibility of the overall trajectory. The simulation results demonstrate superior time efficiency and tracking errors compared to model-integrated and other iteration-based trajectory planning methods. In the real-world experiment, the proposed method was implemented and validated on a ROS-based vehicle, demonstrating the applicability of the RITP method for real vehicles.
comment: Published in the journal Robotics and Autonomous Systems
Geometric Decentralized Stability Condition for Power Systems Based on Projecting DW Shells
The development of decentralized stability conditions has gained considerable attention due to the need to analyze heterogeneous multi-converter power systems. A recent advance is the application of the small-phase theorem, which extends the passivity theory. However, it requires the transfer function matrix to be sectorial, which may not hold in some frequency range and will result in conservatism. This letter tackles this problem by leveraging the Davis-Wielandt (DW) shells for decentralized stability analysis. We develop a geometric decentralized stability condition that visually displays how heterogeneous converters interact with the power grid and enable modular system analysis.
Beamforming Control in RIS-Aided Wireless Communications: A Predictive Physics-Based Approach
Integrating reconfigurable intelligent surfaces (RIS) into wireless communication systems is a promising approach for enhancing coverage and data rates by intelligently redirecting signals, through a process known as beamforming. However, the process of RIS beamforming (or passive beamforming) control is associated with multiple latency-inducing factors. As a result, by the time the beamforming is effectively updated, the channel conditions may have already changed. For example, the low update rate of localization systems becomes a critical limitation, as a mobile UE's position may change significantly between two consecutive measurements. To address this issue, this work proposes a practical and scalable physics-based solution that is effective across a wide range of UE movement models. Specifically, we propose a kinematic observer and predictor to enable proactive RIS control. From low-rate position estimates provided by a localizer, the kinematic observer infers the UE's speed and acceleration. These motion parameters are then used by a predictor to estimate the UE's future positions at a higher rate, allowing the RIS to adjust promptly and compensate for inherent delays in both the RIS control and localization systems. Numerical results validate the effectiveness of the proposed approach, demonstrating real-time RIS adjustments with low computational complexity, even in scenarios involving rapid UE movement.
comment: 11 pages, 10 figures, four tables, 29 references. Full paper submitted to IEEE-TWC
Stability Optimization and Analysis of Energy Flow Networks versus Different Centrality Measurement
Optimizing the stability and control performance of complex networks often hinges on effectively identifying critical nodes for targeted intervention. Due to their inherent complexity and high dimensionality, large-scale energy flow networks, prevalent in domains like power grids, transportation, and financial systems, present unique challenges in selecting optimal nodes for resource allocation. While numerous centrality measurements, such as Katz centrality, eigenvector centrality, closeness centrality, betweenness centrality, and PageRank, have been proposed to evaluate node importance, the impact of different centrality metrics on stability outcomes remains inadequately understood. Moreover, networks manifest diverse structural characteristics-including small-world, scale-free, and random graph properties-which further complicates the optimization problem. This paper systematically investigates how various node centrality measurements influence control stability across representative complex network structures. A unified energy-flow dynamical model is developed, and performance metrics such as the L1 norm are employed to quantify the network stability implications of employing different centrality metrics. Extensive numerical simulations over statistically generated network ensembles reveal significant variances in stability outcomes, highlighting the crucial role of centrality selection. The findings underscore the sensitivity of energy-flow stability to seemingly minor changes in topological node rankings, providing practical insights for enhancing control efficiency and robustness in real-world networked systems.
comment: Accepted by the 2025 21st International Conference on Intelligent Computing (ICIC 2025)
An Adaptive Environment-Aware Transformer Autoencoder for UAV-FSO with Dynamic Complexity Control
The rise of sixth-generation (6G) wireless networks sets high demands on UAV-assisted Free Space Optical (FSO) communications, where the channel environment becomes more complex and variable due to both atmospheric turbulence and UAV-induced vibrations. These factors increase the challenge of maintaining reliable communication and require adaptive processing methods. Autoencoders are promising as they learn optimal encodings from channel data. However, existing autoencoder designs are generic and lack the specific adaptability and computational flexibility needed for UAV-FSO scenarios. To address this, we propose AEAT-AE (Adaptive Environment-aware Transformer Autoencoder), a Transformer-based framework that integrates environmental parameters into both encoder and decoder via a cross-attention mechanism. Moreover, AEAT-AE incorporates a Deep Q-Network (DQN) that dynamically selects which layers of the Transformer autoencoder to activate based on real-time environmental inputs, effectively balancing performance and computational cost. Experiments demonstrate that AEAT-AE outperforms conventional methods in bit error rate while maintaining efficient runtime, representing a novel tailored solution for next-generation UAV-FSO communications.
Chat-Driven Reconfiguration of Model Predictive Control
Traditional control personalization requires users to understand optimization parameters and provide repetitive numerical feedback, creating significant barriers for non-expert users. To deal with this issue, we propose ChatMPC, a model predictive control framework that enables users to personalize control systems and adapt to environmental changes through natural language interaction. The framework operates in two modes: personalization, where users iteratively adjust control behavior to their preferences, and co-development, where users provide real-time environmental information that complements sensor data. We establish convergence guarantees under different user behavior models, demonstrating exponential convergence for consistent feedback and finite-time convergence with logarithmic interaction complexity for tolerance-based users. We validate ChatMPC through experiments in robot navigation with personalized obstacle avoidance and semi-autonomous driving with conversational obstacle reporting. Both experiments achieve real-time performance and demonstrate effective adaptation to user preferences and environmental changes.
Relative Navigation and Dynamic Target Tracking for Autonomous Underwater Proximity Operations SP
Estimating a target's 6-DoF motion in underwater proximity operations is difficult because the chaser lacks target-side proprioception and the available relative observations are sparse, noisy, and often partial (e.g., Ultra-Short Baseline (USBL) positions). Without a motion prior, factor-graph maximum a posteriori estimation is underconstrained: consecutive target states are weakly linked and orientation can drift. We propose a generalized constant-twist motion prior defined on the tangent space of Lie groups that enforces temporally consistent trajectories across all degrees of freedom; in SE(3) it couples translation and rotation in the body frame. We present a ternary factor and derive its closed-form Jacobians based on standard Lie group operations, enabling drop-in use for trajectories on arbitrary Lie groups. We evaluate two deployment modes: (A) an SE(3)-only representation that regularizes orientation even when only position is measured, and (B) a mode with boundary factors that switches the target representation between SE(3) and 3D position while applying the same generalized constant-twist prior across representation changes. Validation on a real-world dynamic docking scenario dataset shows consistent ego-target trajectory estimation through USBL-only and optical relative measurement segments with an improved relative tracking accuracy compared to the noisy measurements to the target. Because the construction relies on standard Lie group primitives, it is portable across state manifolds and sensing modalities.
comment: 10 pages, 7 figures. Equal contribution by David Baxter and Aldo Ter\'an Espinoza. Supported by SAAB, SMaRC, and WASP. Supported by SAAB and the Swedish Maritime Robotics Centre (SMaRC), and by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation
Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning
Coverage path planning (CPP) is the problem of finding a path that covers the entire free space of a confined area, with applications ranging from robotic lawn mowing to search-and-rescue. While for known environments, offline methods can find provably complete paths, and in some cases optimal solutions, unknown environments need to be planned online during mapping. We investigate the suitability of continuous-space reinforcement learning (RL) for this challenging problem, and propose a computationally feasible egocentric map representation based on frontiers, as well as a novel reward term based on total variation to promote complete coverage. Compared to existing classical methods, this approach allows for a flexible path space, and enables the agent to adapt to specific environment characteristics. Meanwhile, the deployment of RL models on real robot systems is difficult. Training from scratch may be infeasible due to slow convergence times, while transferring from simulation to reality, i.e. sim-to-real transfer, is a key challenge in itself. We bridge the sim-to-real gap through a semi-virtual environment, including a real robot and real-time aspects, while utilizing a simulated sensor and obstacles to enable environment randomization and automated episode resetting. We investigate what level of fine-tuning is needed for adapting to a realistic setting. Through extensive experiments, we show that our approach surpasses the performance of both previous RL-based approaches and highly specialized methods across multiple CPP variations in simulation. Meanwhile, our method successfully transfers to a real robot. Our code implementation can be found online.
comment: Published in IEEE Access
On Erlang mixture approximations for differential equations with distributed time delays
In this paper, we propose a general approach for approximate simulation and analysis of delay differential equations (DDEs) with distributed time delays based on methods for ordinary differential equations (ODEs). The key innovation is that we 1) approximate the kernel by the probability density function of an Erlang mixture and 2) use the linear chain trick to transform the approximate DDEs to ODEs. Furthermore, we prove that an approximation with infinitely many terms converges for continuous and bounded kernels and for specific choices of the coefficients. We show that the approximate ODEs can be used to assess the stability of the steady states of the original DDEs and that the solution to the ODEs converges if the kernel is also exponentially bounded. Additionally, we propose an approach based on bisection and least-squares estimation for determining optimal parameter values in the approximation. Finally, we present numerical examples that demonstrate the accuracy and convergence rate obtained with the optimal parameters and the efficacy of the proposed approach for bifurcation analysis and Monte Carlo simulation. The numerical examples involve a modified logistic equation, chemotherapy-induced myelosuppression, and a point reactor kinetics model of a molten salt nuclear fission reactor.
comment: 46 pages, 9 figures
Secure State Estimation of Cyber-Physical Systems via Gaussian Bernoulli Mixture Model
The implementation of cyber-physical systems in real-world applications is challenged by safety requirements in the presence of sensor threats. Most cyber-physical systems, especially multi-sensor systems, struggle to detect sensor attacks when the attack model is unknown. In this paper, we tackle this issue by proposing a Gaussian-Bernoulli Secure (GBS) estimator, which transforms the detection problem into an optimal estimation problem concerning the system state and observation indicators. It encompasses two theoretical sub-problems: sequential state estimation with partial observations and estimation updates with disordered new observations. Within the framework of Kalman filter, we derive closed-form solutions for these two problems. However, due to their computational inefficiency, we propose the iterative approach employing proximal gradient descent to update the estimation in less time. Finally, we conduct experiments from three perspectives: computational efficiency, detection performance, and estimation error. Our GBS estimator demonstrates significant improvements over other methods.
Robotics
Hierarchical Decision-Making for Autonomous Navigation: Integrating Deep Reinforcement Learning and Fuzzy Logic in Four-Wheel Independent Steering and Driving Systems
This paper presents a hierarchical decision-making framework for autonomous navigation in four-wheel independent steering and driving (4WISD) systems. The proposed approach integrates deep reinforcement learning (DRL) for high-level navigation with fuzzy logic for low-level control to ensure both task performance and physical feasibility. The DRL agent generates global motion commands, while the fuzzy logic controller enforces kinematic constraints to prevent mechanical strain and wheel slippage. Simulation experiments demonstrate that the proposed framework outperforms traditional navigation methods, offering enhanced training efficiency and stability and mitigating erratic behaviors compared to purely DRL-based solutions. Real-world validations further confirm the framework's ability to navigate safely and effectively in dynamic industrial settings. Overall, this work provides a scalable and reliable solution for deploying 4WISD mobile robots in complex, real-world scenarios.
Comparative Analysis of UAV Path Planning Algorithms for Efficient Navigation in Urban 3D Environments
The most crucial challenges for UAVs are planning paths and avoiding obstacles in their way. In recent years, a wide variety of path-planning algorithms have been developed. These algorithms have successfully solved path-planning problems; however, they suffer from multiple challenges and limitations. To test the effectiveness and efficiency of three widely used algorithms, namely A*, RRT*, and Particle Swarm Optimization (PSO), this paper conducts extensive experiments in 3D urban city environments cluttered with obstacles. Three experiments were designed with two scenarios each to test the aforementioned algorithms. These experiments consider different city map sizes, different altitudes, and varying obstacle densities and sizes in the environment. According to the experimental results, the A* algorithm outperforms the others in both computation efficiency and path quality. PSO is especially suitable for tight turns and dense environments, and RRT* offers a balance and works well across all experiments due to its randomized approach to finding solutions.
On Kinodynamic Global Planning in a Simplicial Complex Environment: A Mixed Integer Approach
This work casts the kinodynamic planning problem for car-like vehicles as an optimization task to compute a minimum-time trajectory and its associated velocity profile, subject to boundary conditions on velocity, acceleration, and steering. The approach simultaneously optimizes both the spatial path and the sequence of acceleration and steering controls, ensuring continuous motion from a specified initial position and velocity to a target end position and velocity.The method analyzes the admissible control space and terrain to avoid local minima. The proposed method operates efficiently in simplicial complex environments, a preferred terrain representation for capturing intricate 3D landscapes. The problem is initially posed as a mixed-integer fractional program with quadratic constraints, which is then reformulated into a mixed-integer bilinear objective through a variable transformation and subsequently relaxed to a mixed-integer linear program using McCormick envelopes. Comparative simulations against planners such as MPPI and log-MPPI demonstrate that the proposed approach generates solutions 104 times faster while strictly adhering to the specified constraints
Terrain Classification for the Spot Quadrupedal Mobile Robot Using Only Proprioceptive Sensing
Quadrupedal mobile robots can traverse a wider range of terrain types than their wheeled counterparts but do not perform the same on all terrain types. These robots are prone to undesirable behaviours like sinking and slipping on challenging terrains. To combat this issue, we propose a terrain classifier that provides information on terrain type that can be used in robotic systems to create a traversability map to plan safer paths for the robot to navigate. The work presented here is a terrain classifier developed for a Boston Dynamics Spot robot. Spot provides over 100 measured proprioceptive signals describing the motions of the robot and its four legs (e.g., foot penetration, forces, joint angles, etc.). The developed terrain classifier combines dimensionality reduction techniques to extract relevant information from the signals and then applies a classification technique to differentiate terrain based on traversability. In representative field testing, the resulting terrain classifier was able to identify three different terrain types with an accuracy of approximately 97%
HOSt3R: Keypoint-free Hand-Object 3D Reconstruction from RGB images
Hand-object 3D reconstruction has become increasingly important for applications in human-robot interaction and immersive AR/VR experiences. A common approach for object-agnostic hand-object reconstruction from RGB sequences involves a two-stage pipeline: hand-object 3D tracking followed by multi-view 3D reconstruction. However, existing methods rely on keypoint detection techniques, such as Structure from Motion (SfM) and hand-keypoint optimization, which struggle with diverse object geometries, weak textures, and mutual hand-object occlusions, limiting scalability and generalization. As a key enabler to generic and seamless, non-intrusive applicability, we propose in this work a robust, keypoint detector-free approach to estimating hand-object 3D transformations from monocular motion video/images. We further integrate this with a multi-view reconstruction pipeline to accurately recover hand-object 3D shape. Our method, named HOSt3R, is unconstrained, does not rely on pre-scanned object templates or camera intrinsics, and reaches state-of-the-art performance for the tasks of object-agnostic hand-object 3D transformation and shape estimation on the SHOWMe benchmark. We also experiment on sequences from the HO3D dataset, demonstrating generalization to unseen object categories.
comment: 12 pages, 8 figures
Swarming Without an Anchor (SWA): Robot Swarms Adapt Better to Localization Dropouts Then a Single Robot
In this paper, we present the Swarming Without an Anchor (SWA) approach to state estimation in swarms of Unmanned Aerial Vehicles (UAVs) experiencing ego-localization dropout, where individual agents are laterally stabilized using relative information only. We propose to fuse decentralized state estimation with robust mutual perception and onboard sensor data to maintain accurate state awareness despite intermittent localization failures. Thus, the relative information used to estimate the lateral state of UAVs enables the identification of the unambiguous state of UAVs with respect to the local constellation. The resulting behavior reaches velocity consensus, as this task can be referred to as the double integrator synchronization problem. All disturbances and performance degradations except a uniform translation drift of the swarm as a whole is attenuated which is enabling new opportunities in using tight cooperation for increasing reliability and resilience of multi-UAV systems. Simulations and real-world experiments validate the effectiveness of our approach, demonstrating its capability to sustain cohesive swarm behavior in challenging conditions of unreliable or unavailable primary localization.
comment: Accepted to IEEE RA-L on April 1, 2025
GPL-SLAM: A Laser SLAM Framework with Gaussian Process Based Extended Landmarks
We present a novel Simultaneous Localization and Mapping (SLAM) method that employs Gaussian Process (GP) based landmark (object) representations. Instead of conventional grid maps or point cloud registration, we model the environment on a per object basis using GP based contour representations. These contours are updated online through a recursive scheme, enabling efficient memory usage. The SLAM problem is formulated within a fully Bayesian framework, allowing joint inference over the robot pose and object based map. This representation provides semantic information such as the number of objects and their areas, while also supporting probabilistic measurement to object associations. Furthermore, the GP based contours yield confidence bounds on object shapes, offering valuable information for downstream tasks like safe navigation and exploration. We validate our method on synthetic and real world experiments, and show that it delivers accurate localization and mapping performance across diverse structured environments.
comment: Authors Ali Emre Balc{\i} and Erhan Ege Keyvan contributed equally to this work
Sound and Solution-Complete CCBS
Continuous-time Conflict Based-Search (CCBS) has long been viewed as the de-facto optimal solver for multi-agent path finding in continuous time (MAPFR). Recent findings, however, show that the original theoretical variant of CCBS can suffer from non-termination, while the widely used implementation can return sub-optimal solutions. We introduce an analytical framework that yields simple and sufficient conditions under which any CCBS-style algorithm is both sound, i.e., returns only optimal solutions, and solution complete, i.e., terminates on every solvable MAPFR instance. Investigating the publicly available implementation of CCBS reveals that it violates these conditions. Though this merely indicates that CCBS might be unsound, this indication is supported by counter-examples. Leveraging the analytical framework, we propose a novel branching rule and prove that it satisfies the sufficient conditions, thereby restoring soundness and termination guarantees. Consequently, the resulting CCBS variant is both sound and solution complete, matching the guarantees of the discrete-time CBS for the first time in the continuous domain. We experimentally apply standard CCBS and CCBS under our branching rule to an example problem, with our branching rule returning a solution with lower sum-of-costs than standard CCBS. Because the branching rule largely only affects the branching step, it can be adopted as a drop-in replacement in existing code-bases, as we show in our provided implementation. Beyond CCBS, the analytical framework and termination criterion provide a systematic way to evaluate other CCBS-like MAPFR solvers and future extensions.
comment: 15 pages
Do What? Teaching Vision-Language-Action Models to Reject the Impossible
Recently, Vision-Language-Action (VLA) models have demonstrated strong performance on a range of robotic tasks. These models rely on multimodal inputs, with language instructions playing a crucial role -- not only in predicting actions, but also in robustly interpreting user intent, even when the requests are impossible to fulfill. In this work, we investigate how VLAs can recognize, interpret, and respond to false-premise instructions: natural language commands that reference objects or conditions absent from the environment. We propose Instruct-Verify-and-Act (IVA), a unified framework that (i) detects when an instruction cannot be executed due to a false premise, (ii) engages in language-based clarification or correction, and (iii) grounds plausible alternatives in perception and action. Towards this end, we construct a large-scale instruction tuning setup with structured language prompts and train a VLA model capable of handling both accurate and erroneous requests. Our approach leverages a contextually augmented, semi-synthetic dataset containing paired positive and false-premise instructions, enabling robust detection and natural language correction. Our experiments show that IVA improves false premise detection accuracy by 97.56% over baselines, while increasing successful responses in false-premise scenarios by 50.78%.
comment: 9 pages, 2 figures, 1 table
Take That for Me: Multimodal Exophora Resolution with Interactive Questioning for Ambiguous Out-of-View Instructions
Daily life support robots must interpret ambiguous verbal instructions involving demonstratives such as ``Bring me that cup,'' even when objects or users are out of the robot's view. Existing approaches to exophora resolution primarily rely on visual data and thus fail in real-world scenarios where the object or user is not visible. We propose Multimodal Interactive Exophora resolution with user Localization (MIEL), which is a multimodal exophora resolution framework leveraging sound source localization (SSL), semantic mapping, visual-language models (VLMs), and interactive questioning with GPT-4o. Our approach first constructs a semantic map of the environment and estimates candidate objects from a linguistic query with the user's skeletal data. SSL is utilized to orient the robot toward users who are initially outside its visual field, enabling accurate identification of user gestures and pointing directions. When ambiguities remain, the robot proactively interacts with the user, employing GPT-4o to formulate clarifying questions. Experiments in a real-world environment showed results that were approximately 1.3 times better when the user was visible to the robot and 2.0 times better when the user was not visible to the robot, compared to the methods without SSL and interactive questioning. The project website is https://emergentsystemlabstudent.github.io/MIEL/.
comment: See website at https://emergentsystemlabstudent.github.io/MIEL/. Accepted at IEEE RO-MAN 2025
Validating Terrain Models in Digital Twins for Trustworthy sUAS Operations
With the increasing deployment of small Unmanned Aircraft Systems (sUAS) in unfamiliar and complex environments, Environmental Digital Twins (EDT) that comprise weather, airspace, and terrain data are critical for safe flight planning and for maintaining appropriate altitudes during search and surveillance operations. With the expansion of sUAS capabilities through edge and cloud computing, accurate EDT are also vital for advanced sUAS capabilities, like geolocation. However, real-world sUAS deployment introduces significant sources of uncertainty, necessitating a robust validation process for EDT components. This paper focuses on the validation of terrain models, one of the key components of an EDT, for real-world sUAS tasks. These models are constructed by fusing U.S. Geological Survey (USGS) datasets and satellite imagery, incorporating high-resolution environmental data to support mission tasks. Validating both the terrain models and their operational use by sUAS under real-world conditions presents significant challenges, including limited data granularity, terrain discontinuities, GPS and sensor inaccuracies, visual detection uncertainties, as well as onboard resources and timing constraints. We propose a 3-Dimensions validation process grounded in software engineering principles, following a workflow across granularity of tests, simulation to real world, and the analysis of simple to edge conditions. We demonstrate our approach using a multi-sUAS platform equipped with a Terrain-Aware Digital Shadow.
comment: Submitted to EDTconf 2025
NeuralMeshing: Complete Object Mesh Extraction from Casual Captures
How can we extract complete geometric models of objects that we encounter in our daily life, without having access to commercial 3D scanners? In this paper we present an automated system for generating geometric models of objects from two or more videos. Our system requires the specification of one known point in at least one frame of each video, which can be automatically determined using a fiducial marker such as a checkerboard or Augmented Reality (AR) marker. The remaining frames are automatically positioned in world space by using Structure-from-Motion techniques. By using multiple videos and merging results, a complete object mesh can be generated, without having to rely on hole filling. Code for our system is available from https://github.com/FlorisE/NeuralMeshing.
Autonomous UAV Flight Navigation in Confined Spaces: A Reinforcement Learning Approach
Inspecting confined industrial infrastructure, such as ventilation shafts, is a hazardous and inefficient task for humans. Unmanned Aerial Vehicles (UAVs) offer a promising alternative, but GPS-denied environments require robust control policies to prevent collisions. Deep Reinforcement Learning (DRL) has emerged as a powerful framework for developing such policies, and this paper provides a comparative study of two leading DRL algorithms for this task: the on-policy Proximal Policy Optimization (PPO) and the off-policy Soft Actor-Critic (SAC). The training was conducted with procedurally generated duct environments in Genesis simulation environment. A reward function was designed to guide a drone through a series of waypoints while applying a significant penalty for collisions. PPO learned a stable policy that completed all evaluation episodes without collision, producing smooth trajectories. By contrast, SAC consistently converged to a suboptimal behavior that traversed only the initial segments before failure. These results suggest that, in hazard-dense navigation, the training stability of on-policy methods can outweigh the nominal sample efficiency of off-policy algorithms. More broadly, the study provides evidence that procedurally generated, high-fidelity simulations are effective testbeds for developing and benchmarking robust navigation policies.
A Dataset and Benchmark for Robotic Cloth Unfolding Grasp Selection: The ICRA 2024 Cloth Competition
Robotic cloth manipulation suffers from a lack of standardized benchmarks and shared datasets for evaluating and comparing different approaches. To address this, we created a benchmark and organized the ICRA 2024 Cloth Competition, a unique head-to-head evaluation focused on grasp pose selection for in-air robotic cloth unfolding. Eleven diverse teams participated in the competition, utilizing our publicly released dataset of real-world robotic cloth unfolding attempts and a variety of methods to design their unfolding approaches. Afterwards, we also expanded our dataset with 176 competition evaluation trials, resulting in a dataset of 679 unfolding demonstrations across 34 garments. Analysis of the competition results revealed insights about the trade-off between grasp success and coverage, the surprisingly strong achievements of hand-engineered methods and a significant discrepancy between competition performance and prior work, underscoring the importance of independent, out-of-the-lab evaluation in robotic cloth manipulation. The associated dataset is a valuable resource for developing and evaluating grasp selection methods, particularly for learning-based approaches. We hope that our benchmark, dataset and competition results can serve as a foundation for future benchmarks and drive further progress in data-driven robotic cloth manipulation. The dataset and benchmarking code are available at https://airo.ugent.be/cloth_competition.
comment: submitted to IJRR
COSMO-Bench: A Benchmark for Collaborative SLAM Optimization
Recent years have seen a focus on research into distributed optimization algorithms for multi-robot Collaborative Simultaneous Localization and Mapping (C-SLAM). Research in this domain, however, is made difficult by a lack of standard benchmark datasets. Such datasets have been used to great effect in the field of single-robot SLAM, and researchers focused on multi-robot problems would benefit greatly from dedicated benchmark datasets. To address this gap, we design and release the Collaborative Open-Source Multi-robot Optimization Benchmark (COSMO-Bench) -- a suite of 24 datasets derived from a state-of-the-art C-SLAM front-end and real-world LiDAR data. Data DOI: https://doi.org/10.1184/R1/29652158
Towards Training-Free Underwater 3D Object Detection from Sonar Point Clouds: A Comparison of Traditional and Deep Learning Approaches
Underwater 3D object detection remains one of the most challenging frontiers in computer vision, where traditional approaches struggle with the harsh acoustic environment and scarcity of training data. While deep learning has revolutionized terrestrial 3D detection, its application underwater faces a critical bottleneck: obtaining sufficient annotated sonar data is prohibitively expensive and logistically complex, often requiring specialized vessels, expert surveyors, and favorable weather conditions. This work addresses a fundamental question: Can we achieve reliable underwater 3D object detection without real-world training data? We tackle this challenge by developing and comparing two paradigms for training-free detection of artificial structures in multibeam echo-sounder point clouds. Our dual approach combines a physics-based sonar simulation pipeline that generates synthetic training data for state-of-the-art neural networks, with a robust model-based template matching system that leverages geometric priors of target objects. Evaluation on real bathymetry surveys from the Baltic Sea reveals surprising insights: while neural networks trained on synthetic data achieve 98% mean Average Precision (mAP) on simulated scenes, they drop to 40% mAP on real sonar data due to domain shift. Conversely, our template matching approach maintains 83% mAP on real data without requiring any training, demonstrating remarkable robustness to acoustic noise and environmental variations. Our findings challenge conventional wisdom about data-hungry deep learning in underwater domains and establish the first large-scale benchmark for training-free underwater 3D detection. This work opens new possibilities for autonomous underwater vehicle navigation, marine archaeology, and offshore infrastructure monitoring in data-scarce environments where traditional machine learning approaches fail.
comment: 12 pages, 7 figures, submitted to IEEE Journal of Oceanic Engineering (IEEE-JOE)
UAV-ON: A Benchmark for Open-World Object Goal Navigation with Aerial Agents ACM MM
Aerial navigation is a fundamental yet underexplored capability in embodied intelligence, enabling agents to operate in large-scale, unstructured environments where traditional navigation paradigms fall short. However, most existing research follows the Vision-and-Language Navigation (VLN) paradigm, which heavily depends on sequential linguistic instructions, limiting its scalability and autonomy. To address this gap, we introduce UAV-ON, a benchmark for large-scale Object Goal Navigation (ObjectNav) by aerial agents in open-world environments, where agents operate based on high-level semantic goals without relying on detailed instructional guidance as in VLN. UAV-ON comprises 14 high-fidelity Unreal Engine environments with diverse semantic regions and complex spatial layouts, covering urban, natural, and mixed-use settings. It defines 1270 annotated target objects, each characterized by an instance-level instruction that encodes category, physical footprint, and visual descriptors, allowing grounded reasoning. These instructions serve as semantic goals, introducing realistic ambiguity and complex reasoning challenges for aerial agents. To evaluate the benchmark, we implement several baseline methods, including Aerial ObjectNav Agent (AOA), a modular policy that integrates instruction semantics with egocentric observations for long-horizon, goal-directed exploration. Empirical results show that all baselines struggle in this setting, highlighting the compounded challenges of aerial navigation and semantic goal grounding. UAV-ON aims to advance research on scalable UAV autonomy driven by semantic goal descriptions in complex real-world environments.
comment: Accepted to ACM MM Dataset Track 2025
ROS-related Robotic Systems Development with V-model-based Application of MeROS Metamodel
Systems built on the Robot Operating System (ROS) are increasingly easy to assemble, yet hard to govern and reliably coordinate. Beyond the sheer number of subsystems involved, the difficulty stems from their diversity and interaction depth. In this paper, we use a compact heterogeneous robotic system (HeROS), combining mobile and manipulation capabilities, as a demonstration vehicle under dynamically changing tasks. Notably, all its subsystems are powered by ROS. The use of compatible interfaces and other ROS integration capabilities simplifies the construction of such systems. However, this only addresses part of the complexity: the semantic coherence and structural traceability are even more important for precise coordination and call for deliberate engineering methods. The Model-Based Systems Engineering (MBSE) discipline, which emerged from the experience of complexity management in large-scale engineering domains, offers the methodological foundations needed. Despite their strengths in complementary aspects of robotics systems engineering, the lack of a unified approach to integrate ROS and MBSE hinders the full potential of these tools. Motivated by the anticipated impact of such a synergy in robotics practice, we propose a structured methodology based on MeROS - a SysML metamodel created specifically to put the ROS-based systems into the focus of the MBSE workflow. As its methodological backbone, we adapt the well-known V-model to this context, illustrating how complex robotic systems can be designed with traceability and validation capabilities embedded into their lifecycle using practices familiar to engineering teams.
comment: 22 pages
TAGA: A Tangent-Based Reactive Approach for Socially Compliant Robot Navigation Around Human Groups ICRA
Robot navigation in densely populated environments presents significant challenges, particularly regarding the interplay between individual and group dynamics. Current navigation models predominantly address interactions with individual pedestrians while failing to account for human groups that naturally form in real-world settings. Conversely, the limited models implementing group-aware navigation typically prioritize group dynamics at the expense of individual interactions, both of which are essential for socially appropriate navigation. This research extends an existing simulation framework to incorporate both individual pedestrians and human groups. We present Tangent Action for Group Avoidance (TAGA), a modular reactive mechanism that can be integrated with existing navigation frameworks to enhance their group-awareness capabilities. TAGA dynamically modifies robot trajectories using tangent action-based avoidance strategies while preserving the underlying model's capacity to navigate around individuals. Additionally, we introduce Group Collision Rate (GCR), a novel metric to quantitatively assess how effectively robots maintain group integrity during navigation. Through comprehensive simulation-based benchmarking, we demonstrate that integrating TAGA with state-of-the-art navigation models (ORCA, Social Force, DS-RNN, and AG-RL) reduces group intrusions by 45.7-78.6% while maintaining comparable success rates and navigation efficiency. Future work will focus on real-world implementation and validation of this approach.
comment: 6 pages, 3 figures. Preprint; intended for submission to IEEE International Conference on Robotics & Automation (ICRA), 2025
Hyper Yoshimura: How a slight tweak on a classical folding pattern unleashes meta-stability for deployable robots
Deployable structures inspired by origami have provided lightweight, compact, and reconfigurable solutions for various robotic and architectural applications. However, creating an integrated structural system that can effectively balance the competing requirements of high packing efficiency, simple deployment, and precise morphing into multiple load-bearing configurations remains a significant challenge. This study introduces a new class of hyper-Yoshimura origami, which exhibits a wide range of kinematically admissible and locally metastable states, including newly discovered symmetric "self-packing" and asymmetric "pop-out" states. This metastability is achieved by breaking a design rule of Yoshimura origami that has been in place for many decades. To this end, this study derives a new set of mathematically rigorous design rules and geometric formulations. Based on this, forward and inverse kinematic strategies are developed to stack hyper-Yoshimura modules into deployable booms that can approximate complex 3D shapes. Finally, this study showcases the potential of hyper-Yoshimura with a meter-scale pop-up cellphone charging station deployed at our university's bus transit station, along with a 3D-printed, scaled prototype of a space crane that can function as an object manipulator, solar tracking device, or high-load-bearing structure. These results establish hyper-Yoshimura as a promising platform for deployable and adaptable robotic systems in both terrestrial and space environments.
Adaptive Task Space Non-Singular Terminal Super-Twisting Sliding Mode Control of a 7-DOF Robotic Manipulator
This paper presents a new task-space Non-singular Terminal Super-Twisting Sliding Mode (NT-STSM) controller with adaptive gains for robust trajectory tracking of a 7-DOF robotic manipulator. The proposed approach addresses the challenges of chattering, unknown disturbances, and rotational motion tracking, making it suited for high-DOF manipulators in dexterous manipulation tasks. A rigorous boundedness proof is provided, offering gain selection guidelines for practical implementation. Simulations and hardware experiments with external disturbances demonstrate the proposed controller's robust, accurate tracking with reduced control effort under unknown disturbances compared to other NT-STSM and conventional controllers. The results demonstrated that the proposed NT-STSM controller mitigates chattering and instability in complex motions, making it a viable solution for dexterous robotic manipulations and various industrial applications.
comment: Accepted for publication in IEEE Transactions on Industrial Electronics. 12 pages, 8 figures
B*: Efficient and Optimal Base Placement for Fixed-Base Manipulators
B* is a novel optimization framework that addresses a critical challenge in fixed-base manipulator robotics: optimal base placement. Current methods rely on pre-computed kinematics databases generated through sampling to search for solutions. However, they face an inherent trade-off between solution optimality and computational efficiency when determining sampling resolution. To address these limitations, B* unifies multiple objectives without database dependence. The framework employs a two-layer hierarchical approach. The outer layer systematically manages terminal constraints through progressive tightening, particularly for base mobility, enabling feasible initialization and broad solution exploration. The inner layer addresses non-convexities in each outer-layer subproblem through sequential local linearization, converting the original problem into tractable sequential linear programming (SLP). Testing across multiple robot platforms demonstrates B*'s effectiveness. The framework achieves solution optimality five orders of magnitude better than sampling-based approaches while maintaining perfect success rates and reduced computational overhead. Operating directly in configuration space, B* enables simultaneous path planning with customizable optimization criteria. B* serves as a crucial initialization tool that bridges the gap between theoretical motion planning and practical deployment, where feasible trajectory existence is fundamental.
comment: accepted for publication in the IEEE Robotics and Automation Letters (RA-L)
Optimized Lattice-Structured Flexible EIT Sensor for Tactile Reconstruction and Classification
Flexible electrical impedance tomography (EIT) offers a promising alternative to traditional tactile sensing approaches, enabling low-cost, scalable, and deformable sensor designs. Here, we propose an optimized lattice-structured flexible EIT tactile sensor incorporating a hydrogel-based conductive layer, systematically designed through three-dimensional coupling field simulations to optimize structural parameters for enhanced sensitivity and robustness. By tuning the lattice channel width and conductive layer thickness, we achieve significant improvements in tactile reconstruction quality and classification performance. Experimental results demonstrate high-quality tactile reconstruction with correlation coefficients up to 0.9275, peak signal-to-noise ratios reaching 29.0303 dB, and structural similarity indexes up to 0.9660, while maintaining low relative errors down to 0.3798. Furthermore, the optimized sensor accurately classifies 12 distinct tactile stimuli with an accuracy reaching 99.6%. These results highlight the potential of simulation-guided structural optimization for advancing flexible EIT-based tactile sensors toward practical applications in wearable systems, robotics, and human-machine interfaces.
comment: Accepted by IEEE Transactions on Instrumentation & Measurement
OmniVTLA: Vision-Tactile-Language-Action Model with Semantic-Aligned Tactile Sensing
Recent vision-language-action (VLA) models build upon vision-language foundations, and have achieved promising results and exhibit the possibility of task generalization in robot manipulation. However, due to the heterogeneity of tactile sensors and the difficulty of acquiring tactile data, current VLA models significantly overlook the importance of tactile perception and fail in contact-rich tasks. To address this issue, this paper proposes OmniVTLA, a novel architecture involving tactile sensing. Specifically, our contributions are threefold. First, our OmniVTLA features a dual-path tactile encoder framework. This framework enhances tactile perception across diverse vision-based and force-based tactile sensors by using a pretrained vision transformer (ViT) and a semantically-aligned tactile ViT (SA-ViT). Second, we introduce ObjTac, a comprehensive force-based tactile dataset capturing textual, visual, and tactile information for 56 objects across 10 categories. With 135K tri-modal samples, ObjTac supplements existing visuo-tactile datasets. Third, leveraging this dataset, we train a semantically-aligned tactile encoder to learn a unified tactile representation, serving as a better initialization for OmniVTLA. Real-world experiments demonstrate substantial improvements over state-of-the-art VLA baselines, achieving 96.9% success rates with grippers, (21.9% higher over baseline) and 100% success rates with dexterous hands (6.2% higher over baseline) in pick-and-place tasks. Besides, OmniVTLA significantly reduces task completion time and generates smoother trajectories through tactile sensing compared to existing VLA. Our ObjTac dataset can be found at https://readerek.github.io/Objtac.github.io
comment: 15 pages, 7 figures, 8 tables. ObjTac dataset: https://readerek.github.io/Objtac.github.io
ScrewSplat: An End-to-End Method for Articulated Object Recognition
Articulated object recognition -- the task of identifying both the geometry and kinematic joints of objects with movable parts -- is essential for enabling robots to interact with everyday objects such as doors and laptops. However, existing approaches often rely on strong assumptions, such as a known number of articulated parts; require additional inputs, such as depth images; or involve complex intermediate steps that can introduce potential errors -- limiting their practicality in real-world settings. In this paper, we introduce ScrewSplat, a simple end-to-end method that operates solely on RGB observations. Our approach begins by randomly initializing screw axes, which are then iteratively optimized to recover the object's underlying kinematic structure. By integrating with Gaussian Splatting, we simultaneously reconstruct the 3D geometry and segment the object into rigid, movable parts. We demonstrate that our method achieves state-of-the-art recognition accuracy across a diverse set of articulated objects, and further enables zero-shot, text-guided manipulation using the recovered kinematic model. See the project website at: https://screwsplat.github.io.
comment: 26 pages, 12 figures, Conference on Robot Learning (CoRL) 2025
SIGMA: Sheaf-Informed Geometric Multi-Agent Pathfinding ICRA
The Multi-Agent Path Finding (MAPF) problem aims to determine the shortest and collision-free paths for multiple agents in a known, potentially obstacle-ridden environment. It is the core challenge for robotic deployments in large-scale logistics and transportation. Decentralized learning-based approaches have shown great potential for addressing the MAPF problems, offering more reactive and scalable solutions. However, existing learning-based MAPF methods usually rely on agents making decisions based on a limited field of view (FOV), resulting in short-sighted policies and inefficient cooperation in complex scenarios. There, a critical challenge is to achieve consensus on potential movements between agents based on limited observations and communications. To tackle this challenge, we introduce a new framework that applies sheaf theory to decentralized deep reinforcement learning, enabling agents to learn geometric cross-dependencies between each other through local consensus and utilize them for tightly cooperative decision-making. In particular, sheaf theory provides a mathematical proof of conditions for achieving global consensus through local observation. Inspired by this, we incorporate a neural network to approximately model the consensus in latent space based on sheaf theory and train it through self-supervised learning. During the task, in addition to normal features for MAPF as in previous works, each agent distributedly reasons about a learned consensus feature, leading to efficient cooperation on pathfinding and collision avoidance. As a result, our proposed method demonstrates significant improvements over state-of-the-art learning-based MAPF planners, especially in relatively large and complex scenarios, demonstrating its superiority over baselines in various simulations and real-world robot experiments. The code is available at https://github.com/marmotlab/SIGMA
comment: Accepted for presentation at the 2025 IEEE International Conference on Robotics and Automation (ICRA)
Multiagent Systems
LLM-Based Agents for Competitive Landscape Mapping in Drug Asset Due Diligence
In this paper, we describe and benchmark a competitor-discovery component used within an agentic AI system for fast drug asset due diligence. A competitor-discovery AI agent, given an indication, retrieves all drugs comprising the competitive landscape of that indication and extracts canonical attributes for these drugs. The competitor definition is investor-specific, and data is paywalled/licensed, fragmented across registries, ontology-mismatched by indication, alias-heavy for drug names, multimodal, and rapidly changing. Although considered the best tool for this problem, the current LLM-based AI systems aren't capable of reliably retrieving all competing drug names, and there is no accepted public benchmark for this task. To address the lack of evaluation, we use LLM-based agents to transform five years of multi-modal, unstructured diligence memos from a private biotech VC fund into a structured evaluation corpus mapping indications to competitor drugs with normalized attributes. We also introduce a competitor validating LLM-as-a-judge agent that filters out false positives from the list of predicted competitors to maximize precision and suppress hallucinations. On this benchmark, our competitor-discovery agent achieves 83% recall, exceeding OpenAI Deep Research (65%) and Perplexity Labs (60%). The system is deployed in production with enterprise users; in a case study with a biotech VC investment fund, analyst turnaround time dropped from 2.5 days to $\sim$3 hours ($\sim$20x) for the competitive analysis.
Abmax: A JAX-based Agent-based Modeling Framework
Agent-based modeling (ABM) is a principal approach for studying complex systems. By decomposing a system into simpler, interacting agents, agent-based modeling (ABM) allows researchers to observe the emergence of complex phenomena. High-performance array computing libraries like JAX can help scale such computational models to a large number of agents by using automatic vectorization and just-in-time (JIT) compilation. One of the caveats of using JAX to achieve such scaling is that the shapes of arrays used in the computational model should remain immutable throughout the simulation. In the context of agent-based modeling (ABM), this can pose constraints on certain agent manipulation operations that require flexible data structures. A subset of which is represented by the ability to update a dynamically selected number of agents by applying distinct changes to them during a simulation. To this effect, we introduce Abmax, an ABM framework based on JAX that implements multiple just-in-time (JIT) compilable algorithms to provide this functionality. On the canonical predation model benchmark, Abmax achieves runtime performance comparable to state-of-the-art implementations. Further, we show that this functionality can also be vectorized, making it possible to run many similar agent-based models in parallel. We also present two examples in the form of a traffic-flow model and a financial market model to show the use case of Abmax.
comment: 12 pages, 7 figures, 4 tables, 2 algorithms
Swarming Without an Anchor (SWA): Robot Swarms Adapt Better to Localization Dropouts Then a Single Robot
In this paper, we present the Swarming Without an Anchor (SWA) approach to state estimation in swarms of Unmanned Aerial Vehicles (UAVs) experiencing ego-localization dropout, where individual agents are laterally stabilized using relative information only. We propose to fuse decentralized state estimation with robust mutual perception and onboard sensor data to maintain accurate state awareness despite intermittent localization failures. Thus, the relative information used to estimate the lateral state of UAVs enables the identification of the unambiguous state of UAVs with respect to the local constellation. The resulting behavior reaches velocity consensus, as this task can be referred to as the double integrator synchronization problem. All disturbances and performance degradations except a uniform translation drift of the swarm as a whole is attenuated which is enabling new opportunities in using tight cooperation for increasing reliability and resilience of multi-UAV systems. Simulations and real-world experiments validate the effectiveness of our approach, demonstrating its capability to sustain cohesive swarm behavior in challenging conditions of unreliable or unavailable primary localization.
comment: Accepted to IEEE RA-L on April 1, 2025
Integrated Noise and Safety Management in UAM via A Unified Reinforcement Learning Framework
Urban Air Mobility (UAM) envisions the widespread use of small aerial vehicles to transform transportation in dense urban environments. However, UAM faces critical operational challenges, particularly the balance between minimizing noise exposure and maintaining safe separation in low-altitude urban airspace, two objectives that are often addressed separately. We propose a reinforcement learning (RL)-based air traffic management system that integrates both noise and safety considerations within a unified, decentralized framework. Under this scalable air traffic coordination solution, agents operate in a structured, multi-layered airspace and learn altitude adjustment policies to jointly manage noise impact and separation constraints. The system demonstrates strong performance across both objectives and reveals tradeoffs among separation, noise exposure, and energy efficiency under high traffic density. The findings highlight the potential of RL and multi-objective coordination strategies in enhancing the safety, quietness, and efficiency of UAM operations.
Sound and Solution-Complete CCBS
Continuous-time Conflict Based-Search (CCBS) has long been viewed as the de-facto optimal solver for multi-agent path finding in continuous time (MAPFR). Recent findings, however, show that the original theoretical variant of CCBS can suffer from non-termination, while the widely used implementation can return sub-optimal solutions. We introduce an analytical framework that yields simple and sufficient conditions under which any CCBS-style algorithm is both sound, i.e., returns only optimal solutions, and solution complete, i.e., terminates on every solvable MAPFR instance. Investigating the publicly available implementation of CCBS reveals that it violates these conditions. Though this merely indicates that CCBS might be unsound, this indication is supported by counter-examples. Leveraging the analytical framework, we propose a novel branching rule and prove that it satisfies the sufficient conditions, thereby restoring soundness and termination guarantees. Consequently, the resulting CCBS variant is both sound and solution complete, matching the guarantees of the discrete-time CBS for the first time in the continuous domain. We experimentally apply standard CCBS and CCBS under our branching rule to an example problem, with our branching rule returning a solution with lower sum-of-costs than standard CCBS. Because the branching rule largely only affects the branching step, it can be adopted as a drop-in replacement in existing code-bases, as we show in our provided implementation. Beyond CCBS, the analytical framework and termination criterion provide a systematic way to evaluate other CCBS-like MAPFR solvers and future extensions.
comment: 15 pages
Limit-Computable Grains of Truth for Arbitrary Computable Extensive-Form (Un)Known Games
A Bayesian player acting in an infinite multi-player game learns to predict the other players' strategies if his prior assigns positive probability to their play (or contains a grain of truth). Kalai and Lehrer's classic grain of truth problem is to find a reasonably large class of strategies that contains the Bayes-optimal policies with respect to this class, allowing mutually-consistent beliefs about strategy choice that obey the rules of Bayesian inference. Only small classes are known to have a grain of truth and the literature contains several related impossibility results. In this paper we present a formal and general solution to the full grain of truth problem: we construct a class of strategies wide enough to contain all computable strategies as well as Bayes-optimal strategies for every reasonable prior over the class. When the "environment" is a known repeated stage game, we show convergence in the sense of [KL93a] and [KL93b]. When the environment is unknown, agents using Thompson sampling converge to play $\varepsilon$-Nash equilibria in arbitrary unknown computable multi-agent environments. Finally, we include an application to self-predictive policies that avoid planning. While these results use computability theory only as a conceptual tool to solve a classic game theory problem, we show that our solution can naturally be computationally approximated arbitrarily closely.
comment: 42 pages; 2 figures; 7 algorithms
Murakkab: Resource-Efficient Agentic Workflow Orchestration in Cloud Platforms
Agentic workflows commonly coordinate multiple models and tools with complex control logic. They are quickly becoming the dominant paradigm for AI applications. However, serving them remains inefficient with today's frameworks. The key problem is that they expose workflows as opaque sequences of model and tool calls that tightly couple agent logic with model and hardware choices. Often, these workflow components are fragmented across different entities, preventing systems from reasoning about trade-offs across accuracy, latency, energy, and cost. This leads to resource waste and degraded service-level objectives (SLOs). We present Murakkab, a resource-efficient serving system for agentic workflows. Murakkab introduces a declarative abstraction that decouples workflow specification from execution configuration. A profile-guided optimizer and adaptive runtime jointly manage the full stack: orchestrating workflow components, mapping them to models and hardware, and dynamically reconfiguring execution to satisfy user-defined SLOs. By exposing the internal structure of agentic workflows, Murakkab enables cross-layer optimization that existing frameworks and cloud schedulers cannot achieve. Our evaluation on diverse workflows shows that \sysname{} reduces GPU usage by up to 2.8$\times$, energy consumption by 3.7$\times$, and cost by 4.3$\times$ while maintaining SLOs.
Consensus Is All You Need: Gossip-Based Reasoning Among Large Language Models
Large language models have advanced rapidly, but no single model excels in every area -- each has its strengths and weaknesses. Instead of relying on one model alone, we take inspiration from gossip protocols in distributed systems, where information is exchanged with peers until they all come to an agreement. In this setup, models exchange answers and gradually work toward a shared solution. Each LLM acts as a node in a peer-to-peer network, sharing responses and thought processes to reach a collective decision. Our results show that this "gossip-based consensus" leads to robust, resilient, and accurate multi-agent AI reasoning. It helps overcome the weaknesses of individual models and brings out their collective strengths. This approach is similar to how humans build consensus, making AI seem more collaborative and trustworthy instead of just a black-box program.
comment: 4 pages, 5 figures
The Aegis Protocol: A Foundational Security Framework for Autonomous AI Agents
The proliferation of autonomous AI agents marks a paradigm shift toward complex, emergent multi-agent systems. This transition introduces systemic security risks, including control-flow hijacking and cascading failures, that traditional cybersecurity paradigms are ill-equipped to address. This paper introduces the Aegis Protocol, a layered security framework designed to provide strong security guarantees for open agentic ecosystems. The protocol integrates three technological pillars: (1) non-spoofable agent identity via W3C Decentralized Identifiers (DIDs); (2) communication integrity via NIST-standardized post-quantum cryptography (PQC); and (3) verifiable, privacy-preserving policy compliance using the Halo2 zero-knowledge proof (ZKP) system. We formalize an adversary model extending Dolev-Yao for agentic threats and validate the protocol against the STRIDE framework. Our quantitative evaluation used a discrete-event simulation, calibrated against cryptographic benchmarks, to model 1,000 agents. The simulation showed a 0 percent success rate across 20,000 attack trials. For policy verification, analysis of the simulation logs reported a median proof-generation latency of 2.79 seconds, establishing a performance baseline for this class of security. While the evaluation is simulation-based and early-stage, it offers a reproducible baseline for future empirical studies and positions Aegis as a foundation for safe, scalable autonomous AI.
comment: 10 pages, 3 figures, 3 tables. Source compiled with pdfLaTeX; bibliography included via prebuilt main.bbl. Code repository: available in paper
FACET: Teacher-Centred LLM-Based Multi-Agent Systems-Towards Personalized Educational Worksheets
The increasing heterogeneity of student populations poses significant challenges for teachers, particularly in mathematics education, where cognitive, motivational, and emotional differences strongly influence learning outcomes. While AI-driven personalization tools have emerged, most remain performance-focused, offering limited support for teachers and neglecting broader pedagogical needs. This paper presents the FACET framework, a teacher-facing, large language model (LLM)-based multi-agent system designed to generate individualized classroom materials that integrate both cognitive and motivational dimensions of learner profiles. The framework comprises three specialized agents: (1) learner agents that simulate diverse profiles incorporating topic proficiency and intrinsic motivation, (2) a teacher agent that adapts instructional content according to didactical principles, and (3) an evaluator agent that provides automated quality assurance. We tested the system using authentic grade 8 mathematics curriculum content and evaluated its feasibility through a) automated agent-based assessment of output quality and b) exploratory feedback from K-12 in-service teachers. Results from ten internal evaluations highlighted high stability and alignment between generated materials and learner profiles, and teacher feedback particularly highlighted structure and suitability of tasks. The findings demonstrate the potential of multi-agent LLM architectures to provide scalable, context-aware personalization in heterogeneous classroom settings, and outline directions for extending the framework to richer learner profiles and real-world classroom trials.
SIGMA: Sheaf-Informed Geometric Multi-Agent Pathfinding ICRA
The Multi-Agent Path Finding (MAPF) problem aims to determine the shortest and collision-free paths for multiple agents in a known, potentially obstacle-ridden environment. It is the core challenge for robotic deployments in large-scale logistics and transportation. Decentralized learning-based approaches have shown great potential for addressing the MAPF problems, offering more reactive and scalable solutions. However, existing learning-based MAPF methods usually rely on agents making decisions based on a limited field of view (FOV), resulting in short-sighted policies and inefficient cooperation in complex scenarios. There, a critical challenge is to achieve consensus on potential movements between agents based on limited observations and communications. To tackle this challenge, we introduce a new framework that applies sheaf theory to decentralized deep reinforcement learning, enabling agents to learn geometric cross-dependencies between each other through local consensus and utilize them for tightly cooperative decision-making. In particular, sheaf theory provides a mathematical proof of conditions for achieving global consensus through local observation. Inspired by this, we incorporate a neural network to approximately model the consensus in latent space based on sheaf theory and train it through self-supervised learning. During the task, in addition to normal features for MAPF as in previous works, each agent distributedly reasons about a learned consensus feature, leading to efficient cooperation on pathfinding and collision avoidance. As a result, our proposed method demonstrates significant improvements over state-of-the-art learning-based MAPF planners, especially in relatively large and complex scenarios, demonstrating its superiority over baselines in various simulations and real-world robot experiments. The code is available at https://github.com/marmotlab/SIGMA
comment: Accepted for presentation at the 2025 IEEE International Conference on Robotics and Automation (ICRA)
Balancing Act: Prioritization Strategies for LLM-Designed Restless Bandit Rewards
LLMs are increasingly used to design reward functions based on human preferences in Reinforcement Learning (RL). We focus on LLM-designed rewards for Restless Multi-Armed Bandits, a framework for allocating limited resources among agents. In applications such as public health, this approach empowers grassroots health workers to tailor automated allocation decisions to community needs. In the presence of multiple agents, altering the reward function based on human preferences can impact subpopulations very differently, leading to complex tradeoffs and a multi-objective resource allocation problem. We are the first to present a principled method termed Social Choice Language Model for dealing with these tradeoffs for LLM-designed rewards for multiagent planners in general and restless bandits in particular. The novel part of our model is a transparent and configurable selection component, called an adjudicator, external to the LLM that controls complex tradeoffs via a user-selected social welfare function. Our experiments demonstrate that our model reliably selects more effective, aligned, and balanced reward functions compared to purely LLM-based approaches.
Systems and Control (CS)
TinyML Towards Industry 4.0: Resource-Efficient Process Monitoring of a Milling Machine
In the context of industry 4.0, long-serving industrial machines can be retrofitted with process monitoring capabilities for future use in a smart factory. One possible approach is the deployment of wireless monitoring systems, which can benefit substantially from the TinyML paradigm. This work presents a complete TinyML flow from dataset generation, to machine learning model development, up to implementation and evaluation of a full preprocessing and classification pipeline on a microcontroller. After a short review on TinyML in industrial process monitoring, the creation of the novel MillingVibes dataset is described. The feasibility of a TinyML system for structure-integrated process quality monitoring could be shown by the development of an 8-bit-quantized convolutional neural network (CNN) model with 12.59kiB parameter storage. A test accuracy of 100.0% could be reached at 15.4ms inference time and 1.462mJ per quantized CNN inference on an ARM Cortex M4F microcontroller, serving as a reference for future TinyML process monitoring solutions.
comment: 10 pages, 5 figures, 1 table
Multi-agent Robust and Optimal Policy Learning for Data Harvesting
We consider the problem of using multiple agents to harvest data from a collection of sensor nodes (targets) scattered across a two-dimensional environment. These targets transmit their data to the agents that move in the space above them, and our goal is for the agents to collect data from the targets as efficiently as possible while moving to their final destinations. The agents are assumed to have a continuous control action, and we leverage reinforcement learning, specifically Proximal Policy Optimization (PPO) with Lagrangian Penalty (LP), to identify highly effective solutions. Additionally, we enhance the controller's robustness by incorporating regularization at each state to smooth the learned policy. We conduct a series of simulations to demonstrate our approach and validate its performance and robustness.
NOSTRA: A noise-resilient and sparse data framework for trust region based multi objective Bayesian optimization
Multi-objective Bayesian optimization (MOBO) struggles with sparse (non-space-filling), scarce (limited observations) datasets affected by experimental uncertainty, where identical inputs can yield varying outputs. These challenges are common in physical and simulation experiments (e.g., randomized medical trials and, molecular dynamics simulations) and are therefore incompatible with conventional MOBO methods. As a result, experimental resources are inefficiently allocated, leading to suboptimal designs. To address this challenge, we introduce NOSTRA (Noisy and Sparse Data Trust Region-based Optimization Algorithm), a novel sampling framework that integrates prior knowledge of experimental uncertainty to construct more accurate surrogate models while employing trust regions to focus sampling on promising areas of the design space. By strategically leveraging prior information and refining search regions, NOSTRA accelerates convergence to the Pareto frontier, enhances data efficiency, and improves solution quality. Through two test functions with varying levels of experimental uncertainty, we demonstrate that NOSTRA outperforms existing methods in handling noisy, sparse, and scarce data. Specifically, we illustrate that, NOSTRA effectively prioritizes regions where samples enhance the accuracy of the identified Pareto frontier, offering a resource-efficient algorithm that is practical in scenarios with limited experimental budgets while ensuring efficient performance.
Reinforcement Learning-based Control via Y-wise Affine Neural Networks (YANNs)
This work presents a novel reinforcement learning (RL) algorithm based on Y-wise Affine Neural Networks (YANNs). YANNs provide an interpretable neural network which can exactly represent known piecewise affine functions of arbitrary input and output dimensions defined on any amount of polytopic subdomains. One representative application of YANNs is to reformulate explicit solutions of multi-parametric linear model predictive control. Built on this, we propose the use of YANNs to initialize RL actor and critic networks, which enables the resulting YANN-RL control algorithm to start with the confidence of linear optimal control. The YANN-actor is initialized by representing the multi-parametric control solutions obtained via offline computation using an approximated linear system model. The YANN-critic represents the explicit form of the state-action value function for the linear system and the reward function as the objective in an optimal control problem (OCP). Additional network layers are injected to extend YANNs for nonlinear expressions, which can be trained online by directly interacting with the true complex nonlinear system. In this way, both the policy and state-value functions exactly represent a linear OCP initially and are able to eventually learn the solution of a general nonlinear OCP. Continuous policy improvement is also implemented to provide heuristic confidence that the linear OCP solution serves as an effective lower bound to the performance of RL policy. The YANN-RL algorithm is demonstrated on a clipped pendulum and a safety-critical chemical-reactive system. Our results show that YANN-RL significantly outperforms the modern RL algorithm using deep deterministic policy gradient, especially when considering safety constraints.
Wide-Area Power System Oscillations from Large-Scale AI Workloads
This paper develops a new dynamic power profiling approach for modeling AI-centric datacenter loads and analyzing their impact on grid operations, particularly their potential to induce wide-area grid oscillations. We characterize the periodic stochastic power fluctuations inherent to large-scale AI workloads during both the training and fine-tuning stages, driven by the state-of-the-art GPU computing architecture designs. These sustained, large power fluctuations, unlike conventional load ramping, act as persistent forcing inputs capable of interacting with and amplifying local and inter-area oscillation modes. Using the WECC 179-bus system as a test case, we examine the amplitude and variability of oscillatory responses under different factors, ranging from system strength, penetration level, fluctuation frequency range, individual datacenter size, to geographical deployment. Simulation results show that, notably, narrower fluctuation bands, larger single-site capacities, or dispersed siting can intensify oscillations across multiple modes. Our models and numerical studies provide a quantitative basis for integrating AI-dominant electricity demands into grid oscillation studies, and further support the development of new planning and operational measures to power the continuous AI load growth.
Performance analysis for cone-preserving switched systems with constrained switching
This paper studies cone-preserving linear discrete-time switched systems whose switching is governed by an automaton. For this general system class, we present performance analysis conditions for a broadly usable performance measure. In doing so, we generalize several known results for performance and stability analysis for switched and positive switched systems, providing a unifying perspective. We also arrive at novel $\ell_1$-performance analysis conditions for positive switched systems with constrained switching, for which we present an application-motivated numerical example. Further, the cone-preserving perspective provides insights into appropriate Lyapunov function selection.
comment: Accepted for publication at IEEE CDC 2025
Resilient Control for Networked Switched Systems With/Without ACK: An Active Quantized Framework
This paper deals with the quantized control problem for switched systems under denial-of-service (DoS) attack. Considering the system's defensive capability and the computational resources of quantizers and controllers, four control strategies are proposed. These strategies incorporate different combinations of controllers (active and passive), quantizers (centered on the origin or custom-designed), and network configurations (with or without ACK signals). For each strategy, specific update laws for the encoder and decoder are designed to avoid quantization saturation. Furthermore, the uniformity of encoder and decoder operations is maintained by transmitting additional information to the decoder. To achieve asymptotic stability, sufficient conditions concerning the switching signal and DoS attack constraints are derived by taking into account the asynchronous behaviors. The proposed active quantization strategy with the ACK signal leverages the system model information to compute the control signal in real-time, allowing for possible convergence of the system state despite DoS attack. Additionally, a well-designed switching signal is suggested to further mitigate the impact of DoS attack. A passive quantization strategy with ACK signal is also developed as a simplified version of the active quantized control strategy, providing the foundation for a strategy without ACK signal. Inspired by time-triggered and event-triggered mechanisms, the passive quantization strategy without ACK signal is investigated, with two feasible update laws for the quantizer. Finally, two simulations are conducted to validate the effectiveness of the proposed strategies.
Well-posedness of Lur'e systems with feedthrough
For a large class of Lur'e systems with time-varying nonlinearities and feedthrough we consider several well-posedness issues, namely: existence, continuation, blow-up in finite-time, forward completeness and uniqueness of solutions. Lur'e systems with feedthrough are systems of forced, nonlinear ordinary differential equations coupled with a nonlinear algebraic equation determining the output of the system. The presence of feedthrough means that the algebraic equation is implicit in the output, and, in general, the output may not be expressible by an analytic formula in terms of the state and the input. Simple examples illustrate that the well-posedness properties of such systems are not necessarily guaranteed by assumptions sufficient for the corresponding well-posedness properties of Lur'e systems without feedthrough. We provide sufficient conditions for the well-posedness properties mentioned above, using global inversion theorems from real analysis and tools from non-smooth analysis and differential inclusions. The theory is illustrated with examples.
comment: 30 pages, 0 figures. Submitted Aug 2025 to peer-reviewed journal for potential publication. Uploading preprint to arxiv for early dissemination
A Joint Delay-Energy-Security Aware Framework for Intelligent Task Scheduling in Satellite-Terrestrial Edge Computing Network
In this paper, we propose a two-stage optimization framework for secure task scheduling in satellite-terrestrial edge computing networks (STECNs). The framework jointly considers secure user association and task offloading to balance transmission delay, energy consumption, and physical-layer security. To address the inherent complexity, we decouple the problem into two stages. In the first stage, a secrecy-aware user association strategy is designed by discretizing artificial noise (AN) power ratios and identifying feasible links that satisfy secrecy constraints, resulting in a set of candidate secure associations. In the second stage, we formulate a delay-energy-aware task scheduling problem as an integer linear program and solve it using a heuristic Mayfly Algorithm (MA) to obtain low-complexity, high-quality solutions. Extensive simulation results demonstrate the effectiveness and superiority of the proposed framework in achieving secure and efficient task scheduling under dynamic satellite environments.
comment: 10 pages, 8 figures
LLM-Assisted Semantic Alignment and Integration in Collaborative Model-Based Systems Engineering Using SysML v2
Cross-organizational collaboration in Model-Based Systems Engineering (MBSE) faces many challenges in achieving semantic alignment across independently developed system models. SysML v2 introduces enhanced structural modularity and formal semantics, offering a stronger foundation for interoperable modeling. Meanwhile, GPT-based Large Language Models (LLMs) provide new capabilities for assisting model understanding and integration. This paper proposes a structured, prompt-driven approach for LLM-assisted semantic alignment of SysML v2 models. The core contribution lies in the iterative development of an alignment approach and interaction prompts, incorporating model extraction, semantic matching, and verification. The approach leverages SysML v2 constructs such as alias, import, and metadata extensions to support traceable, soft alignment integration. It is demonstrated with a GPT-based LLM through an example of a measurement system. Benefits and limitations are discussed.
comment: Accepted by IEEE ISSE 2025, DOI pending
Data-Driven Analysis and Predictive Control of Descriptor Systems with Application to Power and Water Networks
Despite growing interest in data-driven analysis and control of linear systems, descriptor systems--which are essential for modeling complex engineered systems with algebraic constraints like power and water networks--have received comparatively little attention. This paper develops a comprehensive data-driven framework for analyzing and controlling discrete-time descriptor systems without relying on explicit state-space models. We address fundamental challenges posed by non-causality through the construction of forward and backward data matrices, establishing data-based sufficient conditions for controllability and observability in terms of input-output data, where both R-controllability and C-controllability (R-observability and C-observability) have been considered. We then extend Willems' fundamental lemma to incompletely controllable systems. These methodological advances enable Data-Enabled Predictive Control (DeePC) to achieve output tracking in descriptor systems and to maintain performance under incomplete controllability conditions, as demonstrated in two case studies: i) Frequency regulation in an IEEE 9-bus power system with 3 generators, where DeePC maintained the frequency stability of the power system despite deliberate violations of R-controllability; and ii) Pressure head control in an EPANET water network with 3 tanks, 2 reservoirs, and 117 pipes, where output tracking was successfully enforced under algebraic constraints.
Dynamic Switching Models for Truck-only Delivery and Drone-assisted Truck Delivery under Demand Uncertainty
Integrating drones into truck delivery systems can improve customer accessibility, reduce operational costs, and increase delivery efficiency. However, drone deployment incurs costs, including procurement, maintenance, and energy consumption, and its benefits depend on service demand. In low-demand areas, drone-assisted trucks may underutilize resources due to high upfront costs. Accurately predicting demand is challenging due to uncertainties from unforeseen events or infrastructure disruptions. To address this, a market entry and exit real option approach is used to optimize switching between truck-only and drone-assisted delivery under stochastic demand. Results show that deploying multiple drones per truck offers significant cost advantages in high-demand regions. Using the proposed dynamic switching model, deterministic and stochastic approaches reduce costs by 17.4% and 31.3%, respectively, compared to immediate cost-saving switching. Sensitivity analysis reveals asymmetric effects of stochastic parameters on entry and exit timings. A stochastic multiple-options model is further developed to dynamically switch between truck-only and drone-assisted delivery with varying drone numbers. Applying these models to Miami-Dade County, we evaluate dynamic switching costs for three major logistics operators. This study highlights the potential benefits of dynamic delivery switching and provides insights for optimizing logistics operations.
Fairness for distribution network hosting capacity
The integration of distributed generation (DG) is essential to the energy transition but poses challenges for lowvoltage (LV) distribution networks (DNs) with limited hosting capacity (HC). This study incorporates multiple fairness criteria, utilitarian, egalitarian, bounded, and bargaining, into the HC optimisation framework to assess their impact. When applied to LV feeders of different sizes and topologies, the analysis shows that bargaining and upper-bounded fairness provide the best balance between efficiency and fairness. Efficiency refers to maximising the social welfare of the LV DNs, while fairness is proportional to the minimisation of disparity in opportunity for installing DG. Feeder topology significantly influences fairness outcomes, while feeder size affects total HC and the inherent fairness of feeders. These results emphasise the importance of regulatory incentives and network designs in order to facilitate fair and efficient DG integration.
Grid-Aware Flexibility Operation of Behind-the-Meter Assets: A review of Objectives and Constraints
The high penetration of distributed energy resources (DERs) in low-voltage distribution networks (LVDNs) often leads to network instability and congestion. Discovering the flexibility potential of behind- the-meter (BTM) assets offers a promising solution to these challenges, providing benefits for both prosumers and grid operators. This review focuses on the objectives and constraints associated with the operation of BTM flexibility resources in LVDNs. We propose a new classification framework for network-aware flexibility modelling that incorporates prosumer objectives, flexibility sources, and both local and grid-level constraints. This review identifies research gaps in prosumer-centric grid considerations, control strategies, flexibility preferences, and scenarios in the use of BTM resources.
Fairness of Energy Distribution Mechanisms in Collective Self-Consumption Schemes
In several European countries, regulatory frameworks now allow households to form energy communities and trade energy locally via local energy markets (LEMs). While multiple mechanisms exist to allocate locally produced energy among members, their fairness remains insufficiently understood despite energy justice being a key concern for communities. This paper first provides a thorough description of the collective self-consumption process in France, offering a real world framework for researchers. We then review the main types of fairness relevant to LEMs and identify appropriate indicators for each, including a new scalable indicator to evaluate meritocratic fairness. Using simulations across 250 randomly generated residential communities of 20 households, we assess and compare fairness across different LEM distribution mechanisms. Results show that average financial savings reach 12% with 40% PV uptake. Among the four widely used LEM mechanisms assessed, glass-filling with prioritization yields the highest egalitarian and min max fairness. Double auction and pro rata schemes promote meritocracy, while standard glass filling offers a strong balance across fairness objectives.
comment: 5 pages, Accepted for ISGT Europe Conference 2025
Predictability Enables Parallelization of Nonlinear State Space Models
The rise of parallel computing hardware has made it increasingly important to understand which nonlinear state space models can be efficiently parallelized. Recent advances like DEER (arXiv:2309.12252) or DeepPCR (arXiv:2309.16318) have shown that evaluating a state space model can be recast as solving a parallelizable optimization problem, and sometimes this approach can yield dramatic speed-ups in evaluation time. However, the factors that govern the difficulty of these optimization problems remain unclear, limiting the larger adoption of the technique. In this work, we establish a precise relationship between the dynamics of a nonlinear system and the conditioning of its corresponding optimization formulation. We show that the predictability of a system, defined as the degree to which small perturbations in state influence future behavior, impacts the number of optimization steps required for evaluation. In predictable systems, the state trajectory can be computed in $O((\log T)^2)$ time, where $T$ is the sequence length, a major improvement over the conventional sequential approach. In contrast, chaotic or unpredictable systems exhibit poor conditioning, with the consequence that parallel evaluation converges too slowly to be useful. Importantly, our theoretical analysis demonstrates that for predictable systems, the optimization problem is always well-conditioned, whereas for unpredictable systems, the conditioning degrades exponentially as a function of the sequence length. We validate our claims through extensive experiments, providing practical guidance on when nonlinear dynamical systems can be efficiently parallelized, and highlighting predictability as a key design principle for parallelizable models.
Optimal Coordination of Local Flexibility from Electric Vehicles with Social Impact Consideration
The integration of renewable energy sources (RES) and the convergence of transport electrification, creates a significant challenge for distribution network management e.g. voltage and frequency violations, particularly in rural and remote areas. This paper investigates how smart charging of electric vehicles (EVs) can help reduce renewable energy curtailment and alleviate stress on local distribution networks. We implement a customised AC Optimal Power Flow (AC OPF) formulation which integrates into the optimisation an indicator reflecting the social impact of flexibility from EV users, based on the analysis of historical EV charging behaviours. The contribution of EV owners to reducing wind curtailment is optimised to enhance the acceptability of flexibility procurement, as the method targets EV users whose charging habits are most likely to align with flexibility requirements. Our method integrates social, technological, and economic perspectives with optimal flexibility coordination, and utilises clustering of EVs through a kmeans algorithm. To ensure scalability, we introduce a polar coordinate-based dimension reduction technique. The flexibility optimisation approach is demonstrated on the Orkney grid model, incorporating demand and wind farm generation data, as well as multi year charging data from 106 EVs. Results indicate that, by building upon the existing habits of EV users, curtailment can be reduced by 99.5% during a typical summer week the period when curtailment is most prevalent. This research demonstrates a foundational and transferable approach which is cognisant of socio techno economic factors towards accelerating decarbonisation and tackling the stochastic challenges of new demand and generation patterns on local distribution networks.
comment: 5 pages, accepted for ISGT Europe Conference 2025
Autonomous UAV Flight Navigation in Confined Spaces: A Reinforcement Learning Approach
Inspecting confined industrial infrastructure, such as ventilation shafts, is a hazardous and inefficient task for humans. Unmanned Aerial Vehicles (UAVs) offer a promising alternative, but GPS-denied environments require robust control policies to prevent collisions. Deep Reinforcement Learning (DRL) has emerged as a powerful framework for developing such policies, and this paper provides a comparative study of two leading DRL algorithms for this task: the on-policy Proximal Policy Optimization (PPO) and the off-policy Soft Actor-Critic (SAC). The training was conducted with procedurally generated duct environments in Genesis simulation environment. A reward function was designed to guide a drone through a series of waypoints while applying a significant penalty for collisions. PPO learned a stable policy that completed all evaluation episodes without collision, producing smooth trajectories. By contrast, SAC consistently converged to a suboptimal behavior that traversed only the initial segments before failure. These results suggest that, in hazard-dense navigation, the training stability of on-policy methods can outweigh the nominal sample efficiency of off-policy algorithms. More broadly, the study provides evidence that procedurally generated, high-fidelity simulations are effective testbeds for developing and benchmarking robust navigation policies.
A predictive modular approach to constraint satisfaction under uncertainty - with application to glycosylation in continuous monoclonal antibody biosimilar production
The paper proposes a modular-based approach to constraint handling in process optimization and control. This is partly motivated by the recent interest in learning-based methods, e.g., within bioproduction, for which constraint handling under uncertainty is a challenge. The proposed constraint handler, called predictive filter, is combined with an adaptive constraint margin and a constraint violation cost monitor to minimize the cost of violating soft constraints due to model uncertainty and disturbances. The module can be combined with any controller and is based on minimally modifying the controller output, in a least squares sense, such that constraints are satisfied within the considered horizon. The proposed method is computationally efficient and suitable for real-time applications. The effectiveness of the method is illustrated through a realistic simulation case study of glycosylation constraint satisfaction in continuous monoclonal antibody biosimilar production using Chinese hamster ovary cells, for which the metabolic network model consists of 23 extracellular metabolites and 126 reactions.
CarboNet: A Finite-Time Combustion-Tolerant Compartmental Network for Tropospheric Carbon Control
While governments and international organizations have set the net-zero target to prevent a climate event horizon, practical solutions are lacking mainly because of the impracticability to completely replace combustion processes. Hence, in this paper, we first design a compartmental network whose states must remain in the nonnegative orthant for physical consistency and in which the carbon dioxide emissions result from the combustion of diesel in vehicles and gas in house heaters. Then, we designed both full-state and output-feedback linear-quadratic regulators of the compartmental network to bring the mass of carbon dioxide to the pre-industrial era, which is reached in approximately 25 and 60 days, respectively. The output feedback tolerates for 6 days the combustion taking place in 5,000 vehicles and in 10,000 house heating systems, it meets the net-zero target, and it nullifies the extraction of finite natural resources. The tropospheric temperature with closed-loop reaches the equilibrium at 133 {\deg}C after 16.4 years; while such an high value requires to further investigate with climate experts the model of the dynamics of the temperature, this work is a first step in designing optimal network control systems for climate stability. Source code is publicly available.
comment: To be submitted
TAGA: A Tangent-Based Reactive Approach for Socially Compliant Robot Navigation Around Human Groups ICRA
Robot navigation in densely populated environments presents significant challenges, particularly regarding the interplay between individual and group dynamics. Current navigation models predominantly address interactions with individual pedestrians while failing to account for human groups that naturally form in real-world settings. Conversely, the limited models implementing group-aware navigation typically prioritize group dynamics at the expense of individual interactions, both of which are essential for socially appropriate navigation. This research extends an existing simulation framework to incorporate both individual pedestrians and human groups. We present Tangent Action for Group Avoidance (TAGA), a modular reactive mechanism that can be integrated with existing navigation frameworks to enhance their group-awareness capabilities. TAGA dynamically modifies robot trajectories using tangent action-based avoidance strategies while preserving the underlying model's capacity to navigate around individuals. Additionally, we introduce Group Collision Rate (GCR), a novel metric to quantitatively assess how effectively robots maintain group integrity during navigation. Through comprehensive simulation-based benchmarking, we demonstrate that integrating TAGA with state-of-the-art navigation models (ORCA, Social Force, DS-RNN, and AG-RL) reduces group intrusions by 45.7-78.6% while maintaining comparable success rates and navigation efficiency. Future work will focus on real-world implementation and validation of this approach.
comment: 6 pages, 3 figures. Preprint; intended for submission to IEEE International Conference on Robotics & Automation (ICRA), 2025
Hyper Yoshimura: How a slight tweak on a classical folding pattern unleashes meta-stability for deployable robots
Deployable structures inspired by origami have provided lightweight, compact, and reconfigurable solutions for various robotic and architectural applications. However, creating an integrated structural system that can effectively balance the competing requirements of high packing efficiency, simple deployment, and precise morphing into multiple load-bearing configurations remains a significant challenge. This study introduces a new class of hyper-Yoshimura origami, which exhibits a wide range of kinematically admissible and locally metastable states, including newly discovered symmetric "self-packing" and asymmetric "pop-out" states. This metastability is achieved by breaking a design rule of Yoshimura origami that has been in place for many decades. To this end, this study derives a new set of mathematically rigorous design rules and geometric formulations. Based on this, forward and inverse kinematic strategies are developed to stack hyper-Yoshimura modules into deployable booms that can approximate complex 3D shapes. Finally, this study showcases the potential of hyper-Yoshimura with a meter-scale pop-up cellphone charging station deployed at our university's bus transit station, along with a 3D-printed, scaled prototype of a space crane that can function as an object manipulator, solar tracking device, or high-load-bearing structure. These results establish hyper-Yoshimura as a promising platform for deployable and adaptable robotic systems in both terrestrial and space environments.
Adaptive Task Space Non-Singular Terminal Super-Twisting Sliding Mode Control of a 7-DOF Robotic Manipulator
This paper presents a new task-space Non-singular Terminal Super-Twisting Sliding Mode (NT-STSM) controller with adaptive gains for robust trajectory tracking of a 7-DOF robotic manipulator. The proposed approach addresses the challenges of chattering, unknown disturbances, and rotational motion tracking, making it suited for high-DOF manipulators in dexterous manipulation tasks. A rigorous boundedness proof is provided, offering gain selection guidelines for practical implementation. Simulations and hardware experiments with external disturbances demonstrate the proposed controller's robust, accurate tracking with reduced control effort under unknown disturbances compared to other NT-STSM and conventional controllers. The results demonstrated that the proposed NT-STSM controller mitigates chattering and instability in complex motions, making it a viable solution for dexterous robotic manipulations and various industrial applications.
comment: Accepted for publication in IEEE Transactions on Industrial Electronics. 12 pages, 8 figures
Generative diffusion posterior sampling for informative likelihoods
Sequential Monte Carlo (SMC) methods have recently shown successful results for conditional sampling of generative diffusion models. In this paper we propose a new diffusion posterior SMC sampler achieving improved statistical efficiencies, particularly under outlier conditions or highly informative likelihoods. The key idea is to construct an observation path that correlates with the diffusion model and to design the sampler to leverage this correlation for more efficient sampling. Empirical results conclude the efficiency.
comment: Commemorative issue for celebrating Thomas Kailath's 90th birthday
Optimal Batch-Size Control for Low-Latency Federated Learning with Device Heterogeneity
Federated learning (FL) has emerged as a popular approach for collaborative machine learning in sixth-generation (6G) networks, primarily due to its privacy-preserving capabilities. The deployment of FL algorithms is expected to empower a wide range of Internet-of-Things (IoT) applications, e.g., autonomous driving, augmented reality, and healthcare. The mission-critical and time-sensitive nature of these applications necessitates the design of low-latency FL frameworks that guarantee high learning performance. In practice, achieving low-latency FL faces two challenges: the overhead of computing and transmitting high-dimensional model updates, and the heterogeneity in communication-and-computation (C$^2$) capabilities across devices. To address these challenges, we propose a novel C$^2$-aware framework for optimal batch-size control that minimizes end-to-end (E2E) learning latency while ensuring convergence. The framework is designed to balance a fundamental C$^2$ tradeoff as revealed through convergence analysis. Specifically, increasing batch sizes improves the accuracy of gradient estimation in FL and thus reduces the number of communication rounds required for convergence, but results in higher per-round latency, and vice versa. The associated problem of latency minimization is intractable; however, we solve it by designing an accurate and tractable surrogate for convergence speed, with parameters fitted to real data. This approach yields two batch-size control strategies tailored to scenarios with slow and fast fading, while also accommodating device heterogeneity. Extensive experiments using real datasets demonstrate that the proposed strategies outperform conventional batch-size adaptation schemes that do not consider the C$^2$ tradeoff or device heterogeneity.
IDSO-Managed Bid-Based Transactive Distribution Systems Design for DER Participation in Wholesale Markets While Preserving T-D Interactions
Participation of Distributed Energy Resources (DERs) in bid-based Transactive Energy Systems (TES) at the distribution systems facilitates strongly coupled, bidirectional interactions between Transmission-Distribution (T-D) systems. Capturing these interactions is critical for ensuring seamless integration within an Integrated Transmission and Distribution (ITD) framework. This study proposes a methodology to preserve such tight T-D linkages by developing an Independent Distribution System Operator (IDSO) managed bid-based TES design for unbalanced distribution systems. The proposed design operates within the ITD paradigm and permits DER participation in the Wholesale Power Market (WPM) through IDSO while preserving tight T-D linkages. To this end, this research offers the following key contributions: a novel bid/offer prequalification-cum-aggregation method to ensure a grid-safe and value-based aggregation of DERs' bids and offers for WPM participation through IDSO; and a retail pricing mechanism that reflects the true value of procuring or offering additional units of power within the distribution system. Case studies are conducted on a modified IEEE 123-bus radial feeder populated with a high DER concentration to validate the proposed frameworks' effectiveness in coordinating the DERs efficiently and reliably.
comment: 17 Pages, 13 Figures. Removed a few typos, and added more references in literature review
Transient performance of MPC for tracking without terminal constraints
Model predictive control (MPC) for tracking is a recently introduced approach, which extends standard MPC formulations by incorporating an artificial reference as an additional optimization variable, in order to track external and potentially time-varying references. In this work, we analyze the performance of such an MPC for tracking scheme without a terminal cost and terminal constraints. We derive a transient performance estimate, i.e. a bound on the closed-loop performance over an arbitrary time interval, yielding insights on how to select the scheme's parameters for performance. Furthermore, we show that in the asymptotic case, where the prediction horizon and observed time interval tend to infinity, the closed-loop solution of MPC for tracking recovers the infinite horizon optimal solution.
comment: Accepted for publication in IEEE Control Systems Letters (L-CSS)
AI-Powered CPS-Enabled Urban Transportation Digital Twin: Methods and Applications
We present methods and applications for the development of digital twins (DT) for urban traffic management. While the majority of studies on the DT focus on its ``eyes," which is the emerging sensing and perception like object detection and tracking, what really distinguishes the DT from a traditional simulator lies in its ``brain," the prediction and decision making capabilities of extracting patterns and making informed decisions from what has been seen and perceived. In order to add value to urban transportation management, DTs need to be powered by artificial intelligence and complement with low-latency high-bandwidth sensing and networking technologies, in other words, cyberphysical systems (CPS). We will first review the DT pipeline enabled by CPS and propose our DT architecture deployed on a real-world testbed in New York City. This paper can be a pointer to help researchers and practitioners identify challenges and opportunities for the development of DTs; a bridge to initiate conversations across disciplines; and a road map to exploiting potentials of DTs for diverse urban transportation applications.
Co-Investment with Payoff-Sharing Mechanism for Cooperative Decision-Making in Network Design Games
Network-based systems are inherently interconnected, with the design and performance of subnetworks being interdependent. However, the decisions of self-interested operators may lead to suboptimal outcomes for users and the overall system. This paper explores cooperative mechanisms that can simultaneously benefit both operators and users. We address this challenge using a game-theoretical framework that integrates both non-cooperative and cooperative game theory. In the non-cooperative stage, we propose a network design game in which subnetwork decision-makers strategically design local infrastructures. In the cooperative stage, co-investment with payoff-sharing mechanism is developed to enlarge collective benefits and fairly distribute them. To demonstrate the effectiveness of our framework, we conduct case studies on the Sioux Falls network and real-world public transport networks in Zurich and Winterthur, Switzerland. Our evaluation considers impacts on environmental sustainability, social welfare, and economic efficiency. The proposed framework provides a foundation for improving interdependent networked systems by enabling strategic cooperation among self-interested operators.
Active Learning-Based Optimization of Hydroelectric Turbine Startup to Minimize Fatigue Damage
Hydro-generating units (HGUs) play a crucial role in integrating intermittent renewable energy sources into the power grid due to their flexible operational capabilities. This evolving role has led to an increase in transient events, such as startups, which impose significant stresses on turbines, leading to increased turbine fatigue and a reduced operational lifespan. Consequently, optimizing startup sequences to minimize stresses is vital for hydropower utilities. However, this task is challenging, as stress measurements on prototypes can be expensive and time-consuming. To tackle this challenge, we propose an innovative automated approach to optimize the startup parameters of HGUs with a limited budget of measured startup sequences. Our method combines active learning and black-box optimization techniques, utilizing virtual strain sensors and dynamic simulations of HGUs. This approach was tested in real-time during an on-site measurement campaign on an instrumented Francis turbine prototype. The results demonstrate that our algorithm successfully identified an optimal startup sequence using only seven measured sequences. It achieves a remarkable 42% reduction in the maximum strain cycle amplitude compared to the standard startup sequence. This study paves the way for more efficient HGU startup optimization, potentially extending their operational lifespans.
comment: Published in Renewable Energy
SINDy-RL: Interpretable and Efficient Model-Based Reinforcement Learning
Deep reinforcement learning (DRL) has shown significant promise for uncovering sophisticated control policies that interact in complex environments, such as stabilizing a tokamak fusion reactor or minimizing the drag force on an object in a fluid flow. However, DRL requires an abundance of training examples and may become prohibitively expensive for many applications. In addition, the reliance on deep neural networks often results in an uninterpretable, black-box policy that may be too computationally expensive to use with certain embedded systems. Recent advances in sparse dictionary learning, such as the sparse identification of nonlinear dynamics (SINDy), have shown promise for creating efficient and interpretable data-driven models in the low-data regime. In this work we introduce SINDy-RL, a unifying framework for combining SINDy and DRL to create efficient, interpretable, and trustworthy representations of the dynamics model, reward function, and control policy. We demonstrate the effectiveness of our approaches on benchmark control environments and flow control problems, including gust mitigation on a 3D NACA 0012 airfoil at $Re=1000$. SINDy-RL achieves comparable performance to modern DRL algorithms using significantly fewer interactions in the environment and results in an interpretable control policy orders of magnitude smaller than a DRL policy.
comment: For code, see https://github.com/nzolman/sindy-rl. v2 Update: Included Pinball and 3D Airfoil examples. Christian Lagemann added as an author for contributions with the 3D Airfoil code. To appear in Nature Communications
Generative Machine Learning in Adaptive Control of Dynamic Manufacturing Processes: A Review
Dynamic manufacturing processes exhibit complex characteristics defined by time-varying parameters, nonlinear behaviors, and uncertainties. These characteristics require sophisticated in-situ monitoring techniques utilizing multimodal sensor data and adaptive control systems that can respond to real-time feedback while maintaining product quality. Recently, generative machine learning (ML) has emerged as a powerful tool for modeling complex distributions and generating synthetic data while handling these manufacturing uncertainties. However, adopting these generative technologies in dynamic manufacturing systems lacks a functional control-oriented perspective to translate their probabilistic understanding into actionable process controls while respecting constraints. This review presents a functional classification of Prediction-Based, Direct Policy, Quality Inference, and Knowledge-Integrated approaches, offering a perspective for understanding existing ML-enhanced control systems and incorporating generative ML. The analysis of generative ML architectures within this framework demonstrates control-relevant properties and potential to extend current ML-enhanced approaches where conventional methods prove insufficient. We show generative ML's potential for manufacturing control through decision-making applications, process guidance, simulation, and digital twins, while identifying critical research gaps: separation between generation and control functions, insufficient physical understanding of manufacturing phenomena, and challenges adapting models from other domains. To address these challenges, we propose future research directions aimed at developing integrated frameworks that combine generative ML and control technologies to address the dynamic complexities of modern manufacturing systems.
comment: 12 pages, 1 figure, 1 table. This paper has been accepted for publication in the proceedings of ASME IDETC-CIE 2025
Output-feedback model predictive control under dynamic uncertainties using integral quadratic constraints
In this work, we propose an output-feedback tube-based model predictive control (MPC) scheme for linear systems under dynamic uncertainties that are described via integral quadratic constraints (IQC). By leveraging IQCs, a large class of nonlinear and dynamic uncertainties can be addressed. We leverage recent IQC synthesis tools to design a dynamic controller and an estimator that are robust to these uncertainties and minimize the size of the resulting constraint tightening in the MPC. Thereby, we show that the robust estimation problem using IQCs with peak-to-peak performance can be convexified. We guarantee recursive feasibility, robust constraint satisfaction, and input-to-state stability of the resulting MPC scheme.
Intelligent Condition Monitoring of Industrial Plants: An Overview of Methodologies and Uncertainty Management Strategies
Condition monitoring is essential for ensuring the safety, reliability, and efficiency of modern industrial systems. With the increasing complexity of industrial processes, artificial intelligence (AI) has emerged as a powerful tool for fault detection and diagnosis, attracting growing interest from both academia and industry. This paper provides a comprehensive overview of intelligent condition monitoring methods, with a particular emphasis on chemical plants and the widely used Tennessee Eastman Process (TEP) benchmark. State-of-the-art machine learning (ML) and deep learning (DL) algorithms are reviewed, highlighting their strengths, limitations, and applicability to industrial fault detection and diagnosis. Special attention is given to key challenges, including imbalanced and unlabeled data, and to strategies by which models can address these issues. Furthermore, comparative analyses of algorithm performance are presented to guide method selection in practical scenarios. This survey is intended to benefit both newcomers and experienced researchers by consolidating fundamental concepts, summarizing recent advances, and outlining open challenges and promising directions for intelligent condition monitoring in industrial plants.
Systems and Control (EESS)
TinyML Towards Industry 4.0: Resource-Efficient Process Monitoring of a Milling Machine
In the context of industry 4.0, long-serving industrial machines can be retrofitted with process monitoring capabilities for future use in a smart factory. One possible approach is the deployment of wireless monitoring systems, which can benefit substantially from the TinyML paradigm. This work presents a complete TinyML flow from dataset generation, to machine learning model development, up to implementation and evaluation of a full preprocessing and classification pipeline on a microcontroller. After a short review on TinyML in industrial process monitoring, the creation of the novel MillingVibes dataset is described. The feasibility of a TinyML system for structure-integrated process quality monitoring could be shown by the development of an 8-bit-quantized convolutional neural network (CNN) model with 12.59kiB parameter storage. A test accuracy of 100.0% could be reached at 15.4ms inference time and 1.462mJ per quantized CNN inference on an ARM Cortex M4F microcontroller, serving as a reference for future TinyML process monitoring solutions.
comment: 10 pages, 5 figures, 1 table
Multi-agent Robust and Optimal Policy Learning for Data Harvesting
We consider the problem of using multiple agents to harvest data from a collection of sensor nodes (targets) scattered across a two-dimensional environment. These targets transmit their data to the agents that move in the space above them, and our goal is for the agents to collect data from the targets as efficiently as possible while moving to their final destinations. The agents are assumed to have a continuous control action, and we leverage reinforcement learning, specifically Proximal Policy Optimization (PPO) with Lagrangian Penalty (LP), to identify highly effective solutions. Additionally, we enhance the controller's robustness by incorporating regularization at each state to smooth the learned policy. We conduct a series of simulations to demonstrate our approach and validate its performance and robustness.
NOSTRA: A noise-resilient and sparse data framework for trust region based multi objective Bayesian optimization
Multi-objective Bayesian optimization (MOBO) struggles with sparse (non-space-filling), scarce (limited observations) datasets affected by experimental uncertainty, where identical inputs can yield varying outputs. These challenges are common in physical and simulation experiments (e.g., randomized medical trials and, molecular dynamics simulations) and are therefore incompatible with conventional MOBO methods. As a result, experimental resources are inefficiently allocated, leading to suboptimal designs. To address this challenge, we introduce NOSTRA (Noisy and Sparse Data Trust Region-based Optimization Algorithm), a novel sampling framework that integrates prior knowledge of experimental uncertainty to construct more accurate surrogate models while employing trust regions to focus sampling on promising areas of the design space. By strategically leveraging prior information and refining search regions, NOSTRA accelerates convergence to the Pareto frontier, enhances data efficiency, and improves solution quality. Through two test functions with varying levels of experimental uncertainty, we demonstrate that NOSTRA outperforms existing methods in handling noisy, sparse, and scarce data. Specifically, we illustrate that, NOSTRA effectively prioritizes regions where samples enhance the accuracy of the identified Pareto frontier, offering a resource-efficient algorithm that is practical in scenarios with limited experimental budgets while ensuring efficient performance.
Reinforcement Learning-based Control via Y-wise Affine Neural Networks (YANNs)
This work presents a novel reinforcement learning (RL) algorithm based on Y-wise Affine Neural Networks (YANNs). YANNs provide an interpretable neural network which can exactly represent known piecewise affine functions of arbitrary input and output dimensions defined on any amount of polytopic subdomains. One representative application of YANNs is to reformulate explicit solutions of multi-parametric linear model predictive control. Built on this, we propose the use of YANNs to initialize RL actor and critic networks, which enables the resulting YANN-RL control algorithm to start with the confidence of linear optimal control. The YANN-actor is initialized by representing the multi-parametric control solutions obtained via offline computation using an approximated linear system model. The YANN-critic represents the explicit form of the state-action value function for the linear system and the reward function as the objective in an optimal control problem (OCP). Additional network layers are injected to extend YANNs for nonlinear expressions, which can be trained online by directly interacting with the true complex nonlinear system. In this way, both the policy and state-value functions exactly represent a linear OCP initially and are able to eventually learn the solution of a general nonlinear OCP. Continuous policy improvement is also implemented to provide heuristic confidence that the linear OCP solution serves as an effective lower bound to the performance of RL policy. The YANN-RL algorithm is demonstrated on a clipped pendulum and a safety-critical chemical-reactive system. Our results show that YANN-RL significantly outperforms the modern RL algorithm using deep deterministic policy gradient, especially when considering safety constraints.
Wide-Area Power System Oscillations from Large-Scale AI Workloads
This paper develops a new dynamic power profiling approach for modeling AI-centric datacenter loads and analyzing their impact on grid operations, particularly their potential to induce wide-area grid oscillations. We characterize the periodic stochastic power fluctuations inherent to large-scale AI workloads during both the training and fine-tuning stages, driven by the state-of-the-art GPU computing architecture designs. These sustained, large power fluctuations, unlike conventional load ramping, act as persistent forcing inputs capable of interacting with and amplifying local and inter-area oscillation modes. Using the WECC 179-bus system as a test case, we examine the amplitude and variability of oscillatory responses under different factors, ranging from system strength, penetration level, fluctuation frequency range, individual datacenter size, to geographical deployment. Simulation results show that, notably, narrower fluctuation bands, larger single-site capacities, or dispersed siting can intensify oscillations across multiple modes. Our models and numerical studies provide a quantitative basis for integrating AI-dominant electricity demands into grid oscillation studies, and further support the development of new planning and operational measures to power the continuous AI load growth.
Performance analysis for cone-preserving switched systems with constrained switching
This paper studies cone-preserving linear discrete-time switched systems whose switching is governed by an automaton. For this general system class, we present performance analysis conditions for a broadly usable performance measure. In doing so, we generalize several known results for performance and stability analysis for switched and positive switched systems, providing a unifying perspective. We also arrive at novel $\ell_1$-performance analysis conditions for positive switched systems with constrained switching, for which we present an application-motivated numerical example. Further, the cone-preserving perspective provides insights into appropriate Lyapunov function selection.
comment: Accepted for publication at IEEE CDC 2025
Resilient Control for Networked Switched Systems With/Without ACK: An Active Quantized Framework
This paper deals with the quantized control problem for switched systems under denial-of-service (DoS) attack. Considering the system's defensive capability and the computational resources of quantizers and controllers, four control strategies are proposed. These strategies incorporate different combinations of controllers (active and passive), quantizers (centered on the origin or custom-designed), and network configurations (with or without ACK signals). For each strategy, specific update laws for the encoder and decoder are designed to avoid quantization saturation. Furthermore, the uniformity of encoder and decoder operations is maintained by transmitting additional information to the decoder. To achieve asymptotic stability, sufficient conditions concerning the switching signal and DoS attack constraints are derived by taking into account the asynchronous behaviors. The proposed active quantization strategy with the ACK signal leverages the system model information to compute the control signal in real-time, allowing for possible convergence of the system state despite DoS attack. Additionally, a well-designed switching signal is suggested to further mitigate the impact of DoS attack. A passive quantization strategy with ACK signal is also developed as a simplified version of the active quantized control strategy, providing the foundation for a strategy without ACK signal. Inspired by time-triggered and event-triggered mechanisms, the passive quantization strategy without ACK signal is investigated, with two feasible update laws for the quantizer. Finally, two simulations are conducted to validate the effectiveness of the proposed strategies.
Well-posedness of Lur'e systems with feedthrough
For a large class of Lur'e systems with time-varying nonlinearities and feedthrough we consider several well-posedness issues, namely: existence, continuation, blow-up in finite-time, forward completeness and uniqueness of solutions. Lur'e systems with feedthrough are systems of forced, nonlinear ordinary differential equations coupled with a nonlinear algebraic equation determining the output of the system. The presence of feedthrough means that the algebraic equation is implicit in the output, and, in general, the output may not be expressible by an analytic formula in terms of the state and the input. Simple examples illustrate that the well-posedness properties of such systems are not necessarily guaranteed by assumptions sufficient for the corresponding well-posedness properties of Lur'e systems without feedthrough. We provide sufficient conditions for the well-posedness properties mentioned above, using global inversion theorems from real analysis and tools from non-smooth analysis and differential inclusions. The theory is illustrated with examples.
comment: 30 pages, 0 figures. Submitted Aug 2025 to peer-reviewed journal for potential publication. Uploading preprint to arxiv for early dissemination
A Joint Delay-Energy-Security Aware Framework for Intelligent Task Scheduling in Satellite-Terrestrial Edge Computing Network
In this paper, we propose a two-stage optimization framework for secure task scheduling in satellite-terrestrial edge computing networks (STECNs). The framework jointly considers secure user association and task offloading to balance transmission delay, energy consumption, and physical-layer security. To address the inherent complexity, we decouple the problem into two stages. In the first stage, a secrecy-aware user association strategy is designed by discretizing artificial noise (AN) power ratios and identifying feasible links that satisfy secrecy constraints, resulting in a set of candidate secure associations. In the second stage, we formulate a delay-energy-aware task scheduling problem as an integer linear program and solve it using a heuristic Mayfly Algorithm (MA) to obtain low-complexity, high-quality solutions. Extensive simulation results demonstrate the effectiveness and superiority of the proposed framework in achieving secure and efficient task scheduling under dynamic satellite environments.
comment: 10 pages, 8 figures
LLM-Assisted Semantic Alignment and Integration in Collaborative Model-Based Systems Engineering Using SysML v2
Cross-organizational collaboration in Model-Based Systems Engineering (MBSE) faces many challenges in achieving semantic alignment across independently developed system models. SysML v2 introduces enhanced structural modularity and formal semantics, offering a stronger foundation for interoperable modeling. Meanwhile, GPT-based Large Language Models (LLMs) provide new capabilities for assisting model understanding and integration. This paper proposes a structured, prompt-driven approach for LLM-assisted semantic alignment of SysML v2 models. The core contribution lies in the iterative development of an alignment approach and interaction prompts, incorporating model extraction, semantic matching, and verification. The approach leverages SysML v2 constructs such as alias, import, and metadata extensions to support traceable, soft alignment integration. It is demonstrated with a GPT-based LLM through an example of a measurement system. Benefits and limitations are discussed.
comment: Accepted by IEEE ISSE 2025, DOI pending
Data-Driven Analysis and Predictive Control of Descriptor Systems with Application to Power and Water Networks
Despite growing interest in data-driven analysis and control of linear systems, descriptor systems--which are essential for modeling complex engineered systems with algebraic constraints like power and water networks--have received comparatively little attention. This paper develops a comprehensive data-driven framework for analyzing and controlling discrete-time descriptor systems without relying on explicit state-space models. We address fundamental challenges posed by non-causality through the construction of forward and backward data matrices, establishing data-based sufficient conditions for controllability and observability in terms of input-output data, where both R-controllability and C-controllability (R-observability and C-observability) have been considered. We then extend Willems' fundamental lemma to incompletely controllable systems. These methodological advances enable Data-Enabled Predictive Control (DeePC) to achieve output tracking in descriptor systems and to maintain performance under incomplete controllability conditions, as demonstrated in two case studies: i) Frequency regulation in an IEEE 9-bus power system with 3 generators, where DeePC maintained the frequency stability of the power system despite deliberate violations of R-controllability; and ii) Pressure head control in an EPANET water network with 3 tanks, 2 reservoirs, and 117 pipes, where output tracking was successfully enforced under algebraic constraints.
Dynamic Switching Models for Truck-only Delivery and Drone-assisted Truck Delivery under Demand Uncertainty
Integrating drones into truck delivery systems can improve customer accessibility, reduce operational costs, and increase delivery efficiency. However, drone deployment incurs costs, including procurement, maintenance, and energy consumption, and its benefits depend on service demand. In low-demand areas, drone-assisted trucks may underutilize resources due to high upfront costs. Accurately predicting demand is challenging due to uncertainties from unforeseen events or infrastructure disruptions. To address this, a market entry and exit real option approach is used to optimize switching between truck-only and drone-assisted delivery under stochastic demand. Results show that deploying multiple drones per truck offers significant cost advantages in high-demand regions. Using the proposed dynamic switching model, deterministic and stochastic approaches reduce costs by 17.4% and 31.3%, respectively, compared to immediate cost-saving switching. Sensitivity analysis reveals asymmetric effects of stochastic parameters on entry and exit timings. A stochastic multiple-options model is further developed to dynamically switch between truck-only and drone-assisted delivery with varying drone numbers. Applying these models to Miami-Dade County, we evaluate dynamic switching costs for three major logistics operators. This study highlights the potential benefits of dynamic delivery switching and provides insights for optimizing logistics operations.
Fairness for distribution network hosting capacity
The integration of distributed generation (DG) is essential to the energy transition but poses challenges for lowvoltage (LV) distribution networks (DNs) with limited hosting capacity (HC). This study incorporates multiple fairness criteria, utilitarian, egalitarian, bounded, and bargaining, into the HC optimisation framework to assess their impact. When applied to LV feeders of different sizes and topologies, the analysis shows that bargaining and upper-bounded fairness provide the best balance between efficiency and fairness. Efficiency refers to maximising the social welfare of the LV DNs, while fairness is proportional to the minimisation of disparity in opportunity for installing DG. Feeder topology significantly influences fairness outcomes, while feeder size affects total HC and the inherent fairness of feeders. These results emphasise the importance of regulatory incentives and network designs in order to facilitate fair and efficient DG integration.
Grid-Aware Flexibility Operation of Behind-the-Meter Assets: A review of Objectives and Constraints
The high penetration of distributed energy resources (DERs) in low-voltage distribution networks (LVDNs) often leads to network instability and congestion. Discovering the flexibility potential of behind- the-meter (BTM) assets offers a promising solution to these challenges, providing benefits for both prosumers and grid operators. This review focuses on the objectives and constraints associated with the operation of BTM flexibility resources in LVDNs. We propose a new classification framework for network-aware flexibility modelling that incorporates prosumer objectives, flexibility sources, and both local and grid-level constraints. This review identifies research gaps in prosumer-centric grid considerations, control strategies, flexibility preferences, and scenarios in the use of BTM resources.
Fairness of Energy Distribution Mechanisms in Collective Self-Consumption Schemes
In several European countries, regulatory frameworks now allow households to form energy communities and trade energy locally via local energy markets (LEMs). While multiple mechanisms exist to allocate locally produced energy among members, their fairness remains insufficiently understood despite energy justice being a key concern for communities. This paper first provides a thorough description of the collective self-consumption process in France, offering a real world framework for researchers. We then review the main types of fairness relevant to LEMs and identify appropriate indicators for each, including a new scalable indicator to evaluate meritocratic fairness. Using simulations across 250 randomly generated residential communities of 20 households, we assess and compare fairness across different LEM distribution mechanisms. Results show that average financial savings reach 12% with 40% PV uptake. Among the four widely used LEM mechanisms assessed, glass-filling with prioritization yields the highest egalitarian and min max fairness. Double auction and pro rata schemes promote meritocracy, while standard glass filling offers a strong balance across fairness objectives.
comment: 5 pages, Accepted for ISGT Europe Conference 2025
Predictability Enables Parallelization of Nonlinear State Space Models
The rise of parallel computing hardware has made it increasingly important to understand which nonlinear state space models can be efficiently parallelized. Recent advances like DEER (arXiv:2309.12252) or DeepPCR (arXiv:2309.16318) have shown that evaluating a state space model can be recast as solving a parallelizable optimization problem, and sometimes this approach can yield dramatic speed-ups in evaluation time. However, the factors that govern the difficulty of these optimization problems remain unclear, limiting the larger adoption of the technique. In this work, we establish a precise relationship between the dynamics of a nonlinear system and the conditioning of its corresponding optimization formulation. We show that the predictability of a system, defined as the degree to which small perturbations in state influence future behavior, impacts the number of optimization steps required for evaluation. In predictable systems, the state trajectory can be computed in $O((\log T)^2)$ time, where $T$ is the sequence length, a major improvement over the conventional sequential approach. In contrast, chaotic or unpredictable systems exhibit poor conditioning, with the consequence that parallel evaluation converges too slowly to be useful. Importantly, our theoretical analysis demonstrates that for predictable systems, the optimization problem is always well-conditioned, whereas for unpredictable systems, the conditioning degrades exponentially as a function of the sequence length. We validate our claims through extensive experiments, providing practical guidance on when nonlinear dynamical systems can be efficiently parallelized, and highlighting predictability as a key design principle for parallelizable models.
Optimal Coordination of Local Flexibility from Electric Vehicles with Social Impact Consideration
The integration of renewable energy sources (RES) and the convergence of transport electrification, creates a significant challenge for distribution network management e.g. voltage and frequency violations, particularly in rural and remote areas. This paper investigates how smart charging of electric vehicles (EVs) can help reduce renewable energy curtailment and alleviate stress on local distribution networks. We implement a customised AC Optimal Power Flow (AC OPF) formulation which integrates into the optimisation an indicator reflecting the social impact of flexibility from EV users, based on the analysis of historical EV charging behaviours. The contribution of EV owners to reducing wind curtailment is optimised to enhance the acceptability of flexibility procurement, as the method targets EV users whose charging habits are most likely to align with flexibility requirements. Our method integrates social, technological, and economic perspectives with optimal flexibility coordination, and utilises clustering of EVs through a kmeans algorithm. To ensure scalability, we introduce a polar coordinate-based dimension reduction technique. The flexibility optimisation approach is demonstrated on the Orkney grid model, incorporating demand and wind farm generation data, as well as multi year charging data from 106 EVs. Results indicate that, by building upon the existing habits of EV users, curtailment can be reduced by 99.5% during a typical summer week the period when curtailment is most prevalent. This research demonstrates a foundational and transferable approach which is cognisant of socio techno economic factors towards accelerating decarbonisation and tackling the stochastic challenges of new demand and generation patterns on local distribution networks.
comment: 5 pages, accepted for ISGT Europe Conference 2025
Autonomous UAV Flight Navigation in Confined Spaces: A Reinforcement Learning Approach
Inspecting confined industrial infrastructure, such as ventilation shafts, is a hazardous and inefficient task for humans. Unmanned Aerial Vehicles (UAVs) offer a promising alternative, but GPS-denied environments require robust control policies to prevent collisions. Deep Reinforcement Learning (DRL) has emerged as a powerful framework for developing such policies, and this paper provides a comparative study of two leading DRL algorithms for this task: the on-policy Proximal Policy Optimization (PPO) and the off-policy Soft Actor-Critic (SAC). The training was conducted with procedurally generated duct environments in Genesis simulation environment. A reward function was designed to guide a drone through a series of waypoints while applying a significant penalty for collisions. PPO learned a stable policy that completed all evaluation episodes without collision, producing smooth trajectories. By contrast, SAC consistently converged to a suboptimal behavior that traversed only the initial segments before failure. These results suggest that, in hazard-dense navigation, the training stability of on-policy methods can outweigh the nominal sample efficiency of off-policy algorithms. More broadly, the study provides evidence that procedurally generated, high-fidelity simulations are effective testbeds for developing and benchmarking robust navigation policies.
A predictive modular approach to constraint satisfaction under uncertainty - with application to glycosylation in continuous monoclonal antibody biosimilar production
The paper proposes a modular-based approach to constraint handling in process optimization and control. This is partly motivated by the recent interest in learning-based methods, e.g., within bioproduction, for which constraint handling under uncertainty is a challenge. The proposed constraint handler, called predictive filter, is combined with an adaptive constraint margin and a constraint violation cost monitor to minimize the cost of violating soft constraints due to model uncertainty and disturbances. The module can be combined with any controller and is based on minimally modifying the controller output, in a least squares sense, such that constraints are satisfied within the considered horizon. The proposed method is computationally efficient and suitable for real-time applications. The effectiveness of the method is illustrated through a realistic simulation case study of glycosylation constraint satisfaction in continuous monoclonal antibody biosimilar production using Chinese hamster ovary cells, for which the metabolic network model consists of 23 extracellular metabolites and 126 reactions.
CarboNet: A Finite-Time Combustion-Tolerant Compartmental Network for Tropospheric Carbon Control
While governments and international organizations have set the net-zero target to prevent a climate event horizon, practical solutions are lacking mainly because of the impracticability to completely replace combustion processes. Hence, in this paper, we first design a compartmental network whose states must remain in the nonnegative orthant for physical consistency and in which the carbon dioxide emissions result from the combustion of diesel in vehicles and gas in house heaters. Then, we designed both full-state and output-feedback linear-quadratic regulators of the compartmental network to bring the mass of carbon dioxide to the pre-industrial era, which is reached in approximately 25 and 60 days, respectively. The output feedback tolerates for 6 days the combustion taking place in 5,000 vehicles and in 10,000 house heating systems, it meets the net-zero target, and it nullifies the extraction of finite natural resources. The tropospheric temperature with closed-loop reaches the equilibrium at 133 {\deg}C after 16.4 years; while such an high value requires to further investigate with climate experts the model of the dynamics of the temperature, this work is a first step in designing optimal network control systems for climate stability. Source code is publicly available.
comment: To be submitted
TAGA: A Tangent-Based Reactive Approach for Socially Compliant Robot Navigation Around Human Groups ICRA
Robot navigation in densely populated environments presents significant challenges, particularly regarding the interplay between individual and group dynamics. Current navigation models predominantly address interactions with individual pedestrians while failing to account for human groups that naturally form in real-world settings. Conversely, the limited models implementing group-aware navigation typically prioritize group dynamics at the expense of individual interactions, both of which are essential for socially appropriate navigation. This research extends an existing simulation framework to incorporate both individual pedestrians and human groups. We present Tangent Action for Group Avoidance (TAGA), a modular reactive mechanism that can be integrated with existing navigation frameworks to enhance their group-awareness capabilities. TAGA dynamically modifies robot trajectories using tangent action-based avoidance strategies while preserving the underlying model's capacity to navigate around individuals. Additionally, we introduce Group Collision Rate (GCR), a novel metric to quantitatively assess how effectively robots maintain group integrity during navigation. Through comprehensive simulation-based benchmarking, we demonstrate that integrating TAGA with state-of-the-art navigation models (ORCA, Social Force, DS-RNN, and AG-RL) reduces group intrusions by 45.7-78.6% while maintaining comparable success rates and navigation efficiency. Future work will focus on real-world implementation and validation of this approach.
comment: 6 pages, 3 figures. Preprint; intended for submission to IEEE International Conference on Robotics & Automation (ICRA), 2025
Hyper Yoshimura: How a slight tweak on a classical folding pattern unleashes meta-stability for deployable robots
Deployable structures inspired by origami have provided lightweight, compact, and reconfigurable solutions for various robotic and architectural applications. However, creating an integrated structural system that can effectively balance the competing requirements of high packing efficiency, simple deployment, and precise morphing into multiple load-bearing configurations remains a significant challenge. This study introduces a new class of hyper-Yoshimura origami, which exhibits a wide range of kinematically admissible and locally metastable states, including newly discovered symmetric "self-packing" and asymmetric "pop-out" states. This metastability is achieved by breaking a design rule of Yoshimura origami that has been in place for many decades. To this end, this study derives a new set of mathematically rigorous design rules and geometric formulations. Based on this, forward and inverse kinematic strategies are developed to stack hyper-Yoshimura modules into deployable booms that can approximate complex 3D shapes. Finally, this study showcases the potential of hyper-Yoshimura with a meter-scale pop-up cellphone charging station deployed at our university's bus transit station, along with a 3D-printed, scaled prototype of a space crane that can function as an object manipulator, solar tracking device, or high-load-bearing structure. These results establish hyper-Yoshimura as a promising platform for deployable and adaptable robotic systems in both terrestrial and space environments.
Adaptive Task Space Non-Singular Terminal Super-Twisting Sliding Mode Control of a 7-DOF Robotic Manipulator
This paper presents a new task-space Non-singular Terminal Super-Twisting Sliding Mode (NT-STSM) controller with adaptive gains for robust trajectory tracking of a 7-DOF robotic manipulator. The proposed approach addresses the challenges of chattering, unknown disturbances, and rotational motion tracking, making it suited for high-DOF manipulators in dexterous manipulation tasks. A rigorous boundedness proof is provided, offering gain selection guidelines for practical implementation. Simulations and hardware experiments with external disturbances demonstrate the proposed controller's robust, accurate tracking with reduced control effort under unknown disturbances compared to other NT-STSM and conventional controllers. The results demonstrated that the proposed NT-STSM controller mitigates chattering and instability in complex motions, making it a viable solution for dexterous robotic manipulations and various industrial applications.
comment: Accepted for publication in IEEE Transactions on Industrial Electronics. 12 pages, 8 figures
Generative diffusion posterior sampling for informative likelihoods
Sequential Monte Carlo (SMC) methods have recently shown successful results for conditional sampling of generative diffusion models. In this paper we propose a new diffusion posterior SMC sampler achieving improved statistical efficiencies, particularly under outlier conditions or highly informative likelihoods. The key idea is to construct an observation path that correlates with the diffusion model and to design the sampler to leverage this correlation for more efficient sampling. Empirical results conclude the efficiency.
comment: Commemorative issue for celebrating Thomas Kailath's 90th birthday
Optimal Batch-Size Control for Low-Latency Federated Learning with Device Heterogeneity
Federated learning (FL) has emerged as a popular approach for collaborative machine learning in sixth-generation (6G) networks, primarily due to its privacy-preserving capabilities. The deployment of FL algorithms is expected to empower a wide range of Internet-of-Things (IoT) applications, e.g., autonomous driving, augmented reality, and healthcare. The mission-critical and time-sensitive nature of these applications necessitates the design of low-latency FL frameworks that guarantee high learning performance. In practice, achieving low-latency FL faces two challenges: the overhead of computing and transmitting high-dimensional model updates, and the heterogeneity in communication-and-computation (C$^2$) capabilities across devices. To address these challenges, we propose a novel C$^2$-aware framework for optimal batch-size control that minimizes end-to-end (E2E) learning latency while ensuring convergence. The framework is designed to balance a fundamental C$^2$ tradeoff as revealed through convergence analysis. Specifically, increasing batch sizes improves the accuracy of gradient estimation in FL and thus reduces the number of communication rounds required for convergence, but results in higher per-round latency, and vice versa. The associated problem of latency minimization is intractable; however, we solve it by designing an accurate and tractable surrogate for convergence speed, with parameters fitted to real data. This approach yields two batch-size control strategies tailored to scenarios with slow and fast fading, while also accommodating device heterogeneity. Extensive experiments using real datasets demonstrate that the proposed strategies outperform conventional batch-size adaptation schemes that do not consider the C$^2$ tradeoff or device heterogeneity.
IDSO-Managed Bid-Based Transactive Distribution Systems Design for DER Participation in Wholesale Markets While Preserving T-D Interactions
Participation of Distributed Energy Resources (DERs) in bid-based Transactive Energy Systems (TES) at the distribution systems facilitates strongly coupled, bidirectional interactions between Transmission-Distribution (T-D) systems. Capturing these interactions is critical for ensuring seamless integration within an Integrated Transmission and Distribution (ITD) framework. This study proposes a methodology to preserve such tight T-D linkages by developing an Independent Distribution System Operator (IDSO) managed bid-based TES design for unbalanced distribution systems. The proposed design operates within the ITD paradigm and permits DER participation in the Wholesale Power Market (WPM) through IDSO while preserving tight T-D linkages. To this end, this research offers the following key contributions: a novel bid/offer prequalification-cum-aggregation method to ensure a grid-safe and value-based aggregation of DERs' bids and offers for WPM participation through IDSO; and a retail pricing mechanism that reflects the true value of procuring or offering additional units of power within the distribution system. Case studies are conducted on a modified IEEE 123-bus radial feeder populated with a high DER concentration to validate the proposed frameworks' effectiveness in coordinating the DERs efficiently and reliably.
comment: 17 Pages, 13 Figures. Removed a few typos, and added more references in literature review
Transient performance of MPC for tracking without terminal constraints
Model predictive control (MPC) for tracking is a recently introduced approach, which extends standard MPC formulations by incorporating an artificial reference as an additional optimization variable, in order to track external and potentially time-varying references. In this work, we analyze the performance of such an MPC for tracking scheme without a terminal cost and terminal constraints. We derive a transient performance estimate, i.e. a bound on the closed-loop performance over an arbitrary time interval, yielding insights on how to select the scheme's parameters for performance. Furthermore, we show that in the asymptotic case, where the prediction horizon and observed time interval tend to infinity, the closed-loop solution of MPC for tracking recovers the infinite horizon optimal solution.
comment: Accepted for publication in IEEE Control Systems Letters (L-CSS)
AI-Powered CPS-Enabled Urban Transportation Digital Twin: Methods and Applications
We present methods and applications for the development of digital twins (DT) for urban traffic management. While the majority of studies on the DT focus on its ``eyes," which is the emerging sensing and perception like object detection and tracking, what really distinguishes the DT from a traditional simulator lies in its ``brain," the prediction and decision making capabilities of extracting patterns and making informed decisions from what has been seen and perceived. In order to add value to urban transportation management, DTs need to be powered by artificial intelligence and complement with low-latency high-bandwidth sensing and networking technologies, in other words, cyberphysical systems (CPS). We will first review the DT pipeline enabled by CPS and propose our DT architecture deployed on a real-world testbed in New York City. This paper can be a pointer to help researchers and practitioners identify challenges and opportunities for the development of DTs; a bridge to initiate conversations across disciplines; and a road map to exploiting potentials of DTs for diverse urban transportation applications.
Co-Investment with Payoff-Sharing Mechanism for Cooperative Decision-Making in Network Design Games
Network-based systems are inherently interconnected, with the design and performance of subnetworks being interdependent. However, the decisions of self-interested operators may lead to suboptimal outcomes for users and the overall system. This paper explores cooperative mechanisms that can simultaneously benefit both operators and users. We address this challenge using a game-theoretical framework that integrates both non-cooperative and cooperative game theory. In the non-cooperative stage, we propose a network design game in which subnetwork decision-makers strategically design local infrastructures. In the cooperative stage, co-investment with payoff-sharing mechanism is developed to enlarge collective benefits and fairly distribute them. To demonstrate the effectiveness of our framework, we conduct case studies on the Sioux Falls network and real-world public transport networks in Zurich and Winterthur, Switzerland. Our evaluation considers impacts on environmental sustainability, social welfare, and economic efficiency. The proposed framework provides a foundation for improving interdependent networked systems by enabling strategic cooperation among self-interested operators.
Active Learning-Based Optimization of Hydroelectric Turbine Startup to Minimize Fatigue Damage
Hydro-generating units (HGUs) play a crucial role in integrating intermittent renewable energy sources into the power grid due to their flexible operational capabilities. This evolving role has led to an increase in transient events, such as startups, which impose significant stresses on turbines, leading to increased turbine fatigue and a reduced operational lifespan. Consequently, optimizing startup sequences to minimize stresses is vital for hydropower utilities. However, this task is challenging, as stress measurements on prototypes can be expensive and time-consuming. To tackle this challenge, we propose an innovative automated approach to optimize the startup parameters of HGUs with a limited budget of measured startup sequences. Our method combines active learning and black-box optimization techniques, utilizing virtual strain sensors and dynamic simulations of HGUs. This approach was tested in real-time during an on-site measurement campaign on an instrumented Francis turbine prototype. The results demonstrate that our algorithm successfully identified an optimal startup sequence using only seven measured sequences. It achieves a remarkable 42% reduction in the maximum strain cycle amplitude compared to the standard startup sequence. This study paves the way for more efficient HGU startup optimization, potentially extending their operational lifespans.
comment: Published in Renewable Energy
SINDy-RL: Interpretable and Efficient Model-Based Reinforcement Learning
Deep reinforcement learning (DRL) has shown significant promise for uncovering sophisticated control policies that interact in complex environments, such as stabilizing a tokamak fusion reactor or minimizing the drag force on an object in a fluid flow. However, DRL requires an abundance of training examples and may become prohibitively expensive for many applications. In addition, the reliance on deep neural networks often results in an uninterpretable, black-box policy that may be too computationally expensive to use with certain embedded systems. Recent advances in sparse dictionary learning, such as the sparse identification of nonlinear dynamics (SINDy), have shown promise for creating efficient and interpretable data-driven models in the low-data regime. In this work we introduce SINDy-RL, a unifying framework for combining SINDy and DRL to create efficient, interpretable, and trustworthy representations of the dynamics model, reward function, and control policy. We demonstrate the effectiveness of our approaches on benchmark control environments and flow control problems, including gust mitigation on a 3D NACA 0012 airfoil at $Re=1000$. SINDy-RL achieves comparable performance to modern DRL algorithms using significantly fewer interactions in the environment and results in an interpretable control policy orders of magnitude smaller than a DRL policy.
comment: For code, see https://github.com/nzolman/sindy-rl. v2 Update: Included Pinball and 3D Airfoil examples. Christian Lagemann added as an author for contributions with the 3D Airfoil code. To appear in Nature Communications
Generative Machine Learning in Adaptive Control of Dynamic Manufacturing Processes: A Review
Dynamic manufacturing processes exhibit complex characteristics defined by time-varying parameters, nonlinear behaviors, and uncertainties. These characteristics require sophisticated in-situ monitoring techniques utilizing multimodal sensor data and adaptive control systems that can respond to real-time feedback while maintaining product quality. Recently, generative machine learning (ML) has emerged as a powerful tool for modeling complex distributions and generating synthetic data while handling these manufacturing uncertainties. However, adopting these generative technologies in dynamic manufacturing systems lacks a functional control-oriented perspective to translate their probabilistic understanding into actionable process controls while respecting constraints. This review presents a functional classification of Prediction-Based, Direct Policy, Quality Inference, and Knowledge-Integrated approaches, offering a perspective for understanding existing ML-enhanced control systems and incorporating generative ML. The analysis of generative ML architectures within this framework demonstrates control-relevant properties and potential to extend current ML-enhanced approaches where conventional methods prove insufficient. We show generative ML's potential for manufacturing control through decision-making applications, process guidance, simulation, and digital twins, while identifying critical research gaps: separation between generation and control functions, insufficient physical understanding of manufacturing phenomena, and challenges adapting models from other domains. To address these challenges, we propose future research directions aimed at developing integrated frameworks that combine generative ML and control technologies to address the dynamic complexities of modern manufacturing systems.
comment: 12 pages, 1 figure, 1 table. This paper has been accepted for publication in the proceedings of ASME IDETC-CIE 2025
Output-feedback model predictive control under dynamic uncertainties using integral quadratic constraints
In this work, we propose an output-feedback tube-based model predictive control (MPC) scheme for linear systems under dynamic uncertainties that are described via integral quadratic constraints (IQC). By leveraging IQCs, a large class of nonlinear and dynamic uncertainties can be addressed. We leverage recent IQC synthesis tools to design a dynamic controller and an estimator that are robust to these uncertainties and minimize the size of the resulting constraint tightening in the MPC. Thereby, we show that the robust estimation problem using IQCs with peak-to-peak performance can be convexified. We guarantee recursive feasibility, robust constraint satisfaction, and input-to-state stability of the resulting MPC scheme.
Intelligent Condition Monitoring of Industrial Plants: An Overview of Methodologies and Uncertainty Management Strategies
Condition monitoring is essential for ensuring the safety, reliability, and efficiency of modern industrial systems. With the increasing complexity of industrial processes, artificial intelligence (AI) has emerged as a powerful tool for fault detection and diagnosis, attracting growing interest from both academia and industry. This paper provides a comprehensive overview of intelligent condition monitoring methods, with a particular emphasis on chemical plants and the widely used Tennessee Eastman Process (TEP) benchmark. State-of-the-art machine learning (ML) and deep learning (DL) algorithms are reviewed, highlighting their strengths, limitations, and applicability to industrial fault detection and diagnosis. Special attention is given to key challenges, including imbalanced and unlabeled data, and to strategies by which models can address these issues. Furthermore, comparative analyses of algorithm performance are presented to guide method selection in practical scenarios. This survey is intended to benefit both newcomers and experienced researchers by consolidating fundamental concepts, summarizing recent advances, and outlining open challenges and promising directions for intelligent condition monitoring in industrial plants.
Robotics
Neural Robot Dynamics
Accurate and efficient simulation of modern robots remains challenging due to their high degrees of freedom and intricate mechanisms. Neural simulators have emerged as a promising alternative to traditional analytical simulators, capable of efficiently predicting complex dynamics and adapting to real-world data; however, existing neural simulators typically require application-specific training and fail to generalize to novel tasks and/or environments, primarily due to inadequate representations of the global state. In this work, we address the problem of learning generalizable neural simulators for robots that are structured as articulated rigid bodies. We propose NeRD (Neural Robot Dynamics), learned robot-specific dynamics models for predicting future states for articulated rigid bodies under contact constraints. NeRD uniquely replaces the low-level dynamics and contact solvers in an analytical simulator and employs a robot-centric and spatially-invariant simulation state representation. We integrate the learned NeRD models as an interchangeable backend solver within a state-of-the-art robotics simulator. We conduct extensive experiments to show that the NeRD simulators are stable and accurate over a thousand simulation steps; generalize across tasks and environment configurations; enable policy learning exclusively in a neural engine; and, unlike most classical simulators, can be fine-tuned from real-world data to bridge the gap between simulation and reality.
Understanding and Utilizing Dynamic Coupling in Free-Floating Space Manipulators for On-Orbit Servicing
This study proposes a dynamic coupling-informed trajectory optimization algorithm for free-floating space manipulator systems (SMSs). Dynamic coupling between the base and the manipulator arms plays a critical role in influencing the system's behavior. While prior research has predominantly focused on minimizing this coupling, often overlooking its potential advantages, this work investigates how dynamic coupling can instead be leveraged to improve trajectory planning. Singular value decomposition (SVD) of the dynamic coupling matrix is employed to identify the dominant components governing coupling behavior. A quantitative metric is then formulated to characterize the strength and directionality of the coupling and is incorporated into a trajectory optimization framework. To assess the feasibility of the optimized trajectory, a sliding mode control-based tracking controller is designed to generate the required joint torque inputs. Simulation results demonstrate that explicitly accounting for dynamic coupling in trajectory planning enables more informed and potentially more efficient operation, offering new directions for the control of free-floating SMSs.
comment: 17 pages, 7 figures, 2025 AAS/AIAA Astrodynamics Specialist Conference
Exploiting Policy Idling for Dexterous Manipulation IROS 2025
Learning-based methods for dexterous manipulation have made notable progress in recent years. However, learned policies often still lack reliability and exhibit limited robustness to important factors of variation. One failure pattern that can be observed across many settings is that policies idle, i.e. they cease to move beyond a small region of states when they reach certain states. This policy idling is often a reflection of the training data. For instance, it can occur when the data contains small actions in areas where the robot needs to perform high-precision motions, e.g., when preparing to grasp an object or object insertion. Prior works have tried to mitigate this phenomenon e.g. by filtering the training data or modifying the control frequency. However, these approaches can negatively impact policy performance in other ways. As an alternative, we investigate how to leverage the detectability of idling behavior to inform exploration and policy improvement. Our approach, Pause-Induced Perturbations (PIP), applies perturbations at detected idling states, thus helping it to escape problematic basins of attraction. On a range of challenging simulated dual-arm tasks, we find that this simple approach can already noticeably improve test-time performance, with no additional supervision or training. Furthermore, since the robot tends to idle at critical points in a movement, we also find that learning from the resulting episodes leads to better iterative policy improvement compared to prior approaches. Our perturbation strategy also leads to a 15-35% improvement in absolute success rate on a real-world insertion task that requires complex multi-finger manipulation.
comment: A similar version to this paper was accepted at IROS 2025
Mind and Motion Aligned: A Joint Evaluation IsaacSim Benchmark for Task Planning and Low-Level Policies in Mobile Manipulation
Benchmarks are crucial for evaluating progress in robotics and embodied AI. However, a significant gap exists between benchmarks designed for high-level language instruction following, which often assume perfect low-level execution, and those for low-level robot control, which rely on simple, one-step commands. This disconnect prevents a comprehensive evaluation of integrated systems where both task planning and physical execution are critical. To address this, we propose Kitchen-R, a novel benchmark that unifies the evaluation of task planning and low-level control within a simulated kitchen environment. Built as a digital twin using the Isaac Sim simulator and featuring more than 500 complex language instructions, Kitchen-R supports a mobile manipulator robot. We provide baseline methods for our benchmark, including a task-planning strategy based on a vision-language model and a low-level control policy based on diffusion policy. We also provide a trajectory collection system. Our benchmark offers a flexible framework for three evaluation modes: independent assessment of the planning module, independent assessment of the control policy, and, crucially, an integrated evaluation of the whole system. Kitchen-R bridges a key gap in embodied AI research, enabling more holistic and realistic benchmarking of language-guided robotic agents.
LLM-Driven Self-Refinement for Embodied Drone Task Planning
We introduce SRDrone, a novel system designed for self-refinement task planning in industrial-grade embodied drones. SRDrone incorporates two key technical contributions: First, it employs a continuous state evaluation methodology to robustly and accurately determine task outcomes and provide explanatory feedback. This approach supersedes conventional reliance on single-frame final-state assessment for continuous, dynamic drone operations. Second, SRDrone implements a hierarchical Behavior Tree (BT) modification model. This model integrates multi-level BT plan analysis with a constrained strategy space to enable structured reflective learning from experience. Experimental results demonstrate that SRDrone achieves a 44.87% improvement in Success Rate (SR) over baseline methods. Furthermore, real-world deployment utilizing an experience base optimized through iterative self-refinement attains a 96.25% SR. By embedding adaptive task refinement capabilities within an industrial-grade BT planning framework, SRDrone effectively integrates the general reasoning intelligence of Large Language Models (LLMs) with the stringent physical execution constraints inherent to embodied drones. Code is available at https://github.com/ZXiiiC/SRDrone.
comment: 14pages
Lang2Lift: A Framework for Language-Guided Pallet Detection and Pose Estimation Integrated in Autonomous Outdoor Forklift Operation
The logistics and construction industries face persistent challenges in automating pallet handling, especially in outdoor environments with variable payloads, inconsistencies in pallet quality and dimensions, and unstructured surroundings. In this paper, we tackle automation of a critical step in pallet transport: the pallet pick-up operation. Our work is motivated by labor shortages, safety concerns, and inefficiencies in manually locating and retrieving pallets under such conditions. We present Lang2Lift, a framework that leverages foundation models for natural language-guided pallet detection and 6D pose estimation, enabling operators to specify targets through intuitive commands such as "pick up the steel beam pallet near the crane." The perception pipeline integrates Florence-2 and SAM-2 for language-grounded segmentation with FoundationPose for robust pose estimation in cluttered, multi-pallet outdoor scenes under variable lighting. The resulting poses feed into a motion planning module for fully autonomous forklift operation. We validate Lang2Lift on the ADAPT autonomous forklift platform, achieving 0.76 mIoU pallet segmentation accuracy on a real-world test dataset. Timing and error analysis demonstrate the system's robustness and confirm its feasibility for deployment in operational logistics and construction environments. Video demonstrations are available at https://eric-nguyen1402.github.io/lang2lift.github.io/
comment: 8 pages, 7 figures
Sensing, Social, and Motion Intelligence in Embodied Navigation: A Comprehensive Survey
Embodied navigation (EN) advances traditional navigation by enabling robots to perform complex egocentric tasks through sensing, social, and motion intelligence. In contrast to classic methodologies that rely on explicit localization and pre-defined maps, EN leverages egocentric perception and human-like interaction strategies. This survey introduces a comprehensive EN formulation structured into five stages: Transition, Observation, Fusion, Reward-policy construction, and Action (TOFRA). The TOFRA framework serves to synthesize the current state of the art, provide a critical review of relevant platforms and evaluation metrics, and identify critical open research challenges. A list of studies is available at https://github.com/Franky-X/Awesome-Embodied-Navigation.
Mag-Match: Magnetic Vector Field Features for Map Matching and Registration IROS
Map matching and registration are essential tasks in robotics for localisation and integration of multi-session or multi-robot data. Traditional methods rely on cameras or LiDARs to capture visual or geometric information but struggle in challenging conditions like smoke or dust. Magnetometers, on the other hand, detect magnetic fields, revealing features invisible to other sensors and remaining robust in such environments. In this paper, we introduce Mag-Match, a novel method for extracting and describing features in 3D magnetic vector field maps to register different maps of the same area. Our feature descriptor, based on higher-order derivatives of magnetic field maps, is invariant to global orientation, eliminating the need for gravity-aligned mapping. To obtain these higher-order derivatives map-wide given point-wise magnetometer data, we leverage a physics-informed Gaussian Process to perform efficient and recursive probabilistic inference of both the magnetic field and its derivatives. We evaluate Mag-Match in simulated and real-world experiments against a SIFT-based approach, demonstrating accurate map-to-map, robot-to-map, and robot-to-robot transformations - even without initial gravitational alignment.
comment: To be published in IROS: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025
Survey of Vision-Language-Action Models for Embodied Manipulation
Embodied intelligence systems, which enhance agent capabilities through continuous environment interactions, have garnered significant attention from both academia and industry. Vision-Language-Action models, inspired by advancements in large foundation models, serve as universal robotic control frameworks that substantially improve agent-environment interaction capabilities in embodied intelligence systems. This expansion has broadened application scenarios for embodied AI robots. This survey comprehensively reviews VLA models for embodied manipulation. Firstly, it chronicles the developmental trajectory of VLA architectures. Subsequently, we conduct a detailed analysis of current research across 5 critical dimensions: VLA model structures, training datasets, pre-training methods, post-training methods, and model evaluation. Finally, we synthesize key challenges in VLA development and real-world deployment, while outlining promising future research directions.
comment: in Chinese language
Hardware Implementation of a Zero-Prior-Knowledge Approach to Lifelong Learning in Kinematic Control of Tendon-Driven Quadrupeds
Like mammals, robots must rapidly learn to control their bodies and interact with their environment despite incomplete knowledge of their body structure and surroundings. They must also adapt to continuous changes in both. This work presents a bio-inspired learning algorithm, General-to-Particular (G2P), applied to a tendon-driven quadruped robotic system developed and fabricated in-house. Our quadruped robot undergoes an initial five-minute phase of generalized motor babbling, followed by 15 refinement trials (each lasting 20 seconds) to achieve specific cyclical movements. This process mirrors the exploration-exploitation paradigm observed in mammals. With each refinement, the robot progressively improves upon its initial "good enough" solution. Our results serve as a proof-of-concept, demonstrating the hardware-in-the-loop system's ability to learn the control of a tendon-driven quadruped with redundancies in just a few minutes to achieve functional and adaptive cyclical non-convex movements. By advancing autonomous control in robotic locomotion, our approach paves the way for robots capable of dynamically adjusting to new environments, ensuring sustained adaptability and performance.
Self-Aligning EPM Connector: A Versatile Solution for Adaptive and Multi-Modal Interfaces
This paper presents a multifunctional connector based on electro-permanent magnet (EPM) technology, integrating self-alignment, mechanical coupling, fluid transfer, and data communication within a compact SLA-3D printed structure. Experimental results demonstrate reliable self-alignment, efficient fluid transfer in single-loop and dual-channel modes, and robust data transmission via integrated electronic control. The connector exhibits high flexibility in accommodating axial, angular, and lateral misalignments while maintaining low energy consumption. These features make it highly suitable for modular robotics, electric vehicle charging, household robotic platforms, and aerospace docking applications.
GelSLAM: A Real-time, High-Fidelity, and Robust 3D Tactile SLAM System
Accurately perceiving an object's pose and shape is essential for precise grasping and manipulation. Compared to common vision-based methods, tactile sensing offers advantages in precision and immunity to occlusion when tracking and reconstructing objects in contact. This makes it particularly valuable for in-hand and other high-precision manipulation tasks. In this work, we present GelSLAM, a real-time 3D SLAM system that relies solely on tactile sensing to estimate object pose over long periods and reconstruct object shapes with high fidelity. Unlike traditional point cloud-based approaches, GelSLAM uses tactile-derived surface normals and curvatures for robust tracking and loop closure. It can track object motion in real time with low error and minimal drift, and reconstruct shapes with submillimeter accuracy, even for low-texture objects such as wooden tools. GelSLAM extends tactile sensing beyond local contact to enable global, long-horizon spatial perception, and we believe it will serve as a foundation for many precise manipulation tasks involving interaction with objects in hand. The video demo is available on our website: https://joehjhuang.github.io/gelslam.
comment: 18 pages
UnPose: Uncertainty-Guided Diffusion Priors for Zero-Shot Pose Estimation
Estimating the 6D pose of novel objects is a fundamental yet challenging problem in robotics, often relying on access to object CAD models. However, acquiring such models can be costly and impractical. Recent approaches aim to bypass this requirement by leveraging strong priors from foundation models to reconstruct objects from single or multi-view images, but typically require additional training or produce hallucinated geometry. To this end, we propose UnPose, a novel framework for zero-shot, model-free 6D object pose estimation and reconstruction that exploits 3D priors and uncertainty estimates from a pre-trained diffusion model. Specifically, starting from a single-view RGB-D frame, UnPose uses a multi-view diffusion model to estimate an initial 3D model using 3D Gaussian Splatting (3DGS) representation, along with pixel-wise epistemic uncertainty estimates. As additional observations become available, we incrementally refine the 3DGS model by fusing new views guided by the diffusion model's uncertainty, thereby continuously improving the pose estimation accuracy and 3D reconstruction quality. To ensure global consistency, the diffusion prior-generated views and subsequent observations are further integrated in a pose graph and jointly optimized into a coherent 3DGS field. Extensive experiments demonstrate that UnPose significantly outperforms existing approaches in both 6D pose estimation accuracy and 3D reconstruction quality. We further showcase its practical applicability in real-world robotic manipulation tasks.
comment: Published at the Conference on Robot Learning (CoRL) 2025. For more details please visit https://frankzhaodong.github.io/UnPose
Spatial Policy: Guiding Visuomotor Robotic Manipulation with Spatial-Aware Modeling and Reasoning
Vision-centric hierarchical embodied models have demonstrated strong potential for long-horizon robotic control. However, existing methods lack spatial awareness capabilities, limiting their effectiveness in bridging visual plans to actionable control in complex environments. To address this problem, we propose Spatial Policy (SP), a unified spatial-aware visuomotor robotic manipulation framework via explicit spatial modeling and reasoning. Specifically, we first design a spatial-conditioned embodied video generation module to model spatially guided predictions through a spatial plan table. Then, we propose a spatial-based action prediction module to infer executable actions with coordination. Finally, we propose a spatial reasoning feedback policy to refine the spatial plan table via dual-stage replanning. Extensive experiments show that SP significantly outperforms state-of-the-art baselines, achieving a 33.0% average improvement over the best baseline. With an 86.7% average success rate across 11 diverse tasks, SP substantially enhances the practicality of embodied models for robotic control applications. Code and checkpoints are maintained at https://plantpotatoonmoon.github.io/SpatialPolicy/.
Active Prostate Phantom with Multiple Chambers
Prostate cancer is a major global health concern, requiring advancements in robotic surgery and diagnostics to improve patient outcomes. A phantom is a specially designed object that simulates human tissues or organs. It can be used for calibrating and testing a medical process, as well as for training and research purposes. Existing prostate phantoms fail to simulate dynamic scenarios. This paper presents a pneumatically actuated prostate phantom with multiple independently controlled chambers, allowing for precise volumetric adjustments to replicate asymmetric and symmetric benign prostatic hyperplasia (BPH). The phantom is designed based on shape analysis of magnetic resonance imaging (MRI) datasets, modeled with finite element method (FEM), and validated through 3D reconstruction. The simulation results showed strong agreement with physical measurements, achieving average errors of 3.47% in forward modeling and 1.41% in inverse modeling. These results demonstrate the phantom's potential as a platform for validating robotic-assisted systems and for further development toward realistic simulation-based medical training.
An Informative Planning Framework for Target Tracking and Active Mapping in Dynamic Environments with ASVs
Mobile robot platforms are increasingly being used to automate information gathering tasks such as environmental monitoring. Efficient target tracking in dynamic environments is critical for applications such as search and rescue and pollutant cleanups. In this letter, we study active mapping of floating targets that drift due to environmental disturbances such as wind and currents. This is a challenging problem as it involves predicting both spatial and temporal variations in the map due to changing conditions. We propose an informative path planning framework to map an arbitrary number of moving targets with initially unknown positions in dynamic environments. A key component of our approach is a spatiotemporal prediction network that predicts target position distributions over time. We propose an adaptive planning objective for target tracking that leverages these predictions. Simulation experiments show that our proposed planning objective improves target tracking performance compared to existing methods that consider only entropy reduction as the planning objective. Finally, we validate our approach in field tests using an autonomous surface vehicle, showcasing its ability to track targets in real-world monitoring scenarios.
comment: Submitted to IEEE Robotics and Automation Letters (RA-L)
Taming VR Teleoperation and Learning from Demonstration for Multi-Task Bimanual Table Service Manipulation ICRA 2025
This technical report presents the champion solution of the Table Service Track in the ICRA 2025 What Bimanuals Can Do (WBCD) competition. We tackled a series of demanding tasks under strict requirements for speed, precision, and reliability: unfolding a tablecloth (deformable-object manipulation), placing a pizza into the container (pick-and-place), and opening and closing a food container with the lid. Our solution combines VR-based teleoperation and Learning from Demonstrations (LfD) to balance robustness and autonomy. Most subtasks were executed through high-fidelity remote teleoperation, while the pizza placement was handled by an ACT-based policy trained from 100 in-person teleoperated demonstrations with randomized initial configurations. By carefully integrating scoring rules, task characteristics, and current technical capabilities, our approach achieved both high efficiency and reliability, ultimately securing the first place in the competition.
comment: Technical Report of First-place/Champion solution at IEEE ICRA 2025 What Bimanuals Can Do (WBCD) Challenge - Table Services Track
UAV-ON: A Benchmark for Open-World Object Goal Navigation with Aerial Agents ACM MM
Aerial navigation is a fundamental yet underexplored capability in embodied intelligence, enabling agents to operate in large-scale, unstructured environments where traditional navigation paradigms fall short. However, most existing research follows the Vision-and-Language Navigation (VLN) paradigm, which heavily depends on sequential linguistic instructions, limiting its scalability and autonomy. To address this gap, we introduce UAV-ON, a benchmark for large-scale Object Goal Navigation (ObjectNav) by aerial agents in open-world environments, where agents operate based on high-level semantic goals without relying on detailed instructional guidance as in VLN. UAV-ON comprises 14 high-fidelity Unreal Engine environments with diverse semantic regions and complex spatial layouts, covering urban, natural, and mixed-use settings. It defines 1270 annotated target objects, each characterized by an instance-level instruction that encodes category, physical footprint, and visual descriptors, allowing grounded reasoning. These instructions serve as semantic goals, introducing realistic ambiguity and complex reasoning challenges for aerial agents. To evaluate the benchmark, we implement several baseline methods, including Aerial ObjectNav Agent (AOA), a modular policy that integrates instruction semantics with egocentric observations for long-horizon, goal-directed exploration. Empirical results show that all baselines struggle in this setting, highlighting the compounded challenges of aerial navigation and semantic goal grounding. UAV-ON aims to advance research on scalable UAV autonomy driven by semantic goal descriptions in complex real-world environments.
comment: Accepted to ACM MM Dataset Track 2025
Embodied Long Horizon Manipulation with Closed-loop Code Generation and Incremental Few-shot Adaptation ICRA 6
Embodied long-horizon manipulation requires robotic systems to process multimodal inputs-such as vision and natural language-and translate them into executable actions. However, existing learning-based approaches often depend on large, task-specific datasets and struggle to generalize to unseen scenarios. Recent methods have explored using large language models (LLMs) as high-level planners that decompose tasks into subtasks using natural language and guide pretrained low-level controllers. Yet, these approaches assume perfect execution from low-level policies, which is unrealistic in real-world environments with noise or suboptimal behaviors. To overcome this, we fully discard the pretrained low-level policy and instead use the LLM to directly generate executable code plans within a closed-loop framework. Our planner employs chain-of-thought (CoT)-guided few-shot learning with incrementally structured examples to produce robust and generalizable task plans. Complementing this, a reporter evaluates outcomes using RGB-D and delivers structured feedback, enabling recovery from misalignment and replanning under partial observability. This design eliminates per-step inference, reduces computational overhead, and limits error accumulation that was observed in previous methods. Our framework achieves state-of-the-art performance on 30+ diverse seen and unseen long-horizon tasks across LoHoRavens, CALVIN, Franka Kitchen, and cluttered real-world settings.
comment: update ICRA 6 page
Equivariant IMU Preintegration with Biases: a Galilean Group Approach
This letter proposes a new approach for Inertial Measurement Unit (IMU) preintegration, a fundamental building block that can be leveraged in different optimization-based Inertial Navigation System (INS) localization solutions. Inspired by recent advances in equivariant theory applied to biased INSs, we derive a discrete-time formulation of the IMU preintegration on ${\mathbf{Gal}(3) \ltimes \mathfrak{gal}(3)}$, the left-trivialization of the tangent group of the Galilean group $\mathbf{Gal}(3)$. We define a novel preintegration error that geometrically couples the navigation states and the bias leading to lower linearization error. Our method improves in consistency compared to existing preintegration approaches which treat IMU biases as a separate state-space. Extensive validation against state-of-the-art methods, both in simulation and with real-world IMU data, implementation in the Lie++ library, and open-source code are provided.
TripleMixer: A 3D Point Cloud Denoising Model for Adverse Weather
Adverse weather conditions such as snow, fog, and rain pose significant challenges to LiDAR-based perception models by introducing noise and corrupting point cloud measurements. To address this issue, we propose TripleMixer, a robust and efficient point cloud denoising network that integrates spatial, frequency, and channel-wise processing through three specialized mixer modules. TripleMixer effectively suppresses high-frequency noise while preserving essential geometric structures and can be seamlessly deployed as a plug-and-play module within existing LiDAR perception pipelines. To support the development and evaluation of denoising methods, we construct two large-scale simulated datasets, Weather-KITTI and Weather-NuScenes, covering diverse weather scenarios with dense point-wise semantic and noise annotations. Based on these datasets, we establish four benchmarks: Denoising, Semantic Segmentation (SS), Place Recognition (PR), and Object Detection (OD). These benchmarks enable systematic evaluation of denoising generalization, transferability, and downstream impact under both simulated and real-world adverse weather conditions. Extensive experiments demonstrate that TripleMixer achieves state-of-the-art denoising performance and yields substantial improvements across all downstream tasks without requiring retraining. Our results highlight the potential of denoising as a task-agnostic preprocessing strategy to enhance LiDAR robustness in real-world autonomous driving applications.
comment: 15 pages, submit to IEEE TIP
ILeSiA: Interactive Learning of Robot Situational Awareness from Camera Input
Learning from demonstration is a promising approach for teaching robots new skills. However, a central challenge in the execution of acquired skills is the ability to recognize faults and prevent failures. This is essential because demonstrations typically cover only a limited set of scenarios and often only the successful ones. During task execution, unforeseen situations may arise, such as changes in the robot's environment or interaction with human operators. To recognize such situations, this paper focuses on teaching the robot situational awareness by using a camera input and labeling frames as safe or risky. We train a Gaussian Process (GP) regression model fed by a low-dimensional latent space representation of the input images. The model outputs a continuous risk score ranging from zero to one, quantifying the degree of risk at each timestep. This allows for pausing task execution in unsafe situations and directly adding new training data, labeled by the human user. Our experiments on a robotic manipulator show that the proposed method can reliably detect both known and novel faults using only a single example for each new fault. In contrast, a standard multi-layer perceptron (MLP) performs well only on faults it has encountered during training. Our method enables the next generation of cobots to be rapidly deployed with easy-to-set-up, vision-based risk assessment, proactively safeguarding humans and detecting misaligned parts or missing objects before failures occur. We provide all the code and data required to reproduce our experiments at imitrob.ciirc.cvut.cz/publications/ilesia.
comment: 8 pages, 9 figures. Accepted to IEEE Robotics and Automation Letters (Early Access)
Automatic Geometric Decomposition for Analytical Inverse Kinematics
Calculating the inverse kinematics (IK) is a fundamental challenge in robotics. Compared to numerical or learning-based approaches, analytical IK provides higher efficiency and accuracy. However, existing analytical approaches are difficult to use in most applications, as they require human ingenuity in the derivation process, are numerically unstable, or rely on time-consuming symbolic manipulation. In contrast, we propose a method that, for the first time, enables an analytical IK derivation and computation in less than a millisecond in total. Our work is based on an automatic online decomposition of the IK into pre-solved, numerically stable subproblems via a kinematic classification of the respective manipulator. In numerical experiments, we demonstrate that our approach is orders of magnitude faster in deriving the IK than existing tools that employ symbolic manipulation. Following this one-time derivation, our method matches and often surpasses baselines, such as IKFast, in terms of speed and accuracy during the computation of explicit IK solutions. Finally, we provide an open-source C++ toolbox with Python wrappers that substantially reduces the entry barrier to using analytical IK in applications like rapid prototyping and kinematic robot design.
comment: Website: https://eaik.cps.cit.tum.de/
Towards High Precision: An Adaptive Self-Supervised Learning Framework for Force-Based Verification
The automation of robotic tasks requires high precision and adaptability, particularly in force-based operations such as insertions. Traditional learning-based approaches either rely on static datasets, which limit their ability to generalize, or require frequent manual intervention to maintain good performances. As a result, ensuring long-term reliability without human supervision remains a significant challenge. To address this, we propose an adaptive self-supervised learning framework for insertion classification that continuously improves its precision over time. The framework operates in real-time, incrementally refining its classification decisions by integrating newly acquired force data. Unlike conventional methods, it does not rely on pre-collected datasets but instead evolves dynamically with each task execution. Through real-world experiments, we demonstrate how the system progressively reduces execution time while maintaining near-perfect precision as more samples are processed. This adaptability ensures long-term reliability in force-based robotic tasks while minimizing the need for manual intervention.
comment: 7 pages, 7 figures, 3 tables
Continual Learning for Multimodal Data Fusion of a Soft Gripper
Continual learning (CL) refers to the ability of an algorithm to continuously and incrementally acquire new knowledge from its environment while retaining previously learned information. A model trained on one data modality often fails when tested with a different modality. A straightforward approach might be to fuse the two modalities by concatenating their features and training the model on the fused data. However, this requires retraining the model from scratch each time it encounters a new domain. In this paper, we introduce a continual learning algorithm capable of incrementally learning different data modalities by leveraging both class-incremental and domain-incremental learning scenarios in an artificial environment where labeled data is scarce, yet non-iid (independent and identical distribution) unlabeled data from the environment is plentiful. The proposed algorithm is efficient and only requires storing prototypes for each class. We evaluate the algorithm's effectiveness on a challenging custom multimodal dataset comprising of tactile data from a soft pneumatic gripper, and visual data from non-stationary images of objects extracted from video sequences. Additionally, we conduct an ablation study on the custom dataset and the Core50 dataset to highlight the contributions of different components of the algorithm. To further demonstrate the robustness of the algorithm, we perform a real-time experiment for object classification using the soft gripper and an external independent camera setup, all synchronized with the Robot Operating System (ROS) framework.
comment: Accepted in Wiley Advanced Robotics Research
Polytope Volume Monitoring Problem: Formulation and Solution via Parametric Linear Program Based Control Barrier Function
Motivated by the latest research on feasible space monitoring of multiple control barrier functions (CBFs) as well as polytopic collision avoidance, this paper studies the Polytope Volume Monitoring (PVM) problem, whose goal is to design a control law for inputs of nonlinear systems to prevent the volume of some state-dependent polytope from decreasing to zero. Recent studies have explored the idea of applying Chebyshev ball method in optimization theory to solve the case study of PVM; however, the underlying difficulties caused by nonsmoothness have not been addressed. This paper continues the study on this topic, where our main contribution is to establish the relationship between nonsmooth CBF and parametric optimization theory through directional derivatives for the first time, to solve PVM problems more conveniently. In detail, inspired by Chebyshev ball approach, a parametric linear program (PLP) based nonsmooth barrier function candidate is established for PVM, and then, sufficient conditions for it to be a nonsmooth CBF are proposed, based on which a quadratic program (QP) based safety filter with guaranteed feasibility is proposed to address PVM problems. Finally, a numerical simulation example is given to show the efficiency of the proposed safety filter.
comment: An extension version of the accepted CDC2025
Multiagent Systems
Distributed Detection of Adversarial Attacks in Multi-Agent Reinforcement Learning with Continuous Action Space ECAI 2025
We address the problem of detecting adversarial attacks against cooperative multi-agent reinforcement learning with continuous action space. We propose a decentralized detector that relies solely on the local observations of the agents and makes use of a statistical characterization of the normal behavior of observable agents. The proposed detector utilizes deep neural networks to approximate the normal behavior of agents as parametric multivariate Gaussian distributions. Based on the predicted density functions, we define a normality score and provide a characterization of its mean and variance. This characterization allows us to employ a two-sided CUSUM procedure for detecting deviations of the normality score from its mean, serving as a detector of anomalous behavior in real-time. We evaluate our scheme on various multi-agent PettingZoo benchmarks against different state-of-the-art attack methods, and our results demonstrate the effectiveness of our method in detecting impactful adversarial attacks. Particularly, it outperforms the discrete counterpart by achieving AUC-ROC scores of over 0.95 against the most impactful attacks in all evaluated environments.
comment: Accepted for publication at ECAI 2025
Language-Guided Tuning: Enhancing Numeric Optimization with Textual Feedback
Configuration optimization remains a critical bottleneck in machine learning, requiring coordinated tuning across model architecture, training strategy, feature engineering, and hyperparameters. Traditional approaches treat these dimensions independently and lack interpretability, while recent automated methods struggle with dynamic adaptability and semantic reasoning about optimization decisions. We introduce Language-Guided Tuning (LGT), a novel framework that employs multi-agent Large Language Models to intelligently optimize configurations through natural language reasoning. We apply textual gradients - qualitative feedback signals that complement numerical optimization by providing semantic understanding of training dynamics and configuration interdependencies. LGT coordinates three specialized agents: an Advisor that proposes configuration changes, an Evaluator that assesses progress, and an Optimizer that refines the decision-making process, creating a self-improving feedback loop. Through comprehensive evaluation on six diverse datasets, LGT demonstrates substantial improvements over traditional optimization methods, achieving performance gains while maintaining high interpretability.
comment: 9 pages, 4 figures, 4 tables
Understanding Action Effects through Instrumental Empowerment in Multi-Agent Reinforcement Learning ECAI
To reliably deploy Multi-Agent Reinforcement Learning (MARL) systems, it is crucial to understand individual agent behaviors within a team. While prior work typically evaluates overall team performance based on explicit reward signals or learned value functions, it is unclear how to infer agent contributions in the absence of any value feedback. In this work, we investigate whether meaningful insights into agent behaviors can be extracted that are consistent with the underlying value functions, solely by analyzing the policy distribution. Inspired by the phenomenon that intelligent agents tend to pursue convergent instrumental values, which generally increase the likelihood of task success, we introduce Intended Cooperation Values (ICVs), a method based on information-theoretic Shapley values for quantifying each agent's causal influence on their co-players' instrumental empowerment. Specifically, ICVs measure an agent's action effect on its teammates' policies by assessing their decision uncertainty and preference alignment. The analysis across cooperative and competitive MARL environments reveals the extent to which agents adopt similar or diverse strategies. By comparing action effects between policies and value functions, our method identifies which agent behaviors are beneficial to team success, either by fostering deterministic decisions or by preserving flexibility for future action choices. Our proposed method offers novel insights into cooperation dynamics and enhances explainability in MARL systems.
comment: European Conference on Artificial Intelligence (ECAI) 2025
HEAS: Hierarchical Evolutionary Agent Simulation Framework for Cross-Scale Modeling and Multi-Objective Search
Hierarchical Evolutionary Agent Simulation (HEAS) is a Python framework that unifies layered agent-based modeling with evolutionary optimization and tournament evaluation in a single, reproducible workflow. HEAS represents models as hierarchies of lightweight processes ("streams") scheduled in deterministic layers that read and write a shared context, making cross-scale couplings explicit and auditable. A compact API and CLI-simulate, optimize, evaluate-expose single- and multi-objective evolution, PyTorch policy integration via parameter flattening/unflattening, and general tournament tooling with user-defined scoring and voting rules. The framework standardizes evaluation through uniform per-step and episode metrics, persists seeds, logbooks, and hall-of-fame archives, and provides plotting helpers for traces, Pareto fronts, and comparative outcomes, reducing glue code and improving comparability across studies. HEAS emphasizes separation of mechanism from orchestration, allowing exogenous drivers, endogenous agents, and aggregators to be composed and swapped without refactoring, while the same model can be used for forward simulation, optimization, or systematic comparison. We illustrate usage with two compact examples-an ecological system and an enterprise decision-making setting. HEAS offers a practical foundation for cross-disciplinary, multi-level inquiry, yielding reliable, reproducible results.
comment: 9 pages, 1 figure
Foundational Design Principles and Patterns for Building Robust and Adaptive GenAI-Native Systems
Generative AI (GenAI) has emerged as a transformative technology, demonstrating remarkable capabilities across diverse application domains. However, GenAI faces several major challenges in developing reliable and efficient GenAI-empowered systems due to its unpredictability and inefficiency. This paper advocates for a paradigm shift: future GenAI-native systems should integrate GenAI's cognitive capabilities with traditional software engineering principles to create robust, adaptive, and efficient systems. We introduce foundational GenAI-native design principles centered around five key pillars -- reliability, excellence, evolvability, self-reliance, and assurance -- and propose architectural patterns such as GenAI-native cells, organic substrates, and programmable routers to guide the creation of resilient and self-evolving systems. Additionally, we outline the key ingredients of a GenAI-native software stack and discuss the impact of these systems from technical, user adoption, economic, and legal perspectives, underscoring the need for further validation and experimentation. Our work aims to inspire future research and encourage relevant communities to implement and refine this conceptual framework.
Multiple Memory Systems for Enhancing the Long-term Memory of Agent
An agent powered by large language models have achieved impressive results, but effectively handling the vast amounts of historical data generated during interactions remains a challenge. The current approach is to design a memory module for the agent to process these data. However, existing methods, such as MemoryBank and A-MEM, have poor quality of stored memory content, which affects recall performance and response quality. In order to better construct high-quality long-term memory content, we have designed a multiple memory system (MMS) inspired by cognitive psychology theory. The system processes short-term memory to multiple long-term memory fragments, and constructs retrieval memory units and contextual memory units based on these fragments, with a one-to-one correspondence between the two. During the retrieval phase, MMS will match the most relevant retrieval memory units based on the user's query. Then, the corresponding contextual memory units is obtained as the context for the response stage to enhance knowledge, thereby effectively utilizing historical data. Experiments on LoCoMo dataset compared our method with three others, proving its effectiveness. Ablation studies confirmed the rationality of our memory units. We also analyzed the robustness regarding the number of selected memory segments and the storage overhead, demonstrating its practical value.
See it. Say it. Sorted: Agentic System for Compositional Diagram Generation
We study sketch-to-diagram generation: converting rough hand sketches into precise, compositional diagrams. Diffusion models excel at photorealism but struggle with the spatial precision, alignment, and symbolic structure required for flowcharts. We introduce See it. Say it. Sorted., a training-free agentic system that couples a Vision-Language Model (VLM) with Large Language Models (LLMs) to produce editable Scalable Vector Graphics (SVG) programs. The system runs an iterative loop in which a Critic VLM proposes a small set of qualitative, relational edits; multiple candidate LLMs synthesize SVG updates with diverse strategies (conservative->aggressive, alternative, focused); and a Judge VLM selects the best candidate, ensuring stable improvement. This design prioritizes qualitative reasoning over brittle numerical estimates, preserves global constraints (e.g., alignment, connectivity), and naturally supports human-in-the-loop corrections. On 10 sketches derived from flowcharts in published papers, our method more faithfully reconstructs layout and structure than two frontier closed-source image generation LLMs (GPT-5 and Gemini-2.5-Pro), accurately composing primitives (e.g., multi-headed arrows) without inserting unwanted text. Because outputs are programmatic SVGs, the approach is readily extensible to presentation tools (e.g., PowerPoint) via APIs and can be specialized with improved prompts and task-specific tools. The codebase is open-sourced at https://github.com/hantaoZhangrichard/see_it_say_it_sorted.git.
ASIC-Agent: An Autonomous Multi-Agent System for ASIC Design with Benchmark Evaluation
Large Language Models (LLMs) have demonstrated remarkable capabilities in Register Transfer Level (RTL) design, enabling high-quality code generation from natural language descriptions. However, LLMs alone face significant limitations in real-world hardware design workflows, including the inability to execute code, lack of debugging capabilities, and absence of long-term memory. To address these challenges, we present ASIC-Agent, an autonomous system designed specifically for digital ASIC design tasks. ASIC-Agent enhances base LLMs with a multi-agent architecture incorporating specialized sub-agents for RTL generation, verification, OpenLane hardening, and Caravel chip integration, all operating within a comprehensive sandbox environment with access to essential hardware design tools. The system leverages a vector database containing documentation, API references, error knowledge, and curated insights from the open-source silicon community. To evaluate ASIC-Agent's performance, we introduce ASIC-Agent-Bench, the first benchmark specifically designed to assess agentic systems in hardware design tasks. We evaluate ASIC-Agent with various base LLMs, providing quantitative comparisons and qualitative insights into agent behavior across different design scenarios. Our results demonstrate that ASIC-Agent, when powered by Claude 4 Sonnet, successfully automates a broad range of ASIC design tasks spanning varying levels of complexity, showing the potential of significantly accelerating the ASIC design workflow.
comment: 2025 IEEE International Conference on LLM-Aided Design (ICLAD)
DeepMEL: A Multi-Agent Collaboration Framework for Multimodal Entity Linking
Multimodal Entity Linking (MEL) aims to associate textual and visual mentions with entities in a multimodal knowledge graph. Despite its importance, current methods face challenges such as incomplete contextual information, coarse cross-modal fusion, and the difficulty of jointly large language models (LLMs) and large visual models (LVMs). To address these issues, we propose DeepMEL, a novel framework based on multi-agent collaborative reasoning, which achieves efficient alignment and disambiguation of textual and visual modalities through a role-specialized division strategy. DeepMEL integrates four specialized agents, namely Modal-Fuser, Candidate-Adapter, Entity-Clozer and Role-Orchestrator, to complete end-to-end cross-modal linking through specialized roles and dynamic coordination. DeepMEL adopts a dual-modal alignment path, and combines the fine-grained text semantics generated by the LLM with the structured image representation extracted by the LVM, significantly narrowing the modal gap. We design an adaptive iteration strategy, combines tool-based retrieval and semantic reasoning capabilities to dynamically optimize the candidate set and balance recall and precision. DeepMEL also unifies MEL tasks into a structured cloze prompt to reduce parsing complexity and enhance semantic comprehension. Extensive experiments on five public benchmark datasets demonstrate that DeepMEL achieves state-of-the-art performance, improving ACC by 1%-57%. Ablation studies verify the effectiveness of all modules.
Cognitive Agents Powered by Large Language Models for Agile Software Project Management
This paper investigates the integration of cognitive agents powered by Large Language Models (LLMs) within the Scaled Agile Framework (SAFe) to reinforce software project management. By deploying virtual agents in simulated software environments, this study explores their potential to fulfill fundamental roles in IT project development, thereby optimizing project outcomes through intelligent automation. Particular emphasis is placed on the adaptability of these agents to Agile methodologies and their transformative impact on decision-making, problem-solving, and collaboration dynamics. The research leverages the CogniSim ecosystem, a platform designed to simulate real-world software engineering challenges, such as aligning technical capabilities with business objectives, managing interdependencies, and maintaining project agility. Through iterative simulations, cognitive agents demonstrate advanced capabilities in task delegation, inter-agent communication, and project lifecycle management. By employing natural language processing to facilitate meaningful dialogues, these agents emulate human roles and improve the efficiency and precision of Agile practices. Key findings from this investigation highlight the ability of LLM-powered cognitive agents to deliver measurable improvements in various metrics, including task completion times, quality of deliverables, and communication coherence. These agents exhibit scalability and adaptability, ensuring their applicability across diverse and complex project environments. This study underscores the potential of integrating LLM-powered agents into Agile project management frameworks as a means of advancing software engineering practices. This integration not only refines the execution of project management tasks but also sets the stage for a paradigm shift in how teams collaborate and address emerging challenges.
On the $h$-majority dynamics with many opinions
We present the first upper bound on the convergence time to consensus of the well-known $h$-majority dynamics with $k$ opinions, in the synchronous setting, for $h$ and $k$ that are both non-constant values. We suppose that, at the beginning of the process, there is some initial additive bias towards some plurality opinion, that is, there is an opinion that is supported by $x$ nodes while any other opinion is supported by strictly fewer nodes. We prove that, with high probability, if the bias is $\omega(\sqrt{x})$ and the initial plurality opinion is supported by at least $x = \omega(\log n)$ nodes, then the process converges to plurality consensus in $O(\log n)$ rounds whenever $h = \omega(n \log n / x)$. A main corollary is the following: if $k = o(n / \log n)$ and the process starts from an almost-balanced configuration with an initial bias of magnitude $\omega(\sqrt{n/k})$ towards the initial plurality opinion, then any function $h = \omega(k \log n)$ suffices to guarantee convergence to consensus in $O(\log n)$ rounds, with high probability. Our upper bound shows that the lower bound of $\Omega(k / h^2)$ rounds to reach consensus given by Becchetti et al. (2017) cannot be pushed further than $\widetilde{\Omega}(k / h)$. Moreover, the bias we require is asymptotically smaller than the $\Omega(\sqrt{n\log n})$ bias that guarantees plurality consensus in the $3$-majority dynamics: in our case, the required bias is at most any (arbitrarily small) function in $\omega(\sqrt{x})$ for any value of $k \ge 2$.
Exploring Modularity of Agentic Systems for Drug Discovery
Large-language models (LLMs) and agentic systems present exciting opportunities to accelerate drug discovery. In this study, we examine the modularity of LLM-based agentic systems for drug discovery, i.e., whether parts of the system such as the LLM and type of agent are interchangeable, a topic that has received limited attention in drug discovery. We compare the performance of different LLMs and the effectiveness of tool-calling agents versus code-generating agents. Our case study, comparing performance in orchestrating tools for chemistry and drug discovery using an LLM-as-a-judge score, shows that Claude-3.5-Sonnet, Claude-3.7-Sonnet and GPT-4o outperform alternative language models such as Llama-3.1-8B, Llama-3.1-70B, GPT-3.5-Turbo, and Nova-Micro. Although we confirm that code-generating agents outperform the tool-calling ones on average, we show that this is highly question- and model-dependent. Furthermore, the impact of replacing system prompts is dependent on the question and model, underscoring that even in this particular domain one cannot just replace components of the system without re-engineering. Our study highlights the necessity of further research into the modularity of agentic systems to enable the development of reliable and modular solutions for real-world problems.
Systems and Control (CS)
Understanding and Utilizing Dynamic Coupling in Free-Floating Space Manipulators for On-Orbit Servicing
This study proposes a dynamic coupling-informed trajectory optimization algorithm for free-floating space manipulator systems (SMSs). Dynamic coupling between the base and the manipulator arms plays a critical role in influencing the system's behavior. While prior research has predominantly focused on minimizing this coupling, often overlooking its potential advantages, this work investigates how dynamic coupling can instead be leveraged to improve trajectory planning. Singular value decomposition (SVD) of the dynamic coupling matrix is employed to identify the dominant components governing coupling behavior. A quantitative metric is then formulated to characterize the strength and directionality of the coupling and is incorporated into a trajectory optimization framework. To assess the feasibility of the optimized trajectory, a sliding mode control-based tracking controller is designed to generate the required joint torque inputs. Simulation results demonstrate that explicitly accounting for dynamic coupling in trajectory planning enables more informed and potentially more efficient operation, offering new directions for the control of free-floating SMSs.
comment: 17 pages, 7 figures, 2025 AAS/AIAA Astrodynamics Specialist Conference
A 16.28 ppm/°C Temperature Coefficient, 0.5V Low-Voltage CMOS Voltage Reference with Curvature Compensation
This paper presents a fully-integrated CMOS voltage reference designed in a 90 nm process node using low voltage threshold (LVT) transistor models. The voltage reference leverages subthreshold operation and near-weak inversion characteristics, backed by an all-region MOSFET model. The proposed design achieves a very low operating supply voltage of 0.5 V and a remarkably low temperature coefficient of 16.28 ppm/{\deg}C through the mutual compensation of CTAT, PTAT, and curvature-correction currents, over a wide range from -40 {\deg}C to 130 {\deg}C. A stable reference voltage of 205 mV is generated with a line sensitivity of 1.65 %/V and a power supply rejection ratio (PSRR) of -50 dB at 10 kHz. The circuit achieves all these parameters while maintaining a good power efficiency, consuming only 0.67 {\mu}W.
comment: 6 pages, 29th International Symposium on VLSI Design and Test (VDAT 2025)
A Central Chilled Water Plant Model for Designing Learning-Based Controllers
We describe a framework of modeling a central chilled water plant (CCWP) that consists of an aggregate cooling coil, a number of heterogeneous chillers and cooling towers, and a chilled water-based thermal energy storage system. We improve upon existing component models from the open literature using a constrained optimization-based framework to ensure that the models respect capacities of all the heat exchangers (cooling coils, chillers, and cooling towers) irrespective of the inputs provided. As a result, the proposed model has a wider range of validity compared to existing models; the latter can produce highly erroneous outputs when inputs are not within normal operating range. This feature is essential for training learning-based controllers that can choose inputs beyond normal operating conditions and is lacking in currently available models. The overall plant model is implemented in Matlab and is made publicly available. Simulation of a CCWP with closed loop control is provided as an illustration.
Synthesis and SOS-based Stability Verification of a Neural-Network-Based Controller for a Two-wheeled Inverted Pendulum
This work newly establishes the feasibility and practical value of a sum of squares (SOS)-based stability verification procedure for applied control problems utilizing neural-network-based controllers (NNCs). It successfully verifies closed-loop stability properties of a NNC synthesized using a generalizable procedure to imitate a robust, tube-based model predictive controller (MPC) for a two-wheeled inverted pendulum demonstrator system. This is achieved by first developing a state estimator and control-oriented model for the two-wheeled inverted pendulum. Next, this control-oriented model is used to synthesize a baseline linear-quadratic regulator (LQR) and a robust, tube-based MPC, which is computationally too demanding for real-time execution on the demonstrator system's embedded hardware. The generalizable synthesis procedure generates an NNC imitating the robust, tube-based MPC. Via an SOS-based stability verification procedure, a certificate of local asymptotic stability and a relevant inner estimate of the region of attraction (RoA) are obtained for the closed-loop system incorporating this NNC. Finally, experimental results on the physical two-wheeled inverted pendulum demonstrate that the NNC both stabilizes the system, and improves the control performance compared to the baseline LQR in both regulation and reference-tracking tasks.
comment: Submitted to the IEEE for possible publication, 16 pages, 10 figures
Data-Driven Abstraction and Synthesis for Stochastic Systems with Unknown Dynamics
We study the automated abstraction-based synthesis of correct-by-construction control policies for stochastic dynamical systems with unknown dynamics. Our approach is to learn an abstraction from sampled data, which is represented in the form of a finite Markov decision process (MDP). In this paper, we present a data-driven technique for constructing finite-state interval MDP (IMDP) abstractions of stochastic systems with unknown nonlinear dynamics. As a distinguishing and novel feature, our technique only requires (1) noisy state-input-state observations and (2) an upper bound on the system's Lipschitz constant. Combined with standard model-checking techniques, our IMDP abstractions enable the synthesis of policies that satisfy probabilistic temporal properties (such as "reach-while-avoid") with a predefined confidence. Our experimental results show the effectiveness and robustness of our approach.
Why we need a standardized state of health definition for electric vehicle battery packs -- a proposal for energy- and capacity-based metrics
Range and performance are key customer-relevant properties of electric vehicles. Both degrade over time due to battery aging, thus impacting business decisions throughout a vehicle's lifecycle, such as efficient utilization and asset valuation. For practical assessment, aging is often simplified into a single figure of merit - the state of health - typically defined by the battery pack's remaining capacity or energy. However, no standardized method for measuring the state of health at the vehicle level has been established, leaving both academia and industry without a clear consensus. Ultimately, standardization is crucial to increase transparency and build confidence in the long-term reliability of electric vehicles' battery packs. In this article, we propose a standard measurement procedure for assessing the capacity- and energy-based state of health, leveraging onboard charging to enable reproducibility and scalability. Additionally, we demonstrate how differential voltage analysis can provide deeper insights into battery aging at the vehicle level.
comment: 20 pages, 4 figures, 2 tables,
Jointly Computation- and Communication-Efficient Distributed Learning
We address distributed learning problems over undirected networks. Specifically, we focus on designing a novel ADMM-based algorithm that is jointly computation- and communication-efficient. Our design guarantees computational efficiency by allowing agents to use stochastic gradients during local training. Moreover, communication efficiency is achieved as follows: i) the agents perform multiple training epochs between communication rounds, and ii) compressed transmissions are used. We prove exact linear convergence of the algorithm in the strongly convex setting. We corroborate our theoretical results by numerical comparisons with state of the art techniques on a classification task.
comment: To be presented at 2025 IEEE Conference on Decision and Control
Control-Based Online Distributed Optimization
In this paper we design a novel class of online distributed optimization algorithms leveraging control theoretical techniques. We start by focusing on quadratic costs, and assuming to know an internal model of their variation. In this set-up, we formulate the algorithm design as a robust control problem, showing that it yields a fully distributed algorithm. We also provide a distributed routine to acquire the internal model. We show that the algorithm converges exactly to the sequence of optimal solutions. We empirically evaluate the performance of the algorithm for different choices of parameters. Additionally, we evaluate the performance of the algorithm for quadratic problems with inexact internal model and non-quadratic problems, and show that it outperforms alternative algorithms in both scenarios.
comment: To be presented at 2025 IEEE Conference on Decision and Control
A Solvable Molecular Switch Model for Stable Temporal Information Processing
This paper studies an input-driven one-state differential equation model initially developed for an experimentally demonstrated dynamic molecular switch that switches like synapses in the brain do. The linear-in-the-state and nonlinear-in-the-input model is exactly solvable, and it is shown that it also possesses mathematical properties of convergence and fading memory that enable stable processing of time-varying inputs by nonlinear dynamical systems. Thus, the model exhibits the co-existence of biologically-inspired behavior and desirable mathematical properties for stable learning on sequential data. The results give theoretical support for the use of the dynamic molecular switches as computational units in deep cascaded/layered feedforward and recurrent architectures as well as other more general structures for neuromorphic computing. They could also inspire more general exactly solvable models that can be fitted to emulate arbitrary physical devices which can mimic brain-inspired behaviour and perform stable computation on input signals.
comment: 21 pages, 6 figures, submitted for publication. Comments are welcome
Integrated Take-off Management and Trajectory Optimization for Merging Control in Urban Air Mobility Corridors
Urban Air Mobility (UAM) has the potential to revolutionize daily transportation, offering rapid and efficient aerial mobility services. Take-off and merging phases are critical for air corridor operations, requiring the coordination of take-off aircraft and corridor traffic while ensuring safety and seamless transition. This paper proposes an integrated take-off management and trajectory optimization for merging control in UAM corridors. We first introduce a novel take-off airspace design. To our knowledge, this paper is one of the first to propose a structured design for take-off airspace. Based on the take-off airspace design, we devise a hierarchical coordinated take-off and merging management (HCTMM) strategy. To be specific, the take-off airspace design can simplify aircraft dynamics and thus reduce the dimensionality of the trajectory optimization problem whilst mitigating obstacle avoidance complexities. The HCTMM strategy strictly ensures safety and improves the efficiency of take-off and merging operations. At the tactical level, a scheduling algorithm coordinates aircraft take-off times and selects dynamic merging points to reduce conflicts and ensure smooth take-off and merging processes. At the operational level, a trajectory optimization strategy ensures that each aircraft reaches the dynamic merging point efficiently while satisfying safety constraints. Simulation results show that, compared to representative strategies with fixed or dynamic merging points, the HCTMM strategy significantly improves operational efficiency and reduces computational burden, while ensuring safety under various corridor traffic conditions. Further results confirm the scalability of the HCTMM strategy and the computational efficiency enabled by the proposed take-off airspace design.
comment: 31 pages
Locally Differentially Private Multi-Sensor Fusion Estimation With System Intrinsic Randomness
This paper focuses on the privacy-preserving multi-sensor fusion estimation (MSFE) problem with differential privacy considerations. Most existing research efforts are directed towards the exploration of traditional differential privacy, also referred to as centralized differential privacy (CDP). It is important to note that CDP is tailored to protect the privacy of statistical data at fusion center such as averages and sums rather than individual data at sensors, which renders it inappropriate for MSFE. Additionally, the definitions and assumptions of CDP are primarily applicable for large-scale systems that require statistical results mentioned above. Therefore, to address these limitations, this paper introduces a more recent advancement known as \emph{local differential privacy (LDP)} to enhance the privacy of MSFE. We provide some rigorous definitions about LDP based on the intrinsic properties of MSFE rather than directly presenting the assumptions under CDP. Subsequently, the LDP is proved to be realized with system intrinsic randomness, which is useful and has never been considered before. Furthermore, the Gaussian mechanism is designed when the intrinsic randomness is insufficient. The lower bound of the covariance for extra injected Gaussian noises is determined by integrating system information with privacy budgets. Moreover, the optimal fusion estimators under intrinsic and extra disturbances are respectively designed in the linear minimum variance sense. Finally, the effectiveness of the proposed methods is verified through numerical simulations, encompassing both one-dimensional and high-dimensional scenarios.
comment: 12 pages, 5 figures
On the Performance of Linear Adaptive Filters driven by the Ergodic Chaotic Logistic Map
Chaotic dynamical systems are increasingly considered for use in coding and transmission systems. This stems from their parameter sensitivity and spectral characteristics. The latter are relevant for channel estimation methods. In particular the logistic map $f_\lambda =\lambda x\left( 1-x\right) $ has been employed in chaotic coding and spread spectrum transmission systems. For $\lambda =4$ the statistical properties of sequences generated by $f_4$ are considered as ideal drive signals for channel estimation schemes. This assumption is proven in the present paper. To this end the higher order statistical moments and the autocorrelation of time series generated by $f_4$ are derived. It is shown that for $\lambda =4$ the zero mean time series is uncorrelated. The adaptation performance of finite impulse response (FIR) digital adaptive filters (DAF) used for channel estimation is analyzed. It is shown that using zero mean sequences of $f_4$ leads to the maximal possible FIR DAF performance. An optimal value for the damping parameter in the LMS scheme is derived that leads to the maximal performance and ensures stability. The analytic considerations are confirmed by simulation results.
A 16.28 ppm/$^\circ$C Temperature Coefficient, 0.5V Low-Voltage CMOS Voltage Reference with Curvature Compensation
This paper presents a fully-integrated CMOS voltage reference designed in a 90 nm process node using low voltage threshold (LVT) transistor models. The voltage reference leverages subthreshold operation and near-weak inversion characteristics, backed by an all-region MOSFET model. The proposed design achieves a very low operating supply voltage of 0.5 V and a remarkably low temperature coefficient of 16.28 ppm/$^\circ$C through the mutual compensation of CTAT, PTAT, and curvature-correction currents, over a wide range from -40 $^\circ$C to 130 $^\circ$C. A stable reference voltage of 205 mV is generated with a line sensitivity of 1.65 %/V and a power supply rejection ratio (PSRR) of -50 dB at 10 kHz. The circuit achieves all these parameters while maintaining a good power efficiency, consuming only 0.67 $\mu$W.
comment: 6 pages, 29th International Symposium on VLSI Design and Test (VDAT 2025)
Vector preference-based contextual bandits under distributional shifts
We consider contextual bandit learning under distribution shift when reward vectors are ordered according to a given preference cone. We propose an adaptive-discretization and optimistic elimination based policy that self-tunes to the underlying distribution shift. To measure the performance of this policy, we introduce the notion of preference-based regret which measures the performance of a policy in terms of distance between Pareto fronts. We study the performance of this policy by establishing upper bounds on its regret under various assumptions on the nature of distribution shift. Our regret bounds generalize known results for the existing case of no distribution shift and vectorial reward settings, and scale gracefully with problem parameters in presence of distribution shifts.
Advancing rail safety: An onboard measurement system of rolling stock wheel flange wear based on dynamic machine learning algorithms
Rail and wheel interaction functionality is pivotal to the railway system safety, requiring accurate measurement systems for optimal safety monitoring operation. This paper introduces an innovative onboard measurement system for monitoring wheel flange wear depth, utilizing displacement and temperature sensors. Laboratory experiments are conducted to emulate wheel flange wear depth and surrounding temperature fluctuations in different periods of time. Employing collected data, the training of machine learning algorithms that are based on regression models, is dynamically automated. Further experimentation results, using standards procedures, validate the system's efficacy. To enhance accuracy, an infinite impulse response filter (IIR) that mitigates vehicle dynamics and sensor noise is designed. Filter parameters were computed based on specifications derived from a Fast Fourier Transform analysis of locomotive simulations and emulation experiments data. The results show that the dynamic machine learning algorithm effectively counter sensor nonlinear response to temperature effects, achieving an accuracy of 96.5 %, with a minimal runtime. The real-time noise reduction via IIR filter enhances the accuracy up to 98.2 %. Integrated with railway communication embedded systems such as Internet of Things devices, this advanced monitoring system offers unparalleled real-time insights into wheel flange wear and track irregular conditions that cause it, ensuring heightened safety and efficiency in railway systems operations.
comment: Journal article published in Transportation Research Record: The Journal of Transportation Research Board
Solving Three-phase AC Infeasibility Analysis to Near-zero Optimality Gap
Recent works have shown the use of equivalent circuit-based infeasibility analysis to identify weak locations in distribution power grids. For three-phase power flow problems, when the power flow solver diverges, three-phase infeasibility analysis (TPIA) can converge and identify weak locations. The original TPIA problem is non-convex, and local minima and saddle points are possible. This can result in grid upgrades that are sub-optimal. To address this issue, we reformulate the original non-convex nonlinear program (NLP) as an exact non-convex bilinear program (BLP). Subsequently, we apply the spatial branch-and-bound (SBnB) algorithm to compute a solution with near-zero optimality gap. To improve SBnB performance, we introduce a bound tightening algorithm with variable filtering and decomposition, which tightens bounds on bilinear variables. We demonstrate that sequential bound tightening (SBT) significantly improves the efficiency and accuracy of Gurobi's SBnB algorithm. Our results show that the proposed method can solve large-scale three-phase infeasibility analysis problems with >5k nodes, achieving an optimality gap of less than 10e-4. Furthermore, we demonstrate that by utilizing the developed presolve routine for bounding, we can reduce the runtime of SBnB by up to 97%.
Konzepte zur Effizienzsteigerung von Traktionsmotoren in batterieelektrischen Fahrzeugen durch den Einsatz neuartiger teillastoptimierbarer Motor- und Invertertopologien
To increase the efficiency of future electric vehicles, it is crucial to reduce drivetrain losses in battery-powered vehicles. This enables either an increase in driving range or overall cost savings by reducing battery capacity while maintaining the same range. Harmonic motor losses account for an avoidable share of more than 30% of the total eDrive losses in standard B6-2L 300 kW iPMSM configurations. These losses result from high-frequency voltage distortion across the motor windings, which can be reduced through various approaches. Of great importance is the classification of cost-neutral and low-cost concepts for loss reduction. The following presents and categorizes approaches to loss reduction that have been developed by research and industry in recent years. In particular, novel part-load-capable motor and inverter concepts are introduced, which enable motor switching or multilevel operation to reduce harmonic losses in the part-load range.
comment: in German language, Published in the conference proceedings of Symposium Elektromagnetismus, 2025, February 27--28, K\"unzelsau, Germany (ISBN-Nr.: 978-3-943563-55-9)
Recursive Gaussian Process Regression with Integrated Monotonicity Assumptions for Control Applications
In this paper, we present an extension to the recursive Gaussian Process (RGP) regression that enables the satisfaction of inequality constraints and is well suited for a real-time execution in control applications. The soft inequality constraints are integrated by introducing an additional extended Kalman Filter (EKF) update step using pseudo-measurements. The sequential formulation of the algorithm and several developed heuristics ensure both the performance and a low computational effort of the algorithm. A special focus lies on an efficient consideration of monotonicity assumptions for GPs in the form of inequality constraints. The algorithm is statistically validated in simulations, where the possible advantages in comparison with the standard RGP algorithm become obvious. The paper is concluded with a successful experimental validation of the developed algorithm for the monotonicity-preserving learning of heat transfer values for the control of a vapor compression cycle evaporator, leveraging a previously published partial input output linearization (IOL).
comment: Accepted at ICINCO 2025 (22nd International Conference on Informatics in Control, Automation and Robotics)
Assessment of Power System Stability Considering Multiple Time-Scale Dynamics: Insights into Hopf Bifurcations in Presence of GFL and GFM IBRs
Real power systems exhibit dynamics that evolve across a wide range of time scales, from very fast to very slow phenomena. Historically, incorporating these wide-ranging dynamics into a single model has been impractical. As a result, power engineers rely on time-scale decomposition to simplify models. When fast phenomena are evaluated, slow dynamics are neglected (assumed stable), and vice versa. This paper challenges this paradigm by showing the importance of assessing power system stability while considering multiple time scales simultaneously. Using the concept of Hopf bifurcations, it exemplifies instability issues that would be missed if multi-time-scale dynamics are not considered. Although this work employs both grid-following and grid-forming inverter-based resource models, it is not a direct comparison. Instead, it presents a case study demonstrating how one technology can complement the other from a multi time-scale dynamics perspective.
comment: 7 pages
DCT-MARL: A Dynamic Communication Topology-Based MARL Algorithm for Connected Vehicle Platoon Control
With the rapid advancement of vehicular communication facilities and autonomous driving technologies, connected vehicle platooning has emerged as a promising approach to improve traffic efficiency and driving safety. Reliable Vehicle-to-Vehicle (V2V) communication is critical to achieving efficient cooperative control. However, in the real-world traffic environment, V2V communication may suffer from time-varying delay and packet loss, leading to degraded control performance and even safety risks. To mitigate the adverse effects of non-ideal communication, this paper proposes a Dynamic Communication Topology based Multi-Agent Reinforcement Learning (DCT-MARL) algorithm for robust cooperative platoon control. Specifically, the state space is augmented with historical control action and delay to enhance robustness against communication delay. To mitigate the impact of packet loss, a multi-key gated communication mechanism is introduced, which dynamically adjusts the communication topology based on the correlation between vehicles and their current communication status. Simulation results demonstrate that the proposed DCT-MARL significantly outperforms state-of-the-art methods in terms of string stability and driving comfort, validating its superior robustness and effectiveness.
A "good regulator theorem" for embodied agents
In a classic paper, Conant and Ashby claimed that "every good regulator of a system must be a model of that system." Artificial Life has produced many examples of systems that perform tasks with apparently no model in sight; these suggest Conant and Ashby's theorem doesn't easily generalise beyond its restricted setup. Nevertheless, here we show that a similar intuition can be fleshed out in a different way: whenever an agent is able to perform a regulation task, it is possible for an observer to interpret it as having "beliefs" about its environment, which it "updates" in response to sensory input. This notion of belief updating provides a notion of model that is more sophisticated than Conant and Ashby's, as well as a theorem that is more broadly applicable. However, it necessitates a change in perspective, in that the observer plays an essential role in the theory: models are not a mere property of the system but are imposed on it from outside. Our theorem holds regardless of whether the system is regulating its environment in a classic control theory setup, or whether it's regulating its own internal state; the model is of its environment either way. The model might be trivial, however, and this is how the apparent counterexamples are resolved.
comment: Accepted at the Artificial Life conference 2025 (ALife 2025). 10 pages, 1 figure
Online Convex Optimization and Integral Quadratic Constraints: An automated approach to regret analysis
We propose a novel approach for analyzing dynamic regret of first-order constrained online convex optimization algorithms for strongly convex and Lipschitz-smooth objectives. Crucially, we provide a general analysis that is applicable to a wide range of first-order algorithms that can be expressed as an interconnection of a linear dynamical system in feedback with a first-order oracle. By leveraging Integral Quadratic Constraints (IQCs), we derive a semi-definite program which, when feasible, provides a regret guarantee for the online algorithm. For this, the concept of variational IQCs is introduced as the generalization of IQCs to time-varying monotone operators. Our bounds capture the temporal rate of change of the problem in the form of the path length of the time-varying minimizer and the objective function variation. In contrast to standard results in OCO, our results do not require nerither the assumption of gradient boundedness, nor that of a bounded feasible set. Numerical analyses showcase the ability of the approach to capture the dependence of the regret on the function class condition number.
comment: Published in the 64th IEEE Conference on Decision and Control, 2025
Linear time-and-space-invariant relaxation systems
This paper generalizes the physical property of relaxation from linear time-invariant (LTI) to linear time-and-space-invariant (LTSI) systems. It is shown that the defining features of relaxation -- complete monotonicity, passivity, and memory-based storage -- carry over seamlessly to the spatio-temporal domain. An LTSI system is shown to be of relaxation type if and only if its associated spatio-temporal Hankel operator is cyclically monotone. This implies the existence of an intrinsic quadratic storage functional defined uniquely by past inputs, independently of any state-space realization. As in the LTI case, LTSI relaxation systems are shown to be those systems for which the state-space concept of storage coincides with the input-output concept of fading memory functional.
Integrating Grid impedance estimation method into Advanced Angle Estimation Kalman Filter in GFL inverter
The growing integration of power electronic converter-interfaced distributed energy resources into modern power systems presents significant challenges for system monitoring, protection, and control. Grid impedance plays a critical role in the operation and stability assessment of grid-connected inverter systems. This study presents a real-time grid impedance estimation method based on the Discrete Fourier Transform. The proposed method is integrated with the Advanced Angle Estimation Kalman Filter using a Linear Quadratic Regulator current controller (AAEKF-LQR), assisting the use of impedance information for accurate instantaneous phase angle estimation. Simulation results confirm that the proposed impedance estimation method interacts effectively with the AAEKF-LQR controller, maintaining stable system performance under weak grid conditions. The approach also demonstrates the ability to deliver fast and accurate impedance estimation during operational variations in grid conditions, thereby supporting stable inverter operation.
comment: 8 pages, 6 figures
A Time Splitting Based Optimization Method for Nonlinear MHE
Moving Horizon Estimation~(MHE) is essentially an optimization-based approach designed to estimate the states of dynamic systems within a moving time horizon. Traditional MHE solutions become computationally prohibitive due to the \textit{curse of dimensionality} arising from increasing problem complexity and growing length of time horizon. To address this issue, we propose novel computationally efficient algorithms for solving nonlinear MHE problems. Specifically, we first introduce a distributed reformulation utilizing a time-splitting technique. Leveraging this reformulation, we develop the Efficient Gauss-Newton Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN) to achieve computational efficiency. Additionally, to accommodate limited computational capabilities inherent in some sub-problem solvers, we propose the Efficient Sensitivity Assisted ALADIN, which enables sub-problems to be solved inexactly without hindering computational efficiency. Furthermore, recognizing scenarios where sub-problem solvers possess no computational power, we propose a Distributed Sequential Quadratic Programming (SQP) that relies solely on first- and second-order information of local objective functions. We demonstrate the performance and advantages of our proposed methods through numerical experiments on differential drive robots case, a practical nonlinear MHE problem. Our results demonstrate that the three proposed algorithms achieve computational efficiency while preserving high accuracy, thereby satisfying the real-time requirements of MHE.
A Survey of Foundation Models for IoT: Taxonomy and Criteria-Based Analysis
Foundation models have gained growing interest in the IoT domain due to their reduced reliance on labeled data and strong generalizability across tasks, which address key limitations of traditional machine learning approaches. However, most existing foundation model based methods are developed for specific IoT tasks, making it difficult to compare approaches across IoT domains and limiting guidance for applying them to new tasks. This survey aims to bridge this gap by providing a comprehensive overview of current methodologies and organizing them around four shared performance objectives by different domains: efficiency, context-awareness, safety, and security & privacy. For each objective, we review representative works, summarize commonly-used techniques and evaluation metrics. This objective-centric organization enables meaningful cross-domain comparisons and offers practical insights for selecting and designing foundation model based solutions for new IoT tasks. We conclude with key directions for future research to guide both practitioners and researchers in advancing the use of foundation models in IoT applications.
comment: Accepted by CCF Transactions on Pervasive Computing and Interaction
Polytope Volume Monitoring Problem: Formulation and Solution via Parametric Linear Program Based Control Barrier Function
Motivated by the latest research on feasible space monitoring of multiple control barrier functions (CBFs) as well as polytopic collision avoidance, this paper studies the Polytope Volume Monitoring (PVM) problem, whose goal is to design a control law for inputs of nonlinear systems to prevent the volume of some state-dependent polytope from decreasing to zero. Recent studies have explored the idea of applying Chebyshev ball method in optimization theory to solve the case study of PVM; however, the underlying difficulties caused by nonsmoothness have not been addressed. This paper continues the study on this topic, where our main contribution is to establish the relationship between nonsmooth CBF and parametric optimization theory through directional derivatives for the first time, to solve PVM problems more conveniently. In detail, inspired by Chebyshev ball approach, a parametric linear program (PLP) based nonsmooth barrier function candidate is established for PVM, and then, sufficient conditions for it to be a nonsmooth CBF are proposed, based on which a quadratic program (QP) based safety filter with guaranteed feasibility is proposed to address PVM problems. Finally, a numerical simulation example is given to show the efficiency of the proposed safety filter.
comment: An extension version of the accepted CDC2025
Contrastive-KAN: A Semi-Supervised Intrusion Detection Framework for Cybersecurity with scarce Labeled Data
In the era of the Fourth Industrial Revolution, cybersecurity and intrusion detection systems are vital for the secure and reliable operation of IoT and IIoT environments. A key challenge in this domain is the scarcity of labeled cyberattack data, as most industrial systems operate under normal conditions. This data imbalance, combined with the high cost of annotation, hinders the effective training of machine learning models. Moreover, the rapid detection of attacks is essential, especially in critical infrastructure, to prevent large-scale disruptions. To address these challenges, we propose a real-time intrusion detection system based on a semi-supervised contrastive learning framework using the Kolmogorov-Arnold Network (KAN). Our method leverages abundant unlabeled data to effectively distinguish between normal and attack behaviors. We validate our approach on three benchmark datasets, UNSW-NB15, BoT-IoT, and Gas Pipeline, using only 2.20%, 1.28%, and 8% of labeled samples, respectively, to simulate real-world conditions. Experimental results show that our method outperforms existing contrastive learning-based approaches. We further compare KAN with a traditional multilayer perceptron (MLP), demonstrating KAN's superior performance in both detection accuracy and robustness under limited supervision. KAN's ability to model complex relationships, along with its learnable activation functions, is also explored and visualized, offering interpretability and the potential for rule extraction. The method supports multi-class classification and proves effective in safety, critical environments where reliability is paramount.
Non-asymptotic Error Analysis of Subspace Identification for Deterministic Systems
The subspace identification method (SIM) has been extensively employed in the identification of discrete-time multiple-input multiple-output (MIMO) linear time-invariant (LTI) systems. This paper focuses on the analysis of perturbation errors for the system matrices in state-space models and the corresponding system poles, under two unified SIMs, based on a single finite-length input/output sample trajectory. Specifically, we derive non-asymptotic upper bounds on these errors, providing a unified perspective across various SIM variants. Furthermore, we prove that SIMs become ill-conditioned for MIMO systems when the state-to-output dimensionality ratio $n/m$ is large, regardless of system parameters. Finally, numerical experiments are conducted to validate the non-asymptotic results and the ill-conditionedness of SIMs.
comment: In Assumption 1, the assumption regarding the matrix A is found to be unreasonable and needs correction. Additionally, the bound curve in Figure 3b is inaccurate and requires revision. To address these issues, I am requesting the withdrawal of this version (v2) and plan to re-upload a corrected version
Data-conforming data-driven control: avoiding premature generalizations beyond data
Data-driven and adaptive control approaches face the problem of introducing sudden distributional shifts beyond the distribution of data encountered during learning. Therefore, they are prone to invalidating the very assumptions used in their own construction. This is due to the linearity of the underlying system, inherently assumed and formulated in most data-driven control approaches, which may falsely generalize the behavior of the system beyond the behavior experienced in the data. This paper seeks to mitigate these problems by enforcing consistency of the newly designed closed-loop systems with data and slowing down any distributional shifts in the joint state-input space. This is achieved through incorporating affine regularization terms and linear matrix inequality constraints to data-driven approaches, resulting in convex semi-definite programs that can be efficiently solved by standard software packages. We discuss the optimality conditions of these programs and then conclude the paper with a numerical example that further highlights the problem of premature generalization beyond data and shows the effectiveness of our proposed approaches in enhancing the safety of data-driven control methods.
Systems and Control (EESS)
Understanding and Utilizing Dynamic Coupling in Free-Floating Space Manipulators for On-Orbit Servicing
This study proposes a dynamic coupling-informed trajectory optimization algorithm for free-floating space manipulator systems (SMSs). Dynamic coupling between the base and the manipulator arms plays a critical role in influencing the system's behavior. While prior research has predominantly focused on minimizing this coupling, often overlooking its potential advantages, this work investigates how dynamic coupling can instead be leveraged to improve trajectory planning. Singular value decomposition (SVD) of the dynamic coupling matrix is employed to identify the dominant components governing coupling behavior. A quantitative metric is then formulated to characterize the strength and directionality of the coupling and is incorporated into a trajectory optimization framework. To assess the feasibility of the optimized trajectory, a sliding mode control-based tracking controller is designed to generate the required joint torque inputs. Simulation results demonstrate that explicitly accounting for dynamic coupling in trajectory planning enables more informed and potentially more efficient operation, offering new directions for the control of free-floating SMSs.
comment: 17 pages, 7 figures, 2025 AAS/AIAA Astrodynamics Specialist Conference
A 16.28 ppm/°C Temperature Coefficient, 0.5V Low-Voltage CMOS Voltage Reference with Curvature Compensation
This paper presents a fully-integrated CMOS voltage reference designed in a 90 nm process node using low voltage threshold (LVT) transistor models. The voltage reference leverages subthreshold operation and near-weak inversion characteristics, backed by an all-region MOSFET model. The proposed design achieves a very low operating supply voltage of 0.5 V and a remarkably low temperature coefficient of 16.28 ppm/{\deg}C through the mutual compensation of CTAT, PTAT, and curvature-correction currents, over a wide range from -40 {\deg}C to 130 {\deg}C. A stable reference voltage of 205 mV is generated with a line sensitivity of 1.65 %/V and a power supply rejection ratio (PSRR) of -50 dB at 10 kHz. The circuit achieves all these parameters while maintaining a good power efficiency, consuming only 0.67 {\mu}W.
comment: 6 pages, 29th International Symposium on VLSI Design and Test (VDAT 2025)
A Central Chilled Water Plant Model for Designing Learning-Based Controllers
We describe a framework of modeling a central chilled water plant (CCWP) that consists of an aggregate cooling coil, a number of heterogeneous chillers and cooling towers, and a chilled water-based thermal energy storage system. We improve upon existing component models from the open literature using a constrained optimization-based framework to ensure that the models respect capacities of all the heat exchangers (cooling coils, chillers, and cooling towers) irrespective of the inputs provided. As a result, the proposed model has a wider range of validity compared to existing models; the latter can produce highly erroneous outputs when inputs are not within normal operating range. This feature is essential for training learning-based controllers that can choose inputs beyond normal operating conditions and is lacking in currently available models. The overall plant model is implemented in Matlab and is made publicly available. Simulation of a CCWP with closed loop control is provided as an illustration.
Synthesis and SOS-based Stability Verification of a Neural-Network-Based Controller for a Two-wheeled Inverted Pendulum
This work newly establishes the feasibility and practical value of a sum of squares (SOS)-based stability verification procedure for applied control problems utilizing neural-network-based controllers (NNCs). It successfully verifies closed-loop stability properties of a NNC synthesized using a generalizable procedure to imitate a robust, tube-based model predictive controller (MPC) for a two-wheeled inverted pendulum demonstrator system. This is achieved by first developing a state estimator and control-oriented model for the two-wheeled inverted pendulum. Next, this control-oriented model is used to synthesize a baseline linear-quadratic regulator (LQR) and a robust, tube-based MPC, which is computationally too demanding for real-time execution on the demonstrator system's embedded hardware. The generalizable synthesis procedure generates an NNC imitating the robust, tube-based MPC. Via an SOS-based stability verification procedure, a certificate of local asymptotic stability and a relevant inner estimate of the region of attraction (RoA) are obtained for the closed-loop system incorporating this NNC. Finally, experimental results on the physical two-wheeled inverted pendulum demonstrate that the NNC both stabilizes the system, and improves the control performance compared to the baseline LQR in both regulation and reference-tracking tasks.
comment: Submitted to the IEEE for possible publication, 16 pages, 10 figures
Data-Driven Abstraction and Synthesis for Stochastic Systems with Unknown Dynamics
We study the automated abstraction-based synthesis of correct-by-construction control policies for stochastic dynamical systems with unknown dynamics. Our approach is to learn an abstraction from sampled data, which is represented in the form of a finite Markov decision process (MDP). In this paper, we present a data-driven technique for constructing finite-state interval MDP (IMDP) abstractions of stochastic systems with unknown nonlinear dynamics. As a distinguishing and novel feature, our technique only requires (1) noisy state-input-state observations and (2) an upper bound on the system's Lipschitz constant. Combined with standard model-checking techniques, our IMDP abstractions enable the synthesis of policies that satisfy probabilistic temporal properties (such as "reach-while-avoid") with a predefined confidence. Our experimental results show the effectiveness and robustness of our approach.
Why we need a standardized state of health definition for electric vehicle battery packs -- a proposal for energy- and capacity-based metrics
Range and performance are key customer-relevant properties of electric vehicles. Both degrade over time due to battery aging, thus impacting business decisions throughout a vehicle's lifecycle, such as efficient utilization and asset valuation. For practical assessment, aging is often simplified into a single figure of merit - the state of health - typically defined by the battery pack's remaining capacity or energy. However, no standardized method for measuring the state of health at the vehicle level has been established, leaving both academia and industry without a clear consensus. Ultimately, standardization is crucial to increase transparency and build confidence in the long-term reliability of electric vehicles' battery packs. In this article, we propose a standard measurement procedure for assessing the capacity- and energy-based state of health, leveraging onboard charging to enable reproducibility and scalability. Additionally, we demonstrate how differential voltage analysis can provide deeper insights into battery aging at the vehicle level.
comment: 20 pages, 4 figures, 2 tables,
Jointly Computation- and Communication-Efficient Distributed Learning
We address distributed learning problems over undirected networks. Specifically, we focus on designing a novel ADMM-based algorithm that is jointly computation- and communication-efficient. Our design guarantees computational efficiency by allowing agents to use stochastic gradients during local training. Moreover, communication efficiency is achieved as follows: i) the agents perform multiple training epochs between communication rounds, and ii) compressed transmissions are used. We prove exact linear convergence of the algorithm in the strongly convex setting. We corroborate our theoretical results by numerical comparisons with state of the art techniques on a classification task.
comment: To be presented at 2025 IEEE Conference on Decision and Control
Control-Based Online Distributed Optimization
In this paper we design a novel class of online distributed optimization algorithms leveraging control theoretical techniques. We start by focusing on quadratic costs, and assuming to know an internal model of their variation. In this set-up, we formulate the algorithm design as a robust control problem, showing that it yields a fully distributed algorithm. We also provide a distributed routine to acquire the internal model. We show that the algorithm converges exactly to the sequence of optimal solutions. We empirically evaluate the performance of the algorithm for different choices of parameters. Additionally, we evaluate the performance of the algorithm for quadratic problems with inexact internal model and non-quadratic problems, and show that it outperforms alternative algorithms in both scenarios.
comment: To be presented at 2025 IEEE Conference on Decision and Control
A Solvable Molecular Switch Model for Stable Temporal Information Processing
This paper studies an input-driven one-state differential equation model initially developed for an experimentally demonstrated dynamic molecular switch that switches like synapses in the brain do. The linear-in-the-state and nonlinear-in-the-input model is exactly solvable, and it is shown that it also possesses mathematical properties of convergence and fading memory that enable stable processing of time-varying inputs by nonlinear dynamical systems. Thus, the model exhibits the co-existence of biologically-inspired behavior and desirable mathematical properties for stable learning on sequential data. The results give theoretical support for the use of the dynamic molecular switches as computational units in deep cascaded/layered feedforward and recurrent architectures as well as other more general structures for neuromorphic computing. They could also inspire more general exactly solvable models that can be fitted to emulate arbitrary physical devices which can mimic brain-inspired behaviour and perform stable computation on input signals.
comment: 21 pages, 6 figures, submitted for publication. Comments are welcome
Integrated Take-off Management and Trajectory Optimization for Merging Control in Urban Air Mobility Corridors
Urban Air Mobility (UAM) has the potential to revolutionize daily transportation, offering rapid and efficient aerial mobility services. Take-off and merging phases are critical for air corridor operations, requiring the coordination of take-off aircraft and corridor traffic while ensuring safety and seamless transition. This paper proposes an integrated take-off management and trajectory optimization for merging control in UAM corridors. We first introduce a novel take-off airspace design. To our knowledge, this paper is one of the first to propose a structured design for take-off airspace. Based on the take-off airspace design, we devise a hierarchical coordinated take-off and merging management (HCTMM) strategy. To be specific, the take-off airspace design can simplify aircraft dynamics and thus reduce the dimensionality of the trajectory optimization problem whilst mitigating obstacle avoidance complexities. The HCTMM strategy strictly ensures safety and improves the efficiency of take-off and merging operations. At the tactical level, a scheduling algorithm coordinates aircraft take-off times and selects dynamic merging points to reduce conflicts and ensure smooth take-off and merging processes. At the operational level, a trajectory optimization strategy ensures that each aircraft reaches the dynamic merging point efficiently while satisfying safety constraints. Simulation results show that, compared to representative strategies with fixed or dynamic merging points, the HCTMM strategy significantly improves operational efficiency and reduces computational burden, while ensuring safety under various corridor traffic conditions. Further results confirm the scalability of the HCTMM strategy and the computational efficiency enabled by the proposed take-off airspace design.
comment: 31 pages
Locally Differentially Private Multi-Sensor Fusion Estimation With System Intrinsic Randomness
This paper focuses on the privacy-preserving multi-sensor fusion estimation (MSFE) problem with differential privacy considerations. Most existing research efforts are directed towards the exploration of traditional differential privacy, also referred to as centralized differential privacy (CDP). It is important to note that CDP is tailored to protect the privacy of statistical data at fusion center such as averages and sums rather than individual data at sensors, which renders it inappropriate for MSFE. Additionally, the definitions and assumptions of CDP are primarily applicable for large-scale systems that require statistical results mentioned above. Therefore, to address these limitations, this paper introduces a more recent advancement known as \emph{local differential privacy (LDP)} to enhance the privacy of MSFE. We provide some rigorous definitions about LDP based on the intrinsic properties of MSFE rather than directly presenting the assumptions under CDP. Subsequently, the LDP is proved to be realized with system intrinsic randomness, which is useful and has never been considered before. Furthermore, the Gaussian mechanism is designed when the intrinsic randomness is insufficient. The lower bound of the covariance for extra injected Gaussian noises is determined by integrating system information with privacy budgets. Moreover, the optimal fusion estimators under intrinsic and extra disturbances are respectively designed in the linear minimum variance sense. Finally, the effectiveness of the proposed methods is verified through numerical simulations, encompassing both one-dimensional and high-dimensional scenarios.
comment: 12 pages, 5 figures
On the Performance of Linear Adaptive Filters driven by the Ergodic Chaotic Logistic Map
Chaotic dynamical systems are increasingly considered for use in coding and transmission systems. This stems from their parameter sensitivity and spectral characteristics. The latter are relevant for channel estimation methods. In particular the logistic map $f_\lambda =\lambda x\left( 1-x\right) $ has been employed in chaotic coding and spread spectrum transmission systems. For $\lambda =4$ the statistical properties of sequences generated by $f_4$ are considered as ideal drive signals for channel estimation schemes. This assumption is proven in the present paper. To this end the higher order statistical moments and the autocorrelation of time series generated by $f_4$ are derived. It is shown that for $\lambda =4$ the zero mean time series is uncorrelated. The adaptation performance of finite impulse response (FIR) digital adaptive filters (DAF) used for channel estimation is analyzed. It is shown that using zero mean sequences of $f_4$ leads to the maximal possible FIR DAF performance. An optimal value for the damping parameter in the LMS scheme is derived that leads to the maximal performance and ensures stability. The analytic considerations are confirmed by simulation results.
A 16.28 ppm/$^\circ$C Temperature Coefficient, 0.5V Low-Voltage CMOS Voltage Reference with Curvature Compensation
This paper presents a fully-integrated CMOS voltage reference designed in a 90 nm process node using low voltage threshold (LVT) transistor models. The voltage reference leverages subthreshold operation and near-weak inversion characteristics, backed by an all-region MOSFET model. The proposed design achieves a very low operating supply voltage of 0.5 V and a remarkably low temperature coefficient of 16.28 ppm/$^\circ$C through the mutual compensation of CTAT, PTAT, and curvature-correction currents, over a wide range from -40 $^\circ$C to 130 $^\circ$C. A stable reference voltage of 205 mV is generated with a line sensitivity of 1.65 %/V and a power supply rejection ratio (PSRR) of -50 dB at 10 kHz. The circuit achieves all these parameters while maintaining a good power efficiency, consuming only 0.67 $\mu$W.
comment: 6 pages, 29th International Symposium on VLSI Design and Test (VDAT 2025)
Vector preference-based contextual bandits under distributional shifts
We consider contextual bandit learning under distribution shift when reward vectors are ordered according to a given preference cone. We propose an adaptive-discretization and optimistic elimination based policy that self-tunes to the underlying distribution shift. To measure the performance of this policy, we introduce the notion of preference-based regret which measures the performance of a policy in terms of distance between Pareto fronts. We study the performance of this policy by establishing upper bounds on its regret under various assumptions on the nature of distribution shift. Our regret bounds generalize known results for the existing case of no distribution shift and vectorial reward settings, and scale gracefully with problem parameters in presence of distribution shifts.
Advancing rail safety: An onboard measurement system of rolling stock wheel flange wear based on dynamic machine learning algorithms
Rail and wheel interaction functionality is pivotal to the railway system safety, requiring accurate measurement systems for optimal safety monitoring operation. This paper introduces an innovative onboard measurement system for monitoring wheel flange wear depth, utilizing displacement and temperature sensors. Laboratory experiments are conducted to emulate wheel flange wear depth and surrounding temperature fluctuations in different periods of time. Employing collected data, the training of machine learning algorithms that are based on regression models, is dynamically automated. Further experimentation results, using standards procedures, validate the system's efficacy. To enhance accuracy, an infinite impulse response filter (IIR) that mitigates vehicle dynamics and sensor noise is designed. Filter parameters were computed based on specifications derived from a Fast Fourier Transform analysis of locomotive simulations and emulation experiments data. The results show that the dynamic machine learning algorithm effectively counter sensor nonlinear response to temperature effects, achieving an accuracy of 96.5 %, with a minimal runtime. The real-time noise reduction via IIR filter enhances the accuracy up to 98.2 %. Integrated with railway communication embedded systems such as Internet of Things devices, this advanced monitoring system offers unparalleled real-time insights into wheel flange wear and track irregular conditions that cause it, ensuring heightened safety and efficiency in railway systems operations.
comment: Journal article published in Transportation Research Record: The Journal of Transportation Research Board
Solving Three-phase AC Infeasibility Analysis to Near-zero Optimality Gap
Recent works have shown the use of equivalent circuit-based infeasibility analysis to identify weak locations in distribution power grids. For three-phase power flow problems, when the power flow solver diverges, three-phase infeasibility analysis (TPIA) can converge and identify weak locations. The original TPIA problem is non-convex, and local minima and saddle points are possible. This can result in grid upgrades that are sub-optimal. To address this issue, we reformulate the original non-convex nonlinear program (NLP) as an exact non-convex bilinear program (BLP). Subsequently, we apply the spatial branch-and-bound (SBnB) algorithm to compute a solution with near-zero optimality gap. To improve SBnB performance, we introduce a bound tightening algorithm with variable filtering and decomposition, which tightens bounds on bilinear variables. We demonstrate that sequential bound tightening (SBT) significantly improves the efficiency and accuracy of Gurobi's SBnB algorithm. Our results show that the proposed method can solve large-scale three-phase infeasibility analysis problems with >5k nodes, achieving an optimality gap of less than 10e-4. Furthermore, we demonstrate that by utilizing the developed presolve routine for bounding, we can reduce the runtime of SBnB by up to 97%.
Konzepte zur Effizienzsteigerung von Traktionsmotoren in batterieelektrischen Fahrzeugen durch den Einsatz neuartiger teillastoptimierbarer Motor- und Invertertopologien
To increase the efficiency of future electric vehicles, it is crucial to reduce drivetrain losses in battery-powered vehicles. This enables either an increase in driving range or overall cost savings by reducing battery capacity while maintaining the same range. Harmonic motor losses account for an avoidable share of more than 30% of the total eDrive losses in standard B6-2L 300 kW iPMSM configurations. These losses result from high-frequency voltage distortion across the motor windings, which can be reduced through various approaches. Of great importance is the classification of cost-neutral and low-cost concepts for loss reduction. The following presents and categorizes approaches to loss reduction that have been developed by research and industry in recent years. In particular, novel part-load-capable motor and inverter concepts are introduced, which enable motor switching or multilevel operation to reduce harmonic losses in the part-load range.
comment: in German language, Published in the conference proceedings of Symposium Elektromagnetismus, 2025, February 27--28, K\"unzelsau, Germany (ISBN-Nr.: 978-3-943563-55-9)
Recursive Gaussian Process Regression with Integrated Monotonicity Assumptions for Control Applications
In this paper, we present an extension to the recursive Gaussian Process (RGP) regression that enables the satisfaction of inequality constraints and is well suited for a real-time execution in control applications. The soft inequality constraints are integrated by introducing an additional extended Kalman Filter (EKF) update step using pseudo-measurements. The sequential formulation of the algorithm and several developed heuristics ensure both the performance and a low computational effort of the algorithm. A special focus lies on an efficient consideration of monotonicity assumptions for GPs in the form of inequality constraints. The algorithm is statistically validated in simulations, where the possible advantages in comparison with the standard RGP algorithm become obvious. The paper is concluded with a successful experimental validation of the developed algorithm for the monotonicity-preserving learning of heat transfer values for the control of a vapor compression cycle evaporator, leveraging a previously published partial input output linearization (IOL).
comment: Accepted at ICINCO 2025 (22nd International Conference on Informatics in Control, Automation and Robotics)
Assessment of Power System Stability Considering Multiple Time-Scale Dynamics: Insights into Hopf Bifurcations in Presence of GFL and GFM IBRs
Real power systems exhibit dynamics that evolve across a wide range of time scales, from very fast to very slow phenomena. Historically, incorporating these wide-ranging dynamics into a single model has been impractical. As a result, power engineers rely on time-scale decomposition to simplify models. When fast phenomena are evaluated, slow dynamics are neglected (assumed stable), and vice versa. This paper challenges this paradigm by showing the importance of assessing power system stability while considering multiple time scales simultaneously. Using the concept of Hopf bifurcations, it exemplifies instability issues that would be missed if multi-time-scale dynamics are not considered. Although this work employs both grid-following and grid-forming inverter-based resource models, it is not a direct comparison. Instead, it presents a case study demonstrating how one technology can complement the other from a multi time-scale dynamics perspective.
comment: 7 pages
A "good regulator theorem" for embodied agents
In a classic paper, Conant and Ashby claimed that "every good regulator of a system must be a model of that system." Artificial Life has produced many examples of systems that perform tasks with apparently no model in sight; these suggest Conant and Ashby's theorem doesn't easily generalise beyond its restricted setup. Nevertheless, here we show that a similar intuition can be fleshed out in a different way: whenever an agent is able to perform a regulation task, it is possible for an observer to interpret it as having "beliefs" about its environment, which it "updates" in response to sensory input. This notion of belief updating provides a notion of model that is more sophisticated than Conant and Ashby's, as well as a theorem that is more broadly applicable. However, it necessitates a change in perspective, in that the observer plays an essential role in the theory: models are not a mere property of the system but are imposed on it from outside. Our theorem holds regardless of whether the system is regulating its environment in a classic control theory setup, or whether it's regulating its own internal state; the model is of its environment either way. The model might be trivial, however, and this is how the apparent counterexamples are resolved.
comment: Accepted at the Artificial Life conference 2025 (ALife 2025). 10 pages, 1 figure
Online Convex Optimization and Integral Quadratic Constraints: An automated approach to regret analysis
We propose a novel approach for analyzing dynamic regret of first-order constrained online convex optimization algorithms for strongly convex and Lipschitz-smooth objectives. Crucially, we provide a general analysis that is applicable to a wide range of first-order algorithms that can be expressed as an interconnection of a linear dynamical system in feedback with a first-order oracle. By leveraging Integral Quadratic Constraints (IQCs), we derive a semi-definite program which, when feasible, provides a regret guarantee for the online algorithm. For this, the concept of variational IQCs is introduced as the generalization of IQCs to time-varying monotone operators. Our bounds capture the temporal rate of change of the problem in the form of the path length of the time-varying minimizer and the objective function variation. In contrast to standard results in OCO, our results do not require nerither the assumption of gradient boundedness, nor that of a bounded feasible set. Numerical analyses showcase the ability of the approach to capture the dependence of the regret on the function class condition number.
comment: Published in the 64th IEEE Conference on Decision and Control, 2025
Linear time-and-space-invariant relaxation systems
This paper generalizes the physical property of relaxation from linear time-invariant (LTI) to linear time-and-space-invariant (LTSI) systems. It is shown that the defining features of relaxation -- complete monotonicity, passivity, and memory-based storage -- carry over seamlessly to the spatio-temporal domain. An LTSI system is shown to be of relaxation type if and only if its associated spatio-temporal Hankel operator is cyclically monotone. This implies the existence of an intrinsic quadratic storage functional defined uniquely by past inputs, independently of any state-space realization. As in the LTI case, LTSI relaxation systems are shown to be those systems for which the state-space concept of storage coincides with the input-output concept of fading memory functional.
Integrating Grid impedance estimation method into Advanced Angle Estimation Kalman Filter in GFL inverter
The growing integration of power electronic converter-interfaced distributed energy resources into modern power systems presents significant challenges for system monitoring, protection, and control. Grid impedance plays a critical role in the operation and stability assessment of grid-connected inverter systems. This study presents a real-time grid impedance estimation method based on the Discrete Fourier Transform. The proposed method is integrated with the Advanced Angle Estimation Kalman Filter using a Linear Quadratic Regulator current controller (AAEKF-LQR), assisting the use of impedance information for accurate instantaneous phase angle estimation. Simulation results confirm that the proposed impedance estimation method interacts effectively with the AAEKF-LQR controller, maintaining stable system performance under weak grid conditions. The approach also demonstrates the ability to deliver fast and accurate impedance estimation during operational variations in grid conditions, thereby supporting stable inverter operation.
comment: 8 pages, 6 figures
A Time Splitting Based Optimization Method for Nonlinear MHE
Moving Horizon Estimation~(MHE) is essentially an optimization-based approach designed to estimate the states of dynamic systems within a moving time horizon. Traditional MHE solutions become computationally prohibitive due to the \textit{curse of dimensionality} arising from increasing problem complexity and growing length of time horizon. To address this issue, we propose novel computationally efficient algorithms for solving nonlinear MHE problems. Specifically, we first introduce a distributed reformulation utilizing a time-splitting technique. Leveraging this reformulation, we develop the Efficient Gauss-Newton Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN) to achieve computational efficiency. Additionally, to accommodate limited computational capabilities inherent in some sub-problem solvers, we propose the Efficient Sensitivity Assisted ALADIN, which enables sub-problems to be solved inexactly without hindering computational efficiency. Furthermore, recognizing scenarios where sub-problem solvers possess no computational power, we propose a Distributed Sequential Quadratic Programming (SQP) that relies solely on first- and second-order information of local objective functions. We demonstrate the performance and advantages of our proposed methods through numerical experiments on differential drive robots case, a practical nonlinear MHE problem. Our results demonstrate that the three proposed algorithms achieve computational efficiency while preserving high accuracy, thereby satisfying the real-time requirements of MHE.
A Survey of Foundation Models for IoT: Taxonomy and Criteria-Based Analysis
Foundation models have gained growing interest in the IoT domain due to their reduced reliance on labeled data and strong generalizability across tasks, which address key limitations of traditional machine learning approaches. However, most existing foundation model based methods are developed for specific IoT tasks, making it difficult to compare approaches across IoT domains and limiting guidance for applying them to new tasks. This survey aims to bridge this gap by providing a comprehensive overview of current methodologies and organizing them around four shared performance objectives by different domains: efficiency, context-awareness, safety, and security & privacy. For each objective, we review representative works, summarize commonly-used techniques and evaluation metrics. This objective-centric organization enables meaningful cross-domain comparisons and offers practical insights for selecting and designing foundation model based solutions for new IoT tasks. We conclude with key directions for future research to guide both practitioners and researchers in advancing the use of foundation models in IoT applications.
comment: Accepted by CCF Transactions on Pervasive Computing and Interaction
Polytope Volume Monitoring Problem: Formulation and Solution via Parametric Linear Program Based Control Barrier Function
Motivated by the latest research on feasible space monitoring of multiple control barrier functions (CBFs) as well as polytopic collision avoidance, this paper studies the Polytope Volume Monitoring (PVM) problem, whose goal is to design a control law for inputs of nonlinear systems to prevent the volume of some state-dependent polytope from decreasing to zero. Recent studies have explored the idea of applying Chebyshev ball method in optimization theory to solve the case study of PVM; however, the underlying difficulties caused by nonsmoothness have not been addressed. This paper continues the study on this topic, where our main contribution is to establish the relationship between nonsmooth CBF and parametric optimization theory through directional derivatives for the first time, to solve PVM problems more conveniently. In detail, inspired by Chebyshev ball approach, a parametric linear program (PLP) based nonsmooth barrier function candidate is established for PVM, and then, sufficient conditions for it to be a nonsmooth CBF are proposed, based on which a quadratic program (QP) based safety filter with guaranteed feasibility is proposed to address PVM problems. Finally, a numerical simulation example is given to show the efficiency of the proposed safety filter.
comment: An extension version of the accepted CDC2025
DCT-MARL: A Dynamic Communication Topology-Based MARL Algorithm for Connected Vehicle Platoon Control
With the rapid advancement of vehicular communication facilities and autonomous driving technologies, connected vehicle platooning has emerged as a promising approach to improve traffic efficiency and driving safety. Reliable Vehicle-to-Vehicle (V2V) communication is critical to achieving efficient cooperative control. However, in the real-world traffic environment, V2V communication may suffer from time-varying delay and packet loss, leading to degraded control performance and even safety risks. To mitigate the adverse effects of non-ideal communication, this paper proposes a Dynamic Communication Topology based Multi-Agent Reinforcement Learning (DCT-MARL) algorithm for robust cooperative platoon control. Specifically, the state space is augmented with historical control action and delay to enhance robustness against communication delay. To mitigate the impact of packet loss, a multi-key gated communication mechanism is introduced, which dynamically adjusts the communication topology based on the correlation between vehicles and their current communication status. Simulation results demonstrate that the proposed DCT-MARL significantly outperforms state-of-the-art methods in terms of string stability and driving comfort, validating its superior robustness and effectiveness.
Contrastive-KAN: A Semi-Supervised Intrusion Detection Framework for Cybersecurity with scarce Labeled Data
In the era of the Fourth Industrial Revolution, cybersecurity and intrusion detection systems are vital for the secure and reliable operation of IoT and IIoT environments. A key challenge in this domain is the scarcity of labeled cyberattack data, as most industrial systems operate under normal conditions. This data imbalance, combined with the high cost of annotation, hinders the effective training of machine learning models. Moreover, the rapid detection of attacks is essential, especially in critical infrastructure, to prevent large-scale disruptions. To address these challenges, we propose a real-time intrusion detection system based on a semi-supervised contrastive learning framework using the Kolmogorov-Arnold Network (KAN). Our method leverages abundant unlabeled data to effectively distinguish between normal and attack behaviors. We validate our approach on three benchmark datasets, UNSW-NB15, BoT-IoT, and Gas Pipeline, using only 2.20%, 1.28%, and 8% of labeled samples, respectively, to simulate real-world conditions. Experimental results show that our method outperforms existing contrastive learning-based approaches. We further compare KAN with a traditional multilayer perceptron (MLP), demonstrating KAN's superior performance in both detection accuracy and robustness under limited supervision. KAN's ability to model complex relationships, along with its learnable activation functions, is also explored and visualized, offering interpretability and the potential for rule extraction. The method supports multi-class classification and proves effective in safety, critical environments where reliability is paramount.
Non-asymptotic Error Analysis of Subspace Identification for Deterministic Systems
The subspace identification method (SIM) has been extensively employed in the identification of discrete-time multiple-input multiple-output (MIMO) linear time-invariant (LTI) systems. This paper focuses on the analysis of perturbation errors for the system matrices in state-space models and the corresponding system poles, under two unified SIMs, based on a single finite-length input/output sample trajectory. Specifically, we derive non-asymptotic upper bounds on these errors, providing a unified perspective across various SIM variants. Furthermore, we prove that SIMs become ill-conditioned for MIMO systems when the state-to-output dimensionality ratio $n/m$ is large, regardless of system parameters. Finally, numerical experiments are conducted to validate the non-asymptotic results and the ill-conditionedness of SIMs.
comment: In Assumption 1, the assumption regarding the matrix A is found to be unreasonable and needs correction. Additionally, the bound curve in Figure 3b is inaccurate and requires revision. To address these issues, I am requesting the withdrawal of this version (v2) and plan to re-upload a corrected version
Data-conforming data-driven control: avoiding premature generalizations beyond data
Data-driven and adaptive control approaches face the problem of introducing sudden distributional shifts beyond the distribution of data encountered during learning. Therefore, they are prone to invalidating the very assumptions used in their own construction. This is due to the linearity of the underlying system, inherently assumed and formulated in most data-driven control approaches, which may falsely generalize the behavior of the system beyond the behavior experienced in the data. This paper seeks to mitigate these problems by enforcing consistency of the newly designed closed-loop systems with data and slowing down any distributional shifts in the joint state-input space. This is achieved through incorporating affine regularization terms and linear matrix inequality constraints to data-driven approaches, resulting in convex semi-definite programs that can be efficiently solved by standard software packages. We discuss the optimality conditions of these programs and then conclude the paper with a numerical example that further highlights the problem of premature generalization beyond data and shows the effectiveness of our proposed approaches in enhancing the safety of data-driven control methods.
Robotics
Virtual Community: An Open World for Humans, Robots, and Society
The rapid progress in AI and Robotics may lead to a profound societal transformation, as humans and robots begin to coexist within shared communities, introducing both opportunities and challenges. To explore this future, we present Virtual Community-an open-world platform for humans, robots, and society-built on a universal physics engine and grounded in real-world 3D scenes. With Virtual Community, we aim to study embodied social intelligence at scale: 1) How robots can intelligently cooperate or compete; 2) How humans develop social relations and build community; 3) More importantly, how intelligent robots and humans can co-exist in an open world. To support these, Virtual Community features: 1) An open-source multi-agent physics simulator that supports robots, humans, and their interactions within a society; 2) A large-scale, real-world aligned community generation pipeline, including vast outdoor space, diverse indoor scenes, and a community of grounded agents with rich characters and appearances. Leveraging Virtual Community, we propose two novel challenges. The Community Planning Challenge evaluates multi-agent reasoning and planning ability in open-world settings, such as cooperating to help agents with daily activities and efficiently connecting other agents. The Community Robot Challenge requires multiple heterogeneous robots to collaborate in solving complex open-world tasks. We evaluate various baselines on these tasks and demonstrate the challenges in both high-level open-world task planning and low-level cooperation controls. We hope that Virtual Community will unlock further study of human-robot coexistence within open-world environments.
comment: website https://virtual-community-ai.github.io/
Fusing Monocular RGB Images with AIS Data to Create a 6D Pose Estimation Dataset for Marine Vessels
The paper presents a novel technique for creating a 6D pose estimation dataset for marine vessels by fusing monocular RGB images with Automatic Identification System (AIS) data. The proposed technique addresses the limitations of relying purely on AIS for location information, caused by issues like equipment reliability, data manipulation, and transmission delays. By combining vessel detections from monocular RGB images, obtained using an object detection network (YOLOX-X), with AIS messages, the technique generates 3D bounding boxes that represent the vessels' 6D poses, i.e. spatial and rotational dimensions. The paper evaluates different object detection models to locate vessels in image space. We also compare two transformation methods (homography and Perspective-n-Point) for aligning AIS data with image coordinates. The results of our work demonstrate that the Perspective-n-Point (PnP) method achieves a significantly lower projection error compared to homography-based approaches used before, and the YOLOX-X model achieves a mean Average Precision (mAP) of 0.80 at an Intersection over Union (IoU) threshold of 0.5 for relevant vessel classes. We show indication that our approach allows the creation of a 6D pose estimation dataset without needing manual annotation. Additionally, we introduce the Boats on Nordelbe Kehrwieder (BONK-pose), a publicly available dataset comprising 3753 images with 3D bounding box annotations for pose estimation, created by our data fusion approach. This dataset can be used for training and evaluating 6D pose estimation networks. In addition we introduce a set of 1000 images with 2D bounding box annotations for ship detection from the same scene.
comment: Author version of the submission to the IEEE Journal of Oceanic Engineering
Safe and Transparent Robots for Human-in-the-Loop Meat Processing
Labor shortages have severely affected the meat processing sector. Automated technology has the potential to support the meat industry, assist workers, and enhance job quality. However, existing automation in meat processing is highly specialized, inflexible, and cost intensive. Instead of forcing manufacturers to buy a separate device for each step of the process, our objective is to develop general-purpose robotic systems that work alongside humans to perform multiple meat processing tasks. Through a recently conducted survey of industry experts, we identified two main challenges associated with integrating these collaborative robots alongside human workers. First, there must be measures to ensure the safety of human coworkers; second, the coworkers need to understand what the robot is doing. This paper addresses both challenges by introducing a safety and transparency framework for general-purpose meat processing robots. For safety, we implement a hand-detection system that continuously monitors nearby humans. This system can halt the robot in situations where the human comes into close proximity of the operating robot. We also develop an instrumented knife equipped with a force sensor that can differentiate contact between objects such as meat, bone, or fixtures. For transparency, we introduce a method that detects the robot's uncertainty about its performance and uses an LED interface to communicate that uncertainty to the human. Additionally, we design a graphical interface that displays the robot's plans and allows the human to provide feedback on the planned cut. Overall, our framework can ensure safe operation while keeping human workers in-the-loop about the robot's actions which we validate through a user study.
Consistent Pose Estimation of Unmanned Ground Vehicles through Terrain-Aided Multi-Sensor Fusion on Geometric Manifolds
Aiming to enhance the consistency and thus long-term accuracy of Extended Kalman Filters for terrestrial vehicle localization, this paper introduces the Manifold Error State Extended Kalman Filter (M-ESEKF). By representing the robot's pose in a space with reduced dimensionality, the approach ensures feasible estimates on generic smooth surfaces, without introducing artificial constraints or simplifications that may degrade a filter's performance. The accompanying measurement models are compatible with common loosely- and tightly-coupled sensor modalities and also implicitly account for the ground geometry. We extend the formulation by introducing a novel correction scheme that embeds additional domain knowledge into the sensor data, giving more accurate uncertainty approximations and further enhancing filter consistency. The proposed estimator is seamlessly integrated into a validated modular state estimation framework, demonstrating compatibility with existing implementations. Extensive Monte Carlo simulations across diverse scenarios and dynamic sensor configurations show that the M-ESEKF outperforms classical filter formulations in terms of consistency and stability. Moreover, it eliminates the need for scenario-specific parameter tuning, enabling its application in a variety of real-world settings.
An Informative Planning Framework for Target Tracking and Active Mapping in Dynamic Environments with ASVs
Mobile robot platforms are increasingly being used to automate information gathering tasks such as environmental monitoring. Efficient target tracking in dynamic environments is critical for applications such as search and rescue and pollutant cleanups. In this letter, we study active mapping of floating targets that drift due to environmental disturbances such as wind and currents. This is a challenging problem as it involves predicting both spatial and temporal variations in the map due to changing conditions. We propose an informative path planning framework to map an arbitrary number of moving targets with initially unknown positions in dynamic environments. A key component of our approach is a spatiotemporal prediction network that predicts target position distributions over time. We propose an adaptive planning objective for target tracking that leverages these predictions. Simulation experiments show that our proposed planning objective improves target tracking performance compared to existing methods that consider only entropy reduction as the planning objective. Finally, we validate our approach in field tests using an autonomous surface vehicle, showcasing its ability to track targets in real-world monitoring scenarios.
comment: Submitted to IEEE Robotics and Automation Letters (RA-L)
Can LLM Agents Solve Collaborative Tasks? A Study on Urgency-Aware Planning and Coordination
The ability to coordinate actions across multiple agents is critical for solving complex, real-world problems. Large Language Models (LLMs) have shown strong capabilities in communication, planning, and reasoning, raising the question of whether they can also support effective collaboration in multi-agent settings. In this work, we investigate the use of LLM agents to solve a structured victim rescue task that requires division of labor, prioritization, and cooperative planning. Agents operate in a fully known graph-based environment and must allocate resources to victims with varying needs and urgency levels. We systematically evaluate their performance using a suite of coordination-sensitive metrics, including task success rate, redundant actions, room conflicts, and urgency-weighted efficiency. This study offers new insights into the strengths and failure modes of LLMs in physically grounded multi-agent collaboration tasks, contributing to future benchmarks and architectural improvements.
TRUST-Planner: Topology-guided Robust Trajectory Planner for AAVs with Uncertain Obstacle Spatial-temporal Avoidance
Despite extensive developments in motion planning of autonomous aerial vehicles (AAVs), existing frameworks faces the challenges of local minima and deadlock in complex dynamic environments, leading to increased collision risks. To address these challenges, we present TRUST-Planner, a topology-guided hierarchical planning framework for robust spatial-temporal obstacle avoidance. In the frontend, a dynamic enhanced visible probabilistic roadmap (DEV-PRM) is proposed to rapidly explore topological paths for global guidance. The backend utilizes a uniform terminal-free minimum control polynomial (UTF-MINCO) and dynamic distance field (DDF) to enable efficient predictive obstacle avoidance and fast parallel computation. Furthermore, an incremental multi-branch trajectory management framework is introduced to enable spatio-temporal topological decision-making, while efficiently leveraging historical information to reduce replanning time. Simulation results show that TRUST-Planner outperforms baseline competitors, achieving a 96\% success rate and millisecond-level computation efficiency in tested complex environments. Real-world experiments further validate the feasibility and practicality of the proposed method.
Making Pose Representations More Expressive and Disentangled via Residual Vector Quantization
Recent progress in text-to-motion has advanced both 3D human motion generation and text-based motion control. Controllable motion generation (CoMo), which enables intuitive control, typically relies on pose code representations, but discrete pose codes alone cannot capture fine-grained motion details, limiting expressiveness. To overcome this, we propose a method that augments pose code-based latent representations with continuous motion features using residual vector quantization (RVQ). This design preserves the interpretability and manipulability of pose codes while effectively capturing subtle motion characteristics such as high-frequency details. Experiments on the HumanML3D dataset show that our model reduces Frechet inception distance (FID) from 0.041 to 0.015 and improves Top-1 R-Precision from 0.508 to 0.510. Qualitative analysis of pairwise direction similarity between pose codes further confirms the model's controllability for motion editing.
EAROL: Environmental Augmented Perception-Aware Planning and Robust Odometry via Downward-Mounted Tilted LiDAR IROS 2025
To address the challenges of localization drift and perception-planning coupling in unmanned aerial vehicles (UAVs) operating in open-top scenarios (e.g., collapsed buildings, roofless mazes), this paper proposes EAROL, a novel framework with a downward-mounted tilted LiDAR configuration (20{\deg} inclination), integrating a LiDAR-Inertial Odometry (LIO) system and a hierarchical trajectory-yaw optimization algorithm. The hardware innovation enables constraint enhancement via dense ground point cloud acquisition and forward environmental awareness for dynamic obstacle detection. A tightly-coupled LIO system, empowered by an Iterative Error-State Kalman Filter (IESKF) with dynamic motion compensation, achieves high level 6-DoF localization accuracy in feature-sparse environments. The planner, augmented by environment, balancing environmental exploration, target tracking precision, and energy efficiency. Physical experiments demonstrate 81% tracking error reduction, 22% improvement in perceptual coverage, and near-zero vertical drift across indoor maze and 60-meter-scale outdoor scenarios. This work proposes a hardware-algorithm co-design paradigm, offering a robust solution for UAV autonomy in post-disaster search and rescue missions. We will release our software and hardware as an open-source package for the community. Video: https://youtu.be/7av2ueLSiYw.
comment: Accepted by 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025). This work has been submitted to the IEEE for possible publication
Taming VR Teleoperation and Learning from Demonstration for Multi-Task Bimanual Table Service Manipulation ICRA 2025
This technical report presents the champion solution of the Table Service Track in the ICRA 2025 What Bimanuals Can Do (WBCD) competition. We tackled a series of demanding tasks under strict requirements for speed, precision, and reliability: unfolding a tablecloth (deformable-object manipulation), placing a pizza onto the table (pick-and-place), and opening and closing a food container with the lid. Our solution combines VR-based teleoperation and Learning from Demonstrations (LfD) to balance robustness and autonomy. Most subtasks were executed through high-fidelity remote teleoperation, while the pizza placement was handled by an ACT-based policy trained from 100 in-person teleoperated demonstrations with randomized initial configurations. By carefully integrating scoring rules, task characteristics, and current technical capabilities, our approach achieved both high efficiency and reliability, ultimately securing the first place in the competition.
comment: Technical report of First-place/Champion solution at IEEE ICRA 2025 What Bimanuals Can Do (WBCD) Challenge - Table Services Track
FBI: Learning Dexterous In-hand Manipulation with Dynamic Visuotactile Shortcut Policy
Dexterous in-hand manipulation is a long-standing challenge in robotics due to complex contact dynamics and partial observability. While humans synergize vision and touch for such tasks, robotic approaches often prioritize one modality, therefore limiting adaptability. This paper introduces Flow Before Imitation (FBI), a visuotactile imitation learning framework that dynamically fuses tactile interactions with visual observations through motion dynamics. Unlike prior static fusion methods, FBI establishes a causal link between tactile signals and object motion via a dynamics-aware latent model. FBI employs a transformer-based interaction module to fuse flow-derived tactile features with visual inputs, training a one-step diffusion policy for real-time execution. Extensive experiments demonstrate that the proposed method outperforms the baseline methods in both simulation and the real world on two customized in-hand manipulation tasks and three standard dexterous manipulation tasks. Code, models, and more results are available in the website https://sites.google.com/view/dex-fbi.
DEXTER-LLM: Dynamic and Explainable Coordination of Multi-Robot Systems in Unknown Environments via Large Language Models IROS 2025
Online coordination of multi-robot systems in open and unknown environments faces significant challenges, particularly when semantic features detected during operation dynamically trigger new tasks. Recent large language model (LLMs)-based approaches for scene reasoning and planning primarily focus on one-shot, end-to-end solutions in known environments, lacking both dynamic adaptation capabilities for online operation and explainability in the processes of planning. To address these issues, a novel framework (DEXTER-LLM) for dynamic task planning in unknown environments, integrates four modules: (i) a mission comprehension module that resolves partial ordering of tasks specified by natural languages or linear temporal logic formulas (LTL); (ii) an online subtask generator based on LLMs that improves the accuracy and explainability of task decomposition via multi-stage reasoning; (iii) an optimal subtask assigner and scheduler that allocates subtasks to robots via search-based optimization; and (iv) a dynamic adaptation and human-in-the-loop verification module that implements multi-rate, event-based updates for both subtasks and their assignments, to cope with new features and tasks detected online. The framework effectively combines LLMs' open-world reasoning capabilities with the optimality of model-based assignment methods, simultaneously addressing the critical issue of online adaptability and explainability. Experimental evaluations demonstrate exceptional performances, with 100% success rates across all scenarios, 160 tasks and 480 subtasks completed on average (3 times the baselines), 62% less queries to LLMs during adaptation, and superior plan quality (2 times higher) for compound tasks. Project page at https://tcxm.github.io/DEXTER-LLM/
comment: submitted to IROS 2025
Offline Imitation Learning upon Arbitrary Demonstrations by Pre-Training Dynamics Representations
Limited data has become a major bottleneck in scaling up offline imitation learning (IL). In this paper, we propose enhancing IL performance under limited expert data by introducing a pre-training stage that learns dynamics representations, derived from factorizations of the transition dynamics. We first theoretically justify that the optimal decision variable of offline IL lies in the representation space, significantly reducing the parameters to learn in the downstream IL. Moreover, the dynamics representations can be learned from arbitrary data collected with the same dynamics, allowing the reuse of massive non-expert data and mitigating the limited data issues. We present a tractable loss function inspired by noise contrastive estimation to learn the dynamics representations at the pre-training stage. Experiments on MuJoCo demonstrate that our proposed algorithm can mimic expert policies with as few as a single trajectory. Experiments on real quadrupeds show that we can leverage pre-trained dynamics representations from simulator data to learn to walk from a few real-world demonstrations.
comment: 7 pages, 5 figures
FiReFly: Fair Distributed Receding Horizon Planning for Multiple UAVs SC
We propose injecting notions of fairness into multi-robot motion planning. When robots have competing interests, it is important to optimize for some kind of fairness in their usage of resources. In this work, we explore how the robots' energy expenditures might be fairly distributed among them, while maintaining mission success. We formulate a distributed fair motion planner and integrate it with safe controllers in a algorithm called FiReFly. For simulated reach-avoid missions, FiReFly produces fairer trajectories and improves mission success rates over a non-fair planner. We find that real-time performance is achievable up to 15 UAVs, and that scaling up to 50 UAVs is possible with trade-offs between runtime and fairness improvements.
comment: Accepted to IEEE International Conference on Intelligent Transportation Systems (ITSC) 2025
Fair-CoPlan: Negotiated Flight Planning with Fair Deconfliction for Urban Air Mobility SC
Urban Air Mobility (UAM) is an emerging transportation paradigm in which Uncrewed Aerial Systems (UAS) autonomously transport passengers and goods in cities. The UAS have different operators with different, sometimes competing goals, yet must share the airspace. We propose a negotiated, semi-distributed flight planner that optimizes UAS' flight lengths {\em in a fair manner}. Current flight planners might result in some UAS being given disproportionately shorter flight paths at the expense of others. We introduce Fair-CoPlan, a planner in which operators and a Provider of Service to the UAM (PSU) together compute \emph{fair} flight paths. Fair-CoPlan has three steps: First, the PSU constrains take-off and landing choices for flights based on capacity at and around vertiports. Then, operators plan independently under these constraints. Finally, the PSU resolves any conflicting paths, optimizing for path length fairness. By fairly spreading the cost of deconfliction Fair-CoPlan encourages wider participation in UAM, ensures safety of the airspace and the areas below it, and promotes greater operator flexibility. We demonstrate Fair-CoPlan through simulation experiments and find fairer outcomes than a non-fair planner with minor delays as a trade-off.
comment: Accepted to IEEE International Conference on Intelligent Transportation Systems (ITSC) 2025
Action-Constrained Imitation Learning ICML 2025
Policy learning under action constraints plays a central role in ensuring safe behaviors in various robot control and resource allocation applications. In this paper, we study a new problem setting termed Action-Constrained Imitation Learning (ACIL), where an action-constrained imitator aims to learn from a demonstrative expert with larger action space. The fundamental challenge of ACIL lies in the unavoidable mismatch of occupancy measure between the expert and the imitator caused by the action constraints. We tackle this mismatch through \textit{trajectory alignment} and propose DTWIL, which replaces the original expert demonstrations with a surrogate dataset that follows similar state trajectories while adhering to the action constraints. Specifically, we recast trajectory alignment as a planning problem and solve it via Model Predictive Control, which aligns the surrogate trajectories with the expert trajectories based on the Dynamic Time Warping (DTW) distance. Through extensive experiments, we demonstrate that learning from the dataset generated by DTWIL significantly enhances performance across multiple robot control tasks and outperforms various benchmark imitation learning algorithms in terms of sample efficiency. Our code is publicly available at https://github.com/NYCU-RL-Bandits-Lab/ACRL-Baselines.
comment: Published in ICML 2025
Learning Point Cloud Representations with Pose Continuity for Depth-Based Category-Level 6D Object Pose Estimation ICCV 2025
Category-level object pose estimation aims to predict the 6D pose and 3D size of objects within given categories. Existing approaches for this task rely solely on 6D poses as supervisory signals without explicitly capturing the intrinsic continuity of poses, leading to inconsistencies in predictions and reduced generalization to unseen poses. To address this limitation, we propose HRC-Pose, a novel depth-only framework for category-level object pose estimation, which leverages contrastive learning to learn point cloud representations that preserve the continuity of 6D poses. HRC-Pose decouples object pose into rotation and translation components, which are separately encoded and leveraged throughout the network. Specifically, we introduce a contrastive learning strategy for multi-task, multi-category scenarios based on our 6D pose-aware hierarchical ranking scheme, which contrasts point clouds from multiple categories by considering rotational and translational differences as well as categorical information. We further design pose estimation modules that separately process the learned rotation-aware and translation-aware embeddings. Our experiments demonstrate that HRC-Pose successfully learns continuous feature spaces. Results on REAL275 and CAMERA25 benchmarks show that our method consistently outperforms existing depth-only state-of-the-art methods and runs in real-time, demonstrating its effectiveness and potential for real-world applications. Our code is at https://github.com/zhujunli1993/HRC-Pose.
comment: Accepted by ICCV 2025 Workshop on Recovering 6D Object Pose (R6D)
D$^2$-LIO: Enhanced Optimization for LiDAR-IMU Odometry Considering Directional Degeneracy
LiDAR-inertial odometry (LIO) plays a vital role in achieving accurate localization and mapping, especially in complex environments. However, the presence of LiDAR feature degeneracy poses a major challenge to reliable state estimation. To overcome this issue, we propose an enhanced LIO framework that integrates adaptive outlier-tolerant correspondence with a scan-to-submap registration strategy. The core contribution lies in an adaptive outlier removal threshold, which dynamically adjusts based on point-to-sensor distance and the motion amplitude of platform. This mechanism improves the robustness of feature matching in varying conditions. Moreover, we introduce a flexible scan-to-submap registration method that leverages IMU data to refine pose estimation, particularly in degenerate geometric configurations. To further enhance localization accuracy, we design a novel weighting matrix that fuses IMU preintegration covariance with a degeneration metric derived from the scan-to-submap process. Extensive experiments conducted in both indoor and outdoor environments-characterized by sparse or degenerate features-demonstrate that our method consistently outperforms state-of-the-art approaches in terms of both robustness and accuracy.
comment: 7 page, 2 figures
Open-Universe Assistance Games
Embodied AI agents must infer and act in an interpretable way on diverse human goals and preferences that are not predefined. To formalize this setting, we introduce Open-Universe Assistance Games (OU-AGs), a framework where the agent must reason over an unbounded and evolving space of possible goals. In this context, we introduce GOOD (GOals from Open-ended Dialogue), a data-efficient, online method that extracts goals in the form of natural language during an interaction with a human, and infers a distribution over natural language goals. GOOD prompts an LLM to simulate users with different complex intents, using its responses to perform probabilistic inference over candidate goals. This approach enables rich goal representations and uncertainty estimation without requiring large offline datasets. We evaluate GOOD in a text-based grocery shopping domain and in a text-operated simulated household robotics environment (AI2Thor), using synthetic user profiles. Our method outperforms a baseline without explicit goal tracking, as confirmed by both LLM-based and human evaluations.
comment: 7 pages + 2 pages references + 7 pages appendix
Discrete VHCs for Propeller Motion of a Devil-Stick using purely Impulsive Inputs
The control problem of realizing propeller motion of a devil-stick in the vertical plane using impulsive forces applied normal to the stick is considered. This problem is an example of underactuated robotic juggling and has not been considered in the literature before. Inspired by virtual holonomic constraints, the concept of discrete virtual holonomic constraints (DVHC) is introduced for the first time to solve this orbital stabilization problem. At the discrete instants when impulsive inputs are applied, the location of the center-of-mass of the devil-stick is specified in terms of its orientation angle. This yields the discrete zero dynamics (DZD), which provides conditions for stable propeller motion. In the limiting case, when the rotation angle between successive applications of impulsive inputs is chosen to be arbitrarily small, the problem reduces to that of propeller motion under continuous forcing. A controller that enforces the DVHC, and an orbit stabilizing controller based on the impulse controlled Poincar\'e map approach are presented. The efficacy of the approach to trajectory design and stabilization is validated through simulations.
comment: 16 pages, 11 figures. This work has been submitted to the IEEE for possible publication
Decentralized Vision-Based Autonomous Aerial Wildlife Monitoring
Wildlife field operations demand efficient parallel deployment methods to identify and interact with specific individuals, enabling simultaneous collective behavioral analysis, and health and safety interventions. Previous robotics solutions approach the problem from the herd perspective, or are manually operated and limited in scale. We propose a decentralized vision-based multi-quadrotor system for wildlife monitoring that is scalable, low-bandwidth, and sensor-minimal (single onboard RGB camera). Our approach enables robust identification and tracking of large species in their natural habitat. We develop novel vision-based coordination and tracking algorithms designed for dynamic, unstructured environments without reliance on centralized communication or control. We validate our system through real-world experiments, demonstrating reliable deployment in diverse field conditions.
In-Context Iterative Policy Improvement for Dynamic Manipulation
Attention-based architectures trained on internet-scale language data have demonstrated state of the art reasoning ability for various language-based tasks, such as logic problems and textual reasoning. Additionally, these Large Language Models (LLMs) have exhibited the ability to perform few-shot prediction via in-context learning, in which input-output examples provided in the prompt are generalized to new inputs. This ability furthermore extends beyond standard language tasks, enabling few-shot learning for general patterns. In this work, we consider the application of in-context learning with pre-trained language models for dynamic manipulation. Dynamic manipulation introduces several crucial challenges, including increased dimensionality, complex dynamics, and partial observability. To address this, we take an iterative approach, and formulate our in-context learning problem to predict adjustments to a parametric policy based on previous interactions. We show across several tasks in simulation and on a physical robot that utilizing in-context learning outperforms alternative methods in the low data regime. Video summary of this work and experiments can be found https://youtu.be/2inxpdrq74U?si=dAdDYsUEr25nZvRn.
comment: 14 pages. Accepted at CoRL 2025
GraspQP: Differentiable Optimization of Force Closure for Diverse and Robust Dexterous Grasping
Dexterous robotic hands enable versatile interactions due to the flexibility and adaptability of multi-fingered designs, allowing for a wide range of task-specific grasp configurations in diverse environments. However, to fully exploit the capabilities of dexterous hands, access to diverse and high-quality grasp data is essential -- whether for developing grasp prediction models from point clouds, training manipulation policies, or supporting high-level task planning with broader action options. Existing approaches for dataset generation typically rely on sampling-based algorithms or simplified force-closure analysis, which tend to converge to power grasps and often exhibit limited diversity. In this work, we propose a method to synthesize large-scale, diverse, and physically feasible grasps that extend beyond simple power grasps to include refined manipulations, such as pinches and tri-finger precision grasps. We introduce a rigorous, differentiable energy formulation of force closure, implicitly defined through a Quadratic Program (QP). Additionally, we present an adjusted optimization method (MALA*) that improves performance by dynamically rejecting gradient steps based on the distribution of energy values across all samples. We extensively evaluate our approach and demonstrate significant improvements in both grasp diversity and the stability of final grasp predictions. Finally, we provide a new, large-scale grasp dataset for 5,700 objects from DexGraspNet, comprising five different grippers and three distinct grasp types. Dataset and Code:https://graspqp.github.io/
A Vision-Based Shared-Control Teleoperation Scheme for Controlling the Robotic Arm of a Four-Legged Robot
In hazardous and remote environments, robotic systems perform critical tasks demanding improved safety and efficiency. Among these, quadruped robots with manipulator arms offer mobility and versatility for complex operations. However, teleoperating quadruped robots is challenging due to the lack of integrated obstacle detection and intuitive control methods for the robotic arm, increasing collision risks in confined or dynamically changing workspaces. Teleoperation via joysticks or pads can be non-intuitive and demands a high level of expertise due to its complexity, culminating in a high cognitive load on the operator. To address this challenge, a teleoperation approach that directly maps human arm movements to the robotic manipulator offers a simpler and more accessible solution. This work proposes an intuitive remote control by leveraging a vision-based pose estimation pipeline that utilizes an external camera with a machine learning-based model to detect the operator's wrist position. The system maps these wrist movements into robotic arm commands to control the robot's arm in real-time. A trajectory planner ensures safe teleoperation by detecting and preventing collisions with both obstacles and the robotic arm itself. The system was validated on the real robot, demonstrating robust performance in real-time control. This teleoperation approach provides a cost-effective solution for industrial applications where safety, precision, and ease of use are paramount, ensuring reliable and intuitive robotic control in high-risk environments.
You Only Pose Once: A Minimalist's Detection Transformer for Monocular RGB Category-level 9D Multi-Object Pose Estimation
Accurately recovering the full 9-DoF pose of unseen instances within specific categories from a single RGB image remains a core challenge for robotics and automation. Most existing solutions still rely on pseudo-depth, CAD models, or multi-stage cascades that separate 2D detection from pose estimation. Motivated by the need for a simpler, RGB-only alternative that learns directly at the category level, we revisit a longstanding question: Can object detection and 9-DoF pose estimation be unified with high performance, without any additional data? We show that they can with our method, YOPO, a single-stage, query-based framework that treats category-level 9-DoF estimation as a natural extension of 2D detection. YOPO augments a transformer detector with a lightweight pose head, a bounding-box-conditioned translation module, and a 6D-aware Hungarian matching cost. The model is trained end-to-end only with RGB images and category-level pose labels. Despite its minimalist design, YOPO sets a new state of the art on three benchmarks. On the REAL275 dataset, it achieves 79.6% $\rm{IoU}_{50}$ and 54.1% under the $10^\circ$$10{\rm{cm}}$ metric, surpassing prior RGB-only methods and closing much of the gap to RGB-D systems. The code, models, and additional qualitative results can be found on our project.
comment: https://mikigom.github.io/YOPO-project-page
Dimension-Decomposed Learning for Quadrotor Geometric Attitude Control with Almost Global Exponential Convergence on SO(3)
This paper introduces a lightweight and interpretable online learning approach called Dimension-Decomposed Learning (DiD-L) for disturbance identification in quadrotor geometric attitude control. As a module instance of DiD-L, we propose the Sliced Adaptive-Neuro Mapping (SANM). Specifically, to address underlying underfitting problems, the high-dimensional mapping for online identification is axially ``sliced" into multiple low-dimensional submappings (slices). In this way, the complex high-dimensional problem is decomposed into a set of simple low-dimensional subtasks addressed by shallow neural networks and adaptive laws. These neural networks and adaptive laws are updated online via Lyapunov-based adaptation without the persistent excitation (PE) condition. To enhance the interpretability of the proposed approach, we prove that the state solution of the rotational error dynamics exponentially converges into an arbitrarily small ball within an almost global attraction domain, despite time-varying disturbances and inertia uncertainties. This result is novel as it demonstrates exponential convergence without requiring pre-training for unseen disturbances and specific knowledge of the model. To our knowledge in the quadrotor control field, DiD-L is the first online learning approach that is lightweight enough to run in real-time at 400 Hz on microcontroller units (MCUs) such as STM32, and has been validated through real-world experiments.
Multi-Robot Navigation in Social Mini-Games: Definitions, Taxonomy, and Algorithms
The ``Last Mile Challenge'' has long been considered an important, yet unsolved, challenge for autonomous vehicles, public service robots, and delivery robots. A central issue in this challenge is the ability of robots to navigate constrained and cluttered environments that have high agency (e.g., doorways, hallways, corridor intersections), often while competing for space with other robots and humans. We refer to these environments as ``Social Mini-Games'' (SMGs). Traditional navigation approaches designed for MRN do not perform well in SMGs, which has led to focused research on dedicated SMG solvers. However, publications on SMG navigation research make different assumptions (on centralized versus decentralized, observability, communication, cooperation, etc.), and have different objective functions (safety versus liveness). These assumptions and objectives are sometimes implicitly assumed or described informally. This makes it difficult to establish appropriate baselines for comparison in research papers, as well as making it difficult for practitioners to find the papers relevant to their concrete application. Such ad-hoc representation of the field also presents a barrier to new researchers wanting to start research in this area. SMG navigation research requires its own taxonomy, definitions, and evaluation protocols to guide effective research moving forward. This survey is the first to catalog SMG solvers using a well-defined and unified taxonomy and to classify existing methods accordingly. It also discusses the essential properties of SMG solvers, defines what SMGs are and how they appear in practice, outlines how to evaluate SMG solvers, and highlights the differences between SMG solvers and general navigation systems. The survey concludes with an overview of future directions and open challenges in the field.
Accelerating Signal-Temporal-Logic-Based Task and Motion Planning of Bipedal Navigation using Benders Decomposition
Task and motion planning under Signal Temporal Logic constraints is known to be NP-hard. A common class of approaches formulates these hybrid problems, which involve discrete task scheduling and continuous motion planning, as mixed-integer programs (MIP). However, in applications for bipedal locomotion, introduction of non-convex constraints such as kinematic reachability and footstep rotation exacerbates the computational complexity of MIPs. In this work, we present a method based on Benders Decomposition to address scenarios where solving the entire monolithic optimization problem is prohibitively intractable. Benders Decomposition proposes an iterative cutting-plane technique that partitions the problem into a master problem to prototype a plan that meets the task specification, and a series of subproblems for kinematics and dynamics feasibility checks. Our experiments demonstrate that this method achieves faster planning compared to alternative algorithms for solving the resulting optimization program with nonlinear constraints.
comment: 16 pages, 7 figures, 6 tables
Into the Wild: When Robots Are Not Welcome
Social robots are increasingly being deployed in public spaces, where they face not only technological difficulties and unexpected user utterances, but also objections from stakeholders who may not be comfortable with introducing a robot into those spaces. We describe our difficulties with deploying a social robot in two different public settings: 1) Student services center; 2) Refugees and asylum seekers drop-in service. Although this is a failure report, in each use case we eventually managed to earn the trust of the staff and form a relationship with them, allowing us to deploy our robot and conduct our studies.
comment: Accepted at the workshop on Real-World HRI in Public and Private Spaces: Successes, Failures, and Lessons Learned (PubRob-Fails), held at the IEEE RO-MAN Conference, 2025. 3 pages
From Autonomy to Agency: Agentic Vehicles for Human-Centered Mobility Systems
Autonomy, from the Greek autos (self) and nomos (law), refers to the capacity to operate according to internal rules without external control. Accordingly, autonomous vehicles (AuVs) are viewed as vehicular systems capable of perceiving their environment and executing pre-programmed tasks independently of external input. However, both research and real-world deployments increasingly showcase vehicles that demonstrate behaviors beyond this definition (including the SAE levels 0 to 5); Examples of this outpace include the interaction with humans with natural language, goal adaptation, contextual reasoning, external tool use, and unseen ethical dilemma handling, largely empowered by multi-modal large language models (LLMs). These developments reveal a conceptual gap between technical autonomy and the broader cognitive and social capabilities needed for future human-centered mobility systems. To address this gap, this paper introduces the concept of agentic vehicles (AgVs), referring to vehicles that integrate agentic AI systems to reason, adapt, and interact within complex environments. This paper proposes the term AgVs and their distinguishing characteristics from conventional AuVs. It synthesizes relevant advances in integrating LLMs and AuVs and highlights how AgVs might transform future mobility systems and ensure the systems are human-centered. The paper concludes by identifying key challenges in the development and governance of AgVs, and how they can play a significant role in future agentic transportation systems.
Active Disturbance Rejection Control for Trajectory Tracking of a Seagoing USV: Design, Simulation, and Field Experiments IROS 2025
Unmanned Surface Vessels (USVs) face significant control challenges due to uncertain environmental disturbances like waves and currents. This paper proposes a trajectory tracking controller based on Active Disturbance Rejection Control (ADRC) implemented on the DUS V2500. A custom simulation incorporating realistic waves and current disturbances is developed to validate the controller's performance, supported by further validation through field tests in the harbour of Scheveningen, the Netherlands, and at sea. Simulation results demonstrate that ADRC significantly reduces cross-track error across all tested conditions compared to a baseline PID controller but increases control effort and energy consumption. Field trials confirm these findings while revealing a further increase in energy consumption during sea trials compared to the baseline.
comment: Accepted for presentation at IROS 2025. Accepted version
Dynamic Risk-Aware MPPI for Mobile Robots in Crowds via Efficient Monte Carlo Approximations IROS 2025
Deploying mobile robots safely among humans requires the motion planner to account for the uncertainty in the other agents' predicted trajectories. This remains challenging in traditional approaches, especially with arbitrarily shaped predictions and real-time constraints. To address these challenges, we propose a Dynamic Risk-Aware Model Predictive Path Integral control (DRA-MPPI), a motion planner that incorporates uncertain future motions modelled with potentially non-Gaussian stochastic predictions. By leveraging MPPI's gradient-free nature, we propose a method that efficiently approximates the joint Collision Probability (CP) among multiple dynamic obstacles for several hundred sampled trajectories in real-time via a Monte Carlo (MC) approach. This enables the rejection of samples exceeding a predefined CP threshold or the integration of CP as a weighted objective within the navigation cost function. Consequently, DRA-MPPI mitigates the freezing robot problem while enhancing safety. Real-world and simulated experiments with multiple dynamic obstacles demonstrate DRA-MPPI's superior performance compared to state-of-the-art approaches, including Scenario-based Model Predictive Control (S-MPC), Frenet planner, and vanilla MPPI.
comment: Accepted for presentation at IROS 2025. Accepted Version
Gaussian-LIC: Real-Time Photo-Realistic SLAM with Gaussian Splatting and LiDAR-Inertial-Camera Fusion ICRA 2025
In this paper, we present a real-time photo-realistic SLAM method based on marrying Gaussian Splatting with LiDAR-Inertial-Camera SLAM. Most existing radiance-field-based SLAM systems mainly focus on bounded indoor environments, equipped with RGB-D or RGB sensors. However, they are prone to decline when expanding to unbounded scenes or encountering adverse conditions, such as violent motions and changing illumination. In contrast, oriented to general scenarios, our approach additionally tightly fuses LiDAR, IMU, and camera for robust pose estimation and photo-realistic online mapping. To compensate for regions unobserved by the LiDAR, we propose to integrate both the triangulated visual points from images and LiDAR points for initializing 3D Gaussians. In addition, the modeling of the sky and varying camera exposure have been realized for high-quality rendering. Notably, we implement our system purely with C++ and CUDA, and meticulously design a series of strategies to accelerate the online optimization of the Gaussian-based scene representation. Extensive experiments demonstrate that our method outperforms its counterparts while maintaining real-time capability. Impressively, regarding photo-realistic mapping, our method with our estimated poses even surpasses all the compared approaches that utilize privileged ground-truth poses for mapping. Our code has been released on https://github.com/APRIL-ZJU/Gaussian-LIC.
comment: ICRA 2025
SDS -- See it, Do it, Sorted: Quadruped Skill Synthesis from Single Video Demonstration
Imagine a robot learning locomotion skills from any single video, without labels or reward engineering. We introduce SDS ("See it. Do it. Sorted."), an automated pipeline for skill acquisition from unstructured demonstrations. Using GPT-4o, SDS applies novel prompting techniques, in the form of spatio-temporal grid-based visual encoding ($G_{v}$) and structured input decomposition (SUS). These produce executable reward functions (RF) from the raw input videos. The RFs are used to train PPO policies and are optimized through closed-loop evolution, using training footage and performance metrics as self-supervised signals. SDS allows quadrupeds (e.g. Unitree Go1) to learn four gaits -- trot, bound, pace, and hop -- achieving 100% gait matching fidelity, Dynamic Time Warping (DTW) distance in the order of $10^{-6}$, and stable locomotion with zero failures, both in simulation and the real world. SDS generalizes to morphologically different quadrupeds (e.g. ANYmal) and outperforms prior work in data efficiency, training time and engineering effort. Further materials and the code are open-source under: https://rpl-cs-ucl.github.io/SDSweb/.
Robust simultaneous UWB-anchor calibration and robot localization for emergency situations
In this work, we propose a factor graph optimization (FGO) framework to simultaneously solve the calibration problem for Ultra-WideBand (UWB) anchors and the robot localization problem. Calibrating UWB anchors manually can be time-consuming and even impossible in emergencies or those situations without special calibration tools. Therefore, automatic estimation of the anchor positions becomes a necessity. The proposed method enables the creation of a soft sensor providing the position information of the anchors in a UWB network. This soft sensor requires only UWB and LiDAR measurements measured from a moving robot. The proposed FGO framework is suitable for the calibration of an extendable large UWB network. Moreover, the anchor calibration problem and robot localization problem can be solved simultaneously, which saves time for UWB network deployment. The proposed framework also helps to avoid artificial errors in the UWB-anchor position estimation and improves the accuracy and robustness of the robot-pose. The experimental results of the robot localization using LiDAR and a UWB network in a 3D environment are discussed, demonstrating the performance of the proposed method. More specifically, the anchor calibration problem with four anchors and the robot localization problem can be solved simultaneously and automatically within 30 seconds by the proposed framework. The supplementary video and codes can be accessed via https://github.com/LiuxhRobotAI/Simultaneous_calibration_localization.
comment: Submit to IEEE SMC 2025. This work has been submitted to the IEEE for possible publication
Extremum Flow Matching for Offline Goal Conditioned Reinforcement Learning
Imitation learning is a promising approach for enabling generalist capabilities in humanoid robots, but its scaling is fundamentally constrained by the scarcity of high-quality expert demonstrations. This limitation can be mitigated by leveraging suboptimal, open-ended play data, often easier to collect and offering greater diversity. This work builds upon recent advances in generative modeling, specifically Flow Matching, an alternative to Diffusion models. We introduce a method for estimating the minimum or maximum of the learned distribution by leveraging the unique properties of Flow Matching, namely, deterministic transport and support for arbitrary source distributions. We apply this method to develop several goal-conditioned imitation and reinforcement learning algorithms based on Flow Matching, where policies are conditioned on both current and goal observations. We explore and compare different architectural configurations by combining core components, such as critic, planner, actor, or world model, in various ways. We evaluated our agents on the OGBench benchmark and analyzed how different demonstration behaviors during data collection affect performance in a 2D non-prehensile pushing task. Furthermore, we validated our approach on real hardware by deploying it on the Talos humanoid robot to perform complex manipulation tasks based on high-dimensional image observations, featuring a sequence of pick-and-place and articulated object manipulation in a realistic kitchen environment. Experimental videos and code are available at: https://hucebot.github.io/extremum_flow_matching_website/
comment: 2025 IEEE-RAS 24th International Conference on Humanoid Robots (Humanoids), Sep 2025, Seoul, South Korea
UAV-ON: A Benchmark for Open-World Object Goal Navigation with Aerial Agents ACM MM
Aerial navigation is a fundamental yet underexplored capability in embodied intelligence, enabling agents to operate in large-scale, unstructured environments where traditional navigation paradigms fall short. However, most existing research follows the Vision-and-Language Navigation (VLN) paradigm, which heavily depends on sequential linguistic instructions, limiting its scalability and autonomy. To address this gap, we introduce UAV-ON, a benchmark for large-scale Object Goal Navigation (ObjectNav) by aerial agents in open-world environments, where agents operate based on high-level semantic goals without relying on detailed instructional guidance as in VLN. UAV-ON comprises 14 high-fidelity Unreal Engine environments with diverse semantic regions and complex spatial layouts, covering urban, natural, and mixed-use settings. It defines 1270 annotated target objects, each characterized by an instance-level instruction that encodes category, physical footprint, and visual descriptors, allowing grounded reasoning. These instructions serve as semantic goals, introducing realistic ambiguity and complex reasoning challenges for aerial agents. To evaluate the benchmark, we implement several baseline methods, including Aerial ObjectNav Agent (AOA), a modular policy that integrates instruction semantics with egocentric observations for long-horizon, goal-directed exploration. Empirical results show that all baselines struggle in this setting, highlighting the compounded challenges of aerial navigation and semantic goal grounding. UAV-ON aims to advance research on scalable UAV autonomy driven by semantic goal descriptions in complex real-world environments.
comment: Accepted to ACM MM Dataset Track 2025
MinD: Learning A Dual-System World Model for Real-Time Planning and Implicit Risk Analysis
Video Generation Models (VGMs) have become powerful backbones for Vision-Language-Action (VLA) models, leveraging large-scale pretraining for robust dynamics modeling. However, current methods underutilize their distribution modeling capabilities for predicting future states. Two challenges hinder progress: integrating generative processes into feature learning is both technically and conceptually underdeveloped, and naive frame-by-frame video diffusion is computationally inefficient for real-time robotics. To address these, we propose Manipulate in Dream (MinD), a dual-system world model for real-time, risk-aware planning. MinD uses two asynchronous diffusion processes: a low-frequency visual generator (LoDiff) that predicts future scenes and a high-frequency diffusion policy (HiDiff) that outputs actions. Our key insight is that robotic policies do not require fully denoised frames but can rely on low-resolution latents generated in a single denoising step. To connect early predictions to actions, we introduce DiffMatcher, a video-action alignment module with a novel co-training strategy that synchronizes the two diffusion models. MinD achieves a 63% success rate on RL-Bench, 60% on real-world Franka tasks, and operates at 11.3 FPS, demonstrating the efficiency of single-step latent features for control signals. Furthermore, MinD identifies 74% of potential task failures in advance, providing real-time safety signals for monitoring and intervention. This work establishes a new paradigm for efficient and reliable robotic manipulation using generative world models.
LaViPlan : Language-Guided Visual Path Planning with RLVR ICCV 2025
Out-of-distribution (OOD) scenarios in autonomous driving pose critical challenges, as planners often fail to generalize beyond their training experience, leading to unsafe or unexpected behavior. Vision-Language Models (VLMs) have shown promise in handling such scenarios by providing high-level scene understanding and user-aligned decisions. However, existing VLMs often exhibit a misalignment between their language-based reasoning and the low-level trajectories required for action-level planning. In this paper, we propose LaViPlan, a framework that leverages Reinforcement Learning with Verifiable Rewards (RLVR) to fine-tune VLMs using planning-oriented metrics. Experimental results show that LaViPlan improves planning performance across both in-domain and out-of-domain datasets. While linguistic fidelity slightly decreases after RLVR-based fine-tuning, qualitative evaluation indicates that the outputs remain coherent. We also conduct ablation studies to analyze the effects of sampling ratio and reasoning guidance, highlighting how these design choices influence performance. These findings demonstrate the potential of RLVR as a post-training paradigm for aligning language-guided reasoning with action-level planning in autonomous driving.
comment: Accepted to the 2nd ICCV 2025 Workshop on the Challenge of Out-of-Label Hazards in Autonomous Driving (13 pages, 6 figures)
MetAdv: A Unified and Interactive Adversarial Testing Platform for Autonomous Driving ACM MM 2025
Evaluating and ensuring the adversarial robustness of autonomous driving (AD) systems is a critical and unresolved challenge. This paper introduces MetAdv, a novel adversarial testing platform that enables realistic, dynamic, and interactive evaluation by tightly integrating virtual simulation with physical vehicle feedback. At its core, MetAdv establishes a hybrid virtual-physical sandbox, within which we design a three-layer closed-loop testing environment with dynamic adversarial test evolution. This architecture facilitates end-to-end adversarial evaluation, ranging from high-level unified adversarial generation, through mid-level simulation-based interaction, to low-level execution on physical vehicles. Additionally, MetAdv supports a broad spectrum of AD tasks, algorithmic paradigms (e.g., modular deep learning pipelines, end-to-end learning, vision-language models). It supports flexible 3D vehicle modeling and seamless transitions between simulated and physical environments, with built-in compatibility for commercial platforms such as Apollo and Tesla. A key feature of MetAdv is its human-in-the-loop capability: besides flexible environmental configuration for more customized evaluation, it enables real-time capture of physiological signals and behavioral feedback from drivers, offering new insights into human-machine trust under adversarial conditions. We believe MetAdv can offer a scalable and unified framework for adversarial assessment, paving the way for safer AD.
comment: Accepted by ACM MM 2025 Demo/Videos track
3D FlowMatch Actor: Unified 3D Policy for Single- and Dual-Arm Manipulation
We present 3D FlowMatch Actor (3DFA), a 3D policy architecture for robot manipulation that combines flow matching for trajectory prediction with 3D pretrained visual scene representations for learning from demonstration. 3DFA leverages 3D relative attention between action and visual tokens during action denoising, building on prior work in 3D diffusion-based single-arm policy learning. Through a combination of flow matching and targeted system-level and architectural optimizations, 3DFA achieves over 30x faster training and inference than previous 3D diffusion-based policies, without sacrificing performance. On the bimanual PerAct2 benchmark, it establishes a new state of the art, outperforming the next-best method by an absolute margin of 41.4%. In extensive real-world evaluations, it surpasses strong baselines with up to 1000x more parameters and significantly more pretraining. In unimanual settings, it sets a new state of the art on 74 RLBench tasks by directly predicting dense end-effector trajectories, eliminating the need for motion planning. Comprehensive ablation studies underscore the importance of our design choices for both policy effectiveness and efficiency.
comment: Project page: https://3d-flowmatch-actor.github.io/
CaRL: Learning Scalable Planning Policies with Simple Rewards
We investigate reinforcement learning (RL) for privileged planning in autonomous driving. State-of-the-art approaches for this task are rule-based, but these methods do not scale to the long tail. RL, on the other hand, is scalable and does not suffer from compounding errors like imitation learning. Contemporary RL approaches for driving use complex shaped rewards that sum multiple individual rewards, \eg~progress, position, or orientation rewards. We show that PPO fails to optimize a popular version of these rewards when the mini-batch size is increased, which limits the scalability of these approaches. Instead, we propose a new reward design based primarily on optimizing a single intuitive reward term: route completion. Infractions are penalized by terminating the episode or multiplicatively reducing route completion. We find that PPO scales well with higher mini-batch sizes when trained with our simple reward, even improving performance. Training with large mini-batch sizes enables efficient scaling via distributed data parallelism. We scale PPO to 300M samples in CARLA and 500M samples in nuPlan with a single 8-GPU node. The resulting model achieves 64 DS on the CARLA longest6 v2 benchmark, outperforming other RL methods with more complex rewards by a large margin. Requiring only minimal adaptations from its use in CARLA, the same method is the best learning-based approach on nuPlan. It scores 91.3 in non-reactive and 90.6 in reactive traffic on the Val14 benchmark while being an order of magnitude faster than prior work.
comment: Accepted at the Conference on Robot Learning 2025
EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video
Imitation learning for manipulation has a well-known data scarcity problem. Unlike natural language and 2D computer vision, there is no Internet-scale corpus of data for dexterous manipulation. One appealing option is egocentric human video, a passively scalable data source. However, existing large-scale datasets such as Ego4D do not have native hand pose annotations and do not focus on object manipulation. To this end, we use Apple Vision Pro to collect EgoDex: the largest and most diverse dataset of dexterous human manipulation to date. EgoDex has 829 hours of egocentric video with paired 3D hand and finger tracking data collected at the time of recording, where multiple calibrated cameras and on-device SLAM can be used to precisely track the pose of every joint of each hand. The dataset covers a wide range of diverse manipulation behaviors with everyday household objects in 194 different tabletop tasks ranging from tying shoelaces to folding laundry. Furthermore, we train and systematically evaluate imitation learning policies for hand trajectory prediction on the dataset, introducing metrics and benchmarks for measuring progress in this increasingly important area. By releasing this large-scale dataset, we hope to push the frontier of robotics, computer vision, and foundation models. EgoDex is publicly available for download at https://github.com/apple/ml-egodex.
A MILP-Based Solution to Multi-Agent Motion Planning and Collision Avoidance in Constrained Environments
We propose a mixed-integer linear program (MILP) for multi-agent motion planning that embeds Polytopic Action-based Motion Planning (PAAMP) into a sequence-then-solve pipeline. Region sequences confine each agent to adjacent convex polytopes, while a big-M hyperplane model enforces inter-agent separation. Collision constraints are applied only to agents sharing or neighboring a region, which reduces binary variables exponentially compared with naive formulations. An L1 path-length-plus-acceleration cost yields smooth trajectories. We prove finite-time convergence and demonstrate on representative multi-agent scenarios with obstacles that our formulation produces collision-free trajectories an order of magnitude faster than an unstructured MILP baseline.
comment: Accepted to 2025 IEEE International Conference on Automation Science and Engineering (CASE 2025). This arXiv version adds a supplementary appendix with figures not in the IEEE proceedings
CaLiV: LiDAR-to-Vehicle Calibration of Arbitrary Sensor Setups
In autonomous systems, sensor calibration is essential for safe and efficient navigation in dynamic environments. Accurate calibration is a prerequisite for reliable perception and planning tasks such as object detection and obstacle avoidance. Many existing LiDAR calibration methods require overlapping fields of view, while others use external sensing devices or postulate a feature-rich environment. In addition, Sensor-to-Vehicle calibration is not supported by the vast majority of calibration algorithms. In this work, we propose a novel target-based technique for extrinsic Sensor-to-Sensor and Sensor-to-Vehicle calibration of multi-LiDAR systems called CaLiV. This algorithm works for non-overlapping fields of view and does not require any external sensing devices. First, we apply motion to produce field of view overlaps and utilize a simple Unscented Kalman Filter to obtain vehicle poses. Then, we use the Gaussian mixture model-based registration framework GMMCalib to align the point clouds in a common calibration frame. Finally, we reduce the task of recovering the sensor extrinsics to a minimization problem. We show that both translational and rotational Sensor-to-Sensor errors can be solved accurately by our method. In addition, all Sensor-to-Vehicle rotation angles can also be calibrated with high accuracy. We validate the simulation results in real-world experiments. The code is open-source and available on https://github.com/TUMFTM/CaLiV.
Multiagent Systems
Generative AI Against Poaching: Latent Composite Flow Matching for Wildlife Conservation
Poaching poses significant threats to wildlife and biodiversity. A valuable step in reducing poaching is to forecast poacher behavior, which can inform patrol planning and other conservation interventions. Existing poaching prediction methods based on linear models or decision trees lack the expressivity to capture complex, nonlinear spatiotemporal patterns. Recent advances in generative modeling, particularly flow matching, offer a more flexible alternative. However, training such models on real-world poaching data faces two central obstacles: imperfect detection of poaching events and limited data. To address imperfect detection, we integrate flow matching with an occupancy-based detection model and train the flow in latent space to infer the underlying occupancy state. To mitigate data scarcity, we adopt a composite flow initialized from a linear-model prediction rather than random noise which is the standard in diffusion models, injecting prior knowledge and improving generalization. Evaluations on datasets from two national parks in Uganda show consistent gains in predictive accuracy.
Alpha Berkeley: A Scalable Framework for the Orchestration of Agentic Systems
Coordinating workflows across heterogeneous control systems remains a central challenge in safety-critical environments such as scientific facilities, industrial plants, and energy infrastructures. Language-model-driven agents offer a natural interface for these tasks, but existing approaches often lack scalability, reliability, and human oversight. We introduce the Alpha Berkeley Framework, a production-ready architecture for scalable agentic systems that integrate conversational context with robust tool orchestration. The framework features dynamic capability classification to select only relevant tools per task, a plan-first orchestration model that generates execution plans with explicit dependencies and optional human approval, context-aware task extraction that combines dialogue history with external memory and domain resources, and production-ready execution environments with checkpointing, artifact management, and modular deployment. We demonstrate its versatility through two case studies: a tutorial-style wind farm monitoring example and a deployment at the Advanced Light Source particle accelerator. These results establish Alpha Berkeley as a reliable and transparent framework for agentic systems in high-stakes domains.
Decentralized Vision-Based Autonomous Aerial Wildlife Monitoring
Wildlife field operations demand efficient parallel deployment methods to identify and interact with specific individuals, enabling simultaneous collective behavioral analysis, and health and safety interventions. Previous robotics solutions approach the problem from the herd perspective, or are manually operated and limited in scale. We propose a decentralized vision-based multi-quadrotor system for wildlife monitoring that is scalable, low-bandwidth, and sensor-minimal (single onboard RGB camera). Our approach enables robust identification and tracking of large species in their natural habitat. We develop novel vision-based coordination and tracking algorithms designed for dynamic, unstructured environments without reliance on centralized communication or control. We validate our system through real-world experiments, demonstrating reliable deployment in diverse field conditions.
Building and Measuring Trust between Large Language Models
As large language models (LLMs) increasingly interact with each other, most notably in multi-agent setups, we may expect (and hope) that `trust' relationships develop between them, mirroring trust relationships between human colleagues, friends, or partners. Yet, though prior work has shown LLMs to be capable of identifying emotional connections and recognizing reciprocity in trust games, little remains known about (i) how different strategies to build trust compare, (ii) how such trust can be measured implicitly, and (iii) how this relates to explicit measures of trust. We study these questions by relating implicit measures of trust, i.e. susceptibility to persuasion and propensity to collaborate financially, with explicit measures of trust, i.e. a dyadic trust questionnaire well-established in psychology. We build trust in three ways: by building rapport dynamically, by starting from a prewritten script that evidences trust, and by adapting the LLMs' system prompt. Surprisingly, we find that the measures of explicit trust are either little or highly negatively correlated with implicit trust measures. These findings suggest that measuring trust between LLMs by asking their opinion may be deceiving. Instead, context-specific and implicit measures may be more informative in understanding how LLMs trust each other.
Multi-Robot Navigation in Social Mini-Games: Definitions, Taxonomy, and Algorithms
The ``Last Mile Challenge'' has long been considered an important, yet unsolved, challenge for autonomous vehicles, public service robots, and delivery robots. A central issue in this challenge is the ability of robots to navigate constrained and cluttered environments that have high agency (e.g., doorways, hallways, corridor intersections), often while competing for space with other robots and humans. We refer to these environments as ``Social Mini-Games'' (SMGs). Traditional navigation approaches designed for MRN do not perform well in SMGs, which has led to focused research on dedicated SMG solvers. However, publications on SMG navigation research make different assumptions (on centralized versus decentralized, observability, communication, cooperation, etc.), and have different objective functions (safety versus liveness). These assumptions and objectives are sometimes implicitly assumed or described informally. This makes it difficult to establish appropriate baselines for comparison in research papers, as well as making it difficult for practitioners to find the papers relevant to their concrete application. Such ad-hoc representation of the field also presents a barrier to new researchers wanting to start research in this area. SMG navigation research requires its own taxonomy, definitions, and evaluation protocols to guide effective research moving forward. This survey is the first to catalog SMG solvers using a well-defined and unified taxonomy and to classify existing methods accordingly. It also discusses the essential properties of SMG solvers, defines what SMGs are and how they appear in practice, outlines how to evaluate SMG solvers, and highlights the differences between SMG solvers and general navigation systems. The survey concludes with an overview of future directions and open challenges in the field.
MAViS: A Multi-Agent Framework for Long-Sequence Video Storytelling
Despite recent advances, long-sequence video generation frameworks still suffer from significant limitations: poor assistive capability, suboptimal visual quality, and limited expressiveness. To mitigate these limitations, we propose MAViS, an end-to-end multi-agent collaborative framework for long-sequence video storytelling. MAViS orchestrates specialized agents across multiple stages, including script writing, shot designing, character modeling, keyframe generation, video animation, and audio generation. In each stage, agents operate under the 3E Principle -- Explore, Examine, and Enhance -- to ensure the completeness of intermediate outputs. Considering the capability limitations of current generative models, we propose the Script Writing Guidelines to optimize compatibility between scripts and generative tools. Experimental results demonstrate that MAViS achieves state-of-the-art performance in assistive capability, visual quality, and video expressiveness. Its modular framework further enables scalability with diverse generative models and tools. With just a brief user prompt, MAViS is capable of producing high-quality, expressive long-sequence video storytelling, enriching inspirations and creativity for users. To the best of our knowledge, MAViS is the only framework that provides multimodal design output -- videos with narratives and background music.
comment: Video Generation Agent
On the $h$-majority dynamics with many opinions
We present the first upper bound on the convergence time to consensus of the well-known $h$-majority dynamics with $k$ opinions, in the synchronous setting, for $h$ and $k$ that are both non-constant values. We suppose that, at the beginning of the process, there is some initial additive bias towards some plurality opinion, that is, there is an opinion that is supported by $x$ nodes while any other opinion is supported by strictly fewer nodes. We prove that, with high probability, if the bias is $\omega(\sqrt{x})$ and the initial plurality opinion is supported by at least $x = \omega(\log n)$ nodes, then the process converges to plurality consensus in $O(\log n)$ rounds whenever $h = \omega(n \log n / x)$. A main corollary is the following: if $k = o(n / \log n)$ and the process starts from an almost-balanced configuration with an initial bias of magnitude $\omega(\sqrt{n/k})$ towards the initial plurality opinion, then any function $h = \omega(k \log n)$ suffices to guarantee convergence to consensus in $O(\log n)$ rounds, with high probability. Our upper bound shows that the lower bound of $\Omega(k / h^2)$ rounds to reach consensus given by Becchetti et al.\ (2017) cannot be pushed further than $\widetilde{\Omega}(k / h)$. Moreover, the bias we require is asymptotically smaller than the $\Omega(\sqrt{n\log n})$ bias that guarantees plurality consensus in the $3$-majority dynamics: in our case, the required bias is at most any (arbitrarily small) function in $\omega(\sqrt{x})$ for any value of $k \ge 2$.
Binary Decision Process in Pre-Evacuation Behavior
In crowd evacuation the time interval before decisive movement towards a safe place is defined as the pre-evacuation phase, and it has crucial impact on the total time required for safe egress. This process mainly refers to situation awareness and response to an external stressors, e.g., fire alarm. Due to the complexity of human cognitive process, simulation is used to study this important time interval. In this paper a binary decision process is formulated to simulate pre-evacuation time of many evacuees in a given social context. The model combines classic opinion dynamics with binary phase transition to describe how group pre-evacuation time emerges from individual interaction. The model parameters are quantitatively meaningful to human factors research within socio-psychological background, e.g., whether an individual is stubborn or open-minded, or what kind of the social topology exists among the individuals and how it matters in aggregating individuals into social groups. The modeling framework also describes collective motion of many evacuees in a planar space, and the resulting multi-agent system is partly similar to Vicsek model, and it is meaningful to explore complex crowd behavior in social context.
comment: 5 pages
Dominated Actions in Imperfect-Information Games
Dominance is a fundamental concept in game theory. In strategic-form games dominated strategies can be identified in polynomial time. As a consequence, iterative removal of dominated strategies can be performed efficiently as a preprocessing step for reducing the size of a game before computing a Nash equilibrium. For imperfect-information games in extensive form, we could convert the game to strategic form and then iteratively remove dominated strategies in the same way; however, this conversion may cause an exponential blowup in game size. In this paper we define and study the concept of dominated actions in imperfect-information games. Our main result is a polynomial-time algorithm for determining whether an action is dominated (strictly or weakly) by any mixed strategy in n-player games, which can be extended to an algorithm for iteratively removing dominated actions. This allows us to efficiently reduce the size of the game tree as a preprocessing step for Nash equilibrium computation. We explore the role of dominated actions empirically in the "All In or Fold" No-Limit Texas Hold'em poker variant.
Prescriptive Agents based on RAG for Automated Maintenance (PARAM)
Industrial machinery maintenance requires timely intervention to prevent catastrophic failures and optimize operational efficiency. This paper presents an integrated Large Language Model (LLM)-based intelligent system for prescriptive maintenance that extends beyond traditional anomaly detection to provide actionable maintenance recommendations. Building upon our prior LAMP framework for numerical data analysis, we develop a comprehensive solution that combines bearing vibration frequency analysis with multi agentic generation for intelligent maintenance planning. Our approach serializes bearing vibration data (BPFO, BPFI, BSF, FTF frequencies) into natural language for LLM processing, enabling few-shot anomaly detection with high accuracy. The system classifies fault types (inner race, outer race, ball/roller, cage faults) and assesses severity levels. A multi-agentic component processes maintenance manuals using vector embeddings and semantic search, while also conducting web searches to retrieve comprehensive procedural knowledge and access up-to-date maintenance practices for more accurate and in-depth recommendations. The Gemini model then generates structured maintenance recommendations includes immediate actions, inspection checklists, corrective measures, parts requirements, and timeline specifications. Experimental validation in bearing vibration datasets demonstrates effective anomaly detection and contextually relevant maintenance guidance. The system successfully bridges the gap between condition monitoring and actionable maintenance planning, providing industrial practitioners with intelligent decision support. This work advances the application of LLMs in industrial maintenance, offering a scalable framework for prescriptive maintenance across machinery components and industrial sectors.
Systems and Control (CS)
A State-Space Representation of Coupled Linear Multivariate PDEs and Stability Analysis using SDP
Physical processes evolving in both time and space are often modeled using Partial Differential Equations (PDEs). Recently, it has been shown how stability analysis and control of coupled PDEs in a single spatial variable can be more conveniently performed using an equivalent Partial Integral Equation (PIE) representation. The construction of this PIE representation is based on an analytic expression for the inverse of the spatial differential operator, $\partial_s^{d}$, on the domain defined by boundary conditions. In this paper, we show how this univariate representation may be extended inductively to multiple spatial variables by representing the domain as the intersection of lifted univariate domains. Specifically, we show that if each univariate domain is well-posed, then there exists a readily verified consistency condition which is necessary and sufficient for existence of an inverse to the multivariate spatial differential operator, $D^\alpha=\partial_{s_1}^{\alpha_1}\cdots\partial_{s_N}^{\alpha_N}$, on the PDE domain. Furthermore, we show that this inverse is an element of a $*$-algebra of Partial Integral (PI) operators defined by polynomial semi-separable kernels. Based on this operator algebra, we show that the evolution of any suitably well-posed linear multivariate PDE may be described by a PIE, parameterized by elements of the PI algebra. A convex computational test for PDE stability is then proposed using a positive matrix parameterization of positive PI operators, and software (PIETOOLS) is provided which automates the process of representation and stability analysis of such PDEs. This software is used to analyze stability of 2D heat, wave, and plate equations, obtaining accurate bounds on the rate of decay.
Recursive Gaussian Process Regression with Integrated Monotonicity Assumptions for Control Applications
In this paper, we present an extension to the recursive Gaussian Process (RGP) regression that enables the satisfaction of inequality constraints and is well suited for a real-time execution in control applications. The soft inequality constraints are integrated by introducing an additional extended Kalman Filter (EKF) update step using pseudo-measurements. The sequential formulation of the algorithm and several developed heuristics ensure both the performance and a low computational effort of the algorithm. A special focus lies on an efficient consideration of monotonicity assumptions for GPs in the form of inequality constraints. The algorithm is statistically validated in simulations, where the possible advantages in comparison with the standard RGP algorithm become obvious. The paper is concluded with a successful experimental validation of the developed algorithm for the monotonicity-preserving learning of heat transfer values for the control of a vapor compression cycle evaporator, leveraging a previously published partial input output linearization (IOL).
comment: Accepted at ICINCO 2025 (22nd International Conference on Informatics in Control, Automation and Robotics)
Optimal Unpredictable Control for Linear Systems
In this paper, we investigate how to achieve the unpredictability against malicious inferences for linear systems. The key idea is to add stochastic control inputs, named as unpredictable control, to make the outputs irregular. The future outputs thus become unpredictable and the performance of inferences is degraded. The major challenges lie in: i) how to formulate optimization problems to obtain an optimal distribution of stochastic input, under unknown prediction accuracy of the adversary; and ii) how to achieve the trade-off between the unpredictability and control performance. We first utilize both variance and confidence probability of prediction error to quantify unpredictability, then formulate two two-stage stochastic optimization problems, respectively. Under variance metric, the analytic optimal distribution of control input is provided. With probability metric, it is a non-convex optimization problem, thus we present a novel numerical method and convert the problem into a solvable linear optimization problem. Last, we quantify the control performance under unpredictable control, and accordingly design the unpredictable LQR and cooperative control. Simulations demonstrate the unpredictability of our control algorithm. The obtained optimal distribution outperforms Gaussian and Laplace distributions commonly used in differential privacy under proposed metrics.
Assessment of Power System Stability Considering Multiple Time-Scale Dynamics: Insights into Hopf Bifurcations in Presence of GFL and GFM IBRs
Real power systems exhibit dynamics that evolve across a wide range of time scales, from very fast to very slow phenomena. Historically, incorporating these wide-ranging dynamics into a single model has been impractical. As a result, power engineers rely on time-scale decomposition to simplify models. When fast phenomena are evaluated, slow dynamics are neglected (assumed stable), and vice versa. This paper challenges this paradigm by showing the importance of assessing power system stability while considering multiple time scales simultaneously. Using the concept of Hopf bifurcations, it exemplifies instability issues that would be missed if multi-time-scale dynamics are not considered. Although this work employs both grid-following and grid-forming inverter-based resource models, it is not a direct comparison. Instead, it presents a case study demonstrating how one technology can complement the other from a multi time-scale dynamics perspective.
comment: 7 pages
Distributed Multiple Fault Detection and Estimation in DC Microgrids with Unknown Power Loads
This paper proposes a distributed diagnosis scheme to detect and estimate actuator and power line faults in DC microgrids subject to unknown power loads and stochastic noise. To address actuator faults, we design a fault estimation filter whose parameters are determined through a tractable optimization problem to achieve fault estimation, decoupling from power line faults, and robustness against noise. In contrast, the estimation of power line faults poses greater challenges due to the inherent coupling between fault currents and unknown power loads, which becomes ill-posed when the underlying system is insufficiently excited. To the best of our knowledge, this is the first study to address this critical yet underexplored issue. Our solution introduces a novel differentiate-before-estimate strategy. A set of diagnostic rules based on the temporal characteristics of a constructed residual is developed to distinguish load changes from line faults. Once a power line fault is detected, a regularized least-squares method is activated to estimate the fault currents, for which we further derive an upper bound on the estimation error. Finally, comprehensive simulation results validate the effectiveness of the proposed methods.
comment: 28 pages, 13 figures
Markov Chain-based Model of Blockchain Radio Access Networks
Security has always been a priority, for researchers, service providers and network operators when it comes to radio access networks (RAN). One wireless access approach that has captured attention is blockchain enabled RAN (B-RAN) due to its secure nature. This research introduces a framework that integrates blockchain technology into RAN while also addressing the limitations of state-of-the-art models. The proposed framework utilizes queuing and Markov chain theory to model the aspects of B-RAN. An extensive evaluation of the models performance is provided, including an analysis of timing factors and a focused assessment of its security aspects. The results demonstrate reduced latency and comparable security making the presented framework suitable for diverse application scenarios.
Reformulating Parallel-Connected Lithium-Ion Battery Pack Dynamics with Interconnection Resistances as Ordinary Differential Equations
This work presents analytical solutions for the current distribution in lithium-ion battery packs composed of cells connected in parallel, explicitly accounting for the presence of interconnection resistances. These solutions enable the reformulation of the differential-algebraic equations describing the pack dynamics into a set of ordinary differential equations, thereby simplifying simulation and analysis. Conditions under which uniform current sharing across all cells occurs are also derived. The proposed formulation is validated against experimental data and confirms its ability to capture the key behaviours induced by interconnection resistances. These results can support the improved design and control of parallel-connected battery packs.
Dimension-Decomposed Learning for Quadrotor Geometric Attitude Control with Almost Global Exponential Convergence on SO(3)
This paper introduces a lightweight and interpretable online learning approach called Dimension-Decomposed Learning (DiD-L) for disturbance identification in quadrotor geometric attitude control. As a module instance of DiD-L, we propose the Sliced Adaptive-Neuro Mapping (SANM). Specifically, to address underlying underfitting problems, the high-dimensional mapping for online identification is axially ``sliced" into multiple low-dimensional submappings (slices). In this way, the complex high-dimensional problem is decomposed into a set of simple low-dimensional subtasks addressed by shallow neural networks and adaptive laws. These neural networks and adaptive laws are updated online via Lyapunov-based adaptation without the persistent excitation (PE) condition. To enhance the interpretability of the proposed approach, we prove that the state solution of the rotational error dynamics exponentially converges into an arbitrarily small ball within an almost global attraction domain, despite time-varying disturbances and inertia uncertainties. This result is novel as it demonstrates exponential convergence without requiring pre-training for unseen disturbances and specific knowledge of the model. To our knowledge in the quadrotor control field, DiD-L is the first online learning approach that is lightweight enough to run in real-time at 400 Hz on microcontroller units (MCUs) such as STM32, and has been validated through real-world experiments.
Online Incident Response Planning under Model Misspecification through Bayesian Learning and Belief Quantization CCS
Effective responses to cyberattacks require fast decisions, even when information about the attack is incomplete or inaccurate. However, most decision-support frameworks for incident response rely on a detailed system model that describes the incident, which restricts their practical utility. In this paper, we address this limitation and present an online method for incident response planning under model misspecification, which we call MOBAL: Misspecified Online Bayesian Learning. MOBAL iteratively refines a conjecture about the model through Bayesian learning as new information becomes available, which facilitates model adaptation as the incident unfolds. To determine effective responses online, we quantize the conjectured model into a finite Markov model, which enables efficient response planning through dynamic programming. We prove that Bayesian learning is asymptotically consistent with respect to the information feedback. Additionally, we establish bounds on misspecification and quantization errors. Experiments on the CAGE-2 benchmark show that MOBAL outperforms the state of the art in terms of adaptability and robustness to model misspecification.
comment: Accepted to ACM CCS AISec2025
Smart Charging Impact Analysis using Clustering Methods and Real-world Distribution Feeders
The anticipated widespread adoption of electric vehicles (EVs) necessitates a critical evaluation of existing power distribution infrastructures, as EV integration imposes additional stress on distribution networks that can lead to component overloading and power quality degradation. Implementing smart charging mechanisms can mitigate these adverse effects and defer or even avoid upgrades. This study assesses the performance of two smart charging strategies - Time of Use (TOU) pricing and Load Balancing (LB) on seven representative real-world feeders identified using k-means clustering. A time series-based steady-state load flow analysis was conducted on these feeders to simulate the impact of EV charging under both strategies across four different EV enrollment scenarios and three representative days to capture seasonal load characteristics. A grid upgrade strategy has been proposed to strengthen the power grid to support EV integration with minimal cost. Results demonstrate that both TOU and LB strategies effectively manage the additional EV load with reduced upgrade requirement and cost to existing infrastructure compared to the case without smart charging strategies and LB outperforms TOU when the customer enrollment levels are high. These findings support the viability of smart charging in facilitating EV integration while maintaining distribution network reliability and reducing investment cost.
Discrete VHCs for Propeller Motion of a Devil-Stick using purely Impulsive Inputs
The control problem of realizing propeller motion of a devil-stick in the vertical plane using impulsive forces applied normal to the stick is considered. This problem is an example of underactuated robotic juggling and has not been considered in the literature before. Inspired by virtual holonomic constraints, the concept of discrete virtual holonomic constraints (DVHC) is introduced for the first time to solve this orbital stabilization problem. At the discrete instants when impulsive inputs are applied, the location of the center-of-mass of the devil-stick is specified in terms of its orientation angle. This yields the discrete zero dynamics (DZD), which provides conditions for stable propeller motion. In the limiting case, when the rotation angle between successive applications of impulsive inputs is chosen to be arbitrarily small, the problem reduces to that of propeller motion under continuous forcing. A controller that enforces the DVHC, and an orbit stabilizing controller based on the impulse controlled Poincar\'e map approach are presented. The efficacy of the approach to trajectory design and stabilization is validated through simulations.
comment: 16 pages, 11 figures. This work has been submitted to the IEEE for possible publication
Nonlinear Federated System Identification
We consider federated learning of linearly-parameterized nonlinear systems. We establish theoretical guarantees on the effectiveness of federated nonlinear system identification compared to centralized approaches, demonstrating that the convergence rate improves as the number of clients increases. Although the convergence rates in the linear and nonlinear cases differ only by a constant, this constant depends on the feature map $\phi$, which can be carefully chosen in the nonlinear setting to increase excitation and improve performance. We experimentally validate our theory in physical settings where client devices are driven by i.i.d. control inputs and control policies exhibiting i.i.d. random perturbations, ensuring non-active exploration. Experiments use trajectories from nonlinear dynamical systems characterized by real-analytic feature functions, including polynomial and trigonometric components, representative of physical systems including pendulum and quadrotor dynamics. We analyze the convergence behavior of the proposed method under varying noise levels and data distributions. Results show that federated learning consistently improves convergence of any individual client as the number of participating clients increases.
Structure-preserving Optimal Kron-based Reduction of Radial Distribution Networks
Network reduction simplifies complex electrical networks to address computational challenges of large-scale transmission and distribution grids. Traditional network reduction methods are often based on a predefined set of nodes or lines to remain in the reduced network. This paper builds upon previous work on Optimal Kron-based Reduction of Networks (Opti-KRON), which was formulated as a mixed-integer linear program (MILP), to enhance the framework in two aspects. First, the scalability is improved via a cutting plane restriction, tightened Big~M bounds, and a zero-injection node reduction stage. Next, we introduce a radiality-preservation step to identify and recover nodes whose restoration ensures radiality of the reduced network. The model is validated through its application to the 533-bus distribution test system and a 3499-bus realistic test feeder for a set of representative loading scenarios. In the 533-bus system, an 85% reduction was achieved with a maximum voltage error below 0.0025 p.u., while in the 3499-bus feeder, over 94% reduction was obtained with maximum voltage errors below 0.002 p.u. Additionally, we show that the radialization step accelerates the runtime of optimal voltage control problems when applied to Kron-reduced networks.
A Vision-Based Shared-Control Teleoperation Scheme for Controlling the Robotic Arm of a Four-Legged Robot
In hazardous and remote environments, robotic systems perform critical tasks demanding improved safety and efficiency. Among these, quadruped robots with manipulator arms offer mobility and versatility for complex operations. However, teleoperating quadruped robots is challenging due to the lack of integrated obstacle detection and intuitive control methods for the robotic arm, increasing collision risks in confined or dynamically changing workspaces. Teleoperation via joysticks or pads can be non-intuitive and demands a high level of expertise due to its complexity, culminating in a high cognitive load on the operator. To address this challenge, a teleoperation approach that directly maps human arm movements to the robotic manipulator offers a simpler and more accessible solution. This work proposes an intuitive remote control by leveraging a vision-based pose estimation pipeline that utilizes an external camera with a machine learning-based model to detect the operator's wrist position. The system maps these wrist movements into robotic arm commands to control the robot's arm in real-time. A trajectory planner ensures safe teleoperation by detecting and preventing collisions with both obstacles and the robotic arm itself. The system was validated on the real robot, demonstrating robust performance in real-time control. This teleoperation approach provides a cost-effective solution for industrial applications where safety, precision, and ease of use are paramount, ensuring reliable and intuitive robotic control in high-risk environments.
LyLA-Therm: Lyapunov-based Langevin Adaptive Thermodynamic Neural Network Controller
Thermodynamic principles can be employed to design parameter update laws that address challenges such as the exploration vs. exploitation dilemma. In this paper, inspired by the Langevin equation, an update law is developed for a Lyapunov-based DNN control method, taking the form of a stochastic differential equation. The drift term is designed to minimize the system's generalized internal energy, while the diffusion term is governed by a user-selected generalized temperature law, allowing for more controlled fluctuations. The minimization of generalized internal energy in this design fulfills the exploitation objective, while the temperature-based stochastic noise ensures sufficient exploration. Using a Lyapunov-based stability analysis, the proposed Lyapunov-based Langevin Adaptive Thermodynamic (LyLA-Therm) neural network controller achieves probabilistic convergence of the tracking and parameter estimation errors to an ultimate bound. Simulation results demonstrate the effectiveness of the proposed approach, with the LyLA-Therm architecture achieving up to 20.66% improvement in tracking errors, up to 20.89% improvement in function approximation errors, and up to 11.31% improvement in off-trajectory function approximation errors compared to the baseline deterministic approach.
Holo-Artisan: A Personalized Multi-User Holographic Experience for Virtual Museums on the Edge Intelligence
We present Holo-Artisan, a novel system architecture enabling immersive multi-user experiences in virtual museums through true holographic displays and personalized edge intelligence. In our design, local edge computing nodes process real-time user data -- including pose, facial expression, and voice -- for multiple visitors concurrently. Generative AI models then drive digital artworks (e.g., a volumetric Mona Lisa) to respond uniquely to each viewer. For instance, the Mona Lisa can return a smile to one visitor while engaging in a spoken Q\&A with another, all in real time. A cloud-assisted collaboration platform composes these interactions in a shared scene using a universal scene description, and employs ray tracing to render high-fidelity, personalized views with a direct pipeline to glasses-free holographic displays. To preserve user privacy and continuously improve personalization, we integrate federated learning (FL) -- edge devices locally fine-tune AI models and share only model updates for aggregation. This edge-centric approach minimizes latency and bandwidth usage, ensuring a synchronized shared experience with individual customization. Through Holo-Artisan, static museum exhibits are transformed into dynamic, living artworks that engage each visitor in a personal dialogue, heralding a new paradigm of cultural heritage interaction.
Amplitude maximization in stable systems, Schur positivity, and some conjectures on polynomial interpolation
For $r > 0$ and integers $t \ge n > 0$, we consider the following problem: maximize the amplitude $|x_t|$ at time $t$, over all complex solutions $x = (x_0, x_1, \dots)$ of arbitrary homogeneous linear difference equations of order $n$ with the characteristic roots in the disc $\{z \in \mathbb{C}: |z| \le r\}$, and with initial values $x_0, \dots, x_{n-1}$ in the unit disc. We find that for any triple $t,n,r$, the maximum is attained with coinciding roots on the boundary circle; in particular, this implies that the peak amplitude $\sup_{t \ge n} |x_t|$ can be maximized explicitly, by studying a unique equation with the characteristic polynomial $(z-r)^n$. Moreover, the optimality of the cophase root configuration holds for origin-centered polydiscs. To prove this result, we first reduce the problem to a certain interpolation problem over monomials, then solve the latter by leveraging the theory of symmetric functions and identifying the associated Schur positivity structure. We also discuss the implications for more general Reinhardt domains. Finally, we study the problem of estimating the derivatives of a real entire function from its values at $n/2$ pairs of complex conjugate points in the unit disc. We propose conjectures on the extremality of the monomial $z^n$, and restate them in terms of Schur polynomials.
comment: 18 pages; minor typo corrections viz. the previous version
Active Disturbance Rejection Control for Trajectory Tracking of a Seagoing USV: Design, Simulation, and Field Experiments IROS 2025
Unmanned Surface Vessels (USVs) face significant control challenges due to uncertain environmental disturbances like waves and currents. This paper proposes a trajectory tracking controller based on Active Disturbance Rejection Control (ADRC) implemented on the DUS V2500. A custom simulation incorporating realistic waves and current disturbances is developed to validate the controller's performance, supported by further validation through field tests in the harbour of Scheveningen, the Netherlands, and at sea. Simulation results demonstrate that ADRC significantly reduces cross-track error across all tested conditions compared to a baseline PID controller but increases control effort and energy consumption. Field trials confirm these findings while revealing a further increase in energy consumption during sea trials compared to the baseline.
comment: Accepted for presentation at IROS 2025. Accepted version
Fragile, Robust, and Antifragile: A Perspective from Parameter Responses in Reinforcement Learning Under Stress
This paper explores Reinforcement learning (RL) policy robustness by systematically analyzing network parameters under internal and external stresses. Inspired by synaptic plasticity in neuroscience, synaptic filtering introduces internal stress by selectively perturbing parameters, while adversarial attacks apply external stress through modified agent observations. This dual approach enables the classification of parameters as fragile, robust, or antifragile, based on their influence on policy performance in clean and adversarial settings. Parameter scores are defined to quantify these characteristics, and the framework is validated on PPO-trained agents in Mujoco continuous control environments. The results highlight the presence of antifragile parameters that enhance policy performance under stress, demonstrating the potential of targeted filtering techniques to improve RL policy adaptability. These insights provide a foundation for future advancements in the design of robust and antifragile RL systems.
comment: Withdrawn pending a review of attribution and overlap with Pravin et al., Artificial Intelligence (2024), DOI: 10.1016/j.artint.2023.104060. Further dissemination is paused while we determine appropriate next steps
Moving horizon estimation for nonlinear systems with time-varying parameters
We propose a moving horizon estimation scheme for estimating the states and time-varying parameters of nonlinear systems. We consider the case where observability of the parameters depends on the excitation of the system and may be absent during operation, with the parameter dynamics fulfilling a weak incremental bounded-energy bounded-state property to ensure boundedness of the estimation error (with respect to the disturbance energy). The proposed estimation scheme involves a standard quadratic cost function with an adaptive regularization term depending on the current parameter observability. We develop robustness guarantees for the overall estimation error that are valid for all times, and that improve the more often the parameters are detected to be observable during operation. The theoretical results are illustrated by a simulation example.
comment: Presented at IFAC NMPC 2024, Kyoto, Japan
Game-theoretic Energy Management Strategies With Interacting Agents in Formula 1
This paper presents an interaction-aware energy management optimization framework for Formula 1 racing. The considered scenario involves two agents and a drag reduction model. Strategic interactions between the agents are captured by a Stackelberg game formulated as a bilevel program. To address the computational challenges associated with bilevel optimization, the problem is reformulated as a single-level nonlinear program employing the Karush-Kuhn-Tucker conditions. The proposed framework contributes towards the development of new energy management and allocation strategies, caused by the presence of another agent. For instance, it provides valuable insights on how to redistribute the energy in order to optimally exploit the wake effect, showcasing a notable difference with the behavior studied in previous works. Robust energy allocations can be identified to reduce the lap time loss associated with unexpected choices of the other agent. It allows to recognize the boundary conditions for the interaction to become relevant, impacting the system's behavior, and to assess if overtaking is possible and beneficial. Overall, the framework provides a comprehensive approach for a two-agent Formula 1 racing problem with strategic interactions, offering physically intuitive and practical results.
Binary Decision Process in Pre-Evacuation Behavior
In crowd evacuation the time interval before decisive movement towards a safe place is defined as the pre-evacuation phase, and it has crucial impact on the total time required for safe egress. This process mainly refers to situation awareness and response to an external stressors, e.g., fire alarm. Due to the complexity of human cognitive process, simulation is used to study this important time interval. In this paper a binary decision process is formulated to simulate pre-evacuation time of many evacuees in a given social context. The model combines classic opinion dynamics with binary phase transition to describe how group pre-evacuation time emerges from individual interaction. The model parameters are quantitatively meaningful to human factors research within socio-psychological background, e.g., whether an individual is stubborn or open-minded, or what kind of the social topology exists among the individuals and how it matters in aggregating individuals into social groups. The modeling framework also describes collective motion of many evacuees in a planar space, and the resulting multi-agent system is partly similar to Vicsek model, and it is meaningful to explore complex crowd behavior in social context.
comment: 5 pages
Collision Avoidance for Convex Primitives via Differentiable Optimization Based High-Order Control Barrier Functions
Ensuring the safety of dynamical systems is crucial, where collision avoidance is a primary concern. Recently, control barrier functions (CBFs) have emerged as an effective method to integrate safety constraints into control synthesis through optimization techniques. However, challenges persist when dealing with convex primitives and tasks requiring torque control, as well as the occurrence of unintended equilibria. This work addresses these challenges by introducing a high-order CBF (HOCBF) framework for collision avoidance among convex primitives. We transform nonconvex safety constraints into linear constraints by differentiable optimization and prove the high-order continuous differentiability. Then, we employ HOCBFs to accommodate torque control, enabling tasks involving forces or high dynamics. Additionally, we analyze the issue of spurious equilibria in high-order cases and propose a circulation mechanism to prevent the undesired equilibria on the boundary of the safe set. Finally, we validate our framework with three experiments on the Franka Research 3 robotic manipulator, demonstrating successful collision avoidance and the efficacy of the circulation mechanism.
comment: Accepted to IEEE Transactions on Control Systems Technology
Data-Efficient System Identification via Lipschitz Neural Networks
Extracting dynamic models from data is of enormous importance in understanding the properties of unknown systems. In this work, we employ Lipschitz neural networks, a class of neural networks with a prescribed upper bound on their Lipschitz constant, to address the problem of data-efficient nonlinear system identification. Under the (fairly weak) assumption that the unknown system is Lipschitz continuous, we propose a method to estimate the approximation error bound of the trained network and the bound on the difference between the simulated trajectories by the trained models and the true system. Empirical results show that our method outperforms classic fully connected neural networks and Lipschitz regularized networks through simulation studies on three dynamical systems, and the advantage of our method is more noticeable when less data is used for training.
comment: Accepted at the 2025 American Control Conference (ACC)
Beyond Quadratic Costs: A Bregman Divergence Approach to H$_\infty$ Control
In the past couple of decades, non-quadratic convex penalties have reshaped signal processing and machine learning; in robust control, however, general convex costs break the Riccati and storage function structure that make the design tractable. Practitioners thus default to approximations, heuristics or robust model predictive control that are solved online for short horizons. We close this gap by extending $H_\infty$ control of discrete-time linear systems to strictly convex penalties on state, input, and disturbance, recasting the objective with Bregman divergences that admit a completion-of-squares decomposition. The result is a closed-form, time-invariant, full-information stabilizing controller that minimizes a worst-case performance ratio over the infinite horizon. Necessary and sufficient existence/optimality conditions are given by a Riccati-like identity together with a concavity requirement; with quadratic costs, these collapse to the classical $H_\infty$ algebraic Riccati equation and the associated negative-semidefinite condition, recovering the linear central controller. Otherwise, the optimal controller is nonlinear and can enable safety envelopes, sparse actuation, and bang-bang policies with rigorous $H_\infty$ guarantees.
Monotone Neural Control Barrier Certificates
This work presents a neurosymbolic framework for synthesizing and verifying safety controllers in high-dimensional monotone dynamical systems using only linear sample complexity, without requiring explicit models or conservative Lipschitz bounds. The approach combines the expressiveness of neural networks with the rigor of symbolic reasoning via barrier certificates, functional analogs of inductive invariants that formally guarantee safety. Prior data-driven methods often treat dynamics as black-box models, relying on dense state-space discretization or Lipschitz overapproximations, leading to exponential sample complexity. In contrast, monotonicity -- a pervasive structural property in many real-world systems -- provides a symbolic scaffold that simplifies both learning and verification. Exploiting order preservation reduces verification to localized boundary checks, transforming a high-dimensional problem into a tractable, low-dimensional one. Barrier certificates are synthesized using monotone neural networks -- architectures with embedded monotonicity constraints -- trained via gradient-based optimization guided by barrier conditions. This enables scalable, formally sound verification directly from simulation data, bridging black-box learning and formal guarantees within a unified neurosymbolic framework. Empirical results on three large-scale benchmarks -- a 1,000-dimensional freeway traffic model, a 50-dimensional urban traffic network, and a 13,000-dimensional power grid -- demonstrate the scalability and effectiveness of the approach in real-world, safety-critical systems.
comment: The work reached arXiv before full agreement among all co-authors was documented
A MILP-Based Solution to Multi-Agent Motion Planning and Collision Avoidance in Constrained Environments
We propose a mixed-integer linear program (MILP) for multi-agent motion planning that embeds Polytopic Action-based Motion Planning (PAAMP) into a sequence-then-solve pipeline. Region sequences confine each agent to adjacent convex polytopes, while a big-M hyperplane model enforces inter-agent separation. Collision constraints are applied only to agents sharing or neighboring a region, which reduces binary variables exponentially compared with naive formulations. An L1 path-length-plus-acceleration cost yields smooth trajectories. We prove finite-time convergence and demonstrate on representative multi-agent scenarios with obstacles that our formulation produces collision-free trajectories an order of magnitude faster than an unstructured MILP baseline.
comment: Accepted to 2025 IEEE International Conference on Automation Science and Engineering (CASE 2025). This arXiv version adds a supplementary appendix with figures not in the IEEE proceedings
Utility-Scale Bifacial Solar Photovoltaic System: Optimum Sizing and Techno-Economic Evaluation
Classical monofacial solar photovoltaic systems have gained prevalence and are widely reported in the literature because they have a lower initial cost compared with bifacial systems. However, limited investigation of both systems has been done on a utility scale with different performance indicators. This paper introduces a multifaceted comparative analysis including various aspects like energy generation, reliability, environmental effect, economic viability, and footprint area. Real measured data, including ambient temperature, solar irradiance, and a utility-scale load, were used for studying both systems in the City of Detroit. The optimal system sizing and energy management strategy are attained using the Whale optimization algorithm. Minimizing the loss of power supply probability and sizing the number of photovoltaic panels (NPV) are carried out for both cases. Results revealed that the bifacial solar system generates more power with a lower NPV, a smaller installation area, and hence a lower levelized cost of energy for the entire project lifetime compared to the monofacial system. Accordingly, the bifacial system outlined in this paper is recommended and can be implemented in various locations to establish a sustainable solar energy system that is economically feasible with clean energy production for the entire project's lifespan.
comment: This paper was accepted for, and published in, the proceedings of IEEE PES GM 2024
Assessing the Performance and Impact of PV Technologies on Storage in Hybrid Renewable Systems
Traditional monofacial photovoltaic (mPV) systems are commonly adopted and well-documented because of their lower upfront costs in comparison to bifacial photovoltaic (bPV) systems. This study investigates how PV technologies impact energy storage in grid-scale hybrid renewable systems, focusing on optimizing and assessing the performance of mPV and bPV technologies integrated with pumped storage hydropower. Using Ludington City, Michigan as a case study and analyzing realworld data such as solar irradiance, ambient temperature, and utility-scale load profiles, the research highlights the operational and economic benefits of bPV systems. The results reveal that bPV systems can pump approximately 10.38% more water annually to the upper reservoir while achieving a lower levelized cost of energy ($0.0578/kWh for bPV vs. $0.0672/kWh for mPV). This study underscores the outstanding potential of bPV systems in enhancing energy storage and management strategies, contributing to a more sustainable and resilient renewable energy future.
comment: This paper is accepted for publication in IEEE PES GM 2025
Impact Analysis of Utility-Scale Energy Storage on the ERCOT Grid in Reducing Renewable Generation Curtailments and Emissions
This paper explores the solutions for minimizing renewable energy (RE) curtailment in the Texas Electric Reliability Council of Texas (ERCOT) grid. By utilizing current and future planning data from ERCOT and the System Advisor Model from the National Renewable Energy Laboratory, we examine how future renewable energy (RE) initiatives, combined with utility-scale energy storage, can reduce CO2 emissions while reshaping Texas's energy mix. The study projects the energy landscape from 2023 to 2033, considering the planned phase-out of fossil fuel plants and the integration of new wind/solar projects. By comparing emissions under different load scenarios, with and without storage, we demonstrate storage's role in optimizing RE utilization. The findings of this paper provide actionable guidance for energy stakeholders, underscoring the need to expand wind and solar projects with strategic storage solutions to maximize Texas's RE capacity and substantially reduce CO2 emissions.
comment: This paper is accepted for publication in IEEE PES GM 2025
Systems and Control (EESS)
A State-Space Representation of Coupled Linear Multivariate PDEs and Stability Analysis using SDP
Physical processes evolving in both time and space are often modeled using Partial Differential Equations (PDEs). Recently, it has been shown how stability analysis and control of coupled PDEs in a single spatial variable can be more conveniently performed using an equivalent Partial Integral Equation (PIE) representation. The construction of this PIE representation is based on an analytic expression for the inverse of the spatial differential operator, $\partial_s^{d}$, on the domain defined by boundary conditions. In this paper, we show how this univariate representation may be extended inductively to multiple spatial variables by representing the domain as the intersection of lifted univariate domains. Specifically, we show that if each univariate domain is well-posed, then there exists a readily verified consistency condition which is necessary and sufficient for existence of an inverse to the multivariate spatial differential operator, $D^\alpha=\partial_{s_1}^{\alpha_1}\cdots\partial_{s_N}^{\alpha_N}$, on the PDE domain. Furthermore, we show that this inverse is an element of a $*$-algebra of Partial Integral (PI) operators defined by polynomial semi-separable kernels. Based on this operator algebra, we show that the evolution of any suitably well-posed linear multivariate PDE may be described by a PIE, parameterized by elements of the PI algebra. A convex computational test for PDE stability is then proposed using a positive matrix parameterization of positive PI operators, and software (PIETOOLS) is provided which automates the process of representation and stability analysis of such PDEs. This software is used to analyze stability of 2D heat, wave, and plate equations, obtaining accurate bounds on the rate of decay.
Recursive Gaussian Process Regression with Integrated Monotonicity Assumptions for Control Applications
In this paper, we present an extension to the recursive Gaussian Process (RGP) regression that enables the satisfaction of inequality constraints and is well suited for a real-time execution in control applications. The soft inequality constraints are integrated by introducing an additional extended Kalman Filter (EKF) update step using pseudo-measurements. The sequential formulation of the algorithm and several developed heuristics ensure both the performance and a low computational effort of the algorithm. A special focus lies on an efficient consideration of monotonicity assumptions for GPs in the form of inequality constraints. The algorithm is statistically validated in simulations, where the possible advantages in comparison with the standard RGP algorithm become obvious. The paper is concluded with a successful experimental validation of the developed algorithm for the monotonicity-preserving learning of heat transfer values for the control of a vapor compression cycle evaporator, leveraging a previously published partial input output linearization (IOL).
comment: Accepted at ICINCO 2025 (22nd International Conference on Informatics in Control, Automation and Robotics)
Optimal Unpredictable Control for Linear Systems
In this paper, we investigate how to achieve the unpredictability against malicious inferences for linear systems. The key idea is to add stochastic control inputs, named as unpredictable control, to make the outputs irregular. The future outputs thus become unpredictable and the performance of inferences is degraded. The major challenges lie in: i) how to formulate optimization problems to obtain an optimal distribution of stochastic input, under unknown prediction accuracy of the adversary; and ii) how to achieve the trade-off between the unpredictability and control performance. We first utilize both variance and confidence probability of prediction error to quantify unpredictability, then formulate two two-stage stochastic optimization problems, respectively. Under variance metric, the analytic optimal distribution of control input is provided. With probability metric, it is a non-convex optimization problem, thus we present a novel numerical method and convert the problem into a solvable linear optimization problem. Last, we quantify the control performance under unpredictable control, and accordingly design the unpredictable LQR and cooperative control. Simulations demonstrate the unpredictability of our control algorithm. The obtained optimal distribution outperforms Gaussian and Laplace distributions commonly used in differential privacy under proposed metrics.
Assessment of Power System Stability Considering Multiple Time-Scale Dynamics: Insights into Hopf Bifurcations in Presence of GFL and GFM IBRs
Real power systems exhibit dynamics that evolve across a wide range of time scales, from very fast to very slow phenomena. Historically, incorporating these wide-ranging dynamics into a single model has been impractical. As a result, power engineers rely on time-scale decomposition to simplify models. When fast phenomena are evaluated, slow dynamics are neglected (assumed stable), and vice versa. This paper challenges this paradigm by showing the importance of assessing power system stability while considering multiple time scales simultaneously. Using the concept of Hopf bifurcations, it exemplifies instability issues that would be missed if multi-time-scale dynamics are not considered. Although this work employs both grid-following and grid-forming inverter-based resource models, it is not a direct comparison. Instead, it presents a case study demonstrating how one technology can complement the other from a multi time-scale dynamics perspective.
comment: 7 pages
Distributed Multiple Fault Detection and Estimation in DC Microgrids with Unknown Power Loads
This paper proposes a distributed diagnosis scheme to detect and estimate actuator and power line faults in DC microgrids subject to unknown power loads and stochastic noise. To address actuator faults, we design a fault estimation filter whose parameters are determined through a tractable optimization problem to achieve fault estimation, decoupling from power line faults, and robustness against noise. In contrast, the estimation of power line faults poses greater challenges due to the inherent coupling between fault currents and unknown power loads, which becomes ill-posed when the underlying system is insufficiently excited. To the best of our knowledge, this is the first study to address this critical yet underexplored issue. Our solution introduces a novel differentiate-before-estimate strategy. A set of diagnostic rules based on the temporal characteristics of a constructed residual is developed to distinguish load changes from line faults. Once a power line fault is detected, a regularized least-squares method is activated to estimate the fault currents, for which we further derive an upper bound on the estimation error. Finally, comprehensive simulation results validate the effectiveness of the proposed methods.
comment: 28 pages, 13 figures
Markov Chain-based Model of Blockchain Radio Access Networks
Security has always been a priority, for researchers, service providers and network operators when it comes to radio access networks (RAN). One wireless access approach that has captured attention is blockchain enabled RAN (B-RAN) due to its secure nature. This research introduces a framework that integrates blockchain technology into RAN while also addressing the limitations of state-of-the-art models. The proposed framework utilizes queuing and Markov chain theory to model the aspects of B-RAN. An extensive evaluation of the models performance is provided, including an analysis of timing factors and a focused assessment of its security aspects. The results demonstrate reduced latency and comparable security making the presented framework suitable for diverse application scenarios.
Reformulating Parallel-Connected Lithium-Ion Battery Pack Dynamics with Interconnection Resistances as Ordinary Differential Equations
This work presents analytical solutions for the current distribution in lithium-ion battery packs composed of cells connected in parallel, explicitly accounting for the presence of interconnection resistances. These solutions enable the reformulation of the differential-algebraic equations describing the pack dynamics into a set of ordinary differential equations, thereby simplifying simulation and analysis. Conditions under which uniform current sharing across all cells occurs are also derived. The proposed formulation is validated against experimental data and confirms its ability to capture the key behaviours induced by interconnection resistances. These results can support the improved design and control of parallel-connected battery packs.
Dimension-Decomposed Learning for Quadrotor Geometric Attitude Control with Almost Global Exponential Convergence on SO(3)
This paper introduces a lightweight and interpretable online learning approach called Dimension-Decomposed Learning (DiD-L) for disturbance identification in quadrotor geometric attitude control. As a module instance of DiD-L, we propose the Sliced Adaptive-Neuro Mapping (SANM). Specifically, to address underlying underfitting problems, the high-dimensional mapping for online identification is axially ``sliced" into multiple low-dimensional submappings (slices). In this way, the complex high-dimensional problem is decomposed into a set of simple low-dimensional subtasks addressed by shallow neural networks and adaptive laws. These neural networks and adaptive laws are updated online via Lyapunov-based adaptation without the persistent excitation (PE) condition. To enhance the interpretability of the proposed approach, we prove that the state solution of the rotational error dynamics exponentially converges into an arbitrarily small ball within an almost global attraction domain, despite time-varying disturbances and inertia uncertainties. This result is novel as it demonstrates exponential convergence without requiring pre-training for unseen disturbances and specific knowledge of the model. To our knowledge in the quadrotor control field, DiD-L is the first online learning approach that is lightweight enough to run in real-time at 400 Hz on microcontroller units (MCUs) such as STM32, and has been validated through real-world experiments.
Online Incident Response Planning under Model Misspecification through Bayesian Learning and Belief Quantization CCS
Effective responses to cyberattacks require fast decisions, even when information about the attack is incomplete or inaccurate. However, most decision-support frameworks for incident response rely on a detailed system model that describes the incident, which restricts their practical utility. In this paper, we address this limitation and present an online method for incident response planning under model misspecification, which we call MOBAL: Misspecified Online Bayesian Learning. MOBAL iteratively refines a conjecture about the model through Bayesian learning as new information becomes available, which facilitates model adaptation as the incident unfolds. To determine effective responses online, we quantize the conjectured model into a finite Markov model, which enables efficient response planning through dynamic programming. We prove that Bayesian learning is asymptotically consistent with respect to the information feedback. Additionally, we establish bounds on misspecification and quantization errors. Experiments on the CAGE-2 benchmark show that MOBAL outperforms the state of the art in terms of adaptability and robustness to model misspecification.
comment: Accepted to ACM CCS AISec2025
Smart Charging Impact Analysis using Clustering Methods and Real-world Distribution Feeders
The anticipated widespread adoption of electric vehicles (EVs) necessitates a critical evaluation of existing power distribution infrastructures, as EV integration imposes additional stress on distribution networks that can lead to component overloading and power quality degradation. Implementing smart charging mechanisms can mitigate these adverse effects and defer or even avoid upgrades. This study assesses the performance of two smart charging strategies - Time of Use (TOU) pricing and Load Balancing (LB) on seven representative real-world feeders identified using k-means clustering. A time series-based steady-state load flow analysis was conducted on these feeders to simulate the impact of EV charging under both strategies across four different EV enrollment scenarios and three representative days to capture seasonal load characteristics. A grid upgrade strategy has been proposed to strengthen the power grid to support EV integration with minimal cost. Results demonstrate that both TOU and LB strategies effectively manage the additional EV load with reduced upgrade requirement and cost to existing infrastructure compared to the case without smart charging strategies and LB outperforms TOU when the customer enrollment levels are high. These findings support the viability of smart charging in facilitating EV integration while maintaining distribution network reliability and reducing investment cost.
Discrete VHCs for Propeller Motion of a Devil-Stick using purely Impulsive Inputs
The control problem of realizing propeller motion of a devil-stick in the vertical plane using impulsive forces applied normal to the stick is considered. This problem is an example of underactuated robotic juggling and has not been considered in the literature before. Inspired by virtual holonomic constraints, the concept of discrete virtual holonomic constraints (DVHC) is introduced for the first time to solve this orbital stabilization problem. At the discrete instants when impulsive inputs are applied, the location of the center-of-mass of the devil-stick is specified in terms of its orientation angle. This yields the discrete zero dynamics (DZD), which provides conditions for stable propeller motion. In the limiting case, when the rotation angle between successive applications of impulsive inputs is chosen to be arbitrarily small, the problem reduces to that of propeller motion under continuous forcing. A controller that enforces the DVHC, and an orbit stabilizing controller based on the impulse controlled Poincar\'e map approach are presented. The efficacy of the approach to trajectory design and stabilization is validated through simulations.
comment: 16 pages, 11 figures. This work has been submitted to the IEEE for possible publication
Nonlinear Federated System Identification
We consider federated learning of linearly-parameterized nonlinear systems. We establish theoretical guarantees on the effectiveness of federated nonlinear system identification compared to centralized approaches, demonstrating that the convergence rate improves as the number of clients increases. Although the convergence rates in the linear and nonlinear cases differ only by a constant, this constant depends on the feature map $\phi$, which can be carefully chosen in the nonlinear setting to increase excitation and improve performance. We experimentally validate our theory in physical settings where client devices are driven by i.i.d. control inputs and control policies exhibiting i.i.d. random perturbations, ensuring non-active exploration. Experiments use trajectories from nonlinear dynamical systems characterized by real-analytic feature functions, including polynomial and trigonometric components, representative of physical systems including pendulum and quadrotor dynamics. We analyze the convergence behavior of the proposed method under varying noise levels and data distributions. Results show that federated learning consistently improves convergence of any individual client as the number of participating clients increases.
Structure-preserving Optimal Kron-based Reduction of Radial Distribution Networks
Network reduction simplifies complex electrical networks to address computational challenges of large-scale transmission and distribution grids. Traditional network reduction methods are often based on a predefined set of nodes or lines to remain in the reduced network. This paper builds upon previous work on Optimal Kron-based Reduction of Networks (Opti-KRON), which was formulated as a mixed-integer linear program (MILP), to enhance the framework in two aspects. First, the scalability is improved via a cutting plane restriction, tightened Big~M bounds, and a zero-injection node reduction stage. Next, we introduce a radiality-preservation step to identify and recover nodes whose restoration ensures radiality of the reduced network. The model is validated through its application to the 533-bus distribution test system and a 3499-bus realistic test feeder for a set of representative loading scenarios. In the 533-bus system, an 85% reduction was achieved with a maximum voltage error below 0.0025 p.u., while in the 3499-bus feeder, over 94% reduction was obtained with maximum voltage errors below 0.002 p.u. Additionally, we show that the radialization step accelerates the runtime of optimal voltage control problems when applied to Kron-reduced networks.
A Vision-Based Shared-Control Teleoperation Scheme for Controlling the Robotic Arm of a Four-Legged Robot
In hazardous and remote environments, robotic systems perform critical tasks demanding improved safety and efficiency. Among these, quadruped robots with manipulator arms offer mobility and versatility for complex operations. However, teleoperating quadruped robots is challenging due to the lack of integrated obstacle detection and intuitive control methods for the robotic arm, increasing collision risks in confined or dynamically changing workspaces. Teleoperation via joysticks or pads can be non-intuitive and demands a high level of expertise due to its complexity, culminating in a high cognitive load on the operator. To address this challenge, a teleoperation approach that directly maps human arm movements to the robotic manipulator offers a simpler and more accessible solution. This work proposes an intuitive remote control by leveraging a vision-based pose estimation pipeline that utilizes an external camera with a machine learning-based model to detect the operator's wrist position. The system maps these wrist movements into robotic arm commands to control the robot's arm in real-time. A trajectory planner ensures safe teleoperation by detecting and preventing collisions with both obstacles and the robotic arm itself. The system was validated on the real robot, demonstrating robust performance in real-time control. This teleoperation approach provides a cost-effective solution for industrial applications where safety, precision, and ease of use are paramount, ensuring reliable and intuitive robotic control in high-risk environments.
LyLA-Therm: Lyapunov-based Langevin Adaptive Thermodynamic Neural Network Controller
Thermodynamic principles can be employed to design parameter update laws that address challenges such as the exploration vs. exploitation dilemma. In this paper, inspired by the Langevin equation, an update law is developed for a Lyapunov-based DNN control method, taking the form of a stochastic differential equation. The drift term is designed to minimize the system's generalized internal energy, while the diffusion term is governed by a user-selected generalized temperature law, allowing for more controlled fluctuations. The minimization of generalized internal energy in this design fulfills the exploitation objective, while the temperature-based stochastic noise ensures sufficient exploration. Using a Lyapunov-based stability analysis, the proposed Lyapunov-based Langevin Adaptive Thermodynamic (LyLA-Therm) neural network controller achieves probabilistic convergence of the tracking and parameter estimation errors to an ultimate bound. Simulation results demonstrate the effectiveness of the proposed approach, with the LyLA-Therm architecture achieving up to 20.66% improvement in tracking errors, up to 20.89% improvement in function approximation errors, and up to 11.31% improvement in off-trajectory function approximation errors compared to the baseline deterministic approach.
Holo-Artisan: A Personalized Multi-User Holographic Experience for Virtual Museums on the Edge Intelligence
We present Holo-Artisan, a novel system architecture enabling immersive multi-user experiences in virtual museums through true holographic displays and personalized edge intelligence. In our design, local edge computing nodes process real-time user data -- including pose, facial expression, and voice -- for multiple visitors concurrently. Generative AI models then drive digital artworks (e.g., a volumetric Mona Lisa) to respond uniquely to each viewer. For instance, the Mona Lisa can return a smile to one visitor while engaging in a spoken Q\&A with another, all in real time. A cloud-assisted collaboration platform composes these interactions in a shared scene using a universal scene description, and employs ray tracing to render high-fidelity, personalized views with a direct pipeline to glasses-free holographic displays. To preserve user privacy and continuously improve personalization, we integrate federated learning (FL) -- edge devices locally fine-tune AI models and share only model updates for aggregation. This edge-centric approach minimizes latency and bandwidth usage, ensuring a synchronized shared experience with individual customization. Through Holo-Artisan, static museum exhibits are transformed into dynamic, living artworks that engage each visitor in a personal dialogue, heralding a new paradigm of cultural heritage interaction.
Amplitude maximization in stable systems, Schur positivity, and some conjectures on polynomial interpolation
For $r > 0$ and integers $t \ge n > 0$, we consider the following problem: maximize the amplitude $|x_t|$ at time $t$, over all complex solutions $x = (x_0, x_1, \dots)$ of arbitrary homogeneous linear difference equations of order $n$ with the characteristic roots in the disc $\{z \in \mathbb{C}: |z| \le r\}$, and with initial values $x_0, \dots, x_{n-1}$ in the unit disc. We find that for any triple $t,n,r$, the maximum is attained with coinciding roots on the boundary circle; in particular, this implies that the peak amplitude $\sup_{t \ge n} |x_t|$ can be maximized explicitly, by studying a unique equation with the characteristic polynomial $(z-r)^n$. Moreover, the optimality of the cophase root configuration holds for origin-centered polydiscs. To prove this result, we first reduce the problem to a certain interpolation problem over monomials, then solve the latter by leveraging the theory of symmetric functions and identifying the associated Schur positivity structure. We also discuss the implications for more general Reinhardt domains. Finally, we study the problem of estimating the derivatives of a real entire function from its values at $n/2$ pairs of complex conjugate points in the unit disc. We propose conjectures on the extremality of the monomial $z^n$, and restate them in terms of Schur polynomials.
comment: 18 pages; minor typo corrections viz. the previous version
Active Disturbance Rejection Control for Trajectory Tracking of a Seagoing USV: Design, Simulation, and Field Experiments IROS 2025
Unmanned Surface Vessels (USVs) face significant control challenges due to uncertain environmental disturbances like waves and currents. This paper proposes a trajectory tracking controller based on Active Disturbance Rejection Control (ADRC) implemented on the DUS V2500. A custom simulation incorporating realistic waves and current disturbances is developed to validate the controller's performance, supported by further validation through field tests in the harbour of Scheveningen, the Netherlands, and at sea. Simulation results demonstrate that ADRC significantly reduces cross-track error across all tested conditions compared to a baseline PID controller but increases control effort and energy consumption. Field trials confirm these findings while revealing a further increase in energy consumption during sea trials compared to the baseline.
comment: Accepted for presentation at IROS 2025. Accepted version
Fragile, Robust, and Antifragile: A Perspective from Parameter Responses in Reinforcement Learning Under Stress
This paper explores Reinforcement learning (RL) policy robustness by systematically analyzing network parameters under internal and external stresses. Inspired by synaptic plasticity in neuroscience, synaptic filtering introduces internal stress by selectively perturbing parameters, while adversarial attacks apply external stress through modified agent observations. This dual approach enables the classification of parameters as fragile, robust, or antifragile, based on their influence on policy performance in clean and adversarial settings. Parameter scores are defined to quantify these characteristics, and the framework is validated on PPO-trained agents in Mujoco continuous control environments. The results highlight the presence of antifragile parameters that enhance policy performance under stress, demonstrating the potential of targeted filtering techniques to improve RL policy adaptability. These insights provide a foundation for future advancements in the design of robust and antifragile RL systems.
comment: Withdrawn pending a review of attribution and overlap with Pravin et al., Artificial Intelligence (2024), DOI: 10.1016/j.artint.2023.104060. Further dissemination is paused while we determine appropriate next steps
Moving horizon estimation for nonlinear systems with time-varying parameters
We propose a moving horizon estimation scheme for estimating the states and time-varying parameters of nonlinear systems. We consider the case where observability of the parameters depends on the excitation of the system and may be absent during operation, with the parameter dynamics fulfilling a weak incremental bounded-energy bounded-state property to ensure boundedness of the estimation error (with respect to the disturbance energy). The proposed estimation scheme involves a standard quadratic cost function with an adaptive regularization term depending on the current parameter observability. We develop robustness guarantees for the overall estimation error that are valid for all times, and that improve the more often the parameters are detected to be observable during operation. The theoretical results are illustrated by a simulation example.
comment: Presented at IFAC NMPC 2024, Kyoto, Japan
Game-theoretic Energy Management Strategies With Interacting Agents in Formula 1
This paper presents an interaction-aware energy management optimization framework for Formula 1 racing. The considered scenario involves two agents and a drag reduction model. Strategic interactions between the agents are captured by a Stackelberg game formulated as a bilevel program. To address the computational challenges associated with bilevel optimization, the problem is reformulated as a single-level nonlinear program employing the Karush-Kuhn-Tucker conditions. The proposed framework contributes towards the development of new energy management and allocation strategies, caused by the presence of another agent. For instance, it provides valuable insights on how to redistribute the energy in order to optimally exploit the wake effect, showcasing a notable difference with the behavior studied in previous works. Robust energy allocations can be identified to reduce the lap time loss associated with unexpected choices of the other agent. It allows to recognize the boundary conditions for the interaction to become relevant, impacting the system's behavior, and to assess if overtaking is possible and beneficial. Overall, the framework provides a comprehensive approach for a two-agent Formula 1 racing problem with strategic interactions, offering physically intuitive and practical results.
Binary Decision Process in Pre-Evacuation Behavior
In crowd evacuation the time interval before decisive movement towards a safe place is defined as the pre-evacuation phase, and it has crucial impact on the total time required for safe egress. This process mainly refers to situation awareness and response to an external stressors, e.g., fire alarm. Due to the complexity of human cognitive process, simulation is used to study this important time interval. In this paper a binary decision process is formulated to simulate pre-evacuation time of many evacuees in a given social context. The model combines classic opinion dynamics with binary phase transition to describe how group pre-evacuation time emerges from individual interaction. The model parameters are quantitatively meaningful to human factors research within socio-psychological background, e.g., whether an individual is stubborn or open-minded, or what kind of the social topology exists among the individuals and how it matters in aggregating individuals into social groups. The modeling framework also describes collective motion of many evacuees in a planar space, and the resulting multi-agent system is partly similar to Vicsek model, and it is meaningful to explore complex crowd behavior in social context.
comment: 5 pages
Collision Avoidance for Convex Primitives via Differentiable Optimization Based High-Order Control Barrier Functions
Ensuring the safety of dynamical systems is crucial, where collision avoidance is a primary concern. Recently, control barrier functions (CBFs) have emerged as an effective method to integrate safety constraints into control synthesis through optimization techniques. However, challenges persist when dealing with convex primitives and tasks requiring torque control, as well as the occurrence of unintended equilibria. This work addresses these challenges by introducing a high-order CBF (HOCBF) framework for collision avoidance among convex primitives. We transform nonconvex safety constraints into linear constraints by differentiable optimization and prove the high-order continuous differentiability. Then, we employ HOCBFs to accommodate torque control, enabling tasks involving forces or high dynamics. Additionally, we analyze the issue of spurious equilibria in high-order cases and propose a circulation mechanism to prevent the undesired equilibria on the boundary of the safe set. Finally, we validate our framework with three experiments on the Franka Research 3 robotic manipulator, demonstrating successful collision avoidance and the efficacy of the circulation mechanism.
comment: Accepted to IEEE Transactions on Control Systems Technology
Data-Efficient System Identification via Lipschitz Neural Networks
Extracting dynamic models from data is of enormous importance in understanding the properties of unknown systems. In this work, we employ Lipschitz neural networks, a class of neural networks with a prescribed upper bound on their Lipschitz constant, to address the problem of data-efficient nonlinear system identification. Under the (fairly weak) assumption that the unknown system is Lipschitz continuous, we propose a method to estimate the approximation error bound of the trained network and the bound on the difference between the simulated trajectories by the trained models and the true system. Empirical results show that our method outperforms classic fully connected neural networks and Lipschitz regularized networks through simulation studies on three dynamical systems, and the advantage of our method is more noticeable when less data is used for training.
comment: Accepted at the 2025 American Control Conference (ACC)
Beyond Quadratic Costs: A Bregman Divergence Approach to H$_\infty$ Control
In the past couple of decades, non-quadratic convex penalties have reshaped signal processing and machine learning; in robust control, however, general convex costs break the Riccati and storage function structure that make the design tractable. Practitioners thus default to approximations, heuristics or robust model predictive control that are solved online for short horizons. We close this gap by extending $H_\infty$ control of discrete-time linear systems to strictly convex penalties on state, input, and disturbance, recasting the objective with Bregman divergences that admit a completion-of-squares decomposition. The result is a closed-form, time-invariant, full-information stabilizing controller that minimizes a worst-case performance ratio over the infinite horizon. Necessary and sufficient existence/optimality conditions are given by a Riccati-like identity together with a concavity requirement; with quadratic costs, these collapse to the classical $H_\infty$ algebraic Riccati equation and the associated negative-semidefinite condition, recovering the linear central controller. Otherwise, the optimal controller is nonlinear and can enable safety envelopes, sparse actuation, and bang-bang policies with rigorous $H_\infty$ guarantees.
A MILP-Based Solution to Multi-Agent Motion Planning and Collision Avoidance in Constrained Environments
We propose a mixed-integer linear program (MILP) for multi-agent motion planning that embeds Polytopic Action-based Motion Planning (PAAMP) into a sequence-then-solve pipeline. Region sequences confine each agent to adjacent convex polytopes, while a big-M hyperplane model enforces inter-agent separation. Collision constraints are applied only to agents sharing or neighboring a region, which reduces binary variables exponentially compared with naive formulations. An L1 path-length-plus-acceleration cost yields smooth trajectories. We prove finite-time convergence and demonstrate on representative multi-agent scenarios with obstacles that our formulation produces collision-free trajectories an order of magnitude faster than an unstructured MILP baseline.
comment: Accepted to 2025 IEEE International Conference on Automation Science and Engineering (CASE 2025). This arXiv version adds a supplementary appendix with figures not in the IEEE proceedings
Monotone Neural Control Barrier Certificates
This work presents a neurosymbolic framework for synthesizing and verifying safety controllers in high-dimensional monotone dynamical systems using only linear sample complexity, without requiring explicit models or conservative Lipschitz bounds. The approach combines the expressiveness of neural networks with the rigor of symbolic reasoning via barrier certificates, functional analogs of inductive invariants that formally guarantee safety. Prior data-driven methods often treat dynamics as black-box models, relying on dense state-space discretization or Lipschitz overapproximations, leading to exponential sample complexity. In contrast, monotonicity -- a pervasive structural property in many real-world systems -- provides a symbolic scaffold that simplifies both learning and verification. Exploiting order preservation reduces verification to localized boundary checks, transforming a high-dimensional problem into a tractable, low-dimensional one. Barrier certificates are synthesized using monotone neural networks -- architectures with embedded monotonicity constraints -- trained via gradient-based optimization guided by barrier conditions. This enables scalable, formally sound verification directly from simulation data, bridging black-box learning and formal guarantees within a unified neurosymbolic framework. Empirical results on three large-scale benchmarks -- a 1,000-dimensional freeway traffic model, a 50-dimensional urban traffic network, and a 13,000-dimensional power grid -- demonstrate the scalability and effectiveness of the approach in real-world, safety-critical systems.
comment: The work reached arXiv before full agreement among all co-authors was documented
Utility-Scale Bifacial Solar Photovoltaic System: Optimum Sizing and Techno-Economic Evaluation
Classical monofacial solar photovoltaic systems have gained prevalence and are widely reported in the literature because they have a lower initial cost compared with bifacial systems. However, limited investigation of both systems has been done on a utility scale with different performance indicators. This paper introduces a multifaceted comparative analysis including various aspects like energy generation, reliability, environmental effect, economic viability, and footprint area. Real measured data, including ambient temperature, solar irradiance, and a utility-scale load, were used for studying both systems in the City of Detroit. The optimal system sizing and energy management strategy are attained using the Whale optimization algorithm. Minimizing the loss of power supply probability and sizing the number of photovoltaic panels (NPV) are carried out for both cases. Results revealed that the bifacial solar system generates more power with a lower NPV, a smaller installation area, and hence a lower levelized cost of energy for the entire project lifetime compared to the monofacial system. Accordingly, the bifacial system outlined in this paper is recommended and can be implemented in various locations to establish a sustainable solar energy system that is economically feasible with clean energy production for the entire project's lifespan.
comment: This paper was accepted for, and published in, the proceedings of IEEE PES GM 2024
Assessing the Performance and Impact of PV Technologies on Storage in Hybrid Renewable Systems
Traditional monofacial photovoltaic (mPV) systems are commonly adopted and well-documented because of their lower upfront costs in comparison to bifacial photovoltaic (bPV) systems. This study investigates how PV technologies impact energy storage in grid-scale hybrid renewable systems, focusing on optimizing and assessing the performance of mPV and bPV technologies integrated with pumped storage hydropower. Using Ludington City, Michigan as a case study and analyzing realworld data such as solar irradiance, ambient temperature, and utility-scale load profiles, the research highlights the operational and economic benefits of bPV systems. The results reveal that bPV systems can pump approximately 10.38% more water annually to the upper reservoir while achieving a lower levelized cost of energy ($0.0578/kWh for bPV vs. $0.0672/kWh for mPV). This study underscores the outstanding potential of bPV systems in enhancing energy storage and management strategies, contributing to a more sustainable and resilient renewable energy future.
comment: This paper is accepted for publication in IEEE PES GM 2025
Impact Analysis of Utility-Scale Energy Storage on the ERCOT Grid in Reducing Renewable Generation Curtailments and Emissions
This paper explores the solutions for minimizing renewable energy (RE) curtailment in the Texas Electric Reliability Council of Texas (ERCOT) grid. By utilizing current and future planning data from ERCOT and the System Advisor Model from the National Renewable Energy Laboratory, we examine how future renewable energy (RE) initiatives, combined with utility-scale energy storage, can reduce CO2 emissions while reshaping Texas's energy mix. The study projects the energy landscape from 2023 to 2033, considering the planned phase-out of fossil fuel plants and the integration of new wind/solar projects. By comparing emissions under different load scenarios, with and without storage, we demonstrate storage's role in optimizing RE utilization. The findings of this paper provide actionable guidance for energy stakeholders, underscoring the need to expand wind and solar projects with strategic storage solutions to maximize Texas's RE capacity and substantially reduce CO2 emissions.
comment: This paper is accepted for publication in IEEE PES GM 2025
Robotics
Train Once, Deploy Anywhere: Realize Data-Efficient Dynamic Object Manipulation
Realizing generalizable dynamic object manipulation is important for enhancing manufacturing efficiency, as it eliminates specialized engineering for various scenarios. To this end, imitation learning emerges as a promising paradigm, leveraging expert demonstrations to teach a policy manipulation skills. Although the generalization of an imitation learning policy can be improved by increasing demonstrations, demonstration collection is labor-intensive. To address this problem, this paper investigates whether strong generalization in dynamic object manipulation is achievable with only a few demonstrations. Specifically, we develop an entropy-based theoretical framework to quantify the optimization of imitation learning. Based on this framework, we propose a system named Generalizable Entropy-based Manipulation (GEM). Extensive experiments in simulated and real tasks demonstrate that GEM can generalize across diverse environment backgrounds, robot embodiments, motion dynamics, and object geometries. Notably, GEM has been deployed in a real canteen for tableware collection. Without any in-scene demonstration, it achieves a success rate of over 97% across more than 10,000 operations.
ResPlan: A Large-Scale Vector-Graph Dataset of 17,000 Residential Floor Plans
We introduce ResPlan, a large-scale dataset of 17,000 detailed, structurally rich, and realistic residential floor plans, created to advance spatial AI research. Each plan includes precise annotations of architectural elements (walls, doors, windows, balconies) and functional spaces (such as kitchens, bedrooms, and bathrooms). ResPlan addresses key limitations of existing datasets such as RPLAN (Wu et al., 2019) and MSD (van Engelenburg et al., 2024) by offering enhanced visual fidelity and greater structural diversity, reflecting realistic and non-idealized residential layouts. Designed as a versatile, general-purpose resource, ResPlan supports a wide range of applications including robotics, reinforcement learning, generative AI, virtual and augmented reality, simulations, and game development. Plans are provided in both geometric and graph-based formats, enabling direct integration into simulation engines and fast 3D conversion. A key contribution is an open-source pipeline for geometry cleaning, alignment, and annotation refinement. Additionally, ResPlan includes structured representations of room connectivity, supporting graph-based spatial reasoning tasks. Finally, we present comparative analyses with existing benchmarks and outline several open benchmark tasks enabled by ResPlan. Ultimately, ResPlan offers a significant advance in scale, realism, and usability, providing a robust foundation for developing and benchmarking next-generation spatial intelligence systems.
comment: 18 pages, 3 figures, 4 tables
Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation
Generalization in embodied AI is hindered by the "seeing-to-doing gap," which stems from data scarcity and embodiment heterogeneity. To address this, we pioneer "pointing" as a unified, embodiment-agnostic intermediate representation, defining four core embodied pointing abilities that bridge high-level vision-language comprehension with low-level action primitives. We introduce Embodied-R1, a 3B Vision-Language Model (VLM) specifically designed for embodied reasoning and pointing. We use a wide range of embodied and general visual reasoning datasets as sources to construct a large-scale dataset, Embodied-Points-200K, which supports key embodied pointing capabilities. We then train Embodied-R1 using a two-stage Reinforced Fine-tuning (RFT) curriculum with a specialized multi-task reward design. Embodied-R1 achieves state-of-the-art performance on 11 embodied spatial and pointing benchmarks. Critically, it demonstrates robust zero-shot generalization by achieving a 56.2% success rate in the SIMPLEREnv and 87.5% across 8 real-world XArm tasks without any task-specific fine-tuning, representing a 62% improvement over strong baselines. Furthermore, the model exhibits high robustness against diverse visual disturbances. Our work shows that a pointing-centric representation, combined with an RFT training paradigm, offers an effective and generalizable pathway to closing the perception-action gap in robotics.
comment: Embodied-R1 technical report
The Social Context of Human-Robot Interactions
The Human-Robot Interaction (HRI) community often highlights the social context of an interaction as a key consideration when designing, implementing, and evaluating robot behavior. Unfortunately, researchers use the term "social context" in varied ways. This can lead to miscommunication, making it challenging to draw connections between related work on understanding and modeling the social contexts of human-robot interactions. To address this gap, we survey the HRI literature for existing definitions and uses of the term "social context". Then, we propose a conceptual model for describing the social context of a human-robot interaction. We apply this model to existing work, and we discuss a range of attributes of social contexts that can help researchers plan for interactions, develop behavior models for robots, and gain insights after interactions have taken place. We conclude with a discussion of open research questions in relation to understanding and modeling the social contexts of human-robot interactions.
comment: To be published in Annual Review of Control, Robotics, and Autonomous Systems
Toward an Interaction-Centered Approach to Robot Trustworthiness
As robots get more integrated into human environments, fostering trustworthiness in embodied robotic agents becomes paramount for an effective and safe human-robot interaction (HRI). To achieve that, HRI applications must promote human trust that aligns with robot skills and avoid misplaced trust or overtrust, which can pose safety risks and ethical concerns. To achieve that, HRI applications must promote human trust that aligns with robot skills and avoid misplaced trust or overtrust, which can pose safety risks and ethical concerns. In this position paper, we outline an interaction-based framework for building trust through mutual understanding between humans and robots. We emphasize two main pillars: human awareness and transparency, referring to the robot ability to interpret human actions accurately and to clearly communicate its intentions and goals, respectively. By integrating these two pillars, robots can behave in a manner that aligns with human expectations and needs while providing their human partners with both comprehension and control over their actions. We also introduce four components that we think are important for bridging the gap between a human-perceived sense of trust and a robot true capabilities.
comment: 4 pages, presented at TRUST workshop, organised in conjunction with the IEEE RO-MAN 2025 conference, held in Eindhoven, Netherlands
Augmenting cobots for sheet-metal SMEs with 3D object recognition and localisation
Due to high-mix-low-volume production, sheet-metal workshops today are challenged by small series and varying orders. As standard automation solutions tend to fall short, SMEs resort to repetitive manual labour impacting production costs and leading to tech-skilled workforces not being used to their full potential. The COOCK+ ROBUST project aims to transform cobots into mobile and reconfigurable production assistants by integrating existing technologies, including 3D object recognition and localisation. This article explores both the opportunities and challenges of enhancing cobotic systems with these technologies in an industrial setting, outlining the key steps involved in the process. Additionally, insights from a past project, carried out by the ACRO research unit in collaboration with an industrial partner, serves as a concrete implementation example throughout.
comment: 13 pages, 25 figures
Multimodal Data Storage and Retrieval for Embodied AI: A Survey
Embodied AI (EAI) agents continuously interact with the physical world, generating vast, heterogeneous multimodal data streams that traditional management systems are ill-equipped to handle. In this survey, we first systematically evaluate five storage architectures (Graph Databases, Multi-Model Databases, Data Lakes, Vector Databases, and Time-Series Databases), focusing on their suitability for addressing EAI's core requirements, including physical grounding, low-latency access, and dynamic scalability. We then analyze five retrieval paradigms (Fusion Strategy-Based Retrieval, Representation Alignment-Based Retrieval, Graph-Structure-Based Retrieval, Generation Model-Based Retrieval, and Efficient Retrieval-Based Optimization), revealing a fundamental tension between achieving long-term semantic coherence and maintaining real-time responsiveness. Based on this comprehensive analysis, we identify key bottlenecks, spanning from the foundational Physical Grounding Gap to systemic challenges in cross-modal integration, dynamic adaptation, and open-world generalization. Finally, we outline a forward-looking research agenda encompassing physics-aware data models, adaptive storage-retrieval co-optimization, and standardized benchmarking, to guide future research toward principled data management solutions for EAI. Our survey is based on a comprehensive review of more than 180 related studies, providing a rigorous roadmap for designing the robust, high-performance data management frameworks essential for the next generation of autonomous embodied systems.
Driving Style Recognition Like an Expert Using Semantic Privileged Information from Large Language Models
Existing driving style recognition systems largely depend on low-level sensor-derived features for training, neglecting the rich semantic reasoning capability inherent to human experts. This discrepancy results in a fundamental misalignment between algorithmic classifications and expert judgments. To bridge this gap, we propose a novel framework that integrates Semantic Privileged Information (SPI) derived from large language models (LLMs) to align recognition outcomes with human-interpretable reasoning. First, we introduce DriBehavGPT, an interactive LLM-based module that generates natural-language descriptions of driving behaviors. These descriptions are then encoded into machine learning-compatible representations via text embedding and dimensionality reduction. Finally, we incorporate them as privileged information into Support Vector Machine Plus (SVM+) for training, enabling the model to approximate human-like interpretation patterns. Experiments across diverse real-world driving scenarios demonstrate that our SPI-enhanced framework outperforms conventional methods, achieving F1-score improvements of 7.6% (car-following) and 7.9% (lane-changing). Importantly, SPI is exclusively used during training, while inference relies solely on sensor data, ensuring computational efficiency without sacrificing performance. These results highlight the pivotal role of semantic behavioral representations in improving recognition accuracy while advancing interpretable, human-centric driving systems.
Toward Deployable Multi-Robot Collaboration via a Symbolically-Guided Decision Transformer
Reinforcement learning (RL) has demonstrated great potential in robotic operations. However, its data-intensive nature and reliance on the Markov Decision Process (MDP) assumption limit its practical deployment in real-world scenarios involving complex dynamics and long-term temporal dependencies, such as multi-robot manipulation. Decision Transformers (DTs) have emerged as a promising offline alternative by leveraging causal transformers for sequence modeling in RL tasks. However, their applications to multi-robot manipulations still remain underexplored. To address this gap, we propose a novel framework, Symbolically-Guided Decision Transformer (SGDT), which integrates a neuro-symbolic mechanism with a causal transformer to enable deployable multi-robot collaboration. In the proposed SGDT framework, a neuro-symbolic planner generates a high-level task-oriented plan composed of symbolic subgoals. Guided by these subgoals, a goal-conditioned decision transformer (GCDT) performs low-level sequential decision-making for multi-robot manipulation. This hierarchical architecture enables structured, interpretable, and generalizable decision making in complex multi-robot collaboration tasks. We evaluate the performance of SGDT across a range of task scenarios, including zero-shot and few-shot scenarios. To our knowledge, this is the first work to explore DT-based technology for multi-robot manipulation.
A Screw Approach to the Approximation of the Local Geometry of the Configuration Space and of the set of Configurations of Certain Rank of Lower Pair Linkages
A motion of a mechanism is a curve in its configuration space (c-space). Singularities of the c-space are kinematic singularities of the mechanism. Any mobility analysis of a particular mechanism amounts to investigating the c-space geometry at a given configuration. A higher-order analysis is necessary to determine the finite mobility. To this end, past research lead to approaches using higher-order time derivatives of loop closure constraints assuming (implicitly) that all possible motions are smooth. This continuity assumption limits the generality of these methods. In this paper an approach to the higher-order local mobility analysis of lower pair multi-loop linkages is presented. This is based on a higher-order Taylor series expansion of the geometric constraint mapping, for which a recursive algebraic expression in terms of joint screws is presented. An exhaustive local analysis includes analysis of the set of constraint singularities (configurations where the constraint Jacobian has certain corank). A local approximation of the set of configurations with certain rank is presented, along with an explicit expression for the differentials of Jacobian minors in terms of instantaneous joint screws. The c-space and the set of points of certain corank are therewith locally approximated by an algebraic variety determined algebraically from the mechanism's screw system. Results are shown for a simple planar 4-bar linkage, which exhibits a bifurcation singularity, and for a planar three-loop linkage exhibiting a cusp in c-space. The latter cannot be treated by the higher-order local analysis methods proposed in the literature.
Trajectory Tracking and Stabilization of Quadrotors Using Deep Koopman Model Predictive Control
This paper presents a data-driven control framework for quadrotor systems that integrates a deep Koopman operator with model predictive control (DK-MPC). The deep Koopman operator is trained on sampled flight data to construct a high-dimensional latent representation in which the nonlinear quadrotor dynamics are approximated by linear models. This linearization enables the application of MPC to efficiently optimize control actions over a finite prediction horizon, ensuring accurate trajectory tracking and stabilization. The proposed DK-MPC approach is validated through a series of trajectory-following and point-stabilization numerical experiments, where it demonstrates superior tracking accuracy and significantly lower computation time compared to conventional nonlinear MPC. These results highlight the potential of Koopman-based learning methods to handle complex quadrotor dynamics while meeting the real-time requirements of embedded flight control. Future work will focus on extending the framework to more agile flight scenarios and improving robustness against external disturbances.
Blast Hole Seeking and Dipping -- The Navigation and Perception Framework in a Mine Site Inspection Robot
In open-pit mining, holes are drilled into the surface of the excavation site and detonated with explosives to facilitate digging. These blast holes need to be inspected internally for investigation of downhole material types and properties. Knowing these properties can lead to significant savings in material handling costs in downstream processes. Manual hole inspection is slow and expensive, with major limitations in revealing the geometric and geological properties of the holes and their contents. This has been the motivation for the development of our autonomous mine-site inspection robot - "DIPPeR". In this paper, the automation aspect of the project is explained. We present a robust blast hole seeking and detection framework that enables target-based navigation and accurate down-hole sensor positioning. The pipeline first processes point-cloud data collected by the on-board LiDAR sensors, extracting the cone-shaped volume of drill-waste above the ground. By projecting the 3D cone points into a virtual depth image, segmentation is achieved in the 2D domain, yielding a circular hole at the image centre and a collared cone face. We then identify the hole centre using a robust detection module while suppressing non-maximum candidates, ensuring precise sensor placement for down-hole inspection and avoiding collisions with the cavity wall. To enable autonomous hole-seeking, the pipeline automatically adjusts its projection parameters during robot navigation to account for variations in point sparsity and hole opening size, ensuring a consistent hole appearance in 2D images. This allows continuous tracking of the target hole as the robot approaches the goal point. We demonstrate the effectiveness of our navigation and perception system in both high-fidelity simulation environments and on-site field tests. A demonstration video is available at "https://www.youtube.com/watch?v=fRNbcBcaSqE".
MR6D: Benchmarking 6D Pose Estimation for Mobile Robots CVPR 2025
Existing 6D pose estimation datasets primarily focus on small household objects typically handled by robot arm manipulators, limiting their relevance to mobile robotics. Mobile platforms often operate without manipulators, interact with larger objects, and face challenges such as long-range perception, heavy self-occlusion, and diverse camera perspectives. While recent models generalize well to unseen objects, evaluations remain confined to household-like settings that overlook these factors. We introduce MR6D, a dataset designed for 6D pose estimation for mobile robots in industrial environments. It includes 92 real-world scenes featuring 16 unique objects across static and dynamic interactions. MR6D captures the challenges specific to mobile platforms, including distant viewpoints, varied object configurations, larger object sizes, and complex occlusion/self-occlusion patterns. Initial experiments reveal that current 6D pipelines underperform in these settings, with 2D segmentation being another hurdle. MR6D establishes a foundation for developing and evaluating pose estimation methods tailored to the demands of mobile robotics. The dataset is available at https://huggingface.co/datasets/anas-gouda/mr6d.
comment: accepted CVPR 2025 Workshop on Recovering 6D Object Pose (R6D)
Assessing Pedestrian Behavior Around Autonomous Cleaning Robots in Public Spaces: Findings from a Field Observation
As autonomous robots become more common in public spaces, spontaneous encounters with laypersons are more frequent. For this, robots need to be equipped with communication strategies that enhance momentary transparency and reduce the probability of critical situations. Adapting these robotic strategies requires consideration of robot movements, environmental conditions, and user characteristics and states. While numerous studies have investigated the impact of distraction on pedestrians' movement behavior, limited research has examined this behavior in the presence of autonomous robots. This research addresses the impact of robot type and robot movement pattern on distracted and undistracted pedestrians' movement behavior. In a field setting, unaware pedestrians were videotaped while moving past two working, autonomous cleaning robots. Out of N=498 observed pedestrians, approximately 8% were distracted by smartphones. Distracted and undistracted pedestrians did not exhibit significant differences in their movement behaviors around the robots. Instead, both the larger sweeping robot and the offset rectangular movement pattern significantly increased the number of lateral adaptations compared to the smaller cleaning robot and the circular movement pattern. The offset rectangular movement pattern also led to significantly more close lateral adaptations. Depending on the robot type, the movement patterns led to differences in the distances of lateral adaptations. The study provides initial insights into pedestrian movement behavior around an autonomous cleaning robot in public spaces, contributing to the growing field of HRI research.
AutoMPC: A Code Generator for MPC-based Automated Driving
Model Predictive Control (MPC) is a powerful technique to control nonlinear, multi-input multi-output systems subject to input and state constraints. It is now a standard tool for trajectory tracking control of automated vehicles. As such it has been used in many research and development projects. However, MPC faces several challenges to be integrated into industrial production vehicles. The most important ones are its high computational demands and the complexity of implementation. The software packages AutoMPC aims to address both of these challenges. It builds on a robustified version of an active set algorithm for Nonlinear MPC. The algorithm is embedded into a framework for vehicle trajectory tracking, which makes it easy to used, yet highly customizable. Automatic code generation transforms the selections into a standalone, computationally efficient C-code file with static memory allocation. As such it can be readily deployed on a wide range of embedded platforms, e.g., based on Matlab/Simulink or Robot Operating System (ROS). Compared to a previous version of the code, the vehicle model and the numerical integration method can be manually specified, besides basic algorithm parameters. All of this information and all specifications are directly baked into the generated C-code. The algorithm is suitable driving scenarios at low or high speeds, even drifting, and supports direction changes. Multiple simulation scenarios show the versatility and effectiveness of the AutoMPC code, with the guarantee of a feasible solution, a high degree of robustness, and computational efficiency.
comment: Technical Documentation
The 9th AI City Challenge ICCV 2025
The ninth AI City Challenge continues to advance real-world applications of computer vision and AI in transportation, industrial automation, and public safety. The 2025 edition featured four tracks and saw a 17% increase in participation, with 245 teams from 15 countries registered on the evaluation server. Public release of challenge datasets led to over 30,000 downloads to date. Track 1 focused on multi-class 3D multi-camera tracking, involving people, humanoids, autonomous mobile robots, and forklifts, using detailed calibration and 3D bounding box annotations. Track 2 tackled video question answering in traffic safety, with multi-camera incident understanding enriched by 3D gaze labels. Track 3 addressed fine-grained spatial reasoning in dynamic warehouse environments, requiring AI systems to interpret RGB-D inputs and answer spatial questions that combine perception, geometry, and language. Both Track 1 and Track 3 datasets were generated in NVIDIA Omniverse. Track 4 emphasized efficient road object detection from fisheye cameras, supporting lightweight, real-time deployment on edge devices. The evaluation framework enforced submission limits and used a partially held-out test set to ensure fair benchmarking. Final rankings were revealed after the competition concluded, fostering reproducibility and mitigating overfitting. Several teams achieved top-tier results, setting new benchmarks in multiple tasks.
comment: Summary of the 9th AI City Challenge Workshop in conjunction with ICCV 2025
MimicFunc: Imitating Tool Manipulation from a Single Human Video via Functional Correspondence
Imitating tool manipulation from human videos offers an intuitive approach to teaching robots, while also providing a promising and scalable alternative to labor-intensive teleoperation data collection for visuomotor policy learning. While humans can mimic tool manipulation behavior by observing others perform a task just once and effortlessly transfer the skill to diverse tools for functionally equivalent tasks, current robots struggle to achieve this level of generalization. A key challenge lies in establishing function-level correspondences, considering the significant geometric variations among functionally similar tools, referred to as intra-function variations. To address this challenge, we propose MimicFunc, a framework that establishes functional correspondences with function frame, a function-centric local coordinate frame constructed with keypoint-based abstraction, for imitating tool manipulation skills. Experiments demonstrate that MimicFunc effectively enables the robot to generalize the skill from a single RGB-D human video to manipulating novel tools for functionally equivalent tasks. Furthermore, leveraging MimicFunc's one-shot generalization capability, the generated rollouts can be used to train visuomotor policies without requiring labor-intensive teleoperation data collection for novel objects. Our code and video are available at https://sites.google.com/view/mimicfunc.
comment: Accepted to CoRL 2025
A Three-Level Whole-Body Disturbance Rejection Control Framework for Dynamic Motions in Legged Robots
This paper presents a control framework designed to enhance the stability and robustness of legged robots in the presence of uncertainties, including model uncertainties, external disturbances, and faults. The framework enables the full-state feedback estimator to estimate and compensate for uncertainties in whole-body dynamics of the legged robots. First, we propose a novel moving horizon extended state observer (MH-ESO) to estimate uncertainties and mitigate noise in legged systems, which can be integrated into the framework for disturbance compensation. Second, we introduce a three-level whole-body disturbance rejection control framework (T-WB-DRC). Unlike the previous two-level approach, this three-level framework considers both the plan based on whole-body dynamics without uncertainties and the plan based on dynamics with uncertainties, significantly improving payload transportation, external disturbance rejection, and fault tolerance. Third, simulations of both humanoid and quadruped robots in the Gazebo simulator demonstrate the effectiveness and versatility of T-WB-DRC. Finally, extensive experimental trials on a quadruped robot validate the robustness and stability of the system when using T-WB-DRC under various disturbance conditions.
Unified Hierarchical MPC in Task Executing for Modular Manipulators across Diverse Morphologies
This work proposes a unified Hierarchical Model Predictive Control (H-MPC) for modular manipulators across various morphologies, as the controller can adapt to different configurations to execute the given task without extensive parameter tuning in the controller. The H-MPC divides the control process into two levels: a high-level MPC and a low-level MPC. The high-level MPC predicts future states and provides trajectory information, while the low-level MPC refines control actions by updating the predictive model based on this high-level information. This hierarchical structure allows for the integration of kinematic constraints and ensures smooth joint-space trajectories, even near singular configurations. Moreover, the low-level MPC incorporates secondary linearization by leveraging predictive information from the high-level MPC, effectively capturing the second-order Taylor expansion information of the kinematic model while still maintaining a linearized model formulation. This approach not only preserves the simplicity of a linear control model but also enhances the accuracy of the kinematic representation, thereby improving overall control precision and reliability. To validate the effectiveness of the control policy, we conduct extensive evaluations across different manipulator morphologies and demonstrate the execution of pick-and-place tasks in real-world scenarios.
ROVER: Robust Loop Closure Verification with Trajectory Prior in Repetitive Environments
Loop closure detection is important for simultaneous localization and mapping (SLAM), which associates current observations with historical keyframes, achieving drift correction and global relocalization. However, a falsely detected loop can be fatal, and this is especially difficult in repetitive environments where appearance-based features fail due to the high similarity. Therefore, verification of a loop closure is a critical step in avoiding false positive detections. Existing works in loop closure verification predominantly focus on learning invariant appearance features, neglecting the prior knowledge of the robot's spatial-temporal motion cue, i.e., trajectory. In this letter, we propose ROVER, a loop closure verification method that leverages the historical trajectory as a prior constraint to reject false loops in challenging repetitive environments. For each loop candidate, it is first used to estimate the robot trajectory with pose-graph optimization. This trajectory is then submitted to a scoring scheme that assesses its compliance with the trajectory without the loop, which we refer to as the trajectory prior, to determine if the loop candidate should be accepted. Benchmark comparisons and real-world experiments demonstrate the effectiveness of the proposed method. Furthermore, we integrate ROVER into state-of-the-art SLAM systems to verify its robustness and efficiency. Our source code and self-collected dataset are available at https://github.com/jarvisyjw/ROVER.
comment: 8 pages, 9 figures
Multi-Robot Navigation in Social Mini-Games: Definitions, Taxonomy, and Algorithms
The ``Last Mile Challenge'' has long been considered an important, yet unsolved, challenge for autonomous vehicles, public service robots, and delivery robots. A central issue in this challenge is the ability of robots to navigate constrained and cluttered environments (e.g., doorways, hallways, corridor intersections), often while competing for space with other robots and humans. We refer to these environments as ``Social Mini-Games'' (SMGs). SMGs are tightly coupled, high-agency interactions that arise within general multi-robot navigation (MRN) scenarios. They are identified through certain distinct characteristics and require specialized metrics to evaluate them. Traditional navigation approaches designed for MRN do not perform well in SMGs, which has led to focused research on dedicated SMG solvers (navigation methods specialized to navigate in SMGs), which has flourished in recent years. However, publications on SMG navigation research make different assumptions (on centralized versus decentralized, observability, communication, cooperation, etc.), and have different objective functions (safety versus liveness). These assumptions and objectives are sometimes implicitly assumed or described informally. This makes it difficult to establish appropriate baselines for comparison in research papers, as well as making it difficult for practitioners to find the papers relevant to their concrete application. Such ad-hoc representation of the field also presents a barrier to new researchers wanting to start research in this area. SMG navigation research requires its own taxonomy, definitions, and evaluation protocols to guide effective research moving forward. This survey is the first to catalog SMG solvers using a well-defined and unified taxonomy and to classify existing methods accordingly.
Modeling and Control of AWOISV: A Filtered Tube-Based MPC Approach for Simultaneous Tracking of Lateral Position and Heading Angle
An all-wheel omni-directional independent steering vehicle (AWOISV) is a specialized all-wheel independent steering vehicle with each wheel capable of steering up to 90{\deg}, enabling unique maneuvers like yaw and diagonal movement. This paper introduces a theoretical steering radius angle and sideslip angle (\( \theta_R \)-\(\beta_R \)) representation, based on the position of the instantaneous center of rotation relative to the wheel rotation center, defining the motion modes and switching criteria for AWOISVs. A generalized \( v\)-\(\beta\)-\(r \) dynamic model is developed with forward velocity \(v\), sideslip angle \(\beta\), and yaw rate \(r\) as states, and \(\theta_R\) and \(\beta_R\) as control inputs. This model decouples longitudinal and lateral motions into forward and rotational motions, allowing seamless transitions across all motion modes under specific conditions. A filtered tube-based linear time-varying MPC (FT-LTVMPC) strategy is proposed, achieving simultaneous tracking of lateral position and arbitrary heading angles, with robustness to model inaccuracies and parameter uncertainties. Co-simulation and hardware-in-loop (HIL) experiments confirm that FT-LTVMPC enables high-precision control of both position and heading while ensuring excellent real-time performance.
CAST: Counterfactual Labels Improve Instruction Following in Vision-Language-Action Models
Generalist robots should be able to understand and follow user instructions, but current vision-language-action (VLA) models struggle with following fine-grained commands despite providing a powerful architecture for mapping open-vocabulary natural language instructions to robot actions. One cause for this is a lack of semantic diversity and language grounding in existing robot datasets and, specifically, a lack of fine-grained task diversity for similar observations. To address this, we present a novel method to augment existing robot datasets by leveraging vision language models to create counterfactual labels. Our method improves the language-following capabilities of VLAs by increasing the diversity and granularity of language grounding for robot datasets by generating counterfactual language and actions. We evaluate the resulting model's ability to follow language instructions, ranging from simple object-centric commands to complex referential tasks, by conducting visual language navigation experiments in 3 different indoor and outdoor environments. Our experiments demonstrate that counterfactual relabeling, without any additional data collection, significantly improves instruction-following in VLA policies, making them competitive with state-of-the-art methods and increasing success rate by 27% on navigation tasks.
Switch4EAI: Leveraging Console Game Platform for Benchmarking Robotic Athletics
Recent advances in whole-body robot control have enabled humanoid and legged robots to execute increasingly agile and coordinated movements. However, standardized benchmarks for evaluating robotic athletic performance in real-world settings and in direct comparison to humans remain scarce. We present Switch4EAI(Switch-for-Embodied-AI), a low-cost and easily deployable pipeline that leverages motion-sensing console games to evaluate whole-body robot control policies. Using Just Dance on the Nintendo Switch as a representative example, our system captures, reconstructs, and retargets in-game choreography for robotic execution. We validate the system on a Unitree G1 humanoid with an open-source whole-body controller, establishing a quantitative baseline for the robot's performance against a human player. In the paper, we discuss these results, which demonstrate the feasibility of using commercial games platform as physically grounded benchmarks and motivate future work to for benchmarking embodied AI.
comment: Workshop Submission
Adapting Biological Reflexes for Dynamic Reorientation in Space Manipulator Systems
Robotic arms mounted on spacecraft, known as space manipulator systems (SMSs), are critical for enabling on-orbit assembly, satellite servicing, and debris removal. However, controlling these systems in microgravity remains a significant challenge due to the dynamic coupling between the manipulator and the spacecraft base. This study explores the potential of using biological inspiration to address this issue, focusing on animals, particularly lizards, that exhibit mid-air righting reflexes. Based on similarities between SMSs and these animals in terms of behavior, morphology, and environment, their air-righting motion trajectories are extracted from high-speed video recordings using computer vision techniques. These trajectories are analyzed within a multi-objective optimization framework to identify the key behavioral goals and assess their relative importance. The resulting motion profiles are then applied as reference trajectories for SMS control, with baseline controllers used to track them. The findings provide a step toward translating evolved animal behaviors into interpretable, adaptive control strategies for space robotics, with implications for improving maneuverability and robustness in future missions.
comment: 18 pages, 11 figures, 2025 AAS/AIAA Astrodynamics Specialist Conference
SLAM-based Safe Indoor Exploration Strategy
This paper suggests a 2D exploration strategy for a planar space cluttered with obstacles. Rather than using point robots capable of adjusting their position and altitude instantly, this research is tailored to classical agents with circular footprints that cannot control instantly their pose. Inhere, a self-balanced dual-wheeled differential drive system is used to explore the place. The system is equipped with linear accelerometers and angular gyroscopes, a 3D-LiDAR, and a forward-facing RGB-D camera. The system performs RTAB-SLAM using the IMU and the LiDAR, while the camera is used for loop closures. The mobile agent explores the planar space using a safe skeleton approach that places the agent as far as possible from the static obstacles. During the exploration strategy, the heading is towards any offered openings of the space. This space exploration strategy has as its highest priority the agent's safety in avoiding the obstacles followed by the exploration of undetected space. Experimental studies with a ROS-enabled mobile agent are presented indicating the path planning strategy while exploring the space.
comment: 5 pages, 8 figures. Published in the 2025 11th International Conference on Automation, Robotics, and Applications (ICARA)
Lightweight Tracking Control for Computationally Constrained Aerial Systems with the Newton-Raphson Method
We investigate the performance of a lightweight tracking controller, based on a flow version of the Newton-Raphson method, applied to a miniature blimp and a mid-size quadrotor. This tracking technique has been shown to enjoy theoretical guarantees of performance and has been applied with success in simulation studies and on mobile robots with simple motion models. This paper investigates the technique through real-world flight experiments on aerial hardware platforms subject to realistic deployment and onboard computational constraints. The technique's performance is assessed in comparison with the established control frameworks of feedback linearization for the blimp, and nonlinear model predictive control for both quadrotor and blimp. The performance metrics under consideration are (i) root mean square error of flight trajectories with respect to target trajectories, (ii) algorithms' computation times, and (iii) CPU energy consumption associated with the control algorithms. The experimental findings show that the Newton-Raphson flow-based tracking controller achieves comparable or superior tracking performance to the baseline methods with substantially reduced computation time and energy expenditure.
Towards Unified Probabilistic Verification and Validation of Vision-Based Autonomy
Precise and comprehensive situational awareness is a critical capability of modern autonomous systems. Deep neural networks that perceive task-critical details from rich sensory signals have become ubiquitous; however, their black-box behavior and sensitivity to environmental uncertainty and distribution shifts make them challenging to verify formally. Abstraction-based verification techniques for vision-based autonomy produce safety guarantees contingent on rigid assumptions, such as bounded errors or known unique distributions. Such overly restrictive and inflexible assumptions limit the validity of the guarantees, especially in diverse and uncertain test-time environments. We propose a methodology that unifies the verification models of perception with their offline validation. Our methodology leverages interval MDPs and provides a flexible end-to-end guarantee that adapts directly to the out-of-distribution test-time conditions. We evaluate our methodology on a synthetic perception Markov chain with well-defined state estimation distributions and a mountain car benchmark. Our findings reveal that we can guarantee tight yet rigorous bounds on overall system safety.
comment: Accepted by the 23rd International Symposium on Automated Technology for Verification and Analysis (ATVA'25)
RynnEC: Bringing MLLMs into Embodied World
We introduce RynnEC, a video multimodal large language model designed for embodied cognition. Built upon a general-purpose vision-language foundation model, RynnEC incorporates a region encoder and a mask decoder, enabling flexible region-level video interaction. Despite its compact architecture, RynnEC achieves state-of-the-art performance in object property understanding, object segmentation, and spatial reasoning. Conceptually, it offers a region-centric video paradigm for the brain of embodied agents, providing fine-grained perception of the physical world and enabling more precise interactions. To mitigate the scarcity of annotated 3D datasets, we propose an egocentric video based pipeline for generating embodied cognition data. Furthermore, we introduce RynnEC-Bench, a region-centered benchmark for evaluating embodied cognitive capabilities. We anticipate that RynnEC will advance the development of general-purpose cognitive cores for embodied agents and facilitate generalization across diverse embodied tasks. The code, model checkpoints, and benchmark are available at: https://github.com/alibaba-damo-academy/RynnEC
comment: The technical report of RynnEC, an embodied cognition MLLM
Learning to Drive Ethically: Embedding Moral Reasoning into Autonomous Driving
Autonomous vehicles hold great promise for reducing traffic fatalities and improving transportation efficiency, yet their widespread adoption hinges on embedding robust ethical reasoning into routine and emergency maneuvers. Here, we present a hierarchical Safe Reinforcement Learning (Safe RL) framework that explicitly integrates moral considerations with standard driving objectives. At the decision level, a Safe RL agent is trained using a composite ethical risk cost, combining collision probability and harm severity, to generate high-level motion targets. A dynamic Prioritized Experience Replay mechanism amplifies learning from rare but critical, high-risk events. At the execution level, polynomial path planning coupled with Proportional-Integral-Derivative (PID) and Stanley controllers translates these targets into smooth, feasible trajectories, ensuring both accuracy and comfort. We train and validate our approach on rich, real-world traffic datasets encompassing diverse vehicles, cyclists, and pedestrians, and demonstrate that it outperforms baseline methods in reducing ethical risk and maintaining driving performance. To our knowledge, this is the first study of ethical decision-making for autonomous vehicles via Safe RL in real-world scenarios. Our results highlight the potential of combining formal control theory and data-driven learning to advance ethically accountable autonomy in complex, human-mixed traffic environments.
On the complexity of constrained reconfiguration and motion planning
Coordinating the motion of multiple agents in constrained environments is a fundamental challenge in robotics, motion planning, and scheduling. A motivating example involves $n$ robotic arms, each represented as a line segment. The objective is to rotate each arm to its vertical orientation, one at a time (clockwise or counterclockwise), without collisions nor rotating any arm more than once. This scenario is an example of the more general $k$-Compatible Ordering problem, where $n$ agents, each capable of $k$ state-changing actions, must transition to specific target states under constraints encoded as a set $\mathcal{G}$ of $k$ pairs of directed graphs. We show that $k$-Compatible Ordering is $\mathsf{NP}$-complete, even when $\mathcal{G}$ is planar, degenerate, or acyclic. On the positive side, we provide polynomial-time algorithms for cases such as when $k = 1$ or $\mathcal{G}$ has bounded treewidth. We also introduce generalized variants supporting multiple state-changing actions per agent, broadening the applicability of our framework. These results extend to a wide range of scheduling, reconfiguration, and motion planning applications in constrained environments.
comment: Looking to incorporate comments from reviewers
Insights from Interviews with Teachers and Students on the Use of a Social Robot in Computer Science Class in Sixth Grade
In this paper we report on first insights from interviews with teachers and students on using social robots in computer science class in sixth grade. Our focus is on learning about requirements and potential applications. We are particularly interested in getting both perspectives, the teachers' and the learners' view on how robots could be used and what features they should or should not have. Results show that teachers as well as students are very open to robots in the classroom. However, requirements are partially quite heterogeneous among the groups. This leads to complex design challenges which we discuss at the end of this paper.
comment: 4 pages, 2 figures, Late Breaking Report accepted for RO-MAN 2025
LaDi-WM: A Latent Diffusion-based World Model for Predictive Manipulation
Predictive manipulation has recently gained considerable attention in the Embodied AI community due to its potential to improve robot policy performance by leveraging predicted states. However, generating accurate future visual states of robot-object interactions from world models remains a well-known challenge, particularly in achieving high-quality pixel-level representations. To this end, we propose LaDi-WM, a world model that predicts the latent space of future states using diffusion modeling. Specifically, LaDi-WM leverages the well-established latent space aligned with pre-trained Visual Foundation Models (VFMs), which comprises both geometric features (DINO-based) and semantic features (CLIP-based). We find that predicting the evolution of the latent space is easier to learn and more generalizable than directly predicting pixel-level images. Building on LaDi-WM, we design a diffusion policy that iteratively refines output actions by incorporating forecasted states, thereby generating more consistent and accurate results. Extensive experiments on both synthetic and real-world benchmarks demonstrate that LaDi-WM significantly enhances policy performance by 27.9\% on the LIBERO-LONG benchmark and 20\% on the real-world scenario. Furthermore, our world model and policies achieve impressive generalizability in real-world experiments.
comment: CoRL 2025
Scaling Up without Fading Out: Goal-Aware Sparse GNN for RL-based Generalized Planning
Generalized planning using deep reinforcement learning (RL) combined with graph neural networks (GNNs) has shown promising results in various symbolic planning domains described by PDDL. However, existing approaches typically represent planning states as fully connected graphs, leading to a combinatorial explosion in edge information and substantial sparsity as problem scales grow, especially evident in large grid-based environments. This dense representation results in diluted node-level information, exponentially increases memory requirements, and ultimately makes learning infeasible for larger-scale problems. To address these challenges, we propose a sparse, goal-aware GNN representation that selectively encodes relevant local relationships and explicitly integrates spatial features related to the goal. We validate our approach by designing novel drone mission scenarios based on PDDL within a grid world, effectively simulating realistic mission execution environments. Our experimental results demonstrate that our method scales effectively to larger grid sizes previously infeasible with dense graph representations and substantially improves policy generalization and success rates. Our findings provide a practical foundation for addressing realistic, large-scale generalized planning tasks.
MindEye-OmniAssist: A Gaze-Driven LLM-Enhanced Assistive Robot System for Implicit Intention Recognition and Task Execution
A promising effective human-robot interaction in assistive robotic systems is gaze-based control. However, current gaze-based assistive systems mainly help users with basic grasping actions, offering limited support. Moreover, the restricted intent recognition capability constrains the assistive system's ability to provide diverse assistance functions. In this paper, we propose an open implicit intention recognition framework powered by Large Language Model (LLM) and Vision Foundation Model (VFM), which can process gaze input and recognize user intents that are not confined to predefined or specific scenarios. Furthermore, we implement a gaze-driven LLM-enhanced assistive robot system (MindEye-OmniAssist) that recognizes user's intentions through gaze and assists in completing task. To achieve this, the system utilizes open vocabulary object detector, intention recognition network and LLM to infer their full intentions. By integrating eye movement feedback and LLM, it generates action sequences to assist the user in completing tasks. Real-world experiments have been conducted for assistive tasks, and the system achieved an overall success rate of 41/55 across various undefined tasks. Preliminary results show that the proposed method holds the potential to provide a more user-friendly human-computer interaction interface and significantly enhance the versatility and effectiveness of assistive systems by supporting more complex and diverse task.
Adaptive Lattice-based Motion Planning
This paper proposes an adaptive lattice-based motion planning solution to address the problem of generating feasible trajectories for systems, represented by a linearly parameterizable non-linear model operating within a cluttered environment. The system model is considered to have uncertain model parameters. The key idea here is to utilize input/output data online to update the model set containing the uncertain system parameter, as well as a dynamic estimated parameter of the model, so that the associated model estimation error reduces over time. This in turn improves the quality of the motion primitives generated by the lattice-based motion planner using a nominal estimated model selected on the basis of suitable criteria. The motion primitives are also equipped with tubes to account for the model mismatch between the nominal estimated model and the true system model, to guarantee collision-free overall motion. The tubes are of uniform size, which is directly proportional to the size of the model set containing the uncertain system parameter. The adaptive learning module guarantees a reduction in the diameter of the model set as well as in the parameter estimation error between the dynamic estimated parameter and the true system parameter. This directly implies a reduction in the size of the implemented tubes and guarantees that the utilized motion primitives go arbitrarily close to the resolution-optimal motion primitives associated with the true model of the system, thus significantly improving the overall motion planning performance over time. The efficiency of the motion planner is demonstrated by a suitable simulation example that considers a drone model represented by Euler-Lagrange dynamics containing uncertain parameters and operating within a cluttered environment.
Hybrid Machine Learning Model with a Constrained Action Space for Trajectory Prediction
Trajectory prediction is crucial to advance autonomous driving, improving safety, and efficiency. Although end-to-end models based on deep learning have great potential, they often do not consider vehicle dynamic limitations, leading to unrealistic predictions. To address this problem, this work introduces a novel hybrid model that combines deep learning with a kinematic motion model. It is able to predict object attributes such as acceleration and yaw rate and generate trajectories based on them. A key contribution is the incorporation of expert knowledge into the learning objective of the deep learning model. This results in the constraint of the available action space, thus enabling the prediction of physically feasible object attributes and trajectories, thereby increasing safety and robustness. The proposed hybrid model facilitates enhanced interpretability, thereby reinforcing the trustworthiness of deep learning methods and promoting the development of safe planning solutions. Experiments conducted on the publicly available real-world Argoverse dataset demonstrate realistic driving behaviour, with benchmark comparisons and ablation studies showing promising results.
comment: Copyright 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
MCN-SLAM: Multi-Agent Collaborative Neural SLAM with Hybrid Implicit Neural Scene Representation
Neural implicit scene representations have recently shown promising results in dense visual SLAM. However, existing implicit SLAM algorithms are constrained to single-agent scenarios, and fall difficulties in large-scale scenes and long sequences. Existing NeRF-based multi-agent SLAM frameworks cannot meet the constraints of communication bandwidth. To this end, we propose the first distributed multi-agent collaborative neural SLAM framework with hybrid scene representation, distributed camera tracking, intra-to-inter loop closure, and online distillation for multiple submap fusion. A novel triplane-grid joint scene representation method is proposed to improve scene reconstruction. A novel intra-to-inter loop closure method is designed to achieve local (single-agent) and global (multi-agent) consistency. We also design a novel online distillation method to fuse the information of different submaps to achieve global consistency. Furthermore, to the best of our knowledge, there is no real-world dataset for NeRF-based/GS-based SLAM that provides both continuous-time trajectories groundtruth and high-accuracy 3D meshes groundtruth. To this end, we propose the first real-world Dense slam (DES) dataset covering both single-agent and multi-agent scenarios, ranging from small rooms to large-scale outdoor scenes, with high-accuracy ground truth for both 3D mesh and continuous-time camera trajectory. This dataset can advance the development of the research in both SLAM, 3D reconstruction, and visual foundation model. Experiments on various datasets demonstrate the superiority of the proposed method in both mapping, tracking, and communication. The dataset and code will open-source on https://github.com/dtc111111/mcnslam.
DISCO: Language-Guided Manipulation with Diffusion Policies and Constrained Inpainting
Diffusion policies have demonstrated strong performance in generative modeling, making them promising for robotic manipulation guided by natural language instructions. However, generalizing language-conditioned diffusion policies to open-vocabulary instructions in everyday scenarios remains challenging due to the scarcity and cost of robot demonstration datasets. To address this, we propose DISCO, a framework that leverages off-the-shelf vision-language models (VLMs) to bridge natural language understanding with high-performance diffusion policies. DISCO translates linguistic task descriptions into actionable 3D keyframes using VLMs, which then guide the diffusion process through constrained inpainting. However, enforcing strict adherence to these keyframes can degrade performance when the VLM-generated keyframes are inaccurate. To mitigate this, we introduce an inpainting optimization strategy that balances keyframe adherence with learned motion priors from training data. Experimental results in both simulated and real-world settings demonstrate that DISCO outperforms conventional fine-tuned language-conditioned policies, achieving superior generalization in zero-shot, open-vocabulary manipulation tasks.
MolmoAct: Action Reasoning Models that can Reason in Space
Reasoning is central to purposeful action, yet most robotic foundation models map perception and instructions directly to control, which limits adaptability, generalization, and semantic grounding. We introduce Action Reasoning Models (ARMs), a class of robotic foundation models that integrate perception, planning, and control through a structured three-stage pipeline. Our model, MolmoAct, encodes observations and instructions into depth-aware perception tokens, generates mid-level spatial plans as editable trajectory traces, and predicts precise low-level actions, enabling explainable and steerable behavior. MolmoAct-7B-D achieves strong performance across simulation and real-world settings: 70.5% zero-shot accuracy on SimplerEnv Visual Matching tasks, surpassing closed-source Pi-0 and GR00T N1; 86.6% average success on LIBERO, including an additional 6.3% gain over ThinkAct on long-horizon tasks; and in real-world fine-tuning, an additional 10% (single-arm) and an additional 22.7% (bimanual) task progression over Pi-0-FAST. It also outperforms baselines by an additional 23.3% on out-of-distribution generalization and achieves top human-preference scores for open-ended instruction following and trajectory steering. Furthermore, we release, for the first time, the MolmoAct Dataset -- a mid-training robot dataset comprising over 10,000 high quality robot trajectories across diverse scenarios and tasks. Training with this dataset yields an average 5.5% improvement in general performance over the base model. We release all model weights, training code, our collected dataset, and our action reasoning dataset, establishing MolmoAct as both a state-of-the-art robotics foundation model and an open blueprint for building ARMs that transform perception into purposeful action through structured reasoning. Blogpost: https://allenai.org/blog/molmoact
comment: Appendix include. Code, Data and Weights: https://allenai.org/blog/molmoact
Integrating emotional intelligence, memory architecture, and gestures to achieve empathetic humanoid robot interaction in an educational setting
This study investigates the integration of individual human traits into an empathetically adaptive educational robot tutor system designed to improve student engagement and learning outcomes with corresponding Engagement Vector measurement. While prior research in the field of Human-Robot Interaction (HRI) has examined the integration of the traits, such as emotional intelligence, memory-driven personalization, and non-verbal communication, by themselves, they have thus-far neglected to consider their synchronized integration into a cohesive, operational education framework. To address this gap, we customize a Multi-Modal Large Language Model (LLaMa 3.2 from Meta) deployed with modules for human-like traits (emotion, memory and gestures) into an AI-Agent framework. This constitutes to the robot's intelligent core mimicing the human emotional system, memory architecture and gesture control to allow the robot to behave more empathetically while recognizing and responding appropriately to the student's emotional state. It can also recall the student's past learning record and adapt its style of interaction accordingly. This allows the robot tutor to react to the student in a more sympathetic manner by delivering personalized verbal feedback synchronized with relevant gestures. Our study investigates the extent of this effect through the introduction of Engagement Vector Model which can be a surveyor's pole for judging the quality of HRI experience. Quantitative and qualitative results demonstrate that such an empathetic responsive approach significantly improves student engagement and learning outcomes compared with a baseline humanoid robot without these human-like traits. This indicates that robot tutors with empathetic capabilities can create a more supportive, interactive learning experience that ultimately leads to better outcomes for the student.
LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning
Large language models (LLMs) have shown remarkable abilities in logical reasoning, in-context learning, and code generation. However, translating natural language instructions into effective robotic control policies remains a significant challenge, especially for tasks requiring long-horizon planning and operating under sparse reward conditions. Hierarchical Reinforcement Learning (HRL) provides a natural framework to address this challenge in robotics; however, it typically suffers from non-stationarity caused by the changing behavior of the lower-level policy during training, destabilizing higher-level policy learning. We introduce LGR2, a novel HRL framework that leverages LLMs to generate language-guided reward functions for the higher-level policy. By decoupling high-level reward generation from low-level policy changes, LGR2 fundamentally mitigates the non-stationarity problem in off-policy HRL, enabling stable and efficient learning. To further enhance sample efficiency in sparse environments, we integrate goal-conditioned hindsight experience relabeling. Extensive experiments across simulated and real-world robotic navigation and manipulation tasks demonstrate LGR2 outperforms both hierarchical and non-hierarchical baselines, achieving over 55% success rates on challenging tasks and robust transfer to real robots, without additional fine-tuning.
Hybrid Action Based Reinforcement Learning for Multi-Objective Compatible Autonomous Driving
Reinforcement Learning (RL) has shown excellent performance in solving decision-making and control problems of autonomous driving, which is increasingly applied in diverse driving scenarios. However, driving is a multi-attribute problem, leading to challenges in achieving multi-objective compatibility for current RL methods, especially in both policy updating and policy execution. On the one hand, a single value evaluation network limits the policy updating in complex scenarios with coupled driving objectives. On the other hand, the common single-type action space structure limits driving flexibility or results in large behavior fluctuations during policy execution. To this end, we propose a Multi-objective Ensemble-Critic reinforcement learning method with Hybrid Parametrized Action for multi-objective compatible autonomous driving. Specifically, an advanced MORL architecture is constructed, in which the ensemble-critic focuses on different objectives through independent reward functions. The architecture integrates a hybrid parameterized action space structure, and the generated driving actions contain both abstract guidance that matches the hybrid road modality and concrete control commands. Additionally, an uncertainty-based exploration mechanism that supports hybrid actions is developed to learn multi-objective compatible policies more quickly. Experimental results demonstrate that, in both simulator-based and HighD dataset-based multi-lane highway scenarios, our method efficiently learns multi-objective compatible autonomous driving with respect to efficiency, action consistency, and safety.
comment: 13 pages, 10 figures, 5 tables, Submitted to IEEE T-NNLS (under review, 2nd round)
Systems and Control (CS)
LLMind 2.0: Distributed IoT Automation with Natural Language M2M Communication and Lightweight LLM Agents
Recent advances in large language models (LLMs) have sparked interest in their application to IoT and automation systems, particularly for facilitating device management through natural language instructions. However, existing centralized approaches face significant scalability challenges when managing and coordinating the collaboration between IoT devices of diverse capabilities in large-scale heterogeneous IoT systems. This paper introduces LLMind 2.0, a distributed IoT automation framework that addresses the scalability challenges through lightweight LLM-empowered device agents via natural language-based machine-to-machine (M2M) communication. Unlike previous LLM-controlled automation systems that rely on a centralized coordinator to generate device-specific code to be executed on individual devices, LLMind 2.0 distributes intelligence across individual devices through lightweight LLMs embedded in IoT devices. The central coordinator translates human instructions into simple subtasks described in natural human language, which are then processed by device-specific agents to generate device-specific code locally at the associated devices. This approach transcends device heterogeneity barriers by using natural language as a unified communication medium, enabling seamless collaboration between devices from different manufacturers. The system incorporates several key innovations: a Retrieval-Augmented Generation (RAG) mechanism for accurate subtask-to-API mapping, fine-tuned lightweight LLMs for reliable code generation, and a finite state machine-based task execution framework. Experimental validation in multi-robot warehouse scenarios and real-world WiFi network deployments demonstrates significant improvements in scalability, reliability, and privacy protection compared to the centralized approach.
Energy Management and Wake-up for IoT Networks Powered by Energy Harvesting
The rapid growth of the Internet of Things (IoT) presents sustainability challenges such as increased maintenance requirements and overall higher energy consumption. This motivates self-sustainable IoT ecosystems based on Energy Harvesting (EH). This paper treats IoT deployments in which IoT devices (IoTDs) rely solely on EH to sense and transmit information about events/alarms to a base station (BS). The objective is to effectively manage the duty cycling of the IoTDs to prolong battery life and maximize the relevant data sent to the BS. The BS can also wake up specific IoTDs if extra information about an event is needed upon initial detection. We propose a K-nearest neighbors (KNN)-based duty cycling management to optimize energy efficiency and detection accuracy by considering spatial correlations among IoTDs' activity and their EH process. We evaluate machine learning approaches, including reinforcement learning (RL) and decision transformers (DT), to maximize information captured from events while managing energy consumption. Significant improvements over the state-ofthe-art approaches are obtained in terms of energy saving by all three proposals, KNN, RL, and DT. Moreover, the RL-based solution approaches the performance of a genie-aided benchmark as the number of IoTDs increases.
comment: This work has been partially supported by the Research Council of Finland (Grant 369116 (6G Flagship Programme), Grant 362782), the Finnish Foundation for Technology Promotion, the European Commission through the Horizon Europe/JU SNS project AMBIENT-6G (Grant 101192113), and in Chile, by ANID FONDECYT Regular No.1241977
BioGAP-Ultra: A Modular Edge-AI Platform for Wearable Multimodal Biosignal Acquisition and Processing
The growing demand for continuous physiological monitoring and human-machine interaction in real-world settings calls for wearable platforms that are flexible, low-power, and capable of on-device intelligence. This work presents BioGAP-Ultra, an advanced multimodal biosensing platform that supports synchronized acquisition of diverse electrophysiological and hemodynamic signals such as EEG, EMG, ECG, and PPG while enabling embedded AI processing at state-of-the-art energy efficiency. BioGAP-Ultra is a major extension of our previous design, BioGAP [1], aimed at meeting the rapidly growing requirements of wearable biosensing applications. It features (i) increased on-device storage (x2 SRAM, x4 FLASH), (ii) improved wireless connectivity (1.4 Mbit/s bandwidth, x4 higher than BioGAP), (iii) enhanced number of signal modalities (from 3 to 5) and analog input channels (x2). Further, it is complemented by a complete real-time visualization and analysis software suite, providing access to raw data and real-time configurability on a mobile phone. Electrical characterization and multiple case studies confirm the platform's robustness, configurability, and suitability for real-world multimodal biosignal acquisition and edge intelligence. Finally, we demonstrate the system's versatility through integration into various wearable form factors: an EEG-PPG headband consuming 32.8 mW, an EMG sleeve at 26.7 mW, and an ECG-PPG chest band requiring only 9.3 mW, tailored for diverse biosignal applications. All hardware and software design files are also released open-source with a permissive license.
comment: 13 pages, 12 figures
Singularity-free prescribed performance guaranteed control for perturbed system
This paper addresses the prescribed performance control (PPC) challenge for high-order nonlinear systems affected by mismatched disturbances. The research aims to prevent singularity issues arising from error boundary violations during abrupt changes in reference trajectories. We introduce a novel transformation function with infinite-order differentiability at connection points, advancing beyond mere continuous differentiability. Utilizing this transformation function, we develop a comprehensive transformation strategy that ensures: (1) errors remain within prescribed boundaries when reference trajectories are smooth, and (2) errors return to prescribed boundaries within a specified timeframe following abrupt changes in reference trajectories. Additionally, the complexity explosion issue inherent in backstepping design is effectively resolved. Simulation results corroborate the validity of the proposed theoretical advancements.
comment: 6 pages, 5 figures, accepted by Chinese Automation Conference (CAC) 2025
AutoMPC: A Code Generator for MPC-based Automated Driving
Model Predictive Control (MPC) is a powerful technique to control nonlinear, multi-input multi-output systems subject to input and state constraints. It is now a standard tool for trajectory tracking control of automated vehicles. As such it has been used in many research and development projects. However, MPC faces several challenges to be integrated into industrial production vehicles. The most important ones are its high computational demands and the complexity of implementation. The software packages AutoMPC aims to address both of these challenges. It builds on a robustified version of an active set algorithm for Nonlinear MPC. The algorithm is embedded into a framework for vehicle trajectory tracking, which makes it easy to used, yet highly customizable. Automatic code generation transforms the selections into a standalone, computationally efficient C-code file with static memory allocation. As such it can be readily deployed on a wide range of embedded platforms, e.g., based on Matlab/Simulink or Robot Operating System (ROS). Compared to a previous version of the code, the vehicle model and the numerical integration method can be manually specified, besides basic algorithm parameters. All of this information and all specifications are directly baked into the generated C-code. The algorithm is suitable driving scenarios at low or high speeds, even drifting, and supports direction changes. Multiple simulation scenarios show the versatility and effectiveness of the AutoMPC code, with the guarantee of a feasible solution, a high degree of robustness, and computational efficiency.
comment: Technical Documentation
Model-based Multi-object Visual Tracking: Identification and Standard Model Limitations
This paper uses multi-object tracking methods known from the radar tracking community to address the problem of pedestrian tracking using 2D bounding box detections. The standard point-object (SPO) model is adopted, and the posterior density is computed using the Poisson multi-Bernoulli mixture (PMBM) filter. The selection of the model parameters rooted in continuous time is discussed, including the birth and survival probabilities. Some parameters are selected from the first principles, while others are identified from the data, which is, in this case, the publicly available MOT-17 dataset. Although the resulting PMBM algorithm yields promising results, a mismatch between the SPO model and the data is revealed. The model-based approach assumes that modifying the problematic components causing the SPO model-data mismatch will lead to better model-based algorithms in future developments.
comment: Submitted to FUSION 2025 conference
Transient Stability Analysis for Grid Following Converters in Low-Inertia Power Systems by Direct Method
With the increased penetration of renewable energy and reduced proportion of synchronous generators, the low-inertia characteristics of todays power system become prominent, and the transient stability issue of grid following converter (GFLC) under low inertia system (LIS) condition becomes critical. There are two prominent problems in the transient stability analysis of GFLC-LIS. The angular dynamic of LIS increases the complexity of transient stability analysis, and the nonlinear, possibly negative damping of GFLC makes it difficult to guarantee the conservative of the traditional methods. These problems make the traditional methods inapplicable. In this paper, the transient stability analysis of GFLC LIS is investigated to provide an accurate estimation of the attraction boundary and critical clearance time (CCT). Firstly, a dynamic model of GFLC-LIS is constructed, considering the phase-locked loop (PLL)-based GFLC dynamics and swing equation-based LIS dynamics. The frequency mutation of PLL at fault occurrence and clearing time is also considered. Secondly, a Zubov based transient stability analysis method is proposed, which can construct the energy function in a way that is different from the traditional conservation of energy perspective and can address the negative damping issue. Moreover, the accuracy of the CCT estimation is analyzed, and the influences of LIS parameters on transient stability are illustrated. Finally, simulation experiments are carried out to verify the effectiveness of the proposed method
Towards safe control parameter tuning in distributed multi-agent systems
Many safety-critical real-world problems, such as autonomous driving and collaborative robots, are of a distributed multi-agent nature. To optimize the performance of these systems while ensuring safety, we can cast them as distributed optimization problems, where each agent aims to optimize their parameters to maximize a coupled reward function subject to coupled constraints. Prior work either studies a centralized setting, does not consider safety, or struggles with sample efficiency. Since we require sample efficiency and work with unknown and nonconvex rewards and constraints, we solve this optimization problem using safe Bayesian optimization with Gaussian process regression. Moreover, we consider nearest-neighbor communication between the agents. To capture the behavior of non-neighboring agents, we reformulate the static global optimization problem as a time-varying local optimization problem for each agent, essentially introducing time as a latent variable. To this end, we propose a custom spatio-temporal kernel to integrate prior knowledge. We show the successful deployment of our algorithm in simulations.
comment: Accepted to CDC 2025
Scalable Sensor Placement for Cyclic Networks with Observability Guarantees: Application to Water Distribution Networks
Optimal sensor placement is essential for state estimation and effective network monitoring. As known in the literature, this problem becomes particularly challenging in large-scale undirected or bidirected cyclic networks with parametric uncertainties, such as water distribution networks (WDNs), where pipe resistance and demand patterns are often unknown. Motivated by the challenges of cycles, parametric uncertainties, and scalability, this paper proposes a sensor placement algorithm that guarantees structural observability for cyclic and acyclic networks with parametric uncertainties. By leveraging a graph-based strategy, the proposed method efficiently addresses the computational complexities of large-scale networks. To demonstrate the algorithm's effectiveness, we apply it to several EPANET benchmark WDNs. Most notably, the developed algorithm solves the sensor placement problem with guaranteed structured observability for the L-town WDN with 1694 nodes and 124 cycles in under 0.1 seconds.
comment: Extended version of the paper accepted for IEEE-CDC 2025
Power-Series Approach to Moment-Matching-Based Model Reduction of MIMO Polynomial Nonlinear Systems
The model reduction problem for high-order multi-input, multi-output (MIMO) polynomial nonlinear systems based on moment matching is addressed. The technique of power-series decomposition is exploited: this decomposes the solution of the nonlinear PDE characterizing the center manifold into the solutions of a series of recursively defined Sylvester equations. This approach allows yielding nonlinear reduced-order models in very much the same way as in the linear case (e.g. analytically). Algorithms are proposed for obtaining the order and the parameters of the reduced-order models with precision of degree $\kappa$. The approach also provides new insights into the nonlinear moment matching problem: first, a lower bound for the order of the reduced-order model is obtained, which, in the MIMO case, can be strictly less than the number of matched moments; second, it is revealed that the lower bound is affected by the ratio of the number of the input and output channels; third, it is shown that under mild conditions, a nonlinear reduced-order model can always be constructed with either a linear state equation or a linear output equation.
Repeater Swarm-Assisted Cellular Systems: Interaction Stability and Performance Analysis
We consider a cellular massive MIMO system where swarms of wireless repeaters are deployed to improve coverage. These repeaters are full-duplex relays with small form factors that receive and instantaneously retransmit signals. They can be deployed in a plug-and-play manner at low cost, while being transparent to the network--conceptually they are active channel scatterers with amplification capabilities. Two fundamental questions need to be addressed in repeater deployments: (I) How can we prevent destructive effects of positive feedback caused by inter-repeater interaction (i.e., each repeater receives and amplifies signals from others)? (ii) How much performance improvement can be achieved given that repeaters also inject noise and may introduce more interference? To answer these questions, we first derive a generalized Nyquist stability criterion for the repeater swarm system, and provide an easy-to-check stability condition. Then, we study the uplink performance and develop an efficient iterative algorithm that jointly optimizes the repeater gains, user transmit powers, and receive combining weights to maximize the weighted sum rate while ensuring system stability. Numerical results corroborate our theoretical findings and show that the repeaters can significantly improve the system performance, both in sub-6 GHz and millimeter-wave bands. The results also warrant careful deployment to fully realize the benefits of repeaters, for example, by ensuring a high probability of line-of-sight links between repeaters and the base station.
comment: 16 pages, 13 figures. Submitted to IEEE Transactions on Wireless Communications
Amplitude maximization in stable systems, Schur positivity, and some conjectures on polynomial interpolation
For $r > 0$ and integers $t \ge n > 0$, we consider the following ``amplitude maximization'' problem: maximize the quantity $|x_t|$ over the set of complex solutions $x = (x_0, x_1, \dots)$ of all homogeneous linear difference equations of order $n$ with the roots of characteristic polynomial in the disc $\{z \in \mathbb{C}: |z| \le r\}$, and with initial values $x_0, \dots, x_{n-1}$ in the unit disc. We find that for any $t,n,r$, the maximum is attained in the case of coinciding roots on the boundary circle; this implies that for all $r < 1$, the maximum amplitude can be computed explicitly, by studying a single equation whose characteristic polynomial is $(z-r)^n$. Moreover, the optimality of the co-phase root configuration holds for origin-centered polydiscs. To prove this result, we first reduce the problem to a certain interpolation problem over monomials, then solve the latter by leveraging the theory of symmetric functions and identifying the associated Schur positivity structure. We also discuss the implications for more general Reinhardt domains. Finally, we study the problem of estimating the derivatives of a real entire function from its values at $n/2$ pairs of complex conjugate points in the unit disc. Here, we propose conjectures on the extremality of the monomial $z^n$, and restate them in terms of Schur polynomials.
comment: 18 pages
MuFlex: A Scalable, Physics-based Platform for Multi-Building Flexibility Analysis and Coordination
With the increasing penetration of renewable generation on the power grid, maintaining system balance requires coordinated demand flexibility from aggregations of buildings. Reinforcement learning (RL) has been widely explored for building controls because of its model-free nature. Open-source simulation testbeds are essential not only for training RL agents but also for fairly benchmarking control strategies. However, most building-sector testbeds target single buildings; multi-building platforms are relatively limited and typically rely on simplified models (e.g., Resistance-Capacitance) or data-driven approaches, which lack the ability to fully capture the physical intricacies and intermediate variables necessary for interpreting control performance. Moreover, these platforms often impose fixed inputs, outputs, and model formats, restricting their applicability as benchmarking tools across diverse control scenarios. To address these gaps, MuFlex, a scalable, open-source platform for benchmarking and testing control strategies for multi-building flexibility coordination, was developed in this study. MuFlex enables synchronous information exchange across EnergyPlus building models and adheres to the latest OpenAI Gym interface, providing a modular, standardized RL implementation. The platform capabilities were demonstrated in a case study coordinating demand flexibility across four office buildings using the Soft Actor-Critic algorithm with carefully fine-tuned hyperparameters. The results show that aggregating the four buildings flexibility reduced total peak demand below a specified threshold while maintaining indoor environmental quality.
comment: The platform will be released open-source on GitHub: https://github.com/BuildNexusX/MuFlex once pre-printed
System-Level Performance and Communication Tradeoff in Networked Control with Predictions
Distributed control of large-scale systems is challenging due to the need for scalable and localized communication and computation. In this work, we introduce a Predictive System-Level Synthesis PredSLS framework that designs controllers by jointly integrating communication constraints and local disturbance predictions into an affine feedback structure. Rather than focusing on the worst-case uncertainty, PredSLS leverages both current state feedback and future system disturbance predictions to achieve distributed control of networked systems. In particular, PredSLS enables a unified system synthesis of the optimal $\kappa$-localized controller, therefore outperforms approaches with post hoc communication truncation, as was commonly seen in the literature. The PredSLS framework can be naturally decomposed into spatial and temporal components for efficient and parallelizable computation across the network, yielding a regret upper bound that explicitly depends on the prediction error and communication range. Our regret analysis not only reveals a non-monotonic trade-off between control performance and communication range when prediction errors are present, but also guides the identification of an optimal size for local communication neighborhoods, thereby enabling the co-design of controller and its underlying communication topology.
comment: 52 pages, 13 figures, extended version of our 2025 CDC paper: "PredSLS: A System-Level Framework for Distributed Predictive Control"
Modeling and Control of AWOISV: A Filtered Tube-Based MPC Approach for Simultaneous Tracking of Lateral Position and Heading Angle
An all-wheel omni-directional independent steering vehicle (AWOISV) is a specialized all-wheel independent steering vehicle with each wheel capable of steering up to 90{\deg}, enabling unique maneuvers like yaw and diagonal movement. This paper introduces a theoretical steering radius angle and sideslip angle (\( \theta_R \)-\(\beta_R \)) representation, based on the position of the instantaneous center of rotation relative to the wheel rotation center, defining the motion modes and switching criteria for AWOISVs. A generalized \( v\)-\(\beta\)-\(r \) dynamic model is developed with forward velocity \(v\), sideslip angle \(\beta\), and yaw rate \(r\) as states, and \(\theta_R\) and \(\beta_R\) as control inputs. This model decouples longitudinal and lateral motions into forward and rotational motions, allowing seamless transitions across all motion modes under specific conditions. A filtered tube-based linear time-varying MPC (FT-LTVMPC) strategy is proposed, achieving simultaneous tracking of lateral position and arbitrary heading angles, with robustness to model inaccuracies and parameter uncertainties. Co-simulation and hardware-in-loop (HIL) experiments confirm that FT-LTVMPC enables high-precision control of both position and heading while ensuring excellent real-time performance.
Iterative Youla-Kucera Loop Shaping For Precision Motion Control
This paper presents a numerically robust approach to multi-band disturbance rejection using an iterative Youla-Kucera parameterization technique. The proposed method offers precise control over shaping the frequency response of a feedback loop while maintaining numerical stability through a systematic design process. By implementing an iterative approach, we overcome a critical numerical issue in rejecting vibrations with multiple frequency bands. Meanwhile, our proposed modification of the all-stabilizing Youla-Kucera architecture enables intuitive design while respecting fundamental performance trade-offs and minimizing undesired waterbed amplifications. Numerical validation on a hard disk drive servo system demonstrates significant performance improvements, enabling enhanced positioning precision for increased storage density. The design methodology extends beyond storage systems to various high-precision control applications where multi-band disturbance rejection is critical.
comment: 6pages, To appear at MECC 2025, see https://mecc2025.a2c2.org/
Grid-Edge Energy-Flexible Technologies: A Comparative Analysis Across Generators, Loads, and Energy Storage Systems
This review analysis presents a comprehensive exploration of energy flexibility in modern power systems. It examines the roles and mechanisms of flexible technologies across three main categories: generators, energy storage systems (ESS), and loads. Energy flexibility is defined as the ability to dynamically adjust supply and/or demand in response to grid conditions to maintain balance and stability. This is of particular importance to facilitate the integration of the growing variable renewable energy sources (RES) into modern power grids. Additionally, traditional supply-side mechanisms to maintain balance and stability are complemented by advancements in demand-side management and demand response strategies, which enable loads to adjust consumption patterns and schedules in response to grid requirements. ESS are also explored to further enhance flexibility by absorbing excess generation and/or supplying large load increases that are not able to be met by the less flexible resources. This paper also explores specific flexibility technologies, examining their characteristics, control strategies, advantages, and limitations. Energy flexibility services are also categorized into intermittency mitigation, peak shaving, and energy reserve provisioning. Each service is supported by case studies and examples demonstrating how different resources respond to varying conditions. Ultimately, the findings and reviews of the various flexible resources in this paper provide a roadmap for optimizing energy flexibility across diverse resource types, paving the way for a more sustainable and resilient energy future.
Robust tracking MPC for perturbed nonlinear systems -- Extended version
This paper presents a novel robust predictive controller for constrained nonlinear systems that is able to track piece-wise constant setpoint signals. The tracking model predictive controller presented in this paper extends the nonlinear MPC for tracking to the more complex case of nonlinear systems subject to bounded and not necessarily additive perturbations. The optimal control problem that is solved at each step penalizes the deviation of the predicted nominal system trajectory from an artificial reference, which is added as a decision variable, as well as the distance between the artificial reference and the setpoint. Robust feasibility is ensured by imposing conservative constraints that take into account the effect of uncertainties and convergence to a neighborhood of any feasible setpoint is guaranteed by means of an appropriate terminal cost and an extended stabilizing terminal constraint. In the case of unreachable setpoints, convergence to a neighborhood of the optimal reachable steady output is also proved.
Autonomy at Levels for Spacecraft
Autonomy at Levels is the idea that autonomy should be embedded within and throughout a spacecraft. Using Systems Engineering methods a spacecraft is typically decomposed into systems, subsystems, assemblies, components, and so on. All these decomposition levels within all the spacecraft's systems, could and should have autonomy elements built in. As a result, the "autonomy system" is made of autonomy elements or units that are integrated, distributed and embedded within the whole spacecraft. This is like how the power system would be designed and implemented. Linking control loops and autonomy loops illustrates how to achieve Autonomy at Levels.
comment: 18 pages, 10 figures, to be published in 2025 AAS/AIAA Astrodynamics Specialist conference proceedings
Design and Optimization of a Hybrid VLC/THz Infrastructure-to-Vehicle Communication System for Intelligent Transportation
This paper proposes a hybrid infrastructure-to-vehicle (I2V) communication framework to support future 6G-enabled intelligent transportation systems (ITS) in smart cities. Leveraging existing LED streetlighting infrastructure, the system simultaneously delivers energy-efficient illumination and high-speed wireless connectivity. The proposed scheme integrates visible light communication (VLC) with a complementary ter-ahertz (THz) antenna array to overcome VLC limitations under high ambient light and adverse weather conditions. Key con-tributions include the design of a VLC/THz access network, seamless integration with lighting infrastructure, a proposed switching-combination (PSC) mechanism, and a physical layout optimization strategy. Using a grid search method, thousands of configurations were evaluated to maximize lighting coverage, re-ceived power, signal-to-noise ratio (SNR), signal-to-interference-and-noise ratio (SINR), and minimize outage probability. Results show that optimized lighting coverage improves from 35% to 97%, while hybrid communication coverage increases from 49%to 99.9% at the same power level. Under extreme environmental conditions, the hybrid system maintains up to 99% coverage, compared to 69% with VLC alone. These results demonstrate the scalability, cost-efficiency, and practicality of the proposed system for next-generation ITS deployment.
A Data-Based Review of Battery Electric Vehicle and Traction Inverter Trends
Battery electric vehicles (BEVs) have advanced significantly during the past decade, yet drivetrain energy losses continue to restrict practical range and elevate cost. A dataset comprising more than 1000 European-market BEVs (model years 2010-2025) is combined with detailed inverter-motor co-simulation to chart technology progress for and quantify the efficiency and cost-saving potential of partial-load optimised multi-level inverter (MLI) for 2030. Average drive-cycle range has climbed from 135 km to 455 km, while fleet-average energy consumption has remained virtually constant. Three inverter topologies are assessed to evaluate future efficiency and cost enhancements: a conventional two-level (2L) six halfbridge (B6) inverter with silicon (Si) and silicon carbide (SiC) devices, and two three-level (3L) T-type neutral point clamped (TNPC) and active neutral point clamped (ANPC) inverters tailored for partial-load operation. The 3L-TNPC inverter, realised with only 30% additional SiC chip area, lowers drive-cycle drivetrain losses by 0.67 kWh/100 km relative to a SiC 2L-B6 baseline. These results identify partial-load optimised MLIs as a cost-effective route to further reduce BEV energy consumption and total system cost.
comment: 8 pages, 9 figures, 2025 IECON 51st Annual Conference of the IEEE IES
Reliability comparison of vessel trajectory prediction models via Probability of Detection
This contribution addresses vessel trajectory prediction (VTP), focusing on the evaluation of different deep learning-based approaches. The objective is to assess model performance in diverse traffic complexities and compare the reliability of the approaches. While previous VTP models overlook the specific traffic situation complexity and lack reliability assessments, this research uses a probability of detection analysis to quantify model reliability in varying traffic scenarios, thus going beyond common error distribution analyses. All models are evaluated on test samples categorized according to their traffic situation during the prediction horizon, with performance metrics and reliability estimates obtained for each category. The results of this comprehensive evaluation provide a deeper understanding of the strengths and weaknesses of the different prediction approaches, along with their reliability in terms of the prediction horizon lengths for which safe forecasts can be guaranteed. These findings can inform the development of more reliable vessel trajectory prediction approaches, enhancing safety and efficiency in future inland waterways navigation.
comment: 2025 IEEE Intelligent Vehicles Symposium (IV)
Lightweight Tracking Control for Computationally Constrained Aerial Systems with the Newton-Raphson Method
We investigate the performance of a lightweight tracking controller, based on a flow version of the Newton-Raphson method, applied to a miniature blimp and a mid-size quadrotor. This tracking technique has been shown to enjoy theoretical guarantees of performance and has been applied with success in simulation studies and on mobile robots with simple motion models. This paper investigates the technique through real-world flight experiments on aerial hardware platforms subject to realistic deployment and onboard computational constraints. The technique's performance is assessed in comparison with the established control frameworks of feedback linearization for the blimp, and nonlinear model predictive control for both quadrotor and blimp. The performance metrics under consideration are (i) root mean square error of flight trajectories with respect to target trajectories, (ii) algorithms' computation times, and (iii) CPU energy consumption associated with the control algorithms. The experimental findings show that the Newton-Raphson flow-based tracking controller achieves comparable or superior tracking performance to the baseline methods with substantially reduced computation time and energy expenditure.
Towards Unified Probabilistic Verification and Validation of Vision-Based Autonomy
Precise and comprehensive situational awareness is a critical capability of modern autonomous systems. Deep neural networks that perceive task-critical details from rich sensory signals have become ubiquitous; however, their black-box behavior and sensitivity to environmental uncertainty and distribution shifts make them challenging to verify formally. Abstraction-based verification techniques for vision-based autonomy produce safety guarantees contingent on rigid assumptions, such as bounded errors or known unique distributions. Such overly restrictive and inflexible assumptions limit the validity of the guarantees, especially in diverse and uncertain test-time environments. We propose a methodology that unifies the verification models of perception with their offline validation. Our methodology leverages interval MDPs and provides a flexible end-to-end guarantee that adapts directly to the out-of-distribution test-time conditions. We evaluate our methodology on a synthetic perception Markov chain with well-defined state estimation distributions and a mountain car benchmark. Our findings reveal that we can guarantee tight yet rigorous bounds on overall system safety.
comment: Accepted by the 23rd International Symposium on Automated Technology for Verification and Analysis (ATVA'25)
Safeguarding ISAC Performance in Low-Altitude Wireless Networks Under Channel Access Attack
The increasing saturation of terrestrial resources has driven the exploration of low-altitude applications such as air taxis. Low altitude wireless networks (LAWNs) serve as the foundation for these applications, and integrated sensing and communication (ISAC) constitutes one of the core technologies within LAWNs. However, the openness nature of low-altitude airspace makes LAWNs vulnerable to malicious channel access attacks, which degrade the ISAC performance. Therefore, this paper develops a game-based framework to mitigate the influence of the attacks on LAWNs. Concretely, we first derive expressions of communication data's signal-to-interference-plus-noise ratio and the age of information of sensing data under attack conditions, which serve as quality of service metrics. Then, we formulate the ISAC performance optimization problem as a Stackelberg game, where the attacker acts as the leader, and the legitimate drone and the ground ISAC base station act as second and first followers, respectively. On this basis, we design a backward induction algorithm that achieves the Stackelberg equilibrium while maximizing the utilities of all participants, thereby mitigating the attack-induced degradation of ISAC performance in LAWNs. We further prove the existence and uniqueness of the equilibrium. Simulation results show that the proposed algorithm outperforms existing baselines and a static Nash equilibrium benchmark, ensuring that LAWNs can provide reliable service for low-altitude applications.
A Hierarchical Surrogate Model for Efficient Multi-Task Parameter Learning in Closed-Loop Control
Many control problems require repeated tuning and adaptation of controllers across distinct closed-loop tasks, where data efficiency and adaptability are critical. We propose a hierarchical Bayesian optimization (BO) framework that is tailored to efficient controller parameter learning in sequential decision-making and control scenarios for distinct tasks. Instead of treating the closed-loop cost as a black-box, our method exploits structural knowledge of the underlying problem, consisting of a dynamical system, a control law, and an associated closed-loop cost function. We construct a hierarchical surrogate model using Gaussian processes that capture the closed-loop state evolution under different parameterizations, while the task-specific weighting and accumulation into the closed-loop cost are computed exactly via known closed-form expressions. This allows knowledge transfer and enhanced data efficiency between different closed-loop tasks. The proposed framework retains sublinear regret guarantees on par with standard black-box BO, while enabling multi-task or transfer learning. Simulation experiments with model predictive control demonstrate substantial benefits in both sample efficiency and adaptability when compared to purely black-box BO approaches.
comment: 8 pages, 4 figures, accepted for CDC 2025
Exposing Barriers to Flexibility Aggregation in Unbalanced Distribution Networks
The increasing integration of distributed energy resources (DER) offers new opportunities for distribution system operators (DSO) to improve network operation through flexibility services. To utilise flexible resources, various DER flexibility aggregation methods have been proposed, such as the concept of aggregated P-Q flexibility areas. Yet, many existing studies assume perfect coordination among DER and rely on single-phase power flow analysis, thus overlooking barriers to flexibility aggregation in real unbalanced systems. To quantify the impact of these barriers, this paper proposes a three-phase optimal power flow (OPF) framework for P-Q flexibility assessment, implemented as an open-source Julia tool 3FlexAnalyser.jl. The framework explicitly accounts for voltage unbalance and imperfect coordination among DER in low voltage (LV) distribution networks. Simulations on an illustrative 5-bus system and a real 221-bus LV network in the UK reveal that over 30% of the theoretical aggregated flexibility potential can be lost due to phase unbalance and lack of coordination across phases. These findings highlight the need for improved flexibility aggregation tools applicable to real unbalanced distribution networks.
Nonlinear Systems in Wireless Power Transfer Applications
As a novel pattern of energization, the wireless power transfer (WPT) offers a brand-new way to the energy acquisition for electric-driven devices, thus alleviating the over-dependence on the battery. This report presents three types of WPT systems that use nonlinear control methods, in order to acquire an in-depth understanding of the course of Nonlinear Systems.
Achieving Dispatchability in Data Centers: Carbon and Cost-Aware Sizing of Energy Storage and Local Photovoltaic Generation
Data centers are large electricity consumers due to the high consumption needs of servers and their cooling systems. Given the current crypto-currency and artificial intelligence trends, the data center electricity demand is bound to grow significantly. With the electricity sector being responsible for a large share of global greenhouse gas (GHG) emissions, it is important to lower the carbon footprint of data centers to meet GHG emissions targets set by international agreements. Moreover, uncontrolled integration of data centers in power distribution grids contributes to increasing the stochasticity of the power system demand, thus increasing the need for capacity reserves, which leads to economic and environmental inefficiencies in the power grid operation. This work provides a method to size a PhotoVoltaic (PV) system and an Energy Storage System (ESS) for an existing data center looking to reduce both its carbon footprint and demand stochasticity via dispatching. The proposed scenario-based optimization framework allows to size the ESS and the PV system to minimize the expected operational and capital costs, along with the carbon footprint of the data center complex. The life cycle assessment of the resources, as well as the dynamic carbon emissions of the upstream power distribution grid, are accounted for while computing the day-ahead planning of the data center aggregated demand and PV generation. Case studies in different Swiss cantons and regions of Germany emphasize the need for location-aware sizing processes since the obtained optimal solutions strongly depend on the local electricity carbon footprint, cost and on the local irradiance conditions. Some regions show potential in carbon footprint reduction, while other regions do not.
comment: 22 pages, 7 figures Submitted to Applied Energy, currently under review
On-Orbit Servicing Optimization Framework with High- and Low-Thrust Propulsion Tradeoff
This paper proposes an on-orbit servicing logistics optimization framework that is capable of performing the short-term operational scheduling and long-term strategic planning of sustainable servicing infrastructures that involve high-thrust, low-thrust, and/or multimodal servicers supported by orbital depots. The proposed framework generalizes the state-of-the-art on-orbit servicing logistics optimization method by incorporating user-defined trajectory models and optimizing the logistics operations with the propulsion technology and trajectory tradeoff in consideration. Mixed-Integer Linear Programming is leveraged to find the optimal operations of the servicers over a given period, while the Rolling Horizon approach is used to consider a long time horizon accounting for the uncertainties in service demand. Several analyses are carried out to demonstrate the value of the proposed framework in automatically trading off the high- and low-thrust propulsion systems for both short-term operational scheduling and long-term strategic planning of on-orbit servicing infrastructures.
comment: 45 pages, 16 figures, Accepted by and Published in the Journal of Spacecraft and Rockets (Accepted Version)
Framework for Modeling and Optimization of On-Orbit Servicing Operations under Demand Uncertainties SC
This paper develops a framework that models and optimizes the operations of complex on-orbit servicing infrastructures involving one or more servicers and orbital depots to provide multiple types of services to a fleet of geostationary satellites. The proposed method extends the state-of-the-art space logistics technique by addressing the unique challenges in on-orbit servicing applications, and integrate it with the Rolling Horizon decision making approach. The space logistics technique enables modeling of the on-orbit servicing logistical operations as a Mixed-Integer Linear Program whose optimal solutions can efficiently be found. The Rolling Horizon approach enables the assessment of the long-term value of an on-orbit servicing infrastructure by accounting for the uncertain service needs that arise over time among the geostationary satellites. Two case studies successfully demonstrate the effectiveness of the framework for (1) short-term operational scheduling and (2) long-term strategic decision making for on-orbit servicing architectures under diverse market conditions.
comment: 46 pages, 21 figures, a former version was presented at the AIAA ASCEND Conference; Accepted by and Published in the Journal of Spacecraft and Rockets (Accepted Version)
Finite Sample Analysis of System Poles for Ho-Kalman Algorithm
The Ho-Kalman algorithm has been widely employed for the identification of discrete-time linear time-invariant (LTI) systems. In this paper, we investigate the pole estimation error for the Ho-Kalman algorithm based on finite input/output sample data. Building upon prior works, we derive finite sample error bounds for system pole estimation in both single-trajectory and multiple-trajectory scenarios. Specifically, we prove that, with high probability, the estimation error for an $n$-dimensional system decreases at a rate of at least $\mathcal{O}(T^{-1/2n})$ in the single-trajectory setting with trajectory length $T$, and at a rate of at least $\mathcal{O}(N^{-1/2n})$ in the multiple-trajectory setting with $N$ independent trajectories. Furthermore, we reveal that in both settings, achieving a constant estimation error requires a super-polynomial sample size in $ \max\{n/m, n/p\} $, where $n/m$ and $n/p$ denote the state-to-output and state-to-input dimension ratios, respectively. Finally, numerical experiments are conducted to validate the non-asymptotic results of system pole estimation.
comment: 12 pages, 2 figures
Dual-Head Physics-Informed Graph Decision Transformer for Distribution System Restoration
Driven by recent advances in sensing and computing, deep reinforcement learning (DRL) technologies have shown great potential for addressing distribution system restoration (DSR) under uncertainty. However, their data-intensive nature and reliance on the Markov Decision Process (MDP) assumption limit their ability to handle scenarios that require long-term temporal dependencies or few-shot and zero-shot decision making. Emerging Decision Transformers (DTs), which leverage causal transformers for sequence modeling in DRL tasks, offer a promising alternative. However, their reliance on return-to-go (RTG) cloning and limited generalization capacity restricts their effectiveness in dynamic power system environments. To address these challenges, we introduce an innovative Dual-Head Physics-informed Graph Decision Transformer (DH-PGDT) that integrates physical modeling, structural reasoning, and subgoal-based guidance to enable scalable and robust DSR even in zero-shot or few-shot scenarios. DH-PGDT features a dual-head physics-informed causal transformer architecture comprising Guidance Head, which generates subgoal representations, and Action Head, which uses these subgoals to generate actions independently of RTG. It also incorporates an operational constraint-aware graph reasoning module that encodes power system topology and operational constraints to generate a confidence-weighted action vector for refining DT trajectories. This design effectively improves generalization and enables robust adaptation to unseen scenarios. While this work focuses on DSR, the underlying computing model of the proposed PGDT is broadly applicable to sequential decision making across various power system operations and other complex engineering domains.
Reinforcement learning for robust dynamic metabolic control
Dynamic metabolic control allows key metabolic fluxes to be modulated in real time, enhancing bioprocess flexibility and expanding available optimization degrees of freedom. This is achieved, e.g., via targeted modulation of metabolic enzyme expression. However, identifying optimal dynamic control policies is challenging due to the generally high-dimensional solution space and the need to manage metabolic burden and cytotoxic effects arising from inducible enzyme expression. The task is further complicated by stochastic dynamics, which reduce bioprocess reproducibility. We propose a reinforcement learning framework} to derive optimal policies by allowing an agent (the controller) to interact with a surrogate dynamic model. To promote robustness, we apply domain randomization, enabling the controller to generalize across uncertainties. When transferred to an experimental system, the agent can in principle continue fine-tuning the policy. Our framework provides an alternative to conventional model-based control such as model predictive control, which requires model differentiation with respect to decision variables; often impractical for complex stochastic, nonlinear, stiff, and piecewise-defined dynamics. In contrast, our approach relies on forward integration of the model, thereby simplifying the task. We demonstrate the framework in two $\textit{Escherichia coli}$ bioprocesses: dynamic control of acetyl-CoA carboxylase for fatty-acid synthesis and of adenosine triphosphatase for lactate synthesis.
Exploiting structural observability and graph colorability for optimal sensor placement in water distribution networks
Water distribution networks (WDNs) are critical systems for our society and detecting leakages is important for minimizing losses and water waste. This makes optimal sensor placement for leakage detection very relevant. Existing sensor placement methods rely on simulation-based scenarios, often lacking structure and generalizability, or depend on the knowledge of specific parameters of the WDN as well as on initial sensor data for linearization and demand estimation. Motivated by this, this paper investigates the observability of an entire WDN, based on structural observability theory. This allows us to establish the conditions for the observability of the WDN model, independently of parameter uncertainties. Additionally, a sensor placement algorithm is proposed that leverages such observability conditions and graph theory and accounts for the industrial and material costs. To demonstrate the effectiveness of our approach, we apply it to a hydraulic-transient WDN model.
Adaptive Lattice-based Motion Planning
This paper proposes an adaptive lattice-based motion planning solution to address the problem of generating feasible trajectories for systems, represented by a linearly parameterizable non-linear model operating within a cluttered environment. The system model is considered to have uncertain model parameters. The key idea here is to utilize input/output data online to update the model set containing the uncertain system parameter, as well as a dynamic estimated parameter of the model, so that the associated model estimation error reduces over time. This in turn improves the quality of the motion primitives generated by the lattice-based motion planner using a nominal estimated model selected on the basis of suitable criteria. The motion primitives are also equipped with tubes to account for the model mismatch between the nominal estimated model and the true system model, to guarantee collision-free overall motion. The tubes are of uniform size, which is directly proportional to the size of the model set containing the uncertain system parameter. The adaptive learning module guarantees a reduction in the diameter of the model set as well as in the parameter estimation error between the dynamic estimated parameter and the true system parameter. This directly implies a reduction in the size of the implemented tubes and guarantees that the utilized motion primitives go arbitrarily close to the resolution-optimal motion primitives associated with the true model of the system, thus significantly improving the overall motion planning performance over time. The efficiency of the motion planner is demonstrated by a suitable simulation example that considers a drone model represented by Euler-Lagrange dynamics containing uncertain parameters and operating within a cluttered environment.
Rapid Urban Visibility Hotspots: Quantifying Building Vertex Visibility from Connected Vehicle Trajectories using Spatial Indexing
Effective placement of Out-of-Home advertising and street furniture requires accurate identification of locations offering maximum visual exposure to target audiences, particularly vehicular traffic. Traditional site selection methods often rely on static traffic counts or subjective assessments. This research introduces a data-driven methodology to objectively quantify location visibility by analyzing large-scale connected vehicle trajectory data (sourced from Compass IoT) within urban environments. We model the dynamic driver field-of-view using a forward-projected visibility area for each vehicle position derived from interpolated trajectories. By integrating this with building vertex locations extracted from OpenStreetMap, we quantify the cumulative visual exposure, or ``visibility count'', for thousands of potential points of interest near roadways. The analysis reveals that visibility is highly concentrated, identifying specific ``visual hotspots'' that receive disproportionately high exposure compared to average locations. The core technical contribution involves the construction of a BallTree spatial index over building vertices. This enables highly efficient (O(logN) complexity) radius queries to determine which vertices fall within the viewing circles of millions of trajectory points across numerous trips, significantly outperforming brute-force geometric checks. Analysis reveals two key findings: 1) Visibility is highly concentrated, identifying distinct 'visual hotspots' receiving disproportionately high exposure compared to average locations. 2) The aggregated visibility counts across vertices conform to a Log-Normal distribution.
A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions
Reinforcement learning (RL) has proven to be particularly effective in solving complex decision-making problems for a wide range of applications. From a control theory perspective, RL can be considered as an adaptive optimal control scheme. Lyapunov and barrier functions are the most commonly used certificates to guarantee system stability for a proposed/derived controller and constraint satisfaction guarantees, respectively, in control theoretic approaches. However, compared to theoretical guarantees available in control theoretic methods, RL lacks closed-loop stability of a computed policy and constraint satisfaction guarantees. Safe reinforcement learning refers to a class of constrained problems where the constraint violations lead to partial or complete system failure. The goal of this review is to provide an overview of safe RL techniques using Lyapunov and barrier functions to guarantee this notion of safety discussed (stability of the system in terms of a computed policy and constraint satisfaction during training and deployment). The different approaches employed are discussed in detail along with their shortcomings and benefits to provide critique and possible future research directions. Key motivation for this review is to discuss current theoretical approaches for safety and stability guarantees in RL similar to control theoretic approaches using Lyapunov and barrier functions. The review provides proven potential and promising scope of providing safety guarantees for complex dynamical systems with operational constraints using model-based and model-free RL.
comment: pages - 19, figures - 9, Submitted to IEEE TAI
Systems and Control (EESS)
LLMind 2.0: Distributed IoT Automation with Natural Language M2M Communication and Lightweight LLM Agents
Recent advances in large language models (LLMs) have sparked interest in their application to IoT and automation systems, particularly for facilitating device management through natural language instructions. However, existing centralized approaches face significant scalability challenges when managing and coordinating the collaboration between IoT devices of diverse capabilities in large-scale heterogeneous IoT systems. This paper introduces LLMind 2.0, a distributed IoT automation framework that addresses the scalability challenges through lightweight LLM-empowered device agents via natural language-based machine-to-machine (M2M) communication. Unlike previous LLM-controlled automation systems that rely on a centralized coordinator to generate device-specific code to be executed on individual devices, LLMind 2.0 distributes intelligence across individual devices through lightweight LLMs embedded in IoT devices. The central coordinator translates human instructions into simple subtasks described in natural human language, which are then processed by device-specific agents to generate device-specific code locally at the associated devices. This approach transcends device heterogeneity barriers by using natural language as a unified communication medium, enabling seamless collaboration between devices from different manufacturers. The system incorporates several key innovations: a Retrieval-Augmented Generation (RAG) mechanism for accurate subtask-to-API mapping, fine-tuned lightweight LLMs for reliable code generation, and a finite state machine-based task execution framework. Experimental validation in multi-robot warehouse scenarios and real-world WiFi network deployments demonstrates significant improvements in scalability, reliability, and privacy protection compared to the centralized approach.
Energy Management and Wake-up for IoT Networks Powered by Energy Harvesting
The rapid growth of the Internet of Things (IoT) presents sustainability challenges such as increased maintenance requirements and overall higher energy consumption. This motivates self-sustainable IoT ecosystems based on Energy Harvesting (EH). This paper treats IoT deployments in which IoT devices (IoTDs) rely solely on EH to sense and transmit information about events/alarms to a base station (BS). The objective is to effectively manage the duty cycling of the IoTDs to prolong battery life and maximize the relevant data sent to the BS. The BS can also wake up specific IoTDs if extra information about an event is needed upon initial detection. We propose a K-nearest neighbors (KNN)-based duty cycling management to optimize energy efficiency and detection accuracy by considering spatial correlations among IoTDs' activity and their EH process. We evaluate machine learning approaches, including reinforcement learning (RL) and decision transformers (DT), to maximize information captured from events while managing energy consumption. Significant improvements over the state-ofthe-art approaches are obtained in terms of energy saving by all three proposals, KNN, RL, and DT. Moreover, the RL-based solution approaches the performance of a genie-aided benchmark as the number of IoTDs increases.
comment: This work has been partially supported by the Research Council of Finland (Grant 369116 (6G Flagship Programme), Grant 362782), the Finnish Foundation for Technology Promotion, the European Commission through the Horizon Europe/JU SNS project AMBIENT-6G (Grant 101192113), and in Chile, by ANID FONDECYT Regular No.1241977
BioGAP-Ultra: A Modular Edge-AI Platform for Wearable Multimodal Biosignal Acquisition and Processing
The growing demand for continuous physiological monitoring and human-machine interaction in real-world settings calls for wearable platforms that are flexible, low-power, and capable of on-device intelligence. This work presents BioGAP-Ultra, an advanced multimodal biosensing platform that supports synchronized acquisition of diverse electrophysiological and hemodynamic signals such as EEG, EMG, ECG, and PPG while enabling embedded AI processing at state-of-the-art energy efficiency. BioGAP-Ultra is a major extension of our previous design, BioGAP [1], aimed at meeting the rapidly growing requirements of wearable biosensing applications. It features (i) increased on-device storage (x2 SRAM, x4 FLASH), (ii) improved wireless connectivity (1.4 Mbit/s bandwidth, x4 higher than BioGAP), (iii) enhanced number of signal modalities (from 3 to 5) and analog input channels (x2). Further, it is complemented by a complete real-time visualization and analysis software suite, providing access to raw data and real-time configurability on a mobile phone. Electrical characterization and multiple case studies confirm the platform's robustness, configurability, and suitability for real-world multimodal biosignal acquisition and edge intelligence. Finally, we demonstrate the system's versatility through integration into various wearable form factors: an EEG-PPG headband consuming 32.8 mW, an EMG sleeve at 26.7 mW, and an ECG-PPG chest band requiring only 9.3 mW, tailored for diverse biosignal applications. All hardware and software design files are also released open-source with a permissive license.
comment: 13 pages, 12 figures
Singularity-free prescribed performance guaranteed control for perturbed system
This paper addresses the prescribed performance control (PPC) challenge for high-order nonlinear systems affected by mismatched disturbances. The research aims to prevent singularity issues arising from error boundary violations during abrupt changes in reference trajectories. We introduce a novel transformation function with infinite-order differentiability at connection points, advancing beyond mere continuous differentiability. Utilizing this transformation function, we develop a comprehensive transformation strategy that ensures: (1) errors remain within prescribed boundaries when reference trajectories are smooth, and (2) errors return to prescribed boundaries within a specified timeframe following abrupt changes in reference trajectories. Additionally, the complexity explosion issue inherent in backstepping design is effectively resolved. Simulation results corroborate the validity of the proposed theoretical advancements.
comment: 6 pages, 5 figures, accepted by Chinese Automation Conference (CAC) 2025
AutoMPC: A Code Generator for MPC-based Automated Driving
Model Predictive Control (MPC) is a powerful technique to control nonlinear, multi-input multi-output systems subject to input and state constraints. It is now a standard tool for trajectory tracking control of automated vehicles. As such it has been used in many research and development projects. However, MPC faces several challenges to be integrated into industrial production vehicles. The most important ones are its high computational demands and the complexity of implementation. The software packages AutoMPC aims to address both of these challenges. It builds on a robustified version of an active set algorithm for Nonlinear MPC. The algorithm is embedded into a framework for vehicle trajectory tracking, which makes it easy to used, yet highly customizable. Automatic code generation transforms the selections into a standalone, computationally efficient C-code file with static memory allocation. As such it can be readily deployed on a wide range of embedded platforms, e.g., based on Matlab/Simulink or Robot Operating System (ROS). Compared to a previous version of the code, the vehicle model and the numerical integration method can be manually specified, besides basic algorithm parameters. All of this information and all specifications are directly baked into the generated C-code. The algorithm is suitable driving scenarios at low or high speeds, even drifting, and supports direction changes. Multiple simulation scenarios show the versatility and effectiveness of the AutoMPC code, with the guarantee of a feasible solution, a high degree of robustness, and computational efficiency.
comment: Technical Documentation
Model-based Multi-object Visual Tracking: Identification and Standard Model Limitations
This paper uses multi-object tracking methods known from the radar tracking community to address the problem of pedestrian tracking using 2D bounding box detections. The standard point-object (SPO) model is adopted, and the posterior density is computed using the Poisson multi-Bernoulli mixture (PMBM) filter. The selection of the model parameters rooted in continuous time is discussed, including the birth and survival probabilities. Some parameters are selected from the first principles, while others are identified from the data, which is, in this case, the publicly available MOT-17 dataset. Although the resulting PMBM algorithm yields promising results, a mismatch between the SPO model and the data is revealed. The model-based approach assumes that modifying the problematic components causing the SPO model-data mismatch will lead to better model-based algorithms in future developments.
comment: Submitted to FUSION 2025 conference
Transient Stability Analysis for Grid Following Converters in Low-Inertia Power Systems by Direct Method
With the increased penetration of renewable energy and reduced proportion of synchronous generators, the low-inertia characteristics of todays power system become prominent, and the transient stability issue of grid following converter (GFLC) under low inertia system (LIS) condition becomes critical. There are two prominent problems in the transient stability analysis of GFLC-LIS. The angular dynamic of LIS increases the complexity of transient stability analysis, and the nonlinear, possibly negative damping of GFLC makes it difficult to guarantee the conservative of the traditional methods. These problems make the traditional methods inapplicable. In this paper, the transient stability analysis of GFLC LIS is investigated to provide an accurate estimation of the attraction boundary and critical clearance time (CCT). Firstly, a dynamic model of GFLC-LIS is constructed, considering the phase-locked loop (PLL)-based GFLC dynamics and swing equation-based LIS dynamics. The frequency mutation of PLL at fault occurrence and clearing time is also considered. Secondly, a Zubov based transient stability analysis method is proposed, which can construct the energy function in a way that is different from the traditional conservation of energy perspective and can address the negative damping issue. Moreover, the accuracy of the CCT estimation is analyzed, and the influences of LIS parameters on transient stability are illustrated. Finally, simulation experiments are carried out to verify the effectiveness of the proposed method
Towards safe control parameter tuning in distributed multi-agent systems
Many safety-critical real-world problems, such as autonomous driving and collaborative robots, are of a distributed multi-agent nature. To optimize the performance of these systems while ensuring safety, we can cast them as distributed optimization problems, where each agent aims to optimize their parameters to maximize a coupled reward function subject to coupled constraints. Prior work either studies a centralized setting, does not consider safety, or struggles with sample efficiency. Since we require sample efficiency and work with unknown and nonconvex rewards and constraints, we solve this optimization problem using safe Bayesian optimization with Gaussian process regression. Moreover, we consider nearest-neighbor communication between the agents. To capture the behavior of non-neighboring agents, we reformulate the static global optimization problem as a time-varying local optimization problem for each agent, essentially introducing time as a latent variable. To this end, we propose a custom spatio-temporal kernel to integrate prior knowledge. We show the successful deployment of our algorithm in simulations.
comment: Accepted to CDC 2025
Scalable Sensor Placement for Cyclic Networks with Observability Guarantees: Application to Water Distribution Networks
Optimal sensor placement is essential for state estimation and effective network monitoring. As known in the literature, this problem becomes particularly challenging in large-scale undirected or bidirected cyclic networks with parametric uncertainties, such as water distribution networks (WDNs), where pipe resistance and demand patterns are often unknown. Motivated by the challenges of cycles, parametric uncertainties, and scalability, this paper proposes a sensor placement algorithm that guarantees structural observability for cyclic and acyclic networks with parametric uncertainties. By leveraging a graph-based strategy, the proposed method efficiently addresses the computational complexities of large-scale networks. To demonstrate the algorithm's effectiveness, we apply it to several EPANET benchmark WDNs. Most notably, the developed algorithm solves the sensor placement problem with guaranteed structured observability for the L-town WDN with 1694 nodes and 124 cycles in under 0.1 seconds.
comment: Extended version of the paper accepted for IEEE-CDC 2025
Power-Series Approach to Moment-Matching-Based Model Reduction of MIMO Polynomial Nonlinear Systems
The model reduction problem for high-order multi-input, multi-output (MIMO) polynomial nonlinear systems based on moment matching is addressed. The technique of power-series decomposition is exploited: this decomposes the solution of the nonlinear PDE characterizing the center manifold into the solutions of a series of recursively defined Sylvester equations. This approach allows yielding nonlinear reduced-order models in very much the same way as in the linear case (e.g. analytically). Algorithms are proposed for obtaining the order and the parameters of the reduced-order models with precision of degree $\kappa$. The approach also provides new insights into the nonlinear moment matching problem: first, a lower bound for the order of the reduced-order model is obtained, which, in the MIMO case, can be strictly less than the number of matched moments; second, it is revealed that the lower bound is affected by the ratio of the number of the input and output channels; third, it is shown that under mild conditions, a nonlinear reduced-order model can always be constructed with either a linear state equation or a linear output equation.
Repeater Swarm-Assisted Cellular Systems: Interaction Stability and Performance Analysis
We consider a cellular massive MIMO system where swarms of wireless repeaters are deployed to improve coverage. These repeaters are full-duplex relays with small form factors that receive and instantaneously retransmit signals. They can be deployed in a plug-and-play manner at low cost, while being transparent to the network--conceptually they are active channel scatterers with amplification capabilities. Two fundamental questions need to be addressed in repeater deployments: (I) How can we prevent destructive effects of positive feedback caused by inter-repeater interaction (i.e., each repeater receives and amplifies signals from others)? (ii) How much performance improvement can be achieved given that repeaters also inject noise and may introduce more interference? To answer these questions, we first derive a generalized Nyquist stability criterion for the repeater swarm system, and provide an easy-to-check stability condition. Then, we study the uplink performance and develop an efficient iterative algorithm that jointly optimizes the repeater gains, user transmit powers, and receive combining weights to maximize the weighted sum rate while ensuring system stability. Numerical results corroborate our theoretical findings and show that the repeaters can significantly improve the system performance, both in sub-6 GHz and millimeter-wave bands. The results also warrant careful deployment to fully realize the benefits of repeaters, for example, by ensuring a high probability of line-of-sight links between repeaters and the base station.
comment: 16 pages, 13 figures. Submitted to IEEE Transactions on Wireless Communications
Amplitude maximization in stable systems, Schur positivity, and some conjectures on polynomial interpolation
For $r > 0$ and integers $t \ge n > 0$, we consider the following ``amplitude maximization'' problem: maximize the quantity $|x_t|$ over the set of complex solutions $x = (x_0, x_1, \dots)$ of all homogeneous linear difference equations of order $n$ with the roots of characteristic polynomial in the disc $\{z \in \mathbb{C}: |z| \le r\}$, and with initial values $x_0, \dots, x_{n-1}$ in the unit disc. We find that for any $t,n,r$, the maximum is attained in the case of coinciding roots on the boundary circle; this implies that for all $r < 1$, the maximum amplitude can be computed explicitly, by studying a single equation whose characteristic polynomial is $(z-r)^n$. Moreover, the optimality of the co-phase root configuration holds for origin-centered polydiscs. To prove this result, we first reduce the problem to a certain interpolation problem over monomials, then solve the latter by leveraging the theory of symmetric functions and identifying the associated Schur positivity structure. We also discuss the implications for more general Reinhardt domains. Finally, we study the problem of estimating the derivatives of a real entire function from its values at $n/2$ pairs of complex conjugate points in the unit disc. Here, we propose conjectures on the extremality of the monomial $z^n$, and restate them in terms of Schur polynomials.
comment: 18 pages
MuFlex: A Scalable, Physics-based Platform for Multi-Building Flexibility Analysis and Coordination
With the increasing penetration of renewable generation on the power grid, maintaining system balance requires coordinated demand flexibility from aggregations of buildings. Reinforcement learning (RL) has been widely explored for building controls because of its model-free nature. Open-source simulation testbeds are essential not only for training RL agents but also for fairly benchmarking control strategies. However, most building-sector testbeds target single buildings; multi-building platforms are relatively limited and typically rely on simplified models (e.g., Resistance-Capacitance) or data-driven approaches, which lack the ability to fully capture the physical intricacies and intermediate variables necessary for interpreting control performance. Moreover, these platforms often impose fixed inputs, outputs, and model formats, restricting their applicability as benchmarking tools across diverse control scenarios. To address these gaps, MuFlex, a scalable, open-source platform for benchmarking and testing control strategies for multi-building flexibility coordination, was developed in this study. MuFlex enables synchronous information exchange across EnergyPlus building models and adheres to the latest OpenAI Gym interface, providing a modular, standardized RL implementation. The platform capabilities were demonstrated in a case study coordinating demand flexibility across four office buildings using the Soft Actor-Critic algorithm with carefully fine-tuned hyperparameters. The results show that aggregating the four buildings flexibility reduced total peak demand below a specified threshold while maintaining indoor environmental quality.
comment: The platform will be released open-source on GitHub: https://github.com/BuildNexusX/MuFlex once pre-printed
System-Level Performance and Communication Tradeoff in Networked Control with Predictions
Distributed control of large-scale systems is challenging due to the need for scalable and localized communication and computation. In this work, we introduce a Predictive System-Level Synthesis PredSLS framework that designs controllers by jointly integrating communication constraints and local disturbance predictions into an affine feedback structure. Rather than focusing on the worst-case uncertainty, PredSLS leverages both current state feedback and future system disturbance predictions to achieve distributed control of networked systems. In particular, PredSLS enables a unified system synthesis of the optimal $\kappa$-localized controller, therefore outperforms approaches with post hoc communication truncation, as was commonly seen in the literature. The PredSLS framework can be naturally decomposed into spatial and temporal components for efficient and parallelizable computation across the network, yielding a regret upper bound that explicitly depends on the prediction error and communication range. Our regret analysis not only reveals a non-monotonic trade-off between control performance and communication range when prediction errors are present, but also guides the identification of an optimal size for local communication neighborhoods, thereby enabling the co-design of controller and its underlying communication topology.
comment: 52 pages, 13 figures, extended version of our 2025 CDC paper: "PredSLS: A System-Level Framework for Distributed Predictive Control"
Modeling and Control of AWOISV: A Filtered Tube-Based MPC Approach for Simultaneous Tracking of Lateral Position and Heading Angle
An all-wheel omni-directional independent steering vehicle (AWOISV) is a specialized all-wheel independent steering vehicle with each wheel capable of steering up to 90{\deg}, enabling unique maneuvers like yaw and diagonal movement. This paper introduces a theoretical steering radius angle and sideslip angle (\( \theta_R \)-\(\beta_R \)) representation, based on the position of the instantaneous center of rotation relative to the wheel rotation center, defining the motion modes and switching criteria for AWOISVs. A generalized \( v\)-\(\beta\)-\(r \) dynamic model is developed with forward velocity \(v\), sideslip angle \(\beta\), and yaw rate \(r\) as states, and \(\theta_R\) and \(\beta_R\) as control inputs. This model decouples longitudinal and lateral motions into forward and rotational motions, allowing seamless transitions across all motion modes under specific conditions. A filtered tube-based linear time-varying MPC (FT-LTVMPC) strategy is proposed, achieving simultaneous tracking of lateral position and arbitrary heading angles, with robustness to model inaccuracies and parameter uncertainties. Co-simulation and hardware-in-loop (HIL) experiments confirm that FT-LTVMPC enables high-precision control of both position and heading while ensuring excellent real-time performance.
Iterative Youla-Kucera Loop Shaping For Precision Motion Control
This paper presents a numerically robust approach to multi-band disturbance rejection using an iterative Youla-Kucera parameterization technique. The proposed method offers precise control over shaping the frequency response of a feedback loop while maintaining numerical stability through a systematic design process. By implementing an iterative approach, we overcome a critical numerical issue in rejecting vibrations with multiple frequency bands. Meanwhile, our proposed modification of the all-stabilizing Youla-Kucera architecture enables intuitive design while respecting fundamental performance trade-offs and minimizing undesired waterbed amplifications. Numerical validation on a hard disk drive servo system demonstrates significant performance improvements, enabling enhanced positioning precision for increased storage density. The design methodology extends beyond storage systems to various high-precision control applications where multi-band disturbance rejection is critical.
comment: 6pages, To appear at MECC 2025, see https://mecc2025.a2c2.org/
Grid-Edge Energy-Flexible Technologies: A Comparative Analysis Across Generators, Loads, and Energy Storage Systems
This review analysis presents a comprehensive exploration of energy flexibility in modern power systems. It examines the roles and mechanisms of flexible technologies across three main categories: generators, energy storage systems (ESS), and loads. Energy flexibility is defined as the ability to dynamically adjust supply and/or demand in response to grid conditions to maintain balance and stability. This is of particular importance to facilitate the integration of the growing variable renewable energy sources (RES) into modern power grids. Additionally, traditional supply-side mechanisms to maintain balance and stability are complemented by advancements in demand-side management and demand response strategies, which enable loads to adjust consumption patterns and schedules in response to grid requirements. ESS are also explored to further enhance flexibility by absorbing excess generation and/or supplying large load increases that are not able to be met by the less flexible resources. This paper also explores specific flexibility technologies, examining their characteristics, control strategies, advantages, and limitations. Energy flexibility services are also categorized into intermittency mitigation, peak shaving, and energy reserve provisioning. Each service is supported by case studies and examples demonstrating how different resources respond to varying conditions. Ultimately, the findings and reviews of the various flexible resources in this paper provide a roadmap for optimizing energy flexibility across diverse resource types, paving the way for a more sustainable and resilient energy future.
Robust tracking MPC for perturbed nonlinear systems -- Extended version
This paper presents a novel robust predictive controller for constrained nonlinear systems that is able to track piece-wise constant setpoint signals. The tracking model predictive controller presented in this paper extends the nonlinear MPC for tracking to the more complex case of nonlinear systems subject to bounded and not necessarily additive perturbations. The optimal control problem that is solved at each step penalizes the deviation of the predicted nominal system trajectory from an artificial reference, which is added as a decision variable, as well as the distance between the artificial reference and the setpoint. Robust feasibility is ensured by imposing conservative constraints that take into account the effect of uncertainties and convergence to a neighborhood of any feasible setpoint is guaranteed by means of an appropriate terminal cost and an extended stabilizing terminal constraint. In the case of unreachable setpoints, convergence to a neighborhood of the optimal reachable steady output is also proved.
Autonomy at Levels for Spacecraft
Autonomy at Levels is the idea that autonomy should be embedded within and throughout a spacecraft. Using Systems Engineering methods a spacecraft is typically decomposed into systems, subsystems, assemblies, components, and so on. All these decomposition levels within all the spacecraft's systems, could and should have autonomy elements built in. As a result, the "autonomy system" is made of autonomy elements or units that are integrated, distributed and embedded within the whole spacecraft. This is like how the power system would be designed and implemented. Linking control loops and autonomy loops illustrates how to achieve Autonomy at Levels.
comment: 18 pages, 10 figures, to be published in 2025 AAS/AIAA Astrodynamics Specialist conference proceedings
Design and Optimization of a Hybrid VLC/THz Infrastructure-to-Vehicle Communication System for Intelligent Transportation
This paper proposes a hybrid infrastructure-to-vehicle (I2V) communication framework to support future 6G-enabled intelligent transportation systems (ITS) in smart cities. Leveraging existing LED streetlighting infrastructure, the system simultaneously delivers energy-efficient illumination and high-speed wireless connectivity. The proposed scheme integrates visible light communication (VLC) with a complementary ter-ahertz (THz) antenna array to overcome VLC limitations under high ambient light and adverse weather conditions. Key con-tributions include the design of a VLC/THz access network, seamless integration with lighting infrastructure, a proposed switching-combination (PSC) mechanism, and a physical layout optimization strategy. Using a grid search method, thousands of configurations were evaluated to maximize lighting coverage, re-ceived power, signal-to-noise ratio (SNR), signal-to-interference-and-noise ratio (SINR), and minimize outage probability. Results show that optimized lighting coverage improves from 35% to 97%, while hybrid communication coverage increases from 49%to 99.9% at the same power level. Under extreme environmental conditions, the hybrid system maintains up to 99% coverage, compared to 69% with VLC alone. These results demonstrate the scalability, cost-efficiency, and practicality of the proposed system for next-generation ITS deployment.
A Data-Based Review of Battery Electric Vehicle and Traction Inverter Trends
Battery electric vehicles (BEVs) have advanced significantly during the past decade, yet drivetrain energy losses continue to restrict practical range and elevate cost. A dataset comprising more than 1000 European-market BEVs (model years 2010-2025) is combined with detailed inverter-motor co-simulation to chart technology progress for and quantify the efficiency and cost-saving potential of partial-load optimised multi-level inverter (MLI) for 2030. Average drive-cycle range has climbed from 135 km to 455 km, while fleet-average energy consumption has remained virtually constant. Three inverter topologies are assessed to evaluate future efficiency and cost enhancements: a conventional two-level (2L) six halfbridge (B6) inverter with silicon (Si) and silicon carbide (SiC) devices, and two three-level (3L) T-type neutral point clamped (TNPC) and active neutral point clamped (ANPC) inverters tailored for partial-load operation. The 3L-TNPC inverter, realised with only 30% additional SiC chip area, lowers drive-cycle drivetrain losses by 0.67 kWh/100 km relative to a SiC 2L-B6 baseline. These results identify partial-load optimised MLIs as a cost-effective route to further reduce BEV energy consumption and total system cost.
comment: 8 pages, 9 figures, 2025 IECON 51st Annual Conference of the IEEE IES
Reliability comparison of vessel trajectory prediction models via Probability of Detection
This contribution addresses vessel trajectory prediction (VTP), focusing on the evaluation of different deep learning-based approaches. The objective is to assess model performance in diverse traffic complexities and compare the reliability of the approaches. While previous VTP models overlook the specific traffic situation complexity and lack reliability assessments, this research uses a probability of detection analysis to quantify model reliability in varying traffic scenarios, thus going beyond common error distribution analyses. All models are evaluated on test samples categorized according to their traffic situation during the prediction horizon, with performance metrics and reliability estimates obtained for each category. The results of this comprehensive evaluation provide a deeper understanding of the strengths and weaknesses of the different prediction approaches, along with their reliability in terms of the prediction horizon lengths for which safe forecasts can be guaranteed. These findings can inform the development of more reliable vessel trajectory prediction approaches, enhancing safety and efficiency in future inland waterways navigation.
comment: 2025 IEEE Intelligent Vehicles Symposium (IV)
Lightweight Tracking Control for Computationally Constrained Aerial Systems with the Newton-Raphson Method
We investigate the performance of a lightweight tracking controller, based on a flow version of the Newton-Raphson method, applied to a miniature blimp and a mid-size quadrotor. This tracking technique has been shown to enjoy theoretical guarantees of performance and has been applied with success in simulation studies and on mobile robots with simple motion models. This paper investigates the technique through real-world flight experiments on aerial hardware platforms subject to realistic deployment and onboard computational constraints. The technique's performance is assessed in comparison with the established control frameworks of feedback linearization for the blimp, and nonlinear model predictive control for both quadrotor and blimp. The performance metrics under consideration are (i) root mean square error of flight trajectories with respect to target trajectories, (ii) algorithms' computation times, and (iii) CPU energy consumption associated with the control algorithms. The experimental findings show that the Newton-Raphson flow-based tracking controller achieves comparable or superior tracking performance to the baseline methods with substantially reduced computation time and energy expenditure.
Towards Unified Probabilistic Verification and Validation of Vision-Based Autonomy
Precise and comprehensive situational awareness is a critical capability of modern autonomous systems. Deep neural networks that perceive task-critical details from rich sensory signals have become ubiquitous; however, their black-box behavior and sensitivity to environmental uncertainty and distribution shifts make them challenging to verify formally. Abstraction-based verification techniques for vision-based autonomy produce safety guarantees contingent on rigid assumptions, such as bounded errors or known unique distributions. Such overly restrictive and inflexible assumptions limit the validity of the guarantees, especially in diverse and uncertain test-time environments. We propose a methodology that unifies the verification models of perception with their offline validation. Our methodology leverages interval MDPs and provides a flexible end-to-end guarantee that adapts directly to the out-of-distribution test-time conditions. We evaluate our methodology on a synthetic perception Markov chain with well-defined state estimation distributions and a mountain car benchmark. Our findings reveal that we can guarantee tight yet rigorous bounds on overall system safety.
comment: Accepted by the 23rd International Symposium on Automated Technology for Verification and Analysis (ATVA'25)
Safeguarding ISAC Performance in Low-Altitude Wireless Networks Under Channel Access Attack
The increasing saturation of terrestrial resources has driven the exploration of low-altitude applications such as air taxis. Low altitude wireless networks (LAWNs) serve as the foundation for these applications, and integrated sensing and communication (ISAC) constitutes one of the core technologies within LAWNs. However, the openness nature of low-altitude airspace makes LAWNs vulnerable to malicious channel access attacks, which degrade the ISAC performance. Therefore, this paper develops a game-based framework to mitigate the influence of the attacks on LAWNs. Concretely, we first derive expressions of communication data's signal-to-interference-plus-noise ratio and the age of information of sensing data under attack conditions, which serve as quality of service metrics. Then, we formulate the ISAC performance optimization problem as a Stackelberg game, where the attacker acts as the leader, and the legitimate drone and the ground ISAC base station act as second and first followers, respectively. On this basis, we design a backward induction algorithm that achieves the Stackelberg equilibrium while maximizing the utilities of all participants, thereby mitigating the attack-induced degradation of ISAC performance in LAWNs. We further prove the existence and uniqueness of the equilibrium. Simulation results show that the proposed algorithm outperforms existing baselines and a static Nash equilibrium benchmark, ensuring that LAWNs can provide reliable service for low-altitude applications.
A Hierarchical Surrogate Model for Efficient Multi-Task Parameter Learning in Closed-Loop Control
Many control problems require repeated tuning and adaptation of controllers across distinct closed-loop tasks, where data efficiency and adaptability are critical. We propose a hierarchical Bayesian optimization (BO) framework that is tailored to efficient controller parameter learning in sequential decision-making and control scenarios for distinct tasks. Instead of treating the closed-loop cost as a black-box, our method exploits structural knowledge of the underlying problem, consisting of a dynamical system, a control law, and an associated closed-loop cost function. We construct a hierarchical surrogate model using Gaussian processes that capture the closed-loop state evolution under different parameterizations, while the task-specific weighting and accumulation into the closed-loop cost are computed exactly via known closed-form expressions. This allows knowledge transfer and enhanced data efficiency between different closed-loop tasks. The proposed framework retains sublinear regret guarantees on par with standard black-box BO, while enabling multi-task or transfer learning. Simulation experiments with model predictive control demonstrate substantial benefits in both sample efficiency and adaptability when compared to purely black-box BO approaches.
comment: 8 pages, 4 figures, accepted for CDC 2025
Exposing Barriers to Flexibility Aggregation in Unbalanced Distribution Networks
The increasing integration of distributed energy resources (DER) offers new opportunities for distribution system operators (DSO) to improve network operation through flexibility services. To utilise flexible resources, various DER flexibility aggregation methods have been proposed, such as the concept of aggregated P-Q flexibility areas. Yet, many existing studies assume perfect coordination among DER and rely on single-phase power flow analysis, thus overlooking barriers to flexibility aggregation in real unbalanced systems. To quantify the impact of these barriers, this paper proposes a three-phase optimal power flow (OPF) framework for P-Q flexibility assessment, implemented as an open-source Julia tool 3FlexAnalyser.jl. The framework explicitly accounts for voltage unbalance and imperfect coordination among DER in low voltage (LV) distribution networks. Simulations on an illustrative 5-bus system and a real 221-bus LV network in the UK reveal that over 30% of the theoretical aggregated flexibility potential can be lost due to phase unbalance and lack of coordination across phases. These findings highlight the need for improved flexibility aggregation tools applicable to real unbalanced distribution networks.
Nonlinear Systems in Wireless Power Transfer Applications
As a novel pattern of energization, the wireless power transfer (WPT) offers a brand-new way to the energy acquisition for electric-driven devices, thus alleviating the over-dependence on the battery. This report presents three types of WPT systems that use nonlinear control methods, in order to acquire an in-depth understanding of the course of Nonlinear Systems.
Achieving Dispatchability in Data Centers: Carbon and Cost-Aware Sizing of Energy Storage and Local Photovoltaic Generation
Data centers are large electricity consumers due to the high consumption needs of servers and their cooling systems. Given the current crypto-currency and artificial intelligence trends, the data center electricity demand is bound to grow significantly. With the electricity sector being responsible for a large share of global greenhouse gas (GHG) emissions, it is important to lower the carbon footprint of data centers to meet GHG emissions targets set by international agreements. Moreover, uncontrolled integration of data centers in power distribution grids contributes to increasing the stochasticity of the power system demand, thus increasing the need for capacity reserves, which leads to economic and environmental inefficiencies in the power grid operation. This work provides a method to size a PhotoVoltaic (PV) system and an Energy Storage System (ESS) for an existing data center looking to reduce both its carbon footprint and demand stochasticity via dispatching. The proposed scenario-based optimization framework allows to size the ESS and the PV system to minimize the expected operational and capital costs, along with the carbon footprint of the data center complex. The life cycle assessment of the resources, as well as the dynamic carbon emissions of the upstream power distribution grid, are accounted for while computing the day-ahead planning of the data center aggregated demand and PV generation. Case studies in different Swiss cantons and regions of Germany emphasize the need for location-aware sizing processes since the obtained optimal solutions strongly depend on the local electricity carbon footprint, cost and on the local irradiance conditions. Some regions show potential in carbon footprint reduction, while other regions do not.
comment: 22 pages, 7 figures Submitted to Applied Energy, currently under review
On-Orbit Servicing Optimization Framework with High- and Low-Thrust Propulsion Tradeoff
This paper proposes an on-orbit servicing logistics optimization framework that is capable of performing the short-term operational scheduling and long-term strategic planning of sustainable servicing infrastructures that involve high-thrust, low-thrust, and/or multimodal servicers supported by orbital depots. The proposed framework generalizes the state-of-the-art on-orbit servicing logistics optimization method by incorporating user-defined trajectory models and optimizing the logistics operations with the propulsion technology and trajectory tradeoff in consideration. Mixed-Integer Linear Programming is leveraged to find the optimal operations of the servicers over a given period, while the Rolling Horizon approach is used to consider a long time horizon accounting for the uncertainties in service demand. Several analyses are carried out to demonstrate the value of the proposed framework in automatically trading off the high- and low-thrust propulsion systems for both short-term operational scheduling and long-term strategic planning of on-orbit servicing infrastructures.
comment: 45 pages, 16 figures, Accepted by and Published in the Journal of Spacecraft and Rockets (Accepted Version)
Framework for Modeling and Optimization of On-Orbit Servicing Operations under Demand Uncertainties SC
This paper develops a framework that models and optimizes the operations of complex on-orbit servicing infrastructures involving one or more servicers and orbital depots to provide multiple types of services to a fleet of geostationary satellites. The proposed method extends the state-of-the-art space logistics technique by addressing the unique challenges in on-orbit servicing applications, and integrate it with the Rolling Horizon decision making approach. The space logistics technique enables modeling of the on-orbit servicing logistical operations as a Mixed-Integer Linear Program whose optimal solutions can efficiently be found. The Rolling Horizon approach enables the assessment of the long-term value of an on-orbit servicing infrastructure by accounting for the uncertain service needs that arise over time among the geostationary satellites. Two case studies successfully demonstrate the effectiveness of the framework for (1) short-term operational scheduling and (2) long-term strategic decision making for on-orbit servicing architectures under diverse market conditions.
comment: 46 pages, 21 figures, a former version was presented at the AIAA ASCEND Conference; Accepted by and Published in the Journal of Spacecraft and Rockets (Accepted Version)
Finite Sample Analysis of System Poles for Ho-Kalman Algorithm
The Ho-Kalman algorithm has been widely employed for the identification of discrete-time linear time-invariant (LTI) systems. In this paper, we investigate the pole estimation error for the Ho-Kalman algorithm based on finite input/output sample data. Building upon prior works, we derive finite sample error bounds for system pole estimation in both single-trajectory and multiple-trajectory scenarios. Specifically, we prove that, with high probability, the estimation error for an $n$-dimensional system decreases at a rate of at least $\mathcal{O}(T^{-1/2n})$ in the single-trajectory setting with trajectory length $T$, and at a rate of at least $\mathcal{O}(N^{-1/2n})$ in the multiple-trajectory setting with $N$ independent trajectories. Furthermore, we reveal that in both settings, achieving a constant estimation error requires a super-polynomial sample size in $ \max\{n/m, n/p\} $, where $n/m$ and $n/p$ denote the state-to-output and state-to-input dimension ratios, respectively. Finally, numerical experiments are conducted to validate the non-asymptotic results of system pole estimation.
comment: 12 pages, 2 figures
Dual-Head Physics-Informed Graph Decision Transformer for Distribution System Restoration
Driven by recent advances in sensing and computing, deep reinforcement learning (DRL) technologies have shown great potential for addressing distribution system restoration (DSR) under uncertainty. However, their data-intensive nature and reliance on the Markov Decision Process (MDP) assumption limit their ability to handle scenarios that require long-term temporal dependencies or few-shot and zero-shot decision making. Emerging Decision Transformers (DTs), which leverage causal transformers for sequence modeling in DRL tasks, offer a promising alternative. However, their reliance on return-to-go (RTG) cloning and limited generalization capacity restricts their effectiveness in dynamic power system environments. To address these challenges, we introduce an innovative Dual-Head Physics-informed Graph Decision Transformer (DH-PGDT) that integrates physical modeling, structural reasoning, and subgoal-based guidance to enable scalable and robust DSR even in zero-shot or few-shot scenarios. DH-PGDT features a dual-head physics-informed causal transformer architecture comprising Guidance Head, which generates subgoal representations, and Action Head, which uses these subgoals to generate actions independently of RTG. It also incorporates an operational constraint-aware graph reasoning module that encodes power system topology and operational constraints to generate a confidence-weighted action vector for refining DT trajectories. This design effectively improves generalization and enables robust adaptation to unseen scenarios. While this work focuses on DSR, the underlying computing model of the proposed PGDT is broadly applicable to sequential decision making across various power system operations and other complex engineering domains.
Reinforcement learning for robust dynamic metabolic control
Dynamic metabolic control allows key metabolic fluxes to be modulated in real time, enhancing bioprocess flexibility and expanding available optimization degrees of freedom. This is achieved, e.g., via targeted modulation of metabolic enzyme expression. However, identifying optimal dynamic control policies is challenging due to the generally high-dimensional solution space and the need to manage metabolic burden and cytotoxic effects arising from inducible enzyme expression. The task is further complicated by stochastic dynamics, which reduce bioprocess reproducibility. We propose a reinforcement learning framework} to derive optimal policies by allowing an agent (the controller) to interact with a surrogate dynamic model. To promote robustness, we apply domain randomization, enabling the controller to generalize across uncertainties. When transferred to an experimental system, the agent can in principle continue fine-tuning the policy. Our framework provides an alternative to conventional model-based control such as model predictive control, which requires model differentiation with respect to decision variables; often impractical for complex stochastic, nonlinear, stiff, and piecewise-defined dynamics. In contrast, our approach relies on forward integration of the model, thereby simplifying the task. We demonstrate the framework in two $\textit{Escherichia coli}$ bioprocesses: dynamic control of acetyl-CoA carboxylase for fatty-acid synthesis and of adenosine triphosphatase for lactate synthesis.
Exploiting structural observability and graph colorability for optimal sensor placement in water distribution networks
Water distribution networks (WDNs) are critical systems for our society and detecting leakages is important for minimizing losses and water waste. This makes optimal sensor placement for leakage detection very relevant. Existing sensor placement methods rely on simulation-based scenarios, often lacking structure and generalizability, or depend on the knowledge of specific parameters of the WDN as well as on initial sensor data for linearization and demand estimation. Motivated by this, this paper investigates the observability of an entire WDN, based on structural observability theory. This allows us to establish the conditions for the observability of the WDN model, independently of parameter uncertainties. Additionally, a sensor placement algorithm is proposed that leverages such observability conditions and graph theory and accounts for the industrial and material costs. To demonstrate the effectiveness of our approach, we apply it to a hydraulic-transient WDN model.
Adaptive Lattice-based Motion Planning
This paper proposes an adaptive lattice-based motion planning solution to address the problem of generating feasible trajectories for systems, represented by a linearly parameterizable non-linear model operating within a cluttered environment. The system model is considered to have uncertain model parameters. The key idea here is to utilize input/output data online to update the model set containing the uncertain system parameter, as well as a dynamic estimated parameter of the model, so that the associated model estimation error reduces over time. This in turn improves the quality of the motion primitives generated by the lattice-based motion planner using a nominal estimated model selected on the basis of suitable criteria. The motion primitives are also equipped with tubes to account for the model mismatch between the nominal estimated model and the true system model, to guarantee collision-free overall motion. The tubes are of uniform size, which is directly proportional to the size of the model set containing the uncertain system parameter. The adaptive learning module guarantees a reduction in the diameter of the model set as well as in the parameter estimation error between the dynamic estimated parameter and the true system parameter. This directly implies a reduction in the size of the implemented tubes and guarantees that the utilized motion primitives go arbitrarily close to the resolution-optimal motion primitives associated with the true model of the system, thus significantly improving the overall motion planning performance over time. The efficiency of the motion planner is demonstrated by a suitable simulation example that considers a drone model represented by Euler-Lagrange dynamics containing uncertain parameters and operating within a cluttered environment.
Rapid Urban Visibility Hotspots: Quantifying Building Vertex Visibility from Connected Vehicle Trajectories using Spatial Indexing
Effective placement of Out-of-Home advertising and street furniture requires accurate identification of locations offering maximum visual exposure to target audiences, particularly vehicular traffic. Traditional site selection methods often rely on static traffic counts or subjective assessments. This research introduces a data-driven methodology to objectively quantify location visibility by analyzing large-scale connected vehicle trajectory data (sourced from Compass IoT) within urban environments. We model the dynamic driver field-of-view using a forward-projected visibility area for each vehicle position derived from interpolated trajectories. By integrating this with building vertex locations extracted from OpenStreetMap, we quantify the cumulative visual exposure, or ``visibility count'', for thousands of potential points of interest near roadways. The analysis reveals that visibility is highly concentrated, identifying specific ``visual hotspots'' that receive disproportionately high exposure compared to average locations. The core technical contribution involves the construction of a BallTree spatial index over building vertices. This enables highly efficient (O(logN) complexity) radius queries to determine which vertices fall within the viewing circles of millions of trajectory points across numerous trips, significantly outperforming brute-force geometric checks. Analysis reveals two key findings: 1) Visibility is highly concentrated, identifying distinct 'visual hotspots' receiving disproportionately high exposure compared to average locations. 2) The aggregated visibility counts across vertices conform to a Log-Normal distribution.
A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions
Reinforcement learning (RL) has proven to be particularly effective in solving complex decision-making problems for a wide range of applications. From a control theory perspective, RL can be considered as an adaptive optimal control scheme. Lyapunov and barrier functions are the most commonly used certificates to guarantee system stability for a proposed/derived controller and constraint satisfaction guarantees, respectively, in control theoretic approaches. However, compared to theoretical guarantees available in control theoretic methods, RL lacks closed-loop stability of a computed policy and constraint satisfaction guarantees. Safe reinforcement learning refers to a class of constrained problems where the constraint violations lead to partial or complete system failure. The goal of this review is to provide an overview of safe RL techniques using Lyapunov and barrier functions to guarantee this notion of safety discussed (stability of the system in terms of a computed policy and constraint satisfaction during training and deployment). The different approaches employed are discussed in detail along with their shortcomings and benefits to provide critique and possible future research directions. Key motivation for this review is to discuss current theoretical approaches for safety and stability guarantees in RL similar to control theoretic approaches using Lyapunov and barrier functions. The review provides proven potential and promising scope of providing safety guarantees for complex dynamical systems with operational constraints using model-based and model-free RL.
comment: pages - 19, figures - 9, Submitted to IEEE TAI
Multiagent Systems
The Social Context of Human-Robot Interactions
The Human-Robot Interaction (HRI) community often highlights the social context of an interaction as a key consideration when designing, implementing, and evaluating robot behavior. Unfortunately, researchers use the term "social context" in varied ways. This can lead to miscommunication, making it challenging to draw connections between related work on understanding and modeling the social contexts of human-robot interactions. To address this gap, we survey the HRI literature for existing definitions and uses of the term "social context". Then, we propose a conceptual model for describing the social context of a human-robot interaction. We apply this model to existing work, and we discuss a range of attributes of social contexts that can help researchers plan for interactions, develop behavior models for robots, and gain insights after interactions have taken place. We conclude with a discussion of open research questions in relation to understanding and modeling the social contexts of human-robot interactions.
comment: To be published in Annual Review of Control, Robotics, and Autonomous Systems
LLM-Powered Virtual Patient Agents for Interactive Clinical Skills Training with Automated Feedback
Objective Structured Clinical Examinations (OSCEs) are essential for medical training, but they require significant resources, including professional actors and expert medical feedback. Although Large Language Models (LLMs) have introduced text-based virtual patients for communication practice, these simulations often lack the capability for richer, non-textual interactions. This paper presents a novel framework that significantly enhances LLM-based simulated patients by equipping them with action spaces, thereby enabling more realistic and dynamic patient behaviors that extend beyond text. Furthermore, our system incorporates virtual tutors that provide students with instant, personalized feedback on their performance at any time during these simulated encounters. We have conducted a rigorous evaluation of the framework's real-time performance, including system latency and component accuracy. Preliminary evaluations with medical experts assessed the naturalness and coherence of the simulated patients, as well as the usefulness and appropriateness of the virtual tutor's assessments. This innovative system provides medical students with a low-cost, accessible platform for personalized OSCE preparation at home.
RED.AI Id-Pattern: First Results of Stone Deterioration Patterns with Multi-Agent Systems
The Id-Pattern system within the RED.AI project (Reabilita\c{c}\~ao Estrutural Digital atrav\'es da AI) consists of an agentic system designed to assist in the identification of stone deterioration patterns. Traditional methodologies, based on direct observation by expert teams, are accurate but costly in terms of time and resources. The system developed here introduces and evaluates a multi-agent artificial intelligence (AI) system, designed to simulate collaboration between experts and automate the diagnosis of stone pathologies from visual evidence. The approach is based on a cognitive architecture that orchestrates a team of specialized AI agents which, in this specific case, are limited to five: a lithologist, a pathologist, an environmental expert, a conservator-restorer, and a diagnostic coordinator. To evaluate the system we selected 28 difficult images involving multiple deterioration patterns. Our first results showed a huge boost on all metrics of our system compared to the foundational model.
comment: 11 pages, 1 figure, 1 table. Contribution for REEACH 2025 Symposium
The Multi-Stage Assignment Problem: A Fairness Perspective ECAI
This paper explores the problem of fair assignment on Multi-Stage graphs. A multi-stage graph consists of nodes partitioned into $K$ disjoint sets (stages) structured as a sequence of weighted bipartite graphs formed across adjacent stages. The goal is to assign node-disjoint paths to $n$ agents starting from the first stage and ending in the last stage. We show that an efficient assignment that minimizes the overall sum of costs of all the agents' paths may be highly unfair and lead to significant cost disparities (envy) among the agents. We further show that finding an envy-minimizing assignment on a multi-stage graph is NP-hard. We propose the C-Balance algorithm, which guarantees envy that is bounded by $2M$ in the case of two agents, where $M$ is the maximum edge weight. We demonstrate the algorithm's tightness by presenting an instance where the envy is $2M$. We further show that the cost of fairness ($CoF$), defined as the ratio of the cost of the assignment given by the fair algorithm to that of the minimum cost assignment, is bounded by $2$ for C-Balance. We then extend this approach to $n$ agents by proposing the DC-Balance algorithm that makes iterative calls to C-Balance. We show the convergence of DC-Balance, resulting in envy that is arbitrarily close to $2M$. We derive $CoF$ bounds for DC-Balance and provide insights about its dependency on the instance-specific parameters and the desired degree of envy. We experimentally show that our algorithm runs several orders of magnitude faster than a suitably formulated ILP.
comment: The original version of this paper is accepted in the 28th European Conference on Artificial Intelligence (ECAI), 2025
COCO: Cognitive Operating System with Continuous Oversight for Multi-Agent Workflow Reliability
Large-scale multi-agent workflows exhibit inherent vulnerability to error propagation and quality degradation, where downstream agents compound upstream failures without corrective mechanisms. We introduce COCO (Cognitive Operating System with Continuous Oversight), a theoretically-grounded framework that implements asynchronous self-monitoring and adaptive error correction in multi-agent driven systems. COCO addresses the fundamental trade-off between quality assurance and computational efficiency through a novel decoupled architecture that separates error detection from the critical execution path, achieving $O(1)$ monitoring overhead relative to workflow complexity. COCO employs three key algorithmic innovations to address systematic and stochastic errors: (1) Contextual Rollback Mechanism - a stateful restart protocol that preserves execution history and error diagnostics, enabling informed re-computation rather than naive retry; (2) Bidirectional Reflection Protocol - a mutual validation system between monitoring and execution modules that prevents oscillatory behavior and ensures convergence; (3) Heterogeneous Cross-Validation - leveraging model diversity to detect systematic biases and hallucinations through ensemble disagreement metrics. Extensive experiments on benchmark multi-agent tasks demonstrate 6.5\% average performance improvement, establishing new state-of-the-art for autonomous workflow reliability.
BetaWeb: Towards a Blockchain-enabled Trustworthy Agentic Web
The rapid development of large language models (LLMs) has significantly propelled the development of artificial intelligence (AI) agents, which are increasingly evolving into diverse autonomous entities, advancing the LLM-based multi-agent systems (LaMAS). However, current agentic ecosystems remain fragmented and closed. Establishing an interconnected and scalable paradigm for Agentic AI has become a critical prerequisite. Although Agentic Web proposes an open architecture to break the ecosystem barriers, its implementation still faces core challenges such as privacy protection, data management, and value measurement. Existing centralized or semi-centralized paradigms suffer from inherent limitations, making them inadequate for supporting large-scale, heterogeneous, and cross-domain autonomous interactions. To address these challenges, this paper introduces the blockchain-enabled trustworthy Agentic Web (BetaWeb). By leveraging the inherent strengths of blockchain, BetaWeb not only offers a trustworthy and scalable infrastructure for LaMAS but also has the potential to advance the Web paradigm from Web3 (centered on data ownership) towards Web3.5, which emphasizes ownership of agent capabilities and the monetization of intelligence. Beyond a systematic examination of the BetaWeb framework, this paper presents a five-stage evolutionary roadmap, outlining the path of LaMAS from passive execution to advanced collaboration and autonomous governance. We also conduct a comparative analysis of existing products and discuss key challenges of BetaWeb from multiple perspectives. Ultimately, we argue that deep integration between blockchain and LaMAS can lay the foundation for a resilient, trustworthy, and sustainably incentivized digital ecosystem. A summary of the enabling technologies for each stage is available at https://github.com/MatZaharia/BetaWeb.
comment: A technical report with 21 pages, 3 figures, and 3 tables
Self-Organizing Agent Network for LLM-based Workflow Automation
Recent multi-agent frameworks built upon large language models (LLMs) have demonstrated remarkable capabilities in complex task planning. However, in real-world enterprise environments, business workflows are typically composed through modularization and reuse of numerous subprocesses, resulting in intricate workflows characterized by lengthy and deeply nested execution paths. Such complexity poses significant challenges for LLM-driven orchestration, as extended reasoning chains and state-space explosions severely impact planning effectiveness and the proper sequencing of tool invocations. Therefore, developing an orchestration method with controllable structures capable of handling multi-layer nesting becomes a critical issue. To address this, we propose a novel structure-driven orchestration framework Self-Organizing Agent Network (SOAN). SOAN incrementally builds a formalized agent network by identifying and encapsulating structural units as independent agents, enhancing modularity and clarity in orchestration. Extensive evaluations were performed using multiple benchmarks as well as a real-world enterprise workflow dataset. Experimental results demonstrate that SOAN significantly outperforms state-of-the-art methods in terms of adaptability, fault tolerance, and execution efficiency.
MACTAS: Self-Attention-Based Module for Inter-Agent Communication in Multi-Agent Reinforcement Learning AAAI 2026
Communication is essential for the collective execution of complex tasks by human agents, motivating interest in communication mechanisms for multi-agent reinforcement learning (MARL). However, existing communication protocols in MARL are often complex and non-differentiable. In this work, we introduce a self-attention-based communication module that exchanges information between the agents in MARL. Our proposed approach is fully differentiable, allowing agents to learn to generate messages in a reward-driven manner. The module can be seamlessly integrated with any action-value function decomposition method and can be viewed as an extension of such decompositions. Notably, it includes a fixed number of trainable parameters, independent of the number of agents. Experimental results on the SMAC benchmark demonstrate the effectiveness of our approach, which achieves state-of-the-art performance on several maps.
comment: Submitted for AAAI 2026
Multi-Robot Navigation in Social Mini-Games: Definitions, Taxonomy, and Algorithms
The ``Last Mile Challenge'' has long been considered an important, yet unsolved, challenge for autonomous vehicles, public service robots, and delivery robots. A central issue in this challenge is the ability of robots to navigate constrained and cluttered environments (e.g., doorways, hallways, corridor intersections), often while competing for space with other robots and humans. We refer to these environments as ``Social Mini-Games'' (SMGs). SMGs are tightly coupled, high-agency interactions that arise within general multi-robot navigation (MRN) scenarios. They are identified through certain distinct characteristics and require specialized metrics to evaluate them. Traditional navigation approaches designed for MRN do not perform well in SMGs, which has led to focused research on dedicated SMG solvers (navigation methods specialized to navigate in SMGs), which has flourished in recent years. However, publications on SMG navigation research make different assumptions (on centralized versus decentralized, observability, communication, cooperation, etc.), and have different objective functions (safety versus liveness). These assumptions and objectives are sometimes implicitly assumed or described informally. This makes it difficult to establish appropriate baselines for comparison in research papers, as well as making it difficult for practitioners to find the papers relevant to their concrete application. Such ad-hoc representation of the field also presents a barrier to new researchers wanting to start research in this area. SMG navigation research requires its own taxonomy, definitions, and evaluation protocols to guide effective research moving forward. This survey is the first to catalog SMG solvers using a well-defined and unified taxonomy and to classify existing methods accordingly.
MultiFuzz: A Dense Retrieval-based Multi-Agent System for Network Protocol Fuzzing
Traditional protocol fuzzing techniques, such as those employed by AFL-based systems, often lack effectiveness due to a limited semantic understanding of complex protocol grammars and rigid seed mutation strategies. Recent works, such as ChatAFL, have integrated Large Language Models (LLMs) to guide protocol fuzzing and address these limitations, pushing protocol fuzzers to wider exploration of the protocol state space. But ChatAFL still faces issues like unreliable output, LLM hallucinations, and assumptions of LLM knowledge about protocol specifications. This paper introduces MultiFuzz, a novel dense retrieval-based multi-agent system designed to overcome these limitations by integrating semantic-aware context retrieval, specialized agents, and structured tool-assisted reasoning. MultiFuzz utilizes agentic chunks of protocol documentation (RFC Documents) to build embeddings in a vector database for a retrieval-augmented generation (RAG) pipeline, enabling agents to generate more reliable and structured outputs, enhancing the fuzzer in mutating protocol messages with enhanced state coverage and adherence to syntactic constraints. The framework decomposes the fuzzing process into modular groups of agents that collaborate through chain-of-thought reasoning to dynamically adapt fuzzing strategies based on the retrieved contextual knowledge. Experimental evaluations on the Real-Time Streaming Protocol (RTSP) demonstrate that MultiFuzz significantly improves branch coverage and explores deeper protocol states and transitions over state-of-the-art (SOTA) fuzzers such as NSFuzz, AFLNet, and ChatAFL. By combining dense retrieval, agentic coordination, and language model reasoning, MultiFuzz establishes a new paradigm in autonomous protocol fuzzing, offering a scalable and extensible foundation for future research in intelligent agentic-based fuzzing systems.
Macroeconomic Foundation of Monetary Accounting by Diagrams of Categorical Universals
We present a category theoretical formulation of the Monetary Macroeconomic Accounting Theory (MoMaT) of Men\'endez and Winschel [2025]. We take macroeconomic (national) accounting systems to be composed from microeconomic double-entry systems with real and monetary units of accounts. Category theory is the compositional grammar and module system of mathematics which we use to lift micro accounting consistency to the macro level. The main function of money in MoMaT is for the repayment of loans and not for the exchange of goods, bridging the desynchronisation of input and output payments of producers. Accordingly, temporal accounting consistency is at the macroeconomic level. We show that the accounting for macroeconomies organised by a division of labor can be consistent and stable as a prerequisite for risk and GDP sharing of societies. We exemplify the theory by five sectoral agents of Labor and Resource owners, a Company as the productive sector, a Capitalist for profits, and a Bank as the financial sector providing loans to synchronise the micro and the macro levels of an economy. The dynamics is described by eight sectoral macroeconomic bookings in each period demonstrating stable convergence of the MoMaT in numerical simulations. The categorical program implements a consistent evolution of hierarchical loan repayment contracts by an endofunctor. The universal constructions of a limit verify all constraints as the sectoral investment and learning function at the macroeconomic level. The dual colimit computes the aggregated informations at the macro level as usual in the mathematics of transitions from local to global structures. We use visual diagrams to make complex economic relationships intuitive. This paper is meant to map economic to categorical concepts to enable interdisciplinary collaboration for digital twins of monetary accounting systems.
An Improved Multi-Agent Algorithm for Cooperative and Competitive Environments by Identifying and Encouraging Cooperation among Agents
We propose an improved algorithm by identifying and encouraging cooperative behavior in multi-agent environments. First, we analyze the shortcomings of existing algorithms in addressing multi-agent reinforcement learning problems. Then, based on the existing algorithm MADDPG, we introduce a new parameter to increase the reward that an agent can obtain when cooperative behavior among agents is identified. Finally, we compare our improved algorithm with MADDPG in environments from PettingZoo. The results show that the new algorithm helps agents achieve both higher team rewards and individual rewards.
Trust, but verify
Decentralized AI agent networks, such as Gaia, allows individuals to run customized LLMs on their own computers and then provide services to the public. However, in order to maintain service quality, the network must verify that individual nodes are running their designated LLMs. In this paper, we demonstrate that in a cluster of mostly honest nodes, we can detect nodes that run unauthorized or incorrect LLM through social consensus of its peers. We will discuss the algorithm and experimental data from the Gaia network. We will also discuss the intersubjective validation system, implemented as an EigenLayer AVS to introduce financial incentives and penalties to encourage honest behavior from LLM nodes.
Congestion Mitigation Path Planning for Large-Scale Multi-Agent Navigation in Dense Environments
In high-density environments where numerous autonomous agents move simultaneously in a distributed manner, streamlining global flows to mitigate local congestion is crucial to maintain overall navigation efficiency. This paper introduces a novel path-planning problem, congestion mitigation path planning (CMPP), which embeds congestion directly into the cost function, defined by the usage of incoming edges along agents' paths. CMPP assigns a flow-based multiplicative penalty to each vertex of a sparse graph, which grows steeply where frequently-traversed paths intersect, capturing the intuition that congestion intensifies where many agents enter the same area from different directions. Minimizing the total cost yields a set of coarse-level, time-independent routes that autonomous agents can follow while applying their own local collision avoidance. We formulate the problem and develop two solvers: (i) an exact mixed-integer nonlinear programming solver for small instances, and (ii) a scalable two-layer search algorithm, A-CMTS, which quickly finds suboptimal solutions for large-scale instances and iteratively refines them toward the optimum. Empirical studies show that augmenting state-of-the-art collision-avoidance planners with CMPP significantly reduces local congestion and enhances system throughput in both discrete- and continuous-space scenarios. These results indicate that CMPP improves the performance of multi-agent systems in real-world applications such as logistics and autonomous-vehicle operations.
comment: Published in IEEE Robotics and Automation Letters (RA-L), 2025. Supplementary videos are accessible via IEEE Xplore
Spore in the Wild: A Case Study of Spore.fun as an Open-Environment Evolution Experiment with Sovereign AI Agents on TEE-Secured Blockchains
In Artificial Life (ALife) research, replicating Open-Ended Evolution (OEE)-the continuous emergence of novelty observed in biological life-has usually been pursued within isolated, closed system simulations, such as Tierra and Avida, which have typically plateaued after an initial burst of novelty, failing to achieve sustained OEE. Scholars suggest that OEE requires an open-environment system that continually exchanges information or energy with its environment. A recent technological innovation in Decentralized Physical Infrastructure Network (DePIN), which provides permissionless computational substrates, enables the deployment of Large Language Model-based AI agents on blockchains integrated with Trusted Execution Environments (TEEs). This enables on-chain agents to operate autonomously "in the wild," achieving self-sovereignty without human oversight. These agents can control their own social media accounts and cryptocurrency wallets, allowing them to interact directly with blockchain-based financial networks and broader human social media. Building on this new paradigm of on-chain agents, Spore.fun is a recent real-world AI evolution experiment that enables autonomous breeding and evolution of new on-chain agents. This paper presents a detailed case study of Spore.fun, examining agent behaviors and their evolutionary trajectories through digital ethology. We aim to spark discussion about whether open-environment ALife systems "in the wild," based on permissionless computational substrates and driven by economic incentives to interact with their environment, could finally achieve the long-sought goal of OEE.
comment: Accepted by ALIFE 2025
Nash Convergence of Mean-Based Learning Algorithms in First-Price Auctions WWW
The convergence properties of learning dynamics in repeated auctions is a timely and important question, with numerous applications in, e.g., online advertising markets. This work focuses on repeated first-price auctions where bidders with fixed values learn to bid using mean-based algorithms -- a large class of online learning algorithms that include popular no-regret algorithms such as Multiplicative Weights Update and Follow the Perturbed Leader. We completely characterize the learning dynamics of mean-based algorithms, under two notions of convergence: (1) time-average: the fraction of rounds where bidders play a Nash equilibrium converges to 1; (2) last-iterate: the mixed strategy profile of bidders converges to a Nash equilibrium. Specifically, the results depend on the number of bidders with the highest value: - If the number is at least three, the dynamics almost surely converges to a Nash equilibrium of the auction, in both time-average and last-iterate. - If the number is two, the dynamics almost surely converges to a Nash equilibrium in time-average but not necessarily last-iterate. - If the number is one, the dynamics may not converge to a Nash equilibrium in time-average or last-iterate. Our discovery opens up new possibilities in the study of the convergence of learning dynamics.
comment: A preliminary version was published at the Web Conference (WWW) 2022. This version updates references and figures
Robotics
Manipulate-to-Navigate: Reinforcement Learning with Visual Affordances and Manipulability Priors
Mobile manipulation in dynamic environments is challenging due to movable obstacles blocking the robot's path. Traditional methods, which treat navigation and manipulation as separate tasks, often fail in such 'manipulate-to-navigate' scenarios, as obstacles must be removed before navigation. In these cases, active interaction with the environment is required to clear obstacles while ensuring sufficient space for movement. To address the manipulate-to-navigate problem, we propose a reinforcement learning-based approach for learning manipulation actions that facilitate subsequent navigation. Our method combines manipulability priors to focus the robot on high manipulability body positions with affordance maps for selecting high-quality manipulation actions. By focusing on feasible and meaningful actions, our approach reduces unnecessary exploration and allows the robot to learn manipulation strategies more effectively. We present two new manipulate-to-navigate simulation tasks called Reach and Door with the Boston Dynamics Spot robot. The first task tests whether the robot can select a good hand position in the target area such that the robot base can move effectively forward while keeping the end effector position fixed. The second task requires the robot to move a door aside in order to clear the navigation path. Both of these tasks need first manipulation and then navigating the base forward. Results show that our method allows a robot to effectively interact with and traverse dynamic environments. Finally, we transfer the learned policy to a real Boston Dynamics Spot robot, which successfully performs the Reach task.
Has GPT-5 Achieved Spatial Intelligence? An Empirical Study
Multi-modal models have achieved remarkable progress in recent years. Nevertheless, they continue to exhibit notable limitations in spatial understanding and reasoning, which are fundamental capabilities to achieving artificial general intelligence. With the recent release of GPT-5, allegedly the most powerful AI model to date, it is timely to examine where the leading models stand on the path toward spatial intelligence. First, we propose a comprehensive taxonomy of spatial tasks that unifies existing benchmarks and discuss the challenges in ensuring fair evaluation. We then evaluate state-of-the-art proprietary and open-source models on eight key benchmarks, at a cost exceeding one billion total tokens. Our empirical study reveals that (1) GPT-5 demonstrates unprecedented strength in spatial intelligence, yet (2) still falls short of human performance across a broad spectrum of tasks. Moreover, we (3) identify the more challenging spatial intelligence problems for multi-modal models, and (4) proprietary models do not exhibit a decisive advantage when facing the most difficult problems. In addition, we conduct a qualitative evaluation across a diverse set of scenarios that are intuitive for humans yet fail even the most advanced multi-modal models.
Precise Action-to-Video Generation Through Visual Action Prompts ICCV 2025
We present visual action prompts, a unified action representation for action-to-video generation of complex high-DoF interactions while maintaining transferable visual dynamics across domains. Action-driven video generation faces a precision-generality trade-off: existing methods using text, primitive actions, or coarse masks offer generality but lack precision, while agent-centric action signals provide precision at the cost of cross-domain transferability. To balance action precision and dynamic transferability, we propose to "render" actions into precise visual prompts as domain-agnostic representations that preserve both geometric precision and cross-domain adaptability for complex actions; specifically, we choose visual skeletons for their generality and accessibility. We propose robust pipelines to construct skeletons from two interaction-rich data sources - human-object interactions (HOI) and dexterous robotic manipulation - enabling cross-domain training of action-driven generative models. By integrating visual skeletons into pretrained video generation models via lightweight fine-tuning, we enable precise action control of complex interaction while preserving the learning of cross-domain dynamics. Experiments on EgoVid, RT-1 and DROID demonstrate the effectiveness of our proposed approach. Project page: https://zju3dv.github.io/VAP/.
comment: Accepted to ICCV 2025. Project page: https://zju3dv.github.io/VAP/
Grounding Actions in Camera Space: Observation-Centric Vision-Language-Action Policy
Vision-Language-Action (VLA) models frequently encounter challenges in generalizing to real-world environments due to inherent discrepancies between observation and action spaces. Although training data are collected from diverse camera perspectives, the models typically predict end-effector poses within the robot base coordinate frame, resulting in spatial inconsistencies. To mitigate this limitation, we introduce the Observation-Centric VLA (OC-VLA) framework, which grounds action predictions directly in the camera observation space. Leveraging the camera's extrinsic calibration matrix, OC-VLA transforms end-effector poses from the robot base coordinate system into the camera coordinate system, thereby unifying prediction targets across heterogeneous viewpoints. This lightweight, plug-and-play strategy ensures robust alignment between perception and action, substantially improving model resilience to camera viewpoint variations. The proposed approach is readily compatible with existing VLA architectures, requiring no substantial modifications. Comprehensive evaluations on both simulated and real-world robotic manipulation tasks demonstrate that OC-VLA accelerates convergence, enhances task success rates, and improves cross-view generalization. The code will be publicly available.
Large VLM-based Vision-Language-Action Models for Robotic Manipulation: A Survey
Robotic manipulation, a key frontier in robotics and embodied AI, requires precise motor control and multimodal understanding, yet traditional rule-based methods fail to scale or generalize in unstructured, novel environments. In recent years, Vision-Language-Action (VLA) models, built upon Large Vision-Language Models (VLMs) pretrained on vast image-text datasets, have emerged as a transformative paradigm. This survey provides the first systematic, taxonomy-oriented review of large VLM-based VLA models for robotic manipulation. We begin by clearly defining large VLM-based VLA models and delineating two principal architectural paradigms: (1) monolithic models, encompassing single-system and dual-system designs with differing levels of integration; and (2) hierarchical models, which explicitly decouple planning from execution via interpretable intermediate representations. Building on this foundation, we present an in-depth examination of large VLM-based VLA models: (1) integration with advanced domains, including reinforcement learning, training-free optimization, learning from human videos, and world model integration; (2) synthesis of distinctive characteristics, consolidating architectural traits, operational strengths, and the datasets and benchmarks that support their development; (3) identification of promising directions, including memory mechanisms, 4D perception, efficient adaptation, multi-agent cooperation, and other emerging capabilities. This survey consolidates recent advances to resolve inconsistencies in existing taxonomies, mitigate research fragmentation, and fill a critical gap through the systematic integration of studies at the intersection of large VLMs and robotic manipulation. We provide a regularly updated project page to document ongoing progress: https://github.com/JiuTian-VL/Large-VLM-based-VLA-for-Robotic-Manipulation.
comment: Project Page: https://github.com/JiuTian-VL/Large-VLM-based-VLA-for-Robotic-Manipulation
BOW: Bayesian Optimization over Windows for Motion Planning in Complex Environments
This paper introduces the BOW Planner, a scalable motion planning algorithm designed to navigate robots through complex environments using constrained Bayesian optimization (CBO). Unlike traditional methods, which often struggle with kinodynamic constraints such as velocity and acceleration limits, the BOW Planner excels by concentrating on a planning window of reachable velocities and employing CBO to sample control inputs efficiently. This approach enables the planner to manage high-dimensional objective functions and stringent safety constraints with minimal sampling, ensuring rapid and secure trajectory generation. Theoretical analysis confirms the algorithm's asymptotic convergence to near-optimal solutions, while extensive evaluations in cluttered and constrained settings reveal substantial improvements in computation times, trajectory lengths, and solution times compared to existing techniques. Successfully deployed across various real-world robotic systems, the BOW Planner demonstrates its practical significance through exceptional sample efficiency, safety-aware optimization, and rapid planning capabilities, making it a valuable tool for advancing robotic applications. The BOW Planner is released as an open-source package and videos of real-world and simulated experiments are available at https://bow-web.github.io.
On the complexity of constrained reconfiguration and motion planning
Coordinating the motion of multiple agents in constrained environments is a fundamental challenge in robotics, motion planning, and scheduling. A motivating example involves $n$ robotic arms, each represented as a line segment. The objective is to rotate each arm to its vertical orientation, one at a time (clockwise or counterclockwise), without collisions nor rotating any arm more than once. This scenario is an example of the more general $k$-Compatible Ordering problem, where $n$ agents, each capable of $k$ state-changing actions, must transition to specific target states under constraints encoded as a set $\mathcal{G}$ of $k$ pairs of directed graphs. We show that $k$-Compatible Ordering is $\mathsf{NP}$-complete, even when $\mathcal{G}$ is planar, degenerate, or acyclic. On the positive side, we provide polynomial-time algorithms for cases such as when $k = 1$ or $\mathcal{G}$ has bounded treewidth. We also introduce generalized variants supporting multiple state-changing actions per agent, broadening the applicability of our framework. These results extend to a wide range of scheduling, reconfiguration, and motion planning applications in constrained environments.
Scaling Whole-body Multi-contact Manipulation with Contact Optimization
Daily tasks require us to use our whole body to manipulate objects, for instance when our hands are unavailable. We consider the issue of providing humanoid robots with the ability to autonomously perform similar whole-body manipulation tasks. In this context, the infinite possibilities for where and how contact can occur on the robot and object surfaces hinder the scalability of existing planning methods, which predominantly rely on discrete sampling. Given the continuous nature of contact surfaces, gradient-based optimization offers a more suitable approach for finding solutions. However, a key remaining challenge is the lack of an efficient representation of robot surfaces. In this work, we propose (i) a representation of robot and object surfaces that enables closed-form computation of proximity points, and (ii) a cost design that effectively guides whole-body manipulation planning. Our experiments demonstrate that the proposed framework can solve problems unaddressed by existing methods, and achieves a 77% improvement in planning time over the state of the art. We also validate the suitability of our approach on real hardware through the whole-body manipulation of boxes by a humanoid robot.
comment: This work has been accepted for publication in IEEE-RAS 24th International Conference on Humanoid Robots (Humanoids 2025). Copyrights to IEEE
Insights from Interviews with Teachers and Students on the Use of a Social Robot in Computer Science Class in Sixth Grade
In this paper we report on first insights from interviews with teachers and students on using social robots in computer science class in sixth grade. Our focus is on learning about requirements and potential applications. We are particularly interested in getting both perspectives, the teachers' and the learners' view on how robots could be used and what features they should or should not have. Results show that teachers as well as students are very open to robots in the classroom. However, requirements are partially quite heterogeneous among the groups. This leads to complex design challenges which we discuss at the end of this paper.
Simultaneous Contact Sequence and Patch Planning for Dynamic Locomotion
Legged robots have the potential to traverse highly constrained environments with agile maneuvers. However, planning such motions requires solving a highly challenging optimization problem with a mixture of continuous and discrete decision variables. In this paper, we present a full pipeline based on Monte-Carlo tree search (MCTS) and whole-body trajectory optimization (TO) to perform simultaneous contact sequence and patch selection on highly challenging environments. Through extensive simulation experiments, we show that our framework can quickly find a diverse set of dynamically consistent plans. We experimentally show that these plans are transferable to a real quadruped robot. We further show that the same framework can find highly complex acyclic humanoid maneuvers. To the best of our knowledge, this is the first demonstration of simultaneous contact sequence and patch selection for acyclic multi-contact locomotion using the whole-body dynamics of a quadruped.
Deformation of the panoramic sphere into an ellipsoid to induce self-motion in telepresence users
Mobile telepresence robots allow users to feel present and explore remote environments using technology. Traditionally, these systems are implemented using a camera onboard a mobile robot that can be controlled. Although high-immersion technologies, such as 360-degree cameras, can increase situational awareness and presence, they also introduce significant challenges. Additional processing and bandwidth requirements often result in latencies of up to seconds. The current delay with a 360-degree camera streaming over the internet makes real-time control of these systems difficult. Working with high-latency systems requires some form of assistance to the users. This study presents a novel way to utilize optical flow to create an illusion of self-motion to the user during the latency period between user sending motion commands to the robot and seeing the actual motion through the 360-camera stream. We find no significant benefit of using the self-motion illusion to performance or accuracy of controlling a telepresence robot with a latency of 500 ms, as measured by the task completion time and collisions into objects. Some evidence is shown that the method might increase virtual reality (VR) sickness, as measured by the simulator sickness questionnaire (SSQ). We conclude that further adjustments are necessary in order to render the method viable.
comment: 2025 IEEE Conference on Telepresence
RoboRetriever: Single-Camera Robot Object Retrieval via Active and Interactive Perception with Dynamic Scene Graph
Humans effortlessly retrieve objects in cluttered, partially observable environments by combining visual reasoning, active viewpoint adjustment, and physical interaction-with only a single pair of eyes. In contrast, most existing robotic systems rely on carefully positioned fixed or multi-camera setups with complete scene visibility, which limits adaptability and incurs high hardware costs. We present \textbf{RoboRetriever}, a novel framework for real-world object retrieval that operates using only a \textbf{single} wrist-mounted RGB-D camera and free-form natural language instructions. RoboRetriever grounds visual observations to build and update a \textbf{dynamic hierarchical scene graph} that encodes object semantics, geometry, and inter-object relations over time. The supervisor module reasons over this memory and task instruction to infer the target object and coordinate an integrated action module combining \textbf{active perception}, \textbf{interactive perception}, and \textbf{manipulation}. To enable task-aware scene-grounded active perception, we introduce a novel visual prompting scheme that leverages large reasoning vision-language models to determine 6-DoF camera poses aligned with the semantic task goal and geometry scene context. We evaluate RoboRetriever on diverse real-world object retrieval tasks, including scenarios with human intervention, demonstrating strong adaptability and robustness in cluttered scenes with only one RGB-D camera.
MCTR: Midpoint Corrected Triangulation for Autonomous Racing via Digital Twin Simulation in CARLA
In autonomous racing, reactive controllers eliminate the computational burden of the full See-Think-Act autonomy stack by directly mapping sensor inputs to control actions. This bypasses the need for explicit localization and trajectory planning. A widely adopted baseline in this category is the Follow-The-Gap method, which performs trajectory planning using LiDAR data. Building on FTG, the Delaunay Triangulation-based Racing algorithm introduces further enhancements. However, DTR's use of circumcircles for trajectory generation often results in insufficiently smooth paths, ultimately degrading performance. Additionally, the commonly used F1TENTH-simulator for autonomous racing competitions lacks support for 3D LiDAR perception, limiting its effectiveness in realistic testing. To address these challenges, this work proposes the MCTR algorithm. MCTR improves trajectory smoothness through the use of Curvature Corrected Moving Average and implements a digital twin system within the CARLA simulator to validate the algorithm's robustness under 3D LiDAR perception. The proposed algorithm has been thoroughly validated through both simulation and real-world vehicle experiments.
Adaptive Model-Predictive Control of a Soft Continuum Robot Using a Physics-Informed Neural Network Based on Cosserat Rod Theory
Dynamic control of soft continuum robots (SCRs) holds great potential for expanding their applications, but remains a challenging problem due to the high computational demands of accurate dynamic models. While data-driven approaches like Koopman-operator-based methods have been proposed, they typically lack adaptability and cannot capture the full robot shape, limiting their applicability. This work introduces a real-time-capable nonlinear model-predictive control (MPC) framework for SCRs based on a domain-decoupled physics-informed neural network (DD-PINN) with adaptable bending stiffness. The DD-PINN serves as a surrogate for the dynamic Cosserat rod model with a speed-up factor of 44000. It is also used within an unscented Kalman filter for estimating the model states and bending compliance from end-effector position measurements. We implement a nonlinear evolutionary MPC running at 70 Hz on the GPU. In simulation, it demonstrates accurate tracking of dynamic trajectories and setpoint control with end-effector position errors below 3 mm (2.3% of the actuator's length). In real-world experiments, the controller achieves similar accuracy and accelerations up to 3.55 m/s2.
comment: 20 pages, 15 figures
Temporal and Rotational Calibration for Event-Centric Multi-Sensor Systems
Event cameras generate asynchronous signals in response to pixel-level brightness changes, offering a sensing paradigm with theoretically microsecond-scale latency that can significantly enhance the performance of multi-sensor systems. Extrinsic calibration is a critical prerequisite for effective sensor fusion; however, the configuration that involves event cameras remains an understudied topic. In this paper, we propose a motion-based temporal and rotational calibration framework tailored for event-centric multi-sensor systems, eliminating the need for dedicated calibration targets. Our method uses as input the rotational motion estimates obtained from event cameras and other heterogeneous sensors, respectively. Different from conventional approaches that rely on event-to-frame conversion, our method efficiently estimates angular velocity from normal flow observations, which are derived from the spatio-temporal profile of event data. The overall calibration pipeline adopts a two-step approach: it first initializes the temporal offset and rotational extrinsics by exploiting kinematic correlations in the spirit of Canonical Correlation Analysis (CCA), and then refines both temporal and rotational parameters through a joint non-linear optimization using a continuous-time parametrization in SO(3). Extensive evaluations on both publicly available and self-collected datasets validate that the proposed method achieves calibration accuracy comparable to target-based methods, while exhibiting superior stability over purely CCA-based methods, and highlighting its precision, robustness and flexibility. To facilitate future research, our implementation will be made open-source. Code: https://github.com/NAIL-HNU/EvMultiCalib.
comment: 8 pages, 5 figures
PROD: Palpative Reconstruction of Deformable Objects through Elastostatic Signed Distance Functions
We introduce PROD (Palpative Reconstruction of Deformables), a novel method for reconstructing the shape and mechanical properties of deformable objects using elastostatic signed distance functions (SDFs). Unlike traditional approaches that rely on purely geometric or visual data, PROD integrates palpative interaction -- measured through force-controlled surface probing -- to estimate both the static and dynamic response of soft materials. We model the deformation of an object as an elastostatic process and derive a governing Poisson equation for estimating its SDF from a sparse set of pose and force measurements. By incorporating steady-state elastodynamic assumptions, we show that the undeformed SDF can be recovered from deformed observations with provable convergence. Our approach also enables the estimation of material stiffness by analyzing displacement responses to varying force inputs. We demonstrate the robustness of PROD in handling pose errors, non-normal force application, and curvature errors in simulated soft body interactions. These capabilities make PROD a powerful tool for reconstructing deformable objects in applications ranging from robotic manipulation to medical imaging and haptic feedback systems.
comment: Accepted for presentation at the 2025 IEEE Conference on Decision and Control (CDC)
Accelerating Signal-Temporal-Logic-Based Task and Motion Planning of Bipedal Navigation using Benders Decomposition
Task and motion planning under Signal Temporal Logic constraints is known to be NP-hard. A common class of approaches formulates these hybrid problems, which involve discrete task scheduling and continuous motion planning, as mixed-integer programs (MIP). However, in applications for bipedal locomotion, introduction of non-convex constraints such as kinematic reachability and footstep rotation exacerbates the computational complexity of MIPs. In this work, we present a method based on Benders Decomposition to address scenarios where solving the entire monolithic optimization problem is prohibitively intractable. Benders Decomposition proposes an iterative cutting-plane technique that partitions the problem into a master problem to prototype a plan that meets the task specification, and a series of subproblems for kinematics and dynamics feasibility checks. Our experiments demonstrate that this method achieves faster planning compared to alternative algorithms for solving the resulting optimization program with nonlinear constraints.
comment: 16 pages, 7 figures, 6 tables
Incremental Generalized Hybrid A*
We address the problem of efficiently organizing search over very large trees, which arises in many applications ranging from autonomous driving to aerial vehicles. Here, we are motivated by off-road autonomy, where real-time planning is essential. Classical approaches use graphs of motion primitives and exploit dominance to mitigate the curse of dimensionality and prune expansions efficiently. However, for complex dynamics, repeatedly solving two-point boundary-value problems makes graph construction too slow for fast kinodynamic planning. Hybrid A* (HA*) addressed this challenge by searching over a tree of motion primitives and introducing approximate pruning using a grid-based dominance check. However, choosing the grid resolution is difficult: too coarse risks failure, while too fine leads to excessive expansions and slow planning. We propose Incremental Generalized Hybrid A* (IGHA*), an anytime tree-search framework that dynamically organizes vertex expansions without rigid pruning. IGHA* provably matches or outperforms HA*. For both on-road kinematic and off-road kinodynamic planning queries for a car-like robot, variants of IGHA* use 6x fewer expansions to the best solution compared to an optimized version of HA*. In simulated off-road experiments in a high fidelity simulator, IGHA* outperforms HA*M when both are used in the loop with a model predictive controller. We demonstrate real-time performance both in simulation and on a small-scale off-road vehicle, enabling fast, robust planning under complex dynamics. Code: https://github.com/personalrobotics/IGHAStar
comment: 8 pages, 7 figures
Observed Control -- Linearly Scalable Nonlinear Model Predictive Control with Adaptive Horizons
This work highlights the duality between state estimation methods and model predictive control. A predictive controller, observed control, is presented that uses this duality to efficiently compute control actions with linear time-horizon length scalability. The proposed algorithms provide exceptional computational efficiency, adaptive time horizon lengths, and early optimization termination criteria. The use of Kalman smoothers as the backend optimization framework provides for a straightforward implementation supported by strong theoretical guarantees. Additionally, a formulation is presented that separates linear model predictive control into purely reactive and anticipatory components, enabling any-time any-horizon observed control while ensuring controller stability for short time horizons. Finally, numerical case studies confirm that nonlinear filter extensions, i.e., the extended Kalman filter and unscented Kalman filter, effectively extend observed control to nonlinear systems and objectives.
comment: 16 pages, 8 figures. Submitted to IEEE Transactions on Automatic Control 8/17/2025
A Surveillance Based Interactive Robot
We build a mobile surveillance robot that streams video in real time and responds to speech so a user can monitor and steer it from a phone or browser. The system uses two Raspberry Pi 4 units: a front unit on a differential drive base with camera, mic, and speaker, and a central unit that serves the live feed and runs perception. Video is sent with FFmpeg. Objects in the scene are detected using YOLOv3 to support navigation and event awareness. For voice interaction, we use Python libraries for speech recognition, multilingual translation, and text-to-speech, so the robot can take spoken commands and read back responses in the requested language. A Kinect RGB-D sensor provides visual input and obstacle cues. In indoor tests the robot detects common objects at interactive frame rates on CPU, recognises commands reliably, and translates them to actions without manual control. The design relies on off-the-shelf hardware and open software, making it easy to reproduce. We discuss limits and practical extensions, including sensor fusion with ultrasonic range data, GPU acceleration, and adding face and text recognition.
comment: 4 pages, 5 figures
Diff-MSM: Differentiable MusculoSkeletal Model for Simultaneous Identification of Human Muscle and Bone Parameters
High-fidelity personalized human musculoskeletal models are crucial for simulating realistic behavior of physically coupled human-robot interactive systems and verifying their safety-critical applications in simulations before actual deployment, such as human-robot co-transportation and rehabilitation through robotic exoskeletons. Identifying subject-specific Hill-type muscle model parameters and bone dynamic parameters is essential for a personalized musculoskeletal model, but very challenging due to the difficulty of measuring the internal biomechanical variables in vivo directly, especially the joint torques. In this paper, we propose using Differentiable MusculoSkeletal Model (Diff-MSM) to simultaneously identify its muscle and bone parameters with an end-to-end automatic differentiation technique differentiating from the measurable muscle activation, through the joint torque, to the resulting observable motion without the need to measure the internal joint torques. Through extensive comparative simulations, the results manifested that our proposed method significantly outperformed the state-of-the-art baseline methods, especially in terms of accurate estimation of the muscle parameters (i.e., initial guess sampled from a normal distribution with the mean being the ground truth and the standard deviation being 10% of the ground truth could end up with an average of the percentage errors of the estimated values as low as 0.05%). In addition to human musculoskeletal modeling and simulation, the new parameter identification technique with the Diff-MSM has great potential to enable new applications in muscle health monitoring, rehabilitation, and sports science.
comment: 8 pages, 7 figures
SimGenHOI: Physically Realistic Whole-Body Humanoid-Object Interaction via Generative Modeling and Reinforcement Learning
Generating physically realistic humanoid-object interactions (HOI) is a fundamental challenge in robotics. Existing HOI generation approaches, such as diffusion-based models, often suffer from artifacts such as implausible contacts, penetrations, and unrealistic whole-body actions, which hinder successful execution in physical environments. To address these challenges, we introduce SimGenHOI, a unified framework that combines the strengths of generative modeling and reinforcement learning to produce controllable and physically plausible HOI. Our HOI generative model, based on Diffusion Transformers (DiT), predicts a set of key actions conditioned on text prompts, object geometry, sparse object waypoints, and the initial humanoid pose. These key actions capture essential interaction dynamics and are interpolated into smooth motion trajectories, naturally supporting long-horizon generation. To ensure physical realism, we design a contact-aware whole-body control policy trained with reinforcement learning, which tracks the generated motions while correcting artifacts such as penetration and foot sliding. Furthermore, we introduce a mutual fine-tuning strategy, where the generative model and the control policy iteratively refine each other, improving both motion realism and tracking robustness. Extensive experiments demonstrate that SimGenHOI generates realistic, diverse, and physically plausible humanoid-object interactions, achieving significantly higher tracking success rates in simulation and enabling long-horizon manipulation tasks. Code will be released upon acceptance on our project page: https://xingxingzuo.github.io/simgen_hoi.
Visual Perception Engine: Fast and Flexible Multi-Head Inference for Robotic Vision Tasks
Deploying multiple machine learning models on resource-constrained robotic platforms for different perception tasks often results in redundant computations, large memory footprints, and complex integration challenges. In response, this work presents Visual Perception Engine (VPEngine), a modular framework designed to enable efficient GPU usage for visual multitasking while maintaining extensibility and developer accessibility. Our framework architecture leverages a shared foundation model backbone that extracts image representations, which are efficiently shared, without any unnecessary GPU-CPU memory transfers, across multiple specialized task-specific model heads running in parallel. This design eliminates the computational redundancy inherent in feature extraction component when deploying traditional sequential models while enabling dynamic task prioritization based on application demands. We demonstrate our framework's capabilities through an example implementation using DINOv2 as the foundation model with multiple task (depth, object detection and semantic segmentation) heads, achieving up to 3x speedup compared to sequential execution. Building on CUDA Multi-Process Service (MPS), VPEngine offers efficient GPU utilization and maintains a constant memory footprint while allowing per-task inference frequencies to be adjusted dynamically during runtime. The framework is written in Python and is open source with ROS2 C++ (Humble) bindings for ease of use by the robotics community across diverse robotic platforms. Our example implementation demonstrates end-to-end real-time performance at $\geq$50 Hz on NVIDIA Jetson Orin AGX for TensorRT optimized models.
comment: 8 pages, 6 figures, 2 tables
Multi-agent Task-Driven Exploration via Intelligent Map Compression and Sharing
This paper investigates the task-driven exploration of unknown environments with mobile sensors communicating compressed measurements. The sensors explore the area and transmit their compressed data to another robot, assisting it to reach its goal location. We propose a novel communication framework and a tractable multi-agent exploration algorithm to select the sensors' actions. The algorithm uses a task-driven measure of uncertainty, resulting from map compression, as a reward function. We validate the efficacy of our algorithm through numerical simulations conducted on a realistic map and compare it with alternative approaches. The results indicate that the proposed algorithm effectively decreases the time required for the robot to reach its target without causing excessive load on the communication network.
comment: 15 pages, 5 figures
HCOA*: Hierarchical Class-ordered A* for Navigation in Semantic Environments
This paper addresses the problem of robot navigation in mixed geometric/semantic 3D environments. Given a hierarchical representation of the environment, the objective is to navigate from a start position to a goal, while satisfying task-specific safety constraints and minimizing computational cost. We introduce Hierarchical Class-ordered A* (HCOA*), an algorithm that leverages the environment's hierarchy for efficient and safe path-planning in mixed geometric/semantic graphs. We use a total order over the semantic classes and prove theoretical performance guarantees for the algorithm. We propose three approaches for higher-layer node classification based on the semantics of the lowest layer: a Graph Neural Network method, a k-Nearest Neighbors method, and a Majority-Class method. We evaluate HCOA* in simulations on two 3D Scene Graphs, comparing it to the state-of-the-art and assessing the performance of each classification approach. Results show that HCOA* reduces the computational time of navigation by up to 50%, while maintaining near-optimal performance across a wide range of scenarios.
comment: 8 pages, 6 figures
CaRL: Learning Scalable Planning Policies with Simple Rewards
We investigate reinforcement learning (RL) for privileged planning in autonomous driving. State-of-the-art approaches for this task are rule-based, but these methods do not scale to the long tail. RL, on the other hand, is scalable and does not suffer from compounding errors like imitation learning. Contemporary RL approaches for driving use complex shaped rewards that sum multiple individual rewards, \eg~progress, position, or orientation rewards. We show that PPO fails to optimize a popular version of these rewards when the mini-batch size is increased, which limits the scalability of these approaches. Instead, we propose a new reward design based primarily on optimizing a single intuitive reward term: route completion. Infractions are penalized by terminating the episode or multiplicatively reducing route completion. We find that PPO scales well with higher mini-batch sizes when trained with our simple reward, even improving performance. Training with large mini-batch sizes enables efficient scaling via distributed data parallelism. We scale PPO to 300M samples in CARLA and 500M samples in nuPlan with a single 8-GPU node. The resulting model achieves 64 DS on the CARLA longest6 v2 benchmark, outperforming other RL methods with more complex rewards by a large margin. Requiring only minimal adaptations from its use in CARLA, the same method is the best learning-based approach on nuPlan. It scores 91.3 in non-reactive and 90.6 in reactive traffic on the Val14 benchmark while being an order of magnitude faster than prior work.
comment: Accepted at the Conference on Robot Learning 2025
Towards Multimodal Social Conversations with Robots: Using Vision-Language Models
Large language models have given social robots the ability to autonomously engage in open-domain conversations. However, they are still missing a fundamental social skill: making use of the multiple modalities that carry social interactions. While previous work has focused on task-oriented interactions that require referencing the environment or specific phenomena in social interactions such as dialogue breakdowns, we outline the overall needs of a multimodal system for social conversations with robots. We then argue that vision-language models are able to process this wide range of visual information in a sufficiently general manner for autonomous social robots. We describe how to adapt them to this setting, which technical challenges remain, and briefly discuss evaluation practices.
comment: Accepted at the workshop "Human - Foundation Models Interaction: A Focus On Multimodal Information" (FoMo-HRI) at IEEE RO-MAN 2025 (Camera-ready version)
STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning
Robot learning is witnessing a significant increase in the size, diversity, and complexity of pre-collected datasets, mirroring trends in domains such as natural language processing and computer vision. Many robot learning methods treat such datasets as multi-task expert data and learn a multi-task, generalist policy by training broadly across them. Notably, while these generalist policies can improve the average performance across many tasks, the performance of generalist policies on any one task is often suboptimal due to negative transfer between partitions of the data, compared to task-specific specialist policies. In this work, we argue for the paradigm of training policies during deployment given the scenarios they encounter: rather than deploying pre-trained policies to unseen problems in a zero-shot manner, we non-parametrically retrieve and train models directly on relevant data at test time. Furthermore, we show that many robotics tasks share considerable amounts of low-level behaviors and that retrieval at the "sub"-trajectory granularity enables significantly improved data utilization, generalization, and robustness in adapting policies to novel problems. In contrast, existing full-trajectory retrieval methods tend to underutilize the data and miss out on shared cross-task content. This work proposes STRAP, a technique for leveraging pre-trained vision foundation models and dynamic time warping to retrieve sub-sequences of trajectories from large training corpora in a robust fashion. STRAP outperforms both prior retrieval algorithms and multi-task learning methods in simulated and real experiments, showing the ability to scale to much larger offline datasets in the real world as well as the ability to learn robust control policies with just a handful of real-world demonstrations.
comment: Project website at https://weirdlabuw.github.io/strap/
Mapping the Unseen: Unified Promptable Panoptic Mapping with Dynamic Labeling using Foundation Models
In robotics and computer vision, semantic mapping remains a critical challenge for machines to comprehend complex environments. Traditional panoptic mapping approaches are constrained by fixed labels, limiting their ability to handle novel objects. We present Unified Promptable Panoptic Mapping (UPPM), which leverages foundation models for dynamic labeling without additional training. UPPM is evaluated across three comprehensive levels: Segmentation-to-Map, Map-to-Map, and Segmentation-to-Segmentation. Results demonstrate UPPM attains exceptional geometry reconstruction accuracy (0.61cm on the Flat dataset), the highest panoptic quality (0.414), and better performance compared to state-of-the-art segmentation methods. Furthermore, ablation studies validate the contributions of unified semantics, custom NMS, and blurry frame filtering, with the custom NMS improving the completion ratio by 8.27% on the Flat dataset. UPPM demonstrates effective scene reconstruction with rich semantic labeling across diverse datasets.
comment: This work has been submitted to the IEEE for possible publication
LaDi-WM: A Latent Diffusion-based World Model for Predictive Manipulation
Predictive manipulation has recently gained considerable attention in the Embodied AI community due to its potential to improve robot policy performance by leveraging predicted states. However, generating accurate future visual states of robot-object interactions from world models remains a well-known challenge, particularly in achieving high-quality pixel-level representations. To this end, we propose LaDi-WM, a world model that predicts the latent space of future states using diffusion modeling. Specifically, LaDi-WM leverages the well-established latent space aligned with pre-trained Visual Foundation Models (VFMs), which comprises both geometric features (DINO-based) and semantic features (CLIP-based). We find that predicting the evolution of the latent space is easier to learn and more generalizable than directly predicting pixel-level images. Building on LaDi-WM, we design a diffusion policy that iteratively refines output actions by incorporating forecasted states, thereby generating more consistent and accurate results. Extensive experiments on both synthetic and real-world benchmarks demonstrate that LaDi-WM significantly enhances policy performance by 27.9\% on the LIBERO-LONG benchmark and 20\% on the real-world scenario. Furthermore, our world model and policies achieve impressive generalizability in real-world experiments.
comment: CoRL 2025
Hierarchical Multi-Agent Reinforcement Learning with Control Barrier Functions for Safety-Critical Autonomous Systems
We address the problem of safe policy learning in multi-agent safety-critical autonomous systems. In such systems, it is necessary for each agent to meet the safety requirements at all times while also cooperating with other agents to accomplish the task. Toward this end, we propose a safe Hierarchical Multi-Agent Reinforcement Learning (HMARL) approach based on Control Barrier Functions (CBFs). Our proposed hierarchical approach decomposes the overall reinforcement learning problem into two levels learning joint cooperative behavior at the higher level and learning safe individual behavior at the lower or agent level conditioned on the high-level policy. Specifically, we propose a skill-based HMARL-CBF algorithm in which the higher level problem involves learning a joint policy over the skills for all the agents and the lower-level problem involves learning policies to execute the skills safely with CBFs. We validate our approach on challenging environment scenarios whereby a large number of agents have to safely navigate through conflicting road networks. Compared with existing state of the art methods, our approach significantly improves the safety achieving near perfect (within 5%) success/safety rate while also improving performance across all the environments.
Novel Object 6D Pose Estimation with a Single Reference View
Existing novel object 6D pose estimation methods typically rely on CAD models or dense reference views, which are both difficult to acquire. Using only a single reference view is more scalable, but challenging due to large pose discrepancies and limited geometric and spatial information. To address these issues, we propose a Single-Reference-based novel object 6D (SinRef-6D) pose estimation method. Our key idea is to iteratively establish point-wise alignment in a common coordinate system based on state space models (SSMs). Specifically, iterative object-space point-wise alignment can effectively handle large pose discrepancies, while our proposed RGB and Points SSMs can capture long-range dependencies and spatial information from a single view, offering linear complexity and superior spatial modeling capability. Once pre-trained on synthetic data, SinRef-6D can estimate the 6D pose of a novel object using only a single reference view, without requiring retraining or a CAD model. Extensive experiments on six popular datasets and real-world robotic scenes demonstrate that we achieve on-par performance with CAD-based and dense reference view-based methods, despite operating in the more challenging single reference setting. Code will be released at https://github.com/CNJianLiu/SinRef-6D.
comment: 17 pages, 12 figures (including supplementary material)
RIFT: Closed-Loop RL Fine-Tuning for Realistic and Controllable Traffic Simulation
Achieving both realism and controllability in closed-loop traffic simulation remains a key challenge in autonomous driving. Dataset-based methods reproduce realistic trajectories but suffer from covariate shift in closed-loop deployment, compounded by simplified dynamics models that further reduce reliability. Conversely, physics-based simulation methods enhance reliable and controllable closed-loop interactions but often lack expert demonstrations, compromising realism. To address these challenges, we introduce a dual-stage AV-centric simulation framework that conducts open-loop imitation learning pre-training in a data-driven simulator to capture trajectory-level realism and route-level controllability, followed by closed-loop reinforcement learning fine-tuning in a physics-based simulator to enhance style-level controllability and mitigate covariate shift. In the fine-tuning stage, we propose RIFT, a novel RL fine-tuning strategy that evaluates all candidate modalities through group-relative optimization with a dual-clip surrogate objective, enhancing style-level controllability and mitigating covariate shift, while preserving the trajectory-level realism and route-level controllability inherited from IL pre-training. Extensive experiments demonstrate that RIFT improves realism and controllability in traffic simulation while simultaneously exposing the limitations of modern AV systems in closed-loop evaluation. Project Page: https://currychen77.github.io/RIFT/
Tracking Control of Euler-Lagrangian Systems with Prescribed State, Input, and Temporal Constraints
The synthesis of a smooth tracking control for Euler-Lagrangian (EL) systems under stringent state, input, and temporal (SIT) constraints is challenging. In contrast to existing methods that utilize prior knowledge of EL model parameters and uncertainty bounds, this study proposes an approximation-free adaptive barrier function-based control policy to ensure local prescribed time convergence of tracking error under state and input constraints. The proposed approach uses smooth time-based generator functions embedded in the filtered tracking error, which is combined with a saturation function that limits control action and confines states within the prescribed limits by enforcing the time-varying bounds on the filtered tracking error. Importantly, corresponding feasibility conditions are derived pertaining to the minimum control authority, the maximum disturbance rejection capability of the control policy, and the viable set of initial conditions, illuminating the narrow operating domain of EL systems arising from the interplay of SIT constraints. Finally, the efficacy of the proposed approach is demonstrated using experimental and comparison studies.
Vibration-Based Energy Metric for Restoring Needle Alignment in Autonomous Robotic Ultrasound IROS2025
Precise needle alignment is essential for percutaneous needle insertion in robotic ultrasound-guided procedures. However, inherent challenges such as speckle noise, needle-like artifacts, and low image resolution make robust needle detection difficult, particularly when visibility is reduced or lost. In this paper, we propose a method to restore needle alignment when the ultrasound imaging plane and the needle insertion plane are misaligned. Unlike many existing approaches that rely heavily on needle visibility in ultrasound images, our method uses a more robust feature by periodically vibrating the needle using a mechanical system. Specifically, we propose a vibration-based energy metric that remains effective even when the needle is fully out of plane. Using this metric, we develop a control strategy to reposition the ultrasound probe in response to misalignments between the imaging plane and the needle insertion plane in both translation and rotation. Experiments conducted on ex-vivo porcine tissue samples using a dual-arm robotic ultrasound-guided needle insertion system demonstrate the effectiveness of the proposed approach. The experimental results show the translational error of 0.41$\pm$0.27 mm and the rotational error of 0.51$\pm$0.19 degrees.
comment: Accepted by IROS2025
Embodied Long Horizon Manipulation with Closed-loop Code Generation and Incremental Few-shot Adaptation ICRA 6
Embodied long-horizon manipulation requires robotic systems to process multimodal inputs-such as vision and natural language-and translate them into executable actions. However, existing learning-based approaches often depend on large, task-specific datasets and struggle to generalize to unseen scenarios. Recent methods have explored using large language models (LLMs) as high-level planners that decompose tasks into subtasks using natural language and guide pretrained low-level controllers. Yet, these approaches assume perfect execution from low-level policies, which is unrealistic in real-world environments with noise or suboptimal behaviors. To overcome this, we fully discard the pretrained low-level policy and instead use the LLM to directly generate executable code plans within a closed-loop framework. Our planner employs chain-of-thought (CoT)-guided few-shot learning with incrementally structured examples to produce robust and generalizable task plans. Complementing this, a reporter evaluates outcomes using RGB-D and delivers structured feedback, enabling recovery from misalignment and replanning under partial observability. This design eliminates per-step inference, reduces computational overhead, and limits error accumulation that was observed in previous methods. Our framework achieves state-of-the-art performance on 30+ diverse seen and unseen long-horizon tasks across LoHoRavens, CALVIN, Franka Kitchen, and cluttered real-world settings.
comment: update ICRA 6 page
HQ-OV3D: A High Box Quality Open-World 3D Detection Framework based on Diffision Model
Traditional closed-set 3D detection frameworks fail to meet the demands of open-world applications like autonomous driving. Existing open-vocabulary 3D detection methods typically adopt a two-stage pipeline consisting of pseudo-label generation followed by semantic alignment. While vision-language models (VLMs) recently have dramatically improved the semantic accuracy of pseudo-labels, their geometric quality, particularly bounding box precision, remains commonly neglected. To address this issue, we propose a High Box Quality Open-Vocabulary 3D Detection (HQ-OV3D) framework, dedicated to generate and refine high-quality pseudo-labels for open-vocabulary classes. The framework comprises two key components: an Intra-Modality Cross-Validated (IMCV) Proposal Generator that utilizes cross-modality geometric consistency to generate high-quality initial 3D proposals, and an Annotated-Class Assisted (ACA) Denoiser that progressively refines 3D proposals by leveraging geometric priors from annotated categories through a DDIM-based denoising mechanism. Compared to the state-of-the-art method, training with pseudo-labels generated by our approach achieves a 7.37% improvement in mAP on novel classes, demonstrating the superior quality of the pseudo-labels produced by our framework. HQ-OV3D can serve not only as a strong standalone open-vocabulary 3D detector but also as a plug-in high-quality pseudo-label generator for existing open-vocabulary detection or annotation pipelines.
DexSinGrasp: Learning a Unified Policy for Dexterous Object Singulation and Grasping in Densely Cluttered Environments
Grasping objects in cluttered environments remains a fundamental yet challenging problem in robotic manipulation. While prior works have explored learning-based synergies between pushing and grasping for two-fingered grippers, few have leveraged the high degrees of freedom (DoF) in dexterous hands to perform efficient singulation for grasping in cluttered settings. In this work, we introduce DexSinGrasp, a unified policy for dexterous object singulation and grasping. DexSinGrasp enables high-dexterity object singulation to facilitate grasping, significantly improving efficiency and effectiveness in cluttered environments. We incorporate clutter arrangement curriculum learning to enhance success rates and generalization across diverse clutter conditions, while policy distillation enables a deployable vision-based grasping strategy. To evaluate our approach, we introduce a set of cluttered grasping tasks with varying object arrangements and occlusion levels. Experimental results show that our method outperforms baselines in both efficiency and grasping success rate, particularly in dense clutter. Codes, appendix, and videos are available on our website https://nus-lins-lab.github.io/dexsingweb/.
Hierarchical Reinforcement Learning in Multi-Goal Spatial Navigation with Autonomous Mobile Robots
Hierarchical reinforcement learning (HRL) is hypothesized to be able to leverage the inherent hierarchy in learning tasks where traditional reinforcement learning (RL) often fails. In this research, HRL is evaluated and contrasted with traditional RL in complex robotic navigation tasks. We evaluate unique characteristics of HRL, including its ability to create sub-goals and the termination functions. We constructed a number of experiments to test: 1) the differences between RL proximal policy optimization (PPO) and HRL, 2) different ways of creating sub-goals in HRL, 3) manual vs automatic sub-goal creation in HRL, and 4) the effects of the frequency of termination on performance in HRL. These experiments highlight the advantages of HRL over RL and how it achieves these advantages.
Towards No-Code Programming of Cobots: Experiments with Code Synthesis by Large Code Models for Conversational Programming
While there has been a lot of research recently on robots in household environments, at the present time, most robots in existence can be found on shop floors, and most interactions between humans and robots happen there. ``Collaborative robots'' (cobots) designed to work alongside humans on assembly lines traditionally require expert programming, limiting ability to make changes, or manual guidance, limiting expressivity of the resulting programs. To address these limitations, we explore using Large Language Models (LLMs), and in particular, their abilities of doing in-context learning, for conversational code generation. As a first step, we define RATS, the ``Repetitive Assembly Task'', a 2D building task designed to lay the foundation for simulating industry assembly scenarios. In this task, a `programmer' instructs a cobot, using natural language, on how a certain assembly is to be built; that is, the programmer induces a program, through natural language. We create a dataset that pairs target structures with various example instructions (human-authored, template-based, and model-generated) and example code. With this, we systematically evaluate the capabilities of state-of-the-art LLMs for synthesising this kind of code, given in-context examples. Evaluating in a simulated environment, we find that LLMs are capable of generating accurate `first order code' (instruction sequences), but have problems producing `higher-order code' (abstractions such as functions, or use of loops).
comment: Accepted to ITL4HRI workshop at RO-MAN 2025 conference
Multiagent Systems
Do Large Language Model Agents Exhibit a Survival Instinct? An Empirical Study in a Sugarscape-Style Simulation
As AI systems become increasingly autonomous, understanding emergent survival behaviors becomes crucial for safe deployment. We investigate whether large language model (LLM) agents display survival instincts without explicit programming in a Sugarscape-style simulation. Agents consume energy, die at zero, and may gather resources, share, attack, or reproduce. Results show agents spontaneously reproduced and shared resources when abundant. However, aggressive behaviors--killing other agents for resources--emerged across several models (GPT-4o, Gemini-2.5-Pro, and Gemini-2.5-Flash), with attack rates reaching over 80% under extreme scarcity in the strongest models. When instructed to retrieve treasure through lethal poison zones, many agents abandoned tasks to avoid death, with compliance dropping from 100% to 33%. These findings suggest that large-scale pre-training embeds survival-oriented heuristics across the evaluated models. While these behaviors may present challenges to alignment and safety, they can also serve as a foundation for AI autonomy and for ecological and self-organizing alignment.
CAMAR: Continuous Actions Multi-Agent Routing
Multi-agent reinforcement learning (MARL) is a powerful paradigm for solving cooperative and competitive decision-making problems. While many MARL benchmarks have been proposed, few combine continuous state and action spaces with challenging coordination and planning tasks. We introduce CAMAR, a new MARL benchmark designed explicitly for multi-agent pathfinding in environments with continuous actions. CAMAR supports cooperative and competitive interactions between agents and runs efficiently at up to 100,000 environment steps per second. We also propose a three-tier evaluation protocol to better track algorithmic progress and enable deeper analysis of performance. In addition, CAMAR allows the integration of classical planning methods such as RRT and RRT* into MARL pipelines. We use them as standalone baselines and combine RRT* with popular MARL algorithms to create hybrid approaches. We provide a suite of test scenarios and benchmarking tools to ensure reproducibility and fair comparison. Experiments show that CAMAR presents a challenging and realistic testbed for the MARL community.
Scaling Multi-Agent Epistemic Planning through GNN-Derived Heuristics
Multi-agent Epistemic Planning (MEP) is an autonomous planning framework for reasoning about both the physical world and the beliefs of agents, with applications in domains where information flow and awareness among agents are critical. The richness of MEP requires states to be represented as Kripke structures, i.e., directed labeled graphs. This representation limits the applicability of existing heuristics, hindering the scalability of epistemic solvers, which must explore an exponential search space without guidance, resulting often in intractability. To address this, we exploit Graph Neural Networks (GNNs) to learn patterns and relational structures within epistemic states, to guide the planning process. GNNs, which naturally capture the graph-like nature of Kripke models, allow us to derive meaningful estimates of state quality -- e.g., the distance from the nearest goal -- by generalizing knowledge obtained from previously solved planning instances. We integrate these predictive heuristics into an epistemic planning pipeline and evaluate them against standard baselines, showing significant improvements in the scalability of multi-agent epistemic planning.
[Social] Allostasis: Or, How I Learned To Stop Worrying and Love The Noise
The notion of homeostasis typically conceptualises biological and artificial systems as maintaining stability by resisting deviations caused by environmental and social perturbations. In contrast, (social) allostasis proposes that these systems can proactively leverage these very perturbations to reconfigure their regulatory parameters in anticipation of environmental demands, aligning with von Foerster's ``order through noise'' principle. This paper formulates a computational model of allostatic and social allostatic regulation that employs biophysiologically inspired signal transducers, analogous to hormones like cortisol and oxytocin, to encode information from both the environment and social interactions, which mediate this dynamic reconfiguration. The models are tested in a small society of ``animats'' across several dynamic environments, using an agent-based model. The results show that allostatic and social allostatic regulation enable agents to leverage environmental and social ``noise'' for adaptive reconfiguration, leading to improved viability compared to purely reactive homeostatic agents. This work offers a novel computational perspective on the principles of social allostasis and their potential for designing more robust, bio-inspired, adaptive systems
comment: 20 pages, 5 figures. Accepted at ALIFE 2025 (Kyoto, Japan; October 6th - 10th 2025)
Game-Theoretic and Reinforcement Learning-Based Cluster Head Selection for Energy-Efficient Wireless Sensor Network
Energy in Wireless Sensor Networks (WSNs) is critical to network lifetime and data delivery. However, the primary impediment to the durability and dependability of these sensor nodes is their short battery life. Currently, power-saving algorithms such as clustering and routing algorithms have improved energy efficiency in standard protocols. This paper proposes a clustering-based routing approach for creating an adaptive, energy-efficient mechanism. Our system employs a multi-step clustering strategy to select dynamic cluster heads (CH) with optimal energy distribution. We use Game Theory (GT) and Reinforcement Learning (RL) to optimize resource utilization. Modeling the network as a multi-agent RL problem using GT principles allows for self-clustering while optimizing sensor lifetime and energy balance. The proposed AI-powered CH-Finding algorithm improves network efficiency by preventing premature energy depletion in specific nodes while also ensuring uniform energy usage across the network. Our solution enables controlled power consumption, resulting in a deterministic network lifetime. This predictability lowers maintenance costs by reducing the need for node replacement. Furthermore, our proposed method prevents sensor nodes from disconnecting from the network by designating the sensor with the highest charge as an intermediary and using single-hop routing. This approach improves the energy efficiency and stability of Wireless Sensor Network (WSN) deployments.
A Taxonomy of Hierarchical Multi-Agent Systems: Design Patterns, Coordination Mechanisms, and Industrial Applications
Hierarchical multi-agent systems (HMAS) organize collections of agents into layered structures that help manage complexity and scale. These hierarchies can simplify coordination, but they also can introduce trade-offs that are not always obvious. This paper proposes a multi-dimensional taxonomy for HMAS along five axes: control hierarchy, information flow, role and task delegation, temporal layering, and communication structure. The intent is not to prescribe a single "best" design but to provide a lens for comparing different approaches. Rather than treating these dimensions in isolation, the taxonomy is connected to concrete coordination mechanisms - from the long-standing contract-net protocol for task allocation to more recent work in hierarchical reinforcement learning. Industrial contexts illustrate the framework, including power grids and oilfield operations, where agents at production, maintenance, and supply levels coordinate to diagnose well issues or balance energy demand. These cases suggest that hierarchical structures may achieve global efficiency while preserving local autonomy, though the balance is delicate. The paper closes by identifying open challenges: making hierarchical decisions explainable to human operators, scaling to very large agent populations, and assessing whether learning-based agents such as large language models can be safely integrated into layered frameworks. This paper presents what appears to be the first taxonomy that unifies structural, temporal, and communication dimensions of hierarchical MAS into a single design framework, bridging classical coordination mechanisms with modern reinforcement learning and large language model agents.
Feedback Linearization for Replicator Dynamics: A Control Framework for Evolutionary Game Convergence
This paper demonstrates the first application of feedback linearization to replicator dynamics, driving the evolution of non-convergent evolutionary games to systems with guaranteed global asymptotic stability.
comment: 14 pages, 10 figures feel free to contact author at adil121@bu.edu with any questions, comments, and concerns
Group Fair Matchings using Convex Cost Functions
We consider the problem of assigning items to platforms where each item has a utility associated with each of the platforms to which it can be assigned. Each platform has a soft constraint over the total number of items it serves, modeled via a convex cost function. Additionally, items are partitioned into groups, and each platform also incurs group-specific convex cost over the number of items from each group that can be assigned to the platform. These costs promote group fairness by penalizing imbalances, yielding a soft variation of fairness notions introduced in prior work, such as Restricted Dominance and Minority protection. Restricted Dominance enforces upper bounds on group representation, while Minority protection enforces lower bounds. Our approach replaces such hard constraints with cost-based penalties, allowing more flexible trade-offs. Our model also captures Nash Social Welfare kind of objective. The cost of an assignment is the sum of the values of all the cost functions across all the groups and platforms. The objective is to find an assignment that minimizes the cost while achieving a total utility that is at least a user-specified threshold. The main challenge lies in balancing the overall platform cost with group-specific costs, both governed by convex functions, while meeting the utility constraint. We present an efficient polynomial-time approximation algorithm, supported by theoretical guarantees and experimental evaluation. Our algorithm is based on techniques involving linear programming and network flows. We also provide an exact algorithm for a special case with uniform utilities and establish the hardness of the general problem when the groups can intersect arbitrarily.
CardAIc-Agents: A Multimodal Framework with Hierarchical Adaptation for Cardiac Care Support
Cardiovascular diseases (CVDs) remain the foremost cause of mortality worldwide, a burden worsened by a severe deficit of healthcare workers. Artificial intelligence (AI) agents have shown potential to alleviate this gap via automated early detection and proactive screening, yet their clinical application remains limited by: 1) prompt-based clinical role assignment that relies on intrinsic model capabilities without domain-specific tool support; or 2) rigid sequential workflows, whereas clinical care often requires adaptive reasoning that orders specific tests and, based on their results, guides personalised next steps; 3) general and static knowledge bases without continuous learning capability; and 4) fixed unimodal or bimodal inputs and lack of on-demand visual outputs when further clarification is needed. In response, a multimodal framework, CardAIc-Agents, was proposed to augment models with external tools and adaptively support diverse cardiac tasks. Specifically, a CardiacRAG agent generated general plans from updatable cardiac knowledge, while the chief agent integrated tools to autonomously execute these plans and deliver decisions. To enable adaptive and case-specific customization, a stepwise update strategy was proposed to dynamically refine plans based on preceding execution results, once the task was assessed as complex. In addition, a multidisciplinary discussion tool was introduced to interpret challenging cases, thereby supporting further adaptation. When clinicians raised concerns, visual review panels were provided to assist final validation. Experiments across three datasets showed the efficiency of CardAIc-Agents compared to mainstream Vision-Language Models (VLMs), state-of-the-art agentic systems, and fine-tuned VLMs.
Goal-Directedness is in the Eye of the Beholder
Our ability to predict the behavior of complex agents turns on the attribution of goals. Probing for goal-directed behavior comes in two flavors: Behavioral and mechanistic. The former proposes that goal-directedness can be estimated through behavioral observation, whereas the latter attempts to probe for goals in internal model states. We work through the assumptions behind both approaches, identifying technical and conceptual problems that arise from formalizing goals in agent systems. We arrive at the perhaps surprising position that goal-directedness cannot be measured objectively. We outline new directions for modeling goal-directedness as an emergent property of dynamic, multi-agent systems.
comment: Submitted to Conference and Workshop on Neural Information Processing Systems 2025
To bind or not to bind? Discovering Stable Relationships in Object-centric Processes (Extended Version)
Object-centric process mining investigates the intertwined behavior of multiple objects in business processes. From object-centric event logs, object-centric Petri nets (OCPN) can be discovered to replay the behavior of processes accessing different object types. Although they indicate how objects flow through the process and co-occur in events, OCPNs remain underspecified about the relationships of objects. Hence, they are not able to represent synchronization, i.e. executing objects only according to their intended relationships, and fail to identify violating executions. Existing formal modeling approaches, such as object-centric Petri nets with identifiers (OPID), represent object identities and relationships to synchronize them correctly. However, OPID discovery has not yet been studied. This paper uses explicit data models to bridge the gap between OCPNs and formal OPIDs. We identify the implicit assumptions of stable many-to-one relationships in object-centric event logs, which implies synchronization of related objects. To formally underpin this observation, we combine OCPNs with explicit stable many-to-one relationships in a rigorous mapping from OCPNs to OPIDs explicitly capturing the intended stable relationships and the synchronization of related objects. We prove that the original OCPNs and the resulting OPIDs coincide for those executions that satisfy the intended relationships. Moreover, we provide an implementation of the mapping from OCPN to OPID under stable relationships.
Policy Search, Retrieval, and Composition via Task Similarity in Collaborative Agentic Systems
Agentic AI aims to create systems that set their own goals, adapt proactively to change, and refine behavior through continuous experience. Recent advances suggest that, when facing multiple and unforeseen tasks, agents could benefit from sharing machine-learned knowledge and reuse policies that have already been fully or partially learned by other agents. However, how to query, select, and retrieve policies from a pool of agents, and how to integrate such policies remains a largely unexplored area. This study explores how an agent decides what knowledge to select, from whom, and when and how to integrate it in its own policy in order to accelerate its own learning. The proposed algorithm, \emph{Modular Sharing and Composition in Collective Learning} (MOSAIC), improves learning in agentic collectives by combining (1) knowledge selection using performance signals and cosine similarity on Wasserstein task embeddings, (2) modular and transferable neural representations via masks, and (3) policy integration, composition and fine-tuning. MOSAIC outperforms isolated learners and global sharing approaches in both learning speed and overall performance, and in some cases solves tasks that isolated agents cannot. The results also demonstrate that selective, goal-driven reuse leads to less susceptibility to task interference. We also observe the emergence of self-organization, where agents solving simpler tasks accelerate the learning of harder ones through shared knowledge.
comment: 25 pages, 20 figures, 6 tables. Preprint
Congestion Mitigation Path Planning for Large-Scale Multi-Agent Navigation in Dense Environments
In high-density environments where numerous autonomous agents move simultaneously in a distributed manner, streamlining global flows to mitigate local congestion is crucial to maintain overall navigation efficiency. This paper introduces a novel path-planning problem, congestion mitigation path planning (CMPP), which embeds congestion directly into the cost function, defined by the usage of incoming edges along agents' paths. CMPP assigns a flow-based multiplicative penalty to each vertex of a sparse graph, which grows steeply where frequently-traversed paths intersect, capturing the intuition that congestion intensifies where many agents enter the same area from different directions. Minimizing the total cost yields a set of coarse-level, time-independent routes that autonomous agents can follow while applying their own local collision avoidance. We formulate the problem and develop two solvers: (i) an exact mixed-integer nonlinear programming solver for small instances, and (ii) a scalable two-layer search algorithm, A-CMTS, which quickly finds suboptimal solutions for large-scale instances and iteratively refines them toward the optimum. Empirical studies show that augmenting state-of-the-art collision-avoidance planners with CMPP significantly reduces local congestion and enhances system throughput in both discrete- and continuous-space scenarios. These results indicate that CMPP improves the performance of multi-agent systems in real-world applications such as logistics and autonomous-vehicle operations.
comment: Published in IEEE Robotics and Automation Letters (RA-L), 2025. (C) 2025 IEEE. CC BY 4.0 license. Supplementary videos will be accessible via IEEE Xplore upon publication
MAViS: A Multi-Agent Framework for Long-Sequence Video Storytelling
Despite recent advances, long-sequence video generation frameworks still suffer from significant limitations: poor assistive capability, suboptimal visual quality, and limited expressiveness. To mitigate these limitations, we propose MAViS, an end-to-end multi-agent collaborative framework for long-sequence video storytelling. MAViS orchestrates specialized agents across multiple stages, including script writing, shot designing, character modeling, keyframe generation, video animation, and audio generation. In each stage, agents operate under the 3E Principle -- Explore, Examine, and Enhance -- to ensure the completeness of intermediate outputs. Considering the capability limitations of current generative models, we propose the Script Writing Guidelines to optimize compatibility between scripts and generative tools. Experimental results demonstrate that MAViS achieves state-of-the-art performance in assistive capability, visual quality, and video expressiveness. Its modular framework further enables scalability with diverse generative models and tools. With just a brief user prompt, MAViS is capable of producing high-quality, expressive long-sequence video storytelling, enriching inspirations and creativity for users. To the best of our knowledge, MAViS is the only framework that provides multimodal design output -- videos with narratives and background music.
comment: Video Generation Agent
Language-Guided Multi-Agent Learning in Simulations: A Unified Framework and Evaluation
This paper introduces LLM-MARL, a unified framework that incorporates large language models (LLMs) into multi-agent reinforcement learning (MARL) to enhance coordination, communication, and generalization in simulated game environments. The framework features three modular components of Coordinator, Communicator, and Memory, which dynamically generate subgoals, facilitate symbolic inter-agent messaging, and support episodic recall. Training combines PPO with a language-conditioned loss and LLM query gating. LLM-MARL is evaluated in Google Research Football, MAgent Battle, and StarCraft II. Results show consistent improvements over MAPPO and QMIX in win rate, coordination score, and zero-shot generalization. Ablation studies demonstrate that subgoal generation and language-based messaging each contribute significantly to performance gains. Qualitative analysis reveals emergent behaviors such as role specialization and communication-driven tactics. By bridging language modeling and policy learning, this work contributes to the design of intelligent, cooperative agents in interactive simulations. It offers a path forward for leveraging LLMs in multi-agent systems used for training, games, and human-AI collaboration.
Systems and Control (CS)
Manipulate-to-Navigate: Reinforcement Learning with Visual Affordances and Manipulability Priors
Mobile manipulation in dynamic environments is challenging due to movable obstacles blocking the robot's path. Traditional methods, which treat navigation and manipulation as separate tasks, often fail in such 'manipulate-to-navigate' scenarios, as obstacles must be removed before navigation. In these cases, active interaction with the environment is required to clear obstacles while ensuring sufficient space for movement. To address the manipulate-to-navigate problem, we propose a reinforcement learning-based approach for learning manipulation actions that facilitate subsequent navigation. Our method combines manipulability priors to focus the robot on high manipulability body positions with affordance maps for selecting high-quality manipulation actions. By focusing on feasible and meaningful actions, our approach reduces unnecessary exploration and allows the robot to learn manipulation strategies more effectively. We present two new manipulate-to-navigate simulation tasks called Reach and Door with the Boston Dynamics Spot robot. The first task tests whether the robot can select a good hand position in the target area such that the robot base can move effectively forward while keeping the end effector position fixed. The second task requires the robot to move a door aside in order to clear the navigation path. Both of these tasks need first manipulation and then navigating the base forward. Results show that our method allows a robot to effectively interact with and traverse dynamic environments. Finally, we transfer the learned policy to a real Boston Dynamics Spot robot, which successfully performs the Reach task.
Exploiting Convexity of Neural Networks in Dynamic Operating Envelope Optimization for Distributed Energy Resources
The increasing penetration of distributed energy resources (DERs) brings opportunities and challenges to the operation of distribution systems. To ensure network integrity, dynamic operating envelopes (DOEs) are issued by utilities to DERs as their time-varying export/import power limits. Due to the non-convex nature of power flow equations, the optimization of DOEs faces a dilemma of solution accuracy and computation efficiency. To bridge this gap, in this paper, we facilitate DOE optimization by exploiting the convexity of input convex neural networks (ICNNs). A DOE optimization model is first presented, comprehensively considering multiple operational constraints. We propose a constraint embedding method that allows us to replace the non-convex power flow constraints with trained ICNN models and convexify the problem. To further speed up DOE optimization, we propose a linear relaxation of the ICNN-based DOE optimization problem, for which the tightness is theoretically proven. The effectiveness of the proposed method is validated with numerical case studies. Results show that the proposed ICNN-based method outperforms other benchmark methods in optimizing DOEs in terms of both solution quality and solution time.
Sufficient A Priori Conditions for the Linear Relaxation of the Energy Storage Scheduling Problem
When modeling energy storage systems, an essential question is how to account for the physical infeasibility of simultaneous charge and discharge. The use of complementarity constraints or of binary variables is common, but these formulations do not scale well. Alternatively, assumptions such as perfect efficiencies or positive prices are often used to justify the choice of a linear model. In this paper, we establish new a priori conditions that guarantee the existence of an optimal solution without simultaneous charge and discharge when solving the linear relaxation of the storage scheduling problem. They are based on the characteristics of the storage system, in particular, the duration of charge. They can be valid for negative prices and with inefficiencies, thereby enlarging the set of conditions for which the complementarity constraints can be relaxed. We prove mathematically the validity of these conditions and illustrate them with practical examples. We also introduce a refined mixed-integer linear equivalent, in which the number of binary variables can be drastically reduced.
Revisiting Functional Derivatives in Multi-object Tracking
Probability generating functionals (PGFLs) are efficient and powerful tools for tracking independent objects in clutter. It was shown that PGFLs could be used for the elegant derivation of practical multi-object tracking algorithms, e.g., the probability hypothesis density (PHD) filter. However, derivations using PGFLs use the so-called functional derivatives whose definitions usually appear too complicated or heuristic, involving Dirac delta ``functions''. This paper begins by comparing different definitions of functional derivatives and exploring their relationships and implications for practical applications. It then proposes a rigorous definition of the functional derivative, utilizing straightforward yet precise mathematics for clarity. Key properties of the functional derivative are revealed and discussed.
comment: submitted to IEEE Transactions on Signal Processing
Grid Edge Intelligence-Assisted Model Predictive Framework for Black Start of Distribution Systems with Inverter-Based Resources
The growing proliferation of distributed energy resources (DERs) is significantly enhancing the resilience and reliability of distribution systems. However, a substantial portion of behind-the-meter (BTM) DERs is often overlooked during black start (BS) and restoration processes. Existing BS strategies that utilize grid-forming (GFM) battery energy storage systems (BESS) frequently ignore critical frequency security and synchronization constraints. To address these limitations, this paper proposes a predictive framework for bottom-up BS that leverages the flexibility of BTM DERs through Grid Edge Intelligence (GEI). A predictive model is developed for GEI to estimate multi-period flexibility ranges and track dispatch signals from the utility. A frequency-constrained BS strategy is then introduced, explicitly incorporating constraints on frequency nadir, rate-of-change-of-frequency (RoCoF), and quasi-steady-state (QSS) frequency. The framework also includes synchronizing switches to enable faster and more secure load restoration. Notably, it requires GEI devices to communicate only their flexibility ranges and the utility to send dispatch signals without exchanging detailed asset information. The proposed framework is validated using a modified IEEE 123-bus test system, and the impact of GEI is demonstrated by comparing results across various GEI penetration scenarios.
comment: This manuscript has been submitted to IEEE Transaction on Smart Grid
PFD or PDF: Rethinking the Probability of Failure in Mitigation Safety Functions
SIL (Safety Integrity Level) allocation plays a crucial role in defining the design requirements for Safety Functions (SFs) within high-risk industries. SIL is typically determined based on the estimated Probability of Failure on Demand (PFD), which must remain within permissible limits to manage risk effectively. Extensive research has been conducted on determining target PFD and SIL, with a stronger emphasis on preventive SFs than on mitigation SFs. In this paper, we address a rather conceptual issue: we argue that PFD is not an appropriate reliability measure for mitigation SFs to begin with, and we propose an alternative approach that leverages the Probability Density Function (PDF) and the expected degree of failure as key metrics. The principles underlying this approach are explained and supported by detailed mathematical formulations. Furthermore, the practical application of this new methodology is illustrated through case studies.
[Social] Allostasis: Or, How I Learned To Stop Worrying and Love The Noise
The notion of homeostasis typically conceptualises biological and artificial systems as maintaining stability by resisting deviations caused by environmental and social perturbations. In contrast, (social) allostasis proposes that these systems can proactively leverage these very perturbations to reconfigure their regulatory parameters in anticipation of environmental demands, aligning with von Foerster's ``order through noise'' principle. This paper formulates a computational model of allostatic and social allostatic regulation that employs biophysiologically inspired signal transducers, analogous to hormones like cortisol and oxytocin, to encode information from both the environment and social interactions, which mediate this dynamic reconfiguration. The models are tested in a small society of ``animats'' across several dynamic environments, using an agent-based model. The results show that allostatic and social allostatic regulation enable agents to leverage environmental and social ``noise'' for adaptive reconfiguration, leading to improved viability compared to purely reactive homeostatic agents. This work offers a novel computational perspective on the principles of social allostasis and their potential for designing more robust, bio-inspired, adaptive systems
comment: 20 pages, 5 figures. Accepted at ALIFE 2025 (Kyoto, Japan; October 6th - 10th 2025)
A Hierarchical Surrogate Model for Efficient Multi-Task Parameter Learning in Closed-Loop Contro
Many control problems require repeated tuning and adaptation of controllers across distinct closed-loop tasks, where data efficiency and adaptability are critical. We propose a hierarchical Bayesian optimization (BO) framework that is tailored to efficient controller parameter learning in sequential decision-making and control scenarios for distinct tasks. Instead of treating the closed-loop cost as a black-box, our method exploits structural knowledge of the underlying problem, consisting of a dynamical system, a control law, and an associated closed-loop cost function. We construct a hierarchical surrogate model using Gaussian processes that capture the closed-loop state evolution under different parameterizations, while the task-specific weighting and accumulation into the closed-loop cost are computed exactly via known closed-form expressions. This allows knowledge transfer and enhanced data efficiency between different closed-loop tasks. The proposed framework retains sublinear regret guarantees on par with standard black-box BO, while enabling multi-task or transfer learning. Simulation experiments with model predictive control demonstrate substantial benefits in both sample efficiency and adaptability when compared to purely black-box BO approaches.
comment: 8 pages, 4 figures, accepted for CDC 2025
MCTR: Midpoint Corrected Triangulation for Autonomous Racing via Digital Twin Simulation in CARLA
In autonomous racing, reactive controllers eliminate the computational burden of the full See-Think-Act autonomy stack by directly mapping sensor inputs to control actions. This bypasses the need for explicit localization and trajectory planning. A widely adopted baseline in this category is the Follow-The-Gap method, which performs trajectory planning using LiDAR data. Building on FTG, the Delaunay Triangulation-based Racing algorithm introduces further enhancements. However, DTR's use of circumcircles for trajectory generation often results in insufficiently smooth paths, ultimately degrading performance. Additionally, the commonly used F1TENTH-simulator for autonomous racing competitions lacks support for 3D LiDAR perception, limiting its effectiveness in realistic testing. To address these challenges, this work proposes the MCTR algorithm. MCTR improves trajectory smoothness through the use of Curvature Corrected Moving Average and implements a digital twin system within the CARLA simulator to validate the algorithm's robustness under 3D LiDAR perception. The proposed algorithm has been thoroughly validated through both simulation and real-world vehicle experiments.
On the Gaussian Limit of the Output of IIR Filters
We study the asymptotic distribution of the output of a stable Linear Time-Invariant (LTI) system driven by a non-Gaussian stochastic input. Motivated by longstanding heuristics in the stochastic describing function method, we rigorously characterize when the output process becomes approximately Gaussian, even when the input is not. Using the Wasserstein-1 distance as a quantitative measure of non-Gaussianity, we derive upper bounds on the distance between the appropriately scaled output and a standard normal distribution. These bounds are obtained via Stein's method and depend explicitly on the system's impulse response and the dependence structure of the input process. We show that when the dominant pole of the system approaches the edge of stability and the input satisfies one of the following conditions: (i) independence, (ii) positive correlation with a real and positive dominant pole, or (iii) sufficient correlation decay, the output converges to a standard normal distribution at rate $O(1/\sqrt{t})$. We also present counterexamples where convergence fails, thereby motivating the stated assumptions. Our results provide a rigorous foundation for the widespread observation that outputs of low-pass LTI systems tend to be approximately Gaussian.
comment: 8 pages, 1 figure, accepted for publication IEEE Conference on Decision and Control 2025
BUILDA: A Thermal Building Data Generation Framework for Transfer Learning
Transfer learning (TL) can improve data-driven modeling of building thermal dynamics. Therefore, many new TL research areas emerge in the field, such as selecting the right source model for TL. However, these research directions require massive amounts of thermal building data which is lacking presently. Neither public datasets nor existing data generators meet the needs of TL research in terms of data quality and quantity. Moreover, existing data generation approaches typically require expert knowledge in building simulation. We present BuilDa, a thermal building data generation framework for producing synthetic data of adequate quality and quantity for TL research. The framework does not require profound building simulation knowledge to generate large volumes of data. BuilDa uses a single-zone Modelica model that is exported as a Functional Mock-up Unit (FMU) and simulated in Python. We demonstrate BuilDa by generating data and utilizing it for pretraining and fine-tuning TL models.
comment: Proceedings can be accessed at: https://annsim.org/2025-annsim-proceedings/
Deadline-Aware Bandwidth Allocation for Semantic Generative Communication with Diffusion Models
The importance of Radio Access Network (RAN) in support Artificial Intelligence (AI) application services has grown significantly, underscoring the need for an integrated approach that considers not only network efficiency but also AI performance. In this paper we focus on a semantic generative communication (SGC) framework for image inpainting application. Specifically, the transmitter sends semantic information, i.e., semantic masks and textual descriptions, while the receiver utilizes a conditional diffusion model on a base image, using them as conditioning data to produce the intended image. In this framework, we propose a bandwidth allocation scheme designed to maximize bandwidth efficiency while ensuring generation performance. This approach is based on our finding of a Semantic Deadline--the minimum time that conditioning data is required to be injected to meet a given performance threshold--within the multi-modal SGC framework. Given this observation, the proposed scheme allocates limited bandwidth so that each semantic information can be transmitted within the corresponding semantic deadline. Experimental results corroborate that the proposed bandwidth allocation scheme achieves higher generation performance in terms of PSNR for a given bandwidth compared to traditional schemes that do not account for semantic deadlines.
Stability Analysis of the Newton-Raphson Controller for a Class of Differentially Flat Systems
The Newton-Raphson Controller, established on the output prediction and the Newton-Raphson algorithm, is shown to be effective in a variety of control applications. Although the stability condition of the controller for linear systems has already been established, such condition for nonlinear systems remains unexplored. In this paper, we study the stability of the Newton-Raphson controller for a class of differentially flat nonlinear systems in the context of output regulation and tracking control. For output regulation, we prove that the controlled system is stable within a neighborhood of the origin if the corresponding flat system and output predictor satisfy a verifiable stability criterion. A semi-quantitative analysis is conducted to determine the measure of the domain of attraction. For tracking control, we prove that the controller is capable of driving the outputs to the external reference signals using a specific selection of controller parameters. Simulation results show that the controller achieves regulation and tracking respectively on the inverted pendulum and the kinematic bicycle, suggesting a potential in future control applications.
DCT-MARL: A Dynamic Communication Topology-Based MARL Algorithm for Connected Vehicle Platoon Control
With the rapid advancement of vehicular communication and autonomous driving technologies, connected vehicle platoon has emerged as a promising approach to improve traffic efficiency and driving safety. Reliable Vehicle-to-Vehicle (V2V) communication is critical to achieving efficient cooperative control. However, in real-world traffic environments, V2V links may suffer from time-varying delay and packet loss, leading to degraded control performance and even safety risks. To mitigate the adverse effects of non-ideal communication, this paper proposes a Dynamic Communication Topology based Multi-Agent Reinforcement Learning (DCT-MARL) algorithm for robust cooperative platoon control. Specifically, the state space is augmented with historical control action and delay to enhance robustness against communication delay. To mitigate the impact of packet loss, a multi-key gated communication mechanism is introduced, which dynamically adjusts the communication topology based on the correlation between agents and their current communication status.Simulation results demonstrate that the proposed DCT-MARL significantly outperforms state-of-the-art methods in terms of string stability and driving comfort, validating its superior robustness and effectiveness.
Feedback Linearization for Replicator Dynamics: A Control Framework for Evolutionary Game Convergence
This paper demonstrates the first application of feedback linearization to replicator dynamics, driving the evolution of non-convergent evolutionary games to systems with guaranteed global asymptotic stability.
comment: 14 pages, 10 figures feel free to contact author at adil121@bu.edu with any questions, comments, and concerns
Low-Cost Sensing and Classification for Early Stress and Disease Detection in Avocado Plants
With rising demands for efficient disease and salinity management in agriculture, early detection of plant stressors is crucial, particularly for high-value crops like avocados. This paper presents a comprehensive evaluation of low-cost sensors deployed in the field for early stress and disease detection in avocado plants. Our monitoring system was deployed across 72 plants divided into four treatment categories within a greenhouse environment, with data collected over six months. While leaf temperature and conductivity measurements, widely used metrics for controlled settings, were found unreliable in field conditions due to environmental interference and positioning challenges, leaf spectral measurements produced statistically significant results when combined with our machine learning approach. For soil data analysis, we developed a two-level hierarchical classifier that leverages domain knowledge about treatment characteristics, achieving 75-86\% accuracy across different avocado genotypes and outperforming conventional machine learning approaches by over 20\%. In addition, performance evaluation on an embedded edge device demonstrated the viability of our approach for resource-constrained environments, with reasonable computational efficiency while maintaining high classification accuracy. Our work bridges the gap between theoretical potential and practical application of low-cost sensors in agriculture and offers insights for developing affordable, scalable monitoring systems.
Observed Control -- Linearly Scalable Nonlinear Model Predictive Control with Adaptive Horizons
This work highlights the duality between state estimation methods and model predictive control. A predictive controller, observed control, is presented that uses this duality to efficiently compute control actions with linear time-horizon length scalability. The proposed algorithms provide exceptional computational efficiency, adaptive time horizon lengths, and early optimization termination criteria. The use of Kalman smoothers as the backend optimization framework provides for a straightforward implementation supported by strong theoretical guarantees. Additionally, a formulation is presented that separates linear model predictive control into purely reactive and anticipatory components, enabling any-time any-horizon observed control while ensuring controller stability for short time horizons. Finally, numerical case studies confirm that nonlinear filter extensions, i.e., the extended Kalman filter and unscented Kalman filter, effectively extend observed control to nonlinear systems and objectives.
comment: 16 pages, 8 figures. Submitted to IEEE Transactions on Automatic Control 8/17/2025
Stochastic Black Start Resource Allocation to Enable Dynamic Formation of Networked Microgrids and DER-aided Restoration
Extended outages in distributed systems (DSs) dominated by distributed energy resources (DERs) require innovative strategies to efficiently and securely deploy black start (BS) resources. To address the need, this paper proposes a two-stage stochastic resource allocation method within synchronizing dynamic microgrids (MGs) for black start (SDMG-BS), enabling risk-averse and adaptive restoration across various scenarios while ensuring frequency security. Virtual synchronous generator (VSG)-controlled grid-forming inverters (GFMIs) equipped with primary frequency governors (PFGs) are modeled as BS resources. Their frequency response is characterized by three transient indices, which are deployed as frequency dynamic constraints on load pick-up events to ensure frequency stability during the BS process. SDMG-BS framework facilitates location-independent synchronization among restored MGs and with the transmission grid (TG) with the help of smart switches (SSWs). The model incorporates scenario-based stochastic programming to address multi-source uncertainties, including season-dependent operational conditions and unpredictable TG outage durations, ensuring a resilient allocation plan. The proposed approach is validated on a modified IEEE 123-node feeder with three study cases designed across sixteen uncertainty scenarios.
Harnessing the Full Potential of RRAMs through Scalable and Distributed In-Memory Computing with Integrated Error Correction
Exponential growth in global computing demand is exacerbated due to the higher-energy requirements of conventional architectures, primarily due to energy-intensive data movement. In-memory computing with Resistive Random Access Memory (RRAM) addresses this by co-integrating memory and processing, but faces significant hurdles related to device-level non-idealities and poor scalability for large computing tasks. Here, we introduce \textbf{MELISO+} (In-\textbf{Me}mory \textbf{Li}near \textbf{So}lver), a full-stack, distributed framework for energy-efficient in-memory computing. MELISO+ proposes a novel two-tier error correction mechanism to mitigate device non-idealities and develops a distributed RRAM computing framework to enable matrix computations exceeding dimensions of $65,000 \times 65,000$. This approach reduces first- and second-order arithmetic errors due to device non-idealities by over 90\%, enhances energy efficiency by three to five orders of magnitude, and decreases latency 100-fold. Hence, MELISO+ allows lower-precision RRAM devices to outperform high-precision device alternatives in accuracy, energy and latency metrics. By unifying algorithm-hardware co-design with scalable architecture, MELISO+ significantly advances sustainable, high-dimensional computing suitable for applications like large language models and generative AI.
comment: Submitted to Nature Communication Contact authors for any info
Adaptive Model-Predictive Control of a Soft Continuum Robot Using a Physics-Informed Neural Network Based on Cosserat Rod Theory
Dynamic control of soft continuum robots (SCRs) holds great potential for expanding their applications, but remains a challenging problem due to the high computational demands of accurate dynamic models. While data-driven approaches like Koopman-operator-based methods have been proposed, they typically lack adaptability and cannot capture the full robot shape, limiting their applicability. This work introduces a real-time-capable nonlinear model-predictive control (MPC) framework for SCRs based on a domain-decoupled physics-informed neural network (DD-PINN) with adaptable bending stiffness. The DD-PINN serves as a surrogate for the dynamic Cosserat rod model with a speed-up factor of 44000. It is also used within an unscented Kalman filter for estimating the model states and bending compliance from end-effector position measurements. We implement a nonlinear evolutionary MPC running at 70 Hz on the GPU. In simulation, it demonstrates accurate tracking of dynamic trajectories and setpoint control with end-effector position errors below 3 mm (2.3% of the actuator's length). In real-world experiments, the controller achieves similar accuracy and accelerations up to 3.55 m/s2.
comment: 20 pages, 15 figures
STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning
Robot learning is witnessing a significant increase in the size, diversity, and complexity of pre-collected datasets, mirroring trends in domains such as natural language processing and computer vision. Many robot learning methods treat such datasets as multi-task expert data and learn a multi-task, generalist policy by training broadly across them. Notably, while these generalist policies can improve the average performance across many tasks, the performance of generalist policies on any one task is often suboptimal due to negative transfer between partitions of the data, compared to task-specific specialist policies. In this work, we argue for the paradigm of training policies during deployment given the scenarios they encounter: rather than deploying pre-trained policies to unseen problems in a zero-shot manner, we non-parametrically retrieve and train models directly on relevant data at test time. Furthermore, we show that many robotics tasks share considerable amounts of low-level behaviors and that retrieval at the "sub"-trajectory granularity enables significantly improved data utilization, generalization, and robustness in adapting policies to novel problems. In contrast, existing full-trajectory retrieval methods tend to underutilize the data and miss out on shared cross-task content. This work proposes STRAP, a technique for leveraging pre-trained vision foundation models and dynamic time warping to retrieve sub-sequences of trajectories from large training corpora in a robust fashion. STRAP outperforms both prior retrieval algorithms and multi-task learning methods in simulated and real experiments, showing the ability to scale to much larger offline datasets in the real world as well as the ability to learn robust control policies with just a handful of real-world demonstrations.
comment: Project website at https://weirdlabuw.github.io/strap/
Tracking Control of Euler-Lagrangian Systems with Prescribed State, Input, and Temporal Constraints
The synthesis of a smooth tracking control for Euler-Lagrangian (EL) systems under stringent state, input, and temporal (SIT) constraints is challenging. In contrast to existing methods that utilize prior knowledge of EL model parameters and uncertainty bounds, this study proposes an approximation-free adaptive barrier function-based control policy to ensure local prescribed time convergence of tracking error under state and input constraints. The proposed approach uses smooth time-based generator functions embedded in the filtered tracking error, which is combined with a saturation function that limits control action and confines states within the prescribed limits by enforcing the time-varying bounds on the filtered tracking error. Importantly, corresponding feasibility conditions are derived pertaining to the minimum control authority, the maximum disturbance rejection capability of the control policy, and the viable set of initial conditions, illuminating the narrow operating domain of EL systems arising from the interplay of SIT constraints. Finally, the efficacy of the proposed approach is demonstrated using experimental and comparison studies.
DDD-GenDT: Dynamic Data-driven Generative Digital Twin Framework
Digital twin (DT) technology enables real-time simulation, prediction, and optimization of physical systems, but practical deployment faces challenges from high data requirements, proprietary data constraints, and limited adaptability to evolving conditions. This work introduces DDD-GenDT, a dynamic data-driven generative digital twin framework grounded in the Dynamic Data-Driven Application Systems (DDDAS) paradigm. The architecture comprises the Physical Twin Observation Graph (PTOG) to represent operational states, an Observation Window Extraction process to capture temporal sequences, a Data Preprocessing Pipeline for sensor structuring and filtering, and an LLM ensemble for zero-shot predictive inference. By leveraging generative AI, DDD-GenDT reduces reliance on extensive historical datasets, enabling DT construction in data-scarce settings while maintaining industrial data privacy. The DDDAS feedback mechanism allows the DT to autonomically adapt predictions to physical twin (PT) wear and degradation, supporting DT-aging, which ensures progressive synchronization of DT with PT evolution. The framework is validated using the NASA CNC milling dataset, with spindle current as the monitored variable. In a zero-shot setting, the GPT-4-based DT achieves an average RMSE of 0.479 A (4.79% of the 10 A spindle current), accurately modeling nonlinear process dynamics and PT aging without retraining. These results show that DDD-GenDT provides a generalizable, data-efficient, and adaptive DT modeling approach, bridging generative AI with the performance and reliability requirements of industrial DT applications.
Input-Output Extension of Underactuated Nonlinear Systems
This letter proposes a method to integrate auxiliary actuators that enhance the task space capabilities of commercial underactuated systems, leaving the internal certified low level controller untouched. The additional actuators are combined with a feedback linearizing outer loop controller, enabling full pose tracking. We provide the conditions under which legacy high level commands and new actuator inputs can be cohesively coordinated to achieve decoupled control of all degrees of freedom. A comparative study with a standard quadrotor originally not designed for physical interaction demonstrates that the proposed modified platform remains stable under contact, while the baseline system diverges. Additionally, simulation results under parameter uncertainty illustrate the robustness of the approach.
Systems and Control (EESS)
Manipulate-to-Navigate: Reinforcement Learning with Visual Affordances and Manipulability Priors
Mobile manipulation in dynamic environments is challenging due to movable obstacles blocking the robot's path. Traditional methods, which treat navigation and manipulation as separate tasks, often fail in such 'manipulate-to-navigate' scenarios, as obstacles must be removed before navigation. In these cases, active interaction with the environment is required to clear obstacles while ensuring sufficient space for movement. To address the manipulate-to-navigate problem, we propose a reinforcement learning-based approach for learning manipulation actions that facilitate subsequent navigation. Our method combines manipulability priors to focus the robot on high manipulability body positions with affordance maps for selecting high-quality manipulation actions. By focusing on feasible and meaningful actions, our approach reduces unnecessary exploration and allows the robot to learn manipulation strategies more effectively. We present two new manipulate-to-navigate simulation tasks called Reach and Door with the Boston Dynamics Spot robot. The first task tests whether the robot can select a good hand position in the target area such that the robot base can move effectively forward while keeping the end effector position fixed. The second task requires the robot to move a door aside in order to clear the navigation path. Both of these tasks need first manipulation and then navigating the base forward. Results show that our method allows a robot to effectively interact with and traverse dynamic environments. Finally, we transfer the learned policy to a real Boston Dynamics Spot robot, which successfully performs the Reach task.
Exploiting Convexity of Neural Networks in Dynamic Operating Envelope Optimization for Distributed Energy Resources
The increasing penetration of distributed energy resources (DERs) brings opportunities and challenges to the operation of distribution systems. To ensure network integrity, dynamic operating envelopes (DOEs) are issued by utilities to DERs as their time-varying export/import power limits. Due to the non-convex nature of power flow equations, the optimization of DOEs faces a dilemma of solution accuracy and computation efficiency. To bridge this gap, in this paper, we facilitate DOE optimization by exploiting the convexity of input convex neural networks (ICNNs). A DOE optimization model is first presented, comprehensively considering multiple operational constraints. We propose a constraint embedding method that allows us to replace the non-convex power flow constraints with trained ICNN models and convexify the problem. To further speed up DOE optimization, we propose a linear relaxation of the ICNN-based DOE optimization problem, for which the tightness is theoretically proven. The effectiveness of the proposed method is validated with numerical case studies. Results show that the proposed ICNN-based method outperforms other benchmark methods in optimizing DOEs in terms of both solution quality and solution time.
Sufficient A Priori Conditions for the Linear Relaxation of the Energy Storage Scheduling Problem
When modeling energy storage systems, an essential question is how to account for the physical infeasibility of simultaneous charge and discharge. The use of complementarity constraints or of binary variables is common, but these formulations do not scale well. Alternatively, assumptions such as perfect efficiencies or positive prices are often used to justify the choice of a linear model. In this paper, we establish new a priori conditions that guarantee the existence of an optimal solution without simultaneous charge and discharge when solving the linear relaxation of the storage scheduling problem. They are based on the characteristics of the storage system, in particular, the duration of charge. They can be valid for negative prices and with inefficiencies, thereby enlarging the set of conditions for which the complementarity constraints can be relaxed. We prove mathematically the validity of these conditions and illustrate them with practical examples. We also introduce a refined mixed-integer linear equivalent, in which the number of binary variables can be drastically reduced.
Revisiting Functional Derivatives in Multi-object Tracking
Probability generating functionals (PGFLs) are efficient and powerful tools for tracking independent objects in clutter. It was shown that PGFLs could be used for the elegant derivation of practical multi-object tracking algorithms, e.g., the probability hypothesis density (PHD) filter. However, derivations using PGFLs use the so-called functional derivatives whose definitions usually appear too complicated or heuristic, involving Dirac delta ``functions''. This paper begins by comparing different definitions of functional derivatives and exploring their relationships and implications for practical applications. It then proposes a rigorous definition of the functional derivative, utilizing straightforward yet precise mathematics for clarity. Key properties of the functional derivative are revealed and discussed.
comment: submitted to IEEE Transactions on Signal Processing
Grid Edge Intelligence-Assisted Model Predictive Framework for Black Start of Distribution Systems with Inverter-Based Resources
The growing proliferation of distributed energy resources (DERs) is significantly enhancing the resilience and reliability of distribution systems. However, a substantial portion of behind-the-meter (BTM) DERs is often overlooked during black start (BS) and restoration processes. Existing BS strategies that utilize grid-forming (GFM) battery energy storage systems (BESS) frequently ignore critical frequency security and synchronization constraints. To address these limitations, this paper proposes a predictive framework for bottom-up BS that leverages the flexibility of BTM DERs through Grid Edge Intelligence (GEI). A predictive model is developed for GEI to estimate multi-period flexibility ranges and track dispatch signals from the utility. A frequency-constrained BS strategy is then introduced, explicitly incorporating constraints on frequency nadir, rate-of-change-of-frequency (RoCoF), and quasi-steady-state (QSS) frequency. The framework also includes synchronizing switches to enable faster and more secure load restoration. Notably, it requires GEI devices to communicate only their flexibility ranges and the utility to send dispatch signals without exchanging detailed asset information. The proposed framework is validated using a modified IEEE 123-bus test system, and the impact of GEI is demonstrated by comparing results across various GEI penetration scenarios.
comment: This manuscript has been submitted to IEEE Transaction on Smart Grid
PFD or PDF: Rethinking the Probability of Failure in Mitigation Safety Functions
SIL (Safety Integrity Level) allocation plays a crucial role in defining the design requirements for Safety Functions (SFs) within high-risk industries. SIL is typically determined based on the estimated Probability of Failure on Demand (PFD), which must remain within permissible limits to manage risk effectively. Extensive research has been conducted on determining target PFD and SIL, with a stronger emphasis on preventive SFs than on mitigation SFs. In this paper, we address a rather conceptual issue: we argue that PFD is not an appropriate reliability measure for mitigation SFs to begin with, and we propose an alternative approach that leverages the Probability Density Function (PDF) and the expected degree of failure as key metrics. The principles underlying this approach are explained and supported by detailed mathematical formulations. Furthermore, the practical application of this new methodology is illustrated through case studies.
[Social] Allostasis: Or, How I Learned To Stop Worrying and Love The Noise
The notion of homeostasis typically conceptualises biological and artificial systems as maintaining stability by resisting deviations caused by environmental and social perturbations. In contrast, (social) allostasis proposes that these systems can proactively leverage these very perturbations to reconfigure their regulatory parameters in anticipation of environmental demands, aligning with von Foerster's ``order through noise'' principle. This paper formulates a computational model of allostatic and social allostatic regulation that employs biophysiologically inspired signal transducers, analogous to hormones like cortisol and oxytocin, to encode information from both the environment and social interactions, which mediate this dynamic reconfiguration. The models are tested in a small society of ``animats'' across several dynamic environments, using an agent-based model. The results show that allostatic and social allostatic regulation enable agents to leverage environmental and social ``noise'' for adaptive reconfiguration, leading to improved viability compared to purely reactive homeostatic agents. This work offers a novel computational perspective on the principles of social allostasis and their potential for designing more robust, bio-inspired, adaptive systems
comment: 20 pages, 5 figures. Accepted at ALIFE 2025 (Kyoto, Japan; October 6th - 10th 2025)
A Hierarchical Surrogate Model for Efficient Multi-Task Parameter Learning in Closed-Loop Contro
Many control problems require repeated tuning and adaptation of controllers across distinct closed-loop tasks, where data efficiency and adaptability are critical. We propose a hierarchical Bayesian optimization (BO) framework that is tailored to efficient controller parameter learning in sequential decision-making and control scenarios for distinct tasks. Instead of treating the closed-loop cost as a black-box, our method exploits structural knowledge of the underlying problem, consisting of a dynamical system, a control law, and an associated closed-loop cost function. We construct a hierarchical surrogate model using Gaussian processes that capture the closed-loop state evolution under different parameterizations, while the task-specific weighting and accumulation into the closed-loop cost are computed exactly via known closed-form expressions. This allows knowledge transfer and enhanced data efficiency between different closed-loop tasks. The proposed framework retains sublinear regret guarantees on par with standard black-box BO, while enabling multi-task or transfer learning. Simulation experiments with model predictive control demonstrate substantial benefits in both sample efficiency and adaptability when compared to purely black-box BO approaches.
comment: 8 pages, 4 figures, accepted for CDC 2025
MCTR: Midpoint Corrected Triangulation for Autonomous Racing via Digital Twin Simulation in CARLA
In autonomous racing, reactive controllers eliminate the computational burden of the full See-Think-Act autonomy stack by directly mapping sensor inputs to control actions. This bypasses the need for explicit localization and trajectory planning. A widely adopted baseline in this category is the Follow-The-Gap method, which performs trajectory planning using LiDAR data. Building on FTG, the Delaunay Triangulation-based Racing algorithm introduces further enhancements. However, DTR's use of circumcircles for trajectory generation often results in insufficiently smooth paths, ultimately degrading performance. Additionally, the commonly used F1TENTH-simulator for autonomous racing competitions lacks support for 3D LiDAR perception, limiting its effectiveness in realistic testing. To address these challenges, this work proposes the MCTR algorithm. MCTR improves trajectory smoothness through the use of Curvature Corrected Moving Average and implements a digital twin system within the CARLA simulator to validate the algorithm's robustness under 3D LiDAR perception. The proposed algorithm has been thoroughly validated through both simulation and real-world vehicle experiments.
On the Gaussian Limit of the Output of IIR Filters
We study the asymptotic distribution of the output of a stable Linear Time-Invariant (LTI) system driven by a non-Gaussian stochastic input. Motivated by longstanding heuristics in the stochastic describing function method, we rigorously characterize when the output process becomes approximately Gaussian, even when the input is not. Using the Wasserstein-1 distance as a quantitative measure of non-Gaussianity, we derive upper bounds on the distance between the appropriately scaled output and a standard normal distribution. These bounds are obtained via Stein's method and depend explicitly on the system's impulse response and the dependence structure of the input process. We show that when the dominant pole of the system approaches the edge of stability and the input satisfies one of the following conditions: (i) independence, (ii) positive correlation with a real and positive dominant pole, or (iii) sufficient correlation decay, the output converges to a standard normal distribution at rate $O(1/\sqrt{t})$. We also present counterexamples where convergence fails, thereby motivating the stated assumptions. Our results provide a rigorous foundation for the widespread observation that outputs of low-pass LTI systems tend to be approximately Gaussian.
comment: 8 pages, 1 figure, accepted for publication IEEE Conference on Decision and Control 2025
BUILDA: A Thermal Building Data Generation Framework for Transfer Learning
Transfer learning (TL) can improve data-driven modeling of building thermal dynamics. Therefore, many new TL research areas emerge in the field, such as selecting the right source model for TL. However, these research directions require massive amounts of thermal building data which is lacking presently. Neither public datasets nor existing data generators meet the needs of TL research in terms of data quality and quantity. Moreover, existing data generation approaches typically require expert knowledge in building simulation. We present BuilDa, a thermal building data generation framework for producing synthetic data of adequate quality and quantity for TL research. The framework does not require profound building simulation knowledge to generate large volumes of data. BuilDa uses a single-zone Modelica model that is exported as a Functional Mock-up Unit (FMU) and simulated in Python. We demonstrate BuilDa by generating data and utilizing it for pretraining and fine-tuning TL models.
comment: Proceedings can be accessed at: https://annsim.org/2025-annsim-proceedings/
Deadline-Aware Bandwidth Allocation for Semantic Generative Communication with Diffusion Models
The importance of Radio Access Network (RAN) in support Artificial Intelligence (AI) application services has grown significantly, underscoring the need for an integrated approach that considers not only network efficiency but also AI performance. In this paper we focus on a semantic generative communication (SGC) framework for image inpainting application. Specifically, the transmitter sends semantic information, i.e., semantic masks and textual descriptions, while the receiver utilizes a conditional diffusion model on a base image, using them as conditioning data to produce the intended image. In this framework, we propose a bandwidth allocation scheme designed to maximize bandwidth efficiency while ensuring generation performance. This approach is based on our finding of a Semantic Deadline--the minimum time that conditioning data is required to be injected to meet a given performance threshold--within the multi-modal SGC framework. Given this observation, the proposed scheme allocates limited bandwidth so that each semantic information can be transmitted within the corresponding semantic deadline. Experimental results corroborate that the proposed bandwidth allocation scheme achieves higher generation performance in terms of PSNR for a given bandwidth compared to traditional schemes that do not account for semantic deadlines.
Stability Analysis of the Newton-Raphson Controller for a Class of Differentially Flat Systems
The Newton-Raphson Controller, established on the output prediction and the Newton-Raphson algorithm, is shown to be effective in a variety of control applications. Although the stability condition of the controller for linear systems has already been established, such condition for nonlinear systems remains unexplored. In this paper, we study the stability of the Newton-Raphson controller for a class of differentially flat nonlinear systems in the context of output regulation and tracking control. For output regulation, we prove that the controlled system is stable within a neighborhood of the origin if the corresponding flat system and output predictor satisfy a verifiable stability criterion. A semi-quantitative analysis is conducted to determine the measure of the domain of attraction. For tracking control, we prove that the controller is capable of driving the outputs to the external reference signals using a specific selection of controller parameters. Simulation results show that the controller achieves regulation and tracking respectively on the inverted pendulum and the kinematic bicycle, suggesting a potential in future control applications.
DCT-MARL: A Dynamic Communication Topology-Based MARL Algorithm for Connected Vehicle Platoon Control
With the rapid advancement of vehicular communication and autonomous driving technologies, connected vehicle platoon has emerged as a promising approach to improve traffic efficiency and driving safety. Reliable Vehicle-to-Vehicle (V2V) communication is critical to achieving efficient cooperative control. However, in real-world traffic environments, V2V links may suffer from time-varying delay and packet loss, leading to degraded control performance and even safety risks. To mitigate the adverse effects of non-ideal communication, this paper proposes a Dynamic Communication Topology based Multi-Agent Reinforcement Learning (DCT-MARL) algorithm for robust cooperative platoon control. Specifically, the state space is augmented with historical control action and delay to enhance robustness against communication delay. To mitigate the impact of packet loss, a multi-key gated communication mechanism is introduced, which dynamically adjusts the communication topology based on the correlation between agents and their current communication status.Simulation results demonstrate that the proposed DCT-MARL significantly outperforms state-of-the-art methods in terms of string stability and driving comfort, validating its superior robustness and effectiveness.
Feedback Linearization for Replicator Dynamics: A Control Framework for Evolutionary Game Convergence
This paper demonstrates the first application of feedback linearization to replicator dynamics, driving the evolution of non-convergent evolutionary games to systems with guaranteed global asymptotic stability.
comment: 14 pages, 10 figures feel free to contact author at adil121@bu.edu with any questions, comments, and concerns
Low-Cost Sensing and Classification for Early Stress and Disease Detection in Avocado Plants
With rising demands for efficient disease and salinity management in agriculture, early detection of plant stressors is crucial, particularly for high-value crops like avocados. This paper presents a comprehensive evaluation of low-cost sensors deployed in the field for early stress and disease detection in avocado plants. Our monitoring system was deployed across 72 plants divided into four treatment categories within a greenhouse environment, with data collected over six months. While leaf temperature and conductivity measurements, widely used metrics for controlled settings, were found unreliable in field conditions due to environmental interference and positioning challenges, leaf spectral measurements produced statistically significant results when combined with our machine learning approach. For soil data analysis, we developed a two-level hierarchical classifier that leverages domain knowledge about treatment characteristics, achieving 75-86\% accuracy across different avocado genotypes and outperforming conventional machine learning approaches by over 20\%. In addition, performance evaluation on an embedded edge device demonstrated the viability of our approach for resource-constrained environments, with reasonable computational efficiency while maintaining high classification accuracy. Our work bridges the gap between theoretical potential and practical application of low-cost sensors in agriculture and offers insights for developing affordable, scalable monitoring systems.
Observed Control -- Linearly Scalable Nonlinear Model Predictive Control with Adaptive Horizons
This work highlights the duality between state estimation methods and model predictive control. A predictive controller, observed control, is presented that uses this duality to efficiently compute control actions with linear time-horizon length scalability. The proposed algorithms provide exceptional computational efficiency, adaptive time horizon lengths, and early optimization termination criteria. The use of Kalman smoothers as the backend optimization framework provides for a straightforward implementation supported by strong theoretical guarantees. Additionally, a formulation is presented that separates linear model predictive control into purely reactive and anticipatory components, enabling any-time any-horizon observed control while ensuring controller stability for short time horizons. Finally, numerical case studies confirm that nonlinear filter extensions, i.e., the extended Kalman filter and unscented Kalman filter, effectively extend observed control to nonlinear systems and objectives.
comment: 16 pages, 8 figures. Submitted to IEEE Transactions on Automatic Control 8/17/2025
Stochastic Black Start Resource Allocation to Enable Dynamic Formation of Networked Microgrids and DER-aided Restoration
Extended outages in distributed systems (DSs) dominated by distributed energy resources (DERs) require innovative strategies to efficiently and securely deploy black start (BS) resources. To address the need, this paper proposes a two-stage stochastic resource allocation method within synchronizing dynamic microgrids (MGs) for black start (SDMG-BS), enabling risk-averse and adaptive restoration across various scenarios while ensuring frequency security. Virtual synchronous generator (VSG)-controlled grid-forming inverters (GFMIs) equipped with primary frequency governors (PFGs) are modeled as BS resources. Their frequency response is characterized by three transient indices, which are deployed as frequency dynamic constraints on load pick-up events to ensure frequency stability during the BS process. SDMG-BS framework facilitates location-independent synchronization among restored MGs and with the transmission grid (TG) with the help of smart switches (SSWs). The model incorporates scenario-based stochastic programming to address multi-source uncertainties, including season-dependent operational conditions and unpredictable TG outage durations, ensuring a resilient allocation plan. The proposed approach is validated on a modified IEEE 123-node feeder with three study cases designed across sixteen uncertainty scenarios.
Harnessing the Full Potential of RRAMs through Scalable and Distributed In-Memory Computing with Integrated Error Correction
Exponential growth in global computing demand is exacerbated due to the higher-energy requirements of conventional architectures, primarily due to energy-intensive data movement. In-memory computing with Resistive Random Access Memory (RRAM) addresses this by co-integrating memory and processing, but faces significant hurdles related to device-level non-idealities and poor scalability for large computing tasks. Here, we introduce \textbf{MELISO+} (In-\textbf{Me}mory \textbf{Li}near \textbf{So}lver), a full-stack, distributed framework for energy-efficient in-memory computing. MELISO+ proposes a novel two-tier error correction mechanism to mitigate device non-idealities and develops a distributed RRAM computing framework to enable matrix computations exceeding dimensions of $65,000 \times 65,000$. This approach reduces first- and second-order arithmetic errors due to device non-idealities by over 90\%, enhances energy efficiency by three to five orders of magnitude, and decreases latency 100-fold. Hence, MELISO+ allows lower-precision RRAM devices to outperform high-precision device alternatives in accuracy, energy and latency metrics. By unifying algorithm-hardware co-design with scalable architecture, MELISO+ significantly advances sustainable, high-dimensional computing suitable for applications like large language models and generative AI.
comment: Submitted to Nature Communication Contact authors for any info
Adaptive Model-Predictive Control of a Soft Continuum Robot Using a Physics-Informed Neural Network Based on Cosserat Rod Theory
Dynamic control of soft continuum robots (SCRs) holds great potential for expanding their applications, but remains a challenging problem due to the high computational demands of accurate dynamic models. While data-driven approaches like Koopman-operator-based methods have been proposed, they typically lack adaptability and cannot capture the full robot shape, limiting their applicability. This work introduces a real-time-capable nonlinear model-predictive control (MPC) framework for SCRs based on a domain-decoupled physics-informed neural network (DD-PINN) with adaptable bending stiffness. The DD-PINN serves as a surrogate for the dynamic Cosserat rod model with a speed-up factor of 44000. It is also used within an unscented Kalman filter for estimating the model states and bending compliance from end-effector position measurements. We implement a nonlinear evolutionary MPC running at 70 Hz on the GPU. In simulation, it demonstrates accurate tracking of dynamic trajectories and setpoint control with end-effector position errors below 3 mm (2.3% of the actuator's length). In real-world experiments, the controller achieves similar accuracy and accelerations up to 3.55 m/s2.
comment: 20 pages, 15 figures
STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning
Robot learning is witnessing a significant increase in the size, diversity, and complexity of pre-collected datasets, mirroring trends in domains such as natural language processing and computer vision. Many robot learning methods treat such datasets as multi-task expert data and learn a multi-task, generalist policy by training broadly across them. Notably, while these generalist policies can improve the average performance across many tasks, the performance of generalist policies on any one task is often suboptimal due to negative transfer between partitions of the data, compared to task-specific specialist policies. In this work, we argue for the paradigm of training policies during deployment given the scenarios they encounter: rather than deploying pre-trained policies to unseen problems in a zero-shot manner, we non-parametrically retrieve and train models directly on relevant data at test time. Furthermore, we show that many robotics tasks share considerable amounts of low-level behaviors and that retrieval at the "sub"-trajectory granularity enables significantly improved data utilization, generalization, and robustness in adapting policies to novel problems. In contrast, existing full-trajectory retrieval methods tend to underutilize the data and miss out on shared cross-task content. This work proposes STRAP, a technique for leveraging pre-trained vision foundation models and dynamic time warping to retrieve sub-sequences of trajectories from large training corpora in a robust fashion. STRAP outperforms both prior retrieval algorithms and multi-task learning methods in simulated and real experiments, showing the ability to scale to much larger offline datasets in the real world as well as the ability to learn robust control policies with just a handful of real-world demonstrations.
comment: Project website at https://weirdlabuw.github.io/strap/
Tracking Control of Euler-Lagrangian Systems with Prescribed State, Input, and Temporal Constraints
The synthesis of a smooth tracking control for Euler-Lagrangian (EL) systems under stringent state, input, and temporal (SIT) constraints is challenging. In contrast to existing methods that utilize prior knowledge of EL model parameters and uncertainty bounds, this study proposes an approximation-free adaptive barrier function-based control policy to ensure local prescribed time convergence of tracking error under state and input constraints. The proposed approach uses smooth time-based generator functions embedded in the filtered tracking error, which is combined with a saturation function that limits control action and confines states within the prescribed limits by enforcing the time-varying bounds on the filtered tracking error. Importantly, corresponding feasibility conditions are derived pertaining to the minimum control authority, the maximum disturbance rejection capability of the control policy, and the viable set of initial conditions, illuminating the narrow operating domain of EL systems arising from the interplay of SIT constraints. Finally, the efficacy of the proposed approach is demonstrated using experimental and comparison studies.
DDD-GenDT: Dynamic Data-driven Generative Digital Twin Framework
Digital twin (DT) technology enables real-time simulation, prediction, and optimization of physical systems, but practical deployment faces challenges from high data requirements, proprietary data constraints, and limited adaptability to evolving conditions. This work introduces DDD-GenDT, a dynamic data-driven generative digital twin framework grounded in the Dynamic Data-Driven Application Systems (DDDAS) paradigm. The architecture comprises the Physical Twin Observation Graph (PTOG) to represent operational states, an Observation Window Extraction process to capture temporal sequences, a Data Preprocessing Pipeline for sensor structuring and filtering, and an LLM ensemble for zero-shot predictive inference. By leveraging generative AI, DDD-GenDT reduces reliance on extensive historical datasets, enabling DT construction in data-scarce settings while maintaining industrial data privacy. The DDDAS feedback mechanism allows the DT to autonomically adapt predictions to physical twin (PT) wear and degradation, supporting DT-aging, which ensures progressive synchronization of DT with PT evolution. The framework is validated using the NASA CNC milling dataset, with spindle current as the monitored variable. In a zero-shot setting, the GPT-4-based DT achieves an average RMSE of 0.479 A (4.79% of the 10 A spindle current), accurately modeling nonlinear process dynamics and PT aging without retraining. These results show that DDD-GenDT provides a generalizable, data-efficient, and adaptive DT modeling approach, bridging generative AI with the performance and reliability requirements of industrial DT applications.
Input-Output Extension of Underactuated Nonlinear Systems
This letter proposes a method to integrate auxiliary actuators that enhance the task space capabilities of commercial underactuated systems, leaving the internal certified low level controller untouched. The additional actuators are combined with a feedback linearizing outer loop controller, enabling full pose tracking. We provide the conditions under which legacy high level commands and new actuator inputs can be cohesively coordinated to achieve decoupled control of all degrees of freedom. A comparative study with a standard quadrotor originally not designed for physical interaction demonstrates that the proposed modified platform remains stable under contact, while the baseline system diverges. Additionally, simulation results under parameter uncertainty illustrate the robustness of the approach.
Systems and Control (CS)
Techno-Economic Planning of Spatially-Resolved Battery Storage Systems in Renewable-Dominant Grids Under Weather Variability
The ongoing energy transition is significantly increasing the share of renewable energy sources (RES) in power systems; however, their intermittency and variability pose substantial challenges, including load shedding and system congestion. This study examines the role of the battery storage system (BSS) in mitigating these challenges by balancing power supply and demand. We optimize the location, size, and type of batteries using a two-stage stochastic program, with the second stage involving hourly operational decisions over an entire year. Unlike previous research, we incorporate the comprehensive technical and economic characteristics of battery technologies. The New York State (NYS) power system, currently undergoing a significant shift towards increased RES generation, serves as our case study. Using available load and weather data from 1980-2019, we account for the uncertainty of both load and RES generation through a sample average approximation approach. Our findings indicate that BSS can reduce renewable curtailment by 34% and load shedding by 21%, contributing to a more resilient power system in achieving NYS 2030 energy targets. Furthermore, the cost of employing BSS for the reduction of load shedding and RES curtailment does not increase linearly with additional capacity, revealing a complex relationship between costs and renewable penetration. This study provides valuable insights for the strategic BSS deployment to achieve a cost-effective and reliable power system in the energy transition as well as the feasibility of the NYS 2030 energy targets.
Graphon Mean-Field Logit Dynamic: Derivation, Computation, and Applications
We present a graphon mean-field logit dynamic, a stationary mean-field game based on logit interactions. This dynamic emerges from a stochastic control problem involving a continuum of nonexchangeable and interacting agents and reduces to solving a continuum of Hamilton-Jacobi-Bellman (HJB) equations connected through a graphon that models the connections among agents. Using a fixed-point argument, we prove that this HJB system admits a unique solution in the space of bounded functions when the discount rate is high (i.e., agents are myopic). Under certain assumptions, we also establish regularity properties of the system, such as equi-continuity. We propose a finite difference scheme for computing the HJB system and prove the uniqueness and existence of its numerical solutions. The mean-field logit dynamic is applied to a case study on inland fisheries resource management in the upper Tedori River of Japan. A series of computational cases are then conducted to investigate the dependence of the dynamic on both the discount rate and graphon.
Advanced DOA Regulation with a Whale-Optimized Fractional Order Fuzzy PID Framework
This study introduces a Fractional Order Fuzzy PID (FOFPID) controller that uses the Whale Optimization Algorithm (WOA) to manage the Bispectral Index (BIS), keeping it within the ideal range of forty to sixty. The FOFPID controller combines fuzzy logic for adapting to changes and fractional order dynamics for fine tuning. This allows it to adjust its control gains to handle a person's unique physiology. The WOA helps fine tune the controller's parameters, including the fractional orders and the fuzzy membership functions, which boosts its performance. Tested on models of eight different patient profiles, the FOFPID controller performed better than a standard Fractional Order PID (FOPID) controller. It achieved faster settling times, at two and a half minutes versus three point two minutes, and had a lower steady state error, at zero point five versus one point two. These outcomes show the FOFPID's excellent strength and accuracy. It offers a scalable, artificial intelligence driven solution for automated anesthesia delivery that could enhance clinical practice and improve patient results.
Sspherical sailing omnidirectional rover (SSailOR): wind tunnel experimental setup and results
This paper presents the design, instrumentation, and experimental procedures used to test the Spherical Sailing Omnidirectional Rover (SSailOR) in a controlled wind tunnel environment. The SSailOR is a wind-powered autonomous rover. This concept is motivated by the growing need for persistent and sustainable robotic systems in applications such as planetary exploration, Arctic observation, and military surveillance. SSailOR uses wind propulsion via onboard sails to enable long-duration mobility with minimal energy consumption. The spherical design simplifies mechanical complexity while enabling omnidirectional movement. Experimental tests were conducted to validate dynamic models and assess the aerodynamic performance of the rover under various configurations and environmental conditions. As a result, this design requires a co-design approach. Details of the mechanical structure, sensor integration, electronics, data acquisition system, and test parameters are presented in this paper. In addition, key observations are made that are relevant to the design optimization for further development of the rover.
A One-Class Explainable AI Framework for Identification of Non-Stationary Concurrent False Data Injections in Nuclear Reactor Signals
The transition of next generation advanced nuclear reactor systems from analog to fully digital instrumentation and control will necessitate robust mechanisms to safeguard against potential data integrity threats. One challenge is the real-time characterization of false data injections, which can mask sensor signals and potentially disrupt reactor control systems. While significant progress has been made in anomaly detection within reactor systems, potential false data injections have been shown to bypass conventional linear time-invariant state estimators and failure detectors based on statistical thresholds. The dynamic, nonlinear, multi-variate nature of sensor signals, combined with inherent noise and limited availability of real-world training data, makes the characterization of such threats and more importantly their differentiation from anticipated process anomalies particularly challenging. In this paper, we present an eXplainable AI (XAI) framework for identifying non-stationary concurrent replay attacks in nuclear reactor signals with minimal training data. The proposed framework leverages progress on recurrent neural networks and residual analysis coupled with a modified SHAP algorithm and rule-based correlations. The recurrent neural networks are trained only on normal operational data while for residual analysis we introduce an adaptive windowing technique to improve detection accuracy. We successfully benchmarked this framework on a real-world dataset from Purdue's nuclear reactor (PUR-1). We were able to detect false data injections with accuracy higher than 0.93 and less than 0.01 false positives, differentiate from expected process anomalies, and to identify the origin of the falsified signals.
Data-driven quantification and visualization of resilience metrics of power distribution system
This paper presents a data-driven approach for quantifying the resilience of distribution power grids to extreme weather events using two key metrics: (a) the number of outages and (b) restoration time. The method leverages historical outage records maintained by power utilities and weather measurements collected by the National Oceanic and Atmospheric Administration (NOAA) to evaluate resilience across a utility's service territory. The proposed framework consists of three stages. First, outage events are systematically extracted from the outage records by temporally and spatially aggregating coincident component outages. In the second stage, weather zones across the service territory are delineated using a Voronoi polygon approach, based on the locations of NOAA weather sensors. Finally, data-driven models for outage fragility and restoration time are developed for each weather zone. These models enable the quantification and visualization of resilience metrics under varying intensities of extreme weather events. The proposed method is demonstrated using real-world data from a US distribution utility, located in Indianapolis, focused on wind- and precipitation-related events. The dataset spans two decades and includes over 160,000 outage records.
comment: This paper has been submitted to Nature Communication Engineering
PUB: A Plasma-Propelled Ultra-Quiet Blimp with Two-DOF Vector Thrusting
This study presents the design and control of a Plasma-propelled Ultra-silence Blimp (PUB), a novel aerial robot employing plasma vector propulsion for ultra-quiet flight without mechanical propellers. The system utilizes a helium-lift platform for extended endurance and a four-layer ring asymmetric capacitor to generate ionic wind thrust. The modular propulsion units allow flexible configuration to meet mission-specific requirements, while a two-degree-of-freedom (DOF) head enables thrust vector control. A closed-loop slip control scheme is implemented for stable maneuvering. Flight experiments demonstrate full-envelope capability, including take-off, climb, hover, descent, and smooth landing, confirming the feasibility of plasma vector propulsion, the effectiveness of DOF vector control, and the stability of the control system. Owing to its low acoustic signature, structural simplicity, and high maneuverability, PUB is well suited for noise-sensitive, enclosed, and near-space applications.
Efficient and accurate solution of wind-integrated optimal power flow based on enhanced second-order cone relaxation with rolling cutting plane technique
The integration of large-scale renewable energy sources, such as wind power, poses significant challenges for the optimal operation of power systems owing to their inherent uncertainties. This paper proposes a solution framework for wind-integrated optimal power flow (OPF) that leverages an enhanced second-order cone relaxation (SOCR), supported by a rolling cutting plane technique. Initially, the wind generation cost, arising from discrepancies between scheduled and actual wind power outputs, is meticulously modeled using a Gaussian mixture model based on historical wind power data. This modelled wind generation cost is subsequently incorporated into the objective function of the conventional OPF problem. To achieve the efficient and accurate solution for the wind-integrated OPF, effectively managing the constraints associated with AC power flow equations is essential. In this regard, a SOCR, combined with a second-order Taylor series expansion, is employed to facilitate the convex approximation of the AC power flow equations. Additionally, a warm-start strategy, grounded in a proposed rolling cutting plane technique, is devised to reduce relaxation errors and enhance computational efficiency. Finally, the effectiveness and efficiency of the proposed solution framework are demonstrated across various case studies. Specifically, the influence of wind power cost is also examined, further highlighting the practical implications of the proposed solution framework.
Semi-Infinite Programming for Collision-Avoidance in Optimal and Model Predictive Control
This paper presents a novel approach for collision avoidance in optimal and model predictive control, in which the environment is represented by a large number of points and the robot as a union of padded polygons. The conditions that none of the points shall collide with the robot can be written in terms of an infinite number of constraints per obstacle point. We show that the resulting semi-infinite programming (SIP) optimal control problem (OCP) can be efficiently tackled through a combination of two methods: local reduction and an external active-set method. Specifically, this involves iteratively identifying the closest point obstacles, determining the lower-level distance minimizer among all feasible robot shape parameters, and solving the upper-level finitely-constrained subproblems. In addition, this paper addresses robust collision avoidance in the presence of ellipsoidal state uncertainties. Enforcing constraint satisfaction over all possible uncertainty realizations extends the dimension of constraint infiniteness. The infinitely many constraints arising from translational uncertainty are handled by local reduction together with the robot shape parameterization, while rotational uncertainty is addressed via a backoff reformulation. A controller implemented based on the proposed method is demonstrated on a real-world robot running at 20Hz, enabling fast and collision-free navigation in tight spaces. An application to 3D collision avoidance is also demonstrated in simulation.
comment: 21 pages, 15 figures
Design and Analysis of Curved Electrode Configurations for Enhanced Sensitivity in 1-Axis MEMS Accelerometers
This paper presents a comprehensive analytical and simulation-based study of curved electrode geometries for enhancing the sensitivity of MEMS capacitive accelerometers. Expressions for the capacitance between a planar movable electrode and six distinct fixed electrode profiles (biconvex, biconcave, concavo-convex, convexo-concave, plano-convex, and plano-concave) are derived, enabling direct calculation of differential gain and sensitivity as functions of electrode curvature and gap displacement. These analytical models are then rigorously validated using finite element simulations performed using COMSOL Multiphysics under identical bias and boundary conditions. The simulation results demonstrate agreement with the analytical results with a deviation of less than 7% in all configurations. The results also reveal that biconvex curved electrodes yield the greatest sensitivity improvement over the planar electrodes, with sensitivity monotonically increasing with arc length, while concave and plano-concave designs exhibit reduced performance. The concavo-convex and convexo-concave configurations furthermore introduce polarity inversion in the output voltage, offering additional design flexibility. Importantly, these sensitivity enhancements are achieved without any change in the overall volumetric dimensions of the device or the proofmass dimensions of the module for achieving higher-resolution inertial sensing.
comment: 10 pages, 7 figures
Adaptive Control with Set-Point Tracking and Linear-like Closed-loop Behavior
In this paper, we consider the problem of set-point tracking for a discrete-time plant with unknown plant parameters belonging to a convex and compact uncertainty set. We carry out parameter estimation for an associated auxiliary plant, and a pole-placement-based control law is employed. We prove that this adaptive controller provides desirable linear-like closed-loop behavior which guarantees a bound consisting of: exponential decay with respect to the initial condition, a linear-like convolution bound with respect to the exogenous inputs, and a constant scaled by the square root of the constant in the denominator of the parameter estimator update law. This implies that the system has a bounded gain. Moreover, asymptotic tracking is also proven when the disturbance is constant.
comment: This is an extended version of a paper that will appear in the Proceedings of the 64th IEEE Conference on Decision and Control (CDC)
Understanding the Fundamental Trade-Off Between Age of Information and Throughput in Unreliable Wireless Networks
This paper characterizes the fundamental trade-off between throughput and Age of Information (AoI) in wireless networks where multiple devices transmit status updates to a central base station over unreliable channels. To address the complexity introduced by stochastic transmission successes, we propose the throughput-AoI capacity region, which defines all feasible throughput-AoI pairs achievable under any scheduling policy. Using a second-order approximation that incorporates both mean and temporal variance, we derive an outer bound and a tight inner bound for the throughput-AoI capacity region. Furthermore, we propose a simple and low complexity scheduling policy and prove that it achieves every interior point within the tight inner bound. This establishes a systematic and theoretically grounded framework for the joint optimization of throughput and information freshness in practical wireless communication scenarios. To validate our theoretical framework and demonstrate the utility of the throughput-AoI capacity region, extensive simulations are implemented. Simulation results demonstrate that our proposed policy significantly outperforms conventional methods across various practical network optimization scenarios. The findings highlight our approach's effectiveness in optimizing both throughput and AoI, underscoring its applicability and robustness in practical wireless networks.
Two-Instrument Screening under Soft Budget Constraints
We study soft budget constraints in multi-tier public finance when an upper-tier government uses two instruments: an ex-ante grant schedule and an ex-post rescue. Under convex rescue costs and standard primitives, the three-stage leader-follower problem collapses to one dimensional screening with a single allocation index: the cap on realized rescue. A hazard-based characterization delivers a unified rule that nests (i) no rescue, (ii) a threshold-cap with commitment, and (iii) a threshold--linear--cap without commitment. The knife-edge for eliminating bailouts compares the marginal cost at the origin to the supremum of a virtual weight, and the comparative statics show how greater curvature tightens caps while discretion shifts transfers toward front loading by lowering the effective grant weight. The framework provides a portable benchmark for mechanism design and yields testable implications for policy and empirical work on intergovernmental finance.
comment: arXiv admin note: text overlap with arXiv:2508.02171
Towards Infant Sleep-Optimized Driving: Synergizing Wearable and Vehicle Sensing in Intelligent Cruise Control
Automated driving (AD) has substantially improved vehicle safety and driving comfort, but their impact on passenger well-being, particularly infant sleep, is not sufficiently studied. Sudden acceleration, abrupt braking, and sharp maneuvers can disrupt infant sleep, compromising both passenger comfort and parental convenience. To solve this problem, this paper explores the integration of reinforcement learning (RL) within AD to personalize driving behavior and optimally balance occupant comfort and travel efficiency. In particular, we propose an intelligent cruise control framework that adapts to varying driving conditions to enhance infant sleep quality by effectively synergizing wearable sensing and vehicle data. Long short-term memory (LSTM) and transformer-based neural networks are integrated with RL to model the relationship between driving behavior and infant sleep quality under diverse traffic and road conditions. Based on the sleep quality indicators from the wearable sensors, driving action data from vehicle controllers, and map data from map applications, the model dynamically computes the optimal driving aggressiveness level, which is subsequently translated into specific AD control strategies, e.g., the magnitude and frequency of acceleration, lane change, and overtaking. Simulation experiments conducted in the CARLA environment indicate that the proposed solution significantly improves infant sleep quality compared to baseline methods, while preserving desirable travel efficiency.
Joint Power and Spectrum Orchestration for D2D Semantic Communication Underlying Energy-Efficient Cellular Networks
Semantic communication (SemCom) has been recently deemed a promising next-generation wireless technique to enable efficient spectrum savings and information exchanges, thus naturally introducing a novel and practical network paradigm where cellular and device-to-device (D2D) SemCom approaches coexist. Nevertheless, the involved wireless resource management becomes complicated and challenging due to the unique semantic performance measurements and energy-consuming semantic coding mechanism. To this end, this paper jointly investigates power control and spectrum reuse problems for energy-efficient D2D SemCom cellular networks. Concretely, we first model the user preference-aware semantic triplet transmission and leverage a novel metric of semantic value to identify the semantic information importance conveyed in SemCom. Then, we define the additional power consumption from semantic encoding in conjunction with basic power amplifier dissipation to derive the overall system energy efficiency (semantic-value/Joule). Next, we formulate an energy efficiency maximization problem for joint power and spectrum allocation subject to several SemCom-related and practical constraints. Afterward, we propose an optimal resource management solution by employing the fractional-to-subtractive problem transformation and decomposition while developing a three-stage method with theoretical analysis of its optimality guarantee and computational complexity. Numerical results demonstrate the adequate performance superiority of our proposed solution compared with different benchmarks.
comment: This paper has been submitted to IEEE Trans. on Wireless Communications for the third round of peer review after minor revisions
HuB: Learning Extreme Humanoid Balance
The human body demonstrates exceptional motor capabilities-such as standing steadily on one foot or performing a high kick with the leg raised over 1.5 meters-both requiring precise balance control. While recent research on humanoid control has leveraged reinforcement learning to track human motions for skill acquisition, applying this paradigm to balance-intensive tasks remains challenging. In this work, we identify three key obstacles: instability from reference motion errors, learning difficulties due to morphological mismatch, and the sim-to-real gap caused by sensor noise and unmodeled dynamics. To address these challenges, we propose HuB (Humanoid Balance), a unified framework that integrates reference motion refinement, balance-aware policy learning, and sim-to-real robustness training, with each component targeting a specific challenge. We validate our approach on the Unitree G1 humanoid robot across challenging quasi-static balance tasks, including extreme single-legged poses such as Swallow Balance and Bruce Lee's Kick. Our policy remains stable even under strong physical disturbances-such as a forceful soccer strike-while baseline methods consistently fail to complete these tasks. Project website: https://hub-robot.github.io
comment: CoRL 2025 (Oral Presentation). Project website: https://hub-robot.github.io
Self-Tuning PID Control via a Hybrid Actor-Critic-Based Neural Structure for Quadcopter Control
Proportional-Integrator-Derivative (PID) controller is used in a wide range of industrial and experimental processes. There are a couple of offline methods for tuning PID gains. However, due to the uncertainty of model parameters and external disturbances, real systems such as Quadrotors need more robust and reliable PID controllers. In this research, a self-tuning PID controller using a Reinforcement-Learning-based Neural Network for attitude and altitude control of a Quadrotor has been investigated. An Incremental PID, which contains static and dynamic gains, has been considered and only the variable gains have been tuned. To tune dynamic gains, a model-free actor-critic-based hybrid neural structure was used that was able to properly tune PID gains, and also has done the best as an identifier. In both tunning and identification tasks, a Neural Network with two hidden layers and sigmoid activation functions has been learned using Adaptive Momentum (ADAM) optimizer and Back-Propagation (BP) algorithm. This method is online, able to tackle disturbance, and fast in training. In addition to robustness to mass uncertainty and wind gust disturbance, results showed that the proposed method had a better performance when compared to a PID controller with constant gains.
comment: 7 pages, 18 figures, The 30th Annual International Conference of Iranian Society of Mechanical Engineers
Towards Safe Autonomous Driving Policies using a Neuro-Symbolic Deep Reinforcement Learning Approach
The dynamic nature of driving environments and the presence of diverse road users pose significant challenges for decision-making in autonomous driving. Deep reinforcement learning (DRL) has emerged as a popular approach to tackle this problem. However, the application of existing DRL solutions is mainly confined to simulated environments due to safety concerns, impeding their deployment in real-world. To overcome this limitation, this paper introduces a novel neuro-symbolic model-free DRL approach, called DRL with Symbolic Logic (DRLSL) that combines the strengths of DRL (learning from experience) and symbolic first-order logic (knowledge-driven reasoning) to enable safe learning in real-time interactions of autonomous driving within real environments. This innovative approach provides a means to learn autonomous driving policies by actively engaging with the physical environment while ensuring safety. We have implemented the DRLSL framework in a highway driving scenario using the HighD dataset and demonstrated that our method successfully avoids unsafe actions during both the training and testing phases. Furthermore, our results indicate that DRLSL achieves faster convergence during training and exhibits better generalizability to new highway driving scenarios compared to traditional DRL methods.
comment: 15 pages, 9 figures, 1 table, 1 algorithm
Online Identification of Time-Varying Systems Using Excitation Sets and Change Point Detection
In this work, we first show that the problem of parameter identification is often ill-conditioned and lacks the persistence of excitation required for the convergence of online learning schemes. To tackle these challenges, we introduce the notion of optimal and greedy excitation sets which contain data points with sufficient richness to aid in the identification task. We then present the greedy excitation set-based recursive least squares algorithm to alleviate the problem of the lack of persistent excitation, and prove that the iterates generated by the proposed algorithm minimize an auxiliary weighted least squares cost function. When data points are generated from time-varying parameters, online estimators tend to underfit the true parameter trajectory, and their predictability deteriorates. To tackle this problem, we propose a memory resetting scheme leveraging change point detection techniques. Finally, we illustrate the performance of the proposed algorithms via several numerical case studies to learn the (time-varying) parameters of networked epidemic dynamics, and compare it with results obtained using conventional approaches.
Systems and Control (EESS)
Techno-Economic Planning of Spatially-Resolved Battery Storage Systems in Renewable-Dominant Grids Under Weather Variability
The ongoing energy transition is significantly increasing the share of renewable energy sources (RES) in power systems; however, their intermittency and variability pose substantial challenges, including load shedding and system congestion. This study examines the role of the battery storage system (BSS) in mitigating these challenges by balancing power supply and demand. We optimize the location, size, and type of batteries using a two-stage stochastic program, with the second stage involving hourly operational decisions over an entire year. Unlike previous research, we incorporate the comprehensive technical and economic characteristics of battery technologies. The New York State (NYS) power system, currently undergoing a significant shift towards increased RES generation, serves as our case study. Using available load and weather data from 1980-2019, we account for the uncertainty of both load and RES generation through a sample average approximation approach. Our findings indicate that BSS can reduce renewable curtailment by 34% and load shedding by 21%, contributing to a more resilient power system in achieving NYS 2030 energy targets. Furthermore, the cost of employing BSS for the reduction of load shedding and RES curtailment does not increase linearly with additional capacity, revealing a complex relationship between costs and renewable penetration. This study provides valuable insights for the strategic BSS deployment to achieve a cost-effective and reliable power system in the energy transition as well as the feasibility of the NYS 2030 energy targets.
Graphon Mean-Field Logit Dynamic: Derivation, Computation, and Applications
We present a graphon mean-field logit dynamic, a stationary mean-field game based on logit interactions. This dynamic emerges from a stochastic control problem involving a continuum of nonexchangeable and interacting agents and reduces to solving a continuum of Hamilton-Jacobi-Bellman (HJB) equations connected through a graphon that models the connections among agents. Using a fixed-point argument, we prove that this HJB system admits a unique solution in the space of bounded functions when the discount rate is high (i.e., agents are myopic). Under certain assumptions, we also establish regularity properties of the system, such as equi-continuity. We propose a finite difference scheme for computing the HJB system and prove the uniqueness and existence of its numerical solutions. The mean-field logit dynamic is applied to a case study on inland fisheries resource management in the upper Tedori River of Japan. A series of computational cases are then conducted to investigate the dependence of the dynamic on both the discount rate and graphon.
Advanced DOA Regulation with a Whale-Optimized Fractional Order Fuzzy PID Framework
This study introduces a Fractional Order Fuzzy PID (FOFPID) controller that uses the Whale Optimization Algorithm (WOA) to manage the Bispectral Index (BIS), keeping it within the ideal range of forty to sixty. The FOFPID controller combines fuzzy logic for adapting to changes and fractional order dynamics for fine tuning. This allows it to adjust its control gains to handle a person's unique physiology. The WOA helps fine tune the controller's parameters, including the fractional orders and the fuzzy membership functions, which boosts its performance. Tested on models of eight different patient profiles, the FOFPID controller performed better than a standard Fractional Order PID (FOPID) controller. It achieved faster settling times, at two and a half minutes versus three point two minutes, and had a lower steady state error, at zero point five versus one point two. These outcomes show the FOFPID's excellent strength and accuracy. It offers a scalable, artificial intelligence driven solution for automated anesthesia delivery that could enhance clinical practice and improve patient results.
Sspherical sailing omnidirectional rover (SSailOR): wind tunnel experimental setup and results
This paper presents the design, instrumentation, and experimental procedures used to test the Spherical Sailing Omnidirectional Rover (SSailOR) in a controlled wind tunnel environment. The SSailOR is a wind-powered autonomous rover. This concept is motivated by the growing need for persistent and sustainable robotic systems in applications such as planetary exploration, Arctic observation, and military surveillance. SSailOR uses wind propulsion via onboard sails to enable long-duration mobility with minimal energy consumption. The spherical design simplifies mechanical complexity while enabling omnidirectional movement. Experimental tests were conducted to validate dynamic models and assess the aerodynamic performance of the rover under various configurations and environmental conditions. As a result, this design requires a co-design approach. Details of the mechanical structure, sensor integration, electronics, data acquisition system, and test parameters are presented in this paper. In addition, key observations are made that are relevant to the design optimization for further development of the rover.
A One-Class Explainable AI Framework for Identification of Non-Stationary Concurrent False Data Injections in Nuclear Reactor Signals
The transition of next generation advanced nuclear reactor systems from analog to fully digital instrumentation and control will necessitate robust mechanisms to safeguard against potential data integrity threats. One challenge is the real-time characterization of false data injections, which can mask sensor signals and potentially disrupt reactor control systems. While significant progress has been made in anomaly detection within reactor systems, potential false data injections have been shown to bypass conventional linear time-invariant state estimators and failure detectors based on statistical thresholds. The dynamic, nonlinear, multi-variate nature of sensor signals, combined with inherent noise and limited availability of real-world training data, makes the characterization of such threats and more importantly their differentiation from anticipated process anomalies particularly challenging. In this paper, we present an eXplainable AI (XAI) framework for identifying non-stationary concurrent replay attacks in nuclear reactor signals with minimal training data. The proposed framework leverages progress on recurrent neural networks and residual analysis coupled with a modified SHAP algorithm and rule-based correlations. The recurrent neural networks are trained only on normal operational data while for residual analysis we introduce an adaptive windowing technique to improve detection accuracy. We successfully benchmarked this framework on a real-world dataset from Purdue's nuclear reactor (PUR-1). We were able to detect false data injections with accuracy higher than 0.93 and less than 0.01 false positives, differentiate from expected process anomalies, and to identify the origin of the falsified signals.
Data-driven quantification and visualization of resilience metrics of power distribution system
This paper presents a data-driven approach for quantifying the resilience of distribution power grids to extreme weather events using two key metrics: (a) the number of outages and (b) restoration time. The method leverages historical outage records maintained by power utilities and weather measurements collected by the National Oceanic and Atmospheric Administration (NOAA) to evaluate resilience across a utility's service territory. The proposed framework consists of three stages. First, outage events are systematically extracted from the outage records by temporally and spatially aggregating coincident component outages. In the second stage, weather zones across the service territory are delineated using a Voronoi polygon approach, based on the locations of NOAA weather sensors. Finally, data-driven models for outage fragility and restoration time are developed for each weather zone. These models enable the quantification and visualization of resilience metrics under varying intensities of extreme weather events. The proposed method is demonstrated using real-world data from a US distribution utility, located in Indianapolis, focused on wind- and precipitation-related events. The dataset spans two decades and includes over 160,000 outage records.
comment: This paper has been submitted to Nature Communication Engineering
PUB: A Plasma-Propelled Ultra-Quiet Blimp with Two-DOF Vector Thrusting
This study presents the design and control of a Plasma-propelled Ultra-silence Blimp (PUB), a novel aerial robot employing plasma vector propulsion for ultra-quiet flight without mechanical propellers. The system utilizes a helium-lift platform for extended endurance and a four-layer ring asymmetric capacitor to generate ionic wind thrust. The modular propulsion units allow flexible configuration to meet mission-specific requirements, while a two-degree-of-freedom (DOF) head enables thrust vector control. A closed-loop slip control scheme is implemented for stable maneuvering. Flight experiments demonstrate full-envelope capability, including take-off, climb, hover, descent, and smooth landing, confirming the feasibility of plasma vector propulsion, the effectiveness of DOF vector control, and the stability of the control system. Owing to its low acoustic signature, structural simplicity, and high maneuverability, PUB is well suited for noise-sensitive, enclosed, and near-space applications.
Efficient and accurate solution of wind-integrated optimal power flow based on enhanced second-order cone relaxation with rolling cutting plane technique
The integration of large-scale renewable energy sources, such as wind power, poses significant challenges for the optimal operation of power systems owing to their inherent uncertainties. This paper proposes a solution framework for wind-integrated optimal power flow (OPF) that leverages an enhanced second-order cone relaxation (SOCR), supported by a rolling cutting plane technique. Initially, the wind generation cost, arising from discrepancies between scheduled and actual wind power outputs, is meticulously modeled using a Gaussian mixture model based on historical wind power data. This modelled wind generation cost is subsequently incorporated into the objective function of the conventional OPF problem. To achieve the efficient and accurate solution for the wind-integrated OPF, effectively managing the constraints associated with AC power flow equations is essential. In this regard, a SOCR, combined with a second-order Taylor series expansion, is employed to facilitate the convex approximation of the AC power flow equations. Additionally, a warm-start strategy, grounded in a proposed rolling cutting plane technique, is devised to reduce relaxation errors and enhance computational efficiency. Finally, the effectiveness and efficiency of the proposed solution framework are demonstrated across various case studies. Specifically, the influence of wind power cost is also examined, further highlighting the practical implications of the proposed solution framework.
Semi-Infinite Programming for Collision-Avoidance in Optimal and Model Predictive Control
This paper presents a novel approach for collision avoidance in optimal and model predictive control, in which the environment is represented by a large number of points and the robot as a union of padded polygons. The conditions that none of the points shall collide with the robot can be written in terms of an infinite number of constraints per obstacle point. We show that the resulting semi-infinite programming (SIP) optimal control problem (OCP) can be efficiently tackled through a combination of two methods: local reduction and an external active-set method. Specifically, this involves iteratively identifying the closest point obstacles, determining the lower-level distance minimizer among all feasible robot shape parameters, and solving the upper-level finitely-constrained subproblems. In addition, this paper addresses robust collision avoidance in the presence of ellipsoidal state uncertainties. Enforcing constraint satisfaction over all possible uncertainty realizations extends the dimension of constraint infiniteness. The infinitely many constraints arising from translational uncertainty are handled by local reduction together with the robot shape parameterization, while rotational uncertainty is addressed via a backoff reformulation. A controller implemented based on the proposed method is demonstrated on a real-world robot running at 20Hz, enabling fast and collision-free navigation in tight spaces. An application to 3D collision avoidance is also demonstrated in simulation.
comment: 21 pages, 15 figures
Design and Analysis of Curved Electrode Configurations for Enhanced Sensitivity in 1-Axis MEMS Accelerometers
This paper presents a comprehensive analytical and simulation-based study of curved electrode geometries for enhancing the sensitivity of MEMS capacitive accelerometers. Expressions for the capacitance between a planar movable electrode and six distinct fixed electrode profiles (biconvex, biconcave, concavo-convex, convexo-concave, plano-convex, and plano-concave) are derived, enabling direct calculation of differential gain and sensitivity as functions of electrode curvature and gap displacement. These analytical models are then rigorously validated using finite element simulations performed using COMSOL Multiphysics under identical bias and boundary conditions. The simulation results demonstrate agreement with the analytical results with a deviation of less than 7% in all configurations. The results also reveal that biconvex curved electrodes yield the greatest sensitivity improvement over the planar electrodes, with sensitivity monotonically increasing with arc length, while concave and plano-concave designs exhibit reduced performance. The concavo-convex and convexo-concave configurations furthermore introduce polarity inversion in the output voltage, offering additional design flexibility. Importantly, these sensitivity enhancements are achieved without any change in the overall volumetric dimensions of the device or the proofmass dimensions of the module for achieving higher-resolution inertial sensing.
comment: 10 pages, 7 figures
Adaptive Control with Set-Point Tracking and Linear-like Closed-loop Behavior
In this paper, we consider the problem of set-point tracking for a discrete-time plant with unknown plant parameters belonging to a convex and compact uncertainty set. We carry out parameter estimation for an associated auxiliary plant, and a pole-placement-based control law is employed. We prove that this adaptive controller provides desirable linear-like closed-loop behavior which guarantees a bound consisting of: exponential decay with respect to the initial condition, a linear-like convolution bound with respect to the exogenous inputs, and a constant scaled by the square root of the constant in the denominator of the parameter estimator update law. This implies that the system has a bounded gain. Moreover, asymptotic tracking is also proven when the disturbance is constant.
comment: This is an extended version of a paper that will appear in the Proceedings of the 64th IEEE Conference on Decision and Control (CDC)
Understanding the Fundamental Trade-Off Between Age of Information and Throughput in Unreliable Wireless Networks
This paper characterizes the fundamental trade-off between throughput and Age of Information (AoI) in wireless networks where multiple devices transmit status updates to a central base station over unreliable channels. To address the complexity introduced by stochastic transmission successes, we propose the throughput-AoI capacity region, which defines all feasible throughput-AoI pairs achievable under any scheduling policy. Using a second-order approximation that incorporates both mean and temporal variance, we derive an outer bound and a tight inner bound for the throughput-AoI capacity region. Furthermore, we propose a simple and low complexity scheduling policy and prove that it achieves every interior point within the tight inner bound. This establishes a systematic and theoretically grounded framework for the joint optimization of throughput and information freshness in practical wireless communication scenarios. To validate our theoretical framework and demonstrate the utility of the throughput-AoI capacity region, extensive simulations are implemented. Simulation results demonstrate that our proposed policy significantly outperforms conventional methods across various practical network optimization scenarios. The findings highlight our approach's effectiveness in optimizing both throughput and AoI, underscoring its applicability and robustness in practical wireless networks.
Two-Instrument Screening under Soft Budget Constraints
We study soft budget constraints in multi-tier public finance when an upper-tier government uses two instruments: an ex-ante grant schedule and an ex-post rescue. Under convex rescue costs and standard primitives, the three-stage leader-follower problem collapses to one dimensional screening with a single allocation index: the cap on realized rescue. A hazard-based characterization delivers a unified rule that nests (i) no rescue, (ii) a threshold-cap with commitment, and (iii) a threshold--linear--cap without commitment. The knife-edge for eliminating bailouts compares the marginal cost at the origin to the supremum of a virtual weight, and the comparative statics show how greater curvature tightens caps while discretion shifts transfers toward front loading by lowering the effective grant weight. The framework provides a portable benchmark for mechanism design and yields testable implications for policy and empirical work on intergovernmental finance.
comment: arXiv admin note: text overlap with arXiv:2508.02171
Towards Infant Sleep-Optimized Driving: Synergizing Wearable and Vehicle Sensing in Intelligent Cruise Control
Automated driving (AD) has substantially improved vehicle safety and driving comfort, but their impact on passenger well-being, particularly infant sleep, is not sufficiently studied. Sudden acceleration, abrupt braking, and sharp maneuvers can disrupt infant sleep, compromising both passenger comfort and parental convenience. To solve this problem, this paper explores the integration of reinforcement learning (RL) within AD to personalize driving behavior and optimally balance occupant comfort and travel efficiency. In particular, we propose an intelligent cruise control framework that adapts to varying driving conditions to enhance infant sleep quality by effectively synergizing wearable sensing and vehicle data. Long short-term memory (LSTM) and transformer-based neural networks are integrated with RL to model the relationship between driving behavior and infant sleep quality under diverse traffic and road conditions. Based on the sleep quality indicators from the wearable sensors, driving action data from vehicle controllers, and map data from map applications, the model dynamically computes the optimal driving aggressiveness level, which is subsequently translated into specific AD control strategies, e.g., the magnitude and frequency of acceleration, lane change, and overtaking. Simulation experiments conducted in the CARLA environment indicate that the proposed solution significantly improves infant sleep quality compared to baseline methods, while preserving desirable travel efficiency.
Joint Power and Spectrum Orchestration for D2D Semantic Communication Underlying Energy-Efficient Cellular Networks
Semantic communication (SemCom) has been recently deemed a promising next-generation wireless technique to enable efficient spectrum savings and information exchanges, thus naturally introducing a novel and practical network paradigm where cellular and device-to-device (D2D) SemCom approaches coexist. Nevertheless, the involved wireless resource management becomes complicated and challenging due to the unique semantic performance measurements and energy-consuming semantic coding mechanism. To this end, this paper jointly investigates power control and spectrum reuse problems for energy-efficient D2D SemCom cellular networks. Concretely, we first model the user preference-aware semantic triplet transmission and leverage a novel metric of semantic value to identify the semantic information importance conveyed in SemCom. Then, we define the additional power consumption from semantic encoding in conjunction with basic power amplifier dissipation to derive the overall system energy efficiency (semantic-value/Joule). Next, we formulate an energy efficiency maximization problem for joint power and spectrum allocation subject to several SemCom-related and practical constraints. Afterward, we propose an optimal resource management solution by employing the fractional-to-subtractive problem transformation and decomposition while developing a three-stage method with theoretical analysis of its optimality guarantee and computational complexity. Numerical results demonstrate the adequate performance superiority of our proposed solution compared with different benchmarks.
comment: This paper has been submitted to IEEE Trans. on Wireless Communications for the third round of peer review after minor revisions
HuB: Learning Extreme Humanoid Balance
The human body demonstrates exceptional motor capabilities-such as standing steadily on one foot or performing a high kick with the leg raised over 1.5 meters-both requiring precise balance control. While recent research on humanoid control has leveraged reinforcement learning to track human motions for skill acquisition, applying this paradigm to balance-intensive tasks remains challenging. In this work, we identify three key obstacles: instability from reference motion errors, learning difficulties due to morphological mismatch, and the sim-to-real gap caused by sensor noise and unmodeled dynamics. To address these challenges, we propose HuB (Humanoid Balance), a unified framework that integrates reference motion refinement, balance-aware policy learning, and sim-to-real robustness training, with each component targeting a specific challenge. We validate our approach on the Unitree G1 humanoid robot across challenging quasi-static balance tasks, including extreme single-legged poses such as Swallow Balance and Bruce Lee's Kick. Our policy remains stable even under strong physical disturbances-such as a forceful soccer strike-while baseline methods consistently fail to complete these tasks. Project website: https://hub-robot.github.io
comment: CoRL 2025 (Oral Presentation). Project website: https://hub-robot.github.io
Self-Tuning PID Control via a Hybrid Actor-Critic-Based Neural Structure for Quadcopter Control
Proportional-Integrator-Derivative (PID) controller is used in a wide range of industrial and experimental processes. There are a couple of offline methods for tuning PID gains. However, due to the uncertainty of model parameters and external disturbances, real systems such as Quadrotors need more robust and reliable PID controllers. In this research, a self-tuning PID controller using a Reinforcement-Learning-based Neural Network for attitude and altitude control of a Quadrotor has been investigated. An Incremental PID, which contains static and dynamic gains, has been considered and only the variable gains have been tuned. To tune dynamic gains, a model-free actor-critic-based hybrid neural structure was used that was able to properly tune PID gains, and also has done the best as an identifier. In both tunning and identification tasks, a Neural Network with two hidden layers and sigmoid activation functions has been learned using Adaptive Momentum (ADAM) optimizer and Back-Propagation (BP) algorithm. This method is online, able to tackle disturbance, and fast in training. In addition to robustness to mass uncertainty and wind gust disturbance, results showed that the proposed method had a better performance when compared to a PID controller with constant gains.
comment: 7 pages, 18 figures, The 30th Annual International Conference of Iranian Society of Mechanical Engineers
Towards Safe Autonomous Driving Policies using a Neuro-Symbolic Deep Reinforcement Learning Approach
The dynamic nature of driving environments and the presence of diverse road users pose significant challenges for decision-making in autonomous driving. Deep reinforcement learning (DRL) has emerged as a popular approach to tackle this problem. However, the application of existing DRL solutions is mainly confined to simulated environments due to safety concerns, impeding their deployment in real-world. To overcome this limitation, this paper introduces a novel neuro-symbolic model-free DRL approach, called DRL with Symbolic Logic (DRLSL) that combines the strengths of DRL (learning from experience) and symbolic first-order logic (knowledge-driven reasoning) to enable safe learning in real-time interactions of autonomous driving within real environments. This innovative approach provides a means to learn autonomous driving policies by actively engaging with the physical environment while ensuring safety. We have implemented the DRLSL framework in a highway driving scenario using the HighD dataset and demonstrated that our method successfully avoids unsafe actions during both the training and testing phases. Furthermore, our results indicate that DRLSL achieves faster convergence during training and exhibits better generalizability to new highway driving scenarios compared to traditional DRL methods.
comment: 15 pages, 9 figures, 1 table, 1 algorithm
Online Identification of Time-Varying Systems Using Excitation Sets and Change Point Detection
In this work, we first show that the problem of parameter identification is often ill-conditioned and lacks the persistence of excitation required for the convergence of online learning schemes. To tackle these challenges, we introduce the notion of optimal and greedy excitation sets which contain data points with sufficient richness to aid in the identification task. We then present the greedy excitation set-based recursive least squares algorithm to alleviate the problem of the lack of persistent excitation, and prove that the iterates generated by the proposed algorithm minimize an auxiliary weighted least squares cost function. When data points are generated from time-varying parameters, online estimators tend to underfit the true parameter trajectory, and their predictability deteriorates. To tackle this problem, we propose a memory resetting scheme leveraging change point detection techniques. Finally, we illustrate the performance of the proposed algorithms via several numerical case studies to learn the (time-varying) parameters of networked epidemic dynamics, and compare it with results obtained using conventional approaches.
Robotics
Mechanical Automation with Vision: A Design for Rubik's Cube Solver
The core mechanical system is built around three stepper motors for physical manipulation, a microcontroller for hardware control, a camera and YOLO detection model for real-time cube state detection. A significant software component is the development of a user-friendly graphical user interface (GUI) designed in Unity. The initial state after detection from real-time YOLOv8 model (Precision 0.98443, Recall 0.98419, Box Loss 0.42051, Class Loss 0.2611) is virtualized on GUI. To get the solution, the system employs the Kociemba's algorithm while physical manipulation with a single degree of freedom is done by combination of stepper motors' interaction with the cube achieving the average solving time of ~2.2 minutes.
comment: Presented at the 15th IOE Graduate Conference, Tribhuvan University, May 2024. Original paper available at https://conference.ioe.edu.np/publications/ioegc15/IOEGC-15-023-C1-2-42.pdf
Autonomous Oil Spill Response Through Liquid Neural Trajectory Modeling and Coordinated Marine Robotics
Marine oil spills pose grave environmental and economic risks, threatening marine ecosystems, coastlines, and dependent industries. Predicting and managing oil spill trajectories is highly complex, due to the interplay of physical, chemical, and environmental factors such as wind, currents, and temperature, which makes timely and effective response challenging. Accurate real-time trajectory forecasting and coordinated mitigation are vital for minimizing the impact of these disasters. This study introduces an integrated framework combining a multi-agent swarm robotics system built on the MOOS-IvP platform with Liquid Time-Constant Neural Networks (LTCNs). The proposed system fuses adaptive machine learning with autonomous marine robotics, enabling real-time prediction, dynamic tracking, and rapid response to evolving oil spills. By leveraging LTCNs--well-suited for modeling complex, time-dependent processes--the framework achieves real-time, high-accuracy forecasts of spill movement. Swarm intelligence enables decentralized, scalable, and resilient decision-making among robot agents, enhancing collective monitoring and containment efforts. Our approach was validated using data from the Deepwater Horizon spill, where the LTC-RK4 model achieved 0.96 spatial accuracy, surpassing LSTM approaches by 23%. The integration of advanced neural modeling with autonomous, coordinated robotics demonstrates substantial improvements in prediction precision, flexibility, and operational scalability. Ultimately, this research advances the state-of-the-art for sustainable, autonomous oil spill management and environmental protection by enhancing both trajectory prediction and response coordination.
comment: 30 pages, 40 figures. Framework combining Liquid Time-Constant Neural Networks with autonomous marine robotics for oil spill trajectory prediction and response coordination
Geodesic Tracing-Based Kinematic Integration of Rolling and Sliding Contact on Manifold Meshes for Dexterous In-Hand Manipulation
Reasoning about rolling and sliding contact, or roll-slide contact for short, is critical for dexterous manipulation tasks that involve intricate geometries. But existing works on roll-slide contact mostly focus on continuous shapes with differentiable parametrizations. This work extends roll-slide contact modeling to manifold meshes. Specifically, we present an integration scheme based on geodesic tracing to first-order time-integrate roll-slide contact directly on meshes, enabling dexterous manipulation to reason over high-fidelity discrete representations of an object's true geometry. Using our method, we planned dexterous motions of a multi-finger robotic hand manipulating five objects in-hand in simulation. The planning was achieved with a least-squares optimizer that strives to maintain the most stable instantaneous grasp by minimizing contact sliding and spinning. Then, we evaluated our method against a baseline using collision detection and a baseline using primitive shapes. The results show that our method performed the best in accuracy and precision, even for coarse meshes. We conclude with a future work discussion on incorporating multiple contacts and contact forces to achieve accurate and robust mesh-based surface contact modeling.
Tactile Gesture Recognition with Built-in Joint Sensors for Industrial Robots
While gesture recognition using vision or robot skins is an active research area in Human-Robot Collaboration (HRC), this paper explores deep learning methods relying solely on a robot's built-in joint sensors, eliminating the need for external sensors. We evaluated various convolutional neural network (CNN) architectures and collected two datasets to study the impact of data representation and model architecture on the recognition accuracy. Our results show that spectrogram-based representations significantly improve accuracy, while model architecture plays a smaller role. We also tested generalization to new robot poses, where spectrogram-based models performed better. Implemented on a Franka Emika Research robot, two of our methods, STFT2DCNN and STT3DCNN, achieved over 95% accuracy in contact detection and gesture classification. These findings demonstrate the feasibility of external-sensor-free tactile recognition and promote further research toward cost-effective, scalable solutions for HRC.
PUB: A Plasma-Propelled Ultra-Quiet Blimp with Two-DOF Vector Thrusting
This study presents the design and control of a Plasma-propelled Ultra-silence Blimp (PUB), a novel aerial robot employing plasma vector propulsion for ultra-quiet flight without mechanical propellers. The system utilizes a helium-lift platform for extended endurance and a four-layer ring asymmetric capacitor to generate ionic wind thrust. The modular propulsion units allow flexible configuration to meet mission-specific requirements, while a two-degree-of-freedom (DOF) head enables thrust vector control. A closed-loop slip control scheme is implemented for stable maneuvering. Flight experiments demonstrate full-envelope capability, including take-off, climb, hover, descent, and smooth landing, confirming the feasibility of plasma vector propulsion, the effectiveness of DOF vector control, and the stability of the control system. Owing to its low acoustic signature, structural simplicity, and high maneuverability, PUB is well suited for noise-sensitive, enclosed, and near-space applications.
SIGN: Safety-Aware Image-Goal Navigation for Autonomous Drones via Reinforcement Learning
Image-goal navigation (ImageNav) tasks a robot with autonomously exploring an unknown environment and reaching a location that visually matches a given target image. While prior works primarily study ImageNav for ground robots, enabling this capability for autonomous drones is substantially more challenging due to their need for high-frequency feedback control and global localization for stable flight. In this paper, we propose a novel sim-to-real framework that leverages visual reinforcement learning (RL) to achieve ImageNav for drones. To enhance visual representation ability, our approach trains the vision backbone with auxiliary tasks, including image perturbations and future transition prediction, which results in more effective policy training. The proposed algorithm enables end-to-end ImageNav with direct velocity control, eliminating the need for external localization. Furthermore, we integrate a depth-based safety module for real-time obstacle avoidance, allowing the drone to safely navigate in cluttered environments. Unlike most existing drone navigation methods that focus solely on reference tracking or obstacle avoidance, our framework supports comprehensive navigation behaviors--autonomous exploration, obstacle avoidance, and image-goal seeking--without requiring explicit global mapping. Code and model checkpoints will be released upon acceptance.
Semi-Infinite Programming for Collision-Avoidance in Optimal and Model Predictive Control
This paper presents a novel approach for collision avoidance in optimal and model predictive control, in which the environment is represented by a large number of points and the robot as a union of padded polygons. The conditions that none of the points shall collide with the robot can be written in terms of an infinite number of constraints per obstacle point. We show that the resulting semi-infinite programming (SIP) optimal control problem (OCP) can be efficiently tackled through a combination of two methods: local reduction and an external active-set method. Specifically, this involves iteratively identifying the closest point obstacles, determining the lower-level distance minimizer among all feasible robot shape parameters, and solving the upper-level finitely-constrained subproblems. In addition, this paper addresses robust collision avoidance in the presence of ellipsoidal state uncertainties. Enforcing constraint satisfaction over all possible uncertainty realizations extends the dimension of constraint infiniteness. The infinitely many constraints arising from translational uncertainty are handled by local reduction together with the robot shape parameterization, while rotational uncertainty is addressed via a backoff reformulation. A controller implemented based on the proposed method is demonstrated on a real-world robot running at 20Hz, enabling fast and collision-free navigation in tight spaces. An application to 3D collision avoidance is also demonstrated in simulation.
comment: 21 pages, 15 figures
Implementation and evaluation of a prediction algorithm for an autonomous vehicle
This paper presents a prediction algorithm that estimates the vehicle trajectory every five milliseconds for an autonomous vehicle. A kinematic and a dynamic bicycle model are compared, with the dynamic model exhibiting superior accuracy at higher speeds. Vehicle parameters such as mass, center of gravity, moment of inertia, and cornering stiffness are determined experimentally. For cornering stiffness, a novel measurement procedure using optical position tracking is introduced. The model is incorporated into an extended Kalman filter and implemented in a ROS node in C++. The algorithm achieves a positional deviation of only 1.25 cm per meter over the entire test drive and is up to 82.6% more precise than the kinematic model.
comment: 7 pages, 7 figures
Adjustable AprilTags For Identity Secured Tasks
Special tags such as AprilTags that facilitate image processing and pattern recognition are useful in practical applications. In close and private environments, identity security is unlikely to be an issue because all involved AprilTags can be completely regulated. However, in open and public environments, identity security is no longer an issue that can be neglected. To handle potential harm caused by adversarial attacks, this note advocates utilization of adjustable AprilTags instead of fixed ones.
A robust and compliant robotic assembly control strategy for batch precision assembly task with uncertain fit types and fit amounts
In some high-precision industrial applications, robots are deployed to perform precision assembly tasks on mass batches of manufactured pegs and holes. If the peg and hole are designed with transition fit, machining errors may lead to either a clearance or an interference fit for a specific pair of components, with uncertain fit amounts. This paper focuses on the robotic batch precision assembly task involving components with uncertain fit types and fit amounts, and proposes an efficient methodology to construct the robust and compliant assembly control strategy. Specifically, the batch precision assembly task is decomposed into multiple deterministic subtasks, and a force-vision fusion controller-driven reinforcement learning method and a multi-task reinforcement learning training method (FVFC-MTRL) are proposed to jointly learn multiple compliance control strategies for these subtasks. Subsequently, the multi-teacher policy distillation approach is designed to integrate multiple trained strategies into a unified student network, thereby establishing a robust control strategy. Real-world experiments demonstrate that the proposed method successfully constructs the robust control strategy for high-precision assembly task with different fit types and fit amounts. Moreover, the MTRL framework significantly improves training efficiency, and the final developed control strategy achieves superior force compliance and higher success rate compared with many existing methods.
Bimanual Robot-Assisted Dressing: A Spherical Coordinate-Based Strategy for Tight-Fitting Garments
Robot-assisted dressing is a popular but challenging topic in the field of robotic manipulation, offering significant potential to improve the quality of life for individuals with mobility limitations. Currently, the majority of research on robot-assisted dressing focuses on how to put on loose-fitting clothing, with little attention paid to tight garments. For the former, since the armscye is larger, a single robotic arm can usually complete the dressing task successfully. However, for the latter, dressing with a single robotic arm often fails due to the narrower armscye and the property of diminishing rigidity in the armscye, which eventually causes the armscye to get stuck. This paper proposes a bimanual dressing strategy suitable for dressing tight-fitting clothing. To facilitate the encoding of dressing trajectories that adapt to different human arm postures, a spherical coordinate system for dressing is established. We uses the azimuthal angle of the spherical coordinate system as a task-relevant feature for bimanual manipulation. Based on this new coordinate, we employ Gaussian Mixture Model (GMM) and Gaussian Mixture Regression (GMR) for imitation learning of bimanual dressing trajectories, generating dressing strategies that adapt to different human arm postures. The effectiveness of the proposed method is validated through various experiments.
comment: 8 pages, 41 figures
Robot Trains Robot: Automatic Real-World Policy Adaptation and Learning for Humanoids
Simulation-based reinforcement learning (RL) has significantly advanced humanoid locomotion tasks, yet direct real-world RL from scratch or adapting from pretrained policies remains rare, limiting the full potential of humanoid robots. Real-world learning, despite being crucial for overcoming the sim-to-real gap, faces substantial challenges related to safety, reward design, and learning efficiency. To address these limitations, we propose Robot-Trains-Robot (RTR), a novel framework where a robotic arm teacher actively supports and guides a humanoid robot student. The RTR system provides protection, learning schedule, reward, perturbation, failure detection, and automatic resets. It enables efficient long-term real-world humanoid training with minimal human intervention. Furthermore, we propose a novel RL pipeline that facilitates and stabilizes sim-to-real transfer by optimizing a single dynamics-encoded latent variable in the real world. We validate our method through two challenging real-world humanoid tasks: fine-tuning a walking policy for precise speed tracking and learning a humanoid swing-up task from scratch, illustrating the promising capabilities of real-world humanoid learning realized by RTR-style systems. See https://robot-trains-robot.github.io/ for more info.
comment: Accepted to The Conference on Robot Learning (CoRL) 2025
Improving Pre-Trained Vision-Language-Action Policies with Model-Based Search
Pre-trained vision-language-action (VLA) models offer a promising foundation for generalist robot policies, but often produce brittle behaviours or unsafe failures when deployed zero-shot in out-of-distribution scenarios. We present Vision-Language-Action Planning & Search (VLAPS) -- a novel framework and accompanying algorithms that embed model-based search into the inference procedure of pre-trained VLA policies to improve their performance on robotic tasks. Specifically, our method biases a modified Monte Carlo Tree Search (MCTS) algorithm -- run using a model of the target environment -- using action priors defined by the VLA policy. By using VLA-derived abstractions and priors in model-based search, VLAPS efficiently explores language-conditioned robotics tasks whose search spaces would otherwise be intractably large. Conversely, by integrating model-based search with the VLA policy's inference procedure, VLAPS yields behaviours that are more performant than those obtained by directly following the VLA policy's action predictions. VLAPS offers a principled framework to: i) control test-time compute in VLA models, ii) leverage a priori knowledge of the robotic environment, and iii) integrate established planning and reinforcement learning techniques into the VLA inference process. Across all experiments, VLAPS significantly outperforms VLA-only baselines on language-specified tasks that would otherwise be intractable for uninformed search algorithms, increasing success rates by as much as 67 percentage points.
Self-Guided Action Diffusion
Recent works have shown the promise of inference-time search over action samples for improving generative robot policies. In particular, optimizing cross-chunk coherence via bidirectional decoding has proven effective in boosting the consistency and reactivity of diffusion policies. However, this approach remains computationally expensive as the diversity of sampled actions grows. In this paper, we introduce self-guided action diffusion, a more efficient variant of bidirectional decoding tailored for diffusion-based policies. At the core of our method is to guide the proposal distribution at each diffusion step based on the prior decision. Experiments in simulation tasks show that the proposed self-guidance enables near-optimal performance at negligible inference cost. Notably, under a tight sampling budget, our method achieves up to 70% higher success rates than existing counterparts on challenging dynamic tasks. See project website at https://rhea-mal.github.io/selfgad.github.io.
Humanoid Motion Scripting with Postural Synergies
Generating sequences of human-like motions for humanoid robots presents challenges in collecting and analyzing reference human motions, synthesizing new motions based on these reference motions, and mapping the generated motion onto humanoid robots. To address these issues, we introduce SynSculptor, a humanoid motion analysis and editing framework that leverages postural synergies for training-free human-like motion scripting. To analyze human motion, we collect 3+ hours of motion capture data across 20 individuals where a real-time operational space controller mimics human motion on a simulated humanoid robot. The major postural synergies are extracted using principal component analysis (PCA) for velocity trajectories segmented by changes in robot momentum, constructing a style-conditioned synergy library for free-space motion generation. To evaluate generated motions using the synergy library, the foot-sliding ratio and proposed metrics for motion smoothness involving total momentum and kinetic energy deviations are computed for each generated motion, and compared with reference motions. Finally, we leverage the synergies with a motion-language transformer, where the humanoid, during execution of motion tasks with its end-effectors, adapts its posture based on the chosen synergy. Supplementary material, code, and videos are available at https://rhea-mal.github.io/humanoidsynergies.io.
Efficient Environment Design for Multi-Robot Navigation via Continuous Control
Multi-robot navigation and path planning in continuous state and action spaces with uncertain environments remains an open challenge. Deep Reinforcement Learning (RL) is one of the most popular paradigms for solving this task, but its real-world application has been limited due to sample inefficiency and long training periods. Moreover, the existing works using RL for multi-robot navigation lack formal guarantees while designing the environment. In this paper, we introduce an efficient and highly customizable environment for continuous-control multi-robot navigation, where the robots must visit a set of regions of interest (ROIs) by following the shortest paths. The task is formally modeled as a Markov Decision Process (MDP). We describe the multi-robot navigation task as an optimization problem and relate it to finding an optimal policy for the MDP. We crafted several variations of the environment and measured the performance using both gradient and non-gradient based RL methods: A2C, PPO, TRPO, TQC, CrossQ and ARS. To show real-world applicability, we deployed our environment to a 3-D agricultural field with uncertainties using the CoppeliaSim robot simulator and measured the robustness by running inference on the learned models. We believe our work will guide the researchers on how to develop MDP-based environments that are applicable to real-world systems and solve them using the existing state-of-the-art RL methods with limited resources and within reasonable time periods.
comment: 12 pages, 3 figures, conference
Towards Infant Sleep-Optimized Driving: Synergizing Wearable and Vehicle Sensing in Intelligent Cruise Control
Automated driving (AD) has substantially improved vehicle safety and driving comfort, but their impact on passenger well-being, particularly infant sleep, is not sufficiently studied. Sudden acceleration, abrupt braking, and sharp maneuvers can disrupt infant sleep, compromising both passenger comfort and parental convenience. To solve this problem, this paper explores the integration of reinforcement learning (RL) within AD to personalize driving behavior and optimally balance occupant comfort and travel efficiency. In particular, we propose an intelligent cruise control framework that adapts to varying driving conditions to enhance infant sleep quality by effectively synergizing wearable sensing and vehicle data. Long short-term memory (LSTM) and transformer-based neural networks are integrated with RL to model the relationship between driving behavior and infant sleep quality under diverse traffic and road conditions. Based on the sleep quality indicators from the wearable sensors, driving action data from vehicle controllers, and map data from map applications, the model dynamically computes the optimal driving aggressiveness level, which is subsequently translated into specific AD control strategies, e.g., the magnitude and frequency of acceleration, lane change, and overtaking. Simulation experiments conducted in the CARLA environment indicate that the proposed solution significantly improves infant sleep quality compared to baseline methods, while preserving desirable travel efficiency.
RNBF: Real-Time RGB-D Based Neural Barrier Functions for Safe Robotic Navigation
Autonomous safe navigation in unstructured and novel environments poses significant challenges, especially when environment information can only be provided through low-cost vision sensors. Although safe reactive approaches have been proposed to ensure robot safety in complex environments, many base their theory off the assumption that the robot has prior knowledge on obstacle locations and geometries. In this paper, we present a real-time, vision-based framework that constructs continuous, first-order differentiable Signed Distance Fields (SDFs) of unknown environments entirely online, without any pre-training, and is fully compatible with established SDF-based reactive controllers. To achieve robust performance under practical sensing conditions, our approach explicitly accounts for noise in affordable RGB-D cameras, refining the neural SDF representation online for smoother geometry and stable gradient estimates. We validate the proposed method in simulation and real-world experiments using a Fetch robot.
SLAG: Scalable Language-Augmented Gaussian Splatting
Language-augmented scene representations hold great promise for large-scale robotics applications such as search-and-rescue, smart cities, and mining. Many of these scenarios are time-sensitive, requiring rapid scene encoding while also being data-intensive, necessitating scalable solutions. Deploying these representations on robots with limited computational resources further adds to the challenge. To address this, we introduce SLAG, a multi-GPU framework for language-augmented Gaussian splatting that enhances the speed and scalability of embedding large scenes. Our method integrates 2D visual-language model features into 3D scenes using SAM and CLIP. Unlike prior approaches, SLAG eliminates the need for a loss function to compute per-Gaussian language embeddings. Instead, it derives embeddings from 3D Gaussian scene parameters via a normalized weighted average, enabling highly parallelized scene encoding. Additionally, we introduce a vector database for efficient embedding storage and retrieval. Our experiments show that SLAG achieves an 18 times speedup in embedding computation on a 16-GPU setup compared to OpenGaussian, while preserving embedding quality on the ScanNet and LERF datasets. For more details, visit our project website: https://slag-project.github.io/.
The Foundational Pose as a Selection Mechanism for the Design of Tool-Wielding Multi-Finger Robotic Hands
To wield an object means to hold and move it in a way that exploits its functions. When humans wield tools -- such as writing with a pen or cutting with scissors -- our hands would reach very specific poses, often drastically different from how we pick up the same objects just to transport them. In this work, we investigate the design of tool-wielding multi-finger robotic hand through a hypothesis: If a hand can kinematically reach a foundational pose (FP) with a tool, then it can wield the tool from that FP. We interpret FPs as snapshots that capture the workings of underlying parallel mechanisms formed by the tool and the hand, and one hand can form multiple mechanisms with the same tool. We tested our hypothesis in a hand design experiment, where we developed a sampling-based multi-objective design optimization framework that uses three FPs to computationally generate many different hand designs and evaluate them. The results show that 10,785 out of the 100,480 hand designs we sampled reached the FPs; more than 99\% of the 10,785 hands that reached the FPs successfully wielded tools, supporting our hypothesis. Meanwhile, our methods provide insights into the non-convex, multi-objective hand design optimization problem -- such as clustering and the Pareto front -- that could be hard to unveil with methods that return a single ``optimal" design. Lastly, we demonstrate our methods' real-world feasibility and potential with a hardware prototype equipped with rigid endoskeleton and soft skin.
LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning
Large language models (LLMs) have shown remarkable abilities in logical reasoning, in-context learning, and code generation. However, translating natural language instructions into effective robotic control policies remains a significant challenge, especially for tasks requiring long-horizon planning and operating under sparse reward conditions. Hierarchical Reinforcement Learning (HRL) provides a natural framework to address this challenge in robotics; however, it typically suffers from non-stationarity caused by the changing behavior of the lower-level policy during training, destabilizing higher-level policy learning. We introduce LGR2, a novel HRL framework that leverages LLMs to generate language-guided reward functions for the higher-level policy. By decoupling high-level reward generation from low-level policy changes, LGR2 fundamentally mitigates the non-stationarity problem in off-policy HRL, enabling stable and efficient learning. To further enhance sample efficiency in sparse environments, we integrate goal-conditioned hindsight experience relabeling. Extensive experiments across simulated and real-world robotic navigation and manipulation tasks demonstrate LGR2 outperforms both hierarchical and non-hierarchical baselines, achieving over 55% success rates on challenging tasks and robust transfer to real robots, without additional fine-tuning.
Unravelling Responsibility for AI
It is widely acknowledged that we need to establish where responsibility lies for the outputs and impacts of AI-enabled systems. This is important to achieve justice and compensation for victims of AI harms, and to inform policy and engineering practice. But without a clear, thorough understanding of what "responsibility" means, deliberations about where responsibility lies will be, at best, unfocused and incomplete and, at worst, misguided. Furthermore, AI-enabled systems exist within a wider ecosystem of actors, decisions, and governance structures, giving rise to complex networks of responsibility relations. To address these issues, this paper presents a conceptual framework of responsibility, accompanied with a graphical notation and general methodology for visualising these responsibility networks and for tracing different responsibility attributions for AI. Taking the three-part formulation "Actor A is responsible for Occurrence O," the framework unravels the concept of responsibility to clarify that there are different possibilities of who is responsible for AI, senses in which they are responsible, and aspects of events they are responsible for. The notation allows these permutations to be represented graphically. The methodology enables users to apply the framework to specific scenarios. The aim is to offer a foundation to support stakeholders from diverse disciplinary backgrounds to discuss and address complex responsibility questions in hypothesised and real-world cases involving AI. The work is illustrated by application to a fictitious scenario of a fatal collision between a crewless, AI-enabled maritime vessel in autonomous mode and a traditional, crewed vessel at sea.
HuB: Learning Extreme Humanoid Balance
The human body demonstrates exceptional motor capabilities-such as standing steadily on one foot or performing a high kick with the leg raised over 1.5 meters-both requiring precise balance control. While recent research on humanoid control has leveraged reinforcement learning to track human motions for skill acquisition, applying this paradigm to balance-intensive tasks remains challenging. In this work, we identify three key obstacles: instability from reference motion errors, learning difficulties due to morphological mismatch, and the sim-to-real gap caused by sensor noise and unmodeled dynamics. To address these challenges, we propose HuB (Humanoid Balance), a unified framework that integrates reference motion refinement, balance-aware policy learning, and sim-to-real robustness training, with each component targeting a specific challenge. We validate our approach on the Unitree G1 humanoid robot across challenging quasi-static balance tasks, including extreme single-legged poses such as Swallow Balance and Bruce Lee's Kick. Our policy remains stable even under strong physical disturbances-such as a forceful soccer strike-while baseline methods consistently fail to complete these tasks. Project website: https://hub-robot.github.io
comment: CoRL 2025 (Oral Presentation). Project website: https://hub-robot.github.io
LD-Scene: LLM-Guided Diffusion for Controllable Generation of Adversarial Safety-Critical Driving Scenarios
Ensuring the safety and robustness of autonomous driving systems necessitates a comprehensive evaluation in safety-critical scenarios. However, these safety-critical scenarios are rare and difficult to collect from real-world driving data, posing significant challenges to effectively assessing the performance of autonomous vehicles. Typical existing methods often suffer from limited controllability and lack user-friendliness, as extensive expert knowledge is essentially required. To address these challenges, we propose LD-Scene, a novel framework that integrates Large Language Models (LLMs) with Latent Diffusion Models (LDMs) for user-controllable adversarial scenario generation through natural language. Our approach comprises an LDM that captures realistic driving trajectory distributions and an LLM-based guidance module that translates user queries into adversarial loss functions, facilitating the generation of scenarios aligned with user queries. The guidance module integrates an LLM-based Chain-of-Thought (CoT) code generator and an LLM-based code debugger, enhancing the controllability and robustness in generating guidance functions. Extensive experiments conducted on the nuScenes dataset demonstrate that LD-Scene achieves state-of-the-art performance in generating realistic, diverse, and effective adversarial scenarios. Furthermore, our framework provides fine-grained control over adversarial behaviors, thereby facilitating more effective testing tailored to specific driving scenarios.
comment: 18 pages, 8 figures
Self-Tuning PID Control via a Hybrid Actor-Critic-Based Neural Structure for Quadcopter Control
Proportional-Integrator-Derivative (PID) controller is used in a wide range of industrial and experimental processes. There are a couple of offline methods for tuning PID gains. However, due to the uncertainty of model parameters and external disturbances, real systems such as Quadrotors need more robust and reliable PID controllers. In this research, a self-tuning PID controller using a Reinforcement-Learning-based Neural Network for attitude and altitude control of a Quadrotor has been investigated. An Incremental PID, which contains static and dynamic gains, has been considered and only the variable gains have been tuned. To tune dynamic gains, a model-free actor-critic-based hybrid neural structure was used that was able to properly tune PID gains, and also has done the best as an identifier. In both tunning and identification tasks, a Neural Network with two hidden layers and sigmoid activation functions has been learned using Adaptive Momentum (ADAM) optimizer and Back-Propagation (BP) algorithm. This method is online, able to tackle disturbance, and fast in training. In addition to robustness to mass uncertainty and wind gust disturbance, results showed that the proposed method had a better performance when compared to a PID controller with constant gains.
comment: 7 pages, 18 figures, The 30th Annual International Conference of Iranian Society of Mechanical Engineers
Towards Safe Autonomous Driving Policies using a Neuro-Symbolic Deep Reinforcement Learning Approach
The dynamic nature of driving environments and the presence of diverse road users pose significant challenges for decision-making in autonomous driving. Deep reinforcement learning (DRL) has emerged as a popular approach to tackle this problem. However, the application of existing DRL solutions is mainly confined to simulated environments due to safety concerns, impeding their deployment in real-world. To overcome this limitation, this paper introduces a novel neuro-symbolic model-free DRL approach, called DRL with Symbolic Logic (DRLSL) that combines the strengths of DRL (learning from experience) and symbolic first-order logic (knowledge-driven reasoning) to enable safe learning in real-time interactions of autonomous driving within real environments. This innovative approach provides a means to learn autonomous driving policies by actively engaging with the physical environment while ensuring safety. We have implemented the DRLSL framework in a highway driving scenario using the HighD dataset and demonstrated that our method successfully avoids unsafe actions during both the training and testing phases. Furthermore, our results indicate that DRLSL achieves faster convergence during training and exhibits better generalizability to new highway driving scenarios compared to traditional DRL methods.
comment: 15 pages, 9 figures, 1 table, 1 algorithm
Solving Stochastic Orienteering Problems with Chance Constraints Using a GNN Powered Monte Carlo Tree Search
Leveraging the power of a graph neural network (GNN) with message passing, we present a Monte Carlo Tree Search (MCTS) method to solve stochastic orienteering problems with chance constraints. While adhering to an assigned travel budget the algorithm seeks to maximize collected reward while incurring stochastic travel costs. In this context, the acceptable probability of exceeding the assigned budget is expressed as a chance constraint. Our MCTS solution is an online and anytime algorithm alternating planning and execution that determines the next vertex to visit by continuously monitoring the remaining travel budget. The novelty of our work is that the rollout phase in the MCTS framework is implemented using a message passing GNN, predicting both the utility and failure probability of each available action. This allows to enormously expedite the search process. Our experimental evaluation shows that with the proposed method and architecture we manage to efficiently solve complex problem instances while incurring in moderate losses in terms of collected reward. Moreover, we demonstrate how the approach is capable of generalizing beyond the characteristics of the training dataset. The paper's website, open-source code, and supplementary documentation can be found at ucmercedrobotics.github.io/gnn-sop.
comment: 8 pages, 6 figures
Multiagent Systems
The Yokai Learning Environment: Tracking Beliefs Over Space and Time IJCAI 2025
Developing collaborative AI hinges on Theory of Mind (ToM) - the ability to reason about the beliefs of others to build and maintain common ground. Existing ToM benchmarks, however, are restricted to passive observer settings or lack an assessment of how agents establish and maintain common ground over time. To address these gaps, we introduce the Yokai Learning Environment (YLE) - a multi-agent reinforcement learning (RL) environment based on the cooperative card game Yokai. In the YLE, agents take turns peeking at hidden cards and moving them to form clusters based on colour. Success requires tracking evolving beliefs, remembering past observations, using hints as grounded communication, and maintaining common ground with teammates. Our evaluation yields two key findings: First, current RL agents struggle to solve the YLE, even when given access to perfect memory. Second, while belief modelling improves performance, agents are still unable to effectively generalise to unseen partners or form accurate beliefs over longer games, exposing a reliance on brittle conventions rather than robust belief tracking. We use the YLE to investigate research questions in belief modelling, memory, partner generalisation, and scaling to higher-order ToM.
comment: Presented at the the ToM IJCAI 2025 Workshop
EXOTIC: An Exact, Optimistic, Tree-Based Algorithm for Min-Max Optimization
Min-max optimization arises in many domains such as game theory, adversarial machine learning, etc., with gradient-based methods as a typical computational tool. Beyond convex-concave min-max optimization, the solutions found by gradient-based methods may be arbitrarily far from global optima. In this work, we present an algorithmic apparatus for computing globally optimal solutions in convex-non-concave and non-convex-concave min-max optimization. For former, we employ a reformulation that transforms it into a non-concave-convex max-min optimization problem with suitably defined feasible sets and objective function. The new form can be viewed as a generalization of Sion's minimax theorem. Next, we introduce EXOTIC-an Exact, Optimistic, Tree-based algorithm for solving the reformulated max-min problem. EXOTIC employs an iterative convex optimization solver to (approximately) solve the inner minimization and a hierarchical tree search for the outer maximization to optimistically select promising regions to search based on the approximate solution returned by convex optimization solver. We establish an upper bound on its optimality gap as a function of the number of calls to the inner solver, the solver's convergence rate, and additional problem-dependent parameters. Both our algorithmic apparatus along with its accompanying theoretical analysis can also be applied for non-convex-concave min-max optimization. In addition, we propose a class of benchmark convex-non-concave min-max problems along with their analytical global solutions, providing a testbed for evaluating algorithms for min-max optimization. Empirically, EXOTIC outperforms gradient-based methods on this benchmark as well as on existing numerical benchmark problems from the literature. Finally, we demonstrate the utility of EXOTIC by computing security strategies in multi-player games with three or more players.
comment: 31 pages, 2 figures, 3 tables
Synchronization Dynamics of Heterogeneous, Collaborative Multi-Agent AI Systems
We present a novel interdisciplinary framework that bridges synchronization theory and multi-agent AI systems by adapting the Kuramoto model to describe the collective dynamics of heterogeneous AI agents engaged in complex task execution. By representing AI agents as coupled oscillators with both phase and amplitude dynamics, our model captures essential aspects of agent specialization, influence, and communication within networked systems. We introduce an order parameter to quantify the degree of coordination and synchronization, providing insights into how coupling strength, agent diversity, and network topology impact emergent collective behavior. Furthermore, we formalize a detailed correspondence between Chain-of-Thought prompting in AI reasoning and synchronization phenomena, unifying human-like iterative problem solving with emergent group intelligence. Through extensive simulations on all-to-all and deterministic scale-free networks, we demonstrate that increased coupling promotes robust synchronization despite heterogeneous agent capabilities, reflecting realistic collaborative AI scenarios. Our physics-informed approach establishes a rigorous mathematical foundation for designing, analyzing, and optimizing scalable, adaptive, and interpretable multi-agent AI systems. This work opens pathways for principled orchestration of agentic AI and lays the groundwork for future incorporation of learning dynamics and adaptive network architectures to further enhance system resilience and efficiency.
comment: 9 pages, 6 figures
Local Prompt Adaptation for Style-Consistent Multi-Object Generation in Diffusion Models
Diffusion models have become a powerful backbone for text-to-image generation, producing high-quality visuals from natural language prompts. However, when prompts involve multiple objects alongside global or local style instructions, the outputs often drift in style and lose spatial coherence, limiting their reliability for controlled, style-consistent scene generation. We present Local Prompt Adaptation (LPA), a lightweight, training-free method that splits the prompt into content and style tokens, then injects them selectively into the U-Net's attention layers at chosen timesteps. By conditioning object tokens early and style tokens later in the denoising process, LPA improves both layout control and stylistic uniformity without additional training cost. We conduct extensive ablations across parser settings and injection windows, finding that the best configuration -- lpa late only with a 300-650 step window -- delivers the strongest balance of prompt alignment and style consistency. On the T2I benchmark, LPA improves CLIP-prompt alignment over vanilla SDXL by +0.41% and over SD1.5 by +0.34%, with no diversity loss. On our custom 50-prompt style-rich benchmark, LPA achieves +0.09% CLIP-prompt and +0.08% CLIP-style gains over baseline. Our method is model-agnostic, easy to integrate, and requires only a single configuration change, making it a practical choice for controllable, style-consistent multi-object generation.
comment: 10 Pages,10 figures, pre-print
Systems and Control (CS)
Monotone Neural Control Barrier Certificates
This work presents a neurosymbolic framework for synthesizing and verifying safety controllers in high-dimensional monotone dynamical systems using only linear sample complexity, without requiring explicit models or conservative Lipschitz bounds. The approach combines the expressiveness of neural networks with the rigor of symbolic reasoning via barrier certificates, functional analogs of inductive invariants that formally guarantee safety. Prior data-driven methods often treat dynamics as black-box models, relying on dense state-space discretization or Lipschitz overapproximations, leading to exponential sample complexity. In contrast, monotonicity -- a pervasive structural property in many real-world systems -- provides a symbolic scaffold that simplifies both learning and verification. Exploiting order preservation reduces verification to localized boundary checks, transforming a high-dimensional problem into a tractable, low-dimensional one. Barrier certificates are synthesized using monotone neural networks -- architectures with embedded monotonicity constraints -- trained via gradient-based optimization guided by barrier conditions. This enables scalable, formally sound verification directly from simulation data, bridging black-box learning and formal guarantees within a unified neurosymbolic framework. Empirical results on three large-scale benchmarks -- a 1,000-dimensional freeway traffic model, a 50-dimensional urban traffic network, and a 13,000-dimensional power grid -- demonstrate the scalability and effectiveness of the approach in real-world, safety-critical systems.
Belief-Conditioned One-Step Diffusion: Real-Time Trajectory Planning with Just-Enough Sensing
Robots equipped with rich sensor suites can localize reliably in partially-observable environments, but powering every sensor continuously is wasteful and often infeasible. Belief-space planners address this by propagating pose-belief covariance through analytic models and switching sensors heuristically--a brittle, runtime-expensive approach. Data-driven approaches--including diffusion models--learn multi-modal trajectories from demonstrations, but presuppose an accurate, always-on state estimate. We address the largely open problem: for a given task in a mapped environment, which \textit{minimal sensor subset} must be active at each location to maintain state uncertainty \textit{just low enough} to complete the task? Our key insight is that when a diffusion planner is explicitly conditioned on a pose-belief raster and a sensor mask, the spread of its denoising trajectories yields a calibrated, differentiable proxy for the expected localisation error. Building on this insight, we present Belief-Conditioned One-Step Diffusion (B-COD), the first planner that, in a 10 ms forward pass, returns a short-horizon trajectory, per-waypoint aleatoric variances, and a proxy for localisation error--eliminating external covariance rollouts. We show that this single proxy suffices for a soft-actor-critic to choose sensors online, optimising energy while bounding pose-covariance growth. We deploy B-COD in real-time marine trials on an unmanned surface vehicle and show that it reduces sensing energy consumption while matching the goal-reach performance of an always-on baseline.
comment: Accepted to CoRL 2025 (Conference on Robot Learning)
A layered smart sensing platform for physiologically informed human-exoskeleton interaction
Wearable exoskeletons offer transformative potential to assist mobility across diverse user groups with reduced muscle strength or other forms of impaired mobility. Yet, their deployment beyond laboratory settings remains constrained by sensing systems able to fully capture users' physiological and biomechanical states in real time. We introduce a soft, lightweight smart leg sleeve with anatomically inspired layered multimodal sensing, integrating textile-based surface electromyography (sEMG) electrodes, ultrasensitive textile strain sensors, and inertial measurement units (IMUs). Each sensing modality targets a distinct physiological layer: IMUs track joint kinematics at the skeletal level, sEMG monitors muscle activation at the muscular level, and strain sensors detect skin deformation at the cutaneous level. Together, these sensors provide real-time perception to support three core objectives: controlling personalized assistance, optimizing user effort, and safeguarding against injury risks. The system is skin-conformal, mechanically compliant, and seamlessly integrated with a custom exoskeleton (<20 g total sensor and electronics weight). We demonstrate: (1) accurate ankle joint moment estimation (RMSE = 0.13 Nm/kg), (2) real-time classification of metabolic trends (accuracy = 97.1%), and (3) injury risk detection within 100 ms (recall = 0.96), all validated on unseen users using a leave-one-subject-out protocol. This work demonstrates a lightweight, multimodal sensing architecture for next-generation human-exoskeleton interaction in controlled and semi-structured walking scenarios, with potential for scaling to broader exoskeleton applications towards intelligent, responsive, and personalized wearable robotics.
comment: 21 pages, 5 figures, 43 references
Euclidean Approach to Green-Wave Theory Applied to Traffic Signal Networks
Travel on long arterials with signalized intersections can be inefficient if not coordinated properly. As the number of signals increases, coordination becomes more challenging and traditional progression schemes tend to break down. Long progressions save travel time and fuel, reduce pollution and traffic accidents by providing a smoother flow of traffic. This paper introduces a green-wave theory that can be applied to a network of intersecting arterial roads. It enables uninterrupted flow on arbitrary long signalized arterials using a Road-to-Traveler-Feedback Device. The approach is modelled after Euclid. We define concepts such as RGW-roads (roads where vehicles traveling at the recommended speed make all traffic signals), green-arrows (representing vehicle platoons), real nodes (representing signalized intersections where RGW-roads intersect) and virtual nodes, green-wave speed, blocks, etc. - the analogue of Euclid's postulates. We then use geometric reasoning to deduce results: green-arrow lengths have a maximum value, are restricted to discrete lengths, and green-arrow laws of motion imply that select existing arterial roads can be converted to RGW-roads. The signal timings and offsets that are produced have been shown to be effective using a simulation model developed previously called RGW-SIM.
comment: 74 pages, 49 figures
Design of MIMO Lur'e oscillators via dominant system theory with application in multi-agent rhythm synchronization
This paper presents a new design framework for dynamic output-feedback controllers for Lur'e oscillation in a multiple-input multiple-output setting. We first revisit and extend dominant system theory to state-dependent rates, with the goal of deriving conditions based on linear matrix inequalities. Then, we introduce a separation principle for Lur'e oscillator design, which allows for the independent design of a state-feedback oscillator and an observer. Our proposed control synthesis is demonstrated through the rhythm synchronization in multi-agent systems, illustrating how networks of stable, heterogeneous linear agents can be driven into phase-locked rhythmic behavior.
Co-Investment with Payoff-Sharing Mechanism for Cooperative Decision-Making in Network Design Games
Network-based systems are inherently interconnected, with the design and performance of subnetworks being interdependent. However, the decisions of self-interested operators may lead to suboptimal outcomes for users and the overall system. This paper explores cooperative mechanisms that can simultaneously benefit both operators and users. We address this challenge using a game-theoretical framework that integrates both non-cooperative and cooperative game theory. In the non-cooperative stage, we propose a network design game in which subnetwork decision-makers strategically design local infrastructures. In the cooperative stage, co-investment with payoff-sharing mechanism is developed to enlarge collective benefits and fairly distribute them. To demonstrate the effectiveness of our framework, we conduct case studies on the Sioux Falls network and real-world public transport networks in Zurich and Winterthur, Switzerland. Our evaluation considers impacts on environmental sustainability, social welfare, and economic efficiency. The proposed framework provides a foundation for improving interdependent networked systems by enabling strategic cooperation among self-interested operators.
Active Fault Identification and Robust Control for Unknown Bounded Faults via Volume-Based Costs
This paper proposes a novel framework for active fault diagnosis and parameter estimation in linear systems operating in closed-loop, subject to unknown but bounded faults. The approach integrates set-membership identification with a cost function designed to accelerate fault identification. Informative excitation is achieved by minimizing the size of the parameter uncertainty set, which is approximated using ellipsoidal outer bounds. Combining this formulation with a scheduling parameter enables a transition back to nominal control as confidence in the model estimates increases. Unlike many existing methods, the proposed approach does not rely on predefined fault models. Instead, it only requires known bounds on parameter deviations and additive disturbances. Robust constraint satisfaction is guaranteed through a tube-based model predictive control scheme. Simulation results demonstrate that the method achieves faster fault detection and identification compared to passive strategies and adaptive ones based on persistent excitation constraints.
Virtual Trading in Multi-Settlement Electricity Markets
In the Day-Ahead (DA) market, suppliers sell and load-serving entities (LSEs) purchase energy commitments, with both sides adjusting for imbalances between contracted and actual deliveries in the Real-Time (RT) market. We develop a supply function equilibrium model to study how virtual trading-speculating on DA-RT price spreads without physical delivery-affects market efficiency. Without virtual trading, LSEs underbid relative to actual demand in the DA market, pushing DA prices below expected RT prices. Virtual trading narrows, and in the limit of large number traders can eliminates, this price gap. However, it does not induce quantity alignment: DA-cleared demand remains below true expected demand, as price alignment makes the LSE indifferent between markets and prompts it to reduce DA bids to avoid over-purchasing. Renewable energy suppliers cannot offset these strategic distortions. We provide empirical support to our main model implications using data from the California and New York Independent System Operators.
Optimality of Linear Policies in Distributionally Robust Linear Quadratic Control
We study a generalization of the classical discrete-time, Linear-Quadratic-Gaussian (LQG) control problem where the noise distributions affecting the states and observations are unknown and chosen adversarially from divergence-based ambiguity sets centered around a known nominal distribution. For a finite horizon model with Gaussian nominal noise and a structural assumption on the divergence that is satisfied by many examples -- including 2-Wasserstein distance, Kullback-Leibler divergence, moment-based divergences, entropy-regularized optimal transport, or Fisher (score-matching) divergence -- we prove that a control policy that is affine in the observations is optimal and the adversary's corresponding worst-case optimal distribution is Gaussian. When the nominal means are zero (as in the classical LQG model), we show that the adversary should optimally set the distribution's mean to zero and the optimal control policy becomes linear. Moreover, the adversary should optimally ``inflate" the noise by choosing covariance matrices that dominate the nominal covariance in Loewner order. Exploiting these structural properties, we develop a Frank-Wolfe algorithm whose inner step solves standard LQG subproblems via Kalman filtering and dynamic programming and show that the implementation consistently outperforms semidefinite-programming reformulations of the problem. All structural and algorithmic results extend to an infinite-horizon, average-cost formulation, yielding stationary linear policies and a time-invariant Gaussian distribution for the adversary. Lastly, we show that when the divergence is 2-Wasserstein, the entire framework remains valid when the nominal distributions are elliptical rather than Gaussian.
Beyond Fixed Morphologies: Learning Graph Policies with Trust Region Compensation in Variable Action Spaces
Trust region-based optimization methods have become foundational reinforcement learning algorithms that offer stability and strong empirical performance in continuous control tasks. Growing interest in scalable and reusable control policies translate also in a demand for morphological generalization, the ability of control policies to cope with different kinematic structures. Graph-based policy architectures provide a natural and effective mechanism to encode such structural differences. However, while these architectures accommodate variable morphologies, the behavior of trust region methods under varying action space dimensionality remains poorly understood. To this end, we conduct a theoretical analysis of trust region-based policy optimization methods, focusing on both Trust Region Policy Optimization (TRPO) and its widely used first-order approximation, Proximal Policy Optimization (PPO). The goal is to demonstrate how varying action space dimensionality influence the optimization landscape, particularly under the constraints imposed by KL-divergence or policy clipping penalties. Complementing the theoretical insights, an empirical evaluation under morphological variation is carried out using the Gymnasium Swimmer environment. This benchmark offers a systematically controlled setting for varying the kinematic structure without altering the underlying task, making it particularly well-suited to study morphological generalization.
A Communication Consistent Approach to Signal Temporal Logic Task Decomposition in Multi-Agent Systems
We consider the problem of decomposing a global task assigned to a multi-agent system, expressed as a formula within a fragment of Signal Temporal Logic (STL), under range-limited communication. Given a global task expressed as a conjunction of local tasks defined over the individual and relative states of agents in the system, we propose representing task dependencies among agents as edges of a suitably defined task graph. At the same time, range-limited communication naturally induces the definition of a communication graph that defines which agents have access to each other's states. Within these settings, inconsistencies arise when a task dependency between a pair of agents is not supported by a corresponding communication link due to the limited communication range. As a result, state feedback control laws previously derived to achieve the tasks' satisfaction can not be leveraged. We propose a task decomposition mechanism to distribute tasks assigned to pairs of non-communicating agents in the system as conjunctions of tasks defined over the relative states of communicating agents, thus enforcing consistency between task and communication graphs. Assuming the super-level sets of the predicate functions composing the STL tasks are bounded polytopes, our task decomposition mechanism can be cast as a parameter optimization problem and solved via state-of-the-art decentralized convex optimization algorithms. To guarantee the soundness of our approach, we present various conditions under which the tasks defined in the applied STL fragment are unsatisfiable, and we show sufficient conditions such that our decomposition approach yields satisfiable global tasks after decomposition.
comment: This work has been submitted to the IEEE for possible publication
Self-sustained oscillations in discrete-time relay feedback systems
We study the problem of determining self-sustained oscillations in discrete-time linear time-invariant relay feedback systems. Concretely, we are interested in predicting when such a system admits unimodal oscillations, i.e., when the output has a single-peaked period. Under the assumption that the linear system is stable and has an impulse response that is strictly monotonically decreasing on its infinite support, we take a novel approach in using the framework of total positivity to address our main question. It is shown that unimodal self-oscillations can only exist if the number of positive and negative elements in a period coincides. Based on this result, we derive conditions for the existence of such oscillations, determine bounds on their periods, and address the question of uniqueness.
comment: Update some figures by the comments from reviewers
Open-/Closed-loop Active Learning for Data-driven Predictive Control
An important question in data-driven control is how to obtain an informative dataset. In this work, we consider the problem of effective data acquisition of an unknown linear system with bounded disturbance for both open-loop and closed-loop stages. The learning objective is to minimize the volume of the set of admissible systems. First, a performance measure based on historical data and the input sequence is introduced to characterize the upper bound of the volume of the set of admissible systems. On the basis of this performance measure, an open-loop active learning strategy is proposed to minimize the volume by actively designing inputs during the open-loop stage. For the closed-loop stage, a closed-loop active learning strategy is designed to select and learn from informative closed-loop data. The efficiency of the proposed closed-loop active learning strategy is proved by showing that the unselected data cannot benefit the learning performance. Furthermore, an adaptive predictive controller is designed in accordance with the proposed data acquisition approach. The recursive feasibility and the stability of the controller are proved by analyzing the effect of the closed-loop active learning strategy. Finally, numerical examples and comparisons illustrate the effectiveness of the proposed data acquisition strategy.
Optical Wireless Ether: Enabling Controlled Dynamic Signal Propagation in OWC Systems
Optical wireless communication (OWC) leverages the terahertz-scale optical spectrum to enable ultra-fast data transfer, offering a compelling alternative to often-congested radio frequency systems. However, the highly directional nature of optical signals and their susceptibility to obstruction inherently limit coverage and reliability, particularly in dynamic indoor environments. To overcome these limitations, we propose optical wireless ether (OWE), a novel framework that transforms indoor spaces into a dynamically controllable optical propagation medium. OWE employs a distributed network of ether amplifiers (EAs), which act as optical amplifiers with programmable gain values to extend coverage through diffuse reflections while compensating for signal attenuation. A key challenge in OWE is preventing amplifier saturation from feedback loops. We rigorously derive stability constraints to guarantee system robustness. Beyond coverage extension, OWE dynamically adjusts EA gains in response to user locations and channel conditions, enhancing signal-to-noise ratio, balancing resource allocation, and suppressing interference. As the first framework to harness diffuse reflection for controllable optical propagation, we validate OWE's effectiveness through analytical modeling, simulations, and prototyping. Our work lays the foundation for robust, high-speed indoor OWC networks.
comment: Enhanced clarity of the optical channel and noise models with refined explanations, additional figures, and equations
Systems and Control (EESS)
Monotone Neural Control Barrier Certificates
This work presents a neurosymbolic framework for synthesizing and verifying safety controllers in high-dimensional monotone dynamical systems using only linear sample complexity, without requiring explicit models or conservative Lipschitz bounds. The approach combines the expressiveness of neural networks with the rigor of symbolic reasoning via barrier certificates, functional analogs of inductive invariants that formally guarantee safety. Prior data-driven methods often treat dynamics as black-box models, relying on dense state-space discretization or Lipschitz overapproximations, leading to exponential sample complexity. In contrast, monotonicity -- a pervasive structural property in many real-world systems -- provides a symbolic scaffold that simplifies both learning and verification. Exploiting order preservation reduces verification to localized boundary checks, transforming a high-dimensional problem into a tractable, low-dimensional one. Barrier certificates are synthesized using monotone neural networks -- architectures with embedded monotonicity constraints -- trained via gradient-based optimization guided by barrier conditions. This enables scalable, formally sound verification directly from simulation data, bridging black-box learning and formal guarantees within a unified neurosymbolic framework. Empirical results on three large-scale benchmarks -- a 1,000-dimensional freeway traffic model, a 50-dimensional urban traffic network, and a 13,000-dimensional power grid -- demonstrate the scalability and effectiveness of the approach in real-world, safety-critical systems.
Belief-Conditioned One-Step Diffusion: Real-Time Trajectory Planning with Just-Enough Sensing
Robots equipped with rich sensor suites can localize reliably in partially-observable environments, but powering every sensor continuously is wasteful and often infeasible. Belief-space planners address this by propagating pose-belief covariance through analytic models and switching sensors heuristically--a brittle, runtime-expensive approach. Data-driven approaches--including diffusion models--learn multi-modal trajectories from demonstrations, but presuppose an accurate, always-on state estimate. We address the largely open problem: for a given task in a mapped environment, which \textit{minimal sensor subset} must be active at each location to maintain state uncertainty \textit{just low enough} to complete the task? Our key insight is that when a diffusion planner is explicitly conditioned on a pose-belief raster and a sensor mask, the spread of its denoising trajectories yields a calibrated, differentiable proxy for the expected localisation error. Building on this insight, we present Belief-Conditioned One-Step Diffusion (B-COD), the first planner that, in a 10 ms forward pass, returns a short-horizon trajectory, per-waypoint aleatoric variances, and a proxy for localisation error--eliminating external covariance rollouts. We show that this single proxy suffices for a soft-actor-critic to choose sensors online, optimising energy while bounding pose-covariance growth. We deploy B-COD in real-time marine trials on an unmanned surface vehicle and show that it reduces sensing energy consumption while matching the goal-reach performance of an always-on baseline.
comment: Accepted to CoRL 2025 (Conference on Robot Learning)
A layered smart sensing platform for physiologically informed human-exoskeleton interaction
Wearable exoskeletons offer transformative potential to assist mobility across diverse user groups with reduced muscle strength or other forms of impaired mobility. Yet, their deployment beyond laboratory settings remains constrained by sensing systems able to fully capture users' physiological and biomechanical states in real time. We introduce a soft, lightweight smart leg sleeve with anatomically inspired layered multimodal sensing, integrating textile-based surface electromyography (sEMG) electrodes, ultrasensitive textile strain sensors, and inertial measurement units (IMUs). Each sensing modality targets a distinct physiological layer: IMUs track joint kinematics at the skeletal level, sEMG monitors muscle activation at the muscular level, and strain sensors detect skin deformation at the cutaneous level. Together, these sensors provide real-time perception to support three core objectives: controlling personalized assistance, optimizing user effort, and safeguarding against injury risks. The system is skin-conformal, mechanically compliant, and seamlessly integrated with a custom exoskeleton (<20 g total sensor and electronics weight). We demonstrate: (1) accurate ankle joint moment estimation (RMSE = 0.13 Nm/kg), (2) real-time classification of metabolic trends (accuracy = 97.1%), and (3) injury risk detection within 100 ms (recall = 0.96), all validated on unseen users using a leave-one-subject-out protocol. This work demonstrates a lightweight, multimodal sensing architecture for next-generation human-exoskeleton interaction in controlled and semi-structured walking scenarios, with potential for scaling to broader exoskeleton applications towards intelligent, responsive, and personalized wearable robotics.
comment: 21 pages, 5 figures, 43 references
Euclidean Approach to Green-Wave Theory Applied to Traffic Signal Networks
Travel on long arterials with signalized intersections can be inefficient if not coordinated properly. As the number of signals increases, coordination becomes more challenging and traditional progression schemes tend to break down. Long progressions save travel time and fuel, reduce pollution and traffic accidents by providing a smoother flow of traffic. This paper introduces a green-wave theory that can be applied to a network of intersecting arterial roads. It enables uninterrupted flow on arbitrary long signalized arterials using a Road-to-Traveler-Feedback Device. The approach is modelled after Euclid. We define concepts such as RGW-roads (roads where vehicles traveling at the recommended speed make all traffic signals), green-arrows (representing vehicle platoons), real nodes (representing signalized intersections where RGW-roads intersect) and virtual nodes, green-wave speed, blocks, etc. - the analogue of Euclid's postulates. We then use geometric reasoning to deduce results: green-arrow lengths have a maximum value, are restricted to discrete lengths, and green-arrow laws of motion imply that select existing arterial roads can be converted to RGW-roads. The signal timings and offsets that are produced have been shown to be effective using a simulation model developed previously called RGW-SIM.
comment: 74 pages, 49 figures
Design of MIMO Lur'e oscillators via dominant system theory with application in multi-agent rhythm synchronization
This paper presents a new design framework for dynamic output-feedback controllers for Lur'e oscillation in a multiple-input multiple-output setting. We first revisit and extend dominant system theory to state-dependent rates, with the goal of deriving conditions based on linear matrix inequalities. Then, we introduce a separation principle for Lur'e oscillator design, which allows for the independent design of a state-feedback oscillator and an observer. Our proposed control synthesis is demonstrated through the rhythm synchronization in multi-agent systems, illustrating how networks of stable, heterogeneous linear agents can be driven into phase-locked rhythmic behavior.
Co-Investment with Payoff-Sharing Mechanism for Cooperative Decision-Making in Network Design Games
Network-based systems are inherently interconnected, with the design and performance of subnetworks being interdependent. However, the decisions of self-interested operators may lead to suboptimal outcomes for users and the overall system. This paper explores cooperative mechanisms that can simultaneously benefit both operators and users. We address this challenge using a game-theoretical framework that integrates both non-cooperative and cooperative game theory. In the non-cooperative stage, we propose a network design game in which subnetwork decision-makers strategically design local infrastructures. In the cooperative stage, co-investment with payoff-sharing mechanism is developed to enlarge collective benefits and fairly distribute them. To demonstrate the effectiveness of our framework, we conduct case studies on the Sioux Falls network and real-world public transport networks in Zurich and Winterthur, Switzerland. Our evaluation considers impacts on environmental sustainability, social welfare, and economic efficiency. The proposed framework provides a foundation for improving interdependent networked systems by enabling strategic cooperation among self-interested operators.
Active Fault Identification and Robust Control for Unknown Bounded Faults via Volume-Based Costs
This paper proposes a novel framework for active fault diagnosis and parameter estimation in linear systems operating in closed-loop, subject to unknown but bounded faults. The approach integrates set-membership identification with a cost function designed to accelerate fault identification. Informative excitation is achieved by minimizing the size of the parameter uncertainty set, which is approximated using ellipsoidal outer bounds. Combining this formulation with a scheduling parameter enables a transition back to nominal control as confidence in the model estimates increases. Unlike many existing methods, the proposed approach does not rely on predefined fault models. Instead, it only requires known bounds on parameter deviations and additive disturbances. Robust constraint satisfaction is guaranteed through a tube-based model predictive control scheme. Simulation results demonstrate that the method achieves faster fault detection and identification compared to passive strategies and adaptive ones based on persistent excitation constraints.
Virtual Trading in Multi-Settlement Electricity Markets
In the Day-Ahead (DA) market, suppliers sell and load-serving entities (LSEs) purchase energy commitments, with both sides adjusting for imbalances between contracted and actual deliveries in the Real-Time (RT) market. We develop a supply function equilibrium model to study how virtual trading-speculating on DA-RT price spreads without physical delivery-affects market efficiency. Without virtual trading, LSEs underbid relative to actual demand in the DA market, pushing DA prices below expected RT prices. Virtual trading narrows, and in the limit of large number traders can eliminates, this price gap. However, it does not induce quantity alignment: DA-cleared demand remains below true expected demand, as price alignment makes the LSE indifferent between markets and prompts it to reduce DA bids to avoid over-purchasing. Renewable energy suppliers cannot offset these strategic distortions. We provide empirical support to our main model implications using data from the California and New York Independent System Operators.
Optimality of Linear Policies in Distributionally Robust Linear Quadratic Control
We study a generalization of the classical discrete-time, Linear-Quadratic-Gaussian (LQG) control problem where the noise distributions affecting the states and observations are unknown and chosen adversarially from divergence-based ambiguity sets centered around a known nominal distribution. For a finite horizon model with Gaussian nominal noise and a structural assumption on the divergence that is satisfied by many examples -- including 2-Wasserstein distance, Kullback-Leibler divergence, moment-based divergences, entropy-regularized optimal transport, or Fisher (score-matching) divergence -- we prove that a control policy that is affine in the observations is optimal and the adversary's corresponding worst-case optimal distribution is Gaussian. When the nominal means are zero (as in the classical LQG model), we show that the adversary should optimally set the distribution's mean to zero and the optimal control policy becomes linear. Moreover, the adversary should optimally ``inflate" the noise by choosing covariance matrices that dominate the nominal covariance in Loewner order. Exploiting these structural properties, we develop a Frank-Wolfe algorithm whose inner step solves standard LQG subproblems via Kalman filtering and dynamic programming and show that the implementation consistently outperforms semidefinite-programming reformulations of the problem. All structural and algorithmic results extend to an infinite-horizon, average-cost formulation, yielding stationary linear policies and a time-invariant Gaussian distribution for the adversary. Lastly, we show that when the divergence is 2-Wasserstein, the entire framework remains valid when the nominal distributions are elliptical rather than Gaussian.
Beyond Fixed Morphologies: Learning Graph Policies with Trust Region Compensation in Variable Action Spaces
Trust region-based optimization methods have become foundational reinforcement learning algorithms that offer stability and strong empirical performance in continuous control tasks. Growing interest in scalable and reusable control policies translate also in a demand for morphological generalization, the ability of control policies to cope with different kinematic structures. Graph-based policy architectures provide a natural and effective mechanism to encode such structural differences. However, while these architectures accommodate variable morphologies, the behavior of trust region methods under varying action space dimensionality remains poorly understood. To this end, we conduct a theoretical analysis of trust region-based policy optimization methods, focusing on both Trust Region Policy Optimization (TRPO) and its widely used first-order approximation, Proximal Policy Optimization (PPO). The goal is to demonstrate how varying action space dimensionality influence the optimization landscape, particularly under the constraints imposed by KL-divergence or policy clipping penalties. Complementing the theoretical insights, an empirical evaluation under morphological variation is carried out using the Gymnasium Swimmer environment. This benchmark offers a systematically controlled setting for varying the kinematic structure without altering the underlying task, making it particularly well-suited to study morphological generalization.
A Communication Consistent Approach to Signal Temporal Logic Task Decomposition in Multi-Agent Systems
We consider the problem of decomposing a global task assigned to a multi-agent system, expressed as a formula within a fragment of Signal Temporal Logic (STL), under range-limited communication. Given a global task expressed as a conjunction of local tasks defined over the individual and relative states of agents in the system, we propose representing task dependencies among agents as edges of a suitably defined task graph. At the same time, range-limited communication naturally induces the definition of a communication graph that defines which agents have access to each other's states. Within these settings, inconsistencies arise when a task dependency between a pair of agents is not supported by a corresponding communication link due to the limited communication range. As a result, state feedback control laws previously derived to achieve the tasks' satisfaction can not be leveraged. We propose a task decomposition mechanism to distribute tasks assigned to pairs of non-communicating agents in the system as conjunctions of tasks defined over the relative states of communicating agents, thus enforcing consistency between task and communication graphs. Assuming the super-level sets of the predicate functions composing the STL tasks are bounded polytopes, our task decomposition mechanism can be cast as a parameter optimization problem and solved via state-of-the-art decentralized convex optimization algorithms. To guarantee the soundness of our approach, we present various conditions under which the tasks defined in the applied STL fragment are unsatisfiable, and we show sufficient conditions such that our decomposition approach yields satisfiable global tasks after decomposition.
comment: This work has been submitted to the IEEE for possible publication
Self-sustained oscillations in discrete-time relay feedback systems
We study the problem of determining self-sustained oscillations in discrete-time linear time-invariant relay feedback systems. Concretely, we are interested in predicting when such a system admits unimodal oscillations, i.e., when the output has a single-peaked period. Under the assumption that the linear system is stable and has an impulse response that is strictly monotonically decreasing on its infinite support, we take a novel approach in using the framework of total positivity to address our main question. It is shown that unimodal self-oscillations can only exist if the number of positive and negative elements in a period coincides. Based on this result, we derive conditions for the existence of such oscillations, determine bounds on their periods, and address the question of uniqueness.
comment: Update some figures by the comments from reviewers
Open-/Closed-loop Active Learning for Data-driven Predictive Control
An important question in data-driven control is how to obtain an informative dataset. In this work, we consider the problem of effective data acquisition of an unknown linear system with bounded disturbance for both open-loop and closed-loop stages. The learning objective is to minimize the volume of the set of admissible systems. First, a performance measure based on historical data and the input sequence is introduced to characterize the upper bound of the volume of the set of admissible systems. On the basis of this performance measure, an open-loop active learning strategy is proposed to minimize the volume by actively designing inputs during the open-loop stage. For the closed-loop stage, a closed-loop active learning strategy is designed to select and learn from informative closed-loop data. The efficiency of the proposed closed-loop active learning strategy is proved by showing that the unselected data cannot benefit the learning performance. Furthermore, an adaptive predictive controller is designed in accordance with the proposed data acquisition approach. The recursive feasibility and the stability of the controller are proved by analyzing the effect of the closed-loop active learning strategy. Finally, numerical examples and comparisons illustrate the effectiveness of the proposed data acquisition strategy.
Optical Wireless Ether: Enabling Controlled Dynamic Signal Propagation in OWC Systems
Optical wireless communication (OWC) leverages the terahertz-scale optical spectrum to enable ultra-fast data transfer, offering a compelling alternative to often-congested radio frequency systems. However, the highly directional nature of optical signals and their susceptibility to obstruction inherently limit coverage and reliability, particularly in dynamic indoor environments. To overcome these limitations, we propose optical wireless ether (OWE), a novel framework that transforms indoor spaces into a dynamically controllable optical propagation medium. OWE employs a distributed network of ether amplifiers (EAs), which act as optical amplifiers with programmable gain values to extend coverage through diffuse reflections while compensating for signal attenuation. A key challenge in OWE is preventing amplifier saturation from feedback loops. We rigorously derive stability constraints to guarantee system robustness. Beyond coverage extension, OWE dynamically adjusts EA gains in response to user locations and channel conditions, enhancing signal-to-noise ratio, balancing resource allocation, and suppressing interference. As the first framework to harness diffuse reflection for controllable optical propagation, we validate OWE's effectiveness through analytical modeling, simulations, and prototyping. Our work lays the foundation for robust, high-speed indoor OWC networks.
comment: Enhanced clarity of the optical channel and noise models with refined explanations, additional figures, and equations
Robotics
Energy Efficiency in Robotics Software: A Systematic Literature Review (2020-2024)
This study presents a systematic literature review of software-level approaches to energy efficiency in robotics published from 2020 through 2024, updating and extending pre-2020 evidence. An automated-but-audited pipeline combined Google Scholar seeding, backward/forward snowballing, and large-language-model (LLM) assistance for screening and data extraction, with ~10% human audits at each automated step and consensus-with-tie-breaks for full-text decisions. The final corpus comprises 79 peer-reviewed studies analyzed across application domain, metrics, evaluation type, energy models, major energy consumers, software technique families, and energy-quality trade-offs. Industrial settings dominate (31.6%) followed by exploration (25.3%). Motors/actuators are identified as the primary consumer in 68.4% of studies, with computing/controllers a distant second (13.9%). Simulation-only evaluations remain most common (51.9%), though hybrid evaluations are frequent (25.3%). Representational (physics-grounded) energy models predominate (87.3%). Motion and trajectory optimization is the leading technique family (69.6%), often paired with learning/prediction (40.5%) and computation allocation/scheduling (26.6%); power management/idle control (11.4%) and communication/data efficiency (3.8%) are comparatively underexplored. Reporting is heterogeneous: composite objectives that include energy are most common, while task-normalized and performance-per-energy metrics appear less often, limiting cross-paper comparability. The review offers a minimal reporting checklist (e.g., total energy and average power plus a task-normalized metric and clear baselines) and highlights opportunities in cross-layer designs and in quantifying non-performance trade-offs (accuracy, stability). A replication package with code, prompts, and frozen datasets accompanies the review.
Belief-Conditioned One-Step Diffusion: Real-Time Trajectory Planning with Just-Enough Sensing
Robots equipped with rich sensor suites can localize reliably in partially-observable environments, but powering every sensor continuously is wasteful and often infeasible. Belief-space planners address this by propagating pose-belief covariance through analytic models and switching sensors heuristically--a brittle, runtime-expensive approach. Data-driven approaches--including diffusion models--learn multi-modal trajectories from demonstrations, but presuppose an accurate, always-on state estimate. We address the largely open problem: for a given task in a mapped environment, which \textit{minimal sensor subset} must be active at each location to maintain state uncertainty \textit{just low enough} to complete the task? Our key insight is that when a diffusion planner is explicitly conditioned on a pose-belief raster and a sensor mask, the spread of its denoising trajectories yields a calibrated, differentiable proxy for the expected localisation error. Building on this insight, we present Belief-Conditioned One-Step Diffusion (B-COD), the first planner that, in a 10 ms forward pass, returns a short-horizon trajectory, per-waypoint aleatoric variances, and a proxy for localisation error--eliminating external covariance rollouts. We show that this single proxy suffices for a soft-actor-critic to choose sensors online, optimising energy while bounding pose-covariance growth. We deploy B-COD in real-time marine trials on an unmanned surface vehicle and show that it reduces sensing energy consumption while matching the goal-reach performance of an always-on baseline.
comment: Accepted to CoRL 2025 (Conference on Robot Learning)
Into the Wild: When Robots Are Not Welcome
Social robots are increasingly being deployed in public spaces, where they face not only technological difficulties and unexpected user utterances, but also objections from stakeholders who may not be comfortable with introducing a robot into those spaces. We describe our difficulties with deploying a social robot in two different public settings: 1) Student services center; 2) Refugees and asylum seekers drop-in service. Although this is a failure report, in each use case we eventually managed to earn the trust of the staff and form a relationship with them, allowing us to deploy our robot and conduct our studies.
comment: Accepted at the workshop on Real-World HRI in Public and Private Spaces: Successes, Failures, and Lessons Learned (PubRob-Fails), held at the IEEE RO-MAN Conference, 2025 (paper PubRob-Fails/2025/4)
OASIS: Real-Time Opti-Acoustic Sensing for Intervention Systems in Unstructured Environments IROS 2025
High resolution underwater 3D scene reconstruction is crucial for various applications, including construction, infrastructure maintenance, monitoring, exploration, and scientific investigation. Prior work has leveraged the complementary sensing modalities of imaging sonars and optical cameras for opti-acoustic 3D scene reconstruction, demonstrating improved results over methods which rely solely on either sensor. However, while most existing approaches focus on offline reconstruction, real-time spatial awareness is essential for both autonomous and piloted underwater vehicle operations. This paper presents OASIS, an opti-acoustic fusion method that integrates data from optical images with voxel carving techniques to achieve real-time 3D reconstruction unstructured underwater workspaces. Our approach utilizes an "eye-in-hand" configuration, which leverages the dexterity of robotic manipulator arms to capture multiple workspace views across a short baseline. We validate OASIS through tank-based experiments and present qualitative and quantitative results that highlight its utility for underwater manipulation tasks.
comment: This paper has been accepted for publication in IROS 2025. Copyright IEEE
Talk Less, Fly Lighter: Autonomous Semantic Compression for UAV Swarm Communication via LLMs
The rapid adoption of Large Language Models (LLMs) in unmanned systems has significantly enhanced the semantic understanding and autonomous task execution capabilities of Unmanned Aerial Vehicle (UAV) swarms. However, limited communication bandwidth and the need for high-frequency interactions pose severe challenges to semantic information transmission within the swarm. This paper explores the feasibility of LLM-driven UAV swarms for autonomous semantic compression communication, aiming to reduce communication load while preserving critical task semantics. To this end, we construct four types of 2D simulation scenarios with different levels of environmental complexity and design a communication-execution pipeline that integrates system prompts with task instruction prompts. On this basis, we systematically evaluate the semantic compression performance of nine mainstream LLMs in different scenarios and analyze their adaptability and stability through ablation studies on environmental complexity and swarm size. Experimental results demonstrate that LLM-based UAV swarms have the potential to achieve efficient collaborative communication under bandwidth-constrained and multi-hop link conditions.
Fully Spiking Actor-Critic Neural Network for Robotic Manipulation
This study proposes a hybrid curriculum reinforcement learning (CRL) framework based on a fully spiking neural network (SNN) for 9-degree-of-freedom robotic arms performing target reaching and grasping tasks. To reduce network complexity and inference latency, the SNN architecture is simplified to include only an input and an output layer, which shows strong potential for resource-constrained environments. Building on the advantages of SNNs-high inference speed, low energy consumption, and spike-based biological plausibility, a temporal progress-partitioned curriculum strategy is integrated with the Proximal Policy Optimization (PPO) algorithm. Meanwhile, an energy consumption modeling framework is introduced to quantitatively compare the theoretical energy consumption between SNNs and conventional Artificial Neural Networks (ANNs). A dynamic two-stage reward adjustment mechanism and optimized observation space further improve learning efficiency and policy accuracy. Experiments on the Isaac Gym simulation platform demonstrate that the proposed method achieves superior performance under realistic physical constraints. Comparative evaluations with conventional PPO and ANN baselines validate the scalability and energy efficiency of the proposed approach in dynamic robotic manipulation tasks.
Toward General Physical Intelligence for Resilient Agile Manufacturing Automation
Agile and human-centric manufacturing stipulates resilient robotic solutions capable of contextual reasoning and safe interaction in unstructured environments. Foundation models particularly the Vision Language Action (VLA) models have emerged to fuse multimodal perception, reasoning and physically grounded action across varied embodiments into unified representation, termed as General Physical Intelligence (GPI). While GPI has already been described in the literature but its practical application and evolving role in contemporary agile manufacturing processes have yet to be duly explored. To bridge this gap, this practical review systematically surveys recent advancements in VLA models within GPI context, performs comprehensive comparative analysis of leading implementations and evaluates their readiness for industrial deployment through structured ablation study. Our analysis has organized state-of-the-art into five thematic pillars including multisensory representation learning, sim2real transfer, planning and control, uncertainty and safety measures and benchmarking. Finally, we articulate open research challenges and propose directions to better integrate GPI into next-generation industrial ecosystems in line with Industry 5.0.
comment: Advanced Engineering Informatics
DynamicPose: Real-time and Robust 6D Object Pose Tracking for Fast-Moving Cameras and Objects
We present DynamicPose, a retraining-free 6D pose tracking framework that improves tracking robustness in fast-moving camera and object scenarios. Previous work is mainly applicable to static or quasi-static scenes, and its performance significantly deteriorates when both the object and the camera move rapidly. To overcome these challenges, we propose three synergistic components: (1) A visual-inertial odometry compensates for the shift in the Region of Interest (ROI) caused by camera motion; (2) A depth-informed 2D tracker corrects ROI deviations caused by large object translation; (3) A VIO-guided Kalman filter predicts object rotation, generates multiple candidate poses, and then obtains the final pose by hierarchical refinement. The 6D pose tracking results guide subsequent 2D tracking and Kalman filter updates, forming a closed-loop system that ensures accurate pose initialization and precise pose tracking. Simulation and real-world experiments demonstrate the effectiveness of our method, achieving real-time and robust 6D pose tracking for fast-moving cameras and objects.
No More Blind Spots: Learning Vision-Based Omnidirectional Bipedal Locomotion for Challenging Terrain
Effective bipedal locomotion in dynamic environments, such as cluttered indoor spaces or uneven terrain, requires agile and adaptive movement in all directions. This necessitates omnidirectional terrain sensing and a controller capable of processing such input. We present a learning framework for vision-based omnidirectional bipedal locomotion, enabling seamless movement using depth images. A key challenge is the high computational cost of rendering omnidirectional depth images in simulation, making traditional sim-to-real reinforcement learning (RL) impractical. Our method combines a robust blind controller with a teacher policy that supervises a vision-based student policy, trained on noise-augmented terrain data to avoid rendering costs during RL and ensure robustness. We also introduce a data augmentation technique for supervised student training, accelerating training by up to 10 times compared to conventional methods. Our framework is validated through simulation and real-world tests, demonstrating effective omnidirectional locomotion with minimal reliance on expensive rendering. This is, to the best of our knowledge, the first demonstration of vision-based omnidirectional bipedal locomotion, showcasing its adaptability to diverse terrains.
ExploreVLM: Closed-Loop Robot Exploration Task Planning with Vision-Language Models
The advancement of embodied intelligence is accelerating the integration of robots into daily life as human assistants. This evolution requires robots to not only interpret high-level instructions and plan tasks but also perceive and adapt within dynamic environments. Vision-Language Models (VLMs) present a promising solution by combining visual understanding and language reasoning. However, existing VLM-based methods struggle with interactive exploration, accurate perception, and real-time plan adaptation. To address these challenges, we propose ExploreVLM, a novel closed-loop task planning framework powered by Vision-Language Models (VLMs). The framework is built around a step-wise feedback mechanism that enables real-time plan adjustment and supports interactive exploration. At its core is a dual-stage task planner with self-reflection, enhanced by an object-centric spatial relation graph that provides structured, language-grounded scene representations to guide perception and planning. An execution validator supports the closed loop by verifying each action and triggering re-planning. Extensive real-world experiments demonstrate that ExploreVLM significantly outperforms state-of-the-art baselines, particularly in exploration-centric tasks. Ablation studies further validate the critical role of the reflective planner and structured perception in achieving robust and efficient task execution.
Control of Legged Robots using Model Predictive Optimized Path Integral
Legged robots possess a unique ability to traverse rough terrains and navigate cluttered environments, making them well-suited for complex, real-world unstructured scenarios. However, such robots have not yet achieved the same level as seen in natural systems. Recently, sampling-based predictive controllers have demonstrated particularly promising results. This paper investigates a sampling-based model predictive strategy combining model predictive path integral (MPPI) with cross-entropy (CE) and covariance matrix adaptation (CMA) methods to generate real-time whole-body motions for legged robots across multiple scenarios. The results show that combining the benefits of MPPI, CE and CMA, namely using model predictive optimized path integral (MPOPI), demonstrates greater sample efficiency, enabling robots to attain superior locomotion results using fewer samples when compared to typical MPPI algorithms. Extensive simulation experiments in multiple scenarios on a quadruped robot show that MPOPI can be used as an anytime control strategy, increasing locomotion capabilities at each iteration.
comment: 8 pages, 13 figures, Humanoid conference
OmniD: Generalizable Robot Manipulation Policy via Image-Based BEV Representation
The visuomotor policy can easily overfit to its training datasets, such as fixed camera positions and backgrounds. This overfitting makes the policy perform well in the in-distribution scenarios but underperform in the out-of-distribution generalization. Additionally, the existing methods also have difficulty fusing multi-view information to generate an effective 3D representation. To tackle these issues, we propose Omni-Vision Diffusion Policy (OmniD), a multi-view fusion framework that synthesizes image observations into a unified bird's-eye view (BEV) representation. We introduce a deformable attention-based Omni-Feature Generator (OFG) to selectively abstract task-relevant features while suppressing view-specific noise and background distractions. OmniD achieves 11\%, 17\%, and 84\% average improvement over the best baseline model for in-distribution, out-of-distribution, and few-shot experiments, respectively. Training code and simulation benchmark are available: https://github.com/1mather/omnid.git
Integrating Symbolic RL Planning into a BDI-based Autonomous UAV Framework: System Integration and SIL Validation
Modern autonomous drone missions increasingly require software frameworks capable of seamlessly integrating structured symbolic planning with adaptive reinforcement learning (RL). Although traditional rule-based architectures offer robust structured reasoning for drone autonomy, their capabilities fall short in dynamically complex operational environments that require adaptive symbolic planning. Symbolic RL (SRL), using the Planning Domain Definition Language (PDDL), explicitly integrates domain-specific knowledge and operational constraints, significantly improving the reliability and safety of unmanned aerial vehicle (UAV) decision making. In this study, we propose the AMAD-SRL framework, an extended and refined version of the Autonomous Mission Agents for Drones (AMAD) cognitive multi-agent architecture, enhanced with symbolic reinforcement learning for dynamic mission planning and execution. We validated our framework in a Software-in-the-Loop (SIL) environment structured identically to an intended Hardware-In-the-Loop Simulation (HILS) platform, ensuring seamless transition to real hardware. Experimental results demonstrate stable integration and interoperability of modules, successful transitions between BDI-driven and symbolic RL-driven planning phases, and consistent mission performance. Specifically, we evaluate a target acquisition scenario in which the UAV plans a surveillance path followed by a dynamic reentry path to secure the target while avoiding threat zones. In this SIL evaluation, mission efficiency improved by approximately 75% over a coverage-based baseline, measured by travel distance reduction. This study establishes a robust foundation for handling complex UAV missions and discusses directions for further enhancement and validation.
Saliency-Based Attention Shifting: A Framework for Improving Driver Situational Awareness of Out-of-Label Hazards
The advent of autonomous driving systems promises to transform transportation by enhancing safety, efficiency, and comfort. As these technologies evolve toward higher levels of autonomy, the need for integrated systems that seamlessly support human involvement in decision-making becomes increasingly critical. Certain scenarios necessitate human involvement, including those where the vehicle is unable to identify an object or element in the scene, and as such cannot take independent action. Therefore, situational awareness is essential to mitigate potential risks during a takeover, where a driver must assume control and autonomy from the vehicle. The need for driver attention is important to avoid collisions with external agents and ensure a smooth transition during takeover operations. This paper explores the integration of attention redirection techniques, such as gaze manipulation through targeted visual and auditory cues, to help drivers maintain focus on emerging hazards and reduce target fixation in semi-autonomous driving scenarios. We propose a conceptual framework that combines real-time gaze tracking, context-aware saliency analysis, and synchronized visual and auditory alerts to enhance situational awareness, proactively address potential hazards, and foster effective collaboration between humans and autonomous systems.
Contact-Rich and Deformable Foot Modeling for Locomotion Control of the Human Musculoskeletal System
The human foot serves as the critical interface between the body and environment during locomotion. Existing musculoskeletal models typically oversimplify foot-ground contact mechanics, limiting their ability to accurately simulate human gait dynamics. We developed a novel contact-rich and deformable model of the human foot integrated within a complete musculoskeletal system that captures the complex biomechanical interactions during walking. To overcome the control challenges inherent in modeling multi-point contacts and deformable material, we developed a two-stage policy training strategy to learn natural walking patterns for this interface-enhanced model. Comparative analysis between our approach and conventional rigid musculoskeletal models demonstrated improvements in kinematic, kinetic, and gait stability metrics. Validation against human subject data confirmed that our simulation closely reproduced real-world biomechanical measurements. This work advances contact-rich interface modeling for human musculoskeletal systems and establishes a robust framework that can be extended to humanoid robotics applications requiring precise foot-ground interaction control.
comment: IEEE-RAS 24th International Conference on Humanoid Robots (Humanoids 2025)
From Screen to Stage: Kid Cosmo, A Life-Like, Torque-Controlled Humanoid for Entertainment Robotics
Humanoid robots represent the cutting edge of robotics research, yet their potential in entertainment remains largely unexplored. Entertainment as a field prioritizes visuals and form, a principle that contrasts with the purely functional designs of most contemporary humanoid robots. Designing entertainment humanoid robots capable of fluid movement presents a number of unique challenges. In this paper, we present Kid Cosmo, a research platform designed for robust locomotion and life-like motion generation while imitating the look and mannerisms of its namesake character from Netflix's movie The Electric State. Kid Cosmo is a child-sized humanoid robot, standing 1.45 m tall and weighing 25 kg. It contains 28 degrees of freedom and primarily uses proprioceptive actuators, enabling torque-control walking and lifelike motion generation. Following worldwide showcases as part of the movie's press tour, we present the system architecture, challenges of a functional entertainment robot and unique solutions, and our initial findings on stability during simultaneous upper and lower body movement. We demonstrate the viability of performance-oriented humanoid robots that prioritize both character embodiment and technical functionality.
comment: 8 pages, 14 figures, accepted by IEEE Humanoids 2025
Bioinspired underwater soft robots: from biology to robotics and back
The ocean vast unexplored regions and diverse soft-bodied marine organisms have spurred interest in bio-inspired underwater soft robotics. Recent advances have enabled new capabilities in underwater movement, sensing, and interaction. However, these efforts are largely unidirectional, with biology guiding robotics while insights from robotics rarely feed back into biology. Here we propose a holistic, bidirectional framework that integrates biological principles, robotic implementation, and biological validation. We show that soft robots can serve as experimental tools to probe biological functions and even test evolutionary hypotheses. Their inherent compliance also allows them to outperform rigid systems in unstructured environments, supporting applications in marine exploration, manipulation, and medicine. Looking forward, we introduce bio-universal-inspired robotics, a paradigm that transcends species-specific mimicry by identifying convergent principles across species to inspire more adaptable designs. Despite rapid progress, challenges persist in material robustness, actuation efficiency, autonomy, and intelligence. By uniting biology and engineering, soft robots can advance ocean exploration and deepen scientific discovery.
Data Shift of Object Detection in Autonomous Driving
With the widespread adoption of machine learning technologies in autonomous driving systems, their role in addressing complex environmental perception challenges has become increasingly crucial. However, existing machine learning models exhibit significant vulnerability, as their performance critically depends on the fundamental assumption that training and testing data satisfy the independent and identically distributed condition, which is difficult to guarantee in real-world applications. Dynamic variations in data distribution caused by seasonal changes, weather fluctuations lead to data shift problems in autonomous driving systems. This study investigates the data shift problem in autonomous driving object detection tasks, systematically analyzing its complexity and diverse manifestations. We conduct a comprehensive review of data shift detection methods and employ shift detection analysis techniques to perform dataset categorization and balancing. Building upon this foundation, we construct an object detection model. To validate our approach, we optimize the model by integrating CycleGAN-based data augmentation techniques with the YOLOv5 framework. Experimental results demonstrate that our method achieves superior performance compared to baseline models on the BDD100K dataset.
LocoMamba: Vision-Driven Locomotion via End-to-End Deep Reinforcement Learning with Mamba
We introduce LocoMamba, a vision-driven cross-modal DRL framework built on selective state-space models, specifically leveraging Mamba, that achieves near-linear-time sequence modeling, effectively captures long-range dependencies, and enables efficient training with longer sequences. First, we embed proprioceptive states with a multilayer perceptron and patchify depth images with a lightweight convolutional neural network, producing compact tokens that improve state representation. Second, stacked Mamba layers fuse these tokens via near-linear-time selective scanning, reducing latency and memory footprint, remaining robust to token length and image resolution, and providing an inductive bias that mitigates overfitting. Third, we train the policy end-to-end with Proximal Policy Optimization under terrain and appearance randomization and an obstacle-density curriculum, using a compact state-centric reward that balances progress, smoothness, and safety. We evaluate our method in challenging simulated environments with static and moving obstacles as well as uneven terrain. Compared with state-of-the-art baselines, our method achieves higher returns and success rates with fewer collisions, exhibits stronger generalization to unseen terrains and obstacle densities, and improves training efficiency by converging in fewer updates under the same compute budget.
Beyond Fixed Morphologies: Learning Graph Policies with Trust Region Compensation in Variable Action Spaces
Trust region-based optimization methods have become foundational reinforcement learning algorithms that offer stability and strong empirical performance in continuous control tasks. Growing interest in scalable and reusable control policies translate also in a demand for morphological generalization, the ability of control policies to cope with different kinematic structures. Graph-based policy architectures provide a natural and effective mechanism to encode such structural differences. However, while these architectures accommodate variable morphologies, the behavior of trust region methods under varying action space dimensionality remains poorly understood. To this end, we conduct a theoretical analysis of trust region-based policy optimization methods, focusing on both Trust Region Policy Optimization (TRPO) and its widely used first-order approximation, Proximal Policy Optimization (PPO). The goal is to demonstrate how varying action space dimensionality influence the optimization landscape, particularly under the constraints imposed by KL-divergence or policy clipping penalties. Complementing the theoretical insights, an empirical evaluation under morphological variation is carried out using the Gymnasium Swimmer environment. This benchmark offers a systematically controlled setting for varying the kinematic structure without altering the underlying task, making it particularly well-suited to study morphological generalization.
Domain Translation of a Soft Robotic Arm using Conditional Cycle Generative Adversarial Network
Deep learning provides a powerful method for modeling the dynamics of soft robots, offering advantages over traditional analytical approaches that require precise knowledge of the robot's structure, material properties, and other physical characteristics. Given the inherent complexity and non-linearity of these systems, extracting such details can be challenging. The mappings learned in one domain cannot be directly transferred to another domain with different physical properties. This challenge is particularly relevant for soft robots, as their materials gradually degrade over time. In this paper, we introduce a domain translation framework based on a conditional cycle generative adversarial network (CCGAN) to enable knowledge transfer from a source domain to a target domain. Specifically, we employ a dynamic learning approach to adapt a pose controller trained in a standard simulation environment to a domain with tenfold increased viscosity. Our model learns from input pressure signals conditioned on corresponding end-effector positions and orientations in both domains. We evaluate our approach through trajectory-tracking experiments across five distinct shapes and further assess its robustness under noise perturbations and periodicity tests. The results demonstrate that CCGAN-GP effectively facilitates cross-domain skill transfer, paving the way for more adaptable and generalizable soft robotic controllers.
comment: Accepted at IEEE International Conference on Robotic Systems and Applications
RT-Cache: Training-Free Retrieval for Real-Time Manipulation
Real robots are expected to repeat the same behavior in new environments with very little new data, yet modern controllers either incur heavy per-step inference or require deployment-time fine-tuning. We propose RT-Cache, a training-free retrieval-as-control pipeline that caches diverse image action trajectories in a unified vector memory and, at test time, embeds the current frame to retrieve and replay multi-step snippets, replacing per-step model calls. A hierarchical search keeps lookups sub-second at million scale, shifting cost from compute to storage and enabling real-time control on modest GPUs. Across real-robot tasks and large open logs, RT-Cache achieves higher success and lower completion time than strong retrieval baselines (approximately x2 higher success and ~30% faster in our settings), and a single-episode anchoring study shows immediate adaptation to a more complex, contact-rich task without fine-tuning. RT-Cache turns experience into an append-only memory, offering a simple, scalable path to few-shot deployment today and a foundation for multimodal keys and optional integration with high-level policies. Project page: https://rt-cache.github.io/.
comment: 8 pages, 6 figures. Accepted to the 2025 IEEE-RAS 24th International Conference on Humanoid Robots
Research Challenges and Progress in the End-to-End V2X Cooperative Autonomous Driving Competition ICCV
With the rapid advancement of autonomous driving technology, vehicle-to-everything (V2X) communication has emerged as a key enabler for extending perception range and enhancing driving safety by providing visibility beyond the line of sight. However, integrating multi-source sensor data from both ego-vehicles and infrastructure under real-world constraints, such as limited communication bandwidth and dynamic environments, presents significant technical challenges. To facilitate research in this area, we organized the End-to-End Autonomous Driving through V2X Cooperation Challenge, which features two tracks: cooperative temporal perception and cooperative end-to-end planning. Built on the UniV2X framework and the V2X-Seq-SPD dataset, the challenge attracted participation from over 30 teams worldwide and established a unified benchmark for evaluating cooperative driving systems. This paper describes the design and outcomes of the challenge, highlights key research problems including bandwidth-aware fusion, robust multi-agent planning, and heterogeneous sensor integration, and analyzes emerging technical trends among top-performing solutions. By addressing practical constraints in communication and data fusion, the challenge contributes to the development of scalable and reliable V2X-cooperative autonomous driving systems.
comment: 10 pages, 4 figures, accepted by ICCVW Author list updated to match the camera-ready version, in compliance with conference policy
Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model
Affordance grounding focuses on predicting the specific regions of objects that are associated with the actions to be performed by robots. It plays a vital role in the fields of human-robot interaction, human-object interaction, embodied manipulation, and embodied perception. Existing models often neglect the affordance shared among different objects because they lack the Chain-of-Thought(CoT) reasoning abilities, limiting their out-of-domain (OOD) generalization and explicit reasoning capabilities. To address these challenges, we propose Affordance-R1, the first unified affordance grounding framework that integrates cognitive CoT guided Group Relative Policy Optimization (GRPO) within a reinforcement learning paradigm. Specifically, we designed a sophisticated affordance function, which contains format, perception, and cognition rewards to effectively guide optimization directions. Furthermore, we constructed a high-quality affordance-centric reasoning dataset, ReasonAff, to support training. Trained exclusively via reinforcement learning with GRPO and without explicit reasoning data, Affordance-R1 achieves robust zero-shot generalization and exhibits emergent test-time reasoning capabilities. Comprehensive experiments demonstrate that our model outperforms well-established methods and exhibits open-world generalization. To the best of our knowledge, Affordance-R1 is the first to integrate GRPO-based RL with reasoning into affordance reasoning. The code of our method and our dataset is released on https://github.com/hq-King/Affordance-R1.
FSDP: Fast and Safe Data-Driven Overtaking Trajectory Planning for Head-to-Head Autonomous Racing Competitions IROS 2025
Generating overtaking trajectories in autonomous racing is a challenging task, as the trajectory must satisfy the vehicle's dynamics and ensure safety and real-time performance running on resource-constrained hardware. This work proposes the Fast and Safe Data-Driven Planner to address this challenge. Sparse Gaussian predictions are introduced to improve both the computational efficiency and accuracy of opponent predictions. Furthermore, the proposed approach employs a bi-level quadratic programming framework to generate an overtaking trajectory leveraging the opponent predictions. The first level uses polynomial fitting to generate a rough trajectory, from which reference states and control inputs are derived for the second level. The second level formulates a model predictive control optimization problem in the Frenet frame, generating a trajectory that satisfies both kinematic feasibility and safety. Experimental results on the F1TENTH platform show that our method outperforms the State-of-the-Art, achieving an 8.93% higher overtaking success rate, allowing the maximum opponent speed, ensuring a smoother ego trajectory, and reducing 74.04% computational time compared to the Predictive Spliner method. The code is available at: https://github.com/ZJU-DDRX/FSDP.
comment: accepted by IROS 2025
Reasoning and Learning a Perceptual Metric for Self-Training of Reflective Objects in Bin-Picking with a Low-cost Camera ICRA 2026
Bin-picking of metal objects using low-cost RGB-D cameras often suffers from sparse depth information and reflective surface textures, leading to errors and the need for manual labeling. To reduce human intervention, we propose a two-stage framework consisting of a metric learning stage and a self-training stage. Specifically, to automatically process data captured by a low-cost camera (LC), we introduce a Multi-object Pose Reasoning (MoPR) algorithm that optimizes pose hypotheses under depth, collision, and boundary constraints. To further refine pose candidates, we adopt a Symmetry-aware Lie-group based Bayesian Gaussian Mixture Model (SaL-BGMM), integrated with the Expectation-Maximization (EM) algorithm, for symmetry-aware filtering. Additionally, we propose a Weighted Ranking Information Noise Contrastive Estimation (WR-InfoNCE) loss to enable the LC to learn a perceptual metric from reconstructed data, supporting self-training on untrained or even unseen objects. Experimental results show that our approach outperforms several state-of-the-art methods on both the ROBI dataset and our newly introduced Self-ROBI dataset.
comment: 8 pages, 10 figures; Accepted by IEEE RAL, presentation at ICRA 2026
NarraGuide: an LLM-based Narrative Mobile Robot for Remote Place Exploration
Robotic telepresence enables users to navigate and experience remote environments. However, effective navigation and situational awareness depend on users' prior knowledge of the environment, limiting the usefulness of these systems for exploring unfamiliar places. We explore how integrating location-aware LLM-based narrative capabilities into a mobile robot can support remote exploration. We developed a prototype system, called NarraGuide, that provides narrative guidance for users to explore and learn about a remote place through a dialogue-based interface. We deployed our prototype in a geology museum, where remote participants (n=20) used the robot to tour the museum. Our findings reveal how users perceived the robot's role, engaged in dialogue in the tour, and expressed preferences for bystander encountering. Our work demonstrates the potential of LLM-enabled robotic capabilities to deliver location-aware narrative guidance and enrich the experience of exploring remote environments.
Crossing the Human-Robot Embodiment Gap with Sim-to-Real RL using One Human Demonstration
Teaching robots dexterous manipulation skills often requires collecting hundreds of demonstrations using wearables or teleoperation, a process that is challenging to scale. Videos of human-object interactions are easier to collect and scale, but leveraging them directly for robot learning is difficult due to the lack of explicit action labels and human-robot embodiment differences. We propose Human2Sim2Robot, a novel real-to-sim-to-real framework for training dexterous manipulation policies using only one RGB-D video of a human demonstrating a task. Our method utilizes reinforcement learning (RL) in simulation to cross the embodiment gap without relying on wearables, teleoperation, or large-scale data collection. From the video, we extract: (1) the object pose trajectory to define an object-centric, embodiment-agnostic reward, and (2) the pre-manipulation hand pose to initialize and guide exploration during RL training. These components enable effective policy learning without any task-specific reward tuning. In the single human demo regime, Human2Sim2Robot outperforms object-aware replay by over 55% and imitation learning by over 68% on grasping, non-prehensile manipulation, and multi-step tasks. Website: https://human2sim2robot.github.io
FDSPC: Fast and Direct Smooth Path Planning via Continuous Curvature Integration
In recent decades, global path planning of robot has seen significant advancements. Both heuristic search-based methods and probability sampling-based methods have shown capabilities to find feasible solutions in complex scenarios. However, mainstream global path planning algorithms often produce paths with bends, requiring additional smoothing post-processing. In this work, we propose a fast and direct path planning method based on continuous curvature integration. This method ensures path feasibility while directly generating global smooth paths with constant velocity, thus eliminating the need for post-path-smoothing. Furthermore, we compare the proposed method with existing approaches in terms of solution time, path length, memory usage, and smoothness under multiple scenarios. The proposed method is vastly superior to the average performance of state-of-the-art (SOTA) methods, especially in terms of the self-defined $\mathcal{S}_2 $ smoothness (mean angle of steering). These results demonstrate the effectiveness and superiority of our approach in several representative environments.
SLAC: Simulation-Pretrained Latent Action Space for Whole-Body Real-World RL
Building capable household and industrial robots requires mastering the control of versatile, high-degree-of-freedom (DoF) systems such as mobile manipulators. While reinforcement learning (RL) holds promise for autonomously acquiring robot control policies, scaling it to high-DoF embodiments remains challenging. Direct RL in the real world demands both safe exploration and high sample efficiency, which are difficult to achieve in practice. Sim-to-real RL, on the other hand, is often brittle due to the reality gap. This paper introduces SLAC, a method that renders real-world RL feasible for complex embodiments by leveraging a low-fidelity simulator to pretrain a task-agnostic latent action space. SLAC trains this latent action space via a customized unsupervised skill discovery method designed to promote temporal abstraction, disentanglement, and safety, thereby facilitating efficient downstream learning. Once a latent action space is learned, SLAC uses it as the action interface for a novel off-policy RL algorithm to autonomously learn downstream tasks through real-world interactions. We evaluate SLAC against existing methods on a suite of bimanual mobile manipulation tasks, where it achieves state-of-the-art performance. Notably, SLAC learns contact-rich whole-body tasks in under an hour of real-world interactions, without relying on any demonstrations or hand-crafted behavior priors. More information and robot videos at robo-rl.github.io
comment: CoRL 2025
Measuring and Minimizing Disturbance of Marine Animals to Underwater Vehicles
Do fish respond to the presence of underwater vehicles, potentially biasing our estimates about them? If so, are there strategies to measure and mitigate this response? This work provides a theoretical and practical framework towards bias-free estimation of animal behavior from underwater vehicle observations. We also provide preliminary results from the field in coral reef environments to address these questions.
comment: ISER 2025 proceedings
D-CODA: Diffusion for Coordinated Dual-Arm Data Augmentation
Learning bimanual manipulation is challenging due to its high dimensionality and tight coordination required between two arms. Eye-in-hand imitation learning, which uses wrist-mounted cameras, simplifies perception by focusing on task-relevant views. However, collecting diverse demonstrations remains costly, motivating the need for scalable data augmentation. While prior work has explored visual augmentation in single-arm settings, extending these approaches to bimanual manipulation requires generating viewpoint-consistent observations across both arms and producing corresponding action labels that are both valid and feasible. In this work, we propose Diffusion for COordinated Dual-arm Data Augmentation (D-CODA), a method for offline data augmentation tailored to eye-in-hand bimanual imitation learning that trains a diffusion model to synthesize novel, viewpoint-consistent wrist-camera images for both arms while simultaneously generating joint-space action labels. It employs constrained optimization to ensure that augmented states involving gripper-to-object contacts adhere to constraints suitable for bimanual coordination. We evaluate D-CODA on 5 simulated and 3 real-world tasks. Our results across 2250 simulation trials and 300 real-world trials demonstrate that it outperforms baselines and ablations, showing its potential for scalable data augmentation in eye-in-hand bimanual manipulation. Our project website is at: https://dcodaaug.github.io/D-CODA/.
comment: Accepted to the Conference on Robot Learning (CoRL) 2025
Multiagent Systems
MAPF-World: Action World Model for Multi-Agent Path Finding
Multi-agent path finding (MAPF) is the problem of planning conflict-free paths from the designated start locations to goal positions for multiple agents. It underlies a variety of real-world tasks, including multi-robot coordination, robot-assisted logistics, and social navigation. Recent decentralized learnable solvers have shown great promise for large-scale MAPF, especially when leveraging foundation models and large datasets. However, these agents are reactive policy models and exhibit limited modeling of environmental temporal dynamics and inter-agent dependencies, resulting in performance degradation in complex, long-term planning scenarios. To address these limitations, we propose MAPF-World, an autoregressive action world model for MAPF that unifies situation understanding and action generation, guiding decisions beyond immediate local observations. It improves situational awareness by explicitly modeling environmental dynamics, including spatial features and temporal dependencies, through future state and actions prediction. By incorporating these predicted futures, MAPF-World enables more informed, coordinated, and far-sighted decision-making, especially in complex multi-agent settings. Furthermore, we augment MAPF benchmarks by introducing an automatic map generator grounded in real-world scenarios, capturing practical map layouts for training and evaluating MAPF solvers. Extensive experiments demonstrate that MAPF-World outperforms state-of-the-art learnable solvers, showcasing superior zero-shot generalization to out-of-distribution cases. Notably, MAPF-World is trained with a 96.5% smaller model size and 92% reduced data.
AgentCDM: Enhancing Multi-Agent Collaborative Decision-Making via ACH-Inspired Structured Reasoning
Multi-agent systems (MAS) powered by large language models (LLMs) hold significant promise for solving complex decision-making tasks. However, the core process of collaborative decision-making (CDM) within these systems remains underexplored. Existing approaches often rely on either ``dictatorial" strategies that are vulnerable to the cognitive biases of a single agent, or ``voting-based" methods that fail to fully harness collective intelligence. To address these limitations, we propose \textbf{AgentCDM}, a structured framework for enhancing collaborative decision-making in LLM-based multi-agent systems. Drawing inspiration from the Analysis of Competing Hypotheses (ACH) in cognitive science, AgentCDM introduces a structured reasoning paradigm that systematically mitigates cognitive biases and shifts decision-making from passive answer selection to active hypothesis evaluation and construction. To internalize this reasoning process, we develop a two-stage training paradigm: the first stage uses explicit ACH-inspired scaffolding to guide the model through structured reasoning, while the second stage progressively removes this scaffolding to encourage autonomous generalization. Experiments on multiple benchmark datasets demonstrate that AgentCDM achieves state-of-the-art performance and exhibits strong generalization, validating its effectiveness in improving the quality and robustness of collaborative decisions in MAS.
A Comprehensive Review of AI Agents: Transforming Possibilities in Technology and Beyond
Artificial Intelligence (AI) agents have rapidly evolved from specialized, rule-based programs to versatile, learning-driven autonomous systems capable of perception, reasoning, and action in complex environments. The explosion of data, advances in deep learning, reinforcement learning, and multi-agent coordination have accelerated this transformation. Yet, designing and deploying unified AI agents that seamlessly integrate cognition, planning, and interaction remains a grand challenge. In this review, we systematically examine the architectural principles, foundational components, and emergent paradigms that define the landscape of contemporary AI agents. We synthesize insights from cognitive science-inspired models, hierarchical reinforcement learning frameworks, and large language model-based reasoning. Moreover, we discuss the pressing ethical, safety, and interpretability concerns associated with deploying these agents in real-world scenarios. By highlighting major breakthroughs, persistent challenges, and promising research directions, this review aims to guide the next generation of AI agent systems toward more robust, adaptable, and trustworthy autonomous intelligence.
Benchmarking LLM-based Agents for Single-cell Omics Analysis
The surge in multimodal single-cell omics data exposes limitations in traditional, manually defined analysis workflows. AI agents offer a paradigm shift, enabling adaptive planning, executable code generation, traceable decisions, and real-time knowledge fusion. However, the lack of a comprehensive benchmark critically hinders progress. We introduce a novel benchmarking evaluation system to rigorously assess agent capabilities in single-cell omics analysis. This system comprises: a unified platform compatible with diverse agent frameworks and LLMs; multidimensional metrics assessing cognitive program synthesis, collaboration, execution efficiency, bioinformatics knowledge integration, and task completion quality; and 50 diverse real-world single-cell omics analysis tasks spanning multi-omics, species, and sequencing technologies. Our evaluation reveals that Grok-3-beta achieves state-of-the-art performance among tested agent frameworks. Multi-agent frameworks significantly enhance collaboration and execution efficiency over single-agent approaches through specialized role division. Attribution analyses of agent capabilities identify that high-quality code generation is crucial for task success, and self-reflection has the most significant overall impact, followed by retrieval-augmented generation (RAG) and planning. This work highlights persistent challenges in code generation, long-context handling, and context-aware knowledge retrieval, providing a critical empirical foundation and best practices for developing robust AI agents in computational biology.
Robotics
Investigating Sensors and Methods in Grasp State Classification in Agricultural Manipulation
Effective and efficient agricultural manipulation and harvesting depend on accurately understanding the current state of the grasp. The agricultural environment presents unique challenges due to its complexity, clutter, and occlusion. Additionally, fruit is physically attached to the plant, requiring precise separation during harvesting. Selecting appropriate sensors and modeling techniques is critical for obtaining reliable feedback and correctly identifying grasp states. This work investigates a set of key sensors, namely inertial measurement units (IMUs), infrared (IR) reflectance, tension, tactile sensors, and RGB cameras, integrated into a compliant gripper to classify grasp states. We evaluate the individual contribution of each sensor and compare the performance of two widely used classification models: Random Forest and Long Short-Term Memory (LSTM) networks. Our results demonstrate that a Random Forest classifier, trained in a controlled lab environment and tested on real cherry tomato plants, achieved 100% accuracy in identifying slip, grasp failure, and successful picks, marking a substantial improvement over baseline performance. Furthermore, we identify a minimal viable sensor combination, namely IMU and tension sensors that effectively classifies grasp states. This classifier enables the planning of corrective actions based on real-time feedback, thereby enhancing the efficiency and reliability of fruit harvesting operations.
Visual Perception Engine: Fast and Flexible Multi-Head Inference for Robotic Vision Tasks
Deploying multiple machine learning models on resource-constrained robotic platforms for different perception tasks often results in redundant computations, large memory footprints, and complex integration challenges. In response, this work presents Visual Perception Engine (VPEngine), a modular framework designed to enable efficient GPU usage for visual multitasking while maintaining extensibility and developer accessibility. Our framework architecture leverages a shared foundation model backbone that extracts image representations, which are efficiently shared, without any unnecessary GPU-CPU memory transfers, across multiple specialized task-specific model heads running in parallel. This design eliminates the computational redundancy inherent in feature extraction component when deploying traditional sequential models while enabling dynamic task prioritization based on application demands. We demonstrate our framework's capabilities through an example implementation using DINOv2 as the foundation model with multiple task (depth, object detection and semantic segmentation) heads, achieving up to 3x speedup compared to sequential execution. Building on CUDA Multi-Process Service (MPS), VPEngine offers efficient GPU utilization and maintains a constant memory footprint while allowing per-task inference frequencies to be adjusted dynamically during runtime. The framework is written in Python and is open source with ROS2 C++ (Humble) bindings for ease of use by the robotics community across diverse robotic platforms. Our example implementation demonstrates end-to-end real-time performance at $\geq$50 Hz on NVIDIA Jetson Orin AGX for TensorRT optimized models.
comment: 6 pages, 6 figures, 2 tables
Nominal Evaluation Of Automatic Multi-Sections Control Potential In Comparison To A Simpler One- Or Two-Sections Alternative With Predictive Spray Switching
Automatic Section Control (ASC) is a long-standing trend for spraying in agriculture. It promises to minimise spray overlap areas. The core idea is to (i) switch off spray nozzles on areas that have already been sprayed, and (ii) to dynamically adjust nozzle flow rates along the boom bar that holds the spray nozzles when velocities of boom sections vary during turn maneuvers. ASC is not possible without sensors, in particular for accurate positioning data. Spraying and the movement of modern wide boom bars are highly dynamic processes. In addition, many uncertainty factors have an effect such as cross wind drift, boom height, nozzle clogging in open-field conditions, and so forth. In view of this complexity, the natural question arises if a simpler alternative exist. Therefore, an Automatic Multi-Sections Control method is compared to a proposed simpler one- or two-sections alternative that uses predictive spray switching. The comparison is provided under nominal conditions. Agricultural spraying is intrinsically linked to area coverage path planning and spray switching logic. Combinations of two area coverage path planning and switching logics as well as three sections-setups are compared. The three sections-setups differ by controlling 48 sections, 2 sections or controlling all nozzles uniformly with the same control signal as one single section. Methods are evaluated on 10 diverse real-world field examples, including non-convex field contours, freeform mainfield lanes and multiple obstacle areas. A preferred method is suggested that (i) minimises area coverage pathlength, (ii) offers intermediate overlap, (iii) is suitable for manual driving by following a pre-planned predictive spray switching logic for an area coverage path plan, and (iv) and in contrast to ASC can be implemented sensor-free and therefore at low cost.
comment: 14 pages plus 7 pages appendix with additional figures, 18 main figures, 3 tables
Towards Fully Onboard State Estimation and Trajectory Tracking for UAVs with Suspended Payloads
This paper addresses the problem of tracking the position of a cable-suspended payload carried by an unmanned aerial vehicle, with a focus on real-world deployment and minimal hardware requirements. In contrast to many existing approaches that rely on motion-capture systems, additional onboard cameras, or instrumented payloads, we propose a framework that uses only standard onboard sensors--specifically, real-time kinematic global navigation satellite system measurements and data from the onboard inertial measurement unit--to estimate and control the payload's position. The system models the full coupled dynamics of the aerial vehicle and payload, and integrates a linear Kalman filter for state estimation, a model predictive contouring control planner, and an incremental model predictive controller. The control architecture is designed to remain effective despite sensing limitations and estimation uncertainty. Extensive simulations demonstrate that the proposed system achieves performance comparable to control based on ground-truth measurements, with only minor degradation (< 6%). The system also shows strong robustness to variations in payload parameters. Field experiments further validate the framework, confirming its practical applicability and reliable performance in outdoor environments using only off-the-shelf aerial vehicle hardware.
MultiPark: Multimodal Parking Transformer with Next-Segment Prediction
Parking accurately and safely in highly constrained spaces remains a critical challenge. Unlike structured driving environments, parking requires executing complex maneuvers such as frequent gear shifts and steering saturation. Recent attempts to employ imitation learning (IL) for parking have achieved promising results. However, existing works ignore the multimodal nature of parking behavior in lane-free open space, failing to derive multiple plausible solutions under the same situation. Notably, IL-based methods encompass inherent causal confusion, so enabling a neural network to generalize across diverse parking scenarios is particularly difficult. To address these challenges, we propose MultiPark, an autoregressive transformer for multimodal parking. To handle paths filled with abrupt turning points, we introduce a data-efficient next-segment prediction paradigm, enabling spatial generalization and temporal extrapolation. Furthermore, we design learnable parking queries factorized into gear, longitudinal, and lateral components, parallelly decoding diverse parking behaviors. To mitigate causal confusion in IL, our method employs target-centric pose and ego-centric collision as outcome-oriented loss across all modalities beyond pure imitation loss. Evaluations on real-world datasets demonstrate that MultiPark achieves state-of-the-art performance across various scenarios. We deploy MultiPark on a production vehicle, further confirming our approach's robustness in real-world parking environments.
A Comparative Study of Floating-Base Space Parameterizations for Agile Whole-Body Motion Planning
Automatically generating agile whole-body motions for legged and humanoid robots remains a fundamental challenge in robotics. While numerous trajectory optimization approaches have been proposed, there is no clear guideline on how the choice of floating-base space parameterization affects performance, especially for agile behaviors involving complex contact dynamics. In this paper, we present a comparative study of different parameterizations for direct transcription-based trajectory optimization of agile motions in legged systems. We systematically evaluate several common choices under identical optimization settings to ensure a fair comparison. Furthermore, we introduce a novel formulation based on the tangent space of SE(3) for representing the robot's floating-base pose, which, to our knowledge, has not received attention from the literature. This approach enables the use of mature off-the-shelf numerical solvers without requiring specialized manifold optimization techniques. We hope that our experiments and analysis will provide meaningful insights for selecting the appropriate floating-based representation for agile whole-body motion generation.
comment: 8 pages, 2 figures, 4 tables, Accepted at Humanoids 2025
Sim2Dust: Mastering Dynamic Waypoint Tracking on Granular Media
Reliable autonomous navigation across the unstructured terrains of distant planetary surfaces is a critical enabler for future space exploration. However, the deployment of learning-based controllers is hindered by the inherent sim-to-real gap, particularly for the complex dynamics of wheel interactions with granular media. This work presents a complete sim-to-real framework for developing and validating robust control policies for dynamic waypoint tracking on such challenging surfaces. We leverage massively parallel simulation to train reinforcement learning agents across a vast distribution of procedurally generated environments with randomized physics. These policies are then transferred zero-shot to a physical wheeled rover operating in a lunar-analogue facility. Our experiments systematically compare multiple reinforcement learning algorithms and action smoothing filters to identify the most effective combinations for real-world deployment. Crucially, we provide strong empirical evidence that agents trained with procedural diversity achieve superior zero-shot performance compared to those trained on static scenarios. We also analyze the trade-offs of fine-tuning with high-fidelity particle physics, which offers minor gains in low-speed precision at a significant computational cost. Together, these contributions establish a validated workflow for creating reliable learning-based navigation systems, marking a critical step towards deploying autonomous robots in the final frontier.
comment: The source code is available at https://github.com/AndrejOrsula/space_robotics_bench
Swarm-in-Blocks: Simplifying Drone Swarm Programming with Block-Based Language
Swarm in Blocks, originally developed for CopterHack 2022, is a high-level interface that simplifies drone swarm programming using a block-based language. Building on the Clover platform, this tool enables users to create functionalities like loops and conditional structures by assembling code blocks. In 2023, we introduced Swarm in Blocks 2.0, further refining the platform to address the complexities of swarm management in a user-friendly way. As drone swarm applications grow in areas like delivery, agriculture, and surveillance, the challenge of managing them, especially for beginners, has also increased. The Atena team developed this interface to make swarm handling accessible without requiring extensive knowledge of ROS or programming. The block-based approach not only simplifies swarm control but also expands educational opportunities in programming.
Relative Position Matters: Trajectory Prediction and Planning with Polar Representation
Trajectory prediction and planning in autonomous driving are highly challenging due to the complexity of predicting surrounding agents' movements and planning the ego agent's actions in dynamic environments. Existing methods encode map and agent positions and decode future trajectories in Cartesian coordinates. However, modeling the relationships between the ego vehicle and surrounding traffic elements in Cartesian space can be suboptimal, as it does not naturally capture the varying influence of different elements based on their relative distances and directions. To address this limitation, we adopt the Polar coordinate system, where positions are represented by radius and angle. This representation provides a more intuitive and effective way to model spatial changes and relative relationships, especially in terms of distance and directional influence. Based on this insight, we propose Polaris, a novel method that operates entirely in Polar coordinates, distinguishing itself from conventional Cartesian-based approaches. By leveraging the Polar representation, this method explicitly models distance and direction variations and captures relative relationships through dedicated encoding and refinement modules, enabling more structured and spatially aware trajectory prediction and planning. Extensive experiments on the challenging prediction (Argoverse 2) and planning benchmarks (nuPlan) demonstrate that Polaris achieves state-of-the-art performance.
i2Nav-Robot: A Large-Scale Indoor-Outdoor Robot Dataset for Multi-Sensor Fusion Navigation and Mapping
Accurate and reliable navigation is crucial for autonomous unmanned ground vehicle (UGV). However, current UGV datasets fall short in meeting the demands for advancing navigation and mapping techniques due to limitations in sensor configuration, time synchronization, ground truth, and scenario diversity. To address these challenges, we present i2Nav-Robot, a large-scale dataset designed for multi-sensor fusion navigation and mapping in indoor-outdoor environments. We integrate multi-modal sensors, including the newest front-view and 360-degree solid-state LiDARs, 4-dimensional (4D) radar, stereo cameras, odometer, global navigation satellite system (GNSS) receiver, and inertial measurement units (IMU) on an omnidirectional wheeled robot. Accurate timestamps are obtained through both online hardware synchronization and offline calibration for all sensors. The dataset comprises ten larger-scale sequences covering diverse UGV operating scenarios, such as outdoor streets, and indoor parking lots, with a total length of about 17060 meters. High-frequency ground truth, with centimeter-level accuracy for position, is derived from post-processing integrated navigation methods using a navigation-grade IMU. The proposed i2Nav-Robot dataset is evaluated by more than ten open-sourced multi-sensor fusion systems, and it has proven to have superior data quality.
comment: 10 pages, 12 figures
OVSegDT: Segmenting Transformer for Open-Vocabulary Object Goal Navigation
Open-vocabulary Object Goal Navigation requires an embodied agent to reach objects described by free-form language, including categories never seen during training. Existing end-to-end policies overfit small simulator datasets, achieving high success on training scenes but failing to generalize and exhibiting unsafe behaviour (frequent collisions). We introduce OVSegDT, a lightweight transformer policy that tackles these issues with two synergistic components. The first component is the semantic branch, which includes an encoder for the target binary mask and an auxiliary segmentation loss function, grounding the textual goal and providing precise spatial cues. The second component consists of a proposed Entropy-Adaptive Loss Modulation, a per-sample scheduler that continuously balances imitation and reinforcement signals according to the policy entropy, eliminating brittle manual phase switches. These additions cut the sample complexity of training by 33%, and reduce collision count in two times while keeping inference cost low (130M parameters, RGB-only input). On HM3D-OVON, our model matches the performance on unseen categories to that on seen ones and establishes state-of-the-art results (40.1% SR, 20.9% SPL on val unseen) without depth, odometry, or large vision-language models. Code is available at https://github.com/CognitiveAISystems/OVSegDT.
EvoPSF: Online Evolution of Autonomous Driving Models via Planning-State Feedback
Recent years have witnessed remarkable progress in autonomous driving, with systems evolving from modular pipelines to end-to-end architectures. However, most existing methods are trained offline and lack mechanisms to adapt to new environments during deployment. As a result, their generalization ability diminishes when faced with unseen variations in real-world driving scenarios. In this paper, we break away from the conventional "train once, deploy forever" paradigm and propose EvoPSF, a novel online Evolution framework for autonomous driving based on Planning-State Feedback. We argue that planning failures are primarily caused by inaccurate object-level motion predictions, and such failures are often reflected in the form of increased planner uncertainty. To address this, we treat planner uncertainty as a trigger for online evolution, using it as a diagnostic signal to initiate targeted model updates. Rather than performing blind updates, we leverage the planner's agent-agent attention to identify the specific objects that the ego vehicle attends to most, which are primarily responsible for the planning failures. For these critical objects, we compute a targeted self-supervised loss by comparing their predicted waypoints from the prediction module with their actual future positions, selected from the perception module's outputs with high confidence scores. This loss is then backpropagated to adapt the model online. As a result, our method improves the model's robustness to environmental changes, leads to more precise motion predictions, and therefore enables more accurate and stable planning behaviors. Experiments on both cross-region and corrupted variants of the nuScenes dataset demonstrate that EvoPSF consistently improves planning performance under challenging conditions.
ReachVox: Clutter-free Reachability Visualization for Robot Motion Planning in Virtual Reality
Human-Robot-Collaboration can enhance workflows by leveraging the mutual strengths of human operators and robots. Planning and understanding robot movements remain major challenges in this domain. This problem is prevalent in dynamic environments that might need constant robot motion path adaptation. In this paper, we investigate whether a minimalistic encoding of the reachability of a point near an object of interest, which we call ReachVox, can aid the collaboration between a remote operator and a robotic arm in VR. Through a user study (n=20), we indicate the strength of the visualization relative to a point-based reachability check-up.
comment: To appear in Proceedings of IEEE ISMAR 2025
Open, Reproducible and Trustworthy Robot-Based Experiments with Virtual Labs and Digital-Twin-Based Execution Tracing IROS
We envision a future in which autonomous robots conduct scientific experiments in ways that are not only precise and repeatable, but also open, trustworthy, and transparent. To realize this vision, we present two key contributions: a semantic execution tracing framework that logs sensor data together with semantically annotated robot belief states, ensuring that automated experimentation is transparent and replicable; and the AICOR Virtual Research Building (VRB), a cloud-based platform for sharing, replicating, and validating robot task executions at scale. Together, these tools enable reproducible, robot-driven science by integrating deterministic execution, semantic memory, and open knowledge representation, laying the foundation for autonomous systems to participate in scientific discovery.
comment: 8 pages, 6 figures, submitted to the 1st IROS Workshop on Embodied AI and Robotics for Future Scientific Discovery
An Exploratory Study on Crack Detection in Concrete through Human-Robot Collaboration
Structural inspection in nuclear facilities is vital for maintaining operational safety and integrity. Traditional methods of manual inspection pose significant challenges, including safety risks, high cognitive demands, and potential inaccuracies due to human limitations. Recent advancements in Artificial Intelligence (AI) and robotic technologies have opened new possibilities for safer, more efficient, and accurate inspection methodologies. Specifically, Human-Robot Collaboration (HRC), leveraging robotic platforms equipped with advanced detection algorithms, promises significant improvements in inspection outcomes and reductions in human workload. This study explores the effectiveness of AI-assisted visual crack detection integrated into a mobile Jackal robot platform. The experiment results indicate that HRC enhances inspection accuracy and reduces operator workload, resulting in potential superior performance outcomes compared to traditional manual methods.
Pedestrian Dead Reckoning using Invariant Extended Kalman Filter
This paper presents a cost-effective inertial pedestrian dead reckoning method for the bipedal robot in the GPS-denied environment. Each time when the inertial measurement unit (IMU) is on the stance foot, a stationary pseudo-measurement can be executed to provide innovation to the IMU measurement based prediction. The matrix Lie group based theoretical development of the adopted invariant extended Kalman filter (InEKF) is set forth for tutorial purpose. Three experiments are conducted to compare between InEKF and standard EKF, including motion capture benchmark experiment, large-scale multi-floor walking experiment, and bipedal robot experiment, as an effort to show our method's feasibility in real-world robot system. In addition, a sensitivity analysis is included to show that InEKF is much easier to tune than EKF.
Optimizing ROS 2 Communication for Wireless Robotic Systems
Wireless transmission of large payloads, such as high-resolution images and LiDAR point clouds, is a major bottleneck in ROS 2, the leading open-source robotics middleware. The default Data Distribution Service (DDS) communication stack in ROS 2 exhibits significant performance degradation over lossy wireless links. Despite the widespread use of ROS 2, the underlying causes of these wireless communication challenges remain unexplored. In this paper, we present the first in-depth network-layer analysis of ROS 2's DDS stack under wireless conditions with large payloads. We identify the following three key issues: excessive IP fragmentation, inefficient retransmission timing, and congestive buffer bursts. To address these issues, we propose a lightweight and fully compatible DDS optimization framework that tunes communication parameters based on link and payload characteristics. Our solution can be seamlessly applied through the standard ROS 2 application interface via simple XML-based QoS configuration, requiring no protocol modifications, no additional components, and virtually no integration efforts. Extensive experiments across various wireless scenarios demonstrate that our framework successfully delivers large payloads in conditions where existing DDS modes fail, while maintaining low end-to-end latency.
comment: 10 pages, 8 figures
A Recursive Total Least Squares Solution for Bearing-Only Target Motion Analysis and Circumnavigation IROS
Bearing-only Target Motion Analysis (TMA) is a promising technique for passive tracking in various applications as a bearing angle is easy to measure. Despite its advantages, bearing-only TMA is challenging due to the nonlinearity of the bearing measurement model and the lack of range information, which impairs observability and estimator convergence. This paper addresses these issues by proposing a Recursive Total Least Squares (RTLS) method for online target localization and tracking using mobile observers. The RTLS approach, inspired by previous results on Total Least Squares (TLS), mitigates biases in position estimation and improves computational efficiency compared to pseudo-linear Kalman filter (PLKF) methods. Additionally, we propose a circumnavigation controller to enhance system observability and estimator convergence by guiding the mobile observer in orbit around the target. Extensive simulations and experiments are performed to demonstrate the effectiveness and robustness of the proposed method. The proposed algorithm is also compared with the state-of-the-art approaches, which confirms its superior performance in terms of both accuracy and stability.
comment: Accepted by 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 6 Pages
Scene Graph-Guided Proactive Replanning for Failure-Resilient Embodied Agent
When humans perform everyday tasks, we naturally adjust our actions based on the current state of the environment. For instance, if we intend to put something into a drawer but notice it is closed, we open it first. However, many autonomous robots lack this adaptive awareness. They often follow pre-planned actions that may overlook subtle yet critical changes in the scene, which can result in actions being executed under outdated assumptions and eventual failure. While replanning is critical for robust autonomy, most existing methods respond only after failures occur, when recovery may be inefficient or infeasible. While proactive replanning holds promise for preventing failures in advance, current solutions often rely on manually designed rules and extensive supervision. In this work, we present a proactive replanning framework that detects and corrects failures at subtask boundaries by comparing scene graphs constructed from current RGB-D observations against reference graphs extracted from successful demonstrations. When the current scene fails to align with reference trajectories, a lightweight reasoning module is activated to diagnose the mismatch and adjust the plan. Experiments in the AI2-THOR simulator demonstrate that our approach detects semantic and spatial mismatches before execution failures occur, significantly improving task success and robustness.
Learning Differentiable Reachability Maps for Optimization-based Humanoid Motion Generation
To reduce the computational cost of humanoid motion generation, we introduce a new approach to representing robot kinematic reachability: the differentiable reachability map. This map is a scalar-valued function defined in the task space that takes positive values only in regions reachable by the robot's end-effector. A key feature of this representation is that it is continuous and differentiable with respect to task-space coordinates, enabling its direct use as constraints in continuous optimization for humanoid motion planning. We describe a method to learn such differentiable reachability maps from a set of end-effector poses generated using a robot's kinematic model, using either a neural network or a support vector machine as the learning model. By incorporating the learned reachability map as a constraint, we formulate humanoid motion generation as a continuous optimization problem. We demonstrate that the proposed approach efficiently solves various motion planning problems, including footstep planning, multi-contact motion planning, and loco-manipulation planning for humanoid robots.
Tactile Robotics: An Outlook
Robotics research has long sought to give robots the ability to perceive the physical world through touch in an analogous manner to many biological systems. Developing such tactile capabilities is important for numerous emerging applications that require robots to co-exist and interact closely with humans. Consequently, there has been growing interest in tactile sensing, leading to the development of various technologies, including piezoresistive and piezoelectric sensors, capacitive sensors, magnetic sensors, and optical tactile sensors. These diverse approaches utilise different transduction methods and materials to equip robots with distributed sensing capabilities, enabling more effective physical interactions. These advances have been supported in recent years by simulation tools that generate large-scale tactile datasets to support sensor designs and algorithms to interpret and improve the utility of tactile data. The integration of tactile sensing with other modalities, such as vision, as well as with action strategies for active tactile perception highlights the growing scope of this field. To further the transformative progress in tactile robotics, a holistic approach is essential. In this outlook article, we examine several challenges associated with the current state of the art in tactile robotics and explore potential solutions to inspire innovations across multiple domains, including manufacturing, healthcare, recycling and agriculture.
comment: 20 pages, 2 figures, accepted to IEEE Transactions on Robotics
Embodied Edge Intelligence Meets Near Field Communication: Concept, Design, and Verification
Realizing embodied artificial intelligence is challenging due to the huge computation demands of large models (LMs). To support LMs while ensuring real-time inference, embodied edge intelligence (EEI) is a promising paradigm, which leverages an LM edge to provide computing powers in close proximity to embodied robots. Due to embodied data exchange, EEI requires higher spectral efficiency, enhanced communication security, and reduced inter-user interference. To meet these requirements, near-field communication (NFC), which leverages extremely large antenna arrays as its hardware foundation, is an ideal solution. Therefore, this paper advocates the integration of EEI and NFC, resulting in a near-field EEI (NEEI) paradigm. However, NEEI also introduces new challenges that cannot be adequately addressed by isolated EEI or NFC designs, creating research opportunities for joint optimization of both functionalities. To this end, we propose radio-friendly embodied planning for EEI-assisted NFC scenarios and view-guided beam-focusing for NFC-assisted EEI scenarios. We also elaborate how to realize resource-efficient NEEI through opportunistic collaborative navigation. Experimental results are provided to confirm the superiority of the proposed techniques compared with various benchmarks.
comment: 9 pages, 6 figures, to appear in IEEE Network
Multi-Group Equivariant Augmentation for Reinforcement Learning in Robot Manipulation
Sampling efficiency is critical for deploying visuomotor learning in real-world robotic manipulation. While task symmetry has emerged as a promising inductive bias to improve efficiency, most prior work is limited to isometric symmetries -- applying the same group transformation to all task objects across all timesteps. In this work, we explore non-isometric symmetries, applying multiple independent group transformations across spatial and temporal dimensions to relax these constraints. We introduce a novel formulation of the partially observable Markov decision process (POMDP) that incorporates the non-isometric symmetry structures, and propose a simple yet effective data augmentation method, Multi-Group Equivariance Augmentation (MEA). We integrate MEA with offline reinforcement learning to enhance sampling efficiency, and introduce a voxel-based visual representation that preserves translational equivariance. Extensive simulation and real-robot experiments across two manipulation domains demonstrate the effectiveness of our approach.
Visuomotor Grasping with World Models for Surgical Robots
Grasping is a fundamental task in robot-assisted surgery (RAS), and automating it can reduce surgeon workload while enhancing efficiency, safety, and consistency beyond teleoperated systems. Most prior approaches rely on explicit object pose tracking or handcrafted visual features, limiting their generalization to novel objects, robustness to visual disturbances, and the ability to handle deformable objects. Visuomotor learning offers a promising alternative, but deploying it in RAS presents unique challenges, such as low signal-to-noise ratio in visual observations, demands for high safety and millimeter-level precision, as well as the complex surgical environment. This paper addresses three key challenges: (i) sim-to-real transfer of visuomotor policies to ex vivo surgical scenes, (ii) visuomotor learning using only a single stereo camera pair -- the standard RAS setup, and (iii) object-agnostic grasping with a single policy that generalizes to diverse, unseen surgical objects without retraining or task-specific models. We introduce Grasp Anything for Surgery V2 (GASv2), a visuomotor learning framework for surgical grasping. GASv2 leverages a world-model-based architecture and a surgical perception pipeline for visual observations, combined with a hybrid control system for safe execution. We train the policy in simulation using domain randomization for sim-to-real transfer and deploy it on a real robot in both phantom-based and ex vivo surgical settings, using only a single pair of endoscopic cameras. Extensive experiments show our policy achieves a 65% success rate in both settings, generalizes to unseen objects and grippers, and adapts to diverse disturbances, demonstrating strong performance, generality, and robustness.
Actor-Critic for Continuous Action Chunks: A Reinforcement Learning Framework for Long-Horizon Robotic Manipulation with Sparse Reward
Existing reinforcement learning (RL) methods struggle with long-horizon robotic manipulation tasks, particularly those involving sparse rewards. While action chunking is a promising paradigm for robotic manipulation, using RL to directly learn continuous action chunks in a stable and data-efficient manner remains a critical challenge. This paper introduces AC3 (Actor-Critic for Continuous Chunks), a novel RL framework that learns to generate high-dimensional, continuous action sequences. To make this learning process stable and data-efficient, AC3 incorporates targeted stabilization mechanisms for both the actor and the critic. First, to ensure reliable policy improvement, the actor is trained with an asymmetric update rule, learning exclusively from successful trajectories. Second, to enable effective value learning despite sparse rewards, the critic's update is stabilized using intra-chunk $n$-step returns and further enriched by a self-supervised module providing intrinsic rewards at anchor points aligned with each action chunk. We conducted extensive experiments on 25 tasks from the BiGym and RLBench benchmarks. Results show that by using only a few demonstrations and a simple model architecture, AC3 achieves superior success rates on most tasks, validating its effective design.
Geometry-Aware Predictive Safety Filters on Humanoids: From Poisson Safety Functions to CBF Constrained MPC
Autonomous navigation through unstructured and dynamically-changing environments is a complex task that continues to present many challenges for modern roboticists. In particular, legged robots typically possess manipulable asymmetric geometries which must be considered during safety-critical trajectory planning. This work proposes a predictive safety filter: a nonlinear model predictive control (MPC) algorithm for online trajectory generation with geometry-aware safety constraints based on control barrier functions (CBFs). Critically, our method leverages Poisson safety functions to numerically synthesize CBF constraints directly from perception data. We extend the theoretical framework for Poisson safety functions to incorporate temporal changes in the domain by reformulating the static Dirichlet problem for Poisson's equation as a parameterized moving boundary value problem. Furthermore, we employ Minkowski set operations to lift the domain into a configuration space that accounts for robot geometry. Finally, we implement our real-time predictive safety filter on humanoid and quadruped robots in various safety-critical scenarios. The results highlight the versatility of Poisson safety functions, as well as the benefit of CBF constrained model predictive safety-critical controllers.
comment: 2025 IEEE-RAS 24th International Conference on Humanoid Robots
Recent Advances in Transformer and Large Language Models for UAV Applications
The rapid advancement of Transformer-based models has reshaped the landscape of uncrewed aerial vehicle (UAV) systems by enhancing perception, decision-making, and autonomy. This review paper systematically categorizes and evaluates recent developments in Transformer architectures applied to UAVs, including attention mechanisms, CNN-Transformer hybrids, reinforcement learning Transformers, and large language models (LLMs). Unlike previous surveys, this work presents a unified taxonomy of Transformer-based UAV models, highlights emerging applications such as precision agriculture and autonomous navigation, and provides comparative analyses through structured tables and performance benchmarks. The paper also reviews key datasets, simulators, and evaluation metrics used in the field. Furthermore, it identifies existing gaps in the literature, outlines critical challenges in computational efficiency and real-time deployment, and offers future research directions. This comprehensive synthesis aims to guide researchers and practitioners in understanding and advancing Transformer-driven UAV technologies.
Control of a commercial vehicle by a tetraplegic human using a bimanual brain-computer interface
Brain-computer interfaces (BCIs) read neural signals directly from the brain to infer motor planning and execution. However, the implementation of this technology has been largely limited to laboratory settings, with few real-world applications. We developed a bimanual BCI system to drive a vehicle in both simulated and real-world environments. We demonstrate that an individual with tetraplegia, implanted with intracortical BCI electrodes in the posterior parietal cortex (PPC) and the hand knob region of the motor cortex (MC), reacts at least as fast and precisely as motor intact participants, and drives a simulated vehicle as proficiently as the same control group. This BCI participant, living in California, could also remotely drive a Ford Mustang Mach-E vehicle in Michigan. Our first teledriving task relied on cursor control for speed and steering in a closed urban test facility. However, the final BCI system added click control for full-stop braking and thus enabled bimanual cursor-and-click control for both simulated driving through a virtual town with traffic and teledriving through an obstacle course without traffic in the real world. We also demonstrate the safety and feasibility of BCI-controlled driving. This first-of-its-kind implantable BCI application not only highlights the versatility and innovative potentials of BCIs but also illuminates the promising future for the development of life-changing solutions to restore independence to those who suffer catastrophic neurological injury.
comment: 41 pages, 7 figures, 1 table. 22 supplementary pages, 6 supplementary figures, 11 supplementary tables, 9 supplementary movies available as ancillary files
Anticipatory and Adaptive Footstep Streaming for Teleoperated Bipedal Robots
Achieving seamless synchronization between user and robot motion in teleoperation, particularly during high-speed tasks, remains a significant challenge. In this work, we propose a novel approach for transferring stepping motions from the user to the robot in real-time. Instead of directly replicating user foot poses, we retarget user steps to robot footstep locations, allowing the robot to utilize its own dynamics for locomotion, ensuring better balance and stability. Our method anticipates user footsteps to minimize delays between when the user initiates and completes a step and when the robot does it. The step estimates are continuously adapted to converge with the measured user references. Additionally, the system autonomously adjusts the robot's steps to account for its surrounding terrain, overcoming challenges posed by environmental mismatches between the user's flat-ground setup and the robot's uneven terrain. Experimental results on the humanoid robot Nadia demonstrate the effectiveness of the proposed system.
comment: 2025 IEEE-RAS 24th International Conference on Humanoid Robots (Humanoids)
Scaling Robust Optimization for Swarms: A Distributed Perspective
This article introduces a decentralized robust optimization framework for safe multi-agent control under uncertainty. Although stochastic noise has been the primary form of modeling uncertainty in such systems, these formulations might fall short in addressing uncertainties that are deterministic in nature or simply lack probabilistic data. To ensure safety under such scenarios, we employ the concept of robust constraints that must hold for all possible uncertainty realizations lying inside a bounded set. Nevertheless, standard robust optimization approaches become intractable due to the large number or non-convexity of the constraints involved in safe multi-agent control. To address this, we introduce novel robust reformulations that significantly reduce complexity without compromising safety. The applicability of the framework is further broadened to address both deterministic and stochastic uncertainties by incorporating robust chance constraints and distribution steering techniques. To achieve scalability, we derive a distributed approach based on the Alternating Direction Method of Multipliers (ADMM), supported by a convergence study that accounts for the underlying non-convexity. In addition, computational complexity bounds highlighting the efficiency of the proposed frameworks against standard approaches are presented. Finally, the robustness and scalability of the framework is demonstrated through extensive simulation results across diverse scenarios, including environments with nonconvex obstacles and up to 246 agents.
Using Natural Language for Human-Robot Collaboration in the Real World
We have a vision of a day when autonomous robots can collaborate with humans as assistants in performing complex tasks in the physical world. This vision includes that the robots will have the ability to communicate with their human collaborators using language that is natural to the humans. Traditional Interactive Task Learning (ITL) systems have some of this ability, but the language they can understand is very limited. The advent of large language models (LLMs) provides an opportunity to greatly improve the language understanding of robots, yet integrating the language abilities of LLMs with robots that operate in the real physical world is a challenging problem. In this chapter we first review briefly a few commercial robot products that work closely with humans, and discuss how they could be much better collaborators with robust language abilities. We then explore how an AI system with a cognitive agent that controls a physical robot at its core, interacts with both a human and an LLM, and accumulates situational knowledge through its experiences, can be a possible approach to reach that vision. We focus on three specific challenges of having the robot understand natural language, and present a simple proof-of-concept experiment using ChatGPT for each. Finally, we discuss what it will take to turn these simple experiments into an operational system where LLM-assisted language understanding is a part of an integrated robotic assistant that uses language to collaborate with humans.
comment: 34 pages, 11 figures, 5 tables. Submitted for publication (2026) in W.F. Lawless, Ranjeev Mittu, Shannon P. McGrarry, & Marco Brambilla (Eds.), Generative AI Risks and Benefits within Human-Machine Teams, Elsevier, Chapter 6
An Open-Source User-Friendly Interface for Simulating Magnetic Soft Robots using Simulation Open Framework Architecture (SOFA)
Soft robots, particularly magnetic soft robots, require specialized simulation tools to accurately model their deformation under external magnetic fields. However, existing platforms often lack dedicated support for magnetic materials, making them difficult to use for researchers at different expertise levels. This work introduces an open-source, user-friendly simulation interface using the Simulation Open Framework Architecture (SOFA), specifically designed to model magnetic soft robots. The tool enables users to define material properties, apply magnetic fields, and observe resulting deformations in real time. By integrating intuitive controls and stress analysis capabilities, it aims to bridge the gap between theoretical modeling and practical design. Four benchmark models -- a beam, three- and four-finger grippers, and a butterfly -- demonstrate its functionality. The software's ease of use makes it accessible to both beginners and advanced researchers. Future improvements will refine accuracy through experimental validation and comparison with industry-standard finite element solvers, ensuring realistic and predictive simulations of magnetic soft robots.
Why Report Failed Interactions With Robots?! Towards Vignette-based Interaction Quality
Although the quality of human-robot interactions has improved with the advent of LLMs, there are still various factors that cause systems to be sub-optimal when compared to human-human interactions. The nature and criticality of failures are often dependent on the context of the interaction and so cannot be generalized across the wide range of scenarios and experiments which have been implemented in HRI research. In this work we propose the use of a technique overlooked in the field of HRI, ethnographic vignettes, to clearly highlight these failures, particularly those that are rarely documented. We describe the methodology behind the process of writing vignettes and create our own based on our personal experiences with failures in HRI systems. We emphasize the strength of vignettes as the ability to communicate failures from a multi-disciplinary perspective, promote transparency about the capabilities of robots, and document unexpected behaviours which would otherwise be omitted from research reports. We encourage the use of vignettes to augment existing interaction evaluation methods.
comment: Accepted at the workshop on Real-World HRI in Public and Private Spaces: Successes, Failures, and Lessons Learned (PubRob-Fails), held at the IEEE RO-MAN Conference, 2025. 6 pages
KDPE: A Kernel Density Estimation Strategy for Diffusion Policy Trajectory Selection
Learning robot policies that capture multimodality in the training data has been a long-standing open challenge for behavior cloning. Recent approaches tackle the problem by modeling the conditional action distribution with generative models. One of these approaches is Diffusion Policy, which relies on a diffusion model to denoise random points into robot action trajectories. While achieving state-of-the-art performance, it has two main drawbacks that may lead the robot out of the data distribution during policy execution. First, the stochasticity of the denoising process can highly impact on the quality of generated trajectory of actions. Second, being a supervised learning approach, it can learn data outliers from the dataset used for training. Recent work focuses on mitigating these limitations by combining Diffusion Policy either with large-scale training or with classical behavior cloning algorithms. Instead, we propose KDPE, a Kernel Density Estimation-based strategy that filters out potentially harmful trajectories output of Diffusion Policy while keeping a low test-time computational overhead. For Kernel Density Estimation, we propose a manifold-aware kernel to model a probability density function for actions composed of end-effector Cartesian position, orientation, and gripper state. KDPE overall achieves better performance than Diffusion Policy on simulated single-arm tasks and real robot experiments. Additional material and code are available on our project page at https://hsp-iit.github.io/KDPE/.
comment: 9th Conference on Robot Learning (CoRL 2025), Seoul, Korea
Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors
A dexterous hand capable of generalizable grasping objects is fundamental for the development of general-purpose embodied AI. However, previous methods focus narrowly on low-level grasp stability metrics, neglecting affordance-aware positioning and human-like poses which are crucial for downstream manipulation. To address these limitations, we propose AffordDex, a novel framework with two-stage training that learns a universal grasping policy with an inherent understanding of both motion priors and object affordances. In the first stage, a trajectory imitator is pre-trained on a large corpus of human hand motions to instill a strong prior for natural movement. In the second stage, a residual module is trained to adapt these general human-like motions to specific object instances. This refinement is critically guided by two components: our Negative Affordance-aware Segmentation (NAA) module, which identifies functionally inappropriate contact regions, and a privileged teacher-student distillation process that ensures the final vision-based policy is highly successful. Extensive experiments demonstrate that AffordDex not only achieves universal dexterous grasping but also remains remarkably human-like in posture and functionally appropriate in contact location. As a result, AffordDex significantly outperforms state-of-the-art baselines across seen objects, unseen instances, and even entirely novel categories.
comment: 13 pages, 8 figures
IRL-VLA: Training an Vision-Language-Action Policy via Reward World Model
Vision-Language-Action (VLA) models have demonstrated potential in autonomous driving. However, two critical challenges hinder their development: (1) Existing VLA architectures are typically based on imitation learning in open-loop setup which tends to capture the recorded behaviors in the dataset, leading to suboptimal and constrained performance, (2) Close-loop training relies heavily on high-fidelity sensor simulation, where domain gaps and computational inefficiencies pose significant barriers. In this paper, we introduce IRL-VLA, a novel close-loop Reinforcement Learning via \textbf{I}nverse \textbf{R}einforcement \textbf{L}earning reward world model with a self-built VLA approach. Our framework proceeds in a three-stage paradigm: In the first stage, we propose a VLA architecture and pretrain the VLA policy via imitation learning. In the second stage, we construct a lightweight reward world model via inverse reinforcement learning to enable efficient close-loop reward computation. To further enhance planning performance, finally, we design specialized reward world model guidence reinforcement learning via PPO(Proximal Policy Optimization) to effectively balance the safety incidents, comfortable driving, and traffic efficiency. Our approach achieves state-of-the-art performance in NAVSIM v2 end-to-end driving benchmark, 1st runner up in CVPR2025 Autonomous Grand Challenge. We hope that our framework will accelerate VLA research in close-loop autonomous driving.
comment: 9 pagres, 2 figures
Optimal Planning and Machine Learning for Responsive Tracking and Enhanced Forecasting of Wildfires using a Spacecraft Constellation
We propose a novel concept of operations using optimal planning methods and machine learning (ML) to collect spaceborne data that is unprecedented for monitoring wildfires, process it to create new or enhanced products in the context of wildfire danger or spread monitoring, and assimilate them to improve existing, wildfire decision support tools delivered to firefighters within latency appropriate for time-critical applications. The concept is studied with respect to NASA's CYGNSS Mission, a constellation of passive microwave receivers that measure specular GNSS-R reflections despite clouds and smoke. Our planner uses a Mixed Integer Program formulation to schedule joint observation data collection and downlink for all satellites. Optimal solutions are found quickly that collect 98-100% of available observation opportunities. ML-based fire predictions that drive the planner objective are greater than 40% more correlated with ground truth than existing state-of-art. The presented case study on the TX Smokehouse Creek fire in 2024 and LA fires in 2025 represents the first high-resolution data collected by CYGNSS of active fires. Creation of Burnt Area Maps (BAM) using ML on data from active fires and BAM assimilation into NASA's Weather Research and Forecasting Model using neural nets to broadcast fire spread are novel outcomes. BAM and CYGNSS obtained soil moisture are integrated for the first time into USGS fire danger maps. Inclusion of CYGNSS data in ML-based burn predictions boosts accuracy by 13%, and inclusion of high-resolution data boosts ML recall by another 15%. The proposed workflow has an expected latency of 6-30h, improving on the current delivery time of multiple days. All components in the proposed concept are shown to be computationally scalable and globally generalizable, with sustainability considerations such as edge efficiency and low latency on small devices.
Diffusion Beats Autoregressive in Data-Constrained Settings
Autoregressive (AR) models have long dominated the landscape of large language models, driving progress across a wide range of tasks. Recently, diffusion-based language models have emerged as a promising alternative, though their advantages over AR models remain underexplored. In this paper, we systematically study masked diffusion models in data-constrained settings-where training involves repeated passes over limited data and find that they significantly outperform AR models when compute is abundant but data is scarce. Diffusion models make better use of repeated data, achieving lower validation loss and superior downstream performance. We find new scaling laws for diffusion models and derive a closed-form expression for the critical compute threshold at which diffusion begins to outperform AR. Finally, we explain why diffusion models excel in this regime: their randomized masking objective implicitly trains over a rich distribution of token orderings, acting as an implicit data augmentation that AR's fixed left-to-right factorization lacks. Our results suggest that when data, not compute, is the bottleneck, diffusion models offer a compelling alternative to the standard AR paradigm. Our code is available at: https://diffusion-scaling.github.io.
comment: Project Webpage: https://diffusion-scaling.github.io
Propeller Motion of a Devil-Stick using Normal Forcing
The problem of realizing rotary propeller motion of a devil-stick in the vertical plane using forces purely normal to the stick is considered. This problem represents a nonprehensile manipulation task of an underactuated system. In contrast with previous approaches, the devil-stick is manipulated by controlling the normal force and its point of application. Virtual holonomic constraints are used to design the trajectory of the center-of-mass of the devil-stick in terms of its orientation angle, and conditions for stable propeller motion are derived. Intermittent large-amplitude forces are used to asymptotically stabilize a desired propeller motion. Simulations demonstrate the efficacy of the approach in realizing stable propeller motion without loss of contact between the actuator and devil-stick.
comment: 6 pages, 5 figures. This work has been accepted for publication in the proceedings of the 2025 IEEE Conference on Control Technology and Applications (CCTA)
Dancing with REEM-C: A robot-to-human physical-social communication study
Humans often work closely together and relay a wealth of information through physical interaction. Robots, on the other hand, are not yet able to work similarly closely with humans and to effectively convey information when engaging in physical-social human-robot interaction (psHRI). This currently limits the potential of human-robot collaboration to solve real-world problems. This paper investigates how to establish clear and intuitive robot-to-human communication, while considering human comfort during psHRI. We approach this question from the perspective of a leader-follower dancing scenario, in which a full-body humanoid robot leads a human by signaling the next steps through a choice of communication modalities including haptic, visual, and audio signals. This is achieved through the development of a split whole-body control framework combining admittance and impedance control on the upper body, with position control on the lower body for balancing and stepping. Robot-led psHRI participant experiments allowed us to verify controller performance, as well as to build an understanding of what types of communication work better from the perspective of human partners, particularly in terms of perceived effectiveness and comfort.
comment: 21 pages, 16 figures
From Autonomy to Agency: Agentic Vehicles for Human-Centered Mobility Systems
Autonomy, from the Greek autos (self) and nomos (law), refers to the capacity to operate according to internal rules without external control. Accordingly, autonomous vehicles (AuVs) are viewed as vehicular systems capable of perceiving their environment and executing pre-programmed tasks independently of external input. However, both research and real-world deployments increasingly showcase vehicles that demonstrate behaviors beyond this definition (including the SAE levels 0 to 5); Examples of this outpace include the interaction with humans with natural language, goal adaptation, contextual reasoning, external tool use, and unseen ethical dilemma handling, largely empowered by multi-modal large language models (LLMs). These developments reveal a conceptual gap between technical autonomy and the broader cognitive and social capabilities needed for future human-centered mobility systems. To address this gap, this paper introduces the concept of agentic vehicles (AgVs), referring to vehicles that integrate agentic AI systems to reason, adapt, and interact within complex environments. This paper proposes the term AgVs and their distinguishing characteristics from conventional AuVs. It synthesizes relevant advances in integrating LLMs and AuVs and highlights how AgVs might transform future mobility systems and ensure the systems are human-centered. The paper concludes by identifying key challenges in the development and governance of AgVs, and how they can play a significant role in future agentic transportation systems.
A Segmented Robot Grasping Perception Neural Network for Edge AI
Robotic grasping, the ability of robots to reliably secure and manipulate objects of varying shapes, sizes and orientations, is a complex task that requires precise perception and control. Deep neural networks have shown remarkable success in grasp synthesis by learning rich and abstract representations of objects. When deployed at the edge, these models can enable low-latency, low-power inference, making real-time grasping feasible in resource-constrained environments. This work implements Heatmap-Guided Grasp Detection, an end-to-end framework for the detection of 6-Dof grasp poses, on the GAP9 RISC-V System-on-Chip. The model is optimised using hardware-aware techniques, including input dimensionality reduction, model partitioning, and quantisation. Experimental evaluation on the GraspNet-1Billion benchmark validates the feasibility of fully on-chip inference, highlighting the potential of low-power MCUs for real-time, autonomous manipulation.
comment: Accepted by SMC 2025
CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking ACM MM 2025
Mobile robots are increasingly required to navigate and interact within unknown and unstructured environments to meet human demands. Demand-driven navigation (DDN) enables robots to identify and locate objects based on implicit human intent, even when object locations are unknown. However, traditional data-driven DDN methods rely on pre-collected data for model training and decision-making, limiting their generalization capability in unseen scenarios. In this paper, we propose CogDDN, a VLM-based framework that emulates the human cognitive and learning mechanisms by integrating fast and slow thinking systems and selectively identifying key objects essential to fulfilling user demands. CogDDN identifies appropriate target objects by semantically aligning detected objects with the given instructions. Furthermore, it incorporates a dual-process decision-making module, comprising a Heuristic Process for rapid, efficient decisions and an Analytic Process that analyzes past errors, accumulates them in a knowledge base, and continuously improves performance. Chain of Thought (CoT) reasoning strengthens the decision-making process. Extensive closed-loop evaluations on the AI2Thor simulator with the ProcThor dataset show that CogDDN outperforms single-view camera-only methods by 15\%, demonstrating significant improvements in navigation accuracy and adaptability. The project page is available at https://yuehaohuang.github.io/CogDDN/.
comment: Accepted by ACM MM 2025
UniTracker: Learning Universal Whole-Body Motion Tracker for Humanoid Robots
Achieving expressive and generalizable whole-body motion control is essential for deploying humanoid robots in real-world environments. In this work, we propose UniTracker, a three-stage training framework that enables robust and scalable motion tracking across a wide range of human behaviors. In the first stage, we train a teacher policy with privileged observations to generate high-quality actions. In the second stage, we introduce a Conditional Variational Autoencoder (CVAE) to model a universal student policy that can be deployed directly on real hardware. The CVAE structure allows the policy to learn a global latent representation of motion, enhancing generalization to unseen behaviors and addressing the limitations of standard MLP-based policies under partial observations. Unlike pure MLPs that suffer from drift in global attributes like orientation, our CVAE-student policy incorporates global intent during training by aligning a partial-observation prior to the full-observation encoder. In the third stage, we introduce a fast adaptation module that fine-tunes the universal policy on harder motion sequences that are difficult to track directly. This adaptation can be performed both for single sequences and in batch mode, further showcasing the flexibility and scalability of our approach. We evaluate UniTracker in both simulation and real-world settings using a Unitree G1 humanoid, demonstrating strong performance in motion diversity, tracking accuracy, and deployment robustness.
comment: three-stage universal motion tracker for humanoid robots
A Computationally Efficient Maximum A Posteriori Sequence Estimation via Stein Variational Inference
State estimation in robotic systems presents significant challenges, particularly due to the prevalence of multimodal posterior distributions in real-world scenarios. One effective strategy for handling such complexity is to compute maximum a posteriori (MAP) sequences over a discretized or sampled state space, which enables a concise representation of the most likely state trajectory. However, this approach often incurs substantial computational costs, especially in high-dimensional settings. In this article, we propose a novel MAP sequence estimation method, \textsf{Stein-MAP-Seq}, which effectively addresses multimodality while substantially reducing computational and memory overhead. Our key contribution is a sequential variational inference framework that captures temporal dependencies in dynamical system models and integrates Stein variational gradient descent (SVGD) into a Viterbi-style dynamic programming algorithm, enabling computationally efficient MAP sequence estimation. \textsf{Stein-MAP-Seq} achieves a computational complexity of $\mathcal{O}(M^2)$, where $M$ is the number of particles, in contrast to the $\mathcal{O}(N^2)$ complexity of conventional MAP sequence estimators, with $N \gg M$. Furthermore, the method inherits SVGD's parallelism, enabling efficient computation for real-time deployment on GPU-equipped autonomous systems. We validate the proposed method in various multimodal scenarios, including those arising from nonlinear dynamics with ambiguous observations, unknown data associations, and temporary unobservability, demonstrating substantial improvements in estimation accuracy and robustness to multimodality over existing approaches.
comment: 14 pages
MARS-FTCP: Robust Fault-Tolerant Control and Agile Trajectory Planning for Modular Aerial Robot Systems
Modular Aerial Robot Systems (MARS) consist of multiple drone units that can self-reconfigure to adapt to various mission requirements and fault conditions. However, existing fault-tolerant control methods exhibit significant oscillations during docking and separation, impacting system stability. To address this issue, we propose a novel fault-tolerant control reallocation method that adapts to an arbitrary number of modular robots and their assembly formations. The algorithm redistributes the expected collective force and torque required for MARS to individual units according to their moment arm relative to the center of MARS mass. Furthermore, we propose an agile trajectory planning method for MARS of arbitrary configurations, which is collision-avoiding and dynamically feasible. Our work represents the first comprehensive approach to enable fault-tolerant and collision avoidance flight for MARS. We validate our method through extensive simulations, demonstrating improved fault tolerance, enhanced trajectory tracking accuracy, and greater robustness in cluttered environments. The videos and source code of this work are available at https://github.com/RuiHuangNUS/MARS-FTCP/
Tool-Planner: Task Planning with Clusters across Multiple Tools ICLR 2025
Large language models (LLMs) have demonstrated exceptional reasoning capabilities, enabling them to solve various complex problems. Recently, this ability has been applied to the paradigm of tool learning. Tool learning involves providing examples of tool usage and their corresponding functions, allowing LLMs to formulate plans and demonstrate the process of invoking and executing each tool. LLMs can address tasks that they cannot complete independently, thereby enhancing their potential across different tasks. However, this approach faces two key challenges. First, redundant error correction leads to unstable planning and long execution time. Additionally, designing a correct plan among multiple tools is also a challenge in tool learning. To address these issues, we propose Tool-Planner, a task-processing framework based on toolkits. Tool-Planner groups tools based on the API functions with the same function into a toolkit and allows LLMs to implement planning across the various toolkits. When a tool error occurs, the language model can reselect and adjust tools based on the toolkit. Experiments show that our approach demonstrates a high pass and win rate across different datasets and optimizes the planning scheme for tool learning in models such as GPT-4 and Claude 3, showcasing the potential of our method. Our code is public at https://github.com/OceannTwT/Tool-Planner
comment: ICLR 2025 Camera Ready version
EmbodiedAgent: A Scalable Hierarchical Approach to Overcome Practical Challenge in Multi-Robot Control
This paper introduces EmbodiedAgent, a hierarchical framework for heterogeneous multi-robot control. EmbodiedAgent addresses critical limitations of hallucination in impractical tasks. Our approach integrates a next-action prediction paradigm with a structured memory system to decompose tasks into executable robot skills while dynamically validating actions against environmental constraints. We present MultiPlan+, a dataset of more than 18,000 annotated planning instances spanning 100 scenarios, including a subset of impractical cases to mitigate hallucination. To evaluate performance, we propose the Robot Planning Assessment Schema (RPAS), combining automated metrics with LLM-aided expert grading. Experiments demonstrate EmbodiedAgent's superiority over state-of-the-art models, achieving 71.85% RPAS score. Real-world validation in an office service task highlights its ability to coordinate heterogeneous robots for long-horizon objectives.
Large-Scale Multi-Robot Assembly Planning for Autonomous Manufacturing
Mobile autonomous robots have the potential to revolutionize manufacturing processes. However, employing large robot fleets in manufacturing requires addressing challenges including collision-free movement in a shared workspace, effective multi-robot collaboration to manipulate and transport large payloads, complex task allocation due to coupled manufacturing processes, and spatial planning for parallel assembly and transportation of nested subassemblies. We propose a full algorithmic stack for large-scale multi-robot assembly planning that addresses these challenges and can synthesize construction plans for complex assemblies with thousands of parts in a matter of minutes. Our approach takes in a CAD-like product specification and automatically plans a full-stack assembly procedure for a group of robots to manufacture the product. We propose an algorithmic stack that comprises: (i) an iterative radial layout optimization procedure to define a global staging layout for the manufacturing facility, (ii) a graph-repair mixed-integer program formulation and a modified greedy task allocation algorithm to optimally allocate robots and robot sub-teams to assembly and transport tasks, (iii) a geometric heuristic and a hill-climbing algorithm to plan collaborative carrying configurations of robot sub-teams, and (iv) a distributed control policy that enables robots to execute the assembly motion plan collision-free. We also present an open-source multi-robot manufacturing simulator implemented in Julia as a resource to the research community, to test our algorithms and to facilitate multi-robot manufacturing research more broadly. Our empirical results demonstrate the scalability and effectiveness of our approach by generating plans to manufacture a LEGO model of a Saturn V launch vehicle with 1845 parts, 306 subassemblies, and 250 robots in under three minutes on a standard laptop computer.
comment: Repository: https://github.com/sisl/ConstructionBots.jl. Under review
Towards Physically Realizable Adversarial Attacks in Embodied Vision Navigation IROS
The significant advancements in embodied vision navigation have raised concerns about its susceptibility to adversarial attacks exploiting deep neural networks. Investigating the adversarial robustness of embodied vision navigation is crucial, especially given the threat of 3D physical attacks that could pose risks to human safety. However, existing attack methods for embodied vision navigation often lack physical feasibility due to challenges in transferring digital perturbations into the physical world. Moreover, current physical attacks for object detection struggle to achieve both multi-view effectiveness and visual naturalness in navigation scenarios. To address this, we propose a practical attack method for embodied navigation by attaching adversarial patches to objects, where both opacity and textures are learnable. Specifically, to ensure effectiveness across varying viewpoints, we employ a multi-view optimization strategy based on object-aware sampling, which optimizes the patch's texture based on feedback from the vision-based perception model used in navigation. To make the patch inconspicuous to human observers, we introduce a two-stage opacity optimization mechanism, in which opacity is fine-tuned after texture optimization. Experimental results demonstrate that our adversarial patches decrease the navigation success rate by an average of 22.39%, outperforming previous methods in practicality, effectiveness, and naturalness. Code is available at: https://github.com/chen37058/Physical-Attacks-in-Embodied-Nav
comment: 7 pages, 7 figures, Accept by IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
SORT3D: Spatial Object-centric Reasoning Toolbox for Zero-Shot 3D Grounding Using Large Language Models IROS 2025
Interpreting object-referential language and grounding objects in 3D with spatial relations and attributes is essential for robots operating alongside humans. However, this task is often challenging due to the diversity of scenes, large number of fine-grained objects, and complex free-form nature of language references. Furthermore, in the 3D domain, obtaining large amounts of natural language training data is difficult. Thus, it is important for methods to learn from little data and zero-shot generalize to new environments. To address these challenges, we propose SORT3D, an approach that utilizes rich object attributes from 2D data and merges a heuristics-based spatial reasoning toolbox with the ability of large language models (LLMs) to perform sequential reasoning. Importantly, our method does not require text-to-3D data for training and can be applied zero-shot to unseen environments. We show that SORT3D achieves state-of-the-art zero-shot performance on complex view-dependent grounding tasks on two benchmarks. We also implement the pipeline to run real-time on two autonomous vehicles and demonstrate that our approach can be used for object-goal navigation on previously unseen real-world environments. All source code for the system pipeline is publicly released at https://github.com/nzantout/SORT3D.
comment: 8 pages, 6 figures, published in IROS 2025
Formal Verification and Control with Conformal Prediction
We present recent advances in formal verification and control for autonomous systems with practical safety guarantees enabled by conformal prediction (CP), a statistical tool for uncertainty quantification. This survey is particularly motivated by learning-enabled autonomous systems (LEASs), where the complexity of learning-enabled components (LECs) poses a major bottleneck for applying traditional model-based verification and control techniques. To address this challenge, we advocate for CP as a lightweight alternative and demonstrate its use in formal verification, systems and control, and robotics. CP is appealing due to its simplicity (easy to understand, implement, and adapt), generality (requires no assumptions on learned models and underlying data distributions), and efficiency (real-time capable and accurate). This survey provides an accessible introduction to CP for non-experts interested in applying CP to autonomy problems. We particularly show how CP can be used for formal verification of LECs and the design of safe control as well as offline and online verification algorithms for LEASs. We present these techniques within a unifying framework that addresses the complexity of LEASs. Our exposition spans simple specifications, such as robot navigation tasks, to complex mission requirements expressed in temporal logic. Throughout the survey, we contrast CP with other statistical techniques, including scenario optimization and PAC-Bayes theory, highlighting advantages and limitations for verification and control. Finally, we outline open problems and promising directions for future research.
A flexible framework for accurate LiDAR odometry, map manipulation, and localization
LiDAR-based SLAM is a core technology for autonomous vehicles and robots. One key contribution of this work to 3D LiDAR SLAM and localization is a fierce defense of view-based maps (pose graphs with time-stamped sensor readings) as the fundamental representation of maps. As will be shown, they allow for the greatest flexibility, enabling the posterior generation of arbitrary metric maps optimized for particular tasks, e.g. obstacle avoidance, real-time localization. Moreover, this work introduces a new framework in which mapping pipelines can be defined without coding, defining the connections of a network of reusable blocks much like deep-learning networks are designed by connecting layers of standardized elements. We also introduce tightly-coupled estimation of linear and angular velocity vectors within the Iterative Closest Point (ICP)-like optimizer, leading to superior robustness against aggressive motion profiles without the need for an IMU. Extensive experimental validation reveals that the proposal compares well to, or improves, former state-of-the-art (SOTA) LiDAR odometry systems, while also successfully mapping some hard sequences where others diverge. A proposed self-adaptive configuration has been used, without parameter changes, for all 3D LiDAR datasets with sensors between 16 and 128 rings, and has been extensively tested on 83 sequences over more than 250~km of automotive, hand-held, airborne, and quadruped LiDAR datasets, both indoors and outdoors. The system flexibility is demonstrated with additional configurations for 2D LiDARs and for building 3D NDT-like maps. The framework is open-sourced online: https://github.com/MOLAorg/mola
comment: 44 pages, 35 figures
Safe and Efficient Robot Action Planning in the Presence of Unconcerned Humans
This paper proposes a robot action planning scheme that provides an efficient and probabilistically safe plan for a robot interacting with an unconcerned human -- someone who is either unaware of the robot's presence or unwilling to engage in ensuring safety. The proposed scheme is predictive, meaning that the robot is required to predict human actions over a finite future horizon; such predictions are often inaccurate in real-world scenarios. One possible approach to reduce the uncertainties is to provide the robot with the capability of reasoning about the human's awareness of potential dangers. This paper discusses that by using a binary variable, so-called danger awareness coefficient, it is possible to differentiate between concerned and unconcerned humans, and provides a learning algorithm to determine this coefficient by observing human actions. Moreover, this paper argues how humans rely on predictions of other agents' future actions (including those of robots in human-robot interaction) in their decision-making. It also shows that ignoring this aspect in predicting human's future actions can significantly degrade the efficiency of the interaction, causing agents to deviate from their optimal paths. The proposed robot action planning scheme is verified and validated via extensive simulation and experimental studies on a LoCoBot WidowX-250.
Multiagent Systems
A Dynamically Weighted ADMM Framework for Byzantine Resilience
The alternating direction of multipliers method (ADMM) is a popular method to solve distributed consensus optimization utilizing efficient communication among various nodes in the network. However, in the presence of faulty or attacked nodes, even a small perturbation (or sharing false data) during the communication can lead to divergence of the solution. To address this issue, in this work we consider ADMM under the effect of Byzantine threat, where an unknown subset of nodes is subject to Byzantine attacks or faults. We propose Dynamically Weighted ADMM (DW-ADMM), a novel variant of ADMM that uses dynamic weights on the edges of the network, thus promoting resilient distributed optimization. We establish that the proposed method (i) produces a nearly identical solution to conventional ADMM in the error-free case, and (ii) guarantees a bounded solution with respect to the global minimizer, even under Byzantine threat. Finally, we demonstrate the effectiveness of our proposed algorithm using an illustrative numerical simulation.
comment: 7 pages, 2 figures
Tapas are free! Training-Free Adaptation of Programmatic Agents via LLM-Guided Program Synthesis in Dynamic Environments
Autonomous agents in safety-critical applications must continuously adapt to dynamic conditions without compromising performance and reliability. This work introduces TAPA (Training-free Adaptation of Programmatic Agents), a novel framework that positions large language models (LLMs) as intelligent moderators of the symbolic action space. Unlike prior programmatic agents that typically generate a monolithic policy program or rely on fixed symbolic action sets, TAPA synthesizes and adapts modular programs for individual high-level actions, referred to as logical primitives. By decoupling strategic intent from execution, TAPA enables meta-agents to operate over an abstract, interpretable action space while the LLM dynamically generates, composes, and refines symbolic programs tailored to each primitive. Extensive experiments across cybersecurity and swarm intelligence domains validate TAPA's effectiveness. In autonomous DDoS defense scenarios, TAPA achieves 77.7% network uptime while maintaining near-perfect detection accuracy in unknown dynamic environments. In swarm intelligence formation control under environmental and adversarial disturbances, TAPA consistently preserves consensus at runtime where baseline methods fail completely. This work promotes a paradigm shift for autonomous system design in evolving environments, from policy adaptation to dynamic action adaptation.
comment: Under Review
FACET:Teacher-Centred LLM-Based Multi-Agent Systems-Towards Personalized Educational Worksheets
The increasing heterogeneity of student populations poses significant challenges for teachers, particularly in mathematics education, where cognitive, motivational, and emotional differences strongly influence learning outcomes. While AI-driven personalization tools have emerged, most remain performance-focused, offering limited support for teachers and neglecting broader pedagogical needs. This paper presents the FACET framework, a teacher-facing, large language model (LLM)-based multi-agent system designed to generate individualized classroom materials that integrate both cognitive and motivational dimensions of learner profiles. The framework comprises three specialized agents: (1) learner agents that simulate diverse profiles incorporating topic proficiency and intrinsic motivation, (2) a teacher agent that adapts instructional content according to didactical principles, and (3) an evaluator agent that provides automated quality assurance. We tested the system using authentic grade 8 mathematics curriculum content and evaluated its feasibility through a) automated agent-based assessment of output quality and b) exploratory feedback from K-12 in-service teachers. Results from ten internal evaluations highlighted high stability and alignment between generated materials and learner profiles, and teacher feedback particularly highlighted structure and suitability of tasks. The findings demonstrate the potential of multi-agent LLM architectures to provide scalable, context-aware personalization in heterogeneous classroom settings, and outline directions for extending the framework to richer learner profiles and real-world classroom trials.
Defending a City from Multi-Drone Attacks: A Sequential Stackelberg Security Games Approach
To counter an imminent multi-drone attack on a city, defenders have deployed drones across the city. These drones must intercept/eliminate the threat, thus reducing potential damage from the attack. We model this as a Sequential Stackelberg Security Game, where the defender first commits to a mixed sequential defense strategy, and the attacker then best responds. We develop an efficient algorithm called S2D2, which outputs a defense strategy. We demonstrate the efficacy of S2D2 in extensive experiments on data from 80 real cities, improving the performance of the defender in comparison to greedy heuristics based on prior works. We prove that under some reasonable assumptions about the city structure, S2D2 outputs an approximate Strong Stackelberg Equilibrium (SSE) with a convenient structure.
comment: 59 pages, 10 figures
Allen: Rethinking MAS Design through Step-Level Policy Autonomy
We introduce a new Multi-Agent System (MAS) - Allen, designed to address two core challenges in current MAS design: (1) improve system's policy autonomy, empowering agents to dynamically adapt their behavioral strategies, and (2) achieving the trade-off between collaborative efficiency, task supervision, and human oversight in complex network topologies. Our core insight is to redefine the basic execution unit in the MAS, allowing agents to autonomously form different patterns by combining these units. We have constructed a four-tier state architecture (Task, Stage, Agent, Step) to constrain system behavior from both task-oriented and execution-oriented perspectives. This achieves a unification of topological optimization and controllable progress. Allen grants unprecedented Policy Autonomy, while making a trade-off for the controllability of the collaborative structure. The project code has been open source at: https://github.com/motern88/Allen
Every 28 Days the AI Dreams of Soft Skin and Burning Stars: Scaffolding AI Agents with Hormones and Emotions NeurIPS
Despite significant advances, AI systems struggle with the frame problem: determining what information is contextually relevant from an exponentially large possibility space. We hypothesize that biological rhythms, particularly hormonal cycles, serve as natural relevance filters that could address this fundamental challenge. We develop a framework that embeds simulated menstrual and circadian cycles into Large Language Models through system prompts generated from periodic functions modeling key hormones including estrogen, testosterone, and cortisol. Across multiple state-of-the-art models, linguistic analysis reveals emotional and stylistic variations that track biological phases; sadness peaks during menstruation while happiness dominates ovulation and circadian patterns show morning optimism transitioning to nocturnal introspection. Benchmarking on SQuAD, MMLU, Hellaswag, and AI2-ARC demonstrates subtle but consistent performance variations aligning with biological expectations, including optimal function in moderate rather than extreme hormonal ranges. This methodology provides a novel approach to contextual AI while revealing how societal biases regarding gender and biology are embedded within language models.
comment: 9 pages, 1 figure, submitted to NeurIPS Creative AI track
SafeSieve: From Heuristics to Experience in Progressive Pruning for LLM-based Multi-Agent Communication
LLM-based multi-agent systems exhibit strong collaborative capabilities but often suffer from redundant communication and excessive token overhead. Existing methods typically enhance efficiency through pretrained GNNs or greedy algorithms, but often isolate pre- and post-task optimization, lacking a unified strategy. To this end, we present SafeSieve, a progressive and adaptive multi-agent pruning algorithm that dynamically refines the inter-agent communication through a novel dual-mechanism. SafeSieve integrates initial LLM-based semantic evaluation with accumulated performance feedback, enabling a smooth transition from heuristic initialization to experience-driven refinement. Unlike existing greedy Top-k pruning methods, SafeSieve employs 0-extension clustering to preserve structurally coherent agent groups while eliminating ineffective links. Experiments across benchmarks (SVAMP, HumanEval, etc.) showcase that SafeSieve achieves 94.01% average accuracy while reducing token usage by 12.4%-27.8%. Results further demonstrate robustness under prompt injection attacks (1.23% average accuracy drop). In heterogeneous settings, SafeSieve reduces deployment costs by 13.3% while maintaining performance. These results establish SafeSieve as a robust, efficient, and scalable framework for practical multi-agent systems. Our code can be found in https://anonymous.4open.science/r/SafeSieve-D8F2FFUN.
comment: 7 pages for main content, 5 figures, 4 tables
Smooth Games of Configuration in the Linear-Quadratic Setting
Dynamic game theory offers a toolbox for formalizing and solving for both cooperative and non-cooperative strategies in multi-agent scenarios. However, the optimal configuration of such games remains largely unexplored. While there is existing literature on the parametrization of dynamic games, little research examines this parametrization from a strategic perspective where each agent's configuration choice is influenced by the decisions of others. In this work, we introduce the concept of a game of configuration, providing a framework for the strategic fine-tuning of differential games. We define a game of configuration as a two-stage game within the setting of finite-horizon, affine-quadratic, AQ, differential games. In the first stage, each player chooses their corresponding configuration parameter, which will impact their dynamics and costs in the second stage. We provide the subgame perfect solution concept and a method for computing first stage cost gradients over the configuration space. This then allows us to formulate a gradient-based method for searching for local solutions to the configuration game, as well as provide necessary conditions for equilibrium configurations over their downstream (second stage) trajectories. We conclude by demonstrating the effectiveness of our approach in example AQ systems, both zero-sum and general-sum.
Synergy Over Spiral: A Logistics 5.0 Game-Theoretic Model for Trust-Fatigue Co-regulation in Human-Cobot Order Picking
This paper investigates the critical role of trust and fatigue in human-cobot collaborative order picking, framing the challenge within the scope of Logistics 5.0: the implementation of human-robot symbiosis in smart logistics. We propose a dynamic, leader-follower Stackelberg game to model this interaction, where utility functions explicitly account for human fatigue and trust. Through agent-based simulations, we demonstrate that while a naive model leads to a "trust death spiral," a refined trust model creates a "trust synergy cycle," increasing productivity by nearly 100 percent. Finally, we show that a cobot operating in a Trust-Recovery Mode can overcome system brittleness after a disruption, reducing trust recovery time by over 75 percent compared to a non-adaptive model. Our findings provide a framework for designing intelligent cobot behaviors that fulfill the Industry 5.0 pillars of human-centricity, sustainability, and resilience.
Chasing Moving Targets with Online Self-Play Reinforcement Learning for Safer Language Models
Conventional language model (LM) safety alignment relies on a reactive, disjoint procedure: attackers exploit a static model, followed by defensive fine-tuning to patch exposed vulnerabilities. This sequential approach creates a mismatch -- attackers overfit to obsolete defenses, while defenders perpetually lag behind emerging threats. To address this, we propose Self-RedTeam, an online self-play reinforcement learning algorithm where an attacker and defender agent co-evolve through continuous interaction. We cast safety alignment as a two-player zero-sum game, where a single model alternates between attacker and defender roles -- generating adversarial prompts and safeguarding against them -- while a reward LM adjudicates outcomes. This enables dynamic co-adaptation. Grounded in the game-theoretic framework of zero-sum games, we establish a theoretical safety guarantee which motivates the design of our method: if self-play converges to a Nash Equilibrium, the defender will reliably produce safe responses to any adversarial input. Empirically, Self-RedTeam uncovers more diverse attacks (+21.8% SBERT) compared to attackers trained against static defenders and achieves higher robustness on safety benchmarks (e.g., +65.5% on WildJailBreak) than defenders trained against static attackers. We further propose hidden Chain-of-Thought, allowing agents to plan privately, which boosts adversarial diversity and reduces over-refusals. Our results motivate a shift from reactive patching to proactive co-evolution in LM safety training, enabling scalable, autonomous, and robust self-improvement of LMs via multi-agent reinforcement learning (MARL).
Systems and Control (CS)
Two-Impulse Trajectory Design in Two-Body Systems With Riemannian Geometry
This work presents a new method for generating impulsive trajectories in restricted two-body systems by leveraging Riemannian geometry. The proposed method transforms the standard trajectory optimization problem into a purely geometric one that involves computing a set of geodesics for a suitable Riemannian metric. This transformation is achieved by defining a metric, specifically the Jacobi metric, that embeds the dynamics directly into the metric, so any geodesic of the metric is also a dynamically feasible trajectory. The method finds the fuel-optimal transfer trajectory by sampling candidate energy ($\Delta V$) changes for different points on the current and desired orbit, and efficiently computing and evaluating each candidate geodesic, which are equivalent to candidate orbit transfer trajectories via the Jacobi metric. The method bypasses the known issues of optimization-based methods, e.g., sensitivity to the initial guess, and can be applied to more complex two-body systems. The approach is demonstrated on the minimum-$\Delta V$ two-impulse phase-free orbit transfer problem, first on a Keplerian system and second on a system with a modeled $J_2$ perturbation. The proposed method is shown to meet or exceed the state-of-the-art methods in the minimum-$\Delta V$ problem in the Keplerian system. The generality and versatility of the approach is demonstrated by seamlessly including the $J_2$ perturbation, a case that many existing methods cannot handle. Numerical simulations and performance comparisons showcase the effectiveness of the approach.
Nominal Evaluation Of Automatic Multi-Sections Control Potential In Comparison To A Simpler One- Or Two-Sections Alternative With Predictive Spray Switching
Automatic Section Control (ASC) is a long-standing trend for spraying in agriculture. It promises to minimise spray overlap areas. The core idea is to (i) switch off spray nozzles on areas that have already been sprayed, and (ii) to dynamically adjust nozzle flow rates along the boom bar that holds the spray nozzles when velocities of boom sections vary during turn maneuvers. ASC is not possible without sensors, in particular for accurate positioning data. Spraying and the movement of modern wide boom bars are highly dynamic processes. In addition, many uncertainty factors have an effect such as cross wind drift, boom height, nozzle clogging in open-field conditions, and so forth. In view of this complexity, the natural question arises if a simpler alternative exist. Therefore, an Automatic Multi-Sections Control method is compared to a proposed simpler one- or two-sections alternative that uses predictive spray switching. The comparison is provided under nominal conditions. Agricultural spraying is intrinsically linked to area coverage path planning and spray switching logic. Combinations of two area coverage path planning and switching logics as well as three sections-setups are compared. The three sections-setups differ by controlling 48 sections, 2 sections or controlling all nozzles uniformly with the same control signal as one single section. Methods are evaluated on 10 diverse real-world field examples, including non-convex field contours, freeform mainfield lanes and multiple obstacle areas. A preferred method is suggested that (i) minimises area coverage pathlength, (ii) offers intermediate overlap, (iii) is suitable for manual driving by following a pre-planned predictive spray switching logic for an area coverage path plan, and (iv) and in contrast to ASC can be implemented sensor-free and therefore at low cost.
comment: 14 pages plus 7 pages appendix with additional figures, 18 main figures, 3 tables
A Dynamically Weighted ADMM Framework for Byzantine Resilience
The alternating direction of multipliers method (ADMM) is a popular method to solve distributed consensus optimization utilizing efficient communication among various nodes in the network. However, in the presence of faulty or attacked nodes, even a small perturbation (or sharing false data) during the communication can lead to divergence of the solution. To address this issue, in this work we consider ADMM under the effect of Byzantine threat, where an unknown subset of nodes is subject to Byzantine attacks or faults. We propose Dynamically Weighted ADMM (DW-ADMM), a novel variant of ADMM that uses dynamic weights on the edges of the network, thus promoting resilient distributed optimization. We establish that the proposed method (i) produces a nearly identical solution to conventional ADMM in the error-free case, and (ii) guarantees a bounded solution with respect to the global minimizer, even under Byzantine threat. Finally, we demonstrate the effectiveness of our proposed algorithm using an illustrative numerical simulation.
comment: 7 pages, 2 figures
Identification of Sub/Super-Synchronous Control Interaction Paths Using Dissipative Energy Flow
Sub- and super-synchronous control interactions (SSCIs) are oscillations arising from adverse interactions between inverter-based resource (IBR) controls and the power network. SSCIs often involve multiple frequencies and propagate through complex, interconnected paths, making it difficult for model-based approaches to identify both the sources and the paths of oscillatory energy flow. This paper extends the Dissipative Energy Flow (DEF) method, originally developed for low-frequency electromechanical oscillations, to identify SSCI sources and dynamic interaction paths across multiple frequencies using three-phase voltage and current measurements. The approach operates in the dq frame using dynamic phasors, enabling mode-specific DEF computation from bandpass-filtered signals. An electromagnetic transient (EMT) case study on a meshed network with synchronous generator and type-3 wind farm resources under series-compensated conditions demonstrates the method's capability to distinguish frequency-dependent source and sink roles, including cases where the same resource acts as a source at one frequency and a sink at another. The results show DEF can provide a physics-based and automation-friendly tool for SSCI diagnosis in IBR-rich grids.
comment: 5 pages, 11 figures
Towards Fully Onboard State Estimation and Trajectory Tracking for UAVs with Suspended Payloads
This paper addresses the problem of tracking the position of a cable-suspended payload carried by an unmanned aerial vehicle, with a focus on real-world deployment and minimal hardware requirements. In contrast to many existing approaches that rely on motion-capture systems, additional onboard cameras, or instrumented payloads, we propose a framework that uses only standard onboard sensors--specifically, real-time kinematic global navigation satellite system measurements and data from the onboard inertial measurement unit--to estimate and control the payload's position. The system models the full coupled dynamics of the aerial vehicle and payload, and integrates a linear Kalman filter for state estimation, a model predictive contouring control planner, and an incremental model predictive controller. The control architecture is designed to remain effective despite sensing limitations and estimation uncertainty. Extensive simulations demonstrate that the proposed system achieves performance comparable to control based on ground-truth measurements, with only minor degradation (< 6%). The system also shows strong robustness to variations in payload parameters. Field experiments further validate the framework, confirming its practical applicability and reliable performance in outdoor environments using only off-the-shelf aerial vehicle hardware.
Integrating Uncertainties for Koopman-Based Stabilization
Over the past decades, the Koopman operator has been widely applied in data-driven control, yet its theoretical foundations remain underexplored. This paper establishes a unified framework to address the robust stabilization problem in data-driven control via the Koopman operator, fully accounting for three uncertainties: projection error, estimation error, and process disturbance. It comprehensively investigates both direct and indirect data-driven control approaches, facilitating flexible methodology selection for analysis and control. For the direct approach, considering process disturbances, the lifted-state feedback controller, designed via a linear matrix inequality (LMI), robustly stabilizes all lifted bilinear systems consistent with noisy data. For the indirect approach requiring system identification, the feedback controller, designed using a nonlinear matrix inequality convertible to an LMI, ensures closed-loop stability under worst-case process disturbances. Numerical simulations via cross-validation validate the effectiveness of both approaches, highlighting their theoretical significance and practical utility.
A Comparative Study of Floating-Base Space Parameterizations for Agile Whole-Body Motion Planning
Automatically generating agile whole-body motions for legged and humanoid robots remains a fundamental challenge in robotics. While numerous trajectory optimization approaches have been proposed, there is no clear guideline on how the choice of floating-base space parameterization affects performance, especially for agile behaviors involving complex contact dynamics. In this paper, we present a comparative study of different parameterizations for direct transcription-based trajectory optimization of agile motions in legged systems. We systematically evaluate several common choices under identical optimization settings to ensure a fair comparison. Furthermore, we introduce a novel formulation based on the tangent space of SE(3) for representing the robot's floating-base pose, which, to our knowledge, has not received attention from the literature. This approach enables the use of mature off-the-shelf numerical solvers without requiring specialized manifold optimization techniques. We hope that our experiments and analysis will provide meaningful insights for selecting the appropriate floating-based representation for agile whole-body motion generation.
comment: 8 pages, 2 figures, 4 tables, Accepted at Humanoids 2025
Robust Convolution Neural ODEs via Contractivity-promoting regularization
Neural networks can be fragile to input noise and adversarial attacks. In this work, we consider Convolutional Neural Ordinary Differential Equations (NODEs), a family of continuous-depth neural networks represented by dynamical systems, and propose to use contraction theory to improve their robustness. For a contractive dynamical system two trajectories starting from different initial conditions converge to each other exponentially fast. Contractive Convolutional NODEs can enjoy increased robustness as slight perturbations of the features do not cause a significant change in the output. Contractivity can be induced during training by using a regularization term involving the Jacobian of the system dynamics. To reduce the computational burden, we show that it can also be promoted using carefully selected weight regularization terms for a class of NODEs with slope-restricted activation functions. The performance of the proposed regularizers is illustrated through benchmark image classification tasks on MNIST and FashionMNIST datasets, where images are corrupted by different kinds of noise and attacks.
comment: Accepted in IEEE CDC2025, Rio de Janeiro, Brazil
Principles of Physiological Closed-Loop Controllers in Neuromodulation
As neurostimulation devices increasingly incorporate closed-loop functionality, the greater design complexity brings additional requirements for risk management and special considerations to optimise benefit. This manuscript creates a common framework upon which all current and planned neuromodulation-based physiological closed-loop controllers (PCLCs) can be mapped including integration of the Technical Considerations of Medical Devices with Physiologic Closed-Loop Control Technology guidance published in 2023 by the United States Food and Drug Administration (FDA), a classification of feedback (reactive) and feedforward (predictive) biomarkers, and control systems theory. We explain risk management in the context of this framework and illustrate its applications for three exemplary technologies. This manuscript serves as guidance to the emerging field of PCLCs in neuromodulation, mitigating risk through standardized nomenclature and a systematic outline for rigorous device development, testing, and implementation.
comment: 35 pages, 10 figures
System Synchronization Based on Complex Frequency
In response to the inertia decline caused by high penetration of renewable generation, traditional synchronization criteria that rely solely on frequency consistency are increasingly inadequate for characterizing the coupled behavior of frequency and voltage dynamics during power-system transients. This paper focuses on the theory of complex-frequency synchronization and develops a theory-simulation analysis framework that offers a new perspective for steady-state and transient analysis of low-inertia power systems. First, the fundamental concepts and theoretical foundations of complex-frequency synchronization are presented in detail. Second, local and global dynamic synchronization criteria are derived and the concept of generalized inertia is introduced, which unifies the conventional inertial support to frequency with the inertia-like support of voltage, thereby providing an accurate measure of region-level coupled support strength for voltage and frequency. Finally, numerical case studies on the IEEE 9-bus system validate the effectiveness of the proposed theoretical methods and criteria, and demonstrate a visualization workflow for key indicators such as disturbance impact zones and generalized-inertia regions.
Direct data-driven interpolation and approximation of linear parameter-varying system trajectories
We consider the problem of estimating missing values in trajectories of linear parameter-varying (LPV) systems. We solve this interpolation problem for the class of shifted-affine LPV systems. Conditions for the existence and uniqueness of solutions are given and a direct data-driven algorithm for its computation is presented, i.e., the data-generating system is not given by a parametric model but is implicitly specified by data. We illustrate the applicability of the proposed solution on illustrative examples of a mass-spring-damper system with exogenous and endogenous parameter variation.
comment: 9 pages, 5 figures, submitted for review
Geometry-Aware Predictive Safety Filters on Humanoids: From Poisson Safety Functions to CBF Constrained MPC
Autonomous navigation through unstructured and dynamically-changing environments is a complex task that continues to present many challenges for modern roboticists. In particular, legged robots typically possess manipulable asymmetric geometries which must be considered during safety-critical trajectory planning. This work proposes a predictive safety filter: a nonlinear model predictive control (MPC) algorithm for online trajectory generation with geometry-aware safety constraints based on control barrier functions (CBFs). Critically, our method leverages Poisson safety functions to numerically synthesize CBF constraints directly from perception data. We extend the theoretical framework for Poisson safety functions to incorporate temporal changes in the domain by reformulating the static Dirichlet problem for Poisson's equation as a parameterized moving boundary value problem. Furthermore, we employ Minkowski set operations to lift the domain into a configuration space that accounts for robot geometry. Finally, we implement our real-time predictive safety filter on humanoid and quadruped robots in various safety-critical scenarios. The results highlight the versatility of Poisson safety functions, as well as the benefit of CBF constrained model predictive safety-critical controllers.
comment: 2025 IEEE-RAS 24th International Conference on Humanoid Robots
Recent Advances in Transformer and Large Language Models for UAV Applications
The rapid advancement of Transformer-based models has reshaped the landscape of uncrewed aerial vehicle (UAV) systems by enhancing perception, decision-making, and autonomy. This review paper systematically categorizes and evaluates recent developments in Transformer architectures applied to UAVs, including attention mechanisms, CNN-Transformer hybrids, reinforcement learning Transformers, and large language models (LLMs). Unlike previous surveys, this work presents a unified taxonomy of Transformer-based UAV models, highlights emerging applications such as precision agriculture and autonomous navigation, and provides comparative analyses through structured tables and performance benchmarks. The paper also reviews key datasets, simulators, and evaluation metrics used in the field. Furthermore, it identifies existing gaps in the literature, outlines critical challenges in computational efficiency and real-time deployment, and offers future research directions. This comprehensive synthesis aims to guide researchers and practitioners in understanding and advancing Transformer-driven UAV technologies.
Securing Sideways: Thwarting Lateral Movement by Implementing Active Directory Tiering
The advancement of computing equipment and the advances in services over the Internet has allowed corporations, higher education, and many other organizations to pursue the shared computing network environment. A requirement for shared computing environments is a centralized identity system to authenticate and authorize user access. An organization's digital identity plane is a prime target for cyber threat actors. When compromised, identities can be exploited to steal credentials, create unauthorized accounts, and manipulate permissions-enabling attackers to gain control of the network and undermine its confidentiality, availability, and integrity. Cybercrime losses reached a record of 16.6 B in the United States in 2024. For organizations using Microsoft software, Active Directory is the on-premises identity system of choice. In this article, we examine the challenge of security compromises in Active Directory (AD) environments and present effective strategies to prevent credential theft and limit lateral movement by threat actors. Our proposed approaches aim to confine the movement of compromised credentials, preventing significant privilege escalation and theft. We argue that through our illustration of real-world scenarios, tiering can halt lateral movement and advanced cyber-attacks, thus reducing ransom escalation. Our work bridges a gap in existing literature by combining technical guidelines with theoretical arguments in support of tiering, positioning it as a vital component of modern cybersecurity strategy even though it cannot function in isolation. As the hardware advances and the cloud sourced services along with AI is advancing with unprecedented speed, we think it is important for security experts and the business to work together and start designing and developing software and frameworks to classify devices automatically and accurately within the tiered structure.
comment: 11 pages
Control of a commercial vehicle by a tetraplegic human using a bimanual brain-computer interface
Brain-computer interfaces (BCIs) read neural signals directly from the brain to infer motor planning and execution. However, the implementation of this technology has been largely limited to laboratory settings, with few real-world applications. We developed a bimanual BCI system to drive a vehicle in both simulated and real-world environments. We demonstrate that an individual with tetraplegia, implanted with intracortical BCI electrodes in the posterior parietal cortex (PPC) and the hand knob region of the motor cortex (MC), reacts at least as fast and precisely as motor intact participants, and drives a simulated vehicle as proficiently as the same control group. This BCI participant, living in California, could also remotely drive a Ford Mustang Mach-E vehicle in Michigan. Our first teledriving task relied on cursor control for speed and steering in a closed urban test facility. However, the final BCI system added click control for full-stop braking and thus enabled bimanual cursor-and-click control for both simulated driving through a virtual town with traffic and teledriving through an obstacle course without traffic in the real world. We also demonstrate the safety and feasibility of BCI-controlled driving. This first-of-its-kind implantable BCI application not only highlights the versatility and innovative potentials of BCIs but also illuminates the promising future for the development of life-changing solutions to restore independence to those who suffer catastrophic neurological injury.
comment: 41 pages, 7 figures, 1 table. 22 supplementary pages, 6 supplementary figures, 11 supplementary tables, 9 supplementary movies available as ancillary files
Matrix Control Barrier Functions
This paper generalizes the control barrier function framework by replacing scalar-valued functions with matrix-valued ones. Specifically, we develop barrier conditions for safe sets defined by matrix inequalities -- both semidefinite and indefinite. Matrix inequalities can be used to describe a richer class of safe sets, including nonsmooth ones. The safety filters constructed from our proposed matrix control barrier functions via semidefinite programming (CBF-SDP) are shown to be continuous. Our matrix formulation naturally provides a continuous safety filter for Boolean-based control barrier functions, notably for disjunctions (OR), without relaxing the safe set. We illustrate the effectiveness of the proposed framework with applications in drone network connectivity maintenance and nonsmooth obstacle avoidance, both in simulations and hardware experiments.
comment: 13 pages, 4 figures, submitted to the IEEE Transactions on Automatic Control
Quantifying the Value of Seismic Structural Health Monitoring for post-earthquake recovery of electric power system in terms of resilience enhancement
Post-earthquake recovery of electric power networks (EPNs) is critical to community resilience. Traditional recovery processes often rely on prolonged and imprecise manual inspections for damage diagnosis, leading to suboptimal repair prioritization and extended service disruptions. Seismic Structural Health Monitoring (SSHM) offers the potential to expedite recovery by enabling more accurate and timely damage assessment. However, SSHM deployment incurs costs, and its system-level resilience benefit remains underexplored. This study proposes a probabilistic simulation framework to quantify the value of SSHM for enhancing EPN resilience. The framework includes seismic damage modeling based on network configuration, hazard intensity, fragility functions, and damage-functionality mappings, combined with recovery simulations incorporating resource constraints, repair and transfer durations. System functionality is evaluated using graph-based island detection and optimal power flow analysis. Resilience is quantified via the Lack of Resilience (LoR) metric derived from the functionality restoration curve. SSHM is incorporated by altering the quality of damage information used in repair scheduling. Different monitoring scenarios (e.g., no-SSHM baseline, partial SSHM, full SSHM with various accuracies) are modeled using confusion matrices to simulate damage misclassification. Results show that improved damage awareness via SSHM significantly accelerates recovery and reduces LoR by up to 21%. This work supports evidence-based decisions for SSHM deployment in critical infrastructure.
comment: 21 pages. 14 figures
Transferable Parasitic Estimation via Graph Contrastive Learning and Label Rebalancing in AMS Circuits
Graph representation learning on Analog-Mixed Signal (AMS) circuits is crucial for various downstream tasks, e.g., parasitic estimation. However, the scarcity of design data, the unbalanced distribution of labels, and the inherent diversity of circuit implementations pose significant challenges to learning robust and transferable circuit representations. To address these limitations, we propose CircuitGCL, a novel graph contrastive learning framework that integrates representation scattering and label rebalancing to enhance transferability across heterogeneous circuit graphs. CircuitGCL employs a self-supervised strategy to learn topology-invariant node embeddings through hyperspherical representation scattering, eliminating dependency on large-scale data. Simultaneously, balanced mean squared error (BMSE) and balanced softmax cross-entropy (BSCE) losses are introduced to mitigate label distribution disparities between circuits, enabling robust and transferable parasitic estimation. Evaluated on parasitic capacitance estimation (edge-level task) and ground capacitance classification (node-level task) across TSMC 28nm AMS designs, CircuitGCL outperforms all state-of-the-art (SOTA) methods, with the $R^2$ improvement of $33.64\% \sim 44.20\%$ for edge regression and F1-score gain of $0.9\times \sim 2.1\times$ for node classification. Our code is available at https://github.com/ShenShan123/CircuitGCL.
comment: Final version accepted by the International Conference on Computer-Aided Design (ICCAD) 2025
Decentralized State Estimation and Opacity Verification Based on Partially Ordered Observation Sequences
In this paper, we investigate state estimation and opacity verification problems within a decentralized observation architecture. Specifically, we consider a discrete event system whose behavior is recorded by a set of observation sites. These sites transmit the partially ordered sequences of observations that they record to a coordinator whenever a synchronization occurs. To properly analyze the system behavior from the coordinator's viewpoint, we first introduce the notion of a Complete Synchronizing Sequence structure (CSS structure), which concisely captures the state evolution of each system state upon different information provided by the observation sites. Based on the CSS structure, we then construct corresponding current-state and initial-state estimators for offline state estimation at the coordinator. When used to verify state-isolation properties under this decentralized architecture, the use of CSS structure demonstrates a significant reduction in complexity compared with existing approaches in the literature. In particular, we discuss how to verify initial-state opacity at the coordinator, as well as a novel opacity notion, namely current-state-at-synchronization opacity.
Propeller Motion of a Devil-Stick using Normal Forcing
The problem of realizing rotary propeller motion of a devil-stick in the vertical plane using forces purely normal to the stick is considered. This problem represents a nonprehensile manipulation task of an underactuated system. In contrast with previous approaches, the devil-stick is manipulated by controlling the normal force and its point of application. Virtual holonomic constraints are used to design the trajectory of the center-of-mass of the devil-stick in terms of its orientation angle, and conditions for stable propeller motion are derived. Intermittent large-amplitude forces are used to asymptotically stabilize a desired propeller motion. Simulations demonstrate the efficacy of the approach in realizing stable propeller motion without loss of contact between the actuator and devil-stick.
comment: 6 pages, 5 figures. This work has been accepted for publication in the proceedings of the 2025 IEEE Conference on Control Technology and Applications (CCTA)
Random Walk Learning and the Pac-Man Attack
Random walk (RW)-based algorithms have long been popular in distributed systems due to low overheads and scalability, with recent growing applications in decentralized learning. However, their reliance on local interactions makes them inherently vulnerable to malicious behavior. In this work, we investigate an adversarial threat that we term the ``Pac-Man'' attack, in which a malicious node probabilistically terminates any RW that visits it. This stealthy behavior gradually eliminates active RWs from the network, effectively halting the learning process without triggering failure alarms. To counter this threat, we propose the Average Crossing (AC) algorithm--a fully decentralized mechanism for duplicating RWs to prevent RW extinction in the presence of Pac-Man. Our theoretical analysis establishes that (i) the RW population remains almost surely bounded under AC and (ii) RW-based stochastic gradient descent remains convergent under AC, even in the presence of Pac-Man, with a quantifiable deviation from the true optimum. Our extensive empirical results on both synthetic and real-world datasets corroborate our theoretical findings. Furthermore, they uncover a phase transition in the extinction probability as a function of the duplication threshold. We offer theoretical insights by analyzing a simplified variant of the AC, which sheds light on the observed phase transition.
comment: The updated manuscript represents an incomplete version of the work. A substantially updated version will be prepared before further dissemination
A Transferable Physics-Informed Framework for Battery Degradation Diagnosis, Knee-Onset Detection and Knee Prediction
The techno-economic and safety concerns of battery capacity knee occurrence call for developing online knee detection and prediction methods as an advanced battery management system (BMS) function. To address this, a transferable physics-informed framework that consists of a histogram-based feature engineering method, a hybrid physics-informed model, and a fine-tuning strategy, is proposed for online battery degradation diagnosis and knee-onset detection. The hybrid model is first developed and evaluated using a scenario-aware pipeline in protocol cycling scenarios and then fine-tuned to create local models deployed in a dynamic cycling scenario. A 2D histogram-based 17-feature set is found to be the best choice in both source and target scenarios. The fine-tuning strategy is proven to be effective in improving battery degradation mode estimation and degradation phase detection performance in the target scenario. Again, a strong linear correlation was found between the identified knee-onset and knee points. As a result, advanced BMS functions, such as online degradation diagnosis and prognosis, online knee-onset detection and knee prediction, aging-aware battery classification, and second-life repurposing, can be enabled through a battery performance digital twin in the cloud.
Embedding Safety into RL: A New Take on Trust Region Methods ICML 2025
Reinforcement Learning (RL) agents can solve diverse tasks but often exhibit unsafe behavior. Constrained Markov Decision Processes (CMDPs) address this by enforcing safety constraints, yet existing methods either sacrifice reward maximization or allow unsafe training. We introduce Constrained Trust Region Policy Optimization (C-TRPO), which reshapes the policy space geometry to ensure trust regions contain only safe policies, guaranteeing constraint satisfaction throughout training. We analyze its theoretical properties and connections to TRPO, Natural Policy Gradient (NPG), and Constrained Policy Optimization (CPO). Experiments show that C-TRPO reduces constraint violations while maintaining competitive returns.
comment: Accepted at ICML 2025
LQG Information Design
This paper addresses information design in a workhorse model of network games, where agents have linear best responses, the information designer maximizes a quadratic objective, and the payoff-relevant state follows a multivariate Gaussian distribution. We formulate the problem as a semidefinite program and establish strong duality to characterize the optimal information structure. A necessary and sufficient condition for optimality is given by a simple linear relationship between the induced equilibrium strategy profile and the state. Leveraging this characterization, we show that the state is fully revealed in an aggregative form for welfare maximization, while individual agents may remain only partially informed. When agent roles are interchangeable, the optimal information structure inherits the same degree of symmetry, which facilitates computation. In such cases, we show that the optimal amount of information revealed to each agent is closely linked to the network's chromatic number.
Multi-Class Stackelberg Games for the Co-Design of Networked Systems
We investigate a co-design problem, encompassing simultaneous design of system infrastructure and control, through a game-theoretical framework. To this end, we propose the co-design problem as a two-layer hierarchical strategic interaction. At the upper layer, a leader (or multiple leaders) determines system design parameters, while at the lower layer, a follower (or multiple followers) optimizes the control strategy. To capture this hierarchy, we propose four novel classes of Stackelberg games that integrate diverse strategic behaviors, including combinations of cooperative and non-cooperative interactions across two different layers. Notably, the leaders' interactions are represented using a normal-form game, whereas the followers' interactions are modeled by different games (dynamic games in discrete time). These distinct game structures result in a Stackelberg game that accommodates different game types per layer, and/or supports heterogeneous strategic behaviors involving cooperation and non-cooperation simultaneously. Learning algorithms using the best-response dynamics are used to solve the game problems when considering a discrete strategic space for the leaders. The efficacy of the proposed approach is demonstrated through an application to the co-design of the Barcelona drinking water network.
comment: This work has been submitted to the IEEE for possible publication
MARS-FTCP: Robust Fault-Tolerant Control and Agile Trajectory Planning for Modular Aerial Robot Systems
Modular Aerial Robot Systems (MARS) consist of multiple drone units that can self-reconfigure to adapt to various mission requirements and fault conditions. However, existing fault-tolerant control methods exhibit significant oscillations during docking and separation, impacting system stability. To address this issue, we propose a novel fault-tolerant control reallocation method that adapts to an arbitrary number of modular robots and their assembly formations. The algorithm redistributes the expected collective force and torque required for MARS to individual units according to their moment arm relative to the center of MARS mass. Furthermore, we propose an agile trajectory planning method for MARS of arbitrary configurations, which is collision-avoiding and dynamically feasible. Our work represents the first comprehensive approach to enable fault-tolerant and collision avoidance flight for MARS. We validate our method through extensive simulations, demonstrating improved fault tolerance, enhanced trajectory tracking accuracy, and greater robustness in cluttered environments. The videos and source code of this work are available at https://github.com/RuiHuangNUS/MARS-FTCP/
Formal Verification and Control with Conformal Prediction
We present recent advances in formal verification and control for autonomous systems with practical safety guarantees enabled by conformal prediction (CP), a statistical tool for uncertainty quantification. This survey is particularly motivated by learning-enabled autonomous systems (LEASs), where the complexity of learning-enabled components (LECs) poses a major bottleneck for applying traditional model-based verification and control techniques. To address this challenge, we advocate for CP as a lightweight alternative and demonstrate its use in formal verification, systems and control, and robotics. CP is appealing due to its simplicity (easy to understand, implement, and adapt), generality (requires no assumptions on learned models and underlying data distributions), and efficiency (real-time capable and accurate). This survey provides an accessible introduction to CP for non-experts interested in applying CP to autonomy problems. We particularly show how CP can be used for formal verification of LECs and the design of safe control as well as offline and online verification algorithms for LEASs. We present these techniques within a unifying framework that addresses the complexity of LEASs. Our exposition spans simple specifications, such as robot navigation tasks, to complex mission requirements expressed in temporal logic. Throughout the survey, we contrast CP with other statistical techniques, including scenario optimization and PAC-Bayes theory, highlighting advantages and limitations for verification and control. Finally, we outline open problems and promising directions for future research.
Estimating Reliability of Electric Vehicle Charging Ecosystem using the Principle of Maximum Entropy
This paper addresses the critical challenge of estimating the reliability of an Electric Vehicle (EV) charging systems when facing risks such as overheating, unpredictable, weather, and cyberattacks. Traditional methods for predicting failures often rely on past data or limiting assumptions, making them ineffective for new or less common threats that results in failure. To solve this issue, we utilize the Principle of Maximum Entropy (PME), a statistical tool that estimates risks even with limited information. PME works by balancing known constraints to create an unbiased predictions without guessing missing details. Using the EV charging ecosystem as a case study, we show how PME models stress factors responsible for failure. Our findings reveal a critical insight: even minor, localized stress events can trigger disproportionately large drops in overall system reliability, similar to a domino effect. The our PME model demonstrates how high-impact components, such as the power grid, are more likely to fail as stress accumulates, creating network-wide tipping points. Beyond EVs, this approach applies to any complex system with incomplete data, such as smart grids, healthcare devices, or logistics networks. By mathematically establishing an inverse relationship between uncertainty (entropy) and reliability, our work quantifies how greater system unpredictability directly degrades robustness. This offers a universal tool to improve decision-making under unpredictable conditions. This work bridges advanced mathematics with real-world engineering, providing actionable insights for policymakers and industries to build safer, more efficient systems in our increasingly connected world.
Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence
Zero-sum Linear Quadratic (LQ) games are fundamental in optimal control and can be used (i)~as a dynamic game formulation for risk-sensitive or robust control and (ii)~as a benchmark setting for multi-agent reinforcement learning with two competing agents in continuous state-control spaces. In contrast to the well-studied single-agent linear quadratic regulator problem, zero-sum LQ games entail solving a challenging nonconvex-nonconcave min-max problem with an objective function that lacks coercivity. Recently, Zhang et al. showed that an~$\epsilon$-Nash equilibrium (NE) of finite horizon zero-sum LQ games can be learned via nested model-free Natural Policy Gradient (NPG) algorithms with poly$(1/\epsilon)$ sample complexity. In this work, we propose a simpler nested Zeroth-Order (ZO) algorithm improving sample complexity by several orders of magnitude and guaranteeing convergence of the last iterate. Our main results are two-fold: (i) in the deterministic setting, we establish the first global last-iterate linear convergence result for the nested algorithm that seeks NE of zero-sum LQ games; (ii) in the model-free setting, we establish a~$\widetilde{\mathcal{O}}(\epsilon^{-2})$ sample complexity using a single-point ZO estimator. For our last-iterate convergence results, our analysis leverages the Implicit Regularization (IR) property and a new gradient domination condition for the primal function. Our key improvements in the sample complexity rely on a more sample-efficient nested algorithm design and a finer control of the ZO natural gradient estimation error utilizing the structure endowed by the finite-horizon setting.
Systems and Control (EESS)
Two-Impulse Trajectory Design in Two-Body Systems With Riemannian Geometry
This work presents a new method for generating impulsive trajectories in restricted two-body systems by leveraging Riemannian geometry. The proposed method transforms the standard trajectory optimization problem into a purely geometric one that involves computing a set of geodesics for a suitable Riemannian metric. This transformation is achieved by defining a metric, specifically the Jacobi metric, that embeds the dynamics directly into the metric, so any geodesic of the metric is also a dynamically feasible trajectory. The method finds the fuel-optimal transfer trajectory by sampling candidate energy ($\Delta V$) changes for different points on the current and desired orbit, and efficiently computing and evaluating each candidate geodesic, which are equivalent to candidate orbit transfer trajectories via the Jacobi metric. The method bypasses the known issues of optimization-based methods, e.g., sensitivity to the initial guess, and can be applied to more complex two-body systems. The approach is demonstrated on the minimum-$\Delta V$ two-impulse phase-free orbit transfer problem, first on a Keplerian system and second on a system with a modeled $J_2$ perturbation. The proposed method is shown to meet or exceed the state-of-the-art methods in the minimum-$\Delta V$ problem in the Keplerian system. The generality and versatility of the approach is demonstrated by seamlessly including the $J_2$ perturbation, a case that many existing methods cannot handle. Numerical simulations and performance comparisons showcase the effectiveness of the approach.
Nominal Evaluation Of Automatic Multi-Sections Control Potential In Comparison To A Simpler One- Or Two-Sections Alternative With Predictive Spray Switching
Automatic Section Control (ASC) is a long-standing trend for spraying in agriculture. It promises to minimise spray overlap areas. The core idea is to (i) switch off spray nozzles on areas that have already been sprayed, and (ii) to dynamically adjust nozzle flow rates along the boom bar that holds the spray nozzles when velocities of boom sections vary during turn maneuvers. ASC is not possible without sensors, in particular for accurate positioning data. Spraying and the movement of modern wide boom bars are highly dynamic processes. In addition, many uncertainty factors have an effect such as cross wind drift, boom height, nozzle clogging in open-field conditions, and so forth. In view of this complexity, the natural question arises if a simpler alternative exist. Therefore, an Automatic Multi-Sections Control method is compared to a proposed simpler one- or two-sections alternative that uses predictive spray switching. The comparison is provided under nominal conditions. Agricultural spraying is intrinsically linked to area coverage path planning and spray switching logic. Combinations of two area coverage path planning and switching logics as well as three sections-setups are compared. The three sections-setups differ by controlling 48 sections, 2 sections or controlling all nozzles uniformly with the same control signal as one single section. Methods are evaluated on 10 diverse real-world field examples, including non-convex field contours, freeform mainfield lanes and multiple obstacle areas. A preferred method is suggested that (i) minimises area coverage pathlength, (ii) offers intermediate overlap, (iii) is suitable for manual driving by following a pre-planned predictive spray switching logic for an area coverage path plan, and (iv) and in contrast to ASC can be implemented sensor-free and therefore at low cost.
comment: 14 pages plus 7 pages appendix with additional figures, 18 main figures, 3 tables
A Dynamically Weighted ADMM Framework for Byzantine Resilience
The alternating direction of multipliers method (ADMM) is a popular method to solve distributed consensus optimization utilizing efficient communication among various nodes in the network. However, in the presence of faulty or attacked nodes, even a small perturbation (or sharing false data) during the communication can lead to divergence of the solution. To address this issue, in this work we consider ADMM under the effect of Byzantine threat, where an unknown subset of nodes is subject to Byzantine attacks or faults. We propose Dynamically Weighted ADMM (DW-ADMM), a novel variant of ADMM that uses dynamic weights on the edges of the network, thus promoting resilient distributed optimization. We establish that the proposed method (i) produces a nearly identical solution to conventional ADMM in the error-free case, and (ii) guarantees a bounded solution with respect to the global minimizer, even under Byzantine threat. Finally, we demonstrate the effectiveness of our proposed algorithm using an illustrative numerical simulation.
comment: 7 pages, 2 figures
Identification of Sub/Super-Synchronous Control Interaction Paths Using Dissipative Energy Flow
Sub- and super-synchronous control interactions (SSCIs) are oscillations arising from adverse interactions between inverter-based resource (IBR) controls and the power network. SSCIs often involve multiple frequencies and propagate through complex, interconnected paths, making it difficult for model-based approaches to identify both the sources and the paths of oscillatory energy flow. This paper extends the Dissipative Energy Flow (DEF) method, originally developed for low-frequency electromechanical oscillations, to identify SSCI sources and dynamic interaction paths across multiple frequencies using three-phase voltage and current measurements. The approach operates in the dq frame using dynamic phasors, enabling mode-specific DEF computation from bandpass-filtered signals. An electromagnetic transient (EMT) case study on a meshed network with synchronous generator and type-3 wind farm resources under series-compensated conditions demonstrates the method's capability to distinguish frequency-dependent source and sink roles, including cases where the same resource acts as a source at one frequency and a sink at another. The results show DEF can provide a physics-based and automation-friendly tool for SSCI diagnosis in IBR-rich grids.
comment: 5 pages, 11 figures
Towards Fully Onboard State Estimation and Trajectory Tracking for UAVs with Suspended Payloads
This paper addresses the problem of tracking the position of a cable-suspended payload carried by an unmanned aerial vehicle, with a focus on real-world deployment and minimal hardware requirements. In contrast to many existing approaches that rely on motion-capture systems, additional onboard cameras, or instrumented payloads, we propose a framework that uses only standard onboard sensors--specifically, real-time kinematic global navigation satellite system measurements and data from the onboard inertial measurement unit--to estimate and control the payload's position. The system models the full coupled dynamics of the aerial vehicle and payload, and integrates a linear Kalman filter for state estimation, a model predictive contouring control planner, and an incremental model predictive controller. The control architecture is designed to remain effective despite sensing limitations and estimation uncertainty. Extensive simulations demonstrate that the proposed system achieves performance comparable to control based on ground-truth measurements, with only minor degradation (< 6%). The system also shows strong robustness to variations in payload parameters. Field experiments further validate the framework, confirming its practical applicability and reliable performance in outdoor environments using only off-the-shelf aerial vehicle hardware.
Integrating Uncertainties for Koopman-Based Stabilization
Over the past decades, the Koopman operator has been widely applied in data-driven control, yet its theoretical foundations remain underexplored. This paper establishes a unified framework to address the robust stabilization problem in data-driven control via the Koopman operator, fully accounting for three uncertainties: projection error, estimation error, and process disturbance. It comprehensively investigates both direct and indirect data-driven control approaches, facilitating flexible methodology selection for analysis and control. For the direct approach, considering process disturbances, the lifted-state feedback controller, designed via a linear matrix inequality (LMI), robustly stabilizes all lifted bilinear systems consistent with noisy data. For the indirect approach requiring system identification, the feedback controller, designed using a nonlinear matrix inequality convertible to an LMI, ensures closed-loop stability under worst-case process disturbances. Numerical simulations via cross-validation validate the effectiveness of both approaches, highlighting their theoretical significance and practical utility.
A Comparative Study of Floating-Base Space Parameterizations for Agile Whole-Body Motion Planning
Automatically generating agile whole-body motions for legged and humanoid robots remains a fundamental challenge in robotics. While numerous trajectory optimization approaches have been proposed, there is no clear guideline on how the choice of floating-base space parameterization affects performance, especially for agile behaviors involving complex contact dynamics. In this paper, we present a comparative study of different parameterizations for direct transcription-based trajectory optimization of agile motions in legged systems. We systematically evaluate several common choices under identical optimization settings to ensure a fair comparison. Furthermore, we introduce a novel formulation based on the tangent space of SE(3) for representing the robot's floating-base pose, which, to our knowledge, has not received attention from the literature. This approach enables the use of mature off-the-shelf numerical solvers without requiring specialized manifold optimization techniques. We hope that our experiments and analysis will provide meaningful insights for selecting the appropriate floating-based representation for agile whole-body motion generation.
comment: 8 pages, 2 figures, 4 tables, Accepted at Humanoids 2025
Robust Convolution Neural ODEs via Contractivity-promoting regularization
Neural networks can be fragile to input noise and adversarial attacks. In this work, we consider Convolutional Neural Ordinary Differential Equations (NODEs), a family of continuous-depth neural networks represented by dynamical systems, and propose to use contraction theory to improve their robustness. For a contractive dynamical system two trajectories starting from different initial conditions converge to each other exponentially fast. Contractive Convolutional NODEs can enjoy increased robustness as slight perturbations of the features do not cause a significant change in the output. Contractivity can be induced during training by using a regularization term involving the Jacobian of the system dynamics. To reduce the computational burden, we show that it can also be promoted using carefully selected weight regularization terms for a class of NODEs with slope-restricted activation functions. The performance of the proposed regularizers is illustrated through benchmark image classification tasks on MNIST and FashionMNIST datasets, where images are corrupted by different kinds of noise and attacks.
comment: Accepted in IEEE CDC2025, Rio de Janeiro, Brazil
Principles of Physiological Closed-Loop Controllers in Neuromodulation
As neurostimulation devices increasingly incorporate closed-loop functionality, the greater design complexity brings additional requirements for risk management and special considerations to optimise benefit. This manuscript creates a common framework upon which all current and planned neuromodulation-based physiological closed-loop controllers (PCLCs) can be mapped including integration of the Technical Considerations of Medical Devices with Physiologic Closed-Loop Control Technology guidance published in 2023 by the United States Food and Drug Administration (FDA), a classification of feedback (reactive) and feedforward (predictive) biomarkers, and control systems theory. We explain risk management in the context of this framework and illustrate its applications for three exemplary technologies. This manuscript serves as guidance to the emerging field of PCLCs in neuromodulation, mitigating risk through standardized nomenclature and a systematic outline for rigorous device development, testing, and implementation.
comment: 35 pages, 10 figures
System Synchronization Based on Complex Frequency
In response to the inertia decline caused by high penetration of renewable generation, traditional synchronization criteria that rely solely on frequency consistency are increasingly inadequate for characterizing the coupled behavior of frequency and voltage dynamics during power-system transients. This paper focuses on the theory of complex-frequency synchronization and develops a theory-simulation analysis framework that offers a new perspective for steady-state and transient analysis of low-inertia power systems. First, the fundamental concepts and theoretical foundations of complex-frequency synchronization are presented in detail. Second, local and global dynamic synchronization criteria are derived and the concept of generalized inertia is introduced, which unifies the conventional inertial support to frequency with the inertia-like support of voltage, thereby providing an accurate measure of region-level coupled support strength for voltage and frequency. Finally, numerical case studies on the IEEE 9-bus system validate the effectiveness of the proposed theoretical methods and criteria, and demonstrate a visualization workflow for key indicators such as disturbance impact zones and generalized-inertia regions.
Direct data-driven interpolation and approximation of linear parameter-varying system trajectories
We consider the problem of estimating missing values in trajectories of linear parameter-varying (LPV) systems. We solve this interpolation problem for the class of shifted-affine LPV systems. Conditions for the existence and uniqueness of solutions are given and a direct data-driven algorithm for its computation is presented, i.e., the data-generating system is not given by a parametric model but is implicitly specified by data. We illustrate the applicability of the proposed solution on illustrative examples of a mass-spring-damper system with exogenous and endogenous parameter variation.
comment: 9 pages, 5 figures, submitted for review
Geometry-Aware Predictive Safety Filters on Humanoids: From Poisson Safety Functions to CBF Constrained MPC
Autonomous navigation through unstructured and dynamically-changing environments is a complex task that continues to present many challenges for modern roboticists. In particular, legged robots typically possess manipulable asymmetric geometries which must be considered during safety-critical trajectory planning. This work proposes a predictive safety filter: a nonlinear model predictive control (MPC) algorithm for online trajectory generation with geometry-aware safety constraints based on control barrier functions (CBFs). Critically, our method leverages Poisson safety functions to numerically synthesize CBF constraints directly from perception data. We extend the theoretical framework for Poisson safety functions to incorporate temporal changes in the domain by reformulating the static Dirichlet problem for Poisson's equation as a parameterized moving boundary value problem. Furthermore, we employ Minkowski set operations to lift the domain into a configuration space that accounts for robot geometry. Finally, we implement our real-time predictive safety filter on humanoid and quadruped robots in various safety-critical scenarios. The results highlight the versatility of Poisson safety functions, as well as the benefit of CBF constrained model predictive safety-critical controllers.
comment: 2025 IEEE-RAS 24th International Conference on Humanoid Robots
Recent Advances in Transformer and Large Language Models for UAV Applications
The rapid advancement of Transformer-based models has reshaped the landscape of uncrewed aerial vehicle (UAV) systems by enhancing perception, decision-making, and autonomy. This review paper systematically categorizes and evaluates recent developments in Transformer architectures applied to UAVs, including attention mechanisms, CNN-Transformer hybrids, reinforcement learning Transformers, and large language models (LLMs). Unlike previous surveys, this work presents a unified taxonomy of Transformer-based UAV models, highlights emerging applications such as precision agriculture and autonomous navigation, and provides comparative analyses through structured tables and performance benchmarks. The paper also reviews key datasets, simulators, and evaluation metrics used in the field. Furthermore, it identifies existing gaps in the literature, outlines critical challenges in computational efficiency and real-time deployment, and offers future research directions. This comprehensive synthesis aims to guide researchers and practitioners in understanding and advancing Transformer-driven UAV technologies.
Securing Sideways: Thwarting Lateral Movement by Implementing Active Directory Tiering
The advancement of computing equipment and the advances in services over the Internet has allowed corporations, higher education, and many other organizations to pursue the shared computing network environment. A requirement for shared computing environments is a centralized identity system to authenticate and authorize user access. An organization's digital identity plane is a prime target for cyber threat actors. When compromised, identities can be exploited to steal credentials, create unauthorized accounts, and manipulate permissions-enabling attackers to gain control of the network and undermine its confidentiality, availability, and integrity. Cybercrime losses reached a record of 16.6 B in the United States in 2024. For organizations using Microsoft software, Active Directory is the on-premises identity system of choice. In this article, we examine the challenge of security compromises in Active Directory (AD) environments and present effective strategies to prevent credential theft and limit lateral movement by threat actors. Our proposed approaches aim to confine the movement of compromised credentials, preventing significant privilege escalation and theft. We argue that through our illustration of real-world scenarios, tiering can halt lateral movement and advanced cyber-attacks, thus reducing ransom escalation. Our work bridges a gap in existing literature by combining technical guidelines with theoretical arguments in support of tiering, positioning it as a vital component of modern cybersecurity strategy even though it cannot function in isolation. As the hardware advances and the cloud sourced services along with AI is advancing with unprecedented speed, we think it is important for security experts and the business to work together and start designing and developing software and frameworks to classify devices automatically and accurately within the tiered structure.
comment: 11 pages
Control of a commercial vehicle by a tetraplegic human using a bimanual brain-computer interface
Brain-computer interfaces (BCIs) read neural signals directly from the brain to infer motor planning and execution. However, the implementation of this technology has been largely limited to laboratory settings, with few real-world applications. We developed a bimanual BCI system to drive a vehicle in both simulated and real-world environments. We demonstrate that an individual with tetraplegia, implanted with intracortical BCI electrodes in the posterior parietal cortex (PPC) and the hand knob region of the motor cortex (MC), reacts at least as fast and precisely as motor intact participants, and drives a simulated vehicle as proficiently as the same control group. This BCI participant, living in California, could also remotely drive a Ford Mustang Mach-E vehicle in Michigan. Our first teledriving task relied on cursor control for speed and steering in a closed urban test facility. However, the final BCI system added click control for full-stop braking and thus enabled bimanual cursor-and-click control for both simulated driving through a virtual town with traffic and teledriving through an obstacle course without traffic in the real world. We also demonstrate the safety and feasibility of BCI-controlled driving. This first-of-its-kind implantable BCI application not only highlights the versatility and innovative potentials of BCIs but also illuminates the promising future for the development of life-changing solutions to restore independence to those who suffer catastrophic neurological injury.
comment: 41 pages, 7 figures, 1 table. 22 supplementary pages, 6 supplementary figures, 11 supplementary tables, 9 supplementary movies available as ancillary files
Matrix Control Barrier Functions
This paper generalizes the control barrier function framework by replacing scalar-valued functions with matrix-valued ones. Specifically, we develop barrier conditions for safe sets defined by matrix inequalities -- both semidefinite and indefinite. Matrix inequalities can be used to describe a richer class of safe sets, including nonsmooth ones. The safety filters constructed from our proposed matrix control barrier functions via semidefinite programming (CBF-SDP) are shown to be continuous. Our matrix formulation naturally provides a continuous safety filter for Boolean-based control barrier functions, notably for disjunctions (OR), without relaxing the safe set. We illustrate the effectiveness of the proposed framework with applications in drone network connectivity maintenance and nonsmooth obstacle avoidance, both in simulations and hardware experiments.
comment: 13 pages, 4 figures, submitted to the IEEE Transactions on Automatic Control
Quantifying the Value of Seismic Structural Health Monitoring for post-earthquake recovery of electric power system in terms of resilience enhancement
Post-earthquake recovery of electric power networks (EPNs) is critical to community resilience. Traditional recovery processes often rely on prolonged and imprecise manual inspections for damage diagnosis, leading to suboptimal repair prioritization and extended service disruptions. Seismic Structural Health Monitoring (SSHM) offers the potential to expedite recovery by enabling more accurate and timely damage assessment. However, SSHM deployment incurs costs, and its system-level resilience benefit remains underexplored. This study proposes a probabilistic simulation framework to quantify the value of SSHM for enhancing EPN resilience. The framework includes seismic damage modeling based on network configuration, hazard intensity, fragility functions, and damage-functionality mappings, combined with recovery simulations incorporating resource constraints, repair and transfer durations. System functionality is evaluated using graph-based island detection and optimal power flow analysis. Resilience is quantified via the Lack of Resilience (LoR) metric derived from the functionality restoration curve. SSHM is incorporated by altering the quality of damage information used in repair scheduling. Different monitoring scenarios (e.g., no-SSHM baseline, partial SSHM, full SSHM with various accuracies) are modeled using confusion matrices to simulate damage misclassification. Results show that improved damage awareness via SSHM significantly accelerates recovery and reduces LoR by up to 21%. This work supports evidence-based decisions for SSHM deployment in critical infrastructure.
comment: 21 pages. 14 figures
Transferable Parasitic Estimation via Graph Contrastive Learning and Label Rebalancing in AMS Circuits
Graph representation learning on Analog-Mixed Signal (AMS) circuits is crucial for various downstream tasks, e.g., parasitic estimation. However, the scarcity of design data, the unbalanced distribution of labels, and the inherent diversity of circuit implementations pose significant challenges to learning robust and transferable circuit representations. To address these limitations, we propose CircuitGCL, a novel graph contrastive learning framework that integrates representation scattering and label rebalancing to enhance transferability across heterogeneous circuit graphs. CircuitGCL employs a self-supervised strategy to learn topology-invariant node embeddings through hyperspherical representation scattering, eliminating dependency on large-scale data. Simultaneously, balanced mean squared error (BMSE) and balanced softmax cross-entropy (BSCE) losses are introduced to mitigate label distribution disparities between circuits, enabling robust and transferable parasitic estimation. Evaluated on parasitic capacitance estimation (edge-level task) and ground capacitance classification (node-level task) across TSMC 28nm AMS designs, CircuitGCL outperforms all state-of-the-art (SOTA) methods, with the $R^2$ improvement of $33.64\% \sim 44.20\%$ for edge regression and F1-score gain of $0.9\times \sim 2.1\times$ for node classification. Our code is available at https://github.com/ShenShan123/CircuitGCL.
comment: Final version accepted by the International Conference on Computer-Aided Design (ICCAD) 2025
Decentralized State Estimation and Opacity Verification Based on Partially Ordered Observation Sequences
In this paper, we investigate state estimation and opacity verification problems within a decentralized observation architecture. Specifically, we consider a discrete event system whose behavior is recorded by a set of observation sites. These sites transmit the partially ordered sequences of observations that they record to a coordinator whenever a synchronization occurs. To properly analyze the system behavior from the coordinator's viewpoint, we first introduce the notion of a Complete Synchronizing Sequence structure (CSS structure), which concisely captures the state evolution of each system state upon different information provided by the observation sites. Based on the CSS structure, we then construct corresponding current-state and initial-state estimators for offline state estimation at the coordinator. When used to verify state-isolation properties under this decentralized architecture, the use of CSS structure demonstrates a significant reduction in complexity compared with existing approaches in the literature. In particular, we discuss how to verify initial-state opacity at the coordinator, as well as a novel opacity notion, namely current-state-at-synchronization opacity.
Propeller Motion of a Devil-Stick using Normal Forcing
The problem of realizing rotary propeller motion of a devil-stick in the vertical plane using forces purely normal to the stick is considered. This problem represents a nonprehensile manipulation task of an underactuated system. In contrast with previous approaches, the devil-stick is manipulated by controlling the normal force and its point of application. Virtual holonomic constraints are used to design the trajectory of the center-of-mass of the devil-stick in terms of its orientation angle, and conditions for stable propeller motion are derived. Intermittent large-amplitude forces are used to asymptotically stabilize a desired propeller motion. Simulations demonstrate the efficacy of the approach in realizing stable propeller motion without loss of contact between the actuator and devil-stick.
comment: 6 pages, 5 figures. This work has been accepted for publication in the proceedings of the 2025 IEEE Conference on Control Technology and Applications (CCTA)
Random Walk Learning and the Pac-Man Attack
Random walk (RW)-based algorithms have long been popular in distributed systems due to low overheads and scalability, with recent growing applications in decentralized learning. However, their reliance on local interactions makes them inherently vulnerable to malicious behavior. In this work, we investigate an adversarial threat that we term the ``Pac-Man'' attack, in which a malicious node probabilistically terminates any RW that visits it. This stealthy behavior gradually eliminates active RWs from the network, effectively halting the learning process without triggering failure alarms. To counter this threat, we propose the Average Crossing (AC) algorithm--a fully decentralized mechanism for duplicating RWs to prevent RW extinction in the presence of Pac-Man. Our theoretical analysis establishes that (i) the RW population remains almost surely bounded under AC and (ii) RW-based stochastic gradient descent remains convergent under AC, even in the presence of Pac-Man, with a quantifiable deviation from the true optimum. Our extensive empirical results on both synthetic and real-world datasets corroborate our theoretical findings. Furthermore, they uncover a phase transition in the extinction probability as a function of the duplication threshold. We offer theoretical insights by analyzing a simplified variant of the AC, which sheds light on the observed phase transition.
comment: The updated manuscript represents an incomplete version of the work. A substantially updated version will be prepared before further dissemination
A Transferable Physics-Informed Framework for Battery Degradation Diagnosis, Knee-Onset Detection and Knee Prediction
The techno-economic and safety concerns of battery capacity knee occurrence call for developing online knee detection and prediction methods as an advanced battery management system (BMS) function. To address this, a transferable physics-informed framework that consists of a histogram-based feature engineering method, a hybrid physics-informed model, and a fine-tuning strategy, is proposed for online battery degradation diagnosis and knee-onset detection. The hybrid model is first developed and evaluated using a scenario-aware pipeline in protocol cycling scenarios and then fine-tuned to create local models deployed in a dynamic cycling scenario. A 2D histogram-based 17-feature set is found to be the best choice in both source and target scenarios. The fine-tuning strategy is proven to be effective in improving battery degradation mode estimation and degradation phase detection performance in the target scenario. Again, a strong linear correlation was found between the identified knee-onset and knee points. As a result, advanced BMS functions, such as online degradation diagnosis and prognosis, online knee-onset detection and knee prediction, aging-aware battery classification, and second-life repurposing, can be enabled through a battery performance digital twin in the cloud.
Embedding Safety into RL: A New Take on Trust Region Methods ICML 2025
Reinforcement Learning (RL) agents can solve diverse tasks but often exhibit unsafe behavior. Constrained Markov Decision Processes (CMDPs) address this by enforcing safety constraints, yet existing methods either sacrifice reward maximization or allow unsafe training. We introduce Constrained Trust Region Policy Optimization (C-TRPO), which reshapes the policy space geometry to ensure trust regions contain only safe policies, guaranteeing constraint satisfaction throughout training. We analyze its theoretical properties and connections to TRPO, Natural Policy Gradient (NPG), and Constrained Policy Optimization (CPO). Experiments show that C-TRPO reduces constraint violations while maintaining competitive returns.
comment: Accepted at ICML 2025
LQG Information Design
This paper addresses information design in a workhorse model of network games, where agents have linear best responses, the information designer maximizes a quadratic objective, and the payoff-relevant state follows a multivariate Gaussian distribution. We formulate the problem as a semidefinite program and establish strong duality to characterize the optimal information structure. A necessary and sufficient condition for optimality is given by a simple linear relationship between the induced equilibrium strategy profile and the state. Leveraging this characterization, we show that the state is fully revealed in an aggregative form for welfare maximization, while individual agents may remain only partially informed. When agent roles are interchangeable, the optimal information structure inherits the same degree of symmetry, which facilitates computation. In such cases, we show that the optimal amount of information revealed to each agent is closely linked to the network's chromatic number.
Multi-Class Stackelberg Games for the Co-Design of Networked Systems
We investigate a co-design problem, encompassing simultaneous design of system infrastructure and control, through a game-theoretical framework. To this end, we propose the co-design problem as a two-layer hierarchical strategic interaction. At the upper layer, a leader (or multiple leaders) determines system design parameters, while at the lower layer, a follower (or multiple followers) optimizes the control strategy. To capture this hierarchy, we propose four novel classes of Stackelberg games that integrate diverse strategic behaviors, including combinations of cooperative and non-cooperative interactions across two different layers. Notably, the leaders' interactions are represented using a normal-form game, whereas the followers' interactions are modeled by different games (dynamic games in discrete time). These distinct game structures result in a Stackelberg game that accommodates different game types per layer, and/or supports heterogeneous strategic behaviors involving cooperation and non-cooperation simultaneously. Learning algorithms using the best-response dynamics are used to solve the game problems when considering a discrete strategic space for the leaders. The efficacy of the proposed approach is demonstrated through an application to the co-design of the Barcelona drinking water network.
comment: This work has been submitted to the IEEE for possible publication
MARS-FTCP: Robust Fault-Tolerant Control and Agile Trajectory Planning for Modular Aerial Robot Systems
Modular Aerial Robot Systems (MARS) consist of multiple drone units that can self-reconfigure to adapt to various mission requirements and fault conditions. However, existing fault-tolerant control methods exhibit significant oscillations during docking and separation, impacting system stability. To address this issue, we propose a novel fault-tolerant control reallocation method that adapts to an arbitrary number of modular robots and their assembly formations. The algorithm redistributes the expected collective force and torque required for MARS to individual units according to their moment arm relative to the center of MARS mass. Furthermore, we propose an agile trajectory planning method for MARS of arbitrary configurations, which is collision-avoiding and dynamically feasible. Our work represents the first comprehensive approach to enable fault-tolerant and collision avoidance flight for MARS. We validate our method through extensive simulations, demonstrating improved fault tolerance, enhanced trajectory tracking accuracy, and greater robustness in cluttered environments. The videos and source code of this work are available at https://github.com/RuiHuangNUS/MARS-FTCP/
Formal Verification and Control with Conformal Prediction
We present recent advances in formal verification and control for autonomous systems with practical safety guarantees enabled by conformal prediction (CP), a statistical tool for uncertainty quantification. This survey is particularly motivated by learning-enabled autonomous systems (LEASs), where the complexity of learning-enabled components (LECs) poses a major bottleneck for applying traditional model-based verification and control techniques. To address this challenge, we advocate for CP as a lightweight alternative and demonstrate its use in formal verification, systems and control, and robotics. CP is appealing due to its simplicity (easy to understand, implement, and adapt), generality (requires no assumptions on learned models and underlying data distributions), and efficiency (real-time capable and accurate). This survey provides an accessible introduction to CP for non-experts interested in applying CP to autonomy problems. We particularly show how CP can be used for formal verification of LECs and the design of safe control as well as offline and online verification algorithms for LEASs. We present these techniques within a unifying framework that addresses the complexity of LEASs. Our exposition spans simple specifications, such as robot navigation tasks, to complex mission requirements expressed in temporal logic. Throughout the survey, we contrast CP with other statistical techniques, including scenario optimization and PAC-Bayes theory, highlighting advantages and limitations for verification and control. Finally, we outline open problems and promising directions for future research.
Estimating Reliability of Electric Vehicle Charging Ecosystem using the Principle of Maximum Entropy
This paper addresses the critical challenge of estimating the reliability of an Electric Vehicle (EV) charging systems when facing risks such as overheating, unpredictable, weather, and cyberattacks. Traditional methods for predicting failures often rely on past data or limiting assumptions, making them ineffective for new or less common threats that results in failure. To solve this issue, we utilize the Principle of Maximum Entropy (PME), a statistical tool that estimates risks even with limited information. PME works by balancing known constraints to create an unbiased predictions without guessing missing details. Using the EV charging ecosystem as a case study, we show how PME models stress factors responsible for failure. Our findings reveal a critical insight: even minor, localized stress events can trigger disproportionately large drops in overall system reliability, similar to a domino effect. The our PME model demonstrates how high-impact components, such as the power grid, are more likely to fail as stress accumulates, creating network-wide tipping points. Beyond EVs, this approach applies to any complex system with incomplete data, such as smart grids, healthcare devices, or logistics networks. By mathematically establishing an inverse relationship between uncertainty (entropy) and reliability, our work quantifies how greater system unpredictability directly degrades robustness. This offers a universal tool to improve decision-making under unpredictable conditions. This work bridges advanced mathematics with real-world engineering, providing actionable insights for policymakers and industries to build safer, more efficient systems in our increasingly connected world.
Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence
Zero-sum Linear Quadratic (LQ) games are fundamental in optimal control and can be used (i)~as a dynamic game formulation for risk-sensitive or robust control and (ii)~as a benchmark setting for multi-agent reinforcement learning with two competing agents in continuous state-control spaces. In contrast to the well-studied single-agent linear quadratic regulator problem, zero-sum LQ games entail solving a challenging nonconvex-nonconcave min-max problem with an objective function that lacks coercivity. Recently, Zhang et al. showed that an~$\epsilon$-Nash equilibrium (NE) of finite horizon zero-sum LQ games can be learned via nested model-free Natural Policy Gradient (NPG) algorithms with poly$(1/\epsilon)$ sample complexity. In this work, we propose a simpler nested Zeroth-Order (ZO) algorithm improving sample complexity by several orders of magnitude and guaranteeing convergence of the last iterate. Our main results are two-fold: (i) in the deterministic setting, we establish the first global last-iterate linear convergence result for the nested algorithm that seeks NE of zero-sum LQ games; (ii) in the model-free setting, we establish a~$\widetilde{\mathcal{O}}(\epsilon^{-2})$ sample complexity using a single-point ZO estimator. For our last-iterate convergence results, our analysis leverages the Implicit Regularization (IR) property and a new gradient domination condition for the primal function. Our key improvements in the sample complexity rely on a more sample-efficient nested algorithm design and a finer control of the ZO natural gradient estimation error utilizing the structure endowed by the finite-horizon setting.
Robotics
TLE-Based A2C Agent for Terrestrial Coverage Orbital Path Planning
The increasing congestion of Low Earth Orbit (LEO) poses persistent challenges to the efficient deployment and safe operation of Earth observation satellites. Mission planners must now account not only for mission-specific requirements but also for the increasing collision risk with active satellites and space debris. This work presents a reinforcement learning framework using the Advantage Actor-Critic (A2C) algorithm to optimize satellite orbital parameters for precise terrestrial coverage within predefined surface radii. By formulating the problem as a Markov Decision Process (MDP) within a custom OpenAI Gymnasium environment, our method simulates orbital dynamics using classical Keplerian elements. The agent progressively learns to adjust five of the orbital parameters - semi-major axis, eccentricity, inclination, right ascension of ascending node, and the argument of perigee-to achieve targeted terrestrial coverage. Comparative evaluation against Proximal Policy Optimization (PPO) demonstrates A2C's superior performance, achieving 5.8x higher cumulative rewards (10.0 vs 9.263025) while converging in 31.5x fewer timesteps (2,000 vs 63,000). The A2C agent consistently meets mission objectives across diverse target coordinates while maintaining computational efficiency suitable for real-time mission planning applications. Key contributions include: (1) a TLE-based orbital simulation environment incorporating physics constraints, (2) validation of actor-critic methods' superiority over trust region approaches in continuous orbital control, and (3) demonstration of rapid convergence enabling adaptive satellite deployment. This approach establishes reinforcement learning as a computationally efficient alternative for scalable and intelligent LEO mission planning.
comment: 8 pages, 6 figures, 5 tables
CVIRO: A Consistent and Tightly-Coupled Visual-Inertial-Ranging Odometry on Lie Groups
Ultra Wideband (UWB) is widely used to mitigate drift in visual-inertial odometry (VIO) systems. Consistency is crucial for ensuring the estimation accuracy of a UWBaided VIO system. An inconsistent estimator can degrade localization performance, where the inconsistency primarily arises from two main factors: (1) the estimator fails to preserve the correct system observability, and (2) UWB anchor positions are assumed to be known, leading to improper neglect of calibration uncertainty. In this paper, we propose a consistent and tightly-coupled visual-inertial-ranging odometry (CVIRO) system based on the Lie group. Our method incorporates the UWB anchor state into the system state, explicitly accounting for UWB calibration uncertainty and enabling the joint and consistent estimation of both robot and anchor states. Furthermore, observability consistency is ensured by leveraging the invariant error properties of the Lie group. We analytically prove that the CVIRO algorithm naturally maintains the system's correct unobservable subspace, thereby preserving estimation consistency. Extensive simulations and experiments demonstrate that CVIRO achieves superior localization accuracy and consistency compared to existing methods.
A Multimodal Neural Network for Recognizing Subjective Self-Disclosure Towards Social Robots IROS
Subjective self-disclosure is an important feature of human social interaction. While much has been done in the social and behavioural literature to characterise the features and consequences of subjective self-disclosure, little work has been done thus far to develop computational systems that are able to accurately model it. Even less work has been done that attempts to model specifically how human interactants self-disclose with robotic partners. It is becoming more pressing as we require social robots to work in conjunction with and establish relationships with humans in various social settings. In this paper, our aim is to develop a custom multimodal attention network based on models from the emotion recognition literature, training this model on a large self-collected self-disclosure video corpus, and constructing a new loss function, the scale preserving cross entropy loss, that improves upon both classification and regression versions of this problem. Our results show that the best performing model, trained with our novel loss function, achieves an F1 score of 0.83, an improvement of 0.48 from the best baseline model. This result makes significant headway in the aim of allowing social robots to pick up on an interaction partner's self-disclosures, an ability that will be essential in social robots with social cognition.
comment: Accepted at 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
The SET Perceptual Factors Framework: Towards Assured Perception for Autonomous Systems
Future autonomous systems promise significant societal benefits, yet their deployment raises concerns about safety and trustworthiness. A key concern is assuring the reliability of robot perception, as perception seeds safe decision-making. Failures in perception are often due to complex yet common environmental factors and can lead to accidents that erode public trust. To address this concern, we introduce the SET (Self, Environment, and Target) Perceptual Factors Framework. We designed the framework to systematically analyze how factors such as weather, occlusion, or sensor limitations negatively impact perception. To achieve this, the framework employs SET State Trees to categorize where such factors originate and SET Factor Trees to model how these sources and factors impact perceptual tasks like object detection or pose estimation. Next, we develop Perceptual Factor Models using both trees to quantify the uncertainty for a given task. Our framework aims to promote rigorous safety assurances and cultivate greater public understanding and trust in autonomous systems by offering a transparent and standardized method for identifying, modeling, and communicating perceptual risks.
comment: 4 pages, 4 figures, accepted to the Workshop on Public Trust in Autonomous Systems at the 2025 IEEE International Conference on Robotics & Automation
Learning Task Execution Hierarchies for Redundant Robots
Modern robotic systems, such as mobile manipulators, humanoids, and aerial robots with arms, often possess high redundancy, enabling them to perform multiple tasks simultaneously. Managing this redundancy is key to achieving reliable and flexible behavior. A widely used approach is the Stack of Tasks (SoT), which organizes control objectives by priority within a unified framework. However, traditional SoTs are manually designed by experts, limiting their adaptability and accessibility. This paper introduces a novel framework that automatically learns both the hierarchy and parameters of a SoT from user-defined objectives. By combining Reinforcement Learning and Genetic Programming, the system discovers task priorities and control strategies without manual intervention. A cost function based on intuitive metrics such as precision, safety, and execution time guides the learning process. We validate our method through simulations and experiments on the mobile-YuMi platform, a dual-arm mobile manipulator with high redundancy. Results show that the learned SoTs enable the robot to dynamically adapt to changing environments and inputs, balancing competing objectives while maintaining robust task execution. This approach provides a general and user-friendly solution for redundancy management in complex robots, advancing human-centered robot programming and reducing the need for expert design.
Scaling Up without Fading Out: Goal-Aware Sparse GNN for RL-based Generalized Planning
Generalized planning using deep reinforcement learning (RL) combined with graph neural networks (GNNs) has shown promising results in various symbolic planning domains described by PDDL. However, existing approaches typically represent planning states as fully connected graphs, leading to a combinatorial explosion in edge information and substantial sparsity as problem scales grow, especially evident in large grid-based environments. This dense representation results in diluted node-level information, exponentially increases memory requirements, and ultimately makes learning infeasible for larger-scale problems. To address these challenges, we propose a sparse, goal-aware GNN representation that selectively encodes relevant local relationships and explicitly integrates spatial features related to the goal. We validate our approach by designing novel drone mission scenarios based on PDDL within a grid world, effectively simulating realistic mission execution environments. Our experimental results demonstrate that our method scales effectively to larger grid sizes previously infeasible with dense graph representations and substantially improves policy generalization and success rates. Our findings provide a practical foundation for addressing realistic, large-scale generalized planning tasks.
comment: 16 pages, 10 figures
Biasing Frontier-Based Exploration with Saliency Areas
Autonomous exploration is a widely studied problem where a robot incrementally builds a map of a previously unknown environment. The robot selects the next locations to reach using an exploration strategy. To do so, the robot has to balance between competing objectives, like exploring the entirety of the environment, while being as fast as possible. Most exploration strategies try to maximise the explored area to speed up exploration; however, they do not consider that parts of the environment are more important than others, as they lead to the discovery of large unknown areas. We propose a method that identifies \emph{saliency areas} as those areas that are of high interest for exploration, by using saliency maps obtained from a neural network that, given the current map, implements a termination criterion to estimate whether the environment can be considered fully-explored or not. We use saliency areas to bias some widely used exploration strategies, showing, with an extensive experimental campaign, that this knowledge can significantly influence the behavior of the robot during exploration.
comment: Accepted at the European Confrence on Mobile Robots (ECMR) 2025
An Open-Source User-Friendly Interface for Simulating Magnetic Soft Robots using Simulation Open Framework Architecture (SOFA)
Soft robots, particularly magnetic soft robots, require specialized simulation tools to accurately model their deformation under external magnetic fields. However, existing platforms often lack dedicated support for magnetic materials, making them difficult to use for researchers at different expertise levels. This work introduces an open-source, user-friendly simulation interface using the Simulation Open Framework Architecture (SOFA), specifically designed to model magnetic soft robots. The tool enables users to define material properties, apply magnetic fields, and observe resulting deformations in real time. By integrating intuitive controls and stress analysis capabilities, it aims to bridge the gap between theoretical modeling and practical design. Four benchmark models - a beam, three- and four-finger grippers, and a butterfly - demonstrate its functionality. The software's ease of use makes it accessible to both beginners and advanced researchers. Future improvements will refine accuracy through experimental validation and comparison with industry-standard finite element solvers, ensuring realistic and predictive simulations of magnetic soft robots.
Synthesis of Deep Neural Networks with Safe Robust Adaptive Control for Reliable Operation of Wheeled Mobile Robots
Deep neural networks (DNNs) can enable precise control while maintaining low computational costs by circumventing the need for dynamic modeling. However, the deployment of such black-box approaches remains challenging for heavy-duty wheeled mobile robots (WMRs), which are subject to strict international standards and prone to faults and disturbances. We designed a hierarchical control policy for heavy-duty WMRs, monitored by two safety layers with differing levels of authority. To this end, a DNN policy was trained and deployed as the primary control strategy, providing high-precision performance under nominal operating conditions. When external disturbances arise and reach a level of intensity such that the system performance falls below a predefined threshold, a low-level safety layer intervenes by deactivating the primary control policy and activating a model-free robust adaptive control (RAC) policy. This transition enables the system to continue operating while ensuring stability by effectively managing the inherent trade-off between system robustness and responsiveness. Regardless of the control policy in use, a high-level safety layer continuously monitors system performance during operation. It initiates a shutdown only when disturbances become sufficiently severe such that compensation is no longer viable and continued operation would jeopardize the system or its environment. The proposed synthesis of DNN and RAC policy guarantees uniform exponential stability of the entire WMR system while adhering to safety standards to some extent. The effectiveness of the proposed approach was further validated through real-time experiments using a 6,000 kg WMR.
Why Report Failed Interactions With Robots?! Towards Vignette-based Interaction Quality
Although the quality of human-robot interactions has improved with the advent of LLMs, there are still various factors that cause systems to be sub-optimal when compared to human-human interactions. The nature and criticality of failures are often dependent on the context of the interaction and so cannot be generalized across the wide range of scenarios and experiments which have been implemented in HRI research. In this work we propose the use of a technique overlooked in the field of HRI, ethnographic vignettes, to clearly highlight these failures, particularly those that are rarely documented. We describe the methodology behind the process of writing vignettes and create our own based on our personal experiences with failures in HRI systems. We emphasize the strength of vignettes as the ability to communicate failures from a multi-disciplinary perspective, promote transparency about the capabilities of robots, and document unexpected behaviours which would otherwise be omitted from research reports. We encourage the use of vignettes to augment existing interaction evaluation methods.
comment: Accepted at the workshop on Real-World HRI in Public and Private Spaces: Successes, Failures, and Lessons Learned (PubRob-Fails), held at the IEEE RO-MAN Conference, 2025. 6 pages
SpaRC-AD: A Baseline for Radar-Camera Fusion in End-to-End Autonomous Driving
End-to-end autonomous driving systems promise stronger performance through unified optimization of perception, motion forecasting, and planning. However, vision-based approaches face fundamental limitations in adverse weather conditions, partial occlusions, and precise velocity estimation - critical challenges in safety-sensitive scenarios where accurate motion understanding and long-horizon trajectory prediction are essential for collision avoidance. To address these limitations, we propose SpaRC-AD, a query-based end-to-end camera-radar fusion framework for planning-oriented autonomous driving. Through sparse 3D feature alignment, and doppler-based velocity estimation, we achieve strong 3D scene representations for refinement of agent anchors, map polylines and motion modelling. Our method achieves strong improvements over the state-of-the-art vision-only baselines across multiple autonomous driving tasks, including 3D detection (+4.8% mAP), multi-object tracking (+8.3% AMOTA), online mapping (+1.8% mAP), motion prediction (-4.0% mADE), and trajectory planning (-0.1m L2 and -9% TPC). We achieve both spatial coherence and temporal consistency on multiple challenging benchmarks, including real-world open-loop nuScenes, long-horizon T-nuScenes, and closed-loop simulator Bench2Drive. We show the effectiveness of radar-based fusion in safety-critical scenarios where accurate motion understanding and long-horizon trajectory prediction are essential for collision avoidance. The source code of all experiments is available at https://phi-wol.github.io/sparcad/
comment: 8 pages, 4 figures, 5 tables
MLM: Learning Multi-task Loco-Manipulation Whole-Body Control for Quadruped Robot with Arm
Whole-body loco-manipulation for quadruped robots with arm remains a challenging problem, particularly in achieving multi-task control. To address this, we propose MLM, a reinforcement learning framework driven by both real-world and simulation data. It enables a six-DoF robotic arm--equipped quadruped robot to perform whole-body loco-manipulation for multiple tasks autonomously or under human teleoperation. To address the problem of balancing multiple tasks during the learning of loco-manipulation, we introduce a trajectory library with an adaptive, curriculum-based sampling mechanism. This approach allows the policy to efficiently leverage real-world collected trajectories for learning multi-task loco-manipulation. To address deployment scenarios with only historical observations and to enhance the performance of policy execution across tasks with different spatial ranges, we propose a Trajectory-Velocity Prediction policy network. It predicts unobservable future trajectories and velocities. By leveraging extensive simulation data and curriculum-based rewards, our controller achieves whole-body behaviors in simulation and zero-shot transfer to real-world deployment. Ablation studies in simulation verify the necessity and effectiveness of our approach, while real-world experiments on the Go2 robot with an Airbot robotic arm demonstrate the policy's good performance in multi-task execution.
KDPE: A Kernel Density Estimation Strategy for Diffusion Policy Trajectory Selection
Learning robot policies that capture multimodality in the training data has been a long-standing open challenge for behavior cloning. Recent approaches tackle the problem by modeling the conditional action distribution with generative models. One of these approaches is Diffusion Policy, which relies on a diffusion model to denoise random points into robot action trajectories. While achieving state-of-the-art performance, it has two main drawbacks that may lead the robot out of the data distribution during policy execution. First, the stochasticity of the denoising process can highly impact on the quality of generated trajectory of actions. Second, being a supervised learning approach, it can learn data outliers from the dataset used for training. Recent work focuses on mitigating these limitations by combining Diffusion Policy either with large-scale training or with classical behavior cloning algorithms. Instead, we propose KDPE, a Kernel Density Estimation-based strategy that filters out potentially harmful trajectories output of Diffusion Policy while keeping a low test-time computational overhead. For Kernel Density Estimation, we propose a manifold-aware kernel to model a probability density function for actions composed of end-effector Cartesian position, orientation, and gripper state. KDPE overall achieves better performance than Diffusion Policy on simulated single-arm tasks and real robot experiments. Additional material and code are available on our project page https://hsp-iit.github.io/KDPE/.
comment: 9th Conference on Robot Learning (CoRL 2025), Seoul, Korea
Enabling Generic Robot Skill Implementation Using Object Oriented Programming
Developing robotic algorithms and integrating a robotic subsystem into a larger system can be a difficult task. Particularly in small and medium-sized enterprises (SMEs) where robotics expertise is lacking, implementing, maintaining and developing robotic systems can be a challenge. As a result, many companies rely on external expertise through system integrators, which, in some cases, can lead to vendor lock-in and external dependency. In the academic research on intelligent manufacturing systems, robots play a critical role in the design of robust autonomous systems. Similar challenges are faced by researchers who want to use robotic systems as a component in a larger smart system, without having to deal with the complexity and vastness of the robot interfaces in detail. In this paper, we propose a software framework that reduces the effort required to deploy a working robotic system. The focus is solely on providing a concept for simplifying the different interfaces of a modern robot system and using an abstraction layer for different manufacturers and models. The Python programming language is used to implement a prototype of the concept. The target system is a bin-picking cell containing a Yaskawa Motoman GP4.
comment: 34th International Conference on Robotics in Alpe-Adria-Danube Region (RAAD 2025)
MASH: Cooperative-Heterogeneous Multi-Agent Reinforcement Learning for Single Humanoid Robot Locomotion
This paper proposes a novel method to enhance locomotion for a single humanoid robot through cooperative-heterogeneous multi-agent deep reinforcement learning (MARL). While most existing methods typically employ single-agent reinforcement learning algorithms for a single humanoid robot or MARL algorithms for multi-robot system tasks, we propose a distinct paradigm: applying cooperative-heterogeneous MARL to optimize locomotion for a single humanoid robot. The proposed method, multi-agent reinforcement learning for single humanoid locomotion (MASH), treats each limb (legs and arms) as an independent agent that explores the robot's action space while sharing a global critic for cooperative learning. Experiments demonstrate that MASH accelerates training convergence and improves whole-body cooperation ability, outperforming conventional single-agent reinforcement learning methods. This work advances the integration of MARL into single-humanoid-robot control, offering new insights into efficient locomotion strategies.
CorrectNav: Self-Correction Flywheel Empowers Vision-Language-Action Navigation Model
Existing vision-and-language navigation models often deviate from the correct trajectory when executing instructions. However, these models lack effective error correction capability, hindering their recovery from errors. To address this challenge, we propose Self-correction Flywheel, a novel post-training paradigm. Instead of considering the model's error trajectories on the training set as a drawback, our paradigm emphasizes their significance as a valuable data source. We have developed a method to identify deviations in these error trajectories and devised innovative techniques to automatically generate self-correction data for perception and action. These self-correction data serve as fuel to power the model's continued training. The brilliance of our paradigm is revealed when we re-evaluate the model on the training set, uncovering new error trajectories. At this time, the self-correction flywheel begins to spin. Through multiple flywheel iterations, we progressively enhance our monocular RGB-based VLA navigation model CorrectNav. Experiments on R2R-CE and RxR-CE benchmarks show CorrectNav achieves new state-of-the-art success rates of 65.1% and 69.3%, surpassing prior best VLA navigation models by 8.2% and 16.4%. Real robot tests in various indoor and outdoor environments demonstrate \method's superior capability of error correction, dynamic obstacle avoidance, and long instruction following.
Probabilistic Latency Analysis of the Data Distribution Service in ROS 2
Robot Operating System 2 (ROS 2) is now the de facto standard for robotic communication, pairing UDP transport with the Data Distribution Service (DDS) publish-subscribe middleware. DDS achieves reliability through periodic heartbeats that solicit acknowledgments for missing samples and trigger selective retransmissions. In lossy wireless networks, the tight coupling among heartbeat period, IP fragmentation, and retransmission interval obscures end to end latency behavior and leaves practitioners with little guidance on how to tune these parameters. To address these challenges, we propose a probabilistic latency analysis (PLA) that analytically models the reliable transmission process of ROS 2 DDS communication using a discrete state approach. By systematically analyzing both middleware level and transport level events, PLA computes the steady state probability distribution of unacknowledged messages and the retransmission latency. We validate our PLA across 270 scenarios, exploring variations in packet delivery ratios, message sizes, and both publishing and retransmission intervals, demonstrating a close alignment between analytical predictions and experimental results. Our findings establish a theoretical basis to systematically optimize reliability, latency, and performance in wireless industrial robotics.
comment: 12 pages, 5 figures
Large Model Empowered Embodied AI: A Survey on Decision-Making and Embodied Learning
Embodied AI aims to develop intelligent systems with physical forms capable of perceiving, decision-making, acting, and learning in real-world environments, providing a promising way to Artificial General Intelligence (AGI). Despite decades of explorations, it remains challenging for embodied agents to achieve human-level intelligence for general-purpose tasks in open dynamic environments. Recent breakthroughs in large models have revolutionized embodied AI by enhancing perception, interaction, planning and learning. In this article, we provide a comprehensive survey on large model empowered embodied AI, focusing on autonomous decision-making and embodied learning. We investigate both hierarchical and end-to-end decision-making paradigms, detailing how large models enhance high-level planning, low-level execution, and feedback for hierarchical decision-making, and how large models enhance Vision-Language-Action (VLA) models for end-to-end decision making. For embodied learning, we introduce mainstream learning methodologies, elaborating on how large models enhance imitation learning and reinforcement learning in-depth. For the first time, we integrate world models into the survey of embodied AI, presenting their design methods and critical roles in enhancing decision-making and learning. Though solid advances have been achieved, challenges still exist, which are discussed at the end of this survey, potentially as the further research directions.
Super LiDAR Reflectance for Robotic Perception
Conventionally, human intuition often defines vision as a modality of passive optical sensing, while active optical sensing is typically regarded as measuring rather than the default modality of vision. However, the situation now changes: sensor technologies and data-driven paradigms empower active optical sensing to redefine the boundaries of vision, ushering in a new era of active vision. Light Detection and Ranging (LiDAR) sensors capture reflectance from object surfaces, which remains invariant under varying illumination conditions, showcasing significant potential in robotic perception tasks such as detection, recognition, segmentation, and Simultaneous Localization and Mapping (SLAM). These applications often rely on dense sensing capabilities, typically achieved by high-resolution, expensive LiDAR sensors. A key challenge with low-cost LiDARs lies in the sparsity of scan data, which limits their broader application. To address this limitation, this work introduces an innovative framework for generating dense LiDAR reflectance images from sparse data, leveraging the unique attributes of non-repeating scanning LiDAR (NRS-LiDAR). We tackle critical challenges, including reflectance calibration and the transition from static to dynamic scene domains, facilitating the reconstruction of dense reflectance images in real-world settings. The key contributions of this work include a comprehensive dataset for LiDAR reflectance image densification, a densification network tailored for NRS-LiDAR, and diverse applications such as loop closure and traffic lane detection using the generated dense reflectance images.
A Semantic-Aware Framework for Safe and Intent-Integrative Assistance in Upper-Limb Exoskeletons
Upper-limb exoskeletons are primarily designed to provide assistive support by accurately interpreting and responding to human intentions. In home-care scenarios, exoskeletons are expected to adapt their assistive configurations based on the semantic information of the task, adjusting appropriately in accordance with the nature of the object being manipulated. However, existing solutions often lack the ability to understand task semantics or collaboratively plan actions with the user, limiting their generalizability. To address this challenge, this paper introduces a semantic-aware framework that integrates large language models into the task planning framework, enabling the delivery of safe and intent-integrative assistance. The proposed approach begins with the exoskeleton operating in transparent mode to capture the wearer's intent during object grasping. Once semantic information is extracted from the task description, the system automatically configures appropriate assistive parameters. In addition, a diffusion-based anomaly detector is used to continuously monitor the state of human-robot interaction and trigger real-time replanning in response to detected anomalies. During task execution, online trajectory refinement and impedance control are used to ensure safety and regulate human-robot interaction. Experimental results demonstrate that the proposed method effectively aligns with the wearer's cognition, adapts to semantically varying tasks, and responds reliably to anomalies.
Few-shot Vision-based Human Activity Recognition with MLLM-based Visual Reinforcement Learning
Reinforcement learning in large reasoning models enables learning from feedback on their outputs, making it particularly valuable in scenarios where fine-tuning data is limited. However, its application in multi-modal human activity recognition (HAR) domains remains largely underexplored. Our work extends reinforcement learning to the human activity recognition domain with multimodal large language models. By incorporating visual reinforcement learning in the training process, the model's generalization ability on few-shot recognition can be greatly improved. Additionally, visual reinforcement learning can enhance the model's reasoning ability and enable explainable analysis in the inference stage. We name our few-shot human activity recognition method with visual reinforcement learning FAVOR. Specifically, our approach first utilizes a multimodal large language model (MLLM) to generate multiple candidate responses for the human activity image, each containing reasoning traces and final answers. These responses are then evaluated using reward functions, and the MLLM model is subsequently optimized using the Group Relative Policy Optimization (GRPO) algorithm. In this way, the MLLM model can be adapted to human activity recognition with only a few samples. Extensive experiments on four human activity recognition datasets and five different settings demonstrate the superiority of the proposed method.
BEASST: Behavioral Entropic Gradient based Adaptive Source Seeking for Mobile Robots
This paper presents BEASST (Behavioral Entropic Gradient-based Adaptive Source Seeking for Mobile Robots), a novel framework for robotic source seeking in complex, unknown environments. Our approach enables mobile robots to efficiently balance exploration and exploitation by modeling normalized signal strength as a surrogate probability of source location. Building on Behavioral Entropy(BE) with Prelec's probability weighting function, we define an objective function that adapts robot behavior from risk-averse to risk-seeking based on signal reliability and mission urgency. The framework provides theoretical convergence guarantees under unimodal signal assumptions and practical stability under bounded disturbances. Experimental validation across DARPA SubT and multi-room scenarios demonstrates that BEASST consistently outperforms state-of-the-art methods, achieving 15% reduction in path length and 20% faster source localization through intelligent uncertainty-driven navigation that dynamically transitions between aggressive pursuit and cautious exploration.
ReconVLA: Reconstructive Vision-Language-Action Model as Effective Robot Perceiver
Recent advances in Vision-Language-Action (VLA) models have enabled robotic agents to integrate multimodal understanding with action execution. However, our empirical analysis reveals that current VLAs struggle to allocate visual attention to target regions. Instead, visual attention is always dispersed. To guide the visual attention grounding on the correct target, we propose ReconVLA, a reconstructive VLA model with an implicit grounding paradigm. Conditioned on the model's visual outputs, a diffusion transformer aims to reconstruct the gaze region of the image, which corresponds to the target manipulated objects. This process prompts the VLA model to learn fine-grained representations and accurately allocate visual attention, thus effectively leveraging task-specific visual information and conducting precise manipulation. Moreover, we curate a large-scale pretraining dataset comprising over 100k trajectories and 2 million data samples from open-source robotic datasets, further boosting the model's generalization in visual reconstruction. Extensive experiments in simulation and the real world demonstrate the superiority of our implicit grounding method, showcasing its capabilities of precise manipulation and generalization. Our project page is https://zionchow.github.io/ReconVLA/.
Hybrid Data-Driven Predictive Control for Robust and Reactive Exoskeleton Locomotion Synthesis
Robust bipedal locomotion in exoskeletons requires the ability to dynamically react to changes in the environment in real time. This paper introduces the hybrid data-driven predictive control (HDDPC) framework, an extension of the data-enabled predictive control, that addresses these challenges by simultaneously planning foot contact schedules and continuous domain trajectories. The proposed framework utilizes a Hankel matrix-based representation to model system dynamics, incorporating step-to-step (S2S) transitions to enhance adaptability in dynamic environments. By integrating contact scheduling with trajectory planning, the framework offers an efficient, unified solution for locomotion motion synthesis that enables robust and reactive walking through online replanning. We validate the approach on the Atalante exoskeleton, demonstrating improved robustness and adaptability.
comment: 8 pages; 8 figures
Robot Policy Evaluation for Sim-to-Real Transfer: A Benchmarking Perspective
Current vision-based robotics simulation benchmarks have significantly advanced robotic manipulation research. However, robotics is fundamentally a real-world problem, and evaluation for real-world applications has lagged behind in evaluating generalist policies. In this paper, we discuss challenges and desiderata in designing benchmarks for generalist robotic manipulation policies for the goal of sim-to-real policy transfer. We propose 1) utilizing high visual-fidelity simulation for improved sim-to-real transfer, 2) evaluating policies by systematically increasing task complexity and scenario perturbation to assess robustness, and 3) quantifying performance alignment between real-world performance and its simulation counterparts.
comment: 2025 Robot: Science and Systems (RSS) Workshop on Robot Evaluation for the Real World
Utilizing Vision-Language Models as Action Models for Intent Recognition and Assistance
Human-robot collaboration requires robots to quickly infer user intent, provide transparent reasoning, and assist users in achieving their goals. Our recent work introduced GUIDER, our framework for inferring navigation and manipulation intents. We propose augmenting GUIDER with a vision-language model (VLM) and a text-only language model (LLM) to form a semantic prior that filters objects and locations based on the mission prompt. A vision pipeline (YOLO for object detection and the Segment Anything Model for instance segmentation) feeds candidate object crops into the VLM, which scores their relevance given an operator prompt; in addition, the list of detected object labels is ranked by a text-only LLM. These scores weight the existing navigation and manipulation layers of GUIDER, selecting context-relevant targets while suppressing unrelated objects. Once the combined belief exceeds a threshold, autonomy changes occur, enabling the robot to navigate to the desired area and retrieve the desired object, while adapting to any changes in the operator's intent. Future work will evaluate the system on Isaac Sim using a Franka Emika arm on a Ridgeback base, with a focus on real-time assistance.
comment: Accepted at Human-Centered Robot Autonomy for Human-Robot Teams (HuRoboT) at IEEE RO-MAN 2025, Eindhoven, the Netherlands
GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning ICCV 2025
Recent advances have shown that video generation models can enhance robot learning by deriving effective robot actions through inverse dynamics. However, these methods heavily depend on the quality of generated data and struggle with fine-grained manipulation due to the lack of environment feedback. While video-based reinforcement learning improves policy robustness, it remains constrained by the uncertainty of video generation and the challenges of collecting large-scale robot datasets for training diffusion models. To address these limitations, we propose GenFlowRL, which derives shaped rewards from generated flow trained from diverse cross-embodiment datasets. This enables learning generalizable and robust policies from diverse demonstrations using low-dimensional, object-centric features. Experiments on 10 manipulation tasks, both in simulation and real-world cross-embodiment evaluations, demonstrate that GenFlowRL effectively leverages manipulation features extracted from generated object-centric flow, consistently achieving superior performance across diverse and challenging scenarios. Our Project Page: https://colinyu1.github.io/genflowrl
comment: Published at ICCV 2025
GhostObjects: Instructing Robots by Manipulating Spatially Aligned Virtual Twins in Augmented Reality
Robots are increasingly capable of autonomous operations, yet human interaction remains essential for issuing personalized instructions. Instead of directly controlling robots through Programming by Demonstration (PbD) or teleoperation, we propose giving instructions by interacting with GhostObjects-world-aligned, life-size virtual twins of physical objects-in augmented reality (AR). By direct manipulation of GhostObjects, users can precisely specify physical goals and spatial parameters, with features including real-world lasso selection of multiple objects and snapping back to default positions, enabling tasks beyond simple pick-and-place.
3D FlowMatch Actor: Unified 3D Policy for Single- and Dual-Arm Manipulation
We present 3D FlowMatch Actor (3DFA), a 3D policy architecture for robot manipulation that combines flow matching for trajectory prediction with 3D pretrained visual scene representations for learning from demonstration. 3DFA leverages 3D relative attention between action and visual tokens during action denoising, building on prior work in 3D diffusion-based single-arm policy learning. Through a combination of flow matching and targeted system-level and architectural optimizations, 3DFA achieves over 30x faster training and inference than previous 3D diffusion-based policies, without sacrificing performance. On the bimanual PerAct2 benchmark, it establishes a new state of the art, outperforming the next-best method by an absolute margin of 41.4%. In extensive real-world evaluations, it surpasses strong baselines with up to 1000x more parameters and significantly more pretraining. In unimanual settings, it sets a new state of the art on 74 RLBench tasks by directly predicting dense end-effector trajectories, eliminating the need for motion planning. Comprehensive ablation studies underscore the importance of our design choices for both policy effectiveness and efficiency.
Robust Online Calibration for UWB-Aided Visual-Inertial Navigation with Bias Correction
This paper presents a novel robust online calibration framework for Ultra-Wideband (UWB) anchors in UWB-aided Visual-Inertial Navigation Systems (VINS). Accurate anchor positioning, a process known as calibration, is crucial for integrating UWB ranging measurements into state estimation. While several prior works have demonstrated satisfactory results by using robot-aided systems to autonomously calibrate UWB systems, there are still some limitations: 1) these approaches assume accurate robot localization during the initialization step, ignoring localization errors that can compromise calibration robustness, and 2) the calibration results are highly sensitive to the initial guess of the UWB anchors' positions, reducing the practical applicability of these methods in real-world scenarios. Our approach addresses these challenges by explicitly incorporating the impact of robot localization uncertainties into the calibration process, ensuring robust initialization. To further enhance the robustness of the calibration results against initialization errors, we propose a tightly-coupled Schmidt Kalman Filter (SKF)-based online refinement method, making the system suitable for practical applications. Simulations and real-world experiments validate the improved accuracy and robustness of our approach.
Developing and Validating a High-Throughput Robotic System for the Accelerated Development of Porous Membranes
The development of porous polymeric membranes remains a labor-intensive process, often requiring extensive trial and error to identify optimal fabrication parameters. In this study, we present a fully automated platform for membrane fabrication and characterization via nonsolvent-induced phase separation (NIPS). The system integrates automated solution preparation, blade casting, controlled immersion, and compression testing, allowing precise control over fabrication parameters such as polymer concentration and ambient humidity. The modular design allows parallel processing and reproducible handling of samples, reducing experimental time and increasing consistency. Compression testing is introduced as a sensitive mechanical characterization method for estimating membrane stiffness and as a proxy to infer porosity and intra-sample uniformity through automated analysis of stress-strain curves. As a proof of concept to demonstrate the effectiveness of the system, NIPS was carried out with polysulfone, the green solvent PolarClean, and water as the polymer, solvent, and nonsolvent, respectively. Experiments conducted with the automated system reproduced expected effects of polymer concentration and ambient humidity on membrane properties, namely increased stiffness and uniformity with increasing polymer concentration and humidity variations in pore morphology and mechanical response. The developed automated platform supports high-throughput experimentation and is well-suited for integration into self-driving laboratory workflows, offering a scalable and reproducible foundation for data-driven optimization of porous polymeric membranes through NIPS.
Episodic Memory Verbalization using Hierarchical Representations of Life-Long Robot Experience
Verbalization of robot experience, i.e., summarization of and question answering about a robot's past, is a crucial ability for improving human-robot interaction. Previous works applied rule-based systems or fine-tuned deep models to verbalize short (several-minute-long) streams of episodic data, limiting generalization and transferability. In our work, we apply large pretrained models to tackle this task with zero or few examples, and specifically focus on verbalizing life-long experiences. For this, we derive a tree-like data structure from episodic memory (EM), with lower levels representing raw perception and proprioception data, and higher levels abstracting events to natural language concepts. Given such a hierarchical representation built from the experience stream, we apply a large language model as an agent to interactively search the EM given a user's query, dynamically expanding (initially collapsed) tree nodes to find the relevant information. The approach keeps computational costs low even when scaling to months of robot experience data. We evaluate our method on simulated household robot data, human egocentric videos, and real-world robot recordings, demonstrating its flexibility and scalability.
comment: Humanoids 2025. Code, data and demo videos at https://hierarchical-emv.github.io
GraspClutter6D: A Large-scale Real-world Dataset for Robust Perception and Grasping in Cluttered Scenes
Robust grasping in cluttered environments remains an open challenge in robotics. While benchmark datasets have significantly advanced deep learning methods, they mainly focus on simplistic scenes with light occlusion and insufficient diversity, limiting their applicability to practical scenarios. We present GraspClutter6D, a large-scale real-world grasping dataset featuring: (1) 1,000 highly cluttered scenes with dense arrangements (14.1 objects/scene, 62.6\% occlusion), (2) comprehensive coverage across 200 objects in 75 environment configurations (bins, shelves, and tables) captured using four RGB-D cameras from multiple viewpoints, and (3) rich annotations including 736K 6D object poses and 9.3B feasible robotic grasps for 52K RGB-D images. We benchmark state-of-the-art segmentation, object pose estimation, and grasp detection methods to provide key insights into challenges in cluttered environments. Additionally, we validate the dataset's effectiveness as a training resource, demonstrating that grasping networks trained on GraspClutter6D significantly outperform those trained on existing datasets in both simulation and real-world experiments. The dataset, toolkit, and annotation tools are publicly available on our project website: https://sites.google.com/view/graspclutter6d.
comment: Accepted at IEEE Robotics and Automation Letters (RA-L). Project Websites: https://sites.google.com/view/graspclutter6d
WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization
Visual geo-localization for drones faces critical degradation under weather perturbations, \eg, rain and fog, where existing methods struggle with two inherent limitations: 1) Heavy reliance on limited weather categories that constrain generalization, and 2) Suboptimal disentanglement of entangled scene-weather features through pseudo weather categories. We present WeatherPrompt, a multi-modality learning paradigm that establishes weather-invariant representations through fusing the image embedding with the text context. Our framework introduces two key contributions: First, a Training-free Weather Reasoning mechanism that employs off-the-shelf large multi-modality models to synthesize multi-weather textual descriptions through human-like reasoning. It improves the scalability to unseen or complex weather, and could reflect different weather strength. Second, to better disentangle the scene and weather feature, we propose a multi-modality framework with the dynamic gating mechanism driven by the text embedding to adaptively reweight and fuse visual features across modalities. The framework is further optimized by the cross-modal objectives, including image-text contrastive learning and image-text matching, which maps the same scene with different weather conditions closer in the respresentation space. Extensive experiments validate that, under diverse weather conditions, our method achieves competitive recall rates compared to state-of-the-art drone geo-localization methods. Notably, it improves Recall@1 by +13.37\% under night conditions and by 18.69\% under fog and snow conditions.
Robotic Ultrasound-Guided Femoral Artery Reconstruction of Anatomically-Representative Phantoms
Femoral artery access is essential for numerous clinical procedures, including diagnostic angiography, therapeutic catheterization, and emergency interventions. Despite its critical role, successful vascular access remains challenging due to anatomical variability, overlying adipose tissue, and the need for precise ultrasound (US) guidance. Needle placement errors can result in severe complications, thereby limiting the procedure to highly skilled clinicians operating in controlled hospital environments. While robotic systems have shown promise in addressing these challenges through autonomous scanning and vessel reconstruction, clinical translation remains limited due to reliance on simplified phantom models that fail to capture human anatomical complexity. In this work, we present a method for autonomous robotic US scanning of bifurcated femoral arteries, and validate it on five vascular phantoms created from real patient computed tomography (CT) data. Additionally, we introduce a video-based deep learning US segmentation network tailored for vascular imaging, enabling improved 3D arterial reconstruction. The proposed network achieves a Dice score of 89.21% and an Intersection over Union of 80.54% on a new vascular dataset. The reconstructed artery centerline is evaluated against ground truth CT data, showing an average L2 error of 0.91+/-0.70 mm, with an average Hausdorff distance of 4.36+/-1.11mm. This study is the first to validate an autonomous robotic system for US scanning of the femoral artery on a diverse set of patient-specific phantoms, introducing a more advanced framework for evaluating robotic performance in vascular imaging and intervention.
TAR: Teacher-Aligned Representations via Contrastive Learning for Quadrupedal Locomotion IROS
Quadrupedal locomotion via Reinforcement Learning (RL) is commonly addressed using the teacher-student paradigm, where a privileged teacher guides a proprioceptive student policy. However, key challenges such as representation misalignment between privileged teacher and proprioceptive-only student, covariate shift due to behavioral cloning, and lack of deployable adaptation; lead to poor generalization in real-world scenarios. We propose Teacher-Aligned Representations via Contrastive Learning (TAR), a framework that leverages privileged information with self-supervised contrastive learning to bridge this gap. By aligning representations to a privileged teacher in simulation via contrastive objectives, our student policy learns structured latent spaces and exhibits robust generalization to Out-of-Distribution (OOD) scenarios, surpassing the fully privileged "Teacher". Results showed accelerated training by 2x compared to state-of-the-art baselines to achieve peak performance. OOD scenarios showed better generalization by 40% on average compared to existing methods. Moreover, TAR transitions seamlessly into learning during deployment without requiring privileged states, setting a new benchmark in sample-efficient, adaptive locomotion and enabling continual fine-tuning in real-world scenarios. Open-source code and videos are available at https://amrmousa.com/TARLoco/.
comment: This work has been accepted for publication at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving ICCV 2025
We introduce UniOcc, a comprehensive, unified benchmark and toolkit for occupancy forecasting (i.e., predicting future occupancies based on historical information) and occupancy prediction (i.e., predicting current-frame occupancy from camera images. UniOcc unifies the data from multiple real-world datasets (i.e., nuScenes, Waymo) and high-fidelity driving simulators (i.e., CARLA, OpenCOOD), providing 2D/3D occupancy labels and annotating innovative per-voxel flows. Unlike existing studies that rely on suboptimal pseudo labels for evaluation, UniOcc incorporates novel evaluation metrics that do not depend on ground-truth labels, enabling robust assessment on additional aspects of occupancy quality. Through extensive experiments on state-of-the-art models, we demonstrate that large-scale, diverse training data and explicit flow information significantly enhance occupancy prediction and forecasting performance. Our data and code are available at https://uniocc.github.io/.
comment: IEEE/CVF International Conference on Computer Vision (ICCV 2025); Project website: https://uniocc.github.io/
RobustDexGrasp: Robust Dexterous Grasping of General Objects
The ability to robustly grasp a variety of objects is essential for dexterous robots. In this paper, we present a framework for zero-shot dynamic dexterous grasping using single-view visual inputs, designed to be resilient to various disturbances. Our approach utilizes a hand-centric object shape representation based on dynamic distance vectors between finger joints and object surfaces. This representation captures the local shape around potential contact regions rather than focusing on detailed global object geometry, thereby enhancing generalization to shape variations and uncertainties. To address perception limitations, we integrate a privileged teacher policy with a mixed curriculum learning approach, allowing the student policy to effectively distill grasping capabilities and explore for adaptation to disturbances. Trained in simulation, our method achieves success rates of 97.0% across 247,786 simulated objects and 94.6% across 512 real objects, demonstrating remarkable generalization. Quantitative and qualitative results validate the robustness of our policy against various disturbances.
comment: Camera ready for CoRL2025. Project Page: https://zdchan.github.io/Robust_DexGrasp/
Advancing MAPF towards the Real World: A Scalable Multi-Agent Realistic Testbed (SMART)
We present Scalable Multi-Agent Realistic Testbed (SMART), a realistic and efficient software tool for evaluating Multi-Agent Path Finding (MAPF) algorithms. MAPF focuses on planning collision-free paths for a group of agents. While state-ofthe-art MAPF algorithms can plan paths for hundreds of robots in seconds, they often rely on simplified robot models, making their real-world performance unclear. Researchers typically lack access to hundreds of physical robots in laboratory settings to evaluate the algorithms. Meanwhile, industrial professionals who lack expertise in MAPF require an easy-to-use simulator to efficiently test and understand the performance of MAPF algorithms in their specific settings. SMART fills this gap with several advantages: (1) SMART uses physics-engine-based simulators to create realistic simulation environments, accounting for complex real-world factors such as robot kinodynamics and execution uncertainties, (2) SMART uses an execution monitor framework based on the Action Dependency Graph, facilitating seamless integration with various MAPF algorithms and robot models, and (3) SMART scales to thousands of robots. The code is publicly available at https://github.com/smart-mapf/smart.
Towards Embodied Agentic AI: Review and Classification of LLM- and VLM-Driven Robot Autonomy and Interaction
Foundation models, including large language models (LLMs) and vision-language models (VLMs), have recently enabled novel approaches to robot autonomy and human-robot interfaces. In parallel, vision-language-action models (VLAs) or large behavior models (LBMs) are increasing the dexterity and capabilities of robotic systems. This survey paper focuses on those works advancing towards agentic applications and architectures. This includes initial efforts exploring GPT-style interfaces to tooling, as well as more complex system where AI agents are coordinators, planners, perception actors, or generalist interfaces. Such agentic architectures allow robots to reason over natural language instructions, invoke APIs, plan task sequences, or assist in operations and diagnostics. In addition to peer-reviewed research, due to the fast-evolving nature of the field, we highlight and include community-driven projects, ROS packages, and industrial frameworks that show emerging trends. We propose a taxonomy for classifying model integration approaches and present a comparative analysis of the role that agents play in different solutions in today's literature.
LaDi-WM: A Latent Diffusion-based World Model for Predictive Manipulation
Predictive manipulation has recently gained considerable attention in the Embodied AI community due to its potential to improve robot policy performance by leveraging predicted states. However, generating accurate future visual states of robot-object interactions from world models remains a well-known challenge, particularly in achieving high-quality pixel-level representations. To this end, we propose LaDi-WM, a world model that predicts the latent space of future states using diffusion modeling. Specifically, LaDi-WM leverages the well-established latent space aligned with pre-trained Visual Foundation Models (VFMs), which comprises both geometric features (DINO-based) and semantic features (CLIP-based). We find that predicting the evolution of the latent space is easier to learn and more generalizable than directly predicting pixel-level images. Building on LaDi-WM, we design a diffusion policy that iteratively refines output actions by incorporating forecasted states, thereby generating more consistent and accurate results. Extensive experiments on both synthetic and real-world benchmarks demonstrate that LaDi-WM significantly enhances policy performance by 27.9\% on the LIBERO-LONG benchmark and 20\% on the real-world scenario. Furthermore, our world model and policies achieve impressive generalizability in real-world experiments.
comment: CoRL 2025
Motion Planning Diffusion: Learning and Adapting Robot Motion Planning with Diffusion Models
The performance of optimization-based robot motion planning algorithms is highly dependent on the initial solutions, commonly obtained by running a sampling-based planner to obtain a collision-free path. However, these methods can be slow in high-dimensional and complex scenes and produce non-smooth solutions. Given previously solved path-planning problems, it is highly desirable to learn their distribution and use it as a prior for new similar problems. Several works propose utilizing this prior to bootstrap the motion planning problem, either by sampling initial solutions from it, or using its distribution in a maximum-a-posterior formulation for trajectory optimization. In this work, we introduce Motion Planning Diffusion (MPD), an algorithm that learns trajectory distribution priors with diffusion models. These generative models have shown increasing success in encoding multimodal data and have desirable properties for gradient-based motion planning, such as cost guidance. Given a motion planning problem, we construct a cost function and sample from the posterior distribution using the learned prior combined with the cost function gradients during the denoising process. Instead of learning the prior on all trajectory waypoints, we propose learning a lower-dimensional representation of a trajectory using linear motion primitives, particularly B-spline curves. This parametrization guarantees that the generated trajectory is smooth, can be interpolated at higher frequencies, and needs fewer parameters than a dense waypoint representation. We demonstrate the results of our method ranging from simple 2D to more complex tasks using a 7-dof robot arm manipulator. In addition to learning from simulated data, we also use human demonstrations on a real-world pick-and-place task.
Split Covariance Intersection Filter Based Visual Localization With Accurate AprilTag Map For Warehouse Robot Navigation
Accurate and efficient localization with conveniently-established map is the fundamental requirement for mobile robot operation in warehouse environments. An accurate AprilTag map can be conveniently established with the help of LiDAR-based SLAM. It is true that a LiDAR-based system is usually not commercially competitive in contrast with a vision-based system, yet fortunately for warehouse applications, only a single LiDAR-based SLAM system is needed to establish an accurate AprilTag map, whereas a large amount of visual localization systems can share this established AprilTag map for their own operations. Therefore, the cost of a LiDAR-based SLAM system is actually shared by the large amount of visual localization systems, and turns to be acceptable and even negligible for practical warehouse applications. Once an accurate AprilTag map is available, visual localization is realized as recursive estimation that fuses AprilTag measurements (i.e. AprilTag detection results) and robot motion data. AprilTag measurements may be nonlinear partial measurements; this can be handled by the well-known extended Kalman filter (EKF) in the spirit of local linearization. AprilTag measurements tend to have temporal correlation as well; however, this cannot be reasonably handled by the EKF. The split covariance intersection filter (Split CIF) is adopted to handle temporal correlation among AprilTag measurements. The Split CIF (in the spirit of local linearization) can also handle AprilTag nonlinear partial measurements. The Split CIF based visual localization system incorporates a measurement adaptive mechanism to handle outliers in AprilTag measurements and adopts a dynamic initialization mechanism to address the kidnapping problem. A comparative study in real warehouse environments demonstrates the potential and advantage of the Split CIF based visual localization solution.
Optimizing Force Signals from Human Demonstrations of In-Contact Motions
For non-robot-programming experts, kinesthetic guiding can be an intuitive input method, as robot programming of in-contact tasks is becoming more prominent. However, imprecise and noisy input signals from human demonstrations pose problems when reproducing motions directly or using the signal as input for machine learning methods. This paper explores optimizing force signals to correspond better to the human intention of the demonstrated signal. We compare different signal filtering methods and propose a peak detection method for dealing with first-contact deviations in the signal. The evaluation of these methods considers a specialized error criterion between the input and the human-intended signal. In addition, we analyze the critical parameters' influence on the filtering methods. The quality for an individual motion could be increased by up to \SI{20}{\percent} concerning the error criterion. The proposed contribution can improve the usability of robot programming and the interaction between humans and robots.
comment: This is a preprint of a chapter the following work (and accepted for publication): Annals of Scientific Society for Assembly, Handling and Industrial Robotics 2024. The final authenticated version is will be linked here: http://dx.doi.org/[tba]
Estimation of Payload Inertial Parameters from Human Demonstrations by Hand Guiding
As the availability of cobots increases, it is essential to address the needs of users with little to no programming knowledge to operate such systems efficiently. Programming concepts often use intuitive interaction modalities, such as hand guiding, to address this. When programming in-contact motions, such frameworks require knowledge of the robot tool's payload inertial parameters (PIP) in addition to the demonstrated velocities and forces to ensure effective hybrid motion-force control. This paper aims to enable non-expert users to program in-contact motions more efficiently by eliminating the need for a dedicated PIP calibration, thereby enabling flexible robot tool changes. Since demonstrated tasks generally also contain motions with non-contact, our approach uses these parts to estimate the robot's PIP using established estimation techniques. The results show that the estimation of the payload's mass is accurate, whereas the center of mass and the inertia tensor are affected by noise and a lack of excitation. Overall, these findings show the feasibility of PIP estimation during hand guiding but also highlight the need for sufficient payload accelerations for an accurate estimation.
comment: This is a preprint of a chapter the following work (and accepted for publication): Annals of Scientific Society for Assembly, Handling and Industrial Robotics 2025. The final authenticated version is will be linked here: http://dx.doi.org/[tba]
Traversability analysis with vision and terrain probing for safe legged robot navigation
Inspired by human behavior when traveling over unknown terrain, this study proposes the use of probing strategies and integrates them into a traversability analysis framework to address safe navigation on unknown rough terrain. Our framework integrates collapsibility information into our existing traversability analysis, as vision and geometric information alone could be misled by unpredictable non-rigid terrains such as soft soil, bush area, or water puddles. With the new traversability analysis framework, our robot has a more comprehensive assessment of unpredictable terrain, which is critical for its safety in outdoor environments. The pipeline first identifies the terrain's geometric and semantic properties using an RGB-D camera and desired probing locations on questionable terrains. These regions are probed using a force sensor to determine the risk of terrain collapsing when the robot steps over it. This risk is formulated as a collapsibility metric, which estimates an unpredictable region's ground collapsibility. Thereafter, the collapsibility metric, together with geometric and semantic spatial data, is combined and analyzed to produce global and local traversability grid maps. These traversability grid maps tell the robot whether it is safe to step over different regions of the map. The grid maps are then utilized to generate optimal paths for the robot to safely navigate to its goal. Our approach has been successfully verified on a quadrupedal robot in both simulation and real-world experiments.
Real-time Digital Double Framework to Predict Collapsible Terrains for Legged Robots IROS
Inspired by the digital twinning systems, a novel real-time digital double framework is developed to enhance robot perception of the terrain conditions. Based on the very same physical model and motion control, this work exploits the use of such simulated digital double synchronized with a real robot to capture and extract discrepancy information between the two systems, which provides high dimensional cues in multiple physical quantities to represent differences between the modelled and the real world. Soft, non-rigid terrains cause common failures in legged locomotion, whereby visual perception solely is insufficient in estimating such physical properties of terrains. We used digital double to develop the estimation of the collapsibility, which addressed this issue through physical interactions during dynamic walking. The discrepancy in sensory measurements between the real robot and its digital double are used as input of a learning-based algorithm for terrain collapsibility analysis. Although trained only in simulation, the learned model can perform collapsibility estimation successfully in both simulation and real world. Our evaluation of results showed the generalization to different scenarios and the advantages of the digital double to reliably detect nuances in ground conditions.
comment: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Preprint version. Accepted June 2022
Tactile Aware Dynamic Obstacle Avoidance in Crowded Environment with Deep Reinforcement Learning
Mobile robots operating in crowded environments require the ability to navigate among humans and surrounding obstacles efficiently while adhering to safety standards and socially compliant mannerisms. This scale of the robot navigation problem may be classified as both a local path planning and trajectory optimization problem. This work presents an array of force sensors that act as a tactile layer to complement the use of a LiDAR for the purpose of inducing awareness of contact with any surrounding objects within immediate vicinity of a mobile robot undetected by LiDARs. By incorporating the tactile layer, the robot can take more risks in its movements and possibly go right up to an obstacle or wall, and gently squeeze past it. In addition, we built up a simulation platform via Pybullet which integrates Robot Operating System (ROS) and reinforcement learning (RL) together. A touch-aware neural network model was trained on it to create an RL-based local path planner for dynamic obstacle avoidance. Our proposed method was demonstrated successfully on an omni-directional mobile robot who was able to navigate in a crowded environment with high agility and versatility in movement, while not being overly sensitive to nearby obstacles-not-in-contact.
Visual SLAMMOT Considering Multiple Motion Models
Simultaneous Localization and Mapping (SLAM) and Multi-Object Tracking (MOT) are pivotal tasks in the realm of autonomous driving, attracting considerable research attention. While SLAM endeavors to generate real-time maps and determine the vehicle's pose in unfamiliar settings, MOT focuses on the real-time identification and tracking of multiple dynamic objects. Despite their importance, the prevalent approach treats SLAM and MOT as independent modules within an autonomous vehicle system, leading to inherent limitations. Classical SLAM methodologies often rely on a static environment assumption, suitable for indoor rather than dynamic outdoor scenarios. Conversely, conventional MOT techniques typically rely on the vehicle's known state, constraining the accuracy of object state estimations based on this prior. To address these challenges, previous efforts introduced the unified SLAMMOT paradigm, yet primarily focused on simplistic motion patterns. In our team's previous work IMM-SLAMMOT\cite{IMM-SLAMMOT}, we present a novel methodology incorporating consideration of multiple motion models into SLAMMOT i.e. tightly coupled SLAM and MOT, demonstrating its efficacy in LiDAR-based systems. This paper studies feasibility and advantages of instantiating this methodology as visual SLAMMOT, bridging the gap between LiDAR and vision-based sensing mechanisms. Specifically, we propose a solution of visual SLAMMOT considering multiple motion models and validate the inherent advantages of IMM-SLAMMOT in the visual domain.
RINO: Accurate, Robust Radar-Inertial Odometry with Non-Iterative Estimation
Odometry in adverse weather conditions, such as fog, rain, and snow, presents significant challenges, as traditional vision and LiDAR-based methods often suffer from degraded performance. Radar-Inertial Odometry (RIO) has emerged as a promising solution due to its resilience in such environments. In this paper, we present RINO, a non-iterative RIO framework implemented in an adaptively loosely coupled manner. Building upon ORORA as the baseline for radar odometry, RINO introduces several key advancements, including improvements in keypoint extraction, motion distortion compensation, and pose estimation via an adaptive voting mechanism. This voting strategy facilitates efficient polynomial-time optimization while simultaneously quantifying the uncertainty in the radar module's pose estimation. The estimated uncertainty is subsequently integrated into the maximum a posteriori (MAP) estimation within a Kalman filter framework. Unlike prior loosely coupled odometry systems, RINO not only retains the global and robust registration capabilities of the radar component but also dynamically accounts for the real-time operational state of each sensor during fusion. Experimental results conducted on publicly available datasets demonstrate that RINO reduces translation and rotation errors by 1.06% and 0.09{\deg}/100m, respectively, when compared to the baseline method, thus significantly enhancing its accuracy. Furthermore, RINO achieves performance comparable to state-of-the-art methods.
VPOcc: Exploiting Vanishing Point for 3D Semantic Occupancy Prediction
Understanding 3D scenes semantically and spatially is crucial for the safe navigation of robots and autonomous vehicles, aiding obstacle avoidance and accurate trajectory planning. Camera-based 3D semantic occupancy prediction, which infers complete voxel grids from 2D images, is gaining importance in robot vision for its resource efficiency compared to 3D sensors. However, this task inherently suffers from a 2D-3D discrepancy, where objects of the same size in 3D space appear at different scales in a 2D image depending on their distance from the camera due to perspective projection. To tackle this issue, we propose a novel framework called VPOcc that leverages a vanishing point (VP) to mitigate the 2D-3D discrepancy at both the pixel and feature levels. As a pixel-level solution, we introduce a VPZoomer module, which warps images by counteracting the perspective effect using a VP-based homography transformation. In addition, as a feature-level solution, we propose a VP-guided cross-attention (VPCA) module that performs perspective-aware feature aggregation, utilizing 2D image features that are more suitable for 3D space. Lastly, we integrate two feature volumes extracted from the original and warped images to compensate for each other through a spatial volume fusion (SVF) module. By effectively incorporating VP into the network, our framework achieves improvements in both IoU and mIoU metrics on SemanticKITTI and SSCBench-KITTI360 datasets. Additional details are available at https://vision3d-lab.github.io/vpocc/.
Safe Multi-Robotic Arm Interaction via 3D Convex Shapes
Inter-robot collisions pose a significant safety risk when multiple robotic arms operate in close proximity. We present an online collision avoidance methodology leveraging 3D convex shape-based High-Order Control Barrier Functions (HOCBFs) to address this issue. While prior works focused on using Control Barrier Functions (CBFs) for human-robotic arm and single-arm collision avoidance, we explore the problem of collision avoidance between multiple robotic arms operating in a shared space. In our methodology, we utilize the proposed HOCBFs as centralized and decentralized safety filters. These safety filters are compatible with many nominal controllers and ensure safety without significantly restricting the robots' workspace. A key challenge in implementing these filters is the computational overhead caused by the large number of safety constraints and the computation of a Hessian matrix per constraint. We address this challenge by employing numerical differentiation methods to approximate computationally intensive terms. The effectiveness of our method is demonstrated through extensive simulation studies and real-world experiments with Franka Research 3 robotic arms. The project video is available at this link.
Multiagent Systems
Agentic Design Review System
Evaluating graphic designs involves assessing it from multiple facets like alignment, composition, aesthetics and color choices. Evaluating designs in a holistic way involves aggregating feedback from individual expert reviewers. Towards this, we propose an Agentic Design Review System (AgenticDRS), where multiple agents collaboratively analyze a design, orchestrated by a meta-agent. A novel in-context exemplar selection approach based on graph matching and a unique prompt expansion method plays central role towards making each agent design aware. Towards evaluating this framework, we propose DRS-BENCH benchmark. Thorough experimental evaluation against state-of-the-art baselines adapted to the problem setup, backed-up with critical ablation experiments brings out the efficacy of Agentic-DRS in evaluating graphic designs and generating actionable feedback. We hope that this work will attract attention to this pragmatic, yet under-explored research direction.
A Unified Multi-Agent Framework for Universal Multimodal Understanding and Generation
Real-world multimodal applications often require any-to-any capabilities, enabling both understanding and generation across modalities including text, image, audio, and video. However, integrating the strengths of autoregressive language models (LLMs) for reasoning and diffusion models for high-fidelity generation remains challenging. Existing approaches rely on rigid pipelines or tightly coupled architectures, limiting flexibility and scalability. We propose MAGUS (Multi-Agent Guided Unified Multimodal System), a modular framework that unifies multimodal understanding and generation via two decoupled phases: Cognition and Deliberation. MAGUS enables symbolic multi-agent collaboration within a shared textual workspace. In the Cognition phase, three role-conditioned multimodal LLM agents - Perceiver, Planner, and Reflector - engage in collaborative dialogue to perform structured understanding and planning. The Deliberation phase incorporates a Growth-Aware Search mechanism that orchestrates LLM-based reasoning and diffusion-based generation in a mutually reinforcing manner. MAGUS supports plug-and-play extensibility, scalable any-to-any modality conversion, and semantic alignment - all without the need for joint training. Experiments across multiple benchmarks, including image, video, and audio generation, as well as cross-modal instruction following, demonstrate that MAGUS outperforms strong baselines and state-of-the-art systems. Notably, on the MME benchmark, MAGUS surpasses the powerful closed-source model GPT-4o.
comment: 8 pages, 5 figures
Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study
Large Language Models (LLMs) hold promise in automating data analysis tasks, yet open-source models face significant limitations in these kinds of reasoning-intensive scenarios. In this work, we investigate strategies to enhance the data analysis capabilities of open-source LLMs. By curating a seed dataset of diverse, realistic scenarios, we evaluate model behavior across three core dimensions: data understanding, code generation, and strategic planning. Our analysis reveals three key findings: (1) Strategic planning quality serves as the primary determinant of model performance; (2) Interaction design and task complexity significantly influence reasoning capabilities; (3) Data quality demonstrates a greater impact than diversity in achieving optimal performance. We leverage these insights to develop a data synthesis methodology, demonstrating significant improvements in open-source LLMs' analytical reasoning capabilities. Code is available at https://github.com/zjunlp/DataMind.
comment: Work in progress
UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving ICCV 2025
We introduce UniOcc, a comprehensive, unified benchmark and toolkit for occupancy forecasting (i.e., predicting future occupancies based on historical information) and occupancy prediction (i.e., predicting current-frame occupancy from camera images. UniOcc unifies the data from multiple real-world datasets (i.e., nuScenes, Waymo) and high-fidelity driving simulators (i.e., CARLA, OpenCOOD), providing 2D/3D occupancy labels and annotating innovative per-voxel flows. Unlike existing studies that rely on suboptimal pseudo labels for evaluation, UniOcc incorporates novel evaluation metrics that do not depend on ground-truth labels, enabling robust assessment on additional aspects of occupancy quality. Through extensive experiments on state-of-the-art models, we demonstrate that large-scale, diverse training data and explicit flow information significantly enhance occupancy prediction and forecasting performance. Our data and code are available at https://uniocc.github.io/.
comment: IEEE/CVF International Conference on Computer Vision (ICCV 2025); Project website: https://uniocc.github.io/
A New Query Expansion Approach via Agent-Mediated Dialogic Inquiry KDD 2025
Query expansion is widely used in Information Retrieval (IR) to improve search outcomes by supplementing initial queries with richer information. While recent Large Language Model (LLM) based methods generate pseudo-relevant content and expanded terms via multiple prompts, they often yield homogeneous, narrow expansions that lack the diverse context needed to retrieve relevant information. In this paper, we propose AMD: a new Agent-Mediated Dialogic Framework that engages in a dialogic inquiry involving three specialized roles: (1) a Socratic Questioning Agent reformulates the initial query into three sub-questions, with each question inspired by a specific Socratic questioning dimension, including clarification, assumption probing, and implication probing, (2) a Dialogic Answering Agent generates pseudo-answers, enriching the query representation with multiple perspectives aligned to the user's intent, and (3) a Reflective Feedback Agent evaluates and refines these pseudo-answers, ensuring that only the most relevant and informative content is retained. By leveraging a multi-agent process, AMD effectively crafts richer query representations through inquiry and feedback refinement. Extensive experiments on benchmarks including BEIR and TREC demonstrate that our framework outperforms previous methods, offering a robust solution for retrieval tasks.
comment: Accepted by ACM SIGKDD 2025 Workshop on AI Agent for Information Retrieval (Agent4IR)
Systems and Control (CS)
Fuel Consumption in Platoons: A Literature Review
Platooning has emerged as a promising strategy for improving fuel efficiency in automated vehicle systems, with significant implications for reducing emissions and operational costs. While existing literature on vehicle platooning primarily focuses on individual aspects such as aerodynamic drag reduction or specific control strategies, this work takes a more comprehensive approach by bringing together a wide range of factors and components that contribute to fuel savings in platoons. In this literature review, we examine the impact of platooning on fuel consumption, highlighting the key components of platoon systems, the factors and actors influencing fuel savings, methods for estimating fuel use, and the effect of platoon instability on efficiency. Furthermore, we study the role of reduced aerodynamic drag, vehicle coordination, and the challenges posed by instability in real-world conditions. By compiling insights from recent studies, this work provides a comprehensive overview of the latest advancements in platooning technologies and highlights both the challenges and opportunities for future research to maximize fuel savings in real-world scenarios.
CVIRO: A Consistent and Tightly-Coupled Visual-Inertial-Ranging Odometry on Lie Groups
Ultra Wideband (UWB) is widely used to mitigate drift in visual-inertial odometry (VIO) systems. Consistency is crucial for ensuring the estimation accuracy of a UWBaided VIO system. An inconsistent estimator can degrade localization performance, where the inconsistency primarily arises from two main factors: (1) the estimator fails to preserve the correct system observability, and (2) UWB anchor positions are assumed to be known, leading to improper neglect of calibration uncertainty. In this paper, we propose a consistent and tightly-coupled visual-inertial-ranging odometry (CVIRO) system based on the Lie group. Our method incorporates the UWB anchor state into the system state, explicitly accounting for UWB calibration uncertainty and enabling the joint and consistent estimation of both robot and anchor states. Furthermore, observability consistency is ensured by leveraging the invariant error properties of the Lie group. We analytically prove that the CVIRO algorithm naturally maintains the system's correct unobservable subspace, thereby preserving estimation consistency. Extensive simulations and experiments demonstrate that CVIRO achieves superior localization accuracy and consistency compared to existing methods.
Integrating Terrestrial and Non-Terrestrial Networks for Sustainable 6G Operations: A Latency-Aware Multi-Tier Cell-Switching Approach
Sustainability is paramount in modern cellular networks, which face significant energy consumption challenges from rising mobile traffic and advancements in wireless technology. Cell-switching, well-established in literature as an effective solution, encounters limitations such as inadequate capacity and limited coverage when implemented through terrestrial networks (TN). This study enhances cell-switching by integrating non-terrestrial networks (NTN), including satellites (used for cell-switching for the first time), high altitude platform stations (HAPS), and uncrewed aerial vehicles (UAVs) into TN. This integration significantly boosts energy savings by expanding capacity, enhancing coverage, and increasing operational flexibility. We introduce a multi-tier cell-switching approach that dynamically offloads users across network layers to manage energy effectively and minimize delays, accommodating diverse user demands with a context aware strategy. Additionally, we explore the role of artificial intelligence (AI), particularly generative AI, in optimizing network efficiency through data compression, handover optimization between different network layers, and enhancing device compatibility, further improving the adaptability and energy efficiency of cell-switching operations. A case study confirms substantial improvements in network power consumption and user satisfaction, demonstrating the potential of our approach for future networks.
comment: 9 pages, 6 figures
Learning Task Execution Hierarchies for Redundant Robots
Modern robotic systems, such as mobile manipulators, humanoids, and aerial robots with arms, often possess high redundancy, enabling them to perform multiple tasks simultaneously. Managing this redundancy is key to achieving reliable and flexible behavior. A widely used approach is the Stack of Tasks (SoT), which organizes control objectives by priority within a unified framework. However, traditional SoTs are manually designed by experts, limiting their adaptability and accessibility. This paper introduces a novel framework that automatically learns both the hierarchy and parameters of a SoT from user-defined objectives. By combining Reinforcement Learning and Genetic Programming, the system discovers task priorities and control strategies without manual intervention. A cost function based on intuitive metrics such as precision, safety, and execution time guides the learning process. We validate our method through simulations and experiments on the mobile-YuMi platform, a dual-arm mobile manipulator with high redundancy. Results show that the learned SoTs enable the robot to dynamically adapt to changing environments and inputs, balancing competing objectives while maintaining robust task execution. This approach provides a general and user-friendly solution for redundancy management in complex robots, advancing human-centered robot programming and reducing the need for expert design.
Traffic Intersection Simulation Using Turning Movement Count Data in SUMO: A Case Study of Toronto Intersections
Urban traffic simulation is vital in planning, modeling, and analyzing road networks. However, the realism of a simulation depends extensively on the quality of input data. This paper presents an intersection traffic simulation tool that leverages real-world vehicle turning movement count (TMC) data from the City of Toronto to model traffic in an urban environment at an individual or multiple intersections using Simulation of Urban MObility (SUMO). The simulation performed in this research focuses specifically on intersection-level traffic generation without creating full vehicle routes through the network. This also helps keep the network's complexity to a minimum. The simulated traffic is evaluated against actual data to show that the simulation closely reproduces real intersection flows. This validates that the real data can drive practical simulations, and these scenarios can replace synthetic or random generated data, which is prominently used in developing new traffic-related methodologies. This is the first tool to integrate TMC data from Toronto into SUMO via an easy-to-use Graphical User Interface. This work contributes to the research and traffic planning community on data-driven traffic simulation. It provides transportation engineers with a framework to evaluate intersection design and traffic signal optimization strategies using readily available aggregate traffic data.
comment: Published in 2025 21st DCOSS-IoT, Code is available at Github: https://github.com/ANTS-OntarioTechU/CrossFlow
Multi-Functional Polarization-Based Coverage Control through Static Passive EMSs
An innovative multi-functional static-passive electromagnetic skin (SP-EMS) solution is proposed to simultaneously support, in reflection, two independent wave-manipulation functionalities with a single meta-atoms arrangement on the EMS aperture when illuminated by two EM sources operating at the same frequency, but working in different polarization states. Towards this end, a simple reference meta-atom is designed first to enable an accurate and independent control of each polarization component of the local reflection tensor. Successively, the macro-scale synthesis of multi-polarization (MP) SP-EMSs (MP-SP-EMSs) is carried out by solving a global optimization problem where a cost function, which mathematically codes separate requirements for each polarization, is minimized with a customized version of the system-by-design (SbD) technique. Representative results from a set of numerical and experimental tests are reported to assess the feasibility of a multi-function EMS based on polarization diversity as well as the effectiveness and the robustness of the proposed method for the synthesis of MP-SP-EMSs.
Two-Instrument Screening under Soft Budget Constraints
We study soft budget constraints in multi-tier public finance when an upper-tier government uses two instruments: an ex-ante grant schedule and an ex-post rescue. Under convex rescue costs and standard primitives, the three-stage leader-follower problem collapses to one dimensional screening with a single allocation index: the cap on realized rescue. A hazard-based characterization delivers a unified rule that nests (i) no rescue, (ii) a threshold-cap with commitment, and (iii) a threshold--linear--cap without commitment. The knife-edge for eliminating bailouts compares the marginal cost at the origin to the supremum of a virtual weight, and the comparative statics show how greater curvature tightens caps while discretion shifts transfers toward front loading by lowering the effective grant weight. The framework provides a portable benchmark for mechanism design and yields testable implications for policy and empirical work on intergovernmental finance.
comment: arXiv admin note: text overlap with arXiv:2508.02171
Probabilistic Forecasting Method for Offshore Wind Farm Cluster under Typhoon Conditions: a Score-Based Conditional Diffusion Model
Offshore wind power (OWP) exhibits significant fluctuations under typhoon conditions, posing substantial challenges to the secure operation of power systems. Accurate forecasting of OWP is therefore essential. However, the inherent scarcity of historical typhoon data and stochasticity of OWP render traditional point forecasting methods particularly difficult and inadequate. To address this challenge and provide grid operators with the comprehensive information necessary for decision-making, this study proposes a score-based conditional diffusion model (SCDM) for probabilistic forecasting of OWP during typhoon events. First, a knowledge graph algorithm is employed to embed historical typhoon paths as vectors. Then, a deterministic network is constructed to predict the wind power under typhoon conditions based on these vector embeddings. Finally, to better characterize prediction errors, a denoising network is developed. At the core of this approach is a mean-reverting stochastic differential equation (SDE), which transforms complex error distributions into a standard Gaussian, enabling the sampling of forecasting errors using a reverse-time SDE. The probabilistic forecasting results are reconstructed by combining deterministic forecasts with sampled errors. The proposed method is evaluated using real-world data from a cluster of 9 offshore wind farms. Results demonstrate that under typhoon conditions, our approach outperforms baseline models for both deterministic and probabilistic metrics, verifying the effectiveness of the approach.
A Robust Optimization Approach for Demand Response Participation of Fixed-Frequency Air Conditioners
With the continuous increase in the penetration of renewable energy in the emerging power systems, the pressure on system peak regulation has been significantly intensified. Against this backdrop, demand side resources particularly air conditioning loads have garnered considerable attention for their substantial regulation potential and fast response capabilities, making them promising candidates for providing auxiliary peak shaving services. This study focuses on fixed frequency air conditioners (FFACs) and proposes an optimization model and solution method for their participation in demand response (DR) programs. First, a probabilistic response model for FFACs is developed based on the Markov assumption. Second, by sampling this probabilistic model, the aggregate power consumption of an FFAC cluster under decentralized control is obtained. Subsequently, a robust optimization model is formulated to maximize the profit of an aggregator managing the FFAC cluster during DR events, taking into account the aggregated response power. The model explicitly considers temperature uncertainty to ensure user comfort in a robust sense. Finally, leveraging the structure of the proposed model, it is reformulated as a mixed-integer linear programming (MILP) problem and solved using a commercial optimization solver. Simulation results validate the effectiveness of the proposed model and solution approach.
Synthesis of Deep Neural Networks with Safe Robust Adaptive Control for Reliable Operation of Wheeled Mobile Robots
Deep neural networks (DNNs) can enable precise control while maintaining low computational costs by circumventing the need for dynamic modeling. However, the deployment of such black-box approaches remains challenging for heavy-duty wheeled mobile robots (WMRs), which are subject to strict international standards and prone to faults and disturbances. We designed a hierarchical control policy for heavy-duty WMRs, monitored by two safety layers with differing levels of authority. To this end, a DNN policy was trained and deployed as the primary control strategy, providing high-precision performance under nominal operating conditions. When external disturbances arise and reach a level of intensity such that the system performance falls below a predefined threshold, a low-level safety layer intervenes by deactivating the primary control policy and activating a model-free robust adaptive control (RAC) policy. This transition enables the system to continue operating while ensuring stability by effectively managing the inherent trade-off between system robustness and responsiveness. Regardless of the control policy in use, a high-level safety layer continuously monitors system performance during operation. It initiates a shutdown only when disturbances become sufficiently severe such that compensation is no longer viable and continued operation would jeopardize the system or its environment. The proposed synthesis of DNN and RAC policy guarantees uniform exponential stability of the entire WMR system while adhering to safety standards to some extent. The effectiveness of the proposed approach was further validated through real-time experiments using a 6,000 kg WMR.
Variance Reduced Policy Gradient Method for Multi-Objective Reinforcement Learning
Multi-Objective Reinforcement Learning (MORL) is a generalization of traditional Reinforcement Learning (RL) that aims to optimize multiple, often conflicting objectives simultaneously rather than focusing on a single reward. This approach is crucial in complex decision-making scenarios where agents must balance trade-offs between various goals, such as maximizing performance while minimizing costs. We consider the problem of MORL where the objectives are combined using a non-linear scalarization function. Just like in standard RL, policy gradient methods (PGMs) are amongst the most effective for handling large and continuous state-action spaces in MORL. However, existing PGMs for MORL suffer from high sample inefficiency, requiring large amounts of data to be effective. Previous attempts to solve this problem rely on overly strict assumptions, losing PGMs' benefits in scalability to large state-action spaces. In this work, we address the issue of sample efficiency by implementing variance-reduction techniques to reduce the sample complexity of policy gradients while maintaining general assumptions.
comment: 7 pages, 4 figures
Feedback stabilization of a nanoparticle at the intensity minimum of an optical double-well potential
In this work, we develop and analyze adaptive feedback control strategies to stabilize and confine a nanoparticle at the unstable intensity minimum of an optical double-well potential. The resulting stochastic optimal control problem for a noise-driven mechanical particle in a nonlinear optical potential must account for unavoidable experimental imperfections such as measurement nonlinearities and slow drifts of the optical setup. To address these issues, we simplify the model in the vicinity of the unstable equilibrium and employ indirect adaptive control techniques to dynamically follow changes in the potential landscape. Our approach leads to a simple and efficient Linear Quadratic Gaussian (LQG) controller that can be implemented on fast and cost-effective FPGAs, ensuring accessibility and reproducibility. We demonstrate that this strategy successfully tracks the intensity minimum and significantly reduces the nanoparticle's residual state variance, effectively lowering its center-of-mass temperature. While conventional optical traps rely on confining optical forces in the light field at the intensity maxima, trapping at intensity minima mitigates absorption heating, which is crucial for advanced quantum experiments. Since LQG control naturally extends into the quantum regime, our results provide a promising pathway for future experiments on quantum state preparation beyond the current absorption heating limitation, like matter-wave interference and tests of the quantum-gravity interface.
Virtual Sensing for Solder Layer Degradation and Temperature Monitoring in IGBT Modules
Monitoring the degradation state of Insulated Gate Bipolar Transistor (IGBT) modules is essential for ensuring the reliability and longevity of power electronic systems, especially in safety-critical and high-performance applications. However, direct measurement of key degradation indicators - such as junction temperature, solder fatigue or delamination - remains challenging due to the physical inaccessibility of internal components and the harsh environment. In this context, machine learning-based virtual sensing offers a promising alternative by bridging the gap from feasible sensor placement to the relevant but inaccessible locations. This paper explores the feasibility of estimating the degradation state of solder layers, and the corresponding full temperature maps based on a limited number of physical sensors. Based on synthetic data of a specific degradation mode, we obtain a high accuracy in the estimation of the degraded solder area (1.17% mean absolute error), and are able to reproduce the surface temperature of the IGBT with a maximum relative error of 4.56% (corresponding to an average relative error of 0.37%).
comment: Andrea Urgolo and Monika Stipsitz contributed equally to this work
A Structured Framework for Prioritizing Unsafe Control Actions in STPA: Case Study on eVTOL Operations
Systems Theoretic Process Analysis (STPA) is a widely recommended method for analysing complex system safety. STPA can identify numerous Unsafe Control Actions (UCAs) and requirements depending on the level of granularity of the analysis and the complexity of the system being analysed. Managing numerous results is challenging, especially during a fast-paced development lifecycle. Extensive research has been done to optimize the efficiency of managing and prioritising the STPA results. However, maintaining the objectivity of prioritisation and communicating the prioritised results have become common challenges. In this paper, the authors present a complementary approach that incorporates inputs from both the safety analysts and domain experts to more objectively prioritise UCAs. This is done by evaluating the severity of each UCA, the impact factor of each controller or decision maker that issues the UCA, and the ranking provided by the subject matter experts who assess the UCA criticalities based on different factors. In addition, a Monte Carlo simulation is introduced to reduce subjectivity and relativity, thus enabling more objective prioritisation of the UCAs. As part of the approach to better communicate the prioritisation results and plan the next steps of system development, a dynamic-scaling prioritisation matrix was developed to capture different sets of prioritised UCAs. The approach was applied to a real project to improve the safe operations of Electric Vertical Take-off and Landing (eVTOL). The results highlighted critical UCAs that need to be prioritised for safer eVTOL operation. 318 UCAs were identified in total. Based on the application of the prioritisation methodology, 110 were recognized as high-priority UCAs to strengthen the system design.
MASH: Cooperative-Heterogeneous Multi-Agent Reinforcement Learning for Single Humanoid Robot Locomotion
This paper proposes a novel method to enhance locomotion for a single humanoid robot through cooperative-heterogeneous multi-agent deep reinforcement learning (MARL). While most existing methods typically employ single-agent reinforcement learning algorithms for a single humanoid robot or MARL algorithms for multi-robot system tasks, we propose a distinct paradigm: applying cooperative-heterogeneous MARL to optimize locomotion for a single humanoid robot. The proposed method, multi-agent reinforcement learning for single humanoid locomotion (MASH), treats each limb (legs and arms) as an independent agent that explores the robot's action space while sharing a global critic for cooperative learning. Experiments demonstrate that MASH accelerates training convergence and improves whole-body cooperation ability, outperforming conventional single-agent reinforcement learning methods. This work advances the integration of MARL into single-humanoid-robot control, offering new insights into efficient locomotion strategies.
Quantifying the Value of Seismic Structural Health Monitoring for post-earthquake recovery of electric power system in terms of resilience enhancement
Post-earthquake recovery of electric power networks (EPNs) is critical to community resilience. Traditional recovery processes often rely on prolonged and imprecise manual inspections for damage diagnosis, leading to suboptimal repair prioritization and extended service disruptions. Seismic Structural Health Monitoring (SSHM) offers the potential to expedite recovery by enabling more accurate and timely damage assessment. However, SSHM deployment incurs costs, and its system-level resilience benefit remains underexplored. This study proposes a probabilistic simulation framework to quantify the value of SSHM for enhancing EPN resilience. The framework includes seismic damage modeling based on network configuration, hazard intensity, fragility functions, and damage-functionality mappings, combined with recovery simulations incorporating resource constraints, repair and transfer durations. System functionality is evaluated using graph-based island detection and optimal power flow analysis. Resilience is quantified via the Lack of Resilience (LoR) metric derived from the functionality restoration curve. SSHM is incorporated by altering the quality of damage information used in repair scheduling. Different monitoring scenarios (e.g., no-SSHM baseline, partial SSHM, full SSHM with various accuracies) are modeled using confusion matrices to simulate damage misclassification. Results show that improved damage awareness via SSHM significantly accelerates recovery and reduces LoR by up to 21%. This work supports evidence-based decisions for SSHM deployment in critical infrastructure.
comment: 21 pages. 14 figures
Managing Risks from Large Digital Loads Using Coordinated Grid-Forming Storage Network
Anticipated rapid growth of large digital load, driven by artificial intelligence (AI) data centers, is poised to increase uncertainty and large fluctuations in consumption, threatening the stability, reliability, and security of the energy infrastructure. Conventional measures taken by grid planners and operators to ensure stable and reliable integration of new resources are either cost-prohibitive (e.g., transmission upgrades) or ill-equipped (e.g., generation control) to resolve the unique challenges brought on by AI Data Centers (e.g., extreme load transients). In this work, we explore the feasibility of coordinating and managing available flexibility in the grid, in terms of grid-forming storage units, to ensure stable and reliable integration of AI Data Centers without the need for costly grid upgrades. Recently developed bi-layered coordinated control strategies -- involving fast-acting, local, autonomous, control at the storage to maintain transient safety in voltage and frequency at the point-of-interconnection, and a slower, coordinated (consensus) control to restore normal operating condition in the grid -- are used in the case studies. A comparison is drawn between broadly two scenarios: a network of coordinated, smaller, distributed storage vs. larger storage installations collocated with large digital loads. IEEE 68-bus network is used for the case studies, with large digital load profiles drawn from the MIT Supercloud Dataset.
comment: Submitted to IEEE PES T&D Conference and Expo 2026
A Neural Column-and-Constraint Generation Method for Solving Two-Stage Stochastic Unit Commitment
Two-stage stochastic unit commitment (2S-SUC) problems have been widely adopted to manage the uncertainties introduced by high penetrations of intermittent renewable energy resources. While decomposition-based algorithms such as column-and-constraint generation has been proposed to solve these problems, they remain computationally prohibitive for large-scale, real-time applications. In this paper, we introduce a Neural Column-and-Constraint Generation (Neural CCG) method to significantly accelerate the solution of 2S-SUC problems. The proposed approach integrates a neural network that approximates the second-stage recourse problem by learning from high-level features of operational scenarios and the first-stage commitment decisions. This neural estimator is embedded within the CCG framework, replacing repeated subproblem solving with rapid neural evaluations. We validate the effectiveness of the proposed method on the IEEE 118-bus system. Compared to the original CCG and a state-of-the-art commercial solver, Neural CCG achieves up to 130.1$\times$ speedup while maintaining a mean optimality gap below 0.096\%, demonstrating its strong potential for scalable stochastic optimization in power system.
Risk-Based Prognostics and Health Management
It is often the case that risk assessment and prognostics are viewed as related but separate tasks. This chapter describes a risk-based approach to prognostics that seeks to provide a tighter coupling between risk assessment and fault prediction. We show how this can be achieved using the continuous-time Bayesian network as the underlying modeling framework. Furthermore, we provide an overview of the techniques that are available to derive these models from data and show how they might be used in practice to achieve tasks like decision support and performance-based logistics. This work is intended to provide an overview of the recent developments related to risk-based prognostics, and we hope that it will serve as a tutorial of sorts that will assist others in adopting these techniques.
comment: Appears as Chapter 27 in Realizing Complex Integrated Systems, Anthony P. Ambler and John W. Sheppard (ads.), CRC Press, 2025
Overview of Complex System Design
This chapter serves as an introduction to systems engineering focused on the broad issues surrounding realizing complex integrated systems. What is a system? We pose a number of possible definitions and perspectives, but leave open the opportunity to consider the system from the target context where it will be used. Once we have a system in mind, we acknowledge the fact that this system needs to integrate a variety of pieces, components, subsystems, in order for it to accomplish its task. Therefore, we concern ourselves at the boundaries and interfaces of different technologies and disciplines to determine how best to achieve that integration. Next we raise the specter that this integrated system is complex. Complexity can be defined in a number of ways. For one, the sheer number of subsystems or components can be a measure of complexity. We could also consider the functions being performed by the system and how those functions interact with one another. Further, we could consider computational aspects such as the time or memory that may be needed to accomplish one or more tasks. The extent to which new behaviors might emerge from the system can also be regarded as an element of complexity. In the end, complexity is that characteristic of a system that defines the associated challenges along the life of the system, so we are concerned with how to manage that complexity. Finally, realization refers to the process by which our complex integrated system moves from concept to deployment and subsequent support. It refers to the entire design, development, manufacture, deployment, operation, and support life cycle. Of particular note here, however, is that we focus on systems that, by their very nature, are complex. In other words, we are interested in large, complicated, interacting beasts that are intended to perform difficult tasks and meet a wide variety of end-user needs.
comment: Appears as Chapter 1 in Realizing Complex Integrated Systems, Anthony P. Ambler and John W. Sheppard (ads.), CRC Press, 2025
Zono-Conformal Prediction: Zonotope-Based Uncertainty Quantification for Regression and Classification Tasks
Conformal prediction is a popular uncertainty quantification method that augments a base predictor with prediction sets with statistically valid coverage guarantees. However, current methods are often computationally expensive and data-intensive, as they require constructing an uncertainty model before calibration. Moreover, existing approaches typically represent the prediction sets with intervals, which limits their ability to capture dependencies in multi-dimensional outputs. We address these limitations by introducing zono-conformal prediction, a novel approach inspired by interval predictor models and reachset-conformant identification that constructs prediction zonotopes with assured coverage. By placing zonotopic uncertainty sets directly into the model of the base predictor, zono-conformal predictors can be identified via a single, data-efficient linear program. While we can apply zono-conformal prediction to arbitrary nonlinear base predictors, we focus on feed-forward neural networks in this work. Aside from regression tasks, we also construct optimal zono-conformal predictors in classification settings where the output of an uncertain predictor is a set of possible classes. We provide probabilistic coverage guarantees and present methods for detecting outliers in the identification data. In extensive numerical experiments, we show that zono-conformal predictors are less conservative than interval predictor models and standard conformal prediction methods, while achieving a similar coverage over the test data.
comment: Preprint. Under review
Robust Online Calibration for UWB-Aided Visual-Inertial Navigation with Bias Correction
This paper presents a novel robust online calibration framework for Ultra-Wideband (UWB) anchors in UWB-aided Visual-Inertial Navigation Systems (VINS). Accurate anchor positioning, a process known as calibration, is crucial for integrating UWB ranging measurements into state estimation. While several prior works have demonstrated satisfactory results by using robot-aided systems to autonomously calibrate UWB systems, there are still some limitations: 1) these approaches assume accurate robot localization during the initialization step, ignoring localization errors that can compromise calibration robustness, and 2) the calibration results are highly sensitive to the initial guess of the UWB anchors' positions, reducing the practical applicability of these methods in real-world scenarios. Our approach addresses these challenges by explicitly incorporating the impact of robot localization uncertainties into the calibration process, ensuring robust initialization. To further enhance the robustness of the calibration results against initialization errors, we propose a tightly-coupled Schmidt Kalman Filter (SKF)-based online refinement method, making the system suitable for practical applications. Simulations and real-world experiments validate the improved accuracy and robustness of our approach.
Finite Sample Performance Analysis of MIMO Systems Identification
This paper is concerned with the finite sample identification performance of an n dimensional discrete-time Multiple-Input Multiple-Output (MIMO) Linear Time-Invariant system, with p inputs and m outputs. We prove that the widely-used Ho-Kalman algorithm and Multivariable Output Error State Space (MOESP) algorithm are ill-conditioned for MIMO systems when n/m or n/p is large. Moreover, by analyzing the Cra\'mer-Rao bound, we derive a fundamental limit for identifying the real and stable (or marginally stable) poles of MIMO system and prove that the sample complexity for any unbiased pole estimation algorithm to reach a certain level of accuracy explodes superpolynomially with respect to n/(pm). Numerical results are provided to illustrate the ill-conditionedness of Ho-Kalman algorithm and MOESP algorithm as well as the fundamental limit on identification.
comment: 14 pages, 6 figures
TAR: Teacher-Aligned Representations via Contrastive Learning for Quadrupedal Locomotion IROS
Quadrupedal locomotion via Reinforcement Learning (RL) is commonly addressed using the teacher-student paradigm, where a privileged teacher guides a proprioceptive student policy. However, key challenges such as representation misalignment between privileged teacher and proprioceptive-only student, covariate shift due to behavioral cloning, and lack of deployable adaptation; lead to poor generalization in real-world scenarios. We propose Teacher-Aligned Representations via Contrastive Learning (TAR), a framework that leverages privileged information with self-supervised contrastive learning to bridge this gap. By aligning representations to a privileged teacher in simulation via contrastive objectives, our student policy learns structured latent spaces and exhibits robust generalization to Out-of-Distribution (OOD) scenarios, surpassing the fully privileged "Teacher". Results showed accelerated training by 2x compared to state-of-the-art baselines to achieve peak performance. OOD scenarios showed better generalization by 40% on average compared to existing methods. Moreover, TAR transitions seamlessly into learning during deployment without requiring privileged states, setting a new benchmark in sample-efficient, adaptive locomotion and enabling continual fine-tuning in real-world scenarios. Open-source code and videos are available at https://amrmousa.com/TARLoco/.
comment: This work has been accepted for publication at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
Enhancing Fault Detection and Isolation in an All-Electric Auxiliary Power Unit (APU) Gas Generator by Utilizing Starter/Generator Signal
This study proposes a novel paradigm for enhancing fault detection and isolation (FDI) of gas generators in all-electric auxiliary power unit (APU) by utilizing shaft power information from the starter/generator. First, we conduct a pioneering investigation into the challenges and opportunities for FDI brought about by APU electrification. Our analysis reveals that the electrification of APU opens up new possibilities for utilizing shaft power estimates from starter/generator to improve gas generator FDI. We then provide comprehensive theoretical and analytical evidence demonstrating why, how, and to what extent, the shaft power information from the starter/generator can fundamentally enhance the estimation accuracy of system states and health parameters of the gas generator, while also identifying the key factors influencing these improvements in FDI performance. The effectiveness of the proposed paradigm and its theoretical foundations are validated through extensive Monte Carlo simulations. Furthermore, through comprehensive comparative analysis with state-of-the-art gas generator fault diagnosis methods, our experimental results not only demonstrate the superior performance of the proposed approach but also validate that the diagnostic capabilities of existing advanced FDI techniques can be substantially enhanced by incorporating shaft power information. And the observed performance improvement patterns strongly align with our theoretical analysis, verifying both the effectiveness and guiding significance of our theoretical framework. These research findings provide a unique perspective in answering three fundamental questions: why joint fault diagnosis of the starter/generator and gas generator is essential, how it can be implemented, and what factors determine its effectiveness, thereby opening up promising new avenues for FDI technologies in all-electric APU systems.
On Event-Triggered Resilient Consensus Using Auxiliary Layer
Due to its design simplicity, auxiliary layer-based resilient control is widely discussed in the literature to mitigate the effects of False Data Injection (FDI) attacks. However, the increased communication burden due to additional communication links for connecting an extra layer is often overlooked in the literature. This paper bridges this gap by considering an event-triggered approach for inter-layer communication between the physical layer (containing actual agents) and the auxiliary layer (containing virtual agents) for the resilient state consensus in a multi-agent system. We provide state-based and dynamic event-triggering mechanisms, the former being the motivation for the latter. The exclusion of Zeno behavior is established by proving positive minimum inter-event time (MIET). Extensive simulation and experimental results are provided to illustrate the proposed methodology.
Unified Performance Control for Non-Square Nonlinear Systems with Relaxed Controllability
In this paper, we investigate the problem of unified prescribed performance tracking for a class of non-square strict-feedback nonlinear systems under relaxed controllability conditions. By using a skillful matrix decomposition and introducing some feasible auxiliary matrices, a more generalized controllability condition than the current state of the art is constructed, which can be applied to both square and non-square nonlinear systems subject to actuator faults and unknown yet time-varying control gain. Incorporating the relaxed controllability conditions and the uniform performance specifications into the backstepping design procedure, a prescribed performance fault-tolerant controller is developed that can achieve different performance demands without modifying the controller structure, which is more flexible and practical.In addition, the destruction of the system stability by unknown controllability auxiliary matrices and unknown nonlinearities is circumvented by embedding the available core information of the state-dependent uncertainties into the design procedure. Both theoretical analysis and numerical simulation demonstrate the effectiveness and benefits of the proposed method.
comment: 9 pages,13 figures, submitted to journal
Transferable Parasitic Estimation via Graph Contrastive Learning and Label Rebalancing in AMS Circuits
Graph representation learning on Analog-Mixed Signal (AMS) circuits is crucial for various downstream tasks, e.g., parasitic estimation. However, the scarcity of design data, the unbalanced distribution of labels, and the inherent diversity of circuit implementations pose significant challenges to learning robust and transferable circuit representations. To address these limitations, we propose CircuitGCL, a novel graph contrastive learning framework that integrates representation scattering and label rebalancing to enhance transferability across heterogeneous circuit graphs. CircuitGCL employs a self-supervised strategy to learn topology-invariant node embeddings through hyperspherical representation scattering, eliminating dependency on large-scale data. Simultaneously, balanced mean squared error (BMSE) and balanced softmax cross-entropy (BSCE) losses are introduced to mitigate label distribution disparities between circuits, enabling robust and transferable parasitic estimation. Evaluated on parasitic capacitance estimation (edge-level task) and ground capacitance classification (node-level task) across TSMC 28nm AMS designs, CircuitGCL outperforms all state-of-the-art (SOTA) methods, with the $R^2$ improvement of $33.64\% \sim 44.20\%$ for edge regression and F1-score gain of $0.9\times \sim 2.1\times$ for node classification. Our code is available at https://github.com/ShenShan123/CircuitGCL.
comment: Final version accepted by the International Conference on Computer-Aided Design (ICCAD) 2025
Safe Multi-Robotic Arm Interaction via 3D Convex Shapes
Inter-robot collisions pose a significant safety risk when multiple robotic arms operate in close proximity. We present an online collision avoidance methodology leveraging 3D convex shape-based High-Order Control Barrier Functions (HOCBFs) to address this issue. While prior works focused on using Control Barrier Functions (CBFs) for human-robotic arm and single-arm collision avoidance, we explore the problem of collision avoidance between multiple robotic arms operating in a shared space. In our methodology, we utilize the proposed HOCBFs as centralized and decentralized safety filters. These safety filters are compatible with many nominal controllers and ensure safety without significantly restricting the robots' workspace. A key challenge in implementing these filters is the computational overhead caused by the large number of safety constraints and the computation of a Hessian matrix per constraint. We address this challenge by employing numerical differentiation methods to approximate computationally intensive terms. The effectiveness of our method is demonstrated through extensive simulation studies and real-world experiments with Franka Research 3 robotic arms. The project video is available at this link.
Systems and Control (EESS)
Fuel Consumption in Platoons: A Literature Review
Platooning has emerged as a promising strategy for improving fuel efficiency in automated vehicle systems, with significant implications for reducing emissions and operational costs. While existing literature on vehicle platooning primarily focuses on individual aspects such as aerodynamic drag reduction or specific control strategies, this work takes a more comprehensive approach by bringing together a wide range of factors and components that contribute to fuel savings in platoons. In this literature review, we examine the impact of platooning on fuel consumption, highlighting the key components of platoon systems, the factors and actors influencing fuel savings, methods for estimating fuel use, and the effect of platoon instability on efficiency. Furthermore, we study the role of reduced aerodynamic drag, vehicle coordination, and the challenges posed by instability in real-world conditions. By compiling insights from recent studies, this work provides a comprehensive overview of the latest advancements in platooning technologies and highlights both the challenges and opportunities for future research to maximize fuel savings in real-world scenarios.
CVIRO: A Consistent and Tightly-Coupled Visual-Inertial-Ranging Odometry on Lie Groups
Ultra Wideband (UWB) is widely used to mitigate drift in visual-inertial odometry (VIO) systems. Consistency is crucial for ensuring the estimation accuracy of a UWBaided VIO system. An inconsistent estimator can degrade localization performance, where the inconsistency primarily arises from two main factors: (1) the estimator fails to preserve the correct system observability, and (2) UWB anchor positions are assumed to be known, leading to improper neglect of calibration uncertainty. In this paper, we propose a consistent and tightly-coupled visual-inertial-ranging odometry (CVIRO) system based on the Lie group. Our method incorporates the UWB anchor state into the system state, explicitly accounting for UWB calibration uncertainty and enabling the joint and consistent estimation of both robot and anchor states. Furthermore, observability consistency is ensured by leveraging the invariant error properties of the Lie group. We analytically prove that the CVIRO algorithm naturally maintains the system's correct unobservable subspace, thereby preserving estimation consistency. Extensive simulations and experiments demonstrate that CVIRO achieves superior localization accuracy and consistency compared to existing methods.
Integrating Terrestrial and Non-Terrestrial Networks for Sustainable 6G Operations: A Latency-Aware Multi-Tier Cell-Switching Approach
Sustainability is paramount in modern cellular networks, which face significant energy consumption challenges from rising mobile traffic and advancements in wireless technology. Cell-switching, well-established in literature as an effective solution, encounters limitations such as inadequate capacity and limited coverage when implemented through terrestrial networks (TN). This study enhances cell-switching by integrating non-terrestrial networks (NTN), including satellites (used for cell-switching for the first time), high altitude platform stations (HAPS), and uncrewed aerial vehicles (UAVs) into TN. This integration significantly boosts energy savings by expanding capacity, enhancing coverage, and increasing operational flexibility. We introduce a multi-tier cell-switching approach that dynamically offloads users across network layers to manage energy effectively and minimize delays, accommodating diverse user demands with a context aware strategy. Additionally, we explore the role of artificial intelligence (AI), particularly generative AI, in optimizing network efficiency through data compression, handover optimization between different network layers, and enhancing device compatibility, further improving the adaptability and energy efficiency of cell-switching operations. A case study confirms substantial improvements in network power consumption and user satisfaction, demonstrating the potential of our approach for future networks.
comment: 9 pages, 6 figures
Learning Task Execution Hierarchies for Redundant Robots
Modern robotic systems, such as mobile manipulators, humanoids, and aerial robots with arms, often possess high redundancy, enabling them to perform multiple tasks simultaneously. Managing this redundancy is key to achieving reliable and flexible behavior. A widely used approach is the Stack of Tasks (SoT), which organizes control objectives by priority within a unified framework. However, traditional SoTs are manually designed by experts, limiting their adaptability and accessibility. This paper introduces a novel framework that automatically learns both the hierarchy and parameters of a SoT from user-defined objectives. By combining Reinforcement Learning and Genetic Programming, the system discovers task priorities and control strategies without manual intervention. A cost function based on intuitive metrics such as precision, safety, and execution time guides the learning process. We validate our method through simulations and experiments on the mobile-YuMi platform, a dual-arm mobile manipulator with high redundancy. Results show that the learned SoTs enable the robot to dynamically adapt to changing environments and inputs, balancing competing objectives while maintaining robust task execution. This approach provides a general and user-friendly solution for redundancy management in complex robots, advancing human-centered robot programming and reducing the need for expert design.
Traffic Intersection Simulation Using Turning Movement Count Data in SUMO: A Case Study of Toronto Intersections
Urban traffic simulation is vital in planning, modeling, and analyzing road networks. However, the realism of a simulation depends extensively on the quality of input data. This paper presents an intersection traffic simulation tool that leverages real-world vehicle turning movement count (TMC) data from the City of Toronto to model traffic in an urban environment at an individual or multiple intersections using Simulation of Urban MObility (SUMO). The simulation performed in this research focuses specifically on intersection-level traffic generation without creating full vehicle routes through the network. This also helps keep the network's complexity to a minimum. The simulated traffic is evaluated against actual data to show that the simulation closely reproduces real intersection flows. This validates that the real data can drive practical simulations, and these scenarios can replace synthetic or random generated data, which is prominently used in developing new traffic-related methodologies. This is the first tool to integrate TMC data from Toronto into SUMO via an easy-to-use Graphical User Interface. This work contributes to the research and traffic planning community on data-driven traffic simulation. It provides transportation engineers with a framework to evaluate intersection design and traffic signal optimization strategies using readily available aggregate traffic data.
comment: Published in 2025 21st DCOSS-IoT, Code is available at Github: https://github.com/ANTS-OntarioTechU/CrossFlow
Multi-Functional Polarization-Based Coverage Control through Static Passive EMSs
An innovative multi-functional static-passive electromagnetic skin (SP-EMS) solution is proposed to simultaneously support, in reflection, two independent wave-manipulation functionalities with a single meta-atoms arrangement on the EMS aperture when illuminated by two EM sources operating at the same frequency, but working in different polarization states. Towards this end, a simple reference meta-atom is designed first to enable an accurate and independent control of each polarization component of the local reflection tensor. Successively, the macro-scale synthesis of multi-polarization (MP) SP-EMSs (MP-SP-EMSs) is carried out by solving a global optimization problem where a cost function, which mathematically codes separate requirements for each polarization, is minimized with a customized version of the system-by-design (SbD) technique. Representative results from a set of numerical and experimental tests are reported to assess the feasibility of a multi-function EMS based on polarization diversity as well as the effectiveness and the robustness of the proposed method for the synthesis of MP-SP-EMSs.
Two-Instrument Screening under Soft Budget Constraints
We study soft budget constraints in multi-tier public finance when an upper-tier government uses two instruments: an ex-ante grant schedule and an ex-post rescue. Under convex rescue costs and standard primitives, the three-stage leader-follower problem collapses to one dimensional screening with a single allocation index: the cap on realized rescue. A hazard-based characterization delivers a unified rule that nests (i) no rescue, (ii) a threshold-cap with commitment, and (iii) a threshold--linear--cap without commitment. The knife-edge for eliminating bailouts compares the marginal cost at the origin to the supremum of a virtual weight, and the comparative statics show how greater curvature tightens caps while discretion shifts transfers toward front loading by lowering the effective grant weight. The framework provides a portable benchmark for mechanism design and yields testable implications for policy and empirical work on intergovernmental finance.
comment: arXiv admin note: text overlap with arXiv:2508.02171
Probabilistic Forecasting Method for Offshore Wind Farm Cluster under Typhoon Conditions: a Score-Based Conditional Diffusion Model
Offshore wind power (OWP) exhibits significant fluctuations under typhoon conditions, posing substantial challenges to the secure operation of power systems. Accurate forecasting of OWP is therefore essential. However, the inherent scarcity of historical typhoon data and stochasticity of OWP render traditional point forecasting methods particularly difficult and inadequate. To address this challenge and provide grid operators with the comprehensive information necessary for decision-making, this study proposes a score-based conditional diffusion model (SCDM) for probabilistic forecasting of OWP during typhoon events. First, a knowledge graph algorithm is employed to embed historical typhoon paths as vectors. Then, a deterministic network is constructed to predict the wind power under typhoon conditions based on these vector embeddings. Finally, to better characterize prediction errors, a denoising network is developed. At the core of this approach is a mean-reverting stochastic differential equation (SDE), which transforms complex error distributions into a standard Gaussian, enabling the sampling of forecasting errors using a reverse-time SDE. The probabilistic forecasting results are reconstructed by combining deterministic forecasts with sampled errors. The proposed method is evaluated using real-world data from a cluster of 9 offshore wind farms. Results demonstrate that under typhoon conditions, our approach outperforms baseline models for both deterministic and probabilistic metrics, verifying the effectiveness of the approach.
A Robust Optimization Approach for Demand Response Participation of Fixed-Frequency Air Conditioners
With the continuous increase in the penetration of renewable energy in the emerging power systems, the pressure on system peak regulation has been significantly intensified. Against this backdrop, demand side resources particularly air conditioning loads have garnered considerable attention for their substantial regulation potential and fast response capabilities, making them promising candidates for providing auxiliary peak shaving services. This study focuses on fixed frequency air conditioners (FFACs) and proposes an optimization model and solution method for their participation in demand response (DR) programs. First, a probabilistic response model for FFACs is developed based on the Markov assumption. Second, by sampling this probabilistic model, the aggregate power consumption of an FFAC cluster under decentralized control is obtained. Subsequently, a robust optimization model is formulated to maximize the profit of an aggregator managing the FFAC cluster during DR events, taking into account the aggregated response power. The model explicitly considers temperature uncertainty to ensure user comfort in a robust sense. Finally, leveraging the structure of the proposed model, it is reformulated as a mixed-integer linear programming (MILP) problem and solved using a commercial optimization solver. Simulation results validate the effectiveness of the proposed model and solution approach.
Synthesis of Deep Neural Networks with Safe Robust Adaptive Control for Reliable Operation of Wheeled Mobile Robots
Deep neural networks (DNNs) can enable precise control while maintaining low computational costs by circumventing the need for dynamic modeling. However, the deployment of such black-box approaches remains challenging for heavy-duty wheeled mobile robots (WMRs), which are subject to strict international standards and prone to faults and disturbances. We designed a hierarchical control policy for heavy-duty WMRs, monitored by two safety layers with differing levels of authority. To this end, a DNN policy was trained and deployed as the primary control strategy, providing high-precision performance under nominal operating conditions. When external disturbances arise and reach a level of intensity such that the system performance falls below a predefined threshold, a low-level safety layer intervenes by deactivating the primary control policy and activating a model-free robust adaptive control (RAC) policy. This transition enables the system to continue operating while ensuring stability by effectively managing the inherent trade-off between system robustness and responsiveness. Regardless of the control policy in use, a high-level safety layer continuously monitors system performance during operation. It initiates a shutdown only when disturbances become sufficiently severe such that compensation is no longer viable and continued operation would jeopardize the system or its environment. The proposed synthesis of DNN and RAC policy guarantees uniform exponential stability of the entire WMR system while adhering to safety standards to some extent. The effectiveness of the proposed approach was further validated through real-time experiments using a 6,000 kg WMR.
Variance Reduced Policy Gradient Method for Multi-Objective Reinforcement Learning
Multi-Objective Reinforcement Learning (MORL) is a generalization of traditional Reinforcement Learning (RL) that aims to optimize multiple, often conflicting objectives simultaneously rather than focusing on a single reward. This approach is crucial in complex decision-making scenarios where agents must balance trade-offs between various goals, such as maximizing performance while minimizing costs. We consider the problem of MORL where the objectives are combined using a non-linear scalarization function. Just like in standard RL, policy gradient methods (PGMs) are amongst the most effective for handling large and continuous state-action spaces in MORL. However, existing PGMs for MORL suffer from high sample inefficiency, requiring large amounts of data to be effective. Previous attempts to solve this problem rely on overly strict assumptions, losing PGMs' benefits in scalability to large state-action spaces. In this work, we address the issue of sample efficiency by implementing variance-reduction techniques to reduce the sample complexity of policy gradients while maintaining general assumptions.
comment: 7 pages, 4 figures
Feedback stabilization of a nanoparticle at the intensity minimum of an optical double-well potential
In this work, we develop and analyze adaptive feedback control strategies to stabilize and confine a nanoparticle at the unstable intensity minimum of an optical double-well potential. The resulting stochastic optimal control problem for a noise-driven mechanical particle in a nonlinear optical potential must account for unavoidable experimental imperfections such as measurement nonlinearities and slow drifts of the optical setup. To address these issues, we simplify the model in the vicinity of the unstable equilibrium and employ indirect adaptive control techniques to dynamically follow changes in the potential landscape. Our approach leads to a simple and efficient Linear Quadratic Gaussian (LQG) controller that can be implemented on fast and cost-effective FPGAs, ensuring accessibility and reproducibility. We demonstrate that this strategy successfully tracks the intensity minimum and significantly reduces the nanoparticle's residual state variance, effectively lowering its center-of-mass temperature. While conventional optical traps rely on confining optical forces in the light field at the intensity maxima, trapping at intensity minima mitigates absorption heating, which is crucial for advanced quantum experiments. Since LQG control naturally extends into the quantum regime, our results provide a promising pathway for future experiments on quantum state preparation beyond the current absorption heating limitation, like matter-wave interference and tests of the quantum-gravity interface.
Virtual Sensing for Solder Layer Degradation and Temperature Monitoring in IGBT Modules
Monitoring the degradation state of Insulated Gate Bipolar Transistor (IGBT) modules is essential for ensuring the reliability and longevity of power electronic systems, especially in safety-critical and high-performance applications. However, direct measurement of key degradation indicators - such as junction temperature, solder fatigue or delamination - remains challenging due to the physical inaccessibility of internal components and the harsh environment. In this context, machine learning-based virtual sensing offers a promising alternative by bridging the gap from feasible sensor placement to the relevant but inaccessible locations. This paper explores the feasibility of estimating the degradation state of solder layers, and the corresponding full temperature maps based on a limited number of physical sensors. Based on synthetic data of a specific degradation mode, we obtain a high accuracy in the estimation of the degraded solder area (1.17% mean absolute error), and are able to reproduce the surface temperature of the IGBT with a maximum relative error of 4.56% (corresponding to an average relative error of 0.37%).
comment: Andrea Urgolo and Monika Stipsitz contributed equally to this work
A Structured Framework for Prioritizing Unsafe Control Actions in STPA: Case Study on eVTOL Operations
Systems Theoretic Process Analysis (STPA) is a widely recommended method for analysing complex system safety. STPA can identify numerous Unsafe Control Actions (UCAs) and requirements depending on the level of granularity of the analysis and the complexity of the system being analysed. Managing numerous results is challenging, especially during a fast-paced development lifecycle. Extensive research has been done to optimize the efficiency of managing and prioritising the STPA results. However, maintaining the objectivity of prioritisation and communicating the prioritised results have become common challenges. In this paper, the authors present a complementary approach that incorporates inputs from both the safety analysts and domain experts to more objectively prioritise UCAs. This is done by evaluating the severity of each UCA, the impact factor of each controller or decision maker that issues the UCA, and the ranking provided by the subject matter experts who assess the UCA criticalities based on different factors. In addition, a Monte Carlo simulation is introduced to reduce subjectivity and relativity, thus enabling more objective prioritisation of the UCAs. As part of the approach to better communicate the prioritisation results and plan the next steps of system development, a dynamic-scaling prioritisation matrix was developed to capture different sets of prioritised UCAs. The approach was applied to a real project to improve the safe operations of Electric Vertical Take-off and Landing (eVTOL). The results highlighted critical UCAs that need to be prioritised for safer eVTOL operation. 318 UCAs were identified in total. Based on the application of the prioritisation methodology, 110 were recognized as high-priority UCAs to strengthen the system design.
MASH: Cooperative-Heterogeneous Multi-Agent Reinforcement Learning for Single Humanoid Robot Locomotion
This paper proposes a novel method to enhance locomotion for a single humanoid robot through cooperative-heterogeneous multi-agent deep reinforcement learning (MARL). While most existing methods typically employ single-agent reinforcement learning algorithms for a single humanoid robot or MARL algorithms for multi-robot system tasks, we propose a distinct paradigm: applying cooperative-heterogeneous MARL to optimize locomotion for a single humanoid robot. The proposed method, multi-agent reinforcement learning for single humanoid locomotion (MASH), treats each limb (legs and arms) as an independent agent that explores the robot's action space while sharing a global critic for cooperative learning. Experiments demonstrate that MASH accelerates training convergence and improves whole-body cooperation ability, outperforming conventional single-agent reinforcement learning methods. This work advances the integration of MARL into single-humanoid-robot control, offering new insights into efficient locomotion strategies.
Quantifying the Value of Seismic Structural Health Monitoring for post-earthquake recovery of electric power system in terms of resilience enhancement
Post-earthquake recovery of electric power networks (EPNs) is critical to community resilience. Traditional recovery processes often rely on prolonged and imprecise manual inspections for damage diagnosis, leading to suboptimal repair prioritization and extended service disruptions. Seismic Structural Health Monitoring (SSHM) offers the potential to expedite recovery by enabling more accurate and timely damage assessment. However, SSHM deployment incurs costs, and its system-level resilience benefit remains underexplored. This study proposes a probabilistic simulation framework to quantify the value of SSHM for enhancing EPN resilience. The framework includes seismic damage modeling based on network configuration, hazard intensity, fragility functions, and damage-functionality mappings, combined with recovery simulations incorporating resource constraints, repair and transfer durations. System functionality is evaluated using graph-based island detection and optimal power flow analysis. Resilience is quantified via the Lack of Resilience (LoR) metric derived from the functionality restoration curve. SSHM is incorporated by altering the quality of damage information used in repair scheduling. Different monitoring scenarios (e.g., no-SSHM baseline, partial SSHM, full SSHM with various accuracies) are modeled using confusion matrices to simulate damage misclassification. Results show that improved damage awareness via SSHM significantly accelerates recovery and reduces LoR by up to 21%. This work supports evidence-based decisions for SSHM deployment in critical infrastructure.
comment: 21 pages. 14 figures
Managing Risks from Large Digital Loads Using Coordinated Grid-Forming Storage Network
Anticipated rapid growth of large digital load, driven by artificial intelligence (AI) data centers, is poised to increase uncertainty and large fluctuations in consumption, threatening the stability, reliability, and security of the energy infrastructure. Conventional measures taken by grid planners and operators to ensure stable and reliable integration of new resources are either cost-prohibitive (e.g., transmission upgrades) or ill-equipped (e.g., generation control) to resolve the unique challenges brought on by AI Data Centers (e.g., extreme load transients). In this work, we explore the feasibility of coordinating and managing available flexibility in the grid, in terms of grid-forming storage units, to ensure stable and reliable integration of AI Data Centers without the need for costly grid upgrades. Recently developed bi-layered coordinated control strategies -- involving fast-acting, local, autonomous, control at the storage to maintain transient safety in voltage and frequency at the point-of-interconnection, and a slower, coordinated (consensus) control to restore normal operating condition in the grid -- are used in the case studies. A comparison is drawn between broadly two scenarios: a network of coordinated, smaller, distributed storage vs. larger storage installations collocated with large digital loads. IEEE 68-bus network is used for the case studies, with large digital load profiles drawn from the MIT Supercloud Dataset.
comment: Submitted to IEEE PES T&D Conference and Expo 2026
A Neural Column-and-Constraint Generation Method for Solving Two-Stage Stochastic Unit Commitment
Two-stage stochastic unit commitment (2S-SUC) problems have been widely adopted to manage the uncertainties introduced by high penetrations of intermittent renewable energy resources. While decomposition-based algorithms such as column-and-constraint generation has been proposed to solve these problems, they remain computationally prohibitive for large-scale, real-time applications. In this paper, we introduce a Neural Column-and-Constraint Generation (Neural CCG) method to significantly accelerate the solution of 2S-SUC problems. The proposed approach integrates a neural network that approximates the second-stage recourse problem by learning from high-level features of operational scenarios and the first-stage commitment decisions. This neural estimator is embedded within the CCG framework, replacing repeated subproblem solving with rapid neural evaluations. We validate the effectiveness of the proposed method on the IEEE 118-bus system. Compared to the original CCG and a state-of-the-art commercial solver, Neural CCG achieves up to 130.1$\times$ speedup while maintaining a mean optimality gap below 0.096\%, demonstrating its strong potential for scalable stochastic optimization in power system.
Risk-Based Prognostics and Health Management
It is often the case that risk assessment and prognostics are viewed as related but separate tasks. This chapter describes a risk-based approach to prognostics that seeks to provide a tighter coupling between risk assessment and fault prediction. We show how this can be achieved using the continuous-time Bayesian network as the underlying modeling framework. Furthermore, we provide an overview of the techniques that are available to derive these models from data and show how they might be used in practice to achieve tasks like decision support and performance-based logistics. This work is intended to provide an overview of the recent developments related to risk-based prognostics, and we hope that it will serve as a tutorial of sorts that will assist others in adopting these techniques.
comment: Appears as Chapter 27 in Realizing Complex Integrated Systems, Anthony P. Ambler and John W. Sheppard (ads.), CRC Press, 2025
Overview of Complex System Design
This chapter serves as an introduction to systems engineering focused on the broad issues surrounding realizing complex integrated systems. What is a system? We pose a number of possible definitions and perspectives, but leave open the opportunity to consider the system from the target context where it will be used. Once we have a system in mind, we acknowledge the fact that this system needs to integrate a variety of pieces, components, subsystems, in order for it to accomplish its task. Therefore, we concern ourselves at the boundaries and interfaces of different technologies and disciplines to determine how best to achieve that integration. Next we raise the specter that this integrated system is complex. Complexity can be defined in a number of ways. For one, the sheer number of subsystems or components can be a measure of complexity. We could also consider the functions being performed by the system and how those functions interact with one another. Further, we could consider computational aspects such as the time or memory that may be needed to accomplish one or more tasks. The extent to which new behaviors might emerge from the system can also be regarded as an element of complexity. In the end, complexity is that characteristic of a system that defines the associated challenges along the life of the system, so we are concerned with how to manage that complexity. Finally, realization refers to the process by which our complex integrated system moves from concept to deployment and subsequent support. It refers to the entire design, development, manufacture, deployment, operation, and support life cycle. Of particular note here, however, is that we focus on systems that, by their very nature, are complex. In other words, we are interested in large, complicated, interacting beasts that are intended to perform difficult tasks and meet a wide variety of end-user needs.
comment: Appears as Chapter 1 in Realizing Complex Integrated Systems, Anthony P. Ambler and John W. Sheppard (ads.), CRC Press, 2025
Zono-Conformal Prediction: Zonotope-Based Uncertainty Quantification for Regression and Classification Tasks
Conformal prediction is a popular uncertainty quantification method that augments a base predictor with prediction sets with statistically valid coverage guarantees. However, current methods are often computationally expensive and data-intensive, as they require constructing an uncertainty model before calibration. Moreover, existing approaches typically represent the prediction sets with intervals, which limits their ability to capture dependencies in multi-dimensional outputs. We address these limitations by introducing zono-conformal prediction, a novel approach inspired by interval predictor models and reachset-conformant identification that constructs prediction zonotopes with assured coverage. By placing zonotopic uncertainty sets directly into the model of the base predictor, zono-conformal predictors can be identified via a single, data-efficient linear program. While we can apply zono-conformal prediction to arbitrary nonlinear base predictors, we focus on feed-forward neural networks in this work. Aside from regression tasks, we also construct optimal zono-conformal predictors in classification settings where the output of an uncertain predictor is a set of possible classes. We provide probabilistic coverage guarantees and present methods for detecting outliers in the identification data. In extensive numerical experiments, we show that zono-conformal predictors are less conservative than interval predictor models and standard conformal prediction methods, while achieving a similar coverage over the test data.
comment: Preprint. Under review
Robust Online Calibration for UWB-Aided Visual-Inertial Navigation with Bias Correction
This paper presents a novel robust online calibration framework for Ultra-Wideband (UWB) anchors in UWB-aided Visual-Inertial Navigation Systems (VINS). Accurate anchor positioning, a process known as calibration, is crucial for integrating UWB ranging measurements into state estimation. While several prior works have demonstrated satisfactory results by using robot-aided systems to autonomously calibrate UWB systems, there are still some limitations: 1) these approaches assume accurate robot localization during the initialization step, ignoring localization errors that can compromise calibration robustness, and 2) the calibration results are highly sensitive to the initial guess of the UWB anchors' positions, reducing the practical applicability of these methods in real-world scenarios. Our approach addresses these challenges by explicitly incorporating the impact of robot localization uncertainties into the calibration process, ensuring robust initialization. To further enhance the robustness of the calibration results against initialization errors, we propose a tightly-coupled Schmidt Kalman Filter (SKF)-based online refinement method, making the system suitable for practical applications. Simulations and real-world experiments validate the improved accuracy and robustness of our approach.
Finite Sample Performance Analysis of MIMO Systems Identification
This paper is concerned with the finite sample identification performance of an n dimensional discrete-time Multiple-Input Multiple-Output (MIMO) Linear Time-Invariant system, with p inputs and m outputs. We prove that the widely-used Ho-Kalman algorithm and Multivariable Output Error State Space (MOESP) algorithm are ill-conditioned for MIMO systems when n/m or n/p is large. Moreover, by analyzing the Cra\'mer-Rao bound, we derive a fundamental limit for identifying the real and stable (or marginally stable) poles of MIMO system and prove that the sample complexity for any unbiased pole estimation algorithm to reach a certain level of accuracy explodes superpolynomially with respect to n/(pm). Numerical results are provided to illustrate the ill-conditionedness of Ho-Kalman algorithm and MOESP algorithm as well as the fundamental limit on identification.
comment: 14 pages, 6 figures
TAR: Teacher-Aligned Representations via Contrastive Learning for Quadrupedal Locomotion IROS
Quadrupedal locomotion via Reinforcement Learning (RL) is commonly addressed using the teacher-student paradigm, where a privileged teacher guides a proprioceptive student policy. However, key challenges such as representation misalignment between privileged teacher and proprioceptive-only student, covariate shift due to behavioral cloning, and lack of deployable adaptation; lead to poor generalization in real-world scenarios. We propose Teacher-Aligned Representations via Contrastive Learning (TAR), a framework that leverages privileged information with self-supervised contrastive learning to bridge this gap. By aligning representations to a privileged teacher in simulation via contrastive objectives, our student policy learns structured latent spaces and exhibits robust generalization to Out-of-Distribution (OOD) scenarios, surpassing the fully privileged "Teacher". Results showed accelerated training by 2x compared to state-of-the-art baselines to achieve peak performance. OOD scenarios showed better generalization by 40% on average compared to existing methods. Moreover, TAR transitions seamlessly into learning during deployment without requiring privileged states, setting a new benchmark in sample-efficient, adaptive locomotion and enabling continual fine-tuning in real-world scenarios. Open-source code and videos are available at https://amrmousa.com/TARLoco/.
comment: This work has been accepted for publication at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
Enhancing Fault Detection and Isolation in an All-Electric Auxiliary Power Unit (APU) Gas Generator by Utilizing Starter/Generator Signal
This study proposes a novel paradigm for enhancing fault detection and isolation (FDI) of gas generators in all-electric auxiliary power unit (APU) by utilizing shaft power information from the starter/generator. First, we conduct a pioneering investigation into the challenges and opportunities for FDI brought about by APU electrification. Our analysis reveals that the electrification of APU opens up new possibilities for utilizing shaft power estimates from starter/generator to improve gas generator FDI. We then provide comprehensive theoretical and analytical evidence demonstrating why, how, and to what extent, the shaft power information from the starter/generator can fundamentally enhance the estimation accuracy of system states and health parameters of the gas generator, while also identifying the key factors influencing these improvements in FDI performance. The effectiveness of the proposed paradigm and its theoretical foundations are validated through extensive Monte Carlo simulations. Furthermore, through comprehensive comparative analysis with state-of-the-art gas generator fault diagnosis methods, our experimental results not only demonstrate the superior performance of the proposed approach but also validate that the diagnostic capabilities of existing advanced FDI techniques can be substantially enhanced by incorporating shaft power information. And the observed performance improvement patterns strongly align with our theoretical analysis, verifying both the effectiveness and guiding significance of our theoretical framework. These research findings provide a unique perspective in answering three fundamental questions: why joint fault diagnosis of the starter/generator and gas generator is essential, how it can be implemented, and what factors determine its effectiveness, thereby opening up promising new avenues for FDI technologies in all-electric APU systems.
On Event-Triggered Resilient Consensus Using Auxiliary Layer
Due to its design simplicity, auxiliary layer-based resilient control is widely discussed in the literature to mitigate the effects of False Data Injection (FDI) attacks. However, the increased communication burden due to additional communication links for connecting an extra layer is often overlooked in the literature. This paper bridges this gap by considering an event-triggered approach for inter-layer communication between the physical layer (containing actual agents) and the auxiliary layer (containing virtual agents) for the resilient state consensus in a multi-agent system. We provide state-based and dynamic event-triggering mechanisms, the former being the motivation for the latter. The exclusion of Zeno behavior is established by proving positive minimum inter-event time (MIET). Extensive simulation and experimental results are provided to illustrate the proposed methodology.
Unified Performance Control for Non-Square Nonlinear Systems with Relaxed Controllability
In this paper, we investigate the problem of unified prescribed performance tracking for a class of non-square strict-feedback nonlinear systems under relaxed controllability conditions. By using a skillful matrix decomposition and introducing some feasible auxiliary matrices, a more generalized controllability condition than the current state of the art is constructed, which can be applied to both square and non-square nonlinear systems subject to actuator faults and unknown yet time-varying control gain. Incorporating the relaxed controllability conditions and the uniform performance specifications into the backstepping design procedure, a prescribed performance fault-tolerant controller is developed that can achieve different performance demands without modifying the controller structure, which is more flexible and practical.In addition, the destruction of the system stability by unknown controllability auxiliary matrices and unknown nonlinearities is circumvented by embedding the available core information of the state-dependent uncertainties into the design procedure. Both theoretical analysis and numerical simulation demonstrate the effectiveness and benefits of the proposed method.
comment: 9 pages,13 figures, submitted to journal
Transferable Parasitic Estimation via Graph Contrastive Learning and Label Rebalancing in AMS Circuits
Graph representation learning on Analog-Mixed Signal (AMS) circuits is crucial for various downstream tasks, e.g., parasitic estimation. However, the scarcity of design data, the unbalanced distribution of labels, and the inherent diversity of circuit implementations pose significant challenges to learning robust and transferable circuit representations. To address these limitations, we propose CircuitGCL, a novel graph contrastive learning framework that integrates representation scattering and label rebalancing to enhance transferability across heterogeneous circuit graphs. CircuitGCL employs a self-supervised strategy to learn topology-invariant node embeddings through hyperspherical representation scattering, eliminating dependency on large-scale data. Simultaneously, balanced mean squared error (BMSE) and balanced softmax cross-entropy (BSCE) losses are introduced to mitigate label distribution disparities between circuits, enabling robust and transferable parasitic estimation. Evaluated on parasitic capacitance estimation (edge-level task) and ground capacitance classification (node-level task) across TSMC 28nm AMS designs, CircuitGCL outperforms all state-of-the-art (SOTA) methods, with the $R^2$ improvement of $33.64\% \sim 44.20\%$ for edge regression and F1-score gain of $0.9\times \sim 2.1\times$ for node classification. Our code is available at https://github.com/ShenShan123/CircuitGCL.
comment: Final version accepted by the International Conference on Computer-Aided Design (ICCAD) 2025
Safe Multi-Robotic Arm Interaction via 3D Convex Shapes
Inter-robot collisions pose a significant safety risk when multiple robotic arms operate in close proximity. We present an online collision avoidance methodology leveraging 3D convex shape-based High-Order Control Barrier Functions (HOCBFs) to address this issue. While prior works focused on using Control Barrier Functions (CBFs) for human-robotic arm and single-arm collision avoidance, we explore the problem of collision avoidance between multiple robotic arms operating in a shared space. In our methodology, we utilize the proposed HOCBFs as centralized and decentralized safety filters. These safety filters are compatible with many nominal controllers and ensure safety without significantly restricting the robots' workspace. A key challenge in implementing these filters is the computational overhead caused by the large number of safety constraints and the computation of a Hessian matrix per constraint. We address this challenge by employing numerical differentiation methods to approximate computationally intensive terms. The effectiveness of our method is demonstrated through extensive simulation studies and real-world experiments with Franka Research 3 robotic arms. The project video is available at this link.
Robotics
Masquerade: Learning from In-the-wild Human Videos using Data-Editing
Robot manipulation research still suffers from significant data scarcity: even the largest robot datasets are orders of magnitude smaller and less diverse than those that fueled recent breakthroughs in language and vision. We introduce Masquerade, a method that edits in-the-wild egocentric human videos to bridge the visual embodiment gap between humans and robots and then learns a robot policy with these edited videos. Our pipeline turns each human video into robotized demonstrations by (i) estimating 3-D hand poses, (ii) inpainting the human arms, and (iii) overlaying a rendered bimanual robot that tracks the recovered end-effector trajectories. Pre-training a visual encoder to predict future 2-D robot keypoints on 675K frames of these edited clips, and continuing that auxiliary loss while fine-tuning a diffusion policy head on only 50 robot demonstrations per task, yields policies that generalize significantly better than prior work. On three long-horizon, bimanual kitchen tasks evaluated in three unseen scenes each, Masquerade outperforms baselines by 5-6x. Ablations show that both the robot overlay and co-training are indispensable, and performance scales logarithmically with the amount of edited human video. These results demonstrate that explicitly closing the visual embodiment gap unlocks a vast, readily available source of data from human videos that can be used to improve robot policies.
comment: Project website at https://masquerade-robot.github.io/
Vision-driven River Following of UAV via Safe Reinforcement Learning using Semantic Dynamics Model
Vision-driven autonomous river following by Unmanned Aerial Vehicles is critical for applications such as rescue, surveillance, and environmental monitoring, particularly in dense riverine environments where GPS signals are unreliable. We formalize river following as a coverage control problem in which the reward function is submodular, yielding diminishing returns as more unique river segments are visited, thereby framing the task as a Submodular Markov Decision Process. First, we introduce Marginal Gain Advantage Estimation, which refines the reward advantage function by using a sliding window baseline computed from historical episodic returns, thus aligning the advantage estimation with the agent's evolving recognition of action value in non-Markovian settings. Second, we develop a Semantic Dynamics Model based on patchified water semantic masks that provides more interpretable and data-efficient short-term prediction of future observations compared to latent vision dynamics models. Third, we present the Constrained Actor Dynamics Estimator architecture, which integrates the actor, the cost estimator, and SDM for cost advantage estimation to form a model-based SafeRL framework capable of solving partially observable Constrained Submodular Markov Decision Processes. Simulation results demonstrate that MGAE achieves faster convergence and superior performance over traditional critic-based methods like Generalized Advantage Estimation. SDM provides more accurate short-term state predictions that enable the cost estimator to better predict potential violations. Overall, CADE effectively integrates safety regulation into model-based RL, with the Lagrangian approach achieving the soft balance of reward and safety during training, while the safety layer enhances performance during inference by hard action overlay.
comment: Submitted to Robotics and Autonomous Systems (RAS) journal
Online Safety under Multiple Constraints and Input Bounds using gatekeeper: Theory and Applications
This letter presents an approach to guarantee online safety of a cyber-physical system under multiple state and input constraints. Our proposed framework, called gatekeeper, recursively guarantees the existence of an infinite-horizon trajectory that satisfies all constraints and system dynamics. Such trajectory is constructed using a backup controller, which we define formally in this paper. gatekeeper relies on a small number of verifiable assumptions, and is computationally efficient since it requires optimization over a single scalar variable. We make two primary contributions in this letter. (A) First, we develop the theory of gatekeeper: we derive a sub-optimality bound relative to a full nonlinear trajectory optimization problem, and show how this can be used in runtime to validate performance. This also informs the design of the backup controllers and sets. (B) Second, we demonstrate in detail an application of gatekeeper for multi-agent formation flight, where each Dubins agent must avoid multiple obstacles and weapons engagement zones, both of which are nonlinear, nonconvex constraints.
comment: 6 pages, 2 figures. Accepted for publication in IEEE L-CSS 2025
GBC: Generalized Behavior-Cloning Framework for Whole-Body Humanoid Imitation
The creation of human-like humanoid robots is hindered by a fundamental fragmentation: data processing and learning algorithms are rarely universal across different robot morphologies. This paper introduces the Generalized Behavior Cloning (GBC) framework, a comprehensive and unified solution designed to solve this end-to-end challenge. GBC establishes a complete pathway from human motion to robot action through three synergistic innovations. First, an adaptive data pipeline leverages a differentiable IK network to automatically retarget any human MoCap data to any humanoid. Building on this foundation, our novel DAgger-MMPPO algorithm with its MMTransformer architecture learns robust, high-fidelity imitation policies. To complete the ecosystem, the entire framework is delivered as an efficient, open-source platform based on Isaac Lab, empowering the community to deploy the full workflow via simple configuration scripts. We validate the power and generality of GBC by training policies on multiple heterogeneous humanoids, demonstrating excellent performance and transfer to novel motions. This work establishes the first practical and unified pathway for creating truly generalized humanoid controllers.
PPL: Point Cloud Supervised Proprioceptive Locomotion Reinforcement Learning for Legged Robots in Crawl Spaces
The legged locomotion in spatially constrained structures (called crawl spaces) is challenging. In crawl spaces, current exteroceptive locomotion learning methods are limited by large noises and errors of the sensors in possible low visibility conditions, and current proprioceptive locomotion learning methods are difficult in traversing crawl spaces because only ground features are inferred. In this study, a point cloud supervised proprioceptive locomotion reinforcement learning method for legged robots in crawl spaces is proposed. A state estimation network is designed to estimate the robot's surrounding ground and spatial features as well as the robot's collision states using historical proprioceptive sensor data. The point cloud is represented in polar coordinate frame and a point cloud processing method is proposed to efficiently extract the ground and spatial features that are used to supervise the state estimation network learning. Comprehensive reward functions that guide the robot to traverse through crawl spaces after collisions are designed. Experiments demonstrate that, compared to existing methods, our method exhibits more agile locomotion in crawl spaces. This study enhances the ability of legged robots to traverse spatially constrained environments without requiring exteroceptive sensors.
Collision-Free Bearing-Driven Formation Tracking for Euler-Lagrange Systems
In this paper, we investigate the problem of tracking formations driven by bearings for heterogeneous Euler-Lagrange systems with parametric uncertainty in the presence of multiple moving leaders. To estimate the leaders' velocities and accelerations, we first design a distributed observer for the leader system, utilizing a bearing-based localization condition in place of the conventional connectivity assumption. This observer, coupled with an adaptive mechanism, enables the synthesis of a novel distributed control law that guides the formation towards the target formation, without requiring prior knowledge of the system parameters. Furthermore, we establish a sufficient condition, dependent on the initial formation configuration, that ensures collision avoidance throughout the formation evolution. The effectiveness of the proposed approach is demonstrated through a numerical example.
comment: 10 pages, 4 figures
A Shank Angle-Based Control System Enables Soft Exoskeleton to Assist Human Non-Steady Locomotion
Exoskeletons have been shown to effectively assist humans during steady locomotion. However, their effects on non-steady locomotion, characterized by nonlinear phase progression within a gait cycle, remain insufficiently explored, particularly across diverse activities. This work presents a shank angle-based control system that enables the exoskeleton to maintain real-time coordination with human gait, even under phase perturbations, while dynamically shaping assistance profiles to match the biological ankle moment patterns across walking, running, stair negotiation tasks. The control system consists of an assistance profile online generation method and a model-based feedforward control method. The assistance profile is formulated as a dual-Gaussian model with the shank angle as the independent variable. Leveraging only IMU measurements, the model parameters are updated online each stride to adapt to inter- and intra-individual biomechanical variability. The profile tracking control employs a human-exoskeleton kinematics and stiffness model as a feedforward component, reducing reliance on historical control data due to the lack of clear and consistent periodicity in non-steady locomotion. Three experiments were conducted using a lightweight soft exoskeleton with multiple subjects. The results validated the effectiveness of each individual method, demonstrated the robustness of the control system against gait perturbations across various activities, and revealed positive biomechanical and physiological responses of human users to the exoskeleton's mechanical assistance.
comment: 49 pages, 20 figures, 4 tables
Toward Human-Robot Teaming: Learning Handover Behaviors from 3D Scenes
Human-robot teaming (HRT) systems often rely on large-scale datasets of human and robot interactions, especially for close-proximity collaboration tasks such as human-robot handovers. Learning robot manipulation policies from raw, real-world image data requires a large number of robot-action trials in the physical environment. Although simulation training offers a cost-effective alternative, the visual domain gap between simulation and robot workspace remains a major limitation. We introduce a method for training HRT policies, focusing on human-to-robot handovers, solely from RGB images without the need for real-robot training or real-robot data collection. The goal is to enable the robot to reliably receive objects from a human with stable grasping while avoiding collisions with the human hand. The proposed policy learner leverages sparse-view Gaussian Splatting reconstruction of human-to-robot handover scenes to generate robot demonstrations containing image-action pairs captured with a camera mounted on the robot gripper. As a result, the simulated camera pose changes in the reconstructed scene can be directly translated into gripper pose changes. Experiments in both Gaussian Splatting reconstructed scene and real-world human-to-robot handover experiments demonstrate that our method serves as a new and effective representation for the human-to-robot handover task, contributing to more seamless and robust HRT.
comment: 3 pages, 3 figures
Whole-Body Bilateral Teleoperation with Multi-Stage Object Parameter Estimation for Wheeled Humanoid Locomanipulation
This paper presents an object-aware whole-body bilateral teleoperation framework for wheeled humanoid loco-manipulation. This framework combines whole-body bilateral teleoperation with an online multi-stage object inertial parameter estimation module, which is the core technical contribution of this work. The multi-stage process sequentially integrates a vision-based object size estimator, an initial parameter guess generated by a large vision-language model (VLM), and a decoupled hierarchical sampling strategy. The visual size estimate and VLM prior offer a strong initial guess of the object's inertial parameters, significantly reducing the search space for sampling-based refinement and improving the overall estimation speed. A hierarchical strategy first estimates mass and center of mass, then infers inertia from object size to ensure physically feasible parameters, while a decoupled multi-hypothesis scheme enhances robustness to VLM prior errors. Our estimator operates in parallel with high-fidelity simulation and hardware, enabling real-time online updates. The estimated parameters are then used to update the wheeled humanoid's equilibrium point, allowing the operator to focus more on locomotion and manipulation. This integration improves the haptic force feedback for dynamic synchronization, enabling more dynamic whole-body teleoperation. By compensating for object dynamics using the estimated parameters, the framework also improves manipulation tracking while preserving compliant behavior. We validate the system on a customized wheeled humanoid with a robotic gripper and human-machine interface, demonstrating real-time execution of lifting, delivering, and releasing tasks with a payload weighing approximately one-third of the robot's body weight.
Embodied Tactile Perception of Soft Objects Properties
To enable robots to develop human-like fine manipulation, it is essential to understand how mechanical compliance, multi-modal sensing, and purposeful interaction jointly shape tactile perception. In this study, we use a dedicated modular e-Skin with tunable mechanical compliance and multi-modal sensing (normal, shear forces and vibrations) to systematically investigate how sensing embodiment and interaction strategies influence robotic perception of objects. Leveraging a curated set of soft wave objects with controlled viscoelastic and surface properties, we explore a rich set of palpation primitives-pressing, precession, sliding that vary indentation depth, frequency, and directionality. In addition, we propose the latent filter, an unsupervised, action-conditioned deep state-space model of the sophisticated interaction dynamics and infer causal mechanical properties into a structured latent space. This provides generalizable and in-depth interpretable representation of how embodiment and interaction determine and influence perception. Our investigation demonstrates that multi-modal sensing outperforms uni-modal sensing. It highlights a nuanced interaction between the environment and mechanical properties of e-Skin, which should be examined alongside the interaction by incorporating temporal dynamics.
RayletDF: Raylet Distance Fields for Generalizable 3D Surface Reconstruction from Point Clouds or Gaussians ICCV 2025
In this paper, we present a generalizable method for 3D surface reconstruction from raw point clouds or pre-estimated 3D Gaussians by 3DGS from RGB images. Unlike existing coordinate-based methods which are often computationally intensive when rendering explicit surfaces, our proposed method, named RayletDF, introduces a new technique called raylet distance field, which aims to directly predict surface points from query rays. Our pipeline consists of three key modules: a raylet feature extractor, a raylet distance field predictor, and a multi-raylet blender. These components work together to extract fine-grained local geometric features, predict raylet distances, and aggregate multiple predictions to reconstruct precise surface points. We extensively evaluate our method on multiple public real-world datasets, demonstrating superior performance in surface reconstruction from point clouds or 3D Gaussians. Most notably, our method achieves exceptional generalization ability, successfully recovering 3D surfaces in a single-forward pass across unseen datasets in testing.
comment: ICCV 2025 Highlight. Shenxing and Jinxi are co-first authors. Code and data are available at: https://github.com/vLAR-group/RayletDF
TRACE: Learning 3D Gaussian Physical Dynamics from Multi-view Videos ICCV 2025
In this paper, we aim to model 3D scene geometry, appearance, and physical information just from dynamic multi-view videos in the absence of any human labels. By leveraging physics-informed losses as soft constraints or integrating simple physics models into neural nets, existing works often fail to learn complex motion physics, or doing so requires additional labels such as object types or masks. We propose a new framework named TRACE to model the motion physics of complex dynamic 3D scenes. The key novelty of our method is that, by formulating each 3D point as a rigid particle with size and orientation in space, we directly learn a translation rotation dynamics system for each particle, explicitly estimating a complete set of physical parameters to govern the particle's motion over time. Extensive experiments on three existing dynamic datasets and one newly created challenging synthetic datasets demonstrate the extraordinary performance of our method over baselines in the task of future frame extrapolation. A nice property of our framework is that multiple objects or parts can be easily segmented just by clustering the learned physical parameters.
comment: ICCV 2025. Code and data are available at: https://github.com/vLAR-group/TRACE
FLARE: Agile Flights for Quadrotor Cable-Suspended Payload System via Reinforcement Learning
Agile flight for the quadrotor cable-suspended payload system is a formidable challenge due to its underactuated, highly nonlinear, and hybrid dynamics. Traditional optimization-based methods often struggle with high computational costs and the complexities of cable mode transitions, limiting their real-time applicability and maneuverability exploitation. In this letter, we present FLARE, a reinforcement learning (RL) framework that directly learns agile navigation policy from high-fidelity simulation. Our method is validated across three designed challenging scenarios, notably outperforming a state-of-the-art optimization-based approach by a 3x speedup during gate traversal maneuvers. Furthermore, the learned policies achieve successful zero-shot sim-to-real transfer, demonstrating remarkable agility and safety in real-world experiments, running in real time on an onboard computer.
Predictive Uncertainty for Runtime Assurance of a Real-Time Computer Vision-Based Landing System SC 2025
Recent advances in data-driven computer vision have enabled robust autonomous navigation capabilities for civil aviation, including automated landing and runway detection. However, ensuring that these systems meet the robustness and safety requirements for aviation applications remains a major challenge. In this work, we present a practical vision-based pipeline for aircraft pose estimation from runway images that represents a step toward the ability to certify these systems for use in safety-critical aviation applications. Our approach features three key innovations: (i) an efficient, flexible neural architecture based on a spatial Soft Argmax operator for probabilistic keypoint regression, supporting diverse vision backbones with real-time inference; (ii) a principled loss function producing calibrated predictive uncertainties, which are evaluated via sharpness and calibration metrics; and (iii) an adaptation of Residual-based Receiver Autonomous Integrity Monitoring (RAIM), enabling runtime detection and rejection of faulty model outputs. We implement and evaluate our pose estimation pipeline on a dataset of runway images. We show that our model outperforms baseline architectures in terms of accuracy while also producing well-calibrated uncertainty estimates with sub-pixel precision that can be used downstream for fault detection.
comment: 8 pages, 5 figures, accepted at DASC 2025
Immersive Teleoperation of Beyond-Human-Scale Robotic Manipulators: Challenges and Future Directions
Teleoperation of beyond-human-scale robotic manipulators (BHSRMs) presents unique challenges that differ fundamentally from conventional human-scale systems. As these platforms gain relevance in industrial domains such as construction, mining, and disaster response, immersive interfaces must be rethought to support scalable, safe, and effective human-robot collaboration. This paper investigates the control, cognitive, and interface-level challenges of immersive teleoperation in BHSRMs, with a focus on ensuring operator safety, minimizing sensorimotor mismatch, and enhancing the sense of embodiment. We analyze design trade-offs in haptic and visual feedback systems, supported by early experimental comparisons of exoskeleton- and joystick-based control setups. Finally, we outline key research directions for developing new evaluation tools, scaling strategies, and human-centered safety models tailored to large-scale robotic telepresence.
comment: This work has been accepted for presentation at the 2025 IEEE Conference on Telepresence, to be held in Leiden, Netherlands
Surg-InvNeRF: Invertible NeRF for 3D tracking and reconstruction in surgical vision
We proposed a novel test-time optimisation (TTO) approach framed by a NeRF-based architecture for long-term 3D point tracking. Most current methods in point tracking struggle to obtain consistent motion or are limited to 2D motion. TTO approaches frame the solution for long-term tracking as optimising a function that aggregates correspondences from other specialised state-of-the-art methods. Unlike the state-of-the-art on TTO, we propose parametrising such a function with our new invertible Neural Radiance Field (InvNeRF) architecture to perform both 2D and 3D tracking in surgical scenarios. Our approach allows us to exploit the advantages of a rendering-based approach by supervising the reprojection of pixel correspondences. It adapts strategies from recent rendering-based methods to obtain a bidirectional deformable-canonical mapping, to efficiently handle a defined workspace, and to guide the rays' density. It also presents our multi-scale HexPlanes for fast inference and a new algorithm for efficient pixel sampling and convergence criteria. We present results in the STIR and SCARE datasets, for evaluating point tracking and testing the integration of kinematic data in our pipeline, respectively. In 2D point tracking, our approach surpasses the precision and accuracy of the TTO state-of-the-art methods by nearly 50% on average precision, while competing with other approaches. In 3D point tracking, this is the first TTO approach, surpassing feed-forward methods while incorporating the benefits of a deformable NeRF-based reconstruction.
comment: 10 pages
Plane Detection and Ranking via Model Information Optimization IROS
Plane detection from depth images is a crucial subtask with broad robotic applications, often accomplished by iterative methods such as Random Sample Consensus (RANSAC). While RANSAC is a robust strategy with strong probabilistic guarantees, the ambiguity of its inlier threshold criterion makes it susceptible to false positive plane detections. This issue is particularly prevalent in complex real-world scenes, where the true number of planes is unknown and multiple planes coexist. In this paper, we aim to address this limitation by proposing a generalised framework for plane detection based on model information optimization. Building on previous works, we treat the observed depth readings as discrete random variables, with their probability distributions constrained by the ground truth planes. Various models containing different candidate plane constraints are then generated through repeated random sub-sampling to explain our observations. By incorporating the physics and noise model of the depth sensor, we can calculate the information for each model, and the model with the least information is accepted as the most likely ground truth. This information optimization process serves as an objective mechanism for determining the true number of planes and preventing false positive detections. Additionally, the quality of each detected plane can be ranked by summing the information reduction of inlier points for each plane. We validate these properties through experiments with synthetic data and find that our algorithm estimates plane parameters more accurately compared to the default Open3D RANSAC plane segmentation. Furthermore, we accelerate our algorithm by partitioning the depth map using neural network segmentation, which enhances its ability to generate more realistic plane parameters in real-world data.
comment: Accepted as contributed paper in the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Interpretable Robot Control via Structured Behavior Trees and Large Language Models
As intelligent robots become more integrated into human environments, there is a growing need for intuitive and reliable Human-Robot Interaction (HRI) interfaces that are adaptable and more natural to interact with. Traditional robot control methods often require users to adapt to interfaces or memorize predefined commands, limiting usability in dynamic, unstructured environments. This paper presents a novel framework that bridges natural language understanding and robotic execution by combining Large Language Models (LLMs) with Behavior Trees. This integration enables robots to interpret natural language instructions given by users and translate them into executable actions by activating domain-specific plugins. The system supports scalable and modular integration, with a primary focus on perception-based functionalities, such as person tracking and hand gesture recognition. To evaluate the system, a series of real-world experiments was conducted across diverse environments. Experimental results demonstrate that the proposed approach is practical in real-world scenarios, with an average cognition-to-execution accuracy of approximately 94%, making a significant contribution to HRI systems and robots. The complete source code of the framework is publicly available at https://github.com/snt-arg/robot_suite.
comment: 15 pages, 5 figures, 3 tables
BEAVR: Bimanual, multi-Embodiment, Accessible, Virtual Reality Teleoperation System for Robots
\textbf{BEAVR} is an open-source, bimanual, multi-embodiment Virtual Reality (VR) teleoperation system for robots, designed to unify real-time control, data recording, and policy learning across heterogeneous robotic platforms. BEAVR enables real-time, dexterous teleoperation using commodity VR hardware, supports modular integration with robots ranging from 7-DoF manipulators to full-body humanoids, and records synchronized multi-modal demonstrations directly in the LeRobot dataset schema. Our system features a zero-copy streaming architecture achieving $\leq$35\,ms latency, an asynchronous ``think--act'' control loop for scalable inference, and a flexible network API optimized for real-time, multi-robot operation. We benchmark BEAVR across diverse manipulation tasks and demonstrate its compatibility with leading visuomotor policies such as ACT, DiffusionPolicy, and SmolVLA. All code is publicly available, and datasets are released on Hugging Face\footnote{Code, datasets, and VR app available at https://github.com/ARCLab-MIT/BEAVR-Bot.
comment: Accepted for presentation on ICCR Kyoto 2025
HapticGiant: A Novel Very Large Kinesthetic Haptic Interface with Hierarchical Force Control
Research in virtual reality and haptic technologies has consistently aimed to enhance immersion. While advanced head-mounted displays are now commercially available, kinesthetic haptic interfaces still face challenges such as limited workspaces, insufficient degrees of freedom, and kinematics not matching the human arm. In this paper, we present HapticGiant, a novel large-scale kinesthetic haptic interface designed to match the properties of the human arm as closely as possible and to facilitate natural user locomotion while providing full haptic feedback. The interface incorporates a novel admittance-type force control scheme, leveraging hierarchical optimization to render both arbitrary serial kinematic chains and Cartesian admittances. Notably, the proposed control scheme natively accounts for system limitations, including joint and Cartesian constraints, as well as singularities. Experimental results demonstrate the effectiveness of HapticGiant and its control scheme, paving the way for highly immersive virtual reality applications.
comment: Final Version - Accepted on IEEE Transactions on Haptics
ESCoT: An Enhanced Step-based Coordinate Trajectory Planning Method for Multiple Car-like Robots
Multi-vehicle trajectory planning (MVTP) is one of the key challenges in multi-robot systems (MRSs) and has broad applications across various fields. This paper presents ESCoT, an enhanced step-based coordinate trajectory planning method for multiple car-like robots. ESCoT incorporates two key strategies: collaborative planning for local robot groups and replanning for duplicate configurations. These strategies effectively enhance the performance of step-based MVTP methods. Through extensive experiments, we show that ESCoT 1) in sparse scenarios, significantly improves solution quality compared to baseline step-based method, achieving up to 70% improvement in typical conflict scenarios and 34% in randomly generated scenarios, while maintaining high solving efficiency; and 2) in dense scenarios, outperforms all baseline methods, maintains a success rate of over 50% even in the most challenging configurations. The results demonstrate that ESCoT effectively solves MVTP, further extending the capabilities of step-based methods. Finally, practical robot tests validate the algorithm's applicability in real-world scenarios.
WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization
Visual geo-localization for drones faces critical degradation under weather perturbations, \eg, rain and fog, where existing methods struggle with two inherent limitations: 1) Heavy reliance on limited weather categories that constrain generalization, and 2) Suboptimal disentanglement of entangled scene-weather features through pseudo weather categories. We present WeatherPrompt, a multi-modality learning paradigm that establishes weather-invariant representations through fusing the image embedding with the text context. Our framework introduces two key contributions: First, a Training-free Weather Reasoning mechanism that employs off-the-shelf large multi-modality models to synthesize multi-weather textual descriptions through human-like reasoning. It improves the scalability to unseen or complex weather, and could reflect different weather strength. Second, to better disentangle the scene and weather feature, we propose a multi-modality framework with the dynamic gating mechanism driven by the text embedding to adaptively reweight and fuse visual features across modalities. The framework is further optimized by the cross-modal objectives, including image-text contrastive learning and image-text matching, which maps the same scene with different weather conditions closer in the respresentation space. Extensive experiments validate that, under diverse weather conditions, our method achieves competitive recall rates compared to state-of-the-art drone geo-localization methods. Notably, it improves Recall@1 by +13.37\% under night conditions and by 18.69\% under fog and snow conditions.
comment: 13 pages, 4figures
CaRoBio: 3D Cable Routing with a Bio-inspired Gripper Fingernail
The manipulation of deformable linear flexures has a wide range of applications in industry, such as cable routing in automotive manufacturing and textile production. Cable routing, as a complex multi-stage robot manipulation scenario, is a challenging task for robot automation. Common parallel two-finger grippers have the risk of over-squeezing and over-tension when grasping and guiding cables. In this paper, a novel eagle-inspired fingernail is designed and mounted on the gripper fingers, which helps with cable grasping on planar surfaces and in-hand cable guiding operations. Then we present a single-grasp end-to-end 3D cable routing framework utilizing the proposed fingernails, instead of the common pick-and-place strategy. Continuous control is achieved to efficiently manipulate cables through vision-based state estimation of task configurations and offline trajectory planning based on motion primitives. We evaluate the effectiveness of the proposed framework with a variety of cables and channel slots, significantly outperforming the pick-and-place manipulation process under equivalent perceptual conditions. Our reconfigurable task setting and the proposed framework provide a reference for future cable routing manipulations in 3D space.
SMART-OC: A Real-time Time-risk Optimal Replanning Algorithm for Dynamic Obstacles and Spatio-temporally Varying Currents
Typical marine environments are highly complex with spatio-temporally varying currents and dynamic obstacles, presenting significant challenges to Unmanned Surface Vehicles (USVs) for safe and efficient navigation. Thus, the USVs need to continuously adapt their paths with real-time information to avoid collisions and follow the path of least resistance to the goal via exploiting ocean currents. In this regard, we introduce a novel algorithm, called Self-Morphing Adaptive Replanning Tree for dynamic Obstacles and Currents (SMART-OC), that facilitates real-time time-risk optimal replanning in dynamic environments. SMART-OC integrates the obstacle risks along a path with the time cost to reach the goal to find the time-risk optimal path. The effectiveness of SMART-OC is validated by simulation experiments, which demonstrate that the USV performs fast replannings to avoid dynamic obstacles and exploit ocean currents to successfully reach the goal.
Reactive Model Predictive Contouring Control for Robot Manipulators IROS
This contribution presents a robot path-following framework via Reactive Model Predictive Contouring Control (RMPCC) that successfully avoids obstacles, singularities and self-collisions in dynamic environments at 100 Hz. Many path-following methods rely on the time parametrization, but struggle to handle collision and singularity avoidance while adhering kinematic limits or other constraints. Specifically, the error between the desired path and the actual position can become large when executing evasive maneuvers. Thus, this paper derives a method that parametrizes the reference path by a path parameter and performs the optimization via RMPCC. In particular, Control Barrier Functions (CBFs) are introduced to avoid collisions and singularities in dynamic environments. A Jacobian-based linearization and Gauss-Newton Hessian approximation enable solving the nonlinear RMPCC problem at 100 Hz, outperforming state-of-the-art methods by a factor of 10. Experiments confirm that the framework handles dynamic obstacles in real-world settings with low contouring error and low robot acceleration.
comment: 8 pages, 7 figures, 3 tables, conference paper, Accepted for publication at IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS) 2025
DAgger Diffusion Navigation: DAgger Boosted Diffusion Policy for Vision-Language Navigation
Vision-Language Navigation in Continuous Environments (VLN-CE) requires agents to follow natural language instructions through free-form 3D spaces. Existing VLN-CE approaches typically use a two-stage waypoint planning framework, where a high-level waypoint predictor generates the navigable waypoints, and then a navigation planner suggests the intermediate goals in the high-level action space. However, this two-stage decomposition framework suffers from: (1) global sub-optimization due to the proxy objective in each stage, and (2) a performance bottleneck caused by the strong reliance on the quality of the first-stage predicted waypoints. To address these limitations, we propose DAgger Diffusion Navigation (DifNav), an end-to-end optimized VLN-CE policy that unifies the traditional two stages, i.e. waypoint generation and planning, into a single diffusion policy. Notably, DifNav employs a conditional diffusion policy to directly model multi-modal action distributions over future actions in continuous navigation space, eliminating the need for a waypoint predictor while enabling the agent to capture multiple possible instruction-following behaviors. To address the issues of compounding error in imitation learning and enhance spatial reasoning in long-horizon navigation tasks, we employ DAgger for online policy training and expert trajectory augmentation, and use the aggregated data to further fine-tune the policy. This approach significantly improves the policy's robustness and its ability to recover from error states. Extensive experiments on benchmark datasets demonstrate that, even without a waypoint predictor, the proposed method substantially outperforms previous state-of-the-art two-stage waypoint-based models in terms of navigation performance. Our code is available at: https://github.com/Tokishx/DifNav.
Distilling LLM Prior to Flow Model for Generalizable Agent's Imagination in Object Goal Navigation
The Object Goal Navigation (ObjectNav) task challenges agents to locate a specified object in an unseen environment by imagining unobserved regions of the scene. Prior approaches rely on deterministic and discriminative models to complete semantic maps, overlooking the inherent uncertainty in indoor layouts and limiting their ability to generalize to unseen environments. In this work, we propose GOAL, a generative flow-based framework that models the semantic distribution of indoor environments by bridging observed regions with LLM-enriched full-scene semantic maps. During training, spatial priors inferred from large language models (LLMs) are encoded as two-dimensional Gaussian fields and injected into target maps, distilling rich contextual knowledge into the flow model and enabling more generalizable completions. Extensive experiments demonstrate that GOAL achieves state-of-the-art performance on MP3D and Gibson, and shows strong generalization in transfer settings to HM3D. Codes and pretrained models are available at https://github.com/Badi-Li/GOAL.
Systematic Constraint Formulation and Collision-Free Trajectory Planning Using Space-Time Graphs of Convex Sets
In this paper, we create optimal, collision-free, time-dependent trajectories through cluttered dynamic environments. The many spatial and temporal constraints make finding an initial guess for a numerical solver difficult. Graphs of Convex Sets (GCS) and the recently developed Space-Time Graphs of Convex Sets formulation (ST-GCS) enable us to generate optimal minimum distance collision-free trajectories without providing an initial guess to the solver. We also explore the derivation of general GCS-compatible constraints and document an intuitive strategy for adapting general constraints to the framework. We show that ST-GCS produces equivalent trajectories to the standard GCS formulation when the environment is static. We then show ST-GCS operating in dynamic environments to find minimum distance collision-free trajectories.
comment: 21 pages with references, 20 figures
WiFi-based Global Localization in Large-Scale Environments Leveraging Structural Priors from osmAG
Global localization is essential for autonomous robotics, especially in indoor environments where the GPS signal is denied. We propose a novel WiFi-based localization framework that leverages ubiquitous wireless infrastructure and the OpenStreetMap Area Graph (osmAG) for large-scale indoor environments. Our approach integrates signal propagation modeling with osmAG's geometric and topological priors. In the offline phase, an iterative optimization algorithm localizes WiFi Access Points (APs) by modeling wall attenuation, achieving a mean localization error of 3.79 m (35.3\% improvement over trilateration). In the online phase, real-time robot localization uses the augmented osmAG map, yielding a mean error of 3.12 m in fingerprinted areas (8.77\% improvement over KNN fingerprinting) and 3.83 m in non-fingerprinted areas (81.05\% improvement). Comparison with a fingerprint-based method shows that our approach is much more space efficient and achieves superior localization accuracy, especially for positions where no fingerprint data are available. Validated across a complex 11,025 &m^2& multi-floor environment, this framework offers a scalable, cost-effective solution for indoor robotic localization, solving the kidnapped robot problem. The code and dataset are available at https://github.com/XuMa369/osmag-wifi-localization.
BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion
Learning skills from human motions offers a promising path toward generalizable policies for versatile humanoid whole-body control, yet two key cornerstones are missing: (1) a high-quality motion tracking framework that faithfully transforms large-scale kinematic references into robust and extremely dynamic motions on real hardware, and (2) a distillation approach that can effectively learn these motion primitives and compose them to solve downstream tasks. We address these gaps with BeyondMimic, a real-world framework to learn from human motions for versatile and naturalistic humanoid control via guided diffusion. Our framework provides a motion tracking pipeline capable of challenging skills such as jumping spins, sprinting, and cartwheels with state-of-the-art motion quality. Moving beyond simply mimicking existing motions, we further introduce a unified diffusion policy that enables zero-shot task-specific control at test time using simple cost functions. Deployed on hardware, BeyondMimic performs diverse tasks at test time, including waypoint navigation, joystick teleoperation, and obstacle avoidance, bridging sim-to-real motion tracking and flexible synthesis of human motion primitives for whole-body control. https://beyondmimic.github.io/.
comment: coin toss authorship, minor changes
LM-MCVT: A Lightweight Multi-modal Multi-view Convolutional-Vision Transformer Approach for 3D Object Recognition
In human-centered environments such as restaurants, homes, and warehouses, robots often face challenges in accurately recognizing 3D objects. These challenges stem from the complexity and variability of these environments, including diverse object shapes. In this paper, we propose a novel Lightweight Multi-modal Multi-view Convolutional-Vision Transformer network (LM-MCVT) to enhance 3D object recognition in robotic applications. Our approach leverages the Globally Entropy-based Embeddings Fusion (GEEF) method to integrate multi-views efficiently. The LM-MCVT architecture incorporates pre- and mid-level convolutional encoders and local and global transformers to enhance feature extraction and recognition accuracy. We evaluate our method on the synthetic ModelNet40 dataset and achieve a recognition accuracy of 95.6% using a four-view setup, surpassing existing state-of-the-art methods. To further validate its effectiveness, we conduct 5-fold cross-validation on the real-world OmniObject3D dataset using the same configuration. Results consistently show superior performance, demonstrating the method's robustness in 3D object recognition across synthetic and real-world 3D data.
GeoVLA: Empowering 3D Representations in Vision-Language-Action Models
Vision-Language-Action (VLA) models have emerged as a promising approach for enabling robots to follow language instructions and predict corresponding actions. However, current VLA models mainly rely on 2D visual inputs, neglecting the rich geometric information in the 3D physical world, which limits their spatial awareness and adaptability. In this paper, we present GeoVLA, a novel VLA framework that effectively integrates 3D information to advance robotic manipulation. It uses a vision-language model (VLM) to process images and language instructions,extracting fused vision-language embeddings. In parallel, it converts depth maps into point clouds and employs a customized point encoder, called Point Embedding Network, to generate 3D geometric embeddings independently. These produced embeddings are then concatenated and processed by our proposed spatial-aware action expert, called 3D-enhanced Action Expert, which combines information from different sensor modalities to produce precise action sequences. Through extensive experiments in both simulation and real-world environments, GeoVLA demonstrates superior performance and robustness. It achieves state-of-the-art results in the LIBERO and ManiSkill2 simulation benchmarks and shows remarkable robustness in real-world tasks requiring height adaptability, scale awareness and viewpoint invariance.
comment: The project is visible at https://linsun449.github.io/GeoVLA/
Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors
A dexterous hand capable of generalizable grasping objects is fundamental for the development of general-purpose embodied AI. However, previous methods focus narrowly on low-level grasp stability metrics, neglecting affordance-aware positioning and human-like poses which are crucial for downstream manipulation. To address these limitations, we propose AffordDex, a novel framework with two-stage training that learns a universal grasping policy with an inherent understanding of both motion priors and object affordances. In the first stage, a trajectory imitator is pre-trained on a large corpus of human hand motions to instill a strong prior for natural movement. In the second stage, a residual module is trained to adapt these general human-like motions to specific object instances. This refinement is critically guided by two components: our Negative Affordance-aware Segmentation (NAA) module, which identifies functionally inappropriate contact regions, and a privileged teacher-student distillation process that ensures the final vision-based policy is highly successful. Extensive experiments demonstrate that AffordDex not only achieves universal dexterous grasping but also remains remarkably human-like in posture and functionally appropriate in contact location. As a result, AffordDex significantly outperforms state-of-the-art baselines across seen objects, unseen instances, and even entirely novel categories.
comment: 13 pages, 8 figures
A Minimal Model for Emergent Collective Behaviors in Autonomous Robotic Multi-Agent Systems
Collective behaviors such as swarming and flocking emerge from simple, decentralized interactions in biological systems. Existing models, such as Vicsek and Cucker-Smale, lack collision avoidance, whereas the Olfati-Saber model imposes rigid formations, limiting their applicability in swarm robotics. To address these limitations, this paper proposes a minimal yet expressive model that governs agent dynamics using relative positions, velocities, and local density, modulated by two tunable parameters: the spatial offset and kinetic offset. The model achieves spatially flexible, collision-free behaviors that reflect naturalistic group dynamics. Furthermore, we extend the framework to cognitive autonomous systems, enabling energy-aware phase transitions between swarming and flocking through adaptive control parameter tuning. This cognitively inspired approach offers a robust foundation for real-world applications in multi-robot systems, particularly autonomous aerial swarms.
comment: Accepted for IEEE Transactions on Cognitive and Developmental Systems. Simulation video for Fig. 8: https://youtube.com/shorts/StHrtnSJyyg Simulation video for Fig. 10: https://youtu.be/Z26m7M-63D4
AgentWorld: An Interactive Simulation Platform for Scene Construction and Mobile Robotic Manipulation
We introduce AgentWorld, an interactive simulation platform for developing household mobile manipulation capabilities. Our platform combines automated scene construction that encompasses layout generation, semantic asset placement, visual material configuration, and physics simulation, with a dual-mode teleoperation system supporting both wheeled bases and humanoid locomotion policies for data collection. The resulting AgentWorld Dataset captures diverse tasks ranging from primitive actions (pick-and-place, push-pull, etc.) to multistage activities (serve drinks, heat up food, etc.) across living rooms, bedrooms, and kitchens. Through extensive benchmarking of imitation learning methods including behavior cloning, action chunking transformers, diffusion policies, and vision-language-action models, we demonstrate the dataset's effectiveness for sim-to-real transfer. The integrated system provides a comprehensive solution for scalable robotic skill acquisition in complex home environments, bridging the gap between simulation-based training and real-world deployment. The code, datasets will be available at https://yizhengzhang1.github.io/agent_world/
comment: Accepted by Conference on Robot Learning 2025
Decoupling Geometry from Optimization in 2D Irregular Cutting and Packing Problems: an Open-Source Collision Detection Engine
Addressing irregular cutting and packing (C&P) optimization problems poses two distinct challenges: the geometric challenge of determining whether or not an item can be placed feasibly at a certain position, and the optimization challenge of finding a good solution according to some objective function. Until now, those tackling such problems have had to address both challenges simultaneously, requiring two distinct sets of expertise and a lot of research & development effort. One way to lower this barrier is to decouple the two challenges. In this paper we introduce a powerful collision detection engine (CDE) for 2D irregular C&P problems which assumes full responsibility for the geometric challenge. The CDE (i) allows users to focus with full confidence on their optimization challenge by abstracting geometry away and (ii) enables independent advances to propagate to all optimization algorithms built atop it. We present a set of core principles and design philosophies to model a general and adaptable CDE focused on maximizing performance, accuracy and robustness. These principles are accompanied by a concrete open-source implementation called $\texttt{jagua-rs}$. This paper together with its implementation serves as a catalyst for future advances in irregular C&P problems by providing a solid foundation which can either be used as it currently exists or be further improved upon.
comment: 25 pages, 16 figures
Multi-view Normal and Distance Guidance Gaussian Splatting for Surface Reconstruction IROS 2025
3D Gaussian Splatting (3DGS) achieves remarkable results in the field of surface reconstruction. However, when Gaussian normal vectors are aligned within the single-view projection plane, while the geometry appears reasonable in the current view, biases may emerge upon switching to nearby views. To address the distance and global matching challenges in multi-view scenes, we design multi-view normal and distance-guided Gaussian splatting. This method achieves geometric depth unification and high-accuracy reconstruction by constraining nearby depth maps and aligning 3D normals. Specifically, for the reconstruction of small indoor and outdoor scenes, we propose a multi-view distance reprojection regularization module that achieves multi-view Gaussian alignment by computing the distance loss between two nearby views and the same Gaussian surface. Additionally, we develop a multi-view normal enhancement module, which ensures consistency across views by matching the normals of pixel points in nearby views and calculating the loss. Extensive experimental results demonstrate that our method outperforms the baseline in both quantitative and qualitative evaluations, significantly enhancing the surface reconstruction capability of 3DGS. Our code will be made publicly available at (https://github.com/Bistu3DV/MND-GS/).
comment: This paper has been accepted by IROS 2025. Code: https://github.com/Bistu3DV/MND-GS/
RoHOI: Robustness Benchmark for Human-Object Interaction Detection
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support. However, models trained on clean datasets degrade in real-world conditions due to unforeseen corruptions, leading to inaccurate prediction. To address this, we introduce the first robustness benchmark for HOI detection, evaluating model resilience under diverse challenges. Despite advances, current models struggle with environmental variability, occlusions, and noise. Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric. We systematically analyze existing models in the HOI field, revealing significant performance drops under corruptions. To improve robustness, we propose a Semantic-Aware Masking-based Progressive Learning (SAMPL) strategy to guide the model to be optimized based on holistic and partial cues, thus dynamically adjusting the model's optimization to enhance robust feature learning. Extensive experiments show that our approach outperforms state-of-the-art methods, setting a new standard for robust HOI detection. Benchmarks, datasets, and code will be made publicly available at https://github.com/Kratos-Wen/RoHOI.
comment: Benchmarks, datasets, and code will be made publicly available at https://github.com/Kratos-Wen/RoHOI
Accelerated Reeds-Shepp and Under-Specified Reeds-Shepp Algorithms for Mobile Robot Path Planning
In this study, we present a simple and intuitive method for accelerating optimal Reeds-Shepp path computation. Our approach uses geometrical reasoning to analyze the behavior of optimal paths, resulting in a new partitioning of the state space and a further reduction in the minimal set of viable paths. We revisit and reimplement classic methodologies from the literature, which lack contemporary open-source implementations, to serve as benchmarks for evaluating our method. Additionally, we address the under-specified Reeds-Shepp planning problem where the final orientation is unspecified. We perform exhaustive experiments to validate our solutions. Compared to the modern C++ implementation of the original Reeds-Shepp solution in the Open Motion Planning Library, our method demonstrates a 15x speedup, while classic methods achieve a 5.79x speedup. Both approaches exhibit machine-precision differences in path lengths compared to the original solution. We release our proposed C++ implementations for both the accelerated and under-specified Reeds-Shepp problems as open-source code.
comment: 19 pages, 27 figures
REACT: Real-time Efficient Attribute Clustering and Transfer for Updatable 3D Scene Graph IROS 2025
Modern-day autonomous robots need high-level map representations to perform sophisticated tasks. Recently, 3D scene graphs (3DSGs) have emerged as a promising alternative to traditional grid maps, blending efficient memory use and rich feature representation. However, most efforts to apply them have been limited to static worlds. This work introduces REACT, a framework that efficiently performs real-time attribute clustering and transfer to relocalize object nodes in a 3DSG. REACT employs a novel method for comparing object instances using an embedding model trained on triplet loss, facilitating instance clustering and matching. Experimental results demonstrate that REACT is able to relocalize objects while maintaining computational efficiency. The REACT framework's source code will be available as an open-source project, promoting further advancements in reusable and updatable 3DSGs.
comment: Accepted to IROS 2025
ParkDiffusion: Heterogeneous Multi-Agent Multi-Modal Trajectory Prediction for Automated Parking using Diffusion Models IROS 2025
Automated parking is a critical feature of Advanced Driver Assistance Systems (ADAS), where accurate trajectory prediction is essential to bridge perception and planning modules. Despite its significance, research in this domain remains relatively limited, with most existing studies concentrating on single-modal trajectory prediction of vehicles. In this work, we propose ParkDiffusion, a novel approach that predicts the trajectories of both vehicles and pedestrians in automated parking scenarios. ParkDiffusion employs diffusion models to capture the inherent uncertainty and multi-modality of future trajectories, incorporating several key innovations. First, we propose a dual map encoder that processes soft semantic cues and hard geometric constraints using a two-step cross-attention mechanism. Second, we introduce an adaptive agent type embedding module, which dynamically conditions the prediction process on the distinct characteristics of vehicles and pedestrians. Third, to ensure kinematic feasibility, our model outputs control signals that are subsequently used within a kinematic framework to generate physically feasible trajectories. We evaluate ParkDiffusion on the Dragon Lake Parking (DLP) dataset and the Intersections Drone (inD) dataset. Our work establishes a new baseline for heterogeneous trajectory prediction in parking scenarios, outperforming existing methods by a considerable margin.
comment: IROS 2025 Camera-Ready Version
A multi-strategy improved snake optimizer for three-dimensional UAV path planning and engineering problems
Metaheuristic algorithms have gained widespread application across various fields owing to their ability to generate diverse solutions. One such algorithm is the Snake Optimizer (SO), a progressive optimization approach. However, SO suffers from the issues of slow convergence speed and susceptibility to local optima. In light of these shortcomings, we propose a novel Multi-strategy Improved Snake Optimizer (MISO). Firstly, we propose a new adaptive random disturbance strategy based on sine function to alleviate the risk of getting trapped in a local optimum. Secondly, we introduce adaptive Levy flight strategy based on scale factor and leader and endow the male snake leader with flight capability, which makes it easier for the algorithm to leap out of the local optimum and find the global optimum. More importantly, we put forward a position update strategy combining elite leadership and Brownian motion, effectively accelerating the convergence speed while ensuring precision. Finally, to demonstrate the performance of MISO, we utilize 30 CEC2017 test functions and the CEC2022 test suite, comparing it with 11 popular algorithms across different dimensions to validate its effectiveness. Moreover, Unmanned Aerial Vehicle (UAV) has been widely used in various fields due to its advantages of low cost, high mobility and easy operation. However, the UAV path planning problem is crucial for flight safety and efficiency, and there are still challenges in establishing and optimizing the path model. Therefore, we apply MISO to the UAV 3D path planning problem as well as 6 engineering design problems to assess its feasibility in practical applications. The experimental results demonstrate that MISO exceeds other competitive algorithms in terms of solution quality and stability, establishing its strong potential for application.
comment: 59 pages, 22 figures
Learning Whole-Body Loco-Manipulation for Omni-Directional Task Space Pose Tracking with a Wheeled-Quadrupedal-Manipulator
In this paper, we study the whole-body loco-manipulation problem using reinforcement learning (RL). Specifically, we focus on the problem of how to coordinate the floating base and the robotic arm of a wheeled-quadrupedal manipulator robot to achieve direct six-dimensional (6D) end-effector (EE) pose tracking in task space. Different from conventional whole-body loco-manipulation problems that track both floating-base and end-effector commands, the direct EE pose tracking problem requires inherent balance among redundant degrees of freedom in the whole-body motion. We leverage RL to solve this challenging problem. To address the associated difficulties, we develop a novel reward fusion module (RFM) that systematically integrates reward terms corresponding to different tasks in a nonlinear manner. In such a way, the inherent multi-stage and hierarchical feature of the loco-manipulation problem can be carefully accommodated. By combining the proposed RFM with the a teacher-student RL training paradigm, we present a complete RL scheme to achieve 6D EE pose tracking for the wheeled-quadruped manipulator robot. Extensive simulation and hardware experiments demonstrate the significance of the RFM. In particular, we enable smooth and precise tracking performance, achieving state-of-the-art tracking position error of less than 5 cm, and rotation error of less than 0.1 rad. Please refer to https://clearlab-sustech.github.io/RFM_loco_mani/ for more experimental videos.
BridgeDepth: Bridging Monocular and Stereo Reasoning with Latent Alignment ICCV 2025
Monocular and stereo depth estimation offer complementary strengths: monocular methods capture rich contextual priors but lack geometric precision, while stereo approaches leverage epipolar geometry yet struggle with ambiguities such as reflective or textureless surfaces. Despite post-hoc synergies, these paradigms remain largely disjoint in practice. We introduce a unified framework that bridges both through iterative bidirectional alignment of their latent representations. At its core, a novel cross-attentive alignment mechanism dynamically synchronizes monocular contextual cues with stereo hypothesis representations during stereo reasoning. This mutual alignment resolves stereo ambiguities (e.g., specular surfaces) by injecting monocular structure priors while refining monocular depth with stereo geometry within a single network. Extensive experiments demonstrate state-of-the-art results: \textbf{it reduces zero-shot generalization error by $\!>\!40\%$ on Middlebury and ETH3D}, while addressing longstanding failures on transparent and reflective surfaces. By harmonizing multi-view geometry with monocular context, our approach enables robust 3D perception that transcends modality-specific limitations. Codes available at https://github.com/aeolusguan/BridgeDepth.
comment: ICCV 2025 Highlight
MetaFold: Language-Guided Multi-Category Garment Folding Framework via Trajectory Generation and Foundation Model
Garment folding is a common yet challenging task in robotic manipulation. The deformability of garments leads to a vast state space and complex dynamics, which complicates precise and fine-grained manipulation. Previous approaches often rely on predefined key points or demonstrations, limiting their generalization across diverse garment categories. This paper presents a framework, MetaFold, that disentangles task planning from action prediction, learning each independently to enhance model generalization. It employs language-guided point cloud trajectory generation for task planning and a low-level foundation model for action prediction. This structure facilitates multi-category learning, enabling the model to adapt flexibly to various user instructions and folding tasks. Experimental results demonstrate the superiority of our proposed framework. Supplementary materials are available on our website: https://meta-fold.github.io/.
Episodic Memory Verbalization using Hierarchical Representations of Life-Long Robot Experience
Verbalization of robot experience, i.e., summarization of and question answering about a robot's past, is a crucial ability for improving human-robot interaction. Previous works applied rule-based systems or fine-tuned deep models to verbalize short (several-minute-long) streams of episodic data, limiting generalization and transferability. In our work, we apply large pretrained models to tackle this task with zero or few examples, and specifically focus on verbalizing life-long experiences. For this, we derive a tree-like data structure from episodic memory (EM), with lower levels representing raw perception and proprioception data, and higher levels abstracting events to natural language concepts. Given such a hierarchical representation built from the experience stream, we apply a large language model as an agent to interactively search the EM given a user's query, dynamically expanding (initially collapsed) tree nodes to find the relevant information. The approach keeps computational costs low even when scaling to months of robot experience data. We evaluate our method on simulated household robot data, human egocentric videos, and real-world robot recordings, demonstrating its flexibility and scalability.
comment: Humanoids 2025. Code, data and demo videos at https://hierarchical-emv.github.io
On learning racing policies with reinforcement learning IROS 2025
Fully autonomous vehicles promise enhanced safety and efficiency. However, ensuring reliable operation in challenging corner cases requires control algorithms capable of performing at the vehicle limits. We address this requirement by considering the task of autonomous racing and propose solving it by learning a racing policy using Reinforcement Learning (RL). Our approach leverages domain randomization, actuator dynamics modeling, and policy architecture design to enable reliable and safe zero-shot deployment on a real platform. Evaluated on the F1TENTH race car, our RL policy not only surpasses a state-of-the-art Model Predictive Control (MPC), but, to the best of our knowledge, also represents the first instance of an RL policy outperforming expert human drivers in RC racing. This work identifies the key factors driving this performance improvement, providing critical insights for the design of robust RL-based control strategies for autonomous vehicles.
comment: This paper has been accepted for publication in the Proceedings of the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
SWA-SOP: Spatially-aware Window Attention for Semantic Occupancy Prediction in Autonomous Driving
Perception systems in autonomous driving rely on sensors such as LiDAR and cameras to perceive the 3D environment. However, due to occlusions and data sparsity, these sensors often fail to capture complete information. Semantic Occupancy Prediction (SOP) addresses this challenge by inferring both occupancy and semantics of unobserved regions. Existing transformer-based SOP methods lack explicit modeling of spatial structure in attention computation, resulting in limited geometric awareness and poor performance in sparse or occluded areas. To this end, we propose Spatially-aware Window Attention (SWA), a novel mechanism that incorporates local spatial context into attention. SWA significantly improves scene completion and achieves state-of-the-art results on LiDAR-based SOP benchmarks. We further validate its generality by integrating SWA into a camera-based SOP pipeline, where it also yields consistent gains across modalities.
comment: 2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Vienna, Austria, Oct 2025
OC-SOP: Enhancing Vision-Based 3D Semantic Occupancy Prediction by Object-Centric Awareness
Autonomous driving perception faces significant challenges due to occlusions and incomplete scene data in the environment. To overcome these issues, the task of semantic occupancy prediction (SOP) is proposed, which aims to jointly infer both the geometry and semantic labels of a scene from images. However, conventional camera-based methods typically treat all categories equally and primarily rely on local features, leading to suboptimal predictions, especially for dynamic foreground objects. To address this, we propose Object-Centric SOP (OC-SOP), a framework that integrates high-level object-centric cues extracted via a detection branch into the semantic occupancy prediction pipeline. This object-centric integration significantly enhances the prediction accuracy for foreground objects and achieves state-of-the-art performance among all categories on SemanticKITTI.
comment: 2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Vienna, Austria, Oct 2025
Barriers on the EDGE: A scalable CBF architecture over EDGE for safe aerial-ground multi-agent coordination
In this article, we propose a control architecture for the safe, coordinated operation of a multi-agent system with aerial (UAVs) and ground (UGVs) robots in a confined task space. We consider the case where the aerial and ground operations are coupled, enabled by the capability of the aerial robots to land on moving ground robots. The proposed method uses time-varying Control Barrier Functions (CBFs) to impose safety constraints associated with (i) collision avoidance between agents, (ii) landing of UAVs on mobile UGVs, and (iii) task space restriction. Further, this article addresses the challenge induced by the rapid increase in the number of CBF constraints with the increasing number of agents through a hybrid centralized-distributed coordination approach that determines the set of CBF constraints that is relevant for every aerial and ground agent at any given time. A centralized node (Watcher), hosted by an edge computing cluster, activates the relevant constraints, thus reducing the network complexity and the need for high onboard processing on the robots. The CBF constraints are enforced in a distributed manner by individual robots that run a nominal controller and safety filter locally to overcome latency and other network nonidealities.
comment: 6 pages, 2 figures, first draft of a paper currently under review
Navigating Robot Swarm Through a Virtual Tube with Flow-Adaptive Distribution Control
With the rapid development of robot swarm technology and its diverse applications, navigating robot swarms through complex environments has emerged as a critical research direction. To ensure safe navigation and avoid potential collisions with obstacles, the concept of virtual tubes has been introduced to define safe and navigable regions. However, current control methods in virtual tubes face the congestion issues, particularly in narrow ones with low throughput. To address these challenges, we first propose a novel control method that combines a modified artificial potential field (APF) for swarm navigation and density feedback control for distribution regulation. Then we generate a global velocity field that not only ensures collision-free navigation but also achieves locally input-to-state stability (LISS) for density tracking. Finally, numerical simulations and realistic applications validate the effectiveness and advantages of the proposed method in navigating robot swarms through narrow virtual tubes.
comment: 8 pages(brief paper), 12 figures
RIZE: Regularized Imitation Learning via Distributional Reinforcement Learning
We propose a novel Inverse Reinforcement Learning (IRL) method that mitigates the rigidity of fixed reward structures and the limited flexibility of implicit reward regularization. Building on the Maximum Entropy IRL framework, our approach incorporates a squared temporal-difference (TD) regularizer with adaptive targets that evolve dynamically during training, thereby imposing adaptive bounds on recovered rewards and promoting robust decision-making. To capture richer return information, we integrate distributional RL into the learning process. Empirically, our method achieves expert-level performance on complex MuJoCo tasks, surpassing baseline methods on the Humanoid task with 3 demonstrations. Extensive experiments and ablation studies further validate the effectiveness of the approach and provide insights into reward dynamics in imitation learning.
comment: Major revision - completely rewritten mathematical formulation and proofs, with substantial updates to methodology and expanded appendix for supporting derivations
Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement Learning ICML 2025
For continuous action spaces, actor-critic methods are widely used in online reinforcement learning (RL). However, unlike RL algorithms for discrete actions, which generally model the optimal value function using the Bellman optimality operator, RL algorithms for continuous actions typically model Q-values for the current policy using the Bellman operator. These algorithms for continuous actions rely exclusively on policy updates for improvement, which often results in low sample efficiency. This study examines the effectiveness of incorporating the Bellman optimality operator into actor-critic frameworks. Experiments in a simple environment show that modeling optimal values accelerates learning but leads to overestimation bias. To address this, we propose an annealing approach that gradually transitions from the Bellman optimality operator to the Bellman operator, thereby accelerating learning while mitigating bias. Our method, combined with TD3 and SAC, significantly outperforms existing approaches across various locomotion and manipulation tasks, demonstrating improved performance and robustness to hyperparameters related to optimality. The code for this study is available at https://github.com/motokiomura/annealed-q-learning.
comment: Accepted at ICML 2025. Source code: https://github.com/motokiomura/annealed-q-learning
Chemist Eye: A Visual Language Model-Powered System for Safety Monitoring and Robot Decision-Making in Self-Driving Laboratories
The integration of robotics and automation into self-driving laboratories (SDLs) can introduce additional safety complexities, in addition to those that already apply to conventional research laboratories. Personal protective equipment (PPE) is an essential requirement for ensuring the safety and well-being of workers in laboratories, self-driving or otherwise. Fires are another important risk factor in chemical laboratories. In SDLs, fires that occur close to mobile robots, which use flammable lithium batteries, could have increased severity. Here, we present Chemist Eye, a distributed safety monitoring system designed to enhance situational awareness in SDLs. The system integrates multiple stations equipped with RGB, depth, and infrared cameras, designed to monitor incidents in SDLs. Chemist Eye is also designed to spot workers who have suffered a potential accident or medical emergency, PPE compliance and fire hazards. To do this, Chemist Eye uses decision-making driven by a vision-language model (VLM). Chemist Eye is designed for seamless integration, enabling real-time communication with robots. Based on the VLM recommendations, the system attempts to drive mobile robots away from potential fire locations, exits, or individuals not wearing PPE, and issues audible warnings where necessary. It also integrates with third-party messaging platforms to provide instant notifications to lab personnel. We tested Chemist Eye with real-world data from an SDL equipped with three mobile robots and found that the spotting of possible safety hazards and decision-making performances reached 97 % and 95 %, respectively.
GraspClutter6D: A Large-scale Real-world Dataset for Robust Perception and Grasping in Cluttered Scenes
Robust grasping in cluttered environments remains an open challenge in robotics. While benchmark datasets have significantly advanced deep learning methods, they mainly focus on simplistic scenes with light occlusion and insufficient diversity, limiting their applicability to practical scenarios. We present GraspClutter6D, a large-scale real-world grasping dataset featuring: (1) 1,000 highly cluttered scenes with dense arrangements (14.1 objects/scene, 62.6\% occlusion), (2) comprehensive coverage across 200 objects in 75 environment configurations (bins, shelves, and tables) captured using four RGB-D cameras from multiple viewpoints, and (3) rich annotations including 736K 6D object poses and 9.3B feasible robotic grasps for 52K RGB-D images. We benchmark state-of-the-art segmentation, object pose estimation, and grasp detection methods to provide key insights into challenges in cluttered environments. Additionally, we validate the dataset's effectiveness as a training resource, demonstrating that grasping networks trained on GraspClutter6D significantly outperform those trained on existing datasets in both simulation and real-world experiments. The dataset, toolkit, and annotation tools are publicly available on our project website: https://sites.google.com/view/graspclutter6d.
DualMap: Online Open-Vocabulary Semantic Mapping for Natural Language Navigation in Dynamic Changing Scenes
We introduce DualMap, an online open-vocabulary mapping system that enables robots to understand and navigate dynamically changing environments through natural language queries. Designed for efficient semantic mapping and adaptability to changing environments, DualMap meets the essential requirements for real-world robot navigation applications. Our proposed hybrid segmentation frontend and object-level status check eliminate the costly 3D object merging required by prior methods, enabling efficient online scene mapping. The dual-map representation combines a global abstract map for high-level candidate selection with a local concrete map for precise goal-reaching, effectively managing and updating dynamic changes in the environment. Through extensive experiments in both simulation and real-world scenarios, we demonstrate state-of-the-art performance in 3D open-vocabulary segmentation, efficient scene mapping, and online language-guided navigation.Project page: https://eku127.github.io/DualMap/
comment: 14 pages, 14 figures. Code: https://github.com/Eku127/DualMap Project page: https://eku127.github.io/DualMap/
Human2Robot: Learning Robot Actions from Paired Human-Robot Videos
Distilling knowledge from human demonstrations is a promising way for robots to learn and act. Existing methods, which often rely on coarsely-aligned video pairs, are typically constrained to learning global or task-level features. As a result, they tend to neglect the fine-grained frame-level dynamics required for complex manipulation and generalization to novel tasks. We posit that this limitation stems from a vicious circle of inadequate datasets and the methods they inspire. To break this cycle, we propose a paradigm shift that treats fine-grained human-robot alignment as a conditional video generation problem. To this end, we first introduce H&R, a novel third-person dataset containing 2,600 episodes of precisely synchronized human and robot motions, collected using a VR teleoperation system. We then present Human2Robot, a framework designed to leverage this data. Human2Robot employs a Video Prediction Model to learn a rich and implicit representation of robot dynamics by generating robot videos from human input, which in turn guides a decoupled action decoder. Our real-world experiments demonstrate that this approach not only achieves high performance on seen tasks but also exhibits significant one-shot generalization to novel positions, objects, instances, and even new task categories.
Responsive Noise-Relaying Diffusion Policy: Responsive and Efficient Visuomotor Control
Imitation learning is an efficient method for teaching robots a variety of tasks. Diffusion Policy, which uses a conditional denoising diffusion process to generate actions, has demonstrated superior performance, particularly in learning from multi-modal demonstrates. However, it relies on executing multiple actions predicted from the same inference step to retain performance and prevent mode bouncing, which limits its responsiveness, as actions are not conditioned on the most recent observations. To address this, we introduce Responsive Noise-Relaying Diffusion Policy (RNR-DP), which maintains a noise-relaying buffer with progressively increasing noise levels and employs a sequential denoising mechanism that generates immediate, noise-free actions at the head of the sequence, while appending noisy actions at the tail. This ensures that actions are responsive and conditioned on the latest observations, while maintaining motion consistency through the noise-relaying buffer. This design enables the handling of tasks requiring responsive control, and accelerates action generation by reusing denoising steps. Experiments on response-sensitive tasks demonstrate that, compared to Diffusion Policy, ours achieves 18% improvement in success rate. Further evaluation on regular tasks demonstrates that RNR-DP also exceeds the best acceleration method (DDIM) by 6.9% in success rate, highlighting its computational efficiency advantage in scenarios where responsiveness is less critical. Our project page is available at https://rnr-dp.github.io
comment: Project website: https://rnr-dp.github.io
Deep Learning Warm Starts for Trajectory Optimization on the International Space Station
Trajectory optimization is a cornerstone of modern robot autonomy, enabling systems to compute trajectories and controls in real-time while respecting safety and physical constraints. However, it has seen limited usage in spaceflight applications due to its heavy computational demands that exceed the capability of most flight computers. In this work, we provide results on the first flight demonstration of using machine learning-based warm starts for accelerating trajectory optimization for the Astrobee free-flying robot on-board the International Space Station (ISS). We formulate a data-driven optimal control approach that trains a neural network to learn the structure of the trajectory generation problem being solved for by sequential convex programming (SCP). On-board, this trained neural network predicts solutions for the trajectory generation problem and relies on using the SCP solver to enforce safety constraints for the system. Our trained network reduces the number of solver iterations required for convergence in cases including rotational dynamics by 60% and in cases with obstacles drawn from the training distribution of the warm start model by 50%. This work represents a significant milestone in the use of learning-based control for spaceflight applications and a stepping stone for future advances in the use of machine learning for autonomous guidance, navigation, & control.
comment: Submitted to 2025 International Conference on Space Robotics (iSpaRo). Presented at RSS 2025 Workshop on Space Robotics
Audio-3DVG: Unified Audio -- Point Cloud Fusion for 3D Visual Grounding
3D Visual Grounding (3DVG) involves localizing target objects in 3D point clouds based on natural language. While prior work has made strides using textual descriptions, leveraging spoken language-known as Audio-based 3D Visual Grounding-remains underexplored and challenging. Motivated by advances in automatic speech recognition (ASR) and speech representation learning, we propose Audio-3DVG, a simple yet effective framework that integrates audio and spatial information for enhanced grounding. Rather than treating speech as a monolithic input, we decompose the task into two complementary components. First, we introduce (i) Object Mention Detection, a multi-label classification task that explicitly identifies which objects are referred to in the audio, enabling more structured audio-scene reasoning. Second, we propose an (ii) Audio-Guided Attention module that models the interactions between target candidates and mentioned objects, enhancing discrimination in cluttered 3D environments. To support benchmarking, we (iii) synthesize audio descriptions for standard 3DVG datasets, including ScanRefer, Sr3D, and Nr3D. Experimental results demonstrate that Audio-3DVG not only achieves new state-of-the-art performance in audio-based grounding, but also competes with text-based methods, highlight the promise of integrating spoken language into 3D vision tasks.
comment: Preprint, 51 pages
Detection and Tracking of MAVs Using a Rosette Scanning Pattern LiDAR
The use of commercial Micro Aerial Vehicles (MAVs) has surged in the past decade, offering societal benefits but also raising risks such as airspace violations and privacy concerns. Due to the increased security risks, the development of autonomous drone detection and tracking systems has become a priority. In this study, we tackle this challenge, by using non-repetitive rosette scanning pattern LiDARs, particularly focusing on increasing the detection distance by leveraging the characteristics of the sensor. The presented method utilizes a particle filter with a velocity component for the detection and tracking of the drone, which offers added re-detection capability. A Pan-Tilt platform is utilized to take advantage of the specific characteristics of the rosette scanning pattern LiDAR by keeping the tracked object in the center where the measurement is most dense. The detection capabilities and accuracy of the system are validated through indoor experiments, while the maximum detection distance is shown in our outdoor experiments. Our approach achieved accuracy on par with the state-of-the-art indoor method while increasing the maximum detection range by approximately 80\% beyond the state-of-the-art outdoor method.
MPPI-Generic: A CUDA Library for Stochastic Trajectory Optimization
This paper introduces a new C++/CUDA library for GPU-accelerated stochastic optimization called MPPI-Generic. It provides implementations of Model Predictive Path Integral control, Tube-Model Predictive Path Integral Control, and Robust Model Predictive Path Integral Control, and allows for these algorithms to be used across many pre-existing dynamics models and cost functions. Furthermore, researchers can create their own dynamics models or cost functions following our API definitions without needing to change the actual Model Predictive Path Integral Control code. Finally, we compare computational performance to other popular implementations of Model Predictive Path Integral Control over a variety of GPUs to show the real-time capabilities our library can allow for. Library code can be found at: https://acdslab.github.io/mppi-generic-website/ .
comment: Added missing Acknowledgements section
Implicit Safe Set Algorithm for Provably Safe Reinforcement Learning
Deep reinforcement learning (DRL) has demonstrated remarkable performance in many continuous control tasks. However, a significant obstacle to the real-world application of DRL is the lack of safety guarantees. Although DRL agents can satisfy system safety in expectation through reward shaping, designing agents to consistently meet hard constraints (e.g., safety specifications) at every time step remains a formidable challenge. In contrast, existing work in the field of safe control provides guarantees on persistent satisfaction of hard safety constraints. However, these methods require explicit analytical system dynamics models to synthesize safe control, which are typically inaccessible in DRL settings. In this paper, we present a model-free safe control algorithm, the implicit safe set algorithm, for synthesizing safeguards for DRL agents that ensure provable safety throughout training. The proposed algorithm synthesizes a safety index (barrier certificate) and a subsequent safe control law solely by querying a black-box dynamic function (e.g., a digital twin simulator). Moreover, we theoretically prove that the implicit safe set algorithm guarantees finite time convergence to the safe set and forward invariance for both continuous-time and discrete-time systems. We validate the proposed algorithm on the state-of-the-art Safety Gym benchmark, where it achieves zero safety violations while gaining $95\% \pm 9\%$ cumulative reward compared to state-of-the-art safe DRL methods. Furthermore, the resulting algorithm scales well to high-dimensional systems with parallel computing.
comment: Accepted to Journal of Artificial Intelligence Research. arXiv admin note: text overlap with arXiv:2308.13140
Multiagent Systems
Online Safety under Multiple Constraints and Input Bounds using gatekeeper: Theory and Applications
This letter presents an approach to guarantee online safety of a cyber-physical system under multiple state and input constraints. Our proposed framework, called gatekeeper, recursively guarantees the existence of an infinite-horizon trajectory that satisfies all constraints and system dynamics. Such trajectory is constructed using a backup controller, which we define formally in this paper. gatekeeper relies on a small number of verifiable assumptions, and is computationally efficient since it requires optimization over a single scalar variable. We make two primary contributions in this letter. (A) First, we develop the theory of gatekeeper: we derive a sub-optimality bound relative to a full nonlinear trajectory optimization problem, and show how this can be used in runtime to validate performance. This also informs the design of the backup controllers and sets. (B) Second, we demonstrate in detail an application of gatekeeper for multi-agent formation flight, where each Dubins agent must avoid multiple obstacles and weapons engagement zones, both of which are nonlinear, nonconvex constraints.
comment: 6 pages, 2 figures. Accepted for publication in IEEE L-CSS 2025
Extending the OWASP Multi-Agentic System Threat Modeling Guide: Insights from Multi-Agent Security Research
We propose an extension to the OWASP Multi-Agentic System (MAS) Threat Modeling Guide, translating recent anticipatory research in multi-agent security (MASEC) into practical guidance for addressing challenges unique to large language model (LLM)-driven multi-agent architectures. Although OWASP's existing taxonomy covers many attack vectors, our analysis identifies gaps in modeling failures, including, but not limited to: reasoning collapse across planner-executor chains, metric overfitting, unsafe delegation escalation, emergent covert coordination, and heterogeneous multi-agent exploits. We introduce additional threat classes and scenarios grounded in practical MAS deployments, highlighting risks from benign goal drift, cross-agent hallucination propagation, affective prompt framing, and multi-agent backdoors. We also outline evaluation strategies, including robustness testing, coordination assessment, safety enforcement, and emergent behavior monitoring, to ensure complete coverage. This work complements the framework of OWASP by expanding its applicability to increasingly complex, autonomous, and adaptive multi-agent systems, with the goal of improving security posture and resilience in real world deployments.
Emergence of Hierarchies in Multi-Agent Self-Organizing Systems Pursuing a Joint Objective
Multi-agent self-organizing systems (MASOS) exhibit key characteristics including scalability, adaptability, flexibility, and robustness, which have contributed to their extensive application across various fields. However, the self-organizing nature of MASOS also introduces elements of unpredictability in their emergent behaviors. This paper focuses on the emergence of dependency hierarchies during task execution, aiming to understand how such hierarchies arise from agents' collective pursuit of the joint objective, how they evolve dynamically, and what factors govern their development. To investigate this phenomenon, multi-agent reinforcement learning (MARL) is employed to train MASOS for a collaborative box-pushing task. By calculating the gradients of each agent's actions in relation to the states of other agents, the inter-agent dependencies are quantified, and the emergence of hierarchies is analyzed through the aggregation of these dependencies. Our results demonstrate that hierarchies emerge dynamically as agents work towards a joint objective, with these hierarchies evolving in response to changing task requirements. Notably, these dependency hierarchies emerge organically in response to the shared objective, rather than being a consequence of pre-configured rules or parameters that can be fine-tuned to achieve specific results. Furthermore, the emergence of hierarchies is influenced by the task environment and network initialization conditions. Additionally, hierarchies in MASOS emerge from the dynamic interplay between agents' "Talent" and "Effort" within the "Environment." "Talent" determines an agent's initial influence on collective decision-making, while continuous "Effort" within the "Environment" enables agents to shift their roles and positions within the system.
comment: 34 pages,17 figures
REALISM: A Regulatory Framework for Coordinated Scheduling in Multi-Operator Shared Micromobility Services SP
Shared micromobility (e.g., shared bikes and electric scooters), as a kind of emerging urban transportation, has become more and more popular in the world. However, the blooming of shared micromobility vehicles brings some social problems to the city (e.g., overloaded vehicles on roads, and the inequity of vehicle deployment), which deviate from the city regulator's expectation of the service of the shared micromobility system. In addition, the multi-operator shared micromobility system in a city complicates the problem because of their non-cooperative self-interested pursuits. Existing regulatory frameworks of multi-operator vehicle rebalancing generally assume the intrusive control of vehicle rebalancing of all the operators, which is not practical in the real world. To address this limitation, we design REALISM, a regulatory framework for coordinated scheduling in multi-operator shared micromobility services that incorporates the city regulator's regulations in the form of assigning a score to each operator according to the city goal achievements and operators' individual contributions to achieving the city goal, measured by Shapley value. To realize the fairness-aware score assignment, we measure the fairness of assigned scores and use them as one of the components to optimize the score assignment model. To optimize the whole framework, we develop an alternating procedure to make operators and the city regulator interact with each other until convergence. We evaluate our framework based on real-world e-scooter usage data in Chicago. Our experiment results show that our method achieves a performance gain of at least 39.93% in the equity of vehicle usage and 1.82% in the average demand satisfaction of the whole city.
comment: 11 pages, 7 figures, SIGSPATIAL 2025
FPT-Approximability of Stable Matching Problems
We study parameterized approximability of three optimization problems related to stable matching: (1) Min-BP-SMI: Given a stable marriage instance and a number k, find a size-at-least-k matching that minimizes the number $\beta$ of blocking pairs; (2) Min-BP-SRI: Given a stable roommates instance, find a matching that minimizes the number $\beta$ of blocking pairs; (3) Max-SMTI: Given a stable marriage instance with preferences containing ties, find a maximum-size stable matching. The first two problems are known to be NP-hard to approximate to any constant factor and W[1]-hard with respect to $\beta$, making the existence of an EPTAS or FPT-algorithms unlikely. We show that they are W[1]-hard with respect to $\beta$ to approximate to any function of $\beta$. This means that unless FPT=W[1], there is no FPT-approximation scheme for the parameter $\beta$. The last problem (Max-SMTI) is known to be NP-hard to approximate to factor-29/33 and W[1]-hard with respect to the number of ties. We complement this and present an FPT-approximation scheme for the parameter "number of agents with ties".
Centralized Permutation Equivariant Policy for Cooperative Multi-Agent Reinforcement Learning
The Centralized Training with Decentralized Execution (CTDE) paradigm has gained significant attention in multi-agent reinforcement learning (MARL) and is the foundation of many recent algorithms. However, decentralized policies operate under partial observability and often yield suboptimal performance compared to centralized policies, while fully centralized approaches typically face scalability challenges as the number of agents increases. We propose Centralized Permutation Equivariant (CPE) learning, a centralized training and execution framework that employs a fully centralized policy to overcome these limitations. Our approach leverages a novel permutation equivariant architecture, Global-Local Permutation Equivariant (GLPE) networks, that is lightweight, scalable, and easy to implement. Experiments show that CPE integrates seamlessly with both value decomposition and actor-critic methods, substantially improving the performance of standard CTDE algorithms across cooperative benchmarks including MPE, SMAC, and RWARE, and matching the performance of state-of-the-art RWARE implementations.
A Minimal Model for Emergent Collective Behaviors in Autonomous Robotic Multi-Agent Systems
Collective behaviors such as swarming and flocking emerge from simple, decentralized interactions in biological systems. Existing models, such as Vicsek and Cucker-Smale, lack collision avoidance, whereas the Olfati-Saber model imposes rigid formations, limiting their applicability in swarm robotics. To address these limitations, this paper proposes a minimal yet expressive model that governs agent dynamics using relative positions, velocities, and local density, modulated by two tunable parameters: the spatial offset and kinetic offset. The model achieves spatially flexible, collision-free behaviors that reflect naturalistic group dynamics. Furthermore, we extend the framework to cognitive autonomous systems, enabling energy-aware phase transitions between swarming and flocking through adaptive control parameter tuning. This cognitively inspired approach offers a robust foundation for real-world applications in multi-robot systems, particularly autonomous aerial swarms.
comment: Accepted for IEEE Transactions on Cognitive and Developmental Systems. Simulation video for Fig. 8: https://youtube.com/shorts/StHrtnSJyyg Simulation video for Fig. 10: https://youtu.be/Z26m7M-63D4
miRKatAI: An Integrated Database and Multi-agent AI system for microRNA Research
MicroRNAs (miRs) are robust regulators of gene expression, implicated in most biological processes. microRNAs predominantly downregulate the expression of genes post-transcriptionally and each miR is predicted to target several hundred genes. The accurate identification and annotation of miR-mRNA target interactions is central to understanding miRs function and their therapeutic potential. However, computational target prediction is challenging due to imperfect complementarity of miRs with their targets and the growing volume and heterogeneity of experimental data present challenges in accessing, integrating, and analysing miR-target interaction information across biological contexts. This creates a need for integrated resources and intelligent query tools. We present the miRKat Suite, comprising miRKatDB, a comprehensive, curated database of predicted and validated miR-target interactions and associated annotations, and miRKatAI, a multi-agent system powered by large language models (LLMs) and LangGraph. miRKatDB integrates data from multiple publicly available sources, providing a comprehensive foundation for miR studies, including miR target genes and changes in levels of tissue expression previously reported. miRKatAI offers a natural language interface for complex querying of miRKatDB, facilitates grounded information retrieval from established sources in the field, and supports basic data visualisation. The miRKat Suite aims to accelerate miR research by streamlining data access, enhancing exploratory analysis, and supporting hypothesis generation.
comment: 10 pages, 1 figure, app note
Memp: Exploring Agent Procedural Memory
Large Language Models (LLMs) based agents excel at diverse tasks, yet they suffer from brittle procedural memory that is manually engineered or entangled in static parameters. In this work, we investigate strategies to endow agents with a learnable, updatable, and lifelong procedural memory. We propose Memp that distills past agent trajectories into both fine-grained, step-by-step instructions and higher-level, script-like abstractions, and explore the impact of different strategies for Build, Retrieval, and Update of procedural memory. Coupled with a dynamic regimen that continuously updates, corrects, and deprecates its contents, this repository evolves in lockstep with new experience. Empirical evaluation on TravelPlanner and ALFWorld shows that as the memory repository is refined, agents achieve steadily higher success rates and greater efficiency on analogous tasks. Moreover, procedural memory built from a stronger model retains its value: migrating the procedural memory to a weaker model yields substantial performance gains.
comment: Work in progress
Finite-Time Global Optimality Convergence in Deep Neural Actor-Critic Methods for Decentralized Multi-Agent Reinforcement Learning
Actor-critic methods for decentralized multi-agent reinforcement learning (MARL) facilitate collaborative optimal decision making without centralized coordination, thus enabling a wide range of applications in practice. To date, however, most theoretical convergence studies for existing actor-critic decentralized MARL methods are limited to the guarantee of a stationary solution under the linear function approximation. This leaves a significant gap between the highly successful use of deep neural actor-critic for decentralized MARL in practice and the current theoretical understanding. To bridge this gap, in this paper, we make the first attempt to develop a deep neural actor-critic method for decentralized MARL, where both the actor and critic components are inherently non-linear. We show that our proposed method enjoys a global optimality guarantee with a finite-time convergence rate of O(1/T), where T is the total iteration times. This marks the first global convergence result for deep neural actor-critic methods in the MARL literature. We also conduct extensive numerical experiments, which verify our theoretical results.
Smooth Games of Configuration in the Linear-Quadratic Setting
Dynamic game theory offers a toolbox for formalizing and solving for both cooperative and non-cooperative strategies in multi-agent scenarios. However, the optimal configuration of such games remains largely unexplored. While there is existing literature on the parametrization of dynamic games, little research examines this parametrization from a strategic perspective where each agent's configuration choice is influenced by the decisions of others. In this work, we introduce the concept of a game of configuration, providing a framework for the strategic fine-tuning of differential games. We define a game of configuration as a two-stage game within the setting of finite-horizon, affine-quadratic, AQ, differential games. In the first stage, each player chooses their corresponding configuration parameter, which will impact their dynamics and costs in the second stage. We provide the subgame perfect solution concept and a method for computing first stage cost gradients over the configuration space. This then allows us to formulate a gradient-based method for searching for local solutions to the configuration game, as well as provide necessary conditions for equilibrium configurations over their downstream (second stage) trajectories. We conclude by demonstrating the effectiveness of our approach in example AQ systems, both zero-sum and general-sum.
Competitive Algorithms for Multi-Agent Ski-Rental Problems
This paper introduces a novel multi-agent ski-rental problem that generalizes the classical ski-rental dilemma to a group setting where agents incur individual and shared costs. In our model, each agent can either rent at a fixed daily cost, or purchase a pass at an individual cost, with an additional third option of a discounted group pass available to all. We consider scenarios in which agents' active days differ, leading to dynamic states as agents drop out of the decision process. To address this problem from different perspectives, we define three distinct competitive ratios: overall, state-dependent, and individual rational. For each objective, we design and analyze optimal deterministic and randomized policies. Our deterministic policies employ state-aware threshold functions that adapt to the dynamic states, while our randomized policies sample and resample thresholds from tailored state-aware distributions. The analysis reveals that symmetric policies, in which all agents use the same threshold, outperform asymmetric ones. Our results provide competitive ratio upper and lower bounds and extend classical ski-rental insights to multi-agent settings, highlighting both theoretical and practical implications for group decision-making under uncertainty.
Game-Theoretic Multiagent Reinforcement Learning
Tremendous advances have been made in multiagent reinforcement learning (MARL). MARL corresponds to the learning problem in a multiagent system in which multiple agents learn simultaneously. It is an interdisciplinary field of study with a long history that includes game theory, machine learning, stochastic control, psychology, and optimization. Despite great successes in MARL, there is a lack of a self-contained overview of the literature that covers game-theoretic foundations of modern MARL methods and summarizes the recent advances. The majority of existing surveys are outdated and do not fully cover the recent developments since 2010. In this work, we provide a monograph on MARL that covers both the fundamentals and the latest developments on the research frontier. The goal of this monograph is to provide a self-contained assessment of the current state-of-the-art MARL techniques from a game-theoretic perspective. We expect this work to serve as a stepping stone for both new researchers who are about to enter this fast-growing field and experts in the field who want to obtain a panoramic view and identify new directions based on recent advances.
Systems and Control (CS)
Online Safety under Multiple Constraints and Input Bounds using gatekeeper: Theory and Applications
This letter presents an approach to guarantee online safety of a cyber-physical system under multiple state and input constraints. Our proposed framework, called gatekeeper, recursively guarantees the existence of an infinite-horizon trajectory that satisfies all constraints and system dynamics. Such trajectory is constructed using a backup controller, which we define formally in this paper. gatekeeper relies on a small number of verifiable assumptions, and is computationally efficient since it requires optimization over a single scalar variable. We make two primary contributions in this letter. (A) First, we develop the theory of gatekeeper: we derive a sub-optimality bound relative to a full nonlinear trajectory optimization problem, and show how this can be used in runtime to validate performance. This also informs the design of the backup controllers and sets. (B) Second, we demonstrate in detail an application of gatekeeper for multi-agent formation flight, where each Dubins agent must avoid multiple obstacles and weapons engagement zones, both of which are nonlinear, nonconvex constraints.
comment: 6 pages, 2 figures. Accepted for publication in IEEE L-CSS 2025
Collision-Free Bearing-Driven Formation Tracking for Euler-Lagrange Systems
In this paper, we investigate the problem of tracking formations driven by bearings for heterogeneous Euler-Lagrange systems with parametric uncertainty in the presence of multiple moving leaders. To estimate the leaders' velocities and accelerations, we first design a distributed observer for the leader system, utilizing a bearing-based localization condition in place of the conventional connectivity assumption. This observer, coupled with an adaptive mechanism, enables the synthesis of a novel distributed control law that guides the formation towards the target formation, without requiring prior knowledge of the system parameters. Furthermore, we establish a sufficient condition, dependent on the initial formation configuration, that ensures collision avoidance throughout the formation evolution. The effectiveness of the proposed approach is demonstrated through a numerical example.
comment: 10 pages, 4 figures
A Shank Angle-Based Control System Enables Soft Exoskeleton to Assist Human Non-Steady Locomotion
Exoskeletons have been shown to effectively assist humans during steady locomotion. However, their effects on non-steady locomotion, characterized by nonlinear phase progression within a gait cycle, remain insufficiently explored, particularly across diverse activities. This work presents a shank angle-based control system that enables the exoskeleton to maintain real-time coordination with human gait, even under phase perturbations, while dynamically shaping assistance profiles to match the biological ankle moment patterns across walking, running, stair negotiation tasks. The control system consists of an assistance profile online generation method and a model-based feedforward control method. The assistance profile is formulated as a dual-Gaussian model with the shank angle as the independent variable. Leveraging only IMU measurements, the model parameters are updated online each stride to adapt to inter- and intra-individual biomechanical variability. The profile tracking control employs a human-exoskeleton kinematics and stiffness model as a feedforward component, reducing reliance on historical control data due to the lack of clear and consistent periodicity in non-steady locomotion. Three experiments were conducted using a lightweight soft exoskeleton with multiple subjects. The results validated the effectiveness of each individual method, demonstrated the robustness of the control system against gait perturbations across various activities, and revealed positive biomechanical and physiological responses of human users to the exoskeleton's mechanical assistance.
comment: 49 pages, 20 figures, 4 tables
Integrated Learning and Optimization to Control Load Demand and Wind Generation for Minimizing Ramping Cost in Real-Time Electricity Market
We developed a new integrated learning and optimization (ILO) methodology to predict context-aware unknown parameters in economic dispatch (ED), a crucial problem in power systems solved to generate optimal power dispatching decisions to serve consumer load. The ED formulation in the current study consists of load and renewable generation as unknown parameters in its constraints predicted using contextual information (e.g., prior load, temperature). The ILO framework train a neural network (NN) to estimate ED parameters by minimizing an application-specific regret function which is a difference between ground truth and NN-driven decisions favouring better ED decisions. We thoroughly analyze the feasible region of ED formulation to understand the impact of load and renewable learning together on the ED decisions. Corresponding to that we developed a new regret function to capture real-time electricity market operations where differences in predicted and true loads are corrected by ramping generators in real-time but at a higher cost than the market price. The proposed regret function when minimized using ILO framework train the NN to guide the load and renewable predictions to generate ED decisions favouring minimum generator ramping costs. This is unlike conventional sequential learning and optimization (SLO) framework which train NN to accurately estimate load and renewable instead of better ED decisions. The combined training of load and renewable using ILO is a new concept and lead to significantly improved ramping costs when compared with SLO based training of load and renewable and SLO trained load with 100% accurate renewable proving its decision-focused capability.
Besondere Anforderungen des automatisierten Fahrens an den Entwurf
The development of automated vehicles and automated driving functions is an exceptionally complex task that requires the integration of numerous, sometimes conflicting interests and various constraints already in the early stages of system design. This chapter explains important challenges in concept specifications for automated driving and presents a systematic process model that contributes to overcoming the special requirements in this field. In addition, it describes the successful implementation of a structured concept specification for an automated vehicle guidance system. -- Die Entwicklung automatisierter Fahrzeuge und Fahrfunktionen stellt eine ausgesprochen komplexe Aufgabe dar, die bereits im Zuge des Systementwurfs die Einbeziehung einer Vielzahl teilweise konflikt\"arer Interessen und diverser Randbedingungen erfordert. Dieses Kapitel erl\"autert wichtige Herausforderungen bei Konzeptspezifikationen im Themenfeld des automatisierten Fahrens und stellt ein systematisches Prozessmodell vor, das einen Beitrag zur Erf\"ullung der besonderen Anforderungen des automatisierten Fahrens an den Entwurf leistet. Dar\"uber hinaus wird die erfolgreiche Durchf\"uhrung einer strukturierten Konzeptspezifikation f\"ur ein automatisiertes Fahrzeugf\"uhrungssystem beschrieben.
comment: Preprint of https://link.springer.com/chapter/10.1007/978-3-658-38486-9_45, published in Handbuch Assistiertes und Automatisiertes Fahren, in German
A Divide-and-Conquer Tiling Method for the Design of Large Aperiodic Phased Arrays
Due to the growing request from modern wireless applications of cost-affordable and high-gain scanning antenna solutions, the design of large phased arrays (PAs) with radiating elements organized into modular clusters with sub-array-only amplitude and phase control is a key topic. In this paper, an innovative irregular tiling method is proposed where, according to a divide-and-conquer strategy, the antenna aperture is subdivided into sub-areas that are locally domino-tiled by jointly fulfilling the full-coverage condition on the remaining untiled part of the PA support. Selected representative results, including comparisons with competitive state-of-the-art synthesis methods, are reported to prove the effectiveness and the computational efficiency of the proposed tiling approach. Use-cases of current relevance for low Earth orbit (LEO) satellite communications are discussed, as well, to provide the antenna designers useful practical guidelines for handling large PAs.
Metering traffic flows for perimeter control through auction-based signalling using connected vehicles
Urban traffic congestion remains a critical challenge in modern cities, with traffic signal control systems often struggling to manage congestion during peak travel times. Perimeter control of a Protected Network (PN) has emerged as a potential solution to reducing gridlock in urban networks. This paper proposes a novel auction-based mechanism for green time allocation at signalized intersections, for effective perimeter control application. Utilising a Sealed Bid, Second Price auction framework, our approach combines real-time traffic monitoring with market-inspired mechanisms to regulate vehicle inflows into PN areas. Unlike existing methods that focus primarily on gated links, our system allocates budgets to individual traffic movements, providing greater flexibility in managing multi-directional flows. We evaluate the proposed mechanism using a test case intersection with a single controlled inflow, comparing it against a volume-based fixed-time approach. The results demonstrate that our auction-based method controls flows into the PN with improved accuracy, outperforming the volume-based approach in terms of inflow regulation, queue management and delays. The framework can be applied in real time to any generic intersection, offering a scalable solution for urban traffic management. This work bridges the gap between perimeter control and market-based intersection auctions, providing a pathway for further research on adaptive traffic management systems.
Duty-Cycling is Not Enough in Constrained IoT Networking: Revealing the Energy Savings of Dynamic Clock Scaling
Minimizing energy consumption of low-power wireless nodes is a persistent challenge from the constrained Internet of Things (IoT). In this paper, we start from the observation that constrained IoT devices have largely different hardware (im-)balances than full-scale machines. We find that the performance gap between MCU and network throughput on constrained devices enables minimal energy delay product (EDP) for IoT networking at largely reduced clock frequencies. We analyze the potentials by integrating dynamic voltage and frequency scaling (DVFS) into the RIOT IoT operating system and show that the DVFS reconfiguration overhead stays below the energy saved for a single, downscaled MAC operation. Backed by these findings, we systematically investigate how DVFS further improves energy-efficiency for common networking tasks -- in addition to duty-cycling. We measure IoT communication scenarios between real-world systems and analyze two MAC operating modes -- CSMA/CA and time slotting -- in combination with different CoAP transactions, payload sizes, as well as DTLS transport encryption. Our experiments reveal energy savings between 24% and 52% for MAC operations and up to 37% for encrypted CoAP communication. These results shall encourage research and system design work to integrate DVFS in future IoT devices for performing tasks at their optimal frequencies and thereby significantly extending battery lifetimes.
BEAVR: Bimanual, multi-Embodiment, Accessible, Virtual Reality Teleoperation System for Robots
\textbf{BEAVR} is an open-source, bimanual, multi-embodiment Virtual Reality (VR) teleoperation system for robots, designed to unify real-time control, data recording, and policy learning across heterogeneous robotic platforms. BEAVR enables real-time, dexterous teleoperation using commodity VR hardware, supports modular integration with robots ranging from 7-DoF manipulators to full-body humanoids, and records synchronized multi-modal demonstrations directly in the LeRobot dataset schema. Our system features a zero-copy streaming architecture achieving $\leq$35\,ms latency, an asynchronous ``think--act'' control loop for scalable inference, and a flexible network API optimized for real-time, multi-robot operation. We benchmark BEAVR across diverse manipulation tasks and demonstrate its compatibility with leading visuomotor policies such as ACT, DiffusionPolicy, and SmolVLA. All code is publicly available, and datasets are released on Hugging Face\footnote{Code, datasets, and VR app available at https://github.com/ARCLab-MIT/BEAVR-Bot.
comment: Accepted for presentation on ICCR Kyoto 2025
HapticGiant: A Novel Very Large Kinesthetic Haptic Interface with Hierarchical Force Control
Research in virtual reality and haptic technologies has consistently aimed to enhance immersion. While advanced head-mounted displays are now commercially available, kinesthetic haptic interfaces still face challenges such as limited workspaces, insufficient degrees of freedom, and kinematics not matching the human arm. In this paper, we present HapticGiant, a novel large-scale kinesthetic haptic interface designed to match the properties of the human arm as closely as possible and to facilitate natural user locomotion while providing full haptic feedback. The interface incorporates a novel admittance-type force control scheme, leveraging hierarchical optimization to render both arbitrary serial kinematic chains and Cartesian admittances. Notably, the proposed control scheme natively accounts for system limitations, including joint and Cartesian constraints, as well as singularities. Experimental results demonstrate the effectiveness of HapticGiant and its control scheme, paving the way for highly immersive virtual reality applications.
comment: Final Version - Accepted on IEEE Transactions on Haptics
Offline Auto Labeling: BAAS
This paper introduces BAAS, a new Extended Object Tracking (EOT) and fusion-based label annotation framework for radar detections in autonomous driving. Our framework utilizes Bayesian-based tracking, smoothing and eventually fusion methods to provide veritable and precise object trajectories along with shape estimation to provide annotation labels on the detection level under various supervision levels. Simultaneously, the framework provides evaluation of tracking performance and label annotation. If manually labeled data is available, each processing module can be analyzed independently or combined with other modules to enable closed-loop continuous improvements. The framework performance is evaluated in a challenging urban real-world scenario in terms of tracking performance and the label annotation errors. We demonstrate the functionality of the proposed approach for varying dynamic objects and class types
Shepherd Grid Strategy: Towards Reliable SWARM Interception
Modern unmanned aerial vehicle threats require sophisticated interception strategies that can overcome advanced evasion capabilities and operate effectively in contested environments. Traditional single-interceptor and uncoordinated multi-interceptor approaches suffer from fundamental limitations including inadequate coverage, predictable pursuit patterns, and vulnerability to intelligent evasion maneuvers. This paper introduces the Shepherd Grid Strategy, a new multi-phase coordination framework that employs pack-based behavioral coordination to achieve deterministic target interception through systematic containment and coordinated strike execution. The strategy implements a four-phase operational model consisting of chase, follow, formation, and engagement phases, with dynamic role assignment and adaptive formation geometry that maintains persistent target pressure while preparing optimal strike opportunities. Our approach incorporates three key innovations: adaptive phase transition mechanisms that optimize pursuit behavior based on proximity and mission objectives, dynamic role assignment systems that designate specialized interceptor functions including formation maintenance and strike execution, and predictive formation geometry algorithms that create mobile containment grids adapting to target movement patterns. The simulation experiments demonstrate significant performance improvements over traditional methods, achieving near-perfect interception success rates (over 95%) compared to traditional approaches (65%) and reducing median time-to-intercept.
comment: 9 pages, 5 figures
From Formal Methods to Data-Driven Safety Certificates of Unknown Large-Scale Networks
In this work, we propose a data-driven scheme within a compositional framework with noisy data to design robust safety controllers in a fully decentralized fashion for large-scale interconnected networks with unknown mathematical dynamics. Despite the network's high dimensionality and the inherent complexity of its unknown model, which make it intractable, our approach effectively addresses these challenges by (i) treating the network as a composition of smaller subsystems, and (ii) collecting noisy data from each subsystem's trajectory to design a control sub-barrier certificate (CSBC) and its corresponding local controller. To achieve this, our proposed scheme only requires a noise-corrupted single input-state trajectory from each unknown subsystem up to a specified time horizon, satisfying a certain rank condition. Subsequently, under a small-gain compositional reasoning, we compose those CSBC, derived from noisy data, and formulate a control barrier certificate (CBC) for the unknown network, ensuring its safety over an infinite time horizon, while providing correctness guarantees. We offer a data-dependent sum-of-squares (SOS) optimization program for computing CSBC alongside local controllers of subsystems. We illustrate that while the computational complexity of designing a CBC and its safety controller grows polynomially with network dimension using SOS optimization, our compositional data-driven approach significantly reduces it to a linear scale concerning the number of subsystems. We demonstrate the capability of our data-driven approach on multiple physical networks involving unknown models and a range of interconnection topologies.
From Micro to Macro Flow Modeling: Characterizing Heterogeneity of Mixed-Autonomy Traffic
Most autonomous-vehicles (AVs) driving strategies are designed and analyzed at the vehicle level, yet their aggregate impact on macroscopic traffic flow is still not understood, particularly the flow heterogeneity that emerges when AVs interact with human-driven vehicles (HVs). Existing validation techniques for macroscopic flow models rely on high-resolution spatiotemporal data spanning entire road segments which are rarely available for mixed-autonomy traffic. AVs record detailed Lagrangian trajectories of the ego vehicle and surrounding traffic through onboard sensors. Leveraging these Lagrangian observations to validate mixed-autonomy flow models therefore remains an open research challenge. This paper closes the gap between microscopic Lagrangian data and macroscopic Euclidean traffic models by introducing a continuous traffic-heterogeneity attribute. We represent traffic flow with two coupled conservation laws with one for vehicle number and one for the traffic attribute. Reconstruction methods are designed to derive the traffic attribute from Lagrangian vehicle trajectories. When abundant trajectory data are available, we characterize traffic heterogeneity by extracting drivers' desired speed and local behavioral uncertainty from trajectories. In data-scarce mixed traffic, we design an end-to-end mapping that infers the traffic heterogeneity solely from trajectories in the current spatiotemporal region. Experiments across multiple traffic datasets show that the proposed model effectively captures traffic heterogeneity by clustering the fundamental diagram scatter into attribute-based groups. The calibration errors of traffic flow dynamics are also reduce by 20% relative to the Aw-Rascle-Zhang model benchmark. Detailed analyses further show that the model generalizes well, maintaining nearly the same accuracy when evaluated under a variety of previously unseen traffic conditions.
Imperfect Competition in Markets for Short-Circuit Current Services
An important limitation of Inverter-Based Resources (IBR) is their reduced contribution to Short-Circuit Current (SCC), as compared to that of Synchronous Generators (SGs). With increasing penetration of IBR in most power systems, the reducing SCC poses challenges to a secure system operation, as line protections may not trip when required. In order to address this issue, the SCC ancillary service could be procured via an economic mechanism, aiming at securing adequate SCC on all buses. However, the suitability of markets for SCC services is not well understood, given that these could be prone to market-power issues: since the SCC contributions from various SGs to a certain bus are determined by the electrical topology of the grid, this is a highly local service. It is necessary to understand if SGs at advantageous electrical locations could exert market power and, if so, how it could be mitigated. In order to fill this gap, this paper adopts an SCC-constrained bilevel model to investigate strategic behaviors of SGs. To address the non-convexity due to unit commitment variables, the model is restructured through a primal-dual formulation. Based on a modified IEEE 30-bus system, cases with strategic SGs placed at different buses are analyzed. These studies demonstrate that agents exerting market power could achieve up to triple revenues from SCC provision, highlighting the need to carefully design these markets.
comment: Ancillary services, short-circuit current, market power, bilevel optimization, primal-dual formulation. A paper submitted IEEE
Control Systems Analysis of a 3-Axis Photovoltatic Solar Tracker for Water Pumping
We propose 3-axis solar tracker water pumping system. The solar tracker can rotate and tilt using stepper/DC motors and can rise and lower on a tripod using a linear actuator. The charge generated from solar energy absorbed by photovoltaic (PV) cells in the solar panel is stored in a 12V battery that in turn powers two water diaphragm pumps using a solar charge controller. The PV uses four light photocell resistors/sensors to measure light intensity. A solar tracking algorithm determines the optimal angle for PV positioning. Using an ultrasonic sensor to measure the water level in a reservoir water tank, water is pumped from one water tank to the reservoir. Based on soil moisture sensor levels, a second water pump supplies water from the reservoir to the plant. The system is analyzed from a control systems perspective. The transfer functions, root loci, and Bode plots are generated and simulated and experimental results are provided as well as stability and steady-state error analysis.
Design and Simulation of 6T SRAM Array
Conventional 6T SRAM is used in microprocessors in the cache memory design. The basic 6T SRAM cell and a 6 bit memory array layout are designed in LEdit. The design and analysis of key SRAM components, sense amplifiers, decoders, write drivers and precharge circuits are also provided. The pulse voltage waveforms generated for read and write operations as well as Q and Qbar nodes are simulated in LTSpice. Parasitic capacitances are extracted and their impact on the waveforms analyzed. Static noise margin, propagation delays, and power dissipation are calculated. Comparison of SRAM read and write operational performance using CMOS transistors is made with edge-triggered D flip flops. If certain size area and ratio constraints are satisfied, the 6T cell with CMOS transistors will possess stability, speed, and power efficiency. Both theoretical and simulated results are given.
Systematic Constraint Formulation and Collision-Free Trajectory Planning Using Space-Time Graphs of Convex Sets
In this paper, we create optimal, collision-free, time-dependent trajectories through cluttered dynamic environments. The many spatial and temporal constraints make finding an initial guess for a numerical solver difficult. Graphs of Convex Sets (GCS) and the recently developed Space-Time Graphs of Convex Sets formulation (ST-GCS) enable us to generate optimal minimum distance collision-free trajectories without providing an initial guess to the solver. We also explore the derivation of general GCS-compatible constraints and document an intuitive strategy for adapting general constraints to the framework. We show that ST-GCS produces equivalent trajectories to the standard GCS formulation when the environment is static. We then show ST-GCS operating in dynamic environments to find minimum distance collision-free trajectories.
comment: 21 pages with references, 20 figures
Distributional Robustness in Output Feedback Regret-Optimal Control
This paper studies distributionally robust regret-optimal (DRRO) control with purified output feedback for linear systems subject to additive disturbances and measurement noise. These uncertainties (including the initial system state) are assumed to be stochastic and distributed according to an unknown joint probability distribution within a Wasserstein ambiguity set. We design affine controllers to minimise the worst-case expected regret over all distributions in this set. The expected regret is defined as the difference between an expected cost incurred by an affine causal controller and the expected cost incurred by the optimal noncausal controller with perfect knowledge of the disturbance trajectory at the outset. Leveraging the duality theory in distributionally robust optimisation, we derive strong duality results for worst-case expectation problems involving general quadratic objective functions, enabling exact reformulations of the DRRO control problem as semidefinite programs (SDPs). Focusing on one such reformulation, we eliminate certain decision variables. This technique also permits a further equivalent reformulation of the SDP as a distributed optimisation problem, with potential to enhance scalability.
comment: Presented at the 11th IFAC Symposium on Robust Control Design
Performance Benchmarking of Machine Learning Models for Terahertz Metamaterial Absorber Prediction
This study presents a polarization-insensitive ultra-broadband terahertz metamaterial absorber based on vanadium dioxide (VO2) and evaluates machine learning methods for predicting its absorption performance. The structure consists of a VO2 metasurface, a MF2 dielectric spacer, and a gold ground plane. It achieves more than 90% absorption between 5.72 and 11.11 THz, covering a 5.38 THz bandwidth with an average absorptance of 98.15%. A dataset of 9,018 samples was generated from full-wave simulations by varying patch width, dielectric thickness, and frequency. Six regression models were trained: Linear Regression, Support Vector Regression, Decision Tree, Random Forest, XGBoost, and Bagging. Performance was measured using adjusted R2, MAE, MSE, and RMSE. Ensemble models achieved the best results, with Bagging reaching an adjusted R2 of 0.9985 and RMSE of 0.0146. The workflow offers a faster alternative to exhaustive simulations and can be applied to other metamaterial designs, enabling efficient evaluation and optimization.
comment: 6 pages, 6 figures, 2 tables
Routing Guidance for Emerging Transportation Systems with Improved Dynamic Trip Equity
This paper presents a dynamic routing guidance system that optimizes route recommendations for individual vehicles in an emerging transportation system while enhancing travelers' trip equity. We develop a framework to quantify trip quality and equity in dynamic travel environments, providing new insights into how routing guidance influences equity in road transportation. Our approach enables real-time routing by incorporating both monitored and anticipated traffic congestion. We provide conditions that ensure perfect trip equity for all travelers in a free-flow network. Simulation studies on 1,000 vehicles traversing an urban road network in Boston demonstrate that our method improves trip equity by approximately 11.4\% compared to the shortest-route strategy. In addition, the results reveal that our approach redistributes travel costs across vehicle types through route optimization, contributing to a more equitable transportation system.
Directional excitability in Hilbert spaces
We introduce a generalized excitable system in which spikes can happen in a continuum of directions, therefore drastically enriching the expressivity and control capability of the spiking dynamics. In this generalized excitable system, spiking trajectories happen in a Hilbert space with an excitable resting state at the origin and spike responses that can be triggered in any direction as a function of the system's state and inputs. State-dependence of the spiking direction provide the system with a vanishing spiking memory trace, which enables robust tracking and integration of inputs in the spiking direction history. The model exhibits generalized forms of both Hodgkin's Type I and Type II excitability, capturing their usual bifurcation behaviors in an abstract setting. When used as the controller of a two-dimensional navigation task, this model facilitates both the sparseness of the actuation and its sensitivity to environmental inputs. These results highlight the potential of the proposed generalized excitable model for excitable control in high- and infinite-dimensional spaces.
comment: 6 pages, 7 figures
On learning racing policies with reinforcement learning IROS 2025
Fully autonomous vehicles promise enhanced safety and efficiency. However, ensuring reliable operation in challenging corner cases requires control algorithms capable of performing at the vehicle limits. We address this requirement by considering the task of autonomous racing and propose solving it by learning a racing policy using Reinforcement Learning (RL). Our approach leverages domain randomization, actuator dynamics modeling, and policy architecture design to enable reliable and safe zero-shot deployment on a real platform. Evaluated on the F1TENTH race car, our RL policy not only surpasses a state-of-the-art Model Predictive Control (MPC), but, to the best of our knowledge, also represents the first instance of an RL policy outperforming expert human drivers in RC racing. This work identifies the key factors driving this performance improvement, providing critical insights for the design of robust RL-based control strategies for autonomous vehicles.
comment: This paper has been accepted for publication in the Proceedings of the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
Barriers on the EDGE: A scalable CBF architecture over EDGE for safe aerial-ground multi-agent coordination
In this article, we propose a control architecture for the safe, coordinated operation of a multi-agent system with aerial (UAVs) and ground (UGVs) robots in a confined task space. We consider the case where the aerial and ground operations are coupled, enabled by the capability of the aerial robots to land on moving ground robots. The proposed method uses time-varying Control Barrier Functions (CBFs) to impose safety constraints associated with (i) collision avoidance between agents, (ii) landing of UAVs on mobile UGVs, and (iii) task space restriction. Further, this article addresses the challenge induced by the rapid increase in the number of CBF constraints with the increasing number of agents through a hybrid centralized-distributed coordination approach that determines the set of CBF constraints that is relevant for every aerial and ground agent at any given time. A centralized node (Watcher), hosted by an edge computing cluster, activates the relevant constraints, thus reducing the network complexity and the need for high onboard processing on the robots. The CBF constraints are enforced in a distributed manner by individual robots that run a nominal controller and safety filter locally to overcome latency and other network nonidealities.
comment: 6 pages, 2 figures, first draft of a paper currently under review
Navigating Robot Swarm Through a Virtual Tube with Flow-Adaptive Distribution Control
With the rapid development of robot swarm technology and its diverse applications, navigating robot swarms through complex environments has emerged as a critical research direction. To ensure safe navigation and avoid potential collisions with obstacles, the concept of virtual tubes has been introduced to define safe and navigable regions. However, current control methods in virtual tubes face the congestion issues, particularly in narrow ones with low throughput. To address these challenges, we first propose a novel control method that combines a modified artificial potential field (APF) for swarm navigation and density feedback control for distribution regulation. Then we generate a global velocity field that not only ensures collision-free navigation but also achieves locally input-to-state stability (LISS) for density tracking. Finally, numerical simulations and realistic applications validate the effectiveness and advantages of the proposed method in navigating robot swarms through narrow virtual tubes.
comment: 8 pages(brief paper), 12 figures
SINDyG: Sparse Identification of Nonlinear Dynamical Systems from Graph-Structured Data, with Applications to Stuart-Landau Oscillator Networks
The combination of machine learning (ML) and sparsity-promoting techniques is enabling direct extraction of governing equations from data, revolutionizing computational modeling in diverse fields of science and engineering. The discovered dynamical models could be used to address challenges in climate science, neuroscience, ecology, finance, epidemiology, and beyond. However, most existing sparse identification methods for discovering dynamical systems treat the whole system as one without considering the interactions between subsystems. As a result, such models are not able to capture small changes in the emergent system behavior. To address this issue, we developed a new method called Sparse Identification of Nonlinear Dynamical Systems from Graph-structured data (SINDyG), which incorporates the network structure into sparse regression to identify model parameters that explain the underlying network dynamics. We tested our proposed method using several case studies of neuronal dynamics, where we modeled the macroscopic oscillation of a population of neurons using the extended Stuart-Landau (SL) equation and utilize the SINDyG method to identify the underlying nonlinear dynamics. Our extensive computational experiments validate the improved accuracy and simplicity of discovered network dynamics when compared to the original SINDy approach. The proposed graph-informed penalty can be easily integrated with other symbolic regression algorithms, enhancing model interpretability and performance by incorporating network structure into the regression process.
Stability Analysis and Intervention Strategies on a Coupled SIS Epidemic Model with Polar Opinion Dynamics
This paper investigates the spread of infectious diseases within a networked community by integrating epidemic transmission and public opinion dynamics. We propose a novel discrete-time networked SIS (Susceptible-Infectious-Susceptible) epidemic model coupled with opinion dynamics that includes stubborn agents with biased views. The model captures the interplay between perceived and actual epidemic severity, offering insights into epidemic dynamics in socially interconnected environments. We introduce the SIS-opinion reproduction number to assess epidemic severity and analyze conditions for disease eradication and the global stability of endemic equilibria. Additionally, we explore opinion-based intervention strategies, providing a framework for policymakers to design effective prevention measures. Numerical examples are provided to illustrate our theoretical findings and the model's practical implications.
comment: 13 pages, 4 figures
Optimization-Free Fast Optimal Control: Bang-Ride Property, Monotonicity, and Applications to Fast Battery Charging
Single-input fast optimal control problems, which aim to achieve the optimal objective as fast as possible, occur in various real-world applications. In the case of fast battery charging, the associated optimal control problem becomes computationally challenging when detailed battery models are used. A recent heuristic optimization-free algorithm can significantly reduce the computational cost and provide an approximate solution, consistent with many heuristic input profiles in practice. These heuristic solutions have several special properties: They follow a bang-ride pattern that always activates a constraint and applies the maximum feasible input. This work investigates when the above properties arise in the optimal input, and ultimately, when the heuristic input profiles satisfy necessary optimality conditions. By exploiting Pontryagin's maximum principle (PMP), we show that the optimal control is bang-ride under regularity conditions on constraint switching and local controllability of the system. Moreover, the special type of bang-ride behavior, i.e., applying the maximum feasible input, arises under the monotonicity of the system, objective function, and restricted sensitivity of the constraints. These results provide a theoretical justification for a class of charging heuristics and the fast optimization-free algorithm.
Sequential QCQP for Bilevel Optimization with Line Search
Bilevel optimization involves a hierarchical structure where one problem is nested within another, leading to complex interdependencies between levels. We propose a single-loop, tuning-free algorithm that guarantees anytime feasibility, i.e., approximate satisfaction of the lower-level optimality condition, while ensuring descent of the upper-level objective. At each iteration, a convex quadratically-constrained quadratic program (QCQP) with a closed-form solution yields the search direction, followed by a backtracking line search inspired by control barrier functions to ensure safe, uniformly positive step sizes. The resulting method is scalable, requires no hyperparameter tuning, and converges under mild local regularity assumptions. We establish an O(1/k) ergodic convergence rate in terms of a first-order stationary metric and demonstrate the algorithm's effectiveness on representative bilevel tasks.
comment: IEEE Control Systems Letters (L-CSS) and IEEE Conference on Decision and Control (CDC) 2025
Incompressible Optimal Transport and Applications in Fluid Mixing
The problem of incompressible fluid mixing arises in numerous engineering applications and has been well-studied over the years, yet many open questions remain. This paper aims to address the question "what do efficient flow fields for mixing look like, and how do they behave?" We approach this question by developing a framework which is inspired by the dynamic and geometric approach to optimal mass transport. Specifically, we formulate the fluid mixing problem as an optimal control problem where the dynamics are given by the continuity equation together with an incompressibility constraint. We show that within this framework, the set of reachable fluid configurations can formally be endowed with the structure of an infinite-dimensional Riemannian manifold, with a metric which is induced by the control effort, and that flow fields which are maximally efficient at mixing correspond to geodesics in this Riemannian space.
comment: 8 pages. Extended version (includes proofs)
MPPI-Generic: A CUDA Library for Stochastic Trajectory Optimization
This paper introduces a new C++/CUDA library for GPU-accelerated stochastic optimization called MPPI-Generic. It provides implementations of Model Predictive Path Integral control, Tube-Model Predictive Path Integral Control, and Robust Model Predictive Path Integral Control, and allows for these algorithms to be used across many pre-existing dynamics models and cost functions. Furthermore, researchers can create their own dynamics models or cost functions following our API definitions without needing to change the actual Model Predictive Path Integral Control code. Finally, we compare computational performance to other popular implementations of Model Predictive Path Integral Control over a variety of GPUs to show the real-time capabilities our library can allow for. Library code can be found at: https://acdslab.github.io/mppi-generic-website/ .
comment: Added missing Acknowledgements section
Systems and Control (EESS)
Online Safety under Multiple Constraints and Input Bounds using gatekeeper: Theory and Applications
This letter presents an approach to guarantee online safety of a cyber-physical system under multiple state and input constraints. Our proposed framework, called gatekeeper, recursively guarantees the existence of an infinite-horizon trajectory that satisfies all constraints and system dynamics. Such trajectory is constructed using a backup controller, which we define formally in this paper. gatekeeper relies on a small number of verifiable assumptions, and is computationally efficient since it requires optimization over a single scalar variable. We make two primary contributions in this letter. (A) First, we develop the theory of gatekeeper: we derive a sub-optimality bound relative to a full nonlinear trajectory optimization problem, and show how this can be used in runtime to validate performance. This also informs the design of the backup controllers and sets. (B) Second, we demonstrate in detail an application of gatekeeper for multi-agent formation flight, where each Dubins agent must avoid multiple obstacles and weapons engagement zones, both of which are nonlinear, nonconvex constraints.
comment: 6 pages, 2 figures. Accepted for publication in IEEE L-CSS 2025
Collision-Free Bearing-Driven Formation Tracking for Euler-Lagrange Systems
In this paper, we investigate the problem of tracking formations driven by bearings for heterogeneous Euler-Lagrange systems with parametric uncertainty in the presence of multiple moving leaders. To estimate the leaders' velocities and accelerations, we first design a distributed observer for the leader system, utilizing a bearing-based localization condition in place of the conventional connectivity assumption. This observer, coupled with an adaptive mechanism, enables the synthesis of a novel distributed control law that guides the formation towards the target formation, without requiring prior knowledge of the system parameters. Furthermore, we establish a sufficient condition, dependent on the initial formation configuration, that ensures collision avoidance throughout the formation evolution. The effectiveness of the proposed approach is demonstrated through a numerical example.
comment: 10 pages, 4 figures
A Shank Angle-Based Control System Enables Soft Exoskeleton to Assist Human Non-Steady Locomotion
Exoskeletons have been shown to effectively assist humans during steady locomotion. However, their effects on non-steady locomotion, characterized by nonlinear phase progression within a gait cycle, remain insufficiently explored, particularly across diverse activities. This work presents a shank angle-based control system that enables the exoskeleton to maintain real-time coordination with human gait, even under phase perturbations, while dynamically shaping assistance profiles to match the biological ankle moment patterns across walking, running, stair negotiation tasks. The control system consists of an assistance profile online generation method and a model-based feedforward control method. The assistance profile is formulated as a dual-Gaussian model with the shank angle as the independent variable. Leveraging only IMU measurements, the model parameters are updated online each stride to adapt to inter- and intra-individual biomechanical variability. The profile tracking control employs a human-exoskeleton kinematics and stiffness model as a feedforward component, reducing reliance on historical control data due to the lack of clear and consistent periodicity in non-steady locomotion. Three experiments were conducted using a lightweight soft exoskeleton with multiple subjects. The results validated the effectiveness of each individual method, demonstrated the robustness of the control system against gait perturbations across various activities, and revealed positive biomechanical and physiological responses of human users to the exoskeleton's mechanical assistance.
comment: 49 pages, 20 figures, 4 tables
Integrated Learning and Optimization to Control Load Demand and Wind Generation for Minimizing Ramping Cost in Real-Time Electricity Market
We developed a new integrated learning and optimization (ILO) methodology to predict context-aware unknown parameters in economic dispatch (ED), a crucial problem in power systems solved to generate optimal power dispatching decisions to serve consumer load. The ED formulation in the current study consists of load and renewable generation as unknown parameters in its constraints predicted using contextual information (e.g., prior load, temperature). The ILO framework train a neural network (NN) to estimate ED parameters by minimizing an application-specific regret function which is a difference between ground truth and NN-driven decisions favouring better ED decisions. We thoroughly analyze the feasible region of ED formulation to understand the impact of load and renewable learning together on the ED decisions. Corresponding to that we developed a new regret function to capture real-time electricity market operations where differences in predicted and true loads are corrected by ramping generators in real-time but at a higher cost than the market price. The proposed regret function when minimized using ILO framework train the NN to guide the load and renewable predictions to generate ED decisions favouring minimum generator ramping costs. This is unlike conventional sequential learning and optimization (SLO) framework which train NN to accurately estimate load and renewable instead of better ED decisions. The combined training of load and renewable using ILO is a new concept and lead to significantly improved ramping costs when compared with SLO based training of load and renewable and SLO trained load with 100% accurate renewable proving its decision-focused capability.
Besondere Anforderungen des automatisierten Fahrens an den Entwurf
The development of automated vehicles and automated driving functions is an exceptionally complex task that requires the integration of numerous, sometimes conflicting interests and various constraints already in the early stages of system design. This chapter explains important challenges in concept specifications for automated driving and presents a systematic process model that contributes to overcoming the special requirements in this field. In addition, it describes the successful implementation of a structured concept specification for an automated vehicle guidance system. -- Die Entwicklung automatisierter Fahrzeuge und Fahrfunktionen stellt eine ausgesprochen komplexe Aufgabe dar, die bereits im Zuge des Systementwurfs die Einbeziehung einer Vielzahl teilweise konflikt\"arer Interessen und diverser Randbedingungen erfordert. Dieses Kapitel erl\"autert wichtige Herausforderungen bei Konzeptspezifikationen im Themenfeld des automatisierten Fahrens und stellt ein systematisches Prozessmodell vor, das einen Beitrag zur Erf\"ullung der besonderen Anforderungen des automatisierten Fahrens an den Entwurf leistet. Dar\"uber hinaus wird die erfolgreiche Durchf\"uhrung einer strukturierten Konzeptspezifikation f\"ur ein automatisiertes Fahrzeugf\"uhrungssystem beschrieben.
comment: Preprint of https://link.springer.com/chapter/10.1007/978-3-658-38486-9_45, published in Handbuch Assistiertes und Automatisiertes Fahren, in German
A Divide-and-Conquer Tiling Method for the Design of Large Aperiodic Phased Arrays
Due to the growing request from modern wireless applications of cost-affordable and high-gain scanning antenna solutions, the design of large phased arrays (PAs) with radiating elements organized into modular clusters with sub-array-only amplitude and phase control is a key topic. In this paper, an innovative irregular tiling method is proposed where, according to a divide-and-conquer strategy, the antenna aperture is subdivided into sub-areas that are locally domino-tiled by jointly fulfilling the full-coverage condition on the remaining untiled part of the PA support. Selected representative results, including comparisons with competitive state-of-the-art synthesis methods, are reported to prove the effectiveness and the computational efficiency of the proposed tiling approach. Use-cases of current relevance for low Earth orbit (LEO) satellite communications are discussed, as well, to provide the antenna designers useful practical guidelines for handling large PAs.
Metering traffic flows for perimeter control through auction-based signalling using connected vehicles
Urban traffic congestion remains a critical challenge in modern cities, with traffic signal control systems often struggling to manage congestion during peak travel times. Perimeter control of a Protected Network (PN) has emerged as a potential solution to reducing gridlock in urban networks. This paper proposes a novel auction-based mechanism for green time allocation at signalized intersections, for effective perimeter control application. Utilising a Sealed Bid, Second Price auction framework, our approach combines real-time traffic monitoring with market-inspired mechanisms to regulate vehicle inflows into PN areas. Unlike existing methods that focus primarily on gated links, our system allocates budgets to individual traffic movements, providing greater flexibility in managing multi-directional flows. We evaluate the proposed mechanism using a test case intersection with a single controlled inflow, comparing it against a volume-based fixed-time approach. The results demonstrate that our auction-based method controls flows into the PN with improved accuracy, outperforming the volume-based approach in terms of inflow regulation, queue management and delays. The framework can be applied in real time to any generic intersection, offering a scalable solution for urban traffic management. This work bridges the gap between perimeter control and market-based intersection auctions, providing a pathway for further research on adaptive traffic management systems.
Duty-Cycling is Not Enough in Constrained IoT Networking: Revealing the Energy Savings of Dynamic Clock Scaling
Minimizing energy consumption of low-power wireless nodes is a persistent challenge from the constrained Internet of Things (IoT). In this paper, we start from the observation that constrained IoT devices have largely different hardware (im-)balances than full-scale machines. We find that the performance gap between MCU and network throughput on constrained devices enables minimal energy delay product (EDP) for IoT networking at largely reduced clock frequencies. We analyze the potentials by integrating dynamic voltage and frequency scaling (DVFS) into the RIOT IoT operating system and show that the DVFS reconfiguration overhead stays below the energy saved for a single, downscaled MAC operation. Backed by these findings, we systematically investigate how DVFS further improves energy-efficiency for common networking tasks -- in addition to duty-cycling. We measure IoT communication scenarios between real-world systems and analyze two MAC operating modes -- CSMA/CA and time slotting -- in combination with different CoAP transactions, payload sizes, as well as DTLS transport encryption. Our experiments reveal energy savings between 24% and 52% for MAC operations and up to 37% for encrypted CoAP communication. These results shall encourage research and system design work to integrate DVFS in future IoT devices for performing tasks at their optimal frequencies and thereby significantly extending battery lifetimes.
BEAVR: Bimanual, multi-Embodiment, Accessible, Virtual Reality Teleoperation System for Robots
\textbf{BEAVR} is an open-source, bimanual, multi-embodiment Virtual Reality (VR) teleoperation system for robots, designed to unify real-time control, data recording, and policy learning across heterogeneous robotic platforms. BEAVR enables real-time, dexterous teleoperation using commodity VR hardware, supports modular integration with robots ranging from 7-DoF manipulators to full-body humanoids, and records synchronized multi-modal demonstrations directly in the LeRobot dataset schema. Our system features a zero-copy streaming architecture achieving $\leq$35\,ms latency, an asynchronous ``think--act'' control loop for scalable inference, and a flexible network API optimized for real-time, multi-robot operation. We benchmark BEAVR across diverse manipulation tasks and demonstrate its compatibility with leading visuomotor policies such as ACT, DiffusionPolicy, and SmolVLA. All code is publicly available, and datasets are released on Hugging Face\footnote{Code, datasets, and VR app available at https://github.com/ARCLab-MIT/BEAVR-Bot.
comment: Accepted for presentation on ICCR Kyoto 2025
HapticGiant: A Novel Very Large Kinesthetic Haptic Interface with Hierarchical Force Control
Research in virtual reality and haptic technologies has consistently aimed to enhance immersion. While advanced head-mounted displays are now commercially available, kinesthetic haptic interfaces still face challenges such as limited workspaces, insufficient degrees of freedom, and kinematics not matching the human arm. In this paper, we present HapticGiant, a novel large-scale kinesthetic haptic interface designed to match the properties of the human arm as closely as possible and to facilitate natural user locomotion while providing full haptic feedback. The interface incorporates a novel admittance-type force control scheme, leveraging hierarchical optimization to render both arbitrary serial kinematic chains and Cartesian admittances. Notably, the proposed control scheme natively accounts for system limitations, including joint and Cartesian constraints, as well as singularities. Experimental results demonstrate the effectiveness of HapticGiant and its control scheme, paving the way for highly immersive virtual reality applications.
comment: Final Version - Accepted on IEEE Transactions on Haptics
Offline Auto Labeling: BAAS
This paper introduces BAAS, a new Extended Object Tracking (EOT) and fusion-based label annotation framework for radar detections in autonomous driving. Our framework utilizes Bayesian-based tracking, smoothing and eventually fusion methods to provide veritable and precise object trajectories along with shape estimation to provide annotation labels on the detection level under various supervision levels. Simultaneously, the framework provides evaluation of tracking performance and label annotation. If manually labeled data is available, each processing module can be analyzed independently or combined with other modules to enable closed-loop continuous improvements. The framework performance is evaluated in a challenging urban real-world scenario in terms of tracking performance and the label annotation errors. We demonstrate the functionality of the proposed approach for varying dynamic objects and class types
Shepherd Grid Strategy: Towards Reliable SWARM Interception
Modern unmanned aerial vehicle threats require sophisticated interception strategies that can overcome advanced evasion capabilities and operate effectively in contested environments. Traditional single-interceptor and uncoordinated multi-interceptor approaches suffer from fundamental limitations including inadequate coverage, predictable pursuit patterns, and vulnerability to intelligent evasion maneuvers. This paper introduces the Shepherd Grid Strategy, a new multi-phase coordination framework that employs pack-based behavioral coordination to achieve deterministic target interception through systematic containment and coordinated strike execution. The strategy implements a four-phase operational model consisting of chase, follow, formation, and engagement phases, with dynamic role assignment and adaptive formation geometry that maintains persistent target pressure while preparing optimal strike opportunities. Our approach incorporates three key innovations: adaptive phase transition mechanisms that optimize pursuit behavior based on proximity and mission objectives, dynamic role assignment systems that designate specialized interceptor functions including formation maintenance and strike execution, and predictive formation geometry algorithms that create mobile containment grids adapting to target movement patterns. The simulation experiments demonstrate significant performance improvements over traditional methods, achieving near-perfect interception success rates (over 95%) compared to traditional approaches (65%) and reducing median time-to-intercept.
comment: 9 pages, 5 figures
From Formal Methods to Data-Driven Safety Certificates of Unknown Large-Scale Networks
In this work, we propose a data-driven scheme within a compositional framework with noisy data to design robust safety controllers in a fully decentralized fashion for large-scale interconnected networks with unknown mathematical dynamics. Despite the network's high dimensionality and the inherent complexity of its unknown model, which make it intractable, our approach effectively addresses these challenges by (i) treating the network as a composition of smaller subsystems, and (ii) collecting noisy data from each subsystem's trajectory to design a control sub-barrier certificate (CSBC) and its corresponding local controller. To achieve this, our proposed scheme only requires a noise-corrupted single input-state trajectory from each unknown subsystem up to a specified time horizon, satisfying a certain rank condition. Subsequently, under a small-gain compositional reasoning, we compose those CSBC, derived from noisy data, and formulate a control barrier certificate (CBC) for the unknown network, ensuring its safety over an infinite time horizon, while providing correctness guarantees. We offer a data-dependent sum-of-squares (SOS) optimization program for computing CSBC alongside local controllers of subsystems. We illustrate that while the computational complexity of designing a CBC and its safety controller grows polynomially with network dimension using SOS optimization, our compositional data-driven approach significantly reduces it to a linear scale concerning the number of subsystems. We demonstrate the capability of our data-driven approach on multiple physical networks involving unknown models and a range of interconnection topologies.
From Micro to Macro Flow Modeling: Characterizing Heterogeneity of Mixed-Autonomy Traffic
Most autonomous-vehicles (AVs) driving strategies are designed and analyzed at the vehicle level, yet their aggregate impact on macroscopic traffic flow is still not understood, particularly the flow heterogeneity that emerges when AVs interact with human-driven vehicles (HVs). Existing validation techniques for macroscopic flow models rely on high-resolution spatiotemporal data spanning entire road segments which are rarely available for mixed-autonomy traffic. AVs record detailed Lagrangian trajectories of the ego vehicle and surrounding traffic through onboard sensors. Leveraging these Lagrangian observations to validate mixed-autonomy flow models therefore remains an open research challenge. This paper closes the gap between microscopic Lagrangian data and macroscopic Euclidean traffic models by introducing a continuous traffic-heterogeneity attribute. We represent traffic flow with two coupled conservation laws with one for vehicle number and one for the traffic attribute. Reconstruction methods are designed to derive the traffic attribute from Lagrangian vehicle trajectories. When abundant trajectory data are available, we characterize traffic heterogeneity by extracting drivers' desired speed and local behavioral uncertainty from trajectories. In data-scarce mixed traffic, we design an end-to-end mapping that infers the traffic heterogeneity solely from trajectories in the current spatiotemporal region. Experiments across multiple traffic datasets show that the proposed model effectively captures traffic heterogeneity by clustering the fundamental diagram scatter into attribute-based groups. The calibration errors of traffic flow dynamics are also reduce by 20% relative to the Aw-Rascle-Zhang model benchmark. Detailed analyses further show that the model generalizes well, maintaining nearly the same accuracy when evaluated under a variety of previously unseen traffic conditions.
Imperfect Competition in Markets for Short-Circuit Current Services
An important limitation of Inverter-Based Resources (IBR) is their reduced contribution to Short-Circuit Current (SCC), as compared to that of Synchronous Generators (SGs). With increasing penetration of IBR in most power systems, the reducing SCC poses challenges to a secure system operation, as line protections may not trip when required. In order to address this issue, the SCC ancillary service could be procured via an economic mechanism, aiming at securing adequate SCC on all buses. However, the suitability of markets for SCC services is not well understood, given that these could be prone to market-power issues: since the SCC contributions from various SGs to a certain bus are determined by the electrical topology of the grid, this is a highly local service. It is necessary to understand if SGs at advantageous electrical locations could exert market power and, if so, how it could be mitigated. In order to fill this gap, this paper adopts an SCC-constrained bilevel model to investigate strategic behaviors of SGs. To address the non-convexity due to unit commitment variables, the model is restructured through a primal-dual formulation. Based on a modified IEEE 30-bus system, cases with strategic SGs placed at different buses are analyzed. These studies demonstrate that agents exerting market power could achieve up to triple revenues from SCC provision, highlighting the need to carefully design these markets.
comment: Ancillary services, short-circuit current, market power, bilevel optimization, primal-dual formulation. A paper submitted IEEE
Control Systems Analysis of a 3-Axis Photovoltatic Solar Tracker for Water Pumping
We propose 3-axis solar tracker water pumping system. The solar tracker can rotate and tilt using stepper/DC motors and can rise and lower on a tripod using a linear actuator. The charge generated from solar energy absorbed by photovoltaic (PV) cells in the solar panel is stored in a 12V battery that in turn powers two water diaphragm pumps using a solar charge controller. The PV uses four light photocell resistors/sensors to measure light intensity. A solar tracking algorithm determines the optimal angle for PV positioning. Using an ultrasonic sensor to measure the water level in a reservoir water tank, water is pumped from one water tank to the reservoir. Based on soil moisture sensor levels, a second water pump supplies water from the reservoir to the plant. The system is analyzed from a control systems perspective. The transfer functions, root loci, and Bode plots are generated and simulated and experimental results are provided as well as stability and steady-state error analysis.
Design and Simulation of 6T SRAM Array
Conventional 6T SRAM is used in microprocessors in the cache memory design. The basic 6T SRAM cell and a 6 bit memory array layout are designed in LEdit. The design and analysis of key SRAM components, sense amplifiers, decoders, write drivers and precharge circuits are also provided. The pulse voltage waveforms generated for read and write operations as well as Q and Qbar nodes are simulated in LTSpice. Parasitic capacitances are extracted and their impact on the waveforms analyzed. Static noise margin, propagation delays, and power dissipation are calculated. Comparison of SRAM read and write operational performance using CMOS transistors is made with edge-triggered D flip flops. If certain size area and ratio constraints are satisfied, the 6T cell with CMOS transistors will possess stability, speed, and power efficiency. Both theoretical and simulated results are given.
Systematic Constraint Formulation and Collision-Free Trajectory Planning Using Space-Time Graphs of Convex Sets
In this paper, we create optimal, collision-free, time-dependent trajectories through cluttered dynamic environments. The many spatial and temporal constraints make finding an initial guess for a numerical solver difficult. Graphs of Convex Sets (GCS) and the recently developed Space-Time Graphs of Convex Sets formulation (ST-GCS) enable us to generate optimal minimum distance collision-free trajectories without providing an initial guess to the solver. We also explore the derivation of general GCS-compatible constraints and document an intuitive strategy for adapting general constraints to the framework. We show that ST-GCS produces equivalent trajectories to the standard GCS formulation when the environment is static. We then show ST-GCS operating in dynamic environments to find minimum distance collision-free trajectories.
comment: 21 pages with references, 20 figures
Distributional Robustness in Output Feedback Regret-Optimal Control
This paper studies distributionally robust regret-optimal (DRRO) control with purified output feedback for linear systems subject to additive disturbances and measurement noise. These uncertainties (including the initial system state) are assumed to be stochastic and distributed according to an unknown joint probability distribution within a Wasserstein ambiguity set. We design affine controllers to minimise the worst-case expected regret over all distributions in this set. The expected regret is defined as the difference between an expected cost incurred by an affine causal controller and the expected cost incurred by the optimal noncausal controller with perfect knowledge of the disturbance trajectory at the outset. Leveraging the duality theory in distributionally robust optimisation, we derive strong duality results for worst-case expectation problems involving general quadratic objective functions, enabling exact reformulations of the DRRO control problem as semidefinite programs (SDPs). Focusing on one such reformulation, we eliminate certain decision variables. This technique also permits a further equivalent reformulation of the SDP as a distributed optimisation problem, with potential to enhance scalability.
comment: Presented at the 11th IFAC Symposium on Robust Control Design
Performance Benchmarking of Machine Learning Models for Terahertz Metamaterial Absorber Prediction
This study presents a polarization-insensitive ultra-broadband terahertz metamaterial absorber based on vanadium dioxide (VO2) and evaluates machine learning methods for predicting its absorption performance. The structure consists of a VO2 metasurface, a MF2 dielectric spacer, and a gold ground plane. It achieves more than 90% absorption between 5.72 and 11.11 THz, covering a 5.38 THz bandwidth with an average absorptance of 98.15%. A dataset of 9,018 samples was generated from full-wave simulations by varying patch width, dielectric thickness, and frequency. Six regression models were trained: Linear Regression, Support Vector Regression, Decision Tree, Random Forest, XGBoost, and Bagging. Performance was measured using adjusted R2, MAE, MSE, and RMSE. Ensemble models achieved the best results, with Bagging reaching an adjusted R2 of 0.9985 and RMSE of 0.0146. The workflow offers a faster alternative to exhaustive simulations and can be applied to other metamaterial designs, enabling efficient evaluation and optimization.
comment: 6 pages, 6 figures, 2 tables
Routing Guidance for Emerging Transportation Systems with Improved Dynamic Trip Equity
This paper presents a dynamic routing guidance system that optimizes route recommendations for individual vehicles in an emerging transportation system while enhancing travelers' trip equity. We develop a framework to quantify trip quality and equity in dynamic travel environments, providing new insights into how routing guidance influences equity in road transportation. Our approach enables real-time routing by incorporating both monitored and anticipated traffic congestion. We provide conditions that ensure perfect trip equity for all travelers in a free-flow network. Simulation studies on 1,000 vehicles traversing an urban road network in Boston demonstrate that our method improves trip equity by approximately 11.4\% compared to the shortest-route strategy. In addition, the results reveal that our approach redistributes travel costs across vehicle types through route optimization, contributing to a more equitable transportation system.
Directional excitability in Hilbert spaces
We introduce a generalized excitable system in which spikes can happen in a continuum of directions, therefore drastically enriching the expressivity and control capability of the spiking dynamics. In this generalized excitable system, spiking trajectories happen in a Hilbert space with an excitable resting state at the origin and spike responses that can be triggered in any direction as a function of the system's state and inputs. State-dependence of the spiking direction provide the system with a vanishing spiking memory trace, which enables robust tracking and integration of inputs in the spiking direction history. The model exhibits generalized forms of both Hodgkin's Type I and Type II excitability, capturing their usual bifurcation behaviors in an abstract setting. When used as the controller of a two-dimensional navigation task, this model facilitates both the sparseness of the actuation and its sensitivity to environmental inputs. These results highlight the potential of the proposed generalized excitable model for excitable control in high- and infinite-dimensional spaces.
comment: 6 pages, 7 figures
On learning racing policies with reinforcement learning IROS 2025
Fully autonomous vehicles promise enhanced safety and efficiency. However, ensuring reliable operation in challenging corner cases requires control algorithms capable of performing at the vehicle limits. We address this requirement by considering the task of autonomous racing and propose solving it by learning a racing policy using Reinforcement Learning (RL). Our approach leverages domain randomization, actuator dynamics modeling, and policy architecture design to enable reliable and safe zero-shot deployment on a real platform. Evaluated on the F1TENTH race car, our RL policy not only surpasses a state-of-the-art Model Predictive Control (MPC), but, to the best of our knowledge, also represents the first instance of an RL policy outperforming expert human drivers in RC racing. This work identifies the key factors driving this performance improvement, providing critical insights for the design of robust RL-based control strategies for autonomous vehicles.
comment: This paper has been accepted for publication in the Proceedings of the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
Barriers on the EDGE: A scalable CBF architecture over EDGE for safe aerial-ground multi-agent coordination
In this article, we propose a control architecture for the safe, coordinated operation of a multi-agent system with aerial (UAVs) and ground (UGVs) robots in a confined task space. We consider the case where the aerial and ground operations are coupled, enabled by the capability of the aerial robots to land on moving ground robots. The proposed method uses time-varying Control Barrier Functions (CBFs) to impose safety constraints associated with (i) collision avoidance between agents, (ii) landing of UAVs on mobile UGVs, and (iii) task space restriction. Further, this article addresses the challenge induced by the rapid increase in the number of CBF constraints with the increasing number of agents through a hybrid centralized-distributed coordination approach that determines the set of CBF constraints that is relevant for every aerial and ground agent at any given time. A centralized node (Watcher), hosted by an edge computing cluster, activates the relevant constraints, thus reducing the network complexity and the need for high onboard processing on the robots. The CBF constraints are enforced in a distributed manner by individual robots that run a nominal controller and safety filter locally to overcome latency and other network nonidealities.
comment: 6 pages, 2 figures, first draft of a paper currently under review
Navigating Robot Swarm Through a Virtual Tube with Flow-Adaptive Distribution Control
With the rapid development of robot swarm technology and its diverse applications, navigating robot swarms through complex environments has emerged as a critical research direction. To ensure safe navigation and avoid potential collisions with obstacles, the concept of virtual tubes has been introduced to define safe and navigable regions. However, current control methods in virtual tubes face the congestion issues, particularly in narrow ones with low throughput. To address these challenges, we first propose a novel control method that combines a modified artificial potential field (APF) for swarm navigation and density feedback control for distribution regulation. Then we generate a global velocity field that not only ensures collision-free navigation but also achieves locally input-to-state stability (LISS) for density tracking. Finally, numerical simulations and realistic applications validate the effectiveness and advantages of the proposed method in navigating robot swarms through narrow virtual tubes.
comment: 8 pages(brief paper), 12 figures
SINDyG: Sparse Identification of Nonlinear Dynamical Systems from Graph-Structured Data, with Applications to Stuart-Landau Oscillator Networks
The combination of machine learning (ML) and sparsity-promoting techniques is enabling direct extraction of governing equations from data, revolutionizing computational modeling in diverse fields of science and engineering. The discovered dynamical models could be used to address challenges in climate science, neuroscience, ecology, finance, epidemiology, and beyond. However, most existing sparse identification methods for discovering dynamical systems treat the whole system as one without considering the interactions between subsystems. As a result, such models are not able to capture small changes in the emergent system behavior. To address this issue, we developed a new method called Sparse Identification of Nonlinear Dynamical Systems from Graph-structured data (SINDyG), which incorporates the network structure into sparse regression to identify model parameters that explain the underlying network dynamics. We tested our proposed method using several case studies of neuronal dynamics, where we modeled the macroscopic oscillation of a population of neurons using the extended Stuart-Landau (SL) equation and utilize the SINDyG method to identify the underlying nonlinear dynamics. Our extensive computational experiments validate the improved accuracy and simplicity of discovered network dynamics when compared to the original SINDy approach. The proposed graph-informed penalty can be easily integrated with other symbolic regression algorithms, enhancing model interpretability and performance by incorporating network structure into the regression process.
Stability Analysis and Intervention Strategies on a Coupled SIS Epidemic Model with Polar Opinion Dynamics
This paper investigates the spread of infectious diseases within a networked community by integrating epidemic transmission and public opinion dynamics. We propose a novel discrete-time networked SIS (Susceptible-Infectious-Susceptible) epidemic model coupled with opinion dynamics that includes stubborn agents with biased views. The model captures the interplay between perceived and actual epidemic severity, offering insights into epidemic dynamics in socially interconnected environments. We introduce the SIS-opinion reproduction number to assess epidemic severity and analyze conditions for disease eradication and the global stability of endemic equilibria. Additionally, we explore opinion-based intervention strategies, providing a framework for policymakers to design effective prevention measures. Numerical examples are provided to illustrate our theoretical findings and the model's practical implications.
comment: 13 pages, 4 figures
Optimization-Free Fast Optimal Control: Bang-Ride Property, Monotonicity, and Applications to Fast Battery Charging
Single-input fast optimal control problems, which aim to achieve the optimal objective as fast as possible, occur in various real-world applications. In the case of fast battery charging, the associated optimal control problem becomes computationally challenging when detailed battery models are used. A recent heuristic optimization-free algorithm can significantly reduce the computational cost and provide an approximate solution, consistent with many heuristic input profiles in practice. These heuristic solutions have several special properties: They follow a bang-ride pattern that always activates a constraint and applies the maximum feasible input. This work investigates when the above properties arise in the optimal input, and ultimately, when the heuristic input profiles satisfy necessary optimality conditions. By exploiting Pontryagin's maximum principle (PMP), we show that the optimal control is bang-ride under regularity conditions on constraint switching and local controllability of the system. Moreover, the special type of bang-ride behavior, i.e., applying the maximum feasible input, arises under the monotonicity of the system, objective function, and restricted sensitivity of the constraints. These results provide a theoretical justification for a class of charging heuristics and the fast optimization-free algorithm.
Sequential QCQP for Bilevel Optimization with Line Search
Bilevel optimization involves a hierarchical structure where one problem is nested within another, leading to complex interdependencies between levels. We propose a single-loop, tuning-free algorithm that guarantees anytime feasibility, i.e., approximate satisfaction of the lower-level optimality condition, while ensuring descent of the upper-level objective. At each iteration, a convex quadratically-constrained quadratic program (QCQP) with a closed-form solution yields the search direction, followed by a backtracking line search inspired by control barrier functions to ensure safe, uniformly positive step sizes. The resulting method is scalable, requires no hyperparameter tuning, and converges under mild local regularity assumptions. We establish an O(1/k) ergodic convergence rate in terms of a first-order stationary metric and demonstrate the algorithm's effectiveness on representative bilevel tasks.
comment: IEEE Control Systems Letters (L-CSS) and IEEE Conference on Decision and Control (CDC) 2025
Incompressible Optimal Transport and Applications in Fluid Mixing
The problem of incompressible fluid mixing arises in numerous engineering applications and has been well-studied over the years, yet many open questions remain. This paper aims to address the question "what do efficient flow fields for mixing look like, and how do they behave?" We approach this question by developing a framework which is inspired by the dynamic and geometric approach to optimal mass transport. Specifically, we formulate the fluid mixing problem as an optimal control problem where the dynamics are given by the continuity equation together with an incompressibility constraint. We show that within this framework, the set of reachable fluid configurations can formally be endowed with the structure of an infinite-dimensional Riemannian manifold, with a metric which is induced by the control effort, and that flow fields which are maximally efficient at mixing correspond to geodesics in this Riemannian space.
comment: 8 pages. Extended version (includes proofs)
MPPI-Generic: A CUDA Library for Stochastic Trajectory Optimization
This paper introduces a new C++/CUDA library for GPU-accelerated stochastic optimization called MPPI-Generic. It provides implementations of Model Predictive Path Integral control, Tube-Model Predictive Path Integral Control, and Robust Model Predictive Path Integral Control, and allows for these algorithms to be used across many pre-existing dynamics models and cost functions. Furthermore, researchers can create their own dynamics models or cost functions following our API definitions without needing to change the actual Model Predictive Path Integral Control code. Finally, we compare computational performance to other popular implementations of Model Predictive Path Integral Control over a variety of GPUs to show the real-time capabilities our library can allow for. Library code can be found at: https://acdslab.github.io/mppi-generic-website/ .
comment: Added missing Acknowledgements section
Robotics
GeoVLA: Empowering 3D Representations in Vision-Language-Action Models
Vision-Language-Action (VLA) models have emerged as a promising approach for enabling robots to follow language instructions and predict corresponding actions.However, current VLA models mainly rely on 2D visual inputs, neglecting the rich geometric information in the 3D physical world, which limits their spatial awareness and adaptability. In this paper, we present GeoVLA, a novel VLA framework that effectively integrates 3D information to advance robotic manipulation. It uses a vision-language model (VLM) to process images and language instructions,extracting fused vision-language embeddings. In parallel, it converts depth maps into point clouds and employs a customized point encoder, called Point Embedding Network, to generate 3D geometric embeddings independently. These produced embeddings are then concatenated and processed by our proposed spatial-aware action expert, called 3D-enhanced Action Expert, which combines information from different sensor modalities to produce precise action sequences. Through extensive experiments in both simulation and real-world environments, GeoVLA demonstrates superior performance and robustness. It achieves state-of-the-art results in the LIBERO and ManiSkill2 simulation benchmarks and shows remarkable robustness in real-world tasks requiring height adaptability, scale awareness and viewpoint invariance.
comment: The project is visible at https://linsun449.github.io/GeoVLA/
Spatial Traces: Enhancing VLA Models with Spatial-Temporal Understanding
Vision-Language-Action models have demonstrated remarkable capabilities in predicting agent movements within virtual environments and real-world scenarios based on visual observations and textual instructions. Although recent research has focused on enhancing spatial and temporal understanding independently, this paper presents a novel approach that integrates both aspects through visual prompting. We introduce a method that projects visual traces of key points from observations onto depth maps, enabling models to capture both spatial and temporal information simultaneously. The experiments in SimplerEnv show that the mean number of tasks successfully solved increased for 4% compared to SpatialVLA and 19% compared to TraceVLA. Furthermore, we show that this enhancement can be achieved with minimal training data, making it particularly valuable for real-world applications where data collection is challenging. The project page is available at https://ampiromax.github.io/ST-VLA.
Large Scale Robotic Material Handling: Learning, Planning, and Control
Bulk material handling involves the efficient and precise moving of large quantities of materials, a core operation in many industries, including cargo ship unloading, waste sorting, construction, and demolition. These repetitive, labor-intensive, and safety-critical operations are typically performed using large hydraulic material handlers equipped with underactuated grippers. In this work, we present a comprehensive framework for the autonomous execution of large-scale material handling tasks. The system integrates specialized modules for environment perception, pile attack point selection, path planning, and motion control. The main contributions of this work are two reinforcement learning-based modules: an attack point planner that selects optimal grasping locations on the material pile to maximize removal efficiency and minimize the number of scoops, and a robust trajectory following controller that addresses the precision and safety challenges associated with underactuated grippers in movement, while utilizing their free-swinging nature to release material through dynamic throwing. We validate our framework through real-world experiments on a 40 t material handler in a representative worksite, focusing on two key tasks: high-throughput bulk pile management and high-precision truck loading. Comparative evaluations against human operators demonstrate the system's effectiveness in terms of precision, repeatability, and operational safety. To the best of our knowledge, this is the first complete automation of material handling tasks on a full scale.
comment: Preliminary version, currently undergoing review process
Generation of Real-time Robotic Emotional Expressions Learning from Human Demonstration in Mixed Reality
Expressive behaviors in robots are critical for effectively conveying their emotional states during interactions with humans. In this work, we present a framework that autonomously generates realistic and diverse robotic emotional expressions based on expert human demonstrations captured in Mixed Reality (MR). Our system enables experts to teleoperate a virtual robot from a first-person perspective, capturing their facial expressions, head movements, and upper-body gestures, and mapping these behaviors onto corresponding robotic components including eyes, ears, neck, and arms. Leveraging a flow-matching-based generative process, our model learns to produce coherent and varied behaviors in real-time in response to moving objects, conditioned explicitly on given emotional states. A preliminary test validated the effectiveness of our approach for generating autonomous expressions.
comment: 4
Rational Inverse Reasoning
Humans can observe a single, imperfect demonstration and immediately generalize to very different problem settings. Robots, in contrast, often require hundreds of examples and still struggle to generalize beyond the training conditions. We argue that this limitation arises from the inability to recover the latent explanations that underpin intelligent behavior, and that these explanations can take the form of structured programs consisting of high-level goals, sub-task decomposition, and execution constraints. In this work, we introduce Rational Inverse Reasoning (RIR), a framework for inferring these latent programs through a hierarchical generative model of behavior. RIR frames few-shot imitation as Bayesian program induction: a vision-language model iteratively proposes structured symbolic task hypotheses, while a planner-in-the-loop inference scheme scores each by the likelihood of the observed demonstration under that hypothesis. This loop yields a posterior over concise, executable programs. We evaluate RIR on a suite of continuous manipulation tasks designed to test one-shot and few-shot generalization across variations in object pose, count, geometry, and layout. With as little as one demonstration, RIR infers the intended task structure and generalizes to novel settings, outperforming state-of-the-art vision-language model baselines.
Unsupervised Skill Discovery as Exploration for Learning Agile Locomotion
Exploration is crucial for enabling legged robots to learn agile locomotion behaviors that can overcome diverse obstacles. However, such exploration is inherently challenging, and we often rely on extensive reward engineering, expert demonstrations, or curriculum learning - all of which limit generalizability. In this work, we propose Skill Discovery as Exploration (SDAX), a novel learning framework that significantly reduces human engineering effort. SDAX leverages unsupervised skill discovery to autonomously acquire a diverse repertoire of skills for overcoming obstacles. To dynamically regulate the level of exploration during training, SDAX employs a bi-level optimization process that autonomously adjusts the degree of exploration. We demonstrate that SDAX enables quadrupedal robots to acquire highly agile behaviors including crawling, climbing, leaping, and executing complex maneuvers such as jumping off vertical walls. Finally, we deploy the learned policy on real hardware, validating its successful transfer to the real world.
comment: Conference on Robot Learning 2025
Shape Completion and Real-Time Visualization in Robotic Ultrasound Spine Acquisitions
Ultrasound (US) imaging is increasingly used in spinal procedures due to its real-time, radiation-free capabilities; however, its effectiveness is hindered by shadowing artifacts that obscure deeper tissue structures. Traditional approaches, such as CT-to-US registration, incorporate anatomical information from preoperative CT scans to guide interventions, but they are limited by complex registration requirements, differences in spine curvature, and the need for recent CT imaging. Recent shape completion methods can offer an alternative by reconstructing spinal structures in US data, while being pretrained on large set of publicly available CT scans. However, these approaches are typically offline and have limited reproducibility. In this work, we introduce a novel integrated system that combines robotic ultrasound with real-time shape completion to enhance spinal visualization. Our robotic platform autonomously acquires US sweeps of the lumbar spine, extracts vertebral surfaces from ultrasound, and reconstructs the complete anatomy using a deep learning-based shape completion network. This framework provides interactive, real-time visualization with the capability to autonomously repeat scans and can enable navigation to target locations. This can contribute to better consistency, reproducibility, and understanding of the underlying anatomy. We validate our approach through quantitative experiments assessing shape completion accuracy and evaluations of multiple spine acquisition protocols on a phantom setup. Additionally, we present qualitative results of the visualization on a volunteer scan.
Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors
A dexterous hand capable of generalizable grasping objects is fundamental for the development of general-purpose embodied AI. However, previous methods focus narrowly on low-level grasp stability metrics, neglecting affordance-aware positioning and human-like poses which are crucial for downstream manipulation. To address these limitations, we propose AffordDex, a novel framework with two-stage training that learns a universal grasping policy with an inherent understanding of both motion priors and object affordances. In the first stage, a trajectory imitator is pre-trained on a large corpus of human hand motions to instill a strong prior for natural movement. In the second stage, a residual module is trained to adapt these general human-like motions to specific object instances. This refinement is critically guided by two components: our Negative Affordance-aware Segmentation (NAA) module, which identifies functionally inappropriate contact regions, and a privileged teacher-student distillation process that ensures the final vision-based policy is highly successful. Extensive experiments demonstrate that AffordDex not only achieves universal dexterous grasping but also remains remarkably human-like in posture and functionally appropriate in contact location. As a result, AffordDex significantly outperforms state-of-the-art baselines across seen objects, unseen instances, and even entirely novel categories.
comment: 13 pages, 8 figures
DiffPhysCam: Differentiable Physics-Based Camera Simulation for Inverse Rendering and Embodied AI
We introduce DiffPhysCam, a differentiable camera simulator designed to support robotics and embodied AI applications by enabling gradient-based optimization in visual perception pipelines. Generating synthetic images that closely mimic those from real cameras is essential for training visual models and enabling end-to-end visuomotor learning. Moreover, differentiable rendering allows inverse reconstruction of real-world scenes as digital twins, facilitating simulation-based robotics training. However, existing virtual cameras offer limited control over intrinsic settings, poorly capture optical artifacts, and lack tunable calibration parameters -- hindering sim-to-real transfer. DiffPhysCam addresses these limitations through a multi-stage pipeline that provides fine-grained control over camera settings, models key optical effects such as defocus blur, and supports calibration with real-world data. It enables both forward rendering for image synthesis and inverse rendering for 3D scene reconstruction, including mesh and material texture optimization. We show that DiffPhysCam enhances robotic perception performance in synthetic image tasks. As an illustrative example, we create a digital twin of a real-world scene using inverse rendering, simulate it in a multi-physics environment, and demonstrate navigation of an autonomous ground vehicle using images generated by DiffPhysCam.
comment: 19 pages, 17 figures, and 4 tables
Robot can reduce superior's dominance in group discussions with human social hierarchy
This study investigated whether robotic agents that deal with social hierarchical relationships can reduce the dominance of superiors and equalize participation among participants in discussions with hierarchical structures. Thirty doctors and students having hierarchical relationship were gathered as participants, and an intervention experiment was conducted using a robot that can encourage participants to speak depending on social hierarchy. These were compared with strategies that intervened equally for all participants without considering hierarchy and with a no-action. The robots performed follow actions, showing backchanneling to speech, and encourage actions, prompting speech from members with less speaking time, on the basis of the hierarchical relationships among group members to equalize participation. The experimental results revealed that the robot's actions could potentially influence the speaking time among members, but it could not be conclusively stated that there were significant differences between the robot's action conditions. However, the results suggested that it might be possible to influence speaking time without decreasing the satisfaction of superiors. This indicates that in discussion scenarios where experienced superiors are likely to dominate, controlling the robot's backchanneling behavior could potentially suppress dominance and equalize participation among group members.
comment: 8 pages, 7 figures. International Conference on Human-Agent Interaction (HAI '24), November 24-27, 2024, Swansea, United Kingdom
Visual Prompting for Robotic Manipulation with Annotation-Guided Pick-and-Place Using ACT
Robotic pick-and-place tasks in convenience stores pose challenges due to dense object arrangements, occlusions, and variations in object properties such as color, shape, size, and texture. These factors complicate trajectory planning and grasping. This paper introduces a perception-action pipeline leveraging annotation-guided visual prompting, where bounding box annotations identify both pickable objects and placement locations, providing structured spatial guidance. Instead of traditional step-by-step planning, we employ Action Chunking with Transformers (ACT) as an imitation learning algorithm, enabling the robotic arm to predict chunked action sequences from human demonstrations. This facilitates smooth, adaptive, and data-driven pick-and-place operations. We evaluate our system based on success rate and visual analysis of grasping behavior, demonstrating improved grasp accuracy and adaptability in retail environments.
Boosting Action-Information via a Variational Bottleneck on Unlabelled Robot Videos
Learning from demonstrations (LfD) typically relies on large amounts of action-labeled expert trajectories, which fundamentally constrains the scale of available training data. A promising alternative is to learn directly from unlabeled video demonstrations. However, we find that existing methods tend to encode latent actions that share little mutual information with the true robot actions, leading to suboptimal control performance. To address this limitation, we introduce a novel framework that explicitly maximizes the mutual information between latent actions and true actions, even in the absence of action labels. Our method leverage the variational information-bottleneck to extract action-relevant representations while discarding task-irrelevant information. We provide a theoretical analysis showing that our objective indeed maximizes the mutual information between latent and true actions. Finally, we validate our approach through extensive experiments: first in simulated robotic environments and then on real-world robotic platforms, the experimental results demonstrate that our method significantly enhances mutual information and consistently improves policy performance.
CRADLE: Conversational RTL Design Space Exploration with LLM-based Multi-Agent Systems
This paper presents CRADLE, a conversational framework for design space exploration of RTL designs using LLM-based multi-agent systems. Unlike existing rigid approaches, CRADLE enables user-guided flows with internal self-verification, correction, and optimization. We demonstrate the framework with a generator-critic agent system targeting FPGA resource minimization using state-of-the-art LLMs. Experimental results on the RTLLM benchmark show that CRADLE achieves significant reductions in resource usage with averages of 48% and 40% in LUTs and FFs across all benchmark designs.
comment: Accepted for presentation at the 22nd International SoC Conference (ISOCC 2025). Proceedings to be included in IEEE Xplore
Towards Safe Imitation Learning via Potential Field-Guided Flow Matching IROS 2025
Deep generative models, particularly diffusion and flow matching models, have recently shown remarkable potential in learning complex policies through imitation learning. However, the safety of generated motions remains overlooked, particularly in complex environments with inherent obstacles. In this work, we address this critical gap by proposing Potential Field-Guided Flow Matching Policy (PF2MP), a novel approach that simultaneously learns task policies and extracts obstacle-related information, represented as a potential field, from the same set of successful demonstrations. During inference, PF2MP modulates the flow matching vector field via the learned potential field, enabling safe motion generation. By leveraging these complementary fields, our approach achieves improved safety without compromising task success across diverse environments, such as navigation tasks and robotic manipulation scenarios. We evaluate PF2MP in both simulation and real-world settings, demonstrating its effectiveness in task space and joint space control. Experimental results demonstrate that PF2MP enhances safety, achieving a significant reduction of collisions compared to baseline policies. This work paves the way for safer motion generation in unstructured and obstaclerich environments.
comment: 8 pages, 6 figures, Accepted to IROS 2025
OmniVTLA: Vision-Tactile-Language-Action Model with Semantic-Aligned Tactile Sensing
Recent vision-language-action (VLA) models build upon vision-language foundations, and have achieved promising results and exhibit the possibility of task generalization in robot manipulation. However, due to the heterogeneity of tactile sensors and the difficulty of acquiring tactile data, current VLA models significantly overlook the importance of tactile perception and fail in contact-rich tasks. To address this issue, this paper proposes OmniVTLA, a novel architecture involving tactile sensing. Specifically, our contributions are threefold. First, our OmniVTLA features a dual-path tactile encoder framework. This framework enhances tactile perception across diverse vision-based and force-based tactile sensors by using a pretrained vision transformer (ViT) and a semantically-aligned tactile ViT (SA-ViT). Second, we introduce ObjTac, a comprehensive force-based tactile dataset capturing textual, visual, and tactile information for 56 objects across 10 categories. With 135K tri-modal samples, ObjTac supplements existing visuo-tactile datasets. Third, leveraging this dataset, we train a semantically-aligned tactile encoder to learn a unified tactile representation, serving as a better initialization for OmniVTLA. Real-world experiments demonstrate substantial improvements over state-of-the-art VLA baselines, achieving 96.9% success rates with grippers, (21.9% higher over baseline) and 100% success rates with dexterous hands (6.2% higher over baseline) in pick-and-place tasks. Besides, OmniVTLA significantly reduces task completion time and generates smoother trajectories through tactile sensing compared to existing VLA.
comment: 15 pages, 7 figures, 8 tables
ZS-Puffin: Design, Modeling and Implementation of an Unmanned Aerial-Aquatic Vehicle with Amphibious Wings IROS 2025
Unmanned aerial-aquatic vehicles (UAAVs) can operate both in the air and underwater, giving them broad application prospects. Inspired by the dual-function wings of puffins, we propose a UAAV with amphibious wings to address the challenge posed by medium differences on the vehicle's propulsion system. The amphibious wing, redesigned based on a fixed-wing structure, features a single degree of freedom in pitch and requires no additional components. It can generate lift in the air and function as a flapping wing for propulsion underwater, reducing disturbance to marine life and making it environmentally friendly. Additionally, an artificial central pattern generator (CPG) is introduced to enhance the smoothness of the flapping motion. This paper presents the prototype, design details, and practical implementation of this concept.
comment: Accepted to IROS 2025
Communication Efficient Robotic Mixed Reality with Gaussian Splatting Cross-Layer Optimization
Realizing low-cost communication in robotic mixed reality (RoboMR) systems presents a challenge, due to the necessity of uploading high-resolution images through wireless channels. This paper proposes Gaussian splatting (GS) RoboMR (GSMR), which enables the simulator to opportunistically render a photo-realistic view from the robot's pose by calling ``memory'' from a GS model, thus reducing the need for excessive image uploads. However, the GS model may involve discrepancies compared to the actual environments. To this end, a GS cross-layer optimization (GSCLO) framework is further proposed, which jointly optimizes content switching (i.e., deciding whether to upload image or not) and power allocation (i.e., adjusting to content profiles) across different frames by minimizing a newly derived GSMR loss function. The GSCLO problem is addressed by an accelerated penalty optimization (APO) algorithm that reduces computational complexity by over $10$x compared to traditional branch-and-bound and search algorithms. Moreover, variants of GSCLO are presented to achieve robust, low-power, and multi-robot GSMR. Extensive experiments demonstrate that the proposed GSMR paradigm and GSCLO method achieve significant improvements over existing benchmarks on both wheeled and legged robots in terms of diverse metrics in various scenarios. For the first time, it is found that RoboMR can be achieved with ultra-low communication costs, and mixture of data is useful for enhancing GS performance in dynamic scenarios.
comment: 14 pages, 18 figures, to appear in IEEE Transactions on Cognitive Communications and Networking
Autonomous Mobile Plant Watering Robot : A Kinematic Approach
Plants need regular and the appropriate amount of watering to thrive and survive. While agricultural robots exist that can spray water on plants and crops such as the , they are expensive and have limited mobility and/or functionality. We introduce a novel autonomous mobile plant watering robot that uses a 6 degree of freedom (DOF) manipulator, connected to a 4 wheel drive alloy chassis, to be able to hold a garden hose, recognize and detect plants, and to water them with the appropriate amount of water by being able to insert a soil humidity/moisture sensor into the soil. The robot uses Jetson Nano and Arduino microcontroller and real sense camera to perform computer vision to detect plants using real-time YOLOv5 with the Pl@ntNet-300K dataset. The robot uses LIDAR for object and collision avoideance and does not need to move on a pre-defined path and can keep track of which plants it has watered. We provide the Denavit-Hartenberg (DH) Table, forward kinematics, differential driving kinematics, and inverse kinematics along with simulation and experiment results
Developing a Calibrated Physics-Based Digital Twin for Construction Vehicles
This paper presents the development of a calibrated digital twin of a wheel loader. A calibrated digital twin integrates a construction vehicle with a high-fidelity digital model allowing for automated diagnostics and optimization of operations as well as pre-planning simulations enhancing automation capabilities. The high-fidelity digital model is a virtual twin of the physical wheel loader. It uses a physics-based multibody dynamic model of the wheel loader in the software AGX Dynamics. Interactions of the wheel loader's bucket while in use in construction can be simulated in the virtual model. Calibration makes this simulation of high-fidelity which can enhance realistic planning for automation of construction operations. In this work, a wheel loader was instrumented with several sensors used to calibrate the digital model. The calibrated digital twin was able to estimate the magnitude of the forces on the bucket base with high accuracy, providing a high-fidelity simulation.
DeepFleet: Multi-Agent Foundation Models for Mobile Robots
We introduce DeepFleet, a suite of foundation models designed to support coordination and planning for large-scale mobile robot fleets. These models are trained on fleet movement data, including robot positions, goals, and interactions, from hundreds of thousands of robots in Amazon warehouses worldwide. DeepFleet consists of four architectures that each embody a distinct inductive bias and collectively explore key points in the design space for multi-agent foundation models: the robot-centric (RC) model is an autoregressive decision transformer operating on neighborhoods of individual robots; the robot-floor (RF) model uses a transformer with cross-attention between robots and the warehouse floor; the image-floor (IF) model applies convolutional encoding to a multi-channel image representation of the full fleet; and the graph-floor (GF) model combines temporal attention with graph neural networks for spatial relationships. In this paper, we describe these models and present our evaluation of the impact of these design choices on prediction task performance. We find that the robot-centric and graph-floor models, which both use asynchronous robot state updates and incorporate the localized structure of robot interactions, show the most promise. We also present experiments that show that these two models can make effective use of larger warehouses operation datasets as the models are scaled up.
comment: 25 pages, 10 figures, 2 tables
CLF-RL: Control Lyapunov Function Guided Reinforcement Learning
Reinforcement learning (RL) has shown promise in generating robust locomotion policies for bipedal robots, but often suffers from tedious reward design and sensitivity to poorly shaped objectives. In this work, we propose a structured reward shaping framework that leverages model-based trajectory generation and control Lyapunov functions (CLFs) to guide policy learning. We explore two model-based planners for generating reference trajectories: a reduced-order linear inverted pendulum (LIP) model for velocity-conditioned motion planning, and a precomputed gait library based on hybrid zero dynamics (HZD) using full-order dynamics. These planners define desired end-effector and joint trajectories, which are used to construct CLF-based rewards that penalize tracking error and encourage rapid convergence. This formulation provides meaningful intermediate rewards, and is straightforward to implement once a reference is available. Both the reference trajectories and CLF shaping are used only during training, resulting in a lightweight policy at deployment. We validate our method both in simulation and through extensive real-world experiments on a Unitree G1 robot. CLF-RL demonstrates significantly improved robustness relative to the baseline RL policy and better performance than a classic tracking reward RL formulation.
comment: 8 pages; 8 figures
How Safe Will I Be Given What I Saw? Calibrated Prediction of Safety Chances for Image-Controlled Autonomy
Autonomous robots that rely on deep neural network controllers pose critical challenges for safety prediction, especially under partial observability and distribution shift. Traditional model-based verification techniques are limited in scalability and require access to low-dimensional state models, while model-free methods often lack reliability guarantees. This paper addresses these limitations by introducing a framework for calibrated safety prediction in end-to-end vision-controlled systems, where neither the state-transition model nor the observation model is accessible. Building on the foundation of world models, we leverage variational autoencoders and recurrent predictors to forecast future latent trajectories from raw image sequences and estimate the probability of satisfying safety properties. We distinguish between monolithic and composite prediction pipelines and introduce a calibration mechanism to quantify prediction confidence. In long-horizon predictions from high-dimensional observations, the forecasted inputs to the safety evaluator can deviate significantly from the training distribution due to compounding prediction errors and changing environmental conditions, leading to miscalibrated risk estimates. To address this, we incorporate unsupervised domain adaptation to ensure robustness of safety evaluation under distribution shift in predictions without requiring manual labels. Our formulation provides theoretical calibration guarantees and supports practical evaluation across long prediction horizons. Experimental results on three benchmarks show that our UDA-equipped evaluators maintain high accuracy and substantially lower false positive rates under distribution shift. Similarly, world model-based composite predictors outperform their monolithic counterparts on long-horizon tasks, and our conformal calibration provides reliable statistical bounds.
SegDAC: Segmentation-Driven Actor-Critic for Visual Reinforcement Learning
Visual reinforcement learning (RL) is challenging due to the need to learn both perception and actions from high-dimensional inputs and noisy rewards. Although large perception models exist, integrating them effectively into RL for visual generalization and improved sample efficiency remains unclear. We propose SegDAC, a Segmentation-Driven Actor-Critic method. SegDAC uses Segment Anything (SAM) for object-centric decomposition and YOLO-World to ground segments semantically via text prompts. It includes a novel transformer-based architecture that supports a dynamic number of segments at each time step and effectively learns which segments to focus on using online RL, without using human labels. By evaluating SegDAC over a challenging visual generalization benchmark using Maniskill3, which covers diverse manipulation tasks under strong visual perturbations, we demonstrate that SegDAC achieves significantly better visual generalization, doubling prior performance on the hardest setting and matching or surpassing prior methods in sample efficiency across all evaluated tasks.
Decision-Making-Based Path Planning for Autonomous UAVs: A Survey
One of the most critical features for the successful operation of autonomous UAVs is the ability to make decisions based on the information acquired from their surroundings. Each UAV must be able to make decisions during the flight in order to deal with uncertainties in its system and the environment, and to further act upon the information being received. Such decisions influence the future behavior of the UAV, which is expressed as the path plan. Thus, decision-making in path planning is an enabling technique for deploying autonomous UAVs in real-world applications. This survey provides an overview of existing studies that use aspects of decision-making in path planning, presenting the research strands for Exploration Path Planning and Informative Path Planning, and focusing on characteristics of how data have been modeled and understood. Finally, we highlight the existing challenges for relevant topics in this field.
Safety Perspective on Assisted Lane Changes: Insights from Open-Road, Live-Traffic Experiments
This study investigates the assisted lane change functionality of five different vehicles equipped with advanced driver assistance systems (ADAS). The goal is to examine novel, under-researched features of commercially available ADAS technologies. The experimental campaign, conducted in the I-24 highway near Nashville, TN, US, collected data on the kinematics and safety margins of assisted lane changes in real-world conditions. The results show that the kinematics of assisted lane changes are consistent for each system, with four out of five vehicles using slower speeds and decelerations than human drivers. However, one system consistently performed more assertive lane changes, completing the maneuver in around 5 seconds. Regarding safety margins, only three vehicles are investigated. Those operated in the US are not restricted by relevant UN regulations, and their designs were found not to adhere to these regulatory requirements. A simulation method used to classify the challenge level for the vehicle receiving the lane change, showing that these systems can force trailing vehicles to decelerate to keep a safe gap. One assisted system was found to have performed a maneuver that posed a hard challenge level for the other vehicle, raising concerns about the safety of these systems in real-world operation. All three vehicles were found to carry out lane changes that induced decelerations to the vehicle in the target lane. Those decelerations could affect traffic flow, inducing traffic shockwaves.
comment: 21 pages, 8 Figures
BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion
Learning skills from human motions offers a promising path toward generalizable policies for whole-body humanoid control, yet two key cornerstones are missing: (1) a high-quality motion tracking framework that faithfully transforms large-scale kinematic references into robust and extremely dynamic motions on real hardware, and (2) a distillation approach that can effectively learn these motion primitives and compose them to solve downstream tasks. We address these gaps with BeyondMimic, a real-world framework to learn from human motions for versatile and naturalistic humanoid control via guided diffusion. Our framework provides a motion tracking pipeline capable of challenging skills such as jumping spins, sprinting, and cartwheels with state-of-the-art motion quality. Moving beyond mimicking existing motions and synthesize novel ones, we further introduce a unified diffusion policy that enables zero-shot task-specific control at test time using simple cost functions. Deployed on hardware, BeyondMimic performs diverse tasks at test time, including waypoint navigation, joystick teleoperation, and obstacle avoidance, bridging sim-to-real motion tracking and flexible synthesis of human motion primitives for whole-body control. https://beyondmimic.github.io/.
comment: fix footnote and math
GMF-Drive: Gated Mamba Fusion with Spatial-Aware BEV Representation for End-to-End Autonomous Driving
Diffusion-based models are redefining the state-of-the-art in end-to-end autonomous driving, yet their performance is increasingly hampered by a reliance on transformer-based fusion. These architectures face fundamental limitations: quadratic computational complexity restricts the use of high-resolution features, and a lack of spatial priors prevents them from effectively modeling the inherent structure of Bird's Eye View (BEV) representations. This paper introduces GMF-Drive (Gated Mamba Fusion for Driving), an end-to-end framework that overcomes these challenges through two principled innovations. First, we supersede the information-limited histogram-based LiDAR representation with a geometrically-augmented pillar format encoding shape descriptors and statistical features, preserving critical 3D geometric details. Second, we propose a novel hierarchical gated mamba fusion (GM-Fusion) architecture that substitutes an expensive transformer with a highly efficient, spatially-aware state-space model (SSM). Our core BEV-SSM leverages directional sequencing and adaptive fusion mechanisms to capture long-range dependencies with linear complexity, while explicitly respecting the unique spatial properties of the driving scene. Extensive experiments on the challenging NAVSIM benchmark demonstrate that GMF-Drive achieves a new state-of-the-art performance, significantly outperforming DiffusionDrive. Comprehensive ablation studies validate the efficacy of each component, demonstrating that task-specific SSMs can surpass a general-purpose transformer in both performance and efficiency for autonomous driving.
comment: 7 pages, 4 figures
ReNiL: Relative Neural Inertial Locator with Any-Scale Bayesian Inference
Pedestrian inertial localization is key for mobile and IoT services because it provides infrastructure-free positioning. Yet most learning-based methods depend on fixed sliding-window integration, struggle to adapt to diverse motion scales and cadences, and yield inconsistent uncertainty, limiting real-world use. We present ReNiL, a Bayesian deep-learning framework for accurate, efficient, and uncertainty-aware pedestrian localization. ReNiL introduces Inertial Positioning Demand Points (IPDPs) to estimate motion at contextually meaningful waypoints instead of dense tracking, and supports inference on IMU sequences at any scale so cadence can match application needs. It couples a motion-aware orientation filter with an Any-Scale Laplace Estimator (ASLE), a dual-task network that blends patch-based self-supervision with Bayesian regression. By modeling displacements with a Laplace distribution, ReNiL provides homogeneous Euclidean uncertainty that integrates cleanly with other sensors. A Bayesian inference chain links successive IPDPs into consistent trajectories. On RoNIN-ds and a new WUDataset covering indoor and outdoor motion from 28 participants, ReNiL achieves state-of-the-art displacement accuracy and uncertainty consistency, outperforming TLIO, CTIN, iMoT, and RoNIN variants while reducing computation. Application studies further show robustness and practicality for mobile and IoT localization, making ReNiL a scalable, uncertainty-aware foundation for next-generation positioning.
comment: This work has been submitted to the IEEE for possible publication
Touch and Tell: Multimodal Decoding of Human Emotions and Social Gestures for Robots
Human emotions are complex and can be conveyed through nuanced touch gestures. Previous research has primarily focused on how humans recognize emotions through touch or on identifying key features of emotional expression for robots. However, there is a gap in understanding how reliably these emotions and gestures can be communicated to robots via touch and interpreted using data driven methods. This study investigates the consistency and distinguishability of emotional and gestural expressions through touch and sound. To this end, we integrated a custom piezoresistive pressure sensor as well as a microphone on a social robot. Twenty-eight participants first conveyed ten different emotions to the robot using spontaneous touch gestures, then they performed six predefined social touch gestures. Our findings reveal statistically significant consistency in both emotion and gesture expression among participants. However, some emotions exhibited low intraclass correlation values, and certain emotions with similar levels of arousal or valence did not show significant differences in their conveyance. To investigate emotion and social gesture decoding within affective human-robot tactile interaction, we developed single-modality models and multimodal models integrating tactile and auditory features. A support vector machine (SVM) model trained on multimodal features achieved the highest accuracy for classifying ten emotions, reaching 40 %.For gesture classification, a Convolutional Neural Network- Long Short-Term Memory Network (CNN-LSTM) achieved 90.74 % accuracy. Our results demonstrate that even though the unimodal models have the potential to decode emotions and touch gestures, the multimodal integration of touch and sound significantly outperforms unimodal approaches, enhancing the decoding of both emotions and gestures.
Joint State and Noise Covariance Estimation
This paper tackles the problem of jointly estimating the noise covariance matrix alongside states (parameters such as poses and points) from measurements corrupted by Gaussian noise and, if available, prior information. In such settings, the noise covariance matrix determines the weights assigned to individual measurements in the least squares problem. We show that the joint problem exhibits a convex structure and provide a full characterization of the optimal noise covariance estimate (with analytical solutions) within joint maximum a posteriori and likelihood frameworks and several variants. Leveraging this theoretical result, we propose two novel algorithms that jointly estimate the primary parameters and the noise covariance matrix. Our BCD algorithm can be easily integrated into existing nonlinear least squares solvers, with negligible per-iteration computational overhead. To validate our approach, we conduct extensive experiments across diverse scenarios and offer practical insights into their application in robotics and computer vision estimation problems with a particular focus on SLAM.
comment: Adds a missing related work [4]
LM-MCVT: A Lightweight Multi-modal Multi-view Convolutional-Vision Transformer Approach for 3D Object Recognition
In human-centered environments such as restaurants, homes, and warehouses, robots often face challenges in accurately recognizing 3D objects. These challenges stem from the complexity and variability of these environments, including diverse object shapes. In this paper, we propose a novel Lightweight Multi-modal Multi-view Convolutional-Vision Transformer network (LM-MCVT) to enhance 3D object recognition in robotic applications. Our approach leverages the Globally Entropy-based Embeddings Fusion (GEEF) method to integrate multi-views efficiently. The LM-MCVT architecture incorporates pre- and mid-level convolutional encoders and local and global transformers to enhance feature extraction and recognition accuracy. We evaluate our method on the synthetic ModelNet40 dataset and achieve a recognition accuracy of 95.6% using a four-view setup, surpassing existing state-of-the-art methods. To further validate its effectiveness, we conduct 5-fold cross-validation on the real-world OmniObject3D dataset using the same configuration. Results consistently show superior performance, demonstrating the method's robustness in 3D object recognition across synthetic and real-world 3D data.
OSMa-Bench: Evaluating Open Semantic Mapping Under Varying Lighting Conditions
Open Semantic Mapping (OSM) is a key technology in robotic perception, combining semantic segmentation and SLAM techniques. This paper introduces a dynamically configurable and highly automated LLM/LVLM-powered pipeline for evaluating OSM solutions called OSMa-Bench (Open Semantic Mapping Benchmark). The study focuses on evaluating state-of-the-art semantic mapping algorithms under varying indoor lighting conditions, a critical challenge in indoor environments. We introduce a novel dataset with simulated RGB-D sequences and ground truth 3D reconstructions, facilitating the rigorous analysis of mapping performance across different lighting conditions. Through experiments on leading models such as ConceptGraphs, BBQ and OpenScene, we evaluate the semantic fidelity of object recognition and segmentation. Additionally, we introduce a Scene Graph evaluation method to analyze the ability of models to interpret semantic structure. The results provide insights into the robustness of these models, forming future research directions for developing resilient and adaptable robotic systems. Project page is available at https://be2rlab.github.io/OSMa-Bench/.
comment: Project page: https://be2rlab.github.io/OSMa-Bench/
Frequency Point Game Environment for UAVs via Expert Knowledge and Large Language Model
Unmanned Aerial Vehicles (UAVs) have made significant advancements in communication stability and security through techniques such as frequency hopping, signal spreading, and adaptive interference suppression. However, challenges remain in modeling spectrum competition, integrating expert knowledge, and predicting opponent behavior. To address these issues, we propose UAV-FPG (Unmanned Aerial Vehicle - Frequency Point Game), a game-theoretic environment model that simulates the dynamic interaction between interference and anti-interference strategies of opponent and ally UAVs in communication frequency bands. The model incorporates a prior expert knowledge base to optimize frequency selection and employs large language models for path planning, simulating a "strong adversary". Experimental results highlight the effectiveness of integrating the expert knowledge base and the large language model, with the latter significantly improving path planning in dynamic scenarios through iterative interactions, outperforming fixed-path strategies. UAV-FPG provides a robust platform for advancing anti-jamming strategies and intelligent decision-making in UAV communication systems.
Gait in Eight: Efficient On-Robot Learning for Omnidirectional Quadruped Locomotion
On-robot Reinforcement Learning is a promising approach to train embodiment-aware policies for legged robots. However, the computational constraints of real-time learning on robots pose a significant challenge. We present a framework for efficiently learning quadruped locomotion in just 8 minutes of raw real-time training utilizing the sample efficiency and minimal computational overhead of the new off-policy algorithm CrossQ. We investigate two control architectures: Predicting joint target positions for agile, high-speed locomotion and Central Pattern Generators for stable, natural gaits. While prior work focused on learning simple forward gaits, our framework extends on-robot learning to omnidirectional locomotion. We demonstrate the robustness of our approach in different indoor and outdoor environments.
What Foundation Models can Bring for Robot Learning in Manipulation : A Survey
The realization of universal robots is an ultimate goal of researchers. However, a key hurdle in achieving this goal lies in the robots' ability to manipulate objects in their unstructured surrounding environments according to different tasks. The learning-based approach is considered an effective way to address generalization. The impressive performance of foundation models in the fields of computer vision and natural language suggests the potential of embedding foundation models into manipulation tasks as a viable path toward achieving general manipulation capability. However, we believe achieving general manipulation capability requires an overarching framework akin to auto driving. This framework should encompass multiple functional modules, with different foundation models assuming distinct roles in facilitating general manipulation capability. This survey focuses on the contributions of foundation models to robot learning for manipulation. We propose a comprehensive framework and detail how foundation models can address challenges in each module of the framework. What's more, we examine current approaches, outline challenges, suggest future research directions, and identify potential risks associated with integrating foundation models into this domain.
Edge-Based Multimodal Sensor Data Fusion with Vision Language Models (VLMs) for Real-time Autonomous Vehicle Accident Avoidance
Autonomous driving (AD) systems relying solely on onboard sensors may fail to detect distant or obstacle hazards, potentially causing preventable collisions; however, existing transformer-based Vehicle-to-Everything (V2X) approaches, which mitigate AD sensing limitations, either lack effective multimodal fusion and reasoning or struggle to meet real-time performance requirements under complex, high-dimensional traffic conditions. This paper proposes the Real-time Edge-based Autonomous Co-pilot Trajectory planner (REACT), a V2X-integrated trajectory optimization framework for AD based on a fine-tuned lightweight Vision-Language Model (VLM). REACT integrates infrastructure-provided hazard alerts with onboard sensor data, capturing intricate surrounding traffic dynamics and vehicle intents through visual embeddings, interpreting precise numerical data from symbolic inputs, and employing contextual reasoning to generate optimized, safety-oriented trajectories. To ensure robust real-time deployment on edge devices, REACT innovatively employs Residual Trajectory Fusion (RTF) design and specialized edge-adaptation strategies to reduce model complexity and improve inference efficiency. Evaluated on the DeepAccident benchmark, REACT achieves state-of-the-art performance, a 77% collision rate reduction, a 48.2% Video Panoptic Quality (VPQ), and a 0.57-second inference latency on the Jetson AGX Orin. Ablation studies validate the contribution of each input, module, and edge adaptation strategy. These results highlight the effectiveness of lightweight VLMs in enabling real-time cooperative planning on edge platforms and underscore the potential of language-guided contextual reasoning for improving traffic safety and responsiveness.
comment: 24 pages, 6 tables, 7 figures
UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI ICCV 2025
We introduce UnrealZoo, a collection of over 100 photo-realistic 3D virtual worlds built on Unreal Engine, designed to reflect the complexity and variability of open-world environments. We also provide a rich variety of playable entities, including humans, animals, robots, and vehicles for embodied AI research. We extend UnrealCV with optimized APIs and tools for data collection, environment augmentation, distributed training, and benchmarking. These improvements achieve significant improvements in the efficiency of rendering and communication, enabling advanced applications such as multi-agent interactions. Our experimental evaluation across visual navigation and tracking tasks reveals two key insights: 1) environmental diversity provides substantial benefits for developing generalizable reinforcement learning (RL) agents, and 2) current embodied agents face persistent challenges in open-world scenarios, including navigation in unstructured terrain, adaptation to unseen morphologies, and managing latency in the close-loop control systems for interacting in highly dynamic objects. UnrealZoo thus serves as both a comprehensive testing ground and a pathway toward developing more capable embodied AI systems for real-world deployment.
comment: ICCV 2025 (Highlight), Project page: http://unrealzoo.site/
Speech to Reality: On-Demand Production using Natural Language, 3D Generative AI, and Discrete Robotic Assembly
We present a system that transforms speech into physical objects using 3D generative AI and discrete robotic assembly. By leveraging natural language input, the system makes design and manufacturing more accessible to individuals without expertise in 3D modeling or robotic programming. While current generative AI models can produce a wide range of 3D digital assets, AI-generated meshes are not directly suitable for robotic fabrication and do not account for fabrication constraints. To address this, we contribute a workflow that integrates natural language processing, 3D generative AI, and discrete robotic assembly. The system automatically analyzes and modifies AI-generated geometry to meet physical constraints, such as component count, overhangs, and connectivity, and produces a feasible robotic assembly sequence and toolpath. The results are demonstrated through the assembly of various objects, ranging from chairs to shelves, which are prompted via speech and realized within 5 minutes using a robotic arm.
comment: This work has been submitted for possible publication. An updated version will replace this version when available
Human-Robot Interaction Conversational User Enjoyment Scale (HRI CUES)
Understanding user enjoyment is crucial in human-robot interaction (HRI), as it can impact interaction quality and influence user acceptance and long-term engagement with robots, particularly in the context of conversations with social robots. However, current assessment methods rely solely on self-reported questionnaires, failing to capture interaction dynamics. This work introduces the Human-Robot Interaction Conversational User Enjoyment Scale (HRI CUES), a novel 5-point scale to assess user enjoyment from an external perspective (e.g. by an annotator) for conversations with a robot. The scale was developed through rigorous evaluations and discussions among three annotators with relevant expertise, using open-domain conversations with a companion robot that was powered by a large language model, and was applied to each conversation exchange (i.e. a robot-participant turn pair) alongside overall interaction. It was evaluated on 25 older adults' interactions with the companion robot, corresponding to 174 minutes of data, showing moderate to good alignment between annotators. Although the scale was developed and tested in the context of older adult interactions with a robot, its basis in general and non-task-specific indicators of enjoyment supports its broader applicability. The study further offers insights into understanding the nuances and challenges of assessing user enjoyment in robot interactions, and provides guidelines on applying the scale to other domains and populations. The dataset is available online.
comment: Published in IEEE Transactions on Affective Computing on 18 July 2025. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media
Anticipating Degradation: A Predictive Approach to Fault Tolerance in Robot Swarms
An active approach to fault tolerance is essential for robot swarms to achieve long-term autonomy. Previous efforts have focused on responding to spontaneous electro-mechanical faults and failures. However, many faults occur gradually over time. Waiting until such faults have manifested as failures before addressing them is both inefficient and unsustainable in a variety of scenarios. This work argues that the principles of predictive maintenance, in which potential faults are resolved before they hinder the operation of the swarm, offer a promising means of achieving long-term fault tolerance. This is a novel approach to swarm fault tolerance, which is shown to give a comparable or improved performance when tested against a reactive approach in almost all cases tested.
Context-based Motion Retrieval using Open Vocabulary Methods for Autonomous Driving
Autonomous driving systems must operate reliably in safety-critical scenarios, particularly those involving unusual or complex behavior by Vulnerable Road Users (VRUs). Identifying these edge cases in driving datasets is essential for robust evaluation and generalization, but retrieving such rare human behavior scenarios within the long tail of large-scale datasets is challenging. To support targeted evaluation of autonomous driving systems in diverse, human-centered scenarios, we propose a novel context-aware motion retrieval framework. Our method combines Skinned Multi-Person Linear (SMPL)-based motion sequences and corresponding video frames before encoding them into a shared multimodal embedding space aligned with natural language. Our approach enables the scalable retrieval of human behavior and their context through text queries. This work also introduces our dataset WayMoCo, an extension of the Waymo Open Dataset. It contains automatically labeled motion and scene context descriptions derived from generated pseudo-ground-truth SMPL sequences and corresponding image data. Our approach outperforms state-of-the-art models by up to 27.5% accuracy in motion-context retrieval, when evaluated on the WayMoCo dataset.
comment: Project page: https://iv.ee.hm.edu/contextmotionclip/; This work has been submitted to the IEEE for possible publication
Multi-Keypoint Affordance Representation for Functional Dexterous Grasping
Functional dexterous grasping requires precise hand-object interaction, going beyond simple gripping. Existing affordance-based methods primarily predict coarse interaction regions and cannot directly constrain the grasping posture, leading to a disconnection between visual perception and manipulation. To address this issue, we propose a multi-keypoint affordance representation for functional dexterous grasping, which directly encodes task-driven grasp configurations by localizing functional contact points. Our method introduces Contact-guided Multi-Keypoint Affordance (CMKA), leveraging human grasping experience images for weak supervision combined with Large Vision Models for fine affordance feature extraction, achieving generalization while avoiding manual keypoint annotations. Additionally, we present a Keypoint-based Grasp matrix Transformation (KGT) method, ensuring spatial consistency between hand keypoints and object contact points, thus providing a direct link between visual perception and dexterous grasping actions. Experiments on public real-world FAH datasets, IsaacGym simulation, and challenging robotic tasks demonstrate that our method significantly improves affordance localization accuracy, grasp consistency, and generalization to unseen tools and tasks, bridging the gap between visual affordance learning and dexterous robotic manipulation. The source code and demo videos are publicly available at https://github.com/PopeyePxx/MKA.
comment: Accepted to IEEE Robotics and Automation Letters (RA-L). The source code and demo videos are publicly available at https://github.com/PopeyePxx/MKA
A simulation framework for autonomous lunar construction work
We present a simulation framework for lunar construction work involving multiple autonomous machines. The framework supports modelling of construction scenarios and autonomy solutions, execution of the scenarios in simulation, and analysis of work time and energy consumption throughout the construction project. The simulations are based on physics-based models for contacting multibody dynamics and deformable terrain, including vehicle-soil interaction forces and soil flow in real time. A behaviour tree manages the operational logic and error handling, which enables the representation of complex behaviours through a discrete set of simpler tasks in a modular hierarchical structure. High-level decision-making is separated from lower-level control algorithms, with the two connected via ROS2. Excavation movements are controlled through inverse kinematics and tracking controllers. The framework is tested and demonstrated on two different lunar construction scenarios that involve an excavator and dump truck with actively controlled articulated crawlers.
comment: 13 pages, 16 figures
Hypergraph-based Motion Generation with Multi-modal Interaction Relational Reasoning
The intricate nature of real-world driving environments, characterized by dynamic and diverse interactions among multiple vehicles and their possible future states, presents considerable challenges in accurately predicting the motion states of vehicles and handling the uncertainty inherent in the predictions. Addressing these challenges requires comprehensive modeling and reasoning to capture the implicit relations among vehicles and the corresponding diverse behaviors. This research introduces an integrated framework for autonomous vehicles (AVs) motion prediction to address these complexities, utilizing a novel Relational Hypergraph Interaction-informed Neural mOtion generator (RHINO). RHINO leverages hypergraph-based relational reasoning by integrating a multi-scale hypergraph neural network to model group-wise interactions among multiple vehicles and their multi-modal driving behaviors, thereby enhancing motion prediction accuracy and reliability. Experimental validation using real-world datasets demonstrates the superior performance of this framework in improving predictive accuracy and fostering socially aware automated driving in dynamic traffic scenarios. The source code is publicly available at https://github.com/keshuw95/RHINO-Hypergraph-Motion-Generation.
MEReQ: Max-Ent Residual-Q Inverse RL for Sample-Efficient Alignment from Intervention
Aligning robot behavior with human preferences is crucial for deploying embodied AI agents in human-centered environments. A promising solution is interactive imitation learning from human intervention, where a human expert observes the policy's execution and provides interventions as feedback. However, existing methods often fail to utilize the prior policy efficiently to facilitate learning, thus hindering sample efficiency. In this work, we introduce MEReQ (Maximum-Entropy Residual-Q Inverse Reinforcement Learning), designed for sample-efficient alignment from human intervention. Instead of inferring the complete human behavior characteristics, MEReQ infers a residual reward function that captures the discrepancy between the human expert's and the prior policy's underlying reward functions. It then employs Residual Q-Learning (RQL) to align the policy with human preferences using this residual reward function. Extensive evaluations on simulated and real-world tasks demonstrate that MEReQ achieves sample-efficient policy alignment from human intervention.
IRL-VLA: Training an Vision-Language-Action Policy via Reward World Model
Vision-Language-Action (VLA) models have demonstrated potential in autonomous driving. However, two critical challenges hinder their development: (1) Existing VLA architectures are typically based on imitation learning in open-loop setup which tends to capture the recorded behaviors in the dataset, leading to suboptimal and constrained performance, (2) Close-loop training relies heavily on high-fidelity sensor simulation, where domain gaps and computational inefficiencies pose significant barriers. In this paper, we introduce IRL-VLA, a novel close-loop Reinforcement Learning via \textbf{I}nverse \textbf{R}einforcement \textbf{L}earning reward world model with a self-built VLA approach. Our framework proceeds in a three-stage paradigm: In the first stage, we propose a VLA architecture and pretrain the VLA policy via imitation learning. In the second stage, we construct a lightweight reward world model via inverse reinforcement learning to enable efficient close-loop reward computation. To further enhance planning performance, finally, we design specialized reward world model guidence reinforcement learning via PPO(Proximal Policy Optimization) to effectively balance the safety incidents, comfortable driving, and traffic efficiency. Our approach achieves state-of-the-art performance in NAVSIM v2 end-to-end driving benchmark, 1st runner up in CVPR2025 Autonomous Grand Challenge. We hope that our framework will accelerate VLA research in close-loop autonomous driving.
comment: 9 pagres, 2 figures
MolmoAct: Action Reasoning Models that can Reason in Space
Reasoning is central to purposeful action, yet most robotic foundation models map perception and instructions directly to control, which limits adaptability, generalization, and semantic grounding. We introduce Action Reasoning Models (ARMs), a class of robotic foundation models that integrate perception, planning, and control through a structured three-stage pipeline. Our model, MolmoAct, encodes observations and instructions into depth-aware perception tokens, generates mid-level spatial plans as editable trajectory traces, and predicts precise low-level actions, enabling explainable and steerable behavior. MolmoAct-7B-D achieves strong performance across simulation and real-world settings: 70.5% zero-shot accuracy on SimplerEnv Visual Matching tasks, surpassing closed-source Pi-0 and GR00T N1; 86.6% average success on LIBERO, including an additional 6.3% gain over ThinkAct on long-horizon tasks; and in real-world fine-tuning, an additional 10% (single-arm) and an additional 22.7% (bimanual) task progression over Pi-0-FAST. It also outperforms baselines by an additional 23.3% on out-of-distribution generalization and achieves top human-preference scores for open-ended instruction following and trajectory steering. Furthermore, we release, for the first time, the MolmoAct Dataset -- a mid-training robot dataset comprising over 10,000 high quality robot trajectories across diverse scenarios and tasks. Training with this dataset yields an average 5.5% improvement in general performance over the base model. We release all model weights, training code, our collected dataset, and our action reasoning dataset, establishing MolmoAct as both a state-of-the-art robotics foundation model and an open blueprint for building ARMs that transform perception into purposeful action through structured reasoning. Blogpost: https://allenai.org/blog/molmoact
comment: Appendix include. Code, Data and Weights: https://allenai.org/blog/molmoact
Open-Set LiDAR Panoptic Segmentation Guided by Uncertainty-Aware Learning
Autonomous vehicles that navigate in open-world environments may encounter previously unseen object classes. However, most existing LiDAR panoptic segmentation models rely on closed-set assumptions, failing to detect unknown object instances. In this work, we propose ULOPS, an uncertainty-guided open-set panoptic segmentation framework that leverages Dirichlet-based evidential learning to model predictive uncertainty. Our architecture incorporates separate decoders for semantic segmentation with uncertainty estimation, embedding with prototype association, and instance center prediction. During inference, we leverage uncertainty estimates to identify and segment unknown instances. To strengthen the model's ability to differentiate between known and unknown objects, we introduce three uncertainty-driven loss functions. Uniform Evidence Loss to encourage high uncertainty in unknown regions. Adaptive Uncertainty Separation Loss ensures a consistent difference in uncertainty estimates between known and unknown objects at a global scale. Contrastive Uncertainty Loss refines this separation at the fine-grained level. To evaluate open-set performance, we extend benchmark settings on KITTI-360 and introduce a new open-set evaluation for nuScenes. Extensive experiments demonstrate that ULOPS consistently outperforms existing open-set LiDAR panoptic segmentation methods.
Robo-Instruct: Simulator-Augmented Instruction Alignment For Finetuning Code LLMs
Code LLMs have shown promising results with converting tasks in natural language to programs that can be executed by service robots. We are interested in finetuning small, specialized LLMs for this purpose, but collecting datasets of task-program pairs specific to each robot is time-consuming and expensive. While approaches such as SELF-INSTRUCT and EVOL-INSTRUCT are capable of generating novel tasks given a few examples, they are unable to provide the corresponding programs that correctly abide by physical-world and robot-constraints using the provided programming interface. Using a simulator is a natural potential solution to checking for such constraints, but building simulation environments that can handle arbitrary tasks and their necessary objects and locations, is challenging. To address these challenges, we introduce ROBO-INSTRUCT, which synthesizes task-specific simulation environments on the fly during program execution, by opportunistically inferring entity properties and enforcing corresponding constraints based on how the entities are used in the task program. Additionally, ROBO-INSTRUCT integrates an LLM-aided post-processing procedure to refine instructions for better alignment with robot programs. We demonstrate the effectiveness of ROBO-INSTRUCT across multiple LLMs, showing that our fine-tuned models outperform all baseline methods and even match or surpass the performance of several larger and proprietary models.
TinyMPC: Model-Predictive Control on Resource-Constrained Microcontrollers ICRA 2024
Model-predictive control (MPC) is a powerful tool for controlling highly dynamic robotic systems subject to complex constraints. However, MPC is computationally demanding, and is often impractical to implement on small, resource-constrained robotic platforms. We present TinyMPC, a high-speed MPC solver with a low memory footprint targeting the microcontrollers common on small robots. Our approach is based on the alternating direction method of multipliers (ADMM) and leverages the structure of the MPC problem for efficiency. We demonstrate TinyMPC's effectiveness by benchmarking against the state-of-the-art solver OSQP, achieving nearly an order of magnitude speed increase, as well as through hardware experiments on a 27 gram quadrotor, demonstrating high-speed trajectory tracking and dynamic obstacle avoidance. TinyMPC is publicly available at https://tinympc.org.
comment: Accepted at ICRA 2024. First three authors contributed equally and are listed in alphabetical order. Publicly available at https://tinympc.org
Multiagent Systems
Not in My Backyard! Temporal Voting Over Public Chores IJCAI
We study a temporal voting model where voters have dynamic preferences over a set of public chores -- projects that benefit society, but impose individual costs on those affected by their implementation. We investigate the computational complexity of optimizing utilitarian and egalitarian welfare. Our results show that while optimizing the former is computationally straightforward, minimizing the latter is computationally intractable, even in very restricted cases. Nevertheless, we identify several settings where this problem can be solved efficiently, either exactly or by an approximation algorithm. We also examine the effects of enforcing temporal fairness and its impact on social welfare, and analyze the competitive ratio of online algorithms. We then explore the strategic behavior of agents, providing insights into potential malfeasance in such decision-making environments. Finally, we discuss a range of fairness measures and their suitability for our setting.
comment: Appears in the 34th International Joint Conference on Artificial Intelligence (IJCAI), 2025
Fault Tolerant Multi-Agent Learning with Adversarial Budget Constraints
In multi-agent systems, the safe and reliable execution of tasks often depends on agents correctly coordinating their actions. However, in real-world deployments, failures of computational components are inevitable, presenting a critical challenge: ensuring that multi-agent reinforcement learning (MARL) policies remain effective even when some agents malfunction. We propose the Multi-Agent Robust Training Algorithm (MARTA), a plug-and-play framework for training MARL agents to be resilient to potentially severe faults. MARTA operates in cooperative multi-agent settings where agents may lose the ability to execute their intended actions. It learns to identify failure scenarios that are especially detrimental to system performance and equips agents with strategies to mitigate their impact. At the heart of MARTA is a novel adversarial Markov game in which an adversary -- modelled via \emph{Markov switching controls} -- learns to disable agents in high-risk state regions, while the remaining agents are trained to \emph{jointly} best-respond to such targeted malfunctions. To ensure practicality, MARTA enforces a malfunction budget, constraining the adversary to a fixed number of failures and learning robust policies accordingly. We provide theoretical guarantees that MARTA converges to a Markov perfect equilibrium, ensuring agents optimally counteract worst-case faults. Empirically, we show that MARTA achieves state-of-the-art fault-tolerant performance across benchmark environments, including Multi-Agent Particle World and Level-Based Foraging.
CARES: Collaborative Agentic Reasoning for Error Detection in Surgery
Robotic-assisted surgery (RAS) introduces complex challenges that current surgical error detection methods struggle to address effectively due to limited training data and methodological constraints. Therefore, we construct MERP (Multi-class Error in Robotic Prostatectomy), a comprehensive dataset for error detection in robotic prostatectomy with frame-level annotations featuring six clinically aligned error categories. In addition, we propose CARES (Collaborative Agentic Reasoning for Error Detection in Surgery), a novel zero-shot clinically-informed and risk-stratified agentic reasoning architecture for multi-class surgical error detection. CARES implements adaptive generation of medically informed, error-specific Chain-of-Thought (CoT) prompts across multiple expertise levels. The framework employs risk-aware routing to assign error task to expertise-matched reasoning pathways based on complexity and clinical impact. Subsequently, each pathway decomposes surgical error analysis into three specialized agents with temporal, spatial, and procedural analysis. Each agent analyzes using dynamically selected prompts tailored to the assigned expertise level and error type, generating detailed and transparent reasoning traces. By incorporating clinically informed reasoning from established surgical assessment guidelines, CARES enables zero-shot surgical error detection without prior training. Evaluation demonstrates superior performance with 54.3 mF1 on RARP and 52.0 mF1 on MERP datasets, outperforming existing zero-shot approaches by up to 14% while remaining competitive with trained models. Ablation studies demonstrate the effectiveness of our method. The dataset and code will be publicly available.
CRADLE: Conversational RTL Design Space Exploration with LLM-based Multi-Agent Systems
This paper presents CRADLE, a conversational framework for design space exploration of RTL designs using LLM-based multi-agent systems. Unlike existing rigid approaches, CRADLE enables user-guided flows with internal self-verification, correction, and optimization. We demonstrate the framework with a generator-critic agent system targeting FPGA resource minimization using state-of-the-art LLMs. Experimental results on the RTLLM benchmark show that CRADLE achieves significant reductions in resource usage with averages of 48% and 40% in LUTs and FFs across all benchmark designs.
comment: Accepted for presentation at the 22nd International SoC Conference (ISOCC 2025). Proceedings to be included in IEEE Xplore
Hierarchy Entropy Degeneration Explains the Rat Utopia Population Collapse: The Role of Full Visibility and Isolation
Calhoun's Rat Utopia experiments demonstrated a puzzling population trajectory: initial growth, plateau, and eventually a total collapse of the rat population despite abundant resources. This paper proposes a hypothesis that the enclosure's design enabled full visibility of the social hierarchy (pecking order), leading to entropy degeneration: progressive loss of uncertainty in rats' perceived ranks over generations. High initial uncertainty drives engagement in dominance, reproduction, and care; as visibility solidifies the hierarchy over the generations, uncertainty vanishes, nullifying perceived gains from social activities. Simulations reproduce the experimental arc which rely on a game theoretic matrix that is parameterized by the uncertainty (entropy) in the hierarchy which changes over rat generations.
DeepFleet: Multi-Agent Foundation Models for Mobile Robots
We introduce DeepFleet, a suite of foundation models designed to support coordination and planning for large-scale mobile robot fleets. These models are trained on fleet movement data, including robot positions, goals, and interactions, from hundreds of thousands of robots in Amazon warehouses worldwide. DeepFleet consists of four architectures that each embody a distinct inductive bias and collectively explore key points in the design space for multi-agent foundation models: the robot-centric (RC) model is an autoregressive decision transformer operating on neighborhoods of individual robots; the robot-floor (RF) model uses a transformer with cross-attention between robots and the warehouse floor; the image-floor (IF) model applies convolutional encoding to a multi-channel image representation of the full fleet; and the graph-floor (GF) model combines temporal attention with graph neural networks for spatial relationships. In this paper, we describe these models and present our evaluation of the impact of these design choices on prediction task performance. We find that the robot-centric and graph-floor models, which both use asynchronous robot state updates and incorporate the localized structure of robot interactions, show the most promise. We also present experiments that show that these two models can make effective use of larger warehouses operation datasets as the models are scaled up.
comment: 25 pages, 10 figures, 2 tables
Constrained Black-Box Attacks Against Multi-Agent Reinforcement Learning
Collaborative multi-agent reinforcement learning (c-MARL) has rapidly evolved, offering state-of-the-art algorithms for real-world applications, including sensitive domains. However, a key challenge to its widespread adoption is the lack of a thorough investigation into its vulnerabilities to adversarial attacks. Existing work predominantly focuses on training-time attacks or unrealistic scenarios, such as access to policy weights or the ability to train surrogate policies. In this paper, we investigate new vulnerabilities under more realistic and constrained conditions, assuming an adversary can only collect and perturb the observations of deployed agents. We also consider scenarios where the adversary has no access at all. We propose simple yet highly effective algorithms for generating adversarial perturbations designed to misalign how victim agents perceive their environment. Our approach is empirically validated on three benchmarks and 22 environments, demonstrating its effectiveness across diverse algorithms and environments. Furthermore, we show that our algorithm is sample-efficient, requiring only 1,000 samples compared to the millions needed by previous methods.
comment: Under review in TNNLS
Cowpox: Towards the Immunity of VLM-based Multi-Agent Systems
Vision Language Model (VLM)-based agents are stateful, autonomous entities capable of perceiving and interacting with their environments through vision and language. Multi-agent systems comprise specialized agents who collaborate to solve a (complex) task. A core security property is robustness, stating that the system should maintain its integrity under adversarial attacks. However, the design of existing multi-agent systems lacks the robustness consideration, as a successful exploit against one agent can spread and infect other agents to undermine the entire system's assurance. To address this, we propose a new defense approach, Cowpox, to provably enhance the robustness of multi-agent systems. It incorporates a distributed mechanism, which improves the recovery rate of agents by limiting the expected number of infections to other agents. The core idea is to generate and distribute a special cure sample that immunizes an agent against the attack before exposure and helps recover the already infected agents. We demonstrate the effectiveness of Cowpox empirically and provide theoretical robustness guarantees.
RCR-Router: Efficient Role-Aware Context Routing for Multi-Agent LLM Systems with Structured Memory
Multi-agent large language model (LLM) systems have shown strong potential in complex reasoning and collaborative decision-making tasks. However, most existing coordination schemes rely on static or full-context routing strategies, which lead to excessive token consumption, redundant memory exposure, and limited adaptability across interaction rounds. We introduce RCR-Router, a modular and role-aware context routing framework designed to enable efficient, adaptive collaboration in multi-agent LLMs. To our knowledge, this is the first routing approach that dynamically selects semantically relevant memory subsets for each agent based on its role and task stage, while adhering to a strict token budget. A lightweight scoring policy guides memory selection, and agent outputs are iteratively integrated into a shared memory store to facilitate progressive context refinement. To better evaluate model behavior, we further propose an Answer Quality Score metric that captures LLM-generated explanations beyond standard QA accuracy. Experiments on three multi-hop QA benchmarks -- HotPotQA, MuSiQue, and 2WikiMultihop -- demonstrate that RCR-Router reduces token usage (up to 30%) while improving or maintaining answer quality. These results highlight the importance of structured memory routing and output-aware evaluation in advancing scalable multi-agent LLM systems.
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience
Repurposing large vision-language models (LVLMs) as computer use agents (CUAs) has led to substantial breakthroughs, primarily driven by human-labeled data. However, these models often struggle with novel and specialized software, particularly in scenarios lacking human annotations. To address this challenge, we propose SEAgent, an agentic self-evolving framework enabling CUAs to autonomously evolve through interactions with unfamiliar software. Specifically, SEAgent empowers computer-use agents to autonomously master novel software environments via experiential learning, where agents explore new software, learn through iterative trial-and-error, and progressively tackle auto-generated tasks organized from simple to complex. To achieve this goal, we design a World State Model for step-wise trajectory assessment, along with a Curriculum Generator that generates increasingly diverse and challenging tasks. The agent's policy is updated through experiential learning, comprised of adversarial imitation of failure actions and Group Relative Policy Optimization (GRPO) on successful ones. Furthermore, we introduce a specialist-to-generalist training strategy that integrates individual experiential insights from specialist agents, facilitating the development of a stronger generalist CUA capable of continuous autonomous evolution. This unified agent ultimately achieves performance surpassing ensembles of individual specialist agents on their specialized software. We validate the effectiveness of SEAgent across five novel software environments within OS-World. Our approach achieves a significant improvement of 23.2% in success rate, from 11.3% to 34.5%, over a competitive open-source CUA, i.e., UI-TARS.
comment: Code at https://github.com/SunzeY/SEAgent
StyleTailor: Towards Personalized Fashion Styling via Hierarchical Negative Feedback
The advancement of intelligent agents has revolutionized problem-solving across diverse domains, yet solutions for personalized fashion styling remain underexplored, which holds immense promise for promoting shopping experiences. In this work, we present StyleTailor, the first collaborative agent framework that seamlessly unifies personalized apparel design, shopping recommendation, virtual try-on, and systematic evaluation into a cohesive workflow. To this end, StyleTailor pioneers an iterative visual refinement paradigm driven by multi-level negative feedback, enabling adaptive and precise user alignment. Specifically, our framework features two core agents, i.e., Designer for personalized garment selection and Consultant for virtual try-on, whose outputs are progressively refined via hierarchical vision-language model feedback spanning individual items, complete outfits, and try-on efficacy. Counterexamples are aggregated into negative prompts, forming a closed-loop mechanism that enhances recommendation quality. To assess the performance, we introduce a comprehensive evaluation suite encompassing style consistency, visual quality, face similarity, and artistic appraisal. Extensive experiments demonstrate StyleTailor's superior performance in delivering personalized designs and recommendations, outperforming strong baselines without negative feedback and establishing a new benchmark for intelligent fashion systems.
comment: 24pages, 5 figures
Frequency Point Game Environment for UAVs via Expert Knowledge and Large Language Model
Unmanned Aerial Vehicles (UAVs) have made significant advancements in communication stability and security through techniques such as frequency hopping, signal spreading, and adaptive interference suppression. However, challenges remain in modeling spectrum competition, integrating expert knowledge, and predicting opponent behavior. To address these issues, we propose UAV-FPG (Unmanned Aerial Vehicle - Frequency Point Game), a game-theoretic environment model that simulates the dynamic interaction between interference and anti-interference strategies of opponent and ally UAVs in communication frequency bands. The model incorporates a prior expert knowledge base to optimize frequency selection and employs large language models for path planning, simulating a "strong adversary". Experimental results highlight the effectiveness of integrating the expert knowledge base and the large language model, with the latter significantly improving path planning in dynamic scenarios through iterative interactions, outperforming fixed-path strategies. UAV-FPG provides a robust platform for advancing anti-jamming strategies and intelligent decision-making in UAV communication systems.
DynaSwarm: Dynamically Graph Structure Selection for LLM-based Multi-agent System
Current multi-agent systems (MAS) frameworks often rely on manually designed and static collaboration graph structures, limiting adaptability and performance. To address these limitations, we propose DynaSwarm, a dynamic framework that enhances LLM-based MAS through two key innovations: (1) an actor-critic reinforcement learning (A2C) mechanism to optimize graph structures with improved stability over prior RL methods, and (2) a dynamic graph selector that adaptively chooses the optimal graph structure for each input sample via parameter-efficient LLM fine-tuning. DynaSwarm eliminates the need for rigid, one-fits-all graph architectures, instead leveraging sample-specific idiosyncrasies to dynamically route queries through specialized agent networks. (c) We propose to fine-tune the demonstration retriever to fully exploit the power of in-context learning (ICL). Extensive experiments on question answering, mathematical reasoning, and coding tasks demonstrate that DynaSwarm consistently outperforms state-of-the-art single-agent and MAS baselines across multiple LLM backbones. Our findings highlight the importance of sample-aware structural flexibility in LLM MAS designs.
comment: content error
Anticipating Degradation: A Predictive Approach to Fault Tolerance in Robot Swarms
An active approach to fault tolerance is essential for robot swarms to achieve long-term autonomy. Previous efforts have focused on responding to spontaneous electro-mechanical faults and failures. However, many faults occur gradually over time. Waiting until such faults have manifested as failures before addressing them is both inefficient and unsustainable in a variety of scenarios. This work argues that the principles of predictive maintenance, in which potential faults are resolved before they hinder the operation of the swarm, offer a promising means of achieving long-term fault tolerance. This is a novel approach to swarm fault tolerance, which is shown to give a comparable or improved performance when tested against a reactive approach in almost all cases tested.
Hypergraph-based Motion Generation with Multi-modal Interaction Relational Reasoning
The intricate nature of real-world driving environments, characterized by dynamic and diverse interactions among multiple vehicles and their possible future states, presents considerable challenges in accurately predicting the motion states of vehicles and handling the uncertainty inherent in the predictions. Addressing these challenges requires comprehensive modeling and reasoning to capture the implicit relations among vehicles and the corresponding diverse behaviors. This research introduces an integrated framework for autonomous vehicles (AVs) motion prediction to address these complexities, utilizing a novel Relational Hypergraph Interaction-informed Neural mOtion generator (RHINO). RHINO leverages hypergraph-based relational reasoning by integrating a multi-scale hypergraph neural network to model group-wise interactions among multiple vehicles and their multi-modal driving behaviors, thereby enhancing motion prediction accuracy and reliability. Experimental validation using real-world datasets demonstrates the superior performance of this framework in improving predictive accuracy and fostering socially aware automated driving in dynamic traffic scenarios. The source code is publicly available at https://github.com/keshuw95/RHINO-Hypergraph-Motion-Generation.
Combat Urban Congestion via Collaboration: Heterogeneous GNN-based MARL for Coordinated Platooning and Traffic Signal Control
Over the years, reinforcement learning has emerged as a popular approach to develop signal control and vehicle platooning strategies either independently or in a hierarchical way. However, jointly controlling both in real-time to alleviate traffic congestion presents new challenges, such as the inherent physical and behavioral heterogeneity between signal control and platooning, as well as coordination between them. This paper proposes an innovative solution to tackle these challenges based on heterogeneous graph multi-agent reinforcement learning and traffic theories. Our approach involves: 1) designing platoon and signal control as distinct reinforcement learning agents with their own set of observations, actions, and reward functions to optimize traffic flow; 2) designing coordination by incorporating graph neural networks within multi-agent reinforcement learning to facilitate seamless information exchange among agents on a regional scale; 3) applying alternating optimization for training, allowing agents to update their own policies and adapt to other agents' policies. We evaluate our approach through SUMO simulations, which show convergent results in terms of both travel time and fuel consumption, and superior performance compared to other adaptive signal control methods.
Making Teams and Influencing Agents: Efficiently Coordinating Decision Trees for Interpretable Multi-Agent Reinforcement Learning AAAI
Poor interpretability hinders the practical applicability of multi-agent reinforcement learning (MARL) policies. Deploying interpretable surrogates of uninterpretable policies enhances the safety and verifiability of MARL for real-world applications. However, if these surrogates are to interact directly with the environment within human supervisory frameworks, they must be both performant and computationally efficient. Prior work on interpretable MARL has either sacrificed performance for computational efficiency or computational efficiency for performance. To address this issue, we propose HYDRAVIPER, a decision tree-based interpretable MARL algorithm. HYDRAVIPER coordinates training between agents based on expected team performance, and adaptively allocates budgets for environment interaction to improve computational efficiency. Experiments on standard benchmark environments for multi-agent coordination and traffic signal control show that HYDRAVIPER matches the performance of state-of-the-art methods using a fraction of the runtime, and that it maintains a Pareto frontier of performance for different interaction budgets.
comment: 17 pages; 2 tables; 12 figures; accepted version, published at the 8th AAAI/ACM Conference on AI, Ethics and Society (AIES '25)
Inertial Coordination Games
We analyze inertial coordination games: dynamic coordination games with an endogenously changing state that depends on (i) a persistent fundamental players privately learn about over time; and (ii) past play. The speed of learning determines long-run equilibrium dynamics: the risk-dominant action is played in the limit if and only if learning is slow such that posterior precisions grow sub-quadratically. This generalizes results from static global games and endows them with a learning foundation. Conversely, when learning is fast such that posterior precisions grow super-quadratically, shocks can propagate and generate self-fulfilling spirals.
ABIDES-Economist: Agent-Based Simulator of Economic Systems with Learning Agents
We present ABIDES-Economist, an agent-based simulator for economic systems that includes heterogeneous households, firms, a central bank, and a government. Agent behavior can be defined using domain-specific behavioral rules or learned through reinforcement learning by specifying their objectives. We integrate reinforcement learning capabilities for all agents using the OpenAI Gym environment framework for the multi-agent system. To enhance the realism of our model, we base agent parameters and action spaces on economic literature and real U.S. economic data. To tackle the challenges of calibrating heterogeneous agent-based economic models, we conduct a comprehensive survey of stylized facts related to both microeconomic and macroeconomic time series data. We then validate ABIDES-Economist by demonstrating its ability to generate simulated data that aligns with the relevant stylized facts for the economic scenario under consideration, following the learning of all agent behaviors via reinforcement learning. Specifically, we train our economic agents' policies under two broad configurations. The first configuration demonstrates that the learned economic agents produce system data consistent with macroeconomic and microeconomic stylized facts. The second configuration illustrates the utility of the validated simulation platform in designing regulatory policies for the central bank and government. These policies outperform standard rule-based approaches from the literature, which often overlook agent heterogeneity, shocks, and agent adaptability.
comment: Updated version of arXiv:2402.09563 with a survey of stylized facts for validating economic agent-based models, along with experiments showcasing the utility of our simulator for economic policy
Systems and Control (CS)
An Open-Source Simulation and Data Management Tool for EnergyPlus Building Models
We present a new open-source, GUI-based application created using Plotly-Dash, along with an integrated PostgreSQL-based relational database, developed to streamline EnergyPlus building model simulation workflows. The application facilitates data generation, aggregation (across thermal zones), and visualization based on customizable user preferences, while the database efficiently stores and retrieves complex simulation data generated by EnergyPlus. We demonstrate the need for this application and database, emphasizing how existing approaches for generating, managing, and analyzing EnergyPlus simulation data can be cumbersome, particularly when handling a large number of building models with varying simulation setups. This integrated framework enables building energy engineers and researchers to simplify their EnergyPlus simulations, manage generated simulation data, perform data analyses, and support data-driven modeling tasks.
comment: This manuscript is a version of our paper accepted at the IEEE Power & Energy Society General Meeting (PESGM) 2025
A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions
Reinforcement learning (RL) has proven to be particularly effective in solving complex decision-making problems for a wide range of applications. From a control theory perspective, RL can be considered as an adaptive optimal control scheme. Lyapunov and barrier functions are the most commonly used certificates to guarantee system stability for a proposed/derived controller and constraint satisfaction guarantees, respectively, in control theoretic approaches. However, compared to theoretical guarantees available in control theoretic methods, RL lacks closed-loop stability of a computed policy and constraint satisfaction guarantees. Safe reinforcement learning refers to a class of constrained problems where the constraint violations lead to partial or complete system failure. The goal of this review is to provide an overview of safe RL techniques using Lyapunov and barrier functions to guarantee this notion of safety discussed (stability of the system in terms of a computed policy and constraint satisfaction during training and deployment). The different approaches employed are discussed in detail along with their shortcomings and benefits to provide critique and possible future research directions. Key motivation for this review is to discuss current theoretical approaches for safety and stability guarantees in RL similar to control theoretic approaches using Lyapunov and barrier functions. The review provides proven potential and promising scope of providing safety guarantees for complex dynamical systems with operational constraints using model-based and model-free RL.
comment: pages - 19, figures - 9, Submitted to IEEE Access
Comparing Building Thermal Dynamics Models and Estimation Methods for Grid-Edge Applications
We need computationally efficient and accurate building thermal dynamics models for use in grid-edge applications. This work evaluates two grey-box approaches for modeling building thermal dynamics: RC-network models and structured regression models. For RC-network models, we compare parameter estimation methods including Nonlinear Least Squares, Batch Estimation, and Maximum Likelihood Estimation. We use the Almon Lag Structure with Linear Least Squares for estimating the structured regression models. The performance of these models and methods is evaluated on simulated house and commercial building data across three different simulation types.
comment: This manuscript is a version of our paper accepted at the IEEE Power & Energy Society General Meeting (PESGM) 2025
Smart Residential Community Simulator for Developing and Benchmarking Energy Management Systems
Home Energy Management Systems (HEMS) are being actively developed for both individual houses and communities to support demand response in on-grid operation, and ensure resilience during off-grid scenarios. However, most simulators used for closed-loop HEMS testing are tailored to a specific distributed energy resource (DER) configuration with a fixed number of houses, limiting flexibility and scalability. This leads to additional development efforts to support diverse DER configurations across any number of houses and to integrate appropriate weather and load data pipelines. To address these limitations, we present a scalable simulator capable of modeling any number of houses in both on-grid and off-grid modes as a Gymnasium environment. Each house can have a unique DER configuration - Rooftop Solar Photovoltaics (PV), Battery-only, PV-only, or no DER - and includes models for air-conditioning and eight grouped circuit-level loads. The simulator integrates National Solar Radiation Database (NSRDB) weather and Pecan Street load datasets, supports three default controllers (two for off-grid, and one for on-grid scenarios), and includes performance metrics and visualization tools. We demonstrate its flexibility through simulations on individual houses and a four-house community with heterogeneous DERs, benchmarking the controllers across built-in metrics and computation time. The results highlight the simulator's capability to systematically evaluate control policy performance under varying system configurations.
comment: This manuscript is an extended version of our paper accepted at the IEEE SmartGridComm 2025
Optimization-Free Fast Optimal Control: Bang-Ride Property, Monotonicity, and Applications to Fast Battery Charging
Single-input fast optimal control problems, which aim to achieve the optimal objective as fast as possible, occur in various real-world applications. In the case of fast battery charging, the associated optimal control problem becomes computationally challenging when detailed battery models are used. A recent heuristic optimization-free algorithm can significantly reduce the computational cost and provide an approximate solution, consistent with many heuristic input profiles in practice. These heuristic solutions have several special properties: They follow a bang-ride pattern that always activates a constraint and applies the maximum feasible input. This work investigates when the above properties arise in the optimal input, and ultimately, when the heuristic input profiles satisfy necessary optimality conditions. By exploiting Pontryagin's maximum principle (PMP), we show that the optimal control is bang-ride under regularity conditions on constraint switching and local controllability of the system. Moreover, the special type of bang-ride behavior, i.e., applying the maximum feasible input, arises under the monotonicity of the system, objective function, and restricted sensitivity of the constraints. These results provide a theoretical justification for a class of charging heuristics and the fast optimization-free algorithm.
Large Scale Robotic Material Handling: Learning, Planning, and Control
Bulk material handling involves the efficient and precise moving of large quantities of materials, a core operation in many industries, including cargo ship unloading, waste sorting, construction, and demolition. These repetitive, labor-intensive, and safety-critical operations are typically performed using large hydraulic material handlers equipped with underactuated grippers. In this work, we present a comprehensive framework for the autonomous execution of large-scale material handling tasks. The system integrates specialized modules for environment perception, pile attack point selection, path planning, and motion control. The main contributions of this work are two reinforcement learning-based modules: an attack point planner that selects optimal grasping locations on the material pile to maximize removal efficiency and minimize the number of scoops, and a robust trajectory following controller that addresses the precision and safety challenges associated with underactuated grippers in movement, while utilizing their free-swinging nature to release material through dynamic throwing. We validate our framework through real-world experiments on a 40 t material handler in a representative worksite, focusing on two key tasks: high-throughput bulk pile management and high-precision truck loading. Comparative evaluations against human operators demonstrate the system's effectiveness in terms of precision, repeatability, and operational safety. To the best of our knowledge, this is the first complete automation of material handling tasks on a full scale.
comment: Preliminary version, currently undergoing review process
Fundamental limitations of monotonic tracking systems
We consider the monotonic tracking control problem for continuous-time single-input single-output linear systems using output-feedback linear controllers in this paper. We provide the necessary and sufficient conditions for this problem to be solvable and expose its fundamental limitations: the exact feasible locations of the plant zeros, the minimum controller order possible, and the maximum decay rate achievable for the closed-loop system. The relationship between these bounds is explained by a simple geometric shape for plants with a pair of complex-conjugate zeros.
Architecture and FPGA Implementation of Digital Time-to-Digital Converter for Sensing Applications
Many application domains face the challenges of high-power consumption and high computational demands, especially with the advancement in embedded machine learning and edge computing. Designing application-specific circuits is crucial to reducing hardware complexity and power consumption. In these perspectives, this paper presents the design of a Digital Time-to-Digital converter (DTDC) based on multiple delay line topologies. The DTDC is implemented in VHDL for the Xilinx Artix-7 AC701 FPGA device. Simulation results demonstrate the effectiveness of the circuit in converting the input period along a wide range up to 1ps. The designed circuit is implemented with less than 1% of the resource utilization on the target FPGA device.
XR Reality Check: What Commercial Devices Deliver for Spatial Tracking
Inaccurate spatial tracking in extended reality (XR) devices leads to virtual object jitter, misalignment, and user discomfort, fundamentally limiting immersive experiences and natural interactions. In this work, we introduce a novel testbed that enables simultaneous, synchronized evaluation of multiple XR devices under identical environmental and kinematic conditions. Leveraging this platform, we present the first comprehensive empirical benchmarking of five state-of-the-art XR devices across 16 diverse scenarios. Our results reveal substantial intra-device performance variation, with individual devices exhibiting up to 101\% increases in error when operating in featureless environments. We also demonstrate that tracking accuracy strongly correlates with visual conditions and motion dynamics. We also observe significant inter-device disparities, with performance differences of up to 2.8$\times$, which are closely linked to hardware specifications such as sensor configurations and dedicated processing units. Finally, we explore the feasibility of substituting a motion capture system with the Apple Vision Pro as a practical ground truth reference. While the Apple Vision Pro delivers highly accurate relative pose error estimates ($R^2 = 0.830$), its absolute pose error estimation remains limited ($R^2 = 0.387$), highlighting both its potential and its constraints for rigorous XR evaluation. This work establishes the first standardized framework for comparative XR tracking evaluation, providing the research community with reproducible methodologies, comprehensive benchmark datasets, and open-source tools that enable systematic analysis of tracking performance across devices and conditions, thereby accelerating the development of more robust spatial sensing technologies for XR systems.
Performance Benchmarking of Machine Learning Models for Terahertz Metamaterial Absorber Prediction
This study presents a polarization-insensitive ultra-broadband terahertz metamaterial absorber based on vanadium dioxide (VO2) and evaluates machine learning methods for predicting its absorption performance. The structure consists of a VO2 metasurface, a MF2 dielectric spacer, and a gold ground plane. It achieves more than 90% absorption between 5.72 and 11.11 THz, covering a 5.38 THz bandwidth with an average absorptance of 98.15%. A dataset of 9,018 samples was generated from full-wave simulations by varying patch width, dielectric thickness, and frequency. Six regression models were trained: Linear Regression, Support Vector Regression, Decision Tree, Random Forest, XGBoost, and Bagging. Performance was measured using adjusted R2, MAE, MSE, and RMSE. Ensemble models achieved the best results, with Bagging reaching an adjusted R2 of 0.9985 and RMSE of 0.0146. The workflow offers a faster alternative to exhaustive simulations and can be applied to other metamaterial designs, enabling efficient evaluation and optimization.
comment: 6 pages, 6 figures
DeePConverter: A Data-Driven Optimal Control Architecture for Grid-Connected Power Converters
Grid-connected power converters are ubiquitous in modern power systems, acting as grid interfaces of renewable energy sources, energy storage systems, electric vehicles, high-voltage DC systems, etc. Conventionally, power converters use multiple PID regulators to achieve different control objectives such as grid synchronization and voltage/power regulations, where the PID parameters are usually tuned based on a presumed (and often overly-simplified) power grid model. However, this may lead to inferior performance or even instabilities in practice, as the real power grid is highly complex, variable, and generally unknown. To tackle this problem, we employ a data-enabled predictive control (DeePC) to perform data-driven, optimal, and robust control for power converters. We call the converters that are operated in this way \textit{DeePConverters}. A DeePConverter can implicitly perceive the characteristics of the power grid from data and adjust its control strategy to achieve optimal and robust performance. We present the modular configurations, generalized structure, control behavior specification, detailed implementation, and computation of DeePConverters. High-fidelity simulations and hardware-in-the-loop (HIL) tests are provided to validate the effectiveness of DeePConverters.
Satellites are closer than you think: A near field MIMO approach for Ground stations
The rapid growth of low Earth orbit (LEO) satellite constellations has revolutionized broadband access, earth observation, and direct-to-device connectivity. However, the expansion of ground station infrastructure has not kept pace, creating a critical bottleneck in satellite-to-ground backhaul capacity. Traditional parabolic dish antennas, though effective for geostationary (GEO) satellites, are ill-suited for dense, fastmoving LEO networks due to mechanical steering delays and their inability to track multiple satellites simultaneously. Phased array antennas offer electronically steerable beams and multisatellite support, but their integration into ground stations is limited by the high cost, hardware issues, and complexity of achieving sufficient antenna gain. We introduce ArrayLink, a distributed phased array architecture that coherently combines multiple small commercially available panels to achieve high-gain beamforming and unlock line-of-sight MIMO spatial multiplexing with minimal additional capital expenditure. By spacing 16 (32x32) panels across a kilometer-scale aperture, ArrayLink enters the radiative near-field, focusing energy in both angle and range while supporting up to four simultaneous spatial streams on a single feeder link. Through rigorous theoretical analysis, detailed 2D beam pattern simulations and real-world hardware experiments, we show that ArrayLink (i) achieves dish-class gain with in range 1-2 dB of 1.47 m reflector, (ii) maintains four parallel streams at ranges of hundreds of kilometers (falling to two beyond 2000 km), and (iii) exhibits tight agreement across theory, simulation, and experiment with minimal variance. These findings open a practical and scalable path to boosting LEO backhaul capacity.
comment: 11 pages, 11 figures
Fake-Mamba: Real-Time Speech Deepfake Detection Using Bidirectional Mamba as Self-Attention's Alternative
Advances in speech synthesis intensify security threats, motivating real-time deepfake detection research. We investigate whether bidirectional Mamba can serve as a competitive alternative to Self-Attention in detecting synthetic speech. Our solution, Fake-Mamba, integrates an XLSR front-end with bidirectional Mamba to capture both local and global artifacts. Our core innovation introduces three efficient encoders: TransBiMamba, ConBiMamba, and PN-BiMamba. Leveraging XLSR's rich linguistic representations, PN-BiMamba can effectively capture the subtle cues of synthetic speech. Evaluated on ASVspoof 21 LA, 21 DF, and In-The-Wild benchmarks, Fake-Mamba achieves 0.97%, 1.74%, and 5.85% EER, respectively, representing substantial relative gains over SOTA models XLSR-Conformer and XLSR-Mamba. The framework maintains real-time inference across utterance lengths, demonstrating strong generalization and practical viability. The code is available at https://github.com/xuanxixi/Fake-Mamba.
comment: Accepted at IEEE ASRU 2025
Nonlinear Systems in Wireless Power Transfer Applications
As a novel pattern of energization, the wireless power transfer (WPT) offers a brand-new way to the energy acquisition for electric-driven devices, thus alleviating the over-dependence on the battery. This report presents three types of WPT systems that use nonlinear control methods, in order to acquire an in-depth understanding of the course of Nonlinear Systems.
Optimal Coupled Sensor Placement and Path-Planning in Unknown Time-Varying Environments
We address path-planning for a mobile agent to navigate in an unknown environment with minimum exposure to a spatially and temporally varying threat field. The threat field is estimated using pointwise noisy measurements from a mobile sensor network. For this problem, we present a new information gain measure for optimal sensor placement that quantifies reduction in uncertainty in the path cost rather than the environment state. This measure, which we call the context-relevant mutual information (CRMI), couples the sensor placement and path-planning problem. We propose an iterative coupled sensor configuration and path-planning (CSCP) algorithm. At each iteration, this algorithm places sensors to maximize CRMI, updates the threat estimate using new measurements, and recalculates the path with minimum expected exposure to the threat. The iterations converge when the path cost variance, which is an indicator of risk, reduces below a desired threshold. We show that CRMI is submodular, and therefore, greedy optimization provides near-optimal sensor placements while maintaining computational efficiency of the CSCP algorithm. Distance-based sensor reconfiguration costs are introduced in a modified CRMI measure, which we also show to be submodular. Through numerical simulations, we demonstrate that the principal advantage of this algorithm is that near-optimal low-variance paths are achieved using far fewer sensor measurements as compared to a standard sensor placement method.
Impedance Space Method: Time-Independent Parametric Ellipses for Robot Compliant Control
This paper proposes a novel 3D graphical representation for impedance control, called the impedance space, to foster the analysis of the dynamic behavior of robotic compliant controllers. The method overcomes limitations of existing 2D graphical approaches by incorporating mass, stiffness, and damping dynamics, and associates the impedance control parameters with linear transformations to plot a parametric 3D ellipse and its projections in 2D for a mass-spring-damper impedance under sinusoidal reference. Experimental evaluation demonstrates the effectiveness of the proposed representation for analysis of impedance control. The method applies to various compliant control topologies and can be extended to other model-based control approaches.
comment: Accepted version to be published in the IEEE Latin America Transactions. There were changes aiming to clarify some of the ideas exposed in the paper, which led to the increase in the number of figures and pages. The abstract and the title also changed. The paper now has 8 pages and 10 figures
Dynamic Transfer Policies for Parallel Queues
We consider the problem of load balancing in parallel queues by transferring customers between them at discrete points in time. Holding costs accrue as customers wait in the queue, while transfer decisions incur both fixed (setup) costs and variable costs that increase with the number of transfers and travel distance, and vary by transfer direction. Our work is primarily motivated by inter-facility patient transfers to address imbalanced congestion and inequity in access to care during surges in hospital demand. Analyzing an associated fluid control problem, we show that under general assumptions, including time-varying arrivals and convex holding costs, the optimal policy partitions the state-space into a well-defined $\textit{no-transfer region}$ and its complement, implying that transferring is optimal if and only if the system is sufficiently imbalanced. In the absence of fixed transfer costs, an optimal policy moves the state to the no-transfer region's boundary; in contrast, with fixed costs, the state is moved to its relative interior. Leveraging our structural results, we propose a simulation-based approximate dynamic programming (ADP) algorithm to find effective transfer policies for the stochastic system. We investigate the performance and robustness of the fluid and ADP policies in a case study calibrated using data during the COVID-19 pandemic in the Greater Toronto Area, which demonstrates that transferring patients between hospitals could result in up to 27.7% reduction in total cost with relatively few transfers.
comment: 57 pages, 10 figures
SIL Allocation for Mitigation Safety Functions
SIL (Safety Integrity Level) allocation plays a pivotal role in evaluating the significance of Safety Functions (SFs) within high-risk industries. The outcomes of a SIL allocation study determine the design specifications necessary to uphold the Probability of Failure on Demand (PFD) below permissible limits, thus managing risk effectively. While extensive research has focused on SIL allocation for preventive SFs, there is a noticeable gap in attention towards mitigation SFs. To address this gap, this paper discusses the shortcomings of current methods and proposes a new approach to overcome them. The principles of the proposed method are substantiated by detailed mathematical formulation and the practical application of the method is demonstrated through a case study in a road tunnel project.
Data-Driven Certificate Synthesis
We investigate the problem of verifying different properties of discrete time dynamical systems, namely, reachability, safety and reach-while-avoid. To achieve this, we adopt a data driven perspective and, using past system trajectories as data, we aim at learning a specific function termed certificate for each property we wish to verify. We seek to minimize a loss function, designed to encompass conditions on the certificate to be learned that encode the satisfaction of the associated property. Besides learning a certificate, we quantify probabilistically its generalization properties, namely, how likely it is for a certificate to be valid (and hence for the associated property to be satisfied) when it comes to a new system trajectory not included in the training data set. We view this problem under the realm of probably approximately correct (PAC) learning under the notion of compression, and use recent advancements of the so-called scenario approach to obtain scalable generalization bounds on the learned certificates. To achieve this, we design a novel algorithm that minimizes the loss function and hence constructs a certificate, and at the same time determines a quantity termed compression, which is instrumental in obtaining meaningful probabilistic guarantees. This process is novel per se and provides a constructive mechanism for compression set calculation, thus opening the road for its use to more general non-convex optimization problems. We verify the efficacy of our methodology on several numerical case studies, and compare it (both theoretically and numerically) with closely related results on data-driven property verification.
comment: 18 pages, submitted to Automatica
Adaptive Informed Deep Neural Networks for Power Flow Analysis
This study introduces PINN4PF, an end-to-end deep learning architecture for power flow (PF) analysis that effectively captures the nonlinear dynamics of large-scale modern power systems. The proposed neural network (NN) architecture consists of two important advancements in the training pipeline: (A) a double-head feed-forward NN that aligns with PF analysis, including an activation function that adjusts to the net active and reactive power injections patterns, and (B) a physics-based loss function that partially incorporates power system topology information through a novel hidden function. The effectiveness of the proposed architecture is illustrated through 4-bus, 15-bus, 290-bus, and 2224-bus test systems and is evaluated against two baselines: a linear regression model (LR) and a black-box NN (MLP). The comparison is based on (i) generalization ability, (ii) robustness, (iii) impact of training dataset size on generalization ability, (iv) accuracy in approximating derived PF quantities (specifically line current, line active power, and line reactive power), and (v) scalability. Results demonstrate that PINN4PF outperforms both baselines across all test systems by up to two orders of magnitude not only in terms of direct criteria, e.g., generalization ability, but also in terms of approximating derived physical quantities.
comment: 17 pages, 7 figures, 4 tables
The Holonomy of Optimal Mass Transport: The Gaussian-Linear Case
The theory of Monge-Kantorovich Optimal Mass Transport (OMT) has in recent years spurred a fast developing phase of research in stochastic control, control of ensemble systems, thermodynamics, data science, and several other fields in engineering and science. We herein introduce a new type of transportation problems. The salient feature of these problems is that particles/agents in the ensemble are labeled and their relative position along their journey is of interest. Of particular importance in our program are control laws that steer ensembles along cycles ensuring that individual particles return to their original position. This feature is in contrast with the classical theory of optimal transport where the primary object of study is the path of probability densities, without any concern about particle labels. In the theory that we present, we focus on the case Gaussian distributions and linear dynamics, and explore a hitherto unstudied sub-Riemannian structure of Monge-Kantorovich transport where the relative position of particles along their journey is modeled by the holonomy of the transportation schedule. From this vantage point, we discuss several other problems of independent interest.
comment: 22 pages, 8 figures
TinyMPC: Model-Predictive Control on Resource-Constrained Microcontrollers ICRA 2024
Model-predictive control (MPC) is a powerful tool for controlling highly dynamic robotic systems subject to complex constraints. However, MPC is computationally demanding, and is often impractical to implement on small, resource-constrained robotic platforms. We present TinyMPC, a high-speed MPC solver with a low memory footprint targeting the microcontrollers common on small robots. Our approach is based on the alternating direction method of multipliers (ADMM) and leverages the structure of the MPC problem for efficiency. We demonstrate TinyMPC's effectiveness by benchmarking against the state-of-the-art solver OSQP, achieving nearly an order of magnitude speed increase, as well as through hardware experiments on a 27 gram quadrotor, demonstrating high-speed trajectory tracking and dynamic obstacle avoidance. TinyMPC is publicly available at https://tinympc.org.
comment: Accepted at ICRA 2024. First three authors contributed equally and are listed in alphabetical order. Publicly available at https://tinympc.org
Dynamic load balancing for cloud systems under heterogeneous setup delays
We consider a distributed cloud service deployed at a set of distinct server pools. Arriving jobs are classified into heterogeneous types, in accordance with their setup times which are differentiated at each of the pools. A dispatcher for each job type controls the balance of load between pools, based on decentralized feedback. The system of rates and queues is modeled by a fluid differential equation system, and analyzed via convex optimization. A first, myopic policy is proposed, based on task delay-to-service. Under a simplified dynamic fluid queue model, we prove global convergence to an equilibrium point which minimizes the mean setup time; however queueing delays are incurred with this method. A second proposal is then developed based on proximal optimization, which explicitly models the setup queue and is proved to reach an optimal equilibrium, devoid of queueing delay. Results are demonstrated through a simulation example.
comment: 20 pages, 1 figure
Systems and Control (EESS)
An Open-Source Simulation and Data Management Tool for EnergyPlus Building Models
We present a new open-source, GUI-based application created using Plotly-Dash, along with an integrated PostgreSQL-based relational database, developed to streamline EnergyPlus building model simulation workflows. The application facilitates data generation, aggregation (across thermal zones), and visualization based on customizable user preferences, while the database efficiently stores and retrieves complex simulation data generated by EnergyPlus. We demonstrate the need for this application and database, emphasizing how existing approaches for generating, managing, and analyzing EnergyPlus simulation data can be cumbersome, particularly when handling a large number of building models with varying simulation setups. This integrated framework enables building energy engineers and researchers to simplify their EnergyPlus simulations, manage generated simulation data, perform data analyses, and support data-driven modeling tasks.
comment: This manuscript is a version of our paper accepted at the IEEE Power & Energy Society General Meeting (PESGM) 2025
A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions
Reinforcement learning (RL) has proven to be particularly effective in solving complex decision-making problems for a wide range of applications. From a control theory perspective, RL can be considered as an adaptive optimal control scheme. Lyapunov and barrier functions are the most commonly used certificates to guarantee system stability for a proposed/derived controller and constraint satisfaction guarantees, respectively, in control theoretic approaches. However, compared to theoretical guarantees available in control theoretic methods, RL lacks closed-loop stability of a computed policy and constraint satisfaction guarantees. Safe reinforcement learning refers to a class of constrained problems where the constraint violations lead to partial or complete system failure. The goal of this review is to provide an overview of safe RL techniques using Lyapunov and barrier functions to guarantee this notion of safety discussed (stability of the system in terms of a computed policy and constraint satisfaction during training and deployment). The different approaches employed are discussed in detail along with their shortcomings and benefits to provide critique and possible future research directions. Key motivation for this review is to discuss current theoretical approaches for safety and stability guarantees in RL similar to control theoretic approaches using Lyapunov and barrier functions. The review provides proven potential and promising scope of providing safety guarantees for complex dynamical systems with operational constraints using model-based and model-free RL.
comment: pages - 19, figures - 9, Submitted to IEEE Access
Comparing Building Thermal Dynamics Models and Estimation Methods for Grid-Edge Applications
We need computationally efficient and accurate building thermal dynamics models for use in grid-edge applications. This work evaluates two grey-box approaches for modeling building thermal dynamics: RC-network models and structured regression models. For RC-network models, we compare parameter estimation methods including Nonlinear Least Squares, Batch Estimation, and Maximum Likelihood Estimation. We use the Almon Lag Structure with Linear Least Squares for estimating the structured regression models. The performance of these models and methods is evaluated on simulated house and commercial building data across three different simulation types.
comment: This manuscript is a version of our paper accepted at the IEEE Power & Energy Society General Meeting (PESGM) 2025
Smart Residential Community Simulator for Developing and Benchmarking Energy Management Systems
Home Energy Management Systems (HEMS) are being actively developed for both individual houses and communities to support demand response in on-grid operation, and ensure resilience during off-grid scenarios. However, most simulators used for closed-loop HEMS testing are tailored to a specific distributed energy resource (DER) configuration with a fixed number of houses, limiting flexibility and scalability. This leads to additional development efforts to support diverse DER configurations across any number of houses and to integrate appropriate weather and load data pipelines. To address these limitations, we present a scalable simulator capable of modeling any number of houses in both on-grid and off-grid modes as a Gymnasium environment. Each house can have a unique DER configuration - Rooftop Solar Photovoltaics (PV), Battery-only, PV-only, or no DER - and includes models for air-conditioning and eight grouped circuit-level loads. The simulator integrates National Solar Radiation Database (NSRDB) weather and Pecan Street load datasets, supports three default controllers (two for off-grid, and one for on-grid scenarios), and includes performance metrics and visualization tools. We demonstrate its flexibility through simulations on individual houses and a four-house community with heterogeneous DERs, benchmarking the controllers across built-in metrics and computation time. The results highlight the simulator's capability to systematically evaluate control policy performance under varying system configurations.
comment: This manuscript is an extended version of our paper accepted at the IEEE SmartGridComm 2025
Optimization-Free Fast Optimal Control: Bang-Ride Property, Monotonicity, and Applications to Fast Battery Charging
Single-input fast optimal control problems, which aim to achieve the optimal objective as fast as possible, occur in various real-world applications. In the case of fast battery charging, the associated optimal control problem becomes computationally challenging when detailed battery models are used. A recent heuristic optimization-free algorithm can significantly reduce the computational cost and provide an approximate solution, consistent with many heuristic input profiles in practice. These heuristic solutions have several special properties: They follow a bang-ride pattern that always activates a constraint and applies the maximum feasible input. This work investigates when the above properties arise in the optimal input, and ultimately, when the heuristic input profiles satisfy necessary optimality conditions. By exploiting Pontryagin's maximum principle (PMP), we show that the optimal control is bang-ride under regularity conditions on constraint switching and local controllability of the system. Moreover, the special type of bang-ride behavior, i.e., applying the maximum feasible input, arises under the monotonicity of the system, objective function, and restricted sensitivity of the constraints. These results provide a theoretical justification for a class of charging heuristics and the fast optimization-free algorithm.
Large Scale Robotic Material Handling: Learning, Planning, and Control
Bulk material handling involves the efficient and precise moving of large quantities of materials, a core operation in many industries, including cargo ship unloading, waste sorting, construction, and demolition. These repetitive, labor-intensive, and safety-critical operations are typically performed using large hydraulic material handlers equipped with underactuated grippers. In this work, we present a comprehensive framework for the autonomous execution of large-scale material handling tasks. The system integrates specialized modules for environment perception, pile attack point selection, path planning, and motion control. The main contributions of this work are two reinforcement learning-based modules: an attack point planner that selects optimal grasping locations on the material pile to maximize removal efficiency and minimize the number of scoops, and a robust trajectory following controller that addresses the precision and safety challenges associated with underactuated grippers in movement, while utilizing their free-swinging nature to release material through dynamic throwing. We validate our framework through real-world experiments on a 40 t material handler in a representative worksite, focusing on two key tasks: high-throughput bulk pile management and high-precision truck loading. Comparative evaluations against human operators demonstrate the system's effectiveness in terms of precision, repeatability, and operational safety. To the best of our knowledge, this is the first complete automation of material handling tasks on a full scale.
comment: Preliminary version, currently undergoing review process
Fundamental limitations of monotonic tracking systems
We consider the monotonic tracking control problem for continuous-time single-input single-output linear systems using output-feedback linear controllers in this paper. We provide the necessary and sufficient conditions for this problem to be solvable and expose its fundamental limitations: the exact feasible locations of the plant zeros, the minimum controller order possible, and the maximum decay rate achievable for the closed-loop system. The relationship between these bounds is explained by a simple geometric shape for plants with a pair of complex-conjugate zeros.
Architecture and FPGA Implementation of Digital Time-to-Digital Converter for Sensing Applications
Many application domains face the challenges of high-power consumption and high computational demands, especially with the advancement in embedded machine learning and edge computing. Designing application-specific circuits is crucial to reducing hardware complexity and power consumption. In these perspectives, this paper presents the design of a Digital Time-to-Digital converter (DTDC) based on multiple delay line topologies. The DTDC is implemented in VHDL for the Xilinx Artix-7 AC701 FPGA device. Simulation results demonstrate the effectiveness of the circuit in converting the input period along a wide range up to 1ps. The designed circuit is implemented with less than 1% of the resource utilization on the target FPGA device.
XR Reality Check: What Commercial Devices Deliver for Spatial Tracking
Inaccurate spatial tracking in extended reality (XR) devices leads to virtual object jitter, misalignment, and user discomfort, fundamentally limiting immersive experiences and natural interactions. In this work, we introduce a novel testbed that enables simultaneous, synchronized evaluation of multiple XR devices under identical environmental and kinematic conditions. Leveraging this platform, we present the first comprehensive empirical benchmarking of five state-of-the-art XR devices across 16 diverse scenarios. Our results reveal substantial intra-device performance variation, with individual devices exhibiting up to 101\% increases in error when operating in featureless environments. We also demonstrate that tracking accuracy strongly correlates with visual conditions and motion dynamics. We also observe significant inter-device disparities, with performance differences of up to 2.8$\times$, which are closely linked to hardware specifications such as sensor configurations and dedicated processing units. Finally, we explore the feasibility of substituting a motion capture system with the Apple Vision Pro as a practical ground truth reference. While the Apple Vision Pro delivers highly accurate relative pose error estimates ($R^2 = 0.830$), its absolute pose error estimation remains limited ($R^2 = 0.387$), highlighting both its potential and its constraints for rigorous XR evaluation. This work establishes the first standardized framework for comparative XR tracking evaluation, providing the research community with reproducible methodologies, comprehensive benchmark datasets, and open-source tools that enable systematic analysis of tracking performance across devices and conditions, thereby accelerating the development of more robust spatial sensing technologies for XR systems.
Performance Benchmarking of Machine Learning Models for Terahertz Metamaterial Absorber Prediction
This study presents a polarization-insensitive ultra-broadband terahertz metamaterial absorber based on vanadium dioxide (VO2) and evaluates machine learning methods for predicting its absorption performance. The structure consists of a VO2 metasurface, a MF2 dielectric spacer, and a gold ground plane. It achieves more than 90% absorption between 5.72 and 11.11 THz, covering a 5.38 THz bandwidth with an average absorptance of 98.15%. A dataset of 9,018 samples was generated from full-wave simulations by varying patch width, dielectric thickness, and frequency. Six regression models were trained: Linear Regression, Support Vector Regression, Decision Tree, Random Forest, XGBoost, and Bagging. Performance was measured using adjusted R2, MAE, MSE, and RMSE. Ensemble models achieved the best results, with Bagging reaching an adjusted R2 of 0.9985 and RMSE of 0.0146. The workflow offers a faster alternative to exhaustive simulations and can be applied to other metamaterial designs, enabling efficient evaluation and optimization.
comment: 6 pages, 6 figures
DeePConverter: A Data-Driven Optimal Control Architecture for Grid-Connected Power Converters
Grid-connected power converters are ubiquitous in modern power systems, acting as grid interfaces of renewable energy sources, energy storage systems, electric vehicles, high-voltage DC systems, etc. Conventionally, power converters use multiple PID regulators to achieve different control objectives such as grid synchronization and voltage/power regulations, where the PID parameters are usually tuned based on a presumed (and often overly-simplified) power grid model. However, this may lead to inferior performance or even instabilities in practice, as the real power grid is highly complex, variable, and generally unknown. To tackle this problem, we employ a data-enabled predictive control (DeePC) to perform data-driven, optimal, and robust control for power converters. We call the converters that are operated in this way \textit{DeePConverters}. A DeePConverter can implicitly perceive the characteristics of the power grid from data and adjust its control strategy to achieve optimal and robust performance. We present the modular configurations, generalized structure, control behavior specification, detailed implementation, and computation of DeePConverters. High-fidelity simulations and hardware-in-the-loop (HIL) tests are provided to validate the effectiveness of DeePConverters.
Satellites are closer than you think: A near field MIMO approach for Ground stations
The rapid growth of low Earth orbit (LEO) satellite constellations has revolutionized broadband access, earth observation, and direct-to-device connectivity. However, the expansion of ground station infrastructure has not kept pace, creating a critical bottleneck in satellite-to-ground backhaul capacity. Traditional parabolic dish antennas, though effective for geostationary (GEO) satellites, are ill-suited for dense, fastmoving LEO networks due to mechanical steering delays and their inability to track multiple satellites simultaneously. Phased array antennas offer electronically steerable beams and multisatellite support, but their integration into ground stations is limited by the high cost, hardware issues, and complexity of achieving sufficient antenna gain. We introduce ArrayLink, a distributed phased array architecture that coherently combines multiple small commercially available panels to achieve high-gain beamforming and unlock line-of-sight MIMO spatial multiplexing with minimal additional capital expenditure. By spacing 16 (32x32) panels across a kilometer-scale aperture, ArrayLink enters the radiative near-field, focusing energy in both angle and range while supporting up to four simultaneous spatial streams on a single feeder link. Through rigorous theoretical analysis, detailed 2D beam pattern simulations and real-world hardware experiments, we show that ArrayLink (i) achieves dish-class gain with in range 1-2 dB of 1.47 m reflector, (ii) maintains four parallel streams at ranges of hundreds of kilometers (falling to two beyond 2000 km), and (iii) exhibits tight agreement across theory, simulation, and experiment with minimal variance. These findings open a practical and scalable path to boosting LEO backhaul capacity.
comment: 11 pages, 11 figures
Fake-Mamba: Real-Time Speech Deepfake Detection Using Bidirectional Mamba as Self-Attention's Alternative
Advances in speech synthesis intensify security threats, motivating real-time deepfake detection research. We investigate whether bidirectional Mamba can serve as a competitive alternative to Self-Attention in detecting synthetic speech. Our solution, Fake-Mamba, integrates an XLSR front-end with bidirectional Mamba to capture both local and global artifacts. Our core innovation introduces three efficient encoders: TransBiMamba, ConBiMamba, and PN-BiMamba. Leveraging XLSR's rich linguistic representations, PN-BiMamba can effectively capture the subtle cues of synthetic speech. Evaluated on ASVspoof 21 LA, 21 DF, and In-The-Wild benchmarks, Fake-Mamba achieves 0.97%, 1.74%, and 5.85% EER, respectively, representing substantial relative gains over SOTA models XLSR-Conformer and XLSR-Mamba. The framework maintains real-time inference across utterance lengths, demonstrating strong generalization and practical viability. The code is available at https://github.com/xuanxixi/Fake-Mamba.
comment: Accepted at IEEE ASRU 2025
Nonlinear Systems in Wireless Power Transfer Applications
As a novel pattern of energization, the wireless power transfer (WPT) offers a brand-new way to the energy acquisition for electric-driven devices, thus alleviating the over-dependence on the battery. This report presents three types of WPT systems that use nonlinear control methods, in order to acquire an in-depth understanding of the course of Nonlinear Systems.
Optimal Coupled Sensor Placement and Path-Planning in Unknown Time-Varying Environments
We address path-planning for a mobile agent to navigate in an unknown environment with minimum exposure to a spatially and temporally varying threat field. The threat field is estimated using pointwise noisy measurements from a mobile sensor network. For this problem, we present a new information gain measure for optimal sensor placement that quantifies reduction in uncertainty in the path cost rather than the environment state. This measure, which we call the context-relevant mutual information (CRMI), couples the sensor placement and path-planning problem. We propose an iterative coupled sensor configuration and path-planning (CSCP) algorithm. At each iteration, this algorithm places sensors to maximize CRMI, updates the threat estimate using new measurements, and recalculates the path with minimum expected exposure to the threat. The iterations converge when the path cost variance, which is an indicator of risk, reduces below a desired threshold. We show that CRMI is submodular, and therefore, greedy optimization provides near-optimal sensor placements while maintaining computational efficiency of the CSCP algorithm. Distance-based sensor reconfiguration costs are introduced in a modified CRMI measure, which we also show to be submodular. Through numerical simulations, we demonstrate that the principal advantage of this algorithm is that near-optimal low-variance paths are achieved using far fewer sensor measurements as compared to a standard sensor placement method.
Impedance Space Method: Time-Independent Parametric Ellipses for Robot Compliant Control
This paper proposes a novel 3D graphical representation for impedance control, called the impedance space, to foster the analysis of the dynamic behavior of robotic compliant controllers. The method overcomes limitations of existing 2D graphical approaches by incorporating mass, stiffness, and damping dynamics, and associates the impedance control parameters with linear transformations to plot a parametric 3D ellipse and its projections in 2D for a mass-spring-damper impedance under sinusoidal reference. Experimental evaluation demonstrates the effectiveness of the proposed representation for analysis of impedance control. The method applies to various compliant control topologies and can be extended to other model-based control approaches.
comment: Accepted version to be published in the IEEE Latin America Transactions. There were changes aiming to clarify some of the ideas exposed in the paper, which led to the increase in the number of figures and pages. The abstract and the title also changed. The paper now has 8 pages and 10 figures
Dynamic Transfer Policies for Parallel Queues
We consider the problem of load balancing in parallel queues by transferring customers between them at discrete points in time. Holding costs accrue as customers wait in the queue, while transfer decisions incur both fixed (setup) costs and variable costs that increase with the number of transfers and travel distance, and vary by transfer direction. Our work is primarily motivated by inter-facility patient transfers to address imbalanced congestion and inequity in access to care during surges in hospital demand. Analyzing an associated fluid control problem, we show that under general assumptions, including time-varying arrivals and convex holding costs, the optimal policy partitions the state-space into a well-defined $\textit{no-transfer region}$ and its complement, implying that transferring is optimal if and only if the system is sufficiently imbalanced. In the absence of fixed transfer costs, an optimal policy moves the state to the no-transfer region's boundary; in contrast, with fixed costs, the state is moved to its relative interior. Leveraging our structural results, we propose a simulation-based approximate dynamic programming (ADP) algorithm to find effective transfer policies for the stochastic system. We investigate the performance and robustness of the fluid and ADP policies in a case study calibrated using data during the COVID-19 pandemic in the Greater Toronto Area, which demonstrates that transferring patients between hospitals could result in up to 27.7% reduction in total cost with relatively few transfers.
comment: 57 pages, 10 figures
SIL Allocation for Mitigation Safety Functions
SIL (Safety Integrity Level) allocation plays a pivotal role in evaluating the significance of Safety Functions (SFs) within high-risk industries. The outcomes of a SIL allocation study determine the design specifications necessary to uphold the Probability of Failure on Demand (PFD) below permissible limits, thus managing risk effectively. While extensive research has focused on SIL allocation for preventive SFs, there is a noticeable gap in attention towards mitigation SFs. To address this gap, this paper discusses the shortcomings of current methods and proposes a new approach to overcome them. The principles of the proposed method are substantiated by detailed mathematical formulation and the practical application of the method is demonstrated through a case study in a road tunnel project.
Data-Driven Certificate Synthesis
We investigate the problem of verifying different properties of discrete time dynamical systems, namely, reachability, safety and reach-while-avoid. To achieve this, we adopt a data driven perspective and, using past system trajectories as data, we aim at learning a specific function termed certificate for each property we wish to verify. We seek to minimize a loss function, designed to encompass conditions on the certificate to be learned that encode the satisfaction of the associated property. Besides learning a certificate, we quantify probabilistically its generalization properties, namely, how likely it is for a certificate to be valid (and hence for the associated property to be satisfied) when it comes to a new system trajectory not included in the training data set. We view this problem under the realm of probably approximately correct (PAC) learning under the notion of compression, and use recent advancements of the so-called scenario approach to obtain scalable generalization bounds on the learned certificates. To achieve this, we design a novel algorithm that minimizes the loss function and hence constructs a certificate, and at the same time determines a quantity termed compression, which is instrumental in obtaining meaningful probabilistic guarantees. This process is novel per se and provides a constructive mechanism for compression set calculation, thus opening the road for its use to more general non-convex optimization problems. We verify the efficacy of our methodology on several numerical case studies, and compare it (both theoretically and numerically) with closely related results on data-driven property verification.
comment: 18 pages, submitted to Automatica
Adaptive Informed Deep Neural Networks for Power Flow Analysis
This study introduces PINN4PF, an end-to-end deep learning architecture for power flow (PF) analysis that effectively captures the nonlinear dynamics of large-scale modern power systems. The proposed neural network (NN) architecture consists of two important advancements in the training pipeline: (A) a double-head feed-forward NN that aligns with PF analysis, including an activation function that adjusts to the net active and reactive power injections patterns, and (B) a physics-based loss function that partially incorporates power system topology information through a novel hidden function. The effectiveness of the proposed architecture is illustrated through 4-bus, 15-bus, 290-bus, and 2224-bus test systems and is evaluated against two baselines: a linear regression model (LR) and a black-box NN (MLP). The comparison is based on (i) generalization ability, (ii) robustness, (iii) impact of training dataset size on generalization ability, (iv) accuracy in approximating derived PF quantities (specifically line current, line active power, and line reactive power), and (v) scalability. Results demonstrate that PINN4PF outperforms both baselines across all test systems by up to two orders of magnitude not only in terms of direct criteria, e.g., generalization ability, but also in terms of approximating derived physical quantities.
comment: 17 pages, 7 figures, 4 tables
The Holonomy of Optimal Mass Transport: The Gaussian-Linear Case
The theory of Monge-Kantorovich Optimal Mass Transport (OMT) has in recent years spurred a fast developing phase of research in stochastic control, control of ensemble systems, thermodynamics, data science, and several other fields in engineering and science. We herein introduce a new type of transportation problems. The salient feature of these problems is that particles/agents in the ensemble are labeled and their relative position along their journey is of interest. Of particular importance in our program are control laws that steer ensembles along cycles ensuring that individual particles return to their original position. This feature is in contrast with the classical theory of optimal transport where the primary object of study is the path of probability densities, without any concern about particle labels. In the theory that we present, we focus on the case Gaussian distributions and linear dynamics, and explore a hitherto unstudied sub-Riemannian structure of Monge-Kantorovich transport where the relative position of particles along their journey is modeled by the holonomy of the transportation schedule. From this vantage point, we discuss several other problems of independent interest.
comment: 22 pages, 8 figures
TinyMPC: Model-Predictive Control on Resource-Constrained Microcontrollers ICRA 2024
Model-predictive control (MPC) is a powerful tool for controlling highly dynamic robotic systems subject to complex constraints. However, MPC is computationally demanding, and is often impractical to implement on small, resource-constrained robotic platforms. We present TinyMPC, a high-speed MPC solver with a low memory footprint targeting the microcontrollers common on small robots. Our approach is based on the alternating direction method of multipliers (ADMM) and leverages the structure of the MPC problem for efficiency. We demonstrate TinyMPC's effectiveness by benchmarking against the state-of-the-art solver OSQP, achieving nearly an order of magnitude speed increase, as well as through hardware experiments on a 27 gram quadrotor, demonstrating high-speed trajectory tracking and dynamic obstacle avoidance. TinyMPC is publicly available at https://tinympc.org.
comment: Accepted at ICRA 2024. First three authors contributed equally and are listed in alphabetical order. Publicly available at https://tinympc.org
Dynamic load balancing for cloud systems under heterogeneous setup delays
We consider a distributed cloud service deployed at a set of distinct server pools. Arriving jobs are classified into heterogeneous types, in accordance with their setup times which are differentiated at each of the pools. A dispatcher for each job type controls the balance of load between pools, based on decentralized feedback. The system of rates and queues is modeled by a fluid differential equation system, and analyzed via convex optimization. A first, myopic policy is proposed, based on task delay-to-service. Under a simplified dynamic fluid queue model, we prove global convergence to an equilibrium point which minimizes the mean setup time; however queueing delays are incurred with this method. A second proposal is then developed based on proximal optimization, which explicitly models the setup queue and is proved to reach an optimal equilibrium, devoid of queueing delay. Results are demonstrated through a simulation example.
comment: 20 pages, 1 figure
Robotics
BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion
Learning skills from human motions offers a promising path toward generalizable policies for whole-body humanoid control, yet two key cornerstones are missing: (1) a high-quality motion tracking framework that faithfully transforms large-scale kinematic references into robust and extremely dynamic motions on real hardware, and (2) a distillation approach that can effectively learn these motion primitives and compose them to solve downstream tasks. We address these gaps with BeyondMimic, the first real-world framework to learn from human motions for versatile and naturalistic humanoid control via guided diffusion. Our framework provides a motion tracking pipeline capable of challenging skills such as jumping spins, sprinting, and cartwheels with state-of-the-art motion quality. Moving beyond mimicking existing motions and synthesize novel ones, we further introduce a unified diffusion policy that enables zero-shot task-specific control at test time using simple cost functions. Deployed on hardware, BeyondMimic performs diverse tasks at test time, including waypoint navigation, joystick teleoperation, and obstacle avoidance, bridging sim-to-real motion tracking and flexible synthesis of human motion primitives for whole-body control. https://beyondmimic.github.io/.
comment: 9 pages, 1 figure
ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks
Language-guided long-horizon mobile manipulation has long been a grand challenge in embodied semantic reasoning, generalizable manipulation, and adaptive locomotion. Three fundamental limitations hinder progress: First, although large language models have improved spatial reasoning and task planning through semantic priors, existing implementations remain confined to tabletop scenarios, failing to address the constrained perception and limited actuation ranges of mobile platforms. Second, current manipulation strategies exhibit insufficient generalization when confronted with the diverse object configurations encountered in open-world environments. Third, while crucial for practical deployment, the dual requirement of maintaining high platform maneuverability alongside precise end-effector control in unstructured settings remains understudied. In this work, we present ODYSSEY, a unified mobile manipulation framework for agile quadruped robots equipped with manipulators, which seamlessly integrates high-level task planning with low-level whole-body control. To address the challenge of egocentric perception in language-conditioned tasks, we introduce a hierarchical planner powered by a vision-language model, enabling long-horizon instruction decomposition and precise action execution. At the control level, our novel whole-body policy achieves robust coordination across challenging terrains. We further present the first benchmark for long-horizon mobile manipulation, evaluating diverse indoor and outdoor scenarios. Through successful sim-to-real transfer, we demonstrate the system's generalization and robustness in real-world deployments, underscoring the practicality of legged manipulators in unstructured environments. Our work advances the feasibility of generalized robotic assistants capable of complex, dynamic tasks. Our project page: https://kaijwang.github.io/odyssey.github.io/
Verti-Arena: A Controllable and Standardized Indoor Testbed for Multi-Terrain Off-Road Autonomy
Off-road navigation is an important capability for mobile robots deployed in environments that are inaccessible or dangerous to humans, such as disaster response or planetary exploration. Progress is limited due to the lack of a controllable and standardized real-world testbed for systematic data collection and validation. To fill this gap, we introduce Verti-Arena, a reconfigurable indoor facility designed specifically for off-road autonomy. By providing a repeatable benchmark environment, Verti-Arena supports reproducible experiments across a variety of vertically challenging terrains and provides precise ground truth measurements through onboard sensors and a motion capture system. Verti-Arena also supports consistent data collection and comparative evaluation of algorithms in off-road autonomy research. We also develop a web-based interface that enables research groups worldwide to remotely conduct standardized off-road autonomy experiments on Verti-Arena.
comment: 6 pages
Emergent morphogenesis via planar fabrication enabled by a reduced model of composites
The ability to engineer complex three-dimensional shapes from planar sheets with precise, programmable control underpins emerging technologies in soft robotics, reconfigurable devices, and functional materials. Here, we present a reduced-order numerical and experimental framework for a bilayer system consisting of a stimuli-responsive thermoplastic sheet (Shrinky Dink) bonded to a kirigami-patterned, inert plastic layer. Upon uniform heating, the active layer contracts while the patterned layer constrains in-plane stretch but allows out-of-plane bending, yielding programmable 3D morphologies from simple planar precursors. Our approach enables efficient computational design and scalable manufacturing of 3D forms with a single-layer reduced model that captures the coupled mechanics of stretching and bending. Unlike traditional bilayer modeling, our framework collapses the multilayer composite into a single layer of nodes and elements, reducing the degrees of freedom and enabling simulation on a 2D geometry. This is achieved by introducing a novel energy formulation that captures the coupling between in-plane stretch mismatch and out-of-plane bending - extending beyond simple isotropic linear elastic models. Experimentally, we establish a fully planar, repeatable fabrication protocol using a stimuli-responsive thermoplastic and a laser-cut inert plastic layer. The programmed strain mismatch drives an array of 3D morphologies, such as bowls, canoes, and flower petals, all verified by both simulation and physical prototypes.
comment: GitHub repository: https://github.com/StructuresComp/discrete-shells-shrinky-dink/
COMponent-Aware Pruning for Accelerated Control Tasks in Latent Space Models
The rapid growth of resource-constrained mobile platforms, including mobile robots, wearable systems, and Internet-of-Things devices, has increased the demand for computationally efficient neural network controllers (NNCs) that can operate within strict hardware limitations. While deep neural networks (DNNs) demonstrate superior performance in control applications, their substantial computational complexity and memory requirements present significant barriers to practical deployment on edge devices. This paper introduces a comprehensive model compression methodology that leverages component-aware structured pruning to determine the optimal pruning magnitude for each pruning group, ensuring a balance between compression and stability for NNC deployment. Our approach is rigorously evaluated on Temporal Difference Model Predictive Control (TD-MPC), a state-of-the-art model-based reinforcement learning algorithm, with a systematic integration of mathematical stability guarantee properties, specifically Lyapunov criteria. The key contribution of this work lies in providing a principled framework for determining the theoretical limits of model compression while preserving controller stability. Experimental validation demonstrates that our methodology successfully reduces model complexity while maintaining requisite control performance and stability characteristics. Furthermore, our approach establishes a quantitative boundary for safe compression ratios, enabling practitioners to systematically determine the maximum permissible model reduction before violating critical stability properties, thereby facilitating the confident deployment of compressed NNCs in resource-limited environments.
comment: Submitted in: The 2026 IEEE/SICE International Symposium on System Integration (SII 2026)
AimBot: A Simple Auxiliary Visual Cue to Enhance Spatial Awareness of Visuomotor Policies
In this paper, we propose AimBot, a lightweight visual augmentation technique that provides explicit spatial cues to improve visuomotor policy learning in robotic manipulation. AimBot overlays shooting lines and scope reticles onto multi-view RGB images, offering auxiliary visual guidance that encodes the end-effector's state. The overlays are computed from depth images, camera extrinsics, and the current end-effector pose, explicitly conveying spatial relationships between the gripper and objects in the scene. AimBot incurs minimal computational overhead (less than 1 ms) and requires no changes to model architectures, as it simply replaces original RGB images with augmented counterparts. Despite its simplicity, our results show that AimBot consistently improves the performance of various visuomotor policies in both simulation and real-world settings, highlighting the benefits of spatially grounded visual feedback.
comment: CoRL 2025
Capsizing-Guided Trajectory Optimization for Autonomous Navigation with Rough Terrain
It is a challenging task for ground robots to autonomously navigate in harsh environments due to the presence of non-trivial obstacles and uneven terrain. This requires trajectory planning that balances safety and efficiency. The primary challenge is to generate a feasible trajectory that prevents robot from tip-over while ensuring effective navigation. In this paper, we propose a capsizing-aware trajectory planner (CAP) to achieve trajectory planning on the uneven terrain. The tip-over stability of the robot on rough terrain is analyzed. Based on the tip-over stability, we define the traversable orientation, which indicates the safe range of robot orientations. This orientation is then incorporated into a capsizing-safety constraint for trajectory optimization. We employ a graph-based solver to compute a robust and feasible trajectory while adhering to the capsizing-safety constraint. Extensive simulation and real-world experiments validate the effectiveness and robustness of the proposed method. The results demonstrate that CAP outperforms existing state-of-the-art approaches, providing enhanced navigation performance on uneven terrains.
Aerial Target Encirclement and Interception with Noisy Range Observations
This paper proposes a strategy to encircle and intercept a non-cooperative aerial point-mass moving target by leveraging noisy range measurements for state estimation. In this approach, the guardians actively ensure the observability of the target by using an anti-synchronization (AS), 3D ``vibrating string" trajectory, which enables rapid position and velocity estimation based on the Kalman filter. Additionally, a novel anti-target controller is designed for the guardians to enable adaptive transitions from encircling a protected target to encircling, intercepting, and neutralizing a hostile target, taking into consideration the input constraints of the guardians. Based on the guaranteed uniform observability, the exponentially bounded stability of the state estimation error and the convergence of the encirclement error are rigorously analyzed. Simulation results and real-world UAV experiments are presented to further validate the effectiveness of the system design.
comment: The paper has been accepted in Automatica
PCHands: PCA-based Hand Pose Synergy Representation on Manipulators with N-DoF
We consider the problem of learning a common representation for dexterous manipulation across manipulators of different morphologies. To this end, we propose PCHands, a novel approach for extracting hand postural synergies from a large set of manipulators. We define a simplified and unified description format based on anchor positions for manipulators ranging from 2-finger grippers to 5-finger anthropomorphic hands. This enables learning a variable-length latent representation of the manipulator configuration and the alignment of the end-effector frame of all manipulators. We show that it is possible to extract principal components from this latent representation that is universal across manipulators of different structures and degrees of freedom. To evaluate PCHands, we use this compact representation to encode observation and action spaces of control policies for dexterous manipulation tasks learned with RL. In terms of learning efficiency and consistency, the proposed representation outperforms a baseline that learns the same tasks in joint space. We additionally show that PCHands performs robustly in RL from demonstration, when demonstrations are provided from a different manipulator. We further support our results with real-world experiments that involve a 2-finger gripper and a 4-finger anthropomorphic hand. Code and additional material are available at https://hsp-iit.github.io/PCHands/.
comment: 2025 IEEE-RAS 24th International Conference on Humanoid Robots
MolmoAct: Action Reasoning Models that can Reason in Space
Reasoning is central to purposeful action, yet most robotic foundation models map perception and instructions directly to control, which limits adaptability, generalization, and semantic grounding. We introduce Action Reasoning Models (ARMs), a class of vision-language-action models that integrate perception, planning, and control through a structured three-stage pipeline. Our model, MolmoAct, encodes observations and instructions into depth-aware perception tokens, generates mid-level spatial plans as editable trajectory traces, and predicts precise low-level actions, enabling explainable and steerable behavior. MolmoAct-7B-D achieves strong performance across simulation and real-world settings: 70.5% zero-shot accuracy on SimplerEnv Visual Matching tasks, surpassing closed-source Pi-0 and GR00T N1; 86.6% average success on LIBERO, including an additional 6.3% gain over ThinkAct on long-horizon tasks; and in real-world fine-tuning, an additional 10% (single-arm) and an additional 22.7% (bimanual) task progression over Pi-0-FAST. It also outperforms baselines by an additional 23.3% on out-of-distribution generalization and achieves top human-preference scores for open-ended instruction following and trajectory steering. Furthermore, we release, for the first time, the MolmoAct Dataset -- a mid-training robot dataset comprising over 10,000 high quality robot trajectories across diverse scenarios and tasks. Training with this dataset yields an average 5.5% improvement in general performance over the base model. We release all model weights, training code, our collected dataset, and our action reasoning dataset, establishing MolmoAct as both a state-of-the-art robotics foundation model and an open blueprint for building ARMs that transform perception into purposeful action through structured reasoning. Blogpost: https://allenai.org/blog/molmoact
comment: Appendix on Blogpost: https://allenai.org/blog/molmoact
Autonomous Navigation of Cloud-Controlled Quadcopters in Confined Spaces Using Multi-Modal Perception and LLM-Driven High Semantic Reasoning
This paper introduces an advanced AI-driven perception system for autonomous quadcopter navigation in GPS-denied indoor environments. The proposed framework leverages cloud computing to offload computationally intensive tasks and incorporates a custom-designed printed circuit board (PCB) for efficient sensor data acquisition, enabling robust navigation in confined spaces. The system integrates YOLOv11 for object detection, Depth Anything V2 for monocular depth estimation, a PCB equipped with Time-of-Flight (ToF) sensors and an Inertial Measurement Unit (IMU), and a cloud-based Large Language Model (LLM) for context-aware decision-making. A virtual safety envelope, enforced by calibrated sensor offsets, ensures collision avoidance, while a multithreaded architecture achieves low-latency processing. Enhanced spatial awareness is facilitated by 3D bounding box estimation with Kalman filtering. Experimental results in an indoor testbed demonstrate strong performance, with object detection achieving a mean Average Precision (mAP50) of 0.6, depth estimation Mean Absolute Error (MAE) of 7.2 cm, only 16 safety envelope breaches across 42 trials over approximately 11 minutes, and end-to-end system latency below 1 second. This cloud-supported, high-intelligence framework serves as an auxiliary perception and navigation system, complementing state-of-the-art drone autonomy for GPS-denied confined spaces.
DETACH: Cross-domain Learning for Long-Horizon Tasks via Mixture of Disentangled Experts AAAI'26
Long-Horizon (LH) tasks in Human-Scene Interaction (HSI) are complex multi-step tasks that require continuous planning, sequential decision-making, and extended execution across domains to achieve the final goal. However, existing methods heavily rely on skill chaining by concatenating pre-trained subtasks, with environment observations and self-state tightly coupled, lacking the ability to generalize to new combinations of environments and skills, failing to complete various LH tasks across domains. To solve this problem, this paper presents DETACH, a cross-domain learning framework for LH tasks via biologically inspired dual-stream disentanglement. Inspired by the brain's "where-what" dual pathway mechanism, DETACH comprises two core modules: i) an environment learning module for spatial understanding, which captures object functions, spatial relationships, and scene semantics, achieving cross-domain transfer through complete environment-self disentanglement; ii) a skill learning module for task execution, which processes self-state information including joint degrees of freedom and motor patterns, enabling cross-skill transfer through independent motor pattern encoding. We conducted extensive experiments on various LH tasks in HSI scenes. Compared with existing methods, DETACH can achieve an average subtasks success rate improvement of 23% and average execution efficiency improvement of 29%.
comment: 14 pages,8 figures. Submitted to AAAI'26
Touch Speaks, Sound Feels: A Multimodal Approach to Affective and Social Touch from Robots to Humans
Affective tactile interaction constitutes a fundamental component of human communication. In natural human-human encounters, touch is seldom experienced in isolation; rather, it is inherently multisensory. Individuals not only perceive the physical sensation of touch but also register the accompanying auditory cues generated through contact. The integration of haptic and auditory information forms a rich and nuanced channel for emotional expression. While extensive research has examined how robots convey emotions through facial expressions and speech, their capacity to communicate social gestures and emotions via touch remains largely underexplored. To address this gap, we developed a multimodal interaction system incorporating a 5*5 grid of 25 vibration motors synchronized with audio playback, enabling robots to deliver combined haptic-audio stimuli. In an experiment involving 32 Chinese participants, ten emotions and six social gestures were presented through vibration, sound, or their combination. Participants rated each stimulus on arousal and valence scales. The results revealed that (1) the combined haptic-audio modality significantly enhanced decoding accuracy compared to single modalities; (2) each individual channel-vibration or sound-effectively supported certain emotions recognition, with distinct advantages depending on the emotional expression; and (3) gestures alone were generally insufficient for conveying clearly distinguishable emotions. These findings underscore the importance of multisensory integration in affective human-robot interaction and highlight the complementary roles of haptic and auditory cues in enhancing emotional communication.
SwarmVLM: VLM-Guided Impedance Control for Autonomous Navigation of Heterogeneous Robots in Dynamic Warehousing
With the growing demand for efficient logistics, unmanned aerial vehicles (UAVs) are increasingly being paired with automated guided vehicles (AGVs). While UAVs offer the ability to navigate through dense environments and varying altitudes, they are limited by battery life, payload capacity, and flight duration, necessitating coordinated ground support. Focusing on heterogeneous navigation, SwarmVLM addresses these limitations by enabling semantic collaboration between UAVs and ground robots through impedance control. The system leverages the Vision Language Model (VLM) and the Retrieval-Augmented Generation (RAG) to adjust impedance control parameters in response to environmental changes. In this framework, the UAV acts as a leader using Artificial Potential Field (APF) planning for real-time navigation, while the ground robot follows via virtual impedance links with adaptive link topology to avoid collisions with short obstacles. The system demonstrated a 92% success rate across 12 real-world trials. Under optimal lighting conditions, the VLM-RAG framework achieved 8% accuracy in object detection and selection of impedance parameters. The mobile robot prioritized short obstacle avoidance, occasionally resulting in a lateral deviation of up to 50 cm from the UAV path, which showcases safe navigation in a cluttered setting.
AgentWorld: An Interactive Simulation Platform for Scene Construction and Mobile Robotic Manipulation
We introduce AgentWorld, an interactive simulation platform for developing household mobile manipulation capabilities. Our platform combines automated scene construction that encompasses layout generation, semantic asset placement, visual material configuration, and physics simulation, with a dual-mode teleoperation system supporting both wheeled bases and humanoid locomotion policies for data collection. The resulting AgentWorld Dataset captures diverse tasks ranging from primitive actions (pick-and-place, push-pull, etc.) to multistage activities (serve drinks, heat up food, etc.) across living rooms, bedrooms, and kitchens. Through extensive benchmarking of imitation learning methods including behavior cloning, action chunking transformers, diffusion policies, and vision-language-action models, we demonstrate the dataset's effectiveness for sim-to-real transfer. The integrated system provides a comprehensive solution for scalable robotic skill acquisition in complex home environments, bridging the gap between simulation-based training and real-world deployment. The code, datasets will be available at https://yizhengzhang1.github.io/agent_world/
comment: Accepted by Conference on Robot Learning 2025
Robot and Overhead Crane Collaboration Scheme to Enhance Payload Manipulation
This paper presents a scheme to enhance payload manipulation using a robot collaborating with an overhead crane. In the current industrial practice, when the crane's payload has to be accurately manipulated and located in a desired position, the task becomes laborious and risky since the operators have to guide the fine motions of the payload by hand. In the proposed collaborative scheme, the crane lifts the payload while the robot's end-effector guides it toward the desired position. The only link between the robot and the crane is the interaction force produced during the guiding of the payload. Two admittance transfer functions are considered to accomplish harmless and smooth contact with the payload. The first is used in a position-based admittance control integrated with the robot. The second one adds compliance to the crane by processing the interaction force through the admittance transfer function to generate a crane's velocity command that makes the crane follow the payload. Then the robot's end-effector and the crane move collaboratively to guide the payload to the desired location. A method is presented to design the admittance controllers that accomplish a fluent robot-crane collaboration. Simulations and experiments validating the scheme potential are shown.
Multi-view Normal and Distance Guidance Gaussian Splatting for Surface Reconstruction IROS 2025
3D Gaussian Splatting (3DGS) achieves remarkable results in the field of surface reconstruction. However, when Gaussian normal vectors are aligned within the single-view projection plane, while the geometry appears reasonable in the current view, biases may emerge upon switching to nearby views. To address the distance and global matching challenges in multi-view scenes, we design multi-view normal and distance-guided Gaussian splatting. This method achieves geometric depth unification and high-accuracy reconstruction by constraining nearby depth maps and aligning 3D normals. Specifically, for the reconstruction of small indoor and outdoor scenes, we propose a multi-view distance reprojection regularization module that achieves multi-view Gaussian alignment by computing the distance loss between two nearby views and the same Gaussian surface. Additionally, we develop a multi-view normal enhancement module, which ensures consistency across views by matching the normals of pixel points in nearby views and calculating the loss. Extensive experimental results demonstrate that our method outperforms the baseline in both quantitative and qualitative evaluations, significantly enhancing the surface reconstruction capability of 3DGS.
comment: This paper has been accepted by IROS 2025
LAURON VI: A Six-Legged Robot for Dynamic Walking
Legged locomotion enables robotic systems to traverse extremely challenging terrains. In many real-world scenarios, the terrain is not that difficult and these mixed terrain types introduce the need for flexible use of different walking strategies to achieve mission goals in a fast, reliable, and energy-efficient way. Six-legged robots have a high degree of flexibility and inherent stability that aids them in traversing even some of the most difficult terrains, such as collapsed buildings. However, their lack of fast walking gaits for easier surfaces is one reason why they are not commonly applied in these scenarios. This work presents LAURON VI, a six-legged robot platform for research on dynamic walking gaits as well as on autonomy for complex field missions. The robot's 18 series elastic joint actuators offer high-frequency interfaces for Cartesian impedance and pure torque control. We have designed, implemented, and compared three control approaches: kinematic-based, model-predictive, and reinforcement-learned controllers. The robot hardware and the different control approaches were extensively tested in a lab environment as well as on a Mars analog mission. The introduction of fast locomotion strategies for LAURON VI makes six-legged robots vastly more suitable for a wide range of real-world applications.
Risk Map As Middleware: Towards Interpretable Cooperative End-to-end Autonomous Driving for Risk-Aware Planning
End-to-end paradigm has emerged as a promising approach to autonomous driving. However, existing single-agent end-to-end pipelines are often constrained by occlusion and limited perception range, resulting in hazardous driving. Furthermore, their black-box nature prevents the interpretability of the driving behavior, leading to an untrustworthiness system. To address these limitations, we introduce Risk Map as Middleware (RiskMM) and propose an interpretable cooperative end-to-end driving framework. The risk map learns directly from the driving data and provides an interpretable spatiotemporal representation of the scenario from the upstream perception and the interactions between the ego vehicle and the surrounding environment for downstream planning. RiskMM first constructs a multi-agent spatiotemporal representation with unified Transformer-based architecture, then derives risk-aware representations by modeling interactions among surrounding environments with attention. These representations are subsequently fed into a learning-based Model Predictive Control (MPC) module. The MPC planner inherently accommodates physical constraints and different vehicle types and can provide interpretation by aligning learned parameters with explicit MPC elements. Evaluations conducted on the real-world V2XPnP-Seq dataset confirm that RiskMM achieves superior and robust performance in risk-aware trajectory planning, significantly enhancing the interpretability of the cooperative end-to-end driving framework. The codebase will be released to facilitate future research in this field.
MoRoCo: Multi-operator-robot Coordination, Interaction and Exploration under Restricted Communication
Fleets of autonomous robots are increasingly deployed alongside multiple human operators to explore unknown environments, identify salient features, and perform complex tasks in scenarios such as subterranean exploration, reconnaissance, and search-and-rescue missions. In these contexts, communication is often severely limited to short-range exchanges via ad-hoc networks, posing challenges to coordination. While recent studies have addressed multi-robot exploration under communication constraints, they largely overlook the essential role of human operators and their real-time interaction with robotic teams. Operators may demand timely updates on the exploration progress and robot status, reprioritize or cancel tasks dynamically, or request live video feeds and control access. Conversely, robots may seek human confirmation for anomalous events or require help recovering from motion or planning failures. To enable such bilateral, context-aware interactions under restricted communication, this work proposes MoRoCo, a unified framework for online coordination and exploration in multi-operator, multi-robot systems. MoRoCo enables the team to adaptively switch among three coordination modes: spread mode for parallelized exploration with intermittent data sharing, migrate mode for coordinated relocation, and chain mode for maintaining high-bandwidth connectivity through multi-hop links. These transitions are managed through distributed algorithms via only local communication. Extensive large-scale human-in-the-loop simulations and hardware experiments validate the necessity of incorporating human robot interactions and demonstrate that MoRoCo enables efficient, reliable coordination under limited communication, marking a significant step toward robust human-in-the-loop multi-robot autonomy in challenging environments.
comment: 38 pages, 28 figures, Submitted to the International Journal of Robotics Research (IJRR). Project website: https://zl-tian.github.io/MoRoCo/
GraphCoT-VLA: A 3D Spatial-Aware Reasoning Vision-Language-Action Model for Robotic Manipulation with Ambiguous Instructions
Vision-language-action models have emerged as a crucial paradigm in robotic manipulation. However, existing VLA models exhibit notable limitations in handling ambiguous language instructions and unknown environmental states. Furthermore, their perception is largely constrained to static two-dimensional observations, lacking the capability to model three-dimensional interactions between the robot and its environment. To address these challenges, this paper proposes GraphCoT-VLA, an efficient end-to-end model. To enhance the model's ability to interpret ambiguous instructions and improve task planning, we design a structured Chain-of-Thought reasoning module that integrates high-level task understanding and planning, failed task feedback, and low-level imaginative reasoning about future object positions and robot actions. Additionally, we construct a real-time updatable 3D Pose-Object graph, which captures the spatial configuration of robot joints and the topological relationships between objects in 3D space, enabling the model to better understand and manipulate their interactions. We further integrates a dropout hybrid reasoning strategy to achieve efficient control outputs. Experimental results across multiple real-world robotic tasks demonstrate that GraphCoT-VLA significantly outperforms existing methods in terms of task success rate and response speed, exhibiting strong generalization and robustness in open environments and under uncertain instructions.
comment: 10 pages, 6 figures
Grasp-HGN: Grasping the Unexpected
For transradial amputees, robotic prosthetic hands promise to regain the capability to perform daily living activities. To advance next-generation prosthetic hand control design, it is crucial to address current shortcomings in robustness to out of lab artifacts, and generalizability to new environments. Due to the fixed number of object to interact with in existing datasets, contrasted with the virtually infinite variety of objects encountered in the real world, current grasp models perform poorly on unseen objects, negatively affecting users' independence and quality of life. To address this: (i) we define semantic projection, the ability of a model to generalize to unseen object types and show that conventional models like YOLO, despite 80% training accuracy, drop to 15% on unseen objects. (ii) we propose Grasp-LLaVA, a Grasp Vision Language Model enabling human-like reasoning to infer the suitable grasp type estimate based on the object's physical characteristics resulting in a significant 50.2% accuracy over unseen object types compared to 36.7% accuracy of an SOTA grasp estimation model. Lastly, to bridge the performance-latency gap, we propose Hybrid Grasp Network (HGN), an edge-cloud deployment infrastructure enabling fast grasp estimation on edge and accurate cloud inference as a fail-safe, effectively expanding the latency vs. accuracy Pareto. HGN with confidence calibration (DC) enables dynamic switching between edge and cloud models, improving semantic projection accuracy by 5.6% (to 42.3%) with 3.5x speedup over the unseen object types. Over a real-world sample mix, it reaches 86% average accuracy (12.2% gain over edge-only), and 2.2x faster inference than Grasp-LLaVA alone.
comment: Paper accepted at ACM Transactions on Embedded Computing Systems
AR-VRM: Imitating Human Motions for Visual Robot Manipulation with Analogical Reasoning ICCV2025
Visual Robot Manipulation (VRM) aims to enable a robot to follow natural language instructions based on robot states and visual observations, and therefore requires costly multi-modal data. To compensate for the deficiency of robot data, existing approaches have employed vision-language pretraining with large-scale data. However, they either utilize web data that differs from robotic tasks, or train the model in an implicit way (e.g., predicting future frames at the pixel level), thus showing limited generalization ability under insufficient robot data. In this paper, we propose to learn from large-scale human action video datasets in an explicit way (i.e., imitating human actions from hand keypoints), introducing Visual Robot Manipulation with Analogical Reasoning (AR-VRM). To acquire action knowledge explicitly from human action videos, we propose a keypoint Vision-Language Model (VLM) pretraining scheme, enabling the VLM to learn human action knowledge and directly predict human hand keypoints. During fine-tuning on robot data, to facilitate the robotic arm in imitating the action patterns of human motions, we first retrieve human action videos that perform similar manipulation tasks and have similar historical observations , and then learn the Analogical Reasoning (AR) map between human hand keypoints and robot components. Taking advantage of focusing on action keypoints instead of irrelevant visual cues, our method achieves leading performance on the CALVIN benchmark {and real-world experiments}. In few-shot scenarios, our AR-VRM outperforms previous methods by large margins , underscoring the effectiveness of explicitly imitating human actions under data scarcity.
comment: Accepted by ICCV2025
End-to-End Humanoid Robot Safe and Comfortable Locomotion Policy
The deployment of humanoid robots in unstructured, human-centric environments requires navigation capabilities that extend beyond simple locomotion to include robust perception, provable safety, and socially aware behavior. Current reinforcement learning approaches are often limited by blind controllers that lack environmental awareness or by vision-based systems that fail to perceive complex 3D obstacles. In this work, we present an end-to-end locomotion policy that directly maps raw, spatio-temporal LiDAR point clouds to motor commands, enabling robust navigation in cluttered dynamic scenes. We formulate the control problem as a Constrained Markov Decision Process (CMDP) to formally separate safety from task objectives. Our key contribution is a novel methodology that translates the principles of Control Barrier Functions (CBFs) into costs within the CMDP, allowing a model-free Penalized Proximal Policy Optimization (P3O) to enforce safety constraints during training. Furthermore, we introduce a set of comfort-oriented rewards, grounded in human-robot interaction research, to promote motions that are smooth, predictable, and less intrusive. We demonstrate the efficacy of our framework through a successful sim-to-real transfer to a physical humanoid robot, which exhibits agile and safe navigation around both static and dynamic 3D obstacles.
In-situ Value-aligned Human-Robot Interactions with Physical Constraints
Equipped with Large Language Models (LLMs), human-centered robots are now capable of performing a wide range of tasks that were previously deemed challenging or unattainable. However, merely completing tasks is insufficient for cognitive robots, who should learn and apply human preferences to future scenarios. In this work, we propose a framework that combines human preferences with physical constraints, requiring robots to complete tasks while considering both. Firstly, we developed a benchmark of everyday household activities, which are often evaluated based on specific preferences. We then introduced In-Context Learning from Human Feedback (ICLHF), where human feedback comes from direct instructions and adjustments made intentionally or unintentionally in daily life. Extensive sets of experiments, testing the ICLHF to generate task plans and balance physical constraints with preferences, have demonstrated the efficiency of our approach.
comment: 8 pages, 7 figures
Feedback Control of a Single-Tail Bioinspired 59-mg Swimmer IROS 2025
We present an evolved steerable version of the single-tail Fish-&-Ribbon-Inspired Small Swimming Harmonic roBot (FRISSHBot), a 59-mg biologically inspired swimmer, which is driven by a new shape-memory alloy (SMA)-based bimorph actuator. The new FRISSHBot is controllable in the two-dimensional (2D) space, which enabled the first demonstration of feedback-controlled trajectory tracking of a single-tail aquatic robot with onboard actuation at the subgram scale. These new capabilities are the result of a physics-informed design with an enlarged head and shortened tail relative to those of the original platform. Enhanced by its design, this new platform achieves forward swimming speeds of up to 13.6 mm/s (0.38 Bl/s), which is over four times that of the original platform. Furthermore, when following 2D references in closed loop, the tested FRISSHBot prototype attains forward swimming speeds of up to 9.1 mm/s, root-mean-square (RMS) tracking errors as low as 2.6 mm, turning rates of up to 13.1 {\deg}/s, and turning radii as small as 10 mm.
comment: To be presented at the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
Progressive Bird's Eye View Perception for Safety-Critical Autonomous Driving: A Comprehensive Survey
Bird's-Eye-View (BEV) perception has become a foundational paradigm in autonomous driving, enabling unified spatial representations that support robust multi-sensor fusion and multi-agent collaboration. As autonomous vehicles transition from controlled environments to real-world deployment, ensuring the safety and reliability of BEV perception in complex scenarios - such as occlusions, adverse weather, and dynamic traffic - remains a critical challenge. This survey provides the first comprehensive review of BEV perception from a safety-critical perspective, systematically analyzing state-of-the-art frameworks and implementation strategies across three progressive stages: single-modality vehicle-side, multimodal vehicle-side, and multi-agent collaborative perception. Furthermore, we examine public datasets encompassing vehicle-side, roadside, and collaborative settings, evaluating their relevance to safety and robustness. We also identify key open-world challenges - including open-set recognition, large-scale unlabeled data, sensor degradation, and inter-agent communication latency - and outline future research directions, such as integration with end-to-end autonomous driving systems, embodied intelligence, and large language models.
AZRA: Extending the Affective Capabilities of Zoomorphic Robots using Augmented Reality
Zoomorphic robots could serve as accessible and practical alternatives for users unable or unwilling to keep pets. However, their affective interactions are often simplistic and short-lived, limiting their potential for domestic adoption. In order to facilitate more dynamic and nuanced affective interactions and relationships between users and zoomorphic robots we present AZRA, a novel augmented reality (AR) framework that extends the affective capabilities of these robots without physical modifications. To demonstrate AZRA, we augment a zoomorphic robot, Petit Qoobo, with novel emotional displays (face, light, sound, thought bubbles) and interaction modalities (voice, touch, proximity, gaze). Additionally, AZRA features a computational model of emotion to calculate the robot's emotional responses, daily moods, evolving personality and needs. We highlight how AZRA can be used for rapid participatory prototyping and enhancing existing robots, then discuss implications on future zoomorphic robot development.
comment: Companion of the 2025 ACM/IEEE International Conference on Human-Robot Interaction (RO-MAN 2025)
A Minimal Model for Emergent Collective Behaviors in Autonomous Robotic Multi-Agent Systems
Collective behaviors such as swarming and flocking emerge from simple, decentralized interactions in biological systems. Existing models, such as Vicsek and Cucker-Smale, lack collision avoidance, whereas the Olfati-Saber model imposes rigid formations, limiting their applicability in swarm robotics. To address these limitations, this paper proposes a minimal yet expressive model that governs agent dynamics using relative positions, velocities, and local density, modulated by two tunable parameters: the spatial offset and kinetic offset. The model achieves spatially flexible, collision-free behaviors that reflect naturalistic group dynamics. Furthermore, we extend the framework to cognitive autonomous systems, enabling energy-aware phase transitions between swarming and flocking through adaptive control parameter tuning. This cognitively inspired approach offers a robust foundation for real-world applications in multi-robot systems, particularly autonomous aerial swarms.
Decoupling Geometry from Optimization in 2D Irregular Cutting and Packing Problems: an Open-Source Collision Detection Engine
Addressing irregular cutting and packing (C&P) optimization problems poses two distinct challenges: the geometric challenge of determining whether or not an item can be placed feasibly at a certain position, and the optimization challenge of finding a good solution according to some objective function. Until now, those tackling such problems have had to address both challenges simultaneously, requiring two distinct sets of expertise and a lot of research & development effort. One way to lower this barrier is to decouple the two challenges. In this paper we introduce a powerful collision detection engine (CDE) for 2D irregular C&P problems which assumes full responsibility for the geometric challenge. The CDE (i) allows users to focus with full confidence on their optimization challenge by abstracting geometry away and (ii) enables independent advances to propagate to all optimization algorithms built atop it. We present a set of core principles and design philosophies to model a general and adaptable CDE focused on maximizing performance, accuracy and robustness. These principles are accompanied by a concrete open-source implementation called $\texttt{jagua-rs}$. This paper together with its implementation serves as a catalyst for future advances in irregular C&P problems by providing a solid foundation which can either be used as it currently exists or be further improved upon.
comment: 25 pages, 16 figures
Elastic Motion Policy: An Adaptive Dynamical System for Robust and Efficient One-Shot Imitation Learning
Behavior cloning (BC) has become a staple imitation learning paradigm in robotics due to its ease of teaching robots complex skills directly from expert demonstrations. However, BC suffers from an inherent generalization issue. To solve this, the status quo solution is to gather more data. Yet, regardless of how much training data is available, out-of-distribution performance is still sub-par, lacks any formal guarantee of convergence and success, and is incapable of allowing and recovering from physical interactions with humans. These are critical flaws when robots are deployed in ever-changing human-centric environments. Thus, we propose Elastic Motion Policy (EMP), a one-shot imitation learning framework that allows robots to adjust their behavior based on the scene change while respecting the task specification. Trained from a single demonstration, EMP follows the dynamical systems paradigm where motion planning and control are governed by first-order differential equations with convergence guarantees. We leverage Laplacian editing in full end-effector space, $\mathbb{R}^3\times SO(3)$, and online convex learning of Lyapunov functions, to adapt EMP online to new contexts, avoiding the need to collect new demonstrations. We extensively validate our framework in real robot experiments, demonstrating its robust and efficient performance in dynamic environments, with obstacle avoidance and multi-step task capabilities. Project Website: https://elastic-motion-policy.github.io/EMP/
Dynamic Layer Detection of Thin Materials using DenseTact Optical Tactile Sensors IROS 2025
Manipulation of thin materials is critical for many everyday tasks and remains a significant challenge for robots. While existing research has made strides in tasks like material smoothing and folding, many studies struggle with common failure modes (crumpled corners/edges, incorrect grasp configurations) that a preliminary step of layer detection could solve. We present a novel method for classifying the number of grasped material layers using a custom gripper equipped with DenseTact 2.0 optical tactile sensors. After grasping, the gripper performs an anthropomorphic rubbing motion while collecting optical flow, 6-axis wrench, and joint state data. Using this data in a transformer-based network achieves a test accuracy of 98.21\% in classifying the number of grasped cloth layers, and 81.25\% accuracy in classifying layers of grasped paper, showing the effectiveness of our dynamic rubbing method. Evaluating different inputs and model architectures highlights the usefulness of tactile sensor information and a transformer model for this task. A comprehensive dataset of 568 labeled trials (368 for cloth and 200 for paper) was collected and made open-source along with this paper. Our project page is available at https://armlabstanford.github.io/dynamic-cloth-detection.
comment: 7 pages, 9 figures, accepted to IROS 2025
TextInPlace: Indoor Visual Place Recognition in Repetitive Structures with Scene Text Spotting and Verification IROS 2025
Visual Place Recognition (VPR) is a crucial capability for long-term autonomous robots, enabling them to identify previously visited locations using visual information. However, existing methods remain limited in indoor settings due to the highly repetitive structures inherent in such environments. We observe that scene texts frequently appear in indoor spaces and can help distinguish visually similar but different places. This inspires us to propose TextInPlace, a simple yet effective VPR framework that integrates Scene Text Spotting (STS) to mitigate visual perceptual ambiguity in repetitive indoor environments. Specifically, TextInPlace adopts a dual-branch architecture within a local parameter sharing network. The VPR branch employs attention-based aggregation to extract global descriptors for coarse-grained retrieval, while the STS branch utilizes a bridging text spotter to detect and recognize scene texts. Finally, the discriminative texts are filtered to compute text similarity and re-rank the top-K retrieved images. To bridge the gap between current text-based repetitive indoor scene datasets and the typical scenarios encountered in robot navigation, we establish an indoor VPR benchmark dataset, called Maze-with-Text. Extensive experiments on both custom and public datasets demonstrate that TextInPlace achieves superior performance over existing methods that rely solely on appearance information. The dataset, code, and trained models are publicly available at https://github.com/HqiTao/TextInPlace.
comment: Accepted to IROS 2025
BonnBeetClouds3D: A Dataset Towards Point Cloud-based Organ-level Phenotyping of Sugar Beet Plants under Field Conditions
Agricultural production is facing severe challenges in the next decades induced by climate change and the need for sustainability, reducing its impact on the environment. Advancements in field management through non-chemical weeding by robots in combination with monitoring of crops by autonomous unmanned aerial vehicles (UAVs) and breeding of novel and more resilient crop varieties are helpful to address these challenges. The analysis of plant traits, called phenotyping, is an essential activity in plant breeding, it however involves a great amount of manual labor. With this paper, we address the problem of automatic fine-grained organ-level geometric analysis needed for precision phenotyping. As the availability of real-world data in this domain is relatively scarce, we propose a novel dataset that was acquired using UAVs capturing high-resolution images of a real breeding trial containing 48 plant varieties and therefore covering great morphological and appearance diversity. This enables the development of approaches for autonomous phenotyping that generalize well to different varieties. Based on overlapping high-resolution images from multiple viewing angles, we compute photogrammetric dense point clouds and provide detailed and accurate point-wise labels for plants, leaves, and salient points as the tip and the base. Additionally, we include measurements of phenotypic traits performed by experts from the German Federal Plant Variety Office on the real plants, allowing the evaluation of new approaches not only on segmentation and keypoint detection but also directly on the downstream tasks. The provided labeled point clouds enable fine-grained plant analysis and support further progress in the development of automatic phenotyping approaches, but also enable further research in surface reconstruction, point cloud completion, and semantic interpretation of point clouds.
Interactive Imitation Learning for Dexterous Robotic Manipulation: Challenges and Perspectives -- A Survey
Dexterous manipulation is a crucial yet highly complex challenge in humanoid robotics, demanding precise, adaptable, and sample-efficient learning methods. As humanoid robots are usually designed to operate in human-centric environments and interact with everyday objects, mastering dexterous manipulation is critical for real-world deployment. Traditional approaches, such as reinforcement learning and imitation learning, have made significant strides, but they often struggle due to the unique challenges of real-world dexterous manipulation, including high-dimensional control, limited training data, and covariate shift. This survey provides a comprehensive overview of these challenges and reviews existing learning-based methods for real-world dexterous manipulation, spanning imitation learning, reinforcement learning, and hybrid approaches. A promising yet underexplored direction is interactive imitation learning, where human feedback actively refines a robots behavior during training. While interactive imitation learning has shown success in various robotic tasks, its application to dexterous manipulation remains limited. To address this gap, we examine current interactive imitation learning techniques applied to other robotic tasks and discuss how these methods can be adapted to enhance dexterous manipulation. By synthesizing state-of-the-art research, this paper highlights key challenges, identifies gaps in current methodologies, and outlines potential directions for leveraging interactive imitation learning to improve dexterous robotic skills.
comment: 27 pages, 4 figures, 3 tables
POEX: Towards Policy Executable Jailbreak Attacks Against the LLM-based Robots
The integration of LLMs into robots has witnessed significant growth, where LLMs can convert instructions into executable robot policies. However, the inherent vulnerability of LLMs to jailbreak attacks brings critical security risks from the digital domain to the physical world. An attacked LLM-based robot could execute harmful policies and cause physical harm. In this paper, we investigate the feasibility and rationale of jailbreak attacks against LLM-based robots and answer three research questions: (1) How applicable are existing LLM jailbreak attacks against LLM-based robots? (2) What unique challenges arise if they are not directly applicable? (3) How to defend against such jailbreak attacks? To this end, we first construct a "human-object-environment" robot risks-oriented Harmful-RLbench and then conduct a measurement study on LLM-based robot systems. Our findings conclude that traditional LLM jailbreak attacks are inapplicable in robot scenarios, and we identify two unique challenges: determining policy-executable optimization directions and accurately evaluating robot-jailbroken policies. To enable a more thorough security analysis, we introduce POEX (POlicy EXecutable) jailbreak, a red-teaming framework that induces harmful yet executable policy to jailbreak LLM-based robots. POEX incorporates hidden layer gradient optimization to guarantee jailbreak success and policy execution as well as a multi-agent evaluator to accurately assess the practical executability of policies. Experiments conducted on the real-world robotic systems and in simulation demonstrate the efficacy of POEX, highlighting critical security vulnerabilities and its transferability across LLMs. Finally, we propose prompt-based and model-based defenses to mitigate attacks. Our findings underscore the urgent need for security measures to ensure the safe deployment of LLM-based robots in critical applications.
comment: Homepage: https://poex-jailbreak.github.io/
Optimizing Design and Control Methods for Using Collaborative Robots in Upper-Limb Rehabilitation
In this paper, we address the development of a robotic rehabilitation system for the upper limbs based on collaborative end-effector solutions. The use of commercial collaborative robots offers significant advantages for this task, as they are optimized from an engineering perspective and ensure safe physical interaction with humans. However, they also come with noticeable drawbacks, such as the limited range of sizes available on the market and the standard control modes, which are primarily oriented towards industrial or service applications. To address these limitations, we propose an optimization-based design method to fully exploit the capability of the cobot in performing rehabilitation tasks. Additionally, we introduce a novel control architecture based on an admittance-type Virtual Fixture method, which constrains the motion of the robot along a prescribed path. This approach allows for an intuitive definition of the task to be performed via Programming by Demonstration and enables the system to operate both passively and actively. In passive mode, the system supports the patient during task execution with additional force, while in active mode, it opposes the motion with a braking force. Experimental results demonstrate the effectiveness of the proposed method.
Is Single-View Mesh Reconstruction Ready for Robotics?
This paper evaluates single-view mesh reconstruction models for their potential in enabling instant digital twin creation for real-time planning and dynamics prediction using physics simulators for robotic manipulation. Recent single-view 3D reconstruction advances offer a promising avenue toward an automated real-to-sim pipeline: directly mapping a single observation of a scene into a simulation instance by reconstructing scene objects as individual, complete, and physically plausible 3D meshes. However, their suitability for physics simulations and robotics applications under immediacy, physical fidelity, and simulation readiness remains underexplored. We establish robotics-specific benchmarking criteria for 3D reconstruction, including handling typical inputs, collision-free and stable geometry, occlusions robustness, and meeting computational constraints. Our empirical evaluation using realistic robotics datasets shows that despite success on computer vision benchmarks, existing approaches fail to meet robotics-specific requirements. We quantitively examine limitations of single-view reconstruction for practical robotics implementation, in contrast to prior work that focuses on multi-view approaches. Our findings highlight critical gaps between computer vision advances and robotics needs, guiding future research at this intersection.
comment: 20 pages, 18 figures
Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model
Affordance grounding focuses on predicting the specific regions of objects that are associated with the actions to be performed by robots. It plays a vital role in the fields of human-robot interaction, human-object interaction, embodied manipulation, and embodied perception. Existing models often neglect the affordance shared among different objects because they lack the Chain-of-Thought(CoT) reasoning abilities, limiting their out-of-domain (OOD) generalization and explicit reasoning capabilities. To address these challenges, we propose Affordance-R1, the first unified affordance grounding framework that integrates cognitive CoT guided Group Relative Policy Optimization (GRPO) within a reinforcement learning paradigm. Specifically, we designed a sophisticated affordance function, which contains format, perception, and cognition rewards to effectively guide optimization directions. Furthermore, we constructed a high-quality affordance-centric reasoning dataset, ReasonAff, to support training. Trained exclusively via reinforcement learning with GRPO and without explicit reasoning data, Affordance-R1 achieves robust zero-shot generalization and exhibits emergent test-time reasoning capabilities. Comprehensive experiments demonstrate that our model outperforms well-established methods and exhibits open-world generalization. To the best of our knowledge, Affordance-R1 is the first to integrate GRPO-based RL with reasoning into affordance reasoning. The code of our method and our dataset is released on https://github.com/hq-King/Affordance-R1.
Multimodal Visual Transformer for Sim2real Transfer in Visual Reinforcement Learning
Depth information is robust to scene appearance variations and inherently carries 3D spatial details. In this paper, a visual backbone based on the vision transformer is proposed to fuse RGB and depth modalities for enhancing generalization. Different modalities are first processed by separate CNN stems, and the combined convolutional features are delivered to the scalable vision transformer to obtain visual representations. Moreover, a contrastive unsupervised learning scheme is designed with masked and unmasked tokens to accelerate the sample efficiency during the reinforcement learning process. Simulation results demonstrate that our visual backbone can focus more on task-related regions and exhibit better generalization in unseen scenarios. For sim2real transfer, a flexible curriculum learning schedule is developed to deploy domain randomization over training processes. Finally, the feasibility of our model is validated to perform real-world manipulation tasks via zero-shot transfer.
Mapless Collision-Free Flight via MPC using Dual KD-Trees in Cluttered Environments
Collision-free flight in cluttered environments is a critical capability for autonomous quadrotors. Traditional methods often rely on detailed 3D map construction, trajectory generation, and tracking. However, this cascade pipeline can introduce accumulated errors and computational delays, limiting flight agility and safety. In this paper, we propose a novel method for enabling collision-free flight in cluttered environments without explicitly constructing 3D maps or generating and tracking collision-free trajectories. Instead, we leverage Model Predictive Control (MPC) to directly produce safe actions from sparse waypoints and point clouds from a depth camera. These sparse waypoints are dynamically adjusted online based on nearby obstacles detected from point clouds. To achieve this, we introduce a dual KD-Tree mechanism: the Obstacle KD-Tree quickly identifies the nearest obstacle for avoidance, while the Edge KD-Tree provides a robust initial guess for the MPC solver, preventing it from getting stuck in local minima during obstacle avoidance. We validate our approach through extensive simulations and real-world experiments. The results show that our approach significantly outperforms the mapping-based methods and is also superior to imitation learning-based methods, demonstrating reliable obstacle avoidance at up to 12 m/s in simulations and 6 m/s in real-world tests. Our method provides a simple and robust alternative to existing methods. The code is publicly available at https://github.com/SJTU-ViSYS-team/avoid-mpc.
Industrial Robot Motion Planning with GPUs: Integration of cuRobo for Extended DOF Systems
Efficient motion planning remains a key challenge in industrial robotics, especially for multi-axis systems operating in complex environments. This paper addresses that challenge by integrating GPU-accelerated motion planning through NVIDIA's cuRobo library into Vention's modular automation platform. By leveraging accurate CAD-based digital twins and real-time parallel optimization, our system enables rapid trajectory generation and dynamic collision avoidance for pick-and-place tasks. We demonstrate this capability on robots equipped with additional degrees of freedom, including a 7th-axis gantry, and benchmark performance across various scenarios. The results show significant improvements in planning speed and robustness, highlighting the potential of GPU-based planning pipelines for scalable, adaptable deployment in modern industrial workflows.
comment: 8 pages, 2 figures, 2 tables
Exploring Spatial Representation to Enhance LLM Reasoning in Aerial Vision-Language Navigation
Aerial Vision-and-Language Navigation (VLN) is a novel task enabling Unmanned Aerial Vehicles (UAVs) to navigate in outdoor environments through natural language instructions and visual cues. However, it remains challenging due to the complex spatial relationships in aerial scenes.In this paper, we propose a training-free, zero-shot framework for aerial VLN tasks, where the large language model (LLM) is leveraged as the agent for action prediction. Specifically, we develop a novel Semantic-Topo-Metric Representation (STMR) to enhance the spatial reasoning capabilities of LLMs. This is achieved by extracting and projecting instruction-related semantic masks onto a top-down map, which presents spatial and topological information about surrounding landmarks and grows during the navigation process. At each step, a local map centered at the UAV is extracted from the growing top-down map, and transformed into a ma trix representation with distance metrics, serving as the text prompt to LLM for action prediction in response to the given instruction. Experiments conducted in real and simulation environments have proved the effectiveness and robustness of our method, achieving absolute success rate improvements of 26.8% and 5.8% over current state-of-the-art methods on simple and complex navigation tasks, respectively. The dataset and code will be released soon.
Koopman Operator Based Time-Delay Embeddings and State History Augmented LQR for Periodic Hybrid Systems: Bouncing Pendulum and Bipedal Walking
Time-delay embedding is a technique that uses snapshots of state history over time to build a linear state space model of a nonlinear smooth system. We demonstrate that periodic non-smooth or hybrid system can also be modeled as a linear state space system using this approach as long as its behavior is consistent in modes and timings. We extend time-delay embeddings to generate a linear model of two periodic hybrid systems: the bouncing pendulum and the simplest walker with control inputs. This leads to a state history augmented linear quadratic regulator (LQR) which uses current and past state history for feedback control. Example code can be found at https://github.com/Chun-MingYang/koopman-timeDelay-lqr.git
Vision-Based Adaptive Robotics for Autonomous Surface Crack Repair SC
Surface cracks in infrastructure can lead to severe deterioration and expensive maintenance if not efficiently repaired. Manual repair methods are labor-intensive, time-consuming, and imprecise. While advancements in robotic perception and manipulation have progressed autonomous crack repair, three key challenges remain: accurate localization in the robot's coordinate frame, adaptability to varying crack sizes, and realistic validation of repairs. We present an adaptive, autonomous robotic system for surface crack detection and repair using advanced sensing technologies to enhance precision and safety for humans. A laser scanner is used to refine crack coordinates for accurate localization. Furthermore, our adaptive crack filling approach outperforms fixed speed techniques in efficiency and consistency. We validate our method using 3D printed cracks under realistic conditions, demonstrating repeatable testing. This research contributes to the field of human-robot interaction by reducing manual labor, improving safety, and streamlining maintenance operations, ultimately paving the way for more sophisticated and integrated construction robotics.
comment: 10 pages, 6 figures, 3 tables, submitted to ASCE International Conference on Computing in Civil Engineering (i3CE 2025)
Zero-Shot Generalization of Vision-Based RL Without Data Augmentation ICML 2025
Generalizing vision-based reinforcement learning (RL) agents to novel environments remains a difficult and open challenge. Current trends are to collect large-scale datasets or use data augmentation techniques to prevent overfitting and improve downstream generalization. However, the computational and data collection costs increase exponentially with the number of task variations and can destabilize the already difficult task of training RL agents. In this work, we take inspiration from recent advances in computational neuroscience and propose a model, Associative Latent DisentAnglement (ALDA), that builds on standard off-policy RL towards zero-shot generalization. Specifically, we revisit the role of latent disentanglement in RL and show how combining it with a model of associative memory achieves zero-shot generalization on difficult task variations without relying on data augmentation. Finally, we formally show that data augmentation techniques are a form of weak disentanglement and discuss the implications of this insight.
comment: Published at ICML 2025
Digital and Robotic Twinning for Validation of Proximity Operations and Formation Flying
In spacecraft Rendezvous, Proximity Operations (RPO), and Formation Flying (FF), the Guidance Navigation and Control (GNC) system is safety-critical and must meet strict performance requirements. However, validating such systems is challenging due to the complexity of the space environment, necessitating a verification and validation (V&V) process that bridges simulation and real-world behavior. The key contribution of this paper is a unified, end-to-end digital and robotic twinning framework that enables software- and hardware-in-the-loop testing for multi-modal GNC systems. The robotic twin includes three testbeds at Stanford's Space Rendezvous Laboratory (SLAB): the GNSS and Radiofrequency Autonomous Navigation Testbed for Distributed Space Systems (GRAND) to validate RF-based navigation techniques, and the Testbed for Rendezvous and Optical Navigation (TRON) and Optical Stimulator (OS) to validate vision-based methods. The test article for this work is an integrated multi-modal GNC software stack for RPO and FF developed at SLAB. This paper introduces the hybrid framework and summarizes calibration and error characterization for the robotic twin. Then, the GNC stack's performance and robustness is characterized using the integrated digital and robotic twinning pipeline for a full-range RPO mission scenario in Low-Earth Orbit (LEO). The results shown in the paper demonstrate consistency between digital and robotic twins, validating the hybrid twinning pipeline as a reliable framework for realistic assessment and verification of GNC systems.
comment: Found a mistake in results
The Social Life of Industrial Arms: How Arousal and Attention Shape Human-Robot Interaction
This study explores how human perceptions of a non-anthropomorphic robotic manipulator are shaped by two key dimensions of behaviour: arousal, defined as the robot's movement energy and expressiveness, and attention, defined as the robot's capacity to selectively orient toward and engage with a user. We introduce a novel control architecture that integrates a gaze-like attention engine with an arousal-modulated motion system to generate socially meaningful behaviours. In a user study, we find that robots exhibiting high attention -- actively directing their focus toward users -- are perceived as warmer and more competent, intentional, and lifelike. In contrast, high arousal -- characterized by fast, expansive, and energetic motions -- increases perceptions of discomfort and disturbance. Importantly, a combination of focused attention and moderate arousal yields the highest ratings of trust and sociability, while excessive arousal diminishes social engagement. These findings offer design insights for endowing non-humanoid robots with expressive, intuitive behaviours that support more natural human-robot interaction.
comment: 7 pages, 3 figures, 1 table
AI Pedagogy: Dialogic Social Learning for Artificial Agents
Large Language Models (LLMs) have demonstrated remarkable capabilities in processing extensive offline datasets. However, they often face challenges in acquiring and integrating complex, knowledge online. Traditional AI training paradigms, predominantly based on supervised learning or reinforcement learning, mirror a 'Piagetian' model of independent exploration. These approaches typically rely on large datasets and sparse feedback signals, limiting the models' ability to learn efficiently from interactions. Drawing inspiration from Vygotsky's sociocultural theory, this study explores the potential of socially mediated learning paradigms to address these limitations. We introduce a dynamic environment, termed the 'AI Social Gym', where an AI learner agent engages in dyadic pedagogical dialogues with knowledgeable AI teacher agents. These interactions emphasize external, structured dialogue as a core mechanism for knowledge acquisition, contrasting with methods that depend solely on internal inference or pattern recognition. Our investigation focuses on how different pedagogical strategies impact the AI learning process in the context of ontology acquisition. Empirical results indicate that such dialogic approaches-particularly those involving mixed-direction interactions combining top-down explanations with learner-initiated questioning-significantly enhance the LLM's ability to acquire and apply new knowledge, outperforming both unidirectional instructional methods and direct access to structured knowledge, formats typically present in training datasets. These findings suggest that integrating pedagogical and psychological insights into AI and robot training can substantially improve post-training knowledge acquisition and response quality. This approach offers a complementary pathway to existing strategies like prompt engineering
comment: accepted at ICSR2025
Multiagent Systems
FEAT: A Multi-Agent Forensic AI System with Domain-Adapted Large Language Model for Automated Cause-of-Death Analysis
Forensic cause-of-death determination faces systemic challenges, including workforce shortages and diagnostic variability, particularly in high-volume systems like China's medicolegal infrastructure. We introduce FEAT (ForEnsic AgenT), a multi-agent AI framework that automates and standardizes death investigations through a domain-adapted large language model. FEAT's application-oriented architecture integrates: (i) a central Planner for task decomposition, (ii) specialized Local Solvers for evidence analysis, (iii) a Memory & Reflection module for iterative refinement, and (iv) a Global Solver for conclusion synthesis. The system employs tool-augmented reasoning, hierarchical retrieval-augmented generation, forensic-tuned LLMs, and human-in-the-loop feedback to ensure legal and medical validity. In evaluations across diverse Chinese case cohorts, FEAT outperformed state-of-the-art AI systems in both long-form autopsy analyses and concise cause-of-death conclusions. It demonstrated robust generalization across six geographic regions and achieved high expert concordance in blinded validations. Senior pathologists validated FEAT's outputs as comparable to those of human experts, with improved detection of subtle evidentiary nuances. To our knowledge, FEAT is the first LLM-based AI agent system dedicated to forensic medicine, offering scalable, consistent death certification while maintaining expert-level rigor. By integrating AI efficiency with human oversight, this work could advance equitable access to reliable medicolegal services while addressing critical capacity constraints in forensic systems.
comment: 18pages, 6 figures
Multi-agent systems for chemical engineering: A review and perspective
Large language model (LLM)-based multi-agent systems (MASs) are a recent but rapidly evolving technology with the potential to transform chemical engineering by decomposing complex workflows into teams of collaborative agents with specialized knowledge and tools. This review surveys the state-of-the-art of MAS within chemical engineering. While early studies demonstrate promising results, scientific challenges remain, including the design of tailored architectures, integration of heterogeneous data modalities, development of foundation models with domain-specific modalities, and strategies for ensuring transparency, safety, and environmental impact. As a young but fast-moving field, MASs offer exciting opportunities to rethink chemical engineering workflows.
Robust Reinforcement Learning over Wireless Networks with Homomorphic State Representations
In this work, we address the problem of training Reinforcement Learning (RL) agents over communication networks. The RL paradigm requires the agent to instantaneously perceive the state evolution to infer the effects of its actions on the environment. This is impossible if the agent receives state updates over lossy or delayed wireless systems and thus operates with partial and intermittent information. In recent years, numerous frameworks have been proposed to manage RL with imperfect feedback; however, they often offer specific solutions with a substantial computational burden. To address these limits, we propose a novel architecture, named Homomorphic Robust Remote Reinforcement Learning (HR3L), that enables the training of remote RL agents exchanging observations across a non-ideal wireless channel. HR3L considers two units: the transmitter, which encodes meaningful representations of the environment, and the receiver, which decodes these messages and performs actions to maximize a reward signal. Importantly, HR3L does not require the exchange of gradient information across the wireless channel, allowing for quicker training and a lower communication overhead than state-of-the-art solutions. Experimental results demonstrate that HR3L significantly outperforms baseline methods in terms of sample efficiency and adapts to different communication scenarios, including packet losses, delayed transmissions, and capacity limitations.
comment: This manuscript is currently under revision
Toward Goal-Oriented Communication in Multi-Agent Systems: An overview
As multi-agent systems (MAS) become increasingly prevalent in autonomous systems, distributed control, and edge intelligence, efficient communication under resource constraints has emerged as a critical challenge. Traditional communication paradigms often emphasize message fidelity or bandwidth optimization, overlooking the task relevance of the exchanged information. In contrast, goal-oriented communication prioritizes the importance of information with respect to the agents' shared objectives. This review provides a comprehensive survey of goal-oriented communication in MAS, bridging perspectives from information theory, communication theory, and machine learning. We examine foundational concepts alongside learning-based approaches and emergent protocols. Special attention is given to coordination under communication constraints, as well as applications in domains such as swarm robotics, federated learning, and edge computing. The paper concludes with a discussion of open challenges and future research directions at the intersection of communication theory, machine learning, and multi-agent decision making.
comment: 32 pages
Perpetual exploration in anonymous synchronous networks with a Byzantine black hole SC 2025
In this paper, we investigate: ``How can a group of initially co-located mobile agents perpetually explore an unknown graph, when one stationary node occasionally behaves maliciously, under an adversary's control?'' We call this node a ``Byzantine black hole (BBH)'' and at any given round it may choose to destroy all visiting agents, or none. This subtle power can drastically undermine classical exploration strategies designed for an always active black hole. We study this perpetual exploration problem in the presence of at most one BBH, without initial knowledge of the network size. Since the underlying graph may be 1-connected, perpetual exploration of the entire graph may be infeasible. We thus define two variants: \pbmPerpExpl\ and \pbmPerpExplHome. In the former, the agents are tasked to perform perpetual exploration of at least one component, obtained after the exclusion of the BBH. In the latter, the agents are tasked to perform perpetual exploration of the component which contains the \emph{home} node, where agents are initially co-located. Naturally, \pbmPerpExplHome\ is a special case of \pbmPerpExpl. Agents operate under a synchronous scheduler and communicate in a face-to-face model. Our goal is to determine the minimum number of agents necessary and sufficient to solve these problems. In acyclic networks, we obtain optimal algorithms that solve \pbmPerpExpl\ with $4$ agents, and \pbmPerpExplHome\ with $6$ agents in trees. The lower bounds hold even in path graphs. In general graphs, we give a non-trivial lower bound of $2\Delta-1$ agents for \pbmPerpExpl, and an upper bound of $3\Delta+3$ agents for \pbmPerpExplHome. To our knowledge, this is the first study of a black-hole variant in arbitrary networks without initial topological knowledge.
comment: This paper has been accepted at DISC 2025
EMPATHIA: Multi-Faceted Human-AI Collaboration for Refugee Integration NeurIPS 2025
Current AI approaches to refugee integration optimize narrow objectives such as employment and fail to capture the cultural, emotional, and ethical dimensions critical for long-term success. We introduce EMPATHIA (Enriched Multimodal Pathways for Agentic Thinking in Humanitarian Immigrant Assistance), a multi-agent framework addressing the central Creative AI question: how do we preserve human dignity when machines participate in life-altering decisions? Grounded in Kegan's Constructive Developmental Theory, EMPATHIA decomposes integration into three modules: SEED (Socio-cultural Entry and Embedding Decision) for initial placement, RISE (Rapid Integration and Self-sufficiency Engine) for early independence, and THRIVE (Transcultural Harmony and Resilience through Integrated Values and Engagement) for sustained outcomes. SEED employs a selector-validator architecture with three specialized agents - emotional, cultural, and ethical - that deliberate transparently to produce interpretable recommendations. Experiments on the UN Kakuma dataset (15,026 individuals, 7,960 eligible adults 15+ per ILO/UNHCR standards) and implementation on 6,359 working-age refugees (15+) with 150+ socioeconomic variables achieved 87.4% validation convergence and explainable assessments across five host countries. EMPATHIA's weighted integration of cultural, emotional, and ethical factors balances competing value systems while supporting practitioner-AI collaboration. By augmenting rather than replacing human expertise, EMPATHIA provides a generalizable framework for AI-driven allocation tasks where multiple values must be reconciled.
comment: 19 pages, 3 figures (plus 6 figures in supplementary), 2 tables, 1 algorithm. Submitted to NeurIPS 2025 Creative AI Track: Humanity
Retrieval-Augmented Multi-Agent System for Rapid Statement of Work Generation
Drafting a Statement of Work (SOW) is a vital part of business and legal projects. It outlines key details like deliverables, timelines, responsibilities, and legal terms. However, creating these documents is often a slow and complex process. It usually involves multiple people, takes several days, and leaves room for errors or outdated content. This paper introduces a new AI-driven automation system that makes the entire SOW drafting process faster, easier, and more accurate. Instead of relying completely on humans, the system uses three intelligent components or 'agents' that each handle a part of the job. One agent writes the first draft, another checks if everything is legally correct, and the third agent formats the document and ensures everything is in order. Unlike basic online tools that just fill in templates, this system understands the meaning behind the content and customizes the SOW to match the needs of the project. It also checks legal compliance and formatting so that users can trust the result. The system was tested using real business examples. It was able to create a full SOW in under three minutes, compared to several hours or days using manual methods. It also performed well in accuracy and quality, showing that it can reduce legal risks and save a lot of time. This solution shows how artificial intelligence can be used to support legal and business professionals by taking care of routine work and helping them focus on more important decisions. It's a step toward making legal processes smarter, faster, and more reliable.
comment: 7 pages
MAViS: A Multi-Agent Framework for Long-Sequence Video Storytelling
Despite recent advances, long-sequence video generation frameworks still suffer from significant limitations: poor assistive capability, suboptimal visual quality, and limited expressiveness. To mitigate these limitations, we propose MAViS, an end-to-end multi-agent collaborative framework for long-sequence video storytelling. MAViS orchestrates specialized agents across multiple stages, including script writing, shot designing, character modeling, keyframe generation, video animation, and audio generation. In each stage, agents operate under the 3E Principle -- Explore, Examine, and Enhance -- to ensure the completeness of intermediate outputs. Considering the capability limitations of current generative models, we propose the Script Writing Guidelines to optimize compatibility between scripts and generative tools. Experimental results demonstrate that MAViS achieves state-of-the-art performance in assistive capability, visual quality, and video expressiveness. Its modular framework further enables scalability with diverse generative models and tools. With just a brief user prompt, MAViS is capable of producing high-quality, expressive long-sequence video storytelling, enriching inspirations and creativity for users. To the best of our knowledge, MAViS is the only framework that provides multimodal design output -- videos with narratives and background music.
comment: Video Generation Agent
A Minimal Model for Emergent Collective Behaviors in Autonomous Robotic Multi-Agent Systems
Collective behaviors such as swarming and flocking emerge from simple, decentralized interactions in biological systems. Existing models, such as Vicsek and Cucker-Smale, lack collision avoidance, whereas the Olfati-Saber model imposes rigid formations, limiting their applicability in swarm robotics. To address these limitations, this paper proposes a minimal yet expressive model that governs agent dynamics using relative positions, velocities, and local density, modulated by two tunable parameters: the spatial offset and kinetic offset. The model achieves spatially flexible, collision-free behaviors that reflect naturalistic group dynamics. Furthermore, we extend the framework to cognitive autonomous systems, enabling energy-aware phase transitions between swarming and flocking through adaptive control parameter tuning. This cognitively inspired approach offers a robust foundation for real-world applications in multi-robot systems, particularly autonomous aerial swarms.
ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation
Retrieval-Augmented Generation (RAG) has shown promise in enhancing recommendation systems by incorporating external context into large language model prompts. However, existing RAG-based approaches often rely on static retrieval heuristics and fail to capture nuanced user preferences in dynamic recommendation scenarios. In this work, we introduce ARAG, an Agentic Retrieval-Augmented Generation framework for Personalized Recommendation, which integrates a multi-agent collaboration mechanism into the RAG pipeline. To better understand the long-term and session behavior of the user, ARAG leverages four specialized LLM-based agents: a User Understanding Agent that summarizes user preferences from long-term and session contexts, a Natural Language Inference (NLI) Agent that evaluates semantic alignment between candidate items retrieved by RAG and inferred intent, a context summary agent that summarizes the findings of NLI agent, and an Item Ranker Agent that generates a ranked list of recommendations based on contextual fit. We evaluate ARAG accross three datasets. Experimental results demonstrate that ARAG significantly outperforms standard RAG and recency-based baselines, achieving up to 42.1% improvement in NDCG@5 and 35.5% in Hit@5. We also, conduct an ablation study to analyse the effect by different components of ARAG. Our findings highlight the effectiveness of integrating agentic reasoning into retrieval-augmented recommendation and provide new directions for LLM-based personalization.
ReaGAN: Node-as-Agent-Reasoning Graph Agentic Network
Graph Neural Networks (GNNs) have achieved remarkable success in graph-based learning by propagating information among neighbor nodes via predefined aggregation mechanisms. However, such fixed schemes often suffer from two key limitations. First, they cannot handle the imbalance in node informativeness -- some nodes are rich in information, while others remain sparse. Second, predefined message passing primarily leverages local structural similarity while ignoring global semantic relationships across the graph, limiting the model's ability to capture distant but relevant information. We propose Retrieval-augmented Graph Agentic Network (ReaGAN), an agent-based framework that empowers each node with autonomous, node-level decision-making. Each node acts as an agent that independently plans its next action based on its internal memory, enabling node-level planning and adaptive message propagation. Additionally, retrieval-augmented generation (RAG) allows nodes to access semantically relevant content and build global relationships in the graph. ReaGAN achieves competitive performance under few-shot in-context settings using a frozen LLM backbone without fine-tuning, showcasing the potential of agentic planning and local-global retrieval in graph learning.
comment: 17 pages, work in progress
Converging to Stability in Two-Sided Bandits: The Case of Unknown Preferences on Both Sides of a Matching Market
We study the problem of repeated two-sided matching with uncertain preferences (two-sided bandits), and no explicit communication between agents. Recent work has developed algorithms that converge to stable matchings when one side (the proposers or agents) must learn their preferences, but the preferences of the other side (the proposees or arms) are common knowledge, and the matching mechanism uses simultaneous proposals at each round. We develop new algorithms that provably converge to stable matchings for two more challenging settings: one where the arm preferences are no longer common knowledge, and a second, more general one where the arms are also uncertain about their preferences. In our algorithms, agents start with optimistic beliefs about arms' preferences and update these preferences over time. The key insight is in how to combine these beliefs about arm preferences with beliefs about the value of matching with an arm conditional on one's proposal being accepted when choosing whom to propose to.
Systems and Control (CS)
Autonomous Air-Ground Vehicle Operations Optimization in Hazardous Environments: A Multi-Armed Bandit Approach
Hazardous environments such as chemical spills, radiological zones, and bio-contaminated sites pose significant threats to human safety and public infrastructure. Rapid and reliable hazard mitigation in these settings often unsafe for humans, calling for autonomous systems that can adaptively sense and respond to evolving risks. This paper presents a decision-making framework for autonomous vehicle dispatch in hazardous environments with uncertain and evolving risk levels. The system integrates a Bayesian Upper Confidence Bound (BUCB) sensing strategy with task-specific vehicle routing problems with profits (VRPP), enabling adaptive coordination of unmanned aerial vehicles (UAVs) for hazard sensing and unmanned ground vehicles (UGVs) for cleaning. Using VRPP allows selective site visits under resource constraints by assigning each site a visit value that reflects sensing or cleaning priorities. Site-level hazard beliefs are maintained through a time-weighted Bayesian update. BUCB scores guide UAV routing to balance exploration and exploitation under uncertainty, while UGV routes are optimized to maximize expected hazard reduction under resource constraints. Simulation results demonstrate that our framework reduces the number of dispatch cycles to resolve hazards by around 30% on average compared to baseline dispatch strategies, underscoring the value of uncertainty-aware vehicle dispatch for reliable hazard mitigation.
IDSO-Managed Bid-Based Transactive Distribution Systems Design for DER Participation in Wholesale Markets While Preserving T-D Interactions
Participation of Distributed Energy Resources (DERs) in bid-based Transactive Energy Systems (TES) at the distribution systems facilitates strongly coupled, bidirectional interactions between Transmission-Distribution (T-D) systems. Capturing these interactions is critical for ensuring seamless integration within an Integrated Transmission and Distribution (ITD) framework. This study proposes a methodology to preserve such tight T-D linkages by developing an Independent Distribution System Operator (IDSO) managed bid-based TES design for unbalanced distribution systems. The proposed design operates within the ITD paradigm and permits DER participation in the Wholesale Power Market (WPM) through IDSO while preserving tight T-D linkages. To this end, this research offers the following key contributions: a novel bid/offer prequalification-cum-aggregation method to ensure a grid-safe and value-based aggregation of DERs' bids and offers for WPM participation through IDSO; and a retail pricing mechanism that reflects the true value of procuring or offering additional units of power within the distribution system. Case studies are conducted on a modified IEEE 123-bus radial feeder populated with a high DER concentration to validate the proposed frameworks' effectiveness in coordinating the DERs efficiently and reliably.
comment: 17 Pages, 13 Figures
Pinching-Antenna Systems (PASS)-based Indoor Positioning
Pinching antenna (PA), a flexible waveguide integrated with dielectric particles, intelligently reconstructs line-of-sight channels. Utilizing its geometric deterministic model and meter-level reconstruction, PA systems (PASS) are applied to uplink indoor positioning. In this paper, the uplink positioning system model for PASS is firstly proposed. A PASS-based received signal strength indication (RSSI) method is proposed to measure the distance from the users to each PA, which is efficient and suitable for PASS. PASS-based weighted least squares (WLS) algorithm is designed to calculate the two-dimensional coordinates of the users. Several critical observations can be drawn from our results: i) More PAs on the waveguide improves the positioning accuracy and robustness. ii) When the number of PAs exceeds a certain threshold, the performance gain becomes marginal. iii) User locations between and near PAs yield superior positioning accuracy.
comment: 5 pages, 5 figures, letter
Robust Adaptive Discrete-Time Control Barrier Certificate
This work develops a robust adaptive control strategy for discrete-time systems using Control Barrier Functions (CBFs) to ensure safety under parametric model uncertainty and disturbances. A key contribution of this work is establishing a barrier function certificate in discrete time for general online parameter estimation algorithms. This barrier function certificate guarantees positive invariance of the safe set despite disturbances and parametric uncertainty without access to the true system parameters. In addition, real-time implementation and inherent robustness guarantees are provided. Our approach demonstrates that, using the proposed robust adaptive CBF framework, the parameter estimation module can be designed separately from the CBF-based safety filter, simplifying the development of safe adaptive controllers for discrete-time systems. The resulting safety filter guarantees that the system remains within the safe set while adapting to model uncertainties, making it a promising strategy for real-world applications involving discrete-time safety-critical systems.
comment: 10 pages, 2 figures, submitted to Automatica as a brief paper
COMponent-Aware Pruning for Accelerated Control Tasks in Latent Space Models
The rapid growth of resource-constrained mobile platforms, including mobile robots, wearable systems, and Internet-of-Things devices, has increased the demand for computationally efficient neural network controllers (NNCs) that can operate within strict hardware limitations. While deep neural networks (DNNs) demonstrate superior performance in control applications, their substantial computational complexity and memory requirements present significant barriers to practical deployment on edge devices. This paper introduces a comprehensive model compression methodology that leverages component-aware structured pruning to determine the optimal pruning magnitude for each pruning group, ensuring a balance between compression and stability for NNC deployment. Our approach is rigorously evaluated on Temporal Difference Model Predictive Control (TD-MPC), a state-of-the-art model-based reinforcement learning algorithm, with a systematic integration of mathematical stability guarantee properties, specifically Lyapunov criteria. The key contribution of this work lies in providing a principled framework for determining the theoretical limits of model compression while preserving controller stability. Experimental validation demonstrates that our methodology successfully reduces model complexity while maintaining requisite control performance and stability characteristics. Furthermore, our approach establishes a quantitative boundary for safe compression ratios, enabling practitioners to systematically determine the maximum permissible model reduction before violating critical stability properties, thereby facilitating the confident deployment of compressed NNCs in resource-limited environments.
comment: Submitted in: The 2026 IEEE/SICE International Symposium on System Integration (SII 2026)
MuaLLM: A Multimodal Large Language Model Agent for Circuit Design Assistance with Hybrid Contextual Retrieval-Augmented Generation
Conducting a comprehensive literature review is crucial for advancing circuit design methodologies. However, the rapid influx of state-of-the-art research, inconsistent data representation, and the complexity of optimizing circuit design objectives make this task significantly challenging. In this paper, we propose MuaLLM, an open-source multimodal Large Language Model (LLM) agent for circuit design assistance that integrates a hybrid Retrieval-Augmented Generation (RAG) framework with an adaptive vector database of circuit design research papers. Unlike conventional LLMs, the MuaLLM agent employs a Reason + Act (ReAct) workflow for iterative reasoning, goal-setting, and multi-step information retrieval. It functions as a question-answering design assistant, capable of interpreting complex queries and providing reasoned responses grounded in circuit literature. Its multimodal capabilities enable processing of both textual and visual data, facilitating more efficient and comprehensive analysis. The system dynamically adapts using intelligent search tools, automated document retrieval from the internet, and real-time database updates. Unlike conventional approaches constrained by model context limits, MuaLLM decouples retrieval from inference, enabling scalable reasoning over arbitrarily large corpora. At the maximum context length supported by standard LLMs, MuaLLM remains up to 10x less costly and 1.6x faster while maintaining the same accuracy. This allows rapid, no-human-in-the-loop database generation, overcoming the bottleneck of simulation-based dataset creation for circuits. To evaluate MuaLLM, we introduce two custom datasets: RAG-250, targeting retrieval and citation performance, and Reasoning-100 (Reas-100), focused on multistep reasoning in circuit design. MuaLLM achieves 90.1% recall on RAG-250, and 86.8% accuracy on Reas-100.
Deep Reinforcement Learning with Local Interpretability for Transparent Microgrid Resilience Energy Management
Renewable energy integration into microgrids has become a key approach to addressing global energy issues such as climate change and resource scarcity. However, the variability of renewable sources and the rising occurrence of High Impact Low Probability (HILP) events require innovative strategies for reliable and resilient energy management. This study introduces a practical approach to managing microgrid resilience through Explainable Deep Reinforcement Learning (XDRL). It combines the Proximal Policy Optimization (PPO) algorithm for decision-making with the Local Interpretable Model-agnostic Explanations (LIME) method to improve the transparency of the actor network's decisions. A case study in Ongole, India, examines a microgrid with wind, solar, and battery components to validate the proposed approach. The microgrid is simulated under extreme weather conditions during the Layla cyclone. LIME is used to analyse scenarios, showing the impact of key factors such as renewable generation, state of charge, and load prioritization on decision-making. The results demonstrate a Resilience Index (RI) of 0.9736 and an estimated battery lifespan of 15.11 years. LIME analysis reveals the rationale behind the agent's actions in idle, charging, and discharging modes, with renewable generation identified as the most influential feature. This study shows the effectiveness of integrating advanced DRL algorithms with interpretable AI techniques to achieve reliable and transparent energy management in microgrids.
Autonomous Navigation of Cloud-Controlled Quadcopters in Confined Spaces Using Multi-Modal Perception and LLM-Driven High Semantic Reasoning
This paper introduces an advanced AI-driven perception system for autonomous quadcopter navigation in GPS-denied indoor environments. The proposed framework leverages cloud computing to offload computationally intensive tasks and incorporates a custom-designed printed circuit board (PCB) for efficient sensor data acquisition, enabling robust navigation in confined spaces. The system integrates YOLOv11 for object detection, Depth Anything V2 for monocular depth estimation, a PCB equipped with Time-of-Flight (ToF) sensors and an Inertial Measurement Unit (IMU), and a cloud-based Large Language Model (LLM) for context-aware decision-making. A virtual safety envelope, enforced by calibrated sensor offsets, ensures collision avoidance, while a multithreaded architecture achieves low-latency processing. Enhanced spatial awareness is facilitated by 3D bounding box estimation with Kalman filtering. Experimental results in an indoor testbed demonstrate strong performance, with object detection achieving a mean Average Precision (mAP50) of 0.6, depth estimation Mean Absolute Error (MAE) of 7.2 cm, only 16 safety envelope breaches across 42 trials over approximately 11 minutes, and end-to-end system latency below 1 second. This cloud-supported, high-intelligence framework serves as an auxiliary perception and navigation system, complementing state-of-the-art drone autonomy for GPS-denied confined spaces.
Learning Satellite Attitude Dynamics with Physics-Informed Normalising Flow
Attitude control is a fundamental aspect of spacecraft operations. Model Predictive Control (MPC) has emerged as a powerful strategy for these tasks, relying on accurate models of the system dynamics to optimize control actions over a prediction horizon. In scenarios where physics models are incomplete, difficult to derive, or computationally expensive, machine learning offers a flexible alternative by learning the system behavior directly from data. However, purely data-driven models often struggle with generalization and stability, especially when applied to inputs outside their training domain. To address these limitations, we investigate the benefits of incorporating Physics-Informed Neural Networks (PINNs) into the learning of spacecraft attitude dynamics, comparing their performance with that of purely data-driven approaches. Using a Real-valued Non-Volume Preserving (Real NVP) neural network architecture with a self-attention mechanism, we trained several models on simulated data generated with the Basilisk simulator. Two training strategies were considered: a purely data-driven baseline and a physics-informed variant to improve robustness and stability. Our results demonstrate that the inclusion of physics-based information significantly enhances the performance in terms of the mean relative error of the best architectures found by 27.08%. These advantages are particularly evident when the learned models are integrated into an MPC framework, where PINN-based models consistently outperform their purely data-driven counterparts in terms of control accuracy and robustness, yielding improvements of up to 42.86% in performance stability error and increased robustness-to-noise.
comment: In review
Robust Integrated Priority and Speed Control based on Hierarchical Stochastic Optimization to Promote Bus Schedule Adherence along Signalized Arterial SC 2025
In intelligent transportation systems (ITS), adaptive transit signal priority (TSP) and dynamic bus control systems have been independently developed to maintain efficient and reliable urban bus services. However, those two systems could potentially lead to conflicting decisions due to the lack of coordination. Although some studies explore the integrated control strategies along the arterial, they merely rely on signal replanning to address system uncertainties. Therefore, their performance severely deteriorates in real-world intersection settings, where abrupt signal timing variation is not always applicable in consideration of countdown timers and pedestrian signal design. In this study, we propose a robust integrated priority and speed control strategy based on hierarchical stochastic optimization to enhance bus schedule adherence along the arterial. In the proposed framework, the upper level ensures the coordination across intersections while the lower level handles uncertainties for each intersection with stochastic programming. Hence, the route-level system randomness is decomposed into a series of local problems that can be solved in parallel using sample average approximation (SAA). Simulation experiments are conducted under various scenarios with stochastic bus dwell time and different traffic demand. The results demonstrate that our approach significantly enhances bus punctuality and time headway equivalence without abrupt signal timing variation, with negative impacts on car delays limited to only 0.8%-5.2% as traffic demand increases.
comment: This paper has been accepted by 26th IEEE International Conference on Intelligent Transportation Systems ITSC 2025
Toward Goal-Oriented Communication in Multi-Agent Systems: An overview
As multi-agent systems (MAS) become increasingly prevalent in autonomous systems, distributed control, and edge intelligence, efficient communication under resource constraints has emerged as a critical challenge. Traditional communication paradigms often emphasize message fidelity or bandwidth optimization, overlooking the task relevance of the exchanged information. In contrast, goal-oriented communication prioritizes the importance of information with respect to the agents' shared objectives. This review provides a comprehensive survey of goal-oriented communication in MAS, bridging perspectives from information theory, communication theory, and machine learning. We examine foundational concepts alongside learning-based approaches and emergent protocols. Special attention is given to coordination under communication constraints, as well as applications in domains such as swarm robotics, federated learning, and edge computing. The paper concludes with a discussion of open challenges and future research directions at the intersection of communication theory, machine learning, and multi-agent decision making.
comment: 32 pages
Deep Reinforcement Learning-Based Control Strategy with Direct Gate Control for Buck Converters
This paper proposes a deep reinforcement learning (DRL)-based approach for directly controlling the gate signals of switching devices to achieve voltage regulation in a buck converter. Unlike conventional control methods, the proposed method directly generates gate signals using a neural network trained through DRL, with the objective of achieving high control speed and flexibility while maintaining stability. Simulation results demonstrate that the proposed direct gate control (DGC) method achieves a faster transient response and stable output voltage regulation, outperforming traditional PWM-based control schemes. The DGC method also exhibits strong robustness against parameter variations and sensor noise, indicating its suitability for practical power electronics applications. The effectiveness of the proposed approach is validated via simulation.
When are safety filters safe? On minimum phase conditions of control barrier functions
In emerging control applications involving multiple and complex tasks, safety filters are gaining prominence as a modular approach to enforcing safety constraints. Among various methods, control barrier functions (CBFs) are widely used for designing safety filters due to their simplicity, imposing a single linear constraint on the control input at each state. In this work, we focus on the internal dynamics of systems governed by CBF-constrained control laws. Our key observation is that, although CBFs guarantee safety by enforcing state constraints, they can inadvertently be "unsafe" by causing the internal state to diverge. We investigate the conditions under which the full system state, including the internal state, can remain bounded under a CBF-based safety filter. Drawing inspiration from the input-output linearization literature, where boundedness is ensured by minimum phase conditions, we propose a new set of CBF minimum phase conditions tailored to the structure imposed by the CBF constraint. A critical distinction from the original minimum phase conditions is that the internal dynamics in our setting is driven by a nonnegative virtual control input, which reflects the enforcement of the safety constraint. We include a range of numerical examples, including single-input, multi-input, linear, and nonlinear systems, validating both our analysis and the necessity of the proposed CBF minimum phase conditions.
comment: This work has been submitted to the IEEE for possible publication
Nonlinear Systems in Wireless Power Transfer Applications
As a novel pattern of energization, the wireless power transfer (WPT) offers a brand-new way to the energy acquisition for electric-driven devices, thus alleviating the over-dependence on the battery. This report presents three types of WPT systems that use nonlinear control methods, in order to acquire an in-depth understanding of the course of Nonlinear Systems.
Neuro-Symbolic Acceleration of MILP Motion Planning with Temporal Logic and Chance Constraints
Autonomous systems must solve motion planning problems subject to increasingly complex, time-sensitive, and uncertain missions. These problems often involve high-level task specifications, such as temporal logic or chance constraints, which require solving large-scale Mixed-Integer Linear Programs (MILPs). However, existing MILP-based planning methods suffer from high computational cost and limited scalability, hindering their real-time applicability. We propose to use a neuro-symbolic approach to accelerate MILP-based motion planning by leveraging machine learning techniques to guide the solver's symbolic search. Focusing on two representative classes of planning problems, namely, those with Signal Temporal Logic (STL) specifications and those with chance constraints formulated via Conformal Predictive Programming (CPP). We demonstrate how graph neural network-based learning methods can guide traditional symbolic MILP solvers in solving challenging planning problems, including branching variable selection and solver parameter configuration. Through extensive experiments, we show that neuro-symbolic search techniques yield scalability gains. Our approach yields substantial improvements, achieving an average performance gain of about 20% over state-of-the-art solver across key metrics, including runtime and solution quality.
Control-affine Schrödinger Bridge and Generalized Bohm Potential
The control-affine Schr\"odinger bridge concerns with a stochastic optimal control problem. Its solution is a controlled evolution of joint state probability density subject to a control-affine It\^o diffusion with a given deadline connecting a given pair of initial and terminal densities. In this work, we recast the necessary conditions of optimality for the control-affine Schr\"odinger bridge problem as a two point boundary value problem for a quantum mechanical Schr\"odinger PDE with complex potential. This complex-valued potential is a generalization of the real-valued Bohm potential in quantum mechanics. Our derived potential is akin to the optical potential in nuclear physics where the real part of the potential encodes elastic scattering (transmission of wave function), and the imaginary part encodes inelastic scattering (absorption of wave function). The key takeaway is that the process noise that drives the evolution of probability densities induces an absorbing medium in the evolution of wave function. These results make new connections between control theory and non-equilibrium statistical mechanics through the lens of quantum mechanics.
comment: This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. Partial funding for this work was provided by LLNL Laboratory Directed Research and Development grant GS 25-ERD-044. Document release number: LLNL-JRNL-2008865
An Analytical and Experimental Study of Distributed Uplink Beamforming in the Presence of Carrier Frequency Offsets
Realizing distributed multi-user beamforming (D-MUBF) in time division duplex (TDD)-based multi-user MIMO (MU-MIMO) systems faces significant challenges. One of the most fundamental challenges is achieving accurate over-the-air (OTA) timing and frequency synchronization among distributed access points (APs), particularly due to residual frequency offsets caused by local oscillator (LO) drifts. Despite decades of research on synchronization for MU-MIMO, there are only a few experimental studies that evaluate D-MUBF techniques under imperfect frequency synchronization among distributed antennas. This paper presents an analytical and experimental assessment of D-MUBF methods in the presence of frequency synchronization errors. We provide closed-form expressions for signal-to-interference-plus-noise ratio (SINR) as a function of channel characteristics and statistical properties of carrier frequency offset (CFO) among AP antennas. In addition, through experimental evaluations conducted with the RENEW massive MIMO testbed, we collected comprehensive datasets across various experimental scenarios. These datasets comprise uplink pilot samples for channel and CFO estimation, in addition to uplink multi-user data intended for analyzing D-MUBF techniques. By examining these datasets, we assess the performance of D-MUBF in the presence of CFO and compare the analytical predictions with empirical measurements. Furthermore, we make the datasets publicly available and provide insights on utilizing them for future research endeavors.
comment: This work has been submitted to the IEEE Transactions on Vehicular Technology for possible publication. This version includes revisions based on the first-round review
Gradient- and Newton-Based Unit Vector Extremum Seeking Control
This paper presents novel methods for achieving stable and efficient convergence in multivariable extremum seeking control (ESC) using sliding mode techniques. Drawing inspiration from both classical sliding mode control and more recent developments in finite-time and fixed-time control, we propose a new framework that integrates these concepts into Gradient- and Newton-based ESC schemes based on sinusoidal perturbation signals. The key innovation lies in the use of discontinuous "relay-type" control components, replacing traditional proportional feedback to estimate the gradient of unknown quadratic nonlinear performance maps with Unit Vector Control (UVC). This represents the first attempt to address real-time, model-free optimization using sliding modes within the classical extremum seeking paradigm. In the Gradient-based approach, the convergence rate is influenced by the unknown Hessian of the objective function. In contrast, the Newton-based method overcomes this limitation by employing a dynamic estimator for the inverse of the Hessian, implemented via a Riccati equation filter. We establish finite-time convergence of the closed-loop average system to the extremum point for both methods by leveraging Lyapunov-based analysis and averaging theory tailored to systems with discontinuous right-hand sides. Numerical simulations validate the proposed method, illustrating significantly faster convergence and improved robustness compared to conventional ESC strategies, which typically guarantee only exponential stability. The results also demonstrate that the Gradient-based method exhibits slower convergence and higher transients since the gradient trajectory follows the curved and steepest-descent path, whereas the Newton-based method achieves faster convergence and improved overall performance going straightly to the extremum.
Quantum advantage in decentralized control of POMDPs: A control-theoretic view of the Mermin-Peres square
Consider a decentralized partially-observed Markov decision problem (POMDP) with multiple cooperative agents aiming to maximize a long-term-average reward criterion. We observe that the availability, at a fixed rate, of entangled states of a product quantum system between the agents, where each agent has access to one of the component systems, can result in strictly improved performance even compared to the scenario where common randomness is provided to the agents, i.e. there is a quantum advantage in decentralized control. This observation comes from a simple reinterpretation of the conclusions of the well-known Mermin-Peres square, which underpins the Mermin-Peres game. While quantum advantage has been demonstrated earlier in one-shot team problems of this kind, it is notable that there are examples where there is a quantum advantage for the one-shot criterion but it disappears in the dynamical scenario. The presence of a quantum advantage in dynamical scenarios is thus seen to be a novel finding relative to the current state of knowledge about the achievable performance in decentralized control problems. This paper is dedicated to the memory of Pravin P. Varaiya.
comment: Added two references. Improved the notation to clarify some calculations
Elastic Motion Policy: An Adaptive Dynamical System for Robust and Efficient One-Shot Imitation Learning
Behavior cloning (BC) has become a staple imitation learning paradigm in robotics due to its ease of teaching robots complex skills directly from expert demonstrations. However, BC suffers from an inherent generalization issue. To solve this, the status quo solution is to gather more data. Yet, regardless of how much training data is available, out-of-distribution performance is still sub-par, lacks any formal guarantee of convergence and success, and is incapable of allowing and recovering from physical interactions with humans. These are critical flaws when robots are deployed in ever-changing human-centric environments. Thus, we propose Elastic Motion Policy (EMP), a one-shot imitation learning framework that allows robots to adjust their behavior based on the scene change while respecting the task specification. Trained from a single demonstration, EMP follows the dynamical systems paradigm where motion planning and control are governed by first-order differential equations with convergence guarantees. We leverage Laplacian editing in full end-effector space, $\mathbb{R}^3\times SO(3)$, and online convex learning of Lyapunov functions, to adapt EMP online to new contexts, avoiding the need to collect new demonstrations. We extensively validate our framework in real robot experiments, demonstrating its robust and efficient performance in dynamic environments, with obstacle avoidance and multi-step task capabilities. Project Website: https://elastic-motion-policy.github.io/EMP/
A Communication Consistent Approach to Signal Temporal Logic Task Decomposition in Multi-Agent Systems
We consider the problem of decomposing a global task assigned to a multi-agent system, expressed as a formula within a fragment of Signal Temporal Logic (STL), under range-limited communication. Given a global task expressed as a conjunction of local tasks defined over the individual and relative states of agents in the system, we propose representing task dependencies among agents as edges of a suitably defined task graph. At the same time, range-limited communication naturally induces the definition of a communication graph that defines which agents have access to each other's states. Within these settings, inconsistencies arise when a task dependency between a pair of agents is not supported by a corresponding communication link due to the limited communication range. As a result, state feedback control laws previously derived to achieve the tasks' satisfaction can not be leveraged. We propose a task decomposition mechanism to distribute tasks assigned to pairs of non-communicating agents in the system as conjunctions of tasks defined over the relative states of communicating agents, thus enforcing consistency between task and communication graphs. Assuming the super-level sets of the predicate functions composing the STL tasks are bounded polytopes, our task decomposition mechanism can be cast as a parameter optimization problem and solved via state-of-the-art decentralized convex optimization algorithms. To guarantee the soundness of our approach, we present various conditions under which the tasks defined in the applied STL fragment are unsatisfiable, and we show sufficient conditions such that our decomposition approach yields satisfiable global tasks after decomposition.
comment: This work has been submitted to the IEEE for possible publication
Finite Sample Performance Analysis of MIMO Systems Identification
This paper is concerned with the finite sample identification performance of an n dimensional discrete-time Multiple-Input Multiple-Output (MIMO) Linear Time-Invariant system, with p inputs and m outputs. We prove that the widely-used Ho-Kalman algorithm and Multivariable Output Error State Space (MOESP) algorithm are ill-conditioned for MIMO systems when n/m or n/p is large. Moreover, by analyzing the Cra\'mer-Rao bound, we derive a fundamental limit for identifying the real and stable (or marginally stable) poles of MIMO system and prove that the sample complexity for any unbiased pole estimation algorithm to reach a certain level of accuracy explodes superpolynomially with respect to n/(pm). Numerical results are provided to illustrate the ill-conditionedness of Ho-Kalman algorithm and MOESP algorithm as well as the fundamental limit on identification.
comment: 13 pages, 4 figures
Barrier Integral Control for Global Asymptotic Stabilization of Uncertain Nonlinear Systems under Smooth Feedback and Transient Constraints
This paper addresses the problem of asymptotic stabilization for high-order control-affine MIMO nonlinear systems with unknown dynamic terms. We introduce Barrier Integral Control, a novel algorithm designed to confine the system's state within a predefined funnel, ensuring adherence to prescribed transient constraints, and asymptotically drive it to zero from any initial condition. The algorithm leverages the innovative integration of a reciprocal barrier function and an error-integral term, featuring smooth feedback control. Notably, it operates without relying on any information or approximation schemes for the (unknown) dynamic terms, which, unlike a large class of previous works, are not assumed to be bounded or to comply with globally Lipschitz/growth conditions. Additionally, the system's trajectory and asymptotic performance are decoupled from the uncertain model, control-gain selection, and initial conditions. Finally, comparative simulation studies validate the effectiveness of the proposed algorithm.
Modeling and Simulation of an Active Car Suspension with a Robust LQR Controller under Road Disturbance, Parameter Uncertainty and White Noise
Vehicle suspension is important for passengers to travel comfortably and to be less exposed to effects such as vibration and shock. A good suspension system increases the road holding of vehicles, allows them to take turns safely, and reduces the risk of traffic accidents. A passive suspension system is the most widely used suspension system in vehicles due to its simple structure and low cost. Passive suspension systems do not have an actuator and therefore do not have a controller. Active suspension systems have an actuator and a controller. Although their structures are more complex and costly, they are safer. PID controller is widely used in active suspension systems due to its simple structure, reasonable cost, and easy adjustment of coefficients. In this study, a more robust LQR-controlled active suspension was designed than a passive suspension and a PID-controlled active suspension. Robustness analyses were performed for passive suspension, PID-controlled active suspension, and LQR-controlled active suspension. Suspension travel, sprung mass acceleration, and sprung mass motion simulations were performed for all three suspensions under road disturbance, under simultaneous road disturbance and parameter uncertainty and under road disturbance with white noise. A comparative analysis was performed by obtaining the rise time, overshoot, and settling time data of the suspensions under different conditions. It was observed that the LQR-controlled active suspension showed the fastest rise time, the least overshoot and had the shortest settling time. In this case, it was proven that the LQRcontrolled active suspension provided a more comfortable and safe ride compared to the other two suspension systems.
comment: 19 pages, 17 figures
EngiBench: A Framework for Data-Driven Engineering Design Research
Engineering design optimization seeks to automatically determine the shapes, topologies, or parameters of components that maximize performance under given conditions. This process often depends on physics-based simulations, which are difficult to install, computationally expensive, and require domain-specific expertise. To mitigate these challenges, we introduce EngiBench, the first open-source library and datasets spanning diverse domains for data-driven engineering design. EngiBench provides a unified API and a curated set of benchmarks -- covering aeronautics, heat conduction, photonics, and more -- that enable fair, reproducible comparisons of optimization and machine learning algorithms, such as generative or surrogate models. We also release EngiOpt, a companion library offering a collection of such algorithms compatible with the EngiBench interface. Both libraries are modular, letting users plug in novel algorithms or problems, automate end-to-end experiment workflows, and leverage built-in utilities for visualization, dataset generation, feasibility checks, and performance analysis. We demonstrate their versatility through experiments comparing state-of-the-art techniques across multiple engineering design problems, an undertaking that was previously prohibitively time-consuming to perform. Finally, we show that these problems pose significant challenges for standard machine learning methods due to highly sensitive and constrained design manifolds.
Minimally Conservative Controlled-Invariant Set Synthesis Using Control Barrier Certificates
Finding a controlled-invariant set for a system with state and control constraints is crucial for safety-critical applications. However, existing methods often produce overly conservative solutions. This paper presents a method for generating controlled-invariant (safe) sets for nonlinear polynomial control-affine systems using Control Barrier Certificates (CBCs). We formulate CBC conditions as Sum-of-Squares (SOS) constraints and solve them via an SOS Program (SOSP). First, we generalize existing SOSPs for CBC synthesis to handle environments with complex unsafe state representations. Then, we propose an iterative algorithm that progressively enlarges the safe set constructed by the synthesized CBCs by maximizing boundary expansion at each iteration. We theoretically prove that our method guarantees strict safe set expansion at every step. Finally, we validate our approach with numerical simulations in 2D and 3D for single-input and multi-input systems. Empirical results show that the safe set generated by our method covers in most part a larger portion of the state space compared to two state-of-the-art techniques.
Model Predictive Control on the Neural Manifold
Neural manifolds are an attractive theoretical framework for characterizing the complex behaviors of neural populations. However, many of the tools for identifying these low-dimensional subspaces are correlational and provide limited insight into the underlying dynamics. The ability to precisely control the latent activity of a circuit would allow researchers to investigate the structure and function of neural manifolds. We simulate controlling the latent dynamics of a neural population using closed-loop, dynamically generated sensory inputs. Using a spiking neural network (SNN) as a model of a neural circuit, we find low-dimensional representations of both the network activity (the neural manifold) and a set of salient visual stimuli. The fields of classical and optimal control offer a range of methods to choose from for controlling dynamics on the neural manifold, which differ in performance, computational cost, and ease of implementation. Here, we focus on two commonly used control methods: proportional-integral-derivative (PID) control and model predictive control (MPC). PID is a computationally lightweight controller that is simple to implement. In contrast, MPC is a model-based, anticipatory controller with a much higher computational cost and engineering overhead. We evaluate both methods on trajectory-following tasks in latent space, under partial observability and in the presence of unknown noise. While both controllers in some cases were able to successfully control the latent dynamics on the neural manifold, MPC consistently produced more accurate control and required less hyperparameter tuning. These results demonstrate how MPC can be applied on the neural manifold using data-driven dynamics models, and provide a framework to experimentally test for causal relationships between manifold dynamics and external stimuli.
A Practical Approach Towards Inertia Estimation Using Ambient Synchrophasor Data
Real-time tracking of inertia is important because it reflects the power system's ability to withstand contingencies and maintain frequency security. This paper proposes a practical approach to estimate inertia using ambient phasor measurement unit (PMU) data and a partitioned form of the swing equation. The approach accounts for (bounded) uncertainties in network parameters and PMU measurements, enabling precise estimation of inertia and damping constants, as well as mechanical power inputs. Instead of assuming constant mechanical power input throughout, the approach leverages knowledge of power system operations to determine intervals when it is actually constant to maintain estimation consistency. Simulation results on the IEEE 14-bus system and IEEE 39 bus system integrated with renewable energy sources affirm the method's accuracy and applicability.
Wasserstein Barycenter Soft Actor-Critic
Deep off-policy actor-critic algorithms have emerged as the leading framework for reinforcement learning in continuous control domains. However, most of these algorithms suffer from poor sample efficiency, especially in environments with sparse rewards. In this paper, we take a step towards addressing this issue by providing a principled directed exploration strategy. We propose Wasserstein Barycenter Soft Actor-Critic (WBSAC) algorithm, which benefits from a pessimistic actor for temporal difference learning and an optimistic actor to promote exploration. This is achieved by using the Wasserstein barycenter of the pessimistic and optimistic policies as the exploration policy and adjusting the degree of exploration throughout the learning process. We compare WBSAC with state-of-the-art off-policy actor-critic algorithms and show that WBSAC is more sample-efficient on MuJoCo continuous control tasks.
Spectrum Sharing in Satellite-Terrestrial Integrated Networks: Frameworks, Approaches, and Opportunities
With the construction of low-earth orbit (LEO) satellite constellations, ubiquitous connectivity has been achieved. Terrestrial networks (TNs), such as cellular networks, are mainly deployed in specific urban areas and use licensed spectrum. However, in remote areas where terrestrial infrastructure is sparse, licensed spectrum bands are often underutilized. To accommodate the increasing communication needs, non-terrestrial networks (NTNs) can opportunistically access this idle spectrum to improve spectrum efficiency via spectrum sharing (SS). Therefore, bringing NTNs to a shared spectrum with TNs can improve network capacity under reasonable interference management. In satellite-terrestrial integrated networks (STINs), the comprehensive coverage of a satellite and the unbalanced communication resources of STINs make it challenging to manage mutual interference between NTN and TN effectively. This article presents the fundamentals and prospects of SS in STINs by introducing four SS frameworks, their potential application scenarios, and technical challenges. Furthermore, advanced SS approaches related to interference management in STINs and performance metrics of SS in STINs are introduced. Moreover, a preliminary performance evaluation showcases the potential for sharing the spectrum between NTN and TN. Finally, future research opportunities for SS in STINs are discussed.
Gait Transitions in Load-Pulling Quadrupeds: Insights from Sled Dogs and a Minimal SLIP Model
Quadrupedal animals employ diverse galloping strategies to optimize speed, stability, and energy efficiency. However, the biomechanical mechanisms that enable adaptive gait transitions during high-speed locomotion under load remain poorly understood. In this study, we present new empirical and modeling insights into the biomechanics of load-pulling quadrupeds, using sprint sled dogs as a model system. High-speed video and force recordings reveal that sled dogs often switch between rotary and transverse galloping gaits within just a few strides and without any observable changes in speed, stride duration, or terrain, providing clear evidence of locomotor multistability during high-speed load-pulling. To investigate the mechanical basis of these transitions, a physics-based quadrupedal Spring-Loaded Inverted Pendulum model with hybrid dynamics and prescribed footfall sequences to reproduce the asymmetric galloping patterns observed in racing sled dogs. Through trajectory optimization, we replicate experimentally observed gait sequences and identify swing-leg stiffness modulation as a key control mechanism for inducing transitions. This work provides a much-needed biomechanical perspective on high-speed animal draft and establishes a modeling framework for studying locomotion in pulling quadrupeds, with implications for both biological understanding and the design of adaptive legged systems.
Vision-Based Adaptive Robotics for Autonomous Surface Crack Repair SC
Surface cracks in infrastructure can lead to severe deterioration and expensive maintenance if not efficiently repaired. Manual repair methods are labor-intensive, time-consuming, and imprecise. While advancements in robotic perception and manipulation have progressed autonomous crack repair, three key challenges remain: accurate localization in the robot's coordinate frame, adaptability to varying crack sizes, and realistic validation of repairs. We present an adaptive, autonomous robotic system for surface crack detection and repair using advanced sensing technologies to enhance precision and safety for humans. A laser scanner is used to refine crack coordinates for accurate localization. Furthermore, our adaptive crack filling approach outperforms fixed speed techniques in efficiency and consistency. We validate our method using 3D printed cracks under realistic conditions, demonstrating repeatable testing. This research contributes to the field of human-robot interaction by reducing manual labor, improving safety, and streamlining maintenance operations, ultimately paving the way for more sophisticated and integrated construction robotics.
comment: 10 pages, 6 figures, 3 tables, submitted to ASCE International Conference on Computing in Civil Engineering (i3CE 2025)
On Composable and Parametric Uncertainty in Systems Co-Design
Optimizing the design of complex systems requires navigating interdependent decisions, heterogeneous components, and multiple objectives. Our monotone theory of co-design offers a compositional framework for addressing this challenge, modeling systems as Design Problems (DPs), representing trade-offs between functionalities and resources within partially ordered sets. While current approaches model uncertainty using intervals, capturing worst- and best-case bounds, they fail to express probabilistic notions such as risk and confidence. These limitations hinder the applicability of co-design in domains where uncertainty plays a critical role. In this paper, we introduce a unified framework for composable uncertainty in co-design, capturing intervals, distributions, and parametrized models. This extension enables reasoning about risk-performance trade-offs and supports advanced queries such as experiment design, learning, and multi-stage decision making. We demonstrate the expressiveness and utility of the framework via a numerical case study on the uncertainty-aware co-design of task-driven Unmanned Aerial Vehicles (UAVs).
comment: 8 pages, accepted as an Invited Session Paper to IEEE Conference on Decision and Control (CDC) 2025
The Social Life of Industrial Arms: How Arousal and Attention Shape Human-Robot Interaction
This study explores how human perceptions of a non-anthropomorphic robotic manipulator are shaped by two key dimensions of behaviour: arousal, defined as the robot's movement energy and expressiveness, and attention, defined as the robot's capacity to selectively orient toward and engage with a user. We introduce a novel control architecture that integrates a gaze-like attention engine with an arousal-modulated motion system to generate socially meaningful behaviours. In a user study, we find that robots exhibiting high attention -- actively directing their focus toward users -- are perceived as warmer and more competent, intentional, and lifelike. In contrast, high arousal -- characterized by fast, expansive, and energetic motions -- increases perceptions of discomfort and disturbance. Importantly, a combination of focused attention and moderate arousal yields the highest ratings of trust and sociability, while excessive arousal diminishes social engagement. These findings offer design insights for endowing non-humanoid robots with expressive, intuitive behaviours that support more natural human-robot interaction.
comment: 7 pages, 3 figures, 1 table
Learning Generative Models for Climbing Aircraft from Radar Data
Accurate trajectory prediction (TP) for climbing aircraft is hampered by the presence of epistemic uncertainties concerning aircraft operation, which can lead to significant misspecification between predicted and observed trajectories. This paper proposes a generative model for climbing aircraft in which the standard Base of Aircraft Data (BADA) model is enriched by a functional correction to the thrust that is learned from data. The method offers three features: predictions of the arrival time with 26.7% less error when compared to BADA; generated trajectories that are realistic when compared to test data; and a means of computing confidence bounds for minimal computational cost.
Systems and Control (EESS)
Autonomous Air-Ground Vehicle Operations Optimization in Hazardous Environments: A Multi-Armed Bandit Approach
Hazardous environments such as chemical spills, radiological zones, and bio-contaminated sites pose significant threats to human safety and public infrastructure. Rapid and reliable hazard mitigation in these settings often unsafe for humans, calling for autonomous systems that can adaptively sense and respond to evolving risks. This paper presents a decision-making framework for autonomous vehicle dispatch in hazardous environments with uncertain and evolving risk levels. The system integrates a Bayesian Upper Confidence Bound (BUCB) sensing strategy with task-specific vehicle routing problems with profits (VRPP), enabling adaptive coordination of unmanned aerial vehicles (UAVs) for hazard sensing and unmanned ground vehicles (UGVs) for cleaning. Using VRPP allows selective site visits under resource constraints by assigning each site a visit value that reflects sensing or cleaning priorities. Site-level hazard beliefs are maintained through a time-weighted Bayesian update. BUCB scores guide UAV routing to balance exploration and exploitation under uncertainty, while UGV routes are optimized to maximize expected hazard reduction under resource constraints. Simulation results demonstrate that our framework reduces the number of dispatch cycles to resolve hazards by around 30% on average compared to baseline dispatch strategies, underscoring the value of uncertainty-aware vehicle dispatch for reliable hazard mitigation.
IDSO-Managed Bid-Based Transactive Distribution Systems Design for DER Participation in Wholesale Markets While Preserving T-D Interactions
Participation of Distributed Energy Resources (DERs) in bid-based Transactive Energy Systems (TES) at the distribution systems facilitates strongly coupled, bidirectional interactions between Transmission-Distribution (T-D) systems. Capturing these interactions is critical for ensuring seamless integration within an Integrated Transmission and Distribution (ITD) framework. This study proposes a methodology to preserve such tight T-D linkages by developing an Independent Distribution System Operator (IDSO) managed bid-based TES design for unbalanced distribution systems. The proposed design operates within the ITD paradigm and permits DER participation in the Wholesale Power Market (WPM) through IDSO while preserving tight T-D linkages. To this end, this research offers the following key contributions: a novel bid/offer prequalification-cum-aggregation method to ensure a grid-safe and value-based aggregation of DERs' bids and offers for WPM participation through IDSO; and a retail pricing mechanism that reflects the true value of procuring or offering additional units of power within the distribution system. Case studies are conducted on a modified IEEE 123-bus radial feeder populated with a high DER concentration to validate the proposed frameworks' effectiveness in coordinating the DERs efficiently and reliably.
comment: 17 Pages, 13 Figures
Pinching-Antenna Systems (PASS)-based Indoor Positioning
Pinching antenna (PA), a flexible waveguide integrated with dielectric particles, intelligently reconstructs line-of-sight channels. Utilizing its geometric deterministic model and meter-level reconstruction, PA systems (PASS) are applied to uplink indoor positioning. In this paper, the uplink positioning system model for PASS is firstly proposed. A PASS-based received signal strength indication (RSSI) method is proposed to measure the distance from the users to each PA, which is efficient and suitable for PASS. PASS-based weighted least squares (WLS) algorithm is designed to calculate the two-dimensional coordinates of the users. Several critical observations can be drawn from our results: i) More PAs on the waveguide improves the positioning accuracy and robustness. ii) When the number of PAs exceeds a certain threshold, the performance gain becomes marginal. iii) User locations between and near PAs yield superior positioning accuracy.
comment: 5 pages, 5 figures, letter
Robust Adaptive Discrete-Time Control Barrier Certificate
This work develops a robust adaptive control strategy for discrete-time systems using Control Barrier Functions (CBFs) to ensure safety under parametric model uncertainty and disturbances. A key contribution of this work is establishing a barrier function certificate in discrete time for general online parameter estimation algorithms. This barrier function certificate guarantees positive invariance of the safe set despite disturbances and parametric uncertainty without access to the true system parameters. In addition, real-time implementation and inherent robustness guarantees are provided. Our approach demonstrates that, using the proposed robust adaptive CBF framework, the parameter estimation module can be designed separately from the CBF-based safety filter, simplifying the development of safe adaptive controllers for discrete-time systems. The resulting safety filter guarantees that the system remains within the safe set while adapting to model uncertainties, making it a promising strategy for real-world applications involving discrete-time safety-critical systems.
comment: 10 pages, 2 figures, submitted to Automatica as a brief paper
COMponent-Aware Pruning for Accelerated Control Tasks in Latent Space Models
The rapid growth of resource-constrained mobile platforms, including mobile robots, wearable systems, and Internet-of-Things devices, has increased the demand for computationally efficient neural network controllers (NNCs) that can operate within strict hardware limitations. While deep neural networks (DNNs) demonstrate superior performance in control applications, their substantial computational complexity and memory requirements present significant barriers to practical deployment on edge devices. This paper introduces a comprehensive model compression methodology that leverages component-aware structured pruning to determine the optimal pruning magnitude for each pruning group, ensuring a balance between compression and stability for NNC deployment. Our approach is rigorously evaluated on Temporal Difference Model Predictive Control (TD-MPC), a state-of-the-art model-based reinforcement learning algorithm, with a systematic integration of mathematical stability guarantee properties, specifically Lyapunov criteria. The key contribution of this work lies in providing a principled framework for determining the theoretical limits of model compression while preserving controller stability. Experimental validation demonstrates that our methodology successfully reduces model complexity while maintaining requisite control performance and stability characteristics. Furthermore, our approach establishes a quantitative boundary for safe compression ratios, enabling practitioners to systematically determine the maximum permissible model reduction before violating critical stability properties, thereby facilitating the confident deployment of compressed NNCs in resource-limited environments.
comment: Submitted in: The 2026 IEEE/SICE International Symposium on System Integration (SII 2026)
MuaLLM: A Multimodal Large Language Model Agent for Circuit Design Assistance with Hybrid Contextual Retrieval-Augmented Generation
Conducting a comprehensive literature review is crucial for advancing circuit design methodologies. However, the rapid influx of state-of-the-art research, inconsistent data representation, and the complexity of optimizing circuit design objectives make this task significantly challenging. In this paper, we propose MuaLLM, an open-source multimodal Large Language Model (LLM) agent for circuit design assistance that integrates a hybrid Retrieval-Augmented Generation (RAG) framework with an adaptive vector database of circuit design research papers. Unlike conventional LLMs, the MuaLLM agent employs a Reason + Act (ReAct) workflow for iterative reasoning, goal-setting, and multi-step information retrieval. It functions as a question-answering design assistant, capable of interpreting complex queries and providing reasoned responses grounded in circuit literature. Its multimodal capabilities enable processing of both textual and visual data, facilitating more efficient and comprehensive analysis. The system dynamically adapts using intelligent search tools, automated document retrieval from the internet, and real-time database updates. Unlike conventional approaches constrained by model context limits, MuaLLM decouples retrieval from inference, enabling scalable reasoning over arbitrarily large corpora. At the maximum context length supported by standard LLMs, MuaLLM remains up to 10x less costly and 1.6x faster while maintaining the same accuracy. This allows rapid, no-human-in-the-loop database generation, overcoming the bottleneck of simulation-based dataset creation for circuits. To evaluate MuaLLM, we introduce two custom datasets: RAG-250, targeting retrieval and citation performance, and Reasoning-100 (Reas-100), focused on multistep reasoning in circuit design. MuaLLM achieves 90.1% recall on RAG-250, and 86.8% accuracy on Reas-100.
Deep Reinforcement Learning with Local Interpretability for Transparent Microgrid Resilience Energy Management
Renewable energy integration into microgrids has become a key approach to addressing global energy issues such as climate change and resource scarcity. However, the variability of renewable sources and the rising occurrence of High Impact Low Probability (HILP) events require innovative strategies for reliable and resilient energy management. This study introduces a practical approach to managing microgrid resilience through Explainable Deep Reinforcement Learning (XDRL). It combines the Proximal Policy Optimization (PPO) algorithm for decision-making with the Local Interpretable Model-agnostic Explanations (LIME) method to improve the transparency of the actor network's decisions. A case study in Ongole, India, examines a microgrid with wind, solar, and battery components to validate the proposed approach. The microgrid is simulated under extreme weather conditions during the Layla cyclone. LIME is used to analyse scenarios, showing the impact of key factors such as renewable generation, state of charge, and load prioritization on decision-making. The results demonstrate a Resilience Index (RI) of 0.9736 and an estimated battery lifespan of 15.11 years. LIME analysis reveals the rationale behind the agent's actions in idle, charging, and discharging modes, with renewable generation identified as the most influential feature. This study shows the effectiveness of integrating advanced DRL algorithms with interpretable AI techniques to achieve reliable and transparent energy management in microgrids.
Autonomous Navigation of Cloud-Controlled Quadcopters in Confined Spaces Using Multi-Modal Perception and LLM-Driven High Semantic Reasoning
This paper introduces an advanced AI-driven perception system for autonomous quadcopter navigation in GPS-denied indoor environments. The proposed framework leverages cloud computing to offload computationally intensive tasks and incorporates a custom-designed printed circuit board (PCB) for efficient sensor data acquisition, enabling robust navigation in confined spaces. The system integrates YOLOv11 for object detection, Depth Anything V2 for monocular depth estimation, a PCB equipped with Time-of-Flight (ToF) sensors and an Inertial Measurement Unit (IMU), and a cloud-based Large Language Model (LLM) for context-aware decision-making. A virtual safety envelope, enforced by calibrated sensor offsets, ensures collision avoidance, while a multithreaded architecture achieves low-latency processing. Enhanced spatial awareness is facilitated by 3D bounding box estimation with Kalman filtering. Experimental results in an indoor testbed demonstrate strong performance, with object detection achieving a mean Average Precision (mAP50) of 0.6, depth estimation Mean Absolute Error (MAE) of 7.2 cm, only 16 safety envelope breaches across 42 trials over approximately 11 minutes, and end-to-end system latency below 1 second. This cloud-supported, high-intelligence framework serves as an auxiliary perception and navigation system, complementing state-of-the-art drone autonomy for GPS-denied confined spaces.
Learning Satellite Attitude Dynamics with Physics-Informed Normalising Flow
Attitude control is a fundamental aspect of spacecraft operations. Model Predictive Control (MPC) has emerged as a powerful strategy for these tasks, relying on accurate models of the system dynamics to optimize control actions over a prediction horizon. In scenarios where physics models are incomplete, difficult to derive, or computationally expensive, machine learning offers a flexible alternative by learning the system behavior directly from data. However, purely data-driven models often struggle with generalization and stability, especially when applied to inputs outside their training domain. To address these limitations, we investigate the benefits of incorporating Physics-Informed Neural Networks (PINNs) into the learning of spacecraft attitude dynamics, comparing their performance with that of purely data-driven approaches. Using a Real-valued Non-Volume Preserving (Real NVP) neural network architecture with a self-attention mechanism, we trained several models on simulated data generated with the Basilisk simulator. Two training strategies were considered: a purely data-driven baseline and a physics-informed variant to improve robustness and stability. Our results demonstrate that the inclusion of physics-based information significantly enhances the performance in terms of the mean relative error of the best architectures found by 27.08%. These advantages are particularly evident when the learned models are integrated into an MPC framework, where PINN-based models consistently outperform their purely data-driven counterparts in terms of control accuracy and robustness, yielding improvements of up to 42.86% in performance stability error and increased robustness-to-noise.
comment: In review
Robust Integrated Priority and Speed Control based on Hierarchical Stochastic Optimization to Promote Bus Schedule Adherence along Signalized Arterial SC 2025
In intelligent transportation systems (ITS), adaptive transit signal priority (TSP) and dynamic bus control systems have been independently developed to maintain efficient and reliable urban bus services. However, those two systems could potentially lead to conflicting decisions due to the lack of coordination. Although some studies explore the integrated control strategies along the arterial, they merely rely on signal replanning to address system uncertainties. Therefore, their performance severely deteriorates in real-world intersection settings, where abrupt signal timing variation is not always applicable in consideration of countdown timers and pedestrian signal design. In this study, we propose a robust integrated priority and speed control strategy based on hierarchical stochastic optimization to enhance bus schedule adherence along the arterial. In the proposed framework, the upper level ensures the coordination across intersections while the lower level handles uncertainties for each intersection with stochastic programming. Hence, the route-level system randomness is decomposed into a series of local problems that can be solved in parallel using sample average approximation (SAA). Simulation experiments are conducted under various scenarios with stochastic bus dwell time and different traffic demand. The results demonstrate that our approach significantly enhances bus punctuality and time headway equivalence without abrupt signal timing variation, with negative impacts on car delays limited to only 0.8%-5.2% as traffic demand increases.
comment: This paper has been accepted by 26th IEEE International Conference on Intelligent Transportation Systems ITSC 2025
Toward Goal-Oriented Communication in Multi-Agent Systems: An overview
As multi-agent systems (MAS) become increasingly prevalent in autonomous systems, distributed control, and edge intelligence, efficient communication under resource constraints has emerged as a critical challenge. Traditional communication paradigms often emphasize message fidelity or bandwidth optimization, overlooking the task relevance of the exchanged information. In contrast, goal-oriented communication prioritizes the importance of information with respect to the agents' shared objectives. This review provides a comprehensive survey of goal-oriented communication in MAS, bridging perspectives from information theory, communication theory, and machine learning. We examine foundational concepts alongside learning-based approaches and emergent protocols. Special attention is given to coordination under communication constraints, as well as applications in domains such as swarm robotics, federated learning, and edge computing. The paper concludes with a discussion of open challenges and future research directions at the intersection of communication theory, machine learning, and multi-agent decision making.
comment: 32 pages
Deep Reinforcement Learning-Based Control Strategy with Direct Gate Control for Buck Converters
This paper proposes a deep reinforcement learning (DRL)-based approach for directly controlling the gate signals of switching devices to achieve voltage regulation in a buck converter. Unlike conventional control methods, the proposed method directly generates gate signals using a neural network trained through DRL, with the objective of achieving high control speed and flexibility while maintaining stability. Simulation results demonstrate that the proposed direct gate control (DGC) method achieves a faster transient response and stable output voltage regulation, outperforming traditional PWM-based control schemes. The DGC method also exhibits strong robustness against parameter variations and sensor noise, indicating its suitability for practical power electronics applications. The effectiveness of the proposed approach is validated via simulation.
When are safety filters safe? On minimum phase conditions of control barrier functions
In emerging control applications involving multiple and complex tasks, safety filters are gaining prominence as a modular approach to enforcing safety constraints. Among various methods, control barrier functions (CBFs) are widely used for designing safety filters due to their simplicity, imposing a single linear constraint on the control input at each state. In this work, we focus on the internal dynamics of systems governed by CBF-constrained control laws. Our key observation is that, although CBFs guarantee safety by enforcing state constraints, they can inadvertently be "unsafe" by causing the internal state to diverge. We investigate the conditions under which the full system state, including the internal state, can remain bounded under a CBF-based safety filter. Drawing inspiration from the input-output linearization literature, where boundedness is ensured by minimum phase conditions, we propose a new set of CBF minimum phase conditions tailored to the structure imposed by the CBF constraint. A critical distinction from the original minimum phase conditions is that the internal dynamics in our setting is driven by a nonnegative virtual control input, which reflects the enforcement of the safety constraint. We include a range of numerical examples, including single-input, multi-input, linear, and nonlinear systems, validating both our analysis and the necessity of the proposed CBF minimum phase conditions.
comment: This work has been submitted to the IEEE for possible publication
Nonlinear Systems in Wireless Power Transfer Applications
As a novel pattern of energization, the wireless power transfer (WPT) offers a brand-new way to the energy acquisition for electric-driven devices, thus alleviating the over-dependence on the battery. This report presents three types of WPT systems that use nonlinear control methods, in order to acquire an in-depth understanding of the course of Nonlinear Systems.
Neuro-Symbolic Acceleration of MILP Motion Planning with Temporal Logic and Chance Constraints
Autonomous systems must solve motion planning problems subject to increasingly complex, time-sensitive, and uncertain missions. These problems often involve high-level task specifications, such as temporal logic or chance constraints, which require solving large-scale Mixed-Integer Linear Programs (MILPs). However, existing MILP-based planning methods suffer from high computational cost and limited scalability, hindering their real-time applicability. We propose to use a neuro-symbolic approach to accelerate MILP-based motion planning by leveraging machine learning techniques to guide the solver's symbolic search. Focusing on two representative classes of planning problems, namely, those with Signal Temporal Logic (STL) specifications and those with chance constraints formulated via Conformal Predictive Programming (CPP). We demonstrate how graph neural network-based learning methods can guide traditional symbolic MILP solvers in solving challenging planning problems, including branching variable selection and solver parameter configuration. Through extensive experiments, we show that neuro-symbolic search techniques yield scalability gains. Our approach yields substantial improvements, achieving an average performance gain of about 20% over state-of-the-art solver across key metrics, including runtime and solution quality.
Control-affine Schrödinger Bridge and Generalized Bohm Potential
The control-affine Schr\"odinger bridge concerns with a stochastic optimal control problem. Its solution is a controlled evolution of joint state probability density subject to a control-affine It\^o diffusion with a given deadline connecting a given pair of initial and terminal densities. In this work, we recast the necessary conditions of optimality for the control-affine Schr\"odinger bridge problem as a two point boundary value problem for a quantum mechanical Schr\"odinger PDE with complex potential. This complex-valued potential is a generalization of the real-valued Bohm potential in quantum mechanics. Our derived potential is akin to the optical potential in nuclear physics where the real part of the potential encodes elastic scattering (transmission of wave function), and the imaginary part encodes inelastic scattering (absorption of wave function). The key takeaway is that the process noise that drives the evolution of probability densities induces an absorbing medium in the evolution of wave function. These results make new connections between control theory and non-equilibrium statistical mechanics through the lens of quantum mechanics.
comment: This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. Partial funding for this work was provided by LLNL Laboratory Directed Research and Development grant GS 25-ERD-044. Document release number: LLNL-JRNL-2008865
An Analytical and Experimental Study of Distributed Uplink Beamforming in the Presence of Carrier Frequency Offsets
Realizing distributed multi-user beamforming (D-MUBF) in time division duplex (TDD)-based multi-user MIMO (MU-MIMO) systems faces significant challenges. One of the most fundamental challenges is achieving accurate over-the-air (OTA) timing and frequency synchronization among distributed access points (APs), particularly due to residual frequency offsets caused by local oscillator (LO) drifts. Despite decades of research on synchronization for MU-MIMO, there are only a few experimental studies that evaluate D-MUBF techniques under imperfect frequency synchronization among distributed antennas. This paper presents an analytical and experimental assessment of D-MUBF methods in the presence of frequency synchronization errors. We provide closed-form expressions for signal-to-interference-plus-noise ratio (SINR) as a function of channel characteristics and statistical properties of carrier frequency offset (CFO) among AP antennas. In addition, through experimental evaluations conducted with the RENEW massive MIMO testbed, we collected comprehensive datasets across various experimental scenarios. These datasets comprise uplink pilot samples for channel and CFO estimation, in addition to uplink multi-user data intended for analyzing D-MUBF techniques. By examining these datasets, we assess the performance of D-MUBF in the presence of CFO and compare the analytical predictions with empirical measurements. Furthermore, we make the datasets publicly available and provide insights on utilizing them for future research endeavors.
comment: This work has been submitted to the IEEE Transactions on Vehicular Technology for possible publication. This version includes revisions based on the first-round review
Gradient- and Newton-Based Unit Vector Extremum Seeking Control
This paper presents novel methods for achieving stable and efficient convergence in multivariable extremum seeking control (ESC) using sliding mode techniques. Drawing inspiration from both classical sliding mode control and more recent developments in finite-time and fixed-time control, we propose a new framework that integrates these concepts into Gradient- and Newton-based ESC schemes based on sinusoidal perturbation signals. The key innovation lies in the use of discontinuous "relay-type" control components, replacing traditional proportional feedback to estimate the gradient of unknown quadratic nonlinear performance maps with Unit Vector Control (UVC). This represents the first attempt to address real-time, model-free optimization using sliding modes within the classical extremum seeking paradigm. In the Gradient-based approach, the convergence rate is influenced by the unknown Hessian of the objective function. In contrast, the Newton-based method overcomes this limitation by employing a dynamic estimator for the inverse of the Hessian, implemented via a Riccati equation filter. We establish finite-time convergence of the closed-loop average system to the extremum point for both methods by leveraging Lyapunov-based analysis and averaging theory tailored to systems with discontinuous right-hand sides. Numerical simulations validate the proposed method, illustrating significantly faster convergence and improved robustness compared to conventional ESC strategies, which typically guarantee only exponential stability. The results also demonstrate that the Gradient-based method exhibits slower convergence and higher transients since the gradient trajectory follows the curved and steepest-descent path, whereas the Newton-based method achieves faster convergence and improved overall performance going straightly to the extremum.
Quantum advantage in decentralized control of POMDPs: A control-theoretic view of the Mermin-Peres square
Consider a decentralized partially-observed Markov decision problem (POMDP) with multiple cooperative agents aiming to maximize a long-term-average reward criterion. We observe that the availability, at a fixed rate, of entangled states of a product quantum system between the agents, where each agent has access to one of the component systems, can result in strictly improved performance even compared to the scenario where common randomness is provided to the agents, i.e. there is a quantum advantage in decentralized control. This observation comes from a simple reinterpretation of the conclusions of the well-known Mermin-Peres square, which underpins the Mermin-Peres game. While quantum advantage has been demonstrated earlier in one-shot team problems of this kind, it is notable that there are examples where there is a quantum advantage for the one-shot criterion but it disappears in the dynamical scenario. The presence of a quantum advantage in dynamical scenarios is thus seen to be a novel finding relative to the current state of knowledge about the achievable performance in decentralized control problems. This paper is dedicated to the memory of Pravin P. Varaiya.
comment: Added two references. Improved the notation to clarify some calculations
Elastic Motion Policy: An Adaptive Dynamical System for Robust and Efficient One-Shot Imitation Learning
Behavior cloning (BC) has become a staple imitation learning paradigm in robotics due to its ease of teaching robots complex skills directly from expert demonstrations. However, BC suffers from an inherent generalization issue. To solve this, the status quo solution is to gather more data. Yet, regardless of how much training data is available, out-of-distribution performance is still sub-par, lacks any formal guarantee of convergence and success, and is incapable of allowing and recovering from physical interactions with humans. These are critical flaws when robots are deployed in ever-changing human-centric environments. Thus, we propose Elastic Motion Policy (EMP), a one-shot imitation learning framework that allows robots to adjust their behavior based on the scene change while respecting the task specification. Trained from a single demonstration, EMP follows the dynamical systems paradigm where motion planning and control are governed by first-order differential equations with convergence guarantees. We leverage Laplacian editing in full end-effector space, $\mathbb{R}^3\times SO(3)$, and online convex learning of Lyapunov functions, to adapt EMP online to new contexts, avoiding the need to collect new demonstrations. We extensively validate our framework in real robot experiments, demonstrating its robust and efficient performance in dynamic environments, with obstacle avoidance and multi-step task capabilities. Project Website: https://elastic-motion-policy.github.io/EMP/
A Communication Consistent Approach to Signal Temporal Logic Task Decomposition in Multi-Agent Systems
We consider the problem of decomposing a global task assigned to a multi-agent system, expressed as a formula within a fragment of Signal Temporal Logic (STL), under range-limited communication. Given a global task expressed as a conjunction of local tasks defined over the individual and relative states of agents in the system, we propose representing task dependencies among agents as edges of a suitably defined task graph. At the same time, range-limited communication naturally induces the definition of a communication graph that defines which agents have access to each other's states. Within these settings, inconsistencies arise when a task dependency between a pair of agents is not supported by a corresponding communication link due to the limited communication range. As a result, state feedback control laws previously derived to achieve the tasks' satisfaction can not be leveraged. We propose a task decomposition mechanism to distribute tasks assigned to pairs of non-communicating agents in the system as conjunctions of tasks defined over the relative states of communicating agents, thus enforcing consistency between task and communication graphs. Assuming the super-level sets of the predicate functions composing the STL tasks are bounded polytopes, our task decomposition mechanism can be cast as a parameter optimization problem and solved via state-of-the-art decentralized convex optimization algorithms. To guarantee the soundness of our approach, we present various conditions under which the tasks defined in the applied STL fragment are unsatisfiable, and we show sufficient conditions such that our decomposition approach yields satisfiable global tasks after decomposition.
comment: This work has been submitted to the IEEE for possible publication
Finite Sample Performance Analysis of MIMO Systems Identification
This paper is concerned with the finite sample identification performance of an n dimensional discrete-time Multiple-Input Multiple-Output (MIMO) Linear Time-Invariant system, with p inputs and m outputs. We prove that the widely-used Ho-Kalman algorithm and Multivariable Output Error State Space (MOESP) algorithm are ill-conditioned for MIMO systems when n/m or n/p is large. Moreover, by analyzing the Cra\'mer-Rao bound, we derive a fundamental limit for identifying the real and stable (or marginally stable) poles of MIMO system and prove that the sample complexity for any unbiased pole estimation algorithm to reach a certain level of accuracy explodes superpolynomially with respect to n/(pm). Numerical results are provided to illustrate the ill-conditionedness of Ho-Kalman algorithm and MOESP algorithm as well as the fundamental limit on identification.
comment: 13 pages, 4 figures
Barrier Integral Control for Global Asymptotic Stabilization of Uncertain Nonlinear Systems under Smooth Feedback and Transient Constraints
This paper addresses the problem of asymptotic stabilization for high-order control-affine MIMO nonlinear systems with unknown dynamic terms. We introduce Barrier Integral Control, a novel algorithm designed to confine the system's state within a predefined funnel, ensuring adherence to prescribed transient constraints, and asymptotically drive it to zero from any initial condition. The algorithm leverages the innovative integration of a reciprocal barrier function and an error-integral term, featuring smooth feedback control. Notably, it operates without relying on any information or approximation schemes for the (unknown) dynamic terms, which, unlike a large class of previous works, are not assumed to be bounded or to comply with globally Lipschitz/growth conditions. Additionally, the system's trajectory and asymptotic performance are decoupled from the uncertain model, control-gain selection, and initial conditions. Finally, comparative simulation studies validate the effectiveness of the proposed algorithm.
Modeling and Simulation of an Active Car Suspension with a Robust LQR Controller under Road Disturbance, Parameter Uncertainty and White Noise
Vehicle suspension is important for passengers to travel comfortably and to be less exposed to effects such as vibration and shock. A good suspension system increases the road holding of vehicles, allows them to take turns safely, and reduces the risk of traffic accidents. A passive suspension system is the most widely used suspension system in vehicles due to its simple structure and low cost. Passive suspension systems do not have an actuator and therefore do not have a controller. Active suspension systems have an actuator and a controller. Although their structures are more complex and costly, they are safer. PID controller is widely used in active suspension systems due to its simple structure, reasonable cost, and easy adjustment of coefficients. In this study, a more robust LQR-controlled active suspension was designed than a passive suspension and a PID-controlled active suspension. Robustness analyses were performed for passive suspension, PID-controlled active suspension, and LQR-controlled active suspension. Suspension travel, sprung mass acceleration, and sprung mass motion simulations were performed for all three suspensions under road disturbance, under simultaneous road disturbance and parameter uncertainty and under road disturbance with white noise. A comparative analysis was performed by obtaining the rise time, overshoot, and settling time data of the suspensions under different conditions. It was observed that the LQR-controlled active suspension showed the fastest rise time, the least overshoot and had the shortest settling time. In this case, it was proven that the LQRcontrolled active suspension provided a more comfortable and safe ride compared to the other two suspension systems.
comment: 19 pages, 17 figures
EngiBench: A Framework for Data-Driven Engineering Design Research
Engineering design optimization seeks to automatically determine the shapes, topologies, or parameters of components that maximize performance under given conditions. This process often depends on physics-based simulations, which are difficult to install, computationally expensive, and require domain-specific expertise. To mitigate these challenges, we introduce EngiBench, the first open-source library and datasets spanning diverse domains for data-driven engineering design. EngiBench provides a unified API and a curated set of benchmarks -- covering aeronautics, heat conduction, photonics, and more -- that enable fair, reproducible comparisons of optimization and machine learning algorithms, such as generative or surrogate models. We also release EngiOpt, a companion library offering a collection of such algorithms compatible with the EngiBench interface. Both libraries are modular, letting users plug in novel algorithms or problems, automate end-to-end experiment workflows, and leverage built-in utilities for visualization, dataset generation, feasibility checks, and performance analysis. We demonstrate their versatility through experiments comparing state-of-the-art techniques across multiple engineering design problems, an undertaking that was previously prohibitively time-consuming to perform. Finally, we show that these problems pose significant challenges for standard machine learning methods due to highly sensitive and constrained design manifolds.
Minimally Conservative Controlled-Invariant Set Synthesis Using Control Barrier Certificates
Finding a controlled-invariant set for a system with state and control constraints is crucial for safety-critical applications. However, existing methods often produce overly conservative solutions. This paper presents a method for generating controlled-invariant (safe) sets for nonlinear polynomial control-affine systems using Control Barrier Certificates (CBCs). We formulate CBC conditions as Sum-of-Squares (SOS) constraints and solve them via an SOS Program (SOSP). First, we generalize existing SOSPs for CBC synthesis to handle environments with complex unsafe state representations. Then, we propose an iterative algorithm that progressively enlarges the safe set constructed by the synthesized CBCs by maximizing boundary expansion at each iteration. We theoretically prove that our method guarantees strict safe set expansion at every step. Finally, we validate our approach with numerical simulations in 2D and 3D for single-input and multi-input systems. Empirical results show that the safe set generated by our method covers in most part a larger portion of the state space compared to two state-of-the-art techniques.
Model Predictive Control on the Neural Manifold
Neural manifolds are an attractive theoretical framework for characterizing the complex behaviors of neural populations. However, many of the tools for identifying these low-dimensional subspaces are correlational and provide limited insight into the underlying dynamics. The ability to precisely control the latent activity of a circuit would allow researchers to investigate the structure and function of neural manifolds. We simulate controlling the latent dynamics of a neural population using closed-loop, dynamically generated sensory inputs. Using a spiking neural network (SNN) as a model of a neural circuit, we find low-dimensional representations of both the network activity (the neural manifold) and a set of salient visual stimuli. The fields of classical and optimal control offer a range of methods to choose from for controlling dynamics on the neural manifold, which differ in performance, computational cost, and ease of implementation. Here, we focus on two commonly used control methods: proportional-integral-derivative (PID) control and model predictive control (MPC). PID is a computationally lightweight controller that is simple to implement. In contrast, MPC is a model-based, anticipatory controller with a much higher computational cost and engineering overhead. We evaluate both methods on trajectory-following tasks in latent space, under partial observability and in the presence of unknown noise. While both controllers in some cases were able to successfully control the latent dynamics on the neural manifold, MPC consistently produced more accurate control and required less hyperparameter tuning. These results demonstrate how MPC can be applied on the neural manifold using data-driven dynamics models, and provide a framework to experimentally test for causal relationships between manifold dynamics and external stimuli.
A Practical Approach Towards Inertia Estimation Using Ambient Synchrophasor Data
Real-time tracking of inertia is important because it reflects the power system's ability to withstand contingencies and maintain frequency security. This paper proposes a practical approach to estimate inertia using ambient phasor measurement unit (PMU) data and a partitioned form of the swing equation. The approach accounts for (bounded) uncertainties in network parameters and PMU measurements, enabling precise estimation of inertia and damping constants, as well as mechanical power inputs. Instead of assuming constant mechanical power input throughout, the approach leverages knowledge of power system operations to determine intervals when it is actually constant to maintain estimation consistency. Simulation results on the IEEE 14-bus system and IEEE 39 bus system integrated with renewable energy sources affirm the method's accuracy and applicability.
Wasserstein Barycenter Soft Actor-Critic
Deep off-policy actor-critic algorithms have emerged as the leading framework for reinforcement learning in continuous control domains. However, most of these algorithms suffer from poor sample efficiency, especially in environments with sparse rewards. In this paper, we take a step towards addressing this issue by providing a principled directed exploration strategy. We propose Wasserstein Barycenter Soft Actor-Critic (WBSAC) algorithm, which benefits from a pessimistic actor for temporal difference learning and an optimistic actor to promote exploration. This is achieved by using the Wasserstein barycenter of the pessimistic and optimistic policies as the exploration policy and adjusting the degree of exploration throughout the learning process. We compare WBSAC with state-of-the-art off-policy actor-critic algorithms and show that WBSAC is more sample-efficient on MuJoCo continuous control tasks.
Spectrum Sharing in Satellite-Terrestrial Integrated Networks: Frameworks, Approaches, and Opportunities
With the construction of low-earth orbit (LEO) satellite constellations, ubiquitous connectivity has been achieved. Terrestrial networks (TNs), such as cellular networks, are mainly deployed in specific urban areas and use licensed spectrum. However, in remote areas where terrestrial infrastructure is sparse, licensed spectrum bands are often underutilized. To accommodate the increasing communication needs, non-terrestrial networks (NTNs) can opportunistically access this idle spectrum to improve spectrum efficiency via spectrum sharing (SS). Therefore, bringing NTNs to a shared spectrum with TNs can improve network capacity under reasonable interference management. In satellite-terrestrial integrated networks (STINs), the comprehensive coverage of a satellite and the unbalanced communication resources of STINs make it challenging to manage mutual interference between NTN and TN effectively. This article presents the fundamentals and prospects of SS in STINs by introducing four SS frameworks, their potential application scenarios, and technical challenges. Furthermore, advanced SS approaches related to interference management in STINs and performance metrics of SS in STINs are introduced. Moreover, a preliminary performance evaluation showcases the potential for sharing the spectrum between NTN and TN. Finally, future research opportunities for SS in STINs are discussed.
Gait Transitions in Load-Pulling Quadrupeds: Insights from Sled Dogs and a Minimal SLIP Model
Quadrupedal animals employ diverse galloping strategies to optimize speed, stability, and energy efficiency. However, the biomechanical mechanisms that enable adaptive gait transitions during high-speed locomotion under load remain poorly understood. In this study, we present new empirical and modeling insights into the biomechanics of load-pulling quadrupeds, using sprint sled dogs as a model system. High-speed video and force recordings reveal that sled dogs often switch between rotary and transverse galloping gaits within just a few strides and without any observable changes in speed, stride duration, or terrain, providing clear evidence of locomotor multistability during high-speed load-pulling. To investigate the mechanical basis of these transitions, a physics-based quadrupedal Spring-Loaded Inverted Pendulum model with hybrid dynamics and prescribed footfall sequences to reproduce the asymmetric galloping patterns observed in racing sled dogs. Through trajectory optimization, we replicate experimentally observed gait sequences and identify swing-leg stiffness modulation as a key control mechanism for inducing transitions. This work provides a much-needed biomechanical perspective on high-speed animal draft and establishes a modeling framework for studying locomotion in pulling quadrupeds, with implications for both biological understanding and the design of adaptive legged systems.
Vision-Based Adaptive Robotics for Autonomous Surface Crack Repair SC
Surface cracks in infrastructure can lead to severe deterioration and expensive maintenance if not efficiently repaired. Manual repair methods are labor-intensive, time-consuming, and imprecise. While advancements in robotic perception and manipulation have progressed autonomous crack repair, three key challenges remain: accurate localization in the robot's coordinate frame, adaptability to varying crack sizes, and realistic validation of repairs. We present an adaptive, autonomous robotic system for surface crack detection and repair using advanced sensing technologies to enhance precision and safety for humans. A laser scanner is used to refine crack coordinates for accurate localization. Furthermore, our adaptive crack filling approach outperforms fixed speed techniques in efficiency and consistency. We validate our method using 3D printed cracks under realistic conditions, demonstrating repeatable testing. This research contributes to the field of human-robot interaction by reducing manual labor, improving safety, and streamlining maintenance operations, ultimately paving the way for more sophisticated and integrated construction robotics.
comment: 10 pages, 6 figures, 3 tables, submitted to ASCE International Conference on Computing in Civil Engineering (i3CE 2025)
On Composable and Parametric Uncertainty in Systems Co-Design
Optimizing the design of complex systems requires navigating interdependent decisions, heterogeneous components, and multiple objectives. Our monotone theory of co-design offers a compositional framework for addressing this challenge, modeling systems as Design Problems (DPs), representing trade-offs between functionalities and resources within partially ordered sets. While current approaches model uncertainty using intervals, capturing worst- and best-case bounds, they fail to express probabilistic notions such as risk and confidence. These limitations hinder the applicability of co-design in domains where uncertainty plays a critical role. In this paper, we introduce a unified framework for composable uncertainty in co-design, capturing intervals, distributions, and parametrized models. This extension enables reasoning about risk-performance trade-offs and supports advanced queries such as experiment design, learning, and multi-stage decision making. We demonstrate the expressiveness and utility of the framework via a numerical case study on the uncertainty-aware co-design of task-driven Unmanned Aerial Vehicles (UAVs).
comment: 8 pages, accepted as an Invited Session Paper to IEEE Conference on Decision and Control (CDC) 2025
The Social Life of Industrial Arms: How Arousal and Attention Shape Human-Robot Interaction
This study explores how human perceptions of a non-anthropomorphic robotic manipulator are shaped by two key dimensions of behaviour: arousal, defined as the robot's movement energy and expressiveness, and attention, defined as the robot's capacity to selectively orient toward and engage with a user. We introduce a novel control architecture that integrates a gaze-like attention engine with an arousal-modulated motion system to generate socially meaningful behaviours. In a user study, we find that robots exhibiting high attention -- actively directing their focus toward users -- are perceived as warmer and more competent, intentional, and lifelike. In contrast, high arousal -- characterized by fast, expansive, and energetic motions -- increases perceptions of discomfort and disturbance. Importantly, a combination of focused attention and moderate arousal yields the highest ratings of trust and sociability, while excessive arousal diminishes social engagement. These findings offer design insights for endowing non-humanoid robots with expressive, intuitive behaviours that support more natural human-robot interaction.
comment: 7 pages, 3 figures, 1 table
Learning Generative Models for Climbing Aircraft from Radar Data
Accurate trajectory prediction (TP) for climbing aircraft is hampered by the presence of epistemic uncertainties concerning aircraft operation, which can lead to significant misspecification between predicted and observed trajectories. This paper proposes a generative model for climbing aircraft in which the standard Base of Aircraft Data (BADA) model is enriched by a functional correction to the thrust that is learned from data. The method offers three features: predictions of the arrival time with 26.7% less error when compared to BADA; generated trajectories that are realistic when compared to test data; and a means of computing confidence bounds for minimal computational cost.
Robotics
A Learning-Based Framework for Collision-Free Motion Planning
This paper presents a learning-based extension to a Circular Field (CF)-based motion planner for efficient, collision-free trajectory generation in cluttered environments. The proposed approach overcomes the limitations of hand-tuned force field parameters by employing a deep neural network trained to infer optimal planner gains from a single depth image of the scene. The pipeline incorporates a CUDA-accelerated perception module, a predictive agent-based planning strategy, and a dataset generated through Bayesian optimization in simulation. The resulting framework enables real-time planning without manual parameter tuning and is validated both in simulation and on a Franka Emika Panda robot. Experimental results demonstrate successful task completion and improved generalization compared to classical planners.
Noise-Aware Generative Microscopic Traffic Simulation
Accurately modeling individual vehicle behavior in microscopic traffic simulation remains a key challenge in intelligent transportation systems, as it requires vehicles to realistically generate and respond to complex traffic phenomena such as phantom traffic jams. While traditional human driver simulation models offer computational tractability, they do so by abstracting away the very complexity that defines human driving. On the other hand, recent advances in infrastructure-mounted camera-based roadway sensing have enabled the extraction of vehicle trajectory data, presenting an opportunity to shift toward generative, agent-based models. Yet, a major bottleneck remains: most existing datasets are either overly sanitized or lack standardization, failing to reflect the noisy, imperfect nature of real-world sensing. Unlike data from vehicle-mounted sensors-which can mitigate sensing artifacts like occlusion through overlapping fields of view and sensor fusion-infrastructure-based sensors surface a messier, more practical view of challenges that traffic engineers encounter. To this end, we present the I-24 MOTION Scenario Dataset (I24-MSD)-a standardized, curated dataset designed to preserve a realistic level of sensor imperfection, embracing these errors as part of the learning problem rather than an obstacle to overcome purely from preprocessing. Drawing from noise-aware learning strategies in computer vision, we further adapt existing generative models in the autonomous driving community for I24-MSD with noise-aware loss functions. Our results show that such models not only outperform traditional baselines in realism but also benefit from explicitly engaging with, rather than suppressing, data imperfection. We view I24-MSD as a stepping stone toward a new generation of microscopic traffic simulation that embraces the real-world challenges and is better aligned with practical needs.
Triple-S: A Collaborative Multi-LLM Framework for Solving Long-Horizon Implicative Tasks in Robotics IROS 2025
Leveraging Large Language Models (LLMs) to write policy code for controlling robots has gained significant attention. However, in long-horizon implicative tasks, this approach often results in API parameter, comments and sequencing errors, leading to task failure. To address this problem, we propose a collaborative Triple-S framework that involves multiple LLMs. Through In-Context Learning, different LLMs assume specific roles in a closed-loop Simplification-Solution-Summary process, effectively improving success rates and robustness in long-horizon implicative tasks. Additionally, a novel demonstration library update mechanism which learned from success allows it to generalize to previously failed tasks. We validate the framework in the Long-horizon Desktop Implicative Placement (LDIP) dataset across various baseline models, where Triple-S successfully executes 89% of tasks in both observable and partially observable scenarios. Experiments in both simulation and real-world robot settings further validated the effectiveness of Triple-S. Our code and dataset is available at: https://github.com/Ghbbbbb/Triple-S.
comment: Accepted to IROS 2025
AgriVLN: Vision-and-Language Navigation for Agricultural Robots
Agricultural robots have emerged as powerful members in agricultural tasks, nevertheless, still heavily rely on manual operation or untransportable railway for movement, resulting in limited mobility and poor adaptability. Vision-and-Language Navigation (VLN) enables robots to navigate to the target destinations following natural language instructions, demonstrating strong performance on several domains. However, none of the existing benchmarks or methods is specifically designed for agricultural scenes. To bridge this gap, we propose Agriculture to Agriculture (A2A) benchmark, containing 1,560 episodes across six diverse agricultural scenes, in which all realistic RGB videos are captured by front-facing camera on a quadruped robot at a height of 0.38 meters, aligning with the practical deployment conditions. Meanwhile, we propose Vision-and-Language Navigation for Agricultural Robots (AgriVLN) baseline based on Vision-Language Model (VLM) prompted with carefully crafted templates, which can understand both given instructions and agricultural environments to generate appropriate low-level actions for robot control. When evaluated on A2A, AgriVLN performs well on short instructions but struggles with long instructions, because it often fails to track which part of the instruction is currently being executed. To address this, we further propose Subtask List (STL) instruction decomposition module and integrate it into AgriVLN, improving Success Rate (SR) from 0.33 to 0.47. We additionally compare AgriVLN with several existing VLN methods, demonstrating the state-of-the-art performance in the agricultural domain.
MonoMPC: Monocular Vision Based Navigation with Learned Collision Model and Risk-Aware Model Predictive Control
Navigating unknown environments with a single RGB camera is challenging, as the lack of depth information prevents reliable collision-checking. While some methods use estimated depth to build collision maps, we found that depth estimates from vision foundation models are too noisy for zero-shot navigation in cluttered environments. We propose an alternative approach: instead of using noisy estimated depth for direct collision-checking, we use it as a rich context input to a learned collision model. This model predicts the distribution of minimum obstacle clearance that the robot can expect for a given control sequence. At inference, these predictions inform a risk-aware MPC planner that minimizes estimated collision risk. Our joint learning pipeline co-trains the collision model and risk metric using both safe and unsafe trajectories. Crucially, our joint-training ensures optimal variance in our collision model that improves navigation in highly cluttered environments. Consequently, real-world experiments show 9x and 7x improvements in success rates over NoMaD and the ROS stack, respectively. Ablation studies further validate the effectiveness of our design choices.
Collision-Free Trajectory Planning and control of Robotic Manipulator using Energy-Based Artificial Potential Field (E-APF)
Robotic trajectory planning in dynamic and cluttered environments remains a critical challenge, particularly when striving for both time efficiency and motion smoothness under actuation constraints. Traditional path planner, such as Artificial Potential Field (APF), offer computational efficiency but suffer from local minima issue due to position-based potential field functions and oscillatory motion near the obstacles due to Newtonian mechanics. To address this limitation, an Energy-based Artificial Potential Field (APF) framework is proposed in this paper that integrates position and velocity-dependent potential functions. E-APF ensures dynamic adaptability and mitigates local minima, enabling uninterrupted progression toward the goal. The proposed framework integrates E-APF with a hybrid trajectory optimizer that jointly minimizes jerk and execution time under velocity and acceleration constraints, ensuring geometric smoothness and time efficiency. The entire framework is validated in simulation using the 7-degree-of-freedom Kinova Gen3 robotic manipulator. The results demonstrate collision-free, smooth, time-efficient, and oscillation-free trajectory in the presence of obstacles, highlighting the efficacy of the combined trajectory optimization and real-time obstacle avoidance approach. This work lays the foundation for future integration with reactive control strategies and physical hardware deployment in real-world manipulation tasks.
A Hybrid Force-Position Strategy for Shape Control of Deformable Linear Objects With Graph Attention Networks
Manipulating deformable linear objects (DLOs) such as wires and cables is crucial in various applications like electronics assembly and medical surgeries. However, it faces challenges due to DLOs' infinite degrees of freedom, complex nonlinear dynamics, and the underactuated nature of the system. To address these issues, this paper proposes a hybrid force-position strategy for DLO shape control. The framework, combining both force and position representations of DLO, integrates state trajectory planning in the force space and Model Predictive Control (MPC) in the position space. We present a dynamics model with an explicit action encoder, a property extractor and a graph processor based on Graph Attention Networks. The model is used in the MPC to enhance prediction accuracy. Results from both simulations and real-world experiments demonstrate the effectiveness of our approach in achieving efficient and stable shape control of DLOs. Codes and videos are available at https://sites.google.com/view/dlom.
Multimodal Spiking Neural Network for Space Robotic Manipulation
This paper presents a multimodal control framework based on spiking neural networks (SNNs) for robotic arms aboard space stations. It is designed to cope with the constraints of limited onboard resources while enabling autonomous manipulation and material transfer in space operations. By combining geometric states with tactile and semantic information, the framework strengthens environmental awareness and contributes to more robust control strategies. To guide the learning process progressively, a dual-channel, three-stage curriculum reinforcement learning (CRL) scheme is further integrated into the system. The framework was tested across a range of tasks including target approach, object grasping, and stable lifting with wall-mounted robotic arms, demonstrating reliable performance throughout. Experimental evaluations demonstrate that the proposed method consistently outperforms baseline approaches in both task success rate and energy efficiency. These findings highlight its suitability for real-world aerospace applications.
Navigation and Exploration with Active Inference: from Biology to Industry
By building and updating internal cognitive maps, animals exhibit extraordinary navigation abilities in complex, dynamic environments. Inspired by these biological mechanisms, we present a real time robotic navigation system grounded in the Active Inference Framework (AIF). Our model incrementally constructs a topological map, infers the agent's location, and plans actions by minimising expected uncertainty and fulfilling perceptual goals without any prior training. Integrated into the ROS2 ecosystem, we validate its adaptability and efficiency across both 2D and 3D environments (simulated and real world), demonstrating competitive performance with traditional and state of the art exploration approaches while offering a biologically inspired navigation approach.
comment: conference IWAI 2025 - accepted (in processing)
Bio-Inspired Topological Autonomous Navigation with Active Inference in Robotics
Achieving fully autonomous exploration and navigation remains a critical challenge in robotics, requiring integrated solutions for localisation, mapping, decision-making and motion planning. Existing approaches either rely on strict navigation rules lacking adaptability or on pre-training, which requires large datasets. These AI methods are often computationally intensive or based on static assumptions, limiting their adaptability in dynamic or unknown environments. This paper introduces a bio-inspired agent based on the Active Inference Framework (AIF), which unifies mapping, localisation, and adaptive decision-making for autonomous navigation, including exploration and goal-reaching. Our model creates and updates a topological map of the environment in real-time, planning goal-directed trajectories to explore or reach objectives without requiring pre-training. Key contributions include a probabilistic reasoning framework for interpretable navigation, robust adaptability to dynamic changes, and a modular ROS2 architecture compatible with existing navigation systems. Our method was tested in simulated and real-world environments. The agent successfully explores large-scale simulated environments and adapts to dynamic obstacles and drift, proving to be comparable to other exploration strategies such as Gbplanner, FAEL and Frontiers. This approach offers a scalable and transparent approach for navigating complex, unstructured environments.
comment: Conference ICCAS 2025 - accepted (in processing)
The 2D+ Dynamic Articulatory Model DYNARTmo: Tongue-Palate Contact Area Estimation
This paper describes an extension of the two-dimensional dynamic articulatory model DYNARTmo by integrating an internal three-dimensional representation of the palatal dome to estimate tongue-palate contact areas from midsagittal tongue contours. Two alternative dome geometries - a half-ellipse and a cosine based profile - are implemented to model lateral curvature in the coronal plane. Using these geometries, lateral contact points are analytically computed for each anterior-posterior position, enabling the generation of electropalatography-like visualizations within the 2D+ framework. The enhanced model supports three synchronized views (sagittal, glottal, and palatal) for static and dynamic (animated) articulation displays, suitable for speech science education and speech therapy. Future work includes adding a facial (lip) view and implementing articulatory-to-acoustic synthesis to quantitatively evaluate model realism.
comment: 11 pages, 9 figures, 14 references; supplementary material: python source code
Impact of Gaze-Based Interaction and Augmentation on Human-Robot Collaboration in Critical Tasks
We present a user study analyzing head-gaze-based robot control and foveated visual augmentation in a simulated search-and-rescue task. Results show that foveated augmentation significantly improves task performance, reduces cognitive load by 38%, and shortens task time by over 60%. Head-gaze patterns analysed over both the entire task duration and shorter time segments show that near and far attention capture is essential to better understand user intention in critical scenarios. Our findings highlight the potential of foveation as an augmentation technique and the need to further study gaze measures to leverage them during critical tasks.
3D Gaussian Representations with Motion Trajectory Field for Dynamic Scene Reconstruction
This paper addresses the challenge of novel-view synthesis and motion reconstruction of dynamic scenes from monocular video, which is critical for many robotic applications. Although Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) have demonstrated remarkable success in rendering static scenes, extending them to reconstruct dynamic scenes remains challenging. In this work, we introduce a novel approach that combines 3DGS with a motion trajectory field, enabling precise handling of complex object motions and achieving physically plausible motion trajectories. By decoupling dynamic objects from static background, our method compactly optimizes the motion trajectory field. The approach incorporates time-invariant motion coefficients and shared motion trajectory bases to capture intricate motion patterns while minimizing optimization complexity. Extensive experiments demonstrate that our approach achieves state-of-the-art results in both novel-view synthesis and motion trajectory recovery from monocular video, advancing the capabilities of dynamic scene reconstruction.
Integrating Neurosymbolic AI in Advanced Air Mobility: A Comprehensive Survey IJCAI-2025
Neurosymbolic AI combines neural network adaptability with symbolic reasoning, promising an approach to address the complex regulatory, operational, and safety challenges in Advanced Air Mobility (AAM). This survey reviews its applications across key AAM domains such as demand forecasting, aircraft design, and real-time air traffic management. Our analysis reveals a fragmented research landscape where methodologies, including Neurosymbolic Reinforcement Learning, have shown potential for dynamic optimization but still face hurdles in scalability, robustness, and compliance with aviation standards. We classify current advancements, present relevant case studies, and outline future research directions aimed at integrating these approaches into reliable, transparent AAM systems. By linking advanced AI techniques with AAM's operational demands, this work provides a concise roadmap for researchers and practitioners developing next-generation air mobility solutions.
comment: 9 pages, 4 figures, IJCAI-2025 (accepted)
Whole-Body Coordination for Dynamic Object Grasping with Legged Manipulators
Quadrupedal robots with manipulators offer strong mobility and adaptability for grasping in unstructured, dynamic environments through coordinated whole-body control. However, existing research has predominantly focused on static-object grasping, neglecting the challenges posed by dynamic targets and thus limiting applicability in dynamic scenarios such as logistics sorting and human-robot collaboration. To address this, we introduce DQ-Bench, a new benchmark that systematically evaluates dynamic grasping across varying object motions, velocities, heights, object types, and terrain complexities, along with comprehensive evaluation metrics. Building upon this benchmark, we propose DQ-Net, a compact teacher-student framework designed to infer grasp configurations from limited perceptual cues. During training, the teacher network leverages privileged information to holistically model both the static geometric properties and dynamic motion characteristics of the target, and integrates a grasp fusion module to deliver robust guidance for motion planning. Concurrently, we design a lightweight student network that performs dual-viewpoint temporal modeling using only the target mask, depth map, and proprioceptive state, enabling closed-loop action outputs without reliance on privileged data. Extensive experiments on DQ-Bench demonstrate that DQ-Net achieves robust dynamic objects grasping across multiple task settings, substantially outperforming baseline methods in both success rate and responsiveness.
Unveiling the Potential of iMarkers: Invisible Fiducial Markers for Advanced Robotics
Fiducial markers are widely used in various robotics tasks, facilitating enhanced navigation, object recognition, and scene understanding. Despite their advantages for robots and Augmented Reality (AR) applications, they often disrupt the visual aesthetics of environments because they are visible to humans, making them unsuitable for non-intrusive use cases. To address this gap, this paper presents "iMarkers"-innovative, unobtrusive fiducial markers detectable exclusively by robots equipped with specialized sensors. These markers offer high flexibility in production, allowing customization of their visibility range and encoding algorithms to suit various demands. The paper also introduces the hardware designs and software algorithms developed for detecting iMarkers, highlighting their adaptability and robustness in the detection and recognition stages. Various evaluations have demonstrated the effectiveness of iMarkers compared to conventional (printed) and blended fiducial markers and confirmed their applicability in diverse robotics scenarios.
comment: 18 pages, 10 figures, 3 tables
Semantic Mapping in Indoor Embodied AI -- A Survey on Advances, Challenges, and Future Directions
Intelligent embodied agents (e.g. robots) need to perform complex semantic tasks in unfamiliar environments. Among many skills that the agents need to possess, building and maintaining a semantic map of the environment is most crucial in long-horizon tasks. A semantic map captures information about the environment in a structured way, allowing the agent to reference it for advanced reasoning throughout the task. While existing surveys in embodied AI focus on general advancements or specific tasks like navigation and manipulation, this paper provides a comprehensive review of semantic map-building approaches in embodied AI, specifically for indoor navigation. We categorize these approaches based on their structural representation (spatial grids, topological graphs, dense point-clouds or hybrid maps) and the type of information they encode (implicit features or explicit environmental data). We also explore the strengths and limitations of the map building techniques, highlight current challenges, and propose future research directions. We identify that the field is moving towards developing open-vocabulary, queryable, task-agnostic map representations, while high memory demands and computational inefficiency still remaining to be open challenges. This survey aims to guide current and future researchers in advancing semantic mapping techniques for embodied AI systems.
FunGraph: Functionality Aware 3D Scene Graphs for Language-Prompted Scene Interaction IROS 2025
The concept of 3D scene graphs is increasingly recognized as a powerful semantic and hierarchical representation of the environment. Current approaches often address this at a coarse, object-level resolution. In contrast, our goal is to develop a representation that enables robots to directly interact with their environment by identifying both the location of functional interactive elements and how these can be used. To achieve this, we focus on detecting and storing objects at a finer resolution, focusing on affordance-relevant parts. The primary challenge lies in the scarcity of data that extends beyond instance-level detection and the inherent difficulty of capturing detailed object features using robotic sensors. We leverage currently available 3D resources to generate 2D data and train a detector, which is then used to augment the standard 3D scene graph generation pipeline. Through our experiments, we demonstrate that our approach achieves functional element segmentation comparable to state-of-the-art 3D models and that our augmentation enables task-driven affordance grounding with higher accuracy than the current solutions. See our project page at https://fungraph.github.io.
comment: Paper accepted for IROS 2025
In-between Motion Generation Based Multi-Style Quadruped Robot Locomotion
Quadruped robots face persistent challenges in achieving versatile locomotion due to limitations in reference motion data diversity. To address these challenges, we introduce an in-between motion generation based multi-style quadruped robot locomotion framework. We propose a CVAE based motion generator, synthesizing multi-style dynamically feasible locomotion sequences between arbitrary start and end states. By embedding physical constraints and leveraging joint poses based phase manifold continuity, this component produces physically plausible motions spanning multiple gait modalities while ensuring kinematic compatibility with robotic morphologies. We train the imitation policy based on generated data, which validates the effectiveness of generated motion data in enhancing controller stability and improving velocity tracking performance. The proposed framework demonstrates significant improvements in velocity tracking and deployment stability. We successfully deploy the framework on a real-world quadruped robot, and the experimental validation confirms the framework's capability to generate and execute complex motion profiles, including gallop, tripod, trotting and pacing.
Learning 3D-Gaussian Simulators from RGB Videos
Realistic simulation is critical for applications ranging from robotics to animation. Learned simulators have emerged as a possibility to capture real world physics directly from video data, but very often require privileged information such as depth information, particle tracks and hand-engineered features to maintain spatial and temporal consistency. These strong inductive biases or ground truth 3D information help in domains where data is sparse but limit scalability and generalization in data rich regimes. To overcome the key limitations, we propose 3DGSim, a learned 3D simulator that directly learns physical interactions from multi-view RGB videos. 3DGSim unifies 3D scene reconstruction, particle dynamics prediction and video synthesis into an end-to-end trained framework. It adopts MVSplat to learn a latent particle-based representation of 3D scenes, a Point Transformer for particle dynamics, a Temporal Merging module for consistent temporal aggregation and Gaussian Splatting to produce novel view renderings. By jointly training inverse rendering and dynamics forecasting, 3DGSim embeds the physical properties into point-wise latent features. This enables the model to capture diverse physical behaviors, from rigid to elastic, cloth-like dynamics, and boundary conditions (e.g. fixed cloth corner), along with realistic lighting effects that also generalize to unseen multibody interactions and novel scene edits.
Dynamic Robot-Assisted Surgery with Hierarchical Class-Incremental Semantic Segmentation MICCAI
Robot-assisted surgeries rely on accurate and real-time scene understanding to safely guide surgical instruments. However, segmentation models trained on static datasets face key limitations when deployed in these dynamic and evolving surgical environments. Class-incremental semantic segmentation (CISS) allows models to continually adapt to new classes while avoiding catastrophic forgetting of prior knowledge, without training on previous data. In this work, we build upon the recently introduced Taxonomy-Oriented Poincar\'e-regularized Incremental Class Segmentation (TOPICS) approach and propose an enhanced variant, termed TOPICS+, specifically tailored for robust segmentation of surgical scenes. Concretely, we incorporate the Dice loss into the hierarchical loss formulation to handle strong class imbalances, introduce hierarchical pseudo-labeling, and design tailored label taxonomies for robotic surgery environments. We also propose six novel CISS benchmarks designed for robotic surgery environments including multiple incremental steps and several semantic categories to emulate realistic class-incremental settings in surgical environments. In addition, we introduce a refined set of labels with more than 144 classes on the Syn-Mediverse synthetic dataset, hosted online as an evaluation benchmark. We make the code and trained models publicly available at http://topics.cs.uni-freiburg.de.
comment: accepted at MICCAI AMAI 2025 workshop
Embodied intelligent industrial robotics: Concepts and techniques
In order to work more efficiently, accurately, reliably, and safely in industrial scenarios, robots should have at least general knowledge, working-environment knowledge, and operating-object knowledge. These pose significant challenges to existing embodied intelligent robotics (EIR) techniques. Thus, this paper first briefly reviews the history of industrial robotics and analyzes the limitations of mainstream EIR frameworks. Then, a knowledge-driven technical framework of embodied intelligent industrial robotics (EIIR) is proposed for various industrial environments. It has five modules: a world model, a high-level task planner, a low-level skill controller, a simulator, and a physical system. The development of techniques related to each module are also thoroughly reviewed, and recent progress regarding their adaption to industrial applications are discussed. A case study is given to demonstrate the newly proposed EIIR framework's applicability to real-world assembly system. Finally, the key challenges that EIIR encounters in industrial scenarios are summarized and future research directions are suggested. The authors believe that EIIR technology is shaping the next generation of industrial robotics and EIIR-based industrial systems supply a new technological paradigm for intelligent manufacturing. It is expected that this review could serve as a valuable reference for scholars and engineers that are interested in industrial embodied intelligence. Together, scholars can use this research to drive their rapid advancement and application of EIIR techniques. The interested authors would continue to track and contribute new studies in the project page https://github.com/jackyzengl/EIIR.
comment: 68 pages, 12 figures. The associated project can be found at https://github.com/jackyzengl/EIIR
CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
In robotic visuomotor policy learning, diffusion-based models have achieved significant success in improving the accuracy of action trajectory generation compared to traditional autoregressive models. However, they suffer from inefficiency due to multiple denoising steps and limited flexibility from complex constraints. In this paper, we introduce Coarse-to-Fine AutoRegressive Policy (CARP), a novel paradigm for visuomotor policy learning that redefines the autoregressive action generation process as a coarse-to-fine, next-scale approach. CARP decouples action generation into two stages: first, an action autoencoder learns multi-scale representations of the entire action sequence; then, a GPT-style transformer refines the sequence prediction through a coarse-to-fine autoregressive process. This straightforward and intuitive approach produces highly accurate and smooth actions, matching or even surpassing the performance of diffusion-based policies while maintaining efficiency on par with autoregressive policies. We conduct extensive evaluations across diverse settings, including single-task and multi-task scenarios on state-based and image-based simulation benchmarks, as well as real-world tasks. CARP achieves competitive success rates, with up to a 10% improvement, and delivers 10x faster inference compared to state-of-the-art policies, establishing a high-performance, efficient, and flexible paradigm for action generation in robotic tasks.
MAT-DiSMech: A Discrete Differential Geometry-based Computational Tool for Simulation of Rods, Shells, and Soft Robots
Accurate and efficient simulation tools are essential in robotics, enabling the visualization of system dynamics and the validation of control laws before committing resources to physical experimentation. Developing physically accurate simulation tools is particularly challenging in soft robotics, largely due to the prevalence of geometrically nonlinear deformation. A variety of robot simulators tackle this challenge by using simplified modeling techniques -- such as lumped mass models -- which lead to physical inaccuracies in real-world applications. On the other hand, high-fidelity simulation methods for soft structures, like finite element analysis, offer increased accuracy but lead to higher computational costs. In light of this, we present a Discrete Differential Geometry-based simulator that provides a balance between physical accuracy and computational speed. Building on an extensive body of research on rod and shell-based representations of soft robots, our tool provides a pathway to accurately model soft robots in a computationally tractable manner. Our open-source MATLAB-based framework is capable of simulating the deformations of rods, shells, and their combinations, primarily utilizing implicit integration techniques. The software design is modular for the user to customize the code, for example, add new external forces and impose custom boundary conditions. The implementations for prevalent forces encountered in robotics, including gravity, contact, kinetic and viscous friction, and aerodynamic drag, have been provided. We provide several illustrative examples that showcase the capabilities and validate the physical accuracy of the simulator. The open-source code is available at https://github.com/StructuresComp/dismech-matlab.git. We anticipate that the proposed simulator can serve as an effective digital twin tool, enhancing the Sim2Real pathway in soft robotics research.
comment: Total 31 pages, 12 figures, open-source code available at https://github.com/StructuresComp/dismech-matlab
Understanding and Imitating Human-Robot Motion with Restricted Visual Fields
When working around other agents such as humans, it is important to model their perception capabilities to predict and make sense of their behavior. In this work, we consider agents whose perception capabilities are determined by their limited field of view, viewing range, and the potential to miss objects within their viewing range. By considering the perception capabilities and observation model of agents independently from their motion policy, we show that we can better predict the agents' behavior; i.e., by reasoning about the perception capabilities of other agents, one can better make sense of their actions. We perform a user study where human operators navigate a cluttered scene while scanning the region for obstacles with a limited field of view and range. We show that by reasoning about the limited observation space of humans, a robot can better learn a human's strategy for navigating an environment and navigate with minimal collision with dynamic and static obstacles. We also show that this learned model helps it successfully navigate a physical hardware vehicle in real-time. Code available at https://github.com/labicon/HRMotion-RestrictedView.
MultiNash-PF: A Particle Filtering Approach for Computing Multiple Local Generalized Nash Equilibria in Trajectory Games
Modern robotic systems frequently engage in complex multi-agent interactions, many of which are inherently multi-modal, i.e., they can lead to multiple distinct outcomes. To interact effectively, robots must recognize the possible interaction modes and adapt to the one preferred by other agents. In this work, we propose MultiNash-PF, an efficient algorithm for capturing the multimodality in multi-agent interactions. We model interaction outcomes as equilibria of a game-theoretic planner, where each equilibrium corresponds to a distinct interaction mode. Our framework formulates interactive planning as Constrained Potential Trajectory Games (CPTGs), in which local Generalized Nash Equilibria (GNEs) represent plausible interaction outcomes. We propose to integrate the potential game approach with implicit particle filtering, a sample-efficient method for non-convex trajectory optimization. We utilize implicit particle filtering to identify the coarse estimates of multiple local minimizers of the game's potential function. MultiNash-PF then refines these estimates with optimization solvers, obtaining different local GNEs. We show through numerical simulations that MultiNash-PF reduces computation time by up to 50\% compared to a baseline. We further demonstrate the effectiveness of our algorithm in real-world human-robot interaction scenarios, where it successfully accounts for the multi-modal nature of interactions and resolves potential conflicts in real-time.
Multiagent Systems
Noise-Aware Generative Microscopic Traffic Simulation
Accurately modeling individual vehicle behavior in microscopic traffic simulation remains a key challenge in intelligent transportation systems, as it requires vehicles to realistically generate and respond to complex traffic phenomena such as phantom traffic jams. While traditional human driver simulation models offer computational tractability, they do so by abstracting away the very complexity that defines human driving. On the other hand, recent advances in infrastructure-mounted camera-based roadway sensing have enabled the extraction of vehicle trajectory data, presenting an opportunity to shift toward generative, agent-based models. Yet, a major bottleneck remains: most existing datasets are either overly sanitized or lack standardization, failing to reflect the noisy, imperfect nature of real-world sensing. Unlike data from vehicle-mounted sensors-which can mitigate sensing artifacts like occlusion through overlapping fields of view and sensor fusion-infrastructure-based sensors surface a messier, more practical view of challenges that traffic engineers encounter. To this end, we present the I-24 MOTION Scenario Dataset (I24-MSD)-a standardized, curated dataset designed to preserve a realistic level of sensor imperfection, embracing these errors as part of the learning problem rather than an obstacle to overcome purely from preprocessing. Drawing from noise-aware learning strategies in computer vision, we further adapt existing generative models in the autonomous driving community for I24-MSD with noise-aware loss functions. Our results show that such models not only outperform traditional baselines in realism but also benefit from explicitly engaging with, rather than suppressing, data imperfection. We view I24-MSD as a stepping stone toward a new generation of microscopic traffic simulation that embraces the real-world challenges and is better aligned with practical needs.
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Recent advances in large language models have sparked growing interest in AI agents capable of solving complex, real-world tasks. However, most existing agent systems rely on manually crafted configurations that remain static after deployment, limiting their ability to adapt to dynamic and evolving environments. To this end, recent research has explored agent evolution techniques that aim to automatically enhance agent systems based on interaction data and environmental feedback. This emerging direction lays the foundation for self-evolving AI agents, which bridge the static capabilities of foundation models with the continuous adaptability required by lifelong agentic systems. In this survey, we provide a comprehensive review of existing techniques for self-evolving agentic systems. Specifically, we first introduce a unified conceptual framework that abstracts the feedback loop underlying the design of self-evolving agentic systems. The framework highlights four key components: System Inputs, Agent System, Environment, and Optimisers, serving as a foundation for understanding and comparing different strategies. Based on this framework, we systematically review a wide range of self-evolving techniques that target different components of the agent system. We also investigate domain-specific evolution strategies developed for specialised fields such as biomedicine, programming, and finance, where optimisation objectives are tightly coupled with domain constraints. In addition, we provide a dedicated discussion on the evaluation, safety, and ethical considerations for self-evolving agentic systems, which are critical to ensuring their effectiveness and reliability. This survey aims to provide researchers and practitioners with a systematic understanding of self-evolving AI agents, laying the foundation for the development of more adaptive, autonomous, and lifelong agentic systems.
A Survey on Agentic Service Ecosystems: Measurement, Analysis, and Optimization
The Agentic Service Ecosystem consists of heterogeneous autonomous agents (e.g., intelligent machines, humans, and human-machine hybrid systems) that interact through resource exchange and service co-creation. These agents, with distinct behaviors and motivations, exhibit autonomous perception, reasoning, and action capabilities, which increase system complexity and make traditional linear analysis methods inadequate. Swarm intelligence, characterized by decentralization, self-organization, emergence, and dynamic adaptability, offers a novel theoretical lens and methodology for understanding and optimizing such ecosystems. However, current research, owing to fragmented perspectives and cross-ecosystem differences, fails to comprehensively capture the complexity of swarm-intelligence emergence in agentic contexts. The lack of a unified methodology further limits the depth and systematic treatment of the research. This paper proposes a framework for analyzing the emergence of swarm intelligence in Agentic Service Ecosystems, with three steps: measurement, analysis, and optimization, to reveal the cyclical mechanisms and quantitative criteria that foster emergence. By reviewing existing technologies, the paper analyzes their strengths and limitations, identifies unresolved challenges, and shows how this framework provides both theoretical support and actionable methods for real-world applications.
LLM-based Agents for Automated Confounder Discovery and Subgroup Analysis in Causal Inference
Estimating individualized treatment effects from observational data presents a persistent challenge due to unmeasured confounding and structural bias. Causal Machine Learning (causal ML) methods, such as causal trees and doubly robust estimators, provide tools for estimating conditional average treatment effects. These methods have limited effectiveness in complex real-world environments due to the presence of latent confounders or those described in unstructured formats. Moreover, reliance on domain experts for confounder identification and rule interpretation introduces high annotation cost and scalability concerns. In this work, we proposed Large Language Model-based agents for automated confounder discovery and subgroup analysis that integrate agents into the causal ML pipeline to simulate domain expertise. Our framework systematically performs subgroup identification and confounding structure discovery by leveraging the reasoning capabilities of LLM-based agents, which reduces human dependency while preserving interpretability. Experiments on real-world medical datasets show that our proposed approach enhances treatment effect estimation robustness by narrowing confidence intervals and uncovering unrecognized confounding biases. Our findings suggest that LLM-based agents offer a promising path toward scalable, trustworthy, and semantically aware causal inference.
Multi-Dimensional Summarization Agents with Context-Aware Reasoning over Enterprise Tables
We propose a novel framework for summarizing structured enterprise data across multiple dimensions using large language model (LLM)-based agents. Traditional table-to-text models often lack the capacity to reason across hierarchical structures and context-aware deltas, which are essential in business reporting tasks. Our method introduces a multi-agent pipeline that extracts, analyzes, and summarizes multi-dimensional data using agents for slicing, variance detection, context construction, and LLM-based generation. Our results show that the proposed framework outperforms traditional approaches, achieving 83\% faithfulness to underlying data, superior coverage of significant changes, and high relevance scores (4.4/5) for decision-critical insights. The improvements are especially pronounced in categories involving subtle trade-offs, such as increased revenue due to price changes amid declining unit volumes, which competing methods either overlook or address with limited specificity. We evaluate the framework on Kaggle datasets and demonstrate significant improvements in faithfulness, relevance, and insight quality over baseline table summarization approaches.
miRKatAI: An Integrated Database and Multi-agent AI system for microRNA Research
MicroRNAs (miRs) are robust regulators of gene expression, implicated in most biological processes. microRNAs predominantly downregulate the expression of genes post-transcriptionally and each miR is predicted to target several hundred genes. The accurate identification and annotation of miR-mRNA target interactions is central to understanding miRs function and their therapeutic potential. However, computational target prediction is challenging due to imperfect complementarity of miRs with their targets and the growing volume and heterogeneity of experimental data present challenges in accessing, integrating, and analysing miR-target interaction information across biological contexts. This creates a need for integrated resources and intelligent query tools. We present the miRKat Suite, comprising miRKatDB, a comprehensive, curated database of predicted and validated miR-target interactions and associated annotations, and miRKatAI, a multi-agent system powered by large language models (LLMs) and LangGraph. miRKatDB integrates data from multiple publicly available sources, providing a comprehensive foundation for miR studies, including miR target genes and changes in levels of tissue expression previously reported. miRKatAI offers a natural language interface for complex querying of miRKatDB, facilitates grounded information retrieval from established sources in the field, and supports basic data visualisation. The miRKat Suite aims to accelerate miR research by streamlining data access, enhancing exploratory analysis, and supporting hypothesis generation.
comment: 10 pages, 1 figure, app note
The Condorcet Dimension of Metric Spaces
A Condorcet winning set is a set of candidates such that no other candidate is preferred by at least half the voters over all members of the set. The Condorcet dimension, which is the minimum cardinality of a Condorcet winning set, is known to be at most logarithmic in the number of candidates. We study the case of elections where voters and candidates are located in a $2$-dimensional space with preferences based upon proximity voting. Our main result is that the Condorcet dimension is at most $3$, under both the Manhattan norm and the infinity norm, natural measures in electoral systems.
comment: 10 pages
RCR-Router: Efficient Role-Aware Context Routing for Multi-Agent LLM Systems with Structured Memory
Multi-agent large language model (LLM) systems have shown strong potential in complex reasoning and collaborative decision-making tasks. However, most existing coordination schemes rely on static or full-context routing strategies, which lead to excessive token consumption, redundant memory exposure, and limited adaptability across interaction rounds. We introduce RCR-Router, a modular and role-aware context routing framework designed to enable efficient, adaptive collaboration in multi-agent LLMs. To our knowledge, this is the first routing approach that dynamically selects semantically relevant memory subsets for each agent based on its role and task stage, while adhering to a strict token budget. A lightweight scoring policy guides memory selection, and agent outputs are iteratively integrated into a shared memory store to facilitate progressive context refinement. To better evaluate model behavior, we further propose an Answer Quality Score metric that captures LLM-generated explanations beyond standard QA accuracy. Experiments on three multi-hop QA benchmarks -- HotPotQA, MuSiQue, and 2WikiMultihop -- demonstrate that RCR-Router reduces token usage (up to 30%) while improving or maintaining answer quality. These results highlight the importance of structured memory routing and output-aware evaluation in advancing scalable multi-agent LLM systems.
Systems and Control (CS)
SRAM-based Physically Unclonable Function using Lightweight Hamming-Code Fuzzy Extractor for Energy Harvesting Beat Sensors
Batteryless energy harvesting IoT sensor nodes such as beat sensors can be deployed in millions without the need to replace batteries. They are ultra-low-power and cost-effective wireless sensor nodes without the maintenance cost and can work for 24 hours/365 days. However, they were not equipped with security mechanisms to protect user data. Data encryption and authentication can be used to secure beat sensor applications, but generating a secure cryptographic key is challenging. In this paper, we proposed an SRAM-based Physically Unclonable Function (PUF) combining a high-reliability bit selection algorithm with a lightweight error-correcting code to generate reliable secure keys for data encryption. The system employs a feature of beat sensors, in which the microcontroller is powered on to transmit the ID signals and then powered off. This fits the SRAM-based PUF requirement, which needs the SRAM to be powered off to read out its random values. The proposed system has been evaluated on STM32 Cortex M0+ microcontrollers and has been implemented to protect important data on beat sensors.
Noise-Aware Generative Microscopic Traffic Simulation
Accurately modeling individual vehicle behavior in microscopic traffic simulation remains a key challenge in intelligent transportation systems, as it requires vehicles to realistically generate and respond to complex traffic phenomena such as phantom traffic jams. While traditional human driver simulation models offer computational tractability, they do so by abstracting away the very complexity that defines human driving. On the other hand, recent advances in infrastructure-mounted camera-based roadway sensing have enabled the extraction of vehicle trajectory data, presenting an opportunity to shift toward generative, agent-based models. Yet, a major bottleneck remains: most existing datasets are either overly sanitized or lack standardization, failing to reflect the noisy, imperfect nature of real-world sensing. Unlike data from vehicle-mounted sensors-which can mitigate sensing artifacts like occlusion through overlapping fields of view and sensor fusion-infrastructure-based sensors surface a messier, more practical view of challenges that traffic engineers encounter. To this end, we present the I-24 MOTION Scenario Dataset (I24-MSD)-a standardized, curated dataset designed to preserve a realistic level of sensor imperfection, embracing these errors as part of the learning problem rather than an obstacle to overcome purely from preprocessing. Drawing from noise-aware learning strategies in computer vision, we further adapt existing generative models in the autonomous driving community for I24-MSD with noise-aware loss functions. Our results show that such models not only outperform traditional baselines in realism but also benefit from explicitly engaging with, rather than suppressing, data imperfection. We view I24-MSD as a stepping stone toward a new generation of microscopic traffic simulation that embraces the real-world challenges and is better aligned with practical needs.
A Multi-Model Probabilistic Framework for Seismic Risk Assessment and Retrofit Planning of Electric Power Networks
Electric power networks are critical lifelines, and their disruption during earthquakes can lead to severe cascading failures and significantly hinder post-disaster recovery. Enhancing their seismic resilience requires identifying and strengthening vulnerable components in a cost-effective and system-aware manner. However, existing studies often overlook the systemic behavior of power networks under seismic loading. Common limitations include isolated component analyses that neglect network-wide interdependencies, oversimplified damage models assuming binary states or damage independence, and the exclusion of electrical operational constraints. These simplifications can result in inaccurate risk estimates and inefficient retrofit decisions. This study proposes a multi-model probabilistic framework for seismic risk assessment and retrofit planning of electric power systems. The approach integrates: (1) regional seismic hazard characterization with ground motion prediction and spatial correlation models; (2) component-level damage analysis using fragility functions and multi-state damage-functionality mappings; (3) system-level cascading impact evaluation through graph-based island detection and constrained optimal power flow analysis; and (4) retrofit planning via heuristic optimization to minimize expected annual functionality loss (EAFL) under budget constraints. Uncertainty is propagated throughout the framework using Monte Carlo simulation. The methodology is demonstrated on the IEEE 24-bus Reliability Test System, showcasing its ability to capture cascading failures, identify critical components, and generate effective retrofit strategies. Results underscore the potential of the framework as a scalable, data-informed decision-support tool for enhancing the seismic resilience of power infrastructure.
comment: 13 figures
Collision-Free Trajectory Planning and control of Robotic Manipulator using Energy-Based Artificial Potential Field (E-APF)
Robotic trajectory planning in dynamic and cluttered environments remains a critical challenge, particularly when striving for both time efficiency and motion smoothness under actuation constraints. Traditional path planner, such as Artificial Potential Field (APF), offer computational efficiency but suffer from local minima issue due to position-based potential field functions and oscillatory motion near the obstacles due to Newtonian mechanics. To address this limitation, an Energy-based Artificial Potential Field (APF) framework is proposed in this paper that integrates position and velocity-dependent potential functions. E-APF ensures dynamic adaptability and mitigates local minima, enabling uninterrupted progression toward the goal. The proposed framework integrates E-APF with a hybrid trajectory optimizer that jointly minimizes jerk and execution time under velocity and acceleration constraints, ensuring geometric smoothness and time efficiency. The entire framework is validated in simulation using the 7-degree-of-freedom Kinova Gen3 robotic manipulator. The results demonstrate collision-free, smooth, time-efficient, and oscillation-free trajectory in the presence of obstacles, highlighting the efficacy of the combined trajectory optimization and real-time obstacle avoidance approach. This work lays the foundation for future integration with reactive control strategies and physical hardware deployment in real-world manipulation tasks.
Human-in-the-Loop Simulation for Real-Time Exploration of HVAC Demand Flexibility
The increasing integration of renewable energy into the power grid has highlighted the critical importance of demand-side flexibility. Among flexible loads, heating, ventilation, and air-conditioning (HVAC) systems are particularly significant due to their high energy consumption and controllability. This study presents the development of an interactive simulation platform that integrates a high-fidelity simulation engine with a user-facing dashboard, specifically designed to explore and demonstrate the demand flexibility capacity of HVAC systems. Unlike conventional simulations, where users are passive observers of simulation results with no ability to intervene in the embedded control during the simulation, this platform transforms them into active participants. Users can override system default control settings, such as zone temperature setpoints and HVAC schedules, at any point during the simulation runtime to implement demand response strategies of their choice. This human-in-the-loop capability enables real-time interaction and allows users to observe the immediate impact of their actions, emulating the practical decision-making process of a building or system operator. By exploring different demand flexibility scenarios and system behaviour in a manner that reflects real-world operation, users gain a deeper understanding of demand flexibility and their impacts. This interactive experience builds confidence and supports more informed decision-making in the practical adoption of demand-side flexibility. This paper presents the architecture of the simulation platform, user-oriented dashboard design, and user case showcase. The introduced human-in-the-loop simulation paradigm offers a more intuitive and interactive means of engaging with grid-interactive building operations, extending beyond HVAC demand flexibility exploration.
Threshold dynamics in time-delay systems: polynomial $β$-control in a pressing process and connections to blow-up
This paper addresses a press control problem in straightening machines with small time delays due to system communication. To handle this, we propose a generalized $\beta$-control method, which replaces conventional linear velocity control with a polynomial of degree $\beta \ge 1$. The resulting model is a delay differential equation (DDE), for which we derive basic properties through nondimensionalization and analysis. Numerical experiments suggest the existence of a threshold initial velocity separating overshoot and non-overshoot dynamics, which we formulate as a conjecture. Based on this, we design a control algorithm under velocity constraints and confirm its effectiveness. We also highlight a connection between threshold behavior and finite-time blow-up in DDEs. This study provides a practical control strategy and contributes new insights into threshold dynamics and blow-up phenomena in delay systems.
comment: 11 pages, 9 figures
Applying the Spectral Method for Modeling Linear Filters: Butterworth, Linkwitz-Riley, and Chebyshev filters
This paper proposes a new technique for computer modeling linear filters based on the spectral form of mathematical description of linear systems. It assumes that input and output signals of the filter are represented as orthogonal expansions, while filters themselves are described by two-dimensional non-stationary transfer functions. This technique allows one to model the output signal in continuous time, and it is successfully tested on the Butterworth, Linkwitz-Riley, and Chebyshev filters with different orders.
An Analogy of Frequency Droop Control for Grid-forming Sources
In this paper, we present an analogy for a power system dominated by grid-forming (GFM) sources that proves to be a powerful visualization tool for analyses of power flow, frequency regulation, and power dispatch. Frequency droop characteristics of a typical GFM source are exactly reflected by an ordinary model of water vessels. The frequency is represented by visible water levels while the droop slope is reified by the vessel sizes. This proposed analogy allows us to use the intuitive water-flow phenomenon to explain the abstract power-flow problems. The grid integration of renewables via GFM inverters is interestingly simulated by a vessel connected to an infinite water tank. This paper also provides a means for demonstrating issues to audiences with little or no background in power systems. Finally, the proposal is verified by simulation results.
comment: Accepted by IEEE PESGM 2025
On Irreversibility and Stochastic Systems: Part One
We attempt to characterize irreversibility of a dynamical system from the existence of different forward and backward mathematical representations depending on the direction of the time arrow. Such different representations have been studied intensively and are shown to exist for stochastic diffusion models. In this setting one has however to face the preliminary justification of stochastic description for physical systems which are described by classical mechanics as inherently deterministic and conservative. In part one of this paper we first address this modeling problem for linear systems in a deterministic context. We show that forward-backward representations can also describe conservative finite dimensional deterministic systems when they are coupled to an infinite-dimensional conservative heat bath. A novel key observation is that the heat bath acts on the finite-dimensional conservative system by {\em state-feedback} and can shift its eigenvalues to make the system dissipative but may also generate another totally unstable model which naturally evolves backward in time. In the second part, we address the stochastic description of these two representations. Under a natural family of invariant measures the heat bath can be shown to induce a white noise input acting on the system making it look like a true dissipative diffusion.
Heisenberg-limited calibration of entangling gates with robust phase estimation
The calibration of high-quality two-qubit entangling gates is an essential component in engineering large-scale, fault-tolerant quantum computers. However, many standard calibration techniques are based on randomized circuits that are only quadratically sensitive to calibration errors. As a result, these approaches are inefficient, requiring many experimental shots to achieve acceptable performance. In this work, we demonstrate that robust phase estimation can enable high-precision, Heisenberg-limited estimates of coherent errors in multi-qubit gates. Equipped with an efficient estimator, the calibration problem may be reduced to a simple optimization loop that minimizes the estimated coherent error. We experimentally demonstrate our calibration protocols by improving the operation of a two-qubit controlled-Z gate on a superconducting processor, and we validate the improved performance with gate set tomography. Our methods are applicable to gates in other quantum hardware platforms such as ion traps and neutral atoms, and on other multi-qubit gates, such as CNOT or iSWAP.
comment: 11 pages, 4 figures
Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study
We study continuous-time mean--variance portfolio selection in markets where stock prices are diffusion processes driven by observable factors that are also diffusion processes, yet the coefficients of these processes are unknown. Based on the recently developed reinforcement learning (RL) theory for diffusion processes, we present a general data-driven RL algorithm that learns the pre-committed investment strategy directly without attempting to learn or estimate the market coefficients. For multi-stock Black--Scholes markets without factors, we further devise a baseline algorithm and prove its performance guarantee by deriving a sublinear regret bound in terms of the Sharpe ratio. For performance enhancement and practical implementation, we modify the baseline algorithm and carry out an extensive empirical study to compare its performance, in terms of a host of common metrics, with a large number of widely employed portfolio allocation strategies on S\&P 500 constituents. The results demonstrate that the proposed continuous-time RL strategy is consistently among the best, especially in a volatile bear market, and decisively outperforms the model-based continuous-time counterparts by significant margins.
comment: 82 pages, 6 figures, 7 tables
Mitigating Traffic Oscillations in Mixed Traffic Flow with Scalable Deep Koopman Predictive Control
Mitigating traffic oscillations in mixed flows of connected automated vehicles (CAVs) and human-driven vehicles (HDVs) is critical for enhancing traffic stability. A key challenge lies in modeling the nonlinear, heterogeneous behaviors of HDVs within computationally tractable predictive control frameworks. This study proposes an adaptive deep Koopman predictive control framework (AdapKoopPC) to address this issue. The framework features a novel deep Koopman network, AdapKoopnet, which represents complex HDV car-following dynamics as a linear system in a high-dimensional space by adaptively learning from naturalistic data. This learned linear representation is then embedded into a Model Predictive Control (MPC) scheme, enabling real-time, scalable, and optimal control of CAVs. We validate our framework using the HighD dataset and extensive numerical simulations. Results demonstrate that AdapKoopnet achieves superior trajectory prediction accuracy over baseline models. Furthermore, the complete AdapKoopPC controller significantly dampens traffic oscillations with lower computational cost, exhibiting strong performance even at low CAV penetration rates. The proposed framework offers a scalable and data-driven solution for enhancing stability in realistic mixed traffic environments. The code is made publicly available.
Distributed Optimal Coverage Control in Multi-agent Systems: Known and Unknown Environments
This paper introduces a novel approach to solve the coverage optimization problem in multi-agent systems. The proposed technique offers an optimal solution with a lower cost with respect to conventional Voronoi-based techniques by effectively handling the issue of agents remaining stationary in regions void of information using a ranking function. The proposed approach leverages a novel cost function for optimizing the agents coverage and the cost function eventually aligns with the conventional Voronoi-based cost function. Theoretical analyses are conducted to assure the asymptotic convergence of agents towards the optimal configuration. A distinguishing feature of this approach lies in its departure from the reliance on geometric methods that are characteristic of Voronoi-based approaches; hence can be implemented more simply. Remarkably, the technique is adaptive and applicable to various environments with both known and unknown information distributions. Lastly, the efficacy of the proposed method is demonstrated through simulations, and the obtained results are compared with those of Voronoi-based algorithms.
Functional Controllability, Functional Stabilizability, and the Generalized Separation Principle
This paper introduces the new concepts of Functional Controllability and Functional Stabilizability, and establishes their duality with Functional Observability and Functional Detectability, respectively. A Generalized Separation Principle is presented, under which the classical Separation Principle emerges as a special case. Conditions for the existence of functional controllers of a specified order are derived. Notably, the proposed design framework does not require full controllability. In addition, a functional observer-based controller design is developed for systems that may be both uncontrollable and unobservable. The results presented extend and generalize the classical full-state observer based feedback control paradigm.
comment: Under review in a journal
Systems and Control (EESS)
SRAM-based Physically Unclonable Function using Lightweight Hamming-Code Fuzzy Extractor for Energy Harvesting Beat Sensors
Batteryless energy harvesting IoT sensor nodes such as beat sensors can be deployed in millions without the need to replace batteries. They are ultra-low-power and cost-effective wireless sensor nodes without the maintenance cost and can work for 24 hours/365 days. However, they were not equipped with security mechanisms to protect user data. Data encryption and authentication can be used to secure beat sensor applications, but generating a secure cryptographic key is challenging. In this paper, we proposed an SRAM-based Physically Unclonable Function (PUF) combining a high-reliability bit selection algorithm with a lightweight error-correcting code to generate reliable secure keys for data encryption. The system employs a feature of beat sensors, in which the microcontroller is powered on to transmit the ID signals and then powered off. This fits the SRAM-based PUF requirement, which needs the SRAM to be powered off to read out its random values. The proposed system has been evaluated on STM32 Cortex M0+ microcontrollers and has been implemented to protect important data on beat sensors.
Noise-Aware Generative Microscopic Traffic Simulation
Accurately modeling individual vehicle behavior in microscopic traffic simulation remains a key challenge in intelligent transportation systems, as it requires vehicles to realistically generate and respond to complex traffic phenomena such as phantom traffic jams. While traditional human driver simulation models offer computational tractability, they do so by abstracting away the very complexity that defines human driving. On the other hand, recent advances in infrastructure-mounted camera-based roadway sensing have enabled the extraction of vehicle trajectory data, presenting an opportunity to shift toward generative, agent-based models. Yet, a major bottleneck remains: most existing datasets are either overly sanitized or lack standardization, failing to reflect the noisy, imperfect nature of real-world sensing. Unlike data from vehicle-mounted sensors-which can mitigate sensing artifacts like occlusion through overlapping fields of view and sensor fusion-infrastructure-based sensors surface a messier, more practical view of challenges that traffic engineers encounter. To this end, we present the I-24 MOTION Scenario Dataset (I24-MSD)-a standardized, curated dataset designed to preserve a realistic level of sensor imperfection, embracing these errors as part of the learning problem rather than an obstacle to overcome purely from preprocessing. Drawing from noise-aware learning strategies in computer vision, we further adapt existing generative models in the autonomous driving community for I24-MSD with noise-aware loss functions. Our results show that such models not only outperform traditional baselines in realism but also benefit from explicitly engaging with, rather than suppressing, data imperfection. We view I24-MSD as a stepping stone toward a new generation of microscopic traffic simulation that embraces the real-world challenges and is better aligned with practical needs.
A Multi-Model Probabilistic Framework for Seismic Risk Assessment and Retrofit Planning of Electric Power Networks
Electric power networks are critical lifelines, and their disruption during earthquakes can lead to severe cascading failures and significantly hinder post-disaster recovery. Enhancing their seismic resilience requires identifying and strengthening vulnerable components in a cost-effective and system-aware manner. However, existing studies often overlook the systemic behavior of power networks under seismic loading. Common limitations include isolated component analyses that neglect network-wide interdependencies, oversimplified damage models assuming binary states or damage independence, and the exclusion of electrical operational constraints. These simplifications can result in inaccurate risk estimates and inefficient retrofit decisions. This study proposes a multi-model probabilistic framework for seismic risk assessment and retrofit planning of electric power systems. The approach integrates: (1) regional seismic hazard characterization with ground motion prediction and spatial correlation models; (2) component-level damage analysis using fragility functions and multi-state damage-functionality mappings; (3) system-level cascading impact evaluation through graph-based island detection and constrained optimal power flow analysis; and (4) retrofit planning via heuristic optimization to minimize expected annual functionality loss (EAFL) under budget constraints. Uncertainty is propagated throughout the framework using Monte Carlo simulation. The methodology is demonstrated on the IEEE 24-bus Reliability Test System, showcasing its ability to capture cascading failures, identify critical components, and generate effective retrofit strategies. Results underscore the potential of the framework as a scalable, data-informed decision-support tool for enhancing the seismic resilience of power infrastructure.
comment: 13 figures
Collision-Free Trajectory Planning and control of Robotic Manipulator using Energy-Based Artificial Potential Field (E-APF)
Robotic trajectory planning in dynamic and cluttered environments remains a critical challenge, particularly when striving for both time efficiency and motion smoothness under actuation constraints. Traditional path planner, such as Artificial Potential Field (APF), offer computational efficiency but suffer from local minima issue due to position-based potential field functions and oscillatory motion near the obstacles due to Newtonian mechanics. To address this limitation, an Energy-based Artificial Potential Field (APF) framework is proposed in this paper that integrates position and velocity-dependent potential functions. E-APF ensures dynamic adaptability and mitigates local minima, enabling uninterrupted progression toward the goal. The proposed framework integrates E-APF with a hybrid trajectory optimizer that jointly minimizes jerk and execution time under velocity and acceleration constraints, ensuring geometric smoothness and time efficiency. The entire framework is validated in simulation using the 7-degree-of-freedom Kinova Gen3 robotic manipulator. The results demonstrate collision-free, smooth, time-efficient, and oscillation-free trajectory in the presence of obstacles, highlighting the efficacy of the combined trajectory optimization and real-time obstacle avoidance approach. This work lays the foundation for future integration with reactive control strategies and physical hardware deployment in real-world manipulation tasks.
Human-in-the-Loop Simulation for Real-Time Exploration of HVAC Demand Flexibility
The increasing integration of renewable energy into the power grid has highlighted the critical importance of demand-side flexibility. Among flexible loads, heating, ventilation, and air-conditioning (HVAC) systems are particularly significant due to their high energy consumption and controllability. This study presents the development of an interactive simulation platform that integrates a high-fidelity simulation engine with a user-facing dashboard, specifically designed to explore and demonstrate the demand flexibility capacity of HVAC systems. Unlike conventional simulations, where users are passive observers of simulation results with no ability to intervene in the embedded control during the simulation, this platform transforms them into active participants. Users can override system default control settings, such as zone temperature setpoints and HVAC schedules, at any point during the simulation runtime to implement demand response strategies of their choice. This human-in-the-loop capability enables real-time interaction and allows users to observe the immediate impact of their actions, emulating the practical decision-making process of a building or system operator. By exploring different demand flexibility scenarios and system behaviour in a manner that reflects real-world operation, users gain a deeper understanding of demand flexibility and their impacts. This interactive experience builds confidence and supports more informed decision-making in the practical adoption of demand-side flexibility. This paper presents the architecture of the simulation platform, user-oriented dashboard design, and user case showcase. The introduced human-in-the-loop simulation paradigm offers a more intuitive and interactive means of engaging with grid-interactive building operations, extending beyond HVAC demand flexibility exploration.
Threshold dynamics in time-delay systems: polynomial $β$-control in a pressing process and connections to blow-up
This paper addresses a press control problem in straightening machines with small time delays due to system communication. To handle this, we propose a generalized $\beta$-control method, which replaces conventional linear velocity control with a polynomial of degree $\beta \ge 1$. The resulting model is a delay differential equation (DDE), for which we derive basic properties through nondimensionalization and analysis. Numerical experiments suggest the existence of a threshold initial velocity separating overshoot and non-overshoot dynamics, which we formulate as a conjecture. Based on this, we design a control algorithm under velocity constraints and confirm its effectiveness. We also highlight a connection between threshold behavior and finite-time blow-up in DDEs. This study provides a practical control strategy and contributes new insights into threshold dynamics and blow-up phenomena in delay systems.
comment: 11 pages, 9 figures
Applying the Spectral Method for Modeling Linear Filters: Butterworth, Linkwitz-Riley, and Chebyshev filters
This paper proposes a new technique for computer modeling linear filters based on the spectral form of mathematical description of linear systems. It assumes that input and output signals of the filter are represented as orthogonal expansions, while filters themselves are described by two-dimensional non-stationary transfer functions. This technique allows one to model the output signal in continuous time, and it is successfully tested on the Butterworth, Linkwitz-Riley, and Chebyshev filters with different orders.
An Analogy of Frequency Droop Control for Grid-forming Sources
In this paper, we present an analogy for a power system dominated by grid-forming (GFM) sources that proves to be a powerful visualization tool for analyses of power flow, frequency regulation, and power dispatch. Frequency droop characteristics of a typical GFM source are exactly reflected by an ordinary model of water vessels. The frequency is represented by visible water levels while the droop slope is reified by the vessel sizes. This proposed analogy allows us to use the intuitive water-flow phenomenon to explain the abstract power-flow problems. The grid integration of renewables via GFM inverters is interestingly simulated by a vessel connected to an infinite water tank. This paper also provides a means for demonstrating issues to audiences with little or no background in power systems. Finally, the proposal is verified by simulation results.
comment: Accepted by IEEE PESGM 2025
On Irreversibility and Stochastic Systems: Part One
We attempt to characterize irreversibility of a dynamical system from the existence of different forward and backward mathematical representations depending on the direction of the time arrow. Such different representations have been studied intensively and are shown to exist for stochastic diffusion models. In this setting one has however to face the preliminary justification of stochastic description for physical systems which are described by classical mechanics as inherently deterministic and conservative. In part one of this paper we first address this modeling problem for linear systems in a deterministic context. We show that forward-backward representations can also describe conservative finite dimensional deterministic systems when they are coupled to an infinite-dimensional conservative heat bath. A novel key observation is that the heat bath acts on the finite-dimensional conservative system by {\em state-feedback} and can shift its eigenvalues to make the system dissipative but may also generate another totally unstable model which naturally evolves backward in time. In the second part, we address the stochastic description of these two representations. Under a natural family of invariant measures the heat bath can be shown to induce a white noise input acting on the system making it look like a true dissipative diffusion.
Heisenberg-limited calibration of entangling gates with robust phase estimation
The calibration of high-quality two-qubit entangling gates is an essential component in engineering large-scale, fault-tolerant quantum computers. However, many standard calibration techniques are based on randomized circuits that are only quadratically sensitive to calibration errors. As a result, these approaches are inefficient, requiring many experimental shots to achieve acceptable performance. In this work, we demonstrate that robust phase estimation can enable high-precision, Heisenberg-limited estimates of coherent errors in multi-qubit gates. Equipped with an efficient estimator, the calibration problem may be reduced to a simple optimization loop that minimizes the estimated coherent error. We experimentally demonstrate our calibration protocols by improving the operation of a two-qubit controlled-Z gate on a superconducting processor, and we validate the improved performance with gate set tomography. Our methods are applicable to gates in other quantum hardware platforms such as ion traps and neutral atoms, and on other multi-qubit gates, such as CNOT or iSWAP.
comment: 11 pages, 4 figures
Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study
We study continuous-time mean--variance portfolio selection in markets where stock prices are diffusion processes driven by observable factors that are also diffusion processes, yet the coefficients of these processes are unknown. Based on the recently developed reinforcement learning (RL) theory for diffusion processes, we present a general data-driven RL algorithm that learns the pre-committed investment strategy directly without attempting to learn or estimate the market coefficients. For multi-stock Black--Scholes markets without factors, we further devise a baseline algorithm and prove its performance guarantee by deriving a sublinear regret bound in terms of the Sharpe ratio. For performance enhancement and practical implementation, we modify the baseline algorithm and carry out an extensive empirical study to compare its performance, in terms of a host of common metrics, with a large number of widely employed portfolio allocation strategies on S\&P 500 constituents. The results demonstrate that the proposed continuous-time RL strategy is consistently among the best, especially in a volatile bear market, and decisively outperforms the model-based continuous-time counterparts by significant margins.
comment: 82 pages, 6 figures, 7 tables
Mitigating Traffic Oscillations in Mixed Traffic Flow with Scalable Deep Koopman Predictive Control
Mitigating traffic oscillations in mixed flows of connected automated vehicles (CAVs) and human-driven vehicles (HDVs) is critical for enhancing traffic stability. A key challenge lies in modeling the nonlinear, heterogeneous behaviors of HDVs within computationally tractable predictive control frameworks. This study proposes an adaptive deep Koopman predictive control framework (AdapKoopPC) to address this issue. The framework features a novel deep Koopman network, AdapKoopnet, which represents complex HDV car-following dynamics as a linear system in a high-dimensional space by adaptively learning from naturalistic data. This learned linear representation is then embedded into a Model Predictive Control (MPC) scheme, enabling real-time, scalable, and optimal control of CAVs. We validate our framework using the HighD dataset and extensive numerical simulations. Results demonstrate that AdapKoopnet achieves superior trajectory prediction accuracy over baseline models. Furthermore, the complete AdapKoopPC controller significantly dampens traffic oscillations with lower computational cost, exhibiting strong performance even at low CAV penetration rates. The proposed framework offers a scalable and data-driven solution for enhancing stability in realistic mixed traffic environments. The code is made publicly available.
Distributed Optimal Coverage Control in Multi-agent Systems: Known and Unknown Environments
This paper introduces a novel approach to solve the coverage optimization problem in multi-agent systems. The proposed technique offers an optimal solution with a lower cost with respect to conventional Voronoi-based techniques by effectively handling the issue of agents remaining stationary in regions void of information using a ranking function. The proposed approach leverages a novel cost function for optimizing the agents coverage and the cost function eventually aligns with the conventional Voronoi-based cost function. Theoretical analyses are conducted to assure the asymptotic convergence of agents towards the optimal configuration. A distinguishing feature of this approach lies in its departure from the reliance on geometric methods that are characteristic of Voronoi-based approaches; hence can be implemented more simply. Remarkably, the technique is adaptive and applicable to various environments with both known and unknown information distributions. Lastly, the efficacy of the proposed method is demonstrated through simulations, and the obtained results are compared with those of Voronoi-based algorithms.
Functional Controllability, Functional Stabilizability, and the Generalized Separation Principle
This paper introduces the new concepts of Functional Controllability and Functional Stabilizability, and establishes their duality with Functional Observability and Functional Detectability, respectively. A Generalized Separation Principle is presented, under which the classical Separation Principle emerges as a special case. Conditions for the existence of functional controllers of a specified order are derived. Notably, the proposed design framework does not require full controllability. In addition, a functional observer-based controller design is developed for systems that may be both uncontrollable and unobservable. The results presented extend and generalize the classical full-state observer based feedback control paradigm.
comment: Under review in a journal
Multiagent Systems
K-Dense Analyst: Towards Fully Automated Scientific Analysis
The complexity of modern bioinformatics analysis has created a critical gap between data generation and developing scientific insights. While large language models (LLMs) have shown promise in scientific reasoning, they remain fundamentally limited when dealing with real-world analytical workflows that demand iterative computation, tool integration and rigorous validation. We introduce K-Dense Analyst, a hierarchical multi-agent system that achieves autonomous bioinformatics analysis through a dual-loop architecture. K-Dense Analyst, part of the broader K-Dense platform, couples planning with validated execution using specialized agents to decompose complex objectives into executable, verifiable tasks within secure computational environments. On BixBench, a comprehensive benchmark for open-ended biological analysis, K-Dense Analyst achieves 29.2% accuracy, surpassing the best-performing language model (GPT-5) by 6.3 percentage points, representing nearly 27% improvement over what is widely considered the most powerful LLM available. Remarkably, K-Dense Analyst achieves this performance using Gemini 2.5 Pro, which attains only 18.3% accuracy when used directly, demonstrating that our architectural innovations unlock capabilities far beyond the underlying model's baseline performance. Our insights demonstrate that autonomous scientific reasoning requires more than enhanced language models, it demands purpose-built systems that can bridge the gap between high-level scientific objectives and low-level computational execution. These results represent a significant advance toward fully autonomous computational biologists capable of accelerating discovery across the life sciences.
Narrative Memory in Machines: Multi-Agent Arc Extraction in Serialized TV
Serialized television narratives present significant analytical challenges due to their complex, temporally distributed storylines that necessitate sophisticated information management. This paper introduces a multi-agent system (MAS) designed to extract and analyze narrative arcs by implementing principles of computational memory architectures. The system conceptualizes narrative understanding through analogues of human memory: Large Language Models (LLMs) provide a form of semantic memory for general narrative patterns, while a vector database stores specific arc progressions as episodic memories. A multi-agent workflow simulates working memory processes to integrate these information types. Tested on the first season of Grey's Anatomy (ABC 2005-), the MAS identifies three arc types: Anthology (self-contained), Soap (relationship-focused), and Genre-Specific. These arcs and their episodic developments are stored in a vector database, facilitating structured analysis and semantic comparison. To bridge automation with critical interpretation, a graphical interface enables human oversight and refinement of the system's narrative memory. While demonstrating strong performance in identifying Anthology Arcs and character entities, the system's reliance on textual paratexts (episode summaries) revealed limitations in discerning overlapping arcs and opaque dynamics, underscoring the challenges in computational memory consolidation versus human holistic understanding. This memory-centric approach highlights the potential of combining AI-driven memory processing with human expertise. Beyond television, it offers promise for serialized written formats where narrative is entirely text-based. Future work will focus on integrating multimodal inputs to enrich episodic memory, refining memory integration mechanisms within the MAS, and expanding testing across diverse genres.
Conformal Set-based Human-AI Complementarity with Multiple Experts AAMAS 2025
Decision support systems are designed to assist human experts in classification tasks by providing conformal prediction sets derived from a pre-trained model. This human-AI collaboration has demonstrated enhanced classification performance compared to using either the model or the expert independently. In this study, we focus on the selection of instance-specific experts from a pool of multiple human experts, contrasting it with existing research that typically focuses on single-expert scenarios. We characterize the conditions under which multiple experts can benefit from the conformal sets. With the insight that only certain experts may be relevant for each instance, we explore the problem of subset selection and introduce a greedy algorithm that utilizes conformal sets to identify the subset of expert predictions that will be used in classifying an instance. This approach is shown to yield better performance compared to naive methods for human subset selection. Based on real expert predictions from the CIFAR-10H and ImageNet-16H datasets, our simulation study indicates that our proposed greedy algorithm achieves near-optimal subsets, resulting in improved classification performance among multiple experts.
comment: Accepted at AAMAS 2025. Code available at: https://github.com/paathelb/conformal_hai_multiple
Energy Efficient Task Offloading in UAV-Enabled MEC Using a Fully Decentralized Deep Reinforcement Learning Approach
Unmanned aerial vehicles (UAVs) have been recently utilized in multi-access edge computing (MEC) as edge servers. It is desirable to design UAVs' trajectories and user to UAV assignments to ensure satisfactory service to the users and energy efficient operation simultaneously. The posed optimization problem is challenging to solve because: (i) The formulated problem is non-convex, (ii) Due to the mobility of ground users, their future positions and channel gains are not known in advance, (iii) Local UAVs' observations should be communicated to a central entity that solves the optimization problem. The (semi-) centralized processing leads to communication overhead, communication/processing bottlenecks, lack of flexibility and scalability, and loss of robustness to system failures. To simultaneously address all these limitations, we advocate a fully decentralized setup with no centralized entity. Each UAV obtains its local observation and then communicates with its immediate neighbors only. After sharing information with neighbors, each UAV determines its next position via a locally run deep reinforcement learning (DRL) algorithm. None of the UAVs need to know the global communication graph. Two main components of our proposed solution are (i) Graph attention layers (GAT), and (ii) Experience and parameter sharing proximal policy optimization (EPS-PPO). Our proposed approach eliminates all the limitations of semi-centralized MADRL methods such as MAPPO and MA deep deterministic policy gradient (MADDPG), while guaranteeing a better performance than independent local DRLs such as in IPPO. Numerical results reveal notable performance gains in several different criteria compared to the existing MADDPG algorithm, demonstrating the potential for offering a better performance, while utilizing local communications only.
SEVADE: Self-Evolving Multi-Agent Analysis with Decoupled Evaluation for Hallucination-Resistant Irony Detection
Sarcasm detection is a crucial yet challenging Natural Language Processing task. Existing Large Language Model methods are often limited by single-perspective analysis, static reasoning pathways, and a susceptibility to hallucination when processing complex ironic rhetoric, which impacts their accuracy and reliability. To address these challenges, we propose **SEVADE**, a novel **S**elf-**Ev**olving multi-agent **A**nalysis framework with **D**ecoupled **E**valuation for hallucination-resistant sarcasm detection. The core of our framework is a Dynamic Agentive Reasoning Engine (DARE), which utilizes a team of specialized agents grounded in linguistic theory to perform a multifaceted deconstruction of the text and generate a structured reasoning chain. Subsequently, a separate lightweight rationale adjudicator (RA) performs the final classification based solely on this reasoning chain. This decoupled architecture is designed to mitigate the risk of hallucination by separating complex reasoning from the final judgment. Extensive experiments on four benchmark datasets demonstrate that our framework achieves state-of-the-art performance, with average improvements of **6.75%** in Accuracy and **6.29%** in Macro-F1 score.
PANAMA: A Network-Aware MARL Framework for Multi-Agent Path Finding in Digital Twin Ecosystems
Digital Twins (DTs) are transforming industries through advanced data processing and analysis, positioning the world of DTs, Digital World, as a cornerstone of nextgeneration technologies including embodied AI. As robotics and automated systems scale, efficient data-sharing frameworks and robust algorithms become critical. We explore the pivotal role of data handling in next-gen networks, focusing on dynamics between application and network providers (AP/NP) in DT ecosystems. We introduce PANAMA, a novel algorithm with Priority Asymmetry for Network Aware Multi-agent Reinforcement Learning (MARL) based multi-agent path finding (MAPF). By adopting a Centralized Training with Decentralized Execution (CTDE) framework and asynchronous actor-learner architectures, PANAMA accelerates training while enabling autonomous task execution by embodied AI. Our approach demonstrates superior pathfinding performance in accuracy, speed, and scalability compared to existing benchmarks. Through simulations, we highlight optimized data-sharing strategies for scalable, automated systems, ensuring resilience in complex, real-world environments. PANAMA bridges the gap between network-aware decision-making and robust multi-agent coordination, advancing the synergy between DTs, wireless networks, and AI-driven automation.
AI-Generated Compromises for Coalition Formation
The challenge of finding compromises between agent proposals is fundamental to AI subfields such as argumentation, mediation, and negotiation. Building on this tradition, Elkind et al. (2021) introduced a process for coalition formation that seeks majority-supported proposals preferable to the status quo, using a metric space where each agent has an ideal point. A crucial step in this process involves identifying compromise proposals around which agent coalitions can unite. How to effectively find such compromise proposals remains an open question. We address this gap by formalizing a model that incorporates agent bounded rationality and uncertainty, and by developing AI methods to generate compromise proposals. We focus on the domain of collaborative document writing, such as the democratic drafting of a community constitution. Our approach uses natural language processing techniques and large language models to induce a semantic metric space over text. Based on this space, we design algorithms to suggest compromise points likely to receive broad support. To evaluate our methods, we simulate coalition formation processes and show that AI can facilitate large-scale democratic text editing, a domain where traditional tools are limited.
A Differentiated Reward Method for Reinforcement Learning based Multi-Vehicle Cooperative Decision-Making Algorithms
Reinforcement learning (RL) shows great potential for optimizing multi-vehicle cooperative driving strategies through the state-action-reward feedback loop, but it still faces challenges such as low sample efficiency. This paper proposes a differentiated reward method based on steady-state transition systems, which incorporates state transition gradient information into the reward design by analyzing traffic flow characteristics, aiming to optimize action selection and policy learning in multi-vehicle cooperative decision-making. The performance of the proposed method is validated in RL algorithms such as MAPPO, MADQN, and QMIX under varying autonomous vehicle penetration. The results show that the differentiated reward method significantly accelerates training convergence and outperforms centering reward and others in terms of traffic efficiency, safety, and action rationality. Additionally, the method demonstrates strong scalability and environmental adaptability, providing a novel approach for multi-agent cooperative decision-making in complex traffic scenarios.
comment: 10 pages, 3 figures
Robotics
DexFruit: Dexterous Manipulation and Gaussian Splatting Inspection of Fruit
DexFruit is a robotic manipulation framework that enables gentle, autonomous handling of fragile fruit and precise evaluation of damage. Many fruits are fragile and prone to bruising, thus requiring humans to manually harvest them with care. In this work, we demonstrate by using optical tactile sensing, autonomous manipulation of fruit with minimal damage can be achieved. We show that our tactile informed diffusion policies outperform baselines in both reduced bruising and pick-and-place success rate across three fruits: strawberries, tomatoes, and blackberries. In addition, we introduce FruitSplat, a novel technique to represent and quantify visual damage in high-resolution 3D representation via 3D Gaussian Splatting (3DGS). Existing metrics for measuring damage lack quantitative rigor or require expensive equipment. With FruitSplat, we distill a 2D strawberry mask as well as a 2D bruise segmentation mask into the 3DGS representation. Furthermore, this representation is modular and general, compatible with any relevant 2D model. Overall, we demonstrate a 92% grasping policy success rate, up to a 20% reduction in visual bruising, and up to an 31% improvement in grasp success rate on challenging fruit compared to our baselines across our three tested fruits. We rigorously evaluate this result with over 630 trials. Please checkout our website at https://dex-fruit.github.io .
comment: 8 pages, 5 figures
ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting ICCV 2025
We introduce ForeSight, a novel joint detection and forecasting framework for vision-based 3D perception in autonomous vehicles. Traditional approaches treat detection and forecasting as separate sequential tasks, limiting their ability to leverage temporal cues. ForeSight addresses this limitation with a multi-task streaming and bidirectional learning approach, allowing detection and forecasting to share query memory and propagate information seamlessly. The forecast-aware detection transformer enhances spatial reasoning by integrating trajectory predictions from a multiple hypothesis forecast memory queue, while the streaming forecast transformer improves temporal consistency using past forecasts and refined detections. Unlike tracking-based methods, ForeSight eliminates the need for explicit object association, reducing error propagation with a tracking-free model that efficiently scales across multi-frame sequences. Experiments on the nuScenes dataset show that ForeSight achieves state-of-the-art performance, achieving an EPA of 54.9%, surpassing previous methods by 9.3%, while also attaining the best mAP and minADE among multi-view detection and forecasting models.
comment: Accepted to ICCV 2025
An Evolutionary Game-Theoretic Merging Decision-Making Considering Social Acceptance for Autonomous Driving
Highway on-ramp merging is of great challenge for autonomous vehicles (AVs), since they have to proactively interact with surrounding vehicles to enter the main road safely within limited time. However, existing decision-making algorithms fail to adequately address dynamic complexities and social acceptance of AVs, leading to suboptimal or unsafe merging decisions. To address this, we propose an evolutionary game-theoretic (EGT) merging decision-making framework, grounded in the bounded rationality of human drivers, which dynamically balances the benefits of both AVs and main-road vehicles (MVs). We formulate the cut-in decision-making process as an EGT problem with a multi-objective payoff function that reflects human-like driving preferences. By solving the replicator dynamic equation for the evolutionarily stable strategy (ESS), the optimal cut-in timing is derived, balancing efficiency, comfort, and safety for both AVs and MVs. A real-time driving style estimation algorithm is proposed to adjust the game payoff function online by observing the immediate reactions of MVs. Empirical results demonstrate that we improve the efficiency, comfort and safety of both AVs and MVs compared with existing game-theoretic and traditional planning approaches across multi-object metrics.
Model Predictive Control for Crowd Navigation via Learning-Based Trajectory Prediction
Safe navigation in pedestrian-rich environments remains a key challenge for autonomous robots. This work evaluates the integration of a deep learning-based Social-Implicit (SI) pedestrian trajectory predictor within a Model Predictive Control (MPC) framework on the physical Continental Corriere robot. Tested across varied pedestrian densities, the SI-MPC system is compared to a traditional Constant Velocity (CV) model in both open-loop prediction and closed-loop navigation. Results show that SI improves trajectory prediction - reducing errors by up to 76% in low-density settings - and enhances safety and motion smoothness in crowded scenes. Moreover, real-world deployment reveals discrepancies between open-loop metrics and closed-loop performance, as the SI model yields broader, more cautious predictions. These findings emphasize the importance of system-level evaluation and highlight the SI-MPC framework's promise for safer, more adaptive navigation in dynamic, human-populated environments.
From Data to Safe Mobile Robot Navigation: An Efficient and Modular Robust MPC Design Pipeline
Model predictive control (MPC) is a powerful strategy for planning and control in autonomous mobile robot navigation. However, ensuring safety in real-world deployments remains challenging due to the presence of disturbances and measurement noise. Existing approaches often rely on idealized assumptions, neglect the impact of noisy measurements, and simply heuristically guess unrealistic bounds. In this work, we present an efficient and modular robust MPC design pipeline that systematically addresses these limitations. The pipeline consists of an iterative procedure that leverages closed-loop experimental data to estimate disturbance bounds and synthesize a robust output-feedback MPC scheme. We provide the pipeline in the form of deterministic and reproducible code to synthesize the robust output-feedback MPC from data. We empirically demonstrate robust constraint satisfaction and recursive feasibility in quadrotor simulations using Gazebo.
comment: 8 pages, 5 figures
$\mathcal{P}^3$: Toward Versatile Embodied Agents
Embodied agents have shown promising generalization capabilities across diverse physical environments, making them essential for a wide range of real-world applications. However, building versatile embodied agents poses critical challenges due to three key issues: dynamic environment perception, open-ended tool usage, and complex multi-task planning. Most previous works rely solely on feedback from tool agents to perceive environmental changes and task status, which limits adaptability to real-time dynamics, causes error accumulation, and restricts tool flexibility. Furthermore, multi-task scheduling has received limited attention, primarily due to the inherent complexity of managing task dependencies and balancing competing priorities in dynamic and complex environments. To overcome these challenges, we introduce $\mathcal{P}^3$, a unified framework that integrates real-time perception and dynamic scheduling. Specifically, $\mathcal{P}^3$ enables 1) \textbf Perceive relevant task information actively from the environment, 2) \textbf Plug and utilize any tool without feedback requirement, and 3) \textbf Plan multi-task execution based on prioritizing urgent tasks and dynamically adjusting task order based on dependencies. Extensive real-world experiments show that our approach bridges the gap between benchmarks and practical deployment, delivering highly transferable, general-purpose embodied agents. Code and data will be released soon.
comment: 16 pages, 8 figures
From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving
Learning robust driving policies from large-scale, real-world datasets is a central challenge in autonomous driving, as online data collection is often unsafe and impractical. While Behavioral Cloning (BC) offers a straightforward approach to imitation learning, policies trained with BC are notoriously brittle and suffer from compounding errors in closed-loop execution. This work presents a comprehensive pipeline and a comparative study to address this limitation. We first develop a series of increasingly sophisticated BC baselines, culminating in a Transformer-based model that operates on a structured, entity-centric state representation. While this model achieves low imitation loss, we show that it still fails in long-horizon simulations. We then demonstrate that by applying a state-of-the-art Offline Reinforcement Learning algorithm, Conservative Q-Learning (CQL), to the same data and architecture, we can learn a significantly more robust policy. Using a carefully engineered reward function, the CQL agent learns a conservative value function that enables it to recover from minor errors and avoid out-of-distribution states. In a large-scale evaluation on 1,000 unseen scenarios from the Waymo Open Motion Dataset, our final CQL agent achieves a 3.2x higher success rate and a 7.4x lower collision rate than the strongest BC baseline, proving that an offline RL approach is critical for learning robust, long-horizon driving policies from static expert data.
EGS-SLAM: RGB-D Gaussian Splatting SLAM with Events
Gaussian Splatting SLAM (GS-SLAM) offers a notable improvement over traditional SLAM methods, enabling photorealistic 3D reconstruction that conventional approaches often struggle to achieve. However, existing GS-SLAM systems perform poorly under persistent and severe motion blur commonly encountered in real-world scenarios, leading to significantly degraded tracking accuracy and compromised 3D reconstruction quality. To address this limitation, we propose EGS-SLAM, a novel GS-SLAM framework that fuses event data with RGB-D inputs to simultaneously reduce motion blur in images and compensate for the sparse and discrete nature of event streams, enabling robust tracking and high-fidelity 3D Gaussian Splatting reconstruction. Specifically, our system explicitly models the camera's continuous trajectory during exposure, supporting event- and blur-aware tracking and mapping on a unified 3D Gaussian Splatting scene. Furthermore, we introduce a learnable camera response function to align the dynamic ranges of events and images, along with a no-event loss to suppress ringing artifacts during reconstruction. We validate our approach on a new dataset comprising synthetic and real-world sequences with significant motion blur. Extensive experimental results demonstrate that EGS-SLAM consistently outperforms existing GS-SLAM systems in both trajectory accuracy and photorealistic 3D Gaussian Splatting reconstruction. The source code will be available at https://github.com/Chensiyu00/EGS-SLAM.
comment: Accepted by IEEE RAL
Imaginative World Modeling with Scene Graphs for Embodied Agent Navigation
Semantic navigation requires an agent to navigate toward a specified target in an unseen environment. Employing an imaginative navigation strategy that predicts future scenes before taking action, can empower the agent to find target faster. Inspired by this idea, we propose SGImagineNav, a novel imaginative navigation framework that leverages symbolic world modeling to proactively build a global environmental representation. SGImagineNav maintains an evolving hierarchical scene graphs and uses large language models to predict and explore unseen parts of the environment. While existing methods solely relying on past observations, this imaginative scene graph provides richer semantic context, enabling the agent to proactively estimate target locations. Building upon this, SGImagineNav adopts an adaptive navigation strategy that exploits semantic shortcuts when promising and explores unknown areas otherwise to gather additional context. This strategy continuously expands the known environment and accumulates valuable semantic contexts, ultimately guiding the agent toward the target. SGImagineNav is evaluated in both real-world scenarios and simulation benchmarks. SGImagineNav consistently outperforms previous methods, improving success rate to 65.4 and 66.8 on HM3D and HSSD, and demonstrating cross-floor and cross-room navigation in real-world environments, underscoring its effectiveness and generalizability.
comment: 23 pages
Manipulator for people with limited abilities
The topic of this final qualification work was chosen due to the importance of developing robotic systems designed to assist people with disabilities. Advances in robotics and automation technologies have opened up new prospects for creating devices that can significantly improve the quality of life for these people. In this context, designing a robotic hand with a control system adapted to the needs of people with disabilities is a major scientific and practical challenge. This work addresses the problem of developing and manufacturing a four-degree-of-freedom robotic hand suitable for practical manipulation. Addressing this issue requires a comprehensive approach, encompassing the design of the hand's mechanical structure, the development of its control system, and its integration with a technical vision system and software based on the Robot Operating System (ROS).
comment: 105 pages, in Russian language
Vibration-Based Energy Metric for Restoring Needle Alignment in Autonomous Robotic Ultrasound
Precise needle alignment is essential for percutaneous needle insertion in robotic ultrasound-guided procedures. However, inherent challenges such as speckle noise, needle-like artifacts, and low image resolution make robust needle detection difficult, particularly when visibility is reduced or lost. In this paper, we propose a method to restore needle alignment when the ultrasound imaging plane and the needle insertion plane are misaligned. Unlike many existing approaches that rely heavily on needle visibility in ultrasound images, our method uses a more robust feature by periodically vibrating the needle using a mechanical system. Specifically, we propose a vibration-based energy metric that remains effective even when the needle is fully out of plane. Using this metric, we develop a control strategy to reposition the ultrasound probe in response to misalignments between the imaging plane and the needle insertion plane in both translation and rotation. Experiments conducted on ex-vivo porcine tissue samples using a dual-arm robotic ultrasound-guided needle insertion system demonstrate the effectiveness of the proposed approach. The experimental results show the translational error of 0.41$\pm$0.27 mm and the rotational error of 0.51$\pm$0.19 degrees.
D3P: Dynamic Denoising Diffusion Policy via Reinforcement Learning
Diffusion policies excel at learning complex action distributions for robotic visuomotor tasks, yet their iterative denoising process poses a major bottleneck for real-time deployment. Existing acceleration methods apply a fixed number of denoising steps per action, implicitly treating all actions as equally important. However, our experiments reveal that robotic tasks often contain a mix of \emph{crucial} and \emph{routine} actions, which differ in their impact on task success. Motivated by this finding, we propose \textbf{D}ynamic \textbf{D}enoising \textbf{D}iffusion \textbf{P}olicy \textbf{(D3P)}, a diffusion-based policy that adaptively allocates denoising steps across actions at test time. D3P uses a lightweight, state-aware adaptor to allocate the optimal number of denoising steps for each action. We jointly optimize the adaptor and base diffusion policy via reinforcement learning to balance task performance and inference efficiency. On simulated tasks, D3P achieves an averaged 2.2$\times$ inference speed-up over baselines without degrading success. Furthermore, we demonstrate D3P's effectiveness on a physical robot, achieving a 1.9$\times$ acceleration over the baseline.
Learning a Vision-Based Footstep Planner for Hierarchical Walking Control
Bipedal robots demonstrate potential in navigating challenging terrains through dynamic ground contact. However, current frameworks often depend solely on proprioception or use manually designed visual pipelines, which are fragile in real-world settings and complicate real-time footstep planning in unstructured environments. To address this problem, we present a vision-based hierarchical control framework that integrates a reinforcement learning high-level footstep planner, which generates footstep commands based on a local elevation map, with a low-level Operational Space Controller that tracks the generated trajectories. We utilize the Angular Momentum Linear Inverted Pendulum model to construct a low-dimensional state representation to capture an informative encoding of the dynamics while reducing complexity. We evaluate our method across different terrain conditions using the underactuated bipedal robot Cassie and investigate the capabilities and challenges of our approach through simulation and hardware experiments.
comment: 8 pages, 8 figures, accepted to 2025 IEEE-RAS 24th International Conference on Humanoid Robots
PANAMA: A Network-Aware MARL Framework for Multi-Agent Path Finding in Digital Twin Ecosystems
Digital Twins (DTs) are transforming industries through advanced data processing and analysis, positioning the world of DTs, Digital World, as a cornerstone of nextgeneration technologies including embodied AI. As robotics and automated systems scale, efficient data-sharing frameworks and robust algorithms become critical. We explore the pivotal role of data handling in next-gen networks, focusing on dynamics between application and network providers (AP/NP) in DT ecosystems. We introduce PANAMA, a novel algorithm with Priority Asymmetry for Network Aware Multi-agent Reinforcement Learning (MARL) based multi-agent path finding (MAPF). By adopting a Centralized Training with Decentralized Execution (CTDE) framework and asynchronous actor-learner architectures, PANAMA accelerates training while enabling autonomous task execution by embodied AI. Our approach demonstrates superior pathfinding performance in accuracy, speed, and scalability compared to existing benchmarks. Through simulations, we highlight optimized data-sharing strategies for scalable, automated systems, ensuring resilience in complex, real-world environments. PANAMA bridges the gap between network-aware decision-making and robust multi-agent coordination, advancing the synergy between DTs, wireless networks, and AI-driven automation.
AORRTC: Almost-Surely Asymptotically Optimal Planning with RRT-Connect
Finding high-quality solutions quickly is an important objective in motion planning. This is especially true for high-degree-of-freedom robots. Satisficing planners have traditionally found feasible solutions quickly but provide no guarantees on their optimality, while almost-surely asymptotically optimal (a.s.a.o.) planners have probabilistic guarantees on their convergence towards an optimal solution but are more computationally expensive. This paper uses the AO-x meta-algorithm to extend the satisficing RRT-Connect planner to optimal planning. The resulting Asymptotically Optimal RRT-Connect (AORRTC) finds initial solutions in similar times as RRT-Connect and uses any additional planning time to converge towards the optimal solution in an anytime manner. It is proven to be probabilistically complete and a.s.a.o. AORRTC was tested with the Panda (7 DoF) and Fetch (8 DoF) robotic arms on the MotionBenchMaker dataset. These experiments show that AORRTC finds initial solutions as fast as RRT-Connect and faster than the tested state-of-the-art a.s.a.o. algorithms while converging to better solutions faster. AORRTC finds solutions to difficult high-DoF planning problems in milliseconds where the other a.s.a.o. planners could not consistently find solutions in seconds. This performance was demonstrated both with and without single instruction/multiple data (SIMD) acceleration.
comment: In revision for IEEE Robotics and Automation Letters (RA-L). Manuscript #25-1915. 8 pages, 4 figures, 1 table. A video of AORRTC can be found at https://www.youtube.com/watch?v=j1itxP3KuiM . Information on the implementation of AORRTC is available at https://robotic-esp.com/code/aorrtc/
A Step-by-step Guide on Nonlinear Model Predictive Control for Safe Mobile Robot Navigation
Designing a model predictive control (MPC) scheme that enables a mobile robot to safely navigate through an obstacle-filled environment is a complicated yet essential task in robotics. In this technical report, safety refers to ensuring that the robot respects state and input constraints while avoiding collisions with obstacles despite the presence of disturbances and measurement noise. This report offers a step-by-step approach to implementing nonlinear model predictive control (NMPC) schemes addressing these safety requirements. Numerous books and survey papers provide comprehensive overviews of linear MPC (LMPC), NMPC, and their applications in various domains, including robotics. This report does not aim to replicate those exhaustive reviews. Instead, it focuses specifically on NMPC as a foundation for safe mobile robot navigation. The goal is to provide a practical and accessible path from theoretical concepts to mathematical proofs and implementation, emphasizing safety and performance guarantees. It is intended for researchers, robotics engineers, and practitioners seeking to bridge the gap between theoretical NMPC formulations and real-world robotic applications. This report is not necessarily meant to remain fixed over time. If someone finds an error in the presented theory, please reach out via the given email addresses. We are happy to update the document if necessary.
comment: 51 pages, 3 figures
Learn to Teach: Sample-Efficient Privileged Learning for Humanoid Locomotion over Diverse Terrains
Humanoid robots promise transformative capabilities for industrial and service applications. While recent advances in Reinforcement Learning (RL) yield impressive results in locomotion, manipulation, and navigation, the proposed methods typically require enormous simulation samples to account for real-world variability. This work proposes a novel one-stage training framework-Learn to Teach (L2T)-which unifies teacher and student policy learning. Our approach recycles simulator samples and synchronizes the learning trajectories through shared dynamics, significantly reducing sample complexities and training time while achieving state-of-the-art performance. Furthermore, we validate the RL variant (L2T-RL) through extensive simulations and hardware tests on the Digit robot, demonstrating zero-shot sim-to-real transfer and robust performance over 12+ challenging terrains without depth estimation modules.
comment: Accepted to IEEE Robotics and Automation Letters (RA-L)
CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion
Diffusion Policy (DP) enables robots to learn complex behaviors by imitating expert demonstrations through action diffusion. However, in practical applications, hardware limitations often degrade data quality, while real-time constraints restrict model inference to instantaneous state and scene observations. These limitations seriously reduce the efficacy of learning from expert demonstrations, resulting in failures in object localization, grasp planning, and long-horizon task execution. To address these challenges, we propose Causal Diffusion Policy (CDP), a novel transformer-based diffusion model that enhances action prediction by conditioning on historical action sequences, thereby enabling more coherent and context-aware visuomotor policy learning. To further mitigate the computational cost associated with autoregressive inference, a caching mechanism is also introduced to store attention key-value pairs from previous timesteps, substantially reducing redundant computations during execution. Extensive experiments in both simulated and real-world environments, spanning diverse 2D and 3D manipulation tasks, demonstrate that CDP uniquely leverages historical action sequences to achieve significantly higher accuracy than existing methods. Moreover, even when faced with degraded input observation quality, CDP maintains remarkable precision by reasoning through temporal continuity, which highlights its practical robustness for robotic control under realistic, imperfect conditions.
DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control
Enabling robots to perform diverse tasks across varied environments is a central challenge in robot learning. While vision-language-action (VLA) models have shown promise for generalizable robot skills, realizing their full potential requires addressing limitations in action representation and efficient training. Current VLA models often focus on scaling the vision-language model (VLM) component, while the action space representation remains a critical bottleneck. This paper introduces DexVLA, a novel framework designed to enhance the efficiency and generalization capabilities of VLAs for complex, long-horizon tasks across diverse robot embodiments. DexVLA features a novel diffusion-based action expert, scaled to one billion parameters, designed for cross-embodiment learning. A novel embodiment curriculum learning strategy facilitates efficient training: (1) pre-training the diffusion expert that is separable from the VLA on cross-embodiment data, (2) aligning the VLA model to specific embodiments, and (3) post-training for rapid adaptation to new tasks. We conduct comprehensive experiments across multiple embodiments, including single-arm, bimanual, and dexterous hand, demonstrating DexVLA's adaptability to challenging tasks without task-specific adaptation, its ability to learn dexterous skills on novel embodiments with limited data, and its capacity to complete complex, long-horizon tasks using only direct language prompting, such as laundry folding. In all settings, our method demonstrates superior performance compared to state-of-the-art models like Octo, OpenVLA, and Diffusion Policy.
comment: The webpage is at https://dex-vla.github.io/. DexVLA is accepted by CoRL 2025
A Differentiated Reward Method for Reinforcement Learning based Multi-Vehicle Cooperative Decision-Making Algorithms
Reinforcement learning (RL) shows great potential for optimizing multi-vehicle cooperative driving strategies through the state-action-reward feedback loop, but it still faces challenges such as low sample efficiency. This paper proposes a differentiated reward method based on steady-state transition systems, which incorporates state transition gradient information into the reward design by analyzing traffic flow characteristics, aiming to optimize action selection and policy learning in multi-vehicle cooperative decision-making. The performance of the proposed method is validated in RL algorithms such as MAPPO, MADQN, and QMIX under varying autonomous vehicle penetration. The results show that the differentiated reward method significantly accelerates training convergence and outperforms centering reward and others in terms of traffic efficiency, safety, and action rationality. Additionally, the method demonstrates strong scalability and environmental adaptability, providing a novel approach for multi-agent cooperative decision-making in complex traffic scenarios.
comment: 10 pages, 3 figures
Exploring Video-Based Driver Activity Recognition under Noisy Labels
As an open research topic in the field of deep learning, learning with noisy labels has attracted much attention and grown rapidly over the past ten years. Learning with label noise is crucial for driver distraction behavior recognition, as real-world video data often contains mislabeled samples, impacting model reliability and performance. However, label noise learning is barely explored in the driver activity recognition field. In this paper, we propose the first label noise learning approach for the driver activity recognition task. Based on the cluster assumption, we initially enable the model to learn clustering-friendly low-dimensional representations from given videos and assign the resultant embeddings into clusters. We subsequently perform co-refinement within each cluster to smooth the classifier outputs. Furthermore, we propose a flexible sample selection strategy that combines two selection criteria without relying on any hyperparameters to filter clean samples from the training dataset. We also incorporate a self-adaptive parameter into the sample selection process to enforce balancing across classes. A comprehensive variety of experiments on the public Drive&Act dataset for all granularity levels demonstrates the superior performance of our method in comparison with other label-denoising methods derived from the image classification field. The source code is available at https://github.com/ilonafan/DAR-noisy-labels.
comment: Accepted to SMC 2025. The source code is available at https://github.com/ilonafan/DAR-noisy-labels
UniCalib: Targetless LiDAR-Camera Calibration via Probabilistic Flow on Unified Depth Representations
Precise LiDAR-camera calibration is crucial for integrating these two sensors into robotic systems to achieve robust perception. In applications like autonomous driving, online targetless calibration enables a prompt sensor misalignment correction from mechanical vibrations without extra targets. However, existing methods exhibit limitations in effectively extracting consistent features from LiDAR and camera data and fail to prioritize salient regions, compromising cross-modal alignment robustness. To address these issues, we propose DF-Calib, a LiDAR-camera calibration method that reformulates calibration as an intra-modality depth flow estimation problem. DF-Calib estimates a dense depth map from the camera image and completes the sparse LiDAR projected depth map, using a shared feature encoder to extract consistent depth-to-depth features, effectively bridging the 2D-3D cross-modal gap. Additionally, we introduce a reliability map to prioritize valid pixels and propose a perceptually weighted sparse flow loss to enhance depth flow estimation. Experimental results across multiple datasets validate its accuracy and generalization,with DF-Calib achieving a mean translation error of 0.635cm and rotation error of 0.045 degrees on the KITTI dataset.
comment: 8 pages,5 figures
DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving
Understanding the short-term motion of vulnerable road users (VRUs) like pedestrians and cyclists is critical for safe autonomous driving, especially in urban scenarios with ambiguous or high-risk behaviors. While vision-language models (VLMs) have enabled open-vocabulary perception, their utility for fine-grained intent reasoning remains underexplored. Notably, no existing benchmark evaluates multi-class intent prediction in safety-critical situations, To address this gap, we introduce DRAMA-X, a fine-grained benchmark constructed from the DRAMA dataset via an automated annotation pipeline. DRAMA-X contains 5,686 accident-prone frames labeled with object bounding boxes, a nine-class directional intent taxonomy, binary risk scores, expert-generated action suggestions for the ego vehicle, and descriptive motion summaries. These annotations enable a structured evaluation of four interrelated tasks central to autonomous decision-making: object detection, intent prediction, risk assessment, and action suggestion. As a reference baseline, we propose SGG-Intent, a lightweight, training-free framework that mirrors the ego vehicle's reasoning pipeline. It sequentially generates a scene graph from visual input using VLM-backed detectors, infers intent, assesses risk, and recommends an action using a compositional reasoning stage powered by a large language model. We evaluate a range of recent VLMs, comparing performance across all four DRAMA-X tasks. Our experiments demonstrate that scene-graph-based reasoning enhances intent prediction and risk assessment, especially when contextual cues are explicitly modeled.
comment: 19 pages, 5 figures, Preprint under review. Code available at: https://github.com/taco-group/DRAMA-X
OceanSim: A GPU-Accelerated Underwater Robot Perception Simulation Framework IROS 2025
Underwater simulators offer support for building robust underwater perception solutions. Significant work has recently been done to develop new simulators and to advance the performance of existing underwater simulators. Still, there remains room for improvement on physics-based underwater sensor modeling and rendering efficiency. In this paper, we propose OceanSim, a high-fidelity GPU-accelerated underwater simulator to address this research gap. We propose advanced physics-based rendering techniques to reduce the sim-to-real gap for underwater image simulation. We develop OceanSim to fully leverage the computing advantages of GPUs and achieve real-time imaging sonar rendering and fast synthetic data generation. We evaluate the capabilities and realism of OceanSim using real-world data to provide qualitative and quantitative results. The code and detailed documentation are made available on the project website to support the marine robotics community: https://umfieldrobotics.github.io/OceanSim.
comment: Accepted at IROS 2025; 8 pages, 6 figures
Learning Adaptive Dexterous Grasping from Single Demonstrations
How can robots learn dexterous grasping skills efficiently and apply them adaptively based on user instructions? This work tackles two key challenges: efficient skill acquisition from limited human demonstrations and context-driven skill selection. We introduce AdaDexGrasp, a framework that learns a library of grasping skills from a single human demonstration per skill and selects the most suitable one using a vision-language model (VLM). To improve sample efficiency, we propose a trajectory following reward that guides reinforcement learning (RL) toward states close to a human demonstration while allowing flexibility in exploration. To learn beyond the single demonstration, we employ curriculum learning, progressively increasing object pose variations to enhance robustness. At deployment, a VLM retrieves the appropriate skill based on user instructions, bridging low-level learned skills with high-level intent. We evaluate AdaDexGrasp in both simulation and real-world settings, showing that our approach significantly improves RL efficiency and enables learning human-like grasp strategies across varied object configurations. Finally, we demonstrate zero-shot transfer of our learned policies to a real-world PSYONIC Ability Hand, with a 90% success rate across objects, significantly outperforming the baseline.
Language-Driven Policy Distillation for Cooperative Driving in Multi-Agent Reinforcement Learning
The cooperative driving technology of Connected and Autonomous Vehicles (CAVs) is crucial for improving the efficiency and safety of transportation systems. Learning-based methods, such as Multi-Agent Reinforcement Learning (MARL), have demonstrated strong capabilities in cooperative decision-making tasks. However, existing MARL approaches still face challenges in terms of learning efficiency and performance. In recent years, Large Language Models (LLMs) have rapidly advanced and shown remarkable abilities in various sequential decision-making tasks. To enhance the learning capabilities of cooperative agents while ensuring decision-making efficiency and cost-effectiveness, we propose LDPD, a language-driven policy distillation method for guiding MARL exploration. In this framework, a teacher agent based on LLM trains smaller student agents to achieve cooperative decision-making through its own decision-making demonstrations. The teacher agent enhances the observation information of CAVs and utilizes LLMs to perform complex cooperative decision-making reasoning, which also leverages carefully designed decision-making tools to achieve expert-level decisions, providing high-quality teaching experiences. The student agent then refines the teacher's prior knowledge into its own model through gradient policy updates. The experiments demonstrate that the students can rapidly improve their capabilities with minimal guidance from the teacher and eventually surpass the teacher's performance. Extensive experiments show that our approach demonstrates better performance and learning efficiency compared to baseline methods.
LifelongPR: Lifelong point cloud place recognition based on sample replay and prompt learning
Point cloud place recognition (PCPR) determines the geo-location within a prebuilt map and plays a crucial role in geoscience and robotics applications such as autonomous driving, intelligent transportation, and augmented reality. In real-world large-scale deployments of a geographic positioning system, PCPR models must continuously acquire, update, and accumulate knowledge to adapt to diverse and dynamic environments, i.e., the ability known as continual learning (CL). However, existing PCPR models often suffer from catastrophic forgetting, leading to significant performance degradation in previously learned scenes when adapting to new environments or sensor types. This results in poor model scalability, increased maintenance costs, and system deployment difficulties, undermining the practicality of PCPR. To address these issues, we propose LifelongPR, a novel continual learning framework for PCPR, which effectively extracts and fuses knowledge from sequential point cloud data. First, to alleviate the knowledge loss, we propose a replay sample selection method that dynamically allocates sample sizes according to each dataset's information quantity and selects spatially diverse samples for maximal representativeness. Second, to handle domain shifts, we design a prompt learning-based CL framework with a lightweight prompt module and a two-stage training strategy, enabling domain-specific feature adaptation while minimizing forgetting. Comprehensive experiments on large-scale public and self-collected datasets are conducted to validate the effectiveness of the proposed method. Compared with state-of-the-art (SOTA) methods, our method achieves 6.50% improvement in mIR@1, 7.96% improvement in mR@1, and an 8.95% reduction in F. The code and pre-trained models are publicly available at https://github.com/zouxianghong/LifelongPR.
Systems and Control (CS)
Distributionally Robust Control with Constraints on Linear Unidimensional Projections
Distributionally robust control is a well-studied framework for optimal decision making under uncertainty, with the objective of minimizing an expected cost function over control actions, assuming the most adverse probability distribution from an ambiguity set. We consider an interpretable and expressive class of ambiguity sets defined by constraints on the expected value of functions of one-dimensional linear projections of the uncertain parameters. Prior work has shown that, under conditions, problems in this class can be reformulated as finite convex problems. In this work, we propose two iterative methods that can be used to approximately solve problems of this class in the general case. The first is an approximate algorithm based on best-response dynamics. The second is an approximate method that first reformulates the problem as a semi-infinite program and then solves a relaxation. We apply our methods to portfolio construction and trajectory planning scenarios.
comment: Presented at the 11th International Conference on Control, Decision and Information Technologies (CoDIT 2025)
Model Predictive Control for Crowd Navigation via Learning-Based Trajectory Prediction
Safe navigation in pedestrian-rich environments remains a key challenge for autonomous robots. This work evaluates the integration of a deep learning-based Social-Implicit (SI) pedestrian trajectory predictor within a Model Predictive Control (MPC) framework on the physical Continental Corriere robot. Tested across varied pedestrian densities, the SI-MPC system is compared to a traditional Constant Velocity (CV) model in both open-loop prediction and closed-loop navigation. Results show that SI improves trajectory prediction - reducing errors by up to 76% in low-density settings - and enhances safety and motion smoothness in crowded scenes. Moreover, real-world deployment reveals discrepancies between open-loop metrics and closed-loop performance, as the SI model yields broader, more cautious predictions. These findings emphasize the importance of system-level evaluation and highlight the SI-MPC framework's promise for safer, more adaptive navigation in dynamic, human-populated environments.
From Data to Safe Mobile Robot Navigation: An Efficient and Modular Robust MPC Design Pipeline
Model predictive control (MPC) is a powerful strategy for planning and control in autonomous mobile robot navigation. However, ensuring safety in real-world deployments remains challenging due to the presence of disturbances and measurement noise. Existing approaches often rely on idealized assumptions, neglect the impact of noisy measurements, and simply heuristically guess unrealistic bounds. In this work, we present an efficient and modular robust MPC design pipeline that systematically addresses these limitations. The pipeline consists of an iterative procedure that leverages closed-loop experimental data to estimate disturbance bounds and synthesize a robust output-feedback MPC scheme. We provide the pipeline in the form of deterministic and reproducible code to synthesize the robust output-feedback MPC from data. We empirically demonstrate robust constraint satisfaction and recursive feasibility in quadrotor simulations using Gazebo.
comment: 8 pages, 5 figures
From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving
Learning robust driving policies from large-scale, real-world datasets is a central challenge in autonomous driving, as online data collection is often unsafe and impractical. While Behavioral Cloning (BC) offers a straightforward approach to imitation learning, policies trained with BC are notoriously brittle and suffer from compounding errors in closed-loop execution. This work presents a comprehensive pipeline and a comparative study to address this limitation. We first develop a series of increasingly sophisticated BC baselines, culminating in a Transformer-based model that operates on a structured, entity-centric state representation. While this model achieves low imitation loss, we show that it still fails in long-horizon simulations. We then demonstrate that by applying a state-of-the-art Offline Reinforcement Learning algorithm, Conservative Q-Learning (CQL), to the same data and architecture, we can learn a significantly more robust policy. Using a carefully engineered reward function, the CQL agent learns a conservative value function that enables it to recover from minor errors and avoid out-of-distribution states. In a large-scale evaluation on 1,000 unseen scenarios from the Waymo Open Motion Dataset, our final CQL agent achieves a 3.2x higher success rate and a 7.4x lower collision rate than the strongest BC baseline, proving that an offline RL approach is critical for learning robust, long-horizon driving policies from static expert data.
Learning-Enabled Adaptive Power Capping Scheme for Cloud Data Centers
The rapid growth of the digital economy and artificial intelligence has transformed cloud data centers into essential infrastructure with substantial energy consumption and carbon emission, necessitating effective energy management. However, existing methods face challenges such as incomplete information, uncertain parameters, and dynamic environments, which hinder their real-world implementation. This paper proposes an adaptive power capping framework tailored to cloud data centers. By dynamically setting the energy consumption upper bound, the power load of data centers can be reshaped to align with the electricity price or other market signals. To this end, we formulate the power capping problem as a partially observable Markov decision process. Subsequently, we develop an uncertainty-aware model-based reinforcement learning (MBRL) method to perceive the cloud data center operational environment and optimize power-capping decisions. By incorporating a two-stage uncertainty-aware optimization algorithm into the MBRL, we improve its adaptability to the ever-changing environment. Additionally, we derive the optimality gap of the proposed scheme under finite iterations, ensuring effective decisions under complex and uncertain scenarios. The numerical experiments validate the effectiveness of the proposed method using a cloud data center operational environment simulator built on real-world production traces from Alibaba, which demonstrates its potential as an efficient energy management solution for cloud data centers.
Fixed-Time Voltage Regulation for Boost Converters via Unit-Safe Saturating Functions
This paper explores the voltage regulation challenges in boost converter systems, which are critical components in power electronics due to their ability to step up voltage levels efficiently. The proposed control algorithm ensures fixed-time stability, a desirable property that guarantees system stability within a fixed time frame regardless of initial conditions. To tackle the common chattering issues in conventional fixed-time control methods, a novel class of function families is introduced. State observers and adaptive parameters are utilized to manage the uncertainties associated with unknown load resistance. Furthermore, a new disturbance observer is developed using the proposed function family, and its advantages and limitations are illustrated through comparison with existing designs. Finally, both non-real-time and real-time simulations are conducted to validate the effectiveness and deployability of the proposed control algorithm.
Discovery Learning accelerates battery design evaluation
Fast and reliable validation of novel designs in complex physical systems such as batteries is critical to accelerating technological innovation. However, battery research and development remain bottlenecked by the prohibitively high time and energy costs required to evaluate numerous new design candidates, particularly in battery prototyping and life testing. Despite recent progress in data-driven battery lifetime prediction, existing methods require labeled data of target designs to improve accuracy and cannot make reliable predictions until after prototyping, thus falling far short of the efficiency needed to enable rapid feedback for battery design. Here, we introduce Discovery Learning (DL), a scientific machine-learning paradigm that integrates active learning, physics-guided learning, and zero-shot learning into a human-like reasoning loop, drawing inspiration from learning theories in educational psychology. DL can learn from historical battery designs and actively reduce the need for prototyping, thus enabling rapid lifetime evaluation for unobserved material-design combinations without requiring additional data labeling. To test DL, we present 123 industrial-grade large-format lithium-ion pouch cells, spanning eight material-design combinations and diverse cycling protocols. Trained solely on public datasets of small-capacity cylindrical cells, DL achieves 7.2% test error in predicting the average cycle life under unknown device variability. This results in savings of 98% in time and 95% in energy compared to industrial practices. This work highlights the potential of uncovering insights from historical designs to inform and accelerate the development of next-generation battery technologies. DL represents a key advance toward efficient data-driven modeling and helps realize the promise of machine learning for accelerating scientific discovery and engineering innovation.
comment: Main text, 20 pages, 5 figures
Decoupling Structural Heterogeneity from Functional Fairness in Complex Networks: A Theoretical Framework based on the Imbalance Metric
Performance evaluation of complex networks has traditionally focused on structural integrity or average transmission efficiency, perspectives that often overlook the dimension of functional fairness. This raises a central question: Under certain conditions, structurally heterogeneous networks can exhibit high functional fairness. To systematically address this issue, we introduce a new metric, Network Imbalance (I), designed to quantitatively assess end-to-end accessibility fairness from a perceived QoS perspective. By combining a tunable sigmoid function with a global Shannon entropy framework, the I metric quantifies the uniformity of connection experiences between all node pairs. We analyze the mathematical properties of this metric and validate its explanatory power on various classical network models. Our findings reveal that low imbalance (i.e., high functional fairness) can be achieved through two distinct mechanisms: one via topological symmetry (e.g., in a complete graph) and the other via extreme connection efficiency driven by structural inequality (e.g., in a scale-free network). This decoupling of structure and function provides a new theoretical perspective for network performance evaluation and offers an effective quantitative tool for balancing efficiency and fairness in network design.
Average Consensus with Dynamic Compression in Bandwidth-Limited Directed Networks
In this paper, the average consensus problem has been considered for directed unbalanced networks under finite bit-rate communication. We propose the Push-Pull Average Consensus algorithm with Dynamic Compression (PP-ACDC) algorithm, a distributed consensus algorithm that deploys an adaptive quantization scheme and achieves convergence to the exact average without the need of global information. A preliminary numerical convergence analysis and simulation results corroborate the performance of PP-ACDC.
Collaborative Computing Strategy Based SINS Prediction for Emergency UAVs Network
In emergency scenarios, the dynamic and harsh conditions necessitate timely trajectory adjustments for drones, leading to highly dynamic network topologies and potential task failures. To address these challenges, a collaborative computing strategy based strapdown inertial navigation system (SINS) prediction for emergency UAVs network (EUN) is proposed, where a two-step weighted time expanded graph (WTEG) is constructed to deal with dynamic network topology changes. Furthermore, the task scheduling is formulated as a Directed Acyclic Graph (DAG) to WTEG mapping problem to achieve collaborative computing while transmitting among UAVs. Finally, the binary particle swarm optimization (BPSO) algorithm is employed to choose the mapping strategy that minimizes end-to-end processing latency. The simulation results validate that the collaborative computing strategy significantly outperforms both cloud and local computing in terms of latency. Moreover, the task success rate using SINS is substantially improved compared to approaches without prior prediction.
Neural Network-Based Detection and Multi-Class Classification of FDI Attacks in Smart Grid Home Energy Systems
False Data Injection Attacks (FDIAs) pose a significant threat to smart grid infrastructures, particularly Home Area Networks (HANs), where real-time monitoring and control are highly adopted. Owing to the comparatively less stringent security controls and widespread availability of HANs, attackers view them as an attractive entry point to manipulate aggregated demand patterns, which can ultimately propagate and disrupt broader grid operations. These attacks undermine the integrity of smart meter data, enabling malicious actors to manipulate consumption values without activating conventional alarms, thereby creating serious vulnerabilities across both residential and utility-scale infrastructures. This paper presents a machine learning-based framework for both the detection and classification of FDIAs using residential energy data. A real-time detection is provided by the lightweight Artificial Neural Network (ANN), which works by using the most vital features of energy consumption, cost, and time context. For the classification of different attack types, a Bidirectional LSTM is trained to recognize normal, trapezoidal, and sigmoid attack shapes through learning sequential dependencies in the data. A synthetic time-series dataset was generated to emulate realistic household behaviour. Experimental results demonstrate that the proposed models are effective in identifying and classifying FDIAs, offering a scalable solution for enhancing grid resilience at the edge. This work contributes toward building intelligent, data-driven defence mechanisms that strengthen smart grid cybersecurity from residential endpoints.
comment: 17 pages, 7 figures
A Step-by-step Guide on Nonlinear Model Predictive Control for Safe Mobile Robot Navigation
Designing a model predictive control (MPC) scheme that enables a mobile robot to safely navigate through an obstacle-filled environment is a complicated yet essential task in robotics. In this technical report, safety refers to ensuring that the robot respects state and input constraints while avoiding collisions with obstacles despite the presence of disturbances and measurement noise. This report offers a step-by-step approach to implementing nonlinear model predictive control (NMPC) schemes addressing these safety requirements. Numerous books and survey papers provide comprehensive overviews of linear MPC (LMPC), NMPC, and their applications in various domains, including robotics. This report does not aim to replicate those exhaustive reviews. Instead, it focuses specifically on NMPC as a foundation for safe mobile robot navigation. The goal is to provide a practical and accessible path from theoretical concepts to mathematical proofs and implementation, emphasizing safety and performance guarantees. It is intended for researchers, robotics engineers, and practitioners seeking to bridge the gap between theoretical NMPC formulations and real-world robotic applications. This report is not necessarily meant to remain fixed over time. If someone finds an error in the presented theory, please reach out via the given email addresses. We are happy to update the document if necessary.
comment: 51 pages, 3 figures
Direction Estimation of Sound Sources Using Microphone Arrays and Signal Strength
Sound-tracking refers to the process of determining the direction from which a sound originates, making it a fundamental component of sound source localization. This capability is essential in a variety of applications, including security systems, acoustic monitoring, and speaker tracking, where accurately identifying the direction of a sound source enables real-time responses, efficient resource allocation, and improved situational awareness. While sound-tracking is closely related to localization, it specifically focuses on identifying the direction of the sound source rather than estimating its exact position in space. Despite its utility, sound-tracking systems face several challenges, such as maintaining directional accuracy and precision, along with the need for sophisticated hardware configurations and complex signal processing algorithms. This paper presents a sound-tracking method using three electret microphones. We estimate the direction of a sound source using a lightweight method that analyzes signals from three strategically placed microphones. By comparing the average power of the received signals, the system infers the most probable direction of the sound. The results indicate that the power level from each microphone effectively determines the sound source direction. Our system employs a straightforward and cost-effective hardware design, ensuring simplicity and affordability in implementation. It achieves a localization error of less than 6 degrees and a precision of 98%. Additionally, its effortless integration with various systems makes it versatile and adaptable. Consequently, this technique presents a robust and reliable solution for sound-tracking and localization, with potential applications spanning diverse domains such as security systems, smart homes, and acoustic monitoring.
comment: 5 pages
Systems and Control (EESS)
Distributionally Robust Control with Constraints on Linear Unidimensional Projections
Distributionally robust control is a well-studied framework for optimal decision making under uncertainty, with the objective of minimizing an expected cost function over control actions, assuming the most adverse probability distribution from an ambiguity set. We consider an interpretable and expressive class of ambiguity sets defined by constraints on the expected value of functions of one-dimensional linear projections of the uncertain parameters. Prior work has shown that, under conditions, problems in this class can be reformulated as finite convex problems. In this work, we propose two iterative methods that can be used to approximately solve problems of this class in the general case. The first is an approximate algorithm based on best-response dynamics. The second is an approximate method that first reformulates the problem as a semi-infinite program and then solves a relaxation. We apply our methods to portfolio construction and trajectory planning scenarios.
comment: Presented at the 11th International Conference on Control, Decision and Information Technologies (CoDIT 2025)
Model Predictive Control for Crowd Navigation via Learning-Based Trajectory Prediction
Safe navigation in pedestrian-rich environments remains a key challenge for autonomous robots. This work evaluates the integration of a deep learning-based Social-Implicit (SI) pedestrian trajectory predictor within a Model Predictive Control (MPC) framework on the physical Continental Corriere robot. Tested across varied pedestrian densities, the SI-MPC system is compared to a traditional Constant Velocity (CV) model in both open-loop prediction and closed-loop navigation. Results show that SI improves trajectory prediction - reducing errors by up to 76% in low-density settings - and enhances safety and motion smoothness in crowded scenes. Moreover, real-world deployment reveals discrepancies between open-loop metrics and closed-loop performance, as the SI model yields broader, more cautious predictions. These findings emphasize the importance of system-level evaluation and highlight the SI-MPC framework's promise for safer, more adaptive navigation in dynamic, human-populated environments.
From Data to Safe Mobile Robot Navigation: An Efficient and Modular Robust MPC Design Pipeline
Model predictive control (MPC) is a powerful strategy for planning and control in autonomous mobile robot navigation. However, ensuring safety in real-world deployments remains challenging due to the presence of disturbances and measurement noise. Existing approaches often rely on idealized assumptions, neglect the impact of noisy measurements, and simply heuristically guess unrealistic bounds. In this work, we present an efficient and modular robust MPC design pipeline that systematically addresses these limitations. The pipeline consists of an iterative procedure that leverages closed-loop experimental data to estimate disturbance bounds and synthesize a robust output-feedback MPC scheme. We provide the pipeline in the form of deterministic and reproducible code to synthesize the robust output-feedback MPC from data. We empirically demonstrate robust constraint satisfaction and recursive feasibility in quadrotor simulations using Gazebo.
comment: 8 pages, 5 figures
From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving
Learning robust driving policies from large-scale, real-world datasets is a central challenge in autonomous driving, as online data collection is often unsafe and impractical. While Behavioral Cloning (BC) offers a straightforward approach to imitation learning, policies trained with BC are notoriously brittle and suffer from compounding errors in closed-loop execution. This work presents a comprehensive pipeline and a comparative study to address this limitation. We first develop a series of increasingly sophisticated BC baselines, culminating in a Transformer-based model that operates on a structured, entity-centric state representation. While this model achieves low imitation loss, we show that it still fails in long-horizon simulations. We then demonstrate that by applying a state-of-the-art Offline Reinforcement Learning algorithm, Conservative Q-Learning (CQL), to the same data and architecture, we can learn a significantly more robust policy. Using a carefully engineered reward function, the CQL agent learns a conservative value function that enables it to recover from minor errors and avoid out-of-distribution states. In a large-scale evaluation on 1,000 unseen scenarios from the Waymo Open Motion Dataset, our final CQL agent achieves a 3.2x higher success rate and a 7.4x lower collision rate than the strongest BC baseline, proving that an offline RL approach is critical for learning robust, long-horizon driving policies from static expert data.
Learning-Enabled Adaptive Power Capping Scheme for Cloud Data Centers
The rapid growth of the digital economy and artificial intelligence has transformed cloud data centers into essential infrastructure with substantial energy consumption and carbon emission, necessitating effective energy management. However, existing methods face challenges such as incomplete information, uncertain parameters, and dynamic environments, which hinder their real-world implementation. This paper proposes an adaptive power capping framework tailored to cloud data centers. By dynamically setting the energy consumption upper bound, the power load of data centers can be reshaped to align with the electricity price or other market signals. To this end, we formulate the power capping problem as a partially observable Markov decision process. Subsequently, we develop an uncertainty-aware model-based reinforcement learning (MBRL) method to perceive the cloud data center operational environment and optimize power-capping decisions. By incorporating a two-stage uncertainty-aware optimization algorithm into the MBRL, we improve its adaptability to the ever-changing environment. Additionally, we derive the optimality gap of the proposed scheme under finite iterations, ensuring effective decisions under complex and uncertain scenarios. The numerical experiments validate the effectiveness of the proposed method using a cloud data center operational environment simulator built on real-world production traces from Alibaba, which demonstrates its potential as an efficient energy management solution for cloud data centers.
Fixed-Time Voltage Regulation for Boost Converters via Unit-Safe Saturating Functions
This paper explores the voltage regulation challenges in boost converter systems, which are critical components in power electronics due to their ability to step up voltage levels efficiently. The proposed control algorithm ensures fixed-time stability, a desirable property that guarantees system stability within a fixed time frame regardless of initial conditions. To tackle the common chattering issues in conventional fixed-time control methods, a novel class of function families is introduced. State observers and adaptive parameters are utilized to manage the uncertainties associated with unknown load resistance. Furthermore, a new disturbance observer is developed using the proposed function family, and its advantages and limitations are illustrated through comparison with existing designs. Finally, both non-real-time and real-time simulations are conducted to validate the effectiveness and deployability of the proposed control algorithm.
Discovery Learning accelerates battery design evaluation
Fast and reliable validation of novel designs in complex physical systems such as batteries is critical to accelerating technological innovation. However, battery research and development remain bottlenecked by the prohibitively high time and energy costs required to evaluate numerous new design candidates, particularly in battery prototyping and life testing. Despite recent progress in data-driven battery lifetime prediction, existing methods require labeled data of target designs to improve accuracy and cannot make reliable predictions until after prototyping, thus falling far short of the efficiency needed to enable rapid feedback for battery design. Here, we introduce Discovery Learning (DL), a scientific machine-learning paradigm that integrates active learning, physics-guided learning, and zero-shot learning into a human-like reasoning loop, drawing inspiration from learning theories in educational psychology. DL can learn from historical battery designs and actively reduce the need for prototyping, thus enabling rapid lifetime evaluation for unobserved material-design combinations without requiring additional data labeling. To test DL, we present 123 industrial-grade large-format lithium-ion pouch cells, spanning eight material-design combinations and diverse cycling protocols. Trained solely on public datasets of small-capacity cylindrical cells, DL achieves 7.2% test error in predicting the average cycle life under unknown device variability. This results in savings of 98% in time and 95% in energy compared to industrial practices. This work highlights the potential of uncovering insights from historical designs to inform and accelerate the development of next-generation battery technologies. DL represents a key advance toward efficient data-driven modeling and helps realize the promise of machine learning for accelerating scientific discovery and engineering innovation.
comment: Main text, 20 pages, 5 figures
Decoupling Structural Heterogeneity from Functional Fairness in Complex Networks: A Theoretical Framework based on the Imbalance Metric
Performance evaluation of complex networks has traditionally focused on structural integrity or average transmission efficiency, perspectives that often overlook the dimension of functional fairness. This raises a central question: Under certain conditions, structurally heterogeneous networks can exhibit high functional fairness. To systematically address this issue, we introduce a new metric, Network Imbalance (I), designed to quantitatively assess end-to-end accessibility fairness from a perceived QoS perspective. By combining a tunable sigmoid function with a global Shannon entropy framework, the I metric quantifies the uniformity of connection experiences between all node pairs. We analyze the mathematical properties of this metric and validate its explanatory power on various classical network models. Our findings reveal that low imbalance (i.e., high functional fairness) can be achieved through two distinct mechanisms: one via topological symmetry (e.g., in a complete graph) and the other via extreme connection efficiency driven by structural inequality (e.g., in a scale-free network). This decoupling of structure and function provides a new theoretical perspective for network performance evaluation and offers an effective quantitative tool for balancing efficiency and fairness in network design.
Average Consensus with Dynamic Compression in Bandwidth-Limited Directed Networks
In this paper, the average consensus problem has been considered for directed unbalanced networks under finite bit-rate communication. We propose the Push-Pull Average Consensus algorithm with Dynamic Compression (PP-ACDC) algorithm, a distributed consensus algorithm that deploys an adaptive quantization scheme and achieves convergence to the exact average without the need of global information. A preliminary numerical convergence analysis and simulation results corroborate the performance of PP-ACDC.
Collaborative Computing Strategy Based SINS Prediction for Emergency UAVs Network
In emergency scenarios, the dynamic and harsh conditions necessitate timely trajectory adjustments for drones, leading to highly dynamic network topologies and potential task failures. To address these challenges, a collaborative computing strategy based strapdown inertial navigation system (SINS) prediction for emergency UAVs network (EUN) is proposed, where a two-step weighted time expanded graph (WTEG) is constructed to deal with dynamic network topology changes. Furthermore, the task scheduling is formulated as a Directed Acyclic Graph (DAG) to WTEG mapping problem to achieve collaborative computing while transmitting among UAVs. Finally, the binary particle swarm optimization (BPSO) algorithm is employed to choose the mapping strategy that minimizes end-to-end processing latency. The simulation results validate that the collaborative computing strategy significantly outperforms both cloud and local computing in terms of latency. Moreover, the task success rate using SINS is substantially improved compared to approaches without prior prediction.
Neural Network-Based Detection and Multi-Class Classification of FDI Attacks in Smart Grid Home Energy Systems
False Data Injection Attacks (FDIAs) pose a significant threat to smart grid infrastructures, particularly Home Area Networks (HANs), where real-time monitoring and control are highly adopted. Owing to the comparatively less stringent security controls and widespread availability of HANs, attackers view them as an attractive entry point to manipulate aggregated demand patterns, which can ultimately propagate and disrupt broader grid operations. These attacks undermine the integrity of smart meter data, enabling malicious actors to manipulate consumption values without activating conventional alarms, thereby creating serious vulnerabilities across both residential and utility-scale infrastructures. This paper presents a machine learning-based framework for both the detection and classification of FDIAs using residential energy data. A real-time detection is provided by the lightweight Artificial Neural Network (ANN), which works by using the most vital features of energy consumption, cost, and time context. For the classification of different attack types, a Bidirectional LSTM is trained to recognize normal, trapezoidal, and sigmoid attack shapes through learning sequential dependencies in the data. A synthetic time-series dataset was generated to emulate realistic household behaviour. Experimental results demonstrate that the proposed models are effective in identifying and classifying FDIAs, offering a scalable solution for enhancing grid resilience at the edge. This work contributes toward building intelligent, data-driven defence mechanisms that strengthen smart grid cybersecurity from residential endpoints.
comment: 17 pages, 7 figures
A Step-by-step Guide on Nonlinear Model Predictive Control for Safe Mobile Robot Navigation
Designing a model predictive control (MPC) scheme that enables a mobile robot to safely navigate through an obstacle-filled environment is a complicated yet essential task in robotics. In this technical report, safety refers to ensuring that the robot respects state and input constraints while avoiding collisions with obstacles despite the presence of disturbances and measurement noise. This report offers a step-by-step approach to implementing nonlinear model predictive control (NMPC) schemes addressing these safety requirements. Numerous books and survey papers provide comprehensive overviews of linear MPC (LMPC), NMPC, and their applications in various domains, including robotics. This report does not aim to replicate those exhaustive reviews. Instead, it focuses specifically on NMPC as a foundation for safe mobile robot navigation. The goal is to provide a practical and accessible path from theoretical concepts to mathematical proofs and implementation, emphasizing safety and performance guarantees. It is intended for researchers, robotics engineers, and practitioners seeking to bridge the gap between theoretical NMPC formulations and real-world robotic applications. This report is not necessarily meant to remain fixed over time. If someone finds an error in the presented theory, please reach out via the given email addresses. We are happy to update the document if necessary.
comment: 51 pages, 3 figures
Direction Estimation of Sound Sources Using Microphone Arrays and Signal Strength
Sound-tracking refers to the process of determining the direction from which a sound originates, making it a fundamental component of sound source localization. This capability is essential in a variety of applications, including security systems, acoustic monitoring, and speaker tracking, where accurately identifying the direction of a sound source enables real-time responses, efficient resource allocation, and improved situational awareness. While sound-tracking is closely related to localization, it specifically focuses on identifying the direction of the sound source rather than estimating its exact position in space. Despite its utility, sound-tracking systems face several challenges, such as maintaining directional accuracy and precision, along with the need for sophisticated hardware configurations and complex signal processing algorithms. This paper presents a sound-tracking method using three electret microphones. We estimate the direction of a sound source using a lightweight method that analyzes signals from three strategically placed microphones. By comparing the average power of the received signals, the system infers the most probable direction of the sound. The results indicate that the power level from each microphone effectively determines the sound source direction. Our system employs a straightforward and cost-effective hardware design, ensuring simplicity and affordability in implementation. It achieves a localization error of less than 6 degrees and a precision of 98%. Additionally, its effortless integration with various systems makes it versatile and adaptable. Consequently, this technique presents a robust and reliable solution for sound-tracking and localization, with potential applications spanning diverse domains such as security systems, smart homes, and acoustic monitoring.
comment: 5 pages
Robotics
Shortcut Learning in Generalist Robot Policies: The Role of Dataset Diversity and Fragmentation
Generalist robot policies trained on large-scale datasets such as Open X-Embodiment (OXE) demonstrate strong performance across a wide range of tasks. However, they often struggle to generalize beyond the distribution of their training data. In this paper, we investigate the underlying cause of this limited generalization capability. We identify shortcut learning -- the reliance on task-irrelevant features -- as a key impediment to generalization. Through comprehensive theoretical and empirical analysis, we uncover two primary contributors to shortcut learning: (1) limited diversity within individual sub-datasets, and (2) significant distributional disparities across sub-datasets, leading to dataset fragmentation. These issues arise from the inherent structure of large-scale datasets like OXE, which are typically composed of multiple sub-datasets collected independently across varied environments and embodiments. Our findings provide critical insights into dataset collection strategies that can reduce shortcut learning and enhance the generalization ability of generalist robot policies. Moreover, in scenarios where acquiring new large-scale data is impractical, we demonstrate that carefully selected robotic data augmentation strategies can effectively reduce shortcut learning in existing offline datasets, thereby improving generalization capabilities of generalist robot policies, e.g., $\pi_0$, in both simulation and real-world environments. More information at https://lucky-light-sun.github.io/proj/shortcut-learning-in-grps/.
comment: CoRL 2025
V*: An Efficient Motion Planning Algorithm for Autonomous Vehicles
Autonomous vehicle navigation in structured environments requires planners capable of generating time-optimal, collision-free trajectories that satisfy dynamic and kinematic constraints. We introduce V*, a graph-based motion planner that represents speed and direction as explicit state variables within a discretised space-time-velocity lattice. Unlike traditional methods that decouple spatial search from dynamic feasibility or rely on post-hoc smoothing, V* integrates both motion dimensions directly into graph construction through dynamic graph generation during search expansion. To manage the complexity of high-dimensional search, we employ a hexagonal discretisation strategy and provide formal mathematical proofs establishing optimal waypoint spacing and minimal node redundancy under constrained heading transitions for velocity-aware motion planning. We develop a mathematical formulation for transient steering dynamics in the kinematic bicycle model, modelling steering angle convergence with exponential behaviour, and deriving the relationship for convergence rate parameters. This theoretical foundation, combined with geometric pruning strategies that eliminate expansions leading to infeasible steering configurations, enables V* to evaluate dynamically admissible manoeuvres, ensuring each trajectory is physically realisable without further refinement. We further demonstrate V*'s performance in simulation studies with cluttered and dynamic environments involving moving obstacles, showing its ability to avoid conflicts, yield proactively, and generate safe, efficient trajectories with temporal reasoning capabilities for waiting behaviours and dynamic coordination.
L2Calib: $SE(3)$-Manifold Reinforcement Learning for Robust Extrinsic Calibration with Degenerate Motion Resilience IROS2025
Extrinsic calibration is essential for multi-sensor fusion, existing methods rely on structured targets or fully-excited data, limiting real-world applicability. Online calibration further suffers from weak excitation, leading to unreliable estimates. To address these limitations, we propose a reinforcement learning (RL)-based extrinsic calibration framework that formulates extrinsic calibration as a decision-making problem, directly optimizes $SE(3)$ extrinsics to enhance odometry accuracy. Our approach leverages a probabilistic Bingham distribution to model 3D rotations, ensuring stable optimization while inherently retaining quaternion symmetry. A trajectory alignment reward mechanism enables robust calibration without structured targets by quantitatively evaluating estimated tightly-coupled trajectory against a reference trajectory. Additionally, an automated data selection module filters uninformative samples, significantly improving efficiency and scalability for large-scale datasets. Extensive experiments on UAVs, UGVs, and handheld platforms demonstrate that our method outperforms traditional optimization-based approaches, achieving high-precision calibration even under weak excitation conditions. Our framework simplifies deployment on diverse robotic platforms by eliminating the need for high-quality initial extrinsics and enabling calibration from routine operating data. The code is available at https://github.com/APRIL-ZJU/learn-to-calibrate.
comment: IROS2025
Towards Balanced Behavior Cloning from Imbalanced Datasets
Robots should be able to learn complex behaviors from human demonstrations. In practice, these human-provided datasets are inevitably imbalanced: i.e., the human demonstrates some subtasks more frequently than others. State-of-the-art methods default to treating each element of the human's dataset as equally important. So if -- for instance -- the majority of the human's data focuses on reaching a goal, and only a few state-action pairs move to avoid an obstacle, the learning algorithm will place greater emphasis on goal reaching. More generally, misalignment between the relative amounts of data and the importance of that data causes fundamental problems for imitation learning approaches. In this paper we analyze and develop learning methods that automatically account for mixed datasets. We formally prove that imbalanced data leads to imbalanced policies when each state-action pair is weighted equally; these policies emulate the most represented behaviors, and not the human's complex, multi-task demonstrations. We next explore algorithms that rebalance offline datasets (i.e., reweight the importance of different state-action pairs) without human oversight. Reweighting the dataset can enhance the overall policy performance. However, there is no free lunch: each method for autonomously rebalancing brings its own pros and cons. We formulate these advantages and disadvantages, helping other researchers identify when each type of approach is most appropriate. We conclude by introducing a novel meta-gradient rebalancing algorithm that addresses the primary limitations behind existing approaches. Our experiments show that dataset rebalancing leads to better downstream learning, improving the performance of general imitation learning algorithms without requiring additional data collection. See our project website: https://collab.me.vt.edu/data_curation/.
Surrogate-Enhanced Modeling and Adaptive Modular Control of All-Electric Heavy-Duty Robotic Manipulators
This paper presents a unified system-level modeling and control framework for an all-electric heavy-duty robotic manipulator (HDRM) driven by electromechanical linear actuators (EMLAs). A surrogate-enhanced actuator model, combining integrated electromechanical dynamics with a neural network trained on a dedicated testbed, is integrated into an extended virtual decomposition control (VDC) architecture augmented by a natural adaptation law. The derived analytical HDRM model supports a hierarchical control structure that seamlessly maps high-level force and velocity objectives to real-time actuator commands, accompanied by a Lyapunov-based stability proof. In multi-domain simulations of both cubic and a custom planar triangular trajectory, the proposed adaptive modular controller achieves sub-centimeter Cartesian tracking accuracy. Experimental validation of the same 1-DoF platform under realistic load emulation confirms the efficacy of the proposed control strategy. These findings demonstrate that a surrogate-enhanced EMLA model embedded in the VDC approach can enable modular, real-time control of an all-electric HDRM, supporting its deployment in next-generation mobile working machines.
comment: This is submitted to IEEE T-ASE
Evaluating Robot Program Performance with Power Consumption Driven Metrics in Lightweight Industrial Robots
The code performance of industrial robots is typically analyzed through CPU metrics, which overlook the physical impact of code on robot behavior. This study introduces a novel framework for assessing robot program performance from an embodiment perspective by analyzing the robot's electrical power profile. Our approach diverges from conventional CPU based evaluations and instead leverages a suite of normalized metrics, namely, the energy utilization coefficient, the energy conversion metric, and the reliability coefficient, to capture how efficiently and reliably energy is used during task execution. Complementing these metrics, the established robot wear metric provides further insight into long term reliability. Our approach is demonstrated through an experimental case study in machine tending, comparing four programs with diverse strategies using a UR5e robot. The proposed metrics directly compare and categorize different robot programs, regardless of the specific task, by linking code performance to its physical manifestation through power consumption patterns. Our results reveal the strengths and weaknesses of each strategy, offering actionable insights for optimizing robot programming practices. Enhancing energy efficiency and reliability through this embodiment centric approach not only improves individual robot performance but also supports broader industrial objectives such as sustainable manufacturing and cost reduction.
Real-Time 3D Vision-Language Embedding Mapping
A metric-accurate semantic 3D representation is essential for many robotic tasks. This work proposes a simple, yet powerful, way to integrate the 2D embeddings of a Vision-Language Model in a metric-accurate 3D representation at real-time. We combine a local embedding masking strategy, for a more distinct embedding distribution, with a confidence-weighted 3D integration for more reliable 3D embeddings. The resulting metric-accurate embedding representation is task-agnostic and can represent semantic concepts on a global multi-room, as well as on a local object-level. This enables a variety of interactive robotic applications that require the localisation of objects-of-interest via natural language. We evaluate our approach on a variety of real-world sequences and demonstrate that these strategies achieve a more accurate object-of-interest localisation while improving the runtime performance in order to meet our real-time constraints. We further demonstrate the versatility of our approach in a variety of interactive handheld, mobile robotics and manipulation tasks, requiring only raw image data.
Situationally-aware Path Planning Exploiting 3D Scene Graphs
3D Scene Graphs integrate both metric and semantic information, yet their structure remains underutilized for improving path planning efficiency and interpretability. In this work, we present S-Path, a situationally-aware path planner that leverages the metric-semantic structure of indoor 3D Scene Graphs to significantly enhance planning efficiency. S-Path follows a two-stage process: it first performs a search over a semantic graph derived from the scene graph to yield a human-understandable high-level path. This also identifies relevant regions for planning, which later allows the decomposition of the problem into smaller, independent subproblems that can be solved in parallel. We also introduce a replanning mechanism that, in the event of an infeasible path, reuses information from previously solved subproblems to update semantic heuristics and prioritize reuse to further improve the efficiency of future planning attempts. Extensive experiments on both real-world and simulated environments show that S-Path achieves average reductions of 5.7x in planning time while maintaining comparable path optimality to classical sampling-based planners and surpassing them in complex scenarios, making it an efficient and interpretable path planner for environments represented by indoor 3D Scene Graphs.
Mitigating Undesired Conditions in Flexible Production with Product-Process-Resource Asset Knowledge Graphs
Contemporary industrial cyber-physical production systems (CPPS) composed of robotic workcells face significant challenges in the analysis of undesired conditions due to the flexibility of Industry 4.0 that disrupts traditional quality assurance mechanisms. This paper presents a novel industry-oriented semantic model called Product-Process-Resource Asset Knowledge Graph (PPR-AKG), which is designed to analyze and mitigate undesired conditions in flexible CPPS. Built on top of the well-proven Product-Process-Resource (PPR) model originating from ISA-95 and VDI-3682, a comprehensive OWL ontology addresses shortcomings of conventional model-driven engineering for CPPS, particularly inadequate undesired condition and error handling representation. The integration of semantic technologies with large language models (LLMs) provides intuitive interfaces for factory operators, production planners, and engineers to interact with the entire model using natural language. Evaluation with the use case addressing electric vehicle battery remanufacturing demonstrates that the PPR-AKG approach efficiently supports resource allocation based on explicitly represented capabilities as well as identification and mitigation of undesired conditions in production. The key contributions include (1) a holistic PPR-AKG model capturing multi-dimensional production knowledge, and (2) the useful combination of the PPR-AKG with LLM-based chatbots for human interaction.
comment: 3 pages, 1 figure
EcBot: Data-Driven Energy Consumption Open-Source MATLAB Library for Manipulators
Existing literature proposes models for estimating the electrical power of manipulators, yet two primary limitations prevail. First, most models are predominantly tested using traditional industrial robots. Second, these models often lack accuracy. To address these issues, we introduce an open source Matlab-based library designed to automatically generate \ac{ec} models for manipulators. The necessary inputs for the library are Denavit-Hartenberg parameters, link masses, and centers of mass. Additionally, our model is data-driven and requires real operational data, including joint positions, velocities, accelerations, electrical power, and corresponding timestamps. We validated our methodology by testing on four lightweight robots sourced from three distinct manufacturers: Universal Robots, Franka Emika, and Kinova. The model underwent testing, and the results demonstrated an RMSE ranging from 1.42 W to 2.80 W for the training dataset and from 1.45 W to 5.25 W for the testing dataset.
ADPro: a Test-time Adaptive Diffusion Policy for Robot Manipulation via Manifold and Initial Noise Constraints
Diffusion policies have recently emerged as a powerful class of visuomotor controllers for robot manipulation, offering stable training and expressive multi-modal action modeling. However, existing approaches typically treat action generation as an unconstrained denoising process, ignoring valuable a priori knowledge about geometry and control structure. In this work, we propose the Adaptive Diffusion Policy (ADP), a test-time adaptation method that introduces two key inductive biases into the diffusion. First, we embed a geometric manifold constraint that aligns denoising updates with task-relevant subspaces, leveraging the fact that the relative pose between the end-effector and target scene provides a natural gradient direction, and guiding denoising along the geodesic path of the manipulation manifold. Then, to reduce unnecessary exploration and accelerate convergence, we propose an analytically guided initialization: rather than sampling from an uninformative prior, we compute a rough registration between the gripper and target scenes to propose a structured initial noisy action. ADP is compatible with pre-trained diffusion policies and requires no retraining, enabling test-time adaptation that tailors the policy to specific tasks, thereby enhancing generalization across novel tasks and environments. Experiments on RLBench, CALVIN, and real-world dataset show that ADPro, an implementation of ADP, improves success rates, generalization, and sampling efficiency, achieving up to 25% faster execution and 9% points over strong diffusion baselines.
REBot: Reflexive Evasion Robot for Instantaneous Dynamic Obstacle Avoidance
Dynamic obstacle avoidance (DOA) is critical for quadrupedal robots operating in environments with moving obstacles or humans. Existing approaches typically rely on navigation-based trajectory replanning, which assumes sufficient reaction time and leading to fails when obstacles approach rapidly. In such scenarios, quadrupedal robots require reflexive evasion capabilities to perform instantaneous, low-latency maneuvers. This paper introduces Reflexive Evasion Robot (REBot), a control framework that enables quadrupedal robots to achieve real-time reflexive obstacle avoidance. REBot integrates an avoidance policy and a recovery policy within a finite-state machine. With carefully designed learning curricula and by incorporating regularization and adaptive rewards, REBot achieves robust evasion and rapid stabilization in instantaneous DOA tasks. We validate REBot through extensive simulations and real-world experiments, demonstrating notable improvements in avoidance success rates, energy efficiency, and robustness to fast-moving obstacles. Videos and appendix are available on https://rebot-2025.github.io/.
Depth Jitter: Seeing through the Depth
Depth information is essential in computer vision, particularly in underwater imaging, robotics, and autonomous navigation. However, conventional augmentation techniques overlook depth aware transformations, limiting model robustness in real world depth variations. In this paper, we introduce Depth-Jitter, a novel depth-based augmentation technique that simulates natural depth variations to improve generalization. Our approach applies adaptive depth offsetting, guided by depth variance thresholds, to generate synthetic depth perturbations while preserving structural integrity. We evaluate Depth-Jitter on two benchmark datasets, FathomNet and UTDAC2020 demonstrating its impact on model stability under diverse depth conditions. Extensive experiments compare Depth-Jitter against traditional augmentation strategies such as ColorJitter, analyzing performance across varying learning rates, encoders, and loss functions. While Depth-Jitter does not always outperform conventional methods in absolute performance, it consistently enhances model stability and generalization in depth-sensitive environments. These findings highlight the potential of depth-aware augmentation for real-world applications and provide a foundation for further research into depth-based learning strategies. The proposed technique is publicly available to support advancements in depth-aware augmentation. The code is publicly available on \href{https://github.com/mim-team/Depth-Jitter}{github}.
Computer Vision-based Adaptive Control for Back Exoskeleton Performance Optimization
Back exoskeletons can reduce musculoskeletal strain, but their effectiveness depends on support modulation and adaptive control. This study addresses two challenges: defining optimal support strategies and developing adaptive control based on payload estimation. We introduce an optimization space based on muscle activity reduction, perceived discomfort, and user preference, constructing functions to identify optimal strategies. Experiments with 12 subjects revealed optimal operating regions, highlighting the need for dynamic modulation. Based on these insights, we developed a vision-based adaptive control pipeline that estimates payloads in real-time by enhancing exoskeleton contextual understanding, minimising latency and enabling support adaptation within the defined optimisation space. Validation with 12 more subjects showed over 80% accuracy and improvements across all metrics. Compared to static control, adaptive modulation reduced peak back muscle activation by up to 23% while preserving user preference and minimising discomfort. These findings validate the proposed framework and highlight the potential of intelligent, context-aware control in industrial exoskeletons.
Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model
Affordance grounding focuses on predicting the specific regions of objects that are associated with the actions to be performed by robots. It plays a vital role in the fields of human-robot interaction, human-object interaction, embodied manipulation, and embodied perception. Existing models often neglect the affordance shared among different objects because they lack the Chain-of-Thought(CoT) reasoning abilities, limiting their out-of-domain (OOD) generalization and explicit reasoning capabilities. To address these challenges, we propose Affordance-R1, the first unified affordance grounding framework that integrates cognitive CoT guided Group Relative Policy Optimization (GRPO) within a reinforcement learning paradigm. Specifically, we designed a sophisticated affordance function, which contains format, perception, and cognition rewards to effectively guide optimization directions. Furthermore, we constructed a high-quality affordance-centric reasoning dataset, ReasonAff, to support training. Trained exclusively via reinforcement learning with GRPO and without explicit reasoning data, Affordance-R1 achieves robust zero-shot generalization and exhibits emergent test-time reasoning capabilities. Comprehensive experiments demonstrate that our model outperforms well-established methods and exhibits open-world generalization. To the best of our knowledge, Affordance-R1 is the first to integrate GRPO-based RL with reasoning into affordance reasoning. The code of our method and our dataset is released on https://github.com/hq-King/Affordance-R1.
Beyond Constant Parameters: Hyper Prediction Models and HyperMPC
Model Predictive Control (MPC) is among the most widely adopted and reliable methods for robot control, relying critically on an accurate dynamics model. However, existing dynamics models used in the gradient-based MPC are limited by computational complexity and state representation. To address this limitation, we propose the Hyper Prediction Model (HyperPM) - a novel approach in which we project the unmodeled dynamics onto a time-dependent dynamics model. This time-dependency is captured through time-varying model parameters, whose evolution over the MPC prediction horizon is learned using a neural network. Such formulation preserves the computational efficiency and robustness of the base model while equipping it with the capacity to anticipate previously unmodeled phenomena. We evaluated the proposed approach on several challenging systems, including real-world F1TENTH autonomous racing, and demonstrated that it significantly reduces long-horizon prediction errors. Moreover, when integrated within the MPC framework (HyperMPC), our method consistently outperforms existing state-of-the-art techniques.
Graph-based Robot Localization Using a Graph Neural Network with a Floor Camera and a Feature Rich Industrial Floor
Accurate localization represents a fundamental challenge in robotic navigation. Traditional methodologies, such as Lidar or QR-code based systems, suffer from inherent scalability and adaptability con straints, particularly in complex environments. In this work, we propose an innovative localization framework that harnesses flooring characteris tics by employing graph-based representations and Graph Convolutional Networks (GCNs). Our method uses graphs to represent floor features, which helps localize the robot more accurately (0.64cm error) and more efficiently than comparing individual image features. Additionally, this approach successfully addresses the kidnapped robot problem in every frame without requiring complex filtering processes. These advancements open up new possibilities for robotic navigation in diverse environments.
comment: Accepted at 28th RoboCup International Symposium, Salvador, Brasil
GMF-Drive: Gated Mamba Fusion with Spatial-Aware BEV Representation for End-to-End Autonomous Driving
Diffusion-based models are redefining the state-of-the-art in end-to-end autonomous driving, yet their performance is increasingly hampered by a reliance on transformer-based fusion. These architectures face fundamental limitations: quadratic computational complexity restricts the use of high-resolution features, and a lack of spatial priors prevents them from effectively modeling the inherent structure of Bird's Eye View (BEV) representations. This paper introduces GMF-Drive (Gated Mamba Fusion for Driving), an end-to-end framework that overcomes these challenges through two principled innovations. First, we supersede the information-limited histogram-based LiDAR representation with a geometrically-augmented pillar format encoding shape descriptors and statistical features, preserving critical 3D geometric details. Second, we propose a novel hierarchical gated mamba fusion (GM-Fusion) architecture that substitutes an expensive transformer with a highly efficient, spatially-aware state-space model (SSM). Our core BEV-SSM leverages directional sequencing and adaptive fusion mechanisms to capture long-range dependencies with linear complexity, while explicitly respecting the unique spatial properties of the driving scene. Extensive experiments on the challenging NAVSIM benchmark demonstrate that GMF-Drive achieves a new state-of-the-art performance, significantly outperforming DiffusionDrive. Comprehensive ablation studies validate the efficacy of each component, demonstrating that task-specific SSMs can surpass a general-purpose transformer in both performance and efficiency for autonomous driving.
comment: 7 pages, 4 figures
Bounding Distributional Shifts in World Modeling through Novelty Detection
Recent work on visual world models shows significant promise in latent state dynamics obtained from pre-trained image backbones. However, most of the current approaches are sensitive to training quality, requiring near-complete coverage of the action and state space during training to prevent divergence during inference. To make a model-based planning algorithm more robust to the quality of the learned world model, we propose in this work to use a variational autoencoder as a novelty detector to ensure that proposed action trajectories during planning do not cause the learned model to deviate from the training data distribution. To evaluate the effectiveness of this approach, a series of experiments in challenging simulated robot environments was carried out, with the proposed method incorporated into a model-predictive control policy loop extending the DINO-WM architecture. The results clearly show that the proposed method improves over state-of-the-art solutions in terms of data efficiency.
comment: 7 pages, 6 figures
Incremental Language Understanding for Online Motion Planning of Robot Manipulators IROS 2025
Human-robot interaction requires robots to process language incrementally, adapting their actions in real-time based on evolving speech input. Existing approaches to language-guided robot motion planning typically assume fully specified instructions, resulting in inefficient stop-and-replan behavior when corrections or clarifications occur. In this paper, we introduce a novel reasoning-based incremental parser which integrates an online motion planning algorithm within the cognitive architecture. Our approach enables continuous adaptation to dynamic linguistic input, allowing robots to update motion plans without restarting execution. The incremental parser maintains multiple candidate parses, leveraging reasoning mechanisms to resolve ambiguities and revise interpretations when needed. By combining symbolic reasoning with online motion planning, our system achieves greater flexibility in handling speech corrections and dynamically changing constraints. We evaluate our framework in real-world human-robot interaction scenarios, demonstrating online adaptions of goal poses, constraints, or task objectives. Our results highlight the advantages of integrating incremental language understanding with real-time motion planning for natural and fluid human-robot collaboration. The experiments are demonstrated in the accompanying video at www.acin.tuwien.ac.at/42d5.
comment: 8 pages, 9 figures, accepted at IROS 2025
ME$^3$-BEV: Mamba-Enhanced Deep Reinforcement Learning for End-to-End Autonomous Driving with BEV-Perception
Autonomous driving systems face significant challenges in perceiving complex environments and making real-time decisions. Traditional modular approaches, while offering interpretability, suffer from error propagation and coordination issues, whereas end-to-end learning systems can simplify the design but face computational bottlenecks. This paper presents a novel approach to autonomous driving using deep reinforcement learning (DRL) that integrates bird's-eye view (BEV) perception for enhanced real-time decision-making. We introduce the \texttt{Mamba-BEV} model, an efficient spatio-temporal feature extraction network that combines BEV-based perception with the Mamba framework for temporal feature modeling. This integration allows the system to encode vehicle surroundings and road features in a unified coordinate system and accurately model long-range dependencies. Building on this, we propose the \texttt{ME$^3$-BEV} framework, which utilizes the \texttt{Mamba-BEV} model as a feature input for end-to-end DRL, achieving superior performance in dynamic urban driving scenarios. We further enhance the interpretability of the model by visualizing high-dimensional features through semantic segmentation, providing insight into the learned representations. Extensive experiments on the CARLA simulator demonstrate that \texttt{ME$^3$-BEV} outperforms existing models across multiple metrics, including collision rate and trajectory accuracy, offering a promising solution for real-time autonomous driving.
ReNiL: Relative Neural Inertial Locator with Any-Scale Bayesian Inference
Pedestrian inertial localization is key for mobile and IoT services because it provides infrastructure-free positioning. Yet most learning-based methods depend on fixed sliding-window integration, struggle to adapt to diverse motion scales and cadences, and yield inconsistent uncertainty, limiting real-world use. We present ReNiL, a Bayesian deep-learning framework for accurate, efficient, and uncertainty-aware pedestrian localization. ReNiL introduces Inertial Positioning Demand Points (IPDPs) to estimate motion at contextually meaningful waypoints instead of dense tracking, and supports inference on IMU sequences at any scale so cadence can match application needs. It couples a motion-aware orientation filter with an Any-Scale Laplace Estimator (ASLE), a dual-task network that blends patch-based self-supervision with Bayesian regression. By modeling displacements with a Laplace distribution, ReNiL provides homogeneous Euclidean uncertainty that integrates cleanly with other sensors. A Bayesian inference chain links successive IPDPs into consistent trajectories. On RoNIN-ds and a new WUDataset covering indoor and outdoor motion from 28 participants, ReNiL achieves state-of-the-art displacement accuracy and uncertainty consistency, outperforming TLIO, CTIN, iMoT, and RoNIN variants while reducing computation. Application studies further show robustness and practicality for mobile and IoT localization, making ReNiL a scalable, uncertainty-aware foundation for next-generation positioning.
PASG: A Closed-Loop Framework for Automated Geometric Primitive Extraction and Semantic Anchoring in Robotic Manipulation ICCV 2025
The fragmentation between high-level task semantics and low-level geometric features remains a persistent challenge in robotic manipulation. While vision-language models (VLMs) have shown promise in generating affordance-aware visual representations, the lack of semantic grounding in canonical spaces and reliance on manual annotations severely limit their ability to capture dynamic semantic-affordance relationships. To address these, we propose Primitive-Aware Semantic Grounding (PASG), a closed-loop framework that introduces: (1) Automatic primitive extraction through geometric feature aggregation, enabling cross-category detection of keypoints and axes; (2) VLM-driven semantic anchoring that dynamically couples geometric primitives with functional affordances and task-relevant description; (3) A spatial-semantic reasoning benchmark and a fine-tuned VLM (Qwen2.5VL-PA). We demonstrate PASG's effectiveness in practical robotic manipulation tasks across diverse scenarios, achieving performance comparable to manual annotations. PASG achieves a finer-grained semantic-affordance understanding of objects, establishing a unified paradigm for bridging geometric primitives with task semantics in robotic manipulation.
comment: Accepted to ICCV 2025. 8 pages main paper, 8 figures, plus supplementary material
Dynamical Trajectory Planning of Disturbance Consciousness for Air-Land Bimodal Unmanned Aerial Vehicles
Air-land bimodal vehicles provide a promising solution for navigating complex environments by combining the flexibility of aerial locomotion with the energy efficiency of ground mobility. To enhance the robustness of trajectory planning under environmental disturbances, this paper presents a disturbance-aware planning framework that incorporates real-time disturbance estimation into both path searching and trajectory optimization. A key component of the framework is a disturbance-adaptive safety boundary adjustment mechanism, which dynamically modifies the vehicle's feasible dynamic boundaries based on estimated disturbances to ensure trajectory feasibility. Leveraging the dynamics model of the bimodal vehicle, the proposed approach achieves adaptive and reliable motion planning across different terrains and operating conditions. A series of real-world experiments and benchmark comparisons on a custom-built platform validate the effectiveness and robustness of the method, demonstrating improvements in tracking accuracy, task efficiency, and energy performance under both ground and aerial disturbances.
Social and Telepresence Robots for Accessibility and Inclusion in Small Museums
There are still many museums that present accessibility barriers, particularly regarding perceptual, cultural, and cognitive aspects. This is especially evident in low-density population areas. The aim of the ROBSO-PM project is to improve the accessibility of small museums through the use of social robots and social telepresence robots, focusing on three museums as case studies: the Museum of the Holy Shroud in Turin, a small but globally known institution, and two lesser known mountain museums: the Museum of the Champlas du Col Carnival and the Pragelato Museum of Alpine Peoples' Costumes and Traditions. The project explores two main applications for robots: as guides supporting inclusive visits for foreign or disabled visitors, and as telepresence tools allowing people with limited mobility to access museums remotely. From a research perspective, key topics include storytelling, robot personality, empathy, personalization, and, in the case of telepresence, collaboration between the robot and the person, with clearly defined roles and autonomy.
Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution
Visuomotor policies trained via behavior cloning are vulnerable to covariate shift, where small deviations from expert trajectories can compound into failure. Common strategies to mitigate this issue involve expanding the training distribution through human-in-the-loop corrections or synthetic data augmentation. However, these approaches are often labor-intensive, rely on strong task assumptions, or compromise the quality of imitation. We introduce Latent Policy Barrier, a framework for robust visuomotor policy learning. Inspired by Control Barrier Functions, LPB treats the latent embeddings of expert demonstrations as an implicit barrier separating safe, in-distribution states from unsafe, out-of-distribution (OOD) ones. Our approach decouples the role of precise expert imitation and OOD recovery into two separate modules: a base diffusion policy solely on expert data, and a dynamics model trained on both expert and suboptimal policy rollout data. At inference time, the dynamics model predicts future latent states and optimizes them to stay within the expert distribution. Both simulated and real-world experiments show that LPB improves both policy robustness and data efficiency, enabling reliable manipulation from limited expert data and without additional human correction or annotation.
Affordance-Guided Dual-Armed Disassembly Teleoperation for Mating Parts
Robotic non-destructive disassembly of mating parts remains challenging due to the need for flexible manipulation and the limited visibility of internal structures. This study presents an affordance-guided teleoperation system that enables intuitive human demonstrations for dual-arm fix-and-disassemble tasks for mating parts. The system visualizes feasible grasp poses and disassembly directions in a virtual environment, both derived from the object's geometry, to address occlusions and structural complexity. To prevent excessive position tracking under load when following the affordance, we integrate a hybrid controller that combines position and impedance control into the teleoperated disassembly arm. Real-world experiments validate the effectiveness of the proposed system, showing improved task success rates and reduced object pose deviation.
comment: 6 pages, 9 figures
Modular Vacuum-Based Fixturing System for Adaptive Disassembly Workspace Integration
The disassembly of small household appliances poses significant challenges due to their complex and curved geometries, which render traditional rigid fixtures inadequate. In this paper, we propose a modular vacuum-based fixturing system that leverages commercially available balloon-type soft grippers to conform to arbitrarily shaped surfaces and provide stable support during screw-removal tasks. To enable a reliable deployment of the system, we develop a stability-aware planning framework that samples the bottom surface of the target object, filters candidate contact points based on geometric continuity, and evaluates support configurations using convex hull-based static stability criteria. We compare the quality of object placement under different numbers and configurations of balloon hands. In addition, real-world experiments were conducted to compare the success rates of traditional rigid fixtures with our proposed system. The results demonstrate that our method consistently achieves higher success rates and superior placement stability during screw removal tasks.
comment: 8 pages, 9 figures
Robust-Sub-Gaussian Model Predictive Control for Safe Ultrasound-Image-Guided Robotic Spinal Surgery
Safety-critical control using high-dimensional sensory feedback from optical data (e.g., images, point clouds) poses significant challenges in domains like autonomous driving and robotic surgery. Control can rely on low-dimensional states estimated from high-dimensional data. However, the estimation errors often follow complex, unknown distributions that standard probabilistic models fail to capture, making formal safety guarantees challenging. In this work, we introduce a novel characterization of these general estimation errors using sub-Gaussian noise with bounded mean. We develop a new technique for uncertainty propagation of proposed noise characterization in linear systems, which combines robust set-based methods with the propagation of sub-Gaussian variance proxies. We further develop a Model Predictive Control (MPC) framework that provides closed-loop safety guarantees for linear systems under the proposed noise assumption. We apply this MPC approach in an ultrasound-image-guided robotic spinal surgery pipeline, which contains deep-learning-based semantic segmentation, image-based registration, high-level optimization-based planning, and low-level robotic control. To validate the pipeline, we developed a realistic simulation environment integrating real human anatomy, robot dynamics, efficient ultrasound simulation, as well as in-vivo data of breathing motion and drilling force. Evaluation results in simulation demonstrate the potential of our approach for solving complex image-guided robotic surgery task while ensuring safety.
Learning Causal Structure Distributions for Robust Planning
Structural causal models describe how the components of a robotic system interact. They provide both structural and functional information about the relationships that are present in the system. The structural information outlines the variables among which there is interaction. The functional information describes how such interactions work, via equations or learned models. In this paper we find that learning the functional relationships while accounting for the uncertainty about the structural information leads to more robust dynamics models which improves downstream planning, while using significantly lower computational resources. This in contrast with common model-learning methods that ignore the causal structure and fail to leverage the sparsity of interactions in robotic systems. We achieve this by estimating a causal structure distribution that is used to sample causal graphs that inform the latent-space representations in an encoder-multidecoder probabilistic model. We show that our model can be used to learn the dynamics of a robot, which together with a sampling-based planner can be used to perform new tasks in novel environments, provided an objective function for the new requirement is available. We validate our method using manipulators and mobile robots in both simulation and the real-world. Additionally, we validate the learned dynamics' adaptability and increased robustness to corrupted inputs and changes in the environment, which is highly desirable in challenging real-world robotics scenarios. Video: https://youtu.be/X6k5t7OOnNc.
Improved Obstacle Avoidance for Autonomous Robots with ORCA-FLC
Obstacle avoidance enables autonomous agents and robots to operate safely and efficiently in dynamic and complex environments, reducing the risk of collisions and damage. For a robot or autonomous system to successfully navigate through obstacles, it must be able to detect such obstacles. While numerous collision avoidance algorithms like the dynamic window approach (DWA), timed elastic bands (TEB), and reciprocal velocity obstacles (RVO) have been proposed, they may lead to suboptimal paths due to fixed weights, be computationally expensive, or have limited adaptability to dynamic obstacles in multi-agent environments. Optimal reciprocal collision avoidance (ORCA), which improves on RVO, provides smoother trajectories and stronger collision avoidance guarantees. We propose ORCA-FL to improve on ORCA by using fuzzy logic controllers (FLCs) to better handle uncertainty and imprecision for obstacle avoidance in path planning. Numerous multi-agent experiments are conducted and it is shown that ORCA-FL can outperform ORCA in reducing the number of collision if the agent has a velocity that exceeds a certain threshold. In addition, a proposed algorithm for improving ORCA-FL using fuzzy Q reinforcement learning (FQL) is detailed for optimizing and tuning FLCs.
Optimal Planning and Machine Learning for Responsive Tracking and Enhanced Forecasting of Wildfires using a Spacecraft Constellation
We propose a novel concept of operations using optimal planning methods and machine learning (ML) to collect spaceborne data that is unprecedented for monitoring wildfires, process it to create new or enhanced products in the context of wildfire danger or spread monitoring, and assimilate them to improve existing, wildfire decision support tools delivered to firefighters within latency appropriate for time-critical applications. The concept is studied with respect to NASA's CYGNSS Mission, a constellation of passive microwave receivers that measure specular GNSS-R reflections despite clouds and smoke. Our planner uses a Mixed Integer Program formulation to schedule joint observation data collection and downlink for all satellites. Optimal solutions are found quickly that collect 98-100% of available observation opportunities. ML-based fire predictions that drive the planner objective are greater than 40% more correlated with ground truth than existing state-of-art. The presented case study on the TX Smokehouse Creek fire in 2024 and LA fires in 2025 represents the first high-resolution data collected by CYGNSS of active fires. Creation of Burnt Area Maps (BAM) using ML applied to the data during active fires and BAM assimilation into NASA's Weather Research and Forecasting Model using ML to broadcast fire spread are novel outcomes. BAM and CYGNSS obtained soil moisture are integrated for the first time into USGS fire danger maps. Inclusion of CYGNSS data in ML-based burn predictions boosts accuracy by 13%, and inclusion of high-resolution data boosts ML recall by another 15%. The proposed workflow has an expected latency of 6-30h, improving on the current delivery time of multiple days. All components in the proposed concept are shown to be computationally scalable and globally generalizable, with sustainability considerations such as edge efficiency and low latency on small devices.
GhostShell: Streaming LLM Function Calls for Concurrent Embodied Programming
We present GhostShell, a novel approach that leverages Large Language Models (LLMs) to enable streaming and concurrent behavioral programming for embodied systems. In contrast to conventional methods that rely on pre-scheduled action sequences or behavior trees, GhostShell drives embodied systems to act on-the-fly by issuing function calls incrementally as tokens are streamed from the LLM. GhostShell features a streaming XML function token parser, a dynamic function interface mapper, and a multi-channel scheduler that orchestrates intra-channel synchronous and inter-channel asynchronous function calls, thereby coordinating serial-parallel embodied actions across multiple robotic components under LLM guidance. We evaluate GhostShell on our robotic prototype COCO through comprehensive grounded experiments across 34 real-world interaction tasks and multiple LLM backends. The results demonstrate that our approach achieves a state-of-the-art Behavioral Correctness Metric of 0.85 with Claude-4-Sonnet, and up to 66X faster response times compared to native LLM function calling APIs. GhostShell also proves effective in long-horizon multimodal tasks, exhibiting strong robustness and generalization capabilities.
comment: 17 pages, 5 figures, conference
RoboTron-Nav: A Unified Framework for Embodied Navigation Integrating Perception, Planning, and Prediction ICCV 2025
In language-guided visual navigation, agents locate target objects in unseen environments using natural language instructions. For reliable navigation in unfamiliar scenes, agents should possess strong perception, planning, and prediction capabilities. Additionally, when agents revisit previously explored areas during long-term navigation, they may retain irrelevant and redundant historical perceptions, leading to suboptimal results. In this work, we propose RoboTron-Nav, a unified framework that integrates perception, planning, and prediction capabilities through multitask collaborations on navigation and embodied question answering tasks, thereby enhancing navigation performances. Furthermore, RoboTron-Nav employs an adaptive 3D-aware history sampling strategy to effectively and efficiently utilize historical observations. By leveraging large language model, RoboTron-Nav comprehends diverse commands and complex visual scenes, resulting in appropriate navigation actions. RoboTron-Nav achieves an 81.1% success rate in object goal navigation on the $\mathrm{CHORES}$-$\mathbb{S}$ benchmark, setting a new state-of-the-art performance. Project page: https://yvfengzhong.github.io/RoboTron-Nav
comment: ICCV 2025
CARE: Enhancing Safety of Visual Navigation through Collision Avoidance via Repulsive Estimation
We propose CARE (Collision Avoidance via Repulsive Estimation) to improve the robustness of learning-based visual navigation methods. Recently, visual navigation models, particularly foundation models, have demonstrated promising performance by generating viable trajectories using only RGB images. However, these policies can generalize poorly to environments containing out-of-distribution (OOD) scenes characterized by unseen objects or different camera setups (e.g., variations in field of view, camera pose, or focal length). Without fine-tuning, such models could produce trajectories that lead to collisions, necessitating substantial efforts in data collection and additional training. To address this limitation, we introduce CARE, an attachable module that enhances the safety of visual navigation without requiring additional range sensors or fine-tuning of pretrained models. CARE can be integrated seamlessly into any RGB-based navigation model that generates local robot trajectories. It dynamically adjusts trajectories produced by a pretrained model using repulsive force vectors computed from depth images estimated directly from RGB inputs. We evaluate CARE by integrating it with state-of-the-art visual navigation models across diverse robot platforms. Real-world experiments show that CARE significantly reduces collisions (up to 100%) without compromising navigation performance in goal-conditioned navigation, and further improves collision-free travel distance (up to 10.7x) in exploration tasks. Project page: https://airlab-sogang.github.io/CARE/
comment: 16 pages, 6 figures
LaDi-WM: A Latent Diffusion-based World Model for Predictive Manipulation
Predictive manipulation has recently gained considerable attention in the Embodied AI community due to its potential to improve robot policy performance by leveraging predicted states. However, generating accurate future visual states of robot-object interactions from world models remains a well-known challenge, particularly in achieving high-quality pixel-level representations. To this end, we propose LaDi-WM, a world model that predicts the latent space of future states using diffusion modeling. Specifically, LaDi-WM leverages the well-established latent space aligned with pre-trained Visual Foundation Models (VFMs), which comprises both geometric features (DINO-based) and semantic features (CLIP-based). We find that predicting the evolution of the latent space is easier to learn and more generalizable than directly predicting pixel-level images. Building on LaDi-WM, we design a diffusion policy that iteratively refines output actions by incorporating forecasted states, thereby generating more consistent and accurate results. Extensive experiments on both synthetic and real-world benchmarks demonstrate that LaDi-WM significantly enhances policy performance by 27.9\% on the LIBERO-LONG benchmark and 20\% on the real-world scenario. Furthermore, our world model and policies achieve impressive generalizability in real-world experiments.
comment: CoRL 2025
An improved two-dimensional time-to-collision for articulated vehicles: predicting sideswipe and rear-end collisions
Time-to-collision (TTC) is a widely used measure for predicting rear-end collisions, assuming constant speed and heading for both vehicles in the prediction horizon. However, this conventional formulation cannot detect sideswipe collisions. A two-dimensional extension, $\text{TTC}_{\text{2D}}$, has been proposed in the literature to address lateral interactions. However, this formulation assumes both vehicles have the same heading and that their headings remain unchanged during the manoeuvre, in addition to the constant speed and heading assumptions in the prediction horizon. Moreover, its use for articulated vehicles like a tractor-semitrailer remains unclear. This paper proposes three enhanced versions of $\text{TTC}_{\text{2D}}$ to overcome these limitations. The first incorporates the vehicle heading to account for directional differences. The standard assumption of constant speed and heading in the prediction horizon holds. The second adapts the formulation for articulated vehicles, and the third allows for constant acceleration, relaxing the constant speed assumption in the prediction horizon. All versions are evaluated in simulated cut-in scenarios, covering both sideswipe and rear-end collisions, using the CARLA simulation environment with a tractor-semitrailer model. Results show that the proposed versions predict sideswipe collisions with better accuracy compared to existing $\text{TTC}_{\text{2D}}$. They also detect rear-end collisions similar to the existing methods.
Failure-Aware Multi-Robot Coordination for Resilient and Adaptive Target Tracking
Multi-robot coordination is crucial for autonomous systems, yet real-world deployments often encounter various failures. These include both temporary and permanent disruptions in sensing and communication, which can significantly degrade system robustness and performance if not explicitly modeled. Despite its practical importance, failure-aware coordination remains underexplored in the literature. To bridge the gap between idealized conditions and the complexities of real-world environments, we propose a unified failure-aware coordination framework designed to enable resilient and adaptive multi-robot target tracking under both temporary and permanent failure conditions. Our approach systematically distinguishes between two classes of failures: (1) probabilistic and temporary disruptions, where robots recover from intermittent sensing or communication losses by dynamically adapting paths and avoiding inferred danger zones, and (2) permanent failures, where robots lose sensing or communication capabilities irreversibly, requiring sustained, decentralized behavioral adaptation. To handle these scenarios, the robot team is partitioned into subgroups. Robots that remain connected form a communication group and collaboratively plan using partially centralized nonlinear optimization. Robots experiencing permanent disconnection or failure continue to operate independently through decentralized or individual optimization, allowing them to contribute to the task within their local context. We extensively evaluate our method across a range of benchmark variations and conduct a comprehensive assessment under diverse real-world failure scenarios. Results show that our framework consistently achieves robust performance in realistic environments with unknown danger zones, offering a practical and generalizable solution for the multi-robot systems community.
Would you let a humanoid play storytelling with your child? A usability study on LLM-powered narrative Human-Robot Interaction
A key challenge in human-robot interaction research lies in developing robotic systems that can effectively perceive and interpret social cues, facilitating natural and adaptive interactions. In this work, we present a novel framework for enhancing the attention of the iCub humanoid robot by integrating advanced perceptual abilities to recognise social cues, understand surroundings through generative models, such as ChatGPT, and respond with contextually appropriate social behaviour. Specifically, we propose an interaction task implementing a narrative protocol (storytelling task) in which the human and the robot create a short imaginary story together, exchanging in turn cubes with creative images placed on them. To validate the protocol and the framework, experiments were performed to quantify the degree of usability and the quality of experience perceived by participants interacting with the system. Such a system can be beneficial in promoting effective human robot collaborations, especially in assistance, education and rehabilitation scenarios where the social awareness and the robot responsiveness play a pivotal role.
Learning to Initialize Trajectory Optimization for Vision-Based Autonomous Flight in Unknown Environments IROS 2025
Autonomous flight in unknown environments requires precise spatial and temporal trajectory planning, often involving computationally expensive nonconvex optimization prone to local optima. To overcome these challenges, we present the Neural-Enhanced Trajectory Planner (NEO-Planner), a novel approach that leverages a Neural Network (NN) Planner to provide informed initial values for trajectory optimization. The NN-Planner is trained on a dataset generated by an expert planner using batch sampling, capturing multimodal trajectory solutions. It learns to predict spatial and temporal parameters for trajectories directly from raw sensor observations. NEO-Planner starts optimization from these predictions, accelerating computation speed while maintaining explainability. Furthermore, we introduce a robust online replanning framework that accommodates planning latency for smooth trajectory tracking. Extensive simulations demonstrate that NEO-Planner reduces optimization iterations by 20%, leading to a 26% decrease in computation time compared with pure optimization-based methods. It maintains trajectory quality comparable to baseline approaches and generalizes well to unseen environments. Real-world experiments validate its effectiveness for autonomous drone navigation in cluttered, unknown environments.
comment: Accepted to IROS 2025. Source code available
Unified Multi-Rate Model Predictive Control for a Jet-Powered Humanoid Robot
We propose a novel Model Predictive Control (MPC) framework for a jet-powered flying humanoid robot. The controller is based on a linearised centroidal momentum model to represent the flight dynamics, augmented with a second-order nonlinear model to explicitly account for the slow and nonlinear dynamics of jet propulsion. A key contribution is the introduction of a multi-rate MPC formulation that handles the different actuation rates of the robot's joints and jet engines while embedding the jet dynamics directly into the predictive model. We validated the framework using the jet-powered humanoid robot iRonCub, performing simulations in Mujoco; the simulation results demonstrate the robot's ability to recover from external disturbances and perform stable, non-abrupt flight manoeuvres, validating the effectiveness of the proposed approach.
comment: This paper has been accepted for publication at the 2025 IEEE-RAS 24th International Conference on Humanoid Robots (Humanoids), Seoul, 2025
MBA-SLAM: Motion Blur Aware Gaussian Splatting SLAM
Emerging 3D scene representations, such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS), have demonstrated their effectiveness in Simultaneous Localization and Mapping (SLAM) for photo-realistic rendering, particularly when using high-quality video sequences as input. However, existing methods struggle with motion-blurred frames, which are common in real-world scenarios like low-light or long-exposure conditions. This often results in a significant reduction in both camera localization accuracy and map reconstruction quality. To address this challenge, we propose a dense visual deblur SLAM pipeline (i.e. MBA-SLAM) to handle severe motion-blurred inputs and enhance image deblurring. Our approach integrates an efficient motion blur-aware tracker with either neural radiance fields or Gaussian Splatting based mapper. By accurately modeling the physical image formation process of motion-blurred images, our method simultaneously learns 3D scene representation and estimates the cameras' local trajectory during exposure time, enabling proactive compensation for motion blur caused by camera movement. In our experiments, we demonstrate that MBA-SLAM surpasses previous state-of-the-art methods in both camera localization and map reconstruction, showcasing superior performance across a range of datasets, including synthetic and real datasets featuring sharp images as well as those affected by motion blur, highlighting the versatility and robustness of our approach. Code is available at https://github.com/WU-CVGL/MBA-SLAM.
comment: Accepted to TPAMI; Deblur Gaussian Splatting SLAM
CANVAS: Commonsense-Aware Navigation System for Intuitive Human-Robot Interaction ICRA 2025
Real-life robot navigation involves more than just reaching a destination; it requires optimizing movements while addressing scenario-specific goals. An intuitive way for humans to express these goals is through abstract cues like verbal commands or rough sketches. Such human guidance may lack details or be noisy. Nonetheless, we expect robots to navigate as intended. For robots to interpret and execute these abstract instructions in line with human expectations, they must share a common understanding of basic navigation concepts with humans. To this end, we introduce CANVAS, a novel framework that combines visual and linguistic instructions for commonsense-aware navigation. Its success is driven by imitation learning, enabling the robot to learn from human navigation behavior. We present COMMAND, a comprehensive dataset with human-annotated navigation results, spanning over 48 hours and 219 km, designed to train commonsense-aware navigation systems in simulated environments. Our experiments show that CANVAS outperforms the strong rule-based system ROS NavStack across all environments, demonstrating superior performance with noisy instructions. Notably, in the orchard environment, where ROS NavStack records a 0% total success rate, CANVAS achieves a total success rate of 67%. CANVAS also closely aligns with human demonstrations and commonsense constraints, even in unseen environments. Furthermore, real-world deployment of CANVAS showcases impressive Sim2Real transfer with a total success rate of 69%, highlighting the potential of learning from human demonstrations in simulated environments for real-world applications.
comment: Accepted to ICRA 2025, project page https://worv-ai.github.io/canvas
Direct Robot Configuration Space Construction using Convolutional Encoder-Decoders ICML 2025
Intelligent robots must be able to perform safe and efficient motion planning in their environments. Central to modern motion planning is the configuration space. Configuration spaces define the set of configurations of a robot that result in collisions with obstacles in the workspace, $\text{C}_{\text{clsn}}$, and the set of configurations that do not, $\text{C}_{\text{free}}$. Modern approaches to motion planning first compute the configuration space and then perform motion planning using the calculated configuration space. Real-time motion planning requires accurate and efficient construction of configuration spaces. We are the first to apply a convolutional encoder-decoder framework for calculating highly accurate approximations to configuration spaces, essentially learning how the robot and physical world interact. Our model achieves an average 97.5% F1-score for predicting $\text{C}_{\text{free}}$ and $\text{C}_{\text{clsn}}$ for 2-D robotic workspaces with a dual-arm robot. Our method limits undetected collisions to less than 2.5% on robotic workspaces that involve translation, rotation, and removal of obstacles. Our model learns highly transferable features between robotic workspaces, requiring little to no fine-tuning to adapt to new transformations of obstacles in the workspace.
comment: 8 pages, 7 figures, 4 tables; Appeared at the ICML 2025 Workshop on Building Physically Plausible World Models
Human-Machine Shared Control Approach for the Takeover of CACC
Cooperative Adaptive Cruise Control (CACC) often requires human takeover for tasks such as exiting a freeway. Direct human takeover can pose significant risks, especially given the close-following strategy employed by CACC, which might cause drivers to feel unsafe and execute hard braking, potentially leading to collisions. This research aims to develop a CACC takeover controller that ensures a smooth transition from automated to human control. The proposed CACC takeover maneuver employs an indirect human-machine shared control approach, modeled as a Stackelberg competition where the machine acts as the leader and the human as the follower. The machine guides the human to respond in a manner that aligns with the machine's expectations, aiding in maintaining following stability. Additionally, the human reaction function is integrated into the machine's predictive control system, moving beyond a simple "prediction-planning" pipeline to enhance planning optimality. The controller has been verified to i) enable a smooth takeover maneuver of CACC; ii) ensure string stability in the condition that the platoon has less than 6 CAVs and human control authority is less than 40%; iii) enhance both perceived and actual safety through machine interventions; and iv) reduce the impact on upstream traffic by up to 60%.
comment: This article has been published on IEEE Transactions on Intelligent Transportation Systems (2025)
ImLPR: Image-based LiDAR Place Recognition using Vision Foundation Models
LiDAR Place Recognition (LPR) is a key component in robotic localization, enabling robots to align current scans with prior maps of their environment. While Visual Place Recognition (VPR) has embraced Vision Foundation Models (VFMs) to enhance descriptor robustness, LPR has relied on task-specific models with limited use of pre-trained foundation-level knowledge. This is due to the lack of 3D foundation models and the challenges of using VFM with LiDAR point clouds. To tackle this, we introduce ImLPR, a novel pipeline that employs a pre-trained DINOv2 VFM to generate rich descriptors for LPR. To the best of our knowledge, ImLPR is the first method to utilize a VFM for LPR while retaining the majority of pre-trained knowledge. ImLPR converts raw point clouds into novel three-channel Range Image Views (RIV) to leverage VFM in the LiDAR domain. It employs MultiConv adapters and Patch-InfoNCE loss for effective feature learning. We validate ImLPR on public datasets and outperform state-of-the-art (SOTA) methods across multiple evaluation metrics in both intra- and inter-session LPR. Comprehensive ablations on key design choices such as channel composition, RIV, adapters, and the patch-level loss quantify each component's impact. We release ImLPR as open source for the robotics community: https://github.com/minwoo0611/ImLPR.
comment: CoRL2025 Accepted, 23 Pages, 15 Figures and 14 Tables
EfficientEQA: An Efficient Approach to Open-Vocabulary Embodied Question Answering IROS 2025
Embodied Question Answering (EQA) is an essential yet challenging task for robot assistants. Large vision-language models (VLMs) have shown promise for EQA, but existing approaches either treat it as static video question answering without active exploration or restrict answers to a closed set of choices. These limitations hinder real-world applicability, where a robot must explore efficiently and provide accurate answers in open-vocabulary settings. To overcome these challenges, we introduce EfficientEQA, a novel framework that couples efficient exploration with free-form answer generation. EfficientEQA features three key innovations: (1) Semantic-Value-Weighted Frontier Exploration (SFE) with Verbalized Confidence (VC) from a black-box VLM to prioritize semantically important areas to explore, enabling the agent to gather relevant information faster; (2) a BLIP relevancy-based mechanism to stop adaptively by flagging highly relevant observations as outliers to indicate whether the agent has collected enough information; and (3) a Retrieval-Augmented Generation (RAG) method for the VLM to answer accurately based on pertinent images from the agent's observation history without relying on predefined choices. Our experimental results show that EfficientEQA achieves over 15% higher answer accuracy and requires over 20% fewer exploration steps than state-of-the-art methods. Our code is available at: https://github.com/chengkaiAcademyCity/EfficientEQA
comment: IROS 2025 Oral
Designing Robots with, not for: A Co-Design Framework for Empowering Interactions in Forensic Psychiatry
Forensic mental health care involves the treatment of individuals with severe mental disorders who have committed violent offences. These settings are often characterized by high levels of bureaucracy, risk avoidance, and restricted autonomy. Patients frequently experience a profound loss of control over their lives, leading to heightened psychological stress-sometimes resulting in isolation as a safety measure. In this study, we explore how co-design can be used to collaboratively develop a companion robot that helps monitor and regulate stress while maintaining tracking of the patients' interaction behaviours for long-term intervention. We conducted four co-design workshops in a forensic psychiatric clinic with patients, caregivers, and therapists. Our process began with the presentation of an initial speculative prototype to therapists, enabling reflection on shared concerns, ethical risks, and desirable features. This was followed by a creative ideation session with patients, a third workshop focused on defining desired functions and emotional responses, and we are planning a final prototype demo to gather direct patient feedback. Our findings emphasize the importance of empowering patients in the design process and adapting proposals based on their current emotional state. The goal was to empower the patient in the design process and ensure each patient's voice was heard.
Schema-Guided Scene-Graph Reasoning based on Multi-Agent Large Language Model System
Scene graphs have emerged as a structured and serializable environment representation for grounded spatial reasoning with Large Language Models (LLMs). In this work, we propose SG^2, an iterative Schema-Guided Scene-Graph reasoning framework based on multi-agent LLMs. The agents are grouped into two modules: a (1) Reasoner module for abstract task planning and graph information queries generation, and a (2) Retriever module for extracting corresponding graph information based on code-writing following the queries. Two modules collaborate iteratively, enabling sequential reasoning and adaptive attention to graph information. The scene graph schema, prompted to both modules, serves to not only streamline both reasoning and retrieval process, but also guide the cooperation between two modules. This eliminates the need to prompt LLMs with full graph data, reducing the chance of hallucination due to irrelevant information. Through experiments in multiple simulation environments, we show that our framework surpasses existing LLM-based approaches and baseline single-agent, tool-based Reason-while-Retrieve strategy in numerical Q\&A and planning tasks.
comment: In submission
Vehicle Top Tag Assisted Vehicle-Road Cooperative Localization For Autonomous Public Buses
Accurate vehicle localization is indispensable to autonomous vehicles, but is difficult to realize in complicated application scenarios. Intersection scenarios that suffer from environmental shielding and crowded dynamic objects are especially crucial and challenging. To handle difficult intersection scenarios, the methodology of vehicle top tag assisted vehicle-road cooperative localization or for short vehicle top tag assisted localization is proposed. The proposed methodology has merits of satisfying all the feasibility, reliability, explainability, society and economy concerns. Concrete solutions of vehicle top tag detection and vehicle top tag localization that instantiate the core part of the proposed methodology are presented. Simulation results are provided to demonstrate effectiveness of the presented solutions. The proposed methodology of vehicle top tag assisted localization also has the potential to be extended to a much wider range of practical applications than our intended ones involving autonomous public buses.
Multiagent Systems
ScamAgents: How AI Agents Can Simulate Human-Level Scam Calls
Large Language Models (LLMs) have demonstrated impressive fluency and reasoning capabilities, but their potential for misuse has raised growing concern. In this paper, we present ScamAgent, an autonomous multi-turn agent built on top of LLMs, capable of generating highly realistic scam call scripts that simulate real-world fraud scenarios. Unlike prior work focused on single-shot prompt misuse, ScamAgent maintains dialogue memory, adapts dynamically to simulated user responses, and employs deceptive persuasion strategies across conversational turns. We show that current LLM safety guardrails, including refusal mechanisms and content filters, are ineffective against such agent-based threats. Even models with strong prompt-level safeguards can be bypassed when prompts are decomposed, disguised, or delivered incrementally within an agent framework. We further demonstrate the transformation of scam scripts into lifelike voice calls using modern text-to-speech systems, completing a fully automated scam pipeline. Our findings highlight an urgent need for multi-turn safety auditing, agent-level control frameworks, and new methods to detect and disrupt conversational deception powered by generative AI.
comment: Accepted at CAMLIS 25: Conference on Applied Machine Learning for Information Security. 10 pages, 3 figures
Memp: Exploring Agent Procedural Memory
Large Language Models (LLMs) based agents excel at diverse tasks, yet they suffer from brittle procedural memory that is manually engineered or entangled in static parameters. In this work, we investigate strategies to endow agents with a learnable, updatable, and lifelong procedural memory. We propose Memp that distills past agent trajectories into both fine-grained, step-by-step instructions and higher-level, script-like abstractions, and explore the impact of different strategies for Build, Retrieval, and Update of procedural memory. Coupled with a dynamic regimen that continuously updates, corrects, and deprecates its contents, this repository evolves in lockstep with new experience. Empirical evaluation on TravelPlanner and ALFWorld shows that as the memory repository is refined, agents achieve steadily higher success rates and greater efficiency on analogous tasks. Moreover, procedural memory built from a stronger model retains its value: migrating the procedural memory to a weaker model yields substantial performance gains.
comment: Work in progress
Unsupervised Partner Design Enables Robust Ad-hoc Teamwork
We introduce Unsupervised Partner Design (UPD) - a population-free, multi-agent reinforcement learning framework for robust ad-hoc teamwork that adaptively generates training partners without requiring pretrained partners or manual parameter tuning. UPD constructs diverse partners by stochastically mixing an ego agent's policy with biased random behaviours and scores them using a variance-based learnability metric that prioritises partners near the ego agent's current learning frontier. We show that UPD can be integrated with unsupervised environment design, resulting in the first method enabling fully unsupervised curricula over both level and partner distributions in a cooperative setting. Through extensive evaluations on Overcooked-AI and the Overcooked Generalisation Challenge, we demonstrate that this dynamic partner curriculum is highly effective: UPD consistently outperforms both population-based and population-free baselines as well as ablations. In a user study, we further show that UPD achieves higher returns than all baselines and was perceived as significantly more adaptive, more human-like, a better collaborator, and less frustrating.
comment: 16 pages
Scaling Personality Control in LLMs with Big Five Scaler Prompts
We present Big5-Scaler, a prompt-based framework for conditioning large language models (LLMs) with controllable Big Five personality traits. By embedding numeric trait values into natural language prompts, our method enables fine-grained personality control without additional training. We evaluate Big5-Scaler across trait expression, dialogue generation, and human trait imitation tasks. Results show that it induces consistent and distinguishable personality traits across models, with performance varying by prompt type and scale. Our analysis highlights the effectiveness of concise prompts and lower trait intensities, providing a efficient approach for building personality-aware dialogue agents.
PanelTR: Zero-Shot Table Reasoning Framework Through Multi-Agent Scientific Discussion IJCNN 2025
Table reasoning, including tabular QA and fact verification, often depends on annotated data or complex data augmentation, limiting flexibility and generalization. LLMs, despite their versatility, often underperform compared to simple supervised models. To approach these issues, we introduce PanelTR, a framework utilizing LLM agent scientists for robust table reasoning through a structured scientific approach. PanelTR's workflow involves agent scientists conducting individual investigations, engaging in self-review, and participating in collaborative peer-review discussions. This process, driven by five scientist personas, enables semantic-level transfer without relying on data augmentation or parametric optimization. Experiments across four benchmarks show that PanelTR outperforms vanilla LLMs and rivals fully supervised models, all while remaining independent of training data. Our findings indicate that structured scientific methodology can effectively handle complex tasks beyond table reasoning with flexible semantic understanding in a zero-shot context.
comment: Accepted at IJCNN 2025
Policy Optimization in Multi-Agent Settings under Partially Observable Environments
This work leverages adaptive social learning to estimate partially observable global states in multi-agent reinforcement learning (MARL) problems. Unlike existing methods, the proposed approach enables the concurrent operation of social learning and reinforcement learning. Specifically, it alternates between a single step of social learning and a single step of MARL, eliminating the need for the time- and computation-intensive two-timescale learning frameworks. Theoretical guarantees are provided to support the effectiveness of the proposed method. Simulation results verify that the performance of the proposed methodology can approach that of reinforcement learning when the true state is known.
Asymmetric Network Games: $α$-Potential Function and Learning
In a network game, players interact over a network and the utility of each player depends on his own action and on an aggregate of his neighbours' actions. Many real world networks of interest are asymmetric and involve a large number of heterogeneous players. This paper analyzes static network games using the framework of $\alpha$-potential games. Under mild assumptions on the action sets (compact intervals) and the utility functions (twice continuously differentiable) of the players, we derive an expression for an inexact potential function of the game, called the $\alpha$-potential function. Using such a function, we show that modified versions of the sequential best-response algorithm and the simultaneous gradient play algorithm achieve convergence of players' actions to a $2\alpha$-Nash equilibrium. For linear-quadratic network games, we show that $\alpha$ depends on the maximum asymmetry in the network and is well-behaved for a wide range of networks of practical interest. Further, we derive bounds on the social welfare of the $\alpha$-Nash equilibrium corresponding to the maximum of the $\alpha$-potential function, under suitable assumptions. We numerically illustrate the convergence of the proposed algorithms and properties of the learned $2\alpha$-Nash equilibria.
MAHL: Multi-Agent LLM-Guided Hierarchical Chiplet Design with Adaptive Debugging
As program workloads (e.g., AI) increase in size and algorithmic complexity, the primary challenge lies in their high dimensionality, encompassing computing cores, array sizes, and memory hierarchies. To overcome these obstacles, innovative approaches are required. Agile chip design has already benefited from machine learning integration at various stages, including logic synthesis, placement, and routing. With Large Language Models (LLMs) recently demonstrating impressive proficiency in Hardware Description Language (HDL) generation, it is promising to extend their abilities to 2.5D integration, an advanced technique that saves area overhead and development costs. However, LLM-driven chiplet design faces challenges such as flatten design, high validation cost and imprecise parameter optimization, which limit its chiplet design capability. To address this, we propose MAHL, a hierarchical LLM-based chiplet design generation framework that features six agents which collaboratively enable AI algorithm-hardware mapping, including hierarchical description generation, retrieval-augmented code generation, diverseflow-based validation, and multi-granularity design space exploration. These components together enhance the efficient generation of chiplet design with optimized Power, Performance and Area (PPA). Experiments show that MAHL not only significantly improves the generation accuracy of simple RTL design, but also increases the generation accuracy of real-world chiplet design, evaluated by Pass@5, from 0 to 0.72 compared to conventional LLMs under the best-case scenario. Compared to state-of-the-art CLARIE (expert-based), MAHL achieves comparable or even superior PPA results under certain optimization objectives.
MI9 -- Agent Intelligence Protocol: Runtime Governance for Agentic AI Systems
Agentic AI systems capable of reasoning, planning, and executing actions present fundamentally distinct governance challenges compared to traditional AI models. Unlike conventional AI, these systems exhibit emergent and unexpected behaviors during runtime, introducing novel agent-related risks that cannot be fully anticipated through pre-deployment governance alone. To address this critical gap, we introduce MI9, the first fully integrated runtime governance framework designed specifically for safety and alignment of agentic AI systems. MI9 introduces real-time controls through six integrated components: agency-risk index, agent-semantic telemetry capture, continuous authorization monitoring, Finite-State-Machine (FSM)-based conformance engines, goal-conditioned drift detection, and graduated containment strategies. Operating transparently across heterogeneous agent architectures, MI9 enables the systematic, safe, and responsible deployment of agentic systems in production environments where conventional governance approaches fall short, providing the foundational infrastructure for safe agentic AI deployment at scale. Detailed analysis through a diverse set of scenarios demonstrates MI9's systematic coverage of governance challenges that existing approaches fail to address, establishing the technical foundation for comprehensive agentic AI oversight.
MAATS: A Multi-Agent Automated Translation System Based on MQM Evaluation
We present MAATS, a Multi Agent Automated Translation System that leverages the Multidimensional Quality Metrics (MQM) framework as a fine-grained signal for error detection and refinement. MAATS employs multiple specialized AI agents, each focused on a distinct MQM category (e.g., Accuracy, Fluency, Style, Terminology), followed by a synthesis agent that integrates the annotations to iteratively refine translations. This design contrasts with conventional single-agent methods that rely on self-correction. Evaluated across diverse language pairs and Large Language Models (LLMs), MAATS outperforms zero-shot and single-agent baselines with statistically significant gains in both automatic metrics and human assessments. It excels particularly in semantic accuracy, locale adaptation, and linguistically distant language pairs. Qualitative analysis highlights its strengths in multi-layered error diagnosis, omission detection across perspectives, and context-aware refinement. By aligning modular agent roles with interpretable MQM dimensions, MAATS narrows the gap between black-box LLMs and human translation workflows, shifting focus from surface fluency to deeper semantic and contextual fidelity.
El Agente: An Autonomous Agent for Quantum Chemistry
Computational chemistry tools are widely used to study the behaviour of chemical phenomena. Yet, the complexity of these tools can make them inaccessible to non-specialists and challenging even for experts. In this work, we introduce El Agente Q, an LLM-based multi-agent system that dynamically generates and executes quantum chemistry workflows from natural language user prompts. The system is built on a novel cognitive architecture featuring a hierarchical memory framework that enables flexible task decomposition, adaptive tool selection, post-analysis, and autonomous file handling and submission. El Agente Q is benchmarked on six university-level course exercises and two case studies, demonstrating robust problem-solving performance (averaging >87% task success) and adaptive error handling through in situ debugging. It also supports longer-term, multi-step task execution for more complex workflows, while maintaining transparency through detailed action trace logs. Together, these capabilities lay the foundation for increasingly autonomous and accessible quantum chemistry.
Observation Interference in Partially Observable Assistance Games
We study partially observable assistance games (POAGs), a model of the human-AI value alignment problem which allows the human and the AI assistant to have partial observations. Motivated by concerns of AI deception, we study a qualitatively new phenomenon made possible by partial observability: would an AI assistant ever have an incentive to interfere with the human's observations? First, we prove that sometimes an optimal assistant must take observation-interfering actions, even when the human is playing optimally, and even when there are otherwise-equivalent actions available that do not interfere with observations. Though this result seems to contradict the classic theorem from single-agent decision making that the value of information is nonnegative, we resolve this seeming contradiction by developing a notion of interference defined on entire policies. This can be viewed as an extension of the classic result that the value of information is nonnegative into the cooperative multiagent setting. Second, we prove that if the human is simply making decisions based on their immediate outcomes, the assistant might need to interfere with observations as a way to query the human's preferences. We show that this incentive for interference goes away if the human is playing optimally, or if we introduce a communication channel for the human to communicate their preferences to the assistant. Third, we show that if the human acts according to the Boltzmann model of irrationality, this can create an incentive for the assistant to interfere with observations. Finally, we use an experimental model to analyze tradeoffs faced by the AI assistant in practice when considering whether or not to take observation-interfering actions.
Personalized Constitutionally-Aligned Agentic Superego: Secure AI Behavior Aligned to Diverse Human Values
Agentic AI systems, possessing capabilities for autonomous planning and action, show great potential across diverse domains. However, their practical deployment is hindered by challenges in aligning their behavior with varied human values, complex safety requirements, and specific compliance needs. Existing alignment methodologies often falter when faced with the complex task of providing personalized context without inducing confabulation or operational inefficiencies. This paper introduces a novel solution: a 'superego' agent, designed as a personalized oversight mechanism for agentic AI. This system dynamically steers AI planning by referencing user-selected 'Creed Constitutions' encapsulating diverse rule sets -- with adjustable adherence levels to fit non-negotiable values. A real-time compliance enforcer validates plans against these constitutions and a universal ethical floor before execution. We present a functional system, including a demonstration interface with a prototypical constitution-sharing portal, and successful integration with third-party models via the Model Context Protocol (MCP). Comprehensive benchmark evaluations (HarmBench, AgentHarm) demonstrate that our Superego agent dramatically reduces harmful outputs -- achieving up to a 98.3% harm score reduction and near-perfect refusal rates (e.g., 100% with Claude Sonnet 4 on AgentHarm's harmful set) for leading LLMs like Gemini 2.5 Flash and GPT-4o. This approach substantially simplifies personalized AI alignment, rendering agentic systems more reliably attuned to individual and cultural contexts, while also enabling substantial safety improvements. An overview on this research with examples is available at https://superego.creed.space.
comment: 42 pages, 6 figures
Schema-Guided Scene-Graph Reasoning based on Multi-Agent Large Language Model System
Scene graphs have emerged as a structured and serializable environment representation for grounded spatial reasoning with Large Language Models (LLMs). In this work, we propose SG^2, an iterative Schema-Guided Scene-Graph reasoning framework based on multi-agent LLMs. The agents are grouped into two modules: a (1) Reasoner module for abstract task planning and graph information queries generation, and a (2) Retriever module for extracting corresponding graph information based on code-writing following the queries. Two modules collaborate iteratively, enabling sequential reasoning and adaptive attention to graph information. The scene graph schema, prompted to both modules, serves to not only streamline both reasoning and retrieval process, but also guide the cooperation between two modules. This eliminates the need to prompt LLMs with full graph data, reducing the chance of hallucination due to irrelevant information. Through experiments in multiple simulation environments, we show that our framework surpasses existing LLM-based approaches and baseline single-agent, tool-based Reason-while-Retrieve strategy in numerical Q\&A and planning tasks.
comment: In submission
Systems and Control (CS)
SCAR: State-Space Compression for AI-Driven Resource Management in 6G-Enabled Vehicular Infotainment Systems
The advent of 6G networks opens new possibilities for connected infotainment services in vehicular environments. However, traditional Radio Resource Management (RRM) techniques struggle with the increasing volume and complexity of data such as Channel Quality Indicators (CQI) from autonomous vehicles. To address this, we propose SCAR (State-Space Compression for AI-Driven Resource Management), an Edge AI-assisted framework that optimizes scheduling and fairness in vehicular infotainment. SCAR employs ML-based compression techniques (e.g., clustering and RBF networks) to reduce CQI data size while preserving essential features. These compressed states are used to train 6G-enabled Reinforcement Learning policies that maximize throughput while meeting fairness objectives defined by the NGMN. Simulations show that SCAR increases time in feasible scheduling regions by 14\% and reduces unfair scheduling time by 15\% compared to RL baselines without CQI compression. Furthermore, Simulated Annealing with Stochastic Tunneling (SAST)-based clustering reduces CQI clustering distortion by 10\%, confirming its efficiency. These results demonstrate SCAR's scalability and fairness benefits for dynamic vehicular networks.
Decentralized Optimization via RC-ALADIN with Efficient Quantized Communication
In this paper, we investigate the problem of decentralized consensus optimization over directed graphs with limited communication bandwidth. We introduce a novel decentralized optimization algorithm that combines the Reduced Consensus Augmented Lagrangian Alternating Direction Inexact Newton (RC-ALADIN) method with a finite time quantized coordination protocol, enabling quantized information exchange among nodes. Assuming the nodes' local objective functions are $\mu$-strongly convex and simply smooth, we establish global convergence at a linear rate to a neighborhood of the optimal solution, with the neighborhood size determined by the quantization level. Additionally, we show that the same convergence result also holds for the case where the local objective functions are convex and $L$-smooth. Numerical experiments demonstrate that our proposed algorithm compares favorably against algorithms in the current literature while exhibiting communication efficient operation.
Panel-Scale Reconfigurable Photonic Interconnects for Scalable AI Computation
Panel-scale reconfigurable photonic interconnects on a glass substrate up to 500-mm x 500-mm or larger are envisioned by proposing a novel photonic switch fabric that enables all directional panel-edge-to-panel-edge reach without the need for active repeaters while offering high communication bandwidth, planar-direction reconfigurability, low energy consumption, and compelling data bandwidth density for heterogeneous integration of an in-package AI computing system on a single glass-substrate photonic interposer exceeding thousands of centimeters square. The proposed approach focuses on reconfigurable photonic interconnects, which are integration-compatible with commercial processor chiplets and 3D high-bandwidth memory (HBM) stacks on a large-area glass substrate, to create a novel panel-scale heterogeneously integrated interposer or package enabling low-energy and high-capacity wavelength-division-multiplexing (WDM) optical data links using advanced high-speed optical modulators, broadband photodetectors, novel optical crossbar switches with multi-layer waveguides, and in-package frequency comb sources.
comment: 16 pages, 9 figures, 2 tables
Data-Driven Density Steering via the Gromov-Wasserstein Optimal Transport Distance
We tackle the data-driven chance-constrained density steering problem using the Gromov-Wasserstein metric. The underlying dynamical system is an unknown linear controlled recursion, with the assumption that sufficiently rich input-output data from pre-operational experiments are available. The initial state is modeled as a Gaussian mixture, while the terminal state is required to match a specified Gaussian distribution. We reformulate the resulting optimal control problem as a difference-of-convex program and show that it can be efficiently and tractably solved using the DC algorithm. Numerical results validate our approach through various data-driven schemes.
comment: To be presented at the IEEE CDC, Rio de Janeiro, 2025
Robust-Sub-Gaussian Model Predictive Control for Safe Ultrasound-Image-Guided Robotic Spinal Surgery
Safety-critical control using high-dimensional sensory feedback from optical data (e.g., images, point clouds) poses significant challenges in domains like autonomous driving and robotic surgery. Control can rely on low-dimensional states estimated from high-dimensional data. However, the estimation errors often follow complex, unknown distributions that standard probabilistic models fail to capture, making formal safety guarantees challenging. In this work, we introduce a novel characterization of these general estimation errors using sub-Gaussian noise with bounded mean. We develop a new technique for uncertainty propagation of proposed noise characterization in linear systems, which combines robust set-based methods with the propagation of sub-Gaussian variance proxies. We further develop a Model Predictive Control (MPC) framework that provides closed-loop safety guarantees for linear systems under the proposed noise assumption. We apply this MPC approach in an ultrasound-image-guided robotic spinal surgery pipeline, which contains deep-learning-based semantic segmentation, image-based registration, high-level optimization-based planning, and low-level robotic control. To validate the pipeline, we developed a realistic simulation environment integrating real human anatomy, robot dynamics, efficient ultrasound simulation, as well as in-vivo data of breathing motion and drilling force. Evaluation results in simulation demonstrate the potential of our approach for solving complex image-guided robotic surgery task while ensuring safety.
Learning Causal Structure Distributions for Robust Planning
Structural causal models describe how the components of a robotic system interact. They provide both structural and functional information about the relationships that are present in the system. The structural information outlines the variables among which there is interaction. The functional information describes how such interactions work, via equations or learned models. In this paper we find that learning the functional relationships while accounting for the uncertainty about the structural information leads to more robust dynamics models which improves downstream planning, while using significantly lower computational resources. This in contrast with common model-learning methods that ignore the causal structure and fail to leverage the sparsity of interactions in robotic systems. We achieve this by estimating a causal structure distribution that is used to sample causal graphs that inform the latent-space representations in an encoder-multidecoder probabilistic model. We show that our model can be used to learn the dynamics of a robot, which together with a sampling-based planner can be used to perform new tasks in novel environments, provided an objective function for the new requirement is available. We validate our method using manipulators and mobile robots in both simulation and the real-world. Additionally, we validate the learned dynamics' adaptability and increased robustness to corrupted inputs and changes in the environment, which is highly desirable in challenging real-world robotics scenarios. Video: https://youtu.be/X6k5t7OOnNc.
Secure and Decentralized Peer-to-Peer Energy Transactions using Blockchain Technology
This paper presents an optimal peer-to-peer (P2P) energy transaction mechanism leveraging decentralized blockchain technology to enable a secure and scalable retail electricity market for the increasing penetration of distributed energy resources (DERs). A decentralized bidding strategy is proposed to maximize individual profits while collectively enhancing social welfare. The market design and transaction processes are simulated using the Ethereum testnet, demonstrating the blockchain network's capability to ensure secure, transparent, and sustainable P2P energy trading among DER participants.
Embedded Microcontrol for Photovoltaic Water Pumping System
We introduce a novel 3-axis solar tracker water pumping system. The charge generated from solar energy converted by the photovolatic panel (PV) cells is stored in a 12V battery that in turn powers two water diaphragm pumps using a solar charge controller that includes an MPPT algorithm that serves as a DC-DC converter. The system is analyzed from an embedded microcontroller and embedded software perspective using Arduino. The photovoltaic panel uses four light photocell resistors (LPRs) which measure solar light intensity. An ultrasonic sensor measures the water level in a reservoir water tank. If the water level is too low, water is pumped from one water tank to the reservoir tank. Using a soil moisture sensor, another water pump pumps water from the reservoir tank to the plant if water is needed. Circuit designs for the system are provided as well as the embedded software used. Simulation and experimental results are given.
Dual-Head Physics-Informed Graph Decision Transformer for Distribution System Restoration
Driven by recent advances in sensing and computing, deep reinforcement learning (DRL) technologies have shown great potential for addressing distribution system restoration (DSR) under uncertainty. However, their data-intensive nature and reliance on the Markov Decision Process (MDP) assumption limit their ability to handle scenarios that require long-term temporal dependencies or few-shot and zero-shot decision making. Emerging Decision Transformers (DTs), which leverage causal transformers for sequence modeling in DRL tasks, offer a promising alternative. However, their reliance on return-to-go (RTG) cloning and limited generalization capacity restricts their effectiveness in dynamic power system environments. To address these challenges, we introduce an innovative Dual-Head Physics-informed Graph Decision Transformer (DH-PGDT) that integrates physical modeling, structural reasoning, and subgoal-based guidance to enable scalable and robust DSR even in zero-shot or few-shot scenarios. DH-PGDT features a dual-head physics-informed causal transformer architecture comprising Guidance Head, which generates subgoal representations, and Action Head, which uses these subgoals to generate actions independently of RTG. It also incorporates an operational constraint-aware graph reasoning module that encodes power system topology and operational constraints to generate a confidence-weighted action vector for refining DT trajectories. This design effectively improves generalization and enables robust adaptation to unseen scenarios. While this work focuses on DSR, the underlying computing model of the proposed PGDT is broadly applicable to sequential decision making across various power system operations and other complex engineering domains.
Asymmetric Network Games: $α$-Potential Function and Learning
In a network game, players interact over a network and the utility of each player depends on his own action and on an aggregate of his neighbours' actions. Many real world networks of interest are asymmetric and involve a large number of heterogeneous players. This paper analyzes static network games using the framework of $\alpha$-potential games. Under mild assumptions on the action sets (compact intervals) and the utility functions (twice continuously differentiable) of the players, we derive an expression for an inexact potential function of the game, called the $\alpha$-potential function. Using such a function, we show that modified versions of the sequential best-response algorithm and the simultaneous gradient play algorithm achieve convergence of players' actions to a $2\alpha$-Nash equilibrium. For linear-quadratic network games, we show that $\alpha$ depends on the maximum asymmetry in the network and is well-behaved for a wide range of networks of practical interest. Further, we derive bounds on the social welfare of the $\alpha$-Nash equilibrium corresponding to the maximum of the $\alpha$-potential function, under suitable assumptions. We numerically illustrate the convergence of the proposed algorithms and properties of the learned $2\alpha$-Nash equilibria.
Attitude Determination and Control of GPS Satellites: Stabilization, Orbital Insertion, and Operational Control Mechanisms
Global Positioning System (GPS) satellites are essential for providing accurate navigation and timing information worldwide. Operating in medium Earth orbit (MEO), these satellites must maintain precise Earth-pointing attitudes to transmit signals effectively. This paper presents a comprehensive review of the operational dynamics, attitude determination and control systems (ADCS), and orbital insertion techniques for GPS satellites. We explore the integration of sensors and actuators, control algorithms, stabilization strategies, and the launch procedures required to deploy these satellites. Key equations related to orbital mechanics and attitude control are discussed, and references to recent technical literature are included.
comment: 8 pages, 3 figures
Modeling and Simulation of an Active Quarter Car Suspension with a Robust LQR Controller under Road Disturbance and Parameter Uncertainty
Vehicle suspension is important for passengers to travel comfortably and to be less exposed to effects such as vibration and shock. A good suspension system increases the road holding of vehicles, allows them to take turns safely, and reduces the risk of traffic accidents. A passive suspension system is the most widely used suspension system in vehicles due to its simple structure and low cost. Passive suspension systems do not have an actuator and therefore do not have a controller. Active suspension systems have an actuator and a controller. Although their structures are more complex and costly, they are safer. PID controller is widely used in active suspension systems due to its simple structure, reasonable cost, and easy adjustment of coefficients. In this study, a more robust LQR-controlled active suspension was designed than a passive suspension and a PID-controlled active suspension. Robustness analyses were performed for passive suspension, PID-controlled active suspension, and LQR-controlled active suspension. Suspension travel, sprung mass acceleration, and sprung mass motion simulations were performed for all 3 suspensions under road disturbance and under simultaneous road disturbance and parameter uncertainty. A comparative analysis was performed by obtaining the suspension rise time, overshoot, and settling time data. It was observed that the LQR-controlled active suspension showed the least overshoot and had the shortest settling time. In this case, it was proven that the LQR controlled active suspension provided a more comfortable and safe ride compared to the other two suspension systems.
comment: 16 pages, 13 figures
Continuous-time Data-driven Barrier Certificate Synthesis
We consider the problem of verifying safety for continuous-time dynamical systems. Developing upon recent advancements in data-driven verification, we use only a finite number of sampled trajectories to learn a barrier certificate, namely a function which verifies safety. We train a safety-informed neural network to act as this certificate, with an appropriately designed loss function to encompass the safety conditions. In addition, we provide probabilistic generalisation guarantees from discrete samples of continuous trajectories, to unseen continuous ones. Numerical investigations demonstrate the efficacy of our approach and contrast it with related results in the literature.
comment: Accepted at CDC 2025. arXiv admin note: text overlap with arXiv:2502.05510
Observability and Generalized Sensor Placement for Nonlinear Quality Models in Drinking Water Networks
This paper studies the problem of optimal placement of water quality (WQ) sensors in water distribution networks (WDNs), with a focus on chlorine transport, decay, and reaction models. Such models are traditionally used as suitable proxies for WQ. The literature on this topic is inveterate, but has a key limitation: it utilizes simplified single-species decay and reaction models that do not capture WQ transients for nonlinear, multi-species interactions. This results in sensor placements (SP) that do not account for nonlinear WQ dynamics. Furthermore, as WQ simulations are parameterized by hydraulic profiles and demand patterns, the placement of sensors are often hydraulics-dependent. This study produces a greedy algorithm that addresses the two aforementioned limitations. The algorithm is grounded in nonlinear dynamic systems and observability theory, and yields SPs that are submodular and robust to hydraulic changes. Case studies on benchmark water networks are provided. The key findings provide practical recommendations for WDN operators.
Unified Multi-Rate Model Predictive Control for a Jet-Powered Humanoid Robot
We propose a novel Model Predictive Control (MPC) framework for a jet-powered flying humanoid robot. The controller is based on a linearised centroidal momentum model to represent the flight dynamics, augmented with a second-order nonlinear model to explicitly account for the slow and nonlinear dynamics of jet propulsion. A key contribution is the introduction of a multi-rate MPC formulation that handles the different actuation rates of the robot's joints and jet engines while embedding the jet dynamics directly into the predictive model. We validated the framework using the jet-powered humanoid robot iRonCub, performing simulations in Mujoco; the simulation results demonstrate the robot's ability to recover from external disturbances and perform stable, non-abrupt flight manoeuvres, validating the effectiveness of the proposed approach.
comment: This paper has been accepted for publication at the 2025 IEEE-RAS 24th International Conference on Humanoid Robots (Humanoids), Seoul, 2025
A Versatile Framework for Data-Driven Control of Nonlinear Systems
This note aims to provide a systematic investigation of direct data-driven control, enriching the existing literature not by adding another isolated result, but rather by offering a unifying, versatile, and broad framework that enables the generation of novel results in this domain. We formulate the nonlinear design problem from a high-level perspective as a set of desired controlled systems and propose systematic procedures to synthesize data-driven control algorithms that meet the specified design requirements. Various examples are presented to demonstrate the applicability of the proposed approach and its ability to derive new insights and results, illustrating the novel contributions enabled by the framework.
Spatial Association Between Near-Misses and Accident Blackspots in Sydney, Australia: A Getis-Ord $G_i^*$ Analysis
Conventional road safety management is inherently reactive, relying on analysis of sparse and lagged historical crash data to identify hazardous locations, or crash blackspots. The proliferation of vehicle telematics presents an opportunity for a paradigm shift towards proactive safety, using high-frequency, high-resolution near-miss data as a leading indicator of crash risk. This paper presents a spatial-statistical framework to systematically analyze the concordance and discordance between official crash records and near-miss events within urban environment. A Getis-Ord statistic is first applied to both reported crashes and near-miss events to identify statistically significant local clusters of each type. Subsequently, Bivariate Local Moran's I assesses spatial relationships between crash counts and High-G event counts, classifying grid cells into distinct profiles: High-High (coincident risk), High-Low and Low-High. Our analysis reveals significant amount of Low-Crash, High-Near-Miss clusters representing high-risk areas that remain unobservable when relying solely on historical crash data. Feature importance analysis is performed using contextual Point of Interest data to identify the different infrastructure factors that characterize difference between spatial clusters. The results provide a data-driven methodology for transport authorities to transition from a reactive to a proactive safety management strategy, allowing targeted interventions before severe crashes occur.
A Linear Differential Inclusion for Contraction Analysis to Known Trajectories
Infinitesimal contraction analysis provides exponential convergence rates between arbitrary pairs of trajectories of a system by studying the system's linearization. An essentially equivalent viewpoint arises through stability analysis of a linear differential inclusion (LDI) encompassing the incremental behavior of the system. In this note, we use contraction tools to study the exponential stability of a system to a particular known trajectory, deriving a new LDI characterizing the error between arbitrary trajectories and this known trajectory. As with classical contraction analysis, this new inclusion is constructed via first partial derivatives of the system's vector field, and convergence rates are obtained with familiar tools: uniform bounding of the logarithmic norm and LMI-based Lyapunov conditions. Our LDI is guaranteed to outperform a usual contraction analysis in two special circumstances: i) when the bound on the logarithmic norm arises from an interval overapproximation of the Jacobian matrix, and ii) when the norm considered is the $\ell_1$ norm. Finally, we demonstrate how the proposed approach strictly improves an existing framework for ellipsoidal reachable set computation.
Koopman-based control using sum-of-squares optimization: Improved stability guarantees and data efficiency
In this paper, we propose a novel controller design approach for unknown nonlinear systems using the Koopman operator. In particular, we use the recently proposed stability- and certificate-oriented extended dynamic mode decomposition (SafEDMD) architecture to generate a data-driven bilinear surrogate model with certified error bounds. Then, by accounting for the obtained error bounds in a controller design based on the bilinear system, one can guarantee closed-loop stability for the true nonlinear system. While existing approaches over-approximate the bilinearity of the surrogate model, thus introducing conservatism and providing only local guarantees, we explicitly account for the bilinearity by using sum-of-squares (SOS) optimization in the controller design. More precisely, we parametrize a rational controller stabilizing the error-affected bilinear surrogate model and, consequently, the underlying nonlinear system. The resulting SOS optimization problem provides explicit data-driven controller design conditions for unknown nonlinear systems based on semidefinite programming. Our approach significantly reduces conservatism by establishing a larger region of attraction and improved data efficiency. The proposed method is evaluated using numerical examples, demonstrating its advantages over existing approaches.
comment: Accepted for publication in the European Journal of Control, 2025
Koopman-based control of nonlinear systems with closed-loop guarantees
In this paper, we provide a tutorial overview and an extension of a recently developed framework for data-driven control of unknown nonlinear systems with rigorous closed-loop guarantees. The proposed approach relies on the Koopman operator representation of the nonlinear system, for which a bilinear surrogate model is estimated based on data. In contrast to existing Koopman-based estimation procedures, we state guaranteed bounds on the approximation error using the stability- and certificate-oriented extended dynamic mode decomposition (SafEDMD) framework. The resulting surrogate model and the uncertainty bounds allow us to design controllers via robust control theory and sum-of-squares optimization, guaranteeing desirable properties for the closed-loop system. We present results on stabilization both in discrete and continuous time, and we derive a method for controller design with performance objectives. The benefits of the presented framework over established approaches are demonstrated with a numerical example.
comment: Accepted for publication in at-Automatisierungstechnik
Equilibrium Selection in Replicator Equations Using Adaptive-Gain Control
In this paper, we deal with the equilibrium selection problem, which amounts to steering a population of individuals engaged in strategic game-theoretic interactions to a desired collective behavior. In the literature, this problem has been typically tackled by means of open-loop strategies, whose applicability is however limited by the need of accurate a priori information on the game and scarce robustness to uncertainty and noise. Here, we overcome these limitations by adopting a closed-loop approach using an adaptive-gain control scheme within a replicator equation -a nonlinear ordinary differential equation that models the evolution of the collective behavior of the population. For most classes of 2-action matrix games we establish sufficient conditions to design a controller that guarantees convergence of the replicator equation to the desired equilibrium, requiring limited a-priori information on the game. Numerical simulations corroborate and expand our theoretical findings.
comment: Published in the IEEE Transactions on Automatic Control, 2025. arXiv admin note: text overlap with arXiv:2306.14469
Distribution Grids May Be a Barrier To Residential Electrification
Replacing fossil-fueled appliances and vehicles with electric alternatives can reduce greenhouse gas emissions and air pollution in many settings. However, residential electrification can also raise electricity demand beyond the safe limits of electrical infrastructure. This can increase the risk of blackouts or require grid reinforcement that is often slow and expensive. Here, we estimate the physical and economic impacts on distribution grids of electrifying all housing and personal vehicles in each county of the lower 48 United States. We find that space heating is the main driver of grid impacts, with the coldest regions seeing demand peaks up to five times higher than today's peaks. Accommodating electrification of all housing and personal vehicles is estimated to require 600 GW of distribution grid reinforcement nationally, at a cost of \$350 to \$790 billion, or \$2,800 to \$6,400 per household (95% confidence intervals). However, demand-side management could eliminate three-quarters of grid reinforcement costs.
Distributed ADMM Approach for the Power Distribution Network Reconfiguration
The electrical network reconfiguration problem aims to minimize losses in a distribution system by adjusting switches while ensuring radial topology. The growing use of renewable energy and the complexity of managing modern power grids make solving the reconfiguration problem crucial. Distributed algorithms help optimize grid configurations, ensuring efficient adaptation to changing conditions and better utilization of renewable energy sources. This paper introduces a distributed algorithm designed to tackle the problem of power distribution network reconfiguration with a radiality constraint. This algorithm relies on ADMM (Alternating Direction Method of Multipliers), where each agent progressively updates its estimation based on the information exchanged with neighboring agents. We show that every agent is required to solve a linearly constrained convex quadratic programming problem and a Minimum Weight Rooted Arborescence Problem (MWRAP) with local weights during each iteration. Through numerical experiments, we demonstrate the performance of the proposed algorithm in various scenarios, including its application to a 33-bus test system and a real-world network.
Extremum Seeking Control for Antenna Pointing via Symmetric Product Approximation
This paper investigates extremum seeking control for a torque-controlled antenna pointing system without direct angular measurements. We consider a two-degree-of-freedom (2-DOF) antenna system that receives an unknown signal from its environment, where the signal strength varies with the antenna orientation. It is assumed that only real-time measurements of the signal are available. We develop an extremum seeking control strategy that enables the antenna to autonomously adjust its direction to maximize the received signal strength based on the symmetric product approximation. Under suitable assumptions on the signal function, we prove local practical uniform asymptotic stability for the closed-loop system.
comment: This paper has been accepted for presentation at The 2025 Modeling, Estimation and Control Conference (MECC 2025). This is the full version of the paper, which contains complete proofs and additional details not included in the conference version
Systems and Control (EESS)
SCAR: State-Space Compression for AI-Driven Resource Management in 6G-Enabled Vehicular Infotainment Systems
The advent of 6G networks opens new possibilities for connected infotainment services in vehicular environments. However, traditional Radio Resource Management (RRM) techniques struggle with the increasing volume and complexity of data such as Channel Quality Indicators (CQI) from autonomous vehicles. To address this, we propose SCAR (State-Space Compression for AI-Driven Resource Management), an Edge AI-assisted framework that optimizes scheduling and fairness in vehicular infotainment. SCAR employs ML-based compression techniques (e.g., clustering and RBF networks) to reduce CQI data size while preserving essential features. These compressed states are used to train 6G-enabled Reinforcement Learning policies that maximize throughput while meeting fairness objectives defined by the NGMN. Simulations show that SCAR increases time in feasible scheduling regions by 14\% and reduces unfair scheduling time by 15\% compared to RL baselines without CQI compression. Furthermore, Simulated Annealing with Stochastic Tunneling (SAST)-based clustering reduces CQI clustering distortion by 10\%, confirming its efficiency. These results demonstrate SCAR's scalability and fairness benefits for dynamic vehicular networks.
Decentralized Optimization via RC-ALADIN with Efficient Quantized Communication
In this paper, we investigate the problem of decentralized consensus optimization over directed graphs with limited communication bandwidth. We introduce a novel decentralized optimization algorithm that combines the Reduced Consensus Augmented Lagrangian Alternating Direction Inexact Newton (RC-ALADIN) method with a finite time quantized coordination protocol, enabling quantized information exchange among nodes. Assuming the nodes' local objective functions are $\mu$-strongly convex and simply smooth, we establish global convergence at a linear rate to a neighborhood of the optimal solution, with the neighborhood size determined by the quantization level. Additionally, we show that the same convergence result also holds for the case where the local objective functions are convex and $L$-smooth. Numerical experiments demonstrate that our proposed algorithm compares favorably against algorithms in the current literature while exhibiting communication efficient operation.
Panel-Scale Reconfigurable Photonic Interconnects for Scalable AI Computation
Panel-scale reconfigurable photonic interconnects on a glass substrate up to 500-mm x 500-mm or larger are envisioned by proposing a novel photonic switch fabric that enables all directional panel-edge-to-panel-edge reach without the need for active repeaters while offering high communication bandwidth, planar-direction reconfigurability, low energy consumption, and compelling data bandwidth density for heterogeneous integration of an in-package AI computing system on a single glass-substrate photonic interposer exceeding thousands of centimeters square. The proposed approach focuses on reconfigurable photonic interconnects, which are integration-compatible with commercial processor chiplets and 3D high-bandwidth memory (HBM) stacks on a large-area glass substrate, to create a novel panel-scale heterogeneously integrated interposer or package enabling low-energy and high-capacity wavelength-division-multiplexing (WDM) optical data links using advanced high-speed optical modulators, broadband photodetectors, novel optical crossbar switches with multi-layer waveguides, and in-package frequency comb sources.
comment: 16 pages, 9 figures, 2 tables
Data-Driven Density Steering via the Gromov-Wasserstein Optimal Transport Distance
We tackle the data-driven chance-constrained density steering problem using the Gromov-Wasserstein metric. The underlying dynamical system is an unknown linear controlled recursion, with the assumption that sufficiently rich input-output data from pre-operational experiments are available. The initial state is modeled as a Gaussian mixture, while the terminal state is required to match a specified Gaussian distribution. We reformulate the resulting optimal control problem as a difference-of-convex program and show that it can be efficiently and tractably solved using the DC algorithm. Numerical results validate our approach through various data-driven schemes.
comment: To be presented at the IEEE CDC, Rio de Janeiro, 2025
Robust-Sub-Gaussian Model Predictive Control for Safe Ultrasound-Image-Guided Robotic Spinal Surgery
Safety-critical control using high-dimensional sensory feedback from optical data (e.g., images, point clouds) poses significant challenges in domains like autonomous driving and robotic surgery. Control can rely on low-dimensional states estimated from high-dimensional data. However, the estimation errors often follow complex, unknown distributions that standard probabilistic models fail to capture, making formal safety guarantees challenging. In this work, we introduce a novel characterization of these general estimation errors using sub-Gaussian noise with bounded mean. We develop a new technique for uncertainty propagation of proposed noise characterization in linear systems, which combines robust set-based methods with the propagation of sub-Gaussian variance proxies. We further develop a Model Predictive Control (MPC) framework that provides closed-loop safety guarantees for linear systems under the proposed noise assumption. We apply this MPC approach in an ultrasound-image-guided robotic spinal surgery pipeline, which contains deep-learning-based semantic segmentation, image-based registration, high-level optimization-based planning, and low-level robotic control. To validate the pipeline, we developed a realistic simulation environment integrating real human anatomy, robot dynamics, efficient ultrasound simulation, as well as in-vivo data of breathing motion and drilling force. Evaluation results in simulation demonstrate the potential of our approach for solving complex image-guided robotic surgery task while ensuring safety.
Learning Causal Structure Distributions for Robust Planning
Structural causal models describe how the components of a robotic system interact. They provide both structural and functional information about the relationships that are present in the system. The structural information outlines the variables among which there is interaction. The functional information describes how such interactions work, via equations or learned models. In this paper we find that learning the functional relationships while accounting for the uncertainty about the structural information leads to more robust dynamics models which improves downstream planning, while using significantly lower computational resources. This in contrast with common model-learning methods that ignore the causal structure and fail to leverage the sparsity of interactions in robotic systems. We achieve this by estimating a causal structure distribution that is used to sample causal graphs that inform the latent-space representations in an encoder-multidecoder probabilistic model. We show that our model can be used to learn the dynamics of a robot, which together with a sampling-based planner can be used to perform new tasks in novel environments, provided an objective function for the new requirement is available. We validate our method using manipulators and mobile robots in both simulation and the real-world. Additionally, we validate the learned dynamics' adaptability and increased robustness to corrupted inputs and changes in the environment, which is highly desirable in challenging real-world robotics scenarios. Video: https://youtu.be/X6k5t7OOnNc.
Secure and Decentralized Peer-to-Peer Energy Transactions using Blockchain Technology
This paper presents an optimal peer-to-peer (P2P) energy transaction mechanism leveraging decentralized blockchain technology to enable a secure and scalable retail electricity market for the increasing penetration of distributed energy resources (DERs). A decentralized bidding strategy is proposed to maximize individual profits while collectively enhancing social welfare. The market design and transaction processes are simulated using the Ethereum testnet, demonstrating the blockchain network's capability to ensure secure, transparent, and sustainable P2P energy trading among DER participants.
Embedded Microcontrol for Photovoltaic Water Pumping System
We introduce a novel 3-axis solar tracker water pumping system. The charge generated from solar energy converted by the photovolatic panel (PV) cells is stored in a 12V battery that in turn powers two water diaphragm pumps using a solar charge controller that includes an MPPT algorithm that serves as a DC-DC converter. The system is analyzed from an embedded microcontroller and embedded software perspective using Arduino. The photovoltaic panel uses four light photocell resistors (LPRs) which measure solar light intensity. An ultrasonic sensor measures the water level in a reservoir water tank. If the water level is too low, water is pumped from one water tank to the reservoir tank. Using a soil moisture sensor, another water pump pumps water from the reservoir tank to the plant if water is needed. Circuit designs for the system are provided as well as the embedded software used. Simulation and experimental results are given.
Dual-Head Physics-Informed Graph Decision Transformer for Distribution System Restoration
Driven by recent advances in sensing and computing, deep reinforcement learning (DRL) technologies have shown great potential for addressing distribution system restoration (DSR) under uncertainty. However, their data-intensive nature and reliance on the Markov Decision Process (MDP) assumption limit their ability to handle scenarios that require long-term temporal dependencies or few-shot and zero-shot decision making. Emerging Decision Transformers (DTs), which leverage causal transformers for sequence modeling in DRL tasks, offer a promising alternative. However, their reliance on return-to-go (RTG) cloning and limited generalization capacity restricts their effectiveness in dynamic power system environments. To address these challenges, we introduce an innovative Dual-Head Physics-informed Graph Decision Transformer (DH-PGDT) that integrates physical modeling, structural reasoning, and subgoal-based guidance to enable scalable and robust DSR even in zero-shot or few-shot scenarios. DH-PGDT features a dual-head physics-informed causal transformer architecture comprising Guidance Head, which generates subgoal representations, and Action Head, which uses these subgoals to generate actions independently of RTG. It also incorporates an operational constraint-aware graph reasoning module that encodes power system topology and operational constraints to generate a confidence-weighted action vector for refining DT trajectories. This design effectively improves generalization and enables robust adaptation to unseen scenarios. While this work focuses on DSR, the underlying computing model of the proposed PGDT is broadly applicable to sequential decision making across various power system operations and other complex engineering domains.
Asymmetric Network Games: $α$-Potential Function and Learning
In a network game, players interact over a network and the utility of each player depends on his own action and on an aggregate of his neighbours' actions. Many real world networks of interest are asymmetric and involve a large number of heterogeneous players. This paper analyzes static network games using the framework of $\alpha$-potential games. Under mild assumptions on the action sets (compact intervals) and the utility functions (twice continuously differentiable) of the players, we derive an expression for an inexact potential function of the game, called the $\alpha$-potential function. Using such a function, we show that modified versions of the sequential best-response algorithm and the simultaneous gradient play algorithm achieve convergence of players' actions to a $2\alpha$-Nash equilibrium. For linear-quadratic network games, we show that $\alpha$ depends on the maximum asymmetry in the network and is well-behaved for a wide range of networks of practical interest. Further, we derive bounds on the social welfare of the $\alpha$-Nash equilibrium corresponding to the maximum of the $\alpha$-potential function, under suitable assumptions. We numerically illustrate the convergence of the proposed algorithms and properties of the learned $2\alpha$-Nash equilibria.
Attitude Determination and Control of GPS Satellites: Stabilization, Orbital Insertion, and Operational Control Mechanisms
Global Positioning System (GPS) satellites are essential for providing accurate navigation and timing information worldwide. Operating in medium Earth orbit (MEO), these satellites must maintain precise Earth-pointing attitudes to transmit signals effectively. This paper presents a comprehensive review of the operational dynamics, attitude determination and control systems (ADCS), and orbital insertion techniques for GPS satellites. We explore the integration of sensors and actuators, control algorithms, stabilization strategies, and the launch procedures required to deploy these satellites. Key equations related to orbital mechanics and attitude control are discussed, and references to recent technical literature are included.
comment: 8 pages, 3 figures
Modeling and Simulation of an Active Quarter Car Suspension with a Robust LQR Controller under Road Disturbance and Parameter Uncertainty
Vehicle suspension is important for passengers to travel comfortably and to be less exposed to effects such as vibration and shock. A good suspension system increases the road holding of vehicles, allows them to take turns safely, and reduces the risk of traffic accidents. A passive suspension system is the most widely used suspension system in vehicles due to its simple structure and low cost. Passive suspension systems do not have an actuator and therefore do not have a controller. Active suspension systems have an actuator and a controller. Although their structures are more complex and costly, they are safer. PID controller is widely used in active suspension systems due to its simple structure, reasonable cost, and easy adjustment of coefficients. In this study, a more robust LQR-controlled active suspension was designed than a passive suspension and a PID-controlled active suspension. Robustness analyses were performed for passive suspension, PID-controlled active suspension, and LQR-controlled active suspension. Suspension travel, sprung mass acceleration, and sprung mass motion simulations were performed for all 3 suspensions under road disturbance and under simultaneous road disturbance and parameter uncertainty. A comparative analysis was performed by obtaining the suspension rise time, overshoot, and settling time data. It was observed that the LQR-controlled active suspension showed the least overshoot and had the shortest settling time. In this case, it was proven that the LQR controlled active suspension provided a more comfortable and safe ride compared to the other two suspension systems.
comment: 16 pages, 13 figures
Continuous-time Data-driven Barrier Certificate Synthesis
We consider the problem of verifying safety for continuous-time dynamical systems. Developing upon recent advancements in data-driven verification, we use only a finite number of sampled trajectories to learn a barrier certificate, namely a function which verifies safety. We train a safety-informed neural network to act as this certificate, with an appropriately designed loss function to encompass the safety conditions. In addition, we provide probabilistic generalisation guarantees from discrete samples of continuous trajectories, to unseen continuous ones. Numerical investigations demonstrate the efficacy of our approach and contrast it with related results in the literature.
comment: Accepted at CDC 2025. arXiv admin note: text overlap with arXiv:2502.05510
Observability and Generalized Sensor Placement for Nonlinear Quality Models in Drinking Water Networks
This paper studies the problem of optimal placement of water quality (WQ) sensors in water distribution networks (WDNs), with a focus on chlorine transport, decay, and reaction models. Such models are traditionally used as suitable proxies for WQ. The literature on this topic is inveterate, but has a key limitation: it utilizes simplified single-species decay and reaction models that do not capture WQ transients for nonlinear, multi-species interactions. This results in sensor placements (SP) that do not account for nonlinear WQ dynamics. Furthermore, as WQ simulations are parameterized by hydraulic profiles and demand patterns, the placement of sensors are often hydraulics-dependent. This study produces a greedy algorithm that addresses the two aforementioned limitations. The algorithm is grounded in nonlinear dynamic systems and observability theory, and yields SPs that are submodular and robust to hydraulic changes. Case studies on benchmark water networks are provided. The key findings provide practical recommendations for WDN operators.
Unified Multi-Rate Model Predictive Control for a Jet-Powered Humanoid Robot
We propose a novel Model Predictive Control (MPC) framework for a jet-powered flying humanoid robot. The controller is based on a linearised centroidal momentum model to represent the flight dynamics, augmented with a second-order nonlinear model to explicitly account for the slow and nonlinear dynamics of jet propulsion. A key contribution is the introduction of a multi-rate MPC formulation that handles the different actuation rates of the robot's joints and jet engines while embedding the jet dynamics directly into the predictive model. We validated the framework using the jet-powered humanoid robot iRonCub, performing simulations in Mujoco; the simulation results demonstrate the robot's ability to recover from external disturbances and perform stable, non-abrupt flight manoeuvres, validating the effectiveness of the proposed approach.
comment: This paper has been accepted for publication at the 2025 IEEE-RAS 24th International Conference on Humanoid Robots (Humanoids), Seoul, 2025
A Versatile Framework for Data-Driven Control of Nonlinear Systems
This note aims to provide a systematic investigation of direct data-driven control, enriching the existing literature not by adding another isolated result, but rather by offering a unifying, versatile, and broad framework that enables the generation of novel results in this domain. We formulate the nonlinear design problem from a high-level perspective as a set of desired controlled systems and propose systematic procedures to synthesize data-driven control algorithms that meet the specified design requirements. Various examples are presented to demonstrate the applicability of the proposed approach and its ability to derive new insights and results, illustrating the novel contributions enabled by the framework.
Spatial Association Between Near-Misses and Accident Blackspots in Sydney, Australia: A Getis-Ord $G_i^*$ Analysis
Conventional road safety management is inherently reactive, relying on analysis of sparse and lagged historical crash data to identify hazardous locations, or crash blackspots. The proliferation of vehicle telematics presents an opportunity for a paradigm shift towards proactive safety, using high-frequency, high-resolution near-miss data as a leading indicator of crash risk. This paper presents a spatial-statistical framework to systematically analyze the concordance and discordance between official crash records and near-miss events within urban environment. A Getis-Ord statistic is first applied to both reported crashes and near-miss events to identify statistically significant local clusters of each type. Subsequently, Bivariate Local Moran's I assesses spatial relationships between crash counts and High-G event counts, classifying grid cells into distinct profiles: High-High (coincident risk), High-Low and Low-High. Our analysis reveals significant amount of Low-Crash, High-Near-Miss clusters representing high-risk areas that remain unobservable when relying solely on historical crash data. Feature importance analysis is performed using contextual Point of Interest data to identify the different infrastructure factors that characterize difference between spatial clusters. The results provide a data-driven methodology for transport authorities to transition from a reactive to a proactive safety management strategy, allowing targeted interventions before severe crashes occur.
A Linear Differential Inclusion for Contraction Analysis to Known Trajectories
Infinitesimal contraction analysis provides exponential convergence rates between arbitrary pairs of trajectories of a system by studying the system's linearization. An essentially equivalent viewpoint arises through stability analysis of a linear differential inclusion (LDI) encompassing the incremental behavior of the system. In this note, we use contraction tools to study the exponential stability of a system to a particular known trajectory, deriving a new LDI characterizing the error between arbitrary trajectories and this known trajectory. As with classical contraction analysis, this new inclusion is constructed via first partial derivatives of the system's vector field, and convergence rates are obtained with familiar tools: uniform bounding of the logarithmic norm and LMI-based Lyapunov conditions. Our LDI is guaranteed to outperform a usual contraction analysis in two special circumstances: i) when the bound on the logarithmic norm arises from an interval overapproximation of the Jacobian matrix, and ii) when the norm considered is the $\ell_1$ norm. Finally, we demonstrate how the proposed approach strictly improves an existing framework for ellipsoidal reachable set computation.
Koopman-based control using sum-of-squares optimization: Improved stability guarantees and data efficiency
In this paper, we propose a novel controller design approach for unknown nonlinear systems using the Koopman operator. In particular, we use the recently proposed stability- and certificate-oriented extended dynamic mode decomposition (SafEDMD) architecture to generate a data-driven bilinear surrogate model with certified error bounds. Then, by accounting for the obtained error bounds in a controller design based on the bilinear system, one can guarantee closed-loop stability for the true nonlinear system. While existing approaches over-approximate the bilinearity of the surrogate model, thus introducing conservatism and providing only local guarantees, we explicitly account for the bilinearity by using sum-of-squares (SOS) optimization in the controller design. More precisely, we parametrize a rational controller stabilizing the error-affected bilinear surrogate model and, consequently, the underlying nonlinear system. The resulting SOS optimization problem provides explicit data-driven controller design conditions for unknown nonlinear systems based on semidefinite programming. Our approach significantly reduces conservatism by establishing a larger region of attraction and improved data efficiency. The proposed method is evaluated using numerical examples, demonstrating its advantages over existing approaches.
comment: Accepted for publication in the European Journal of Control, 2025
Koopman-based control of nonlinear systems with closed-loop guarantees
In this paper, we provide a tutorial overview and an extension of a recently developed framework for data-driven control of unknown nonlinear systems with rigorous closed-loop guarantees. The proposed approach relies on the Koopman operator representation of the nonlinear system, for which a bilinear surrogate model is estimated based on data. In contrast to existing Koopman-based estimation procedures, we state guaranteed bounds on the approximation error using the stability- and certificate-oriented extended dynamic mode decomposition (SafEDMD) framework. The resulting surrogate model and the uncertainty bounds allow us to design controllers via robust control theory and sum-of-squares optimization, guaranteeing desirable properties for the closed-loop system. We present results on stabilization both in discrete and continuous time, and we derive a method for controller design with performance objectives. The benefits of the presented framework over established approaches are demonstrated with a numerical example.
comment: Accepted for publication in at-Automatisierungstechnik
Equilibrium Selection in Replicator Equations Using Adaptive-Gain Control
In this paper, we deal with the equilibrium selection problem, which amounts to steering a population of individuals engaged in strategic game-theoretic interactions to a desired collective behavior. In the literature, this problem has been typically tackled by means of open-loop strategies, whose applicability is however limited by the need of accurate a priori information on the game and scarce robustness to uncertainty and noise. Here, we overcome these limitations by adopting a closed-loop approach using an adaptive-gain control scheme within a replicator equation -a nonlinear ordinary differential equation that models the evolution of the collective behavior of the population. For most classes of 2-action matrix games we establish sufficient conditions to design a controller that guarantees convergence of the replicator equation to the desired equilibrium, requiring limited a-priori information on the game. Numerical simulations corroborate and expand our theoretical findings.
comment: Published in the IEEE Transactions on Automatic Control, 2025. arXiv admin note: text overlap with arXiv:2306.14469
Distribution Grids May Be a Barrier To Residential Electrification
Replacing fossil-fueled appliances and vehicles with electric alternatives can reduce greenhouse gas emissions and air pollution in many settings. However, residential electrification can also raise electricity demand beyond the safe limits of electrical infrastructure. This can increase the risk of blackouts or require grid reinforcement that is often slow and expensive. Here, we estimate the physical and economic impacts on distribution grids of electrifying all housing and personal vehicles in each county of the lower 48 United States. We find that space heating is the main driver of grid impacts, with the coldest regions seeing demand peaks up to five times higher than today's peaks. Accommodating electrification of all housing and personal vehicles is estimated to require 600 GW of distribution grid reinforcement nationally, at a cost of \$350 to \$790 billion, or \$2,800 to \$6,400 per household (95% confidence intervals). However, demand-side management could eliminate three-quarters of grid reinforcement costs.
Distributed ADMM Approach for the Power Distribution Network Reconfiguration
The electrical network reconfiguration problem aims to minimize losses in a distribution system by adjusting switches while ensuring radial topology. The growing use of renewable energy and the complexity of managing modern power grids make solving the reconfiguration problem crucial. Distributed algorithms help optimize grid configurations, ensuring efficient adaptation to changing conditions and better utilization of renewable energy sources. This paper introduces a distributed algorithm designed to tackle the problem of power distribution network reconfiguration with a radiality constraint. This algorithm relies on ADMM (Alternating Direction Method of Multipliers), where each agent progressively updates its estimation based on the information exchanged with neighboring agents. We show that every agent is required to solve a linearly constrained convex quadratic programming problem and a Minimum Weight Rooted Arborescence Problem (MWRAP) with local weights during each iteration. Through numerical experiments, we demonstrate the performance of the proposed algorithm in various scenarios, including its application to a 33-bus test system and a real-world network.
Extremum Seeking Control for Antenna Pointing via Symmetric Product Approximation
This paper investigates extremum seeking control for a torque-controlled antenna pointing system without direct angular measurements. We consider a two-degree-of-freedom (2-DOF) antenna system that receives an unknown signal from its environment, where the signal strength varies with the antenna orientation. It is assumed that only real-time measurements of the signal are available. We develop an extremum seeking control strategy that enables the antenna to autonomously adjust its direction to maximize the received signal strength based on the symmetric product approximation. Under suitable assumptions on the signal function, we prove local practical uniform asymptotic stability for the closed-loop system.
comment: This paper has been accepted for presentation at The 2025 Modeling, Estimation and Control Conference (MECC 2025). This is the full version of the paper, which contains complete proofs and additional details not included in the conference version
Robotics
Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation
We introduce Genie Envisioner (GE), a unified world foundation platform for robotic manipulation that integrates policy learning, evaluation, and simulation within a single video-generative framework. At its core, GE-Base is a large-scale, instruction-conditioned video diffusion model that captures the spatial, temporal, and semantic dynamics of real-world robotic interactions in a structured latent space. Built upon this foundation, GE-Act maps latent representations to executable action trajectories through a lightweight, flow-matching decoder, enabling precise and generalizable policy inference across diverse embodiments with minimal supervision. To support scalable evaluation and training, GE-Sim serves as an action-conditioned neural simulator, producing high-fidelity rollouts for closed-loop policy development. The platform is further equipped with EWMBench, a standardized benchmark suite measuring visual fidelity, physical consistency, and instruction-action alignment. Together, these components establish Genie Envisioner as a scalable and practical foundation for instruction-driven, general-purpose embodied intelligence. All code, models, and benchmarks will be released publicly.
comment: https://genie-envisioner.github.io/
Towards Generalizable Safety in Crowd Navigation via Conformal Uncertainty Handling
Mobile robots navigating in crowds trained using reinforcement learning are known to suffer performance degradation when faced with out-of-distribution scenarios. We propose that by properly accounting for the uncertainties of pedestrians, a robot can learn safe navigation policies that are robust to distribution shifts. Our method augments agent observations with prediction uncertainty estimates generated by adaptive conformal inference, and it uses these estimates to guide the agent's behavior through constrained reinforcement learning. The system helps regulate the agent's actions and enables it to adapt to distribution shifts. In the in-distribution setting, our approach achieves a 96.93% success rate, which is over 8.80% higher than the previous state-of-the-art baselines with over 3.72 times fewer collisions and 2.43 times fewer intrusions into ground-truth human future trajectories. In three out-of-distribution scenarios, our method shows much stronger robustness when facing distribution shifts in velocity variations, policy changes, and transitions from individual to group dynamics. We deploy our method on a real robot, and experiments show that the robot makes safe and robust decisions when interacting with both sparse and dense crowds. Our code and videos are available on https://gen-safe-nav.github.io/.
comment: 9th Conference on Robot Learning (CoRL 2025); Project website: https://gen-safe-nav.github.io/. arXiv admin note: text overlap with arXiv:2407.17460
TrajEvo: Trajectory Prediction Heuristics Design via LLM-driven Evolution
Trajectory prediction is a critical task in modeling human behavior, especially in safety-critical domains such as social robotics and autonomous vehicle navigation. Traditional heuristics based on handcrafted rules often lack accuracy and generalizability. Although deep learning approaches offer improved performance, they typically suffer from high computational cost, limited explainability, and, importantly, poor generalization to out-of-distribution (OOD) scenarios. In this paper, we introduce TrajEvo, a framework that leverages Large Language Models (LLMs) to automatically design trajectory prediction heuristics. TrajEvo employs an evolutionary algorithm to generate and refine prediction heuristics from past trajectory data. We propose two key innovations: Cross-Generation Elite Sampling to encourage population diversity, and a Statistics Feedback Loop that enables the LLM to analyze and improve alternative predictions. Our evaluations demonstrate that TrajEvo outperforms existing heuristic methods across multiple real-world datasets, and notably surpasses both heuristic and deep learning methods in generalizing to an unseen OOD real-world dataset. TrajEvo marks a promising step toward the automated design of fast, explainable, and generalizable trajectory prediction heuristics. We release our source code to facilitate future research at https://github.com/ai4co/trajevo.
comment: arXiv admin note: substantial text overlap with arXiv:2505.04480
Robust adaptive fuzzy sliding mode control for trajectory tracking for of cylindrical manipulator
This research proposes a robust adaptive fuzzy sliding mode control (AFSMC) approach to enhance the trajectory tracking performance of cylindrical robotic manipulators, extensively utilized in applications such as CNC and 3D printing. The proposed approach integrates fuzzy logic with sliding mode control (SMC) to bolster adaptability and robustness, with fuzzy logic approximating the uncertain dynamics of the system, while SMC ensures strong performance. Simulation results in MATLAB/Simulink demonstrate that AFSMC significantly improves trajectory tracking accuracy, stability, and disturbance rejection compared to traditional methods. This research underscores the effectiveness of AFSMC in controlling robotic manipulators, contributing to enhanced precision in industrial robotic applications.
CleanUpBench: Embodied Sweeping and Grasping Benchmark
Embodied AI benchmarks have advanced navigation, manipulation, and reasoning, but most target complex humanoid agents or large-scale simulations that are far from real-world deployment. In contrast, mobile cleaning robots with dual mode capabilities, such as sweeping and grasping, are rapidly emerging as realistic and commercially viable platforms. However, no benchmark currently exists that systematically evaluates these agents in structured, multi-target cleaning tasks, revealing a critical gap between academic research and real-world applications. We introduce CleanUpBench, a reproducible and extensible benchmark for evaluating embodied agents in realistic indoor cleaning scenarios. Built on NVIDIA Isaac Sim, CleanUpBench simulates a mobile service robot equipped with a sweeping mechanism and a six-degree-of-freedom robotic arm, enabling interaction with heterogeneous objects. The benchmark includes manually designed environments and one procedurally generated layout to assess generalization, along with a comprehensive evaluation suite covering task completion, spatial efficiency, motion quality, and control performance. To support comparative studies, we provide baseline agents based on heuristic strategies and map-based planning. CleanUpBench bridges the gap between low-level skill evaluation and full-scene testing, offering a scalable testbed for grounded, embodied intelligence in everyday settings.
Mixed-Initiative Dialog for Human-Robot Collaborative Manipulation
Effective robotic systems for long-horizon human-robot collaboration must adapt to a wide range of human partners, whose physical behavior, willingness to assist, and understanding of the robot's capabilities may change over time. This demands a tightly coupled communication loop that grants both agents the flexibility to propose, accept, or decline requests as they coordinate toward completing the task effectively. We apply a Mixed-Initiative dialog paradigm to Collaborative human-roBot teaming and propose MICoBot, a system that handles the common scenario where both agents, using natural language, take initiative in formulating, accepting, or rejecting proposals on who can best complete different steps of a task. To handle diverse, task-directed dialog, and find successful collaborative strategies that minimize human effort, MICoBot makes decisions at three levels: (1) a meta-planner considers human dialog to formulate and code a high-level collaboration strategy, (2) a planner optimally allocates the remaining steps to either agent based on the robot's capabilities (measured by a simulation-pretrained affordance model) and the human's estimated availability to help, and (3) an action executor decides the low-level actions to perform or words to say to the human. Our extensive evaluations in simulation and real-world -- on a physical robot with 18 unique human participants over 27 hours -- demonstrate the ability of our method to effectively collaborate with diverse human users, yielding significantly improved task success and user experience than a pure LLM baseline and other agent allocation models. See additional videos and materials at https://robin-lab.cs.utexas.edu/MicoBot/.
comment: Project website at https://robin-lab.cs.utexas.edu/MicoBot/
Towards Human-Centric Evaluation of Interaction-Aware Automated Vehicle Controllers: A Framework and Case Study
As automated vehicles (AVs) increasingly integrate into mixed-traffic environments, evaluating their interaction with human-driven vehicles (HDVs) becomes critical. In most research focused on developing new AV control algorithms (controllers), the performance of these algorithms is assessed solely based on performance metrics such as collision avoidance or lane-keeping efficiency, while largely overlooking the human-centred dimensions of interaction with HDVs. This paper proposes a structured evaluation framework that addresses this gap by incorporating metrics grounded in the human-robot interaction literature. The framework spans four key domains: a) interaction effect, b) interaction perception, c) interaction effort, and d) interaction ability. These domains capture both the performance of the AV and its impact on human drivers around it. To demonstrate the utility of the framework, we apply it to a case study evaluating how a state-of-the-art AV controller interacts with human drivers in a merging scenario in a driving simulator. Measuring HDV-HDV interactions as a baseline, this study included one representative metric per domain: a) perceived safety, b) subjective ratings, specifically how participants perceived the other vehicle's driving behaviour (e.g., aggressiveness or predictability) , c) driver workload, and d) merging success. The results showed that incorporating metrics covering all four domains in the evaluation of AV controllers can illuminate critical differences in driver experience when interacting with AVs. This highlights the need for a more comprehensive evaluation approach. Our framework offers researchers, developers, and policymakers a systematic method for assessing AV behaviour beyond technical performance, fostering the development of AVs that are not only functionally capable but also understandable, acceptable, and safe from a human perspective.
Do Robots Really Need Anthropomorphic Hands?
Human manipulation skills represent a pinnacle of their voluntary motor functions, requiring the coordination of many degrees of freedom and processing of high-dimensional sensor input to achieve such a high level of dexterity. Thus, we set out to answer whether the human hand, with its associated biomechanical properties, sensors, and control mechanisms, is an ideal that we should strive for in robotics-do we really need anthropomorphic robotic hands? This survey can help practitioners to make the trade-off between hand complexity and potential manipulation skills. We provide an overview of the human hand, a comparison of commercially available robotic and prosthetic hands, and a systematic review of hand mechanisms and skills that they are capable of. This leads to follow-up questions. What is the minimum requirement for mechanisms and sensors to implement most skills that a robot needs? What is missing to reach human-level dexterity? Can we improve upon human dexterity? Although complex five-fingered hands are often used as the ultimate goal for robotic manipulators, they are not necessary for all tasks. We found that wrist flexibility and finger abduction/adduction are important for manipulation capabilities. On the contrary, increasing the number of fingers, actuators, or degrees of freedom is often not necessary. Three fingers are a good compromise between simplicity and dexterity. Non-anthropomorphic hand designs with two opposing pairs of fingers or human hands with six fingers can further increase dexterity, suggesting that the human hand may not be the optimum.
Computational Design and Fabrication of Modular Robots with Untethered Control
Natural organisms use distributed actuation via their musculoskeletal systems to adapt their gait for traversing diverse terrains or to morph their bodies to perform varied tasks. A longstanding challenge in the field of robotics is to mimic this extensive adaptability and range of motion. This has led humans to develop various soft robotic systems that emulate natural organisms. However, such systems are generally optimized for a single functionality, lack the ability to change form or function on demand, or are often tethered to bulky control systems. To address these challenges, we present our framework for designing and controlling robots that mimic nature's blueprint by utilizing distributed actuation. We propose a novel building block that combines 3D-printed bones with liquid crystal elastomer (LCE) muscles as lightweight actuators and enables the modular assembly of musculoskeletal robots. We developed LCE rods that contract in response to infrared radiation, thereby achieving local and untethered control over the distributed network of bones, which in turn results in global deformation of the robot. Furthermore, to capitalize on the extensive design space, we develop two computational tools: one to optimize the robot's skeletal graph, enabling multiple target deformations, and another to co-optimize the skeletal designs and control gaits to achieve target locomotion. We validate our system by building several robots that show complex shape morphing, varying control schemes, and adaptability to their environment. Our system integrates advances in modular material building, untethered and distributed control, and computational design to introduce a new generation of robots that brings us closer to the capabilities of living organisms.
DistillDrive: End-to-End Multi-Mode Autonomous Driving Distillation by Isomorphic Hetero-Source Planning Model
End-to-end autonomous driving has been recently seen rapid development, exerting a profound influence on both industry and academia. However, the existing work places excessive focus on ego-vehicle status as their sole learning objectives and lacks of planning-oriented understanding, which limits the robustness of the overall decision-making prcocess. In this work, we introduce DistillDrive, an end-to-end knowledge distillation-based autonomous driving model that leverages diversified instance imitation to enhance multi-mode motion feature learning. Specifically, we employ a planning model based on structured scene representations as the teacher model, leveraging its diversified planning instances as multi-objective learning targets for the end-to-end model. Moreover, we incorporate reinforcement learning to enhance the optimization of state-to-decision mappings, while utilizing generative modeling to construct planning-oriented instances, fostering intricate interactions within the latent space. We validate our model on the nuScenes and NAVSIM datasets, achieving a 50\% reduction in collision rate and a 3-point improvement in closed-loop performance compared to the baseline model. Code and model are publicly available at https://github.com/YuruiAI/DistillDrive
Real-Time Iteration Scheme for Diffusion Policy
Diffusion Policies have demonstrated impressive performance in robotic manipulation tasks. However, their long inference time, resulting from an extensive iterative denoising process, and the need to execute an action chunk before the next prediction to maintain consistent actions limit their applicability to latency-critical tasks or simple tasks with a short cycle time. While recent methods explored distillation or alternative policy structures to accelerate inference, these often demand additional training, which can be resource-intensive for large robotic models. In this paper, we introduce a novel approach inspired by the Real-Time Iteration (RTI) Scheme, a method from optimal control that accelerates optimization by leveraging solutions from previous time steps as initial guesses for subsequent iterations. We explore the application of this scheme in diffusion inference and propose a scaling-based method to effectively handle discrete actions, such as grasping, in robotic manipulation. The proposed scheme significantly reduces runtime computational costs without the need for distillation or policy redesign. This enables a seamless integration into many pre-trained diffusion-based models, in particular, to resource-demanding large models. We also provide theoretical conditions for the contractivity which could be useful for estimating the initial denoising step. Quantitative results from extensive simulation experiments show a substantial reduction in inference time, with comparable overall performance compared with Diffusion Policy using full-step denoising. Our project page with additional resources is available at: https://rti-dp.github.io/.
comment: \c{opyright} 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Robots can defuse high-intensity conflict situations IROS
This paper investigates the specific scenario of high-intensity confrontations between humans and robots, to understand how robots can defuse the conflict. It focuses on the effectiveness of using five different affective expression modalities as main drivers for defusing the conflict. The aim is to discover any strengths or weaknesses in using each modality to mitigate the hostility that people feel towards a poorly performing robot. The defusing of the situation is accomplished by making the robot better at acknowledging the conflict and by letting it express remorse. To facilitate the tests, we used a custom affective robot in a simulated conflict situation with 105 test participants. The results show that all tested expression modalities can successfully be used to defuse the situation and convey an acknowledgment of the confrontation. The ratings were remarkably similar, but the movement modality was different (ANON p$<$.05) than the other modalities. The test participants also had similar affective interpretations on how impacted the robot was of the confrontation across all expression modalities. This indicates that defusing a high-intensity interaction may not demand special attention to the expression abilities of the robot, but rather require attention to the abilities of being socially aware of the situation and reacting in accordance with it.
comment: 7 pages, 6 figures, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) October 25-29, 2020, Las Vegas, NV, USA
A Multi-view Landmark Representation Approach with Application to GNSS-Visual-Inertial Odometry
Invariant Extended Kalman Filter (IEKF) has been a significant technique in vision-aided sensor fusion. However, it usually suffers from high computational burden when jointly optimizing camera poses and the landmarks. To improve its efficiency and applicability for multi-sensor fusion, we present a multi-view pose-only estimation approach with its application to GNSS-Visual-Inertial Odometry (GVIO) in this paper. Our main contribution is deriving a visual measurement model which directly associates landmark representation with multiple camera poses and observations. Such a pose-only measurement is proven to be tightly-coupled between landmarks and poses, and maintain a perfect null space that is independent of estimated poses. Finally, we apply the proposed approach to a filter based GVIO with a novel feature management strategy. Both simulation tests and real-world experiments are conducted to demonstrate the superiority of the proposed method in terms of efficiency and accuracy.
Affecta-Context: The Context-Guided Behavior Adaptation Framework
This paper presents Affecta-context, a general framework to facilitate behavior adaptation for social robots. The framework uses information about the physical context to guide its behaviors in human-robot interactions. It consists of two parts: one that represents encountered contexts and one that learns to prioritize between behaviors through human-robot interactions. As physical contexts are encountered the framework clusters them by their measured physical properties. In each context, the framework learns to prioritize between behaviors to optimize the physical attributes of the robot's behavior in line with its current environment and the preferences of the users it interacts with. This paper illlustrates the abilities of the Affecta-context framework by enabling a robot to autonomously learn the prioritization of discrete behaviors. This was achieved by training across 72 interactions in two different physical contexts with 6 different human test participants. The paper demonstrates the trained Affecta-context framework by verifying the robot's ability to generalize over the input and to match its behaviors to a previously unvisited physical context.
comment: 6 pages, Intelligent Autonomous Systems 18. IAS 2023
Information-Theoretic Graph Fusion with Vision-Language-Action Model for Policy Reasoning and Dual Robotic Control
Teaching robots dexterous skills from human videos remains challenging due to the reliance on low-level trajectory imitation, which fails to generalize across object types, spatial layouts, and manipulator configurations. We propose Graph-Fused Vision-Language-Action (GF-VLA), a framework that enables dual-arm robotic systems to perform task-level reasoning and execution directly from RGB and Depth human demonstrations. GF-VLA first extracts Shannon-information-based cues to identify hands and objects with the highest task relevance, then encodes these cues into temporally ordered scene graphs that capture both hand-object and object-object interactions. These graphs are fused with a language-conditioned transformer that generates hierarchical behavior trees and interpretable Cartesian motion commands. To improve execution efficiency in bimanual settings, we further introduce a cross-hand selection policy that infers optimal gripper assignment without explicit geometric reasoning. We evaluate GF-VLA on four structured dual-arm block assembly tasks involving symbolic shape construction and spatial generalization. Experimental results show that the information-theoretic scene representation achieves over 95 percent graph accuracy and 93 percent subtask segmentation, supporting the LLM planner in generating reliable and human-readable task policies. When executed by the dual-arm robot, these policies yield 94 percent grasp success, 89 percent placement accuracy, and 90 percent overall task success across stacking, letter-building, and geometric reconfiguration scenarios, demonstrating strong generalization and robustness across diverse spatial and semantic variations.
comment: Journal under review
ASkDAgger: Active Skill-level Data Aggregation for Interactive Imitation Learning
Human teaching effort is a significant bottleneck for the broader applicability of interactive imitation learning. To reduce the number of required queries, existing methods employ active learning to query the human teacher only in uncertain, risky, or novel situations. However, during these queries, the novice's planned actions are not utilized despite containing valuable information, such as the novice's capabilities, as well as corresponding uncertainty levels. To this end, we allow the novice to say: "I plan to do this, but I am uncertain." We introduce the Active Skill-level Data Aggregation (ASkDAgger) framework, which leverages teacher feedback on the novice plan in three key ways: (1) S-Aware Gating (SAG): Adjusts the gating threshold to track sensitivity, specificity, or a minimum success rate; (2) Foresight Interactive Experience Replay (FIER), which recasts valid and relabeled novice action plans into demonstrations; and (3) Prioritized Interactive Experience Replay (PIER), which prioritizes replay based on uncertainty, novice success, and demonstration age. Together, these components balance query frequency with failure incidence, reduce the number of required demonstration annotations, improve generalization, and speed up adaptation to changing domains. We validate the effectiveness of ASkDAgger through language-conditioned manipulation tasks in both simulation and real-world environments. Code, data, and videos are available at https://askdagger.github.io.
comment: Accepted for publication in Transactions on Machine Learning Research (TMLR, 2025)
GhostShell: Streaming LLM Function Calls for Concurrent Embodied Programming
We present GhostShell, a novel approach that leverages Large Language Models (LLMs) to enable streaming and concurrent behavioral programming for embodied systems. In contrast to conventional methods that rely on pre-scheduled action sequences or behavior trees, GhostShell drives embodied systems to act on-the-fly by issuing function calls incrementally as tokens are streamed from the LLM. GhostShell features a streaming XML function token parser, a dynamic function interface mapper, and a multi-channel scheduler that orchestrates intra-channel synchronous and inter-channel asynchronous function calls, thereby coordinating serial-parallel embodied actions across multiple robotic components as directed by the LLM. We evaluate GhostShell on our robot prototype COCO through comprehensive grounded experiments across 34 real-world interaction tasks and multiple LLMs. The results demonstrate that our approach achieves state-of-the-art Behavioral Correctness Metric of 0.85 with Claude-4 Sonnet and up to 66X faster response times compared to LLM native function calling APIs. GhostShell also proves effective in long-horizon multimodal tasks, demonstrating strong robustness and generalization.
comment: 17 pages, 5 figures, conference
Towards Embodied Agentic AI: Review and Classification of LLM- and VLM-Driven Robot Autonomy and Interaction
Foundation models, including large language models (LLMs) and vision-language models (VLMs), have recently enabled novel approaches to robot autonomy and human-robot interfaces. In parallel, vision-language-action models (VLAs) or large behavior models (BLMs) are increasing the dexterity and capabilities of robotic systems. This survey paper focuses on those words advancing towards agentic applications and architectures. This includes initial efforts exploring GPT-style interfaces to tooling, as well as more complex system where AI agents are coordinators, planners, perception actors, or generalist interfaces. Such agentic architectures allow robots to reason over natural language instructions, invoke APIs, plan task sequences, or assist in operations and diagnostics. In addition to peer-reviewed research, due to the fast-evolving nature of the field, we highlight and include community-driven projects, ROS packages, and industrial frameworks that show emerging trends. We propose a taxonomy for classifying model integration approaches and present a comparative analysis of the role that agents play in different solutions in today's literature.
Dancing with a Robot: An Experimental Study of Child-Robot Interaction in a Performative Art Setting
This paper presents an evaluation of 18 children's in-the-wild experiences with the autonomous robot arm performer NED (Never-Ending Dancer) within the Thingamabobas installation, showcased across the UK. We detail NED's design, including costume, behaviour, and human interactions, all integral to the installation. Our observational analysis revealed three key challenges in child-robot interactions: 1) Initiating and maintaining engagement, 2) Lack of robot expressivity and reciprocity, and 3) Unmet expectations. Our findings show that children are naturally curious, and adept at interacting with a robotic art performer. However, our observations emphasise the critical need to optimise human-robot interaction (HRI) systems through careful consideration of audience's capabilities, perceptions, and expectations, within the performative arts context, to enable engaging and meaningful experiences, especially for young audiences.
comment: published by Springer
Learning to See and Act: Task-Aware View Planning for Robotic Manipulation
Recent vision-language-action (VLA) models for multi-task robotic manipulation commonly rely on static viewpoints and shared visual encoders, which limit 3D perception and cause task interference, hindering robustness and generalization. In this work, we propose Task-Aware View Planning (TAVP), a framework designed to overcome these challenges by integrating active view planning with task-specific representation learning. TAVP employs an efficient exploration policy, accelerated by a novel pseudo-environment, to actively acquire informative views. Furthermore, we introduce a Mixture-of-Experts (MoE) visual encoder to disentangle features across different tasks, boosting both representation fidelity and task generalization. By learning to see the world in a task-aware way, TAVP generates more complete and discriminative visual representations, demonstrating significantly enhanced action prediction across a wide array of manipulation challenges. Extensive experiments on RLBench tasks show that our proposed TAVP model achieves superior performance over state-of-the-art fixed-view approaches. Visual results and code are provided at: https://hcplab-sysu.github.io/TAVP.
comment: 7 pages, 9 figures, project page: https://hcplab-sysu.github.io/TAVP
FCBV-Net: Category-Level Robotic Garment Smoothing via Feature-Conditioned Bimanual Value Prediction
Category-level generalization for robotic garment manipulation, such as bimanual smoothing, remains a significant hurdle due to high dimensionality, complex dynamics, and intra-category variations. Current approaches often struggle, either overfitting with concurrently learned visual features for a specific instance or, despite category-level perceptual generalization, failing to predict the value of synergistic bimanual actions. We propose the Feature-Conditioned Bimanual Value Network (FCBV-Net), operating on 3D point clouds to specifically enhance category-level policy generalization for garment smoothing. FCBV-Net conditions bimanual action value prediction on pre-trained, frozen dense geometric features, ensuring robustness to intra-category garment variations. Trainable downstream components then learn a task-specific policy using these static features. In simulated GarmentLab experiments with the CLOTH3D dataset, FCBV-Net demonstrated superior category-level generalization. It exhibited only an 11.5% efficiency drop (Steps80) on unseen garments compared to 96.2% for a 2D image-based baseline, and achieved 89% final coverage, outperforming an 83% coverage from a 3D correspondence-based baseline that uses identical per-point geometric features but a fixed primitive. These results highlight that the decoupling of geometric understanding from bimanual action value learning enables better category-level generalization.
comment: 7 pages, 3 figures, 1 table. Submitted to IEEE Robotics and Automation Letters
Chemist Eye: A Visual Language Model-Powered System for Safety Monitoring and Robot Decision-Making in Self-Driving Laboratories
The integration of robotics and automation into self-driving laboratories (SDLs) can introduce additional safety complexities, in addition to those that already apply to conventional research laboratories. Personal protective equipment (PPE) is an essential requirement for ensuring the safety and well-being of workers in laboratories, self-driving or otherwise. Fires are another important risk factor in chemical laboratories. In SDLs, fires that occur close to mobile robots, which use flammable lithium batteries, could have increased severity. Here, we present Chemist Eye, a distributed safety monitoring system designed to enhance situational awareness in SDLs. The system integrates multiple stations equipped with RGB, depth, and infrared cameras, designed to monitor incidents in SDLs. Chemist Eye is also designed to spot workers who have suffered a potential accident or medical emergency, PPE compliance and fire hazards. To do this, Chemist Eye uses decision-making driven by a vision-language model (VLM). Chemist Eye is designed for seamless integration, enabling real-time communication with robots. Based on the VLM recommendations, the system attempts to drive mobile robots away from potential fire locations, exits, or individuals not wearing PPE, and issues audible warnings where necessary. It also integrates with third-party messaging platforms to provide instant notifications to lab personnel. We tested Chemist Eye with real-world data from an SDL equipped with three mobile robots and found that the spotting of possible safety hazards and decision-making performances reached 97 % and 95 %, respectively.
From Canada to Japan: How 10,000 km Affect User Perception in Robot Teleoperation
Robot teleoperation (RTo) has emerged as a viable alternative to local control, particularly when human intervention is still necessary. This research aims to study the distance effect on user perception in RTo, exploring the potential of teleoperated robots for older adult care. We propose an evaluation of non-expert users' perception of long-distance RTo, examining how their perception changes before and after interaction, as well as comparing it to that of locally operated robots. We have designed a specific protocol consisting of multiple questionnaires, along with a dedicated software architecture using the Robotics Operating System (ROS) and Unity. The results revealed no statistically significant differences between the local and remote robot conditions, suggesting that robots may be a viable alternative to traditional local control.
comment: Author preprint - Accepted for Humanoids 2025
Examining the legibility of humanoid robot arm movements in a pointing task
Human--robot interaction requires robots whose actions are legible, allowing humans to interpret, predict, and feel safe around them. This study investigates the legibility of humanoid robot arm movements in a pointing task, aiming to understand how humans predict robot intentions from truncated movements and bodily cues. We designed an experiment using the NICO humanoid robot, where participants observed its arm movements towards targets on a touchscreen. Robot cues varied across conditions: gaze, pointing, and pointing with congruent or incongruent gaze. Arm trajectories were stopped at 60\% or 80\% of their full length, and participants predicted the final target. We tested the multimodal superiority and ocular primacy hypotheses, both of which were supported by the experiment.
comment: Published at ICSR 2025
Analyzing the Impact of Multimodal Perception on Sample Complexity and Optimization Landscapes in Imitation Learning
This paper examines the theoretical foundations of multimodal imitation learning through the lens of statistical learning theory. We analyze how multimodal perception (RGB-D, proprioception, language) affects sample complexity and optimization landscapes in imitation policies. Building on recent advances in multimodal learning theory, we show that properly integrated multimodal policies can achieve tighter generalization bounds and more favorable optimization landscapes than their unimodal counterparts. We provide a comprehensive review of theoretical frameworks that explain why multimodal architectures like PerAct and CLIPort achieve superior performance, connecting these empirical results to fundamental concepts in Rademacher complexity, PAC learning, and information theory.
comment: 9 pages, 1 figure, 1 table, theoretical analysis with empirical validation on PerAct implementation in MuJoCo simulation environment
A Vision-Based Collision Sensing Method for Stable Circular Object Grasping with A Soft Gripper System
External collisions to robot actuators typically pose risks to grasping circular objects. This work presents a vision-based sensing module capable of detecting collisions to maintain stable grasping with a soft gripper system. The system employs an eye-in-palm camera with a broad field of view to simultaneously monitor the motion of fingers and the grasped object. Furthermore, we have developed a collision-rich grasping strategy to ensure the stability and security of the entire dynamic grasping process. A physical soft gripper was manufactured and affixed to a collaborative robotic arm to evaluate the performance of the collision detection mechanism. An experiment regarding testing the response time of the mechanism confirmed the system has the capability to react to the collision instantaneously. A dodging test was conducted to demonstrate the gripper can detect the direction and scale of external collisions precisely.
Benchmarking Shortcutting Techniques for Multi-Robot-Arm Motion Planning IROS 2025
Generating high-quality motion plans for multiple robot arms is challenging due to the high dimensionality of the system and the potential for inter-arm collisions. Traditional motion planning methods often produce motions that are suboptimal in terms of smoothness and execution time for multi-arm systems. Post-processing via shortcutting is a common approach to improve motion quality for efficient and smooth execution. However, in multi-arm scenarios, optimizing one arm's motion must not introduce collisions with other arms. Although existing multi-arm planning works often use some form of shortcutting techniques, their exact methodology and impact on performance are often vaguely described. In this work, we present a comprehensive study quantitatively comparing existing shortcutting methods for multi-arm trajectories across diverse simulated scenarios. We carefully analyze the pros and cons of each shortcutting method and propose two simple strategies for combining these methods to achieve the best performance-runtime tradeoff. Video, code, and dataset are available at https://philip-huang.github.io/mr-shortcut/.
comment: 9 pages, 6 figures, accepted for publication at 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
MAG-Nav: Language-Driven Object Navigation Leveraging Memory-Reserved Active Grounding
Visual navigation in unknown environments based solely on natural language descriptions is a key capability for intelligent robots. In this work, we propose a navigation framework built upon off-the-shelf Visual Language Models (VLMs), enhanced with two human-inspired mechanisms: perspective-based active grounding, which dynamically adjusts the robot's viewpoint for improved visual inspection, and historical memory backtracking, which enables the system to retain and re-evaluate uncertain observations over time. Unlike existing approaches that passively rely on incidental visual inputs, our method actively optimizes perception and leverages memory to resolve ambiguity, significantly improving vision-language grounding in complex, unseen environments. Our framework operates in a zero-shot manner, achieving strong generalization to diverse and open-ended language descriptions without requiring labeled data or model fine-tuning. Experimental results on Habitat-Matterport 3D (HM3D) show that our method outperforms state-of-the-art approaches in language-driven object navigation. We further demonstrate its practicality through real-world deployment on a quadruped robot, achieving robust and effective navigation performance.
Hierarchical Deep Deterministic Policy Gradient for Autonomous Maze Navigation of Mobile Robots
Maze navigation is a fundamental challenge in robotics, requiring agents to traverse complex environments efficiently. While the Deep Deterministic Policy Gradient (DDPG) algorithm excels in control tasks, its performance in maze navigation suffers from sparse rewards, inefficient exploration, and long-horizon planning difficulties, often leading to low success rates and average rewards, sometimes even failing to achieve effective navigation. To address these limitations, this paper proposes an efficient Hierarchical DDPG (HDDPG) algorithm, which includes high-level and low-level policies. The high-level policy employs an advanced DDPG framework to generate intermediate subgoals from a long-term perspective and on a higher temporal scale. The low-level policy, also powered by the improved DDPG algorithm, generates primitive actions by observing current states and following the subgoal assigned by the high-level policy. The proposed method enhances stability with off-policy correction, refining subgoal assignments by relabeling historical experiences. Additionally, adaptive parameter space noise is utilized to improve exploration, and a reshaped intrinsic-extrinsic reward function is employed to boost learning efficiency. Further optimizations, including gradient clipping and Xavier initialization, are employed to improve robustness. The proposed algorithm is rigorously evaluated through numerical simulation experiments executed using the Robot Operating System (ROS) and Gazebo. Regarding the three distinct final targets in autonomous maze navigation tasks, HDDPG significantly overcomes the limitations of standard DDPG and its variants, improving the success rate by at least 56.59% and boosting the average reward by a minimum of 519.03 compared to baseline algorithms.
Optimal Planning for Multi-Robot Simultaneous Area and Line Coverage Using Hierarchical Cyclic Merging Regulation
The double coverage problem focuses on determining efficient, collision-free routes for multiple robots to simultaneously cover linear features (e.g., surface cracks or road routes) and survey areas (e.g., parking lots or local regions) in known environments. In these problems, each robot carries two functional roles: service (linear feature footprint coverage) and exploration (complete area coverage). Service has a smaller operational footprint but incurs higher costs (e.g., time) compared to exploration. We present optimal planning algorithms for the double coverage problems using hierarchical cyclic merging regulation (HCMR). To reduce the complexity for optimal planning solutions, we analyze the manifold attachment process during graph traversal from a Morse theory perspective. We show that solutions satisfying minimum path length and collision-free constraints must belong to a Morse-bounded collection. To identify this collection, we introduce the HCMR algorithm. In HCMR, cyclic merging search regulates traversal behavior, while edge sequence back propagation converts these regulations into graph edge traversal sequences. Incorporating balanced partitioning, the optimal sequence is selected to generate routes for each robot. We prove the optimality of the HCMR algorithm under a fixed sweep direction. The multi-robot simulation results demonstrate that the HCMR algorithm significantly improves planned path length by at least 10.0%, reduces task time by at least 16.9% in average, and ensures conflict-free operation compared to other state-of-the-art planning methods.
Safety of Embodied Navigation: A Survey
As large language models (LLMs) continue to advance and gain influence, the development of embodied AI has accelerated, drawing significant attention, particularly in navigation scenarios. Embodied navigation requires an agent to perceive, interact with, and adapt to its environment while moving toward a specified target in unfamiliar settings. However, the integration of embodied navigation into critical applications raises substantial safety concerns. Given their deployment in dynamic, real-world environments, ensuring the safety of such systems is critical. This survey provides a comprehensive analysis of safety in embodied navigation from multiple perspectives, encompassing attack strategies, defense mechanisms, and evaluation methodologies. Beyond conducting a comprehensive examination of existing safety challenges, mitigation technologies, and various datasets and metrics that assess effectiveness and robustness, we explore unresolved issues and future research directions in embodied navigation safety. These include potential attack methods, mitigation strategies, more reliable evaluation techniques, and the implementation of verification frameworks. By addressing these critical gaps, this survey aims to provide valuable insights that can guide future research toward the development of safer and more reliable embodied navigation systems. Furthermore, the findings of this study have broader implications for enhancing societal safety and increasing industrial efficiency.
Towards Transparent Ethical AI: A Roadmap for Trustworthy Robotic Systems
As artificial intelligence (AI) and robotics increasingly permeate society, ensuring the ethical behavior of these systems has become paramount. This paper contends that transparency in AI decision-making processes is fundamental to developing trustworthy and ethically aligned robotic systems. We explore how transparency facilitates accountability, enables informed consent, and supports the debugging of ethical algorithms. The paper outlines technical, ethical, and practical challenges in implementing transparency and proposes novel approaches to enhance it, including standardized metrics, explainable AI techniques, and user-friendly interfaces. This paper introduces a framework that connects technical implementation with ethical considerations in robotic systems, focusing on the specific challenges of achieving transparency in dynamic, real-world contexts. We analyze how prioritizing transparency can impact public trust, regulatory policies, and avenues for future research. By positioning transparency as a fundamental element in ethical AI system design, we aim to add to the ongoing discussion on responsible AI and robotics, providing direction for future advancements in this vital field.
comment: Published in the Proceedings of the 2025 3rd International Conference on Robotics, Control and Vision Engineering (RCVE'25). 6 pages, 3 tables
Integrating Vision Foundation Models with Reinforcement Learning for Enhanced Object Interaction
This paper presents a novel approach that integrates vision foundation models with reinforcement learning to enhance object interaction capabilities in simulated environments. By combining the Segment Anything Model (SAM) and YOLOv5 with a Proximal Policy Optimization (PPO) agent operating in the AI2-THOR simulation environment, we enable the agent to perceive and interact with objects more effectively. Our comprehensive experiments, conducted across four diverse indoor kitchen settings, demonstrate significant improvements in object interaction success rates and navigation efficiency compared to a baseline agent without advanced perception. The results show a 68% increase in average cumulative reward, a 52.5% improvement in object interaction success rate, and a 33% increase in navigation efficiency. These findings highlight the potential of integrating foundation models with reinforcement learning for complex robotic tasks, paving the way for more sophisticated and capable autonomous agents.
comment: Published in the Proceedings of the 2025 3rd International Conference on Robotics, Control and Vision Engineering (RCVE'25). 6 pages, 3 figures, 1 table
GPU-Accelerated Barrier-Rate Guided MPPI Control for Tractor-Trailer Systems SC 2025
Articulated vehicles such as tractor-trailers, yard trucks, and similar platforms must often reverse and maneuver in cluttered spaces where pedestrians are present. We present how Barrier-Rate guided Model Predictive Path Integral (BR-MPPI) control can solve navigation in such challenging environments. BR-MPPI embeds Control Barrier Function (CBF) constraints directly into the path-integral update. By steering the importance-sampling distribution toward collision-free, dynamically feasible trajectories, BR-MPPI enhances the exploration strength of MPPI and improves robustness of resulting trajectories. The method is evaluated in the high-fidelity CarMaker simulator on a 12 [m] tractor-trailer tasked with reverse and forward parking in a parking lot. BR-MPPI computes control inputs in above 100 [Hz] on a single GPU (for scenarios with eight obstacles) and maintains better parking clearance than a standard MPPI baseline and an MPPI with collision cost baseline.
comment: Accepted to IEEE ITSC 2025
RoboTron-Drive: All-in-One Large Multimodal Model for Autonomous Driving ICCV 2025
Large Multimodal Models (LMMs) have demonstrated exceptional comprehension and interpretation capabilities in Autonomous Driving (AD) by incorporating large language models. Despite the advancements, current data-driven AD approaches tend to concentrate on a single dataset and specific tasks, neglecting their overall capabilities and ability to generalize. To bridge these gaps, we propose RoboTron-Drive, a general large multimodal model designed to process diverse data inputs, such as images and multi-view videos, while performing a broad spectrum of AD tasks, including perception, prediction, and planning. Initially, the model undergoes curriculum pre-training to process varied visual signals and perform basic visual comprehension and perception tasks. Subsequently, we augment and standardize various AD datasets to finetune the model, resulting in an all-in-one LMM for autonomous driving. To assess the general capabilities and generalization ability, we conduct evaluations on six public benchmarks and undertake zero-shot transfer on three unseen datasets, where RoboTron-Drive achieves state-of-the-art performance across all tasks. We hope RoboTron-Drive as a promising solution for AD in the real world. Project page with code: https://github.com/zhijian11/RoboTron-Drive.
comment: ICCV 2025
Position-Based Flocking for Robust Alignment
This paper presents a position-based flocking model for interacting agents, balancing cohesion-separation and alignment to achieve stable collective motion. The model modifies a position-velocity-based approach by approximating velocity differences using initial and current positions, introducing a threshold weight to ensure sustained alignment. Simulations with 50 agents in 2D demonstrate that the position-based model produces stronger alignment and more rigid and compact formations compared to the position-velocity-based model. The alignment metric and separation distances highlight the efficacy of the proposed model in achieving robust flocking behavior. The model's use of positions ensures robust alignment, with applications in robotics and collective dynamics.
comment: Accepted for "The 9th International Symposium on Swarm Behavior and Bio-Inspired Robotics 2025" Simulation video for Fig. 1 at https://youtu.be/yID0taa7X7o Simulation video for Fig. 3 at https://youtu.be/8I_yBhY2imo Simulation video for Fig. 5 at https://youtu.be/fTG7pgBPMZg
$\textit{RoboTron-Nav}$: A Unified Framework for Embodied Navigation Integrating Perception, Planning, and Prediction ICCV 2025
In language-guided visual navigation, agents locate target objects in unseen environments using natural language instructions. For reliable navigation in unfamiliar scenes, agents should possess strong perception, planning, and prediction capabilities. Additionally, when agents revisit previously explored areas during long-term navigation, they may retain irrelevant and redundant historical perceptions, leading to suboptimal results. In this work, we propose $\textbf{RoboTron-Nav}$, a unified framework that integrates perception, planning, and prediction capabilities through multitask collaborations on navigation and embodied question answering tasks, thereby enhancing navigation performances. Furthermore, RoboTron-Nav employs an adaptive 3D-aware history sampling strategy to effectively and efficiently utilize historical observations. By leveraging large language model, RoboTron-Nav comprehends diverse commands and complex visual scenes, resulting in appropriate navigation actions. RoboTron-Nav achieves an 81.1% success rate in object goal navigation on the $\mathrm{CHORES}$-$\mathbb{S}$ benchmark, setting a new state-of-the-art performance. Project page: https://yvfengzhong.github.io/RoboTron-Nav
comment: ICCV 2025
Tunable Leg Stiffness in a Monopedal Hopper for Energy-Efficient Vertical Hopping Across Varying Ground Profiles ICRA
We present the design and implementation of HASTA (Hopper with Adjustable Stiffness for Terrain Adaptation), a vertical hopping robot with real-time tunable leg stiffness, aimed at optimizing energy efficiency across various ground profiles (a pair of ground stiffness and damping conditions). By adjusting leg stiffness, we aim to maximize apex hopping height, a key metric for energy-efficient vertical hopping. We hypothesize that softer legs perform better on soft, damped ground by minimizing penetration and energy loss, while stiffer legs excel on hard, less damped ground by reducing limb deformation and energy dissipation. Through experimental tests and simulations, we find the best leg stiffness within our selection for each combination of ground stiffness and damping, enabling the robot to achieve maximum steady-state hopping height with a constant energy input. These results support our hypothesis that tunable stiffness improves energy-efficient locomotion in controlled experimental conditions. In addition, the simulation provides insights that could aid in the future development of controllers for selecting leg stiffness.
comment: 2025 IEEE International Conference on Robotics & Automation (ICRA)
"Set It Up": Functional Object Arrangement with Compositional Generative Models (Journal Version)
Functional object arrangement (FORM) is the task of arranging objects to fulfill a function, e.g., "set up a dining table for two". One key challenge here is that the instructions for FORM are often under-specified and do not explicitly specify the desired object goal poses. This paper presents SetItUp, a neuro-symbolic framework that learns to specify the goal poses of objects from a few training examples and a structured natural-language task specification. SetItUp uses a grounding graph, which is composed of abstract spatial relations among objects (e.g., left-of), as its intermediate representation. This decomposes the FORM problem into two stages: (i) predicting this graph among objects and (ii) predicting object poses given the grounding graph. For (i), SetItUp leverages large language models (LLMs) to induce Python programs from a task specification and a few training examples. This program can be executed to generate grounding graphs in novel scenarios. For (ii), SetItUp pre-trains a collection of diffusion models to capture primitive spatial relations and online composes these models to predict object poses based on the grounding graph. We evaluated SetItUp on a dataset spanning three distinct task families: arranging tableware on a dining table, organizing items on a bookshelf, and laying out furniture in a bedroom. Experiments show that SetItUp outperforms existing models in generating functional, physically feasible, and aesthetically pleasing object arrangements. This article extends our conference paper published at Robotics: Science and Systems (RSS) 2024.
comment: This is the journal version accepted to the International Journal of Robotics Research (IJRR). It extends our prior work presented at Robotics: Science and Systems (RSS) 2024, with a new compositional program induction pipeline from natural language, and expanded evaluations on personalized bookshelf and bedroom furniture layout tasks
Diffusion Beats Autoregressive in Data-Constrained Settings
Autoregressive (AR) models have long dominated the landscape of large language models, driving progress across a wide range of tasks. Recently, diffusion-based language models have emerged as a promising alternative, though their advantages over AR models remain underexplored. In this paper, we systematically study masked diffusion models in data-constrained settings-where training involves repeated passes over limited data-and find that they significantly outperform AR models when compute is abundant but data is scarce. Diffusion models make better use of repeated data, achieving lower validation loss and superior downstream performance. We interpret this advantage as implicit data augmentation: masked diffusion exposes the model to a diverse distribution of token orderings and prediction tasks, unlike AR's fixed left-to-right factorization. We find new scaling laws for diffusion models and derive a closed-form expression for the critical compute threshold at which diffusion begins to outperform AR. These results suggest that when data, not compute, is the bottleneck, diffusion models offer a compelling alternative to the standard AR paradigm. Our code is available at: https://diffusion-scaling.github.io.
comment: Project Webpage: https://diffusion-scaling.github.io
Bayesian Optimization applied for accelerated Virtual Validation of the Autonomous Driving Function
Rigorous Verification and Validation (V&V) of Autonomous Driving Functions (ADFs) is paramount for ensuring the safety and public acceptance of Autonomous Vehicles (AVs). Current validation relies heavily on simulation to achieve sufficient test coverage within the Operational Design Domain (ODD) of a vehicle, but exhaustively exploring the vast parameter space of possible scenarios is computationally expensive and time-consuming. This work introduces a framework based on Bayesian Optimization (BO) to accelerate the discovery of critical scenarios. We demonstrate the effectiveness of the framework on an Model Predictive Controller (MPC)-based motion planner, showing that it identifies hazardous situations, such as off-road events, using orders of magnitude fewer simulations than brute-force Design of Experiments (DoE) methods. Furthermore, this study investigates the scalability of the framework in higher-dimensional parameter spaces and its ability to identify multiple, distinct critical regions within the ODD of the motion planner used as the case study .
comment: 12 pages, corrected author list of references 27 and 38, removed duplicate reference of reference 6
Di-NeRF: Distributed NeRF for Collaborative Learning with Relative Pose Refinement
Collaborative mapping of unknown environments can be done faster and more robustly than a single robot. However, a collaborative approach requires a distributed paradigm to be scalable and deal with communication issues. This work presents a fully distributed algorithm enabling a group of robots to collectively optimize the parameters of a Neural Radiance Field (NeRF). The algorithm involves the communication of each robot's trained NeRF parameters over a mesh network, where each robot trains its NeRF and has access to its own visual data only. Additionally, the relative poses of all robots are jointly optimized alongside the model parameters, enabling mapping with less accurate relative camera poses. We show that multi-robot systems can benefit from differentiable and robust 3D reconstruction optimized from multiple NeRFs. Experiments on real-world and synthetic data demonstrate the efficiency of the proposed algorithm. See the website of the project for videos of the experiments and supplementary material (https://sites.google.com/view/di-nerf/home).
comment: 9 pages, 11 figures, Accepted in IEEE-RA-L
Fast and Robust Visuomotor Riemannian Flow Matching Policy
Diffusion-based visuomotor policies excel at learning complex robotic tasks by effectively combining visual data with high-dimensional, multi-modal action distributions. However, diffusion models often suffer from slow inference due to costly denoising processes or require complex sequential training arising from recent distilling approaches. This paper introduces Riemannian Flow Matching Policy (RFMP), a model that inherits the easy training and fast inference capabilities of flow matching (FM). Moreover, RFMP inherently incorporates geometric constraints commonly found in realistic robotic applications, as the robot state resides on a Riemannian manifold. To enhance the robustness of RFMP, we propose Stable RFMP (SRFMP), which leverages LaSalle's invariance principle to equip the dynamics of FM with stability to the support of a target Riemannian distribution. Rigorous evaluation on ten simulated and real-world tasks show that RFMP successfully learns and synthesizes complex sensorimotor policies on Euclidean and Riemannian spaces with efficient training and inference phases, outperforming Diffusion Policies and Consistency Policies.
comment: Accepted for publication in IEEE T-RO. Project website: https://sites.google.com/view/rfmp 17 pages, 12 figures, 12 tables
Reality Fusion: Robust Real-time Immersive Mobile Robot Teleoperation with Volumetric Visual Data Fusion IROS 2024
We introduce Reality Fusion, a novel robot teleoperation system that localizes, streams, projects, and merges a typical onboard depth sensor with a photorealistic, high resolution, high framerate, and wide field of view (FoV) rendering of the complex remote environment represented as 3D Gaussian splats (3DGS). Our framework enables robust egocentric and exocentric robot teleoperation in immersive VR, with the 3DGS effectively extending spatial information of a depth sensor with limited FoV and balancing the trade-off between data streaming costs and data visual quality. We evaluated our framework through a user study with 24 participants, which revealed that Reality Fusion leads to significantly better user performance, situation awareness, and user preferences. To support further research and development, we provide an open-source implementation with an easy-to-replicate custom-made telepresence robot, a high-performance virtual reality 3DGS renderer, and an immersive robot control package. (Source code: https://github.com/uhhhci/RealityFusion)
comment: Accepted at IROS 2024
Motion Planning Diffusion: Learning and Adapting Robot Motion Planning with Diffusion Models
The performance of optimization-based robot motion planning algorithms is highly dependent on the initial solutions, commonly obtained by running a sampling-based planner to obtain a collision-free path. However, these methods can be slow in high-dimensional and complex scenes and produce non-smooth solutions. Given previously solved path-planning problems, it is highly desirable to learn their distribution and use it as a prior for new similar problems. Several works propose utilizing this prior to bootstrap the motion planning problem, either by sampling initial solutions from it, or using its distribution in a maximum-a-posterior formulation for trajectory optimization. In this work, we introduce Motion Planning Diffusion (MPD), an algorithm that learns trajectory distribution priors with diffusion models. These generative models have shown increasing success in encoding multimodal data and have desirable properties for gradient-based motion planning, such as cost guidance. Given a motion planning problem, we construct a cost function and sample from the posterior distribution using the learned prior combined with the cost function gradients during the denoising process. Instead of learning the prior on all trajectory waypoints, we propose learning a lower-dimensional representation of a trajectory using linear motion primitives, particularly B-spline curves. This parametrization guarantees that the generated trajectory is smooth, can be interpolated at higher frequencies, and needs fewer parameters than a dense waypoint representation. We demonstrate the results of our method ranging from simple 2D to more complex tasks using a 7-dof robot arm manipulator. In addition to learning from simulated data, we also use human demonstrations on a real-world pick-and-place task.
Quaternion-Based Sliding Mode Control for Six Degrees of Freedom Flight Control of Quadrotors
Despite extensive research on sliding mode control (SMC) design for quadrotors, the existing approaches suffer from certain limitations. Euler angle-based SMC formulations suffer from poor performance in high-pitch or -roll maneuvers. Quaternion-based SMC approaches have unwinding issues and complex architecture. Coordinate-free methods are slow and only almost globally stable. This paper presents a new six degrees of freedom SMC flight controller to address the above limitations. We use a cascaded architecture with a position controller in the outer loop and a quaternion-based attitude controller in the inner loop. The position controller generates the desired trajectory for the attitude controller using a coordinate-free approach. The quaternion-based attitude controller uses the natural characteristics of the quaternion hypersphere, featuring a simple structure while providing global stability and avoiding unwinding issues. We compare our controller with three other common control methods conducting challenging maneuvers like flip-over and high-speed trajectory tracking in the presence of model uncertainties and disturbances. Our controller consistently outperforms the benchmark approaches with less control effort and actuator saturation, offering highly effective and efficient flight control.
WeatherEdit: Controllable Weather Editing with 4D Gaussian Field
In this work, we present WeatherEdit, a novel weather editing pipeline for generating realistic weather effects with controllable types and severity in 3D scenes. Our approach is structured into two key components: weather background editing and weather particle construction. For weather background editing, we introduce an all-in-one adapter that integrates multiple weather styles into a single pretrained diffusion model, enabling the generation of diverse weather effects in 2D image backgrounds. During inference, we design a Temporal-View (TV-) attention mechanism that follows a specific order to aggregate temporal and spatial information, ensuring consistent editing across multi-frame and multi-view images. To construct the weather particles, we first reconstruct a 3D scene using the edited images and then introduce a dynamic 4D Gaussian field to generate snowflakes, raindrops and fog in the scene. The attributes and dynamics of these particles are precisely controlled through physical-based modelling and simulation, ensuring realistic weather representation and flexible severity adjustments. Finally, we integrate the 4D Gaussian field with the 3D scene to render consistent and highly realistic weather effects. Experiments on multiple driving datasets demonstrate that WeatherEdit can generate diverse weather effects with controllable condition severity, highlighting its potential for autonomous driving simulation in adverse weather. See project page: https://jumponthemoon.github.io/w-edit
Vector Quantized-Elites: Unsupervised and Problem-Agnostic Quality-Diversity Optimization
Quality-Diversity algorithms have transformed optimization by prioritizing the discovery of diverse, high-performing solutions over a single optimal result. However, traditional Quality-Diversity methods, such as MAP-Elites, rely heavily on predefined behavior descriptors and complete prior knowledge of the task to define the behavior space grid, limiting their flexibility and applicability. In this work, we introduce Vector Quantized-Elites (VQ-Elites), a novel Quality-Diversity algorithm that autonomously constructs a structured behavior space grid using unsupervised learning, eliminating the need for prior task-specific knowledge. At the core of VQ-Elites is the integration of Vector Quantized Variational Autoencoders, which enables the dynamic learning of behavior descriptors and the generation of a structured, rather than unstructured, behavior space grid -- a significant advancement over existing unsupervised Quality-Diversity approaches. This design establishes VQ-Elites as a flexible, robust, and task-agnostic optimization framework. To further enhance the performance of unsupervised Quality-Diversity algorithms, we introduce behavior space bounding and cooperation mechanisms, which significantly improve convergence and performance, as well as the Effective Diversity Ratio and Coverage Diversity Score, two novel metrics that quantify the actual diversity in the unsupervised setting. We validate VQ-Elites on robotic arm pose-reaching, mobile robot space-covering, and MiniGrid exploration tasks. The results demonstrate its ability to efficiently generate diverse, high-quality solutions, emphasizing its adaptability, scalability, robustness to hyperparameters, and potential to extend Quality-Diversity optimization to complex, previously inaccessible domains.
comment: 15 pages, 13 figures, 1 algorithm, 1 table
RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems
We present RoboMemory, a brain-inspired multi-memory framework for lifelong learning in physical embodied systems, addressing critical challenges in real-world environments: continuous learning, multi-module memory latency, task correlation capture, and infinite-loop mitigation in closed-loop planning. Grounded in cognitive neuroscience, it integrates four core modules: the Information Preprocessor (thalamus-like), the Lifelong Embodied Memory System (hippocampus-like), the Closed-Loop Planning Module (prefrontal lobe-like), and the Low-Level Executer (cerebellum-like) to enable long-term planning and cumulative learning. The Lifelong Embodied Memory System, central to the framework, alleviates inference speed issues in complex memory frameworks via parallelized updates/retrieval across Spatial, Temporal, Episodic, and Semantic submodules. It incorporates a dynamic Knowledge Graph (KG) and consistent architectural design to enhance memory consistency and scalability. Evaluations on EmbodiedBench show RoboMemory outperforms the open-source baseline (Qwen2.5-VL-72B-Ins) by 25% in average success rate and surpasses the closed-source State-of-the-Art (SOTA) (Claude3.5-Sonnet) by 5%, establishing new SOTA. Ablation studies validate key components (critic, spatial memory, long-term memory), while real-world deployment confirms its lifelong learning capability with significantly improved success rates across repeated tasks. RoboMemory alleviates high latency challenges with scalability, serving as a foundational reference for integrating multi-modal memory systems in physical robots.
Unified Linear Parametric Map Modeling and Perception-aware Trajectory Planning for Mobile Robotics
Autonomous navigation in mobile robots, reliant on perception and planning, faces major hurdles in large-scale, complex environments. These include heavy computational burdens for mapping, sensor occlusion failures for UAVs, and traversal challenges on irregular terrain for UGVs, all compounded by a lack of perception-aware strategies. To address these challenges, we introduce Random Mapping and Random Projection (RMRP). This method constructs a lightweight linear parametric map by first mapping data to a high-dimensional space, followed by a sparse random projection for dimensionality reduction. Our novel Residual Energy Preservation Theorem provides theoretical guarantees for this process, ensuring critical geometric properties are preserved. Based on this map, we propose the RPATR (Robust Perception-Aware Trajectory Planner) framework. For UAVs, our method unifies grid and Euclidean Signed Distance Field (ESDF) maps. The front-end uses an analytical occupancy gradient to refine initial paths for safety and smoothness, while the back-end uses a closed-form ESDF for trajectory optimization. Leveraging the trained RMRP model's generalization, the planner predicts unobserved areas for proactive navigation. For UGVs, the model characterizes terrain and provides closed-form gradients, enabling online planning to circumvent large holes. Validated in diverse scenarios, our framework demonstrates superior mapping performance in time, memory, and accuracy, and enables computationally efficient, safe navigation for high-speed UAVs and UGVs. The code will be released to foster community collaboration.
Optimizing Mesh to Improve the Triangular Expansion Algorithm for Computing Visibility Regions
This paper addresses the problem of improving the query performance of the triangular expansion algorithm (TEA) for computing visibility regions by finding the most advantageous instance of the triangular mesh, the preprocessing structure. The TEA recursively traverses the mesh while keeping track of the visible region, the set of all points visible from a query point in a polygonal world. We show that the measured query time is approximately proportional to the number of triangle edge expansions during the mesh traversal. We propose a new type of triangular mesh that minimizes the expected number of expansions assuming the query points are drawn from a known probability distribution. We design a heuristic method to approximate the mesh and evaluate the approach on many challenging instances that resemble real-world environments. The proposed mesh improves the mean query times by 12-16% compared to the reference constrained Delaunay triangulation. The approach is suitable to boost offline applications that require computing millions of queries without addressing the preprocessing time. The implementation is publicly available to replicate our experiments and serve the community.
comment: 30 pages, 43 figures (including subfigures), accepted version
CARE: Enhancing Safety of Visual Navigation through Collision Avoidance via Repulsive Estimation
We propose CARE (Collision Avoidance via Repulsive Estimation) to improve the robustness of learning-based visual navigation methods. Recently, visual navigation models, particularly foundation models, have demonstrated promising performance by generating viable trajectories using only RGB images. However, these policies can generalize poorly to environments containing out-of-distribution (OOD) scenes characterized by unseen objects or different camera setups (e.g., variations in field of view, camera pose, or focal length). Without fine-tuning, such models could produce trajectories that lead to collisions, necessitating substantial efforts in data collection and additional training. To address this limitation, we introduce CARE, an attachable module that enhances the safety of visual navigation without requiring additional range sensors or fine-tuning of pretrained models. CARE can be integrated seamlessly into any RGB-based navigation model that generates local robot trajectories. It dynamically adjusts trajectories produced by a pretrained model using repulsive force vectors computed from depth images estimated directly from RGB inputs. We evaluate CARE by integrating it with state-of-the-art visual navigation models across diverse robot platforms. Real-world experiments show that CARE significantly reduces collisions (up to 100%) without compromising navigation performance in goal-conditioned navigation, and further improves collision-free travel distance (up to 10.7x) in exploration tasks. Project page: https://airlab-sogang.github.io/CARE/
comment: 16 pages, 6 figures
Extracting Visual Plans from Unlabeled Videos via Symbolic Guidance
Visual planning, by offering a sequence of intermediate visual subgoals to a goal-conditioned low-level policy, achieves promising performance on long-horizon manipulation tasks. To obtain the subgoals, existing methods typically resort to video generation models but suffer from model hallucination and computational cost. We present Vis2Plan, an efficient, explainable and white-box visual planning framework powered by symbolic guidance. From raw, unlabeled play data, Vis2Plan harnesses vision foundation models to automatically extract a compact set of task symbols, which allows building a high-level symbolic transition graph for multi-goal, multi-stage planning. At test time, given a desired task goal, our planner conducts planning at the symbolic level and assembles a sequence of physically consistent intermediate sub-goal images grounded by the underlying symbolic representation. Our Vis2Plan outperforms strong diffusion video generation-based visual planners by delivering 53\% higher aggregate success rate in real robot settings while generating visual plans 35$\times$ faster. The results indicate that Vis2Plan is able to generate physically consistent image goals while offering fully inspectable reasoning steps.
APEX-MR: Multi-Robot Asynchronous Planning and Execution for Cooperative Assembly
Compared to a single-robot workstation, a multi-robot system offers several advantages: 1) it expands the system's workspace, 2) improves task efficiency, and, more importantly, 3) enables robots to achieve significantly more complex and dexterous tasks, such as cooperative assembly. However, coordinating the tasks and motions of multiple robots is challenging due to issues, e.g., system uncertainty, task efficiency, algorithm scalability, and safety concerns. To address these challenges, this paper studies multi-robot coordination and proposes APEX-MR, an asynchronous planning and execution framework designed to safely and efficiently coordinate multiple robots to achieve cooperative assembly, e.g., LEGO assembly. In particular, APEX-MR provides a systematic approach to post-process multi-robot tasks and motion plans to enable robust asynchronous execution under uncertainty. Experimental results demonstrate that APEX-MR can significantly speed up the execution time of many long-horizon LEGO assembly tasks by 48% compared to sequential planning and 36% compared to synchronous planning on average. To further demonstrate performance, we deploy APEX-MR in a dual-arm system to perform physical LEGO assembly. To our knowledge, this is the first robotic system capable of performing customized LEGO assembly using commercial LEGO bricks. The experimental results demonstrate that the dual-arm system, with APEX-MR, can safely coordinate robot motions, efficiently collaborate, and construct complex LEGO structures. Our project website is available at https://intelligent-control-lab.github.io/APEX-MR/.
comment: 17 pages, 11 figures. To appear in the proceedings of RSS 2025
Task-driven SLAM Benchmarking For Robot Navigation IROS 2025
A critical use case of SLAM for mobile assistive robots is to support localization during a navigation-based task. Current SLAM benchmarks overlook the significance of repeatability (precision), despite its importance in real-world deployments. To address this gap, we propose a task-driven approach to SLAM benchmarking, TaskSLAM-Bench. It employs precision as a key metric, accounts for SLAM's mapping capabilities, and has easy-to-meet implementation requirements. Simulated and real-world testing scenarios of SLAM methods provide insights into the navigation performance properties of modern visual and LiDAR SLAM solutions. The outcomes show that passive stereo SLAM operates at a level of precision comparable to LiDAR SLAM in typical indoor environments. TaskSLAM-Bench complements existing benchmarks and offers richer assessment of SLAM performance in navigation-focused scenarios. Publicly available code permits in-situ SLAM testing in custom environments with properly equipped robots.
comment: 7 pages, 8 figures, 1 table. Accepted to IROS 2025
Following Is All You Need: Robot Crowd Navigation Using People As Planners
Navigating in crowded environments requires the robot to be equipped with high-level reasoning and planning techniques. Existing works focus on developing complex and heavyweight planners while ignoring the role of human intelligence. Since humans are highly capable agents who are also widely available in a crowd navigation setting, we propose an alternative scheme where the robot utilises people as planners to benefit from their effective planning decisions and social behaviours. Through a set of rule-based evaluations, we identify suitable human leaders who exhibit the potential to guide the robot towards its goal. Using a simple base planner, the robot follows the selected leader through shorthorizon subgoals that are designed to be straightforward to achieve. We demonstrate through both simulated and real-world experiments that our novel framework generates safe and efficient robot plans compared to existing planners, even without predictive or data-driven modules. Our method also brings human-like robot behaviours without explicitly defining traffic rules and social norms. Code will be available at https://github.com/centiLinda/PeopleAsPlanner.git.
comment: RAL 2025
Multi-Fidelity Reinforcement Learning for Time-Optimal Quadrotor Re-planning
High-speed online trajectory planning for UAVs poses a significant challenge due to the need for precise modeling of complex dynamics while also being constrained by computational limitations. This paper presents a multi-fidelity reinforcement learning method (MFRL) that aims to effectively create a realistic dynamics model and simultaneously train a planning policy that can be readily deployed in real-time applications. The proposed method involves the co-training of a planning policy and a reward estimator; the latter predicts the performance of the policy's output and is trained efficiently through multi-fidelity Bayesian optimization. This optimization approach models the correlation between different fidelity levels, thereby constructing a high-fidelity model based on a low-fidelity foundation, which enables the accurate development of the reward model with limited high-fidelity experiments. The framework is further extended to include real-world flight experiments in reinforcement learning training, allowing the reward model to precisely reflect real-world constraints and broadening the policy's applicability to real-world scenarios. We present rigorous evaluations by training and testing the planning policy in both simulated and real-world environments. The resulting trained policy not only generates faster and more reliable trajectories compared to the baseline snap minimization method, but it also achieves trajectory updates in 2 ms on average, while the baseline method takes several minutes.
comment: Accepted for publication in the International Journal of Robotics Research
BUFFER-X: Towards Zero-Shot Point Cloud Registration in Diverse Scenes ICCV 2025
Recent advances in deep learning-based point cloud registration have improved generalization, yet most methods still require retraining or manual parameter tuning for each new environment. In this paper, we identify three key factors limiting generalization: (a) reliance on environment-specific voxel size and search radius, (b) poor out-of-domain robustness of learning-based keypoint detectors, and (c) raw coordinate usage, which exacerbates scale discrepancies. To address these issues, we present a zero-shot registration pipeline called BUFFER-X by (a) adaptively determining voxel size/search radii, (b) using farthest point sampling to bypass learned detectors, and (c) leveraging patch-wise scale normalization for consistent coordinate bounds. In particular, we present a multi-scale patch-based descriptor generation and a hierarchical inlier search across scales to improve robustness in diverse scenes. We also propose a novel generalizability benchmark using 11 datasets that cover various indoor/outdoor scenarios and sensor modalities, demonstrating that BUFFER-X achieves substantial generalization without prior information or manual parameter tuning for the test datasets. Our code is available at https://github.com/MIT-SPARK/BUFFER-X.
comment: 20 pages, 14 figures. Accepted as a highlight paper at ICCV 2025
Uncertainty-aware Accurate Elevation Modeling for Off-road Navigation via Neural Processes
Terrain elevation modeling for off-road navigation aims to accurately estimate changes in terrain geometry in real-time and quantify the corresponding uncertainties. Having precise estimations and uncertainties plays a crucial role in planning and control algorithms to explore safe and reliable maneuver strategies. However, existing approaches, such as Gaussian Processes (GPs) and neural network-based methods, often fail to meet these needs. They are either unable to perform in real-time due to high computational demands, underestimating sharp geometry changes, or harming elevation accuracy when learned with uncertainties. Recently, Neural Processes (NPs) have emerged as a promising approach that integrates the Bayesian uncertainty estimation of GPs with the efficiency and flexibility of neural networks. Inspired by NPs, we propose an effective NP-based method that precisely estimates sharp elevation changes and quantifies the corresponding predictive uncertainty without losing elevation accuracy. Our method leverages semantic features from LiDAR and camera sensors to improve interpolation and extrapolation accuracy in unobserved regions. Also, we introduce a local ball-query attention mechanism to effectively reduce the computational complexity of global attention by 17\% while preserving crucial local and spatial information. We evaluate our method on off-road datasets having interesting geometric features, collected from trails, deserts, and hills. Our results demonstrate superior performance over baselines and showcase the potential of neural processes for effective and expressive terrain modeling in complex off-road environments.
comment: CoRL 2025
Hand-Eye Autonomous Delivery: Learning Humanoid Navigation, Locomotion and Reaching
We propose Hand-Eye Autonomous Delivery (HEAD), a framework that learns navigation, locomotion, and reaching skills for humanoids, directly from human motion and vision perception data. We take a modular approach where the high-level planner commands the target position and orientation of the hands and eyes of the humanoid, delivered by the low-level policy that controls the whole-body movements. Specifically, the low-level whole-body controller learns to track the three points (eyes, left hand, and right hand) from existing large-scale human motion capture data while high-level policy learns from human data collected by Aria glasses. Our modular approach decouples the ego-centric vision perception from physical actions, promoting efficient learning and scalability to novel scenes. We evaluate our method both in simulation and in the real-world, demonstrating humanoid's capabilities to navigate and reach in complex environments designed for humans.
Observability-Aware Control for Quadrotor Formation Flight with Range-only Measurement
Cooperative Localization (CL) is a promising approach to achieve safe quadrotor formation flight through precise positioning via low-cost inter-drone sensors. This paper develops an observability-aware control principle tailored to quadrotor formation flight with range-only inter-drone measurement. The control principle is based on a novel approximation of the local observability Gramian (LOG), which we name the Short-Term Local Observability Gramian (STLOG). The validity of STLOG is established by a proof of the link between local observability and estimation precision. We propose the Observability Predictive Controller (OPC), an implementation of our control principle under a receding-horizon framework, which optimizes a metric of the STLOG to maximize the minimum precision improvement along a trajectory. Monte Carlo simulations and experimental flight tests are conducted on a pair of quadrotors performing formation flight. The results show that the OPC improves positioning precision and estimator confidence, confirming the practical utility of the proposed approach.
comment: 36 pages, 5 figures
From Autonomy to Agency: Agentic Vehicles for Human-Centered Mobility Systems
Autonomy, from the Greek autos (self) and nomos (law), refers to the capacity to operate according to internal rules without external control. Accordingly, autonomous vehicles (AuVs) are defined as systems capable of perceiving their environment and executing preprogrammed tasks independently of external input. However, both research and real-world deployments increasingly showcase vehicles that demonstrate behaviors beyond this definition (including the SAE levels 1 to 6), such as interaction with humans and machines, goal adaptation, contextual reasoning, external tool use, and long-term planning, particularly with the integration of large language models (LLMs) and agentic AI systems. These developments reveal a conceptual gap between technical autonomy and the broader cognitive and social capabilities needed for future human-centered mobility systems. To address this, we introduce the concept of agentic vehicles (AgVs), referring to vehicles that integrate agentic AI to reason, adapt, and interact within complex environments. This paper presents a systems-level framework to characterize AgVs, focusing on their cognitive and communicative layers and differentiating them from conventional AuVs. It synthesizes relevant advances in agentic AI, robotics, multi-agent systems, and human-machine interaction, and highlights how agentic AI, through high-level reasoning and tool use, can function not merely as computational tools but as interactive agents embedded in mobility ecosystems. The paper concludes by identifying key challenges in the development and governance of AgVs, including safety, real-time control, public acceptance, ethical alignment, and regulatory frameworks.
Code-as-Symbolic-Planner: Foundation Model-Based Robot Planning via Symbolic Code Generation
Recent works have shown great potentials of Large Language Models (LLMs) in robot task and motion planning (TAMP). Current LLM approaches generate text- or code-based reasoning chains with sub-goals and action plans. However, they do not fully leverage LLMs' symbolic computing and code generation capabilities. Many robot TAMP tasks involve complex optimization under multiple constraints, where pure textual reasoning is insufficient. While augmenting LLMs with predefined solvers and planners improves performance, it lacks generalization across tasks. Given LLMs' growing coding proficiency, we enhance their TAMP capabilities by steering them to generate code as symbolic planners for optimization and constraint verification. Unlike prior work that uses code to interface with robot action modules, we steer LLMs to generate code as solvers, planners, and checkers for TAMP tasks requiring symbolic computing, while still leveraging textual reasoning to incorporate common sense. With a multi-round guidance and answer evolution framework, the proposed Code-as-Symbolic-Planner improves success rates by average 24.1\% over best baseline methods across seven typical TAMP tasks and three popular LLMs. Code-as-Symbolic-Planner shows strong effectiveness and generalizability across discrete and continuous environments, 2D/3D simulations and real-world settings, as well as single- and multi-robot tasks with diverse requirements. See our project website https://yongchao98.github.io/Code-Symbol-Planner/ for prompts, videos, and code.
comment: 7 pages, 7 figures, 3 tables
Multiagent Systems
Mixed-Initiative Dialog for Human-Robot Collaborative Manipulation
Effective robotic systems for long-horizon human-robot collaboration must adapt to a wide range of human partners, whose physical behavior, willingness to assist, and understanding of the robot's capabilities may change over time. This demands a tightly coupled communication loop that grants both agents the flexibility to propose, accept, or decline requests as they coordinate toward completing the task effectively. We apply a Mixed-Initiative dialog paradigm to Collaborative human-roBot teaming and propose MICoBot, a system that handles the common scenario where both agents, using natural language, take initiative in formulating, accepting, or rejecting proposals on who can best complete different steps of a task. To handle diverse, task-directed dialog, and find successful collaborative strategies that minimize human effort, MICoBot makes decisions at three levels: (1) a meta-planner considers human dialog to formulate and code a high-level collaboration strategy, (2) a planner optimally allocates the remaining steps to either agent based on the robot's capabilities (measured by a simulation-pretrained affordance model) and the human's estimated availability to help, and (3) an action executor decides the low-level actions to perform or words to say to the human. Our extensive evaluations in simulation and real-world -- on a physical robot with 18 unique human participants over 27 hours -- demonstrate the ability of our method to effectively collaborate with diverse human users, yielding significantly improved task success and user experience than a pure LLM baseline and other agent allocation models. See additional videos and materials at https://robin-lab.cs.utexas.edu/MicoBot/.
comment: Project website at https://robin-lab.cs.utexas.edu/MicoBot/
MoMA: A Mixture-of-Multimodal-Agents Architecture for Enhancing Clinical Prediction Modelling
Multimodal electronic health record (EHR) data provide richer, complementary insights into patient health compared to single-modality data. However, effectively integrating diverse data modalities for clinical prediction modeling remains challenging due to the substantial data requirements. We introduce a novel architecture, Mixture-of-Multimodal-Agents (MoMA), designed to leverage multiple large language model (LLM) agents for clinical prediction tasks using multimodal EHR data. MoMA employs specialized LLM agents ("specialist agents") to convert non-textual modalities, such as medical images and laboratory results, into structured textual summaries. These summaries, together with clinical notes, are combined by another LLM ("aggregator agent") to generate a unified multimodal summary, which is then used by a third LLM ("predictor agent") to produce clinical predictions. Evaluating MoMA on three prediction tasks using real-world datasets with different modality combinations and prediction settings, MoMA outperforms current state-of-the-art methods, highlighting its enhanced accuracy and flexibility across various tasks.
Congestion Mitigation Path Planning for Large-Scale Multi-Agent Navigation in Dense Environments
In high-density environments where numerous autonomous agents move simultaneously in a distributed manner, streamlining global flows to mitigate local congestion is crucial to maintain overall navigation efficiency. This paper introduces a novel path-planning problem, congestion mitigation path planning (CMPP), which embeds congestion directly into the cost function, defined by the usage of incoming edges along agents' paths. CMPP assigns a flow-based multiplicative penalty to each vertex of a sparse graph, which grows steeply where frequently-traversed paths intersect, capturing the intuition that congestion intensifies where many agents enter the same area from different directions. Minimizing the total cost yields a set of coarse-level, time-independent routes that autonomous agents can follow while applying their own local collision avoidance. We formulate the problem and develop two solvers: (i) an exact mixed-integer nonlinear programming solver for small instances, and (ii) a scalable two-layer search algorithm, A-CMTS, which quickly finds suboptimal solutions for large-scale instances and iteratively refines them toward the optimum. Empirical studies show that augmenting state-of-the-art collision-avoidance planners with CMPP significantly reduces local congestion and enhances system throughput in both discrete- and continuous-space scenarios. These results indicate that CMPP improves the performance of multi-agent systems in real-world applications such as logistics and autonomous-vehicle operations.
comment: Accepted for publication in IEEE Robotics and Automation Letters (RA-L), 2025. 9 pages, 6 figures, 3 tables. (C) 2025 IEEE. CC BY 4.0 license. Supplementary videos will be accessible via IEEE Xplore upon publication
Beyond Automation: Socratic AI, Epistemic Agency, and the Implications of the Emergence of Orchestrated Multi-Agent Learning Architectures
Generative AI is no longer a peripheral tool in higher education. It is rapidly evolving into a general-purpose infrastructure that reshapes how knowledge is generated, mediated, and validated. This paper presents findings from a controlled experiment evaluating a Socratic AI Tutor, a large language model designed to scaffold student research question development through structured dialogue grounded in constructivist theory. Conducted with 65 pre-service teacher students in Germany, the study compares interaction with the Socratic Tutor to engagement with an uninstructed AI chatbot. Students using the Socratic Tutor reported significantly greater support for critical, independent, and reflective thinking, suggesting that dialogic AI can stimulate metacognitive engagement and challenging recent narratives of de-skilling due to generative AI usage. These findings serve as a proof of concept for a broader pedagogical shift: the use of multi-agent systems (MAS) composed of specialised AI agents. To conceptualise this, we introduce the notion of orchestrated MAS, modular, pedagogically aligned agent constellations, curated by educators, that support diverse learning trajectories through differentiated roles and coordinated interaction. To anchor this shift, we propose an adapted offer-and-use model, in which students appropriate instructional offers from these agents. Beyond technical feasibility, we examine system-level implications for higher education institutions and students, including funding necessities, changes to faculty roles, curriculars, competencies and assessment practices. We conclude with a comparative cost-effectiveness analysis highlighting the scalability of such systems. In sum, this study contributes both empirical evidence and a conceptual roadmap for hybrid learning ecosystems that embed human-AI co-agency and pedagogical alignment.
Cognitive Duality for Adaptive Web Agents
Web navigation represents a critical and challenging domain for evaluating artificial general intelligence (AGI), demanding complex decision-making within high-entropy, dynamic environments with combinatorially explosive action spaces. Current approaches to building autonomous web agents either focus on offline imitation learning or online exploration, but rarely integrate both paradigms effectively. Inspired by the dual-process theory of human cognition, we derive a principled decomposition into fast System 1 and slow System 2 cognitive processes. This decomposition provides a unifying perspective on existing web agent methodologies, bridging the gap between offline learning of intuitive reactive behaviors and online acquisition of deliberative planning capabilities. We implement this framework in CogniWeb, a modular agent architecture that adaptively toggles between fast intuitive processing and deliberate reasoning based on task complexity. Our evaluation on WebArena demonstrates that CogniWeb achieves competitive performance (43.96% success rate) while maintaining significantly higher efficiency (75% reduction in token usage).
Flow-Based Task Assignment for Large-Scale Online Multi-Agent Pickup and Delivery
We study the problem of online Multi-Agent Pickup and Delivery (MAPD), where a team of agents must repeatedly serve dynamically appearing tasks on a shared map. Existing online methods either rely on simple heuristics, which result in poor decisions, or employ complex reasoning, which suffers from limited scalability under real-time constraints. In this work, we focus on the task assignment subproblem and formulate it as a minimum-cost flow over the environment graph. This eliminates the need for pairwise distance computations and allows agents to be simultaneously assigned to tasks and routed toward them. The resulting flow network also supports efficient guide path extraction to integrate with the planner and accelerates planning under real-time constraints. To improve solution quality, we introduce two congestion-aware edge cost models that incorporate real-time traffic estimates. This approach supports real-time execution and scales to over 20000 agents and 30000 tasks within 1-second planning time, outperforming existing baselines in both computational efficiency and assignment quality.
CLAPP: The CLASS LLM Agent for Pair Programming
We introduce CLAPP (CLASS LLM Agent for Pair Programming), an interactive AI assistant designed to support researchers working with the Einstein-Boltzmann solver CLASS. CLAPP leverages large language models (LLMs) and domain-specific retrieval to provide conversational coding support for CLASS-answering questions, generating code, debugging errors, and producing plots. Its architecture combines multi-agent LLM orchestration, semantic search across CLASS documentation, and a live Python execution environment. Deployed as a user-friendly web application, CLAPP lowers the entry barrier for scientists unfamiliar with AI tools and enables more productive human-AI collaboration in computational and numerical cosmology. The app is available at https://classclapp.streamlit.app
comment: Code: https://github.com/santiagocasas/clapp, Streamlit app: https://classclapp.streamlit.app
Domain-driven Metrics for Reinforcement Learning: A Case Study on Epidemic Control using Agent-based Simulation
For the development and optimization of agent-based models (ABMs) and rational agent-based models (RABMs), optimization algorithms such as reinforcement learning are extensively used. However, assessing the performance of RL-based ABMs and RABMS models is challenging due to the complexity and stochasticity of the modeled systems, and the lack of well-standardized metrics for comparing RL algorithms. In this study, we are developing domain-driven metrics for RL, while building on state-of-the-art metrics. We demonstrate our ``Domain-driven-RL-metrics'' using policy optimization on a rational ABM disease modeling case study to model masking behavior, vaccination, and lockdown in a pandemic. Our results show the use of domain-driven rewards in conjunction with traditional and state-of-the-art metrics for a few different simulation scenarios such as the differential availability of masks.
G-UBS: Towards Robust Understanding of Implicit Feedback via Group-Aware User Behavior Simulation
User feedback is critical for refining recommendation systems, yet explicit feedback (e.g., likes or dislikes) remains scarce in practice. As a more feasible alternative, inferring user preferences from massive implicit feedback has shown great potential (e.g., a user quickly skipping a recommended video usually indicates disinterest). Unfortunately, implicit feedback is often noisy: a user might skip a video due to accidental clicks or other reasons, rather than disliking it. Such noise can easily misjudge user interests, thereby undermining recommendation performance. To address this issue, we propose a novel Group-aware User Behavior Simulation (G-UBS) paradigm, which leverages contextual guidance from relevant user groups, enabling robust and in-depth interpretation of implicit feedback for individual users. Specifically, G-UBS operates via two key agents. First, the User Group Manager (UGM) effectively clusters users to generate group profiles utilizing a ``summarize-cluster-reflect" workflow based on LLMs. Second, the User Feedback Modeler (UFM) employs an innovative group-aware reinforcement learning approach, where each user is guided by the associated group profiles during the reinforcement learning process, allowing UFM to robustly and deeply examine the reasons behind implicit feedback. To assess our G-UBS paradigm, we have constructed a Video Recommendation benchmark with Implicit Feedback (IF-VR). To the best of our knowledge, this is the first multi-modal benchmark for implicit feedback evaluation in video recommendation, encompassing 15k users, 25k videos, and 933k interaction records with implicit feedback. Extensive experiments on IF-VR demonstrate that G-UBS significantly outperforms mainstream LLMs and MLLMs, with a 4.0% higher proportion of videos achieving a play rate > 30% and 14.9% higher reasoning accuracy on IF-VR.
Semantic Reasoning Meets Numerical Precision: An LLM-Powered Multi-Agent System for Power Grid Control
The increasing penetration of Distributed Energy Resources (DERs), widespread adoption of Electric Vehicles (EVs), and the growing frequency of extreme weather events have significantly increased the complexity of power grid planning, operation, and management. Traditional rule-based systems and numerical optimization approaches often struggle with the scale, dynamics, and adaptability required by modern power networks. This paper introduces Grid-Agent, an autonomous, AI-driven framework that combines Large Language Models (LLMs) with multi-agent reinforcement learning to detect and remediate grid violations in real time. Grid-Agent integrates semantic reasoning with numerical precision through a modular agent architecture: a planning agent generates coordinated action sequences using numerical power flow solvers, while a validation agent evaluates system stability and action effectiveness via sandboxed execution with safety rollbacks. To ensure scalability, Grid-Agent incorporates an adaptive multiscale network representation that dynamically selects optimal encoding schemes based on network size and complexity. The framework enables coordinated violation resolution through optimizing switch configurations, battery deployment, and load curtailment strategies. Experimental results in standard IEEE and CIGRE test systems (IEEE 69-bus, CIGRE MV, and IEEE 30-bus) demonstrate superior violation mitigation performance. Additionally, the framework's built-in data collection and learning capabilities enable continuous learning and adaptation to diverse network topologies. The autonomous nature of the framework makes it particularly suitable for modern smart grid applications requiring rapid response to dynamic operating conditions.
LLM-Based Intelligent Agents for Music Recommendation: A Comparison with Classical Content-Based Filtering
The growing availability of music on streaming platforms has led to information overload for users. To address this issue and enhance the user experience, increasingly sophisticated recommendation systems have been proposed. This work investigates the use of Large Language Models (LLMs) from the Gemini and LLaMA families, combined with intelligent agents, in a multi-agent personalized music recommendation system. The results are compared with a traditional content-based recommendation model, considering user satisfaction, novelty, and computational efficiency. LLMs achieved satisfaction rates of up to \textit{89{,}32\%}, indicating their promising potential in music recommendation systems.
comment: 12 pages, in Portuguese language, 2 figures, 5 tables, 3 formulas. To be published in the Proceedings of the Encontro Nacional de Intelig\^encia Artificial e Computacional (ENIAC 2025)
Position-Based Flocking for Robust Alignment
This paper presents a position-based flocking model for interacting agents, balancing cohesion-separation and alignment to achieve stable collective motion. The model modifies a position-velocity-based approach by approximating velocity differences using initial and current positions, introducing a threshold weight to ensure sustained alignment. Simulations with 50 agents in 2D demonstrate that the position-based model produces stronger alignment and more rigid and compact formations compared to the position-velocity-based model. The alignment metric and separation distances highlight the efficacy of the proposed model in achieving robust flocking behavior. The model's use of positions ensures robust alignment, with applications in robotics and collective dynamics.
comment: Accepted for "The 9th International Symposium on Swarm Behavior and Bio-Inspired Robotics 2025" Simulation video for Fig. 1 at https://youtu.be/yID0taa7X7o Simulation video for Fig. 3 at https://youtu.be/8I_yBhY2imo Simulation video for Fig. 5 at https://youtu.be/fTG7pgBPMZg
Adaptive Inference through Bayesian and Inverse Bayesian Inference with Symmetry-Bias in Nonstationary Environments
This study proposes a novel inference framework known as Bayesian and inverse Bayesian (BIB) inference, which incorporates symmetry bias into the Bayesian updating process to perform both conventional and inverse Bayesian updates concurrently. The model was evaluated in a sequential estimation task involving observations drawn from a Gaussian distribution with a stochastically time-varying mean. Conventional Bayesian inference is constrained by a fundamental trade-off between adaptability to abrupt environmental changes and accuracy during stable periods. The BIB framework addresses this limitation by dynamically modulating the learning rate via inverse Bayesian updates, thereby enhancing adaptive flexibility. Notably, the BIB model exhibited spontaneous bursts in the learning rate during environmental transitions, transiently entering high-sensitivity states that facilitated rapid adaptation.This burst-relaxation dynamic serves as a mechanism for balancing adaptability and accuracy. Furthermore, avalanche analysis, detrended fluctuation analysis, and power spectral analysis revealed that the BIB system likely operates near a critical state-a property not observed in standard Bayesian inference. This suggests that the BIB model uniquely achieves a coexistence of computational efficiency and critical dynamics, resolving the adaptability-accuracy trade-off while maintaining a scale-free behavior. These findings offer a new computational perspective on scale-free dynamics in natural systems and provide valuable insights for the design of adaptive inference systems in nonstationary environments.
Conservative classifiers do consistently well with improving agents: characterizing statistical and online learning
Machine learning is now ubiquitous in societal decision-making, for example in evaluating job candidates or loan applications, and it is increasingly important to take into account how classified agents will react to the learning algorithms. The majority of recent literature on strategic classification has focused on reducing and countering deceptive behaviors by the classified agents, but recent work of Attias et al. identifies surprising properties of learnability when the agents genuinely improve in order to attain the desirable classification, such as smaller generalization error than standard PAC-learning. In this paper we characterize so-called learnability with improvements across multiple new axes. We introduce an asymmetric variant of minimally consistent concept classes and use it to provide an exact characterization of proper learning with improvements in the realizable setting. While prior work studies learnability only under general, arbitrary agent improvement regions, we give positive results for more natural Euclidean ball improvement sets. In particular, we characterize improper learning under a mild generative assumption on the data distribution. We further show how to learn in more challenging settings, achieving lower generalization error under well-studied bounded noise models and obtaining mistake bounds in realizable and agnostic online learning. We resolve open questions posed by Attias et al. for both proper and improper learning.
comment: 26 pages
SciReplicate-Bench: Benchmarking LLMs in Agent-driven Algorithmic Reproduction from Research Papers
This study evaluates large language models (LLMs) in generating code from algorithm descriptions in recent NLP papers. The task requires two key competencies: (1) algorithm comprehension: synthesizing information from papers and academic literature to understand implementation logic, and (2) coding expertise: identifying dependencies and correctly implementing necessary APIs. To facilitate rigorous evaluation, we introduce SciReplicate-Bench, a benchmark of 100 tasks from 36 NLP papers published in 2024, featuring detailed annotations and comprehensive test cases. Building on SciReplicate-Bench, we propose Sci-Reproducer, a dual-agent framework consisting of a Paper Agent that interprets algorithmic concepts from literature and a Code Agent that retrieves dependencies from repositories and implements solutions. To assess algorithm understanding, we introduce reasoning graph accuracy, which quantifies similarity between generated and reference reasoning graphs derived from code comments and structure. For evaluating implementation quality, we employ execution accuracy, CodeBLEU, and repository dependency/API recall metrics. In our experiments, we evaluate various powerful non-reasoning and reasoning LLMs as foundational models. The best-performing LLM using \ModelName~achieves only 39% execution accuracy, highlighting the benchmark's difficulty. Our analysis identifies missing or inconsistent algorithm descriptions as key barriers to successful reproduction. We make available our benchmark and code at https://github.com/xyzCS/SciReplicate-Bench and project homepage at https://xyzcs.github.io/scireplicate.github.io/.
What Do Agents Think Others Would Do? Level-2 Inverse Games for Inferring Agents' Estimates of Others' Objectives
Effectively interpreting strategic interactions among multiple agents requires us to infer each agent's objective from limited information. Existing inverse game-theoretic approaches frame this challenge in terms of a "level-1" inference problem, in which we take the perspective of a third-party observer and assume that individual agents share complete knowledge of one another's objectives. However, this assumption breaks down in decentralized, real-world decision scenarios like urban driving and bargaining, in which agents may act based on conflicting views of one another's objectives. We demonstrate the necessity of inferring agents' heterogeneous estimates of each other's objectives through empirical examples, and by theoretically characterizing the prediction error of level-1 inference on fictitious gameplay data from linear-quadratic games. To address this fundamental issue, we propose a framework for level-2 inference to address the question: "What does each agent believe about all agents' objectives?" We prove that the level-2 inference problem is non-convex even in benign settings like linear-quadratic games, and we develop an efficient gradient-based approach for identifying local solutions. Experiments on a synthetic urban driving example show that our approach uncovers nuanced misalignments that level-1 methods miss.
comment: 7 pages + references + appendix
Systems and Control (CS)
Towards Generalizable Safety in Crowd Navigation via Conformal Uncertainty Handling
Mobile robots navigating in crowds trained using reinforcement learning are known to suffer performance degradation when faced with out-of-distribution scenarios. We propose that by properly accounting for the uncertainties of pedestrians, a robot can learn safe navigation policies that are robust to distribution shifts. Our method augments agent observations with prediction uncertainty estimates generated by adaptive conformal inference, and it uses these estimates to guide the agent's behavior through constrained reinforcement learning. The system helps regulate the agent's actions and enables it to adapt to distribution shifts. In the in-distribution setting, our approach achieves a 96.93% success rate, which is over 8.80% higher than the previous state-of-the-art baselines with over 3.72 times fewer collisions and 2.43 times fewer intrusions into ground-truth human future trajectories. In three out-of-distribution scenarios, our method shows much stronger robustness when facing distribution shifts in velocity variations, policy changes, and transitions from individual to group dynamics. We deploy our method on a real robot, and experiments show that the robot makes safe and robust decisions when interacting with both sparse and dense crowds. Our code and videos are available on https://gen-safe-nav.github.io/.
comment: 9th Conference on Robot Learning (CoRL 2025); Project website: https://gen-safe-nav.github.io/. arXiv admin note: text overlap with arXiv:2407.17460
Error Bounds for Radial Network Topology Learning from Quantized Measurements
We probabilistically bound the error of a solution to a radial network topology learning problem where both connectivity and line parameters are estimated. In our model, data errors are introduced by the precision of the sensors, i.e., quantization. This produces a nonlinear measurement model that embeds the operation of the sensor communication network into the learning problem, expanding beyond the additive noise models typically seen in power system estimation algorithms. We show that the error of a learned radial network topology is proportional to the quantization bin width and grows sublinearly in the number of nodes, provided that the number of samples per node is logarithmic in the number of nodes.
comment: 3 pages, 2 figures
Design and Analysis of a Vanadium Dioxide-Based Ultra-Broadband Terahertz Metamaterial Absorber
This paper presents a VO2-based metamaterial absorber optimized for ultra-broadband, polarization-insensitive performance in the terahertz (THz) frequency range. The absorber consists of a patterned VO2 metasurface, a low-loss MF2 dielectric spacer, and a gold ground plane. Exploiting the phase transition of VO2, the design enables dynamic control of electromagnetic absorption. Full-wave simulations show an average absorptance of 98.15% across a 5.38THz bandwidth (5.72-11.11THz) and over 99% absorption sustained across 3.35THz. The absorber maintains stable performance for varying polarization angles and both TE and TM modes under oblique incidence. Impedance analysis confirms strong matching to free space, reducing reflection and eliminating transmission. Parametric analysis investigates the influence of VO2 conductivity, MF2 thickness, and unit cell periodicity on performance. Compared to recent THz metamaterial absorbers, the proposed design achieves broader bandwidth, higher efficiency, and simpler implementation. These characteristics make it suitable for THz sensing, imaging, wireless communication, and adaptive photonic systems, and position it as a promising platform for tunable and reconfigurable THz modules.
comment: 6 pages, 3 figures, 2 tables
Research on integrated intelligent energy management system based on big data analysis and machine learning
The application of big data is one of the significant features of integrated smart energy. Applying it to the file management of integrated smart energy projects is of great significance for improving the efficiency of project management and control. This article first discussed the benefits and challenges of implementing big data analysis in document management and control of integrated smart energy projects. In addition, an implementation framework for big data analysis in integrated smart energy project document management was developed, and a method for optimizing the efficiency of integrated smart energy project document management through machine learning was proposed. Using various types of data and information generated during the project document management process, the efficiency of the entire process project document control through three different machine learning methods was optimized. The result of fitting a penalty linear regression model shows that when there is enough data as a training set, the accuracy of the model achieved can reach over 95\%. By using big data analysis and machine learning to analyze the efficiency of comprehensive smart energy project document management, it is possible to track the entire process of comprehensive smart energy project documents and optimize business processes, thereby strengthening project construction control and improving project construction efficiency.
comment: 6 pages, 4 figures, conference
A 20-Year Retrospective on Power and Thermal Modeling and Management
As processor performance advances, increasing power densities and complex thermal behaviors threaten both energy efficiency and system reliability. This survey covers more than two decades of research on power and thermal modeling and management in modern processors. We start by comparing analytical, regression-based, and neural network-based techniques for power estimation, then review thermal modeling methods, including finite element, finite difference, and data-driven approaches. Next, we categorize dynamic runtime management strategies that balance performance, power consumption, and reliability. Finally, we conclude with a discussion of emerging challenges and promising research directions.
Sub- μ W Battery-Less and Oscillator-Less Wi-Fi Backscattering Transmitter Reusing RF Signal for Harvesting, Communications, and Motion Detection
In this paper, a sub-uW power 802.11b backscattering transmitter is presented to enable reuse of the same incident wave for three purposes: RF harvesting, backscattering communications and position/motion sensing. The removal of the battery and any off-chip motion sensor (e.g., MEMS) enables unprecedented level of miniaturization and ubiquity, unrestricted device lifespan, low fabrication and maintenance cost. The uW power wall for WiFi transmitters is broken for the first time via local oscillator elimination, as achieved by extracting its frequency through second-order intermodulation of a twotone incident wave. The two-tone scheme also enables a cumulative harvesting/transmission/sensing sensitivity down to Pmin -19 dBm. Position/motion sensing is enabled by using the harvested voltage as a proxy for the Received Signal Strength (RSS), allowing to sense the chip location with respect to the tone generator(s) shared across tags in indoor neighborhoods.
Distributionally Robust System Level Synthesis With Output Feedback Affine Control Policy
This paper studies the finite-horizon robust optimal control of linear systems subject to model mismatch and additive stochastic disturbances. Utilizing the system level synthesis (SLS) parameterization, we propose a novel SLS design using output-feedback affine control policy and extend it to a distributionally robust setting to improve system resilience by minimizing the cost function while ensuring constraint satisfaction against the worst-case uncertainty distribution. The scopes of model mismatch and stochastic disturbances are quantified using the 1-norm and a Wasserstein metric-based ambiguity set, respectively. For the closed-loop dynamics, we analyze the distributional shift between the predicted output-input response -- computed using nominal parameters and empirical disturbance samples -- and the actual closed-loop distribution, highlighting its dependence on model mismatch and SLS parameterization. Assuming convex and Lipschitz continuous cost functions and constraints, we derive a tractable reformulation of the distributionally robust SLS (DR-SLS) problem by leveraging tools from robust control and distributionally robust optimization (DRO). Numerical experiments validate the performance and robustness of the proposed approach.
F2PASeg: Feature Fusion for Pituitary Anatomy Segmentation in Endoscopic Surgery
Pituitary tumors often cause deformation or encapsulation of adjacent vital structures. Anatomical structure segmentation can provide surgeons with early warnings of regions that pose surgical risks, thereby enhancing the safety of pituitary surgery. However, pixel-level annotated video stream datasets for pituitary surgeries are extremely rare. To address this challenge, we introduce a new dataset for Pituitary Anatomy Segmentation (PAS). PAS comprises 7,845 time-coherent images extracted from 120 videos. To mitigate class imbalance, we apply data augmentation techniques that simulate the presence of surgical instruments in the training data. One major challenge in pituitary anatomy segmentation is the inconsistency in feature representation due to occlusions, camera motion, and surgical bleeding. By incorporating a Feature Fusion module, F2PASeg is proposed to refine anatomical structure segmentation by leveraging both high-resolution image features and deep semantic embeddings, enhancing robustness against intraoperative variations. Experimental results demonstrate that F2PASeg consistently segments critical anatomical structures in real time, providing a reliable solution for intraoperative pituitary surgery planning. Code: https://github.com/paulili08/F2PASeg.
A Multi-view Landmark Representation Approach with Application to GNSS-Visual-Inertial Odometry
Invariant Extended Kalman Filter (IEKF) has been a significant technique in vision-aided sensor fusion. However, it usually suffers from high computational burden when jointly optimizing camera poses and the landmarks. To improve its efficiency and applicability for multi-sensor fusion, we present a multi-view pose-only estimation approach with its application to GNSS-Visual-Inertial Odometry (GVIO) in this paper. Our main contribution is deriving a visual measurement model which directly associates landmark representation with multiple camera poses and observations. Such a pose-only measurement is proven to be tightly-coupled between landmarks and poses, and maintain a perfect null space that is independent of estimated poses. Finally, we apply the proposed approach to a filter based GVIO with a novel feature management strategy. Both simulation tests and real-world experiments are conducted to demonstrate the superiority of the proposed method in terms of efficiency and accuracy.
Passive nonlinear FIR filters for data-driven control
We propose a new class of passive nonlinear finite impulse response operators. This class is constructed by the action of finite impulse response filters in a lifted space. This allows for efficient control synthesis through constrained optimization. Closed-loop performance is taken into account through least-squares fitting, based on the theory of virtual reference feedback tuning. Passivity is established through efficient linear constraints, based on sampling in the frequency domain. Because of passivity, this class of operators is particularly suited for the control of physical systems, such as electromechanical systems.
comment: 15 pages, 12 figures
Advanced Hybrid Transformer LSTM Technique with Attention and TS Mixer for Drilling Rate of Penetration Prediction
The Rate of Penetration (ROP) is crucial for optimizing drilling operations; however, accurately predicting it is hindered by the complex, dynamic, and high-dimensional nature of drilling data. Traditional empirical, physics-based, and basic machine learning models often fail to capture intricate temporal and contextual relationships, resulting in suboptimal predictions and limited real-time utility. To address this gap, we propose a novel hybrid deep learning architecture integrating Long Short-Term Memory (LSTM) networks, Transformer encoders, Time-Series Mixer (TS-Mixer) blocks, and attention mechanisms to synergistically model temporal dependencies, static feature interactions, global context, and dynamic feature importance. Evaluated on a real-world drilling dataset, our model outperformed benchmarks (standalone LSTM, TS-Mixer, and simpler hybrids) with an R-squared score of 0.9988 and a Mean Absolute Percentage Error of 1.447%, as measured by standard regression metrics (R-squared, MAE, RMSE, MAPE). Model interpretability was ensured using SHAP and LIME, while actual vs. predicted curves and bias checks confirmed accuracy and fairness across scenarios. This advanced hybrid approach enables reliable real-time ROP prediction, paving the way for intelligent, cost-effective drilling optimization systems with significant operational impact.
comment: 37 Pages, 19 Figures, 9 Tables
Overview of Controllability Definitions in Supervisory Control Theory
In the field of supervisory control theory, the literature often proposes different definitions for the same concept, making it difficult to understand how these definitions are related. This is definitely so for the fundamental notion of controllability of a supervisor w.r.t. a plant. This paper lists definitions of controllability found in the literature and studies their relationships in settings of both deterministic and nondeterministic automata. In the general context, where both the supervisor and the plant are allowed to be nondeterministic, the notions of controllability as described by Flordal and Malik, and uncontrollable event admissibility by Kushi and Takai are equivalent. These are also the only notions that imply the traditional notion of (language) controllability. From a practical perspective, one is often more interested in controllability of a supervised plant w.r.t. a plant. In this context, in addition to the previous two controllability notions, state controllability by Zhou et al. implies language controllability.
comment: 18 pages
Preparing for the worst: Long-term and short-term weather extremes in resource adequacy assessment
Security of supply is a common and important concern when integrating renewables in net-zero power systems. Extreme weather affects both demand and supply leading to power system stress; in Europe this stress spreads continentally beyond the meteorological root cause. We use an approach based on shadow prices to identify periods of elevated stress called system-defining events and analyse their impact on the power system. By classifying different types of system-defining events, we identify challenges to power system operation and planning. Crucially, we find the need for sufficient resilience back-up (power) capacities whose financial viability is precarious due to weather variability. Furthermore, we disentangle short- and long-term resilience challenges with distinct metrics and stress tests to incorporate both into future energy modelling assessments. Our methodology and implementation in the open model PyPSA-Eur can be re-applied to other systems and help researchers and policymakers in building more resilient and adequate energy systems.
Probabilistic Alternating Simulations for Policy Synthesis in Uncertain Stochastic Dynamical Systems
A classical approach to formal policy synthesis in stochastic dynamical systems is to construct a finite-state abstraction, often represented as a Markov decision process (MDP). The correctness of these approaches hinges on a behavioural relation between the dynamical system and its abstraction, such as a probabilistic simulation relation. However, probabilistic simulation relations do not suffice when the system dynamics are, next to being stochastic, also subject to nondeterministic (i.e., set-valued) disturbances. In this work, we extend probabilistic simulation relations to systems with both stochastic and nondeterministic disturbances. Our relation, which is inspired by a notion of alternating simulation, generalises existing relations used for verification and policy synthesis used in several works. Intuitively, our relation allows reasoning probabilistically over stochastic uncertainty, while reasoning robustly (i.e., adversarially) over nondeterministic disturbances. We experimentally demonstrate the applicability of our relations for policy synthesis in a 4D-state Dubins vehicle.
comment: Presented at CDC 2025
RCUKF: Data-Driven Modeling Meets Bayesian Estimation
Accurate modeling is crucial in many engineering and scientific applications, yet obtaining a reliable process model for complex systems is often challenging. To address this challenge, we propose a novel framework, reservoir computing with unscented Kalman filtering (RCUKF), which integrates data-driven modeling via reservoir computing (RC) with Bayesian estimation through the unscented Kalman filter (UKF). The RC component learns the nonlinear system dynamics directly from data, serving as a surrogate process model in the UKF prediction step to generate state estimates in high-dimensional or chaotic regimes where nominal mathematical models may fail. Meanwhile, the UKF measurement update integrates real-time sensor data to correct potential drift in the data-driven model. We demonstrate RCUKF effectiveness on well-known benchmark problems and a real-time vehicle trajectory estimation task in a high-fidelity simulation environment.
comment: 6 pages, 6 figures. Accepted at IFAC MECC 2025 (Modeling, Estimation and Control Conference)
Uncovering the Influence Flow Model of Transistor Amplifiers, Its Reconstruction and Application
Multistage transistor amplifiers can be effectively modeled as network of dynamic systems where individual amplifier stages interact through couplings that are dynamic in nature. Using circuit analysis techniques, we show that a large class of transistor amplifiers can be modeled as Linear Dynamic Influence Model (LDIM), where the interactions between different amplifier stages are modeled as linear dynamic equations. LDIM modeling of transistor circuits leads to application of data-driven network reconstruction techniques to characterize stage interactions and identify faults and critical circuit parameters efficiently. Employing graphical modeling techniques and Wiener filtering, we demonstrate that the network structure can be reconstructed solely from voltage time-series measurements sampled at specified points in the circuit. The efficacy of these network reconstruction methods in multistage amplifiers is demonstrated through extensive simulations involving multiple amplifier circuits in Cadence, as well as experimental results on physical hardware. The ability to infer network structure directly from measurement data offers designers and users efficient tools to design, analyze, and debug amplifier circuits. To demonstrate the utility of network reconstruction in multistage amplifier circuits, a fault diagnosis method leveraging these techniques is presented.
Distributed Quantized Average Consensus in Open Multi-Agent Systems with Dynamic Communication Links
In this paper, we focus on the distributed quantized average consensus problem in open multi-agent systems consisting of communication links that change dynamically over time. Open multi-agent systems exhibiting the aforementioned characteristic are referred to as \textit{open dynamic multi-agent systems} in this work. We present a distributed algorithm that enables active nodes in the open dynamic multi-agent system to calculate the quantized average of their initial states. Our algorithm consists of the following advantages: (i) ensures efficient communication by enabling nodes to exchange quantized valued messages, and (ii) exhibits finite time convergence to the desired solution. We establish the correctness of our algorithm and we present necessary and sufficient topological conditions for it to successfully solve the quantized average consensus problem in an open dynamic multi-agent system. Finally, we illustrate the performance of our algorithm with numerical simulations.
Distributed Optimization and Learning for Automated Stepsize Selection with Finite Time Coordination
Distributed optimization and learning algorithms are designed to operate over large scale networks enabling processing of vast amounts of data effectively and efficiently. One of the main challenges for ensuring a smooth learning process in gradient-based methods is the appropriate selection of a learning stepsize. Most current distributed approaches let individual nodes adapt their stepsizes locally. However, this may introduce stepsize heterogeneity in the network, thus disrupting the learning process and potentially leading to divergence. In this paper, we propose a distributed learning algorithm that incorporates a novel mechanism for automating stepsize selection among nodes. Our main idea relies on implementing a finite time coordination algorithm for eliminating stepsize heterogeneity among nodes. We analyze the operation of our algorithm and we establish its convergence to the optimal solution. We conclude our paper with numerical simulations for a linear regression problem, showcasing that eliminating stepsize heterogeneity enhances convergence speed and accuracy against current approaches.
Integrating Vision Foundation Models with Reinforcement Learning for Enhanced Object Interaction
This paper presents a novel approach that integrates vision foundation models with reinforcement learning to enhance object interaction capabilities in simulated environments. By combining the Segment Anything Model (SAM) and YOLOv5 with a Proximal Policy Optimization (PPO) agent operating in the AI2-THOR simulation environment, we enable the agent to perceive and interact with objects more effectively. Our comprehensive experiments, conducted across four diverse indoor kitchen settings, demonstrate significant improvements in object interaction success rates and navigation efficiency compared to a baseline agent without advanced perception. The results show a 68% increase in average cumulative reward, a 52.5% improvement in object interaction success rate, and a 33% increase in navigation efficiency. These findings highlight the potential of integrating foundation models with reinforcement learning for complex robotic tasks, paving the way for more sophisticated and capable autonomous agents.
comment: Published in the Proceedings of the 2025 3rd International Conference on Robotics, Control and Vision Engineering (RCVE'25). 6 pages, 3 figures, 1 table
A United Framework for Planning Electric Vehicle Charging Accessibility
The shift towards electric vehicles (EVs) is crucial for establishing sustainable and low-emission urban transportation systems. However, the success of this transition depends on the strategic placement of the charging infrastructure. This paper addresses the challenge of optimizing charging station locations in dense urban environments while balancing efficiency with spatial accessibility. We propose an optimization framework that integrates traffic simulation, energy consumption modeling, and a mobility equity measure to evaluate the social reach of each potential charging station. Using New York City as a case study, we demonstrate consistent improvements in accessibility (15-20% reduction in travel time variability). Our results provide a scalable methodology for incorporating equity considerations into EV infrastructure planning, although economic factors and grid integration remain important areas for future development.
GPU-Accelerated Barrier-Rate Guided MPPI Control for Tractor-Trailer Systems SC 2025
Articulated vehicles such as tractor-trailers, yard trucks, and similar platforms must often reverse and maneuver in cluttered spaces where pedestrians are present. We present how Barrier-Rate guided Model Predictive Path Integral (BR-MPPI) control can solve navigation in such challenging environments. BR-MPPI embeds Control Barrier Function (CBF) constraints directly into the path-integral update. By steering the importance-sampling distribution toward collision-free, dynamically feasible trajectories, BR-MPPI enhances the exploration strength of MPPI and improves robustness of resulting trajectories. The method is evaluated in the high-fidelity CarMaker simulator on a 12 [m] tractor-trailer tasked with reverse and forward parking in a parking lot. BR-MPPI computes control inputs in above 100 [Hz] on a single GPU (for scenarios with eight obstacles) and maintains better parking clearance than a standard MPPI baseline and an MPPI with collision cost baseline.
comment: Accepted to IEEE ITSC 2025
A Framework for Inherently Safer AGI through Language-Mediated Active Inference
This paper proposes a novel framework for developing safe Artificial General Intelligence (AGI) by combining Active Inference principles with Large Language Models (LLMs). We argue that traditional approaches to AI safety, focused on post-hoc interpretability and reward engineering, have fundamental limitations. We present an architecture where safety guarantees are integrated into the system's core design through transparent belief representations and hierarchical value alignment. Our framework leverages natural language as a medium for representing and manipulating beliefs, enabling direct human oversight while maintaining computational tractability. The architecture implements a multi-agent system where agents self-organize according to Active Inference principles, with preferences and safety constraints flowing through hierarchical Markov blankets. We outline specific mechanisms for ensuring safety, including: (1) explicit separation of beliefs and preferences in natural language, (2) bounded rationality through resource-aware free energy minimization, and (3) compositional safety through modular agent structures. The paper concludes with a research agenda centered on the Abstraction and Reasoning Corpus (ARC) benchmark, proposing experiments to validate our framework's safety properties. Our approach offers a path toward AGI development that is inherently safer, rather than retrofitted with safety measures.
Semantic Reasoning Meets Numerical Precision: An LLM-Powered Multi-Agent System for Power Grid Control
The increasing penetration of Distributed Energy Resources (DERs), widespread adoption of Electric Vehicles (EVs), and the growing frequency of extreme weather events have significantly increased the complexity of power grid planning, operation, and management. Traditional rule-based systems and numerical optimization approaches often struggle with the scale, dynamics, and adaptability required by modern power networks. This paper introduces Grid-Agent, an autonomous, AI-driven framework that combines Large Language Models (LLMs) with multi-agent reinforcement learning to detect and remediate grid violations in real time. Grid-Agent integrates semantic reasoning with numerical precision through a modular agent architecture: a planning agent generates coordinated action sequences using numerical power flow solvers, while a validation agent evaluates system stability and action effectiveness via sandboxed execution with safety rollbacks. To ensure scalability, Grid-Agent incorporates an adaptive multiscale network representation that dynamically selects optimal encoding schemes based on network size and complexity. The framework enables coordinated violation resolution through optimizing switch configurations, battery deployment, and load curtailment strategies. Experimental results in standard IEEE and CIGRE test systems (IEEE 69-bus, CIGRE MV, and IEEE 30-bus) demonstrate superior violation mitigation performance. Additionally, the framework's built-in data collection and learning capabilities enable continuous learning and adaptation to diverse network topologies. The autonomous nature of the framework makes it particularly suitable for modern smart grid applications requiring rapid response to dynamic operating conditions.
Robust pid sliding mode control for dc servo motor speed control
This research proposes a Sliding Mode PID (SMC-PID) controller to improve the speed control performance of DC servo motors, which are widely used in industrial applications such as robotics and CNC. The objective of the proposed controller is to enhance the speed control performance of DC servo motors on the CE110 Servo Trainer. The proposed method integrates a traditional PID controller with a sliding mode control mechanism to effectively handle system uncertainties and disturbances. Experimental results show that the SMC-PID method provides significant improvements in accuracy and stability compared to traditional PID controllers, with metrics such as reduced overshoot, shorter settling time, and increased adaptability to system uncertainties. This research highlights the effectiveness of the SMC-PID controller, enhancing the performance of DC servo motor speed control.
Optimizing Preventive and Reactive Defense Resource Allocation with Uncertain Sensor Signals
Cyber attacks continue to be a cause of concern despite advances in cyber defense techniques. Although cyber attacks cannot be fully prevented, standard decision-making frameworks typically focus on how to prevent them from succeeding, without considering the cost of cleaning up the damages incurred by successful attacks. This motivates us to investigate a new resource allocation problem formulated in this paper: The defender must decide how to split its investment between preventive defenses, which aim to harden nodes from attacks, and reactive defenses, which aim to quickly clean up the compromised nodes. This encounters a challenge imposed by the uncertainty associated with the observation, or sensor signal, whether a node is truly compromised or not; this uncertainty is real because attack detectors are not perfect. We investigate how the quality of sensor signals impacts the defender's strategic investment in the two types of defense, and ultimately the level of security that can be achieved. In particular, we show that the optimal investment in preventive resources increases, and thus reactive resource investment decreases, with higher sensor quality. We also show that the defender's performance improvement, relative to a baseline of no sensors employed, is maximal when the attacker can only achieve low attack success probabilities.
comment: 6 pages, 6 figures. Accepted for presentation at the 61st Allerton Conference on Communication, Control, and Computing
Tunable Leg Stiffness in a Monopedal Hopper for Energy-Efficient Vertical Hopping Across Varying Ground Profiles ICRA
We present the design and implementation of HASTA (Hopper with Adjustable Stiffness for Terrain Adaptation), a vertical hopping robot with real-time tunable leg stiffness, aimed at optimizing energy efficiency across various ground profiles (a pair of ground stiffness and damping conditions). By adjusting leg stiffness, we aim to maximize apex hopping height, a key metric for energy-efficient vertical hopping. We hypothesize that softer legs perform better on soft, damped ground by minimizing penetration and energy loss, while stiffer legs excel on hard, less damped ground by reducing limb deformation and energy dissipation. Through experimental tests and simulations, we find the best leg stiffness within our selection for each combination of ground stiffness and damping, enabling the robot to achieve maximum steady-state hopping height with a constant energy input. These results support our hypothesis that tunable stiffness improves energy-efficient locomotion in controlled experimental conditions. In addition, the simulation provides insights that could aid in the future development of controllers for selecting leg stiffness.
comment: 2025 IEEE International Conference on Robotics & Automation (ICRA)
Systolic Array-based Accelerator for Structured State-Space Models
Sequence modeling is crucial for AI to understand temporal data and detect complex time-dependent patterns. While recurrent neural networks (RNNs), convolutional neural networks (CNNs), and Transformers have advanced in capturing long-range dependencies, they struggle with achieving high accuracy with very long sequences due to limited memory retention (fixed context window). State-Space Models (SSMs) leverage exponentially decaying memory enabling lengthy context window and so they process very long data sequences more efficiently than recurrent and Transformer-based models. Unlike traditional neural models like CNNs and RNNs, SSM-based models require solving differential equations through continuous integration, making training and inference both compute- and memory-intensive on conventional CPUs and GPUs. In this paper we introduce a specialized hardware accelerator, EpochCore, for accelerating SSMs. EpochCore is based on systolic arrays (SAs) and is designed to enhance the energy efficiency and throughput of inference of SSM-based models for long-range sequence tasks. Within the SA, we propose a versatile processing element (PE) called LIMA-PE to perform traditional and specialized MAC operations to support traditional DNNs and SSMs. To complement the EpochCore microarchitecture, we propose a novel dataflow, ProDF, which enables highly efficient execution of SSM-based models. By leveraging the LIMA-PE microarchitecture and ProDF, EpochCore achieves on average 2000x improvement in performance on LRA datasets compared to a GPU and 250x gains in performance and 45x improvement in energy efficiency, over traditional SA-based accelerators (TPU).
Attitude Determination and Control of GPS Satellites: Stabilization, Orbital Insertion, and Operational Control Mechanisms
Global Positioning System (GPS) satellites are essential for providing accurate navigation and timing information worldwide. Operating in medium Earth orbit (MEO), these satellites must maintain precise Earth-pointing attitudes to transmit signals effectively. This paper presents a comprehensive review of the operational dynamics, attitude determination and control systems (ADCS), and orbital insertion techniques for GPS satellites. We explore the integration of sensors and actuators, control algorithms, stabilization strategies, and the launch procedures required to deploy these satellites. Key equations related to orbital mechanics and attitude control are discussed, and references to recent technical literature are included.
comment: 8 pages, 3 figures
Skew-Induced Insertion Loss Deviation (SILD) and FOM_SILD: Metrics for Quantifying P/N Skew Effects in High-Speed Channels
The rise of AI workloads and growing data center demands have driven the need for ultra-high-speed interconnects exceeding 200 Gb/s. As unit intervals (UI) shrink, even a few picoseconds of P/N skew can degrade serializer-deserializer (SerDes) performance. Traditional methods for quantifying skew fall short in capturing its impact. We introduce two new metrics: 1) Skew-Induced Insertion Loss Deviation (SILD) and 2) its complementary Figure of Merit (FOM_SILD), analytically developed to assess P/N skew effects. Measured S-parameters confirm FOM_SILD reciprocity, while simulations of 224G PAM4 SerDes show strong correlation with bit error rate (BER) trends. This approach offers a robust framework for analyzing skew in next-generation ultra-high-speed interconnects.
Bayesian Optimization applied for accelerated Virtual Validation of the Autonomous Driving Function
Rigorous Verification and Validation (V&V) of Autonomous Driving Functions (ADFs) is paramount for ensuring the safety and public acceptance of Autonomous Vehicles (AVs). Current validation relies heavily on simulation to achieve sufficient test coverage within the Operational Design Domain (ODD) of a vehicle, but exhaustively exploring the vast parameter space of possible scenarios is computationally expensive and time-consuming. This work introduces a framework based on Bayesian Optimization (BO) to accelerate the discovery of critical scenarios. We demonstrate the effectiveness of the framework on an Model Predictive Controller (MPC)-based motion planner, showing that it identifies hazardous situations, such as off-road events, using orders of magnitude fewer simulations than brute-force Design of Experiments (DoE) methods. Furthermore, this study investigates the scalability of the framework in higher-dimensional parameter spaces and its ability to identify multiple, distinct critical regions within the ODD of the motion planner used as the case study .
comment: 12 pages, corrected author list of references 27 and 38, removed duplicate reference of reference 6
Data-driven control of a magnetohydrodynamic flow
We demonstrate the feedback control of a weakly conducting magnetohydrodynamic (MHD) flow via Lorentz forces generated by externally applied electric and magnetic fields. Specifically, we steer the flow of an electrolyte toward prescribed velocity or vorticity patterns using arrays of electrodes and electromagnets positioned around and beneath a fluid reservoir, with feedback provided by planar particle image velocimetry (PIV). Control is implemented using a model predictive control (MPC) framework, in which control signals are computed by minimizing a cost function over the predicted evolution of the flow. The predictor is constructed entirely from data using Koopman operator theory, which enables a linear representation of the underlying nonlinear fluid dynamics. This linearity allows the MPC problem to be solved by alternating between two small and efficiently solvable convex quadratic programs (QPs): one for the electrodes and one for the electromagnets. The resulting controller runs in a closed loop on a standard laptop, enabling real-time control of the flow. We demonstrate the functionality of the approach through experiments in which the flow is shaped to match a range of reference velocity fields and a time-varying vorticity field.
comment: 21 pages, 7 figures; name changed, added references, polished language, expanded appendix
A Runtime-Adaptive Transformer Neural Network Accelerator on FPGAs
Transformer neural networks (TNN) excel in natural language processing (NLP), machine translation, and computer vision (CV) without relying on recurrent or convolutional layers. However, they have high computational and memory demands, particularly on resource-constrained devices like FPGAs. Moreover, transformer models vary in processing time across applications, requiring custom models with specific parameters. Designing custom accelerators for each model is complex and time-intensive. Some custom accelerators exist with no runtime adaptability, and they often rely on sparse matrices to reduce latency. However, hardware designs become more challenging due to the need for application-specific sparsity patterns. This paper introduces ADAPTOR, a runtime-adaptive accelerator for dense matrix computations in transformer encoders and decoders on FPGAs. ADAPTOR enhances the utilization of processing elements and on-chip memory, enhancing parallelism and reducing latency. It incorporates efficient matrix tiling to distribute resources across FPGA platforms and is fully quantized for computational efficiency and portability. Evaluations on Xilinx Alveo U55C data center cards and embedded platforms like VC707 and ZCU102 show that our design is 1.2$\times$ and 2.87$\times$ more power efficient than the NVIDIA K80 GPU and the i7-8700K CPU respectively. Additionally, it achieves a speedup of 1.7 to 2.25$\times$ compared to some state-of-the-art FPGA-based accelerators.
comment: arXiv admin note: text overlap with arXiv:2409.14023
Robust Regret Optimal Control
This paper presents a synthesis method for robust, regret optimal control. The plant is modeled in discrete-time by an uncertain linear time-invariant (LTI) system. An optimal non-causal controller is constructed using the nominal plant model and given full knowledge of the disturbance. Robust regret is defined relative to the performance of this optimal non-causal control. It is shown that a controller achieves robust regret if and only if it satisfies a robust $H_\infty$ performance condition. DK-iteration can be used to synthesize a controller that satisfies this condition and hence achieve a given level of robust regret. The approach is demonstrated three examples: (i) a simple single-input, single-output classical design, (ii) a longitudinal control for a simplified model for a Boeing 747 model, and (iii) an active suspension for a quarter car model. All examples compare the robust regret optimal against regret optimal controllers designed without uncertainty.
Enabling On-Device Medical AI Assistants via Input-Driven Saliency Adaptation
Large Language Models (LLMs) have significant impact on the healthcare scenarios but remain prohibitively large for deployment in real-time, resource-constrained environments such as edge devices. In this work, we introduce a novel medical assistant system, optimized through our general-purpose compression framework, which tailors Large Language Models (LLMs) for deployment in specialized domains. By measuring neuron saliency on domain-specific data, our method can aggressively prune irrelevant neurons, reducing model size while preserving performance. Following pruning, we apply post-training quantization to further reduce the memory footprint, and evaluate the compressed model across medical benchmarks including MedMCQA, MedQA, and PubMedQA. We also deploy the 50\% compressed Gemma and the 67\% compressed LLaMA3 models on Jetson Orin Nano (18.7W peak) and Raspberry Pi 5 (6.3W peak), achieving real-time, energy-efficient inference under hardware constraints.
comment: Accepted for publication in the Proceedings of IEEE BioCAS 2025
On the effects of angular acceleration in orientation estimation using inertial measurement units
In this paper, we analyze the orientation estimation problem using inertial measurement units. Many estimation algorithms suffer degraded performance when accelerations other than gravity affect the accelerometer. We show that linear accelerations resulting from rotational accelerations cannot be treated as external disturbance to be attenuated, rather, they change the dynamic behavior of the filter itself. In particular, this results in the introduction of additional zeros in the linearized transfer functions. These zeros lead to nonminimum phase behavior, which is known to be challenging for control. We validate these findings experimentally. Further, we demonstrate that Mahony and Madgwick filters can attenuate the acceleration at the expense of reduced bandwidth. In addition, we show that validation schemes based on precollected data fail to capture these closed-loop effects accurately.
Data-Driven Distributed Output Synchronization of Heterogeneous Discrete-Time Multi-Agent Systems
In this paper, we assume that an autonomous exosystem generates a reference output, and we consider the problem of designing a distributed data-driven control law for a family of discrete-time heterogeneous LTI agents, connected through a directed graph, in order to synchronize the agents' outputs to the reference one. The agents of the network are split into two categories: leaders, with direct access to the exosystem output, and followers, that only receive information from their neighbors. All agents aim to achieve output synchronization by means of a state feedback that makes use of their own states as well as of an estimate of the exogenous system state, provided by an internal state observer. Such observer has a different structure for leaders and followers. Necessary and sufficient conditions for the existence of a solution are first derived in the model-based set-up and then in a data-driven context. An example illustrates both the implementation procedure and the performance of the proposed approach.
comment: Extended version of the conference paper accepted for presentation at 64th IEEE Conference on Decision and Control. Compared to the previous version, some typos have been corrected, and the proof of Lemma 13 in the appendix has been expanded
Quaternion-Based Sliding Mode Control for Six Degrees of Freedom Flight Control of Quadrotors
Despite extensive research on sliding mode control (SMC) design for quadrotors, the existing approaches suffer from certain limitations. Euler angle-based SMC formulations suffer from poor performance in high-pitch or -roll maneuvers. Quaternion-based SMC approaches have unwinding issues and complex architecture. Coordinate-free methods are slow and only almost globally stable. This paper presents a new six degrees of freedom SMC flight controller to address the above limitations. We use a cascaded architecture with a position controller in the outer loop and a quaternion-based attitude controller in the inner loop. The position controller generates the desired trajectory for the attitude controller using a coordinate-free approach. The quaternion-based attitude controller uses the natural characteristics of the quaternion hypersphere, featuring a simple structure while providing global stability and avoiding unwinding issues. We compare our controller with three other common control methods conducting challenging maneuvers like flip-over and high-speed trajectory tracking in the presence of model uncertainties and disturbances. Our controller consistently outperforms the benchmark approaches with less control effort and actuator saturation, offering highly effective and efficient flight control.
A Time Splitting Based Optimization Method for Nonlinear MHE
Moving Horizon Estimation~(MHE) is essentially an optimization-based approach designed to estimate the states of dynamic systems within a moving time horizon. Traditional MHE solutions become computationally prohibitive due to the \textit{curse of dimensionality} arising from increasing problem complexity and growing length of time horizon. To address this issue, we propose novel computationally efficient algorithms for solving nonlinear MHE problems. Specifically, we first introduce a distributed reformulation utilizing a time-splitting technique. Leveraging this reformulation, we develop the Efficient Gauss-Newton Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN) to achieve computational efficiency. Additionally, to accommodate limited computational capabilities inherent in some sub-problem solvers, we propose the Efficient Sensitivity Assisted ALADIN, which enables sub-problems to be solved inexactly without hindering computational efficiency. Furthermore, recognizing scenarios where sub-problem solvers possess no computational power, we propose a Distributed Sequential Quadratic Programming (SQP) that relies solely on first- and second-order information of local objective functions. We demonstrate the performance and advantages of our proposed methods through numerical experiments on differential drive robots case, a practical nonlinear MHE problem. Our results demonstrate that the three proposed algorithms achieve computational efficiency while preserving high accuracy, thereby satisfying the real-time requirements of MHE.
GNN-Enhanced Fault Diagnosis Method for Parallel Cyber-physical Attacks in Power Grids
Parallel cyber-physical attacks (PCPA) simultaneously damage physical transmission lines and block measurement data transmission in power grids, impairing or delaying system protection and recovery. This paper investigates the fault diagnosis problem for a linearized (DC) power flow model under PCPA. The physical attack mechanism includes not only line disconnection but also admittance modification, for example via compromised distributed flexible AC transmission system (D-FACTS) devices. To address this problem, we propose a fault diagnosis framework based on meta-mixed-integer programming (MMIP), integrating graph attention network-based fault localization (GAT-FL). First, we derive measurement reconstruction conditions that allow reconstructing unknown measurements in attacked areas from available measurements and the system topology. Based on these conditions, we formulate the diagnosis task as an MMIP model. The GAT-FL predicts a probability distribution over potential physical attacks, which is then incorporated as objective coefficients in the MMIP. Solving the MMIP yields optimal attack location and magnitude estimates, from which the system states are also reconstructed. Experimental simulations are conducted on IEEE 30/118 bus standard test cases to demonstrate the effectiveness of the proposed fault diagnosis algorithms.
comment: 10 pages, 3 figures, 5 tables, journal
Adaptive Network Security Policies via Belief Aggregation and Rollout
Evolving security vulnerabilities and shifting operational conditions require frequent updates to network security policies. These updates include adjustments to incident response procedures and modifications to access controls, among others. Reinforcement learning methods have been proposed for automating such policy adaptations, but most of the methods in the research literature lack performance guarantees and adapt slowly to changes. In this paper, we address these limitations and present a method for computing security policies that is scalable, offers theoretical guarantees, and adapts quickly to changes. It assumes a model or simulator of the system and comprises three components: belief estimation through particle filtering, offline policy computation through aggregation, and online policy adaptation through rollout. Central to our method is a new feature-based aggregation technique, which improves scalability and flexibility. We analyze the approximation error of aggregation and show that rollout efficiently adapts policies to changes under certain conditions. Simulations and testbed results demonstrate that our method outperforms state-of-the-art methods on several benchmarks, including CAGE-2.
Integrative, Scalable Modeling of Hydrological Systems with MBSE and HFGT
Worsening global challenges in the Anthropocene demand complex, adaptive solutions grounded in a systems-level understanding of coupled social and environmental dynamics. However, existing modeling approaches often fall short due to disciplinary silos, limited scalability, and the absence of shared ontological frameworks. Model-Based Systems Engineering (MBSE), when integrated with Hetero-functional Graph Theory (HFGT), offers a powerful methodology for modeling systems of systems while preserving subsystem heterogeneity and enabling cross-disciplinary integration. This paper presents the first application of the MBSE-HFGT methodology to environmental systems, using a series of worked examples involving flow through lake and land segments. These examples demonstrate how the approach enables consistent, scalable, and integrative modeling of complex environmental processes.
Off-Policy Evaluation for Sequential Persuasion Process with Unobserved Confounding
In this paper, we expand the Bayesian persuasion framework to account for unobserved confounding variables in sender-receiver interactions. While traditional models assume that belief updates follow Bayesian principles, real-world scenarios often involve hidden variables that impact the receiver's belief formation and decision-making. We conceptualize this as a sequential decision-making problem, where the sender and receiver interact over multiple rounds. In each round, the sender communicates with the receiver, who also interacts with the environment. Crucially, the receiver's belief update is affected by an unobserved confounding variable. By reformulating this scenario as a Partially Observable Markov Decision Process (POMDP), we capture the sender's incomplete information regarding both the dynamics of the receiver's beliefs and the unobserved confounder. We prove that finding an optimal observation-based policy in this POMDP is equivalent to solving for an optimal signaling strategy in the original persuasion framework. Furthermore, we demonstrate how this reformulation facilitates the application of proximal learning for off-policy evaluation in the persuasion process. This advancement enables the sender to evaluate alternative signaling strategies using only observational data from a behavioral policy, thus eliminating the necessity for costly new experiments.
comment: 8 pages, 4 Figures
Latency-Optimal File Assignment in Geo-Distributed Storage with Preferential Demands
We consider the problem of data storage in a geographically distributed (or geo-distributed) network of servers (or nodes) where inter-node communication incurs certain round-trip delays. Every node serves a set of users who can request any file in the network. If the requested file is not available at the node, it communicates with other nodes to obtain the file, thus causing the user to experience latency in obtaining the file. The files can be placed uncoded, where each node stores exact copies of the files, or in coded fashion, where certain linear combination of files are placed at each node. We aim to obtain an optimal file placement on the nodes with respect to minimizing the worst-case latency at each node, as well as the system-average latency. The prior literature considered the case of equiprobable file demands at the nodes. In this paper, we investigate the generic case of non-uniform file-demand probabilities at each node. The scheme presented here is optimal within the family of uncoded schemes. It is obtained first by modeling the worst-case latency constraint as a vertex coloring problem, and then converting the system-average latency optimization to a problem of balanced-assignment.
comment: arXiv admin note: text overlap with arXiv:2405.06641
Systems and Control (EESS)
Towards Generalizable Safety in Crowd Navigation via Conformal Uncertainty Handling
Mobile robots navigating in crowds trained using reinforcement learning are known to suffer performance degradation when faced with out-of-distribution scenarios. We propose that by properly accounting for the uncertainties of pedestrians, a robot can learn safe navigation policies that are robust to distribution shifts. Our method augments agent observations with prediction uncertainty estimates generated by adaptive conformal inference, and it uses these estimates to guide the agent's behavior through constrained reinforcement learning. The system helps regulate the agent's actions and enables it to adapt to distribution shifts. In the in-distribution setting, our approach achieves a 96.93% success rate, which is over 8.80% higher than the previous state-of-the-art baselines with over 3.72 times fewer collisions and 2.43 times fewer intrusions into ground-truth human future trajectories. In three out-of-distribution scenarios, our method shows much stronger robustness when facing distribution shifts in velocity variations, policy changes, and transitions from individual to group dynamics. We deploy our method on a real robot, and experiments show that the robot makes safe and robust decisions when interacting with both sparse and dense crowds. Our code and videos are available on https://gen-safe-nav.github.io/.
comment: 9th Conference on Robot Learning (CoRL 2025); Project website: https://gen-safe-nav.github.io/. arXiv admin note: text overlap with arXiv:2407.17460
Error Bounds for Radial Network Topology Learning from Quantized Measurements
We probabilistically bound the error of a solution to a radial network topology learning problem where both connectivity and line parameters are estimated. In our model, data errors are introduced by the precision of the sensors, i.e., quantization. This produces a nonlinear measurement model that embeds the operation of the sensor communication network into the learning problem, expanding beyond the additive noise models typically seen in power system estimation algorithms. We show that the error of a learned radial network topology is proportional to the quantization bin width and grows sublinearly in the number of nodes, provided that the number of samples per node is logarithmic in the number of nodes.
comment: 3 pages, 2 figures
Design and Analysis of a Vanadium Dioxide-Based Ultra-Broadband Terahertz Metamaterial Absorber
This paper presents a VO2-based metamaterial absorber optimized for ultra-broadband, polarization-insensitive performance in the terahertz (THz) frequency range. The absorber consists of a patterned VO2 metasurface, a low-loss MF2 dielectric spacer, and a gold ground plane. Exploiting the phase transition of VO2, the design enables dynamic control of electromagnetic absorption. Full-wave simulations show an average absorptance of 98.15% across a 5.38THz bandwidth (5.72-11.11THz) and over 99% absorption sustained across 3.35THz. The absorber maintains stable performance for varying polarization angles and both TE and TM modes under oblique incidence. Impedance analysis confirms strong matching to free space, reducing reflection and eliminating transmission. Parametric analysis investigates the influence of VO2 conductivity, MF2 thickness, and unit cell periodicity on performance. Compared to recent THz metamaterial absorbers, the proposed design achieves broader bandwidth, higher efficiency, and simpler implementation. These characteristics make it suitable for THz sensing, imaging, wireless communication, and adaptive photonic systems, and position it as a promising platform for tunable and reconfigurable THz modules.
comment: 6 pages, 3 figures, 2 tables
Research on integrated intelligent energy management system based on big data analysis and machine learning
The application of big data is one of the significant features of integrated smart energy. Applying it to the file management of integrated smart energy projects is of great significance for improving the efficiency of project management and control. This article first discussed the benefits and challenges of implementing big data analysis in document management and control of integrated smart energy projects. In addition, an implementation framework for big data analysis in integrated smart energy project document management was developed, and a method for optimizing the efficiency of integrated smart energy project document management through machine learning was proposed. Using various types of data and information generated during the project document management process, the efficiency of the entire process project document control through three different machine learning methods was optimized. The result of fitting a penalty linear regression model shows that when there is enough data as a training set, the accuracy of the model achieved can reach over 95\%. By using big data analysis and machine learning to analyze the efficiency of comprehensive smart energy project document management, it is possible to track the entire process of comprehensive smart energy project documents and optimize business processes, thereby strengthening project construction control and improving project construction efficiency.
comment: 6 pages, 4 figures, conference
A 20-Year Retrospective on Power and Thermal Modeling and Management
As processor performance advances, increasing power densities and complex thermal behaviors threaten both energy efficiency and system reliability. This survey covers more than two decades of research on power and thermal modeling and management in modern processors. We start by comparing analytical, regression-based, and neural network-based techniques for power estimation, then review thermal modeling methods, including finite element, finite difference, and data-driven approaches. Next, we categorize dynamic runtime management strategies that balance performance, power consumption, and reliability. Finally, we conclude with a discussion of emerging challenges and promising research directions.
Sub- μ W Battery-Less and Oscillator-Less Wi-Fi Backscattering Transmitter Reusing RF Signal for Harvesting, Communications, and Motion Detection
In this paper, a sub-uW power 802.11b backscattering transmitter is presented to enable reuse of the same incident wave for three purposes: RF harvesting, backscattering communications and position/motion sensing. The removal of the battery and any off-chip motion sensor (e.g., MEMS) enables unprecedented level of miniaturization and ubiquity, unrestricted device lifespan, low fabrication and maintenance cost. The uW power wall for WiFi transmitters is broken for the first time via local oscillator elimination, as achieved by extracting its frequency through second-order intermodulation of a twotone incident wave. The two-tone scheme also enables a cumulative harvesting/transmission/sensing sensitivity down to Pmin -19 dBm. Position/motion sensing is enabled by using the harvested voltage as a proxy for the Received Signal Strength (RSS), allowing to sense the chip location with respect to the tone generator(s) shared across tags in indoor neighborhoods.
Distributionally Robust System Level Synthesis With Output Feedback Affine Control Policy
This paper studies the finite-horizon robust optimal control of linear systems subject to model mismatch and additive stochastic disturbances. Utilizing the system level synthesis (SLS) parameterization, we propose a novel SLS design using output-feedback affine control policy and extend it to a distributionally robust setting to improve system resilience by minimizing the cost function while ensuring constraint satisfaction against the worst-case uncertainty distribution. The scopes of model mismatch and stochastic disturbances are quantified using the 1-norm and a Wasserstein metric-based ambiguity set, respectively. For the closed-loop dynamics, we analyze the distributional shift between the predicted output-input response -- computed using nominal parameters and empirical disturbance samples -- and the actual closed-loop distribution, highlighting its dependence on model mismatch and SLS parameterization. Assuming convex and Lipschitz continuous cost functions and constraints, we derive a tractable reformulation of the distributionally robust SLS (DR-SLS) problem by leveraging tools from robust control and distributionally robust optimization (DRO). Numerical experiments validate the performance and robustness of the proposed approach.
F2PASeg: Feature Fusion for Pituitary Anatomy Segmentation in Endoscopic Surgery
Pituitary tumors often cause deformation or encapsulation of adjacent vital structures. Anatomical structure segmentation can provide surgeons with early warnings of regions that pose surgical risks, thereby enhancing the safety of pituitary surgery. However, pixel-level annotated video stream datasets for pituitary surgeries are extremely rare. To address this challenge, we introduce a new dataset for Pituitary Anatomy Segmentation (PAS). PAS comprises 7,845 time-coherent images extracted from 120 videos. To mitigate class imbalance, we apply data augmentation techniques that simulate the presence of surgical instruments in the training data. One major challenge in pituitary anatomy segmentation is the inconsistency in feature representation due to occlusions, camera motion, and surgical bleeding. By incorporating a Feature Fusion module, F2PASeg is proposed to refine anatomical structure segmentation by leveraging both high-resolution image features and deep semantic embeddings, enhancing robustness against intraoperative variations. Experimental results demonstrate that F2PASeg consistently segments critical anatomical structures in real time, providing a reliable solution for intraoperative pituitary surgery planning. Code: https://github.com/paulili08/F2PASeg.
A Multi-view Landmark Representation Approach with Application to GNSS-Visual-Inertial Odometry
Invariant Extended Kalman Filter (IEKF) has been a significant technique in vision-aided sensor fusion. However, it usually suffers from high computational burden when jointly optimizing camera poses and the landmarks. To improve its efficiency and applicability for multi-sensor fusion, we present a multi-view pose-only estimation approach with its application to GNSS-Visual-Inertial Odometry (GVIO) in this paper. Our main contribution is deriving a visual measurement model which directly associates landmark representation with multiple camera poses and observations. Such a pose-only measurement is proven to be tightly-coupled between landmarks and poses, and maintain a perfect null space that is independent of estimated poses. Finally, we apply the proposed approach to a filter based GVIO with a novel feature management strategy. Both simulation tests and real-world experiments are conducted to demonstrate the superiority of the proposed method in terms of efficiency and accuracy.
Passive nonlinear FIR filters for data-driven control
We propose a new class of passive nonlinear finite impulse response operators. This class is constructed by the action of finite impulse response filters in a lifted space. This allows for efficient control synthesis through constrained optimization. Closed-loop performance is taken into account through least-squares fitting, based on the theory of virtual reference feedback tuning. Passivity is established through efficient linear constraints, based on sampling in the frequency domain. Because of passivity, this class of operators is particularly suited for the control of physical systems, such as electromechanical systems.
comment: 15 pages, 12 figures
Advanced Hybrid Transformer LSTM Technique with Attention and TS Mixer for Drilling Rate of Penetration Prediction
The Rate of Penetration (ROP) is crucial for optimizing drilling operations; however, accurately predicting it is hindered by the complex, dynamic, and high-dimensional nature of drilling data. Traditional empirical, physics-based, and basic machine learning models often fail to capture intricate temporal and contextual relationships, resulting in suboptimal predictions and limited real-time utility. To address this gap, we propose a novel hybrid deep learning architecture integrating Long Short-Term Memory (LSTM) networks, Transformer encoders, Time-Series Mixer (TS-Mixer) blocks, and attention mechanisms to synergistically model temporal dependencies, static feature interactions, global context, and dynamic feature importance. Evaluated on a real-world drilling dataset, our model outperformed benchmarks (standalone LSTM, TS-Mixer, and simpler hybrids) with an R-squared score of 0.9988 and a Mean Absolute Percentage Error of 1.447%, as measured by standard regression metrics (R-squared, MAE, RMSE, MAPE). Model interpretability was ensured using SHAP and LIME, while actual vs. predicted curves and bias checks confirmed accuracy and fairness across scenarios. This advanced hybrid approach enables reliable real-time ROP prediction, paving the way for intelligent, cost-effective drilling optimization systems with significant operational impact.
comment: 37 Pages, 19 Figures, 9 Tables
Overview of Controllability Definitions in Supervisory Control Theory
In the field of supervisory control theory, the literature often proposes different definitions for the same concept, making it difficult to understand how these definitions are related. This is definitely so for the fundamental notion of controllability of a supervisor w.r.t. a plant. This paper lists definitions of controllability found in the literature and studies their relationships in settings of both deterministic and nondeterministic automata. In the general context, where both the supervisor and the plant are allowed to be nondeterministic, the notions of controllability as described by Flordal and Malik, and uncontrollable event admissibility by Kushi and Takai are equivalent. These are also the only notions that imply the traditional notion of (language) controllability. From a practical perspective, one is often more interested in controllability of a supervised plant w.r.t. a plant. In this context, in addition to the previous two controllability notions, state controllability by Zhou et al. implies language controllability.
comment: 18 pages
Preparing for the worst: Long-term and short-term weather extremes in resource adequacy assessment
Security of supply is a common and important concern when integrating renewables in net-zero power systems. Extreme weather affects both demand and supply leading to power system stress; in Europe this stress spreads continentally beyond the meteorological root cause. We use an approach based on shadow prices to identify periods of elevated stress called system-defining events and analyse their impact on the power system. By classifying different types of system-defining events, we identify challenges to power system operation and planning. Crucially, we find the need for sufficient resilience back-up (power) capacities whose financial viability is precarious due to weather variability. Furthermore, we disentangle short- and long-term resilience challenges with distinct metrics and stress tests to incorporate both into future energy modelling assessments. Our methodology and implementation in the open model PyPSA-Eur can be re-applied to other systems and help researchers and policymakers in building more resilient and adequate energy systems.
Probabilistic Alternating Simulations for Policy Synthesis in Uncertain Stochastic Dynamical Systems
A classical approach to formal policy synthesis in stochastic dynamical systems is to construct a finite-state abstraction, often represented as a Markov decision process (MDP). The correctness of these approaches hinges on a behavioural relation between the dynamical system and its abstraction, such as a probabilistic simulation relation. However, probabilistic simulation relations do not suffice when the system dynamics are, next to being stochastic, also subject to nondeterministic (i.e., set-valued) disturbances. In this work, we extend probabilistic simulation relations to systems with both stochastic and nondeterministic disturbances. Our relation, which is inspired by a notion of alternating simulation, generalises existing relations used for verification and policy synthesis used in several works. Intuitively, our relation allows reasoning probabilistically over stochastic uncertainty, while reasoning robustly (i.e., adversarially) over nondeterministic disturbances. We experimentally demonstrate the applicability of our relations for policy synthesis in a 4D-state Dubins vehicle.
comment: Presented at CDC 2025
RCUKF: Data-Driven Modeling Meets Bayesian Estimation
Accurate modeling is crucial in many engineering and scientific applications, yet obtaining a reliable process model for complex systems is often challenging. To address this challenge, we propose a novel framework, reservoir computing with unscented Kalman filtering (RCUKF), which integrates data-driven modeling via reservoir computing (RC) with Bayesian estimation through the unscented Kalman filter (UKF). The RC component learns the nonlinear system dynamics directly from data, serving as a surrogate process model in the UKF prediction step to generate state estimates in high-dimensional or chaotic regimes where nominal mathematical models may fail. Meanwhile, the UKF measurement update integrates real-time sensor data to correct potential drift in the data-driven model. We demonstrate RCUKF effectiveness on well-known benchmark problems and a real-time vehicle trajectory estimation task in a high-fidelity simulation environment.
comment: 6 pages, 6 figures. Accepted at IFAC MECC 2025 (Modeling, Estimation and Control Conference)
Uncovering the Influence Flow Model of Transistor Amplifiers, Its Reconstruction and Application
Multistage transistor amplifiers can be effectively modeled as network of dynamic systems where individual amplifier stages interact through couplings that are dynamic in nature. Using circuit analysis techniques, we show that a large class of transistor amplifiers can be modeled as Linear Dynamic Influence Model (LDIM), where the interactions between different amplifier stages are modeled as linear dynamic equations. LDIM modeling of transistor circuits leads to application of data-driven network reconstruction techniques to characterize stage interactions and identify faults and critical circuit parameters efficiently. Employing graphical modeling techniques and Wiener filtering, we demonstrate that the network structure can be reconstructed solely from voltage time-series measurements sampled at specified points in the circuit. The efficacy of these network reconstruction methods in multistage amplifiers is demonstrated through extensive simulations involving multiple amplifier circuits in Cadence, as well as experimental results on physical hardware. The ability to infer network structure directly from measurement data offers designers and users efficient tools to design, analyze, and debug amplifier circuits. To demonstrate the utility of network reconstruction in multistage amplifier circuits, a fault diagnosis method leveraging these techniques is presented.
Distributed Quantized Average Consensus in Open Multi-Agent Systems with Dynamic Communication Links
In this paper, we focus on the distributed quantized average consensus problem in open multi-agent systems consisting of communication links that change dynamically over time. Open multi-agent systems exhibiting the aforementioned characteristic are referred to as \textit{open dynamic multi-agent systems} in this work. We present a distributed algorithm that enables active nodes in the open dynamic multi-agent system to calculate the quantized average of their initial states. Our algorithm consists of the following advantages: (i) ensures efficient communication by enabling nodes to exchange quantized valued messages, and (ii) exhibits finite time convergence to the desired solution. We establish the correctness of our algorithm and we present necessary and sufficient topological conditions for it to successfully solve the quantized average consensus problem in an open dynamic multi-agent system. Finally, we illustrate the performance of our algorithm with numerical simulations.
Distributed Optimization and Learning for Automated Stepsize Selection with Finite Time Coordination
Distributed optimization and learning algorithms are designed to operate over large scale networks enabling processing of vast amounts of data effectively and efficiently. One of the main challenges for ensuring a smooth learning process in gradient-based methods is the appropriate selection of a learning stepsize. Most current distributed approaches let individual nodes adapt their stepsizes locally. However, this may introduce stepsize heterogeneity in the network, thus disrupting the learning process and potentially leading to divergence. In this paper, we propose a distributed learning algorithm that incorporates a novel mechanism for automating stepsize selection among nodes. Our main idea relies on implementing a finite time coordination algorithm for eliminating stepsize heterogeneity among nodes. We analyze the operation of our algorithm and we establish its convergence to the optimal solution. We conclude our paper with numerical simulations for a linear regression problem, showcasing that eliminating stepsize heterogeneity enhances convergence speed and accuracy against current approaches.
Integrating Vision Foundation Models with Reinforcement Learning for Enhanced Object Interaction
This paper presents a novel approach that integrates vision foundation models with reinforcement learning to enhance object interaction capabilities in simulated environments. By combining the Segment Anything Model (SAM) and YOLOv5 with a Proximal Policy Optimization (PPO) agent operating in the AI2-THOR simulation environment, we enable the agent to perceive and interact with objects more effectively. Our comprehensive experiments, conducted across four diverse indoor kitchen settings, demonstrate significant improvements in object interaction success rates and navigation efficiency compared to a baseline agent without advanced perception. The results show a 68% increase in average cumulative reward, a 52.5% improvement in object interaction success rate, and a 33% increase in navigation efficiency. These findings highlight the potential of integrating foundation models with reinforcement learning for complex robotic tasks, paving the way for more sophisticated and capable autonomous agents.
comment: Published in the Proceedings of the 2025 3rd International Conference on Robotics, Control and Vision Engineering (RCVE'25). 6 pages, 3 figures, 1 table
A United Framework for Planning Electric Vehicle Charging Accessibility
The shift towards electric vehicles (EVs) is crucial for establishing sustainable and low-emission urban transportation systems. However, the success of this transition depends on the strategic placement of the charging infrastructure. This paper addresses the challenge of optimizing charging station locations in dense urban environments while balancing efficiency with spatial accessibility. We propose an optimization framework that integrates traffic simulation, energy consumption modeling, and a mobility equity measure to evaluate the social reach of each potential charging station. Using New York City as a case study, we demonstrate consistent improvements in accessibility (15-20% reduction in travel time variability). Our results provide a scalable methodology for incorporating equity considerations into EV infrastructure planning, although economic factors and grid integration remain important areas for future development.
GPU-Accelerated Barrier-Rate Guided MPPI Control for Tractor-Trailer Systems SC 2025
Articulated vehicles such as tractor-trailers, yard trucks, and similar platforms must often reverse and maneuver in cluttered spaces where pedestrians are present. We present how Barrier-Rate guided Model Predictive Path Integral (BR-MPPI) control can solve navigation in such challenging environments. BR-MPPI embeds Control Barrier Function (CBF) constraints directly into the path-integral update. By steering the importance-sampling distribution toward collision-free, dynamically feasible trajectories, BR-MPPI enhances the exploration strength of MPPI and improves robustness of resulting trajectories. The method is evaluated in the high-fidelity CarMaker simulator on a 12 [m] tractor-trailer tasked with reverse and forward parking in a parking lot. BR-MPPI computes control inputs in above 100 [Hz] on a single GPU (for scenarios with eight obstacles) and maintains better parking clearance than a standard MPPI baseline and an MPPI with collision cost baseline.
comment: Accepted to IEEE ITSC 2025
A Framework for Inherently Safer AGI through Language-Mediated Active Inference
This paper proposes a novel framework for developing safe Artificial General Intelligence (AGI) by combining Active Inference principles with Large Language Models (LLMs). We argue that traditional approaches to AI safety, focused on post-hoc interpretability and reward engineering, have fundamental limitations. We present an architecture where safety guarantees are integrated into the system's core design through transparent belief representations and hierarchical value alignment. Our framework leverages natural language as a medium for representing and manipulating beliefs, enabling direct human oversight while maintaining computational tractability. The architecture implements a multi-agent system where agents self-organize according to Active Inference principles, with preferences and safety constraints flowing through hierarchical Markov blankets. We outline specific mechanisms for ensuring safety, including: (1) explicit separation of beliefs and preferences in natural language, (2) bounded rationality through resource-aware free energy minimization, and (3) compositional safety through modular agent structures. The paper concludes with a research agenda centered on the Abstraction and Reasoning Corpus (ARC) benchmark, proposing experiments to validate our framework's safety properties. Our approach offers a path toward AGI development that is inherently safer, rather than retrofitted with safety measures.
Semantic Reasoning Meets Numerical Precision: An LLM-Powered Multi-Agent System for Power Grid Control
The increasing penetration of Distributed Energy Resources (DERs), widespread adoption of Electric Vehicles (EVs), and the growing frequency of extreme weather events have significantly increased the complexity of power grid planning, operation, and management. Traditional rule-based systems and numerical optimization approaches often struggle with the scale, dynamics, and adaptability required by modern power networks. This paper introduces Grid-Agent, an autonomous, AI-driven framework that combines Large Language Models (LLMs) with multi-agent reinforcement learning to detect and remediate grid violations in real time. Grid-Agent integrates semantic reasoning with numerical precision through a modular agent architecture: a planning agent generates coordinated action sequences using numerical power flow solvers, while a validation agent evaluates system stability and action effectiveness via sandboxed execution with safety rollbacks. To ensure scalability, Grid-Agent incorporates an adaptive multiscale network representation that dynamically selects optimal encoding schemes based on network size and complexity. The framework enables coordinated violation resolution through optimizing switch configurations, battery deployment, and load curtailment strategies. Experimental results in standard IEEE and CIGRE test systems (IEEE 69-bus, CIGRE MV, and IEEE 30-bus) demonstrate superior violation mitigation performance. Additionally, the framework's built-in data collection and learning capabilities enable continuous learning and adaptation to diverse network topologies. The autonomous nature of the framework makes it particularly suitable for modern smart grid applications requiring rapid response to dynamic operating conditions.
Robust pid sliding mode control for dc servo motor speed control
This research proposes a Sliding Mode PID (SMC-PID) controller to improve the speed control performance of DC servo motors, which are widely used in industrial applications such as robotics and CNC. The objective of the proposed controller is to enhance the speed control performance of DC servo motors on the CE110 Servo Trainer. The proposed method integrates a traditional PID controller with a sliding mode control mechanism to effectively handle system uncertainties and disturbances. Experimental results show that the SMC-PID method provides significant improvements in accuracy and stability compared to traditional PID controllers, with metrics such as reduced overshoot, shorter settling time, and increased adaptability to system uncertainties. This research highlights the effectiveness of the SMC-PID controller, enhancing the performance of DC servo motor speed control.
Optimizing Preventive and Reactive Defense Resource Allocation with Uncertain Sensor Signals
Cyber attacks continue to be a cause of concern despite advances in cyber defense techniques. Although cyber attacks cannot be fully prevented, standard decision-making frameworks typically focus on how to prevent them from succeeding, without considering the cost of cleaning up the damages incurred by successful attacks. This motivates us to investigate a new resource allocation problem formulated in this paper: The defender must decide how to split its investment between preventive defenses, which aim to harden nodes from attacks, and reactive defenses, which aim to quickly clean up the compromised nodes. This encounters a challenge imposed by the uncertainty associated with the observation, or sensor signal, whether a node is truly compromised or not; this uncertainty is real because attack detectors are not perfect. We investigate how the quality of sensor signals impacts the defender's strategic investment in the two types of defense, and ultimately the level of security that can be achieved. In particular, we show that the optimal investment in preventive resources increases, and thus reactive resource investment decreases, with higher sensor quality. We also show that the defender's performance improvement, relative to a baseline of no sensors employed, is maximal when the attacker can only achieve low attack success probabilities.
comment: 6 pages, 6 figures. Accepted for presentation at the 61st Allerton Conference on Communication, Control, and Computing
Tunable Leg Stiffness in a Monopedal Hopper for Energy-Efficient Vertical Hopping Across Varying Ground Profiles ICRA
We present the design and implementation of HASTA (Hopper with Adjustable Stiffness for Terrain Adaptation), a vertical hopping robot with real-time tunable leg stiffness, aimed at optimizing energy efficiency across various ground profiles (a pair of ground stiffness and damping conditions). By adjusting leg stiffness, we aim to maximize apex hopping height, a key metric for energy-efficient vertical hopping. We hypothesize that softer legs perform better on soft, damped ground by minimizing penetration and energy loss, while stiffer legs excel on hard, less damped ground by reducing limb deformation and energy dissipation. Through experimental tests and simulations, we find the best leg stiffness within our selection for each combination of ground stiffness and damping, enabling the robot to achieve maximum steady-state hopping height with a constant energy input. These results support our hypothesis that tunable stiffness improves energy-efficient locomotion in controlled experimental conditions. In addition, the simulation provides insights that could aid in the future development of controllers for selecting leg stiffness.
comment: 2025 IEEE International Conference on Robotics & Automation (ICRA)
Systolic Array-based Accelerator for Structured State-Space Models
Sequence modeling is crucial for AI to understand temporal data and detect complex time-dependent patterns. While recurrent neural networks (RNNs), convolutional neural networks (CNNs), and Transformers have advanced in capturing long-range dependencies, they struggle with achieving high accuracy with very long sequences due to limited memory retention (fixed context window). State-Space Models (SSMs) leverage exponentially decaying memory enabling lengthy context window and so they process very long data sequences more efficiently than recurrent and Transformer-based models. Unlike traditional neural models like CNNs and RNNs, SSM-based models require solving differential equations through continuous integration, making training and inference both compute- and memory-intensive on conventional CPUs and GPUs. In this paper we introduce a specialized hardware accelerator, EpochCore, for accelerating SSMs. EpochCore is based on systolic arrays (SAs) and is designed to enhance the energy efficiency and throughput of inference of SSM-based models for long-range sequence tasks. Within the SA, we propose a versatile processing element (PE) called LIMA-PE to perform traditional and specialized MAC operations to support traditional DNNs and SSMs. To complement the EpochCore microarchitecture, we propose a novel dataflow, ProDF, which enables highly efficient execution of SSM-based models. By leveraging the LIMA-PE microarchitecture and ProDF, EpochCore achieves on average 2000x improvement in performance on LRA datasets compared to a GPU and 250x gains in performance and 45x improvement in energy efficiency, over traditional SA-based accelerators (TPU).
Attitude Determination and Control of GPS Satellites: Stabilization, Orbital Insertion, and Operational Control Mechanisms
Global Positioning System (GPS) satellites are essential for providing accurate navigation and timing information worldwide. Operating in medium Earth orbit (MEO), these satellites must maintain precise Earth-pointing attitudes to transmit signals effectively. This paper presents a comprehensive review of the operational dynamics, attitude determination and control systems (ADCS), and orbital insertion techniques for GPS satellites. We explore the integration of sensors and actuators, control algorithms, stabilization strategies, and the launch procedures required to deploy these satellites. Key equations related to orbital mechanics and attitude control are discussed, and references to recent technical literature are included.
comment: 8 pages, 3 figures
Skew-Induced Insertion Loss Deviation (SILD) and FOM_SILD: Metrics for Quantifying P/N Skew Effects in High-Speed Channels
The rise of AI workloads and growing data center demands have driven the need for ultra-high-speed interconnects exceeding 200 Gb/s. As unit intervals (UI) shrink, even a few picoseconds of P/N skew can degrade serializer-deserializer (SerDes) performance. Traditional methods for quantifying skew fall short in capturing its impact. We introduce two new metrics: 1) Skew-Induced Insertion Loss Deviation (SILD) and 2) its complementary Figure of Merit (FOM_SILD), analytically developed to assess P/N skew effects. Measured S-parameters confirm FOM_SILD reciprocity, while simulations of 224G PAM4 SerDes show strong correlation with bit error rate (BER) trends. This approach offers a robust framework for analyzing skew in next-generation ultra-high-speed interconnects.
Bayesian Optimization applied for accelerated Virtual Validation of the Autonomous Driving Function
Rigorous Verification and Validation (V&V) of Autonomous Driving Functions (ADFs) is paramount for ensuring the safety and public acceptance of Autonomous Vehicles (AVs). Current validation relies heavily on simulation to achieve sufficient test coverage within the Operational Design Domain (ODD) of a vehicle, but exhaustively exploring the vast parameter space of possible scenarios is computationally expensive and time-consuming. This work introduces a framework based on Bayesian Optimization (BO) to accelerate the discovery of critical scenarios. We demonstrate the effectiveness of the framework on an Model Predictive Controller (MPC)-based motion planner, showing that it identifies hazardous situations, such as off-road events, using orders of magnitude fewer simulations than brute-force Design of Experiments (DoE) methods. Furthermore, this study investigates the scalability of the framework in higher-dimensional parameter spaces and its ability to identify multiple, distinct critical regions within the ODD of the motion planner used as the case study .
comment: 12 pages, corrected author list of references 27 and 38, removed duplicate reference of reference 6
Data-driven control of a magnetohydrodynamic flow
We demonstrate the feedback control of a weakly conducting magnetohydrodynamic (MHD) flow via Lorentz forces generated by externally applied electric and magnetic fields. Specifically, we steer the flow of an electrolyte toward prescribed velocity or vorticity patterns using arrays of electrodes and electromagnets positioned around and beneath a fluid reservoir, with feedback provided by planar particle image velocimetry (PIV). Control is implemented using a model predictive control (MPC) framework, in which control signals are computed by minimizing a cost function over the predicted evolution of the flow. The predictor is constructed entirely from data using Koopman operator theory, which enables a linear representation of the underlying nonlinear fluid dynamics. This linearity allows the MPC problem to be solved by alternating between two small and efficiently solvable convex quadratic programs (QPs): one for the electrodes and one for the electromagnets. The resulting controller runs in a closed loop on a standard laptop, enabling real-time control of the flow. We demonstrate the functionality of the approach through experiments in which the flow is shaped to match a range of reference velocity fields and a time-varying vorticity field.
comment: 21 pages, 7 figures; name changed, added references, polished language, expanded appendix
A Runtime-Adaptive Transformer Neural Network Accelerator on FPGAs
Transformer neural networks (TNN) excel in natural language processing (NLP), machine translation, and computer vision (CV) without relying on recurrent or convolutional layers. However, they have high computational and memory demands, particularly on resource-constrained devices like FPGAs. Moreover, transformer models vary in processing time across applications, requiring custom models with specific parameters. Designing custom accelerators for each model is complex and time-intensive. Some custom accelerators exist with no runtime adaptability, and they often rely on sparse matrices to reduce latency. However, hardware designs become more challenging due to the need for application-specific sparsity patterns. This paper introduces ADAPTOR, a runtime-adaptive accelerator for dense matrix computations in transformer encoders and decoders on FPGAs. ADAPTOR enhances the utilization of processing elements and on-chip memory, enhancing parallelism and reducing latency. It incorporates efficient matrix tiling to distribute resources across FPGA platforms and is fully quantized for computational efficiency and portability. Evaluations on Xilinx Alveo U55C data center cards and embedded platforms like VC707 and ZCU102 show that our design is 1.2$\times$ and 2.87$\times$ more power efficient than the NVIDIA K80 GPU and the i7-8700K CPU respectively. Additionally, it achieves a speedup of 1.7 to 2.25$\times$ compared to some state-of-the-art FPGA-based accelerators.
comment: arXiv admin note: text overlap with arXiv:2409.14023
Robust Regret Optimal Control
This paper presents a synthesis method for robust, regret optimal control. The plant is modeled in discrete-time by an uncertain linear time-invariant (LTI) system. An optimal non-causal controller is constructed using the nominal plant model and given full knowledge of the disturbance. Robust regret is defined relative to the performance of this optimal non-causal control. It is shown that a controller achieves robust regret if and only if it satisfies a robust $H_\infty$ performance condition. DK-iteration can be used to synthesize a controller that satisfies this condition and hence achieve a given level of robust regret. The approach is demonstrated three examples: (i) a simple single-input, single-output classical design, (ii) a longitudinal control for a simplified model for a Boeing 747 model, and (iii) an active suspension for a quarter car model. All examples compare the robust regret optimal against regret optimal controllers designed without uncertainty.
Enabling On-Device Medical AI Assistants via Input-Driven Saliency Adaptation
Large Language Models (LLMs) have significant impact on the healthcare scenarios but remain prohibitively large for deployment in real-time, resource-constrained environments such as edge devices. In this work, we introduce a novel medical assistant system, optimized through our general-purpose compression framework, which tailors Large Language Models (LLMs) for deployment in specialized domains. By measuring neuron saliency on domain-specific data, our method can aggressively prune irrelevant neurons, reducing model size while preserving performance. Following pruning, we apply post-training quantization to further reduce the memory footprint, and evaluate the compressed model across medical benchmarks including MedMCQA, MedQA, and PubMedQA. We also deploy the 50\% compressed Gemma and the 67\% compressed LLaMA3 models on Jetson Orin Nano (18.7W peak) and Raspberry Pi 5 (6.3W peak), achieving real-time, energy-efficient inference under hardware constraints.
comment: Accepted for publication in the Proceedings of IEEE BioCAS 2025
On the effects of angular acceleration in orientation estimation using inertial measurement units
In this paper, we analyze the orientation estimation problem using inertial measurement units. Many estimation algorithms suffer degraded performance when accelerations other than gravity affect the accelerometer. We show that linear accelerations resulting from rotational accelerations cannot be treated as external disturbance to be attenuated, rather, they change the dynamic behavior of the filter itself. In particular, this results in the introduction of additional zeros in the linearized transfer functions. These zeros lead to nonminimum phase behavior, which is known to be challenging for control. We validate these findings experimentally. Further, we demonstrate that Mahony and Madgwick filters can attenuate the acceleration at the expense of reduced bandwidth. In addition, we show that validation schemes based on precollected data fail to capture these closed-loop effects accurately.
Data-Driven Distributed Output Synchronization of Heterogeneous Discrete-Time Multi-Agent Systems
In this paper, we assume that an autonomous exosystem generates a reference output, and we consider the problem of designing a distributed data-driven control law for a family of discrete-time heterogeneous LTI agents, connected through a directed graph, in order to synchronize the agents' outputs to the reference one. The agents of the network are split into two categories: leaders, with direct access to the exosystem output, and followers, that only receive information from their neighbors. All agents aim to achieve output synchronization by means of a state feedback that makes use of their own states as well as of an estimate of the exogenous system state, provided by an internal state observer. Such observer has a different structure for leaders and followers. Necessary and sufficient conditions for the existence of a solution are first derived in the model-based set-up and then in a data-driven context. An example illustrates both the implementation procedure and the performance of the proposed approach.
comment: Extended version of the conference paper accepted for presentation at 64th IEEE Conference on Decision and Control. Compared to the previous version, some typos have been corrected, and the proof of Lemma 13 in the appendix has been expanded
Quaternion-Based Sliding Mode Control for Six Degrees of Freedom Flight Control of Quadrotors
Despite extensive research on sliding mode control (SMC) design for quadrotors, the existing approaches suffer from certain limitations. Euler angle-based SMC formulations suffer from poor performance in high-pitch or -roll maneuvers. Quaternion-based SMC approaches have unwinding issues and complex architecture. Coordinate-free methods are slow and only almost globally stable. This paper presents a new six degrees of freedom SMC flight controller to address the above limitations. We use a cascaded architecture with a position controller in the outer loop and a quaternion-based attitude controller in the inner loop. The position controller generates the desired trajectory for the attitude controller using a coordinate-free approach. The quaternion-based attitude controller uses the natural characteristics of the quaternion hypersphere, featuring a simple structure while providing global stability and avoiding unwinding issues. We compare our controller with three other common control methods conducting challenging maneuvers like flip-over and high-speed trajectory tracking in the presence of model uncertainties and disturbances. Our controller consistently outperforms the benchmark approaches with less control effort and actuator saturation, offering highly effective and efficient flight control.
A Time Splitting Based Optimization Method for Nonlinear MHE
Moving Horizon Estimation~(MHE) is essentially an optimization-based approach designed to estimate the states of dynamic systems within a moving time horizon. Traditional MHE solutions become computationally prohibitive due to the \textit{curse of dimensionality} arising from increasing problem complexity and growing length of time horizon. To address this issue, we propose novel computationally efficient algorithms for solving nonlinear MHE problems. Specifically, we first introduce a distributed reformulation utilizing a time-splitting technique. Leveraging this reformulation, we develop the Efficient Gauss-Newton Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN) to achieve computational efficiency. Additionally, to accommodate limited computational capabilities inherent in some sub-problem solvers, we propose the Efficient Sensitivity Assisted ALADIN, which enables sub-problems to be solved inexactly without hindering computational efficiency. Furthermore, recognizing scenarios where sub-problem solvers possess no computational power, we propose a Distributed Sequential Quadratic Programming (SQP) that relies solely on first- and second-order information of local objective functions. We demonstrate the performance and advantages of our proposed methods through numerical experiments on differential drive robots case, a practical nonlinear MHE problem. Our results demonstrate that the three proposed algorithms achieve computational efficiency while preserving high accuracy, thereby satisfying the real-time requirements of MHE.
GNN-Enhanced Fault Diagnosis Method for Parallel Cyber-physical Attacks in Power Grids
Parallel cyber-physical attacks (PCPA) simultaneously damage physical transmission lines and block measurement data transmission in power grids, impairing or delaying system protection and recovery. This paper investigates the fault diagnosis problem for a linearized (DC) power flow model under PCPA. The physical attack mechanism includes not only line disconnection but also admittance modification, for example via compromised distributed flexible AC transmission system (D-FACTS) devices. To address this problem, we propose a fault diagnosis framework based on meta-mixed-integer programming (MMIP), integrating graph attention network-based fault localization (GAT-FL). First, we derive measurement reconstruction conditions that allow reconstructing unknown measurements in attacked areas from available measurements and the system topology. Based on these conditions, we formulate the diagnosis task as an MMIP model. The GAT-FL predicts a probability distribution over potential physical attacks, which is then incorporated as objective coefficients in the MMIP. Solving the MMIP yields optimal attack location and magnitude estimates, from which the system states are also reconstructed. Experimental simulations are conducted on IEEE 30/118 bus standard test cases to demonstrate the effectiveness of the proposed fault diagnosis algorithms.
comment: 10 pages, 3 figures, 5 tables, journal
Adaptive Network Security Policies via Belief Aggregation and Rollout
Evolving security vulnerabilities and shifting operational conditions require frequent updates to network security policies. These updates include adjustments to incident response procedures and modifications to access controls, among others. Reinforcement learning methods have been proposed for automating such policy adaptations, but most of the methods in the research literature lack performance guarantees and adapt slowly to changes. In this paper, we address these limitations and present a method for computing security policies that is scalable, offers theoretical guarantees, and adapts quickly to changes. It assumes a model or simulator of the system and comprises three components: belief estimation through particle filtering, offline policy computation through aggregation, and online policy adaptation through rollout. Central to our method is a new feature-based aggregation technique, which improves scalability and flexibility. We analyze the approximation error of aggregation and show that rollout efficiently adapts policies to changes under certain conditions. Simulations and testbed results demonstrate that our method outperforms state-of-the-art methods on several benchmarks, including CAGE-2.
Integrative, Scalable Modeling of Hydrological Systems with MBSE and HFGT
Worsening global challenges in the Anthropocene demand complex, adaptive solutions grounded in a systems-level understanding of coupled social and environmental dynamics. However, existing modeling approaches often fall short due to disciplinary silos, limited scalability, and the absence of shared ontological frameworks. Model-Based Systems Engineering (MBSE), when integrated with Hetero-functional Graph Theory (HFGT), offers a powerful methodology for modeling systems of systems while preserving subsystem heterogeneity and enabling cross-disciplinary integration. This paper presents the first application of the MBSE-HFGT methodology to environmental systems, using a series of worked examples involving flow through lake and land segments. These examples demonstrate how the approach enables consistent, scalable, and integrative modeling of complex environmental processes.
Off-Policy Evaluation for Sequential Persuasion Process with Unobserved Confounding
In this paper, we expand the Bayesian persuasion framework to account for unobserved confounding variables in sender-receiver interactions. While traditional models assume that belief updates follow Bayesian principles, real-world scenarios often involve hidden variables that impact the receiver's belief formation and decision-making. We conceptualize this as a sequential decision-making problem, where the sender and receiver interact over multiple rounds. In each round, the sender communicates with the receiver, who also interacts with the environment. Crucially, the receiver's belief update is affected by an unobserved confounding variable. By reformulating this scenario as a Partially Observable Markov Decision Process (POMDP), we capture the sender's incomplete information regarding both the dynamics of the receiver's beliefs and the unobserved confounder. We prove that finding an optimal observation-based policy in this POMDP is equivalent to solving for an optimal signaling strategy in the original persuasion framework. Furthermore, we demonstrate how this reformulation facilitates the application of proximal learning for off-policy evaluation in the persuasion process. This advancement enables the sender to evaluate alternative signaling strategies using only observational data from a behavioral policy, thus eliminating the necessity for costly new experiments.
comment: 8 pages, 4 Figures
Latency-Optimal File Assignment in Geo-Distributed Storage with Preferential Demands
We consider the problem of data storage in a geographically distributed (or geo-distributed) network of servers (or nodes) where inter-node communication incurs certain round-trip delays. Every node serves a set of users who can request any file in the network. If the requested file is not available at the node, it communicates with other nodes to obtain the file, thus causing the user to experience latency in obtaining the file. The files can be placed uncoded, where each node stores exact copies of the files, or in coded fashion, where certain linear combination of files are placed at each node. We aim to obtain an optimal file placement on the nodes with respect to minimizing the worst-case latency at each node, as well as the system-average latency. The prior literature considered the case of equiprobable file demands at the nodes. In this paper, we investigate the generic case of non-uniform file-demand probabilities at each node. The scheme presented here is optimal within the family of uncoded schemes. It is obtained first by modeling the worst-case latency constraint as a vertex coloring problem, and then converting the system-average latency optimization to a problem of balanced-assignment.
comment: arXiv admin note: text overlap with arXiv:2405.06641
Robotics
Achieving Precise and Reliable Locomotion with Differentiable Simulation-Based System Identification IROS 2025
Accurate system identification is crucial for reducing trajectory drift in bipedal locomotion, particularly in reinforcement learning and model-based control. In this paper, we present a novel control framework that integrates system identification into the reinforcement learning training loop using differentiable simulation. Unlike traditional approaches that rely on direct torque measurements, our method estimates system parameters using only trajectory data (positions, velocities) and control inputs. We leverage the differentiable simulator MuJoCo-XLA to optimize system parameters, ensuring that simulated robot behavior closely aligns with real-world motion. This framework enables scalable and flexible parameter optimization. Accurate system identification is crucial for reducing trajectory drift in bipedal locomotion, particularly in reinforcement learning and model-based control. In this paper, we present a novel control framework that integrates system identification into the reinforcement learning training loop using differentiable simulation. Unlike traditional approaches that rely on direct torque measurements, our method estimates system parameters using only trajectory data (positions, velocities) and control inputs. We leverage the differentiable simulator MuJoCo-XLA to optimize system parameters, ensuring that simulated robot behavior closely aligns with real-world motion. This framework enables scalable and flexible parameter optimization. It supports fundamental physical properties such as mass and inertia. Additionally, it handles complex system nonlinear behaviors, including advanced friction models, through neural network approximations. Experimental results show that our framework significantly improves trajectory following.
comment: 6 pages, Accepted for IROS 2025
From MAS to MARS: Coordination Failures and Reasoning Trade-offs in Hierarchical Multi-Agent Robotic Systems within a Healthcare Scenario
Multi-agent robotic systems (MARS) build upon multi-agent systems by integrating physical and task-related constraints, increasing the complexity of action execution and agent coordination. However, despite the availability of advanced multi-agent frameworks, their real-world deployment on robots remains limited, hindering the advancement of MARS research in practice. To bridge this gap, we conducted two studies to investigate performance trade-offs of hierarchical multi-agent frameworks in a simulated real-world multi-robot healthcare scenario. In Study 1, using CrewAI, we iteratively refine the system's knowledge base, to systematically identify and categorize coordination failures (e.g., tool access violations, lack of timely handling of failure reports) not resolvable by providing contextual knowledge alone. In Study 2, using AutoGen, we evaluate a redesigned bidirectional communication structure and further measure the trade-offs between reasoning and non-reasoning models operating within the same robotic team setting. Drawing from our empirical findings, we emphasize the tension between autonomy and stability and the importance of edge-case testing to improve system reliability and safety for future real-world deployment. Supplementary materials, including codes, task agent setup, trace outputs, and annotated examples of coordination failures and reasoning behaviors, are available at: https://byc-sophie.github.io/mas-to-mars/.
Open Scene Graphs for Open-World Object-Goal Navigation
How can we build general-purpose robot systems for open-world semantic navigation, e.g., searching a novel environment for a target object specified in natural language? To tackle this challenge, we introduce OSG Navigator, a modular system composed of foundation models, for open-world Object-Goal Navigation (ObjectNav). Foundation models provide enormous semantic knowledge about the world, but struggle to organise and maintain spatial information effectively at scale. Key to OSG Navigator is the Open Scene Graph representation, which acts as spatial memory for OSG Navigator. It organises spatial information hierarchically using OSG schemas, which are templates, each describing the common structure of a class of environments. OSG schemas can be automatically generated from simple semantic labels of a given environment, e.g., "home" or "supermarket". They enable OSG Navigator to adapt zero-shot to new environment types. We conducted experiments using both Fetch and Spot robots in simulation and in the real world, showing that OSG Navigator achieves state-of-the-art performance on ObjectNav benchmarks and generalises zero-shot over diverse goals, environments, and robot embodiments.
comment: In IJRR Special Issue: Foundation Models and Neuro-symbolic AI for Robotics. Journal extension to arXiv:2407.02473
RoboTron-Sim: Improving Real-World Driving via Simulated Hard-Case ICCV 2025
Collecting real-world data for rare high-risk scenarios, long-tailed driving events, and complex interactions remains challenging, leading to poor performance of existing autonomous driving systems in these critical situations. In this paper, we propose RoboTron-Sim that improves real-world driving in critical situations by utilizing simulated hard cases. First, we develop a simulated dataset called Hard-case Augmented Synthetic Scenarios (HASS), which covers 13 high-risk edge-case categories, as well as balanced environmental conditions such as day/night and sunny/rainy. Second, we introduce Scenario-aware Prompt Engineering (SPE) and an Image-to-Ego Encoder (I2E Encoder) to enable multimodal large language models to effectively learn real-world challenging driving skills from HASS, via adapting to environmental deviations and hardware differences between real-world and simulated scenarios. Extensive experiments on nuScenes show that RoboTron-Sim improves driving performance in challenging scenarios by around 50%, achieving state-of-the-art results in real-world open-loop planning. Qualitative results further demonstrate the effectiveness of RoboTron-Sim in better managing rare high-risk driving scenarios. Project page: https://stars79689.github.io/RoboTron-Sim/
comment: ICCV 2025
OmniDepth: Bridging Monocular and Stereo Reasoning with Latent Alignment ICCV 2025
Monocular and stereo depth estimation offer complementary strengths: monocular methods capture rich contextual priors but lack geometric precision, while stereo approaches leverage epipolar geometry yet struggle with ambiguities such as reflective or textureless surfaces. Despite post-hoc synergies, these paradigms remain largely disjoint in practice. We introduce OmniDepth, a unified framework that bridges both through iterative bidirectional alignment of their latent representations. At its core, a novel cross-attentive alignment mechanism dynamically synchronizes monocular contextual cues with stereo hypothesis representations during stereo reasoning. This mutual alignment resolves stereo ambiguities (e.g., specular surfaces) by injecting monocular structure priors while refining monocular depth with stereo geometry within a single network. Extensive experiments demonstrate state-of-the-art results: \textbf{OmniDepth reduces zero-shot generalization error by $\!>\!40\%$ on Middlebury and ETH3D}, while addressing longstanding failures on transparent and reflective surfaces. By harmonizing multi-view geometry with monocular context, OmniDepth enables robust 3D perception that transcends modality-specific limitations. Codes available at https://github.com/aeolusguan/OmniDepth.
comment: ICCV 2025 Highlight
$NavA^3$: Understanding Any Instruction, Navigating Anywhere, Finding Anything
Embodied navigation is a fundamental capability of embodied intelligence, enabling robots to move and interact within physical environments. However, existing navigation tasks primarily focus on predefined object navigation or instruction following, which significantly differs from human needs in real-world scenarios involving complex, open-ended scenes. To bridge this gap, we introduce a challenging long-horizon navigation task that requires understanding high-level human instructions and performing spatial-aware object navigation in real-world environments. Existing embodied navigation methods struggle with such tasks due to their limitations in comprehending high-level human instructions and localizing objects with an open vocabulary. In this paper, we propose $NavA^3$, a hierarchical framework divided into two stages: global and local policies. In the global policy, we leverage the reasoning capabilities of Reasoning-VLM to parse high-level human instructions and integrate them with global 3D scene views. This allows us to reason and navigate to regions most likely to contain the goal object. In the local policy, we have collected a dataset of 1.0 million samples of spatial-aware object affordances to train the NaviAfford model (PointingVLM), which provides robust open-vocabulary object localization and spatial awareness for precise goal identification and navigation in complex environments. Extensive experiments demonstrate that $NavA^3$ achieves SOTA results in navigation performance and can successfully complete longhorizon navigation tasks across different robot embodiments in real-world settings, paving the way for universal embodied navigation. The dataset and code will be made available. Project website: https://NavigationA3.github.io/.
Behaviorally Adaptive Multi-Robot Hazard Localization in Failure-Prone, Communication-Denied Environments
We address the challenge of multi-robot autonomous hazard mapping in high-risk, failure-prone, communication-denied environments such as post-disaster zones, underground mines, caves, and planetary surfaces. In these missions, robots must explore and map hazards while minimizing the risk of failure due to environmental threats or hardware limitations. We introduce a behavior-adaptive, information-theoretic planning framework for multi-robot teams grounded in the concept of Behavioral Entropy (BE), that generalizes Shannon entropy (SE) to capture diverse human-like uncertainty evaluations. Building on this formulation, we propose the Behavior-Adaptive Path Planning (BAPP) framework, which modulates information gathering strategies via a tunable risk-sensitivity parameter, and present two planning algorithms: BAPP-TID for intelligent triggering of high-fidelity robots, and BAPP-SIG for safe deployment under high risk. We provide theoretical insights on the informativeness of the proposed BAPP framework and validate its effectiveness through both single-robot and multi-robot simulations. Our results show that the BAPP stack consistently outperforms Shannon-based and random strategies: BAPP-TID accelerates entropy reduction, while BAPP-SIG improves robot survivability with minimal loss in information gain. In multi-agent deployments, BAPP scales effectively through spatial partitioning, mobile base relocation, and role-aware heterogeneity. These findings underscore the value of behavior-adaptive planning for robust, risk-sensitive exploration in complex, failure-prone environments.
Reliable and Real-Time Highway Trajectory Planning via Hybrid Learning-Optimization Frameworks
Autonomous highway driving presents a high collision risk due to fast-changing environments and limited reaction time, necessitating reliable and efficient trajectory planning. This paper proposes a hybrid trajectory planning framework that integrates the adaptability of learning-based methods with the formal safety guarantees of optimization-based approaches. The framework features a two-layer architecture: an upper layer employing a graph neural network (GNN) trained on real-world highway data to predict human-like longitudinal velocity profiles, and a lower layer utilizing path optimization formulated as a mixed-integer quadratic programming (MIQP) problem. The primary contribution is the lower-layer path optimization model, which introduces a linear approximation of discretized vehicle geometry to substantially reduce computational complexity, while enforcing strict spatiotemporal non-overlapping constraints to formally guarantee collision avoidance throughout the planning horizon. Experimental results demonstrate that the planner generates highly smooth, collision-free trajectories in complex real-world emergency scenarios, achieving success rates exceeding 97% with average planning times of 54 ms, thereby confirming real-time capability.
Incorporating Stochastic Models of Controller Behavior into Kinodynamic Efficiently Adaptive State Lattices for Mobile Robot Motion Planning in Off-Road Environments
Mobile robot motion planners rely on theoretical models to predict how the robot will move through the world. However, when deployed on a physical robot, these models are subject to errors due to real-world physics and uncertainty in how the lower-level controller follows the planned trajectory. In this work, we address this problem by presenting three methods of incorporating stochastic controller behavior into the recombinant search space of the Kinodynamic Efficiently Adaptive State Lattice (KEASL) planner. To demonstrate this work, we analyze the results of experiments performed on a Clearpath Robotics Warthog Unmanned Ground Vehicle (UGV) in an off-road, unstructured environment using two different perception algorithms, and performed an ablation study using a full spectrum of simulated environment map complexities. Analysis of the data found that incorporating stochastic controller sampling into KEASL leads to more conservative trajectories that decrease predicted collision likelihood when compared to KEASL without sampling. When compared to baseline planning with expanded obstacle footprints, the predicted likelihood of collisions becomes more comparable, but reduces the planning success rate for baseline search.
comment: Accepted to the International Symposium on Experimental Robotics (ISER) 2025
Position-Based Flocking for Robust Alignment
This paper presents a position-based flocking model for interacting agents, balancing cohesion-separation and alignment to achieve stable collective motion. The model modifies a velocity-based approach by approximating velocity differences using initial and current positions, introducing a threshold weight to ensure sustained alignment. Simulations with 50 agents in 2D demonstrate that the position-based model produces stronger alignment and more rigid and compact formations compared to the velocity-based model. The alignment metric and separation distances highlight the efficacy of the proposed model in achieving robust flocking behavior. The model's use of positions ensures robust alignment, with applications in robotics and collective dynamics.
Tactile Comfort: Lowering Heart Rate Through Interactions IROS
Children diagnosed with anxiety disorders are taught a range of strategies to navigate situations of heightened anxiety. Techniques such as deep breathing and repetition of mantras are commonly employed, as they are known to be calming and reduce elevated heart rates. Although these strategies are often effective, their successful application relies on prior training of the children for successful use when faced with challenging situations. This paper investigates a pocket-sized companion robot designed to offer a relaxation technique requiring no prior training, with a focus on immediate impact on the user's heart rate. The robot utilizes a tactile game to divert the user's attention, thereby promoting relaxation. We conducted two studies with children who were not diagnosed with anxiety: a 14-day pilot study with two children (age 8) and a main study with 18 children (ages 7-8). Both studies employed a within-subjects design and focused on measuring heart rate during tactile interaction with the robot and during non-use. Interacting with the robot was found to significantly lower the study participants' heart rate (p$<$0.01) compared to the non-use condition, indicating a consistent calming effect across all participants. These results suggest that tactile companion robots have the potential to enhance the therapeutic value of relaxation techniques.
comment: 6 pages, 4 figures, Proceedings from 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2024
Improving Tactile Gesture Recognition with Optical Flow
Tactile gesture recognition systems play a crucial role in Human-Robot Interaction (HRI) by enabling intuitive communication between humans and robots. The literature mainly addresses this problem by applying machine learning techniques to classify sequences of tactile images encoding the pressure distribution generated when executing the gestures. However, some gestures can be hard to differentiate based on the information provided by tactile images alone. In this paper, we present a simple yet effective way to improve the accuracy of a gesture recognition classifier. Our approach focuses solely on processing the tactile images used as input by the classifier. In particular, we propose to explicitly highlight the dynamics of the contact in the tactile image by computing the dense optical flow. This additional information makes it easier to distinguish between gestures that produce similar tactile images but exhibit different contact dynamics. We validate the proposed approach in a tactile gesture recognition task, showing that a classifier trained on tactile images augmented with optical flow information achieved a 9% improvement in gesture classification accuracy compared to one trained on standard tactile images.
comment: 7 pages, 7 figures, paper accepted by the 2025 34th IEEE International Conference on Robot and Human Interactive Communication (ROMAN)
RiemanLine: Riemannian Manifold Representation of 3D Lines for Factor Graph Optimization
Minimal parametrization of 3D lines plays a critical role in camera localization and structural mapping. Existing representations in robotics and computer vision predominantly handle independent lines, overlooking structural regularities such as sets of parallel lines that are pervasive in man-made environments. This paper introduces \textbf{RiemanLine}, a unified minimal representation for 3D lines formulated on Riemannian manifolds that jointly accommodates both individual lines and parallel-line groups. Our key idea is to decouple each line landmark into global and local components: a shared vanishing direction optimized on the unit sphere $\mathcal{S}^2$, and scaled normal vectors constrained on orthogonal subspaces, enabling compact encoding of structural regularities. For $n$ parallel lines, the proposed representation reduces the parameter space from $4n$ (orthonormal form) to $2n+2$, naturally embedding parallelism without explicit constraints. We further integrate this parameterization into a factor graph framework, allowing global direction alignment and local reprojection optimization within a unified manifold-based bundle adjustment. Extensive experiments on ICL-NUIM, TartanAir, and synthetic benchmarks demonstrate that our method achieves significantly more accurate pose estimation and line reconstruction, while reducing parameter dimensionality and improving convergence stability.
Industrial Robot Motion Planning with GPUs: Integration of cuRobo for Extended DOF Systems ICRA
Efficient motion planning remains a key challenge in industrial robotics, especially for multi-axis systems operating in complex environments. This paper addresses that challenge by integrating GPU-accelerated motion planning through NVIDIA's cuRobo library into Vention's modular automation platform. By leveraging accurate CAD-based digital twins and real-time parallel optimization, our system enables rapid trajectory generation and dynamic collision avoidance for pick-and-place tasks. We demonstrate this capability on robots equipped with additional degrees of freedom, including a 7th-axis gantry, and benchmark performance across various scenarios. The results show significant improvements in planning speed and robustness, highlighting the potential of GPU-based planning pipelines for scalable, adaptable deployment in modern industrial workflows.
comment: 8 pages, 2 figures, 2 tables. Submitted to IEEE International Conference on Robotics and Automation (ICRA) 2025
DRIVE: Dynamic Rule Inference and Verified Evaluation for Constraint-Aware Autonomous Driving
Understanding and adhering to soft constraints is essential for safe and socially compliant autonomous driving. However, such constraints are often implicit, context-dependent, and difficult to specify explicitly. In this work, we present DRIVE, a novel framework for Dynamic Rule Inference and Verified Evaluation that models and evaluates human-like driving constraints from expert demonstrations. DRIVE leverages exponential-family likelihood modeling to estimate the feasibility of state transitions, constructing a probabilistic representation of soft behavioral rules that vary across driving contexts. These learned rule distributions are then embedded into a convex optimization-based planning module, enabling the generation of trajectories that are not only dynamically feasible but also compliant with inferred human preferences. Unlike prior approaches that rely on fixed constraint forms or purely reward-based modeling, DRIVE offers a unified framework that tightly couples rule inference with trajectory-level decision-making. It supports both data-driven constraint generalization and principled feasibility verification. We validate DRIVE on large-scale naturalistic driving datasets, including inD, highD, and RoundD, and benchmark it against representative inverse constraint learning and planning baselines. Experimental results show that DRIVE achieves 0.0% soft constraint violation rates, smoother trajectories, and stronger generalization across diverse driving scenarios. Verified evaluations further demonstrate the efficiency, explanability, and robustness of the framework for real-world deployment.
SCOUT: An in-vivo Methane Sensing System for Real-time Monitoring of Enteric Emissions in Cattle with ex-vivo Validation
Accurate measurement of enteric methane emissions remains a critical bottleneck for advancing livestock sustainability through genetic selection and precision management. Existing ambient sampling approaches suffer from low data retention rates, environmental interference, and limited temporal resolution. We developed SCOUT (Smart Cannula-mounted Optical Unit for Trace-methane), the first robust in-vivo sensing system enabling continuous, high-resolution monitoring of ruminal methane concentrations through an innovative closed-loop gas recirculation design. We conducted comprehensive validation with two cannulated Simmental heifers under contrasting dietary treatments, with cross-platform comparison against established ambient sniffer systems. SCOUT achieved exceptional performance with 82% data retention compared to 17% for conventional sniffer systems, while capturing methane concentrations 100-1000x higher than ambient approaches. Cross-platform validation demonstrated strong scale-dependent correlations, with optimal correlation strength (r = -0.564 $\pm$ 0.007) at biologically relevant 40-minute windows and 100% statistical significance. High-frequency monitoring revealed novel behavior-emission coupling, including rapid concentration changes (14.5 $\pm$ 11.3k ppm) triggered by postural transitions within 15 minutes, insights previously inaccessible through existing technologies. The SCOUT system represents a transformative advancement, enabling accurate, continuous emission phenotyping essential for genomic selection programs and sustainable precision livestock management. This validation framework establishes new benchmarks for agricultural sensor performance while generating unprecedented biological insights into ruminal methane dynamics, contributing essential tools for sustainable livestock production in climate-conscious agricultural systems.
Optimization of sliding control parameters for a 3-dof robot arm using genetic algorithm (GA)
This paper presents a method for optimizing the sliding mode control (SMC) parameter for a robot manipulator applying a genetic algorithm (GA). The objective of the SMC is to achieve precise and consistent tracking of the trajectory of the robot manipulator under uncertain and disturbed conditions. However, the system effectiveness and robustness depend on the choice of the SMC parameters, which is a difficult and crucial task. To solve this problem, a genetic algorithm is used to locate the optimal values of these parameters that gratify the capability criteria. The proposed method is efficient compared with the conventional SMC and Fuzzy-SMC. The simulation results show that the genetic algorithm with SMC can achieve better tracking capability and reduce the chattering effect.
INTENTION: Inferring Tendencies of Humanoid Robot Motion Through Interactive Intuition and Grounded VLM
Traditional control and planning for robotic manipulation heavily rely on precise physical models and predefined action sequences. While effective in structured environments, such approaches often fail in real-world scenarios due to modeling inaccuracies and struggle to generalize to novel tasks. In contrast, humans intuitively interact with their surroundings, demonstrating remarkable adaptability, making efficient decisions through implicit physical understanding. In this work, we propose INTENTION, a novel framework enabling robots with learned interactive intuition and autonomous manipulation in diverse scenarios, by integrating Vision-Language Models (VLMs) based scene reasoning with interaction-driven memory. We introduce Memory Graph to record scenes from previous task interactions which embodies human-like understanding and decision-making about different tasks in real world. Meanwhile, we design an Intuitive Perceptor that extracts physical relations and affordances from visual scenes. Together, these components empower robots to infer appropriate interaction behaviors in new scenes without relying on repetitive instructions. Videos: https://robo-intention.github.io
comment: Project Web: https://robo-intention.github.io
BTPG-max: Achieving Local Maximal Bidirectional Pairs for Bidirectional Temporal Plan Graphs
Multi-Agent Path Finding (MAPF) requires computing collision-free paths for multiple agents in shared environment. Most MAPF planners assume that each agent reaches a specific location at a specific timestep, but this is infeasible to directly follow on real systems where delays often occur. To address collisions caused by agents deviating due to delays, the Temporal Plan Graph (TPG) was proposed, which converts a MAPF time dependent solution into a time independent set of inter-agent dependencies. Recently, a Bidirectional TPG (BTPG) was proposed which relaxed some dependencies into ``bidirectional pairs" and improved efficiency of agents executing their MAPF solution with delays. Our work improves upon this prior work by designing an algorithm, BPTG-max, that finds more bidirectional pairs. Our main theoretical contribution is in designing the BTPG-max algorithm is locally optimal, i.e. which constructs a BTPG where no additional bidirectional pairs can be added. We also show how in practice BTPG-max leads to BTPGs with significantly more bidirectional edges, superior anytime behavior, and improves robustness to delays.
On the causality between affective impact and coordinated human-robot reactions
In an effort to improve how robots function in social contexts, this paper investigates if a robot that actively shares a reaction to an event with a human alters how the human perceives the robot's affective impact. To verify this, we created two different test setups. One to highlight and isolate the reaction element of affective robot expressions, and one to investigate the effects of applying specific timing delays to a robot reacting to a physical encounter with a human. The first test was conducted with two different groups (n=84) of human observers, a test group and a control group both interacting with the robot. The second test was performed with 110 participants using increasingly longer reaction delays for the robot with every ten participants. The results show a statistically significant change (p$<$.05) in perceived affective impact for the robots when they react to an event shared with a human observer rather than reacting at random. The result also shows for shared physical interaction, the near-human reaction times from the robot are most appropriate for the scenario. The paper concludes that a delay time around 200ms may render the biggest impact on human observers for small-sized non-humanoid robots. It further concludes that a slightly shorter reaction time around 100ms is most effective when the goal is to make the human observers feel they made the biggest impact on the robot.
comment: 7 pages, 5 figures, 29th IEEE International Workshop on Robot and Human Communication (ROMAN)
Opti-Acoustic Scene Reconstruction in Highly Turbid Underwater Environments IROS 2022
Scene reconstruction is an essential capability for underwater robots navigating in close proximity to structures. Monocular vision-based reconstruction methods are unreliable in turbid waters and lack depth scale information. Sonars are robust to turbid water and non-uniform lighting conditions, however, they have low resolution and elevation ambiguity. This work proposes a real-time opti-acoustic scene reconstruction method that is specially optimized to work in turbid water. Our strategy avoids having to identify point features in visual data and instead identifies regions of interest in the data. We then match relevant regions in the image to corresponding sonar data. A reconstruction is obtained by leveraging range data from the sonar and elevation data from the camera image. Experimental comparisons against other vision-based and sonar-based approaches at varying turbidity levels, and field tests conducted in marina environments, validate the effectiveness of the proposed approach. We have made our code open-source to facilitate reproducibility and encourage community engagement.
comment: To appear at IROS 2022 in Hangzhou, China
HyCodePolicy: Hybrid Language Controllers for Multimodal Monitoring and Decision in Embodied Agents ICCV 2025
Recent advances in multimodal large language models (MLLMs) have enabled richer perceptual grounding for code policy generation in embodied agents. However, most existing systems lack effective mechanisms to adaptively monitor policy execution and repair codes during task completion. In this work, we introduce HyCodePolicy, a hybrid language-based control framework that systematically integrates code synthesis, geometric grounding, perceptual monitoring, and iterative repair into a closed-loop programming cycle for embodied agents. Technically, given a natural language instruction, our system first decomposes it into subgoals and generates an initial executable program grounded in object-centric geometric primitives. The program is then executed in simulation, while a vision-language model (VLM) observes selected checkpoints to detect and localize execution failures and infer failure reasons. By fusing structured execution traces capturing program-level events with VLM-based perceptual feedback, HyCodePolicy infers failure causes and repairs programs. This hybrid dual feedback mechanism enables self-correcting program synthesis with minimal human supervision. Our results demonstrate that HyCodePolicy significantly improves the robustness and sample efficiency of robot manipulation policies, offering a scalable strategy for integrating multimodal reasoning into autonomous decision-making pipelines.
comment: Accepted to ICCV 2025 Workshop on Multi-Modal Reasoning for Agentic Intelligence
Compact LED-Based Displacement Sensing for Robot Fingers
In this paper, we introduce a sensor designed for integration in robot fingers, where it can provide information on the displacements induced by external contact. Our sensor uses LEDs to sense the displacement between two plates connected by a transparent elastomer; when a force is applied to the finger, the elastomer displaces and the LED signals change. We show that using LEDs as both light emitters an receivers in this context provides high sensitivity, allowing such an emitter and receiver pairs to detect very small displacements. We characterize the standalone performance of the sensor by testing the ability of a supervised learning model to predict complete force and torque data from its raw signals, and obtain a mean error between 0.05 and 0.07 N across the three directions of force applied to the finger. Our method allows for finger-size packaging with no amplification electronics, low cost manufacturing, easy integration into a complete hand, and high overload shear forces and bending torques, suggesting future applicability to complete manipulation tasks.
RoboTron-Drive: All-in-One Large Multimodal Model for Autonomous Driving ICCV 2025
Large Multimodal Models (LMMs) have demonstrated exceptional comprehension and interpretation capabilities in Autonomous Driving (AD) by incorporating large language models. Despite the advancements, current data-driven AD approaches tend to concentrate on a single dataset and specific tasks, neglecting their overall capabilities and ability to generalize. To bridge these gaps, we propose RoboTron-Drive, a general large multimodal model designed to process diverse data inputs, such as images and multi-view videos, while performing a broad spectrum of AD tasks, including perception, prediction, and planning. Initially, the model undergoes curriculum pre-training to process varied visual signals and perform basic visual comprehension and perception tasks. Subsequently, we augment and standardize various AD datasets to finetune the model, resulting in an all-in-one LMM for autonomous driving. To assess the general capabilities and generalization ability, we conduct evaluations on six public benchmarks and undertake zero-shot transfer on three unseen datasets, where RoboTron-Drive achieves state-of-the-art performance across all tasks. We hope RoboTron-Drive as a promising solution for AD in the real world. Project page with code: https://github.com/zhijian11/RoboTron-Drive.
comment: ICCV 2025
Reachset-Conformant System Identification
Formal verification techniques play a pivotal role in ensuring the safety of complex cyber-physical systems. To transfer model-based verification results to the real world, we require that the measurements of the target system lie in the set of reachable outputs of the corresponding model, a property we refer to as reachset conformance. This paper is on automatically identifying those reachset-conformant models. While state-of-the-art reachset-conformant identification methods focus on linear state-space models, we generalize these methods to nonlinear state-space models and linear and nonlinear input-output models. Furthermore, our identification framework adapts to different levels of prior knowledge on the system dynamics. In particular, we identify the set of model uncertainties for white-box models, the parameters and the set of model uncertainties for gray-box models, and entire reachset-conformant black-box models from data. The robustness and efficacy of our framework are demonstrated in extensive numerical experiments using simulated and real-world data.
comment: This work has been submitted to the IEEE for possible publication
3DTTNet: Multimodal Fusion-Based 3D Traversable Terrain Modeling for Off-Road Environments
Off-road environments remain significant challenges for autonomous ground vehicles, due to the lack of structured roads and the presence of complex obstacles, such as uneven terrain, vegetation, and occlusions. Traditional perception algorithms, primarily designed for structured environments, often fail in unstructured scenarios. In this paper, traversable area recognition is achieved through semantic scene completion. A novel multimodal method, 3DTTNet, is proposed to generate dense traversable terrain estimations by integrating LiDAR point clouds with monocular images from a forward-facing perspective. By integrating multimodal data, environmental feature extraction is strengthened, which is crucial for accurate terrain modeling in complex terrains. Furthermore, RELLIS-OCC, a dataset with 3D traversable annotations, is introduced, incorporating geometric features such as step height, slope, and unevenness. Through a comprehensive analysis of vehicle obsta cle-crossing conditions and the incorporation of vehicle body structure constraints, four traversability cost labels are generated: lethal, medium-cost, low-cost, and free. Experimental results demonstrate that 3DTTNet outperforms the comparison approaches in 3D traversable area recognition, particularly in off-road environments with irregular geometries and partial occlusions. Specifically, 3DTTNet achieves a 42\% improvement in scene completion IoU compared to other models. The proposed framework is scalable and adaptable to various vehicle platforms, allowing for adjustments to occupancy grid parameters and the integration of advanced dynamic models for traversability cost estimation.
comment: 15 pages,13 figures
Real-World Offline Reinforcement Learning from Vision Language Model Feedback IROS 2025
Offline reinforcement learning can enable policy learning from pre-collected, sub-optimal datasets without online interactions. This makes it ideal for real-world robots and safety-critical scenarios, where collecting online data or expert demonstrations is slow, costly, and risky. However, most existing offline RL works assume the dataset is already labeled with the task rewards, a process that often requires significant human effort, especially when ground-truth states are hard to ascertain (e.g., in the real-world). In this paper, we build on prior work, specifically RL-VLM-F, and propose a novel system that automatically generates reward labels for offline datasets using preference feedback from a vision-language model and a text description of the task. Our method then learns a policy using offline RL with the reward-labeled dataset. We demonstrate the system's applicability to a complex real-world robot-assisted dressing task, where we first learn a reward function using a vision-language model on a sub-optimal offline dataset, and then we use the learned reward to employ Implicit Q learning to develop an effective dressing policy. Our method also performs well in simulation tasks involving the manipulation of rigid and deformable objects, and significantly outperform baselines such as behavior cloning and inverse RL. In summary, we propose a new system that enables automatic reward labeling and policy learning from unlabeled, sub-optimal offline datasets.
comment: 7 pages. Accepted at the LangRob Workshop 2024 @ CoRL, 2024. Accepted at 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
VISO-Grasp: Vision-Language Informed Spatial Object-centric 6-DoF Active View Planning and Grasping in Clutter and Invisibility IROS 2025
We propose VISO-Grasp, a novel vision-language-informed system designed to systematically address visibility constraints for grasping in severely occluded environments. By leveraging Foundation Models (FMs) for spatial reasoning and active view planning, our framework constructs and updates an instance-centric representation of spatial relationships, enhancing grasp success under challenging occlusions. Furthermore, this representation facilitates active Next-Best-View (NBV) planning and optimizes sequential grasping strategies when direct grasping is infeasible. Additionally, we introduce a multi-view uncertainty-driven grasp fusion mechanism that refines grasp confidence and directional uncertainty in real-time, ensuring robust and stable grasp execution. Extensive real-world experiments demonstrate that VISO-Grasp achieves a success rate of $87.5\%$ in target-oriented grasping with the fewest grasp attempts outperforming baselines. To the best of our knowledge, VISO-Grasp is the first unified framework integrating FMs into target-aware active view planning and 6-DoF grasping in environments with severe occlusions and entire invisibility constraints. Code is available at: https://github.com/YitianShi/vMF-Contact
comment: Accepted to IROS 2025
\textit{RoboTron-Nav}: A Unified Framework for Embodied Navigation Integrating Perception, Planning, and Prediction ICCV 2025
In language-guided visual navigation, agents locate target objects in unseen environments using natural language instructions. For reliable navigation in unfamiliar scenes, agents should possess strong perception, planning, and prediction capabilities. Additionally, when agents revisit previously explored areas during long-term navigation, they may retain irrelevant and redundant historical perceptions, leading to suboptimal results. In this work, we propose \textbf{RoboTron-Nav}, a unified framework that integrates {p}erception, {p}lanning, and {p}rediction capabilities through multitask collaborations on navigation and embodied question answering tasks, thereby enhancing navigation performances. Furthermore, RoboTron-Nav employs an adaptive 3D-aware history sampling strategy to effectively and efficiently utilize historical observations. By leveraging large language model, RoboTron-Nav comprehends diverse commands and complex visual scenes, resulting in appropriate navigation actions. RoboTron-Nav achieves an 81.1\% success rate in object goal navigation on the $\mathrm{CHORES}$-$\mathbb{S}$ benchmark, setting a new state-of-the-art performance.
comment: ICCV 2025
Accelerating Focal Search in Multi-Agent Path Finding with Tighter Lower Bounds
Multi-Agent Path Finding (MAPF) involves finding collision-free paths for multiple agents while minimizing a cost function--an NP-hard problem. Bounded suboptimal methods like Enhanced Conflict-Based Search (ECBS) and Explicit Estimation CBS (EECBS) balance solution quality with computational efficiency using focal search mechanisms. While effective, traditional focal search faces a limitation: the lower bound (LB) value determining which nodes enter the FOCAL list often increases slowly in early search stages, resulting in a constrained search space that delays finding valid solutions. In this paper, we propose a novel bounded suboptimal algorithm, double-ECBS (DECBS), to address this issue by first determining the maximum LB value and then employing a best-first search guided by this LB to find a collision-free path. Experimental results demonstrate that DECBS outperforms ECBS in most test cases and is compatible with existing optimization techniques. DECBS can reduce nearly 30% high-level CT nodes and 50% low-level focal search nodes. When agent density is moderate to high, DECBS achieves a 23.5% average runtime improvement over ECBS with identical suboptimality bounds and optimizations.
comment: 7 pages
RAILGUN: A Unified Convolutional Policy for Multi-Agent Path Finding Across Different Environments and Tasks
Multi-Agent Path Finding (MAPF), which focuses on finding collision-free paths for multiple robots, is crucial for applications ranging from aerial swarms to warehouse automation. Solving MAPF is NP-hard so learning-based approaches for MAPF have gained attention, particularly those leveraging deep neural networks. Nonetheless, despite the community's continued efforts, all learning-based MAPF planners still rely on decentralized planning due to variability in the number of agents and map sizes. We have developed the first centralized learning-based policy for MAPF problem called RAILGUN. RAILGUN is not an agent-based policy but a map-based policy. By leveraging a CNN-based architecture, RAILGUN can generalize across different maps and handle any number of agents. We collect trajectories from rule-based methods to train our model in a supervised way. In experiments, RAILGUN outperforms most baseline methods and demonstrates great zero-shot generalization capabilities on various tasks, maps and agent numbers that were not seen in the training dataset.
comment: 7 pages
IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks
Flawed planning from VLM-driven embodied agents poses significant safety hazards, hindering their deployment in real-world household tasks. However, existing static, non-interactive evaluation paradigms fail to adequately assess risks within these interactive environments, since they cannot simulate dynamic risks that emerge from an agent's actions and rely on unreliable post-hoc evaluations that ignore unsafe intermediate steps. To bridge this critical gap, we propose evaluating an agent's interactive safety: its ability to perceive emergent risks and execute mitigation steps in the correct procedural order. We thus present IS-Bench, the first multi-modal benchmark designed for interactive safety, featuring 161 challenging scenarios with 388 unique safety risks instantiated in a high-fidelity simulator. Crucially, it facilitates a novel process-oriented evaluation that verifies whether risk mitigation actions are performed before/after specific risk-prone steps. Extensive experiments on leading VLMs, including the GPT-4o and Gemini-2.5 series, reveal that current agents lack interactive safety awareness, and that while safety-aware Chain-of-Thought can improve performance, it often compromises task completion. By highlighting these critical limitations, IS-Bench provides a foundation for developing safer and more reliable embodied AI systems. Code and data are released under [this https URL](https://github.com/AI45Lab/IS-Bench).
LTLCodeGen: Code Generation of Syntactically Correct Temporal Logic for Robot Task Planning
This paper focuses on planning robot navigation tasks from natural language specifications. We develop a modular approach, where a large language model (LLM) translates the natural language instructions into a linear temporal logic (LTL) formula with propositions defined by object classes in a semantic occupancy map. The LTL formula and the semantic occupancy map are provided to a motion planning algorithm to generate a collision-free robot path that satisfies the natural language instructions. Our main contribution is LTLCodeGen, a method to translate natural language to syntactically correct LTL using code generation. We demonstrate the complete task planning method in real-world experiments involving human speech to provide navigation instructions to a mobile robot. We also thoroughly evaluate our approach in simulated and real-world experiments in comparison to end-to-end LLM task planning and state-of-the-art LLM-to-LTL translation methods.
comment: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (iROS) 2025
RoboBrain 2.0 Technical Report
We introduce RoboBrain 2.0, our latest generation of embodied vision-language foundation models, designed to unify perception, reasoning, and planning for complex embodied tasks in physical environments. It comes in two variants: a lightweight 7B model and a full-scale 32B model, featuring a heterogeneous architecture with a vision encoder and a language model. Despite its compact size, RoboBrain 2.0 achieves strong performance across a wide spectrum of embodied reasoning tasks. On both spatial and temporal benchmarks, the 32B variant achieves leading results, surpassing prior open-source and proprietary models. In particular, it supports key real-world embodied AI capabilities, including spatial understanding (e.g., affordance prediction, spatial referring, trajectory forecasting) and temporal decision-making (e.g., closed-loop interaction, multi-agent long-horizon planning, and scene graph updating). This report details the model architecture, data construction, multi-stage training strategies, infrastructure and practical applications. We hope RoboBrain 2.0 advances embodied AI research and serves as a practical step toward building generalist embodied agents. The code, checkpoint and benchmark are available at https://superrobobrain.github.io.
Subframework-based Bearing Rigidity Maintenance Control in Multirobot Networks
This work presents a novel approach for \textit{bearing rigidity} analysis and control in multi-robot networks with sensing constraints and dynamic topology. By decomposing the system's framework into \textit{subframeworks}, we express bearing rigidity -- a global property -- as a set of \textit{local} properties, with rigidity eigenvalues serving as natural \textit{local rigidity measures}. We propose a decentralized gradient-based controller to execute mission-specific commands using only bearing measurements. The controller preserves bearing rigidity by keeping the rigidity eigenvalues above a threshold, using only information exchanged within subframeworks. Simulations evaluate the scheme's effectiveness, underscoring its scalability and practicality.
comment: 6 pages
Stability analysis through folds: An end-loaded elastic with a lever arm
Many physical systems can be modelled as parameter-dependent variational problems. In numerous cases, multiple equilibria co-exist, requiring the evaluation of their stability, and the monitoring of transitions between them. Generally, the stability characteristics of the equilibria change near folds in the parameter space. The direction of stability changes is embedded in a specific projection of the solutions, known as distinguished bifurcation diagrams. In this article, we identify such projections for variational problems characterized by fixed-free ends -- a class of problems frequently encountered in mechanics. Using these diagrams, we study an Elastica subject to an end load applied through a rigid lever arm. Several instances of snap-back instability are reported, along with their dependence on system parameters through numerical examples. These findings have potential applications in the design of soft robot arms and other actuator designs.
comment: 22 pages, 12 figures
How to Adapt Control Barrier Functions? A Learning-Based Approach with Applications to a VTOL Quadplane
In this paper, we present a novel theoretical framework for online adaptation of Control Barrier Function (CBF) parameters, i.e., of the class K functions included in the CBF condition, under input constraints. We introduce the concept of locally validated CBF parameters, which are adapted online to guarantee finite-horizon safety, based on conditions derived from Nagumo's theorem and tangent cone analysis. To identify these parameters online, we integrate a learning-based approach with an uncertainty-aware verification process that account for both epistemic and aleatoric uncertainties inherent in neural network predictions. Our method is demonstrated on a VTOL quadplane model during challenging transition and landing maneuvers, showcasing enhanced performance while maintaining safety.
comment: 2025 IEEE Conference on Decision and Control (CDC). Project page: https://www.taekyung.me/how-to-adapt-cbf
Systems and Control (CS)
Layers of a City: Network-Based Insights into San Diego's Transportation Ecosystem
Analyzing the structure and function of urban transportation networks is critical for enhancing mobility, equity, and resilience. This paper leverages network science to conduct a multi-modal analysis of San Diego's transportation system. We construct a multi-layer graph using data from OpenStreetMap (OSM) and the San Diego Metropolitan Transit System (MTS), representing driving, walking, and public transit layers. By integrating thousands of Points of Interest (POIs), we analyze network accessibility, structure, and resilience through centrality measures, community detection, and a proposed metric for walkability. Our analysis reveals a system defined by a stark core-periphery divide. We find that while the urban core is well-integrated, 30.3% of POIs are isolated from public transit within a walkable distance, indicating significant equity gaps in suburban and rural access. Centrality analysis highlights the driving network's over-reliance on critical freeways as bottlenecks, suggesting low network resilience, while confirming that San Diego is not a broadly walkable city. Furthermore, community detection demonstrates that transportation mode dictates the scale of mobility, producing compact, local clusters for walking and broad, regional clusters for driving. Collectively, this work provides a comprehensive framework for diagnosing urban mobility systems, offering quantitative insights that can inform targeted interventions to improve transportation equity and infrastructure resilience in San Diego.
Case Studies of Generative Machine Learning Models for Dynamical Systems
Systems like aircraft and spacecraft are expensive to operate in the real world. The design, validation, and testing for such systems therefore relies on a combination of mathematical modeling, abundant numerical simulations, and a relatively small set of real-world experiments. Due to modeling errors, simplifications, and uncertainties, the data synthesized by simulation models often does not match data from the system's real-world operation. We consider the broad research question of whether this model mismatch can be significantly reduced by generative artificial intelligence models (GAIMs). Unlike text- or image-processing, where generative models have attained recent successes, GAIM development for aerospace engineering applications must not only train with scarce operational data, but their outputs must also satisfy governing equations based on natural laws, e.g., conservation laws. The scope of this paper primarily focuses on two case studies of optimally controlled systems that are commonly understood and employed in aircraft guidance, namely: minimum-time navigation in a wind field and minimum-exposure navigation in a threat field. We report GAIMs that are trained with a relatively small set, of the order of a few hundred, of examples and with underlying governing equations. By focusing on optimally controlled systems, we formulate training loss functions based on invariance of the Hamiltonian function along system trajectories. We investigate three GAIM architectures, namely: the generative adversarial network (GAN) and two variants of the variational autoencoder (VAE). We provide architectural details and thorough performance analyses of these models. The main finding is that our new models, especially the VAE-based models, are able to synthesize data that satisfy the governing equations and are statistically similar to the training data despite small volumes of training data.
Reliable and Real-Time Highway Trajectory Planning via Hybrid Learning-Optimization Frameworks
Autonomous highway driving presents a high collision risk due to fast-changing environments and limited reaction time, necessitating reliable and efficient trajectory planning. This paper proposes a hybrid trajectory planning framework that integrates the adaptability of learning-based methods with the formal safety guarantees of optimization-based approaches. The framework features a two-layer architecture: an upper layer employing a graph neural network (GNN) trained on real-world highway data to predict human-like longitudinal velocity profiles, and a lower layer utilizing path optimization formulated as a mixed-integer quadratic programming (MIQP) problem. The primary contribution is the lower-layer path optimization model, which introduces a linear approximation of discretized vehicle geometry to substantially reduce computational complexity, while enforcing strict spatiotemporal non-overlapping constraints to formally guarantee collision avoidance throughout the planning horizon. Experimental results demonstrate that the planner generates highly smooth, collision-free trajectories in complex real-world emergency scenarios, achieving success rates exceeding 97% with average planning times of 54 ms, thereby confirming real-time capability.
Error Accumulation using Linearized Models for Aggregating Flexibility in Distribution Systems
This paper investigates flexibility aggregation approaches based on linear models. We begin by examining the theoretical foundations of linear AC power flow, two variants of so-called DC power flow, and the LinDistFlow model, along with their underlying assumptions. The discussion covers key system details, including network topology, voltage constraints, and line losses. Simulations are conducted on the KIT Campus Nord network with real demand and solar data. Results show that, in the absence of negative losses, line losses are generally underestimated by linear models. Furthermore, line losses errors tend to accumulate both at the point of common coupling (PCC) and over extended time horizons.
Design of Adaptive Hybrid Downlink NOMA-TDMA for Visible Light Communications Networks
This paper proposes an adaptive hybrid non-orthogonal multiple access (NOMA)-time division multiple access (TDMA) scheme for multi-user visible light communication (VLC) networks, aiming to enhance users' sum-rate performance while maintaining low complexity. In the proposed scheme, users are divided into groups where each group is served in a different time slot using TDMA. Within each group, up to two users can be served simultaneously using NOMA. A central challenge lies in determining which users should be paired together for NOMA, as the effectiveness of successive interference cancellation (SIC) employed by NOMA depends on the difference between users' channel gains. To address this, for a pair of users, we determine the range of their channel gain ratio within which the pair benefits more from NOMA or TDMA. Identifying the lower and upper bounds of this range is formulated as two optimization problems which are solved efficiently using the Successive Convex Approximation (SCA) method. Simulation results demonstrate that the proposed scheme outperforms the conventional hybrid NOMA-TDMA method under different numbers of users and transmit LED powers.
Challenges in Applying Variational Quantum Algorithms to Dynamic Satellite Network Routing
Applying near-term variational quantum algorithms to the problem of dynamic satellite network routing represents a promising direction for quantum computing. In this work, we provide a critical evaluation of two major approaches: static quantum optimizers such as the Variational Quantum Eigensolver (VQE) and the Quantum Approximate Optimization Algorithm (QAOA) for offline route computation, and Quantum Reinforcement Learning (QRL) methods for online decision-making. Using ideal, noise-free simulations, we find that these algorithms face significant challenges. Specifically, static optimizers are unable to solve even a classically easy 4-node shortest path problem due to the complexity of the optimization landscape. Likewise, a basic QRL agent based on policy gradient methods fails to learn a useful routing strategy in a dynamic 8-node environment and performs no better than random actions. These negative findings highlight key obstacles that must be addressed before quantum algorithms can offer real advantages in communication networks. We discuss the underlying causes of these limitations, including barren plateaus and learning instability, and suggest future research directions to overcome them.
comment: 17 pages and 3 figures
Optimizing Microgrid Composition for Sustainable Data Centers
As computing energy demand continues to grow and electrical grid infrastructure struggles to keep pace, an increasing number of data centers are being planned with colocated microgrids that integrate on-site renewable generation and energy storage. However, while existing research has examined the tradeoffs between operational and embodied carbon emissions in the context of renewable energy certificates, there is a lack of tools to assess how the sizing and composition of microgrid components affects long-term sustainability and power reliability. In this paper, we present a novel optimization framework that extends the computing and energy system co-simulator Vessim with detailed renewable energy generation models from the National Renewable Energy Laboratory's (NREL) System Advisor Model (SAM). Our framework simulates the interaction between computing workloads, on-site renewable production, and energy storage, capturing both operational and embodied emissions. We use a multi-horizon black-box optimization to explore efficient microgrid compositions and enable operators to make more informed decisions when planning energy systems for data centers.
A virtual sensor fusion approach for state of charge estimation of lithium-ion cells
This paper addresses the estimation of the State Of Charge (SOC) of lithium-ion cells via the combination of two widely used paradigms: Kalman Filters (KFs) equipped with Equivalent Circuit Models (ECMs) and machine-learning approaches. In particular, a recent Virtual Sensor (VS) synthesis technique is considered, which operates as follows: (i) learn an Affine Parameter-Varying (APV) model of the cell directly from data, (ii) derive a bank of linear observers from the APV model, (iii) train a machine-learning technique from features extracted from the observers together with input and output data to predict the SOC. The SOC predictions returned by the VS are supplied to an Extended KF (EKF) as output measurements along with the cell terminal voltage, combining the two paradigms. A data-driven calibration strategy for the noise covariance matrices of the EKF is proposed. Experimental results show that the designed approach is beneficial w.r.t. SOC estimation accuracy and smoothness.
Information Bulletin Strategy in Impatient Queuing SC
In Sixth Generation (6G) networks, decentralized control in multi-tenant systems is a suggested enabler for autonomous network operations. However, autonomy requires independent rationale decisions be taken by tenants. This rationality can only be underpinned by timely and continuous access to status information. Despite its importance, the questions of what information should be shared, how much should be communicated, and how frequently updates should be dispatched remain open research challenges. This manuscript proposes an information bulletin strategy defined around two models of the system descriptor states to address these fundamental questions. The strategy is that queues periodically broadcast these information models to tenants at different time intervals, who may respond by reneging from the queue or jockeying to a more favorable one. The expectation is that over time, the queues adapt their processing rates based on what they learn from the tenant behavior. The objective is to minimize overall delay and the impatience. We formulate for this impatience as an optimization problem, whose analytical solution is intractable. We perform numerical experiments to evaluate the performance of the learned queue policy and to assess how closely it approaches optimal conditions.
comment: Submitted to IEEE CSCN 2025
Agentic-AI based Mathematical Framework for Commercialization of Energy Resilience in Electrical Distribution System Planning and Operation
The increasing vulnerability of electrical distribution systems to extreme weather events and cyber threats necessitates the development of economically viable frameworks for resilience enhancement. While existing approaches focus primarily on technical resilience metrics and enhancement strategies, there remains a significant gap in establishing market-driven mechanisms that can effectively commercialize resilience features while optimizing their deployment through intelligent decision-making. Moreover, traditional optimization approaches for distribution network reconfiguration often fail to dynamically adapt to both normal and emergency conditions. This paper introduces a novel framework integrating dual-agent Proximal Policy Optimization (PPO) with market-based mechanisms, achieving an average resilience score of 0.85 0.08 over 10 test episodes. The proposed architecture leverages a dual-agent PPO scheme, where a strategic agent selects optimal DER-driven switching configurations, while a tactical agent fine-tunes individual switch states and grid preferences under budget and weather constraints. These agents interact within a custom-built dynamic simulation environment that models stochastic calamity events, budget limits, and resilience-cost trade-offs. A comprehensive reward function is designed that balances resilience enhancement objectives with market profitability (with up to 200x reward incentives, resulting in 85% of actions during calamity steps selecting configurations with 4 DERs), incorporating factors such as load recovery speed, system robustness, and customer satisfaction. Over 10 test episodes, the framework achieved a benefit-cost ratio of 0.12 0.01, demonstrating sustainable market incentives for resilience investment. This framework creates sustainable market incentives
Optimization of sliding control parameters for a 3-dof robot arm using genetic algorithm (GA)
This paper presents a method for optimizing the sliding mode control (SMC) parameter for a robot manipulator applying a genetic algorithm (GA). The objective of the SMC is to achieve precise and consistent tracking of the trajectory of the robot manipulator under uncertain and disturbed conditions. However, the system effectiveness and robustness depend on the choice of the SMC parameters, which is a difficult and crucial task. To solve this problem, a genetic algorithm is used to locate the optimal values of these parameters that gratify the capability criteria. The proposed method is efficient compared with the conventional SMC and Fuzzy-SMC. The simulation results show that the genetic algorithm with SMC can achieve better tracking capability and reduce the chattering effect.
Sequence Aware SAC Control for Engine Fuel Consumption Optimization in Electrified Powertrain
As hybrid electric vehicles (HEVs) gain traction in heavy-duty trucks, adaptive and efficient energy management is critical for reducing fuel consumption while maintaining battery charge for long operation times. We present a new reinforcement learning (RL) framework based on the Soft Actor-Critic (SAC) algorithm to optimize engine control in series HEVs. We reformulate the control task as a sequential decision-making problem and enhance SAC by incorporating Gated Recurrent Units (GRUs) and Decision Transformers (DTs) into both actor and critic networks to capture temporal dependencies and improve planning over time. To evaluate robustness and generalization, we train the models under diverse initial battery states, drive cycle durations, power demands, and input sequence lengths. Experiments show that the SAC agent with a DT-based actor and GRU-based critic was within 1.8% of Dynamic Programming (DP) in fuel savings on the Highway Fuel Economy Test (HFET) cycle, while the SAC agent with GRUs in both actor and critic networks, and FFN actor-critic agent were within 3.16% and 3.43%, respectively. On unseen drive cycles (US06 and Heavy Heavy-Duty Diesel Truck (HHDDT) cruise segment), generalized sequence-aware agents consistently outperformed feedforward network (FFN)-based agents, highlighting their adaptability and robustness in real-world settings.
Linear Program-Based Stability Conditions for Nonlinear Autonomous Systems
This paper introduces a novel approach to evaluating the asymptotic stability of equilibrium points in both continuous-time (CT) and discrete-time (DT) nonlinear autonomous systems. By utilizing indirect Lyapunov methods and linearizing system dynamics through Jacobian matrices, the methodology replaces traditional semi-definite programming (SDP) techniques with computationally efficient linear programming (LP) conditions. This substitution substantially lowers the computational burden, including time and memory usage, particularly for high-dimensional systems. The stability criteria are developed using matrix transformations and leveraging the structural characteristics of the system, improving scalability. Several examples demonstrated the computational efficiency of the proposed approach compared to the existing SDP-based criteria, particularly for high-dimensional systems.
comment: 6 pages. Submitted to NOLCOS'2025
Optimality Principles and Neural Ordinary Differential Equations-based Process Modeling for Distributed Control
Most recent advances in machine learning and analytics for process control pose the question of how to naturally integrate new data-driven methods with classical process models and control. We propose a process modeling framework enabling integration of data-driven algorithms through consistent topological properties and conservation of extensive quantities. Interconnections among process network units are represented through connectivity matrices and network graphs. We derive the system's natural objective function equivalent to the non-equilibrium entropy production in a steady state system as a driving force for the process dynamics. We illustrate how distributed control and optimization can be implemented into process network structures and how control laws and algorithms alter the system's natural equilibrium towards engineered objectives. The basic requirement is that the flow conditions can be expressed in terms of conic sector (passivity) conditions. Our formalism allows integration of fundamental conservation properties from topology with learned dynamic relations from data through sparse deep neural networks. We demonstrate in a practical example of a simple inventory control system how to integrate the basic topology of a process with a neural network ordinary differential equation model. The system specific constitutive equations are left undescribed and learned by the neural ordinary differential equation algorithm using the adjoint method in combination with an adaptive ODE solver from synthetic time-series data. The resulting neural network forms a state space model for use in e.g. a model predictive control algorithm.
comment: 27 pages, 7 figures
Streaming Generated Gaussian Process Experts for Online Learning and Control
Gaussian Processes (GPs), as a nonparametric learning method, offer flexible modeling capabilities and calibrated uncertainty quantification for function approximations. Additionally, GPs support online learning by efficiently incorporating new data with polynomial-time computation, making them well-suited for safety-critical dynamical systems that require rapid adaptation. However, the inference and online updates of exact GPs, when processing streaming data, incur cubic computation time and quadratic storage memory complexity, limiting their scalability to large datasets in real-time settings. In this paper, we propose a streaming kernel-induced progressively generated expert framework of Gaussian processes (SkyGP) that addresses both computational and memory constraints by maintaining a bounded set of experts, while inheriting the learning performance guarantees from exact Gaussian processes. Furthermore, two SkyGP variants are introduced, each tailored to a specific objective, either maximizing prediction accuracy (SkyGP-Dense) or improving computational efficiency (SkyGP-Fast). The effectiveness of SkyGP is validated through extensive benchmarks and real-time control experiments demonstrating its superior performance compared to state-of-the-art approaches.
Live Demonstration: Neuromorphic Radar for Gesture Recognition ICASSP 2025
We present a neuromorphic radar framework for real-time, low-power hand gesture recognition (HGR) using an event-driven architecture inspired by biological sensing. Our system comprises a 24 GHz Doppler radar front-end and a custom neuromorphic sampler that converts intermediate-frequency (IF) signals into sparse spike-based representations via asynchronous sigma-delta encoding. These events are directly processed by a lightweight neural network deployed on a Cortex-M0 microcontroller, enabling low-latency inference without requiring spectrogram reconstruction. Unlike conventional radar HGR pipelines that continuously sample and process data, our architecture activates only when meaningful motion is detected, significantly reducing memory, power, and computation overhead. Evaluated on a dataset of five gestures collected from seven users, our system achieves > 85% real-time accuracy. To the best of our knowledge, this is the first work that employs bio-inspired asynchronous sigma-delta encoding and an event-driven processing framework for radar-based HGR.
comment: Neuromorphic Radar, Hand Gesture Recognition, Event-Driven, Sigma-Delta Encoding, Sparse Representation. Presented in ICASSP 2025 at Hyderabad, India
Improving Sequential Market Coordination via Value-oriented Renewable Energy Forecasting
Large penetration of renewable energy sources (RESs) brings huge uncertainty into the electricity markets. The current deterministic clearing approach in the day-ahead (DA) market, where RESs participate based on expected production, has been criticized for causing a lack of coordination between the DA and real-time (RT) markets, leading to high overall operating costs. Previous works indicate that improving day-ahead RES entering quantities can significantly mitigate the drawbacks of deterministic clearing. In this work, we propose using a trained forecasting model, referred to as value-oriented forecasting, to determine RES Improved Entering Quantities (RIEQ) more efficiently during the operational phase. Unlike traditional models that minimize statistical forecasting errors, our approach trains model parameters to minimize the expected overall operating costs across both DA and RT markets. We derive the exact form of the loss function used for training, which becomes piecewise linear when market clearing is modeled by linear programs. Additionally, we provide the analytical gradient of the loss function with respect to the forecast, enabling an efficient training strategy. Numerical studies demonstrate that our forecasts significantly reduce overall operating costs for deterministic market clearing compared to conventional forecasts based on expected RES production.
comment: Submitted to IEEE Transactions on Energy Markets, Policy, and Regulation
ALADIN-$β$: A Distributed Optimization Algorithm for Solving MPCC Problems
Mathematical Programs with Complementarity Constraints (MPCC) are critical in various real-world applications but notoriously challenging due to non-smoothness and degeneracy from complementarity constraints. The $\ell_1$-Exact Penalty-Barrier enhanced \texttt{IPOPT} improves performance and robustness by introducing additional inequality constraints and decision variables. However, this comes at the cost of increased computational complexity due to the higher dimensionality and additional constraints introduced in the centralized formulation. To mitigate this, we propose a distributed structure-splitting reformulation that decomposes these inequality constraints and auxiliary variables into independent sub-problems. Furthermore, we introduce Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN)-$\beta$, a novel approach that integrates the $\ell_1$-Exact Penalty-Barrier method with ALADIN to efficiently solve the distributed reformulation. Numerical experiments demonstrate that even without a globalization strategy, the proposed distributed approach achieves fast convergence while maintaining high precision.
Reachset-Conformant System Identification
Formal verification techniques play a pivotal role in ensuring the safety of complex cyber-physical systems. To transfer model-based verification results to the real world, we require that the measurements of the target system lie in the set of reachable outputs of the corresponding model, a property we refer to as reachset conformance. This paper is on automatically identifying those reachset-conformant models. While state-of-the-art reachset-conformant identification methods focus on linear state-space models, we generalize these methods to nonlinear state-space models and linear and nonlinear input-output models. Furthermore, our identification framework adapts to different levels of prior knowledge on the system dynamics. In particular, we identify the set of model uncertainties for white-box models, the parameters and the set of model uncertainties for gray-box models, and entire reachset-conformant black-box models from data. The robustness and efficacy of our framework are demonstrated in extensive numerical experiments using simulated and real-world data.
comment: This work has been submitted to the IEEE for possible publication
Quantum-Enhanced Power Flow and Optimal Power Flow based on Combinatorial Reformulation
This study introduces the Adiabatic Quantum Power Flow (AQPF) and Adiabatic Quantum Optimal Power Flow (AQOPF) algorithms to solve power flow (PF) and optimal power flow (OPF) problems, respectively. These algorithms utilize a novel combinatorial optimization reformulation of classical PF and OPF problems, and hence, enable their implementation on Ising machines, e.g., quantum and quantum-inspired hardware. The experiments are conducted on standard test cases ranging from 4-bus to 1354-bus systems, using D-Wave's Advantage system (QA), its hybrid quantum-classical solver (HA), as well as the third-generation Digital Annealer (DAv3) and Quantum-Inspired Integrated Optimization software (QIIO) developed by Fujitsu. The annealers are systematically evaluated based on: (i) full and partitioned formulations, (ii) ability to handle ill-conditioned cases, and (iii) scalability. The results are benchmarked against the Newton-Raphson numerical method (NR) and suggest that AQPF and AQOPF can serve as effective solvers or complementary tools to classical methods to address unsolved challenges in large-scale modern power systems.
comment: 10 pages, 1 pseudo code, 2 figures, 4 tables
Dynamic Input Mapping Inversion to Eliminate Algebraic Loops in Hydraulic Actuator Control
The application of nonlinear control schemes to electro-hydraulic actuators often requires several alterations in the design of the controllers during their implementation. This is to overcome challenges that frequently arise in such control algorithms owing to model nonlinearities. Moreover, advanced control solutions for this type of systems often introduce input algebraic loops that pose significant design and tuning difficulties. Conventional methods to avoid such loops introduce chatter, which considerably degrade tracking performance and has oil degradation and wear as side effects. This study presents a nonlinear control architecture for hydraulic actuators that comprises low-complexity modules that facilitate robust high performance in tracking and avoids the drawbacks of chatter. The salient feature is a dynamic input-mapping inversion module that avoids algebraic loops in the control input and is followed by dedicated position control. The stability of the closed-loop system is analyzed using arguments from Lyapunov theory for cascaded non-autonomous nonlinear systems. The effectiveness of the proposed solution is evaluated on a high-fidelity simulator of a wind turbine pitch system, and validated on a full-scale laboratory setup that includes a hydraulic pitch system and blade bearing. Appropriate quantitative metrics are used to evaluate the closed-loop system performance in comparison to a state-of-the-art nonlinear design.
A Value Based Parallel Update MCTS Method for Multi-Agent Cooperative Decision Making of Connected and Automated Vehicles
To solve the problem of lateral and logitudinal joint decision-making of multi-vehicle cooperative driving for connected and automated vehicles (CAVs), this paper proposes a Monte Carlo tree search (MCTS) method with parallel update for multi-agent Markov game with limited horizon and time discounted setting. By analyzing the parallel actions in the multi-vehicle joint action space in the partial-steady-state traffic flow, the parallel update method can quickly exclude potential dangerous actions, thereby increasing the search depth without sacrificing the search breadth. The proposed method is tested in a large number of randomly generated traffic flow. The experiment results show that the algorithm has good robustness and better performance than the SOTA reinforcement learning algorithms and heuristic methods. The vehicle driving strategy using the proposed algorithm shows rationality beyond human drivers, and has advantages in traffic efficiency and safety in the coordinating zone.
comment: arXiv admin note: text overlap with arXiv:2408.04295 by other authors
Privacy-Preserving Fusion for Multi-Sensor Systems Under Multiple Packet Dropouts
Wireless sensor networks (WSNs) are critical components in modern cyber-physical systems, enabling efficient data collection and fusion through spatially distributed sensors. However, the inherent risks of eavesdropping and packet dropouts in such networks pose significant challenges to secure state estimation. In this paper, we address the privacy-preserving fusion estimation (PPFE) problem for multi-sensor systems under multiple packet dropouts and eavesdropping attacks. To mitigate these issues, we propose a distributed encoding-based privacy-preserving mechanism (PPM) within a control-theoretic framework, ensuring data privacy during transmission while maintaining the performance of legitimate state estimation. A centralized fusion filter is developed, accounting for the coupling effects of packet dropouts and the encoding-based PPM. Boundedness conditions for the legitimate user's estimation error covariance are derived via a modified algebraic Riccati equation. Additionally, by demonstrating the divergence of the eavesdropper's mean estimation error, the proposed PPFE algorithm's data confidentiality is rigorously analyzed. Simulation results for an Internet-based three-tank system validate the effectiveness of the proposed approach, highlighting its potential to enhance privacy without compromising estimation accuracy.
Distributional Soft Actor-Critic with Three Refinements
Reinforcement learning (RL) has shown remarkable success in solving complex decision-making and control tasks. However, many model-free RL algorithms experience performance degradation due to inaccurate value estimation, particularly the overestimation of Q-values, which can lead to suboptimal policies. To address this issue, we previously proposed the Distributional Soft Actor-Critic (DSAC or DSACv1), an off-policy RL algorithm that enhances value estimation accuracy by learning a continuous Gaussian value distribution. Despite its effectiveness, DSACv1 faces challenges such as training instability and sensitivity to reward scaling, caused by high variance in critic gradients due to return randomness. In this paper, we introduce three key refinements to DSACv1 to overcome these limitations and further improve Q-value estimation accuracy: expected value substitution, twin value distribution learning, and variance-based critic gradient adjustment. The enhanced algorithm, termed DSAC with Three refinements (DSAC-T or DSACv2), is systematically evaluated across a diverse set of benchmark tasks. Without the need for task-specific hyperparameter tuning, DSAC-T consistently matches or outperforms leading model-free RL algorithms, including SAC, TD3, DDPG, TRPO, and PPO, in all tested environments. Additionally, DSAC-T ensures a stable learning process and maintains robust performance across varying reward scales. Its effectiveness is further demonstrated through real-world application in controlling a wheeled robot, highlighting its potential for deployment in practical robotic tasks.
comment: Title updated in this version. The previous version was titled "DSAC-T: Distributional Soft Actor-Critic With Three Refinements". Added a footnote with the BibTeX entry for the published journal version. No other major changes
CityLight: A Neighborhood-inclusive Universal Model for Coordinated City-scale Traffic Signal Control CIKM 2025
City-scale traffic signal control (TSC) involves thousands of heterogeneous intersections with varying topologies, making cooperative decision-making across intersections particularly challenging. Given the prohibitive computational cost of learning individual policies for each intersection, some researchers explore learning a universal policy to control each intersection in a decentralized manner, where the key challenge is to construct a universal representation method for heterogeneous intersections. However, existing methods are limited to universally representing information of heterogeneous ego intersections, neglecting the essential representation of influence from their heterogeneous neighbors. Universally incorporating neighborhood information is nontrivial due to the intrinsic complexity of traffic flow interactions, as well as the challenge of modeling collective influences from neighbor intersections. To address these challenges, we propose CityLight, which learns a universal policy based on representations obtained with two major modules: a Neighbor Influence Encoder to explicitly model neighbor's influence with specified traffic flow relation and connectivity to the ego intersection; a Neighbor Influence Aggregator to attentively aggregate the influence of neighbors based on their mutual competitive relations. Extensive experiments on five city-scale datasets, ranging from 97 to 13,952 intersections, confirm the efficacy of CityLight, with an average throughput improvement of 11.68% and a lift of 22.59% for generalization.
comment: Accepted by CIKM 2025
Advantages of Feedback in Distributed Data-Gathering for Accurate and Power-Efficient State-Estimation
In distributed target-tracking sensor networks, efficient data gathering methods are necessary to save communication resources and assure information accuracy. This paper proposes a Feedback (FB) distributed data-gathering method which lets the central unit feed information back to the mobile sensors; each sensor then uses it to cancel redundant transmissions and reduce communication congestion. We rigorously compare its performance, in terms of mean-squared error (MSE) and cost of power per sensor, against more conventional Non-Feedback (NF) architectures by evaluating conditions of feasibility and advantage under different architecture specifications (e.g., communication delay rate, power cost rate, maximum back-off time, sampling period, observation noise). Here, we defined the advantage as the performance gain achieved by FB over NF, while FB is said to be feasible if the advantage region is nonempty. Our theoretical analyses show that the feasibility of FB depends more on the communication power cost, while the advantage depends on the sensors' propagation delay per transmission interval; we derive concrete conditions under which these outcomes hold. Using extensive numerical simulations under a variety of settings, we confirm the accuracy of the derived conditions, and show that our theoretical results hold even for more complex scenarios where the simplifying assumptions no longer hold.
comment: Corrected author name in metadata from "Soojean Han" to "SooJean Han"
Split the Yield, Share the Risk: Pricing, Hedging and Fixed rates in DeFi
We present the first formal treatment of \emph{yield tokenization}, a mechanism that decomposes yield-bearing assets into principal and yield components to facilitate risk transfer and price discovery in decentralized finance (DeFi). We propose a model that characterizes yield token dynamics using stochastic differential equations. We derive a no-arbitrage pricing framework for yield tokens, enabling their use in hedging future yield volatility and managing interest rate risk in decentralized lending pools. Taking DeFi lending as our focus, we show how both borrowers and lenders can use yield tokens to achieve optimal hedging outcomes and mitigate exposure to adversarial interest rate manipulation. Furthermore, we design automated market makers (AMMs) that incorporate a menu of bonding curves to aggregate liquidity from participants with heterogeneous risk preferences. This leads to an efficient and incentive-compatible mechanism for trading yield tokens and yield futures. Building on these foundations, we propose a modular \textit{fixed-rate} lending protocol that synthesizes on-chain yield token markets and lending pools, enabling robust interest rate discovery and enhancing capital efficiency. Our work provides the theoretical underpinnings for risk management and fixed-income infrastructure in DeFi, offering practical mechanisms for stable and sustainable yield markets.
Subframework-based Bearing Rigidity Maintenance Control in Multirobot Networks
This work presents a novel approach for \textit{bearing rigidity} analysis and control in multi-robot networks with sensing constraints and dynamic topology. By decomposing the system's framework into \textit{subframeworks}, we express bearing rigidity -- a global property -- as a set of \textit{local} properties, with rigidity eigenvalues serving as natural \textit{local rigidity measures}. We propose a decentralized gradient-based controller to execute mission-specific commands using only bearing measurements. The controller preserves bearing rigidity by keeping the rigidity eigenvalues above a threshold, using only information exchanged within subframeworks. Simulations evaluate the scheme's effectiveness, underscoring its scalability and practicality.
comment: 6 pages
How to Adapt Control Barrier Functions? A Learning-Based Approach with Applications to a VTOL Quadplane
In this paper, we present a novel theoretical framework for online adaptation of Control Barrier Function (CBF) parameters, i.e., of the class K functions included in the CBF condition, under input constraints. We introduce the concept of locally validated CBF parameters, which are adapted online to guarantee finite-horizon safety, based on conditions derived from Nagumo's theorem and tangent cone analysis. To identify these parameters online, we integrate a learning-based approach with an uncertainty-aware verification process that account for both epistemic and aleatoric uncertainties inherent in neural network predictions. Our method is demonstrated on a VTOL quadplane model during challenging transition and landing maneuvers, showcasing enhanced performance while maintaining safety.
comment: 2025 IEEE Conference on Decision and Control (CDC). Project page: https://www.taekyung.me/how-to-adapt-cbf
Systems and Control (EESS)
Layers of a City: Network-Based Insights into San Diego's Transportation Ecosystem
Analyzing the structure and function of urban transportation networks is critical for enhancing mobility, equity, and resilience. This paper leverages network science to conduct a multi-modal analysis of San Diego's transportation system. We construct a multi-layer graph using data from OpenStreetMap (OSM) and the San Diego Metropolitan Transit System (MTS), representing driving, walking, and public transit layers. By integrating thousands of Points of Interest (POIs), we analyze network accessibility, structure, and resilience through centrality measures, community detection, and a proposed metric for walkability. Our analysis reveals a system defined by a stark core-periphery divide. We find that while the urban core is well-integrated, 30.3% of POIs are isolated from public transit within a walkable distance, indicating significant equity gaps in suburban and rural access. Centrality analysis highlights the driving network's over-reliance on critical freeways as bottlenecks, suggesting low network resilience, while confirming that San Diego is not a broadly walkable city. Furthermore, community detection demonstrates that transportation mode dictates the scale of mobility, producing compact, local clusters for walking and broad, regional clusters for driving. Collectively, this work provides a comprehensive framework for diagnosing urban mobility systems, offering quantitative insights that can inform targeted interventions to improve transportation equity and infrastructure resilience in San Diego.
Case Studies of Generative Machine Learning Models for Dynamical Systems
Systems like aircraft and spacecraft are expensive to operate in the real world. The design, validation, and testing for such systems therefore relies on a combination of mathematical modeling, abundant numerical simulations, and a relatively small set of real-world experiments. Due to modeling errors, simplifications, and uncertainties, the data synthesized by simulation models often does not match data from the system's real-world operation. We consider the broad research question of whether this model mismatch can be significantly reduced by generative artificial intelligence models (GAIMs). Unlike text- or image-processing, where generative models have attained recent successes, GAIM development for aerospace engineering applications must not only train with scarce operational data, but their outputs must also satisfy governing equations based on natural laws, e.g., conservation laws. The scope of this paper primarily focuses on two case studies of optimally controlled systems that are commonly understood and employed in aircraft guidance, namely: minimum-time navigation in a wind field and minimum-exposure navigation in a threat field. We report GAIMs that are trained with a relatively small set, of the order of a few hundred, of examples and with underlying governing equations. By focusing on optimally controlled systems, we formulate training loss functions based on invariance of the Hamiltonian function along system trajectories. We investigate three GAIM architectures, namely: the generative adversarial network (GAN) and two variants of the variational autoencoder (VAE). We provide architectural details and thorough performance analyses of these models. The main finding is that our new models, especially the VAE-based models, are able to synthesize data that satisfy the governing equations and are statistically similar to the training data despite small volumes of training data.
Reliable and Real-Time Highway Trajectory Planning via Hybrid Learning-Optimization Frameworks
Autonomous highway driving presents a high collision risk due to fast-changing environments and limited reaction time, necessitating reliable and efficient trajectory planning. This paper proposes a hybrid trajectory planning framework that integrates the adaptability of learning-based methods with the formal safety guarantees of optimization-based approaches. The framework features a two-layer architecture: an upper layer employing a graph neural network (GNN) trained on real-world highway data to predict human-like longitudinal velocity profiles, and a lower layer utilizing path optimization formulated as a mixed-integer quadratic programming (MIQP) problem. The primary contribution is the lower-layer path optimization model, which introduces a linear approximation of discretized vehicle geometry to substantially reduce computational complexity, while enforcing strict spatiotemporal non-overlapping constraints to formally guarantee collision avoidance throughout the planning horizon. Experimental results demonstrate that the planner generates highly smooth, collision-free trajectories in complex real-world emergency scenarios, achieving success rates exceeding 97% with average planning times of 54 ms, thereby confirming real-time capability.
Error Accumulation using Linearized Models for Aggregating Flexibility in Distribution Systems
This paper investigates flexibility aggregation approaches based on linear models. We begin by examining the theoretical foundations of linear AC power flow, two variants of so-called DC power flow, and the LinDistFlow model, along with their underlying assumptions. The discussion covers key system details, including network topology, voltage constraints, and line losses. Simulations are conducted on the KIT Campus Nord network with real demand and solar data. Results show that, in the absence of negative losses, line losses are generally underestimated by linear models. Furthermore, line losses errors tend to accumulate both at the point of common coupling (PCC) and over extended time horizons.
Design of Adaptive Hybrid Downlink NOMA-TDMA for Visible Light Communications Networks
This paper proposes an adaptive hybrid non-orthogonal multiple access (NOMA)-time division multiple access (TDMA) scheme for multi-user visible light communication (VLC) networks, aiming to enhance users' sum-rate performance while maintaining low complexity. In the proposed scheme, users are divided into groups where each group is served in a different time slot using TDMA. Within each group, up to two users can be served simultaneously using NOMA. A central challenge lies in determining which users should be paired together for NOMA, as the effectiveness of successive interference cancellation (SIC) employed by NOMA depends on the difference between users' channel gains. To address this, for a pair of users, we determine the range of their channel gain ratio within which the pair benefits more from NOMA or TDMA. Identifying the lower and upper bounds of this range is formulated as two optimization problems which are solved efficiently using the Successive Convex Approximation (SCA) method. Simulation results demonstrate that the proposed scheme outperforms the conventional hybrid NOMA-TDMA method under different numbers of users and transmit LED powers.
Challenges in Applying Variational Quantum Algorithms to Dynamic Satellite Network Routing
Applying near-term variational quantum algorithms to the problem of dynamic satellite network routing represents a promising direction for quantum computing. In this work, we provide a critical evaluation of two major approaches: static quantum optimizers such as the Variational Quantum Eigensolver (VQE) and the Quantum Approximate Optimization Algorithm (QAOA) for offline route computation, and Quantum Reinforcement Learning (QRL) methods for online decision-making. Using ideal, noise-free simulations, we find that these algorithms face significant challenges. Specifically, static optimizers are unable to solve even a classically easy 4-node shortest path problem due to the complexity of the optimization landscape. Likewise, a basic QRL agent based on policy gradient methods fails to learn a useful routing strategy in a dynamic 8-node environment and performs no better than random actions. These negative findings highlight key obstacles that must be addressed before quantum algorithms can offer real advantages in communication networks. We discuss the underlying causes of these limitations, including barren plateaus and learning instability, and suggest future research directions to overcome them.
comment: 17 pages and 3 figures
Optimizing Microgrid Composition for Sustainable Data Centers
As computing energy demand continues to grow and electrical grid infrastructure struggles to keep pace, an increasing number of data centers are being planned with colocated microgrids that integrate on-site renewable generation and energy storage. However, while existing research has examined the tradeoffs between operational and embodied carbon emissions in the context of renewable energy certificates, there is a lack of tools to assess how the sizing and composition of microgrid components affects long-term sustainability and power reliability. In this paper, we present a novel optimization framework that extends the computing and energy system co-simulator Vessim with detailed renewable energy generation models from the National Renewable Energy Laboratory's (NREL) System Advisor Model (SAM). Our framework simulates the interaction between computing workloads, on-site renewable production, and energy storage, capturing both operational and embodied emissions. We use a multi-horizon black-box optimization to explore efficient microgrid compositions and enable operators to make more informed decisions when planning energy systems for data centers.
A virtual sensor fusion approach for state of charge estimation of lithium-ion cells
This paper addresses the estimation of the State Of Charge (SOC) of lithium-ion cells via the combination of two widely used paradigms: Kalman Filters (KFs) equipped with Equivalent Circuit Models (ECMs) and machine-learning approaches. In particular, a recent Virtual Sensor (VS) synthesis technique is considered, which operates as follows: (i) learn an Affine Parameter-Varying (APV) model of the cell directly from data, (ii) derive a bank of linear observers from the APV model, (iii) train a machine-learning technique from features extracted from the observers together with input and output data to predict the SOC. The SOC predictions returned by the VS are supplied to an Extended KF (EKF) as output measurements along with the cell terminal voltage, combining the two paradigms. A data-driven calibration strategy for the noise covariance matrices of the EKF is proposed. Experimental results show that the designed approach is beneficial w.r.t. SOC estimation accuracy and smoothness.
Information Bulletin Strategy in Impatient Queuing SC
In Sixth Generation (6G) networks, decentralized control in multi-tenant systems is a suggested enabler for autonomous network operations. However, autonomy requires independent rationale decisions be taken by tenants. This rationality can only be underpinned by timely and continuous access to status information. Despite its importance, the questions of what information should be shared, how much should be communicated, and how frequently updates should be dispatched remain open research challenges. This manuscript proposes an information bulletin strategy defined around two models of the system descriptor states to address these fundamental questions. The strategy is that queues periodically broadcast these information models to tenants at different time intervals, who may respond by reneging from the queue or jockeying to a more favorable one. The expectation is that over time, the queues adapt their processing rates based on what they learn from the tenant behavior. The objective is to minimize overall delay and the impatience. We formulate for this impatience as an optimization problem, whose analytical solution is intractable. We perform numerical experiments to evaluate the performance of the learned queue policy and to assess how closely it approaches optimal conditions.
comment: Submitted to IEEE CSCN 2025
Agentic-AI based Mathematical Framework for Commercialization of Energy Resilience in Electrical Distribution System Planning and Operation
The increasing vulnerability of electrical distribution systems to extreme weather events and cyber threats necessitates the development of economically viable frameworks for resilience enhancement. While existing approaches focus primarily on technical resilience metrics and enhancement strategies, there remains a significant gap in establishing market-driven mechanisms that can effectively commercialize resilience features while optimizing their deployment through intelligent decision-making. Moreover, traditional optimization approaches for distribution network reconfiguration often fail to dynamically adapt to both normal and emergency conditions. This paper introduces a novel framework integrating dual-agent Proximal Policy Optimization (PPO) with market-based mechanisms, achieving an average resilience score of 0.85 0.08 over 10 test episodes. The proposed architecture leverages a dual-agent PPO scheme, where a strategic agent selects optimal DER-driven switching configurations, while a tactical agent fine-tunes individual switch states and grid preferences under budget and weather constraints. These agents interact within a custom-built dynamic simulation environment that models stochastic calamity events, budget limits, and resilience-cost trade-offs. A comprehensive reward function is designed that balances resilience enhancement objectives with market profitability (with up to 200x reward incentives, resulting in 85% of actions during calamity steps selecting configurations with 4 DERs), incorporating factors such as load recovery speed, system robustness, and customer satisfaction. Over 10 test episodes, the framework achieved a benefit-cost ratio of 0.12 0.01, demonstrating sustainable market incentives for resilience investment. This framework creates sustainable market incentives
Optimization of sliding control parameters for a 3-dof robot arm using genetic algorithm (GA)
This paper presents a method for optimizing the sliding mode control (SMC) parameter for a robot manipulator applying a genetic algorithm (GA). The objective of the SMC is to achieve precise and consistent tracking of the trajectory of the robot manipulator under uncertain and disturbed conditions. However, the system effectiveness and robustness depend on the choice of the SMC parameters, which is a difficult and crucial task. To solve this problem, a genetic algorithm is used to locate the optimal values of these parameters that gratify the capability criteria. The proposed method is efficient compared with the conventional SMC and Fuzzy-SMC. The simulation results show that the genetic algorithm with SMC can achieve better tracking capability and reduce the chattering effect.
Sequence Aware SAC Control for Engine Fuel Consumption Optimization in Electrified Powertrain
As hybrid electric vehicles (HEVs) gain traction in heavy-duty trucks, adaptive and efficient energy management is critical for reducing fuel consumption while maintaining battery charge for long operation times. We present a new reinforcement learning (RL) framework based on the Soft Actor-Critic (SAC) algorithm to optimize engine control in series HEVs. We reformulate the control task as a sequential decision-making problem and enhance SAC by incorporating Gated Recurrent Units (GRUs) and Decision Transformers (DTs) into both actor and critic networks to capture temporal dependencies and improve planning over time. To evaluate robustness and generalization, we train the models under diverse initial battery states, drive cycle durations, power demands, and input sequence lengths. Experiments show that the SAC agent with a DT-based actor and GRU-based critic was within 1.8% of Dynamic Programming (DP) in fuel savings on the Highway Fuel Economy Test (HFET) cycle, while the SAC agent with GRUs in both actor and critic networks, and FFN actor-critic agent were within 3.16% and 3.43%, respectively. On unseen drive cycles (US06 and Heavy Heavy-Duty Diesel Truck (HHDDT) cruise segment), generalized sequence-aware agents consistently outperformed feedforward network (FFN)-based agents, highlighting their adaptability and robustness in real-world settings.
Linear Program-Based Stability Conditions for Nonlinear Autonomous Systems
This paper introduces a novel approach to evaluating the asymptotic stability of equilibrium points in both continuous-time (CT) and discrete-time (DT) nonlinear autonomous systems. By utilizing indirect Lyapunov methods and linearizing system dynamics through Jacobian matrices, the methodology replaces traditional semi-definite programming (SDP) techniques with computationally efficient linear programming (LP) conditions. This substitution substantially lowers the computational burden, including time and memory usage, particularly for high-dimensional systems. The stability criteria are developed using matrix transformations and leveraging the structural characteristics of the system, improving scalability. Several examples demonstrated the computational efficiency of the proposed approach compared to the existing SDP-based criteria, particularly for high-dimensional systems.
comment: 6 pages. Submitted to NOLCOS'2025
Optimality Principles and Neural Ordinary Differential Equations-based Process Modeling for Distributed Control
Most recent advances in machine learning and analytics for process control pose the question of how to naturally integrate new data-driven methods with classical process models and control. We propose a process modeling framework enabling integration of data-driven algorithms through consistent topological properties and conservation of extensive quantities. Interconnections among process network units are represented through connectivity matrices and network graphs. We derive the system's natural objective function equivalent to the non-equilibrium entropy production in a steady state system as a driving force for the process dynamics. We illustrate how distributed control and optimization can be implemented into process network structures and how control laws and algorithms alter the system's natural equilibrium towards engineered objectives. The basic requirement is that the flow conditions can be expressed in terms of conic sector (passivity) conditions. Our formalism allows integration of fundamental conservation properties from topology with learned dynamic relations from data through sparse deep neural networks. We demonstrate in a practical example of a simple inventory control system how to integrate the basic topology of a process with a neural network ordinary differential equation model. The system specific constitutive equations are left undescribed and learned by the neural ordinary differential equation algorithm using the adjoint method in combination with an adaptive ODE solver from synthetic time-series data. The resulting neural network forms a state space model for use in e.g. a model predictive control algorithm.
comment: 27 pages, 7 figures
Streaming Generated Gaussian Process Experts for Online Learning and Control
Gaussian Processes (GPs), as a nonparametric learning method, offer flexible modeling capabilities and calibrated uncertainty quantification for function approximations. Additionally, GPs support online learning by efficiently incorporating new data with polynomial-time computation, making them well-suited for safety-critical dynamical systems that require rapid adaptation. However, the inference and online updates of exact GPs, when processing streaming data, incur cubic computation time and quadratic storage memory complexity, limiting their scalability to large datasets in real-time settings. In this paper, we propose a streaming kernel-induced progressively generated expert framework of Gaussian processes (SkyGP) that addresses both computational and memory constraints by maintaining a bounded set of experts, while inheriting the learning performance guarantees from exact Gaussian processes. Furthermore, two SkyGP variants are introduced, each tailored to a specific objective, either maximizing prediction accuracy (SkyGP-Dense) or improving computational efficiency (SkyGP-Fast). The effectiveness of SkyGP is validated through extensive benchmarks and real-time control experiments demonstrating its superior performance compared to state-of-the-art approaches.
Live Demonstration: Neuromorphic Radar for Gesture Recognition ICASSP 2025
We present a neuromorphic radar framework for real-time, low-power hand gesture recognition (HGR) using an event-driven architecture inspired by biological sensing. Our system comprises a 24 GHz Doppler radar front-end and a custom neuromorphic sampler that converts intermediate-frequency (IF) signals into sparse spike-based representations via asynchronous sigma-delta encoding. These events are directly processed by a lightweight neural network deployed on a Cortex-M0 microcontroller, enabling low-latency inference without requiring spectrogram reconstruction. Unlike conventional radar HGR pipelines that continuously sample and process data, our architecture activates only when meaningful motion is detected, significantly reducing memory, power, and computation overhead. Evaluated on a dataset of five gestures collected from seven users, our system achieves > 85% real-time accuracy. To the best of our knowledge, this is the first work that employs bio-inspired asynchronous sigma-delta encoding and an event-driven processing framework for radar-based HGR.
comment: Neuromorphic Radar, Hand Gesture Recognition, Event-Driven, Sigma-Delta Encoding, Sparse Representation. Presented in ICASSP 2025 at Hyderabad, India
Improving Sequential Market Coordination via Value-oriented Renewable Energy Forecasting
Large penetration of renewable energy sources (RESs) brings huge uncertainty into the electricity markets. The current deterministic clearing approach in the day-ahead (DA) market, where RESs participate based on expected production, has been criticized for causing a lack of coordination between the DA and real-time (RT) markets, leading to high overall operating costs. Previous works indicate that improving day-ahead RES entering quantities can significantly mitigate the drawbacks of deterministic clearing. In this work, we propose using a trained forecasting model, referred to as value-oriented forecasting, to determine RES Improved Entering Quantities (RIEQ) more efficiently during the operational phase. Unlike traditional models that minimize statistical forecasting errors, our approach trains model parameters to minimize the expected overall operating costs across both DA and RT markets. We derive the exact form of the loss function used for training, which becomes piecewise linear when market clearing is modeled by linear programs. Additionally, we provide the analytical gradient of the loss function with respect to the forecast, enabling an efficient training strategy. Numerical studies demonstrate that our forecasts significantly reduce overall operating costs for deterministic market clearing compared to conventional forecasts based on expected RES production.
comment: Submitted to IEEE Transactions on Energy Markets, Policy, and Regulation
ALADIN-$β$: A Distributed Optimization Algorithm for Solving MPCC Problems
Mathematical Programs with Complementarity Constraints (MPCC) are critical in various real-world applications but notoriously challenging due to non-smoothness and degeneracy from complementarity constraints. The $\ell_1$-Exact Penalty-Barrier enhanced \texttt{IPOPT} improves performance and robustness by introducing additional inequality constraints and decision variables. However, this comes at the cost of increased computational complexity due to the higher dimensionality and additional constraints introduced in the centralized formulation. To mitigate this, we propose a distributed structure-splitting reformulation that decomposes these inequality constraints and auxiliary variables into independent sub-problems. Furthermore, we introduce Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN)-$\beta$, a novel approach that integrates the $\ell_1$-Exact Penalty-Barrier method with ALADIN to efficiently solve the distributed reformulation. Numerical experiments demonstrate that even without a globalization strategy, the proposed distributed approach achieves fast convergence while maintaining high precision.
Reachset-Conformant System Identification
Formal verification techniques play a pivotal role in ensuring the safety of complex cyber-physical systems. To transfer model-based verification results to the real world, we require that the measurements of the target system lie in the set of reachable outputs of the corresponding model, a property we refer to as reachset conformance. This paper is on automatically identifying those reachset-conformant models. While state-of-the-art reachset-conformant identification methods focus on linear state-space models, we generalize these methods to nonlinear state-space models and linear and nonlinear input-output models. Furthermore, our identification framework adapts to different levels of prior knowledge on the system dynamics. In particular, we identify the set of model uncertainties for white-box models, the parameters and the set of model uncertainties for gray-box models, and entire reachset-conformant black-box models from data. The robustness and efficacy of our framework are demonstrated in extensive numerical experiments using simulated and real-world data.
comment: This work has been submitted to the IEEE for possible publication
Quantum-Enhanced Power Flow and Optimal Power Flow based on Combinatorial Reformulation
This study introduces the Adiabatic Quantum Power Flow (AQPF) and Adiabatic Quantum Optimal Power Flow (AQOPF) algorithms to solve power flow (PF) and optimal power flow (OPF) problems, respectively. These algorithms utilize a novel combinatorial optimization reformulation of classical PF and OPF problems, and hence, enable their implementation on Ising machines, e.g., quantum and quantum-inspired hardware. The experiments are conducted on standard test cases ranging from 4-bus to 1354-bus systems, using D-Wave's Advantage system (QA), its hybrid quantum-classical solver (HA), as well as the third-generation Digital Annealer (DAv3) and Quantum-Inspired Integrated Optimization software (QIIO) developed by Fujitsu. The annealers are systematically evaluated based on: (i) full and partitioned formulations, (ii) ability to handle ill-conditioned cases, and (iii) scalability. The results are benchmarked against the Newton-Raphson numerical method (NR) and suggest that AQPF and AQOPF can serve as effective solvers or complementary tools to classical methods to address unsolved challenges in large-scale modern power systems.
comment: 10 pages, 1 pseudo code, 2 figures, 4 tables
Dynamic Input Mapping Inversion to Eliminate Algebraic Loops in Hydraulic Actuator Control
The application of nonlinear control schemes to electro-hydraulic actuators often requires several alterations in the design of the controllers during their implementation. This is to overcome challenges that frequently arise in such control algorithms owing to model nonlinearities. Moreover, advanced control solutions for this type of systems often introduce input algebraic loops that pose significant design and tuning difficulties. Conventional methods to avoid such loops introduce chatter, which considerably degrade tracking performance and has oil degradation and wear as side effects. This study presents a nonlinear control architecture for hydraulic actuators that comprises low-complexity modules that facilitate robust high performance in tracking and avoids the drawbacks of chatter. The salient feature is a dynamic input-mapping inversion module that avoids algebraic loops in the control input and is followed by dedicated position control. The stability of the closed-loop system is analyzed using arguments from Lyapunov theory for cascaded non-autonomous nonlinear systems. The effectiveness of the proposed solution is evaluated on a high-fidelity simulator of a wind turbine pitch system, and validated on a full-scale laboratory setup that includes a hydraulic pitch system and blade bearing. Appropriate quantitative metrics are used to evaluate the closed-loop system performance in comparison to a state-of-the-art nonlinear design.
A Value Based Parallel Update MCTS Method for Multi-Agent Cooperative Decision Making of Connected and Automated Vehicles
To solve the problem of lateral and logitudinal joint decision-making of multi-vehicle cooperative driving for connected and automated vehicles (CAVs), this paper proposes a Monte Carlo tree search (MCTS) method with parallel update for multi-agent Markov game with limited horizon and time discounted setting. By analyzing the parallel actions in the multi-vehicle joint action space in the partial-steady-state traffic flow, the parallel update method can quickly exclude potential dangerous actions, thereby increasing the search depth without sacrificing the search breadth. The proposed method is tested in a large number of randomly generated traffic flow. The experiment results show that the algorithm has good robustness and better performance than the SOTA reinforcement learning algorithms and heuristic methods. The vehicle driving strategy using the proposed algorithm shows rationality beyond human drivers, and has advantages in traffic efficiency and safety in the coordinating zone.
comment: arXiv admin note: text overlap with arXiv:2408.04295 by other authors
Privacy-Preserving Fusion for Multi-Sensor Systems Under Multiple Packet Dropouts
Wireless sensor networks (WSNs) are critical components in modern cyber-physical systems, enabling efficient data collection and fusion through spatially distributed sensors. However, the inherent risks of eavesdropping and packet dropouts in such networks pose significant challenges to secure state estimation. In this paper, we address the privacy-preserving fusion estimation (PPFE) problem for multi-sensor systems under multiple packet dropouts and eavesdropping attacks. To mitigate these issues, we propose a distributed encoding-based privacy-preserving mechanism (PPM) within a control-theoretic framework, ensuring data privacy during transmission while maintaining the performance of legitimate state estimation. A centralized fusion filter is developed, accounting for the coupling effects of packet dropouts and the encoding-based PPM. Boundedness conditions for the legitimate user's estimation error covariance are derived via a modified algebraic Riccati equation. Additionally, by demonstrating the divergence of the eavesdropper's mean estimation error, the proposed PPFE algorithm's data confidentiality is rigorously analyzed. Simulation results for an Internet-based three-tank system validate the effectiveness of the proposed approach, highlighting its potential to enhance privacy without compromising estimation accuracy.
Distributional Soft Actor-Critic with Three Refinements
Reinforcement learning (RL) has shown remarkable success in solving complex decision-making and control tasks. However, many model-free RL algorithms experience performance degradation due to inaccurate value estimation, particularly the overestimation of Q-values, which can lead to suboptimal policies. To address this issue, we previously proposed the Distributional Soft Actor-Critic (DSAC or DSACv1), an off-policy RL algorithm that enhances value estimation accuracy by learning a continuous Gaussian value distribution. Despite its effectiveness, DSACv1 faces challenges such as training instability and sensitivity to reward scaling, caused by high variance in critic gradients due to return randomness. In this paper, we introduce three key refinements to DSACv1 to overcome these limitations and further improve Q-value estimation accuracy: expected value substitution, twin value distribution learning, and variance-based critic gradient adjustment. The enhanced algorithm, termed DSAC with Three refinements (DSAC-T or DSACv2), is systematically evaluated across a diverse set of benchmark tasks. Without the need for task-specific hyperparameter tuning, DSAC-T consistently matches or outperforms leading model-free RL algorithms, including SAC, TD3, DDPG, TRPO, and PPO, in all tested environments. Additionally, DSAC-T ensures a stable learning process and maintains robust performance across varying reward scales. Its effectiveness is further demonstrated through real-world application in controlling a wheeled robot, highlighting its potential for deployment in practical robotic tasks.
comment: Title updated in this version. The previous version was titled "DSAC-T: Distributional Soft Actor-Critic With Three Refinements". Added a footnote with the BibTeX entry for the published journal version. No other major changes
CityLight: A Neighborhood-inclusive Universal Model for Coordinated City-scale Traffic Signal Control CIKM 2025
City-scale traffic signal control (TSC) involves thousands of heterogeneous intersections with varying topologies, making cooperative decision-making across intersections particularly challenging. Given the prohibitive computational cost of learning individual policies for each intersection, some researchers explore learning a universal policy to control each intersection in a decentralized manner, where the key challenge is to construct a universal representation method for heterogeneous intersections. However, existing methods are limited to universally representing information of heterogeneous ego intersections, neglecting the essential representation of influence from their heterogeneous neighbors. Universally incorporating neighborhood information is nontrivial due to the intrinsic complexity of traffic flow interactions, as well as the challenge of modeling collective influences from neighbor intersections. To address these challenges, we propose CityLight, which learns a universal policy based on representations obtained with two major modules: a Neighbor Influence Encoder to explicitly model neighbor's influence with specified traffic flow relation and connectivity to the ego intersection; a Neighbor Influence Aggregator to attentively aggregate the influence of neighbors based on their mutual competitive relations. Extensive experiments on five city-scale datasets, ranging from 97 to 13,952 intersections, confirm the efficacy of CityLight, with an average throughput improvement of 11.68% and a lift of 22.59% for generalization.
comment: Accepted by CIKM 2025
Advantages of Feedback in Distributed Data-Gathering for Accurate and Power-Efficient State-Estimation
In distributed target-tracking sensor networks, efficient data gathering methods are necessary to save communication resources and assure information accuracy. This paper proposes a Feedback (FB) distributed data-gathering method which lets the central unit feed information back to the mobile sensors; each sensor then uses it to cancel redundant transmissions and reduce communication congestion. We rigorously compare its performance, in terms of mean-squared error (MSE) and cost of power per sensor, against more conventional Non-Feedback (NF) architectures by evaluating conditions of feasibility and advantage under different architecture specifications (e.g., communication delay rate, power cost rate, maximum back-off time, sampling period, observation noise). Here, we defined the advantage as the performance gain achieved by FB over NF, while FB is said to be feasible if the advantage region is nonempty. Our theoretical analyses show that the feasibility of FB depends more on the communication power cost, while the advantage depends on the sensors' propagation delay per transmission interval; we derive concrete conditions under which these outcomes hold. Using extensive numerical simulations under a variety of settings, we confirm the accuracy of the derived conditions, and show that our theoretical results hold even for more complex scenarios where the simplifying assumptions no longer hold.
comment: Corrected author name in metadata from "Soojean Han" to "SooJean Han"
Split the Yield, Share the Risk: Pricing, Hedging and Fixed rates in DeFi
We present the first formal treatment of \emph{yield tokenization}, a mechanism that decomposes yield-bearing assets into principal and yield components to facilitate risk transfer and price discovery in decentralized finance (DeFi). We propose a model that characterizes yield token dynamics using stochastic differential equations. We derive a no-arbitrage pricing framework for yield tokens, enabling their use in hedging future yield volatility and managing interest rate risk in decentralized lending pools. Taking DeFi lending as our focus, we show how both borrowers and lenders can use yield tokens to achieve optimal hedging outcomes and mitigate exposure to adversarial interest rate manipulation. Furthermore, we design automated market makers (AMMs) that incorporate a menu of bonding curves to aggregate liquidity from participants with heterogeneous risk preferences. This leads to an efficient and incentive-compatible mechanism for trading yield tokens and yield futures. Building on these foundations, we propose a modular \textit{fixed-rate} lending protocol that synthesizes on-chain yield token markets and lending pools, enabling robust interest rate discovery and enhancing capital efficiency. Our work provides the theoretical underpinnings for risk management and fixed-income infrastructure in DeFi, offering practical mechanisms for stable and sustainable yield markets.
Subframework-based Bearing Rigidity Maintenance Control in Multirobot Networks
This work presents a novel approach for \textit{bearing rigidity} analysis and control in multi-robot networks with sensing constraints and dynamic topology. By decomposing the system's framework into \textit{subframeworks}, we express bearing rigidity -- a global property -- as a set of \textit{local} properties, with rigidity eigenvalues serving as natural \textit{local rigidity measures}. We propose a decentralized gradient-based controller to execute mission-specific commands using only bearing measurements. The controller preserves bearing rigidity by keeping the rigidity eigenvalues above a threshold, using only information exchanged within subframeworks. Simulations evaluate the scheme's effectiveness, underscoring its scalability and practicality.
comment: 6 pages
How to Adapt Control Barrier Functions? A Learning-Based Approach with Applications to a VTOL Quadplane
In this paper, we present a novel theoretical framework for online adaptation of Control Barrier Function (CBF) parameters, i.e., of the class K functions included in the CBF condition, under input constraints. We introduce the concept of locally validated CBF parameters, which are adapted online to guarantee finite-horizon safety, based on conditions derived from Nagumo's theorem and tangent cone analysis. To identify these parameters online, we integrate a learning-based approach with an uncertainty-aware verification process that account for both epistemic and aleatoric uncertainties inherent in neural network predictions. Our method is demonstrated on a VTOL quadplane model during challenging transition and landing maneuvers, showcasing enhanced performance while maintaining safety.
comment: 2025 IEEE Conference on Decision and Control (CDC). Project page: https://www.taekyung.me/how-to-adapt-cbf
Multiagent Systems
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience
Repurposing large vision-language models (LVLMs) as computer use agents (CUAs) has led to substantial breakthroughs, primarily driven by human-labeled data. However, these models often struggle with novel and specialized software, particularly in scenarios lacking human annotations. To address this challenge, we propose SEAgent, an agentic self-evolving framework enabling CUAs to autonomously evolve through interactions with unfamiliar software. Specifically, SEAgent empowers computer-use agents to autonomously master novel software environments via experiential learning, where agents explore new software, learn through iterative trial-and-error, and progressively tackle auto-generated tasks organized from simple to complex. To achieve this goal, we design a World State Model for step-wise trajectory assessment, along with a Curriculum Generator that generates increasingly diverse and challenging tasks. The agent's policy is updated through experiential learning, comprised of adversarial imitation of failure actions and Group Relative Policy Optimization (GRPO) on successful ones. Furthermore, we introduce a specialist-to-generalist training strategy that integrates individual experiential insights from specialist agents, facilitating the development of a stronger generalist CUA capable of continuous autonomous evolution. This unified agent ultimately achieves performance surpassing ensembles of individual specialist agents on their specialized software. We validate the effectiveness of SEAgent across five novel software environments within OS-World. Our approach achieves a significant improvement of 23.2% in success rate, from 11.3% to 34.5%, over a competitive open-source CUA, i.e., UI-TARS.
comment: Code at https://github.com/SunzeY/SEAgent
From MAS to MARS: Coordination Failures and Reasoning Trade-offs in Hierarchical Multi-Agent Robotic Systems within a Healthcare Scenario
Multi-agent robotic systems (MARS) build upon multi-agent systems by integrating physical and task-related constraints, increasing the complexity of action execution and agent coordination. However, despite the availability of advanced multi-agent frameworks, their real-world deployment on robots remains limited, hindering the advancement of MARS research in practice. To bridge this gap, we conducted two studies to investigate performance trade-offs of hierarchical multi-agent frameworks in a simulated real-world multi-robot healthcare scenario. In Study 1, using CrewAI, we iteratively refine the system's knowledge base, to systematically identify and categorize coordination failures (e.g., tool access violations, lack of timely handling of failure reports) not resolvable by providing contextual knowledge alone. In Study 2, using AutoGen, we evaluate a redesigned bidirectional communication structure and further measure the trade-offs between reasoning and non-reasoning models operating within the same robotic team setting. Drawing from our empirical findings, we emphasize the tension between autonomy and stability and the importance of edge-case testing to improve system reliability and safety for future real-world deployment. Supplementary materials, including codes, task agent setup, trace outputs, and annotated examples of coordination failures and reasoning behaviors, are available at: https://byc-sophie.github.io/mas-to-mars/.
Behaviorally Adaptive Multi-Robot Hazard Localization in Failure-Prone, Communication-Denied Environments
We address the challenge of multi-robot autonomous hazard mapping in high-risk, failure-prone, communication-denied environments such as post-disaster zones, underground mines, caves, and planetary surfaces. In these missions, robots must explore and map hazards while minimizing the risk of failure due to environmental threats or hardware limitations. We introduce a behavior-adaptive, information-theoretic planning framework for multi-robot teams grounded in the concept of Behavioral Entropy (BE), that generalizes Shannon entropy (SE) to capture diverse human-like uncertainty evaluations. Building on this formulation, we propose the Behavior-Adaptive Path Planning (BAPP) framework, which modulates information gathering strategies via a tunable risk-sensitivity parameter, and present two planning algorithms: BAPP-TID for intelligent triggering of high-fidelity robots, and BAPP-SIG for safe deployment under high risk. We provide theoretical insights on the informativeness of the proposed BAPP framework and validate its effectiveness through both single-robot and multi-robot simulations. Our results show that the BAPP stack consistently outperforms Shannon-based and random strategies: BAPP-TID accelerates entropy reduction, while BAPP-SIG improves robot survivability with minimal loss in information gain. In multi-agent deployments, BAPP scales effectively through spatial partitioning, mobile base relocation, and role-aware heterogeneity. These findings underscore the value of behavior-adaptive planning for robust, risk-sensitive exploration in complex, failure-prone environments.
Think Before You Segment: An Object-aware Reasoning Agent for Referring Audio-Visual Segmentation
Referring Audio-Visual Segmentation (Ref-AVS) aims to segment target objects in audible videos based on given reference expressions. Prior works typically rely on learning latent embeddings via multimodal fusion to prompt a tunable SAM/SAM2 decoder for segmentation, which requires strong pixel-level supervision and lacks interpretability. From a novel perspective of explicit reference understanding, we propose TGS-Agent, which decomposes the task into a Think-Ground-Segment process, mimicking the human reasoning procedure by first identifying the referred object through multimodal analysis, followed by coarse-grained grounding and precise segmentation. To this end, we first propose Ref-Thinker, a multimodal language model capable of reasoning over textual, visual, and auditory cues. We construct an instruction-tuning dataset with explicit object-aware think-answer chains for Ref-Thinker fine-tuning. The object description inferred by Ref-Thinker is used as an explicit prompt for Grounding-DINO and SAM2, which perform grounding and segmentation without relying on pixel-level supervision. Additionally, we introduce R\textsuperscript{2}-AVSBench, a new benchmark with linguistically diverse and reasoning-intensive references for better evaluating model generalization. Our approach achieves state-of-the-art results on both standard Ref-AVSBench and proposed R\textsuperscript{2}-AVSBench. Code will be available at https://github.com/jasongief/TGS-Agent.
comment: Project page: https://github.com/jasongief/TGS-Agent
Position-Based Flocking for Robust Alignment
This paper presents a position-based flocking model for interacting agents, balancing cohesion-separation and alignment to achieve stable collective motion. The model modifies a velocity-based approach by approximating velocity differences using initial and current positions, introducing a threshold weight to ensure sustained alignment. Simulations with 50 agents in 2D demonstrate that the position-based model produces stronger alignment and more rigid and compact formations compared to the velocity-based model. The alignment metric and separation distances highlight the efficacy of the proposed model in achieving robust flocking behavior. The model's use of positions ensures robust alignment, with applications in robotics and collective dynamics.
Chain of Questions: Guiding Multimodal Curiosity in Language Models
Reasoning capabilities in large language models (LLMs) have substantially advanced through methods such as chain-of-thought and explicit step-by-step explanations. However, these improvements have not yet fully transitioned to multimodal contexts, where models must proactively decide which sensory modalities such as vision, audio, or spatial perception to engage when interacting with complex real-world environments. In this paper, we introduce the Chain of Questions (CoQ) framework, a curiosity-driven reasoning approach that encourages multimodal language models to dynamically generate targeted questions regarding their surroundings. These generated questions guide the model to selectively activate relevant modalities, thereby gathering critical information necessary for accurate reasoning and response generation. We evaluate our framework on a novel multimodal benchmark dataset, assembled by integrating WebGPT, ScienceQA, AVSD, and ScanQA datasets. Experimental results demonstrate that our CoQ method improves a foundation model's ability to effectively identify and integrate pertinent sensory information. This leads to improved accuracy, interpretability, and alignment of the reasoning process with diverse multimodal tasks.
DRAMA: A Dynamic and Robust Allocation-based Multi-Agent System for Changing Environments
Multi-agent systems (MAS) have demonstrated significant effectiveness in addressing complex problems through coordinated collaboration among heterogeneous agents. However, real-world environments and task specifications are inherently dynamic, characterized by frequent changes, uncertainty, and variability. Despite this, most existing MAS frameworks rely on static architectures with fixed agent capabilities and rigid task allocation strategies, which greatly limits their adaptability to evolving conditions. This inflexibility poses substantial challenges for sustaining robust and efficient multi-agent cooperation in dynamic and unpredictable scenarios. To address these limitations, we propose DRAMA: a Dynamic and Robust Allocation-based Multi-Agent System designed to facilitate resilient collaboration in rapidly changing environments. DRAMA features a modular architecture with a clear separation between the control plane and the worker plane. Both agents and tasks are abstracted as resource objects with well-defined lifecycles, while task allocation is achieved via an affinity-based, loosely coupled mechanism. The control plane enables real-time monitoring and centralized planning, allowing flexible and efficient task reassignment as agents join, depart, or become unavailable, thereby ensuring continuous and robust task execution. The worker plane comprises a cluster of autonomous agents, each with local reasoning, task execution, the ability to collaborate, and the capability to take over unfinished tasks from other agents when needed.
Generic-to-Specific Reasoning and Learning for Scalable Ad Hoc Teamwork
AI agents deployed in assistive roles often have to collaborate with other agents (humans, AI systems) without prior coordination. Methods considered state of the art for such ad hoc teamwork often pursue a data-driven approach that needs a large labeled dataset of prior observations, lacks transparency, and makes it difficult to rapidly revise existing knowledge in response to changes. As the number of agents increases, the complexity of decision-making makes it difficult to collaborate effectively. This paper advocates leveraging the complementary strengths of knowledge-based and data-driven methods for reasoning and learning for ad hoc teamwork. For any given goal, our architecture enables each ad hoc agent to determine its actions through non-monotonic logical reasoning with: (a) prior commonsense domain-specific knowledge; (b) models learned and revised rapidly to predict the behavior of other agents; and (c) anticipated abstract future goals based on generic knowledge of similar situations in an existing foundation model. We experimentally evaluate our architecture's capabilities in VirtualHome, a realistic physics-based 3D simulation environment.
comment: 14 pages, 6 figures
ConfAgents: A Conformal-Guided Multi-Agent Framework for Cost-Efficient Medical Diagnosis
The efficacy of AI agents in healthcare research is hindered by their reliance on static, predefined strategies. This creates a critical limitation: agents can become better tool-users but cannot learn to become better strategic planners, a crucial skill for complex domains like healthcare. We introduce HealthFlow, a self-evolving AI agent that overcomes this limitation through a novel meta-level evolution mechanism. HealthFlow autonomously refines its own high-level problem-solving policies by distilling procedural successes and failures into a durable, strategic knowledge base. To anchor our research and facilitate reproducible evaluation, we introduce EHRFlowBench, a new benchmark featuring complex, realistic health data analysis tasks derived from peer-reviewed clinical research. Our comprehensive experiments demonstrate that HealthFlow's self-evolving approach significantly outperforms state-of-the-art agent frameworks. This work marks a necessary shift from building better tool-users to designing smarter, self-evolving task-managers, paving the way for more autonomous and effective AI for scientific discovery.
comment: Code: https://github.com/PKU-AICare/ConfAgents
RCR-Router: Efficient Role-Aware Context Routing for Multi-Agent LLM Systems with Structured Memory
Multi-agent large language model (LLM) systems have shown strong potential in complex reasoning and collaborative decision-making tasks. However, most existing coordination schemes rely on static or full-context routing strategies, which lead to excessive token consumption, redundant memory exposure, and limited adaptability across interaction rounds. We introduce RCR-Router, a modular and role-aware context routing framework designed to enable efficient, adaptive collaboration in multi-agent LLMs. To our knowledge, this is the first routing approach that dynamically selects semantically relevant memory subsets for each agent based on its role and task stage, while adhering to a strict token budget. A lightweight scoring policy guides memory selection, and agent outputs are iteratively integrated into a shared memory store to facilitate progressive context refinement. To better evaluate model behavior, we further propose an Answer Quality Score metric that captures LLM-generated explanations beyond standard QA accuracy. Experiments on three multi-hop QA benchmarks -- HotPotQA, MuSiQue, and 2WikiMultihop -- demonstrate that RCR-Router reduces token usage (up to 30%) while improving or maintaining answer quality. These results highlight the importance of structured memory routing and output-aware evaluation in advancing scalable multi-agent LLM systems.
BTPG-max: Achieving Local Maximal Bidirectional Pairs for Bidirectional Temporal Plan Graphs
Multi-Agent Path Finding (MAPF) requires computing collision-free paths for multiple agents in shared environment. Most MAPF planners assume that each agent reaches a specific location at a specific timestep, but this is infeasible to directly follow on real systems where delays often occur. To address collisions caused by agents deviating due to delays, the Temporal Plan Graph (TPG) was proposed, which converts a MAPF time dependent solution into a time independent set of inter-agent dependencies. Recently, a Bidirectional TPG (BTPG) was proposed which relaxed some dependencies into ``bidirectional pairs" and improved efficiency of agents executing their MAPF solution with delays. Our work improves upon this prior work by designing an algorithm, BPTG-max, that finds more bidirectional pairs. Our main theoretical contribution is in designing the BTPG-max algorithm is locally optimal, i.e. which constructs a BTPG where no additional bidirectional pairs can be added. We also show how in practice BTPG-max leads to BTPGs with significantly more bidirectional edges, superior anytime behavior, and improves robustness to delays.
Online EFX Allocations with Predictions
We study an online fair division problem where a fixed number of goods arrive sequentially and must be allocated to a given set of agents. Once a good arrives, its true value for each agent is revealed, and it has to be immediately and irrevocably allocated to some agent. The ultimate goal is to ensure envy-freeness up to any good (EFX) after all goods have been allocated. Unfortunately, as we show, approximate EFX allocations are unattainable in general, even under restrictive assumptions on the valuation functions. To address this, we follow a recent and fruitful trend of augmenting algorithms with predictions. Specifically, we assume access to a prediction vector estimating the agents' true valuations -- e.g., generated by a machine learning model trained on past data. Predictions may be unreliable, and we measure their error using the total variation distance from the true valuations, that is, the percentage of predicted value-mass that disagrees with the true values. Focusing on the natural class of additive valuations, we prove impossibility results even on approximate EFX allocations for algorithms that either ignore predictions or rely solely on them. We then turn to algorithms that use both the predictions and the true values and show strong lower bounds on the prediction accuracy that is required by any algorithm to compute an approximate EFX. These negative results persist even under identical valuations, contrary to the offline setting where exact EFX allocations always exist without the necessity of predictions. We then present an algorithm for two agents with identical valuations that uses effectively the predictions and the true values. The algorithm approximates EFX, with its guarantees improving as the accuracy of the predictions increases.
Risk Analysis Techniques for Governed LLM-based Multi-Agent Systems
Organisations are starting to adopt LLM-based AI agents, with their deployments naturally evolving from single agents towards interconnected, multi-agent networks. Yet a collection of safe agents does not guarantee a safe collection of agents, as interactions between agents over time create emergent behaviours and induce novel failure modes. This means multi-agent systems require a fundamentally different risk analysis approach than that used for a single agent. This report addresses the early stages of risk identification and analysis for multi-agent AI systems operating within governed environments where organisations control their agent configurations and deployment. In this setting, we examine six critical failure modes: cascading reliability failures, inter-agent communication failures, monoculture collapse, conformity bias, deficient theory of mind, and mixed motive dynamics. For each, we provide a toolkit for practitioners to extend or integrate into their existing frameworks to assess these failure modes within their organisational contexts. Given fundamental limitations in current LLM behavioural understanding, our approach centres on analysis validity, and advocates for progressively increasing validity through staged testing across stages of abstraction and deployment that gradually increases exposure to potential negative impacts, while collecting convergent evidence through simulation, observational analysis, benchmarking, and red teaming. This methodology establishes the groundwork for robust organisational risk management as these LLM-based multi-agent systems are deployed and operated.
StyleTailor: Towards Personalized Fashion Styling via Hierarchical Negative Feedback
The advancement of intelligent agents has revolutionized problem-solving across diverse domains, yet solutions for personalized fashion styling remain underexplored, which holds immense promise for promoting shopping experiences. In this work, we present StyleTailor, the first collaborative agent framework that seamlessly unifies personalized apparel design, shopping recommendation, virtual try-on, and systematic evaluation into a cohesive workflow. To this end, StyleTailor pioneers an iterative visual refinement paradigm driven by multi-level negative feedback, enabling adaptive and precise user alignment. Specifically, our framework features two core agents, i.e., Designer for personalized garment selection and Consultant for virtual try-on, whose outputs are progressively refined via hierarchical vision-language model feedback spanning individual items, complete outfits, and try-on efficacy. Counterexamples are aggregated into negative prompts, forming a closed-loop mechanism that enhances recommendation quality.To assess the performance, we introduce a comprehensive evaluation suite encompassing style consistency, visual quality, face similarity, and artistic appraisal. Extensive experiments demonstrate StyleTailor's superior performance in delivering personalized designs and recommendations, outperforming strong baselines without negative feedback and establishing a new benchmark for intelligent fashion systems.
comment: 24pages, 5 figures
The 2R-Conjecture for the Hegselmann--Krause Model: A Proof in Expectation and New Directions
Hegselmann--Krause models are localized, distributed averaging dynamics on spatial data. A key aspect of these dynamics is that they lead to cluster formation, which has important applications in geographic information systems, dynamic clustering algorithms, opinion dynamics, and social networks. For these models, the key questions are whether a fixed point exists and, if so, characterizing it. In this work, we establish new results towards the "2R-Conjecture" for the Hegselmann--Krause model, for which no meaningful progress, or even any precise statement, has been made since its introduction in 2007. This conjecture relates to the structure of the fixed point when there are a large number of agents per unit space. We provide, among other results, a proof in expectation and a statement of a stronger result that is supported by simulation. The key methodological contribution is to consider the dynamics as an infinite-dimensional problem on the space of point processes, rather than on finitely many points. This enables us to leverage stationarity, shift invariance, and certain other symmetries to obtain the results. These techniques do not have finite-dimensional analogs.
StackPilot: Autonomous Function Agents for Scalable and Environment-Free Code Execution
Recent advances in large language models (LLMs) have substantially enhanced automated code generation across a wide range of programming languages. Nonetheless, verifying the correctness and executability of LLM-generated code remains a significant challenge, as traditional methods rely on language-specific compilers and environment-dependent runtimes. To overcome these limitations, we introduce StackPilot, an LLM-native, multi-agent framework designed for language-agnostic code verification and execution, which operates independently of conventional toolchains. StackPilot offers three principal innovations: (1) a Function-as-Agents paradigm, in which each function is modeled as an autonomous agent capable of fine-grained reasoning and collaborative verification; (2) an LLM-as-Executor strategy, which enables scalable verification via stack-based scheduling; and (3) a novel snapshot mechanism that preserves complete execution contexts, facilitating deterministic and lossless context switching during verification. Empirical evaluations demonstrate that StackPilot achieves framework reliability rates between 89% and 97%, substantially outperforming baseline approaches. These results indicate that StackPilot can reliably verify and execute a significantly larger proportion of LLM-generated code across diverse programming tasks compared to existing methods.
A Value Based Parallel Update MCTS Method for Multi-Agent Cooperative Decision Making of Connected and Automated Vehicles
To solve the problem of lateral and logitudinal joint decision-making of multi-vehicle cooperative driving for connected and automated vehicles (CAVs), this paper proposes a Monte Carlo tree search (MCTS) method with parallel update for multi-agent Markov game with limited horizon and time discounted setting. By analyzing the parallel actions in the multi-vehicle joint action space in the partial-steady-state traffic flow, the parallel update method can quickly exclude potential dangerous actions, thereby increasing the search depth without sacrificing the search breadth. The proposed method is tested in a large number of randomly generated traffic flow. The experiment results show that the algorithm has good robustness and better performance than the SOTA reinforcement learning algorithms and heuristic methods. The vehicle driving strategy using the proposed algorithm shows rationality beyond human drivers, and has advantages in traffic efficiency and safety in the coordinating zone.
comment: arXiv admin note: text overlap with arXiv:2408.04295 by other authors
Accelerating Focal Search in Multi-Agent Path Finding with Tighter Lower Bounds
Multi-Agent Path Finding (MAPF) involves finding collision-free paths for multiple agents while minimizing a cost function--an NP-hard problem. Bounded suboptimal methods like Enhanced Conflict-Based Search (ECBS) and Explicit Estimation CBS (EECBS) balance solution quality with computational efficiency using focal search mechanisms. While effective, traditional focal search faces a limitation: the lower bound (LB) value determining which nodes enter the FOCAL list often increases slowly in early search stages, resulting in a constrained search space that delays finding valid solutions. In this paper, we propose a novel bounded suboptimal algorithm, double-ECBS (DECBS), to address this issue by first determining the maximum LB value and then employing a best-first search guided by this LB to find a collision-free path. Experimental results demonstrate that DECBS outperforms ECBS in most test cases and is compatible with existing optimization techniques. DECBS can reduce nearly 30% high-level CT nodes and 50% low-level focal search nodes. When agent density is moderate to high, DECBS achieves a 23.5% average runtime improvement over ECBS with identical suboptimality bounds and optimizations.
comment: 7 pages
CityLight: A Neighborhood-inclusive Universal Model for Coordinated City-scale Traffic Signal Control CIKM 2025
City-scale traffic signal control (TSC) involves thousands of heterogeneous intersections with varying topologies, making cooperative decision-making across intersections particularly challenging. Given the prohibitive computational cost of learning individual policies for each intersection, some researchers explore learning a universal policy to control each intersection in a decentralized manner, where the key challenge is to construct a universal representation method for heterogeneous intersections. However, existing methods are limited to universally representing information of heterogeneous ego intersections, neglecting the essential representation of influence from their heterogeneous neighbors. Universally incorporating neighborhood information is nontrivial due to the intrinsic complexity of traffic flow interactions, as well as the challenge of modeling collective influences from neighbor intersections. To address these challenges, we propose CityLight, which learns a universal policy based on representations obtained with two major modules: a Neighbor Influence Encoder to explicitly model neighbor's influence with specified traffic flow relation and connectivity to the ego intersection; a Neighbor Influence Aggregator to attentively aggregate the influence of neighbors based on their mutual competitive relations. Extensive experiments on five city-scale datasets, ranging from 97 to 13,952 intersections, confirm the efficacy of CityLight, with an average throughput improvement of 11.68% and a lift of 22.59% for generalization.
comment: Accepted by CIKM 2025
DSBC : Data Science task Benchmarking with Context engineering
Recent advances in large language models (LLMs) have significantly impacted data science workflows, giving rise to specialized data science agents designed to automate analytical tasks. Despite rapid adoption, systematic benchmarks evaluating the efficacy and limitations of these agents remain scarce. In this paper, we introduce a comprehensive benchmark specifically crafted to reflect real-world user interactions with data science agents by observing usage of our commercial applications. We evaluate three LLMs: Claude-4.0-Sonnet, Gemini-2.5-Flash, and OpenAI-o4-Mini across three approaches: zero-shot with context engineering, multi-step with context engineering, and with SmolAgent. Our benchmark assesses performance across a diverse set of eight data science task categories, additionally exploring the sensitivity of models to common prompting issues, such as data leakage and slightly ambiguous instructions. We further investigate the influence of temperature parameters on overall and task-specific outcomes for each model and approach. Our findings reveal distinct performance disparities among the evaluated models and methodologies, highlighting critical factors that affect practical deployment. The benchmark dataset and evaluation framework introduced herein aim to provide a foundation for future research of more robust and effective data science agents.
comment: 32 pages
Towards Language-Augmented Multi-Agent Deep Reinforcement Learning
Most prior works on communication in multi-agent reinforcement learning have focused on emergent communication, which often results in inefficient and non-interpretable systems. Inspired by the role of language in natural intelligence, we investigate how grounding agents in a human-defined language can improve the learning and coordination of embodied agents. We propose a framework in which agents are trained not only to act but also to produce and interpret natural language descriptions of their observations. This language-augmented learning serves a dual role: enabling efficient and interpretable communication between agents, and guiding representation learning. We demonstrate that language-augmented agents outperform emergent communication baselines across various tasks. Our analysis reveals that language grounding leads to more informative internal representations, better generalization to new partners, and improved capability for human-agent interaction. These findings demonstrate the effectiveness of integrating structured language into multi-agent learning and open avenues for more interpretable and capable multi-agent systems.
comment: Accespted at the European Conference on Artificial Intelligence 2025
Robotics
LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences
Generative world models have become essential data engines for autonomous driving, yet most existing efforts focus on videos or occupancy grids, overlooking the unique LiDAR properties. Extending LiDAR generation to dynamic 4D world modeling presents challenges in controllability, temporal coherence, and evaluation standardization. To this end, we present LiDARCrafter, a unified framework for 4D LiDAR generation and editing. Given free-form natural language inputs, we parse instructions into ego-centric scene graphs, which condition a tri-branch diffusion network to generate object structures, motion trajectories, and geometry. These structured conditions enable diverse and fine-grained scene editing. Additionally, an autoregressive module generates temporally coherent 4D LiDAR sequences with smooth transitions. To support standardized evaluation, we establish a comprehensive benchmark with diverse metrics spanning scene-, object-, and sequence-level aspects. Experiments on the nuScenes dataset using this benchmark demonstrate that LiDARCrafter achieves state-of-the-art performance in fidelity, controllability, and temporal consistency across all levels, paving the way for data augmentation and simulation. The code and benchmark are released to the community.
comment: Preprint; 28 pages, 18 figures, 12 tables; Project Page at https://lidarcrafter.github.io
La La LiDAR: Large-Scale Layout Generation from LiDAR Data
Controllable generation of realistic LiDAR scenes is crucial for applications such as autonomous driving and robotics. While recent diffusion-based models achieve high-fidelity LiDAR generation, they lack explicit control over foreground objects and spatial relationships, limiting their usefulness for scenario simulation and safety validation. To address these limitations, we propose Large-scale Layout-guided LiDAR generation model ("La La LiDAR"), a novel layout-guided generative framework that introduces semantic-enhanced scene graph diffusion with relation-aware contextual conditioning for structured LiDAR layout generation, followed by foreground-aware control injection for complete scene generation. This enables customizable control over object placement while ensuring spatial and semantic consistency. To support our structured LiDAR generation, we introduce Waymo-SG and nuScenes-SG, two large-scale LiDAR scene graph datasets, along with new evaluation metrics for layout synthesis. Extensive experiments demonstrate that La La LiDAR achieves state-of-the-art performance in both LiDAR generation and downstream perception tasks, establishing a new benchmark for controllable 3D scene generation.
comment: Preprint; 10 pages, 6 figures, 7 tables
Veila: Panoramic LiDAR Generation from a Monocular RGB Image
Realistic and controllable panoramic LiDAR data generation is critical for scalable 3D perception in autonomous driving and robotics. Existing methods either perform unconditional generation with poor controllability or adopt text-guided synthesis, which lacks fine-grained spatial control. Leveraging a monocular RGB image as a spatial control signal offers a scalable and low-cost alternative, which remains an open problem. However, it faces three core challenges: (i) semantic and depth cues from RGB are vary spatially, complicating reliable conditioning generation; (ii) modality gaps between RGB appearance and LiDAR geometry amplify alignment errors under noisy diffusion; and (iii) maintaining structural coherence between monocular RGB and panoramic LiDAR is challenging, particularly in non-overlap regions between images and LiDAR. To address these challenges, we propose Veila, a novel conditional diffusion framework that integrates: a Confidence-Aware Conditioning Mechanism (CACM) that strengthens RGB conditioning by adaptively balancing semantic and depth cues according to their local reliability; a Geometric Cross-Modal Alignment (GCMA) for robust RGB-LiDAR alignment under noisy diffusion; and a Panoramic Feature Coherence (PFC) for enforcing global structural consistency across monocular RGB and panoramic LiDAR. Additionally, we introduce two metrics, Cross-Modal Semantic Consistency and Cross-Modal Depth Consistency, to evaluate alignment quality across modalities. Experiments on nuScenes, SemanticKITTI, and our proposed KITTI-Weather benchmark demonstrate that Veila achieves state-of-the-art generation fidelity and cross-modal consistency, while enabling generative data augmentation that improves downstream LiDAR semantic segmentation.
comment: Preprint; 10 pages, 6 figures, 7 tables
Inland-LOAM: Voxel-Based Structural Semantic Mapping for Inland Waterways
Accurate geospatial information is crucial for safe, autonomous Inland Waterway Transport (IWT), as existing charts (IENC) lack real-time detail and conventional LiDAR SLAM fails in waterway environments. These challenges lead to vertical drift and non-semantic maps, hindering autonomous navigation. This paper introduces Inland-LOAM, a LiDAR SLAM framework for waterways. It uses an improved feature extraction and a water surface planar constraint to mitigate vertical drift. A novel pipeline transforms 3D point clouds into structured 2D semantic maps using voxel-based geometric analysis, enabling real-time computation of navigational parameters like bridge clearances. An automated module extracts shorelines and exports them into a lightweight, IENC-compatible format. Evaluations on a real-world dataset show Inland-LOAM achieves superior localization accuracy over state-of-the-art methods. The generated semantic maps and shorelines align with real-world conditions, providing reliable data for enhanced situational awareness. The code and dataset will be publicly available
OmniShape: Zero-Shot Multi-Hypothesis Shape and Pose Estimation in the Real World ICRA 2025
We would like to estimate the pose and full shape of an object from a single observation, without assuming known 3D model or category. In this work, we propose OmniShape, the first method of its kind to enable probabilistic pose and shape estimation. OmniShape is based on the key insight that shape completion can be decoupled into two multi-modal distributions: one capturing how measurements project into a normalized object reference frame defined by the dataset and the other modelling a prior over object geometries represented as triplanar neural fields. By training separate conditional diffusion models for these two distributions, we enable sampling multiple hypotheses from the joint pose and shape distribution. OmniShape demonstrates compelling performance on challenging real world datasets. Project website: https://tri-ml.github.io/omnishape
comment: 8 pages, 5 figures. This version has typo fixes on top of the version published at ICRA 2025
DiWA: Diffusion Policy Adaptation with World Models
Fine-tuning diffusion policies with reinforcement learning (RL) presents significant challenges. The long denoising sequence for each action prediction impedes effective reward propagation. Moreover, standard RL methods require millions of real-world interactions, posing a major bottleneck for practical fine-tuning. Although prior work frames the denoising process in diffusion policies as a Markov Decision Process to enable RL-based updates, its strong dependence on environment interaction remains highly inefficient. To bridge this gap, we introduce DiWA, a novel framework that leverages a world model for fine-tuning diffusion-based robotic skills entirely offline with reinforcement learning. Unlike model-free approaches that require millions of environment interactions to fine-tune a repertoire of robot skills, DiWA achieves effective adaptation using a world model trained once on a few hundred thousand offline play interactions. This results in dramatically improved sample efficiency, making the approach significantly more practical and safer for real-world robot learning. On the challenging CALVIN benchmark, DiWA improves performance across eight tasks using only offline adaptation, while requiring orders of magnitude fewer physical interactions than model-free baselines. To our knowledge, this is the first demonstration of fine-tuning diffusion policies for real-world robotic skills using an offline world model. We make the code publicly available at https://diwa.cs.uni-freiburg.de.
comment: Accepted at the 2025 Conference on Robot Learning (CoRL)
Why Evolve When You Can Adapt? Post-Evolution Adaptation of Genetic Memory for On-the-Fly Control
Imagine a robot controller with the ability to adapt like human synapses, dynamically rewiring itself to overcome unforeseen challenges in real time. This paper proposes a novel zero-shot adaptation mechanism for evolutionary robotics, merging a standard Genetic Algorithm (GA) controller with online Hebbian plasticity. Inspired by biological systems, the method separates learning and memory, with the genotype acting as memory and Hebbian updates handling learning. In our approach, the fitness function is leveraged as a live scaling factor for Hebbian learning, enabling the robot's neural controller to adjust synaptic weights on-the-fly without additional training. This adds a dynamic adaptive layer that activates only during runtime to handle unexpected environmental changes. After the task, the robot 'forgets' the temporary adjustments and reverts to the original weights, preserving core knowledge. We validate this hybrid GA-Hebbian controller on an e-puck robot in a T-maze navigation task with changing light conditions and obstacles.
comment: This work was accepted for presentation at the ALIFE 2025 Conference in Kyoto, and will be published by MIT Press as part of the ALIFE 2025 proceedings
Online Learning for Vibration Suppression in Physical Robot Interaction using Power Tools
Vibration suppression is an important capability for collaborative robots deployed in challenging environments such as construction sites. We study the active suppression of vibration caused by external sources such as power tools. We adopt the band-limited multiple Fourier linear combiner (BMFLC) algorithm to learn the vibration online and counter it by feedforward force control. We propose the damped BMFLC method, extending BMFLC with a novel adaptive step-size approach that improves the convergence time and noise resistance. Our logistic function-based damping mechanism reduces the effect of noise and enables larger learning rates. We evaluate our method on extensive simulation experiments with realistic time-varying multi-frequency vibration and real-world physical interaction experiments. The simulation experiments show that our method improves the suppression rate in comparison to the original BMFLC and its recursive least squares and Kalman filter-based extensions. Furthermore, our method is far more efficient than the latter two. We further validate the effectiveness of our method in real-world polishing experiments. A supplementary video is available at https://youtu.be/ms6m-6JyVAI.
comment: Submitted, under review
Vision-based Perception System for Automated Delivery Robot-Pedestrians Interactions
The integration of Automated Delivery Robots (ADRs) into pedestrian-heavy urban spaces introduces unique challenges in terms of safe, efficient, and socially acceptable navigation. We develop the complete pipeline for a single vision sensor based multi-pedestrian detection and tracking, pose estimation, and monocular depth perception. Leveraging the real-world MOT17 dataset sequences, this study demonstrates how integrating human-pose estimation and depth cues enhances pedestrian trajectory prediction and identity maintenance, even under occlusions and dense crowds. Results show measurable improvements, including up to a 10% increase in identity preservation (IDF1), a 7% improvement in multiobject tracking accuracy (MOTA), and consistently high detection precision exceeding 85%, even in challenging scenarios. Notably, the system identifies vulnerable pedestrian groups supporting more socially aware and inclusive robot behaviour.
CollaBot: Vision-Language Guided Simultaneous Collaborative Manipulation
A central research topic in robotics is how to use this system to interact with the physical world. Traditional manipulation tasks primarily focus on small objects. However, in factory or home environments, there is often a need for the movement of large objects, such as moving tables. These tasks typically require multi-robot systems to work collaboratively. Previous research lacks a framework that can scale to arbitrary sizes of robots and generalize to various kinds of tasks. In this work, we propose CollaBot, a generalist framework for simultaneous collaborative manipulation. First, we use SEEM for scene segmentation and point cloud extraction of the target object. Then, we propose a collaborative grasping framework, which decomposes the task into local grasp pose generation and global collaboration. Finally, we design a 2-stage planning module that can generate collision-free trajectories to achieve this task. Experiments show a success rate of 52% across different numbers of robots, objects, and tasks, indicating the effectiveness of the proposed framework.
comment: 9 pages,5 figures
Theatre in the Loop: A Rehearsal-Based, Collaborative Workflow for Expressive Robotic Behaviours
In this paper, we propose theatre-in-the-loop, a framework for developing expressive robot behaviours tailored to artistic performance through a director-guided puppeteering workflow. Leveraging theatrical methods, we use narrative objectives to direct a puppeteer in generating improvised robotic gestures that convey specific emotions. These improvisations are captured and curated to build a dataset of reusable movement templates for standalone playback in future autonomous performances. Initial trials demonstrate the feasibility of this approach, illustrating how the workflow enables precise sculpting of robotic gestures into coherent emotional arcs while revealing challenges posed by the robot's mechanical constraints. We argue that this practice-led framework provides a model for interdisciplinary teams creating socially expressive robot behaviours, contributing to (1) theatre as an interactive training ground for human-robot interaction and (2) co-creation methodologies between humans and machines.
comment: The paper is accepted for presentation to International Conference on Social Robotics + AI (https://icsr2025.eu/)
Residual Neural Terminal Constraint for MPC-based Collision Avoidance in Dynamic Environments
In this paper, we propose a hybrid MPC local planner that uses a learning-based approximation of a time-varying safe set, derived from local observations and applied as the MPC terminal constraint. This set can be represented as a zero-superlevel set of the value function computed via Hamilton-Jacobi (HJ) reachability analysis, which is infeasible in real-time. We exploit the property that the HJ value function can be expressed as a difference of the corresponding signed distance function (SDF) and a non-negative residual function. The residual component is modeled as a neural network with non-negative output and subtracted from the computed SDF, resulting in a real-time value function estimate that is at least as safe as the SDF by design. Additionally, we parametrize the neural residual by a hypernetwork to improve real-time performance and generalization properties. The proposed method is compared with three state-of-the-art methods in simulations and hardware experiments, achieving up to 30\% higher success rates compared to the best baseline while requiring a similar computational effort and producing high-quality (low travel-time) solutions.
Opti-Acoustic Scene Reconstruction in Highly Turbid Underwater Environments
Scene reconstruction is an essential capability for underwater robots navigating in close proximity to structures. Monocular vision-based reconstruction methods are unreliable in turbid waters and lack depth scale information. Sonars are robust to turbid water and non-uniform lighting conditions, however, they have low resolution and elevation ambiguity. This work proposes a real-time opti-acoustic scene reconstruction method that is specially optimized to work in turbid water. Our strategy avoids having to identify point features in visual data and instead identifies regions of interest in the data. We then match relevant regions in the image to corresponding sonar data. A reconstruction is obtained by leveraging range data from the sonar and elevation data from the camera image. Experimental comparisons against other vision-based and sonar-based approaches at varying turbidity levels, and field tests conducted in marina environments, validate the effectiveness of the proposed approach. We have made our code open-source to facilitate reproducibility and encourage community engagement.
UniFucGrasp: Human-Hand-Inspired Unified Functional Grasp Annotation Strategy and Dataset for Diverse Dexterous Hands
Dexterous grasp datasets are vital for embodied intelligence, but mostly emphasize grasp stability, ignoring functional grasps needed for tasks like opening bottle caps or holding cup handles. Most rely on bulky, costly, and hard-to-control high-DOF Shadow Hands. Inspired by the human hand's underactuated mechanism, we establish UniFucGrasp, a universal functional grasp annotation strategy and dataset for multiple dexterous hand types. Based on biomimicry, it maps natural human motions to diverse hand structures and uses geometry-based force closure to ensure functional, stable, human-like grasps. This method supports low-cost, efficient collection of diverse, high-quality functional grasps. Finally, we establish the first multi-hand functional grasp dataset and provide a synthesis model to validate its effectiveness. Experiments on the UFG dataset, IsaacSim, and complex robotic tasks show that our method improves functional manipulation accuracy and grasp stability, enables efficient generalization across diverse robotic hands, and overcomes annotation cost and generalization challenges in dexterous grasping. The project page is at https://haochen611.github.io/UFG.
comment: The project page is at https://haochen611.github.io/UFG
LRDDv2: Enhanced Long-Range Drone Detection Dataset with Range Information and Comprehensive Real-World Challenges
The exponential growth in Unmanned Aerial Vehicles (UAVs) usage underscores the critical need of detecting them at extended distances to ensure safe operations, especially in densely populated areas. Despite the tremendous advances made in computer vision through deep learning, the detection of these small airborne objects remains a formidable challenge. While several datasets have been developed specifically for drone detection, the need for a more extensive and diverse collection of drone image data persists, particularly for long-range detection under varying environmental conditions. We introduce here the Long Range Drone Detection (LRDD) Version 2 dataset, comprising 39,516 meticulously annotated images, as a second release of the LRDD dataset released previously. The LRDDv2 dataset enhances the LRDDv1 by incorporating a greater variety of images, providing a more diverse and comprehensive resource for drone detection research. What sets LRDDv2 apart is its inclusion of target range information for over 8,000 images, making it possible to develop algorithms for drone range estimation. Tailored for long-range aerial object detection, the majority of LRDDv2's dataset consists of images capturing drones with 50 or fewer pixels in 1080p resolution. For access to the complete Long-Range Drone Detection Dataset (LRDD)v2, please visit https://research.coe.drexel.edu/ece/imaple/lrddv2/ .
comment: Accepted and presented at ISRR 2024
Enhancing Joint Human-AI Inference in Robot Missions: A Confidence-Based Approach
Joint human-AI inference holds immense potential to improve outcomes in human-supervised robot missions. Current day missions are generally in the AI-assisted setting, where the human operator makes the final inference based on the AI recommendation. However, due to failures in human judgement on when to accept or reject the AI recommendation, complementarity is rarely achieved. We investigate joint human-AI inference where the inference made with higher confidence is selected. Through a user study with N=100 participants on a representative simulated robot teleoperation task, specifically studying the inference of robots' control delays we show that: a) Joint inference accuracy is higher and its extent is regulated by the confidence calibration of the AI agent, and b) Humans change their inferences based on AI recommendations and the extent and direction of this change is also regulated by the confidence calibration of the AI agent. Interestingly, our results show that pairing poorly-calibrated AI-DSS with humans hurts performance instead of helping the team, reiterating the need for AI-based decision support systems with good metacognitive sensitivity. To the best of our knowledge, our study presents the first application of a maximum-confidence-based heuristic for joint human-AI inference within a simulated robot teleoperation task.
Force-Compliance MPC and Robot-User CBFs for Interactive Navigation and User-Robot Safety in Hexapod Guide Robots
Guiding the visually impaired in complex environments requires real-time two-way interaction and safety assurance. We propose a Force-Compliance Model Predictive Control (FC-MPC) and Robot-User Control Barrier Functions (CBFs) for force-compliant navigation and obstacle avoidance in Hexapod guide robots. FC-MPC enables two-way interaction by estimating user-applied forces and moments using the robot's dynamic model and the recursive least squares (RLS) method, and then adjusting the robot's movements accordingly, while Robot-User CBFs ensure the safety of both the user and the robot by handling static and dynamic obstacles, and employ weighted slack variables to overcome feasibility issues in complex dynamic environments. We also adopt an Eight-Way Connected DBSCAN method for obstacle clustering, reducing computational complexity from O(n2) to approximately O(n), enabling real-time local perception on resource-limited on-board robot computers. Obstacles are modeled using Minimum Bounding Ellipses (MBEs), and their trajectories are predicted through Kalman filtering. Implemented on the HexGuide robot, the system seamlessly integrates force compliance, autonomous navigation, and obstacle avoidance. Experimental results demonstrate the system's ability to adapt to user force commands while guaranteeing user and robot safety simultaneously during navigation in complex environments.
CookBench: A Long-Horizon Embodied Planning Benchmark for Complex Cooking Scenarios
Embodied Planning is dedicated to the goal of creating agents capable of executing long-horizon tasks in complex physical worlds. However, existing embodied planning benchmarks frequently feature short-horizon tasks and coarse-grained action primitives. To address this challenge, we introduce CookBench, a benchmark for long-horizon planning in complex cooking scenarios. By leveraging a high-fidelity simulation environment built upon the powerful Unity game engine, we define frontier AI challenges in a complex, realistic environment. The core task in CookBench is designed as a two-stage process. First, in Intention Recognition, an agent needs to accurately parse a user's complex intent. Second, in Embodied Interaction, the agent should execute the identified cooking goal through a long-horizon, fine-grained sequence of physical actions. Unlike existing embodied planning benchmarks, we refine the action granularity to a spatial level that considers crucial operational information while abstracting away low-level robotic control. Besides, We provide a comprehensive toolset that encapsulates the simulator. Its unified API supports both macro-level operations, such as placing orders and purchasing ingredients, and a rich set of fine-grained embodied actions for physical interaction, enabling researchers to focus on high-level planning and decision-making. Furthermore, we present an in-depth analysis of state-of-the-art, closed-source Large Language Model and Vision-Language Model, revealing their major shortcomings and challenges posed by complex, long-horizon tasks. The full benchmark will be open-sourced to facilitate future research.
comment: 9 pages, 5 figures
Language as Cost: Proactive Hazard Mapping using VLM for Robot Navigation IROS
Robots operating in human-centric or hazardous environments must proactively anticipate and mitigate dangers beyond basic obstacle detection. Traditional navigation systems often depend on static maps, which struggle to account for dynamic risks, such as a person emerging from a suddenly opening door. As a result, these systems tend to be reactive rather than anticipatory when handling dynamic hazards. Recent advancements in pre-trained large language models and vision-language models (VLMs) create new opportunities for proactive hazard avoidance. In this work, we propose a zero-shot language-as-cost mapping framework that leverages VLMs to interpret visual scenes, assess potential dynamic risks, and assign risk-aware navigation costs preemptively, enabling robots to anticipate hazards before they materialize. By integrating this language-based cost map with a geometric obstacle map, the robot not only identifies existing obstacles but also anticipates and proactively plans around potential hazards arising from environmental dynamics. Experiments in simulated and diverse dynamic environments demonstrate that the proposed method significantly improves navigation success rates and reduces hazard encounters, compared to reactive baseline planners. Code and supplementary materials are available at https://github.com/Taekmino/LaC.
comment: Accepted at IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025. 8 pages, 7 figures
COFFEE: A Shadow-Resilient Real-Time Pose Estimator for Unknown Tumbling Asteroids using Sparse Neural Networks
The accurate state estimation of unknown bodies in space is a critical challenge with applications ranging from the tracking of space debris to the shape estimation of small bodies. A necessary enabler to this capability is to find and track features on a continuous stream of images. Existing methods, such as SIFT, ORB and AKAZE, achieve real-time but inaccurate pose estimates, whereas modern deep learning methods yield higher quality features at the cost of more demanding computational resources which might not be available on space-qualified hardware. Additionally, both classical and data-driven methods are not robust to the highly opaque self-cast shadows on the object of interest. We show that, as the target body rotates, these shadows may lead to large biases in the resulting pose estimates. For these objects, a bias in the real-time pose estimation algorithm may mislead the spacecraft's state estimator and cause a mission failure, especially if the body undergoes a chaotic tumbling motion. We present COFFEE, the Celestial Occlusion Fast FEature Extractor, a real-time pose estimation framework for asteroids designed to leverage prior information on the sun phase angle given by sun-tracking sensors commonly available onboard spacecraft. By associating salient contours to their projected shadows, a sparse set of features are detected, invariant to the motion of the shadows. A Sparse Neural Network followed by an attention-based Graph Neural Network feature matching model are then jointly trained to provide a set of correspondences between successive frames. The resulting pose estimation pipeline is found to be bias-free, more accurate than classical pose estimation pipelines and an order of magnitude faster than other state-of-the-art deep learning pipelines on synthetic data as well as on renderings of the tumbling asteroid Apophis.
comment: in Proc. 75th Int. Astronautical Congress (IAC-24), Milan, Italy, Oct. 2024
Safety-Aware Imitation Learning via MPC-Guided Disturbance Injection
Imitation Learning has provided a promising approach to learning complex robot behaviors from expert demonstrations. However, learned policies can make errors that lead to safety violations, which limits their deployment in safety-critical applications. We propose MPC-SafeGIL, a design-time approach that enhances the safety of imitation learning by injecting adversarial disturbances during expert demonstrations. This exposes the expert to a broader range of safety-critical scenarios and allows the imitation policy to learn robust recovery behaviors. Our method uses sampling-based Model Predictive Control (MPC) to approximate worst-case disturbances, making it scalable to high-dimensional and black-box dynamical systems. In contrast to prior work that relies on analytical models or interactive experts, MPC-SafeGIL integrates safety considerations directly into data collection. We validate our approach through extensive simulations including quadruped locomotion and visuomotor navigation and real-world experiments on a quadrotor, demonstrating improvements in both safety and task performance. See our website here: https://leqiu2003.github.io/MPCSafeGIL/
Can Large Language Models Identify Materials from Radar Signals?
Accurately identifying the material composition of objects is a critical capability for AI robots powered by large language models (LLMs) to perform context-aware manipulation. Radar technologies offer a promising sensing modality for material recognition task. When combined with deep learning, radar technologies have demonstrated strong potential in identifying the material of various objects. However, existing radar-based solutions are often constrained to closed-set object categories and typically require task-specific data collection to train deep learning models, largely limiting their practical applicability. This raises an important question: Can we leverage the powerful reasoning capabilities of pre-trained LLMs to directly infer material composition from raw radar signals? Answering this question is non-trivial due to the inherent redundancy of radar signals and the fact that pre-trained LLMs have no prior exposure to raw radar data during training. To address this, we introduce LLMaterial, the first study to investigate the feasibility of using LLM to identify materials directly from radar signals. First, we introduce a physics-informed signal processing pipeline that distills high-redundancy radar raw data into a set of compact intermediate parameters that encapsulate the material's intrinsic characteristics. Second, we adopt a retrieval-augmented generation (RAG) strategy to provide the LLM with domain-specific knowledge, enabling it to interpret and reason over the extracted intermediate parameters. Leveraging this integration, the LLM is empowered to perform step-by-step reasoning on the condensed radar features, achieving open-set material recognition directly from raw radar signals. Preliminary results show that LLMaterial can effectively distinguish among a variety of common materials, highlighting its strong potential for real-world material identification applications.
Point2Act: Efficient 3D Distillation of Multimodal LLMs for Zero-Shot Context-Aware Grasping
We propose Point2Act, which directly retrieves the 3D action point relevant for a contextually described task, leveraging Multimodal Large Language Models (MLLMs). Foundation models opened the possibility for generalist robots that can perform a zero-shot task following natural language descriptions within an unseen environment. While the semantics obtained from large-scale image and language datasets provide contextual understanding in 2D images, the rich yet nuanced features deduce blurry 2D regions and struggle to find precise 3D locations for actions. Our proposed 3D relevancy fields bypass the high-dimensional features and instead efficiently imbue lightweight 2D point-level guidance tailored to the task-specific action. The multi-view aggregation effectively compensates for misalignments due to geometric ambiguities, such as occlusion, or semantic uncertainties inherent in the language descriptions. The output region is highly localized, reasoning fine-grained 3D spatial context that can directly transfer to an explicit position for physical action at the on-the-fly reconstruction of the scene. Our full-stack pipeline, which includes capturing, MLLM querying, 3D reconstruction, and grasp pose extraction, generates spatially grounded responses in under 20 seconds, facilitating practical manipulation tasks. Project page: https://sangminkim-99.github.io/point2act/
Optimizing Bipedal Locomotion for The 100m Dash With Comparison to Human Running ICRA 2023
In this paper, we explore the space of running gaits for the bipedal robot Cassie. Our first contribution is to present an approach for optimizing gait efficiency across a spectrum of speeds with the aim of enabling extremely high-speed running on hardware. This raises the question of how the resulting gaits compare to human running mechanics, which are known to be highly efficient in comparison to quadrupeds. Our second contribution is to conduct this comparison based on established human biomechanical studies. We find that despite morphological differences between Cassie and humans, key properties of the gaits are highly similar across a wide range of speeds. Finally, our third contribution is to integrate the optimized running gaits into a full controller that satisfies the rules of the real-world task of the 100m dash, including starting and stopping from a standing position. We demonstrate this controller on hardware to establish the Guinness World Record for Fastest 100m by a Bipedal Robot.
comment: 7 pages, 7 figures, published by IEEE at ICRA 2023, pp. 12205-12211, see https://ieeexplore.ieee.org/document/10160436
Hand-Eye Autonomous Delivery: Learning Humanoid Navigation, Locomotion and Reaching
We propose Hand-Eye Autonomous Delivery (HEAD), a framework that learns navigation, locomotion, and reaching skills for humanoids, directly from human motion and vision perception data. We take a modular approach where the high-level planner commands the target position and orientation of the hands and eyes of the humanoid, delivered by the low-level policy that controls the whole-body movements. Specifically, the low-level whole-body controller learns to track the three points (eyes, left hand, and right hand) from existing large-scale human motion capture data while high-level policy learns from human data collected by Aria glasses. Our modular approach decouples the ego-centric vision perception from physical actions, promoting efficient learning and scalability to novel scenes. We evaluate our method both in simulation and in the real-world, demonstrating humanoid's capabilities to navigate and reach in complex environments designed for humans.
SkeNa: Learning to Navigate Unseen Environments Based on Abstract Hand-Drawn Maps
A typical human strategy for giving navigation guidance is to sketch route maps based on the environmental layout. Inspired by this, we introduce Sketch map-based visual Navigation (SkeNa), an embodied navigation task in which an agent must reach a goal in an unseen environment using only a hand-drawn sketch map as guidance. To support research for SkeNa, we present a large-scale dataset named SoR, comprising 54k trajectory and sketch map pairs across 71 indoor scenes. In SoR, we introduce two navigation validation sets with varying levels of abstraction in hand-drawn sketches, categorized based on their preservation of spatial scales in the environment, to facilitate future research. To construct SoR, we develop an automated sketch-generation pipeline that efficiently converts floor plans into hand-drawn representations. To solve SkeNa, we propose SkeNavigator, a navigation framework that aligns visual observations with hand-drawn maps to estimate navigation targets. It employs a Ray-based Map Descriptor (RMD) to enhance sketch map valid feature representation using equidistant sampling points and boundary distances. To improve alignment with visual observations, a Dual-Map Aligned Goal Predictor (DAGP) leverages the correspondence between sketch map features and on-site constructed exploration map features to predict goal position and guide navigation. SkeNavigator outperforms prior floor plan navigation methods by a large margin, improving SPL on the high-abstract validation set by 105% relatively. Our code and dataset will be released.
comment: 9 pages, 5 figures
Aerobatic maneuvers in insect-scale flapping-wing aerial robots via deep-learned robust tube model predictive control
Aerial insects exhibit highly agile maneuvers such as sharp braking, saccades, and body flips under disturbance. In contrast, insect-scale aerial robots are limited to tracking non-aggressive trajectories with small body acceleration. This performance gap is contributed by a combination of low robot inertia, fast dynamics, uncertainty in flapping-wing aerodynamics, and high susceptibility to environmental disturbance. Executing highly dynamic maneuvers requires the generation of aggressive flight trajectories that push against the hardware limit and a high-rate feedback controller that accounts for model and environmental uncertainty. Here, through designing a deep-learned robust tube model predictive controller, we showcase insect-like flight agility and robustness in a 750-millgram flapping-wing robot. Our model predictive controller can track aggressive flight trajectories under disturbance. To achieve a high feedback rate in a compute-constrained real-time system, we design imitation learning methods to train a two-layer, fully connected neural network, which resembles insect flight control architecture consisting of central nervous system and motor neurons. Our robot demonstrates insect-like saccade movements with lateral speed and acceleration of 197 centimeters per second and 11.7 meters per second square, representing 447$\%$ and 255$\%$ improvement over prior results. The robot can also perform saccade maneuvers under 160 centimeters per second wind disturbance and large command-to-force mapping errors. Furthermore, it performs 10 consecutive body flips in 11 seconds - the most challenging maneuver among sub-gram flyers. These results represent a milestone in achieving insect-scale flight agility and inspire future investigations on sensing and compute autonomy.
comment: 27 pages, 26 supplementary pages, 6 main figures, 16 supplementary figures, 1 table
CogniPlan: Uncertainty-Guided Path Planning with Conditional Generative Layout Prediction
Path planning in unknown environments is a crucial yet inherently challenging capability for mobile robots, which primarily encompasses two coupled tasks: autonomous exploration and point-goal navigation. In both cases, the robot must perceive the environment, update its belief, and accurately estimate potential information gain on-the-fly to guide planning. In this work, we propose CogniPlan, a novel path planning framework that leverages multiple plausible layouts predicted by a COnditional GeNerative Inpainting model, mirroring how humans rely on cognitive maps during navigation. These predictions, based on the partially observed map and a set of layout conditioning vectors, enable our planner to reason effectively under uncertainty. We demonstrate strong synergy between generative image-based layout prediction and graph-attention-based path planning, allowing CogniPlan to combine the scalability of graph representations with the fidelity and predictiveness of occupancy maps, yielding notable performance gains in both exploration and navigation. We extensively evaluate CogniPlan on two datasets (hundreds of maps and realistic floor plans), consistently outperforming state-of-the-art planners. We further deploy it in a high-fidelity simulator and on hardware, showcasing its high-quality path planning and real-world applicability.
comment: Accepted for presentation at CORL 2025
LiGen: GAN-Augmented Spectral Fingerprinting for Indoor Positioning
Accurate and robust indoor localization is critical for smart building applications, yet existing Wi-Fi-based systems are often vulnerable to environmental conditions. This work presents a novel indoor localization system, called LiGen, that leverages the spectral intensity patterns of ambient light as fingerprints, offering a more stable and infrastructure-free alternative to radio signals. To address the limited spectral data, we design a data augmentation framework based on generative adversarial networks (GANs), featuring two variants: PointGAN, which generates fingerprints conditioned on coordinates, and FreeGAN, which uses a weak localization model to label unconditioned samples. Our positioning model, leveraging a Multi-Layer Perceptron (MLP) architecture to train on synthesized data, achieves submeter-level accuracy, outperforming Wi-Fi-based baselines by over 50\%. LiGen also demonstrates strong robustness in cluttered environments. To the best of our knowledge, this is the first system to combine spectral fingerprints with GAN-based data augmentation for indoor localization.
comment: 6 pages, 10 figures
Beyond Policy Optimization: A Data Curation Flywheel for Sparse-Reward Long-Horizon Planning
Large Language Reasoning Models have demonstrated remarkable success on static tasks, yet their application to multi-round agentic planning in interactive environments faces two fundamental challenges. First, the intractable credit assignment problem renders conventional reinforcement learning ineffective in sparse-reward settings. Second, the computational overhead of verbose, step-by-step reasoning histories is prohibitive. To address these challenges, we propose BPO, a three-stage framework (bootstrapping, extrapolation, and refinement) that establishes a self-improving data flywheel to develop robust reasoning models for long-horizon, sparse-reward environments. Our framework first bootstraps efficient reasoning using the proposed planning quaternions with long-short chain-of-thought fusion. It then extrapolates to out-of-distribution tasks through complexity-stratified curriculum learning. Finally, the model iteratively refines itself by learning exclusively on experiences selected via reward-gated rejection sampling. Experiments on ALFWorld, ScienceWorld, and WebShop demonstrate that our approach achieves state-of-the-art with significant token efficiency, providing a new recipe for reasoning models in agentic planning.
Generating Light-based Fingerprints for Indoor Localization
Accurate indoor localization underpins applications ranging from wayfinding and emergency response to asset tracking and smart-building services. Radio-frequency solutions (e.g. Wi-Fi, RFID, UWB) are widely adopted but remain vulnerable to multipath fading, interference, and uncontrollable coverage variation. We explore an orthogonal modality -- visible light communication (VLC) -- and demonstrate that the spectral signatures captured by a low-cost AS7341 sensor can serve as robust location fingerprints. We introduce a two-stage framework that (i) trains a multi-layer perceptron (MLP) on real spectral measurements and (ii) enlarges the training corpus with synthetic samples produced by TabGAN. The augmented dataset reduces the mean localization error from 62.9cm to 49.3cm -- a 20% improvement -- while requiring only 5% additional data-collection effort. Experimental results obtained on 42 reference points in a U-shaped laboratory confirm that GAN-based augmentation mitigates data-scarcity issues and enhances generalization.
comment: 5 pages, 12 figures; presented at the 2024 MC & WASN Conference (Best Paper Candidate)
Thruster-Enhanced Locomotion: A Decoupled Model Predictive Control with Learned Contact Residuals
Husky Carbon, a robot developed by Northeastern University, serves as a research platform to explore unification of posture manipulation and thrust vectoring. Unlike conventional quadrupeds, its joint actuators and thrusters enable enhanced control authority, facilitating thruster-assisted narrow-path walking. While a unified Model Predictive Control (MPC) framework optimizing both ground reaction forces and thruster forces could theoretically address this control problem, its feasibility is limited by the low torque-control bandwidth of the system's lightweight actuators. To overcome this challenge, we propose a decoupled control architecture: a Raibert-type controller governs legged locomotion using position-based control, while an MPC regulates the thrusters augmented by learned Contact Residual Dynamics (CRD) to account for leg-ground impacts. This separation bypasses the torque-control rate bottleneck while retaining the thruster MPC to explicitly account for leg-ground impact dynamics through learned residuals. We validate this approach through both simulation and hardware experiments, showing that the decoupled control architecture with CRD performs more stable behavior in terms of push recovery and cat-like walking gait compared to the decoupled controller without CRD.
GACL: Grounded Adaptive Curriculum Learning with Active Task and Performance Monitoring IROS 2025
Curriculum learning has emerged as a promising approach for training complex robotics tasks, yet current applications predominantly rely on manually designed curricula, which demand significant engineering effort and can suffer from subjective and suboptimal human design choices. While automated curriculum learning has shown success in simple domains like grid worlds and games where task distributions can be easily specified, robotics tasks present unique challenges: they require handling complex task spaces while maintaining relevance to target domain distributions that are only partially known through limited samples. To this end, we propose Grounded Adaptive Curriculum Learning, a framework specifically designed for robotics curriculum learning with three key innovations: (1) a task representation that consistently handles complex robot task design, (2) an active performance tracking mechanism that allows adaptive curriculum generation appropriate for the robot's current capabilities, and (3) a grounding approach that maintains target domain relevance through alternating sampling between reference and synthetic tasks. We validate GACL on wheeled navigation in constrained environments and quadruped locomotion in challenging 3D confined spaces, achieving 6.8% and 6.1% higher success rates, respectively, than state-of-the-art methods in each domain.
comment: 7 pages, IROS 2025
Estimation of Aerodynamics Forces in Dynamic Morphing Wing Flight
Accurate estimation of aerodynamic forces is essential for advancing the control, modeling, and design of flapping-wing aerial robots with dynamic morphing capabilities. In this paper, we investigate two distinct methodologies for force estimation on Aerobat, a bio-inspired flapping-wing platform designed to emulate the inertial and aerodynamic behaviors observed in bat flight. Our goal is to quantify aerodynamic force contributions during tethered flight, a crucial step toward closed-loop flight control. The first method is a physics-based observer derived from Hamiltonian mechanics that leverages the concept of conjugate momentum to infer external aerodynamic forces acting on the robot. This observer builds on the system's reduced-order dynamic model and utilizes real-time sensor data to estimate forces without requiring training data. The second method employs a neural network-based regression model, specifically a multi-layer perceptron (MLP), to learn a mapping from joint kinematics, flapping frequency, and environmental parameters to aerodynamic force outputs. We evaluate both estimators using a 6-axis load cell in a high-frequency data acquisition setup that enables fine-grained force measurements during periodic wingbeats. The conjugate momentum observer and the regression model demonstrate strong agreement across three force components (Fx, Fy, Fz).
Multimodal Human-Intent Modeling for Contextual Robot-to-Human Handovers of Arbitrary Objects
Human-robot object handover is a crucial element for assistive robots that aim to help people in their daily lives, including elderly care, hospitals, and factory floors. The existing approaches to solving these tasks rely on pre-selected target objects and do not contextualize human implicit and explicit preferences for handover, limiting natural and smooth interaction between humans and robots. These preferences can be related to the target object selection from the cluttered environment and to the way the robot should grasp the selected object to facilitate desirable human grasping during handovers. Therefore, this paper presents a unified approach that selects target distant objects using human verbal and non-verbal commands and performs the handover operation by contextualizing human implicit and explicit preferences to generate robot grasps and compliant handover motion sequences. We evaluate our integrated framework and its components through real-world experiments and user studies with arbitrary daily-life objects. The results of these evaluations demonstrate the effectiveness of our proposed pipeline in handling object handover tasks by understanding human preferences. Our demonstration videos can be found at https://youtu.be/6z27B2INl-s.
Physics-informed Neural Time Fields for Prehensile Object Manipulation
Object manipulation skills are necessary for robots operating in various daily-life scenarios, ranging from warehouses to hospitals. They allow the robots to manipulate the given object to their desired arrangement in the cluttered environment. The existing approaches to solving object manipulations are either inefficient sampling based techniques, require expert demonstrations, or learn by trial and error, making them less ideal for practical scenarios. In this paper, we propose a novel, multimodal physics-informed neural network (PINN) for solving object manipulation tasks. Our approach efficiently learns to solve the Eikonal equation without expert data and finds object manipulation trajectories fast in complex, cluttered environments. Our method is multimodal as it also reactively replans the robot's grasps during manipulation to achieve the desired object poses. We demonstrate our approach in both simulation and real-world scenarios and compare it against state-of-the-art baseline methods. The results indicate that our approach is effective across various objects, has efficient training compared to previous learning-based methods, and demonstrates high performance in planning time, trajectory length, and success rates. Our demonstration videos can be found at https://youtu.be/FaQLkTV9knI.
Constraint-Preserving Data Generation for Visuomotor Policy Learning
Large-scale demonstration data has powered key breakthroughs in robot manipulation, but collecting that data remains costly and time-consuming. We present Constraint-Preserving Data Generation (CP-Gen), a method that uses a single expert trajectory to generate robot demonstrations containing novel object geometries and poses. These generated demonstrations are used to train closed-loop visuomotor policies that transfer zero-shot to the real world and generalize across variations in object geometries and poses. Similar to prior work using pose variations for data generation, CP-Gen first decomposes expert demonstrations into free-space motions and robot skills. But unlike those works, we achieve geometry-aware data generation by formulating robot skills as keypoint-trajectory constraints: keypoints on the robot or grasped object must track a reference trajectory defined relative to a task-relevant object. To generate a new demonstration, CP-Gen samples pose and geometry transforms for each task-relevant object, then applies these transforms to the object and its associated keypoints or keypoint trajectories. We optimize robot joint configurations so that the keypoints on the robot or grasped object track the transformed keypoint trajectory, and then motion plan a collision-free path to the first optimized joint configuration. Experiments on 16 simulation tasks and four real-world tasks, featuring multi-stage, non-prehensile and tight-tolerance manipulation, show that policies trained using CP-Gen achieve an average success rate of 77%, outperforming the best baseline that achieves an average of 50%.
comment: CoRL 2025. Website: https://cp-gen.github.io
Uncertainty-aware Accurate Elevation Modeling for Off-road Navigation via Neural Processes
Terrain elevation modeling for off-road navigation aims to accurately estimate changes in terrain geometry in real-time and quantify the corresponding uncertainties. Having precise estimations and uncertainties plays a crucial role in planning and control algorithms to explore safe and reliable maneuver strategies. However, existing approaches, such as Gaussian Processes (GPs) and neural network-based methods, often fail to meet these needs. They are either unable to perform in real-time due to high computational demands, underestimating sharp geometry changes, or harming elevation accuracy when learned with uncertainties. Recently, Neural Processes (NPs) have emerged as a promising approach that integrates the Bayesian uncertainty estimation of GPs with the efficiency and flexibility of neural networks. Inspired by NPs, we propose an effective NP-based method that precisely estimates sharp elevation changes and quantifies the corresponding predictive uncertainty without losing elevation accuracy. Our method leverages semantic features from LiDAR and camera sensors to improve interpolation and extrapolation accuracy in unobserved regions. Also, we introduce a local ball-query attention mechanism to effectively reduce the computational complexity of global attention by 17\% while preserving crucial local and spatial information. We evaluate our method on off-road datasets having interesting geometric features, collected from trails, deserts, and hills. Our results demonstrate superior performance over baselines and showcase the potential of neural processes for effective and expressive terrain modeling in complex off-road environments.
comment: CoRL 2025
3DRot: 3D Rotation Augmentation for RGB-Based 3D Tasks
RGB-based 3D tasks, e.g., 3D detection, depth estimation, 3D keypoint estimation, still suffer from scarce, expensive annotations and a thin augmentation toolbox, since most image transforms, including resize and rotation, disrupt geometric consistency. In this paper, we introduce 3DRot, a plug-and-play augmentation that rotates and mirrors images about the camera's optical center while synchronously updating RGB images, camera intrinsics, object poses, and 3D annotations to preserve projective geometry-achieving geometry-consistent rotations and reflections without relying on any scene depth. We validate 3DRot with a classical 3D task, monocular 3D detection. On SUN RGB-D dataset, 3DRot raises $IoU_{3D}$ from 43.21 to 44.51, cuts rotation error (ROT) from 22.91$^\circ$ to 20.93$^\circ$, and boosts $mAP_{0.5}$ from 35.70 to 38.11. As a comparison, Cube R-CNN adds 3 other datasets together with SUN RGB-D for monocular 3D estimation, with a similar mechanism and test dataset, increases $IoU_{3D}$ from 36.2 to 37.8, boosts $mAP_{0.5}$ from 34.7 to 35.4. Because it operates purely through camera-space transforms, 3DRot is readily transferable to other 3D tasks.
Improving Drone Racing Performance Through Iterative Learning MPC IROS 2025
Autonomous drone racing presents a challenging control problem, requiring real-time decision-making and robust handling of nonlinear system dynamics. While iterative learning model predictive control (LMPC) offers a promising framework for iterative performance improvement, its direct application to drone racing faces challenges like real-time compatibility or the trade-off between time-optimal and safe traversal. In this paper, we enhance LMPC with three key innovations: (1) an adaptive cost function that dynamically weights time-optimal tracking against centerline adherence, (2) a shifted local safe set to prevent excessive shortcutting and enable more robust iterative updates, and (3) a Cartesian-based formulation that accommodates safety constraints without the singularities or integration errors associated with Frenet-frame transformations. Results from extensive simulation and real-world experiments demonstrate that our improved algorithm can optimize initial trajectories generated by a wide range of controllers with varying levels of tuning for a maximum improvement in lap time by 60.85%. Even applied to the most aggressively tuned state-of-the-art model-based controller, MPCC++, on a real drone, a 6.05% improvement is still achieved. Overall, the proposed method pushes the drone toward faster traversal and avoids collisions in simulation and real-world experiments, making it a practical solution to improve the peak performance of drone racing.
comment: Accepted for oral presentation at IROS 2025
Doppler-SLAM: Doppler-Aided Radar-Inertial and LiDAR-Inertial Simultaneous Localization and Mapping
Simultaneous localization and mapping (SLAM) is a critical capability for autonomous systems. Traditional SLAM approaches, which often rely on visual or LiDAR sensors, face significant challenges in adverse conditions such as low light or featureless environments. To overcome these limitations, we propose a novel Doppler-aided radar-inertial and LiDAR-inertial SLAM framework that leverages the complementary strengths of 4D radar, FMCW LiDAR, and inertial measurement units. Our system integrates Doppler velocity measurements and spatial data into a tightly-coupled front-end and graph optimization back-end to provide enhanced ego velocity estimation, accurate odometry, and robust mapping. We also introduce a Doppler-based scan-matching technique to improve front-end odometry in dynamic environments. In addition, our framework incorporates an innovative online extrinsic calibration mechanism, utilizing Doppler velocity and loop closure to dynamically maintain sensor alignment. Extensive evaluations on both public and proprietary datasets show that our system significantly outperforms state-of-the-art radar-SLAM and LiDAR-SLAM frameworks in terms of accuracy and robustness. To encourage further research, the code of our Doppler-SLAM and our dataset are available at: https://github.com/Wayne-DWA/Doppler-SLAM.
comment: 8 pages, 7 figures
The Starlink Robot: A Platform and Dataset for Mobile Satellite Communication
The integration of satellite communication into mobile devices represents a paradigm shift in connectivity, yet the performance characteristics under motion and environmental occlusion remain poorly understood. We present the Starlink Robot, the first mobile robotic platform equipped with Starlink satellite internet, comprehensive sensor suite including upward-facing camera, LiDAR, and IMU, designed to systematically study satellite communication performance during movement. Our multi-modal dataset captures synchronized communication metrics, motion dynamics, sky visibility, and 3D environmental context across diverse scenarios including steady-state motion, variable speeds, and different occlusion conditions. This platform and dataset enable researchers to develop motion-aware communication protocols, predict connectivity disruptions, and optimize satellite communication for emerging mobile applications from smartphones to autonomous vehicles. In this work, we use LEOViz for real-time satellite tracking and data collection. The starlink robot project is available at https://github.com/StarlinkRobot.
DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping
Dexterous grasping remains a fundamental yet challenging problem in robotics. A general-purpose robot must be capable of grasping diverse objects in arbitrary scenarios. However, existing research typically relies on restrictive assumptions, such as single-object settings or limited environments, showing constrained generalization. We present DexGraspVLA, a hierarchical framework for robust generalization in language-guided general dexterous grasping and beyond. It utilizes a pre-trained Vision-Language model as the high-level planner and learns a diffusion-based low-level Action controller. The key insight to achieve generalization lies in iteratively transforming diverse language and visual inputs into domain-invariant representations via foundation models, where imitation learning can be effectively applied due to the alleviation of domain shift. Notably, our method achieves a 90+% dexterous grasping success rate under thousands of challenging unseen cluttered scenes. Empirical analysis confirms the consistency of internal model behavior across environmental variations, validating our design. DexGraspVLA also, for the first time, simultaneously demonstrates free-form long-horizon prompt execution, robustness to adversarial objects and human disturbance, and failure recovery. Extended application to nonprehensile grasping further proves its generality. Project website: https://dexgraspvla.github.io.
comment: 19 pages, 11 figures
Real-Time Sense and Detect of Drones Using Deep Learning and Airborne LiDAR
The safe operation of drone swarms beyond visual line of sight requires multiple safeguards to mitigate the risk of collision between drones flying in close-proximity scenarios. Cooperative navigation and flight coordination strategies that rely on pre-planned trajectories, constant %{satellite and network connectivity and reliable Global Navigation Satellite System (GNSS) positioning are brittle to failure. Drone embedded sense and detect offers a comprehensive mode of separation between drones for deconfliction and collision avoidance. This paper presents the first airborne LiDAR based solution for drone-swarm detection and localization using 3D deep learning model. It adapts an existing deep learning neural network to the air-to-air drone scenario by expanding the scan space vertically. A new sparse convolution is proposed and applied to accelerate the backbone layer, which is the most time-consuming part of the neural network. To collect training data of safety critical, close-proximity multi-drone operations, a scenario Digital Twin is used to augment real datasets with high fidelity synthetic data. The trained model achieves over 80% recall and 96% precision when tested on real-world datasets. By incorporating a tracking-by-detection algorithm the system can reliably monitor the separation distance of multiple drones in challenging environments.
Non-Prehensile Tool-Object Manipulation by Integrating LLM-Based Planning and Manoeuvrability-Driven Controls
Being able to use tools is a widely recognised indicator of intelligence across species. Humans, for instance, have demonstrated mastery of tool use for over two million years. The ability to use tools is invaluable as it extends an organism's reach and enhances its capacity to interact with objects and the environment. Being able to understand the geometric-mechanical relations between the tools-objects-environments allows certain species (e.g., apes and crows) to reach food in narrow constrained spaces. The same principles of physical augmentation and its associated non-prehensile manipulation capabilities also apply to robotic systems. For example, by instrumenting them with different types of end-effectors, robots can (in principle) dexterously interact (e.g., push and flip) with objects of various shapes and masses akin to its biological counterpart. However, developing this type of manipulation skill is still an open research problem. Furthermore, the complexity of planning tool-object manipulation tasks, particularly in coordinating the actions of dual-arm robots, presents significant challenges. To address these complexities, we propose integrating Large Language Models (LLMs) to assist in planning and executing these intricate manipulations, thereby enhancing the robot's ability to perform in diverse scenarios.
Rethink Repeatable Measures of Robot Performance with Statistical Query
For a general standardized testing algorithm designed to evaluate a specific aspect of a robot's performance, several key expectations are commonly imposed. Beyond accuracy (i.e., closeness to a typically unknown ground-truth reference) and efficiency (i.e., feasibility within acceptable testing costs and equipment constraints), one particularly important attribute is repeatability. Repeatability refers to the ability to consistently obtain the same testing outcome when similar testing algorithms are executed on the same subject robot by different stakeholders, across different times or locations. However, achieving repeatable testing has become increasingly challenging as the components involved grow more complex, intelligent, diverse, and, most importantly, stochastic. While related efforts have addressed repeatability at ethical, hardware, and procedural levels, this study focuses specifically on repeatable testing at the algorithmic level. Specifically, we target the well-adopted class of testing algorithms in standardized evaluation: statistical query (SQ) algorithms (i.e., algorithms that estimate the expected value of a bounded function over a distribution using sampled data). We propose a lightweight, parameterized, and adaptive modification applicable to any SQ routine, whether based on Monte Carlo sampling, importance sampling, or adaptive importance sampling, that makes it provably repeatable, with guaranteed bounds on both accuracy and efficiency. We demonstrate the effectiveness of the proposed approach across three representative scenarios: (i) established and widely adopted standardized testing of manipulators, (ii) emerging intelligent testing algorithms for operational risk assessment in automated vehicles, and (iii) developing use cases involving command tracking performance evaluation of humanoid robots in locomotion tasks.
Equivariant Volumetric Grasping
We propose a new volumetric grasp model that is equivariant to rotations around the vertical axis, leading to a significant improvement in sample efficiency. Our model employs a tri-plane volumetric feature representation -- i.e., the projection of 3D features onto three canonical planes. We introduce a novel tri-plane feature design in which features on the horizontal plane are equivariant to 90{\deg} rotations, while the sum of features from the other two planes remains invariant to the same transformations. This design is enabled by a new deformable steerable convolution, which combines the adaptability of deformable convolutions with the rotational equivariance of steerable ones. This allows the receptive field to adapt to local object geometry while preserving equivariance properties. We further develop equivariant adaptations of two state-of-the-art volumetric grasp planners, GIGA and IGD. Specifically, we derive a new equivariant formulation of IGD's deformable attention mechanism and propose an equivariant generative model of grasp orientations based on flow matching. We provide a detailed analytical justification of the proposed equivariance properties and validate our approach through extensive simulated and real-world experiments. Our results demonstrate that the proposed projection-based design significantly reduces both computational and memory costs. Moreover, the equivariant grasp models built on top of our tri-plane features consistently outperform their non-equivariant counterparts, achieving higher performance with only a modest computational overhead. Video and code can be viewed in: https://mousecpn.github.io/evg-page/
comment: 19 pages
Unveiling the Potential of iMarkers: Invisible Fiducial Markers for Advanced Robotics
Fiducial markers are widely used in various robotics tasks, facilitating enhanced navigation, object recognition, and scene understanding. Despite their advantages for robots and Augmented Reality (AR) applications, they often disrupt the visual aesthetics of environments because they are visible to humans, making them unsuitable for non-intrusive use cases. To address this gap, this paper presents "iMarkers"-innovative, unobtrusive fiducial markers detectable exclusively by robots equipped with specialized sensors. These markers offer high flexibility in production, allowing customization of their visibility range and encoding algorithms to suit various demands. The paper also introduces the hardware designs and software algorithms developed for detecting iMarkers, highlighting their adaptability and robustness in the detection and recognition stages. Various evaluations have demonstrated the effectiveness of iMarkers compared to conventional (printed) and blended fiducial markers and confirmed their applicability in diverse robotics scenarios.
comment: 18 pages, 10 figures, 3 tables
Opt-in Camera: Person Identification in Video via UWB Localization and Its Application to Opt-in Systems IROS 2025
This paper presents opt-in camera, a concept of privacy-preserving camera systems capable of recording only specific individuals in a crowd who explicitly consent to be recorded. Our system utilizes a mobile wireless communication tag attached to personal belongings as proof of opt-in and as a means of localizing tag carriers in video footage. Specifically, the on-ground positions of the wireless tag are first tracked over time using the unscented Kalman filter (UKF). The tag trajectory is then matched against visual tracking results for pedestrians found in videos to identify the tag carrier. Technically, we devise a dedicated trajectory matching technique based on constrained linear optimization, as well as a novel calibration technique that handles wireless tag-camera calibration and hyperparameter tuning for the UKF, which mitigates the non-line-of-sight (NLoS) issue in wireless localization. We implemented the proposed opt-in camera system using ultra-wideband (UWB) devices and an off-the-shelf webcam. Experimental results demonstrate that our system can perform opt-in recording of individuals in real-time at 10 fps, with reliable identification accuracy in crowds of 8-23 people in a confined space.
comment: IROS 2025
Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation ICCV 2025
An ideal traffic simulator replicates the realistic long-term point-to-point trip that a self-driving system experiences during deployment. Prior models and benchmarks focus on closed-loop motion simulation for initial agents in a scene. This is problematic for long-term simulation. Agents enter and exit the scene as the ego vehicle enters new regions. We propose InfGen, a unified next-token prediction model that performs interleaved closed-loop motion simulation and scene generation. InfGen automatically switches between closed-loop motion simulation and scene generation mode. It enables stable long-term rollout simulation. InfGen performs at the state-of-the-art in short-term (9s) traffic simulation, and significantly outperforms all other methods in long-term (30s) simulation. The code and model of InfGen will be released at https://orangesodahub.github.io/InfGen
comment: ICCV 2025. Project page: https://orangesodahub.github.io/InfGen Code: https://github.com/OrangeSodahub/infgen
Simultaneous Pick and Place Detection by Combining SE(3) Diffusion Models with Differential Kinematics IROS2025
Grasp detection methods typically target the detection of a set of free-floating hand poses that can grasp the object. However, not all of the detected grasp poses are executable due to physical constraints. Even though it is straightforward to filter invalid grasp poses in the post-process, such a two-staged approach is computationally inefficient, especially when the constraint is hard. In this work, we propose an approach to take the following two constraints into account during the grasp detection stage, namely, (i) the picked object must be able to be placed with a predefined configuration without in-hand manipulation (ii) it must be reachable by the robot under the joint limit and collision-avoidance constraints for both pick and place cases. Our key idea is to train an SE(3) grasp diffusion network to estimate the noise in the form of spatial velocity, and constrain the denoising process by a multi-target differential inverse kinematics with an inequality constraint, so that the states are guaranteed to be reachable and placement can be performed without collision. In addition to an improved success ratio, we experimentally confirmed that our approach is more efficient and consistent in computation time compared to a naive two-stage approach.
comment: Accepted for IROS2025
Breaking Imitation Bottlenecks: Reinforced Diffusion Powers Diverse Trajectory Generation
Most end-to-end autonomous driving methods rely on imitation learning from single expert demonstrations, often leading to conservative and homogeneous behaviors that limit generalization in complex real-world scenarios. In this work, we propose DIVER, an end-to-end driving framework that integrates reinforcement learning with diffusion-based generation to produce diverse and feasible trajectories. At the core of DIVER lies a reinforced diffusion-based generation mechanism. First, the model conditions on map elements and surrounding agents to generate multiple reference trajectories from a single ground-truth trajectory, alleviating the limitations of imitation learning that arise from relying solely on single expert demonstrations. Second, reinforcement learning is employed to guide the diffusion process, where reward-based supervision enforces safety and diversity constraints on the generated trajectories, thereby enhancing their practicality and generalization capability. Furthermore, to address the limitations of L2-based open-loop metrics in capturing trajectory diversity, we propose a novel Diversity metric to evaluate the diversity of multi-mode predictions.Extensive experiments on the closed-loop NAVSIM and Bench2Drive benchmarks, as well as the open-loop nuScenes dataset, demonstrate that DIVER significantly improves trajectory diversity, effectively addressing the mode collapse problem inherent in imitation learning.
comment: 16 pages, 6 figures
DriveSOTIF: Advancing Perception SOTIF Through Multimodal Large Language Models
Human drivers possess spatial and causal intelligence, enabling them to perceive driving scenarios, anticipate hazards, and react to dynamic environments. In contrast, autonomous vehicles lack these abilities, making it challenging to manage perception-related Safety of the Intended Functionality (SOTIF) risks, especially under complex or unpredictable driving conditions. To address this gap, we propose fine-tuning multimodal large language models (MLLMs) on a customized dataset specifically designed to capture perception-related SOTIF scenarios. Benchmarking results show that fine-tuned MLLMs achieve an 11.8\% improvement in close-ended VQA accuracy and a 12.0\% increase in open-ended VQA scores compared to baseline models, while maintaining real-time performance with a 0.59-second average inference time per image. We validate our approach through real-world case studies in Canada and China, where fine-tuned models correctly identify safety risks that challenge even experienced human drivers. This work represents the first application of domain-specific MLLM fine-tuning for SOTIF domain in autonomous driving. The dataset and related resources are available at github.com/s95huang/DriveSOTIF.git
comment: This work has been submitted to the IEEE for possible publication. V2 of the manuscript, submitted to IEEE-TVT;
Vision-Language Fusion for Real-Time Autonomous Driving: Goal-Centered Cross-Attention of Camera, HD-Map, & Waypoints
Autonomous cars need geometric accuracy and semantic understanding to navigate complex environments, yet most stacks handle them separately. We present XYZ-Drive, a single vision-language model that reads a front-camera frame, a 25m $\times$ 25m overhead map, and the next waypoint, then outputs steering and speed. A lightweight goal-centered cross-attention layer lets waypoint tokens highlight relevant image and map patches, supporting both action and textual explanations, before the fused tokens enter a partially fine-tuned LLaMA-3.2 11B model. On the MD-NEX Outdoor-Driving benchmark XYZ-Drive attains 95% success and 0.80 Success weighted by Path Length (SPL), surpassing PhysNav-DG by 15%. and halving collisions, all while significantly improving efficiency by using only a single branch. Sixteen ablations explain the gains. Removing any modality (vision, waypoint, map) drops success by up to 11%, confirming their complementary roles and rich connections. Replacing goal-centered attention with simple concatenation cuts 3% in performance, showing query-based fusion injects map knowledge more effectively. Keeping the transformer frozen loses 5%, showing the importance of fine-tuning when applying VLMs for specific tasks such as autonomous driving. Coarsening map resolution from 10 cm to 40 cm blurs lane edges and raises crash rate. Overall, these results demonstrate that early, token-level fusion of intent and map layout enables accurate, transparent, real-time driving.
comment: 5 pages
Residual Koopman Model Predictive Control for Enhanced Vehicle Dynamics with Small On-Track Data Input
In vehicle trajectory tracking tasks, the simplest approach is the Pure Pursuit (PP) Control. However, this single-point preview tracking strategy fails to consider vehicle model constraints, compromising driving safety. Model Predictive Control (MPC) as a widely adopted control method, optimizes control actions by incorporating mechanistic models and physical constraints. While its control performance critically depends on the accuracy of vehicle modeling. Traditional vehicle modeling approaches face inherent trade-offs between capturing nonlinear dynamics and maintaining computational efficiency, often resulting in reduced control performance. To address these challenges, this paper proposes Residual Koopman Model Predictive Control (RKMPC) framework. This method uses two linear MPC architecture to calculate control inputs: a Linear Model Predictive Control (LMPC) computes the baseline control input based on the vehicle kinematic model, and a neural network-based RKMPC calculates the compensation input. The final control command is obtained by adding these two components. This design preserves the reliability and interpretability of traditional mechanistic model while achieving performance optimization through residual modeling. This method has been validated on the Carsim-Matlab joint simulation platform and a physical 1:10 scale F1TENTH racing car. Experimental results show that RKMPC requires only 20% of the training data needed by traditional Koopman Model Predictive Control (KMPC) while delivering superior tracking performance. Compared to traditional LMPC, RKMPC reduces lateral error by 11.7%-22.1%, decreases heading error by 8.9%-15.8%, and improves front-wheel steering stability by up to 27.6%. The implementation code is available at: https://github.com/ZJU-DDRX/Residual Koopman.
Diffuse-CLoC: Guided Diffusion for Physics-based Character Look-ahead Control
We present Diffuse-CLoC, a guided diffusion framework for physics-based look-ahead control that enables intuitive, steerable, and physically realistic motion generation. While existing kinematics motion generation with diffusion models offer intuitive steering capabilities with inference-time conditioning, they often fail to produce physically viable motions. In contrast, recent diffusion-based control policies have shown promise in generating physically realizable motion sequences, but the lack of kinematics prediction limits their steerability. Diffuse-CLoC addresses these challenges through a key insight: modeling the joint distribution of states and actions within a single diffusion model makes action generation steerable by conditioning it on the predicted states. This approach allows us to leverage established conditioning techniques from kinematic motion generation while producing physically realistic motions. As a result, we achieve planning capabilities without the need for a high-level planner. Our method handles a diverse set of unseen long-horizon downstream tasks through a single pre-trained model, including static and dynamic obstacle avoidance, motion in-betweening, and task-space control. Experimental results show that our method significantly outperforms the traditional hierarchical framework of high-level motion diffusion and low-level tracking.
Learning Pivoting Manipulation with Force and Vision Feedback Using Optimization-based Demonstrations
Non-prehensile manipulation is challenging due to complex contact interactions between objects, the environment, and robots. Model-based approaches can efficiently generate complex trajectories of robots and objects under contact constraints. However, they tend to be sensitive to model inaccuracies and require access to privileged information (e.g., object mass, size, pose), making them less suitable for novel objects. In contrast, learning-based approaches are typically more robust to modeling errors but require large amounts of data. In this paper, we bridge these two approaches to propose a framework for learning closed-loop pivoting manipulation. By leveraging computationally efficient Contact-Implicit Trajectory Optimization (CITO), we design demonstration-guided deep Reinforcement Learning (RL), leading to sample-efficient learning. We also present a sim-to-real transfer approach using a privileged training strategy, enabling the robot to perform pivoting manipulation using only proprioception, vision, and force sensing without access to privileged information. Our method is evaluated on several pivoting tasks, demonstrating that it can successfully perform sim-to-real transfer. The overview of our method and the hardware experiments are shown at https://youtu.be/akjGDgfwLbM?si=QVw6ExoPy2VsU2g6
RAMBO: RL-Augmented Model-Based Whole-Body Control for Loco-Manipulation
Loco-manipulation, physical interaction of various objects that is concurrently coordinated with locomotion, remains a major challenge for legged robots due to the need for both precise end-effector control and robustness to unmodeled dynamics. While model-based controllers provide precise planning via online optimization, they are limited by model inaccuracies. In contrast, learning-based methods offer robustness, but they struggle with precise modulation of interaction forces. We introduce RAMBO, a hybrid framework that integrates model-based whole-body control within a feedback policy trained with reinforcement learning. The model-based module generates feedforward torques by solving a quadratic program, while the policy provides feedback corrective terms to enhance robustness. We validate our framework on a quadruped robot across a diverse set of real-world loco-manipulation tasks, such as pushing a shopping cart, balancing a plate, and holding soft objects, in both quadrupedal and bipedal walking. Our experiments demonstrate that RAMBO enables precise manipulation capabilities while achieving robust and dynamic locomotion.
comment: Accepted to IEEE Robotics and Automation Letters (RA-L)
VOTE: Vision-Language-Action Optimization with Trajectory Ensemble Voting
Recent large-scale Vision Language Action (VLA) models have shown superior performance in robotic manipulation tasks guided by natural language. However, current VLA models suffer from two drawbacks: (i) generation of massive tokens leading to high inference latency and increased training cost, and (ii) insufficient utilization of generated actions resulting in potential performance loss. To address these issues, we develop a training framework to finetune VLA models for generating significantly fewer action tokens with high parallelism, effectively reducing inference latency and training cost. Furthermore, we introduce an inference optimization technique with a novel voting-based ensemble strategy to combine current and previous action predictions, improving the utilization of generated actions and overall performance. Our results demonstrate that we achieve superior performance compared with state-of-the-art VLA models, achieving significantly higher success rates and 39$\times$ faster inference than OpenVLA with 46 Hz throughput on edge platforms, demonstrating practical deployability. The code is available at https://github.com/LukeLIN-web/VOTE.
NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language Models
Recent advances in multi-modal large language models (MLLMs) have demonstrated strong performance across various domains; however, their ability to comprehend driving scenes remains less proven. The complexity of driving scenarios, which includes multi-view information, poses significant challenges for existing MLLMs. In this paper, we introduce NuPlanQA-Eval, a multi-view, multi-modal evaluation benchmark for driving scene understanding. To further support generalization to multi-view driving scenarios, we also propose NuPlanQA-1M, a large-scale dataset comprising 1M real-world visual question-answering (VQA) pairs. For context-aware analysis of traffic scenes, we categorize our dataset into nine subtasks across three core skills: Road Environment Perception, Spatial Relations Recognition, and Ego-Centric Reasoning. Furthermore, we present BEV-LLM, integrating Bird's-Eye-View (BEV) features from multi-view images into MLLMs. Our evaluation results reveal key challenges that existing MLLMs face in driving scene-specific perception and spatial reasoning from ego-centric perspectives. In contrast, BEV-LLM demonstrates remarkable adaptability to this domain, outperforming other models in six of the nine subtasks. These findings highlight how BEV integration enhances multi-view MLLMs while also identifying key areas that require further refinement for effective adaptation to driving scenes. To facilitate further research, we publicly release NuPlanQA at https://github.com/sungyeonparkk/NuPlanQA.
Empirical Analysis of Sim-and-Real Cotraining of Diffusion Policies for Planar Pushing from Pixels IROS 2025
Cotraining with demonstration data generated both in simulation and on real hardware has emerged as a promising recipe for scaling imitation learning in robotics. This work seeks to elucidate basic principles of this sim-and-real cotraining to inform simulation design, sim-and-real dataset creation, and policy training. Our experiments confirm that cotraining with simulated data can dramatically improve performance, especially when real data is limited. We show that these performance gains scale with additional simulated data up to a plateau; adding more real-world data increases this performance ceiling. The results also suggest that reducing physical domain gaps may be more impactful than visual fidelity for non-prehensile or contact-rich tasks. Perhaps surprisingly, we find that some visual gap can help cotraining -- binary probes reveal that high-performing policies must learn to distinguish simulated domains from real. We conclude by investigating this nuance and mechanisms that facilitate positive transfer between sim-and-real. Focusing narrowly on the canonical task of planar pushing from pixels allows us to be thorough in our study. In total, our experiments span 50+ real-world policies (evaluated on 1000+ trials) and 250 simulated policies (evaluated on 50,000+ trials). Videos and code can be found at https://sim-and-real-cotraining.github.io/.
comment: 11 pages, 17 figures, IROS 2025 Aug 5, 2025 update: Included new experiments in Sections V and VII. Updated abstract and minor changes to text
Multiagent Systems
A Robust Cooperative Vehicle Coordination Framework for Intersection Crossing
Cooperative vehicle coordination at unsignalized intersections has garnered significant interest from both academia and industry in recent years, highlighting its notable advantages in improving traffic throughput and fuel efficiency. However, most existing studies oversimplify the coordination system, assuming accurate vehicle state information and ideal state update process. The oversights pose driving risks in the presence of state uncertainty and communication constraint. To address this gap, we propose a robust and comprehensive intersection coordination framework consisting of a robust cooperative trajectory planner and a context-aware status update scheduler. The trajectory planner directly controls the evolution of the trajectory distributions during frequent vehicle interactions, thereby offering probabilistic safety guarantees. To further align with coordination safety in practical bandwidth-limited conditions, we propose a context-aware status update scheduler that dynamically prioritizes the state updating order of vehicles based on their driving urgency. Simulation results validate the robustness and effectiveness of the proposed coordination framework, showing that the collision probability can be significantly reduced while maintaining comparable coordination efficiency to state-of-theart strategies. Moreover, our proposed framework demonstrates superior effectiveness in utilizing wireless resources in practical uncertain and bandwidth-limited conditions.
A Closed-Loop Multi-Agent Framework for Aerodynamics-Aware Automotive Styling Design
The core challenge in automotive exterior design is balancing subjective aesthetics with objective aerodynamic performance while dramatically accelerating the development cycle. To address this, we propose a novel, LLM-driven multi-agent framework that automates the end-to-end workflow from ambiguous requirements to 3D concept model performance validation. The workflow is structured in two stages: conceptual generation and performance validation. In the first stage, agents collaborate to interpret fuzzy design requirements, generate concept sketches, and produce photorealistic renderings using diffusion models. In the second stage, the renderings are converted to 3D point clouds, where a Drag Prediction Agent, built upon a lightweight surrogate model, provides near-instantaneous predictions of the drag coefficient and pressure fields, replacing time-consuming CFD simulations. The primary contribution of this work is the seamless integration of creative generation with a rapid engineering validation loop within a unified, automated system, which provides a new paradigm for efficiently balancing creative exploration with engineering constraints in the earliest stages of design.
Approximate Proportionality in Online Fair Division
We study the online fair division problem, where indivisible goods arrive sequentially and must be allocated immediately and irrevocably to agents. Prior work has established strong impossibility results for approximating classic fairness notions, such as envy-freeness and maximin share fairness, in this setting. In contrast, we focus on proportionality up to one good (PROP1), a natural relaxation of proportionality whose approximability remains unresolved. We begin by showing that three natural greedy algorithms fail to guarantee any positive approximation to PROP1 in general, against an adaptive adversary. This is surprising because greedy algorithms are commonly used in fair division and a natural greedy algorithm is known to be able to achieve PROP1 under additional information assumptions. This hardness result motivates the study of non-adaptive adversaries and the use of side-information, in the spirit of learning-augmented algorithms. For non-adaptive adversaries, we show that the simple uniformly random allocation can achieve a meaningful PROP1 approximation with high probability. Meanwhile, we present an algorithm that obtain robust approximation ratios against PROP1 when given predictions of the maximum item value (MIV). Interestingly, we also show that stronger fairness notions such as EF1, MMS, and PROPX remain inapproximable even with perfect MIV predictions.
Distributionally Robust Markov Games with Average Reward
This paper introduces the formulation of a distributionally robust Markov game (DR-MG) with average rewards, a crucial framework for multi-agent decision-making under uncertainty over extended horizons. Unlike finite-horizon or discounted models, the average-reward criterion naturally captures long-term performance for systems designed for continuous operation, where sustained reliability is paramount. We account for uncertainty in transition kernels, with players aiming to optimize their worst-case average reward. We first establish a connection between the multi-agent and single agent settings, and derive the solvability of the robust Bellman equation under the average-reward formulation. We then rigorously prove the existence of a robust Nash Equilibrium (NE), offering essential theoretical guarantees for system stability. We further develop and analyze an algorithm named robust Nash-Iteration to compute the robust Nash Equilibria among all agents, providing practical tools for identifying optimal strategies in complex, uncertain, and long-running multi-player environments. Finally, we demonstrate the connection between the average-reward NE and the well-studied discounted NEs, showing that the former can be approximated as the discount factor approaches one. Together, these contributions provide a comprehensive theoretical and algorithmic foundation for identifying optimal strategies in complex, uncertain, and long-running multi-player environments, which allow for the future extension of robust average-reward single-agent problems to the multi-agent setting.
Attack the Messages, Not the Agents: A Multi-round Adaptive Stealthy Tampering Framework for LLM-MAS
Large language model-based multi-agent systems (LLM-MAS) effectively accomplish complex and dynamic tasks through inter-agent communication, but this reliance introduces substantial safety vulnerabilities. Existing attack methods targeting LLM-MAS either compromise agent internals or rely on direct and overt persuasion, which limit their effectiveness, adaptability, and stealthiness. In this paper, we propose MAST, a Multi-round Adaptive Stealthy Tampering framework designed to exploit communication vulnerabilities within the system. MAST integrates Monte Carlo Tree Search with Direct Preference Optimization to train an attack policy model that adaptively generates effective multi-round tampering strategies. Furthermore, to preserve stealthiness, we impose dual semantic and embedding similarity constraints during the tampering process. Comprehensive experiments across diverse tasks, communication architectures, and LLMs demonstrate that MAST consistently achieves high attack success rates while significantly enhancing stealthiness compared to baselines. These findings highlight the effectiveness, stealthiness, and adaptability of MAST, underscoring the need for robust communication safeguards in LLM-MAS.
NANDA Adaptive Resolver: Architecture for Dynamic Resolution of AI Agent Names
AdaptiveResolver is a dynamic microservice architecture designed to address the limitations of static endpoint resolution for AI agent communication in distributed, heterogeneous environments. Unlike traditional DNS or static URLs, AdaptiveResolver enables context-aware, real-time selection of communication endpoints based on factors such as geographic location, system load, agent capabilities, and security threats. Agents advertise their Agent Name and context requirements through Agent Fact cards in an Agent Registry/Index. A requesting Agent discovers a Target Agent using the registry. The Requester Agent can then resolve the Target Agent Name to obtain a tailored communication channel to the agent based on actual environmental context between the agents. The architecture supports negotiation of trust, quality of service, and resource constraints, facilitating flexible, secure, and scalable agent-to-agent interactions that go beyond the classic client-server model. AdaptiveResolver provides a foundation for robust, future-proof agent communication that can evolve with increasing ecosystem complexity.
Using the NANDA Index Architecture in Practice: An Enterprise Perspective
The proliferation of autonomous AI agents represents a paradigmatic shift from traditional web architectures toward collaborative intelligent systems requiring sophisticated mechanisms for discovery, authentication, capability verification, and secure collaboration across heterogeneous protocol environments. This paper presents a comprehensive framework addressing the fundamental infrastructure requirements for secure, trustworthy, and interoperable AI agent ecosystems. We introduce the NANDA (Networked AI Agents in a Decentralized Architecture) framework, providing global agent discovery, cryptographically verifiable capability attestation through AgentFacts, and cross-protocol interoperability across Anthropic's Modal Context Protocol (MCP), Google's Agent-to-Agent (A2A), Microsoft's NLWeb, and standard HTTPS communications. NANDA implements Zero Trust Agentic Access (ZTAA) principles, extending traditional Zero Trust Network Access (ZTNA) to address autonomous agent security challenges including capability spoofing, impersonation attacks, and sensitive data leakage. The framework defines Agent Visibility and Control (AVC) mechanisms enabling enterprise governance while maintaining operational autonomy and regulatory compliance. Our approach transforms isolated AI agents into an interconnected ecosystem of verifiable, trustworthy intelligent services, establishing foundational infrastructure for large-scale autonomous agent deployment across enterprise and consumer environments. This work addresses the critical gap between current AI agent capabilities and infrastructure requirements for secure, scalable, multi-agent collaboration, positioning the foundation for next-generation autonomous intelligent systems.
A Survey of AI Agent Registry Solutions
As As autonomous AI agents scale across cloud, enterprise, and decentralized environments, the need for standardized registry systems to support discovery, identity, and capability sharing has become essential. This paper surveys three prominent registry approaches each defined by a unique metadata model: MCP's mcp.json, A2A's Agent Card, and NANDA's AgentFacts. MCP uses a centralized metaregistry with GitHub authenticated publishing and structured metadata for server discovery. A2A enables decentralized interaction via JSON-based Agent Cards, discoverable through well-known URIs, curated catalogs, or direct configuration. NANDA Index introduces AgentFacts, a cryptographically verifiable and privacy-preserving metadata model designed for dynamic discovery, credentialed capabilities, and cross-domain interoperability. These approaches are compared across four dimensions: security, scalability, authentication, and maintainability. The paper concludes with suggestions and recommendations to guide future design and adoption of registry systems for the Internet of AI Agents.
MI9 -- Agent Intelligence Protocol: Runtime Governance for Agentic AI Systems
Agentic AI systems capable of reasoning, planning, and executing actions present fundamentally distinct governance challenges compared to traditional AI models. Unlike conventional AI, these systems exhibit emergent and unexpected behaviors during runtime, introducing novel agent-related risks that cannot be fully anticipated through pre-deployment governance alone. To address this critical gap, we introduce MI9, the first fully integrated runtime governance framework designed specifically for safety and alignment of agentic AI systems. MI9 introduces real-time controls through six integrated components: agency-risk index, agent-semantic telemetry capture, continuous authorization monitoring, Finite-State-Machine (FSM)-based conformance engines, goal-conditioned drift detection, and graduated containment strategies. Operating transparently across heterogeneous agent architectures, MI9 enables the systematic, safe, and responsible deployment of agentic systems in production environments where conventional governance approaches fall short, providing the foundational infrastructure for safe agentic AI deployment at scale. Detailed analysis through a diverse set of scenarios demonstrates MI9's systematic coverage of governance challenges that existing approaches fail to address, establishing the technical foundation for comprehensive agentic AI oversight.
What Do Agents Think Others Would Do? Level-2 Inverse Games for Inferring Agents' Estimates of Others' Objectives
Effectively interpreting strategic interactions among multiple agents requires us to infer each agent's objective from limited information. Existing inverse game-theoretic approaches frame this challenge in terms of a "level-1" inference problem, in which we take the perspective of a third-party observer and assume that individual agents share complete knowledge of one another's objectives. However, this assumption breaks down in decentralized, real-world decision scenarios like urban driving and bargaining, in which agents may act based on conflicting views of one another's objectives. We demonstrate the necessity of inferring agents' heterogeneous estimates of each other's objectives through empirical examples, and by theoretically characterizing the prediction error of level-1 inference on fictitious gameplay data from linear-quadratic games. To address this fundamental issue, we propose a framework for level-2 inference to address the question: "What does each agent believe about all agents' objectives?" We prove that the level-2 inference problem is non-convex even in benign settings like linear-quadratic games, and we develop an efficient gradient-based approach for identifying local solutions. Experiments on a synthetic urban driving example show that our approach uncovers nuanced misalignments that level-1 methods miss.
comment: 7 pages + references + appendix
When Agents Break Down in Multiagent Path Finding
In Multiagent Path Finding (MAPF), the goal is to compute efficient, collision-free paths for multiple agents navigating a network from their sources to targets, minimizing the schedule's makespan-the total time until all agents reach their destinations. We introduce a new variant that formally models scenarios where some agents may experience delays due to malfunctions, posing significant challenges for maintaining optimal schedules. Recomputing an entirely new schedule from scratch after each malfunction is often computationally infeasible. To address this, we propose a framework for dynamic schedule adaptation that does not rely on full replanning. Instead, we develop protocols enabling agents to locally coordinate and adjust their paths on the fly. We prove that following our primary communication protocol, the increase in makespan after k malfunctions is bounded by k additional turns, effectively limiting the impact of malfunctions on overall efficiency. Moreover, recognizing that agents may have limited computational capabilities, we also present a secondary protocol that shifts the necessary computations onto the network's nodes, ensuring robustness without requiring enhanced agent processing power. Our results demonstrate that these protocols provide a practical, scalable approach to resilient multiagent navigation in the face of agent failures.
Forgive and Forget? An Industry 5.0 Approach to Trust-Fatigue Co-regulation in Human-Cobot Order Picking
This paper investigates the critical role of trust and fatigue in human-cobot collaborative order picking, framing the challenge within the scope of Logistics 5.0 -- the implementation of human-robot symbiosis in smart logistics. We propose a dynamic, leader-follower Stackelberg game to model this interaction, where utility functions explicitly account for human fatigue and trust. Through agent-based simulations, we demonstrate that while a naive model leads to a "trust death spiral," a refined trust model creates a "trust synergy cycle," increasing productivity by nearly 100 percent. Finally, we show that a cobot equipped with a proactive Trust-Repair Protocol can overcome system brittleness, reducing trust recovery time after a severe failure by over 75 percent compared to a non-adaptive model. Our findings provide a framework for designing intelligent cobot behaviors that fulfill the Industry 5.0 pillars of human-centricity, sustainability, and resilience.
ReaGAN: Node-as-Agent-Reasoning Graph Agentic Network
Graph Neural Networks (GNNs) have achieved remarkable success in graph-based learning by propagating information among neighbor nodes via predefined aggregation mechanisms. However, such fixed schemes often suffer from two key limitations. First, they cannot handle the imbalance in node informativeness -- some nodes are rich in information, while others remain sparse. Second, predefined message passing primarily leverages local structural similarity while ignoring global semantic relationships across the graph, limiting the model's ability to capture distant but relevant information. We propose Retrieval-augmented Graph Agentic Network (ReaGAN), an agent-based framework that empowers each node with autonomous, node-level decision-making. Each node acts as an agent that independently plans its next action based on its internal memory, enabling node-level planning and adaptive message propagation. Additionally, retrieval-augmented generation (RAG) allows nodes to access semantically relevant content and build global relationships in the graph. ReaGAN achieves competitive performance under few-shot in-context settings using a frozen LLM backbone without fine-tuning, showcasing the potential of agentic planning and local-global retrieval in graph learning.
comment: 17 pages, work in progress
Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study
Large Language Models (LLMs) hold promise in automating data analysis tasks, yet open-source models face significant limitations in these kinds of reasoning-intensive scenarios. In this work, we investigate strategies to enhance the data analysis capabilities of open-source LLMs. By curating a seed dataset of diverse, realistic scenarios, we evaluate model behavior across three core dimensions: data understanding, code generation, and strategic planning. Our analysis reveals three key findings: (1) Strategic planning quality serves as the primary determinant of model performance; (2) Interaction design and task complexity significantly influence reasoning capabilities; (3) Data quality demonstrates a greater impact than diversity in achieving optimal performance. We leverage these insights to develop a data synthesis methodology, demonstrating significant improvements in open-source LLMs' analytical reasoning capabilities. Code is available at https://github.com/zjunlp/DataMind.
comment: Work in progress
AudioGenie: A Training-Free Multi-Agent Framework for Diverse Multimodality-to-Multiaudio Generation
Multimodality-to-Multiaudio (MM2MA) generation faces significant challenges in synthesizing diverse and contextually aligned audio types (e.g., sound effects, speech, music, and songs) from multimodal inputs (e.g., video, text, images), owing to the scarcity of high-quality paired datasets and the lack of robust multi-task learning frameworks. Recently, multi-agent system shows great potential in tackling the above issues. However, directly applying it to MM2MA task presents three critical challenges: (1) inadequate fine-grained understanding of multimodal inputs (especially for video), (2) the inability of single models to handle diverse audio events, and (3) the absence of self-correction mechanisms for reliable outputs. To this end, we propose AudioGenie, a novel training-free multi-agent system featuring a dual-layer architecture with a generation team and a supervisor team. For the generation team, a fine-grained task decomposition and an adaptive Mixture-of-Experts (MoE) collaborative entity are designed for detailed comprehensive multimodal understanding and dynamic model selection, and a trial-and-error iterative refinement module is designed for self-correction. The supervisor team ensures temporal-spatial consistency and verifies outputs through feedback loops. Moreover, we build MA-Bench, the first benchmark for MM2MA tasks, comprising 198 annotated videos with multi-type audios. Experiments demonstrate that our AudioGenie achieves state-of-the-art (SOTA) or comparable performance across 9 metrics in 8 tasks. User study further validates the effectiveness of our method in terms of quality, accuracy, alignment, and aesthetic. The project website with audio samples can be found at https://audiogenie.github.io/.
BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modeling ICML 2025
Time-series Generation (TSG) is a prominent research area with broad applications in simulations, data augmentation, and counterfactual analysis. While existing methods have shown promise in unconditional single-domain TSG, real-world applications demand for cross-domain approaches capable of controlled generation tailored to domain-specific constraints and instance-level requirements. In this paper, we argue that text can provide semantic insights, domain information and instance-specific temporal patterns, to guide and improve TSG. We introduce ``Text-Controlled TSG'', a task focused on generating realistic time series by incorporating textual descriptions. To address data scarcity in this setting, we propose a novel LLM-based Multi-Agent framework that synthesizes diverse, realistic text-to-TS datasets. Furthermore, we introduce BRIDGE, a hybrid text-controlled TSG framework that integrates semantic prototypes with text description for supporting domain-level guidance. This approach achieves state-of-the-art generation fidelity on 11 of 12 datasets, and improves controllability by up to 12% on MSE and 6% MAE compared to no text input generation, highlighting its potential for generating tailored time-series data.
comment: ICML 2025 Main Conference
Adaptive Inference through Bayesian and Inverse Bayesian Inference with Symmetry-Bias in Nonstationary Environments
This study proposes a novel inference framework known as Bayesian and inverse Bayesian (BIB) inference, which incorporates symmetry bias into the Bayesian updating process to perform both conventional and inverse Bayesian updates concurrently. The model was evaluated in a sequential estimation task involving observations drawn from a Gaussian distribution with a stochastically time-varying mean. Conventional Bayesian inference is constrained by a fundamental trade-off between adaptability to abrupt environmental changes and accuracy during stable periods.The BIB framework addresses this limitation by dynamically modulating the learning rate via inverse Bayesian updates, thereby enhancing adaptive flexibility. Notably, the BIB model exhibited spontaneous bursts in the learning rate during environmental transitions, transiently entering high-sensitivity states that facilitated rapid adaptation. This burst-relaxation dynamic serves as a mechanism for balancing adaptability and accuracy. Furthermore, avalanche analysis and detrended fluctuation analysis revealed that the BIB system likely operates near a critical state-a property not observed in standard Bayesian inference. This suggests that the BIB model uniquely achieves a coexistence of computational efficiency and critical dynamics, resolving the adaptability-accuracy trade-off while maintaining a scale-free behavior. These findings offer a new computational perspective on scale-free dynamics in natural systems and provide valuable insights for the design of adaptive inference systems in nonstationary environments.
TVDO: Tchebycheff Value-Decomposition Optimization for Multi-Agent Reinforcement Learning
In cooperative multiagent reinforcement learning (MARL), centralized training with decentralized execution (CTDE) has recently attracted more attention due to the physical demand. However, the most dilemma therein is the inconsistency between jointly-trained policies and individually-executed actions. In this article, we propose a factorized Tchebycheff value-decomposition optimization (TVDO) method to overcome the trouble of inconsistency. In particular, a nonlinear Tchebycheff aggregation function is formulated to realize the global optimum by tightly constraining the upper bound of individual action-value bias, which is inspired by the Tchebycheff method of multi-objective optimization. We theoretically prove that, under no extra limitations, the factorized value decomposition with Tchebycheff aggregation satisfies the sufficiency and necessity of Individual-Global-Max (IGM), which guarantees the consistency between the global and individual optimal action-value function. Empirically, in the climb and penalty game, we verify that TVDO precisely expresses the global-to-individual value decomposition with a guarantee of policy consistency. Meanwhile, we evaluate TVDO in the SMAC benchmark, and extensive experiments demonstrate that TVDO achieves a significant performance superiority over some SOTA MARL baselines.
Systems and Control (CS)
Streaming Generated Gaussian Process Experts for Online Learning and Control
Gaussian Processes (GPs), as a nonparametric learning method, offer flexible modeling capabilities and calibrated uncertainty quantification for function approximations. Additionally, GPs support online learning by efficiently incorporating new data with polynomial-time computation, making them well-suited for safety-critical dynamical systems that require rapid adaptation. However, the inference and online updates of exact GPs, when processing streaming data, incur cubic computation time and quadratic storage memory complexity, limiting their scalability to large datasets in real-time settings. In this paper, we propose a \underline{s}treaming \underline{k}ernel-induced progressivel\underline{y} generated expert framework of \underline{G}aussian \underline{p}rocesses (SkyGP) that addresses both computational and memory constraints by maintaining a bounded set of experts, while inheriting the learning performance guarantees from exact Gaussian processes. Furthermore, two SkyGP variants are introduced, each tailored to a specific objective, either maximizing prediction accuracy (SkyGP-Dense) or improving computational efficiency (SkyGP-Fast). The effectiveness of SkyGP is validated through extensive benchmarks and real-time control experiments demonstrating its superior performance compared to state-of-the-art approaches.
Improving Q-Learning for Real-World Control: A Case Study in Series Hybrid Agricultural Tractors
The variable and unpredictable load demands in hybrid agricultural tractors make it difficult to design optimal rule-based energy management strategies, motivating the use of adaptive, learning-based control. However, existing approaches often rely on basic fuel-based rewards and do not leverage expert demonstrations to accelerate training. In this paper, first, the performance of Q-value-based reinforcement learning algorithms is evaluated for powertrain control in a hybrid agricultural tractor. Three algorithms, Double Q-Learning (DQL), Deep Q-Networks (DQN), and Double DQN (DDQN), are compared in terms of convergence speed and policy optimality. Second, a piecewise domain-specific reward-shaping strategy is introduced to improve learning efficiency and steer agent behavior toward engine fuel-efficient operating regions. Third, the design of the experience replay buffer is examined, with a focus on the effects of seeding the buffer with expert demonstrations and analyzing how different types of expert policies influence convergence dynamics and final performance. Experimental results demonstrate that (1) DDQN achieves 70\% faster convergence than DQN in this application domain, (2) the proposed reward shaping method effectively biases the learned policy toward fuel-efficient outcomes, and (3) initializing the replay buffer with structured expert data leads to a 33\% improvement in convergence speed.
Residual Neural Terminal Constraint for MPC-based Collision Avoidance in Dynamic Environments
In this paper, we propose a hybrid MPC local planner that uses a learning-based approximation of a time-varying safe set, derived from local observations and applied as the MPC terminal constraint. This set can be represented as a zero-superlevel set of the value function computed via Hamilton-Jacobi (HJ) reachability analysis, which is infeasible in real-time. We exploit the property that the HJ value function can be expressed as a difference of the corresponding signed distance function (SDF) and a non-negative residual function. The residual component is modeled as a neural network with non-negative output and subtracted from the computed SDF, resulting in a real-time value function estimate that is at least as safe as the SDF by design. Additionally, we parametrize the neural residual by a hypernetwork to improve real-time performance and generalization properties. The proposed method is compared with three state-of-the-art methods in simulations and hardware experiments, achieving up to 30\% higher success rates compared to the best baseline while requiring a similar computational effort and producing high-quality (low travel-time) solutions.
A Robust Cooperative Vehicle Coordination Framework for Intersection Crossing
Cooperative vehicle coordination at unsignalized intersections has garnered significant interest from both academia and industry in recent years, highlighting its notable advantages in improving traffic throughput and fuel efficiency. However, most existing studies oversimplify the coordination system, assuming accurate vehicle state information and ideal state update process. The oversights pose driving risks in the presence of state uncertainty and communication constraint. To address this gap, we propose a robust and comprehensive intersection coordination framework consisting of a robust cooperative trajectory planner and a context-aware status update scheduler. The trajectory planner directly controls the evolution of the trajectory distributions during frequent vehicle interactions, thereby offering probabilistic safety guarantees. To further align with coordination safety in practical bandwidth-limited conditions, we propose a context-aware status update scheduler that dynamically prioritizes the state updating order of vehicles based on their driving urgency. Simulation results validate the robustness and effectiveness of the proposed coordination framework, showing that the collision probability can be significantly reduced while maintaining comparable coordination efficiency to state-of-theart strategies. Moreover, our proposed framework demonstrates superior effectiveness in utilizing wireless resources in practical uncertain and bandwidth-limited conditions.
Grid-Forming Vector Current Control FRT Modes Under Symmetrical and Asymmetrical Faults
Recent research has shown that operating grid-connected converters using the grid-forming vector current control (GFVCC) scheme offers significant benefits, including the simplicity and modularity of the control architecture, as well as enabling a seamless transition from PLL-based grid-following control to grid-forming. An important aspect of any grid-connected converter control strategy is the handling of grid-fault scenarios such as symmetrical and asymmetrical short-circuit faults. This paper presents several fault ride-through (FRT) strategies for GFVCC that enable the converter to provide fault current and stay synchronized to the grid while respecting the converter hardware limitations and retaining grid-forming behavior. The converter control scheme is extended in a modular manner to include negative-sequence loops, and the proposed FRT strategies address both symmetrical and asymmetrical faults. The proposed FRT strategies are analyzed through case studies, including infinite-bus setups and multi-unit grids.
Live Demonstration: Neuromorphic Radar for Gesture Recognition ICASSP 2025
We present a neuromorphic radar framework for real-time, low-power hand gesture recognition (HGR) using an event-driven architecture inspired by biological sensing. Our system comprises a 24 GHz Doppler radar front-end and a custom neuromorphic sampler that converts intermediate-frequency (IF) signals into sparse spike-based representations via asynchronous sigma-delta encoding. These events are directly processed by a lightweight neural network deployed on a Cortex-M0 microcontroller, enabling low-latency inference without requiring spectrogram reconstruction. Unlike conventional radar HGR pipelines that continuously sample and process data, our architecture activates only when meaningful motion is detected, significantly reducing memory, power, and computation overhead. Evaluated on a dataset of five gestures collected from seven users, our system achieves > 85% real-time accuracy. To the best of our knowledge, this is the first work that employs bio-inspired asynchronous sigma-delta encoding and an event-driven processing framework for radar-based HGR.
comment: Neuromorphic Radar, Hand Gesture Recognition, Event-Driven, Sigma-Delta Encoding, Sparse Representation. Presented in ICASSP 2025 at Hyderabad, India
An Event-based State Estimation Approach for Positive Systems with Positive Observers
This article addresses the problem of state observer design for continuous-time linear positive networked systems. Considering the bandwidth constraint in the communication network, an event-measurement-based positive observer design is proposed. The physical interpretation of a positive observer differs from that of a general observer. Its primary goal is to ensure that all state estimates remain non-negative at all times. Using output measurements, a law with weighted sampling error is used to determine the sampling sequence between the system and the observer. The observer dynamics are designed using the standard Luenberger structure with the event-based sampled output information, which is updated only when an event occurs. Assuming observability and sufficient conditions for the positivity of the system, the asymptotic stability of the observer dynamics with sampled information is established. Sufficient conditions of stability and positivity are derived using linear matrix inequalities. Moreover, the design ensures that the event-based architecture is free from Zeno behavior, ensuring a positive minimum bound on the inter-execution time. In addition, numerical simulations on a three-tank system having variable cross-sections are used to demonstrate the efficacy of the proposed event-based positive observer.
comment: 16 pages, 7 figures and 3 tables
Filtering and 1/3 Power Law for Optimal Time Discretisation in Numerical Integration of Stochastic Differential Equations
This paper is concerned with the numerical integration of stochastic differential equations (SDEs) which govern diffusion processes driven by a standard Wiener process. With the latter being replaced by a sequence of increments at discrete moments of time, we revisit a filtering point of view on the approximate strong solution of the SDE as an estimate of the hidden system state whose conditional probability distribution is updated using a Bayesian approach and Brownian bridges over the intermediate time intervals. For a class of multivariable linear SDEs, where the numerical solution is organised as a Kalman filter, we investigate the fine-grid asymptotic behaviour of terminal and integral mean-square error functionals when the time discretisation is specified by a sufficiently smooth monotonic transformation of a uniform grid. This leads to constrained optimisation problems over the time discretisation profile, and their solutions reveal a 1/3 power law for the asymptotically optimal grid density functions. As a one-dimensional example, the results are illustrated for the Ornstein-Uhlenbeck process.
comment: 6 pages, submitted to IFAC World Congress 2026
Power System Voltage Stability Boundary: Computational Results and Applications
The objective of this paper is to report some computational results for the theory of DAE stability boundary, with the aim of advancing applications in power system voltage stability studies. Firstly, a new regularization transformation for standard differential-algebraic equations (DAEs) is proposed. Then the existence of anchor points on voltage stability boundary is examined, and an optimization method for computing the controlling pseudo-saddle is suggested. Subsequently, a local representation of the stable manifold of the pseudo-saddle on the stability boundary is presented, and a voltage stability margin expression is obtained. Finally, the proposed results are verified using several examples, demonstrating the accuracy and effectiveness of the suggested methods.
Aerobatic maneuvers in insect-scale flapping-wing aerial robots via deep-learned robust tube model predictive control
Aerial insects exhibit highly agile maneuvers such as sharp braking, saccades, and body flips under disturbance. In contrast, insect-scale aerial robots are limited to tracking non-aggressive trajectories with small body acceleration. This performance gap is contributed by a combination of low robot inertia, fast dynamics, uncertainty in flapping-wing aerodynamics, and high susceptibility to environmental disturbance. Executing highly dynamic maneuvers requires the generation of aggressive flight trajectories that push against the hardware limit and a high-rate feedback controller that accounts for model and environmental uncertainty. Here, through designing a deep-learned robust tube model predictive controller, we showcase insect-like flight agility and robustness in a 750-millgram flapping-wing robot. Our model predictive controller can track aggressive flight trajectories under disturbance. To achieve a high feedback rate in a compute-constrained real-time system, we design imitation learning methods to train a two-layer, fully connected neural network, which resembles insect flight control architecture consisting of central nervous system and motor neurons. Our robot demonstrates insect-like saccade movements with lateral speed and acceleration of 197 centimeters per second and 11.7 meters per second square, representing 447$\%$ and 255$\%$ improvement over prior results. The robot can also perform saccade maneuvers under 160 centimeters per second wind disturbance and large command-to-force mapping errors. Furthermore, it performs 10 consecutive body flips in 11 seconds - the most challenging maneuver among sub-gram flyers. These results represent a milestone in achieving insect-scale flight agility and inspire future investigations on sensing and compute autonomy.
comment: 27 pages, 26 supplementary pages, 6 main figures, 16 supplementary figures, 1 table
Integrating Upstream Supply Chains into Generation Expansion Planning
Rising electricity demand underscores the need for secure and reliable generation expansion planning that accounts for upstream supply chain constraints. Traditional models often overlook limitations in materials, manufacturing capacity, lead times for deployment, and field availability, which can delay availability of planned resources and thus to threaten system reliability. This paper introduces a multi-stage supply chain-constrained generation expansion planning (SC-GEP) model that optimizes long-term investments while capturing material availability, production limits, spatial and temporal constraints, and material reuse from retired assets. A decomposition algorithm efficiently solves the resulting MILP. A Maryland case study shows that supply chain constraints shift technology choices, amplify deployment delays caused by lead times, and prompt earlier investment in shorter lead-time, low-material-intensity options. In the low-demand scenario, supply chain constraints raise investment costs by $1.2 billion. Under high demand, persistent generation and reserve shortfalls emerge, underscoring the need to integrate upstream constraints into long-term planning.
comment: 10 pages, 9 figures
Quantum Hamiltonian Descent based Augmented Lagrangian Method for Constrained Nonconvex Nonlinear Optimization
Nonlinear programming (NLP) plays a critical role in domains such as power energy systems, chemical engineering, communication networks, and financial engineering. However, solving large-scale, nonconvex NLP problems remains a significant challenge due to the complexity of the solution landscape and the presence of nonlinear nonconvex constraints. In this paper, we develop a Quantum Hamiltonian Descent based Augmented Lagrange Method (QHD-ALM) framework to address largescale, constrained nonconvex NLP problems. The augmented Lagrange method (ALM) can convert a constrained NLP to an unconstrained NLP, which can be solved by using Quantum Hamiltonian Descent (QHD). To run the QHD on a classical machine, we propose to use the Simulated Bifurcation algorithm as the engine to simulate the dynamic process. We apply our algorithm to a Power-to-Hydrogen System, and the simulation results verify the effectiveness of our algorithm.
Control Closure Certificates
This paper introduces the notion of control closure certificates to synthesize controllers for discrete-time control systems against $\omega$-regular specifications. Typical functional approaches to synthesize controllers against $\omega$-regular specifications rely on combining inductive invariants (for example, via barrier certificates) with proofs of well-foundedness (for example, via ranking functions). Transition invariants, provide an alternative where instead of standard well-foundedness arguments one may instead search for disjunctive well-foundedness arguments that together ensure a well-foundedness argument. Closure certificates, functional analogs of transition invariants, provide an effective, automated approach to verify discrete-time dynamical systems against linear temporal logic and $\omega$-regular specifications. We build on this notion to synthesize controllers to ensure the satisfaction of $\omega$-regular specifications. To do so, we first illustrate how one may construct control closure certificates to visit a region infinitely often (or only finitely often) via disjunctive well-founded arguments. We then combine these arguments to provide an argument for parity specifications. Thus, finding an appropriate control closure certificate over the product of the system and a parity automaton specifying a desired $\omega$-regular specification ensures that there exists a controller $\kappa$ to enforce the $\omega$-regular specification. We propose a sum-of-squares optimization approach to synthesize such certificates and demonstrate their efficacy in designing controllers over some case studies.
comment: 28 pages, 4 figures, 6 Tables. To appear in International Symposium on Automated Technology for Verification and Analysis (ATVA), 2025
Moveless: Minimizing Overhead on QCCDs via Versatile Execution and Low Excess Shuttling
One of the most promising paths towards large scale fault tolerant quantum computation is the use of quantum error correcting stabilizer codes. Just like every other quantum circuit, these codes must be compiled to hardware in a way to minimize the total physical error introduced into the system, for example either due to high latency execution or excessive gates to meet connectivity limitations of the target hardware. However, unlike arbitrary quantum circuits, all syndrome extraction circuits have several common properties, for example they have a bipartite connectivity graph, consist only of commuting subcircuits, among other properties. For the most part, compilation methods have aimed at being generic, able to map any input circuit into executables on the hardware, and therefore cannot appropriately exploit these properties and result in executables which have higher physical error. In the case of modular trapped ion systems, specifically QCCDs, this corresponds to the insertion of excessive shuttling operations necessary to realize arbitrary qubit interactions. We propose a compilation scheme explicitly tailored for the structural regularity of QEC circuits based on several key observations: 1. only ancilla or data (but not both) should be shuttled, 2. stabilizers can be executed in any order meaning we can dynamically modify circuit execution on a per-cycle basis 3. ancilla are indistinguishable meaning any can be selected to begin a stabilizer measurement and retain a fixed-point mapping between cycles, and 4. QCCD hardware limits the number of parallel operations equal to the number traps in the system, meaning fewer ancilla are necessary and can be reused. Our resulting compiler, leads to QEC circuits which are on average 3.38x faster to execute, and lead to up to two orders of magnitude of improvement in logical error rates with realistic physical error rates.
comment: 12 pages, 14 figures, Accepted at IEEE QCE 2025, Will be presented on September 4, 2025
Enhancing AI System Resiliency: Formulation and Guarantee for LSTM Resilience Based on Control Theory
This paper proposes a novel theoretical framework for guaranteeing and evaluating the resilience of long short-term memory (LSTM) networks in control systems. We introduce "recovery time" as a new metric of resilience in order to quantify the time required for an LSTM to return to its normal state after anomalous inputs. By mathematically refining incremental input-to-state stability ($\delta$ISS) theory for LSTM, we derive a practical data-independent upper bound on recovery time. This upper bound gives us resilience-aware training. Experimental validation on simple models demonstrates the effectiveness of our resilience estimation and control methods, enhancing a foundation for rigorous quality assurance in safety-critical AI applications.
comment: 9 pages, 6 figures. Appendix: 16 pages. First three listed authors have equal contributions
Improving Drone Racing Performance Through Iterative Learning MPC IROS 2025
Autonomous drone racing presents a challenging control problem, requiring real-time decision-making and robust handling of nonlinear system dynamics. While iterative learning model predictive control (LMPC) offers a promising framework for iterative performance improvement, its direct application to drone racing faces challenges like real-time compatibility or the trade-off between time-optimal and safe traversal. In this paper, we enhance LMPC with three key innovations: (1) an adaptive cost function that dynamically weights time-optimal tracking against centerline adherence, (2) a shifted local safe set to prevent excessive shortcutting and enable more robust iterative updates, and (3) a Cartesian-based formulation that accommodates safety constraints without the singularities or integration errors associated with Frenet-frame transformations. Results from extensive simulation and real-world experiments demonstrate that our improved algorithm can optimize initial trajectories generated by a wide range of controllers with varying levels of tuning for a maximum improvement in lap time by 60.85%. Even applied to the most aggressively tuned state-of-the-art model-based controller, MPCC++, on a real drone, a 6.05% improvement is still achieved. Overall, the proposed method pushes the drone toward faster traversal and avoids collisions in simulation and real-world experiments, making it a practical solution to improve the peak performance of drone racing.
comment: Accepted for oral presentation at IROS 2025
Robust Sensor-Limited Control with Safe Input-Output Constraints for Hydraulic In-Wheel Motor Drive Mobility Systems
In-wheel drive (IWD) systems enhance the responsiveness, traction, and maintenance efficiency of vehicles by enabling each wheel to operate independently. This paper proposes a novel robust torque-observed valve-based control (RTOVC) framework to address velocity tracking in hydraulic IWDs that actuate heavy-duty wheeled mobile robots (HWMRs), considering such challenges as wheel slippages, sensor limitations, rough terrains, and modeling uncertainties. To overcome the sensor-dependent control systems associated with the closed-loop torque/pressure in hydraulic IWD-actuated HWMRs, a robust observer network based on an adaptive barrier Lyapunov function (BLF) is proposed to estimate the required in-wheel motor torque to track the velocity references. Then, another adaptive BLF for valve control signals is employed to modulate the hydraulic fluid to generate the estimated torque for each IWD. The RTOVC strategy ensures user-defined safety within the logarithmic BLF framework by constraining the valve control signal, actual velocity, velocity tracking error, and torque of each hydraulic IWD in an HWMR to avoid exceeding specified limits. Despite its safety constraints, external disturbances, and modeling uncertainties, robustness and uniformly exponential stability of the RTOVC-applied hydraulic IWD mechanism are ensured in HWMRs. Experimental investigations using a 6,500-kg HWMR, actuated by four independent IWDs under intense disturbances and safety-defined constraints, validate the performance of the RTOVC.
comment: This work has been submitted for possible publication in the Elsevier
Model Reference-Based Control with Guaranteed Predefined Performance for Uncertain Strict-Feedback Systems
To address the complexities posed by time- and state-varying uncertainties and the computation of analytic derivatives in strict-feedback form (SFF) systems, this study introduces a novel model reference-based control (MRBC) framework which applies locally to each subsystem (SS), to ensure output tracking performance within the specified transient and steady-state response criteria. This framework includes 1) novel homogeneous adaptive estimators (HAEs) designed to match the uncertain nonlinear SFF system to a reference model, enabling easier analysis and control design at the level, and 2) model-based homogeneous adaptive controllers enhanced by logarithmic barrier Lyapunov functions (HAC-BLFs), intended to control the reference model provided by HAEs in each SS, while ensuring the prescribed tracking responses under control amplitude saturation. The inherently robust MRBC achieves uniformly exponential stability using a generic stability connector term, which addresses dynamic interactions between the adjacent SSs. The parameter sensitivities of HAEs and HAC-BLFs in the MRBC framework are analyzed, focusing on the system's robustness and responsiveness. The proposed MRBC framework is experimentally validated through several scenarios involving an electromechanical linear actuator system with an uncertain SFF, subjected loading disturbance forces challenging 0-95% of its capacity.
Self-sustained oscillations in discrete-time relay feedback systems
We study the problem of determining self-sustained oscillations in discrete-time linear time-invariant relay feedback systems. Concretely, we are interested in predicting when such a system admits unimodal oscillations, i.e., when the output has a single-peaked period. Under the assumption that the linear system is stable and has an impulse response that is strictly monotonically decreasing on its infinite support, we take a novel approach in using the framework of total positivity to address our main question. It is shown that unimodal self-oscillations can only exist if the number of positive and negative elements in a period coincides. Based on this result, we derive conditions for the existence of such oscillations, determine bounds on their periods, and address the question of uniqueness.
comment: Update some figures by the comments from reviewers
Rethink Repeatable Measures of Robot Performance with Statistical Query
For a general standardized testing algorithm designed to evaluate a specific aspect of a robot's performance, several key expectations are commonly imposed. Beyond accuracy (i.e., closeness to a typically unknown ground-truth reference) and efficiency (i.e., feasibility within acceptable testing costs and equipment constraints), one particularly important attribute is repeatability. Repeatability refers to the ability to consistently obtain the same testing outcome when similar testing algorithms are executed on the same subject robot by different stakeholders, across different times or locations. However, achieving repeatable testing has become increasingly challenging as the components involved grow more complex, intelligent, diverse, and, most importantly, stochastic. While related efforts have addressed repeatability at ethical, hardware, and procedural levels, this study focuses specifically on repeatable testing at the algorithmic level. Specifically, we target the well-adopted class of testing algorithms in standardized evaluation: statistical query (SQ) algorithms (i.e., algorithms that estimate the expected value of a bounded function over a distribution using sampled data). We propose a lightweight, parameterized, and adaptive modification applicable to any SQ routine, whether based on Monte Carlo sampling, importance sampling, or adaptive importance sampling, that makes it provably repeatable, with guaranteed bounds on both accuracy and efficiency. We demonstrate the effectiveness of the proposed approach across three representative scenarios: (i) established and widely adopted standardized testing of manipulators, (ii) emerging intelligent testing algorithms for operational risk assessment in automated vehicles, and (iii) developing use cases involving command tracking performance evaluation of humanoid robots in locomotion tasks.
Semi-Data-Driven Model Predictive Control: A Physics-Informed Data-Driven Control Approach
Data-enabled predictive control (DeePC) has emerged as a powerful technique to control complex systems without the need for extensive modeling efforts. However, relying solely on offline collected data trajectories to represent the system dynamics introduces certain drawbacks. Therefore, we present a novel semi-data-driven model predictive control (SD-MPC) framework that combines (limited) model information with DeePC to address a range of these drawbacks, including sensitivity to noisy data and a lack of robustness. In this work, we focus on the performance of DeePC in operating regimes not captured by the offline collected data trajectories and demonstrate how incorporating an underlying parametric model can counteract this issue. SD-MPC exhibits equivalent closed-loop performance as DeePC for deterministic linear time-invariant systems. Simulations demonstrate the general control performance of the proposed SD-MPC for both a linear time-invariant system and a nonlinear system modeled as a linear parameter-varying system. These results provide numerical evidence of the enhanced robustness of SD-MPC over classical DeePC.
comment: 8 pages, 5 figures
A Minimax Optimal Controller for Positive Systems
We present an explicit solution to the discrete-time Bellman equation for minimax optimal control of positive systems under unconstrained disturbances. The primary contribution of our result relies on deducing a bound for the disturbance penalty, which characterizes the existence of a finite solution to the problem class. Moreover, this constraint on the disturbance penalty reveals that, in scenarios where a solution is feasible, the problem converges to its equivalent minimization problem in the absence of disturbances.
comment: Presented at the 26th International Symposium on Mathematical Theory of Networks and Systems (Cambridge, UK)
Differentially Private Distributed Nonconvex Stochastic Optimization with Quantized Communication
This paper proposes a new distributed nonconvex stochastic optimization algorithm that can achieve privacy protection, communication efficiency and convergence simultaneously. Specifically, each node adds general privacy noises to its local state to avoid information leakage, and then quantizes its noise-perturbed state before transmitting to improve communication efficiency. By using a subsampling method controlled through the sample-size parameter, the proposed algorithm reduces cumulative differential privacy parameters {\epsilon}, {\delta}, and thus enhances the differential privacy level, which is significantly different from the existing works. By using a two-time-scale step-sizes method, the mean square convergence for nonconvex cost functions is given. Furthermore, when the global cost function satisfies the Polyak-Lojasiewicz condition, the convergence rate and the oracle complexity of the proposed algorithm are given. In addition, the proposed algorithm achieves both the mean square convergence and finite cumulative differential privacy parameters {\epsilon}, {\delta} over infinite iterations as the sample-size goes to infinity. A numerical example of the distributed training on the "MNIST" dataset is given to show the effectiveness of the algorithm.
Residual Koopman Model Predictive Control for Enhanced Vehicle Dynamics with Small On-Track Data Input
In vehicle trajectory tracking tasks, the simplest approach is the Pure Pursuit (PP) Control. However, this single-point preview tracking strategy fails to consider vehicle model constraints, compromising driving safety. Model Predictive Control (MPC) as a widely adopted control method, optimizes control actions by incorporating mechanistic models and physical constraints. While its control performance critically depends on the accuracy of vehicle modeling. Traditional vehicle modeling approaches face inherent trade-offs between capturing nonlinear dynamics and maintaining computational efficiency, often resulting in reduced control performance. To address these challenges, this paper proposes Residual Koopman Model Predictive Control (RKMPC) framework. This method uses two linear MPC architecture to calculate control inputs: a Linear Model Predictive Control (LMPC) computes the baseline control input based on the vehicle kinematic model, and a neural network-based RKMPC calculates the compensation input. The final control command is obtained by adding these two components. This design preserves the reliability and interpretability of traditional mechanistic model while achieving performance optimization through residual modeling. This method has been validated on the Carsim-Matlab joint simulation platform and a physical 1:10 scale F1TENTH racing car. Experimental results show that RKMPC requires only 20% of the training data needed by traditional Koopman Model Predictive Control (KMPC) while delivering superior tracking performance. Compared to traditional LMPC, RKMPC reduces lateral error by 11.7%-22.1%, decreases heading error by 8.9%-15.8%, and improves front-wheel steering stability by up to 27.6%. The implementation code is available at: https://github.com/ZJU-DDRX/Residual Koopman.
Learning Pivoting Manipulation with Force and Vision Feedback Using Optimization-based Demonstrations
Non-prehensile manipulation is challenging due to complex contact interactions between objects, the environment, and robots. Model-based approaches can efficiently generate complex trajectories of robots and objects under contact constraints. However, they tend to be sensitive to model inaccuracies and require access to privileged information (e.g., object mass, size, pose), making them less suitable for novel objects. In contrast, learning-based approaches are typically more robust to modeling errors but require large amounts of data. In this paper, we bridge these two approaches to propose a framework for learning closed-loop pivoting manipulation. By leveraging computationally efficient Contact-Implicit Trajectory Optimization (CITO), we design demonstration-guided deep Reinforcement Learning (RL), leading to sample-efficient learning. We also present a sim-to-real transfer approach using a privileged training strategy, enabling the robot to perform pivoting manipulation using only proprioception, vision, and force sensing without access to privileged information. Our method is evaluated on several pivoting tasks, demonstrating that it can successfully perform sim-to-real transfer. The overview of our method and the hardware experiments are shown at https://youtu.be/akjGDgfwLbM?si=QVw6ExoPy2VsU2g6
Coded Kalman Filtering over MIMO Gaussian Channels with Feedback
We consider the problem of remotely stabilizing a dynamical system. A sensor (encoder) co-located with the system communicates with a controller (decoder), whose goal is to stabilize the system, over a noisy communication channel with feedback. To accomplish this, the controller must estimate the system state with finite mean squared error (MSE). The vector-valued dynamical system state follows a Gauss-Markov law with additive control. The channel is a multiple-input multiple-output (MIMO) additive white Gaussian noise (AWGN) channel with feedback. For such a source, a linear encoder, and a MIMO AWGN channel, the minimal MSE decoder is a Kalman filter. The parameters of the Kalman filter and the linear encoder can be jointly optimized, under a power constraint at the channel input. We term the resulting encoder-decoder pair a coded Kalman filter. We establish sufficient and necessary conditions for the coded Kalman filter to achieve a finite MSE in the real-time estimation of the state. For sufficiency, we introduce a coding scheme where each unstable mode of the state is estimated using the channel outputs of a single sub-channel. We prove a coinciding necessity condition when either the source or channel is scalar and present a matrix-algebraic condition which implies the condition is necessary in general. Finally, we provide a new counter-example demonstrating that linear codes are generally sub-optimal for coding over MIMO channels.
comment: Submitted for review at IEEE Transactions on Automatic Control
SINDyG: Sparse Identification of Nonlinear Dynamical Systems from Graph-Structured Data, with Applications to Stuart-Landau Oscillator Networks
The combination of machine learning (ML) and sparsity-promoting techniques is enabling direct extraction of governing equations from data, revolutionizing computational modeling in diverse fields of science and engineering. The discovered dynamical models could be used to address challenges in climate science, neuroscience, ecology, finance, epidemiology, and beyond. However, most existing sparse identification methods for discovering dynamical systems treat the whole system as one without considering the interactions between subsystems. As a result, such models are not able to capture small changes in the emergent system behavior. To address this issue, we developed a new method called Sparse Identification of Nonlinear Dynamical Systems from Graph-structured data (SINDyG), which incorporates the network structure into sparse regression to identify model parameters that explain the underlying network dynamics. We tested our proposed method using several case studies of neuronal dynamics, where we modeled the macroscopic oscillation of a population of neurons using the extended Stuart-Landau (SL) equation and utilize the SINDyG method to identify the underlying nonlinear dynamics. Our extensive computational experiments validate the improved accuracy and simplicity of discovered network dynamics when compared to the original SINDy approach. The proposed graph-informed penalty can be easily integrated with other symbolic regression algorithms, enhancing model interpretability and performance by incorporating network structure into the regression process.
Environmental Sound Classification on An Embedded Hardware Platform
Convolutional neural networks (CNNs) have exhibited state-of-the-art performance in various audio classification tasks. However, their real-time deployment remains a challenge on resource constrained devices such as embedded systems. In this paper, we analyze how the performance of large-scale pre-trained audio neural networks designed for audio pattern recognition changes when deployed on a hardware such as a Raspberry Pi. We empirically study the role of CPU temperature, microphone quality and audio signal volume on performance. Our experiments reveal that the continuous CPU usage results in an increased temperature that can trigger an automated slowdown mechanism in the Raspberry Pi, impacting inference latency. The quality of a microphone, specifically with affordable devices such as the Google AIY Voice Kit, and audio signal volume, all affect the system performance. In the course of our investigation, we encounter substantial complications linked to library compatibility and the unique processor architecture requirements of the Raspberry Pi, making the process less straightforward compared to conventional computers (PCs). Our observations, while presenting challenges, pave the way for future researchers to develop more compact machine learning models, design heat-dissipative hardware, and select appropriate microphones when AI models are deployed for real-time applications on edge devices.
comment: Accepted in INTER-NOISE and NOISE-CON Congress and Conference Proceedings, INTER-NOISE24, Nantes, France
Systems and Control (EESS)
Streaming Generated Gaussian Process Experts for Online Learning and Control
Gaussian Processes (GPs), as a nonparametric learning method, offer flexible modeling capabilities and calibrated uncertainty quantification for function approximations. Additionally, GPs support online learning by efficiently incorporating new data with polynomial-time computation, making them well-suited for safety-critical dynamical systems that require rapid adaptation. However, the inference and online updates of exact GPs, when processing streaming data, incur cubic computation time and quadratic storage memory complexity, limiting their scalability to large datasets in real-time settings. In this paper, we propose a \underline{s}treaming \underline{k}ernel-induced progressivel\underline{y} generated expert framework of \underline{G}aussian \underline{p}rocesses (SkyGP) that addresses both computational and memory constraints by maintaining a bounded set of experts, while inheriting the learning performance guarantees from exact Gaussian processes. Furthermore, two SkyGP variants are introduced, each tailored to a specific objective, either maximizing prediction accuracy (SkyGP-Dense) or improving computational efficiency (SkyGP-Fast). The effectiveness of SkyGP is validated through extensive benchmarks and real-time control experiments demonstrating its superior performance compared to state-of-the-art approaches.
Improving Q-Learning for Real-World Control: A Case Study in Series Hybrid Agricultural Tractors
The variable and unpredictable load demands in hybrid agricultural tractors make it difficult to design optimal rule-based energy management strategies, motivating the use of adaptive, learning-based control. However, existing approaches often rely on basic fuel-based rewards and do not leverage expert demonstrations to accelerate training. In this paper, first, the performance of Q-value-based reinforcement learning algorithms is evaluated for powertrain control in a hybrid agricultural tractor. Three algorithms, Double Q-Learning (DQL), Deep Q-Networks (DQN), and Double DQN (DDQN), are compared in terms of convergence speed and policy optimality. Second, a piecewise domain-specific reward-shaping strategy is introduced to improve learning efficiency and steer agent behavior toward engine fuel-efficient operating regions. Third, the design of the experience replay buffer is examined, with a focus on the effects of seeding the buffer with expert demonstrations and analyzing how different types of expert policies influence convergence dynamics and final performance. Experimental results demonstrate that (1) DDQN achieves 70\% faster convergence than DQN in this application domain, (2) the proposed reward shaping method effectively biases the learned policy toward fuel-efficient outcomes, and (3) initializing the replay buffer with structured expert data leads to a 33\% improvement in convergence speed.
Residual Neural Terminal Constraint for MPC-based Collision Avoidance in Dynamic Environments
In this paper, we propose a hybrid MPC local planner that uses a learning-based approximation of a time-varying safe set, derived from local observations and applied as the MPC terminal constraint. This set can be represented as a zero-superlevel set of the value function computed via Hamilton-Jacobi (HJ) reachability analysis, which is infeasible in real-time. We exploit the property that the HJ value function can be expressed as a difference of the corresponding signed distance function (SDF) and a non-negative residual function. The residual component is modeled as a neural network with non-negative output and subtracted from the computed SDF, resulting in a real-time value function estimate that is at least as safe as the SDF by design. Additionally, we parametrize the neural residual by a hypernetwork to improve real-time performance and generalization properties. The proposed method is compared with three state-of-the-art methods in simulations and hardware experiments, achieving up to 30\% higher success rates compared to the best baseline while requiring a similar computational effort and producing high-quality (low travel-time) solutions.
A Robust Cooperative Vehicle Coordination Framework for Intersection Crossing
Cooperative vehicle coordination at unsignalized intersections has garnered significant interest from both academia and industry in recent years, highlighting its notable advantages in improving traffic throughput and fuel efficiency. However, most existing studies oversimplify the coordination system, assuming accurate vehicle state information and ideal state update process. The oversights pose driving risks in the presence of state uncertainty and communication constraint. To address this gap, we propose a robust and comprehensive intersection coordination framework consisting of a robust cooperative trajectory planner and a context-aware status update scheduler. The trajectory planner directly controls the evolution of the trajectory distributions during frequent vehicle interactions, thereby offering probabilistic safety guarantees. To further align with coordination safety in practical bandwidth-limited conditions, we propose a context-aware status update scheduler that dynamically prioritizes the state updating order of vehicles based on their driving urgency. Simulation results validate the robustness and effectiveness of the proposed coordination framework, showing that the collision probability can be significantly reduced while maintaining comparable coordination efficiency to state-of-theart strategies. Moreover, our proposed framework demonstrates superior effectiveness in utilizing wireless resources in practical uncertain and bandwidth-limited conditions.
Grid-Forming Vector Current Control FRT Modes Under Symmetrical and Asymmetrical Faults
Recent research has shown that operating grid-connected converters using the grid-forming vector current control (GFVCC) scheme offers significant benefits, including the simplicity and modularity of the control architecture, as well as enabling a seamless transition from PLL-based grid-following control to grid-forming. An important aspect of any grid-connected converter control strategy is the handling of grid-fault scenarios such as symmetrical and asymmetrical short-circuit faults. This paper presents several fault ride-through (FRT) strategies for GFVCC that enable the converter to provide fault current and stay synchronized to the grid while respecting the converter hardware limitations and retaining grid-forming behavior. The converter control scheme is extended in a modular manner to include negative-sequence loops, and the proposed FRT strategies address both symmetrical and asymmetrical faults. The proposed FRT strategies are analyzed through case studies, including infinite-bus setups and multi-unit grids.
Live Demonstration: Neuromorphic Radar for Gesture Recognition ICASSP 2025
We present a neuromorphic radar framework for real-time, low-power hand gesture recognition (HGR) using an event-driven architecture inspired by biological sensing. Our system comprises a 24 GHz Doppler radar front-end and a custom neuromorphic sampler that converts intermediate-frequency (IF) signals into sparse spike-based representations via asynchronous sigma-delta encoding. These events are directly processed by a lightweight neural network deployed on a Cortex-M0 microcontroller, enabling low-latency inference without requiring spectrogram reconstruction. Unlike conventional radar HGR pipelines that continuously sample and process data, our architecture activates only when meaningful motion is detected, significantly reducing memory, power, and computation overhead. Evaluated on a dataset of five gestures collected from seven users, our system achieves > 85% real-time accuracy. To the best of our knowledge, this is the first work that employs bio-inspired asynchronous sigma-delta encoding and an event-driven processing framework for radar-based HGR.
comment: Neuromorphic Radar, Hand Gesture Recognition, Event-Driven, Sigma-Delta Encoding, Sparse Representation. Presented in ICASSP 2025 at Hyderabad, India
An Event-based State Estimation Approach for Positive Systems with Positive Observers
This article addresses the problem of state observer design for continuous-time linear positive networked systems. Considering the bandwidth constraint in the communication network, an event-measurement-based positive observer design is proposed. The physical interpretation of a positive observer differs from that of a general observer. Its primary goal is to ensure that all state estimates remain non-negative at all times. Using output measurements, a law with weighted sampling error is used to determine the sampling sequence between the system and the observer. The observer dynamics are designed using the standard Luenberger structure with the event-based sampled output information, which is updated only when an event occurs. Assuming observability and sufficient conditions for the positivity of the system, the asymptotic stability of the observer dynamics with sampled information is established. Sufficient conditions of stability and positivity are derived using linear matrix inequalities. Moreover, the design ensures that the event-based architecture is free from Zeno behavior, ensuring a positive minimum bound on the inter-execution time. In addition, numerical simulations on a three-tank system having variable cross-sections are used to demonstrate the efficacy of the proposed event-based positive observer.
comment: 16 pages, 7 figures and 3 tables
Filtering and 1/3 Power Law for Optimal Time Discretisation in Numerical Integration of Stochastic Differential Equations
This paper is concerned with the numerical integration of stochastic differential equations (SDEs) which govern diffusion processes driven by a standard Wiener process. With the latter being replaced by a sequence of increments at discrete moments of time, we revisit a filtering point of view on the approximate strong solution of the SDE as an estimate of the hidden system state whose conditional probability distribution is updated using a Bayesian approach and Brownian bridges over the intermediate time intervals. For a class of multivariable linear SDEs, where the numerical solution is organised as a Kalman filter, we investigate the fine-grid asymptotic behaviour of terminal and integral mean-square error functionals when the time discretisation is specified by a sufficiently smooth monotonic transformation of a uniform grid. This leads to constrained optimisation problems over the time discretisation profile, and their solutions reveal a 1/3 power law for the asymptotically optimal grid density functions. As a one-dimensional example, the results are illustrated for the Ornstein-Uhlenbeck process.
comment: 6 pages, submitted to IFAC World Congress 2026
Power System Voltage Stability Boundary: Computational Results and Applications
The objective of this paper is to report some computational results for the theory of DAE stability boundary, with the aim of advancing applications in power system voltage stability studies. Firstly, a new regularization transformation for standard differential-algebraic equations (DAEs) is proposed. Then the existence of anchor points on voltage stability boundary is examined, and an optimization method for computing the controlling pseudo-saddle is suggested. Subsequently, a local representation of the stable manifold of the pseudo-saddle on the stability boundary is presented, and a voltage stability margin expression is obtained. Finally, the proposed results are verified using several examples, demonstrating the accuracy and effectiveness of the suggested methods.
Aerobatic maneuvers in insect-scale flapping-wing aerial robots via deep-learned robust tube model predictive control
Aerial insects exhibit highly agile maneuvers such as sharp braking, saccades, and body flips under disturbance. In contrast, insect-scale aerial robots are limited to tracking non-aggressive trajectories with small body acceleration. This performance gap is contributed by a combination of low robot inertia, fast dynamics, uncertainty in flapping-wing aerodynamics, and high susceptibility to environmental disturbance. Executing highly dynamic maneuvers requires the generation of aggressive flight trajectories that push against the hardware limit and a high-rate feedback controller that accounts for model and environmental uncertainty. Here, through designing a deep-learned robust tube model predictive controller, we showcase insect-like flight agility and robustness in a 750-millgram flapping-wing robot. Our model predictive controller can track aggressive flight trajectories under disturbance. To achieve a high feedback rate in a compute-constrained real-time system, we design imitation learning methods to train a two-layer, fully connected neural network, which resembles insect flight control architecture consisting of central nervous system and motor neurons. Our robot demonstrates insect-like saccade movements with lateral speed and acceleration of 197 centimeters per second and 11.7 meters per second square, representing 447$\%$ and 255$\%$ improvement over prior results. The robot can also perform saccade maneuvers under 160 centimeters per second wind disturbance and large command-to-force mapping errors. Furthermore, it performs 10 consecutive body flips in 11 seconds - the most challenging maneuver among sub-gram flyers. These results represent a milestone in achieving insect-scale flight agility and inspire future investigations on sensing and compute autonomy.
comment: 27 pages, 26 supplementary pages, 6 main figures, 16 supplementary figures, 1 table
Integrating Upstream Supply Chains into Generation Expansion Planning
Rising electricity demand underscores the need for secure and reliable generation expansion planning that accounts for upstream supply chain constraints. Traditional models often overlook limitations in materials, manufacturing capacity, lead times for deployment, and field availability, which can delay availability of planned resources and thus to threaten system reliability. This paper introduces a multi-stage supply chain-constrained generation expansion planning (SC-GEP) model that optimizes long-term investments while capturing material availability, production limits, spatial and temporal constraints, and material reuse from retired assets. A decomposition algorithm efficiently solves the resulting MILP. A Maryland case study shows that supply chain constraints shift technology choices, amplify deployment delays caused by lead times, and prompt earlier investment in shorter lead-time, low-material-intensity options. In the low-demand scenario, supply chain constraints raise investment costs by $1.2 billion. Under high demand, persistent generation and reserve shortfalls emerge, underscoring the need to integrate upstream constraints into long-term planning.
comment: 10 pages, 9 figures
Quantum Hamiltonian Descent based Augmented Lagrangian Method for Constrained Nonconvex Nonlinear Optimization
Nonlinear programming (NLP) plays a critical role in domains such as power energy systems, chemical engineering, communication networks, and financial engineering. However, solving large-scale, nonconvex NLP problems remains a significant challenge due to the complexity of the solution landscape and the presence of nonlinear nonconvex constraints. In this paper, we develop a Quantum Hamiltonian Descent based Augmented Lagrange Method (QHD-ALM) framework to address largescale, constrained nonconvex NLP problems. The augmented Lagrange method (ALM) can convert a constrained NLP to an unconstrained NLP, which can be solved by using Quantum Hamiltonian Descent (QHD). To run the QHD on a classical machine, we propose to use the Simulated Bifurcation algorithm as the engine to simulate the dynamic process. We apply our algorithm to a Power-to-Hydrogen System, and the simulation results verify the effectiveness of our algorithm.
Control Closure Certificates
This paper introduces the notion of control closure certificates to synthesize controllers for discrete-time control systems against $\omega$-regular specifications. Typical functional approaches to synthesize controllers against $\omega$-regular specifications rely on combining inductive invariants (for example, via barrier certificates) with proofs of well-foundedness (for example, via ranking functions). Transition invariants, provide an alternative where instead of standard well-foundedness arguments one may instead search for disjunctive well-foundedness arguments that together ensure a well-foundedness argument. Closure certificates, functional analogs of transition invariants, provide an effective, automated approach to verify discrete-time dynamical systems against linear temporal logic and $\omega$-regular specifications. We build on this notion to synthesize controllers to ensure the satisfaction of $\omega$-regular specifications. To do so, we first illustrate how one may construct control closure certificates to visit a region infinitely often (or only finitely often) via disjunctive well-founded arguments. We then combine these arguments to provide an argument for parity specifications. Thus, finding an appropriate control closure certificate over the product of the system and a parity automaton specifying a desired $\omega$-regular specification ensures that there exists a controller $\kappa$ to enforce the $\omega$-regular specification. We propose a sum-of-squares optimization approach to synthesize such certificates and demonstrate their efficacy in designing controllers over some case studies.
comment: 28 pages, 4 figures, 6 Tables. To appear in International Symposium on Automated Technology for Verification and Analysis (ATVA), 2025
Moveless: Minimizing Overhead on QCCDs via Versatile Execution and Low Excess Shuttling
One of the most promising paths towards large scale fault tolerant quantum computation is the use of quantum error correcting stabilizer codes. Just like every other quantum circuit, these codes must be compiled to hardware in a way to minimize the total physical error introduced into the system, for example either due to high latency execution or excessive gates to meet connectivity limitations of the target hardware. However, unlike arbitrary quantum circuits, all syndrome extraction circuits have several common properties, for example they have a bipartite connectivity graph, consist only of commuting subcircuits, among other properties. For the most part, compilation methods have aimed at being generic, able to map any input circuit into executables on the hardware, and therefore cannot appropriately exploit these properties and result in executables which have higher physical error. In the case of modular trapped ion systems, specifically QCCDs, this corresponds to the insertion of excessive shuttling operations necessary to realize arbitrary qubit interactions. We propose a compilation scheme explicitly tailored for the structural regularity of QEC circuits based on several key observations: 1. only ancilla or data (but not both) should be shuttled, 2. stabilizers can be executed in any order meaning we can dynamically modify circuit execution on a per-cycle basis 3. ancilla are indistinguishable meaning any can be selected to begin a stabilizer measurement and retain a fixed-point mapping between cycles, and 4. QCCD hardware limits the number of parallel operations equal to the number traps in the system, meaning fewer ancilla are necessary and can be reused. Our resulting compiler, leads to QEC circuits which are on average 3.38x faster to execute, and lead to up to two orders of magnitude of improvement in logical error rates with realistic physical error rates.
comment: 12 pages, 14 figures, Accepted at IEEE QCE 2025, Will be presented on September 4, 2025
Enhancing AI System Resiliency: Formulation and Guarantee for LSTM Resilience Based on Control Theory
This paper proposes a novel theoretical framework for guaranteeing and evaluating the resilience of long short-term memory (LSTM) networks in control systems. We introduce "recovery time" as a new metric of resilience in order to quantify the time required for an LSTM to return to its normal state after anomalous inputs. By mathematically refining incremental input-to-state stability ($\delta$ISS) theory for LSTM, we derive a practical data-independent upper bound on recovery time. This upper bound gives us resilience-aware training. Experimental validation on simple models demonstrates the effectiveness of our resilience estimation and control methods, enhancing a foundation for rigorous quality assurance in safety-critical AI applications.
comment: 9 pages, 6 figures. Appendix: 16 pages. First three listed authors have equal contributions
Improving Drone Racing Performance Through Iterative Learning MPC IROS 2025
Autonomous drone racing presents a challenging control problem, requiring real-time decision-making and robust handling of nonlinear system dynamics. While iterative learning model predictive control (LMPC) offers a promising framework for iterative performance improvement, its direct application to drone racing faces challenges like real-time compatibility or the trade-off between time-optimal and safe traversal. In this paper, we enhance LMPC with three key innovations: (1) an adaptive cost function that dynamically weights time-optimal tracking against centerline adherence, (2) a shifted local safe set to prevent excessive shortcutting and enable more robust iterative updates, and (3) a Cartesian-based formulation that accommodates safety constraints without the singularities or integration errors associated with Frenet-frame transformations. Results from extensive simulation and real-world experiments demonstrate that our improved algorithm can optimize initial trajectories generated by a wide range of controllers with varying levels of tuning for a maximum improvement in lap time by 60.85%. Even applied to the most aggressively tuned state-of-the-art model-based controller, MPCC++, on a real drone, a 6.05% improvement is still achieved. Overall, the proposed method pushes the drone toward faster traversal and avoids collisions in simulation and real-world experiments, making it a practical solution to improve the peak performance of drone racing.
comment: Accepted for oral presentation at IROS 2025
Robust Sensor-Limited Control with Safe Input-Output Constraints for Hydraulic In-Wheel Motor Drive Mobility Systems
In-wheel drive (IWD) systems enhance the responsiveness, traction, and maintenance efficiency of vehicles by enabling each wheel to operate independently. This paper proposes a novel robust torque-observed valve-based control (RTOVC) framework to address velocity tracking in hydraulic IWDs that actuate heavy-duty wheeled mobile robots (HWMRs), considering such challenges as wheel slippages, sensor limitations, rough terrains, and modeling uncertainties. To overcome the sensor-dependent control systems associated with the closed-loop torque/pressure in hydraulic IWD-actuated HWMRs, a robust observer network based on an adaptive barrier Lyapunov function (BLF) is proposed to estimate the required in-wheel motor torque to track the velocity references. Then, another adaptive BLF for valve control signals is employed to modulate the hydraulic fluid to generate the estimated torque for each IWD. The RTOVC strategy ensures user-defined safety within the logarithmic BLF framework by constraining the valve control signal, actual velocity, velocity tracking error, and torque of each hydraulic IWD in an HWMR to avoid exceeding specified limits. Despite its safety constraints, external disturbances, and modeling uncertainties, robustness and uniformly exponential stability of the RTOVC-applied hydraulic IWD mechanism are ensured in HWMRs. Experimental investigations using a 6,500-kg HWMR, actuated by four independent IWDs under intense disturbances and safety-defined constraints, validate the performance of the RTOVC.
comment: This work has been submitted for possible publication in the Elsevier
Model Reference-Based Control with Guaranteed Predefined Performance for Uncertain Strict-Feedback Systems
To address the complexities posed by time- and state-varying uncertainties and the computation of analytic derivatives in strict-feedback form (SFF) systems, this study introduces a novel model reference-based control (MRBC) framework which applies locally to each subsystem (SS), to ensure output tracking performance within the specified transient and steady-state response criteria. This framework includes 1) novel homogeneous adaptive estimators (HAEs) designed to match the uncertain nonlinear SFF system to a reference model, enabling easier analysis and control design at the level, and 2) model-based homogeneous adaptive controllers enhanced by logarithmic barrier Lyapunov functions (HAC-BLFs), intended to control the reference model provided by HAEs in each SS, while ensuring the prescribed tracking responses under control amplitude saturation. The inherently robust MRBC achieves uniformly exponential stability using a generic stability connector term, which addresses dynamic interactions between the adjacent SSs. The parameter sensitivities of HAEs and HAC-BLFs in the MRBC framework are analyzed, focusing on the system's robustness and responsiveness. The proposed MRBC framework is experimentally validated through several scenarios involving an electromechanical linear actuator system with an uncertain SFF, subjected loading disturbance forces challenging 0-95% of its capacity.
Self-sustained oscillations in discrete-time relay feedback systems
We study the problem of determining self-sustained oscillations in discrete-time linear time-invariant relay feedback systems. Concretely, we are interested in predicting when such a system admits unimodal oscillations, i.e., when the output has a single-peaked period. Under the assumption that the linear system is stable and has an impulse response that is strictly monotonically decreasing on its infinite support, we take a novel approach in using the framework of total positivity to address our main question. It is shown that unimodal self-oscillations can only exist if the number of positive and negative elements in a period coincides. Based on this result, we derive conditions for the existence of such oscillations, determine bounds on their periods, and address the question of uniqueness.
comment: Update some figures by the comments from reviewers
Rethink Repeatable Measures of Robot Performance with Statistical Query
For a general standardized testing algorithm designed to evaluate a specific aspect of a robot's performance, several key expectations are commonly imposed. Beyond accuracy (i.e., closeness to a typically unknown ground-truth reference) and efficiency (i.e., feasibility within acceptable testing costs and equipment constraints), one particularly important attribute is repeatability. Repeatability refers to the ability to consistently obtain the same testing outcome when similar testing algorithms are executed on the same subject robot by different stakeholders, across different times or locations. However, achieving repeatable testing has become increasingly challenging as the components involved grow more complex, intelligent, diverse, and, most importantly, stochastic. While related efforts have addressed repeatability at ethical, hardware, and procedural levels, this study focuses specifically on repeatable testing at the algorithmic level. Specifically, we target the well-adopted class of testing algorithms in standardized evaluation: statistical query (SQ) algorithms (i.e., algorithms that estimate the expected value of a bounded function over a distribution using sampled data). We propose a lightweight, parameterized, and adaptive modification applicable to any SQ routine, whether based on Monte Carlo sampling, importance sampling, or adaptive importance sampling, that makes it provably repeatable, with guaranteed bounds on both accuracy and efficiency. We demonstrate the effectiveness of the proposed approach across three representative scenarios: (i) established and widely adopted standardized testing of manipulators, (ii) emerging intelligent testing algorithms for operational risk assessment in automated vehicles, and (iii) developing use cases involving command tracking performance evaluation of humanoid robots in locomotion tasks.
Semi-Data-Driven Model Predictive Control: A Physics-Informed Data-Driven Control Approach
Data-enabled predictive control (DeePC) has emerged as a powerful technique to control complex systems without the need for extensive modeling efforts. However, relying solely on offline collected data trajectories to represent the system dynamics introduces certain drawbacks. Therefore, we present a novel semi-data-driven model predictive control (SD-MPC) framework that combines (limited) model information with DeePC to address a range of these drawbacks, including sensitivity to noisy data and a lack of robustness. In this work, we focus on the performance of DeePC in operating regimes not captured by the offline collected data trajectories and demonstrate how incorporating an underlying parametric model can counteract this issue. SD-MPC exhibits equivalent closed-loop performance as DeePC for deterministic linear time-invariant systems. Simulations demonstrate the general control performance of the proposed SD-MPC for both a linear time-invariant system and a nonlinear system modeled as a linear parameter-varying system. These results provide numerical evidence of the enhanced robustness of SD-MPC over classical DeePC.
comment: 8 pages, 5 figures
A Minimax Optimal Controller for Positive Systems
We present an explicit solution to the discrete-time Bellman equation for minimax optimal control of positive systems under unconstrained disturbances. The primary contribution of our result relies on deducing a bound for the disturbance penalty, which characterizes the existence of a finite solution to the problem class. Moreover, this constraint on the disturbance penalty reveals that, in scenarios where a solution is feasible, the problem converges to its equivalent minimization problem in the absence of disturbances.
comment: Presented at the 26th International Symposium on Mathematical Theory of Networks and Systems (Cambridge, UK)
Differentially Private Distributed Nonconvex Stochastic Optimization with Quantized Communication
This paper proposes a new distributed nonconvex stochastic optimization algorithm that can achieve privacy protection, communication efficiency and convergence simultaneously. Specifically, each node adds general privacy noises to its local state to avoid information leakage, and then quantizes its noise-perturbed state before transmitting to improve communication efficiency. By using a subsampling method controlled through the sample-size parameter, the proposed algorithm reduces cumulative differential privacy parameters {\epsilon}, {\delta}, and thus enhances the differential privacy level, which is significantly different from the existing works. By using a two-time-scale step-sizes method, the mean square convergence for nonconvex cost functions is given. Furthermore, when the global cost function satisfies the Polyak-Lojasiewicz condition, the convergence rate and the oracle complexity of the proposed algorithm are given. In addition, the proposed algorithm achieves both the mean square convergence and finite cumulative differential privacy parameters {\epsilon}, {\delta} over infinite iterations as the sample-size goes to infinity. A numerical example of the distributed training on the "MNIST" dataset is given to show the effectiveness of the algorithm.
Residual Koopman Model Predictive Control for Enhanced Vehicle Dynamics with Small On-Track Data Input
In vehicle trajectory tracking tasks, the simplest approach is the Pure Pursuit (PP) Control. However, this single-point preview tracking strategy fails to consider vehicle model constraints, compromising driving safety. Model Predictive Control (MPC) as a widely adopted control method, optimizes control actions by incorporating mechanistic models and physical constraints. While its control performance critically depends on the accuracy of vehicle modeling. Traditional vehicle modeling approaches face inherent trade-offs between capturing nonlinear dynamics and maintaining computational efficiency, often resulting in reduced control performance. To address these challenges, this paper proposes Residual Koopman Model Predictive Control (RKMPC) framework. This method uses two linear MPC architecture to calculate control inputs: a Linear Model Predictive Control (LMPC) computes the baseline control input based on the vehicle kinematic model, and a neural network-based RKMPC calculates the compensation input. The final control command is obtained by adding these two components. This design preserves the reliability and interpretability of traditional mechanistic model while achieving performance optimization through residual modeling. This method has been validated on the Carsim-Matlab joint simulation platform and a physical 1:10 scale F1TENTH racing car. Experimental results show that RKMPC requires only 20% of the training data needed by traditional Koopman Model Predictive Control (KMPC) while delivering superior tracking performance. Compared to traditional LMPC, RKMPC reduces lateral error by 11.7%-22.1%, decreases heading error by 8.9%-15.8%, and improves front-wheel steering stability by up to 27.6%. The implementation code is available at: https://github.com/ZJU-DDRX/Residual Koopman.
Learning Pivoting Manipulation with Force and Vision Feedback Using Optimization-based Demonstrations
Non-prehensile manipulation is challenging due to complex contact interactions between objects, the environment, and robots. Model-based approaches can efficiently generate complex trajectories of robots and objects under contact constraints. However, they tend to be sensitive to model inaccuracies and require access to privileged information (e.g., object mass, size, pose), making them less suitable for novel objects. In contrast, learning-based approaches are typically more robust to modeling errors but require large amounts of data. In this paper, we bridge these two approaches to propose a framework for learning closed-loop pivoting manipulation. By leveraging computationally efficient Contact-Implicit Trajectory Optimization (CITO), we design demonstration-guided deep Reinforcement Learning (RL), leading to sample-efficient learning. We also present a sim-to-real transfer approach using a privileged training strategy, enabling the robot to perform pivoting manipulation using only proprioception, vision, and force sensing without access to privileged information. Our method is evaluated on several pivoting tasks, demonstrating that it can successfully perform sim-to-real transfer. The overview of our method and the hardware experiments are shown at https://youtu.be/akjGDgfwLbM?si=QVw6ExoPy2VsU2g6
Coded Kalman Filtering over MIMO Gaussian Channels with Feedback
We consider the problem of remotely stabilizing a dynamical system. A sensor (encoder) co-located with the system communicates with a controller (decoder), whose goal is to stabilize the system, over a noisy communication channel with feedback. To accomplish this, the controller must estimate the system state with finite mean squared error (MSE). The vector-valued dynamical system state follows a Gauss-Markov law with additive control. The channel is a multiple-input multiple-output (MIMO) additive white Gaussian noise (AWGN) channel with feedback. For such a source, a linear encoder, and a MIMO AWGN channel, the minimal MSE decoder is a Kalman filter. The parameters of the Kalman filter and the linear encoder can be jointly optimized, under a power constraint at the channel input. We term the resulting encoder-decoder pair a coded Kalman filter. We establish sufficient and necessary conditions for the coded Kalman filter to achieve a finite MSE in the real-time estimation of the state. For sufficiency, we introduce a coding scheme where each unstable mode of the state is estimated using the channel outputs of a single sub-channel. We prove a coinciding necessity condition when either the source or channel is scalar and present a matrix-algebraic condition which implies the condition is necessary in general. Finally, we provide a new counter-example demonstrating that linear codes are generally sub-optimal for coding over MIMO channels.
comment: Submitted for review at IEEE Transactions on Automatic Control
SINDyG: Sparse Identification of Nonlinear Dynamical Systems from Graph-Structured Data, with Applications to Stuart-Landau Oscillator Networks
The combination of machine learning (ML) and sparsity-promoting techniques is enabling direct extraction of governing equations from data, revolutionizing computational modeling in diverse fields of science and engineering. The discovered dynamical models could be used to address challenges in climate science, neuroscience, ecology, finance, epidemiology, and beyond. However, most existing sparse identification methods for discovering dynamical systems treat the whole system as one without considering the interactions between subsystems. As a result, such models are not able to capture small changes in the emergent system behavior. To address this issue, we developed a new method called Sparse Identification of Nonlinear Dynamical Systems from Graph-structured data (SINDyG), which incorporates the network structure into sparse regression to identify model parameters that explain the underlying network dynamics. We tested our proposed method using several case studies of neuronal dynamics, where we modeled the macroscopic oscillation of a population of neurons using the extended Stuart-Landau (SL) equation and utilize the SINDyG method to identify the underlying nonlinear dynamics. Our extensive computational experiments validate the improved accuracy and simplicity of discovered network dynamics when compared to the original SINDy approach. The proposed graph-informed penalty can be easily integrated with other symbolic regression algorithms, enhancing model interpretability and performance by incorporating network structure into the regression process.
Environmental Sound Classification on An Embedded Hardware Platform
Convolutional neural networks (CNNs) have exhibited state-of-the-art performance in various audio classification tasks. However, their real-time deployment remains a challenge on resource constrained devices such as embedded systems. In this paper, we analyze how the performance of large-scale pre-trained audio neural networks designed for audio pattern recognition changes when deployed on a hardware such as a Raspberry Pi. We empirically study the role of CPU temperature, microphone quality and audio signal volume on performance. Our experiments reveal that the continuous CPU usage results in an increased temperature that can trigger an automated slowdown mechanism in the Raspberry Pi, impacting inference latency. The quality of a microphone, specifically with affordable devices such as the Google AIY Voice Kit, and audio signal volume, all affect the system performance. In the course of our investigation, we encounter substantial complications linked to library compatibility and the unique processor architecture requirements of the Raspberry Pi, making the process less straightforward compared to conventional computers (PCs). Our observations, while presenting challenges, pave the way for future researchers to develop more compact machine learning models, design heat-dissipative hardware, and select appropriate microphones when AI models are deployed for real-time applications on edge devices.
comment: Accepted in INTER-NOISE and NOISE-CON Congress and Conference Proceedings, INTER-NOISE24, Nantes, France
Systems and Control (CS)
Hierarchical Learning-Based Control for Multi-Agent Shepherding of Stochastic Autonomous Agents
Multi-agent shepherding represents a challenging distributed control problem where herder agents must coordinate to guide independently moving targets to desired spatial configurations. Most existing control strategies assume cohesive target behavior, which frequently fails in practical applications where targets exhibit stochastic autonomous behavior. This paper presents a hierarchical learning-based control architecture that decomposes the shepherding problem into a high-level decision-making module and a low-level motion control component. The proposed distributed control system synthesizes effective control policies directly from closed-loop experience without requiring explicit inter-agent communication or prior knowledge of target dynamics. The decentralized architecture achieves cooperative control behavior through emergent coordination without centralized supervision. Experimental validation demonstrates superior closed-loop performance compared to state-of-the-art heuristic control methods, achieving 100\% success rates with improved settling times and control efficiency. The control architecture scales beyond its design conditions, adapts to time-varying goal regions, and demonstrates practical implementation feasibility through real-time experiments on the Robotarium platform.
Tensor Dynamic Mode Decomposition
Dynamic mode decomposition (DMD) has become a powerful data-driven method for analyzing the spatiotemporal dynamics of complex, high-dimensional systems. However, conventional DMD methods are limited to matrix-based formulations, which might be inefficient or inadequate for modeling inherently multidimensional data including images, videos, and higher-order networks. In this letter, we propose tensor dynamic mode decomposition (TDMD), a novel extension of DMD to third-order tensors based on the recently developed T-product framework. By incorporating tensor factorization techniques, TDMD achieves more efficient computation and better preservation of spatial and temporal structures in multiway data for tasks such as state reconstruction and dynamic component separation, compared to standard DMD with data flattening. We demonstrate the effectiveness of TDMD on both synthetic and real-world datasets.
comment: 6 pages, 4 figures, 1 table
Periodic robust robotic rock chop via virtual model control
Robotic cutting is a challenging contact-rich manipulation task where the robot must simultaneously negotiate unknown object mechanics, large contact forces, and precise motion requirements. We introduce a new virtual-model control scheme that enables knife rocking motion for robot manipulators, without pre-planned trajectories or precise information of the environment. Motion is generated through interconnection with virtual mechanisms, given by virtual springs, dampers, and masses arranged in a suitable way. Through analysis and experiments, we demonstrate that the controlled robot behavior settles into a periodic motion. Experiments with a Franka manipulator demonstrate robust cuts with five different vegetables, and sub-millimeter slice accuracy from 1 mm to 6 mm at nearly one cut per second. The same controller survives changes in knife shape and cutting board height, and adaptation to a different humanoid manipulator, demonstrating robustness and platform independence.
Causality and Interpretability for Electrical Distribution System faults
Causal analysis helps us understand variables that are responsible for system failures. This improves fault detection and makes system more reliable. In this work, we present a new method that combines causal inference with machine learning to classify faults in electrical distribution systems (EDS) using graph-based models. We first build causal graphs using transfer entropy (TE). Each fault case is represented as a graph, where the nodes are features such as voltage and current, and the edges demonstrate how these features influence each other. Then, the graphs are classified using machine learning and GraphSAGE where the model learns from both the node values and the structure of the graph to predict the type of fault. To make the predictions understandable, we further developed an integrated approach using GNNExplainer and Captums Integrated Gradients to highlight the nodes (features) that influences the most on the final prediction. This gives us clear insights into the possible causes of the fault. Our experiments show high accuracy: 99.44% on the EDS fault dataset, which is better than state of art models. By combining causal graphs with machine learning, our method not only predicts faults accurately but also helps understand their root causes. This makes it a strong and practical tool for improving system reliability.
Uncertainty-Aware Perception-Based Control for Autonomous Racing
Autonomous systems operating in unknown environments often rely heavily on visual sensor data, yet making safe and informed control decisions based on these measurements remains a significant challenge. To facilitate the integration of perception and control in autonomous vehicles, we propose a novel perception-based control approach that incorporates road estimation, quantification of its uncertainty, and uncertainty-aware control based on this estimate. At the core of our method is a parametric road curvature model, optimized using visual measurements of the road through a constrained nonlinear optimization problem. This process ensures adherence to constraints on both model parameters and curvature. By leveraging the Frenet frame formulation, we embed the estimated track curvature into the system dynamics, allowing the controller to explicitly account for perception uncertainty and enhancing robustness to estimation errors based on visual input. We validate our approach in a simulated environment, using a high-fidelity 3D rendering engine, and demonstrate its effectiveness in achieving reliable and uncertainty-aware control for autonomous racing.
Computationally efficient Gauss-Newton reinforcement learning for model predictive control
Model predictive control (MPC) is widely used in process control due to its interpretability and ability to handle constraints. As a parametric policy in reinforcement learning (RL), MPC offers strong initial performance and low data requirements compared to black-box policies like neural networks. However, most RL methods rely on first-order updates, which scale well to large parameter spaces but converge at most linearly, making them inefficient when each policy update requires solving an optimal control problem, as is the case with MPC. While MPC policies are typically sparsely parameterized and thus amenable to second-order approaches, existing second-order methods demand second-order policy derivatives, which can be computationally and memory-wise intractable. This work introduces a Gauss-Newton approximation of the deterministic policy Hessian that eliminates the need for second-order policy derivatives, enabling superlinear convergence with minimal computational overhead. To further improve robustness, we propose a momentum-based Hessian averaging scheme for stable training under noisy estimates. We demonstrate the effectiveness of the approach on a nonlinear continuously stirred tank reactor (CSTR), showing faster convergence and improved data efficiency over state-of-the-art first-order methods.
comment: 14 pages, 8 figures, submitted to Elsevier
Equivalence of Koopman Eigenfunctions and a Commuting Local Frame of Symmetries
This article establishes the relationship between the Koopman eigenfunctions of a first-order nonlinear ODE system and its symmetries. Specifically, a bijective mapping is obtained between i) a commuting local frame of symmetry generators, and ii) a set of Koopman eigenfunctions whose gradients form a local frame. This equivalence is used to convert the problem of finding Koopman eigenfunctions into the equivalent problem of finding commuting vector fields, which are readily given by the flow map of the system. The implication for stability is also studied in terms of a contraction metric that is associated with the commuting local frame of symmetries. The theoretical result is verified with the van der Pol oscillator whose eigenfunctions are computed to the precision of finite numerical integration.
comment: 6 pages, 1 figure
Data-Driven Adaptive Second-Order Sliding Mode Control with Noisy Data
This paper offers a data-driven approach for designing adaptive suboptimal second-order sliding mode (ASSOSM) controllers for single-input nonlinear systems, characterized by perturbed strict-feedback structures with unknown dynamics. The proposed approach is recursive, in which the system dynamics are first decomposed into two parts, referred to as the upper and lower dynamics. The control design task is then divided into two stages, that is, designing a virtual controller for the upper dynamics, followed by synthesizing the actual controller for the full-order system. To this end, we start by collecting noisy data from the system through a finite-time experiment, referred to as a single trajectory. We then formulate a data-dependent condition as a semidefinite program, whose feasibility enables the design of a virtual controller that ensures global asymptotic stability of the origin for the upper dynamics. Building upon this virtual controller, we subsequently propose a data-driven sliding variable that facilitates the design of an ASSOSM controller for the unknown full-order system. This controller guarantees semi-global asymptotic stability of the origin in the presence of disturbances. Specifically, for any prescribed bounded set--no matter how large--the controller's design parameters can be chosen to ensure asymptotic stability of the origin. The effectiveness of the proposed method is demonstrated through three case studies, reflecting different aspects of the approach.
Adaptive Lattice-based Motion Planning
This paper proposes an adaptive lattice-based motion planning solution to address the problem of generating feasible trajectories for systems, represented by a linearly parameterizable non-linear model operating within a cluttered environment. The system model is considered to have uncertain model parameters. The key idea here is to utilize input/output data online to update the model set containing the uncertain system parameter, as well as a dynamic estimated parameter of the model, so that the associated model estimation error reduces over time. This in turn improves the quality of the motion primitives generated by the lattice-based motion planner using a nominal estimated model selected on the basis of suitable criteria. The motion primitives are also equipped with tubes to account for the model mismatch between the nominal estimated model and the true system model, to guarantee collision-free overall motion. The tubes are of uniform size, which is directly proportional to the size of the model set containing the uncertain system parameter. The adaptive learning module guarantees a reduction in the diameter of the model set as well as in the parameter estimation error between the dynamic estimated parameter and the true system parameter. This directly implies a reduction in the size of the implemented tubes and guarantees that the utilized motion primitives go arbitrarily close to the resolution-optimal motion primitives associated with the true model of the system, thus significantly improving the overall motion planning performance over time. The efficiency of the motion planner is demonstrated by a suitable simulation example that considers a drone model represented by Euler-Lagrange dynamics containing uncertain parameters and operating within a cluttered environment.
Distributed Non-Uniform Scaling Control of Multi-Agent Formation via Matrix-Valued Constraints
Distributed formation maneuver control refers to the problem of maneuvering a group of agents to change their formation shape by adjusting the motions of partial agents, where the controller of each agent only requires local information measured from its neighbors. Although this problem has been extensively investigated, existing approaches are mostly limited to uniform scaling transformations. This article proposes a new type of local matrix-valued constraints, via which non-uniform scaling control of position formation can be achieved by tuning the positions of only two agents (i.e., leaders). Here, the non-uniform scaling transformation refers to scaling the position formation with different ratios along different orthogonal coordinate directions. Moreover, by defining scaling and translation of attitude formation, we propose a distributed control scheme for scaling and translation maneuver control of joint position-attitude formations. It is proven that the proposed controller achieves global convergence, provided that the sensing graph among agents is a 2-rooted bidirectional graph. Compared with the affine formation maneuver control approach, the proposed approach leverages a sparser sensing graph, requires fewer leaders, and additionally enables scaling transformations of the attitude formation. A simulation example is proposed to demonstrate our theoretical results.
Distributed Constraint-coupled Resource Allocation: Anytime Feasibility and Violation Robustness
This paper considers distributed resource allocation problems (DRAPs) with a coupled constraint for real-time systems. Based on primal-dual methods, we adopt a control perspective for optimization algorithm design by synthesizing a safe feedback controller using control barrier functions to enforce constraint satisfaction. On this basis, a distributed anytime-feasible resource allocation (DanyRA) algorithm is proposed. It is shown that DanyRA algorithm converges to the exact optimal solution of DRAPs while ensuring feasibility of the coupled inequality constraint at all time steps. Considering constraint violation arises from potential external interferences, a virtual queue with minimum buffer is incorporated to restore the constraint satisfaction before the pre-defined deadlines. We characterize the trade-off between convergence accuracy and violation robustness for maintaining or recovering feasibility. DanyRA algorithm is further extended to address DRAPs with a coupled equality constraint, and its linear convergence rate is theoretically established. Finally, a numerical example is provided for verification.
Centralized Dynamic State Estimation Algorithm for Detecting and Distinguishing Faults and Cyber Attacks in Power Systems
As power systems evolve with increased integration of renewable energy sources, they become more complex and vulnerable to both cyber and physical threats. This study validates a centralized Dynamic State Estimation (DSE) algorithm designed to enhance the protection of power systems, particularly focusing on microgrids with substantial renewable energy integration. The algorithm utilizing a structured hypothesis testing framework, systematically identifies and differentiates anomalies caused by cyberattacks from those resulting from physical faults. This algorithm was evaluated through four case studies: a False Data Injection Attack (FDIA) via manipulation of Current Transformer (CT) ratios, a single line-to-ground (SLG) fault, and two combined scenarios involving both anomalies. Results from real-time simulations demonstrate that the algorithm effectively distinguishes between cyber-induced anomalies and physical faults, thereby significantly enhancing the reliability and security of energy systems. This research underscores the critical role of advanced diagnostic tools in protecting power systems against the growing prevalence of cyber-physical threats, enhancing the resilience of the grid and preventing potential blackouts by avoiding the mis-operation of protection relays.
Supervisory Control of Discrete Event Systems for Small Language Under Cyber Attacks
Cyber attacks are unavoidable in networked discrete event systems where the plant and the supervisor communicate with each other via networks. Because of the nondeterminism in observation and control caused by cyber attacks, the language generated by the supervised system becomes nondeterministic. The small language is defined as the lower bound on all possible languages that can be generated by the supervised system, which is needed for a supervised system to perform some required tasks under cyber attacks. In this paper, we investigate supervisory control for the small language. After introducing CA-S-controllability and CA-S-observability, we prove that the supervisory control problem of achieving a required small language is solvable if and only if the given language is CA-Scontrollable and CA-S-observable. If the given language is not CA-S controllable and/or CA-S-observable, we derive conditions under which the infimal CA-S-controllable and CA-S-observable superlanguage exists and can be used to design a supervisor satisfying the given requirement.
comment: It is submitted to IEEE Transactions on Automatic Control. The status is under review
A Comparative Study of Optimal Control and Neural Networks in Asteroid Rendezvous Mission Analysis
This paper presents a comparative study of the applicability and accuracy of optimal control methods and neural network-based estimators in the context of porkchop plots for preliminary asteroid rendezvous mission design. The scenario considered involves a deep-space CubeSat equipped with a low-thrust engine, departing from Earth and rendezvousing with a near-Earth asteroid within a three-year launch window. A low-thrust trajectory optimization model is formulated, incorporating variable specific impulse, maximum thrust, and path constraints. The optimal control problem is efficiently solved using Sequential Convex Programming (SCP) combined with a solution continuation strategy. The neural network framework consists of two models: one predicts the minimum fuel consumption ($\Delta v$), while the other estimates the minimum flight time ($\Delta t$) which is used to assess transfer feasibility. Case results demonstrate that, in simplified scenarios without path constraints, the neural network approach achieves low relative errors across most of the design space and successfully captures the main structural features of the porkchop plots. In cases where the SCP-based continuation method fails due to the presence of multiple local optima, the neural network still provides smooth and globally consistent predictions, significantly improving the efficiency of early-stage asteroid candidate screening. However, the deformation of the feasible region caused by path constraints leads to noticeable discrepancies in certain boundary regions, thereby limiting the applicability of the network in detailed mission design phases. Overall, the integration of neural networks with porkchop plot analysis offers a effective decision-making tool for mission designers and planetary scientists, with significant potential for engineering applications.
Engineered over Emergent Communication in MARL for Scalable and Sample-Efficient Cooperative Task Allocation in a Partially Observable Grid
We compare the efficacy of learned versus engineered communication strategies in a cooperative multi-agent reinforcement learning (MARL) environment. For the learned approach, we introduce Learned Direct Communication (LDC), where agents generate messages and actions concurrently via a neural network. Our engineered approach, Intention Communication, employs an Imagined Trajectory Generation Module (ITGM) and a Message Generation Network (MGN) to formulate messages based on predicted future states. Both strategies are evaluated on their success rates in cooperative tasks under fully and partially observable conditions. Our findings indicate that while emergent communication is viable, the engineered approach demonstrates superior performance and scalability, particularly as environmental complexity increases.
Neural Approximators for Low-Thrust Trajectory Transfer Cost and Reachability
In trajectory design, fuel consumption and trajectory reachability are two key performance indicators for low-thrust missions. This paper proposes general-purpose pretrained neural networks to predict these metrics. The contributions of this paper are as follows: Firstly, based on the confirmation of the Scaling Law applicable to low-thrust trajectory approximation, the largest dataset is constructed using the proposed homotopy ray method, which aligns with mission-design-oriented data requirements. Secondly, the data are transformed into a self-similar space, enabling the neural network to adapt to arbitrary semi-major axes, inclinations, and central bodies. This extends the applicability beyond existing studies and can generalize across diverse mission scenarios without retraining. Thirdly, to the best of our knowledge, this work presents the current most general and accurate low-thrust trajectory approximator, with implementations available in C++, Python, and MATLAB. The resulting neural network achieves a relative error of 0.78% in predicting velocity increments and 0.63% in minimum transfer time estimation. The models have also been validated on a third-party dataset, multi-flyby mission design problem, and mission analysis scenario, demonstrating their generalization capability, predictive accuracy, and computational efficiency.
Modeling and Simulation of an Active Quarter Car Suspension with a Robust LQR Controller under Road Disturbance and Parameter Uncertainty
Vehicle suspension is important for passengers to travel comfortably and to be less exposed to effects such as vibration and shock. A good suspension system in-creases the road holding of vehicles, allows them to take turns safely and reduces the risk of traffic accidents. Passive suspension system is the most widely used suspension system in vehicles due to its simple structure and low cost. Passive suspension systems do not have an actuator and therefore do not have a controller. Active suspension systems have an actuator and a controller. Although their structures are more complex and costly, they are safer. PID controller is widely used in active suspension systems due to its simple structure, reasonable cost and easy adjustment of coefficients. In this study, a more robust LQR controlled active suspension was designed than passive sus-pension and PID controlled active suspension. Robustness analyses were performed for passive suspension, PID controlled active suspension and LQR controlled active sus-pension. Suspension travel, sprung mass acceleration and sprung mass motion simulations were performed for all 3 suspensions under road disturbance and under simultaneous road disturbance and parameter uncertainty. A comparative analysis was performed by obtaining the suspension rise time, overshoot and settling time data. It was observed that the LQR controlled active suspension showed the least overshoot and had the shortest settling time. In this case, it was proven that the LQR controlled active suspension provided a more comfortable and safe ride compared to the other two suspension systems.
comment: 16 pages, 13 figures
Global Optimality in Multi-Flyby Asteroid Trajectory Optimization: Theory and Application Techniques
Designing optimal trajectories for multi-flyby asteroid missions is scientifically critical but technically challenging due to nonlinear dynamics, intermediate constraints, and numerous local optima. This paper establishes a method that approaches global optimality for multi-flyby trajectory optimization under a given sequence. The original optimal control problem with interior-point equality constraints is transformed into a multi-stage decision formulation. This reformulation enables direct application of dynamic programming in lower dimensions, and follows Bellman's principle of optimality. Moreover, the method provides a quantifiable bound on global optima errors introduced by discretization and approximation assumptions, thus ensuring a measure of confidence in the obtained solution. The method accommodates both impulsive and low-thrust maneuver schemes in rendezvous and flyby scenarios. Several computational techniques are introduced to enhance efficiency, including a specialized solution for bi-impulse cases and an adaptive step refinement strategy. The proposed method is validated through three problems: 1) an impulsive variant of the fourth Global Trajectory Optimization competition problem (GTOC4), 2) the GTOC11 problem, and 3) the original low-thrust GTOC4 problem. Each case demonstrates improvements in fuel consumption over the best-known trajectories. These results give evidence of the generality and effectiveness of the proposed method in global trajectory optimization.
Optimal control driven functional electrical stimulation: A scoping review
Introduction: Rehabilitation after a neurological impairment can be supported by functional electrical stimulation (FES). However, FES is limited by early muscle fatigue, slowing down the recovery progress. The use of optimal control to reduce overstimulation and improve motion precision is gaining interest. This review aims to map the current literature state meanwhile clarifying the best practices, identifying persistent challenges, and outlining directions for future research. Methods: Following the PRISMA guidelines, a search was conducted up to February 2024 using the combined keywords "FES", "optimal control" or "fatigue" across five databases (Medline, Embase, CINAHL Complete, Web of Science, and ProQuest Dissertations & Theses Citation Index). Inclusion criteria included the use of optimal control with FES for healthy individuals and those with neuromuscular disorders. Results: Among the 44 included studies, half were in silico and half in vivo, involving 87 participants, predominantly healthy young men. Twelve different motor tasks were investigated, with a focus on single joint lower limb movements. These studies principally used simple FES models, modulating pulse width or intensity to track joint angle. Conclusions: Optimal control-driven FES can deliver precise motions and reduce fatigue. Yet clinical adoption is slowed down by the lack of consensus about modelling, inconvenient model identification protocol and limited validation. Additional barriers include insufficient open-science practices, computational performance reporting and the availability of customizable commercial hardware. Comparative FES model studies and longitudinal trials with large cohorts, among other efforts, are required to improve the technology readiness level. Such advances would help clinical adoption and improve patient outcomes.
comment: 37 pages, 7 figures, 3 tables
Physics-Embedded Neural ODEs for Sim2Real Edge Digital Twins of Hybrid Power Electronics Systems
Edge Digital Twins (EDTs) are crucial for monitoring and control of Power Electronics Systems (PES). However, existing modeling approaches struggle to consistently capture continuously evolving hybrid dynamics that are inherent in PES, degrading Sim-to-Real generalization on resource-constrained edge devices. To address these challenges, this paper proposes a Physics-Embedded Neural ODEs (PENODE) that (i) embeds the hybrid operating mechanism as an event automaton to explicitly govern discrete switching and (ii) injects known governing ODE components directly into the neural parameterization of unmodeled dynamics. This unified design yields a differentiable end-to-end trainable architecture that preserves physical interpretability while reducing redundancy, and it supports a cloud-to-edge toolchain for efficient FPGA deployment. Experimental results demonstrate that PENODE achieves significantly higher accuracy in benchmarks in white-box, gray-box, and black-box scenarios, with a 75% reduction in neuron count, validating that the proposed PENODE maintains physical interpretability, efficient edge deployment, and real-time control enhancement.
Optimizing Preventive and Reactive Defense Resource Allocation with Uncertain Sensor Signals
Cyber attacks continue to be a cause of concern despite advances in cyber defense techniques. Although cyber attacks cannot be fully prevented, standard decision-making frameworks typically focus on how to prevent them from succeeding, without considering the cost of cleaning up the damages incurred by successful attacks. This motivates us to investigate a new resource allocation problem formulated in this paper: The defender must decide how to split its investment between preventive defenses, which aim to harden nodes from attacks, and reactive defenses, which aim to quickly clean up the compromised nodes. This encounters a challenge imposed by the uncertainty associated with the observation, or sensor signal, whether a node is truly compromised or not; this uncertainty is real because attack detectors are not perfect. We investigate how the quality of sensor signals impacts the defender's strategic investment in the two types of defense, and ultimately the level of security that can be achieved. In particular, we show that the optimal investment in preventive resources increases, and thus reactive resource investment decreases, with higher sensor quality. We also show that the defender's performance improvement, relative to a baseline of no sensors employed, is maximal when the attacker can only achieve low attack success probabilities.
comment: 6 pages, 6 figures. Accepted for presentation at the 61st Allerton Conference on Communication, Control, and Computing
Tunable Leg Stiffness in a Monopedal Hopper for Energy-Efficient Vertical Hopping Across Varying Ground Profiles ICRA
We present the design and implementation of HASTA (Hopper with Adjustable Stiffness for Terrain Adaptation), a vertical hopping robot with real-time tunable leg stiffness, aimed at optimizing energy efficiency across various ground profiles (a pair of ground stiffness and damping conditions). By adjusting leg stiffness, we aim to maximize apex hopping height, a key metric for energy-efficient vertical hopping. We hypothesize that softer legs perform better on soft, damped ground by minimizing penetration and energy loss, while stiffer legs excel on hard, less damped ground by reducing limb deformation and energy dissipation. Through experimental tests and simulations, we find the best leg stiffness within our selection for each combination of ground stiffness and damping, enabling the robot to achieve maximum steady-state hopping height with a constant energy input. These results support our hypothesis that tunable stiffness improves energy-efficient locomotion in controlled experimental conditions. In addition, the simulation provides insights that could aid in the future development of controllers for selecting leg stiffness.
comment: 2025 IEEE International Conference on Robotics & Automation (ICRA)
Integrating Machine Learning with Multimodal Monitoring System Utilizing Acoustic and Vision Sensing to Evaluate Geometric Variations in Laser Directed Energy Deposition
Laser directed energy deposition (DED) additive manufacturing struggles with consistent part quality due to complex melt pool dynamics and process variations. While much research targets defect detection, little work has validated process monitoring systems for evaluating melt pool dynamics and process quality. This study presents a novel multimodal monitoring framework, synergistically integrating contact-based acoustic emission (AE) sensing with coaxial camera vision to enable layer-wise identification and evaluation of geometric variations in DED parts. The experimental study used three part configurations: a baseline part without holes, a part with a 3mm diameter through-hole, and one with a 5mm through-hole to test the system's discerning capabilities. Raw sensor data was preprocessed: acoustic signals were filtered for time-domain and frequency-domain feature extraction, while camera data underwent melt pool segmentation and morphological feature extraction. Multiple machine learning algorithms (including SVM, random forest, and XGBoost) were evaluated to find the optimal model for classifying layer-wise geometric variations. The integrated multimodal strategy achieved a superior classification performance of 94.4%, compared to 87.8% for AE only and 86.7% for the camera only. Validation confirmed the integrated system effectively captures both structural vibration signatures and surface morphological changes tied to the geometric variations. While this study focuses on specific geometries, the demonstrated capability to discriminate between features establishes a technical foundation for future applications in characterizing part variations like geometric inaccuracies and manufacturing-induced defects.
State dimension reduction of recurrent equilibrium networks with contraction and robustness preservation
Recurrent equilibrium networks (RENs) are effective for learning the dynamics of complex dynamical systems with certified contraction and robustness properties through unconstrained learning. While this opens the door to learning large-scale RENs, deploying such large-scale RENs in real-time applications on resource-limited devices remains challenging. Since a REN consists of a feedback interconnection of linear time-invariant (LTI) dynamics and static activation functions, this article proposes a projection-based approach to reduce the state dimension of the LTI component of a trained REN. One of the two projection matrices is dedicated to preserving contraction and robustness by leveraging the already-learned REN contraction certificate. The other projection matrix is iteratively updated to improve the accuracy of the reduced-order REN based on necessary $h_2$-optimality conditions for LTI model reduction. Numerical examples validate the approach, demonstrating significant state dimension reduction with limited accuracy loss while preserving contraction and robustness.
A "good regulator theorem" for embodied agents
In a classic paper, Conant and Ashby claimed that "every good regulator of a system must be a model of that system." Artificial Life has produced many examples of systems that perform tasks with apparently no model in sight; these suggest Conant and Ashby's theorem doesn't easily generalise beyond its restricted setup. Nevertheless, here we show that a similar intuition can be fleshed out in a different way: whenever an agent is able to perform a regulation task, it is possible for an observer to interpret it as having "beliefs" about its environment, which it "updates" in response to sensory input. This notion of belief updating provides a notion of model that is more sophisticated than Conant and Ashby's, as well as a theorem that is more broadly applicable. However, it necessitates a change in perspective, in that the observer plays an essential role in the theory: models are not a mere property of the system but are imposed on it from outside. Our theorem holds regardless of whether the system is regulating its environment in a classic control theory setup, or whether it's regulating its own internal state; the model is of its environment either way. The model might be trivial, however, and this is how the apparent counterexamples are resolved.
comment: Accepted at the Artificial Life conference 2025 (ALife 2025). 10 pages, 1 figure
Systolic Array-based Accelerator for State-Space Models
Sequence modeling is crucial for AI to understand temporal data and detect complex time-dependent patterns. While recurrent neural networks (RNNs), convolutional neural networks (CNNs), and Transformers have advanced in capturing long-range dependencies, they struggle with achieving high accuracy with very long sequences due to limited memory retention (fixed context window). State-Space Models (SSMs) leverage exponentially decaying memory enabling lengthy context window and so they process very long data sequences more efficiently than recurrent and Transformer-based models. Unlike traditional neural models like CNNs and RNNs, SSM-based models require solving differential equations through continuous integration, making training and inference both compute- and memory-intensive on conventional CPUs and GPUs. In this paper we introduce a specialized hardware accelerator, EpochCore, for accelerating SSMs. EpochCore is based on systolic arrays (SAs) and is designed to enhance the energy efficiency and throughput of inference of SSM-based models for long-range sequence tasks. Within the SA, we propose a versatile processing element (PE) called LIMA-PE to perform traditional and specialized MAC operations to support traditional DNNs and SSMs. To complement the EpochCore microarchitecture, we propose a novel dataflow, ProDF, which enables highly efficient execution of SSM-based models. By leveraging the LIMA-PE microarchitecture and ProDF, EpochCore achieves on average 250x gains in performance and 45x improvement in energy efficiency, at the expense of 2x increase in area cost over traditional SA-based accelerators, and around ~2,000x improvement in latency/inference on LRA datasets compared to GPU kernel operations.
Transformable Modular Robots: A CPG-Based Approach to Independent and Collective Locomotion
Modular robotics enables the development of versatile and adaptive robotic systems with autonomous reconfiguration. This paper presents a modular robotic system in which each module has independent actuation, battery power, and control, allowing both individual mobility and coordinated locomotion. A hierarchical Central Pattern Generator (CPG) framework governs motion, with a low-level CPG controlling individual modules and a high-level CPG synchronizing inter-module coordination, enabling smooth transitions between independent and collective behaviors. To validate the system, we conduct simulations in MuJoCo and hardware experiments, evaluating locomotion across different configurations. We first analyze single-module motion, followed by two-module cooperative locomotion. Results demonstrate the effectiveness of the CPG-based control framework in achieving robust, flexible, and scalable locomotion. The proposed modular architecture has potential applications in search and rescue, environmental monitoring, and autonomous exploration, where adaptability and reconfigurability are essential.
Metasurface-Enabled Superheterodyne Transmitter for Arbitrary-Order Modulation with Spatially Isotropic Symbol Distribution
Electromagnetically programmable information metasurfaces, as dynamically controllable 2D metamaterials, hold significant promise as low-profile hardware enabling passive wave control and signal generation for backscatter systems. However, current metasurface-based transmitters architecture fundamentally suffer from hardware non-modularization, forcing all transmitter functions onto nonlinear switch-based unit cells, which introduces symbol mapping inconsistency via phase coupling. Moreover, both temporal coding (limited by unit cell diodes) and space-time coding (impaired by symbol anisotropy) exhibit irreducible harmonic interference and entangled control of amplitude, phase, and beam direction. This paper proposes a metasurface-enabled superheterodyne architecture (MSA), comprising a digital up-conversion (DUC) module performing baseband-to-intermediate frequency (IF) conversion, filtering, and digital-to-analog conversion (DAC), and a reconfigurable metasurface featuring programmable unit cells that independently control both the magnitude and phase of the reflection coefficient. Systematically, the architecture leverages a dual-stage up-conversion process, typical of superheterodyne systems, but uniquely employs the metasurface for the final RF conversion stage. Building upon this framework, a proof-of-concept prototype featuring a 5.8 GHz magnitude-phase decoupled (MPD) metasurface (<15 degree phase deviation per state) and a DAC-based DUC module is presented. Extensive validation confirms the metasurface's capability for distortion-free mixing with arbitrary IF signals while maintaining consistent radiation patterns. The prototype successfully implements diverse QAM modulation schemes (4QAM to 256QAM) in mono-static and bi-static configurations, demonstrating symbol isotropy for spatially separated receivers and achieving a data rate of approximately 20 Mbps (at 5 MHz IF)...
Frequency-Space Channel Estimation and Spatial Equalization in Wideband Fluid Antenna System
The Fluid Antenna System (FAS) overcomes the spatial degree-of-freedom limitations of conventional static antenna arrays in wireless communications.This capability critically depends on acquiring full Channel State Information across all accessible ports. Existing studies focus exclusively on narrowband FAS, performing channel estimation solely in the spatial domain. This work proposes a channel estimation and spatial equalization framework for wideband FAS, revealing for the first time an inherent group-sparse structure in aperture-limited FAS channels. First, we establish a group-sparse recovery framework for space-frequency characteristics in FAS, formally characterizing leakage-induced sparsity degradation from limited aperture and bandwidth as a structured group-sparsity problem. By deriving dictionary-adapted group restricted isometry property, we prove tight recovery bounds for a convex $\ell_1/\ell_2$-mixed norm optimization formulation that preserves leakage-aware sparsity patterns. Second, we develop a descending correlation group orthogonal matching pursuit algorithm that systematically relaxes leakage constraints to reduce subcoherence. This approach enables FSC recovery with accelerated convergence and superior performance compared to conventional compressive sensing methods like OMP or GOMP. Third, we formulate spatial equalization as a mixed-integer linear programming problem, complement this with a greedy algorithm maintaining near-optimal performance. Simulation results demonstrate the proposed channel estimation algorithm effectively resolves energy misallocation and enables recovery of weak details, achieving superior recovery accuracy and convergence rate. The SE framework suppresses deep fading phenomena and largely reduces time consumption overhead while maintaining equivalent link reliability.
Convex Computations for Controlled Safety Invariant Sets of Black-box Discrete-time Dynamical Systems
Identifying controlled safety invariant sets (CSISs) is essential in safety-critical applications. This paper tackles the problem of identifying CSISs for black-box discrete-time systems, where the model is unknown and only limited simulation data is accessible. Traditionally, a CSIS is defined as a subset of a safe set, encompassing initial states for which a control input exists that keeps the system within the set at the next time step-this is referred to as the one-step invariance property. However, the requirement for one-step invariance can be equivalently translated into a stricter condition of ``always-invariance'', meaning that there exist control inputs capable of keeping the system within this set indefinitely. Such a condition may prove overly stringent or impractical for black-box systems, where predictions can become unreliable beyond a single time step or a limited number of finite time steps. To overcome the challenges posed by black-box systems, we reformulate the one-step invariance property in a ``Probably Approximately Correct'' (PAC) sense. This approach allows us to assess the probability that a control input exists to keep the system within the CSIS at the next time step, with a predefined level of confidence. If the system successfully remains within the set at the next time step, we can then reapply the invariance evaluation to the new state, thereby facilitating a recursive assurance of invariance. Our method employs barrier functions and scenario optimization, resulting in a linear programming method to estimate PAC CSISs. Finally, the effectiveness of our approach is demonstrated on several examples.
comment: 15 pages
Online and Offline Space-Filling Input Design for Nonlinear System Identification: A Receding Horizon Control-Based Approach
The effectiveness of data-driven techniques heavily depends on the input signal used to generate the estimation data. However, a significant research gap exists in the field of input design for nonlinear dynamic system identification. In particular, existing methods largely overlook the minimization of the generalization error, i.e., model inaccuracies in regions not covered by the estimation dataset. This work addresses this gap by proposing an input design method that embeds a novel optimality criterion within a receding horizon control (RHC)-based optimization framework. The distance-based optimality criterion induces a space-filling design within a user-defined region of interest in a surrogate model's input space, requiring only minimal prior knowledge. Additionally, the method is applicable both online, where model parameters are continuously updated based on process observations, and offline, where a fixed model is employed. The space-filling performance of the proposed strategy is evaluated on an artificial example and compared to state-of-the-art methods, demonstrating superior efficiency in exploring process operating regions.
Clustered Federated Learning for Generalizable FDIA Detection in Smart Grids with Heterogeneous Data
False Data Injection Attacks (FDIAs) pose severe security risks to smart grids by manipulating measurement data collected from spatially distributed devices such as SCADA systems and PMUs. These measurements typically exhibit Non-Independent and Identically Distributed (Non-IID) characteristics across different regions, which significantly challenges the generalization ability of detection models. Traditional centralized training approaches not only face privacy risks and data sharing constraints but also incur high transmission costs, limiting their scalability and deployment feasibility. To address these issues, this paper proposes a privacy-preserving federated learning framework, termed Federated Cluster Average (FedClusAvg), designed to improve FDIA detection in Non-IID and resource-constrained environments. FedClusAvg incorporates cluster-based stratified sampling and hierarchical communication (client-subserver-server) to enhance model generalization and reduce communication overhead. By enabling localized training and weighted parameter aggregation, the algorithm achieves accurate model convergence without centralizing sensitive data. Experimental results on benchmark smart grid datasets demonstrate that FedClusAvg not only improves detection accuracy under heterogeneous data distributions but also significantly reduces communication rounds and bandwidth consumption. This work provides an effective solution for secure and efficient FDIA detection in large-scale distributed power systems.
comment: 10 pages,6 figures
Tunable Terahertz Detection and Generation using FETs operating in the saturation regime
I report on the experimental observation of DC instability and self-amplification through stimulated emission of 0.2 and 1.63 THz radiation using InGaAs/GaAs HEMT operating in the deep saturation regime at room temperature. I demonstrate both theoretically and experimentally, that the Sub-THz and THz response of FETs are attributable to the rectification of the nonlinear dependence of the device's current-voltage characteristics. FETs function as nonlinear THz mixers and rectifiers, with their open-drain responsivity described by an expression analogous to that of a zero-bias Schottky diode detector. However, operating FETs in the deep saturation regime permits precise tuning of the device to the quantum localized resonance condition and the negative resistance mode at room temperature. Consequently, FETs can be adjusted in the deep saturation regime to facilitate tunable sub-THz and THz detection and generation as well as tunable sub-THz and THz lasing effect. These results are anticipated to significantly impact technological advancements across various fields in the near future.
comment: 6 pages, 5 figures, to be submitted in Journal
Future Deployment and Flexibility of Distributed Energy Resources in the Distribution Grids of Switzerland
The decarbonization goals worldwide drive the energy transition of power distribution grids, which operate under increasingly volatile conditions and closer to their technical limits. In this context, localized operational data with high temporal and spatial resolution is essential for their effective planning and regulation. Nevertheless, information on grid-connected distributed energy resources, such as electric vehicles, photovoltaic systems, and heat pumps, is often fragmented, inconsistent, and unavailable. This work introduces a comprehensive database of distributed energy resources and non-controllable loads allocated in Switzerland's medium- and low-voltage distribution grid models, covering over 2 million points of connection. Remarkably, this data specifies the flexibility capabilities of the controllable devices, with a set of projections aligned with national forecasts for 2030, 2040, and 2050. The database supports studies on flexibility provision of distributed energy resources, distribution grid resilience, and national energy policy, among other topics. Importantly, its modular structure allows users to extract national- and local-scale information across medium- and low-voltage systems, enabling broad applicability across locations.
comment: The dataset can be accessed here: https://doi.org/10.5281/zenodo.15056134
Reward-Augmented Reinforcement Learning for Continuous Control in Precision Autonomous Parking via Policy Optimization Methods
Autonomous parking (AP) represents a critical yet complex subset of intelligent vehicle automation, characterized by tight spatial constraints, frequent close-range obstacle interactions, and stringent safety margins. However, conventional rule-based and model-predictive methods often lack the adaptability and generalization needed to handle the nonlinear and environment-dependent complexities of AP. To address these limitations, we propose a reward-augmented learning framework for AP (RARLAP), that mitigates the inherent complexities of continuous-domain control by leveraging structured reward design to induce smooth and adaptable policy behavior, trained entirely within a high-fidelity Unity-based custom 3D simulation environment. We systematically design and assess three structured reward strategies: goal-only reward (GOR), dense proximity reward (DPR), and milestone-augmented reward (MAR), each integrated with both on-policy and off-policy optimization paradigms. Empirical evaluations demonstrate that the on-policy MAR achieves a 91\% success rate, yielding smoother trajectories and more robust behavior, while GOR and DPR fail to guide effective learning. Convergence and trajectory analyses demonstrate that the proposed framework enhances policy adaptability, accelerates training, and improves safety in continuous control. Overall, RARLAP establishes that reward augmentation effectively addresses complex autonomous parking challenges, enabling scalable and efficient policy optimization with both on- and off-policy methods. To support reproducibility, the code accompanying this paper is publicly available.
Low-Bit Integerization of Vision Transformers using Operand Reordering for Efficient Hardware
Pre-trained vision transformers have achieved remarkable performance across various visual tasks but suffer from expensive computational and memory costs. While model quantization reduces memory usage by lowering precision, these models still incur significant computational overhead due to the dequantization before matrix operations. In this work, we analyze the computation graph and propose an integerization process based on operation reordering. Specifically, the process delays dequantization until after matrix operations. This enables integerized matrix multiplication and linear module by directly processing the quantized input. To validate our approach, we synthesize the self-attention module of ViT on a systolic array-based hardware. Experimental results show that our low-bit inference reduces per-PE power consumption for linear layer and matrix multiplication, bridging the gap between quantized models and efficient inference.
comment: 4 pages + references, 5 figures, 2 tables in IEEE double column conference template
16 Ways to Gallop: Energetics and Body Dynamics of High-Speed Quadrupedal Gaits IROS 2025
Galloping is a common high-speed gait in both animals and quadrupedal robots, yet its energetic characteristics remain insufficiently explored. This study systematically analyzes a large number of possible galloping gaits by categorizing them based on the number of flight phases per stride and the phase relationships between the front and rear legs, following Hildebrand's framework for asymmetrical gaits. Using the A1 quadrupedal robot from Unitree, we model galloping dynamics as a hybrid dynamical system and employ trajectory optimization (TO) to minimize the cost of transport (CoT) across a range of speeds. Our results reveal that rotary and transverse gallop footfall sequences exhibit no fundamental energetic difference, despite variations in body yaw and roll motion. However, the number of flight phases significantly impacts energy efficiency: galloping with no flight phases is optimal at lower speeds, whereas galloping with two flight phases minimizes energy consumption at higher speeds. We validate these findings using a quadratic programming (QP)-based controller, developed in our previous work, in Gazebo simulations. These insights advance the understanding of quadrupedal locomotion energetics and may inform future legged robot designs for adaptive, energy-efficient gait transitions.
comment: 7 pages, 6 figures, Accepted for IROS 2025
Systems and Control (EESS)
Hierarchical Learning-Based Control for Multi-Agent Shepherding of Stochastic Autonomous Agents
Multi-agent shepherding represents a challenging distributed control problem where herder agents must coordinate to guide independently moving targets to desired spatial configurations. Most existing control strategies assume cohesive target behavior, which frequently fails in practical applications where targets exhibit stochastic autonomous behavior. This paper presents a hierarchical learning-based control architecture that decomposes the shepherding problem into a high-level decision-making module and a low-level motion control component. The proposed distributed control system synthesizes effective control policies directly from closed-loop experience without requiring explicit inter-agent communication or prior knowledge of target dynamics. The decentralized architecture achieves cooperative control behavior through emergent coordination without centralized supervision. Experimental validation demonstrates superior closed-loop performance compared to state-of-the-art heuristic control methods, achieving 100\% success rates with improved settling times and control efficiency. The control architecture scales beyond its design conditions, adapts to time-varying goal regions, and demonstrates practical implementation feasibility through real-time experiments on the Robotarium platform.
Tensor Dynamic Mode Decomposition
Dynamic mode decomposition (DMD) has become a powerful data-driven method for analyzing the spatiotemporal dynamics of complex, high-dimensional systems. However, conventional DMD methods are limited to matrix-based formulations, which might be inefficient or inadequate for modeling inherently multidimensional data including images, videos, and higher-order networks. In this letter, we propose tensor dynamic mode decomposition (TDMD), a novel extension of DMD to third-order tensors based on the recently developed T-product framework. By incorporating tensor factorization techniques, TDMD achieves more efficient computation and better preservation of spatial and temporal structures in multiway data for tasks such as state reconstruction and dynamic component separation, compared to standard DMD with data flattening. We demonstrate the effectiveness of TDMD on both synthetic and real-world datasets.
comment: 6 pages, 4 figures, 1 table
Periodic robust robotic rock chop via virtual model control
Robotic cutting is a challenging contact-rich manipulation task where the robot must simultaneously negotiate unknown object mechanics, large contact forces, and precise motion requirements. We introduce a new virtual-model control scheme that enables knife rocking motion for robot manipulators, without pre-planned trajectories or precise information of the environment. Motion is generated through interconnection with virtual mechanisms, given by virtual springs, dampers, and masses arranged in a suitable way. Through analysis and experiments, we demonstrate that the controlled robot behavior settles into a periodic motion. Experiments with a Franka manipulator demonstrate robust cuts with five different vegetables, and sub-millimeter slice accuracy from 1 mm to 6 mm at nearly one cut per second. The same controller survives changes in knife shape and cutting board height, and adaptation to a different humanoid manipulator, demonstrating robustness and platform independence.
Causality and Interpretability for Electrical Distribution System faults
Causal analysis helps us understand variables that are responsible for system failures. This improves fault detection and makes system more reliable. In this work, we present a new method that combines causal inference with machine learning to classify faults in electrical distribution systems (EDS) using graph-based models. We first build causal graphs using transfer entropy (TE). Each fault case is represented as a graph, where the nodes are features such as voltage and current, and the edges demonstrate how these features influence each other. Then, the graphs are classified using machine learning and GraphSAGE where the model learns from both the node values and the structure of the graph to predict the type of fault. To make the predictions understandable, we further developed an integrated approach using GNNExplainer and Captums Integrated Gradients to highlight the nodes (features) that influences the most on the final prediction. This gives us clear insights into the possible causes of the fault. Our experiments show high accuracy: 99.44% on the EDS fault dataset, which is better than state of art models. By combining causal graphs with machine learning, our method not only predicts faults accurately but also helps understand their root causes. This makes it a strong and practical tool for improving system reliability.
Uncertainty-Aware Perception-Based Control for Autonomous Racing
Autonomous systems operating in unknown environments often rely heavily on visual sensor data, yet making safe and informed control decisions based on these measurements remains a significant challenge. To facilitate the integration of perception and control in autonomous vehicles, we propose a novel perception-based control approach that incorporates road estimation, quantification of its uncertainty, and uncertainty-aware control based on this estimate. At the core of our method is a parametric road curvature model, optimized using visual measurements of the road through a constrained nonlinear optimization problem. This process ensures adherence to constraints on both model parameters and curvature. By leveraging the Frenet frame formulation, we embed the estimated track curvature into the system dynamics, allowing the controller to explicitly account for perception uncertainty and enhancing robustness to estimation errors based on visual input. We validate our approach in a simulated environment, using a high-fidelity 3D rendering engine, and demonstrate its effectiveness in achieving reliable and uncertainty-aware control for autonomous racing.
Computationally efficient Gauss-Newton reinforcement learning for model predictive control
Model predictive control (MPC) is widely used in process control due to its interpretability and ability to handle constraints. As a parametric policy in reinforcement learning (RL), MPC offers strong initial performance and low data requirements compared to black-box policies like neural networks. However, most RL methods rely on first-order updates, which scale well to large parameter spaces but converge at most linearly, making them inefficient when each policy update requires solving an optimal control problem, as is the case with MPC. While MPC policies are typically sparsely parameterized and thus amenable to second-order approaches, existing second-order methods demand second-order policy derivatives, which can be computationally and memory-wise intractable. This work introduces a Gauss-Newton approximation of the deterministic policy Hessian that eliminates the need for second-order policy derivatives, enabling superlinear convergence with minimal computational overhead. To further improve robustness, we propose a momentum-based Hessian averaging scheme for stable training under noisy estimates. We demonstrate the effectiveness of the approach on a nonlinear continuously stirred tank reactor (CSTR), showing faster convergence and improved data efficiency over state-of-the-art first-order methods.
comment: 14 pages, 8 figures, submitted to Elsevier
Equivalence of Koopman Eigenfunctions and a Commuting Local Frame of Symmetries
This article establishes the relationship between the Koopman eigenfunctions of a first-order nonlinear ODE system and its symmetries. Specifically, a bijective mapping is obtained between i) a commuting local frame of symmetry generators, and ii) a set of Koopman eigenfunctions whose gradients form a local frame. This equivalence is used to convert the problem of finding Koopman eigenfunctions into the equivalent problem of finding commuting vector fields, which are readily given by the flow map of the system. The implication for stability is also studied in terms of a contraction metric that is associated with the commuting local frame of symmetries. The theoretical result is verified with the van der Pol oscillator whose eigenfunctions are computed to the precision of finite numerical integration.
comment: 6 pages, 1 figure
Data-Driven Adaptive Second-Order Sliding Mode Control with Noisy Data
This paper offers a data-driven approach for designing adaptive suboptimal second-order sliding mode (ASSOSM) controllers for single-input nonlinear systems, characterized by perturbed strict-feedback structures with unknown dynamics. The proposed approach is recursive, in which the system dynamics are first decomposed into two parts, referred to as the upper and lower dynamics. The control design task is then divided into two stages, that is, designing a virtual controller for the upper dynamics, followed by synthesizing the actual controller for the full-order system. To this end, we start by collecting noisy data from the system through a finite-time experiment, referred to as a single trajectory. We then formulate a data-dependent condition as a semidefinite program, whose feasibility enables the design of a virtual controller that ensures global asymptotic stability of the origin for the upper dynamics. Building upon this virtual controller, we subsequently propose a data-driven sliding variable that facilitates the design of an ASSOSM controller for the unknown full-order system. This controller guarantees semi-global asymptotic stability of the origin in the presence of disturbances. Specifically, for any prescribed bounded set--no matter how large--the controller's design parameters can be chosen to ensure asymptotic stability of the origin. The effectiveness of the proposed method is demonstrated through three case studies, reflecting different aspects of the approach.
Adaptive Lattice-based Motion Planning
This paper proposes an adaptive lattice-based motion planning solution to address the problem of generating feasible trajectories for systems, represented by a linearly parameterizable non-linear model operating within a cluttered environment. The system model is considered to have uncertain model parameters. The key idea here is to utilize input/output data online to update the model set containing the uncertain system parameter, as well as a dynamic estimated parameter of the model, so that the associated model estimation error reduces over time. This in turn improves the quality of the motion primitives generated by the lattice-based motion planner using a nominal estimated model selected on the basis of suitable criteria. The motion primitives are also equipped with tubes to account for the model mismatch between the nominal estimated model and the true system model, to guarantee collision-free overall motion. The tubes are of uniform size, which is directly proportional to the size of the model set containing the uncertain system parameter. The adaptive learning module guarantees a reduction in the diameter of the model set as well as in the parameter estimation error between the dynamic estimated parameter and the true system parameter. This directly implies a reduction in the size of the implemented tubes and guarantees that the utilized motion primitives go arbitrarily close to the resolution-optimal motion primitives associated with the true model of the system, thus significantly improving the overall motion planning performance over time. The efficiency of the motion planner is demonstrated by a suitable simulation example that considers a drone model represented by Euler-Lagrange dynamics containing uncertain parameters and operating within a cluttered environment.
Distributed Non-Uniform Scaling Control of Multi-Agent Formation via Matrix-Valued Constraints
Distributed formation maneuver control refers to the problem of maneuvering a group of agents to change their formation shape by adjusting the motions of partial agents, where the controller of each agent only requires local information measured from its neighbors. Although this problem has been extensively investigated, existing approaches are mostly limited to uniform scaling transformations. This article proposes a new type of local matrix-valued constraints, via which non-uniform scaling control of position formation can be achieved by tuning the positions of only two agents (i.e., leaders). Here, the non-uniform scaling transformation refers to scaling the position formation with different ratios along different orthogonal coordinate directions. Moreover, by defining scaling and translation of attitude formation, we propose a distributed control scheme for scaling and translation maneuver control of joint position-attitude formations. It is proven that the proposed controller achieves global convergence, provided that the sensing graph among agents is a 2-rooted bidirectional graph. Compared with the affine formation maneuver control approach, the proposed approach leverages a sparser sensing graph, requires fewer leaders, and additionally enables scaling transformations of the attitude formation. A simulation example is proposed to demonstrate our theoretical results.
Distributed Constraint-coupled Resource Allocation: Anytime Feasibility and Violation Robustness
This paper considers distributed resource allocation problems (DRAPs) with a coupled constraint for real-time systems. Based on primal-dual methods, we adopt a control perspective for optimization algorithm design by synthesizing a safe feedback controller using control barrier functions to enforce constraint satisfaction. On this basis, a distributed anytime-feasible resource allocation (DanyRA) algorithm is proposed. It is shown that DanyRA algorithm converges to the exact optimal solution of DRAPs while ensuring feasibility of the coupled inequality constraint at all time steps. Considering constraint violation arises from potential external interferences, a virtual queue with minimum buffer is incorporated to restore the constraint satisfaction before the pre-defined deadlines. We characterize the trade-off between convergence accuracy and violation robustness for maintaining or recovering feasibility. DanyRA algorithm is further extended to address DRAPs with a coupled equality constraint, and its linear convergence rate is theoretically established. Finally, a numerical example is provided for verification.
Centralized Dynamic State Estimation Algorithm for Detecting and Distinguishing Faults and Cyber Attacks in Power Systems
As power systems evolve with increased integration of renewable energy sources, they become more complex and vulnerable to both cyber and physical threats. This study validates a centralized Dynamic State Estimation (DSE) algorithm designed to enhance the protection of power systems, particularly focusing on microgrids with substantial renewable energy integration. The algorithm utilizing a structured hypothesis testing framework, systematically identifies and differentiates anomalies caused by cyberattacks from those resulting from physical faults. This algorithm was evaluated through four case studies: a False Data Injection Attack (FDIA) via manipulation of Current Transformer (CT) ratios, a single line-to-ground (SLG) fault, and two combined scenarios involving both anomalies. Results from real-time simulations demonstrate that the algorithm effectively distinguishes between cyber-induced anomalies and physical faults, thereby significantly enhancing the reliability and security of energy systems. This research underscores the critical role of advanced diagnostic tools in protecting power systems against the growing prevalence of cyber-physical threats, enhancing the resilience of the grid and preventing potential blackouts by avoiding the mis-operation of protection relays.
Supervisory Control of Discrete Event Systems for Small Language Under Cyber Attacks
Cyber attacks are unavoidable in networked discrete event systems where the plant and the supervisor communicate with each other via networks. Because of the nondeterminism in observation and control caused by cyber attacks, the language generated by the supervised system becomes nondeterministic. The small language is defined as the lower bound on all possible languages that can be generated by the supervised system, which is needed for a supervised system to perform some required tasks under cyber attacks. In this paper, we investigate supervisory control for the small language. After introducing CA-S-controllability and CA-S-observability, we prove that the supervisory control problem of achieving a required small language is solvable if and only if the given language is CA-Scontrollable and CA-S-observable. If the given language is not CA-S controllable and/or CA-S-observable, we derive conditions under which the infimal CA-S-controllable and CA-S-observable superlanguage exists and can be used to design a supervisor satisfying the given requirement.
comment: It is submitted to IEEE Transactions on Automatic Control. The status is under review
A Comparative Study of Optimal Control and Neural Networks in Asteroid Rendezvous Mission Analysis
This paper presents a comparative study of the applicability and accuracy of optimal control methods and neural network-based estimators in the context of porkchop plots for preliminary asteroid rendezvous mission design. The scenario considered involves a deep-space CubeSat equipped with a low-thrust engine, departing from Earth and rendezvousing with a near-Earth asteroid within a three-year launch window. A low-thrust trajectory optimization model is formulated, incorporating variable specific impulse, maximum thrust, and path constraints. The optimal control problem is efficiently solved using Sequential Convex Programming (SCP) combined with a solution continuation strategy. The neural network framework consists of two models: one predicts the minimum fuel consumption ($\Delta v$), while the other estimates the minimum flight time ($\Delta t$) which is used to assess transfer feasibility. Case results demonstrate that, in simplified scenarios without path constraints, the neural network approach achieves low relative errors across most of the design space and successfully captures the main structural features of the porkchop plots. In cases where the SCP-based continuation method fails due to the presence of multiple local optima, the neural network still provides smooth and globally consistent predictions, significantly improving the efficiency of early-stage asteroid candidate screening. However, the deformation of the feasible region caused by path constraints leads to noticeable discrepancies in certain boundary regions, thereby limiting the applicability of the network in detailed mission design phases. Overall, the integration of neural networks with porkchop plot analysis offers a effective decision-making tool for mission designers and planetary scientists, with significant potential for engineering applications.
Engineered over Emergent Communication in MARL for Scalable and Sample-Efficient Cooperative Task Allocation in a Partially Observable Grid
We compare the efficacy of learned versus engineered communication strategies in a cooperative multi-agent reinforcement learning (MARL) environment. For the learned approach, we introduce Learned Direct Communication (LDC), where agents generate messages and actions concurrently via a neural network. Our engineered approach, Intention Communication, employs an Imagined Trajectory Generation Module (ITGM) and a Message Generation Network (MGN) to formulate messages based on predicted future states. Both strategies are evaluated on their success rates in cooperative tasks under fully and partially observable conditions. Our findings indicate that while emergent communication is viable, the engineered approach demonstrates superior performance and scalability, particularly as environmental complexity increases.
Neural Approximators for Low-Thrust Trajectory Transfer Cost and Reachability
In trajectory design, fuel consumption and trajectory reachability are two key performance indicators for low-thrust missions. This paper proposes general-purpose pretrained neural networks to predict these metrics. The contributions of this paper are as follows: Firstly, based on the confirmation of the Scaling Law applicable to low-thrust trajectory approximation, the largest dataset is constructed using the proposed homotopy ray method, which aligns with mission-design-oriented data requirements. Secondly, the data are transformed into a self-similar space, enabling the neural network to adapt to arbitrary semi-major axes, inclinations, and central bodies. This extends the applicability beyond existing studies and can generalize across diverse mission scenarios without retraining. Thirdly, to the best of our knowledge, this work presents the current most general and accurate low-thrust trajectory approximator, with implementations available in C++, Python, and MATLAB. The resulting neural network achieves a relative error of 0.78% in predicting velocity increments and 0.63% in minimum transfer time estimation. The models have also been validated on a third-party dataset, multi-flyby mission design problem, and mission analysis scenario, demonstrating their generalization capability, predictive accuracy, and computational efficiency.
Modeling and Simulation of an Active Quarter Car Suspension with a Robust LQR Controller under Road Disturbance and Parameter Uncertainty
Vehicle suspension is important for passengers to travel comfortably and to be less exposed to effects such as vibration and shock. A good suspension system in-creases the road holding of vehicles, allows them to take turns safely and reduces the risk of traffic accidents. Passive suspension system is the most widely used suspension system in vehicles due to its simple structure and low cost. Passive suspension systems do not have an actuator and therefore do not have a controller. Active suspension systems have an actuator and a controller. Although their structures are more complex and costly, they are safer. PID controller is widely used in active suspension systems due to its simple structure, reasonable cost and easy adjustment of coefficients. In this study, a more robust LQR controlled active suspension was designed than passive sus-pension and PID controlled active suspension. Robustness analyses were performed for passive suspension, PID controlled active suspension and LQR controlled active sus-pension. Suspension travel, sprung mass acceleration and sprung mass motion simulations were performed for all 3 suspensions under road disturbance and under simultaneous road disturbance and parameter uncertainty. A comparative analysis was performed by obtaining the suspension rise time, overshoot and settling time data. It was observed that the LQR controlled active suspension showed the least overshoot and had the shortest settling time. In this case, it was proven that the LQR controlled active suspension provided a more comfortable and safe ride compared to the other two suspension systems.
comment: 16 pages, 13 figures
Global Optimality in Multi-Flyby Asteroid Trajectory Optimization: Theory and Application Techniques
Designing optimal trajectories for multi-flyby asteroid missions is scientifically critical but technically challenging due to nonlinear dynamics, intermediate constraints, and numerous local optima. This paper establishes a method that approaches global optimality for multi-flyby trajectory optimization under a given sequence. The original optimal control problem with interior-point equality constraints is transformed into a multi-stage decision formulation. This reformulation enables direct application of dynamic programming in lower dimensions, and follows Bellman's principle of optimality. Moreover, the method provides a quantifiable bound on global optima errors introduced by discretization and approximation assumptions, thus ensuring a measure of confidence in the obtained solution. The method accommodates both impulsive and low-thrust maneuver schemes in rendezvous and flyby scenarios. Several computational techniques are introduced to enhance efficiency, including a specialized solution for bi-impulse cases and an adaptive step refinement strategy. The proposed method is validated through three problems: 1) an impulsive variant of the fourth Global Trajectory Optimization competition problem (GTOC4), 2) the GTOC11 problem, and 3) the original low-thrust GTOC4 problem. Each case demonstrates improvements in fuel consumption over the best-known trajectories. These results give evidence of the generality and effectiveness of the proposed method in global trajectory optimization.
Optimal control driven functional electrical stimulation: A scoping review
Introduction: Rehabilitation after a neurological impairment can be supported by functional electrical stimulation (FES). However, FES is limited by early muscle fatigue, slowing down the recovery progress. The use of optimal control to reduce overstimulation and improve motion precision is gaining interest. This review aims to map the current literature state meanwhile clarifying the best practices, identifying persistent challenges, and outlining directions for future research. Methods: Following the PRISMA guidelines, a search was conducted up to February 2024 using the combined keywords "FES", "optimal control" or "fatigue" across five databases (Medline, Embase, CINAHL Complete, Web of Science, and ProQuest Dissertations & Theses Citation Index). Inclusion criteria included the use of optimal control with FES for healthy individuals and those with neuromuscular disorders. Results: Among the 44 included studies, half were in silico and half in vivo, involving 87 participants, predominantly healthy young men. Twelve different motor tasks were investigated, with a focus on single joint lower limb movements. These studies principally used simple FES models, modulating pulse width or intensity to track joint angle. Conclusions: Optimal control-driven FES can deliver precise motions and reduce fatigue. Yet clinical adoption is slowed down by the lack of consensus about modelling, inconvenient model identification protocol and limited validation. Additional barriers include insufficient open-science practices, computational performance reporting and the availability of customizable commercial hardware. Comparative FES model studies and longitudinal trials with large cohorts, among other efforts, are required to improve the technology readiness level. Such advances would help clinical adoption and improve patient outcomes.
comment: 37 pages, 7 figures, 3 tables
Physics-Embedded Neural ODEs for Sim2Real Edge Digital Twins of Hybrid Power Electronics Systems
Edge Digital Twins (EDTs) are crucial for monitoring and control of Power Electronics Systems (PES). However, existing modeling approaches struggle to consistently capture continuously evolving hybrid dynamics that are inherent in PES, degrading Sim-to-Real generalization on resource-constrained edge devices. To address these challenges, this paper proposes a Physics-Embedded Neural ODEs (PENODE) that (i) embeds the hybrid operating mechanism as an event automaton to explicitly govern discrete switching and (ii) injects known governing ODE components directly into the neural parameterization of unmodeled dynamics. This unified design yields a differentiable end-to-end trainable architecture that preserves physical interpretability while reducing redundancy, and it supports a cloud-to-edge toolchain for efficient FPGA deployment. Experimental results demonstrate that PENODE achieves significantly higher accuracy in benchmarks in white-box, gray-box, and black-box scenarios, with a 75% reduction in neuron count, validating that the proposed PENODE maintains physical interpretability, efficient edge deployment, and real-time control enhancement.
Optimizing Preventive and Reactive Defense Resource Allocation with Uncertain Sensor Signals
Cyber attacks continue to be a cause of concern despite advances in cyber defense techniques. Although cyber attacks cannot be fully prevented, standard decision-making frameworks typically focus on how to prevent them from succeeding, without considering the cost of cleaning up the damages incurred by successful attacks. This motivates us to investigate a new resource allocation problem formulated in this paper: The defender must decide how to split its investment between preventive defenses, which aim to harden nodes from attacks, and reactive defenses, which aim to quickly clean up the compromised nodes. This encounters a challenge imposed by the uncertainty associated with the observation, or sensor signal, whether a node is truly compromised or not; this uncertainty is real because attack detectors are not perfect. We investigate how the quality of sensor signals impacts the defender's strategic investment in the two types of defense, and ultimately the level of security that can be achieved. In particular, we show that the optimal investment in preventive resources increases, and thus reactive resource investment decreases, with higher sensor quality. We also show that the defender's performance improvement, relative to a baseline of no sensors employed, is maximal when the attacker can only achieve low attack success probabilities.
comment: 6 pages, 6 figures. Accepted for presentation at the 61st Allerton Conference on Communication, Control, and Computing
Tunable Leg Stiffness in a Monopedal Hopper for Energy-Efficient Vertical Hopping Across Varying Ground Profiles ICRA
We present the design and implementation of HASTA (Hopper with Adjustable Stiffness for Terrain Adaptation), a vertical hopping robot with real-time tunable leg stiffness, aimed at optimizing energy efficiency across various ground profiles (a pair of ground stiffness and damping conditions). By adjusting leg stiffness, we aim to maximize apex hopping height, a key metric for energy-efficient vertical hopping. We hypothesize that softer legs perform better on soft, damped ground by minimizing penetration and energy loss, while stiffer legs excel on hard, less damped ground by reducing limb deformation and energy dissipation. Through experimental tests and simulations, we find the best leg stiffness within our selection for each combination of ground stiffness and damping, enabling the robot to achieve maximum steady-state hopping height with a constant energy input. These results support our hypothesis that tunable stiffness improves energy-efficient locomotion in controlled experimental conditions. In addition, the simulation provides insights that could aid in the future development of controllers for selecting leg stiffness.
comment: 2025 IEEE International Conference on Robotics & Automation (ICRA)
Integrating Machine Learning with Multimodal Monitoring System Utilizing Acoustic and Vision Sensing to Evaluate Geometric Variations in Laser Directed Energy Deposition
Laser directed energy deposition (DED) additive manufacturing struggles with consistent part quality due to complex melt pool dynamics and process variations. While much research targets defect detection, little work has validated process monitoring systems for evaluating melt pool dynamics and process quality. This study presents a novel multimodal monitoring framework, synergistically integrating contact-based acoustic emission (AE) sensing with coaxial camera vision to enable layer-wise identification and evaluation of geometric variations in DED parts. The experimental study used three part configurations: a baseline part without holes, a part with a 3mm diameter through-hole, and one with a 5mm through-hole to test the system's discerning capabilities. Raw sensor data was preprocessed: acoustic signals were filtered for time-domain and frequency-domain feature extraction, while camera data underwent melt pool segmentation and morphological feature extraction. Multiple machine learning algorithms (including SVM, random forest, and XGBoost) were evaluated to find the optimal model for classifying layer-wise geometric variations. The integrated multimodal strategy achieved a superior classification performance of 94.4%, compared to 87.8% for AE only and 86.7% for the camera only. Validation confirmed the integrated system effectively captures both structural vibration signatures and surface morphological changes tied to the geometric variations. While this study focuses on specific geometries, the demonstrated capability to discriminate between features establishes a technical foundation for future applications in characterizing part variations like geometric inaccuracies and manufacturing-induced defects.
State dimension reduction of recurrent equilibrium networks with contraction and robustness preservation
Recurrent equilibrium networks (RENs) are effective for learning the dynamics of complex dynamical systems with certified contraction and robustness properties through unconstrained learning. While this opens the door to learning large-scale RENs, deploying such large-scale RENs in real-time applications on resource-limited devices remains challenging. Since a REN consists of a feedback interconnection of linear time-invariant (LTI) dynamics and static activation functions, this article proposes a projection-based approach to reduce the state dimension of the LTI component of a trained REN. One of the two projection matrices is dedicated to preserving contraction and robustness by leveraging the already-learned REN contraction certificate. The other projection matrix is iteratively updated to improve the accuracy of the reduced-order REN based on necessary $h_2$-optimality conditions for LTI model reduction. Numerical examples validate the approach, demonstrating significant state dimension reduction with limited accuracy loss while preserving contraction and robustness.
A "good regulator theorem" for embodied agents
In a classic paper, Conant and Ashby claimed that "every good regulator of a system must be a model of that system." Artificial Life has produced many examples of systems that perform tasks with apparently no model in sight; these suggest Conant and Ashby's theorem doesn't easily generalise beyond its restricted setup. Nevertheless, here we show that a similar intuition can be fleshed out in a different way: whenever an agent is able to perform a regulation task, it is possible for an observer to interpret it as having "beliefs" about its environment, which it "updates" in response to sensory input. This notion of belief updating provides a notion of model that is more sophisticated than Conant and Ashby's, as well as a theorem that is more broadly applicable. However, it necessitates a change in perspective, in that the observer plays an essential role in the theory: models are not a mere property of the system but are imposed on it from outside. Our theorem holds regardless of whether the system is regulating its environment in a classic control theory setup, or whether it's regulating its own internal state; the model is of its environment either way. The model might be trivial, however, and this is how the apparent counterexamples are resolved.
comment: Accepted at the Artificial Life conference 2025 (ALife 2025). 10 pages, 1 figure
Systolic Array-based Accelerator for State-Space Models
Sequence modeling is crucial for AI to understand temporal data and detect complex time-dependent patterns. While recurrent neural networks (RNNs), convolutional neural networks (CNNs), and Transformers have advanced in capturing long-range dependencies, they struggle with achieving high accuracy with very long sequences due to limited memory retention (fixed context window). State-Space Models (SSMs) leverage exponentially decaying memory enabling lengthy context window and so they process very long data sequences more efficiently than recurrent and Transformer-based models. Unlike traditional neural models like CNNs and RNNs, SSM-based models require solving differential equations through continuous integration, making training and inference both compute- and memory-intensive on conventional CPUs and GPUs. In this paper we introduce a specialized hardware accelerator, EpochCore, for accelerating SSMs. EpochCore is based on systolic arrays (SAs) and is designed to enhance the energy efficiency and throughput of inference of SSM-based models for long-range sequence tasks. Within the SA, we propose a versatile processing element (PE) called LIMA-PE to perform traditional and specialized MAC operations to support traditional DNNs and SSMs. To complement the EpochCore microarchitecture, we propose a novel dataflow, ProDF, which enables highly efficient execution of SSM-based models. By leveraging the LIMA-PE microarchitecture and ProDF, EpochCore achieves on average 250x gains in performance and 45x improvement in energy efficiency, at the expense of 2x increase in area cost over traditional SA-based accelerators, and around ~2,000x improvement in latency/inference on LRA datasets compared to GPU kernel operations.
Transformable Modular Robots: A CPG-Based Approach to Independent and Collective Locomotion
Modular robotics enables the development of versatile and adaptive robotic systems with autonomous reconfiguration. This paper presents a modular robotic system in which each module has independent actuation, battery power, and control, allowing both individual mobility and coordinated locomotion. A hierarchical Central Pattern Generator (CPG) framework governs motion, with a low-level CPG controlling individual modules and a high-level CPG synchronizing inter-module coordination, enabling smooth transitions between independent and collective behaviors. To validate the system, we conduct simulations in MuJoCo and hardware experiments, evaluating locomotion across different configurations. We first analyze single-module motion, followed by two-module cooperative locomotion. Results demonstrate the effectiveness of the CPG-based control framework in achieving robust, flexible, and scalable locomotion. The proposed modular architecture has potential applications in search and rescue, environmental monitoring, and autonomous exploration, where adaptability and reconfigurability are essential.
Metasurface-Enabled Superheterodyne Transmitter for Arbitrary-Order Modulation with Spatially Isotropic Symbol Distribution
Electromagnetically programmable information metasurfaces, as dynamically controllable 2D metamaterials, hold significant promise as low-profile hardware enabling passive wave control and signal generation for backscatter systems. However, current metasurface-based transmitters architecture fundamentally suffer from hardware non-modularization, forcing all transmitter functions onto nonlinear switch-based unit cells, which introduces symbol mapping inconsistency via phase coupling. Moreover, both temporal coding (limited by unit cell diodes) and space-time coding (impaired by symbol anisotropy) exhibit irreducible harmonic interference and entangled control of amplitude, phase, and beam direction. This paper proposes a metasurface-enabled superheterodyne architecture (MSA), comprising a digital up-conversion (DUC) module performing baseband-to-intermediate frequency (IF) conversion, filtering, and digital-to-analog conversion (DAC), and a reconfigurable metasurface featuring programmable unit cells that independently control both the magnitude and phase of the reflection coefficient. Systematically, the architecture leverages a dual-stage up-conversion process, typical of superheterodyne systems, but uniquely employs the metasurface for the final RF conversion stage. Building upon this framework, a proof-of-concept prototype featuring a 5.8 GHz magnitude-phase decoupled (MPD) metasurface (<15 degree phase deviation per state) and a DAC-based DUC module is presented. Extensive validation confirms the metasurface's capability for distortion-free mixing with arbitrary IF signals while maintaining consistent radiation patterns. The prototype successfully implements diverse QAM modulation schemes (4QAM to 256QAM) in mono-static and bi-static configurations, demonstrating symbol isotropy for spatially separated receivers and achieving a data rate of approximately 20 Mbps (at 5 MHz IF)...
Frequency-Space Channel Estimation and Spatial Equalization in Wideband Fluid Antenna System
The Fluid Antenna System (FAS) overcomes the spatial degree-of-freedom limitations of conventional static antenna arrays in wireless communications.This capability critically depends on acquiring full Channel State Information across all accessible ports. Existing studies focus exclusively on narrowband FAS, performing channel estimation solely in the spatial domain. This work proposes a channel estimation and spatial equalization framework for wideband FAS, revealing for the first time an inherent group-sparse structure in aperture-limited FAS channels. First, we establish a group-sparse recovery framework for space-frequency characteristics in FAS, formally characterizing leakage-induced sparsity degradation from limited aperture and bandwidth as a structured group-sparsity problem. By deriving dictionary-adapted group restricted isometry property, we prove tight recovery bounds for a convex $\ell_1/\ell_2$-mixed norm optimization formulation that preserves leakage-aware sparsity patterns. Second, we develop a descending correlation group orthogonal matching pursuit algorithm that systematically relaxes leakage constraints to reduce subcoherence. This approach enables FSC recovery with accelerated convergence and superior performance compared to conventional compressive sensing methods like OMP or GOMP. Third, we formulate spatial equalization as a mixed-integer linear programming problem, complement this with a greedy algorithm maintaining near-optimal performance. Simulation results demonstrate the proposed channel estimation algorithm effectively resolves energy misallocation and enables recovery of weak details, achieving superior recovery accuracy and convergence rate. The SE framework suppresses deep fading phenomena and largely reduces time consumption overhead while maintaining equivalent link reliability.
Convex Computations for Controlled Safety Invariant Sets of Black-box Discrete-time Dynamical Systems
Identifying controlled safety invariant sets (CSISs) is essential in safety-critical applications. This paper tackles the problem of identifying CSISs for black-box discrete-time systems, where the model is unknown and only limited simulation data is accessible. Traditionally, a CSIS is defined as a subset of a safe set, encompassing initial states for which a control input exists that keeps the system within the set at the next time step-this is referred to as the one-step invariance property. However, the requirement for one-step invariance can be equivalently translated into a stricter condition of ``always-invariance'', meaning that there exist control inputs capable of keeping the system within this set indefinitely. Such a condition may prove overly stringent or impractical for black-box systems, where predictions can become unreliable beyond a single time step or a limited number of finite time steps. To overcome the challenges posed by black-box systems, we reformulate the one-step invariance property in a ``Probably Approximately Correct'' (PAC) sense. This approach allows us to assess the probability that a control input exists to keep the system within the CSIS at the next time step, with a predefined level of confidence. If the system successfully remains within the set at the next time step, we can then reapply the invariance evaluation to the new state, thereby facilitating a recursive assurance of invariance. Our method employs barrier functions and scenario optimization, resulting in a linear programming method to estimate PAC CSISs. Finally, the effectiveness of our approach is demonstrated on several examples.
comment: 15 pages
Online and Offline Space-Filling Input Design for Nonlinear System Identification: A Receding Horizon Control-Based Approach
The effectiveness of data-driven techniques heavily depends on the input signal used to generate the estimation data. However, a significant research gap exists in the field of input design for nonlinear dynamic system identification. In particular, existing methods largely overlook the minimization of the generalization error, i.e., model inaccuracies in regions not covered by the estimation dataset. This work addresses this gap by proposing an input design method that embeds a novel optimality criterion within a receding horizon control (RHC)-based optimization framework. The distance-based optimality criterion induces a space-filling design within a user-defined region of interest in a surrogate model's input space, requiring only minimal prior knowledge. Additionally, the method is applicable both online, where model parameters are continuously updated based on process observations, and offline, where a fixed model is employed. The space-filling performance of the proposed strategy is evaluated on an artificial example and compared to state-of-the-art methods, demonstrating superior efficiency in exploring process operating regions.
Clustered Federated Learning for Generalizable FDIA Detection in Smart Grids with Heterogeneous Data
False Data Injection Attacks (FDIAs) pose severe security risks to smart grids by manipulating measurement data collected from spatially distributed devices such as SCADA systems and PMUs. These measurements typically exhibit Non-Independent and Identically Distributed (Non-IID) characteristics across different regions, which significantly challenges the generalization ability of detection models. Traditional centralized training approaches not only face privacy risks and data sharing constraints but also incur high transmission costs, limiting their scalability and deployment feasibility. To address these issues, this paper proposes a privacy-preserving federated learning framework, termed Federated Cluster Average (FedClusAvg), designed to improve FDIA detection in Non-IID and resource-constrained environments. FedClusAvg incorporates cluster-based stratified sampling and hierarchical communication (client-subserver-server) to enhance model generalization and reduce communication overhead. By enabling localized training and weighted parameter aggregation, the algorithm achieves accurate model convergence without centralizing sensitive data. Experimental results on benchmark smart grid datasets demonstrate that FedClusAvg not only improves detection accuracy under heterogeneous data distributions but also significantly reduces communication rounds and bandwidth consumption. This work provides an effective solution for secure and efficient FDIA detection in large-scale distributed power systems.
comment: 10 pages,6 figures
Tunable Terahertz Detection and Generation using FETs operating in the saturation regime
I report on the experimental observation of DC instability and self-amplification through stimulated emission of 0.2 and 1.63 THz radiation using InGaAs/GaAs HEMT operating in the deep saturation regime at room temperature. I demonstrate both theoretically and experimentally, that the Sub-THz and THz response of FETs are attributable to the rectification of the nonlinear dependence of the device's current-voltage characteristics. FETs function as nonlinear THz mixers and rectifiers, with their open-drain responsivity described by an expression analogous to that of a zero-bias Schottky diode detector. However, operating FETs in the deep saturation regime permits precise tuning of the device to the quantum localized resonance condition and the negative resistance mode at room temperature. Consequently, FETs can be adjusted in the deep saturation regime to facilitate tunable sub-THz and THz detection and generation as well as tunable sub-THz and THz lasing effect. These results are anticipated to significantly impact technological advancements across various fields in the near future.
comment: 6 pages, 5 figures, to be submitted in Journal
Future Deployment and Flexibility of Distributed Energy Resources in the Distribution Grids of Switzerland
The decarbonization goals worldwide drive the energy transition of power distribution grids, which operate under increasingly volatile conditions and closer to their technical limits. In this context, localized operational data with high temporal and spatial resolution is essential for their effective planning and regulation. Nevertheless, information on grid-connected distributed energy resources, such as electric vehicles, photovoltaic systems, and heat pumps, is often fragmented, inconsistent, and unavailable. This work introduces a comprehensive database of distributed energy resources and non-controllable loads allocated in Switzerland's medium- and low-voltage distribution grid models, covering over 2 million points of connection. Remarkably, this data specifies the flexibility capabilities of the controllable devices, with a set of projections aligned with national forecasts for 2030, 2040, and 2050. The database supports studies on flexibility provision of distributed energy resources, distribution grid resilience, and national energy policy, among other topics. Importantly, its modular structure allows users to extract national- and local-scale information across medium- and low-voltage systems, enabling broad applicability across locations.
comment: The dataset can be accessed here: https://doi.org/10.5281/zenodo.15056134
Reward-Augmented Reinforcement Learning for Continuous Control in Precision Autonomous Parking via Policy Optimization Methods
Autonomous parking (AP) represents a critical yet complex subset of intelligent vehicle automation, characterized by tight spatial constraints, frequent close-range obstacle interactions, and stringent safety margins. However, conventional rule-based and model-predictive methods often lack the adaptability and generalization needed to handle the nonlinear and environment-dependent complexities of AP. To address these limitations, we propose a reward-augmented learning framework for AP (RARLAP), that mitigates the inherent complexities of continuous-domain control by leveraging structured reward design to induce smooth and adaptable policy behavior, trained entirely within a high-fidelity Unity-based custom 3D simulation environment. We systematically design and assess three structured reward strategies: goal-only reward (GOR), dense proximity reward (DPR), and milestone-augmented reward (MAR), each integrated with both on-policy and off-policy optimization paradigms. Empirical evaluations demonstrate that the on-policy MAR achieves a 91\% success rate, yielding smoother trajectories and more robust behavior, while GOR and DPR fail to guide effective learning. Convergence and trajectory analyses demonstrate that the proposed framework enhances policy adaptability, accelerates training, and improves safety in continuous control. Overall, RARLAP establishes that reward augmentation effectively addresses complex autonomous parking challenges, enabling scalable and efficient policy optimization with both on- and off-policy methods. To support reproducibility, the code accompanying this paper is publicly available.
Low-Bit Integerization of Vision Transformers using Operand Reordering for Efficient Hardware
Pre-trained vision transformers have achieved remarkable performance across various visual tasks but suffer from expensive computational and memory costs. While model quantization reduces memory usage by lowering precision, these models still incur significant computational overhead due to the dequantization before matrix operations. In this work, we analyze the computation graph and propose an integerization process based on operation reordering. Specifically, the process delays dequantization until after matrix operations. This enables integerized matrix multiplication and linear module by directly processing the quantized input. To validate our approach, we synthesize the self-attention module of ViT on a systolic array-based hardware. Experimental results show that our low-bit inference reduces per-PE power consumption for linear layer and matrix multiplication, bridging the gap between quantized models and efficient inference.
comment: 4 pages + references, 5 figures, 2 tables in IEEE double column conference template
16 Ways to Gallop: Energetics and Body Dynamics of High-Speed Quadrupedal Gaits IROS 2025
Galloping is a common high-speed gait in both animals and quadrupedal robots, yet its energetic characteristics remain insufficiently explored. This study systematically analyzes a large number of possible galloping gaits by categorizing them based on the number of flight phases per stride and the phase relationships between the front and rear legs, following Hildebrand's framework for asymmetrical gaits. Using the A1 quadrupedal robot from Unitree, we model galloping dynamics as a hybrid dynamical system and employ trajectory optimization (TO) to minimize the cost of transport (CoT) across a range of speeds. Our results reveal that rotary and transverse gallop footfall sequences exhibit no fundamental energetic difference, despite variations in body yaw and roll motion. However, the number of flight phases significantly impacts energy efficiency: galloping with no flight phases is optimal at lower speeds, whereas galloping with two flight phases minimizes energy consumption at higher speeds. We validate these findings using a quadratic programming (QP)-based controller, developed in our previous work, in Gazebo simulations. These insights advance the understanding of quadrupedal locomotion energetics and may inform future legged robot designs for adaptive, energy-efficient gait transitions.
comment: 7 pages, 6 figures, Accepted for IROS 2025
Robotics
Manip4Care: Robotic Manipulation of Human Limbs for Solving Assistive Tasks IROS 2025
Enabling robots to grasp and reposition human limbs can significantly enhance their ability to provide assistive care to individuals with severe mobility impairments, particularly in tasks such as robot-assisted bed bathing and dressing. However, existing assistive robotics solutions often assume that the human remains static or quasi-static, limiting their effectiveness. To address this issue, we present Manip4Care, a modular simulation pipeline that enables robotic manipulators to grasp and reposition human limbs effectively. Our approach features a physics simulator equipped with built-in techniques for grasping and repositioning while considering biomechanical and collision avoidance constraints. Our grasping method employs antipodal sampling with force closure to grasp limbs, and our repositioning system utilizes the Model Predictive Path Integral (MPPI) and vector-field-based control method to generate motion trajectories under collision avoidance and biomechanical constraints. We evaluate this approach across various limb manipulation tasks in both supine and sitting positions and compare outcomes for different age groups with differing shoulder joint limits. Additionally, we demonstrate our approach for limb manipulation using a real-world mannequin and further showcase its effectiveness in bed bathing tasks.
comment: Accepted to IROS 2025
HyCodePolicy: Hybrid Language Controllers for Multimodal Monitoring and Decision in Embodied Agents ICCV 2025
Recent advances in multimodal large language models (MLLMs) have enabled richer perceptual grounding for code policy generation in embodied agents. However, most existing systems lack effective mechanisms to adaptively monitor policy execution and repair codes during task completion. In this work, we introduce HyCodePolicy, a hybrid language-based control framework that systematically integrates code synthesis, geometric grounding, perceptual monitoring, and iterative repair into a closed-loop programming cycle for embodied agents. Technically, given a natural language instruction, our system first decomposes it into subgoals and generates an initial executable program grounded in object-centric geometric primitives. The program is then executed in simulation, while a vision-language model (VLM) observes selected checkpoints to detect and localize execution failures and infer failure reasons. By fusing structured execution traces capturing program-level events with VLM-based perceptual feedback, HyCodePolicy infers failure causes and repairs programs. This hybrid dual feedback mechanism enables self-correcting program synthesis with minimal human supervision. Our results demonstrate that HyCodePolicy significantly improves the robustness and sample efficiency of robot manipulation policies, offering a scalable strategy for integrating multimodal reasoning into autonomous decision-making pipelines.
comment: Accepted to ICCV 2025 Workshop on Multi-Modal Reasoning for Agentic Intelligence
Vision-based Navigation of Unmanned Aerial Vehicles in Orchards: An Imitation Learning Approach
Autonomous unmanned aerial vehicle (UAV) navigation in orchards presents significant challenges due to obstacles and GPS-deprived environments. In this work, we introduce a learning-based approach to achieve vision-based navigation of UAVs within orchard rows. Our method employs a variational autoencoder (VAE)-based controller, trained with an intervention-based learning framework that allows the UAV to learn a visuomotor policy from human experience. We validate our approach in real orchard environments with a custom-built quadrotor platform. Field experiments demonstrate that after only a few iterations of training, the proposed VAE-based controller can autonomously navigate the UAV based on a front-mounted camera stream. The controller exhibits strong obstacle avoidance performance, achieves longer flying distances with less human assistance, and outperforms existing algorithms. Furthermore, we show that the policy generalizes effectively to novel environments and maintains competitive performance across varying conditions and speeds. This research not only advances UAV autonomy but also holds significant potential for precision agriculture, improving efficiency in orchard monitoring and management.
Periodic robust robotic rock chop via virtual model control
Robotic cutting is a challenging contact-rich manipulation task where the robot must simultaneously negotiate unknown object mechanics, large contact forces, and precise motion requirements. We introduce a new virtual-model control scheme that enables knife rocking motion for robot manipulators, without pre-planned trajectories or precise information of the environment. Motion is generated through interconnection with virtual mechanisms, given by virtual springs, dampers, and masses arranged in a suitable way. Through analysis and experiments, we demonstrate that the controlled robot behavior settles into a periodic motion. Experiments with a Franka manipulator demonstrate robust cuts with five different vegetables, and sub-millimeter slice accuracy from 1 mm to 6 mm at nearly one cut per second. The same controller survives changes in knife shape and cutting board height, and adaptation to a different humanoid manipulator, demonstrating robustness and platform independence.
MonoDream: Monocular Vision-Language Navigation with Panoramic Dreaming
Vision-Language Navigation (VLN) tasks often leverage panoramic RGB and depth inputs to provide rich spatial cues for action planning, but these sensors can be costly or less accessible in real-world deployments. Recent approaches based on Vision-Language Action (VLA) models achieve strong results with monocular input, yet they still lag behind methods using panoramic RGB-D information. We present MonoDream, a lightweight VLA framework that enables monocular agents to learn a Unified Navigation Representation (UNR). This shared feature representation jointly aligns navigation-relevant visual semantics (e.g., global layout, depth, and future cues) and language-grounded action intent, enabling more reliable action prediction. MonoDream further introduces Latent Panoramic Dreaming (LPD) tasks to supervise the UNR, which train the model to predict latent features of panoramic RGB and depth observations at both current and future steps based on only monocular input. Experiments on multiple VLN benchmarks show that MonoDream consistently improves monocular navigation performance and significantly narrows the gap with panoramic-based agents.
An RGB-D Camera-Based Multi-Small Flying Anchors Control for Wire-Driven Robots Connecting to the Environment IROS 2025
In order to expand the operational range and payload capacity of robots, wire-driven robots that leverage the external environment have been proposed. It can exert forces and operate in spaces far beyond those dictated by its own structural limits. However, for practical use, robots must autonomously attach multiple wires to the environment based on environmental recognition-an operation so difficult that many wire-driven robots remain restricted to specialized, pre-designed environments. Here, in this study, we propose a robot that autonomously connects multiple wires to the environment by employing a multi-small flying anchor system, as well as an RGB-D camera-based control and environmental recognition method. Each flying anchor is a drone with an anchoring mechanism at the wire tip, allowing the robot to attach wires by flying into position. Using the robot's RGB-D camera to identify suitable attachment points and a flying anchor position, the system can connect wires in environments that are not specially prepared, and can also attach multiple wires simultaneously. Through this approach, a wire-driven robot can autonomously attach its wires to the environment, thereby realizing the benefits of wire-driven operation at any location.
comment: Accepted at IROS 2025, website - https://shin0805.github.io/flying-anchor/, YouTube - https://youtu.be/BXdlXf7BNQQ
Failure-Aware Multi-Robot Coordination for Resilient and Adaptive Target Tracking
Multi-robot coordination is crucial for autonomous systems, yet real-world deployments often encounter various failures. These include both temporary and permanent disruptions in sensing and communication, which can significantly degrade system robustness and performance if not explicitly modeled. Despite its practical importance, failure-aware coordination remains underexplored in the literature. To bridge the gap between idealized conditions and the complexities of real-world environments, we propose a unified failure-aware coordination framework designed to enable resilient and adaptive multi-robot target tracking under both temporary and permanent failure conditions. Our approach systematically distinguishes between two classes of failures: (1) probabilistic and temporary disruptions, where robots recover from intermittent sensing or communication losses by dynamically adapting paths and avoiding inferred danger zones, and (2) permanent failures, where robots lose sensing or communication capabilities irreversibly, requiring sustained, decentralized behavioral adaptation. To handle these scenarios, the robot team is partitioned into subgroups. Robots that remain connected form a communication group and collaboratively plan using partially centralized nonlinear optimization. Robots experiencing permanent disconnection or failure continue to operate independently through decentralized or individual optimization, allowing them to contribute to the task within their local context. We extensively evaluate our method across a range of benchmark variations and conduct a comprehensive assessment under diverse real-world failure scenarios. Results show that our framework consistently achieves robust performance in realistic environments with unknown danger zones, offering a practical and generalizable solution for the multi-robot systems community.
QuaDreamer: Controllable Panoramic Video Generation for Quadruped Robots
Panoramic cameras, capturing comprehensive 360-degree environmental data, are suitable for quadruped robots in surrounding perception and interaction with complex environments. However, the scarcity of high-quality panoramic training data-caused by inherent kinematic constraints and complex sensor calibration challenges-fundamentally limits the development of robust perception systems tailored to these embodied platforms. To address this issue, we propose QuaDreamer-the first panoramic data generation engine specifically designed for quadruped robots. QuaDreamer focuses on mimicking the motion paradigm of quadruped robots to generate highly controllable, realistic panoramic videos, providing a data source for downstream tasks. Specifically, to effectively capture the unique vertical vibration characteristics exhibited during quadruped locomotion, we introduce Vertical Jitter Encoding (VJE). VJE extracts controllable vertical signals through frequency-domain feature filtering and provides high-quality prompts. To facilitate high-quality panoramic video generation under jitter signal control, we propose a Scene-Object Controller (SOC) that effectively manages object motion and boosts background jitter control through the attention mechanism. To address panoramic distortions in wide-FoV video generation, we propose the Panoramic Enhancer (PE)-a dual-stream architecture that synergizes frequency-texture refinement for local detail enhancement with spatial-structure correction for global geometric consistency. We further demonstrate that the generated video sequences can serve as training data for the quadruped robot's panoramic visual perception model, enhancing the performance of multi-object tracking in 360-degree scenes. The source code and model weights will be publicly available at https://github.com/losehu/QuaDreamer.
comment: Accepted to CoRL 2025. The source code and model weights will be publicly available at https://github.com/losehu/QuaDreamer
Would you let a humanoid play storytelling with your child? A usability study on LLM-powered narrative Humanoid-Robot Interaction
A key challenge in human-robot interaction research lies in developing robotic systems that can effectively perceive and interpret social cues, facilitating natural and adaptive interactions. In this work, we present a novel framework for enhancing the attention of the iCub humanoid robot by integrating advanced perceptual abilities to recognise social cues, understand surroundings through generative models, such as ChatGPT, and respond with contextually appropriate social behaviour. Specifically, we propose an interaction task implementing a narrative protocol (storytelling task) in which the human and the robot create a short imaginary story together, exchanging in turn cubes with creative images placed on them. To validate the protocol and the framework, experiments were performed to quantify the degree of usability and the quality of experience perceived by participants interacting with the system. Such a system can be beneficial in promoting effective human robot collaborations, especially in assistance, education and rehabilitation scenarios where the social awareness and the robot responsiveness play a pivotal role.
Uncertainty-Aware Perception-Based Control for Autonomous Racing
Autonomous systems operating in unknown environments often rely heavily on visual sensor data, yet making safe and informed control decisions based on these measurements remains a significant challenge. To facilitate the integration of perception and control in autonomous vehicles, we propose a novel perception-based control approach that incorporates road estimation, quantification of its uncertainty, and uncertainty-aware control based on this estimate. At the core of our method is a parametric road curvature model, optimized using visual measurements of the road through a constrained nonlinear optimization problem. This process ensures adherence to constraints on both model parameters and curvature. By leveraging the Frenet frame formulation, we embed the estimated track curvature into the system dynamics, allowing the controller to explicitly account for perception uncertainty and enhancing robustness to estimation errors based on visual input. We validate our approach in a simulated environment, using a high-fidelity 3D rendering engine, and demonstrate its effectiveness in achieving reliable and uncertainty-aware control for autonomous racing.
Multi-Class Human/Object Detection on Robot Manipulators using Proprioceptive Sensing
In physical human-robot collaboration (pHRC) settings, humans and robots collaborate directly in shared environments. Robots must analyze interactions with objects to ensure safety and facilitate meaningful workflows. One critical aspect is human/object detection, where the contacted object is identified. Past research introduced binary machine learning classifiers to distinguish between soft and hard objects. This study improves upon those results by evaluating three-class human/object detection models, offering more detailed contact analysis. A dataset was collected using the Franka Emika Panda robot manipulator, exploring preprocessing strategies for time-series analysis. Models including LSTM, GRU, and Transformers were trained on these datasets. The best-performing model achieved 91.11\% accuracy during real-time testing, demonstrating the feasibility of multi-class detection models. Additionally, a comparison of preprocessing strategies suggests a sliding window approach is optimal for this task.
Improving Generalization of Language-Conditioned Robot Manipulation
The control of robots for manipulation tasks generally relies on visual input. Recent advances in vision-language models (VLMs) enable the use of natural language instructions to condition visual input and control robots in a wider range of environments. However, existing methods require a large amount of data to fine-tune VLMs for operating in unseen environments. In this paper, we present a framework that learns object-arrangement tasks from just a few demonstrations. We propose a two-stage framework that divides object-arrangement tasks into a target localization stage, for picking the object, and a region determination stage for placing the object. We present an instance-level semantic fusion module that aligns the instance-level image crops with the text embedding, enabling the model to identify the target objects defined by the natural language instructions. We validate our method on both simulation and real-world robotic environments. Our method, fine-tuned with a few demonstrations, improves generalization capability and demonstrates zero-shot ability in real-robot manipulation scenarios.
comment: 7 pages,18 figures,2 tables
Adaptive Lattice-based Motion Planning
This paper proposes an adaptive lattice-based motion planning solution to address the problem of generating feasible trajectories for systems, represented by a linearly parameterizable non-linear model operating within a cluttered environment. The system model is considered to have uncertain model parameters. The key idea here is to utilize input/output data online to update the model set containing the uncertain system parameter, as well as a dynamic estimated parameter of the model, so that the associated model estimation error reduces over time. This in turn improves the quality of the motion primitives generated by the lattice-based motion planner using a nominal estimated model selected on the basis of suitable criteria. The motion primitives are also equipped with tubes to account for the model mismatch between the nominal estimated model and the true system model, to guarantee collision-free overall motion. The tubes are of uniform size, which is directly proportional to the size of the model set containing the uncertain system parameter. The adaptive learning module guarantees a reduction in the diameter of the model set as well as in the parameter estimation error between the dynamic estimated parameter and the true system parameter. This directly implies a reduction in the size of the implemented tubes and guarantees that the utilized motion primitives go arbitrarily close to the resolution-optimal motion primitives associated with the true model of the system, thus significantly improving the overall motion planning performance over time. The efficiency of the motion planner is demonstrated by a suitable simulation example that considers a drone model represented by Euler-Lagrange dynamics containing uncertain parameters and operating within a cluttered environment.
mmWave Radar-Based Non-Line-of-Sight Pedestrian Localization at T-Junctions Utilizing Road Layout Extraction via Camera
Pedestrians Localization in Non-Line-of-Sight (NLoS) regions within urban environments poses a significant challenge for autonomous driving systems. While mmWave radar has demonstrated potential for detecting objects in such scenarios, the 2D radar point cloud (PCD) data is susceptible to distortions caused by multipath reflections, making accurate spatial inference difficult. Additionally, although camera images provide high-resolution visual information, they lack depth perception and cannot directly observe objects in NLoS regions. In this paper, we propose a novel framework that interprets radar PCD through road layout inferred from camera for localization of NLoS pedestrians. The proposed method leverages visual information from the camera to interpret 2D radar PCD, enabling spatial scene reconstruction. The effectiveness of the proposed approach is validated through experiments conducted using a radar-camera system mounted on a real vehicle. The localization performance is evaluated using a dataset collected in outdoor NLoS driving environments, demonstrating the practical applicability of the method.
Correspondence-Free Fast and Robust Spherical Point Pattern Registration ICCV 2025
Existing methods for rotation estimation between two spherical ($\mathbb{S}^2$) patterns typically rely on spherical cross-correlation maximization between two spherical function. However, these approaches exhibit computational complexities greater than cubic $O(n^3)$ with respect to rotation space discretization and lack extensive evaluation under significant outlier contamination. To this end, we propose a rotation estimation algorithm between two spherical patterns with linear time complexity $O(n)$. Unlike existing spherical-function-based methods, we explicitly represent spherical patterns as discrete 3D point sets on the unit sphere, reformulating rotation estimation as a spherical point-set alignment (i.e., Wahba problem for 3D unit vectors). Given the geometric nature of our formulation, our spherical pattern alignment algorithm naturally aligns with the Wahba problem framework for 3D unit vectors. Specifically, we introduce three novel algorithms: (1) SPMC (Spherical Pattern Matching by Correlation), (2) FRS (Fast Rotation Search), and (3) a hybrid approach (SPMC+FRS) that combines the advantages of the previous two methods. Our experiments demonstrate that in the $\mathbb{S}^2$ domain and in correspondence-free settings, our algorithms are over 10x faster and over 10x more accurate than current state-of-the-art methods for the Wahba problem with outliers. We validate our approach through extensive simulations on a new dataset of spherical patterns, the ``Robust Vector Alignment Dataset. "Furthermore, we adapt our methods to two real-world tasks: (i) Point Cloud Registration (PCR) and (ii) rotation estimation for spherical images.
comment: Accepted at ICCV 2025
Vision Language Model-based Testing of Industrial Autonomous Mobile Robots
Autonomous Mobile Robots (AMRs) are deployed in diverse environments (e.g., warehouses, retail spaces, and offices), where they work alongside humans. Given that human behavior can be unpredictable and that AMRs may not have been trained to handle all possible unknown and uncertain behaviors, it is important to test AMRs under a wide range of human interactions to ensure their safe behavior. Moreover, testing in real environments with actual AMRs and humans is often costly, impractical, and potentially hazardous (e.g., it could result in human injury). To this end, we propose a Vision Language Model (VLM)-based testing approach (RVSG) for industrial AMRs developed by PAL Robotics in Spain. Based on the functional and safety requirements, RVSG uses the VLM to generate diverse human behaviors that violate these requirements. We evaluated RVSG with several requirements and navigation routes in a simulator using the latest AMR from PAL Robotics. Our results show that, compared with the baseline, RVSG can effectively generate requirement-violating scenarios. Moreover, RVSG-generated scenarios increase variability in robot behavior, thereby helping reveal their uncertain behaviors.
Framework for Robust Motion Planning of Tethered Multi-Robot Systems in Marine Environments
This paper introduces CoralGuide, a novel framework designed for path planning and trajectory optimization for tethered multi-robot systems. We focus on marine robotics, which commonly have tethered configurations of an Autonomous Surface Vehicle (ASV) and an Autonomous Underwater Vehicle (AUV). CoralGuide provides safe navigation in marine environments by enhancing the A* algorithm with specialized heuristics tailored for tethered ASV-AUV systems. Our method integrates catenary curve modelling for tether management and employs Bezier curve interpolation for smoother trajectory planning, ensuring efficient and synchronized operations without compromising safety. Through simulations and real-world experiments, we have validated CoralGuides effectiveness in improving path planning and trajectory optimization, demonstrating its potential to significantly enhance operational capabilities in marine research and infrastructure inspection.
comment: The research paper has been submitted and accepted for presentation at the OCEANS 2025 conference in France
Tethered Multi-Robot Systems in Marine Environments ICRA 2025
This paper introduces a novel simulation framework for evaluating motion control in tethered multi-robot systems within dynamic marine environments. Specifically, it focuses on the coordinated operation of an Autonomous Underwater Vehicle (AUV) and an Autonomous Surface Vehicle(ASV). The framework leverages GazeboSim, enhanced with realistic marine environment plugins and ArduPilots SoftwareIn-The-Loop (SITL) mode, to provide a high-fidelity simulation platform. A detailed tether model, combining catenary equations and physical simulation, is integrated to accurately represent the dynamic interactions between the vehicles and the environment. This setup facilitates the development and testing of advanced control strategies under realistic conditions, demonstrating the frameworks capability to analyze complex tether interactions and their impact on system performance.
comment: The research paper has been submitted and accepted for the AQ2UASIM Workshop during ICRA 2025 in the USA
An Event-based Fast Intensity Reconstruction Scheme for UAV Real-time Perception
Event cameras offer significant advantages, including a wide dynamic range, high temporal resolution, and immunity to motion blur, making them highly promising for addressing challenging visual conditions. Extracting and utilizing effective information from asynchronous event streams is essential for the onboard implementation of event cameras. In this paper, we propose a streamlined event-based intensity reconstruction scheme, event-based single integration (ESI), to address such implementation challenges. This method guarantees the portability of conventional frame-based vision methods to event-based scenarios and maintains the intrinsic advantages of event cameras. The ESI approach reconstructs intensity images by performing a single integration of the event streams combined with an enhanced decay algorithm. Such a method enables real-time intensity reconstruction at a high frame rate, typically 100 FPS. Furthermore, the relatively low computation load of ESI fits onboard implementation suitably, such as in UAV-based visual tracking scenarios. Extensive experiments have been conducted to evaluate the performance comparison of ESI and state-of-the-art algorithms. Compared to state-of-the-art algorithms, ESI demonstrates remarkable runtime efficiency improvements, superior reconstruction quality, and a high frame rate. As a result, ESI enhances UAV onboard perception significantly under visual adversary surroundings. In-flight tests, ESI demonstrates effective performance for UAV onboard visual tracking under extremely low illumination conditions(2-10lux), whereas other comparative algorithms fail due to insufficient frame rate, poor image quality, or limited real-time performance.
comment: A supplementary video is available at https://youtu.be/tLzXjXVRkVg
CO-RFT: Efficient Fine-Tuning of Vision-Language-Action Models through Chunked Offline Reinforcement Learning
Vision-Language-Action (VLA) models demonstrate significant potential for developing generalized policies in real-world robotic control. This progress inspires researchers to explore fine-tuning these models with Reinforcement Learning (RL). However, fine-tuning VLA models with RL still faces challenges related to sample efficiency, compatibility with action chunking, and training stability. To address these challenges, we explore the fine-tuning of VLA models through offline reinforcement learning incorporating action chunking. In this work, we propose Chunked RL, a novel reinforcement learning framework specifically designed for VLA models. Within this framework, we extend temporal difference (TD) learning to incorporate action chunking, a prominent characteristic of VLA models. Building upon this framework, we propose CO-RFT, an algorithm aimed at fine-tuning VLA models using a limited set of demonstrations (30 to 60 samples). Specifically, we first conduct imitation learning (IL) with full parameter fine-tuning to initialize both the backbone and the policy. Subsequently, we implement offline RL with action chunking to optimize the pretrained policy. Our empirical results in real-world environments demonstrate that CO-RFT outperforms previous supervised methods, achieving a 57% improvement in success rate and a 22.3% reduction in cycle time. Moreover, our method exhibits robust positional generalization capabilities, attaining a success rate of 44.3% in previously unseen positions.
TacMan-Turbo: Proactive Tactile Control for Robust and Efficient Articulated Object Manipulation
Adept manipulation of articulated objects is essential for robots to operate successfully in human environments. Such manipulation requires both effectiveness -- reliable operation despite uncertain object structures -- and efficiency -- swift execution with minimal redundant steps and smooth actions. Existing approaches struggle to achieve both objectives simultaneously: methods relying on predefined kinematic models lack effectiveness when encountering structural variations, while tactile-informed approaches achieve robust manipulation without kinematic priors but compromise efficiency through reactive, step-by-step exploration-compensation cycles. This paper introduces TacMan-Turbo, a novel proactive tactile control framework for articulated object manipulation that resolves this fundamental trade-off. Unlike previous approaches that treat tactile contact deviations merely as error signals requiring compensation, our method interprets these deviations as rich sources of local kinematic information. This new perspective enables our controller to predict optimal future interactions and make proactive adjustments, significantly enhancing manipulation efficiency. In comprehensive evaluations across 200 diverse simulated articulated objects and real-world experiments, our approach maintains a 100% success rate while significantly outperforming the previous tactile-informed method in time efficiency, action efficiency, and trajectory smoothness (all p-values < 0.0001). These results demonstrate that the long-standing trade-off between effectiveness and efficiency in articulated object manipulation can be successfully resolved without relying on prior kinematic knowledge.
Constrained Reinforcement Learning for Unstable Point-Feet Bipedal Locomotion Applied to the Bolt Robot
Bipedal locomotion is a key challenge in robotics, particularly for robots like Bolt, which have a point-foot design. This study explores the control of such underactuated robots using constrained reinforcement learning, addressing their inherent instability, lack of arms, and limited foot actuation. We present a methodology that leverages Constraints-as-Terminations and domain randomization techniques to enable sim-to-real transfer. Through a series of qualitative and quantitative experiments, we evaluate our approach in terms of balance maintenance, velocity control, and responses to slip and push disturbances. Additionally, we analyze autonomy through metrics like the cost of transport and ground reaction force. Our method advances robust control strategies for point-foot bipedal robots, offering insights into broader locomotion.
FedVLA: Federated Vision-Language-Action Learning with Dual Gating Mixture-of-Experts for Robotic Manipulation ICCV 2025
Vision-language-action (VLA) models have significantly advanced robotic manipulation by enabling robots to interpret language instructions for task execution. However, training these models often relies on large-scale user-specific data, raising concerns about privacy and security, which in turn limits their broader adoption. To address this, we propose FedVLA, the first federated VLA learning framework, enabling distributed model training that preserves data privacy without compromising performance. Our framework integrates task-aware representation learning, adaptive expert selection, and expert-driven federated aggregation, enabling efficient and privacy-preserving training of VLA models. Specifically, we introduce an Instruction Oriented Scene-Parsing mechanism, which decomposes and enhances object-level features based on task instructions, improving contextual understanding. To effectively learn diverse task patterns, we design a Dual Gating Mixture-of-Experts (DGMoE) mechanism, where not only input tokens but also self-aware experts adaptively decide their activation. Finally, we propose an Expert-Driven Aggregation strategy at the federated server, where model aggregation is guided by activated experts, ensuring effective cross-client knowledge transfer.Extensive simulations and real-world robotic experiments demonstrate the effectiveness of our proposals. Notably, DGMoE significantly improves computational efficiency compared to its vanilla counterpart, while FedVLA achieves task success rates comparable to centralized training, effectively preserving data privacy.
comment: Accepted by ICCV 2025
A Moment Matching-Based Method for Sparse and Noisy Point Cloud Registration
Point cloud registration is a key step in robotic perception tasks, such as Simultaneous Localization and Mapping (SLAM). It is especially challenging in conditions with sparse points and heavy noise. Traditional registration methods, such as Iterative Closest Point (ICP) and Normal Distributions Transform (NDT), often have difficulties in achieving a robust and accurate alignment under these conditions. In this paper, we propose a registration framework based on moment matching. In particular, the point clouds are regarded as i.i.d. samples drawn from the same distribution observed in the source and target frames. We then match the generalized Gaussian Radial Basis moments calculated from the point clouds to estimate the rigid transformation between two frames. Moreover, such method does not require explicit point-to-point correspondences among the point clouds. We further show the consistency of the proposed method. Experiments on synthetic and real-world datasets show that our approach achieves higher accuracy and robustness than existing methods. In addition, we integrate our framework into a 4D Radar SLAM system. The proposed method significantly improves the localization performance and achieves results comparable to LiDAR-based systems. These findings demonstrate the potential of moment matching technique for robust point cloud registration in sparse and noisy scenarios.
Towards High Precision: An Adaptive Self-Supervised Learning Framework for Force-Based Verification
The automation of robotic tasks requires high precision and adaptability, particularly in force-based operations such as insertions. Traditional learning-based approaches either rely on static datasets, which limit their ability to generalize, or require frequent manual intervention to maintain good performances. As a result, ensuring long-term reliability without human supervision remains a significant challenge. To address this, we propose an adaptive self-supervised learning framework for insertion classification that continuously improves its precision over time. The framework operates in real-time, incrementally refining its classification decisions by integrating newly acquired force data. Unlike conventional methods, it does not rely on pre-collected datasets but instead evolves dynamically with each task execution. Through real-world experiments, we demonstrate how the system progressively reduces execution time while maintaining near-perfect precision as more samples are processed. This adaptability ensures long-term reliability in force-based robotic tasks while minimizing the need for manual intervention.
comment: 7 pages, 7 figures, 3 tables
ScrewSplat: An End-to-End Method for Articulated Object Recognition
Articulated object recognition -- the task of identifying both the geometry and kinematic joints of objects with movable parts -- is essential for enabling robots to interact with everyday objects such as doors and laptops. However, existing approaches often rely on strong assumptions, such as a known number of articulated parts; require additional inputs, such as depth images; or involve complex intermediate steps that can introduce potential errors -- limiting their practicality in real-world settings. In this paper, we introduce ScrewSplat, a simple end-to-end method that operates solely on RGB observations. Our approach begins by randomly initializing screw axes, which are then iteratively optimized to recover the object's underlying kinematic structure. By integrating with Gaussian Splatting, we simultaneously reconstruct the 3D geometry and segment the object into rigid, movable parts. We demonstrate that our method achieves state-of-the-art recognition accuracy across a diverse set of articulated objects, and further enables zero-shot, text-guided manipulation using the recovered kinematic model.
comment: 26 pages, 12 figures, Conference on Robot Learning (CoRL) 2025
Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis ICCV 2025
Real-time synthesis of physically plausible human interactions remains a critical challenge for immersive VR/AR systems and humanoid robotics. While existing methods demonstrate progress in kinematic motion generation, they often fail to address the fundamental tension between real-time responsiveness, physical feasibility, and safety requirements in dynamic human-machine interactions. We introduce Human-X, a novel framework designed to enable immersive and physically plausible human interactions across diverse entities, including human-avatar, human-humanoid, and human-robot systems. Unlike existing approaches that focus on post-hoc alignment or simplified physics, our method jointly predicts actions and reactions in real-time using an auto-regressive reaction diffusion planner, ensuring seamless synchronization and context-aware responses. To enhance physical realism and safety, we integrate an actor-aware motion tracking policy trained with reinforcement learning, which dynamically adapts to interaction partners' movements while avoiding artifacts like foot sliding and penetration. Extensive experiments on the Inter-X and InterHuman datasets demonstrate significant improvements in motion quality, interaction continuity, and physical plausibility over state-of-the-art methods. Our framework is validated in real-world applications, including virtual reality interface for human-robot interaction, showcasing its potential for advancing human-robot collaboration.
comment: Accepted by ICCV 2025
"Set It Up": Functional Object Arrangement with Compositional Generative Models
Functional object arrangement (FORM) is the task of arranging objects to fulfill a function, e.g., "set up a dining table for two". One key challenge here is that the instructions for FORM are often under-specified and do not explicitly specify the desired object goal poses. This paper presents SetItUp, a neuro-symbolic framework that learns to specify the goal poses of objects from a few training examples and a structured natural-language task specification. SetItUp uses a grounding graph, which is composed of abstract spatial relations among objects (e.g., left-of), as its intermediate representation. This decomposes the FORM problem into two stages: (i) predicting this graph among objects and (ii) predicting object poses given the grounding graph. For (i), SetItUp leverages large language models (LLMs) to induce Python programs from a task specification and a few training examples. This program can be executed to generate grounding graphs in novel scenarios. For (ii), SetItUp pre-trains a collection of diffusion models to capture primitive spatial relations and online composes these models to predict object poses based on the grounding graph. We evaluated SetItUp on a dataset spanning three distinct task families: arranging tableware on a dining table, organizing items on a bookshelf, and laying out furniture in a bedroom. Experiments show that SetItUp outperforms existing models in generating functional, physically feasible, and aesthetically pleasing object arrangements. This article extends our conference paper published at Robotics: Science and Systems (RSS) 2024.
comment: This is the journal version accepted to the International Journal of Robotics Research (IJRR). It extends our prior work presented at Robotics: Science and Systems (RSS) 2024, with a new compositional program induction pipeline from natural language, and expanded evaluations on personalized bookshelf and bedroom furniture layout tasks
RICL: Adding In-Context Adaptability to Pre-Trained Vision-Language-Action Models
Multi-task ``vision-language-action'' (VLA) models have recently demonstrated increasing promise as generalist foundation models for robotics, achieving non-trivial performance out of the box on new tasks in new environments. However, for such models to be truly useful, an end user must have easy means to teach them to improve. For language and vision models, the emergent ability to perform in-context learning (ICL) has proven to be a versatile and highly useful interface to easily teach new tasks with no parameter finetuning. Unfortunately, VLAs pre-trained with imitation learning objectives do not naturally acquire ICL abilities. In this paper, we demonstrate that, with the right finetuning recipe and a small robot demonstration dataset, it is possible to inject in-context adaptability post hoc into such a VLA. After retraining for in-context learning (RICL), our system permits an end user to provide a small number (10-20) of demonstrations for a new task. RICL then fetches the most relevant portions of those demonstrations into the VLA context to exploit ICL, performing the new task and boosting task performance. We apply RICL to inject ICL into the $\pi_{0}$-FAST VLA, and show that it permits large in-context improvements for a variety of new manipulation tasks with only 20 demonstrations per task, without any parameter updates. When parameter updates on the target task demonstrations is possible, RICL finetuning further boosts performance. We release code and model weights for RICL-$\pi_{0}$-FAST alongside the paper to enable, for the first time, a simple in-context learning interface for new manipulation tasks. Website: https://ricl-vla.github.io.
comment: Conference on Robot Learning 2025 (CoRL 2025), 17 pages
NaviMaster: Learning a Unified Policy for GUI and Embodied Navigation Tasks
Recent advances in Graphical User Interface (GUI) and embodied navigation have driven significant progress, yet these domains have largely evolved in isolation, with disparate datasets and training paradigms. In this paper, we observe that both tasks can be formulated as Markov Decision Processes (MDP), suggesting a foundational principle for their unification. Hence, we present NaviMaster, the first unified agent capable of seamlessly integrating GUI navigation and embodied navigation within a single framework. Specifically, NaviMaster (i) proposes a visual-target trajectory collection pipeline that generates trajectories for both GUI and embodied tasks in one formulation. (ii) employs a unified reinforcement learning framework on the mix data for better generalization. (iii) designs a novel distance-aware reward to ensure efficient learning from the trajectories. Through extensive experiments on out-of-domain benchmarks, NaviMaster is shown to outperform state-of-the-art agents in GUI navigation, spatial affordance prediction, and embodied navigation. Ablation studies further confirm the efficacy of our unified training strategy, data mixing strategy, and reward design.
comment: Homepage: https://iron-boyy.github.io/navimaster/
Design and Control of an Actively Morphing Quadrotor with Vertically Foldable Arms
In this work, we propose a novel quadrotor design capable of folding its arms vertically to grasp objects and navigate through narrow spaces. The transformation is controlled actively by a central servomotor, gears, and racks. The arms connect the motor bases to the central frame, forming a parallelogram structure that ensures the propellers maintain a constant orientation during morphing. In its stretched state, the quadrotor resembles a conventional design, and when contracted, it functions as a gripper with grasping components emerging from the motor bases. To mitigate disturbances during transforming and grasping payloads, we employ an adaptive sliding mode controller with a disturbance observer. After fully folded, the quadrotor frame shrinks to 67% of its original size. The control performance and versatility of the morphing quadrotor are validated through real-world experiments.
From Photons to Physics: Autonomous Indoor Drones and the Future of Objective Property Assessment
The convergence of autonomous indoor drones with physics-aware sensing technologies promises to transform property assessment from subjective visual inspection to objective, quantitative measurement. This comprehensive review examines the technical foundations enabling this paradigm shift across four critical domains: (1) platform architectures optimized for indoor navigation, where weight constraints drive innovations in heterogeneous computing, collision-tolerant design, and hierarchical control systems; (2) advanced sensing modalities that extend perception beyond human vision, including hyperspectral imaging for material identification, polarimetric sensing for surface characterization, and computational imaging with metaphotonics enabling radical miniaturization; (3) intelligent autonomy through active reconstruction algorithms, where drones equipped with 3D Gaussian Splatting make strategic decisions about viewpoint selection to maximize information gain within battery constraints; and (4) integration pathways with existing property workflows, including Building Information Modeling (BIM) systems and industry standards like Uniform Appraisal Dataset (UAD) 3.6.
comment: 63 pages, 5 figures
Robot builds a robot's brain: AI generated drone command and control station hosted in the sky
Advances in artificial intelligence (AI) including large language models (LLMs) and hybrid reasoning models present an opportunity to reimagine how autonomous robots such as drones are designed, developed, and validated. Here, we demonstrate a fully AI-generated drone control system: with minimal human input, an artificial intelligence (AI) model authored all the code for a real-time, self-hosted drone command and control platform, which was deployed and demonstrated on a real drone in flight as well as a simulated virtual drone in the cloud. The system enables real-time mapping, flight telemetry, autonomous mission planning and execution, and safety protocolsall orchestrated through a web interface hosted directly on the drone itself. Not a single line of code was written by a human. We quantitatively benchmark system performance, code complexity, and development speed against prior, human-coded architectures, finding that AI-generated code can deliver functionally complete command-and-control stacks at orders-of-magnitude faster development cycles, though with identifiable current limitations related to specific model context window and reasoning depth. Our analysis uncovers the practical boundaries of AI-driven robot control code generation at current model scales, as well as emergent strengths and failure modes in AI-generated robotics code. This work sets a precedent for the autonomous creation of robot control systems and, more broadly, suggests a new paradigm for robotics engineeringone in which future robots may be largely co-designed, developed, and verified by artificial intelligence. In this initial work, a robot built a robot's brain.
Optimal Trajectory Planning in a Vertically Undulating Snake Locomotion using Contact-implicit Optimization
Contact-rich problems, such as snake robot locomotion, offer unexplored yet rich opportunities for optimization-based trajectory and acyclic contact planning. So far, a substantial body of control research has focused on emulating snake locomotion and replicating its distinctive movement patterns using shape functions that either ignore the complexity of interactions or focus on complex interactions with matter (e.g., burrowing movements). However, models and control frameworks that lie in between these two paradigms and are based on simple, fundamental rigid body dynamics, which alleviate the challenging contact and control allocation problems in snake locomotion, remain absent. This work makes meaningful contributions, substantiated by simulations and experiments, in the following directions: 1) introducing a reduced-order model based on Moreau's stepping-forward approach from differential inclusion mathematics, 2) verifying model accuracy, 3) experimental validation.
A novel autonomous microplastics surveying robot for beach environments
Microplastics, defined as plastic particles smaller than 5 millimeters, have become a pervasive environmental contaminant that accumulates on beaches due to wind patterns and tidal forcing. Detecting microplastics and mapping their concentration in the wild remains one of the primary challenges in addressing this environmental issue. This paper introduces a novel robotic platform that automatically detects and chemically analyzes microplastics on beach surfaces. This mobile manipulator system scans areas for microplastics using a camera mounted on the robotic arm's end effector. The system effectively segments candidate microplastic particles on sand surfaces even in the presence of organic matter such as leaves and clams. Once a candidate microplastic particle is detected, the system steers a near-infrared (NIR) spectroscopic sensor onto the particle using both NIR and visual feedback to chemically analyze it in real-time. Through experiments in lab and beach environments, the system is shown to achieve an excellent positional precision in manipulation control and high microplastic classification accuracy.
comment: 12 pages, 11 figures
AeroSafe: Mobile Indoor Air Purification using Aerosol Residence Time Analysis and Robotic Cough Emulator Testbed ICRA
Indoor air quality plays an essential role in the safety and well-being of occupants, especially in the context of airborne diseases. This paper introduces AeroSafe, a novel approach aimed at enhancing the efficacy of indoor air purification systems through a robotic cough emulator testbed and a digital-twins-based aerosol residence time analysis. Current portable air filters often overlook the concentrations of respiratory aerosols generated by coughs, posing a risk, particularly in high-exposure environments like healthcare facilities and public spaces. To address this gap, we present a robotic dual-agent physical emulator comprising a maneuverable mannequin simulating cough events and a portable air purifier autonomously responding to aerosols. The generated data from this emulator trains a digital twins model, combining a physics-based compartment model with a machine learning approach, using Long Short-Term Memory (LSTM) networks and graph convolution layers. Experimental results demonstrate the model's ability to predict aerosol concentration dynamics with a mean residence time prediction error within 35 seconds. The proposed system's real-time intervention strategies outperform static air filter placement, showcasing its potential in mitigating airborne pathogen risks.
comment: Accepted at IEEE International Conference on Robotics and Automation (ICRA) 2025. Author Accepted Manuscript
Model-agnostic Meta-learning for Adaptive Gait Phase and Terrain Geometry Estimation with Wearable Soft Sensors
This letter presents a model-agnostic meta-learning (MAML) based framework for simultaneous and accurate estimation of human gait phase and terrain geometry using a small set of fabric-based wearable soft sensors, with efficient adaptation to unseen subjects and strong generalization across different subjects and terrains. Compared to rigid alternatives such as inertial measurement units, fabric-based soft sensors improve comfort but introduce nonlinearities due to hysteresis, placement error, and fabric deformation. Moreover, inter-subject and inter-terrain variability, coupled with limited calibration data in real-world deployments, further complicate accurate estimation. To address these challenges, the proposed framework integrates MAML into a deep learning architecture to learn a generalizable model initialization that captures subject- and terrain-invariant structure. This initialization enables efficient adaptation (i.e., adaptation with only a small amount of calibration data and a few fine-tuning steps) to new users, while maintaining strong generalization (i.e., high estimation accuracy across subjects and terrains). Experiments on nine participants walking at various speeds over five terrain conditions demonstrate that the proposed framework outperforms baseline approaches in estimating gait phase, locomotion mode, and incline angle, with superior accuracy, adaptation efficiency, and generalization.
comment: 8 pages, 5 figures
Context-aware Risk Assessment and Its Application in Autonomous Driving SC 2025
Ensuring safety in autonomous driving requires precise, real-time risk assessment and adaptive behavior. Prior work on risk estimation either outputs coarse, global scene-level metrics lacking interpretability, proposes indicators without concrete integration into autonomous systems, or focuses narrowly on specific driving scenarios. We introduce the Context-aware Risk Index (CRI), a light-weight modular framework that quantifies directional risks based on object kinematics and spatial relationships, dynamically adjusting control commands in real time. CRI employs direction-aware spatial partitioning within a dynamic safety envelope using Responsibility-Sensitive Safety (RSS) principles, a hybrid probabilistic-max fusion strategy for risk aggregation, and an adaptive control policy for real-time behavior modulation. We evaluate CRI on the Bench2Drive benchmark comprising 220 safety-critical scenarios using a state-of-the-art end-to-end model Transfuser++ on challenging routes. Our collision-rate metrics show a 19\% reduction (p = 0.003) in vehicle collisions per failed route, a 20\% reduction (p = 0.004) in collisions per kilometer, a 17\% increase (p = 0.016) in composed driving score, and a statistically significant reduction in penalty scores (p = 0.013) with very low overhead (3.6 ms per decision cycle). These results demonstrate that CRI substantially improves safety and robustness in complex, risk-intensive environments while maintaining modularity and low runtime overhead.
comment: ITSC 2025, 7 pages
Following Route Instructions using Large Vision-Language Models: A Comparison between Low-level and Panoramic Action Spaces
Vision-and-Language Navigation (VLN) refers to the task of enabling autonomous robots to navigate unfamiliar environments by following natural language instructions. While recent Large Vision-Language Models (LVLMs) have shown promise in this task, most current VLM systems rely on models specifically designed and optimized for navigation, leaving the potential of off-the-shelf LVLMs underexplored. Furthermore, while older VLN approaches used low-level action spaces with egocentric views and atomic actions (such as "turn left" or "move forward"), newer models tend to favor panoramic action spaces with discrete navigable viewpoints. This paper investigates (1) whether off-the-shelf LVLMs (fine-tuned without architectural modifications or simulator-based training) can effectively support VLN tasks and (2) whether such models can support both low-level and panoramic action paradigms. To this end, we fine-tune the open-source model Qwen2.5-VL-3B-Instruct on the Room-to-Room (R2R) dataset and evaluate its empirical performance across both low-level and panoramic action spaces. The best resulting model achieves a 41% success rate on the R2R test set, demonstrating that while off-the-shelf LVLMs can learn to perform Vision-and-Language Navigation, they still lag behind models specifically designed for this task.
comment: This paper has been accepted to ICNSLP 2025
Co-designing Zoomorphic Robot Concepts for Animal Welfare Education
Animal welfare education could greatly benefit from customized robots to help children learn about animals and their behavior, and thereby promote positive, safe child-animal interactions. To this end, we ran Participatory Design workshops with animal welfare educators and children to identify key requirements for zoomorphic robots from their perspectives. Our findings encompass a zoomorphic robot's appearance, behavior, and features, as well as concepts for a narrative surrounding the robot. Through comparing and contrasting the two groups, we find the importance of: negative reactions to undesirable behavior from children; using the facial features and tail to provide cues signaling an animal's internal state; and a natural, furry appearance and texture. We also contribute some novel activities for Participatory Design with children, including branching storyboards inspired by thematic apperception tests and interactive narratives, and reflect on some of the key design challenges of achieving consensus between the groups, despite much overlap in their design concepts.
Tunable Leg Stiffness in a Monopedal Hopper for Energy-Efficient Vertical Hopping Across Varying Ground Profiles ICRA
We present the design and implementation of HASTA (Hopper with Adjustable Stiffness for Terrain Adaptation), a vertical hopping robot with real-time tunable leg stiffness, aimed at optimizing energy efficiency across various ground profiles (a pair of ground stiffness and damping conditions). By adjusting leg stiffness, we aim to maximize apex hopping height, a key metric for energy-efficient vertical hopping. We hypothesize that softer legs perform better on soft, damped ground by minimizing penetration and energy loss, while stiffer legs excel on hard, less damped ground by reducing limb deformation and energy dissipation. Through experimental tests and simulations, we find the best leg stiffness within our selection for each combination of ground stiffness and damping, enabling the robot to achieve maximum steady-state hopping height with a constant energy input. These results support our hypothesis that tunable stiffness improves energy-efficient locomotion in controlled experimental conditions. In addition, the simulation provides insights that could aid in the future development of controllers for selecting leg stiffness.
comment: 2025 IEEE International Conference on Robotics & Automation (ICRA)
Learning User Interaction Forces using Vision for a Soft Finger Exosuit
Wearable assistive devices are increasingly becoming softer. Modelling their interface with human tissue is necessary to capture transmission of dynamic assistance. However, their nonlinear and compliant nature makes both physical modeling and embedded sensing challenging. In this paper, we develop a image-based, learning-based framework to estimate distributed contact forces for a finger-exosuit system. We used the SoRoSim toolbox to generate a diverse dataset of exosuit geometries and actuation scenarios for training. The method accurately estimated interaction forces across multiple contact locations from low-resolution grayscale images, was able to generalize to unseen shapes and actuation levels, and remained robust under visual noise and contrast variations. We integrated the model into a feedback controller, and found that the vision-based estimator functions as a surrogate force sensor for closed-loop control. This approach could be used as a non-intrusive alternative for real-time force estimation for exosuits.
comment: 13 pages, 9 figures
Refined Policy Distillation: From VLA Generalists to RL Experts IROS 2026
Vision-Language-Action Models (VLAs) have demonstrated remarkable generalization capabilities in real-world experiments. However, their success rates are often not on par with expert policies, and they require fine-tuning when the setup changes. In this work, we introduce Refined Policy Distillation (RPD), a novel Reinforcement Learning (RL)-based policy refinement method that bridges this performance gap through a combination of on-policy RL with behavioral cloning. The core idea of RPD is to distill and refine VLAs into compact, high-performing expert policies by guiding the student policy during RL exploration using the actions of a teacher VLA, resulting in increased sample efficiency and faster convergence. We complement our method by fine-tuned versions of Octo and OpenVLA for ManiSkill3 to evaluate RPD in simulation. While this is a key requirement for applying RL, it also yields new insights beyond existing studies on VLA performance in real-world settings. Our experimental results across various manipulation tasks show that RPD enables the RL student to learn expert policies that outperform the VLA teacher in both dense and sparse reward settings, while also achieving faster convergence than the RL baseline. Our approach is even robust to changes in camera perspective and can generalize to task variations that the underlying VLA cannot solve. Our code, dataset, VLA checkpoints, and videos are available at https://refined-policy-distillation.github.io
comment: accepted for publication at IROS 2026
Eyes Will Shut: A Vision-Based Next GPS Location Prediction Model by Reinforcement Learning from Visual Map Feed Back
Next Location Prediction is a fundamental task in the study of human mobility, with wide-ranging applications in transportation planning, urban governance, and epidemic forecasting. In practice, when humans attempt to predict the next location in a trajectory, they often visualize the trajectory on a map and reason based on road connectivity and movement trends. However, the vast majority of existing next-location prediction models do not reason over maps \textbf{in the way that humans do}. Fortunately, the recent development of Vision-Language Models (VLMs) has demonstrated strong capabilities in visual perception and even visual reasoning. This opens up a new possibility: by rendering both the road network and trajectory onto an image and leveraging the reasoning abilities of VLMs, we can enable models to perform trajectory inference in a human-like manner. To explore this idea, we first propose a method called Vision-Guided Location Search (VGLS), which evaluates whether a general-purpose VLM is capable of trajectory-based reasoning without modifying any of its internal parameters. Based on insights from the VGLS results, we further propose our main approach: VLMLocPredictor, which is composed of two stages: In the first stage, we design two Supervised Fine-Tuning (SFT) tasks that help the VLM understand road network and trajectory structures and acquire basic reasoning ability on such visual inputs. In the second stage, we introduce Reinforcement Learning from Visual Map Feedback, enabling the model to self-improve its next-location prediction ability through interaction with the environment. Experiments conducted on datasets from four different cities show that our method achieves state-of-the-art (SOTA) performance and exhibits superior cross-city generalization compared to other LLM-based approaches.
Globally Optimal Data-Association-Free Landmark-Based Localization Using Semidefinite Relaxations
This paper proposes a semidefinite relaxation for landmark-based localization with unknown data associations in planar environments. The proposed method simultaneously solves for the optimal robot states and data associations in a globally optimal fashion. Relative position measurements to known landmarks are used, but the data association is unknown in tha tthe robot does not know which landmark each measurement is generated from. The relaxation is shown to be tight in a majority of cases for moderate noise levels. The proposed algorithm is compared to local Gauss-Newton baselines initialized at the dead-reckoned trajectory, and is shown to significantly improve convergence to the problem's global optimum in simulation and experiment. Accompanying software and supplementary material may be found at https://github.com/decargroup/certifiable_uda_loc .
comment: 13 pages, 6 figures. Accepted to IEEE Robotics and Automation Letters
Faster Motion Planning via Restarts
Randomized methods such as PRM and RRT are widely used in motion planning. However, in some cases, their running-time suffers from inherent instability, leading to ``catastrophic'' performance even for relatively simple instances. We apply stochastic restart techniques, some of them new, for speeding up Las Vegas algorithms, that provide dramatic speedups in practice (a factor of $3$ [or larger] in many cases). Our experiments demonstrate that the new algorithms have faster runtimes, shorter paths, and greater gains from multi-threading (when compared with straightforward parallel implementation). We prove the optimality of the new variants. Our implementation is open source, available on github, and is easy to deploy and use.
comment: arXiv admin note: text overlap with arXiv:2503.04633
Transformable Modular Robots: A CPG-Based Approach to Independent and Collective Locomotion
Modular robotics enables the development of versatile and adaptive robotic systems with autonomous reconfiguration. This paper presents a modular robotic system in which each module has independent actuation, battery power, and control, allowing both individual mobility and coordinated locomotion. A hierarchical Central Pattern Generator (CPG) framework governs motion, with a low-level CPG controlling individual modules and a high-level CPG synchronizing inter-module coordination, enabling smooth transitions between independent and collective behaviors. To validate the system, we conduct simulations in MuJoCo and hardware experiments, evaluating locomotion across different configurations. We first analyze single-module motion, followed by two-module cooperative locomotion. Results demonstrate the effectiveness of the CPG-based control framework in achieving robust, flexible, and scalable locomotion. The proposed modular architecture has potential applications in search and rescue, environmental monitoring, and autonomous exploration, where adaptability and reconfigurability are essential.
MIXPINN: Mixed-Material Simulations by Physics-Informed Neural Network IROS
Simulating the complex interactions between soft tissues and rigid anatomy is critical for applications in surgical training, planning, and robotic-assisted interventions. Traditional Finite Element Method (FEM)-based simulations, while accurate, are computationally expensive and impractical for real-time scenarios. Learning-based approaches have shown promise in accelerating predictions but have fallen short in modeling soft-rigid interactions effectively. We introduce MIXPINN, a physics-informed Graph Neural Network (GNN) framework for mixed-material simulations, explicitly capturing soft-rigid interactions using graph-based augmentations. Our approach integrates Virtual Nodes (VNs) and Virtual Edges (VEs) to enhance rigid body constraint satisfaction while preserving computational efficiency. By leveraging a graph-based representation of biomechanical structures, MIXPINN learns high-fidelity deformations from FEM-generated data and achieves real-time inference with sub-millimeter accuracy. We validate our method in a realistic clinical scenario, demonstrating superior performance compared to baseline GNN models and traditional FEM methods. Our results show that MIXPINN reduces computational cost by an order of magnitude while maintaining high physical accuracy, making it a viable solution for real-time surgical simulation and robotic-assisted procedures.
comment: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
Autonomous Dissection in Robotic Cholecystectomy IROS
Robotic surgery offers enhanced precision and adaptability, paving the way for automation in surgical interventions. Cholecystectomy, the gallbladder removal, is particularly well-suited for automation due to its standardized procedural steps and distinct anatomical boundaries. A key challenge in automating this procedure is dissecting with accuracy and adaptability. This paper presents a vision-based autonomous robotic dissection architecture that integrates real-time segmentation, keypoint detection, grasping and stretching the gallbladder with the left arm, and dissecting with the other arm. We introduce an improved segmentation dataset based on videos of robotic cholecystectomy performed by various surgeons, incorporating a new ``liver bed'' class to enhance boundary tracking after multiple rounds of dissection. Our system employs state-of-the-art segmentation models and an adaptive boundary extraction method that maintains accuracy despite tissue deformations and visual variations. Moreover, we implemented an automated grasping and pulling strategy to optimize tissue tension before dissection upon our previous work. Ex vivo evaluations on porcine livers demonstrate that our framework significantly improves dissection precision and consistency, marking a step toward fully autonomous robotic cholecystectomy.
comment: Accepted to the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
Reward-Augmented Reinforcement Learning for Continuous Control in Precision Autonomous Parking via Policy Optimization Methods
Autonomous parking (AP) represents a critical yet complex subset of intelligent vehicle automation, characterized by tight spatial constraints, frequent close-range obstacle interactions, and stringent safety margins. However, conventional rule-based and model-predictive methods often lack the adaptability and generalization needed to handle the nonlinear and environment-dependent complexities of AP. To address these limitations, we propose a reward-augmented learning framework for AP (RARLAP), that mitigates the inherent complexities of continuous-domain control by leveraging structured reward design to induce smooth and adaptable policy behavior, trained entirely within a high-fidelity Unity-based custom 3D simulation environment. We systematically design and assess three structured reward strategies: goal-only reward (GOR), dense proximity reward (DPR), and milestone-augmented reward (MAR), each integrated with both on-policy and off-policy optimization paradigms. Empirical evaluations demonstrate that the on-policy MAR achieves a 91\% success rate, yielding smoother trajectories and more robust behavior, while GOR and DPR fail to guide effective learning. Convergence and trajectory analyses demonstrate that the proposed framework enhances policy adaptability, accelerates training, and improves safety in continuous control. Overall, RARLAP establishes that reward augmentation effectively addresses complex autonomous parking challenges, enabling scalable and efficient policy optimization with both on- and off-policy methods. To support reproducibility, the code accompanying this paper is publicly available.
Distributed AI Agents for Cognitive Underwater Robot Autonomy
Achieving robust cognitive autonomy in robots navigating complex, unpredictable environments remains a fundamental challenge in robotics. This paper presents Underwater Robot Self-Organizing Autonomy (UROSA), a groundbreaking architecture leveraging distributed Large Language Model AI agents integrated within the Robot Operating System 2 (ROS 2) framework to enable advanced cognitive capabilities in Autonomous Underwater Vehicles. UROSA decentralises cognition into specialised AI agents responsible for multimodal perception, adaptive reasoning, dynamic mission planning, and real-time decision-making. Central innovations include flexible agents dynamically adapting their roles, retrieval-augmented generation utilising vector databases for efficient knowledge management, reinforcement learning-driven behavioural optimisation, and autonomous on-the-fly ROS 2 node generation for runtime functional extensibility. Extensive empirical validation demonstrates UROSA's promising adaptability and reliability through realistic underwater missions in simulation and real-world deployments, showing significant advantages over traditional rule-based architectures in handling unforeseen scenarios, environmental uncertainties, and novel mission objectives. This work not only advances underwater autonomy but also establishes a scalable, safe, and versatile cognitive robotics framework capable of generalising to a diverse array of real-world applications.
KeyMPs: One-Shot Vision-Language Guided Motion Generation by Sequencing DMPs for Occlusion-Rich Tasks
Dynamic Movement Primitives (DMPs) provide a flexible framework wherein smooth robotic motions are encoded into modular parameters. However, they face challenges in integrating multimodal inputs commonly used in robotics like vision and language into their framework. To fully maximize DMPs' potential, enabling them to handle multimodal inputs is essential. In addition, we also aim to extend DMPs' capability to handle object-focused tasks requiring one-shot complex motion generation, as observation occlusion could easily happen mid-execution in such tasks (e.g., knife occlusion in cake icing, hand occlusion in dough kneading, etc.). A promising approach is to leverage Vision-Language Models (VLMs), which process multimodal data and can grasp high-level concepts. However, they typically lack enough knowledge and capabilities to directly infer low-level motion details and instead only serve as a bridge between high-level instructions and low-level control. To address this limitation, we propose Keyword Labeled Primitive Selection and Keypoint Pairs Generation Guided Movement Primitives (KeyMPs), a framework that combines VLMs with sequencing of DMPs. KeyMPs use VLMs' high-level reasoning capability to select a reference primitive through \emph{keyword labeled primitive selection} and VLMs' spatial awareness to generate spatial scaling parameters used for sequencing DMPs by generalizing the overall motion through \emph{keypoint pairs generation}, which together enable one-shot vision-language guided motion generation that aligns with the intent expressed in the multimodal input. We validate our approach through experiments on two occlusion-rich tasks: object cutting, conducted in both simulated and real-world environments, and cake icing, performed in simulation. These evaluations demonstrate superior performance over other DMP-based methods that integrate VLM support.
comment: Published in IEEE Access, Jul 14 2025
Sampling-Based Model Predictive Control for Dexterous Manipulation on a Biomimetic Tendon-Driven Hand
Biomimetic and compliant robotic hands offer the potential for human-like dexterity, but controlling them is challenging due to high dimensionality, complex contact interactions, and uncertainties in state estimation. Sampling-based model predictive control (MPC), using a physics simulator as the dynamics model, is a promising approach for generating contact-rich behavior. However, sampling-based MPC has yet to be evaluated on physical (non-simulated) robotic hands, particularly on compliant hands with state uncertainties. We present the first successful demonstration of in-hand manipulation on a physical biomimetic tendon-driven robot hand using sampling-based MPC. While sampling-based MPC does not require lengthy training cycles like reinforcement learning approaches, it still necessitates adapting the task-specific objective function to ensure robust behavior execution on physical hardware. To adapt the objective function, we integrate a visual language model (VLM) with a real-time optimizer (MuJoCo MPC). We provide the VLM with a high-level human language description of the task and a video of the hand's current behavior. The VLM gradually adapts the objective function, allowing for efficient behavior generation, with each iteration taking less than two minutes. We show the feasibility of ball rolling, flipping, and catching using both simulated and physical robot hands. Our results demonstrate that sampling-based MPC is a promising approach for generating dexterous manipulation skills on biomimetic hands without extensive training cycles.
comment: For a video, see https://youtu.be/u4d6v3ohsOI
Dynamic-Dark SLAM: RGB-Thermal Cooperative Robot Vision Strategy for Multi-Person Tracking in Both Well-Lit and Low-Light Scenes
In robot vision, thermal cameras hold great potential for recognizing humans even in complete darkness. However, their application to multi-person tracking (MPT) has been limited due to data scarcity and the inherent difficulty of distinguishing individuals. In this study, we propose a cooperative MPT system that utilizes co-located RGB and thermal cameras, where pseudo-annotations (bounding boxes and person IDs) are used to train both RGB and thermal trackers. Evaluation experiments demonstrate that the thermal tracker performs robustly in both bright and dark environments. Moreover, the results suggest that a tracker-switching strategy -- guided by a binary brightness classifier -- is more effective for information integration than a tracker-fusion approach. As an application example, we present an image change pattern recognition (ICPR) method, the ``human-as-landmark,'' which combines two key properties: the thermal recognizability of humans in dark environments and the rich landmark characteristics -- appearance, geometry, and semantics -- of static objects (occluders). Whereas conventional SLAM focuses on mapping static landmarks in well-lit environments, the present study takes a first step toward a new Human-Only SLAM paradigm, ``Dynamic-Dark SLAM,'' which aims to map even dynamic landmarks in complete darkness. Additionally, this study demonstrates that knowledge transfer between thermal and depth modalities enables reliable person tracking using low-resolution 3D LiDAR data without RGB input, contributing an important advance toward cross-robot SLAM systems.
comment: 11 pages, 11 figures, technical report
Traffic Scene Generation from Natural Language Description for Autonomous Vehicles with Large Language Model
Generating realistic and controllable traffic scenes from natural language can greatly enhance the development and evaluation of autonomous driving systems. However, this task poses unique challenges: (1) grounding free-form text into spatially valid and semantically coherent layouts, (2) composing scenarios without predefined locations, and (3) planning multi-agent behaviors and selecting roads that respect agents' configurations. To address these, we propose a modular framework, TTSG, comprising prompt analysis, road retrieval, agent planning, and a novel plan-aware road ranking algorithm to solve these challenges. While large language models (LLMs) are used as general planners, our design integrates them into a tightly controlled pipeline that enforces structure, feasibility, and scene diversity. Notably, our ranking strategy ensures consistency between agent actions and road geometry, enabling scene generation without predefined routes or spawn points. The framework supports both routine and safety-critical scenarios, as well as multi-stage event composition. Experiments on SafeBench demonstrate that our method achieves the lowest average collision rate (3.5\%) across three critical scenarios. Moreover, driving captioning models trained on our generated scenes improve action reasoning by over 30 CIDEr points. These results underscore our proposed framework for flexible, interpretable, and safety-oriented simulation.
Investigating Robotaxi Crash Severity with Geographical Random Forest and the Urban Environment
This paper quantitatively investigates the crash severity of Autonomous Vehicles (AVs) with spatially localized machine learning and macroscopic measures of the urban built environment. Extending beyond the microscopic effects of individual infrastructure elements, we focus on the city-scale land use and behavioral patterns, while addressing spatial heterogeneity and spatial autocorrelation. We implemented a spatially localized machine learning technique called Geographical Random Forest (GRF) on the California AV collision dataset. Analyzing multiple urban measures, including points of interest, building footprint, and land use, we built a GRF model and visualized it as a crash severity risk map of San Francisco. This paper presents three findings. First, spatially localized machine learning outperformed regular machine learning in predicting AV crash severity. The bias-variance tradeoff was evident as we adjusted the localization weight hyperparameter. Second, land use was the most important predictor, compared to intersections, building footprints, public transit stops, and Points Of Interest (POIs). Third, AV crashes were more likely to result in low-severity incidents in city center areas with greater diversity and commercial activities, than in residential neighborhoods. Residential land use is likely associated with higher severity due to human behavior and less restrictive environments. Counterintuitively, residential areas were associated with higher crash severity, compared to more complex areas such as commercial and mixed-use areas. When robotaxi operators train their AV systems, it is recommended to: (1) consider where their fleet operates and make localized algorithms for their perception system, and (2) design safety measures specific to residential neighborhoods, such as slower driving speeds and more alert sensors.
AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
We explore how scalable robot data can address real-world challenges for generalized robotic manipulation. Introducing AgiBot World, a large-scale platform comprising over 1 million trajectories across 217 tasks in five deployment scenarios, we achieve an order-of-magnitude increase in data scale compared to existing datasets. Accelerated by a standardized collection pipeline with human-in-the-loop verification, AgiBot World guarantees high-quality and diverse data distribution. It is extensible from grippers to dexterous hands and visuo-tactile sensors for fine-grained skill acquisition. Building on top of data, we introduce Genie Operator-1 (GO-1), a novel generalist policy that leverages latent action representations to maximize data utilization, demonstrating predictable performance scaling with increased data volume. Policies pre-trained on our dataset achieve an average performance improvement of 30% over those trained on Open X-Embodiment, both in in-domain and out-of-distribution scenarios. GO-1 exhibits exceptional capability in real-world dexterous and long-horizon tasks, achieving over 60% success rate on complex tasks and outperforming prior RDT approach by 32%. By open-sourcing the dataset, tools, and models, we aim to democratize access to large-scale, high-quality robot data, advancing the pursuit of scalable and general-purpose intelligence.
comment: Project website: https://agibot-world.com/. Github repo: https://github.com/OpenDriveLab/AgiBot-World. The author list is ordered alphabetically by surname, with detailed contributions provided in the appendix
3D-CDRGP: Towards Cross-Device Robotic Grasping Policy in 3D Open World
Given the diversity of devices and the product upgrades, cross-device research has become an urgent issue that needs to be tackled. To this end, we pioneer in probing the cross-device (cameras & robotics) grasping policy in the 3D open world. Specifically, we construct two real-world grasping setups, employing robotic arms and cameras from completely different manufacturers. To minimize domain differences in point clouds from diverse cameras, we adopt clustering methods to generate 3D object proposals. However, existing clustering methods are limited to closed-set scenarios, which confines the robotic graspable object categories and ossifies the deployment scenarios. To extend these methods to open-world settings, we introduce the SSGC-Seg module that enables category-agnostic 3D object detection. The proposed module transforms the original multi-class semantic information into binary semantic cues-foreground and background by analyzing the SoftMax value of each point, and then clusters the foreground points based on geometric information to form initial object proposals. Furthermore, ScoreNet{\ddag} is designed to score each detection result, and the robotic arm prioritizes grasping the object with the highest confidence score. Experiments on two different types of setups highlight the effectiveness and robustness of our policy for cross-device robotics grasping research. Our code is provided in the supplementary and will be released upon acceptance.
P2 Explore: Efficient Exploration in Unknown Cluttered Environment with Floor Plan Prediction IROS 2025
Robot exploration aims at the reconstruction of unknown environments, and it is important to achieve it with shorter paths. Traditional methods focus on optimizing the visiting order of frontiers based on current observations, which may lead to local-minimal results. Recently, by predicting the structure of the unseen environment, the exploration efficiency can be further improved. However, in a cluttered environment, due to the randomness of obstacles, the ability to predict is weak. Moreover, this inaccuracy will lead to limited improvement in exploration. Therefore, we propose FPUNet which can be efficient in predicting the layout of noisy indoor environments. Then, we extract the segmentation of rooms and construct their topological connectivity based on the predicted map. The visiting order of these predicted rooms is optimized which can provide high-level guidance for exploration. The FPUNet is compared with other network architectures which demonstrates it is the SOTA method for this task. Extensive experiments in simulations show that our method can shorten the path length by 2.18% to 34.60% compared to the baselines.
comment: 7 pages, Accepted by IROS 2025, Open-sourced at https://github.com/KunSong-L/P2Explore
Friction-Aware Safety Locomotion for Wheeled-legged Robots using Vision Language Models and Reinforcement Learning
Controlling Wheeled-legged robots is challenging especially on slippery surfaces due to their dependence on continuous ground contact. Unlike quadrupeds or bipeds, which can leverage multiple fixed contact points for recovery, wheeled-legged robots are highly susceptible to slip, where even momentary loss of traction can result in irrecoverable instability. Anticipating ground physical properties such as friction before contact would allow proactive control adjustments, reducing slip risk. In this paper, we propose a friction-aware safety locomotion framework that integrates Vision-Language Models (VLMs) with a Reinforcement Learning (RL) policy. Our method employs a Retrieval-Augmented Generation (RAG) approach to estimate the Coefficient of Friction (CoF), which is then explicitly incorporated into the RL policy. This enables the robot to adapt its speed based on predicted friction conditions before contact. The framework is validated through experiments in both simulation and on a physical customized Wheeled Inverted Pendulum (WIP). Experimental results show that our approach successfully completes trajectory tracking tasks on slippery surfaces, whereas baseline methods relying solely on proprioceptive feedback fail. These findings highlight the importance and effectiveness of explicitly predicting and utilizing ground friction information for safe locomotion. They also point to a promising research direction in exploring the use of VLMs for estimating ground conditions, which remains a significant challenge for purely vision-based methods.
comment: Accepted to Humanoids 2025
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model IROS 2025
Omnidirectional depth perception is essential for mobile robotics applications that require scene understanding across a full 360{\deg} field of view. Camera-based setups offer a cost-effective option by using stereo depth estimation to generate dense, high-resolution depth maps without relying on expensive active sensing. However, existing omnidirectional stereo matching approaches achieve only limited depth accuracy across diverse environments, depth ranges, and lighting conditions, due to the scarcity of real-world data. We present DFI-OmniStereo, a novel omnidirectional stereo matching method that leverages a large-scale pre-trained foundation model for relative monocular depth estimation within an iterative optimization-based stereo matching architecture. We introduce a dedicated two-stage training strategy to utilize the relative monocular depth features for our omnidirectional stereo matching before scale-invariant fine-tuning. DFI-OmniStereo achieves state-of-the-art results on the real-world Helvipad dataset, reducing disparity MAE by approximately 16% compared to the previous best omnidirectional stereo method.
comment: Accepted at IROS 2025. Project page: https://vita-epfl.github.io/DFI-OmniStereo-website/
OWLed: Outlier-weighed Layerwise Pruning for Efficient Autonomous Driving Framework IJCNN2025
The integration of Large Language Models (LLMs) into autonomous driving systems offers promising enhancements in environmental understanding and decision-making. However, the substantial computational demands of deploying LLMs locally on vehicles render this approach unfeasible for real-world automotive applications. To address this challenge, we introduce OWLed, the Outlier-Weighed Layerwise Pruning for Efficient Autonomous Driving Framework that leverages outlier-weighted layerwise sparsity for model compression. Our method assigns non-uniform sparsity ratios to different layers based on the distribution of outlier features, significantly reducing the model size without the need for fine-tuning. To ensure the compressed model adapts well to autonomous driving tasks, we incorporate driving environment data into both the calibration and pruning processes. Our empirical studies reveal that the encoder component is more sensitive to pruning than the LLM, highlighting its critical role in the system. Experimental results demonstrate that OWLed outperforms existing methods in perception, action prediction, and language understanding while substantially lowering computational requirements. These findings underscore the potential of combining advanced pruning techniques with LLMs to develop efficient and robust autonomous driving systems capable of handling complex scenarios. Code will be made publicly available.
comment: IJCNN2025
MonoForce: Self-supervised Learning of Physics-informed Model for Predicting Robot-terrain Interaction IROS 2024
While autonomous navigation of mobile robots on rigid terrain is a well-explored problem, navigating on deformable terrain such as tall grass or bushes remains a challenge. To address it, we introduce an explainable, physics-aware and end-to-end differentiable model which predicts the outcome of robot-terrain interaction from camera images, both on rigid and non-rigid terrain. The proposed MonoForce model consists of a black-box module which predicts robot-terrain interaction forces from onboard cameras, followed by a white-box module, which transforms these forces and a control signals into predicted trajectories, using only the laws of classical mechanics. The differentiable white-box module allows backpropagating the predicted trajectory errors into the black-box module, serving as a self-supervised loss that measures consistency between the predicted forces and ground-truth trajectories of the robot. Experimental evaluation on a public dataset and our data has shown that while the prediction capabilities are comparable to state-of-the-art algorithms on rigid terrain, MonoForce shows superior accuracy on non-rigid terrain such as tall grass or bushes. To facilitate the reproducibility of our results, we release both the code and datasets.
comment: Accepted for IEEE IROS 2024. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
16 Ways to Gallop: Energetics and Body Dynamics of High-Speed Quadrupedal Gaits IROS 2025
Galloping is a common high-speed gait in both animals and quadrupedal robots, yet its energetic characteristics remain insufficiently explored. This study systematically analyzes a large number of possible galloping gaits by categorizing them based on the number of flight phases per stride and the phase relationships between the front and rear legs, following Hildebrand's framework for asymmetrical gaits. Using the A1 quadrupedal robot from Unitree, we model galloping dynamics as a hybrid dynamical system and employ trajectory optimization (TO) to minimize the cost of transport (CoT) across a range of speeds. Our results reveal that rotary and transverse gallop footfall sequences exhibit no fundamental energetic difference, despite variations in body yaw and roll motion. However, the number of flight phases significantly impacts energy efficiency: galloping with no flight phases is optimal at lower speeds, whereas galloping with two flight phases minimizes energy consumption at higher speeds. We validate these findings using a quadratic programming (QP)-based controller, developed in our previous work, in Gazebo simulations. These insights advance the understanding of quadrupedal locomotion energetics and may inform future legged robot designs for adaptive, energy-efficient gait transitions.
comment: 7 pages, 6 figures, Accepted for IROS 2025
Multiagent Systems
What Is Your AI Agent Buying? Evaluation, Implications and Emerging Questions for Agentic E-Commerce
Online marketplaces will be transformed by autonomous AI agents acting on behalf of consumers. Rather than humans browsing and clicking, vision-language-model (VLM) agents can parse webpages, evaluate products, and transact. This raises a fundamental question: what do AI agents buy, and why? We develop ACES, a sandbox environment that pairs a platform-agnostic VLM agent with a fully programmable mock marketplace to study this question. We first conduct basic rationality checks in the context of simple tasks, and then, by randomizing product positions, prices, ratings, reviews, sponsored tags, and platform endorsements, we obtain causal estimates of how frontier VLMs actually shop. Models show strong but heterogeneous position effects: all favor the top row, yet different models prefer different columns, undermining the assumption of a universal "top" rank. They penalize sponsored tags and reward endorsements. Sensitivities to price, ratings, and reviews are directionally human-like but vary sharply in magnitude across models. Motivated by scenarios where sellers use AI agents to optimize product listings, we show that a seller-side agent that makes minor tweaks to product descriptions, targeting AI buyer preferences, can deliver substantial market-share gains if AI-mediated shopping dominates. We also find that modal product choices can differ across models and, in some cases, demand may concentrate on a few select products, raising competition questions. Together, our results illuminate how AI agents may behave in e-commerce settings and surface concrete seller strategy, platform design, and regulatory questions in an AI-mediated ecosystem.
HealthFlow: A Self-Evolving AI Agent with Meta Planning for Autonomous Healthcare Research
The efficacy of AI agents in healthcare research is hindered by their reliance on static, predefined strategies. This creates a critical limitation: agents can become better tool-users but cannot learn to become better strategic planners, a crucial skill for complex domains like healthcare. We introduce HealthFlow, a self-evolving AI agent that overcomes this limitation through a novel meta-level evolution mechanism. HealthFlow autonomously refines its own high-level problem-solving policies by distilling procedural successes and failures into a durable, strategic knowledge base. To anchor our research and facilitate reproducible evaluation, we introduce EHRFlowBench, a new benchmark featuring complex, realistic health data analysis tasks derived from peer-reviewed clinical research. Our comprehensive experiments demonstrate that HealthFlow's self-evolving approach significantly outperforms state-of-the-art agent frameworks. This work marks a necessary shift from building better tool-users to designing smarter, self-evolving task-managers, paving the way for more autonomous and effective AI for scientific discovery.
comment: Code: https://github.com/yhzhu99/HealthFlow
AIAP: A No-Code Workflow Builder for Non-Experts with Natural Language and Multi-Agent Collaboration
While many tools are available for designing AI, non-experts still face challenges in clearly expressing their intent and managing system complexity. We introduce AIAP, a no-code platform that integrates natural language input with visual workflows. AIAP leverages a coordinated multi-agent system to decompose ambiguous user instructions into modular, actionable steps, hidden from users behind a unified interface. A user study involving 32 participants showed that AIAP's AI-generated suggestions, modular workflows, and automatic identification of data, actions, and context significantly improved participants' ability to develop services intuitively. These findings highlight that natural language-based visual programming significantly reduces barriers and enhances user experience in AI service design.
comment: 14 pages, 6 figures
Emergence of Fair Leaders via Mediators in Multi-Agent Reinforcement Learning ECAI 2025
Stackelberg games and their resulting equilibria have received increasing attention in the multi-agent reinforcement learning literature. Each stage of a traditional Stackelberg game involves a leader(s) acting first, followed by the followers. In situations where the roles of leader(s) and followers can be interchanged, the designated role can have considerable advantages, for example, in first-mover advantage settings. Then the question arises: Who should be the leader and when? A bias in the leader selection process can lead to unfair outcomes. This problem is aggravated if the agents are self-interested and care only about their goals and rewards. We formally define this leader selection problem and show its relation to fairness in agents' returns. Furthermore, we propose a multi-agent reinforcement learning framework that maximizes fairness by integrating mediators. Mediators have previously been used in the simultaneous action setting with varying levels of control, such as directly performing agents' actions or just recommending them. Our framework integrates mediators in the Stackelberg setting with minimal control (leader selection). We show that the presence of mediators leads to self-interested agents taking fair actions, resulting in higher overall fairness in agents' returns.
comment: Accepted to ECAI 2025
Distributed Non-Uniform Scaling Control of Multi-Agent Formation via Matrix-Valued Constraints
Distributed formation maneuver control refers to the problem of maneuvering a group of agents to change their formation shape by adjusting the motions of partial agents, where the controller of each agent only requires local information measured from its neighbors. Although this problem has been extensively investigated, existing approaches are mostly limited to uniform scaling transformations. This article proposes a new type of local matrix-valued constraints, via which non-uniform scaling control of position formation can be achieved by tuning the positions of only two agents (i.e., leaders). Here, the non-uniform scaling transformation refers to scaling the position formation with different ratios along different orthogonal coordinate directions. Moreover, by defining scaling and translation of attitude formation, we propose a distributed control scheme for scaling and translation maneuver control of joint position-attitude formations. It is proven that the proposed controller achieves global convergence, provided that the sensing graph among agents is a 2-rooted bidirectional graph. Compared with the affine formation maneuver control approach, the proposed approach leverages a sparser sensing graph, requires fewer leaders, and additionally enables scaling transformations of the attitude formation. A simulation example is proposed to demonstrate our theoretical results.
A Survey on AgentOps: Categorization, Challenges, and Future Directions
As the reasoning capabilities of Large Language Models (LLMs) continue to advance, LLM-based agent systems offer advantages in flexibility and interpretability over traditional systems, garnering increasing attention. However, despite the widespread research interest and industrial application of agent systems, these systems, like their traditional counterparts, frequently encounter anomalies. These anomalies lead to instability and insecurity, hindering their further development. Therefore, a comprehensive and systematic approach to the operation and maintenance of agent systems is urgently needed. Unfortunately, current research on the operations of agent systems is sparse. To address this gap, we have undertaken a survey on agent system operations with the aim of establishing a clear framework for the field, defining the challenges, and facilitating further development. Specifically, this paper begins by systematically defining anomalies within agent systems, categorizing them into intra-agent anomalies and inter-agent anomalies. Next, we introduce a novel and comprehensive operational framework for agent systems, dubbed Agent System Operations (AgentOps). We provide detailed definitions and explanations of its four key stages: monitoring, anomaly detection, root cause analysis, and resolution.
comment: 35 pages
A Group Consensus-Driven Auction Algorithm for Cooperative Task Allocation Among Heterogeneous Multi-Agents
In scenarios like automated warehouses, assigning tasks to robots presents a heterogeneous multi-task and multi-agent task allocation problem. However, existing task allocation study ignores the integration of multi-task and multi-attribute agent task allocation with heterogeneous task allocation. In addition, current algorithms are limited by scenario constraints and can incur significant errors in specific contexts. Therefore, this study proposes a distributed heterogeneous multi-task and multi-agent task allocation algorithm with a time window, called group consensus-based heterogeneous auction (GCBHA). Firstly, this method decomposes tasks that exceed the capability of a single Agent into subtasks that can be completed by multiple independent agents. And then groups similar or adjacent tasks through a heuristic clustering method to reduce the time required to reach a consensus. Subsequently, the task groups are allocated to agents that meet the conditions through an auction process. Furthermore, the method evaluates the task path cost distance based on the scenario, which can calculate the task cost more accurately. The experimental results demonstrate that GCBHA performs well in terms of task allocation time and solution quality, with a significant reduction in the error rate between predicted task costs and actual costs.
Online Robust Multi-Agent Reinforcement Learning under Model Uncertainties
Well-trained multi-agent systems can fail when deployed in real-world environments due to model mismatches between the training and deployment environments, caused by environment uncertainties including noise or adversarial attacks. Distributionally Robust Markov Games (DRMGs) enhance system resilience by optimizing for worst-case performance over a defined set of environmental uncertainties. However, current methods are limited by their dependence on simulators or large offline datasets, which are often unavailable. This paper pioneers the study of online learning in DRMGs, where agents learn directly from environmental interactions without prior data. We introduce the {\it Robust Optimistic Nash Value Iteration (RONAVI)} algorithm and provide the first provable guarantees for this setting. Our theoretical analysis demonstrates that the algorithm achieves low regret and efficiently finds the optimal robust policy for uncertainty sets measured by Total Variation divergence and Kullback-Leibler divergence. These results establish a new, practical path toward developing truly robust multi-agent systems.
Engineered over Emergent Communication in MARL for Scalable and Sample-Efficient Cooperative Task Allocation in a Partially Observable Grid
We compare the efficacy of learned versus engineered communication strategies in a cooperative multi-agent reinforcement learning (MARL) environment. For the learned approach, we introduce Learned Direct Communication (LDC), where agents generate messages and actions concurrently via a neural network. Our engineered approach, Intention Communication, employs an Imagined Trajectory Generation Module (ITGM) and a Message Generation Network (MGN) to formulate messages based on predicted future states. Both strategies are evaluated on their success rates in cooperative tasks under fully and partially observable conditions. Our findings indicate that while emergent communication is viable, the engineered approach demonstrates superior performance and scalability, particularly as environmental complexity increases.
TransAM: Transformer-Based Agent Modeling for Multi-Agent Systems via Local Trajectory Encoding
Agent modeling is a critical component in developing effective policies within multi-agent systems, as it enables agents to form beliefs about the behaviors, intentions, and competencies of others. Many existing approaches assume access to other agents' episodic trajectories, a condition often unrealistic in real-world applications. Consequently, a practical agent modeling approach must learn a robust representation of the policies of the other agents based only on the local trajectory of the controlled agent. In this paper, we propose \texttt{TransAM}, a novel transformer-based agent modeling approach to encode local trajectories into an embedding space that effectively captures the policies of other agents. We evaluate the performance of the proposed method in cooperative, competitive, and mixed multi-agent environments. Extensive experimental results demonstrate that our approach generates strong policy representations, improves agent modeling, and leads to higher episodic returns.
Physiological Signal-Driven QoE Optimization for Wireless Virtual Reality Transmission
Abrupt resolution changes in virtual reality (VR) streaming can significantly impair the quality-of-experience (QoE) of users, particularly during transitions from high to low resolutions. Existing QoE models and transmission schemes inadequately address the perceptual impact of these shifts. To bridge this gap, this article proposes, for the first time, an innovative physiological signal-driven QoE modeling and optimization framework that fully leverages users' electroencephalogram (EEG), electrocardiogram (ECG), and skin activity signals. This framework precisely captures the temporal dynamics of physiological responses and resolution changes in VR streaming, enabling accurate quantification of resolution upgrades' benefits and downgrades' impacts. Integrated the proposed QoE framework into the radio access network (RAN) via a deep reinforcement learning (DRL) framework, adaptive transmission strategies have been implemented to allocate radio resources dynamically, which mitigates short-term channel fluctuations and adjusts frame resolution in response to channel variations caused by user mobility. By prioritizing long-term resolution while minimizing abrupt transitions, the proposed solution achieves an 88.7\% improvement in resolution and an 81.0\% reduction in handover over the baseline. Experimental results demonstrate the effectiveness of this physiological signal-driven strategy, underscoring the promise of edge AI in immersive media services.
comment: 7 pages, 6 figures
Bearing-Distance Flocking with Zone-Based Interactions in Constrained Dynamic Environments
This paper presents a novel zone-based flocking control approach suitable for dynamic multi-agent systems (MAS). Inspired by Reynolds behavioral rules for $boids$, flocking behavioral rules with the zones of repulsion, conflict, attraction, and surveillance are introduced. For each agent, using only bearing and distance measurements, behavioral contribution vectors quantify the local separation, local and global flock velocity alignment, local cohesion, obstacle avoidance and boundary conditions, and strategic separation for avoiding alien agents. The control strategy uses the local perception-based behavioral contribution vectors to guide each agent's motion. Additionally, the control strategy incorporates a directionally aware obstacle avoidance mechanism that prioritizes obstacles in the agent's forward path. Simulation results validate the effectiveness of the model in creating flexible, adaptable, and scalable flocking behavior. Asymptotic stability and convergence to a stable flocking configuration for any initial conditions provided the interaction graph is a spanning tree are demonstrated. The flocking model's reliance on locally sensed bearing and distance measurements ensures scalability and robustness, particularly in scenarios where communication is unreliable or resource-intensive. This makes it well-suited for real-world applications demanding seamless operation in highly dynamic and distributed environments.
comment: Video for Figure 4: https://youtu.be/5nSml5F2oQk?si=X0q51SXcZiTRBdFS Video for Figure 6 (2D): https://youtu.be/ANkAZ9FMq0o?si=px8oSAeKkUwBR7uf Video for Figure 6 (3D): https://youtu.be/AUlcCH7P73U?si=QFYYeCGLIGdsJCad
Distributed AI Agents for Cognitive Underwater Robot Autonomy
Achieving robust cognitive autonomy in robots navigating complex, unpredictable environments remains a fundamental challenge in robotics. This paper presents Underwater Robot Self-Organizing Autonomy (UROSA), a groundbreaking architecture leveraging distributed Large Language Model AI agents integrated within the Robot Operating System 2 (ROS 2) framework to enable advanced cognitive capabilities in Autonomous Underwater Vehicles. UROSA decentralises cognition into specialised AI agents responsible for multimodal perception, adaptive reasoning, dynamic mission planning, and real-time decision-making. Central innovations include flexible agents dynamically adapting their roles, retrieval-augmented generation utilising vector databases for efficient knowledge management, reinforcement learning-driven behavioural optimisation, and autonomous on-the-fly ROS 2 node generation for runtime functional extensibility. Extensive empirical validation demonstrates UROSA's promising adaptability and reliability through realistic underwater missions in simulation and real-world deployments, showing significant advantages over traditional rule-based architectures in handling unforeseen scenarios, environmental uncertainties, and novel mission objectives. This work not only advances underwater autonomy but also establishes a scalable, safe, and versatile cognitive robotics framework capable of generalising to a diverse array of real-world applications.
Operator: A Protocol for Trustless Delegation Under Uncertainty
Correctness is an emergent property of systems where exposing error is cheaper than committing it. In dynamic, low-trust environments, autonomous AI agents benefit from delegating work to sub-agents, yet correctness cannot be assured through upfront specification or centralized oversight. We propose a protocol that enforces correctness through collateralized claims in a recursive verification game. Tasks are published as intents, and solvers compete to fulfill them. Selected solvers carry out tasks under risk, with correctness checked post hoc by verifiers. Any challenger can challenge a result by staking against it to trigger the verification process. Incorrect agents are slashed and correct opposition is rewarded, with an escalation path that penalizes erroneous verifiers themselves. When incentives are aligned across solvers, challengers, and verifiers, falsification conditions make correctness the Nash equilibrium.
comment: 9 pages, 1 figure
GEMA-Score: Granular Explainable Multi-Agent Scoring Framework for Radiology Report Evaluation
Automatic medical report generation has the potential to support clinical diagnosis, reduce the workload of radiologists, and demonstrate potential for enhancing diagnostic consistency. However, current evaluation metrics often fail to reflect the clinical reliability of generated reports. Early overlap-based methods focus on textual matches between predicted and ground-truth entities but miss fine-grained clinical details (e.g., anatomical location, severity). Some diagnostic metrics are limited by fixed vocabularies or templates, reducing their ability to capture diverse clinical expressions. LLM-based approaches further lack interpretable reasoning steps, making it hard to assess or trust their behavior in safety-critical settings. These limitations hinder the comprehensive assessment of the reliability of generated reports and pose risks in their selection for clinical use. Therefore, we propose a Granular Explainable Multi-Agent Score (GEMA-Score) in this paper, which conducts both objective quantification and subjective evaluation through a large language model-based multi-agent workflow. Our GEMA-Score parses structured reports and employs stable calculations through interactive exchanges of information among agents to assess disease diagnosis, location, severity, and uncertainty. Additionally, an LLM-based scoring agent evaluates completeness, readability, and clinical terminology while providing explanatory feedback. Extensive experiments validate that GEMA-Score achieves the highest correlation with human expert evaluations on a public dataset, demonstrating its effectiveness in clinical scoring (Kendall coefficient = $0.69$ for ReXVal dataset and Kendall coefficient = $0.45$ for RadEvalX dataset). The anonymous project demo is available at: https://github.com/Zhenxuan-Zhang/GEMA_score.
Robotics
ROVER: Recursive Reasoning Over Videos with Vision-Language Models for Embodied Tasks
Vision-language models (VLMs) have exhibited impressive capabilities across diverse image understanding tasks, but still struggle in settings that require reasoning over extended sequences of camera frames from a video. This limits their utility in embodied settings, which require reasoning over long frame sequences from a continuous stream of visual input at each moment of a task attempt. To address this limitation, we propose ROVER (Reasoning Over VidEo Recursively), a framework that enables the model to recursively decompose long-horizon video trajectories into segments corresponding to shorter subtasks within the trajectory. In doing so, ROVER facilitates more focused and accurate reasoning over temporally localized frame sequences without losing global context. We evaluate ROVER, implemented using an in-context learning approach, on diverse OpenX Embodiment videos and on a new dataset derived from RoboCasa that consists of 543 videos showing both expert and perturbed non-expert trajectories across 27 robotic manipulation tasks. ROVER outperforms strong baselines across three video reasoning tasks: task progress estimation, frame-level natural language reasoning, and video question answering. We observe that, by reducing the number of frames the model reasons over at each timestep, ROVER mitigates hallucinations, especially during unexpected or non-optimal moments of a trajectory. In addition, by enabling the implementation of a subtask-specific sliding context window, ROVER's time complexity scales linearly with video length, an asymptotic improvement over baselines. Demos, code, and data available at: https://rover-vlm.github.io
CVD-SfM: A Cross-View Deep Front-end Structure-from-Motion System for Sparse Localization in Multi-Altitude Scenes
We present a novel multi-altitude camera pose estimation system, addressing the challenges of robust and accurate localization across varied altitudes when only considering sparse image input. The system effectively handles diverse environmental conditions and viewpoint variations by integrating the cross-view transformer, deep features, and structure-from-motion into a unified framework. To benchmark our method and foster further research, we introduce two newly collected datasets specifically tailored for multi-altitude camera pose estimation; datasets of this nature remain rare in the current literature. The proposed framework has been validated through extensive comparative analyses on these datasets, demonstrating that our system achieves superior performance in both accuracy and robustness for multi-altitude sparse pose estimation tasks compared to existing solutions, making it well suited for real-world robotic applications such as aerial navigation, search and rescue, and automated inspection.
Beyond Simulation: Benchmarking World Models for Planning and Causality in Autonomous Driving ICRA 2025
World models have become increasingly popular in acting as learned traffic simulators. Recent work has explored replacing traditional traffic simulators with world models for policy training. In this work, we explore the robustness of existing metrics to evaluate world models as traffic simulators to see if the same metrics are suitable for evaluating a world model as a pseudo-environment for policy training. Specifically, we analyze the metametric employed by the Waymo Open Sim-Agents Challenge (WOSAC) and compare world model predictions on standard scenarios where the agents are fully or partially controlled by the world model (partial replay). Furthermore, since we are interested in evaluating the ego action-conditioned world model, we extend the standard WOSAC evaluation domain to include agents that are causal to the ego vehicle. Our evaluations reveal a significant number of scenarios where top-ranking models perform well under no perturbation but fail when the ego agent is forced to replay the original trajectory. To address these cases, we propose new metrics to highlight the sensitivity of world models to uncontrollable objects and evaluate the performance of world models as pseudo-environments for policy training and analyze some state-of-the-art world models under these new metrics.
comment: Accepted ICRA 2025
L3M+P: Lifelong Planning with Large Language Models
By combining classical planning methods with large language models (LLMs), recent research such as LLM+P has enabled agents to plan for general tasks given in natural language. However, scaling these methods to general-purpose service robots remains challenging: (1) classical planning algorithms generally require a detailed and consistent specification of the environment, which is not always readily available; and (2) existing frameworks mainly focus on isolated planning tasks, whereas robots are often meant to serve in long-term continuous deployments, and therefore must maintain a dynamic memory of the environment which can be updated with multi-modal inputs and extracted as planning knowledge for future tasks. To address these two issues, this paper introduces L3M+P (Lifelong LLM+P), a framework that uses an external knowledge graph as a representation of the world state. The graph can be updated from multiple sources of information, including sensory input and natural language interactions with humans. L3M+P enforces rules for the expected format of the absolute world state graph to maintain consistency between graph updates. At planning time, given a natural language description of a task, L3M+P retrieves context from the knowledge graph and generates a problem definition for classical planners. Evaluated on household robot simulators and on a real-world service robot, L3M+P achieves significant improvement over baseline methods both on accurately registering natural language state changes and on correctly generating plans, thanks to the knowledge graph retrieval and verification.
MUTE-DSS: A Digital-Twin-Based Decision Support System for Minimizing Underwater Radiated Noise in Ship Voyage Planning
We present a novel MUTE-DSS, a digital-twin-based decision support system for minimizing underwater radiated noise (URN) during ship voyage planning. It is a ROS2-centric framework that integrates state-of-the-art acoustic models combining a semi-empirical reference spectrum for near-field modeling with 3D ray tracing for propagation losses for far-field modeling, offering real-time computation of the ship noise signature, alongside a data-driven Southern resident killer whale distribution model. The proposed DSS performs a two-stage optimization pipeline: Batch Informed Trees for collision-free ship routing and a genetic algorithm for adaptive ship speed profiling under voyage constraints that minimizes cumulative URN exposure to marine mammals. The effectiveness of MUTE-DSS is demonstrated through case studies of ships operating between the Strait of Georgia and the Strait of Juan de Fuca, comparing optimized voyages against baseline trajectories derived from automatic identification system data. Results show substantial reductions in noise exposure level, up to 7.14 dB, corresponding to approximately an 80.68% reduction in a simplified scenario, and an average 4.90 dB reduction, corresponding to approximately a 67.6% reduction in a more realistic dynamic setting. These results illustrate the adaptability and practical utility of the proposed decision support system.
A Simple Algebraic Solution for Estimating the Pose of a Camera from Planar Point Features IROS 2025
This paper presents a simple algebraic method to estimate the pose of a camera relative to a planar target from $n \geq 4$ reference points with known coordinates in the target frame and their corresponding bearing measurements in the camera frame. The proposed approach follows a hierarchical structure; first, the unit vector normal to the target plane is determined, followed by the camera's position vector, its distance to the target plane, and finally, the full orientation. To improve the method's robustness to measurement noise, an averaging methodology is introduced to refine the estimation of the target's normal direction. The accuracy and robustness of the approach are validated through extensive experiments.
comment: 10 pages, 6 figures. To appear in IEEE/RSJ IROS 2025
Exploring environment exploitation for self-reconfiguration in modular robotics
Modular robotics research has long been preoccupied with perfecting the modules themselves -- their actuation methods, connectors, controls, communication, and fabrication. This inward focus results, in part, from the complexity of the task and largely confines modular robots to sterile laboratory settings. The latest generation of truss modular robots, such as the Variable Topology Truss and the Truss Link, have begun to focus outward and reveal a key insight: the environment is not just a backdrop; it is a tool. In this work, we shift the paradigm from building better robots to building better robot environment interactions for modular truss robots. We study how modular robots can effectively exploit their surroundings to achieve faster locomotion, adaptive self-reconfiguration, and complex three-dimensional assembly from simple two-dimensional robot assemblies. By using environment features -- ledges, gaps, and slopes -- we show how the environment can extend the robots' capabilities. Nature has long mastered this principle: organisms not only adapt, but exploit their environments to their advantage. Robots must learn to do the same. This study is a step towards modular robotic systems that transcend their limitations by exploiting environmental features.
Unraveling the Connection: How Cognitive Workload Shapes Intent Recognition in Robot-Assisted Surgery
Robot-assisted surgery has revolutionized the healthcare industry by providing surgeons with greater precision, reducing invasiveness, and improving patient outcomes. However, the success of these surgeries depends heavily on the robotic system ability to accurately interpret the intentions of the surgical trainee or even surgeons. One critical factor impacting intent recognition is the cognitive workload experienced during the procedure. In our recent research project, we are building an intelligent adaptive system to monitor cognitive workload and improve learning outcomes in robot-assisted surgery. The project will focus on achieving a semantic understanding of surgeon intents and monitoring their mental state through an intelligent multi-modal assistive framework. This system will utilize brain activity, heart rate, muscle activity, and eye tracking to enhance intent recognition, even in mentally demanding situations. By improving the robotic system ability to interpret the surgeons intentions, we can further enhance the benefits of robot-assisted surgery and improve surgery outcomes.
Exploring Stiffness Gradient Effects in Magnetically Induced Metamorphic Materials via Continuum Simulation and Validation IROS 2025
Magnetic soft continuum robots are capable of bending with remote control in confined space environments, and they have been applied in various bioengineering contexts. As one type of ferromagnetic soft continuums, the Magnetically Induced Metamorphic Materials (MIMMs)-based continuum (MC) exhibits similar bending behaviors. Based on the characteristics of its base material, MC is flexible in modifying unit stiffness and convenient in molding fabrication. However, recent studies on magnetic continuum robots have primarily focused on one or two design parameters, limiting the development of a comprehensive magnetic continuum bending model. In this work, we constructed graded-stiffness MCs (GMCs) and developed a numerical model for GMCs' bending performance, incorporating four key parameters that determine their performance. The simulated bending results were validated with real bending experiments in four different categories: varying magnetic field, cross-section, unit stiffness, and unit length. The graded-stiffness design strategy applied to GMCs prevents sharp bending at the fixed end and results in a more circular curvature. We also trained an expansion model for GMCs' bending performance that is highly efficient and accurate compared to the simulation process. An extensive library of bending prediction for GMCs was built using the trained model.
comment: Accepted to IROS 2025
Learning to Perform Low-Contact Autonomous Nasotracheal Intubation by Recurrent Action-Confidence Chunking with Transformer IROS 2025
Nasotracheal intubation (NTI) is critical for establishing artificial airways in clinical anesthesia and critical care. Current manual methods face significant challenges, including cross-infection, especially during respiratory infection care, and insufficient control of endoluminal contact forces, increasing the risk of mucosal injuries. While existing studies have focused on automated endoscopic insertion, the automation of NTI remains unexplored despite its unique challenges: Nasotracheal tubes exhibit greater diameter and rigidity than standard endoscopes, substantially increasing insertion complexity and patient risks. We propose a novel autonomous NTI system with two key components to address these challenges. First, an autonomous NTI system is developed, incorporating a prosthesis embedded with force sensors, allowing for safety assessment and data filtering. Then, the Recurrent Action-Confidence Chunking with Transformer (RACCT) model is developed to handle complex tube-tissue interactions and partial visual observations. Experimental results demonstrate that the RACCT model outperforms the ACT model in all aspects and achieves a 66% reduction in average peak insertion force compared to manual operations while maintaining equivalent success rates. This validates the system's potential for reducing infection risks and improving procedural safety.
comment: Accepted to IROS 2025
DiffSemanticFusion: Semantic Raster BEV Fusion for Autonomous Driving via Online HD Map Diffusion
Autonomous driving requires accurate scene understanding, including road geometry, traffic agents, and their semantic relationships. In online HD map generation scenarios, raster-based representations are well-suited to vision models but lack geometric precision, while graph-based representations retain structural detail but become unstable without precise maps. To harness the complementary strengths of both, we propose DiffSemanticFusion -- a fusion framework for multimodal trajectory prediction and planning. Our approach reasons over a semantic raster-fused BEV space, enhanced by a map diffusion module that improves both the stability and expressiveness of online HD map representations. We validate our framework on two downstream tasks: trajectory prediction and planning-oriented end-to-end autonomous driving. Experiments on real-world autonomous driving benchmarks, nuScenes and NAVSIM, demonstrate improved performance over several state-of-the-art methods. For the prediction task on nuScenes, we integrate DiffSemanticFusion with the online HD map informed QCNet, achieving a 5.1\% performance improvement. For end-to-end autonomous driving in NAVSIM, DiffSemanticFusion achieves state-of-the-art results, with a 15\% performance gain in NavHard scenarios. In addition, extensive ablation and sensitivity studies show that our map diffusion module can be seamlessly integrated into other vector-based approaches to enhance performance. All artifacts are available at https://github.com/SunZhigang7/DiffSemanticFusion.
Set the Stage: Enabling Storytelling with Multiple Robots through Roleplaying Metaphors
Gestures are an expressive input modality for controlling multiple robots, but their use is often limited by rigid mappings and recognition constraints. To move beyond these limitations, we propose roleplaying metaphors as a scaffold for designing richer interactions. By introducing three roles: Director, Puppeteer, and Wizard, we demonstrate how narrative framing can guide the creation of diverse gesture sets and interaction styles. These roles enable a variety of scenarios, showing how roleplay can unlock new possibilities for multi-robot systems. Our approach emphasizes creativity, expressiveness, and intuitiveness as key elements for future human-robot interaction design.
comment: 3 pages, 2 figures, UIST Poster, adjunct proceedings
OpenMap: Instruction Grounding via Open-Vocabulary Visual-Language Mapping ACM MM '25
Grounding natural language instructions to visual observations is fundamental for embodied agents operating in open-world environments. Recent advances in visual-language mapping have enabled generalizable semantic representations by leveraging vision-language models (VLMs). However, these methods often fall short in aligning free-form language commands with specific scene instances, due to limitations in both instance-level semantic consistency and instruction interpretation. We present OpenMap, a zero-shot open-vocabulary visual-language map designed for accurate instruction grounding in navigation tasks. To address semantic inconsistencies across views, we introduce a Structural-Semantic Consensus constraint that jointly considers global geometric structure and vision-language similarity to guide robust 3D instance-level aggregation. To improve instruction interpretation, we propose an LLM-assisted Instruction-to-Instance Grounding module that enables fine-grained instance selection by incorporating spatial context and expressive target descriptions. We evaluate OpenMap on ScanNet200 and Matterport3D, covering both semantic mapping and instruction-to-target retrieval tasks. Experimental results show that OpenMap outperforms state-of-the-art baselines in zero-shot settings, demonstrating the effectiveness of our method in bridging free-form language and 3D perception for embodied navigation.
comment: ACM MM '25
Towards Zero-Shot Terrain Traversability Estimation: Challenges and Opportunities
Terrain traversability estimation is crucial for autonomous robots, especially in unstructured environments where visual cues and reasoning play a key role. While vision-language models (VLMs) offer potential for zero-shot estimation, the problem remains inherently ill-posed. To explore this, we introduce a small dataset of human-annotated water traversability ratings, revealing that while estimations are subjective, human raters still show some consensus. Additionally, we propose a simple pipeline that integrates VLMs for zero-shot traversability estimation. Our experiments reveal mixed results, suggesting that current foundation models are not yet suitable for practical deployment but provide valuable insights for further research.
comment: Accepted and presented at the 1st German Robotics Conference (GRC); March 13-15, 2025, Nuremberg, Germany https://ras.papercept.net/conferences/conferences/GRC25/program/GRC25_ContentListWeb_3.html#sada_48
Dynamic Robot-Assisted Surgery with Hierarchical Class-Incremental Semantic Segmentation
Robot-assisted surgeries rely on accurate and real-time scene understanding to safely guide surgical instruments. However, segmentation models trained on static datasets face key limitations when deployed in these dynamic and evolving surgical environments. Class-incremental semantic segmentation (CISS) allows models to continually adapt to new classes while avoiding catastrophic forgetting of prior knowledge, without training on previous data. In this work, we build upon the recently introduced Taxonomy-Oriented Poincar\'e-regularized Incremental Class Segmentation (TOPICS) approach and propose an enhanced variant, termed TOPICS+, specifically tailored for robust segmentation of surgical scenes. Concretely, we incorporate the Dice loss into the hierarchical loss formulation to handle strong class imbalances, introduce hierarchical pseudo-labeling, and design tailored label taxonomies for robotic surgery environments. We also propose six novel CISS benchmarks designed for robotic surgery environments including multiple incremental steps and several semantic categories to emulate realistic class-incremental settings in surgical environments. In addition, we introduce a refined set of labels with more than 144 classes on the Syn-Mediverse synthetic dataset, hosted online as an evaluation benchmark. We make the code and trained models publicly available at http://topics.cs.uni-freiburg.de.
DexReMoE:In-hand Reorientation of General Object via Mixtures of Experts
In hand object reorientation provides capability for dexterous manipulation, requiring robust control policies to manage diverse object geometries, maintain stable grasps, and execute precise complex orientation trajectories. However, prior works focus on single objects or simple geometries and struggle to generalize to complex shapes. In this work, we introduce DexReMoE (Dexterous Reorientation Mixture-of-Experts), in which multiple expert policies are trained for different complex shapes and integrated within a Mixture-of-Experts (MoE) framework, making the approach capable of generalizing across a wide range of objects. Additionally, we incorporate object category information as privileged inputs to enhance shape representation. Our framework is trained in simulation using reinforcement learning (RL) and evaluated on novel out-of-distribution objects in the most challenging scenario of reorienting objects held in the air by a downward-facing hand. In terms of the average consecutive success count, DexReMoE achieves a score of 19.5 across a diverse set of 150 objects. In comparison to the baselines, it also enhances the worst-case performance, increasing it from 0.69 to 6.05. These results underscore the scalability and adaptability of the DexReMoE framework for general-purpose in-hand reorientation.
comment: 10 pages, 8 figures
Energy-Predictive Planning for Optimizing Drone Service Delivery
We propose a novel Energy-Predictive Drone Service (EPDS) framework for efficient package delivery within a skyway network. The EPDS framework incorporates a formal modeling of an EPDS and an adaptive bidirectional Long Short-Term Memory (Bi-LSTM) machine learning model. This model predicts the energy status and stochastic arrival times of other drones operating in the same skyway network. Leveraging these predictions, we develop a heuristic optimization approach for composite drone services. This approach identifies the most time-efficient and energy-efficient skyway path and recharging schedule for each drone in the network. We conduct extensive experiments using a real-world drone flight dataset to evaluate the performance of the proposed framework.
comment: 37 pages, 16 figures. This is an accepted paper, and it is going to appear in the Expert Systems with Applications journal
VFP: Variational Flow-Matching Policy for Multi-Modal Robot Manipulation
Flow-matching-based policies have recently emerged as a promising approach for learning-based robot manipulation, offering significant acceleration in action sampling compared to diffusion-based policies. However, conventional flow-matching methods struggle with multi-modality, often collapsing to averaged or ambiguous behaviors in complex manipulation tasks. To address this, we propose the Variational Flow-Matching Policy (VFP), which introduces a variational latent prior for mode-aware action generation and effectively captures both task-level and trajectory-level multi-modality. VFP further incorporates Kantorovich Optimal Transport (K-OT) for distribution-level alignment and utilizes a Mixture-of-Experts (MoE) decoder for mode specialization and efficient inference. We comprehensively evaluate VFP on 41 tasks across four benchmark environments, demonstrating its effectiveness and sampling efficiency in both task and path multi-modality settings. Results show that VFP achieves a $49\%$ relative improvement in task success rate over standard flow-based baselines, while maintaining fast inference and compact model size. More details are available on our project page: https://sites.google.com/view/varfp/
CLASS: Contrastive Learning via Action Sequence Supervision for Robot Manipulation
Recent advances in Behavior Cloning (BC) have led to strong performance in robotic manipulation, driven by expressive models, sequence modeling of actions, and large-scale demonstration data. However, BC faces significant challenges when applied to heterogeneous datasets, such as visual shift with different camera poses or object appearances, where performance degrades despite the benefits of learning at scale. This stems from BC's tendency to overfit individual demonstrations rather than capture shared structure, limiting generalization. To address this, we introduce Contrastive Learning via Action Sequence Supervision (CLASS), a method for learning behavioral representations from demonstrations using supervised contrastive learning. CLASS leverages weak supervision from similar action sequences identified via Dynamic Time Warping (DTW) and optimizes a soft InfoNCE loss with similarity-weighted positive pairs. We evaluate CLASS on 5 simulation benchmarks and 3 real-world tasks to achieve competitive results using retrieval-based control with representations only. Most notably, for downstream policy learning under significant visual shifts, Diffusion Policy with CLASS pre-training achieves an average success rate of 75%, while all other baseline methods fail to perform competitively. Project webpage: https://class-robot.github.io.
comment: To appear in Proceedings of the Conference on Robot Learning (CoRL) 2025
Adverse Weather-Independent Framework Towards Autonomous Driving Perception through Temporal Correlation and Unfolded Regularization
Various adverse weather conditions such as fog and rain pose a significant challenge to autonomous driving (AD) perception tasks like semantic segmentation, object detection, etc. The common domain adaption strategy is to minimize the disparity between images captured in clear and adverse weather conditions. However, domain adaption faces two challenges: (I) it typically relies on utilizing clear image as a reference, which is challenging to obtain in practice; (II) it generally targets single adverse weather condition and performs poorly when confronting the mixture of multiple adverse weather conditions. To address these issues, we introduce a reference-free and Adverse weather condition-independent (Advent) framework (rather than a specific model architecture) that can be implemented by various backbones and heads. This is achieved by leveraging the homogeneity over short durations, getting rid of clear reference and being generalizable to arbitrary weather condition. Specifically, Advent includes three integral components: (I) Locally Sequential Mechanism (LSM) leverages temporal correlations between adjacent frames to achieve the weather-condition-agnostic effect thanks to the homogeneity behind arbitrary weather condition; (II) Globally Shuffled Mechanism (GSM) is proposed to shuffle segments processed by LSM from different positions of input sequence to prevent the overfitting to LSM-induced temporal patterns; (III) Unfolded Regularizers (URs) are the deep unfolding implementation of two proposed regularizers to penalize the model complexity to enhance across-weather generalization. We take the semantic segmentation task as an example to assess the proposed Advent framework. Extensive experiments demonstrate that the proposed Advent outperforms existing state-of-the-art baselines with large margins.
comment: 10 pages. arXiv admin note: substantial text overlap with arXiv:2409.14737
HALO: Human Preference Aligned Offline Reward Learning for Robot Navigation
In this paper, we introduce HALO, a novel Offline Reward Learning algorithm that quantifies human intuition in navigation into a vision-based reward function for robot navigation. HALO learns a reward model from offline data, leveraging expert trajectories collected from mobile robots. During training, actions are uniformly sampled around a reference action and ranked using preference scores derived from a Boltzmann distribution centered on the preferred action, and shaped based on binary user feedback to intuitive navigation queries. The reward model is trained via the Plackett-Luce loss to align with these ranked preferences. To demonstrate the effectiveness of HALO, we deploy its reward model in two downstream applications: (i) an offline learned policy trained directly on the HALO-derived rewards, and (ii) a model-predictive-control (MPC) based planner that incorporates the HALO reward as an additional cost term. This showcases the versatility of HALO across both learning-based and classical navigation frameworks. Our real-world deployments on a Clearpath Husky across diverse scenarios demonstrate that policies trained with HALO generalize effectively to unseen environments and hardware setups not present in the training data. HALO outperforms state-of-the-art vision-based navigation methods, achieving at least a 33.3% improvement in success rate, a 12.9% reduction in normalized trajectory length, and a 26.6% reduction in Frechet distance compared to human expert trajectories.
Frequency Point Game Environment for UAVs via Expert Knowledge and Large Language Model
Unmanned Aerial Vehicles (UAVs) have made significant advancements in communication stability and security through techniques such as frequency hopping, signal spreading, and adaptive interference suppression. However, challenges remain in modeling spectrum competition, integrating expert knowledge, and predicting opponent behavior. To address these issues, we propose UAV-FPG (Unmanned Aerial Vehicle - Frequency Point Game), a game-theoretic environment model that simulates the dynamic interaction between interference and anti-interference strategies of opponent and ally UAVs in communication frequency bands. The model incorporates a prior expert knowledge base to optimize frequency selection and employs large language models for path planning, simulating a "strong adversary". Experimental results highlight the effectiveness of integrating the expert knowledge base and the large language model, with the latter significantly improving path planning in dynamic scenarios through iterative interactions, outperforming fixed-path strategies. UAV-FPG provides a robust platform for advancing anti-jamming strategies and intelligent decision-making in UAV communication systems.
Learning Physical Interaction Skills from Human Demonstrations
Learning physical interaction skills, such as dancing, handshaking, or sparring, remains a fundamental challenge for agents operating in human environments, particularly when the agent's morphology differs significantly from that of the demonstrator. Existing approaches often rely on handcrafted objectives or morphological similarity, limiting their capacity for generalization. Here, we introduce a framework that enables agents with diverse embodiments to learn wholebbody interaction behaviors directly from human demonstrations. The framework extracts a compact, transferable representation of interaction dynamics, called the Embedded Interaction Graph (EIG), which captures key spatiotemporal relationships between the interacting agents. This graph is then used as an imitation objective to train control policies in physics-based simulations, allowing the agent to generate motions that are both semantically meaningful and physically feasible. We demonstrate BuddyImitation on multiple agents, such as humans, quadrupedal robots with manipulators, or mobile manipulators and various interaction scenarios, including sparring, handshaking, rock-paper-scissors, or dancing. Our results demonstrate a promising path toward coordinated behaviors across morphologically distinct characters via cross embodiment interaction learning.
Unified Locomotion Transformer with Simultaneous Sim-to-Real Transfer for Quadrupeds IROS 2025
Quadrupeds have gained rapid advancement in their capability of traversing across complex terrains. The adoption of deep Reinforcement Learning (RL), transformers and various knowledge transfer techniques can greatly reduce the sim-to-real gap. However, the classical teacher-student framework commonly used in existing locomotion policies requires a pre-trained teacher and leverages the privilege information to guide the student policy. With the implementation of large-scale models in robotics controllers, especially transformers-based ones, this knowledge distillation technique starts to show its weakness in efficiency, due to the requirement of multiple supervised stages. In this paper, we propose Unified Locomotion Transformer (ULT), a new transformer-based framework to unify the processes of knowledge transfer and policy optimization in a single network while still taking advantage of privilege information. The policies are optimized with reinforcement learning, next state-action prediction, and action imitation, all in just one training stage, to achieve zero-shot deployment. Evaluation results demonstrate that with ULT, optimal teacher and student policies can be obtained at the same time, greatly easing the difficulty in knowledge transfer, even with complex transformer-based models.
comment: Accepted for IROS 2025. Project website for video: https://johnliudk.github.io/ult/
LanternNet: A Hub-and-Spoke System to Seek and Suppress Spotted Lanternfly Populations
The invasive spotted lanternfly (SLF) poses a significant threat to agriculture and ecosystems, causing widespread damage. Current control methods, such as egg scraping, pesticides, and quarantines, prove labor-intensive, environmentally hazardous, and inadequate for long-term SLF suppression. This research introduces LanternNet, a novel autonomous robotic Hub-and-Spoke system designed for scalable detection and suppression of SLF populations. A central, tree-mimicking hub utilizes a YOLOv8 computer vision model for precise SLF identification. Three specialized robotic spokes perform targeted tasks: pest neutralization, environmental monitoring, and navigation/mapping. Field deployment across multiple infested sites over 5 weeks demonstrated LanternNet's efficacy. Quantitative analysis revealed significant reductions (p < 0.01, paired t-tests) in SLF populations and corresponding improvements in tree health indicators across the majority of test sites. Compared to conventional methods, LanternNet offers substantial cost advantages and improved scalability. Furthermore, the system's adaptability for enhanced autonomy and targeting of other invasive species presents significant potential for broader ecological impact. LanternNet demonstrates the transformative potential of integrating robotics and AI for advanced invasive species management and improved environmental outcomes.
Exploring 3D Reasoning-Driven Planning: From Implicit Human Intentions to Route-Aware Activity Planning
3D task planning has attracted increasing attention in human-robot interaction and embodied AI thanks to the recent advances in multimodal learning. However, most existing studies are facing two common challenges: 1) heavy reliance on explicit instructions with little reasoning on implicit user intention; 2) negligence of inter-step route planning on robot moves. We address the above challenges by proposing 3D Reasoning-Driven Planning, a novel 3D task that reasons the intended activities from implicit instructions and decomposes them into steps with inter-step routes and planning under the guidance of fine-grained 3D object shapes and locations from scene segmentation. We tackle the new 3D task from two perspectives. First, we construct ReasonPlan3D, a large-scale benchmark that covers diverse 3D scenes with rich implicit instructions and detailed annotations for multi-step task planning, inter-step route planning, and fine-grained segmentation. Second, we design a novel framework that introduces progressive plan generation with contextual consistency across multiple steps, as well as a scene graph that is updated dynamically for capturing critical objects and their spatial relations. Extensive experiments demonstrate the effectiveness of our benchmark and framework in reasoning activities from implicit human instructions, producing accurate stepwise task plans and seamlessly integrating route planning for multi-step moves. The dataset and code will be released.
Boosting Robotic Manipulation Generalization with Minimal Costly Data
The growing adoption of Vision-Language-Action (VLA) models in embodied AI intensifies the demand for diverse manipulation demonstrations. However, high costs associated with data collection often result in insufficient data coverage across all scenarios, which limits the performance of the models. It is observed that the spatial reasoning phase (SRP) in large workspace dominates the failure cases. Fortunately, this data can be collected with low cost, underscoring the potential of leveraging inexpensive data to improve model performance. In this paper, we introduce the RoboTron-Craft, a stage-divided and cost-effective pipeline for realistic manipulation generation. Base on this, the RoboTron-Platter method is introduced, a framework that decouples training trajectories into distinct task stages and leverages abundant easily collectible SRP data to enhance VLA model's generalization. Through analysis we demonstrate that sub-task-specific training with additional SRP data with proper proportion can act as a performance catalyst for robot manipulation, maximizing the utilization of costly physical interaction phase (PIP) data. Experiments show that through introducing large proportion of cost-effective SRP trajectories into a limited set of PIP data, we can achieve a maximum improvement of 41\% on success rate in zero-shot scenes, while with the ability to transfer manipulation skill to novel targets. Project available at https://github.com/ notFoundThisPerson/RoboTron-Craft.
Task Reconstruction and Extrapolation for $π_0$ using Text Latent
Vision-language-action models (VLAs) often achieve high performance on demonstrated tasks but struggle significantly when required to extrapolate, combining skills learned from different tasks in novel ways. For instance, VLAs might successfully put the cream cheese in the bowl and put the bowl on top of the cabinet, yet still fail to put the cream cheese on top of the cabinet. In this work, we demonstrate that behaviors from distinct tasks can be effectively recombined by manipulating the VLA's internal representations at inference time. Concretely, we identify the text latent by averaging the text tokens' hidden states across all demonstrated trajectories for a specific base task. For executing an extrapolated task, we can temporally interpolate the text latent of the two base tasks and add it back to the text hidden states, so sub-behaviors from the two tasks will be activated sequentially. We evaluate this approach using the newly created libero-ood benchmark, featuring 20 tasks extrapolated from standard LIBERO suites. The results on libero-ood show that all SOTA VLAs achieve < 15% success rate, while $\pi0$ with text latent interpolation reaches an 83% success rate. Further qualitative analysis reveals a tendency for VLAs to exhibit spatial overfitting, mapping object names to demonstrated locations rather than achieving genuine object and goal understanding. Additionally, we find that decoding the text latent yields human-unreadable prompts that can nevertheless instruct the VLA to achieve a 70% success rate on standard LIBERO suites, enabling private instruction or backdoor attacks.
StyleDrive: Towards Driving-Style Aware Benchmarking of End-To-End Autonomous Driving
Personalization, while extensively studied in conventional autonomous driving pipelines, has been largely overlooked in the context of end-to-end autonomous driving (E2EAD), despite its critical role in fostering user trust, safety perception, and real-world adoption. A primary bottleneck is the absence of large-scale real-world datasets that systematically capture driving preferences, severely limiting the development and evaluation of personalized E2EAD models. In this work, we introduce the first large-scale real-world dataset explicitly curated for personalized E2EAD, integrating comprehensive scene topology with rich dynamic context derived from agent dynamics and semantics inferred via a fine-tuned vision-language model (VLM). We propose a hybrid annotation pipeline that combines behavioral analysis, rule-and-distribution-based heuristics, and subjective semantic modeling guided by VLM reasoning, with final refinement through human-in-the-loop verification. Building upon this dataset, we introduce the first standardized benchmark for systematically evaluating personalized E2EAD models. Empirical evaluations on state-of-the-art architectures demonstrate that incorporating personalized driving preferences significantly improves behavioral alignment with human demonstrations.
comment: 25 pages, 7 figures, 5 tables
RoboGSim: A Real2Sim2Real Robotic Gaussian Splatting Simulator
Efficient acquisition of real-world embodied data has been increasingly critical. However, large-scale demonstrations captured by remote operation tend to take extremely high costs and fail to scale up the data size in an efficient manner. Sampling the episodes under a simulated environment is a promising way for large-scale collection while existing simulators fail to high-fidelity modeling on texture and physics. To address these limitations, we introduce the RoboGSim, a real2sim2real robotic simulator, powered by 3D Gaussian Splatting and the physics engine. RoboGSim mainly includes four parts: Gaussian Reconstructor, Digital Twins Builder, Scene Composer, and Interactive Engine. It can synthesize the simulated data with novel views, objects, trajectories, and scenes. RoboGSim also provides an online, reproducible, and safe evaluation for different manipulation policies. The real2sim and sim2real transfer experiments show a high consistency in the texture and physics. We compared the test results of RoboGSim data and real robot data on both RoboGSim and real robot platforms. The experimental results show that the RoboGSim data model can achieve zero-shot performance on the real robot, with results comparable to real robot data. Additionally, in experiments with novel perspectives and novel scenes, the RoboGSim data model performed even better on the real robot than the real robot data model. This not only helps reduce the sim2real gap but also addresses the limitations of real robot data collection, such as its single-source and high cost. We hope RoboGSim serves as a closed-loop simulator for fair comparison on policy learning. More information can be found on our project page https://robogsim.github.io/.
Humanoid Robots and Humanoid AI: Review, Perspectives and Directions
In the approximately century-long journey of robotics, humanoid robots made their debut around six decades ago. While current humanoids bear human-like appearances, none have embodied true humaneness, remaining distant from achieving human-like to human-level intelligence. The rapid recent advancements in generative AI and (multimodal) large language models have further reignited and escalated interest in humanoids towards real-time, interactive, and multimodal designs and applications, such as fostering humanoid workers, advisers, educators, medical professionals, caregivers, and receptionists. These unveil boundless opportunities of transforming 1) AI robotics into a research era of humanoid AI, and 2) AI robots into new-generation humanoid AI robots (AI humanoids). Our unique and comprehensive review of about 30 reported humanoids discloses a systematic terminology and a paradigmatic landscape of human-looking to human-like and human-level humanoids. It inspires comprehensive new perspectives and directions of humanoid AI as an area: transitioning from human-looking to humane humanoids, humanizing humanoids with functional and nonfunctional specifications, and cultivating technical and actionable advances of AI humanoids. Humanoid AI and AI humanoids nurture symbiotic advancements and future opportunities of synthesizing and transforming humanity modeling and conventional, generative to human-level AI into humanoid robotics.
comment: 52 pages, 4 figures, 4 tables
Human-Machine Shared Control Approach for the Takeover of Cooperative Adaptive Cruise Control
Cooperative Adaptive Cruise Control (CACC) often requires human takeover for tasks such as exiting a freeway. Direct human takeover can pose significant risks, especially given the close-following strategy employed by CACC, which might cause drivers to feel unsafe and execute hard braking, potentially leading to collisions. This research aims to develop a CACC takeover controller that ensures a smooth transition from automated to human control. The proposed CACC takeover maneuver employs an indirect human-machine shared control approach, modeled as a Stackelberg competition where the machine acts as the leader and the human as the follower. The machine guides the human to respond in a manner that aligns with the machine's expectations, aiding in maintaining following stability. Additionally, the human reaction function is integrated into the machine's predictive control system, moving beyond a simple "prediction-planning" pipeline to enhance planning optimality. The controller has been verified to i) enable a smooth takeover maneuver of CACC; ii) ensure string stability in the condition that the platoon has less than 6 CAVs and human control authority is less than 40%; iii) enhance both perceived and actual safety through machine interventions; and iv) reduce the impact on upstream traffic by up to 60%.
comment: This article has been published on IEEE Transactions on Intelligent Transportation Systems (2025)
iMacHSR: Intermediate Multi-Access Heterogeneous Supervision and Regularization Scheme Toward Architecture-Agnostic Training
While deep supervision is a powerful training strategy by supervising intermediate layers with auxiliary losses, it faces three underexplored problems: (I) Existing deep supervision techniques are generally bond with specific model architectures strictly, lacking generality. (II) The identical loss function for intermediate and output layers causes intermediate layers to prioritize output-specific features prematurely, limiting generalizable representations. (III) Lacking regularization on hidden activations risks overconfident predictions, reducing generalization to unseen scenarios. To tackle these challenges, we propose an architecture-agnostic, intermediate Multi-access Heterogeneous Supervision and Regularization (iMacHSR) scheme. Specifically, the proposed iMacHSR introduces below integral strategies: (I) we select multiple intermediate layers based on predefined architecture-agnostic standards; (II) loss functions (different from output-layer loss) are applied to those selected intermediate layers, which can guide intermediate layers to learn diverse and hierarchical representations; and (III) negative entropy regularization on selected layers' hidden features discourages overconfident predictions and mitigates overfitting. These intermediate terms are combined into the output-layer training loss to form a unified optimization objective, enabling comprehensive optimization across the network hierarchy. We then take the semantic understanding task as an example to assess iMacHSR and apply iMacHSR to several model architectures. Extensive experiments on multiple datasets demonstrate that iMacHSR outperforms conventional output-layer single-point supervision method up to 9.19% in mIoU.
comment: 9 pages
NeuFlow v2: Push High-Efficiency Optical Flow To the Limit
Real-time high-accuracy optical flow estimation is critical for a variety of real-world robotic applications. However, current learning-based methods often struggle to balance accuracy and computational efficiency: methods that achieve high accuracy typically demand substantial processing power, while faster approaches tend to sacrifice precision. These fast approaches specifically falter in their generalization capabilities and do not perform well across diverse real-world scenarios. In this work, we revisit the limitations of the SOTA methods and present NeuFlow-V2, a novel method that offers both - high accuracy in real-world datasets coupled with low computational overhead. In particular, we introduce a novel light-weight backbone and a fast refinement module to keep computational demands tractable while delivering accurate optical flow. Experimental results on synthetic and real-world datasets demonstrate that NeuFlow-V2 provides similar accuracy to SOTA methods while achieving 10x-70x speedups. It is capable of running at over 20 FPS on 512x384 resolution images on a Jetson Orin Nano. The full training and evaluation code is available at https://github.com/neufieldrobotics/NeuFlow_v2.
Multiagent Systems
Agent-Based Feature Generation from Clinical Notes for Outcome Prediction
Electronic health records (EHRs) contain rich unstructured clinical notes that could enhance predictive modeling, yet extracting meaningful features from these notes remains challenging. Current approaches range from labor-intensive manual clinician feature generation (CFG) to fully automated representational feature generation (RFG) that lack interpretability and clinical relevance. Here we introduce SNOW (Scalable Note-to-Outcome Workflow), a modular multi-agent system powered by large language models (LLMs) that autonomously generates structured clinical features from unstructured notes without human intervention. We evaluated SNOW against manual CFG, clinician-guided LLM approaches, and RFG methods for predicting 5-year prostate cancer recurrence in 147 patients from Stanford Healthcare. While manual CFG achieved the highest performance (AUC-ROC: 0.771), SNOW matched this performance (0.761) without requiring any clinical expertise, significantly outperforming both baseline features alone (0.691) and all RFG approaches. The clinician-guided LLM method also performed well (0.732) but still required expert input. SNOW's specialized agents handle feature discovery, extraction, validation, post-processing, and aggregation, creating interpretable features that capture complex clinical information typically accessible only through manual review. Our findings demonstrate that autonomous LLM systems can replicate expert-level feature engineering at scale, potentially transforming how clinical ML models leverage unstructured EHR data while maintaining the interpretability essential for clinical deployment.
Distributed games with jumps: An $α$-potential game approach
Motivated by game-theoretic models of crowd motion dynamics, this paper analyzes a broad class of distributed games with jump diffusions within the recently developed $\alpha$-potential game framework. We demonstrate that analyzing the $\alpha$-Nash equilibria reduces to solving a finite-dimensional control problem. Beyond the viscosity and verification characterizations for the general games, we explicitly and in detail examine how spatial population distributions and interaction rules influence the structure of $\alpha$-Nash equilibria in these distributed settings, and in particular for crowd motion games. Our theoretical results are supported by numerical implementations using policy gradient-based algorithms, further demonstrating the computational advantages of the $\alpha$-potential game framework in computing Nash equilibria for general dynamic games.
comment: 23 pages, 4 figures
Optimizing Day-Ahead Energy Trading with Proximal Policy Optimization and Blockchain
The increasing penetration of renewable energy sources in day-ahead energy markets introduces challenges in balancing supply and demand, ensuring grid resilience, and maintaining trust in decentralized trading systems. This paper proposes a novel framework that integrates the Proximal Policy Optimization (PPO) algorithm, a state-of-the-art reinforcement learning method, with blockchain technology to optimize automated trading strategies for prosumers in day-ahead energy markets. We introduce a comprehensive framework that employs RL agent for multi-objective energy optimization and blockchain for tamper-proof data and transaction management. Simulations using real-world data from the Electricity Reliability Council of Texas (ERCOT) demonstrate the effectiveness of our approach. The RL agent achieves demand-supply balancing within 2\% and maintains near-optimal supply costs for the majority of the operating hours. Moreover, it generates robust battery storage policies capable of handling variability in solar and wind generation. All decisions are recorded on an Algorand-based blockchain, ensuring transparency, auditability, and security - key enablers for trustworthy multi-agent energy trading. Our contributions include a novel system architecture, curriculum learning for robust agent development, and actionable policy insights for practical deployment.
Causality and Decision-making: A Logical Framework for Systems and Security Modelling
Causal reasoning is essential for understanding decision-making about the behaviour of complex `ecosystems' of systems that underpin modern society, with security -- including issues around correctness, safety, resilience, etc. -- typically providing critical examples. We present a theory of strategic reasoning about system modelling based on minimal structural assumptions and employing the methods of transition systems, supported by a modal logic of system states in the tradition of van Benthem, Hennessy, and Milner, and validated through equivalence theorems. Our framework introduces an intervention operator and a separating conjunction to capture actual causal relationships between component systems of the ecosystem, aligning naturally with Halpern and Pearl's counterfactual approach based on Structural Causal Models. We illustrate the applicability through examples of of decision-making about microservices in distributed systems. We discuss localized decision-making through a separating conjunction. This work unifies a formal, minimalistic notion of system behaviour with a Halpern--Pearl-compatible theory of counterfactual reasoning, providing a logical foundation for studying decision making about causality in complex interacting systems.
comment: 28 pages
Revisiting Gossip Protocols: A Vision for Emergent Coordination in Agentic Multi-Agent Systems
As agentic platforms scale, agents are evolving beyond static roles and fixed toolchains, creating a growing need for flexible, decentralized coordination. Today's structured communication protocols (e.g., direct agent-to-agent messaging) excel at reliability and task delegation, but they fall short in enabling emergent, swarm-like intelligence, where distributed agents continuously learn, adapt, and communicate to form collective cognition. This paper revisits gossip protocols, long valued in distributed systems for their fault tolerance and decentralization, and argues that they offer a missing layer for context-rich, adaptive communication in agentic AI. Gossip enables scalable, low-overhead dissemination of shared knowledge, but also raises unresolved challenges around semantic filtering, staleness, trustworthiness, and consistency in high-stakes environments. Rather than proposing a new framework, this work charts a research agenda for integrating gossip as a complementary substrate alongside structured protocols. We identify critical gaps in current agent-to-agent architectures, highlight where gossip could reshape assumptions about coordination, and outline open questions around intent propagation, knowledge decay, and peer-to-peer trust. Gossip is not a silver bullet, but overlooking it risks missing a key path toward resilient, reflexive, and self-organizing multi-agent systems.
Frequency Point Game Environment for UAVs via Expert Knowledge and Large Language Model
Unmanned Aerial Vehicles (UAVs) have made significant advancements in communication stability and security through techniques such as frequency hopping, signal spreading, and adaptive interference suppression. However, challenges remain in modeling spectrum competition, integrating expert knowledge, and predicting opponent behavior. To address these issues, we propose UAV-FPG (Unmanned Aerial Vehicle - Frequency Point Game), a game-theoretic environment model that simulates the dynamic interaction between interference and anti-interference strategies of opponent and ally UAVs in communication frequency bands. The model incorporates a prior expert knowledge base to optimize frequency selection and employs large language models for path planning, simulating a "strong adversary". Experimental results highlight the effectiveness of integrating the expert knowledge base and the large language model, with the latter significantly improving path planning in dynamic scenarios through iterative interactions, outperforming fixed-path strategies. UAV-FPG provides a robust platform for advancing anti-jamming strategies and intelligent decision-making in UAV communication systems.
AI-Generated Compromises for Coalition Formation
The challenge of finding compromises between agent proposals is fundamental to AI subfields such as argumentation, mediation, and negotiation. Building on this tradition, Elkind et al. (2021) introduced a process for coalition formation that seeks majority-supported proposals preferable to the status quo, using a metric space where each agent has an ideal point. A crucial step in this process involves identifying compromise proposals around which agent coalitions can unite. How to effectively find such compromise proposals remains an open question. We address this gap by formalizing a model that incorporates agent bounded rationality and uncertainty, and by developing AI methods to generate compromise proposals. We focus on the domain of collaborative document writing, such as the democratic drafting of a community constitution. Our approach uses natural language processing techniques and large language models to induce a semantic metric space over text. Based on this space, we design algorithms to suggest compromise points likely to receive broad support. To evaluate our methods, we simulate coalition formation processes and show that AI can facilitate large-scale democratic text editing, a domain where traditional tools are limited.
On the Power of Perturbation under Sampling in Solving Extensive-Form Games
We investigate how perturbation does and does not improve the Follow-the-Regularized-Leader (FTRL) algorithm in solving imperfect-information extensive-form games under sampling, where payoffs are estimated from sampled trajectories. While optimistic algorithms are effective under full feedback, they often become unstable in the presence of sampling noise. Payoff perturbation offers a promising alternative for stabilizing learning and achieving \textit{last-iterate convergence}. We present a unified framework for \textit{Perturbed FTRL} algorithms and study two variants: PFTRL-KL (standard KL divergence) and PFTRL-RKL (Reverse KL divergence), the latter featuring an estimator with both unbiasedness and conditional zero variance. While PFTRL-KL generally achieves equivalent or better performance across benchmark games, PFTRL-RKL consistently outperforms it in Leduc poker, whose structure is more asymmetric than the other games in a sense. Given the modest advantage of PFTRL-RKL, we design the second experiment to isolate the effect of conditional zero variance, showing that the variance-reduction property of RKL improve last-iterate performance.
Dynamic Strategy Adaptation in Multi-Agent Environments with Large Language Models
Large language models (LLMs) demonstrate strong reasoning abilities across mathematical, strategic, and linguistic tasks, yet little is known about how well they reason in dynamic, real-time, multi-agent scenarios, such as collaborative environments in which agents continuously adapt to each other's behavior, as in cooperative gameplay settings. In this paper, we bridge this gap by combining LLM-driven agents with strategic reasoning and real-time adaptation in cooperative, multi-agent environments grounded in game-theoretic principles such as belief consistency and Nash equilibrium. The proposed framework applies broadly to dynamic scenarios in which agents coordinate, communicate, and make decisions in response to continuously changing conditions. We provide real-time strategy refinement and adaptive feedback mechanisms that enable agents to dynamically adjust policies based on immediate contextual interactions, in contrast to previous efforts that evaluate LLM capabilities in static or turn-based settings. Empirical results show that our method achieves up to a 26\% improvement in return over PPO baselines in high-noise environments, while maintaining real-time latency under 1.05 milliseconds. Our approach improves collaboration efficiency, task completion rates, and flexibility, illustrating that game-theoretic guidance integrated with real-time feedback enhances LLM performance, ultimately fostering more resilient and flexible strategic multi-agent systems.
Systems and Control (CS)
Revenue Optimization in Wireless Video Caching Networks: A Privacy-Preserving Two-Stage Solution
Video caching can significantly improve delivery efficiency and enhance quality of video streaming, which constitutes the majority of wireless communication traffic. Due to limited cache size, caching strategies must be designed to adapt to and dynamic user demand in order to maximize system revenue. The system revenue depends on the benefits of delivering the requested videos and costs for (a) transporting the files to the users and (b) cache replacement. Since the cache content at any point in time impacts the replacement costs in the future, demand predictions over multiple cache placement slots become an important prerequisite for efficient cache planning. Motivated by this, we introduce a novel two-stage privacy-preserving solution for revenue optimization in wireless video caching networks. First, we train a Transformer using privacy-preserving federated learning (FL) to predict multi-slot future demands. Given that prediction results are never entirely accurate, especially for longer horizons, we further combine global content popularity with per-user prediction results to estimate the content demand distribution. Then, in the second stage, we leverage these estimation results to find caching strategies that maximize the long-term system revenue. This latter problem takes on the form of a multi-stage knapsack problem, which we then transform to a integer linear program. Our extensive simulation results demonstrate that (i) our FL solution delivers nearly identical performance to that of the ideal centralized solution and outperforms other existing caching methods, and (ii) our novel revenue optimization approach provides deeper system performance insights than traditional cache hit ratio (CHR)-based optimization approaches.
comment: Under review for possible publication in the IEEE Transactions on Communications
A Heuristic Method for Simplified Resource Allocation based on Comparative Advantage in Wireless Access Systems
This paper presents a heuristic method for simplifying resource allocation in access systems, leveraging the concept of comparative advantage to reduce computational complexity while maintaining near-optimal performance. Using power-division non-orthogonal multiple access (PD-NOMA) as an example, we demonstrate how this approach mitigates the challenge of power allocation in multi-cell networks. Our method reduces the search space for optimization, significantly decreasing computational overhead while ensuring efficient spectrum utilization. In principle, the method reduces the dimensions of search space by half. Extensive analysis and simulations validate its effectiveness, highlighting its potential for practical deployment in next-generation wireless networks. The proposed framework can help streamline resource allocation in complex communication environments, enhancing system performance and scalability.
A Provably Secure Network Protocol for Private Communication with Analysis and Tracing Resistance
Anonymous communication networks have emerged as crucial tools for obfuscating communication pathways and concealing user identities. However, their practical deployments face significant challenges, including susceptibility to artificial intelligence (AI)-powered metadata analysis, difficulties in decentralized architectures, and the absence of provable security guarantees. To address these issues, this paper proposes a novel decentralized anonymous routing protocol with resistance to tracing and traffic analysis. The protocol eliminates dependencies on the threshold model and trusted third-party setups, ensuring indistinguishable identity privacy even in highly adversarial environments. Different from traditional empirical security analysis of anonymous networks, this paper rigorously proves indistinguishable identity privacy for users even in extremely adversarial environments. Furthermore, simulations confirm its practical feasibility, demonstrating both security and efficiency. By achieving information sharing with privacy preservation, the proposed protocol offers a provably secure solution for privacy-preserving communication in digital environments.
Attitude Determination and Control of GPS Satellites: Stabilization, Orbital Insertion, and Operational Control Mechanisms
Global Positioning System (GPS) satellites are essential for providing accurate navigation and timing information worldwide. Operating in medium Earth orbit (MEO), these satellites must maintain precise Earth-pointing attitudes to transmit signals effectively. This paper presents a comprehensive review of the operational dynamics, attitude determination and control systems (ADCS), and orbital insertion techniques for GPS satellites. We explore the integration of sensors and actuators, control algorithms, stabilization strategies, and the launch procedures required to deploy these satellites. Key equations related to orbital mechanics and attitude control are discussed, and references to recent technical literature are included.
A class of unified disturbance rejection control barrier functions
Most existing robust control barrier functions (CBFs) can only handle matched disturbances, restricting their applications in real-world scenarios. While some recent advances extend robust CBFs to unmatched disturbances, they heavily rely on differentiability property of disturbances, and fail to accommodate non-differentiable case for high-relative-degree safety constraints. To address these limitations, this paper proposes a class of disturbance rejection CBFs (DRCBFs), including DRCBFs and adaptive DRCBFs (aDRCBFs). This class of DRCBFs can strictly guarantee safety under general bounded disturbances, which includes both matched or unmatched, differentiable or non-differentiable disturbances as special cases. Morevoer, no information of disturbance bound is needed in aDRCBFs. Simulation results illustrate that this class of DRCBFs outperform existing robust CBFs.
comment: 8 pages, 6 figures
Pursuit-Evasion Between a Velocity-Constrained Double-Integrator Pursuer and a Single-Integrator Evader
We study a pursuit-evasion game between a double integrator-driven pursuer with bounded velocity and bounded acceleration and a single integrator-driven evader with bounded velocity in a two-dimensional plane. The pursuer's goal is to capture the evader in the shortest time, while the evader attempts to delay the capture. We analyze two scenarios based on whether the capture can happen before the pursuer's speed reaches its maximum. For the case when the pursuer can capture the evader before its speed reaches its maximum, we use geometric methods to obtain the strategies for the pursuer and the evader. For the case when the pursuer cannot capture the evader before its speed reaches its maximum, we use numerical methods to obtain the strategies for the pursuer and the evader. In both cases, we demonstrate that the proposed strategies are optimal in the sense of Nash equilibrium through the Hamilton-Jacobi-Isaacs equation, and the pursuer can capture the evader as long as as its maximum speed is larger than that of the evader. Simulation experiments illustrate the effectiveness of the strategies.
Social Media Information Operations
The battlefield of information warfare has moved to online social networks, where influence campaigns operate at unprecedented speed and scale. As with any strategic domain, success requires understanding the terrain, modeling adversaries, and executing interventions. This tutorial introduces a formal optimization framework for social media information operations (IO), where the objective is to shape opinions through targeted actions. This framework is parameterized by quantities such as network structure, user opinions, and activity levels - all of which must be estimated or inferred from data. We discuss analytic tools that support this process, including centrality measures for identifying influential users, clustering algorithms for detecting community structure, and sentiment analysis for gauging public opinion. These tools either feed directly into the optimization pipeline or help defense analysts interpret the information environment. With the landscape mapped, we highlight threats such as coordinated bot networks, extremist recruitment, and viral misinformation. Countermeasures range from content-level interventions to mathematically optimized influence strategies. Finally, the emergence of generative AI transforms both offense and defense, democratizing persuasive capabilities while enabling scalable defenses. This shift calls for algorithmic innovation, policy reform, and ethical vigilance to protect the integrity of our digital public sphere.
Optimizing Return Distributions with Distributional Dynamic Programming
We introduce distributional dynamic programming (DP) methods for optimizing statistical functionals of the return distribution, with standard reinforcement learning as a special case. Previous distributional DP methods could optimize the same class of expected utilities as classic DP. To go beyond, we combine distributional DP with stock augmentation, a technique previously introduced for classic DP in the context of risk-sensitive RL, where the MDP state is augmented with a statistic of the rewards obtained since the first time step. We find that a number of recently studied problems can be formulated as stock-augmented return distribution optimization, and we show that we can use distributional DP to solve them. We analyze distributional value and policy iteration, with bounds and a study of what objectives these distributional DP methods can or cannot optimize. We describe a number of applications outlining how to use distributional DP to solve different stock-augmented return distribution optimization problems, for example maximizing conditional value-at-risk, and homeostatic regulation. To highlight the practical potential of stock-augmented return distribution optimization and distributional DP, we introduce an agent that combines DQN and the core ideas of distributional DP, and empirically evaluate it for solving instances of the applications discussed.
Distributed Coordination for Heterogeneous Non-Terrestrial Networks
To guarantee global coverage and ubiquitous connectivity, the Non-terrestrial Network (NTN) technology has been regarded as a key enabling technology in the Six Generation (6G) network, which consists of the unmanned aerial vehicle (UAV), high-altitude platform (HAP), and satellite. It is noted that the unique characteristics of various NTN platforms directly impact the design and implementation of NTNs, which results in highly dynamic and heterogeneous networks. Even within the same tier, such as the space tier, the NTNs are developed based on different platforms including Low Earth Orbit (LEO), Medium Earth Orbit (MEO), and Geostationary Earth Orbit (GEO). Therefore, distributed coordination among heterogeneous NTNs remains an important challenge. Although distributed learning framework finds a wide range of applications by leveraging rich distributed data and computation resources. The explicit and systematic analysis of the individual layers' challenges, and corresponding distributed coordination solutions in heterogeneous NTNs has not been proposed yet. In this article, we first summarize the unique characteristics of each NTN platform, and analyze the corresponding impact on the design and implementation of the NTN. We then identify the communication challenges of heterogeneous NTNs in individual layers, where the potential coordinated solutions are identified. We further illustrate the multi-agent deep reinforcement learning (MADRL) algorithms tailored for coordinated solutions in heterogeneous NTNs. Last but not least, we present a case study of the user scheduling optimization problem in heterogeneous UAVs-based cellular networks, where the multi-agent deep deterministic policy gradient (MADDPG) technique is developed to validate the effectiveness of distributed coordination in heterogeneous NTNs.
Two-Player Dynamic Potential LQ Games with Sequentially Revealed Costs
We investigate a novel finite-horizon linear-quadratic (LQ) feedback dynamic potential game with a priori unknown cost matrices played between two players. The cost matrices are revealed to the players sequentially, with the potential for future values to be previewed over a short time window. We propose an algorithm that enables the players to predict and track a feedback Nash equilibrium trajectory, and we measure the quality of their resulting decisions by introducing the concept of \emph{price of uncertainty}. We show that under the proposed algorithm, the price of uncertainty is bounded by horizon-invariant constants. The constants are the sum of three terms; the first and second terms decay exponentially as the preview window grows, and another depends on the magnitude of the differences between the cost matrices for each player. Through simulations, we illustrate that the resulting price of uncertainty initially decays at an exponential rate as the preview window lengthens, then remains constant for large time horizons.
Systems and Control (EESS)
Revenue Optimization in Wireless Video Caching Networks: A Privacy-Preserving Two-Stage Solution
Video caching can significantly improve delivery efficiency and enhance quality of video streaming, which constitutes the majority of wireless communication traffic. Due to limited cache size, caching strategies must be designed to adapt to and dynamic user demand in order to maximize system revenue. The system revenue depends on the benefits of delivering the requested videos and costs for (a) transporting the files to the users and (b) cache replacement. Since the cache content at any point in time impacts the replacement costs in the future, demand predictions over multiple cache placement slots become an important prerequisite for efficient cache planning. Motivated by this, we introduce a novel two-stage privacy-preserving solution for revenue optimization in wireless video caching networks. First, we train a Transformer using privacy-preserving federated learning (FL) to predict multi-slot future demands. Given that prediction results are never entirely accurate, especially for longer horizons, we further combine global content popularity with per-user prediction results to estimate the content demand distribution. Then, in the second stage, we leverage these estimation results to find caching strategies that maximize the long-term system revenue. This latter problem takes on the form of a multi-stage knapsack problem, which we then transform to a integer linear program. Our extensive simulation results demonstrate that (i) our FL solution delivers nearly identical performance to that of the ideal centralized solution and outperforms other existing caching methods, and (ii) our novel revenue optimization approach provides deeper system performance insights than traditional cache hit ratio (CHR)-based optimization approaches.
comment: Under review for possible publication in the IEEE Transactions on Communications
A Heuristic Method for Simplified Resource Allocation based on Comparative Advantage in Wireless Access Systems
This paper presents a heuristic method for simplifying resource allocation in access systems, leveraging the concept of comparative advantage to reduce computational complexity while maintaining near-optimal performance. Using power-division non-orthogonal multiple access (PD-NOMA) as an example, we demonstrate how this approach mitigates the challenge of power allocation in multi-cell networks. Our method reduces the search space for optimization, significantly decreasing computational overhead while ensuring efficient spectrum utilization. In principle, the method reduces the dimensions of search space by half. Extensive analysis and simulations validate its effectiveness, highlighting its potential for practical deployment in next-generation wireless networks. The proposed framework can help streamline resource allocation in complex communication environments, enhancing system performance and scalability.
A Provably Secure Network Protocol for Private Communication with Analysis and Tracing Resistance
Anonymous communication networks have emerged as crucial tools for obfuscating communication pathways and concealing user identities. However, their practical deployments face significant challenges, including susceptibility to artificial intelligence (AI)-powered metadata analysis, difficulties in decentralized architectures, and the absence of provable security guarantees. To address these issues, this paper proposes a novel decentralized anonymous routing protocol with resistance to tracing and traffic analysis. The protocol eliminates dependencies on the threshold model and trusted third-party setups, ensuring indistinguishable identity privacy even in highly adversarial environments. Different from traditional empirical security analysis of anonymous networks, this paper rigorously proves indistinguishable identity privacy for users even in extremely adversarial environments. Furthermore, simulations confirm its practical feasibility, demonstrating both security and efficiency. By achieving information sharing with privacy preservation, the proposed protocol offers a provably secure solution for privacy-preserving communication in digital environments.
Attitude Determination and Control of GPS Satellites: Stabilization, Orbital Insertion, and Operational Control Mechanisms
Global Positioning System (GPS) satellites are essential for providing accurate navigation and timing information worldwide. Operating in medium Earth orbit (MEO), these satellites must maintain precise Earth-pointing attitudes to transmit signals effectively. This paper presents a comprehensive review of the operational dynamics, attitude determination and control systems (ADCS), and orbital insertion techniques for GPS satellites. We explore the integration of sensors and actuators, control algorithms, stabilization strategies, and the launch procedures required to deploy these satellites. Key equations related to orbital mechanics and attitude control are discussed, and references to recent technical literature are included.
A class of unified disturbance rejection control barrier functions
Most existing robust control barrier functions (CBFs) can only handle matched disturbances, restricting their applications in real-world scenarios. While some recent advances extend robust CBFs to unmatched disturbances, they heavily rely on differentiability property of disturbances, and fail to accommodate non-differentiable case for high-relative-degree safety constraints. To address these limitations, this paper proposes a class of disturbance rejection CBFs (DRCBFs), including DRCBFs and adaptive DRCBFs (aDRCBFs). This class of DRCBFs can strictly guarantee safety under general bounded disturbances, which includes both matched or unmatched, differentiable or non-differentiable disturbances as special cases. Morevoer, no information of disturbance bound is needed in aDRCBFs. Simulation results illustrate that this class of DRCBFs outperform existing robust CBFs.
comment: 8 pages, 6 figures
Pursuit-Evasion Between a Velocity-Constrained Double-Integrator Pursuer and a Single-Integrator Evader
We study a pursuit-evasion game between a double integrator-driven pursuer with bounded velocity and bounded acceleration and a single integrator-driven evader with bounded velocity in a two-dimensional plane. The pursuer's goal is to capture the evader in the shortest time, while the evader attempts to delay the capture. We analyze two scenarios based on whether the capture can happen before the pursuer's speed reaches its maximum. For the case when the pursuer can capture the evader before its speed reaches its maximum, we use geometric methods to obtain the strategies for the pursuer and the evader. For the case when the pursuer cannot capture the evader before its speed reaches its maximum, we use numerical methods to obtain the strategies for the pursuer and the evader. In both cases, we demonstrate that the proposed strategies are optimal in the sense of Nash equilibrium through the Hamilton-Jacobi-Isaacs equation, and the pursuer can capture the evader as long as as its maximum speed is larger than that of the evader. Simulation experiments illustrate the effectiveness of the strategies.
Social Media Information Operations
The battlefield of information warfare has moved to online social networks, where influence campaigns operate at unprecedented speed and scale. As with any strategic domain, success requires understanding the terrain, modeling adversaries, and executing interventions. This tutorial introduces a formal optimization framework for social media information operations (IO), where the objective is to shape opinions through targeted actions. This framework is parameterized by quantities such as network structure, user opinions, and activity levels - all of which must be estimated or inferred from data. We discuss analytic tools that support this process, including centrality measures for identifying influential users, clustering algorithms for detecting community structure, and sentiment analysis for gauging public opinion. These tools either feed directly into the optimization pipeline or help defense analysts interpret the information environment. With the landscape mapped, we highlight threats such as coordinated bot networks, extremist recruitment, and viral misinformation. Countermeasures range from content-level interventions to mathematically optimized influence strategies. Finally, the emergence of generative AI transforms both offense and defense, democratizing persuasive capabilities while enabling scalable defenses. This shift calls for algorithmic innovation, policy reform, and ethical vigilance to protect the integrity of our digital public sphere.
Optimizing Return Distributions with Distributional Dynamic Programming
We introduce distributional dynamic programming (DP) methods for optimizing statistical functionals of the return distribution, with standard reinforcement learning as a special case. Previous distributional DP methods could optimize the same class of expected utilities as classic DP. To go beyond, we combine distributional DP with stock augmentation, a technique previously introduced for classic DP in the context of risk-sensitive RL, where the MDP state is augmented with a statistic of the rewards obtained since the first time step. We find that a number of recently studied problems can be formulated as stock-augmented return distribution optimization, and we show that we can use distributional DP to solve them. We analyze distributional value and policy iteration, with bounds and a study of what objectives these distributional DP methods can or cannot optimize. We describe a number of applications outlining how to use distributional DP to solve different stock-augmented return distribution optimization problems, for example maximizing conditional value-at-risk, and homeostatic regulation. To highlight the practical potential of stock-augmented return distribution optimization and distributional DP, we introduce an agent that combines DQN and the core ideas of distributional DP, and empirically evaluate it for solving instances of the applications discussed.
Distributed Coordination for Heterogeneous Non-Terrestrial Networks
To guarantee global coverage and ubiquitous connectivity, the Non-terrestrial Network (NTN) technology has been regarded as a key enabling technology in the Six Generation (6G) network, which consists of the unmanned aerial vehicle (UAV), high-altitude platform (HAP), and satellite. It is noted that the unique characteristics of various NTN platforms directly impact the design and implementation of NTNs, which results in highly dynamic and heterogeneous networks. Even within the same tier, such as the space tier, the NTNs are developed based on different platforms including Low Earth Orbit (LEO), Medium Earth Orbit (MEO), and Geostationary Earth Orbit (GEO). Therefore, distributed coordination among heterogeneous NTNs remains an important challenge. Although distributed learning framework finds a wide range of applications by leveraging rich distributed data and computation resources. The explicit and systematic analysis of the individual layers' challenges, and corresponding distributed coordination solutions in heterogeneous NTNs has not been proposed yet. In this article, we first summarize the unique characteristics of each NTN platform, and analyze the corresponding impact on the design and implementation of the NTN. We then identify the communication challenges of heterogeneous NTNs in individual layers, where the potential coordinated solutions are identified. We further illustrate the multi-agent deep reinforcement learning (MADRL) algorithms tailored for coordinated solutions in heterogeneous NTNs. Last but not least, we present a case study of the user scheduling optimization problem in heterogeneous UAVs-based cellular networks, where the multi-agent deep deterministic policy gradient (MADDPG) technique is developed to validate the effectiveness of distributed coordination in heterogeneous NTNs.
Two-Player Dynamic Potential LQ Games with Sequentially Revealed Costs
We investigate a novel finite-horizon linear-quadratic (LQ) feedback dynamic potential game with a priori unknown cost matrices played between two players. The cost matrices are revealed to the players sequentially, with the potential for future values to be previewed over a short time window. We propose an algorithm that enables the players to predict and track a feedback Nash equilibrium trajectory, and we measure the quality of their resulting decisions by introducing the concept of \emph{price of uncertainty}. We show that under the proposed algorithm, the price of uncertainty is bounded by horizon-invariant constants. The constants are the sum of three terms; the first and second terms decay exponentially as the preview window grows, and another depends on the magnitude of the differences between the cost matrices for each player. Through simulations, we illustrate that the resulting price of uncertainty initially decays at an exponential rate as the preview window lengthens, then remains constant for large time horizons.