MyArxiv
Robotics
PO-VINS: An Efficient and Robust Pose-Only Visual-Inertial State Estimator With LiDAR Enhancement
The pose adjustment (PA) with a pose-only visual representation has been proven equivalent to the bundle adjustment (BA), while significantly improving the computational efficiency. However, the pose-only solution has not yet been properly considered in a tightly-coupled visual-inertial state estimator (VISE) with a normal configuration for real-time navigation. In this study, we propose a tightly-coupled LiDAR-enhanced VISE, named PO-VINS, with a full pose-only form for visual and LiDAR-depth measurements. Based on the pose-only visual representation, we derive the analytical depth uncertainty, which is then employed for rejecting LiDAR depth outliers. Besides, we propose a multi-state constraint (MSC)-based LiDAR-depth measurement model with a pose-only form, to balance efficiency and robustness. The pose-only visual and LiDAR-depth measurements and the IMU-preintegration measurements are tightly integrated under the factor graph optimization framework to perform efficient and accurate state estimation. Exhaustive experimental results on private and public datasets indicate that the proposed PO-VINS yields improved or comparable accuracy to sate-of-the-art methods. Compared to the baseline method LE-VINS, the state-estimation efficiency of PO-VINS is improved by 33% and 56% on the laptop PC and the onboard ARM computer, respectively. Besides, PO-VINS yields higher accuracy and robustness than LE-VINS by employing the proposed uncertainty-based outlier-culling method and the MSC-based measurement model for LiDAR depth.
comment: 17 pages, 13 figures, 8 tables
Systems and Control (CS)
Learning-Based Efficient Approximation of Data-Enabled Predictive Control
Data-Enabled Predictive Control (DeePC) bypasses the need for system identification by directly leveraging raw data to formulate optimal control policies. However, the size of the optimization problem in DeePC grows linearly with respect to the data size, which prohibits its application to resource-constrained systems due to high computational costs. In this paper, we propose an efficient approximation of DeePC, whose size is invariant with respect to the amount of data collected, via differentiable convex programming. Specifically, the optimization problem in DeePC is decomposed into two parts: a control objective and a scoring function that evaluates the likelihood of a guessed I/O sequence, the latter of which is approximated with a size-invariant learned optimization problem. The proposed method is validated through numerical simulations on a quadruple tank system, illustrating that the learned controller can reduce the computational time of DeePC by a factor of 5 while maintaining its control performance.
Distributionally Robust Stochastic Data-Driven Predictive Control with Optimized Feedback Gain
We consider the problem of direct data-driven predictive control for unknown stochastic linear time-invariant (LTI) systems with partial state observation. Building upon our previous research on data-driven stochastic control, this paper (i) relaxes the assumption of Gaussian process and measurement noise, and (ii) enables optimization of the gain matrix within the affine feedback policy. Output safety constraints are modelled using conditional value-at-risk, and enforced in a distributionally robust sense. Under idealized assumptions, we prove that our proposed data-driven control method yields control inputs identical to those produced by an equivalent model-based stochastic predictive controller. A simulation study illustrates the enhanced performance of our approach over previous designs.
comment: 8 pages, 1 figure, 2 tables, the extended version of an accepted paper in Conference on Decision and Control (CDC). arXiv admin note: text overlap with arXiv:2312.15177
Decentralized Control of Multi-Agent Systems Under Acyclic Spatio-Temporal Task Dependencies
We introduce a novel distributed sampled-data control method tailored for heterogeneous multi-agent systems under a global spatio-temporal task with acyclic dependencies. Specifically, we consider the global task as a conjunction of independent and collaborative tasks, defined over the absolute and relative states of agent pairs. Task dependencies in this form are then represented by a task graph, which we assume to be acyclic. From the given task graph, we provide an algorithmic approach to define a distributed sampled-data controller prioritizing the fulfilment of collaborative tasks as the primary objective, while fulfilling independent tasks unless they conflict with collaborative ones. Moreover, communication maintenance among collaborating agents is seamlessly enforced within the proposed control framework. A numerical simulation is provided to showcase the potential of our control framework.
comment: Short version of this paper was accepted for the Conference on Decision and Control. Reupload was needed for a misspelt name and corrected minor typos
Systems and Control (EESS)
Learning-Based Efficient Approximation of Data-Enabled Predictive Control
Data-Enabled Predictive Control (DeePC) bypasses the need for system identification by directly leveraging raw data to formulate optimal control policies. However, the size of the optimization problem in DeePC grows linearly with respect to the data size, which prohibits its application to resource-constrained systems due to high computational costs. In this paper, we propose an efficient approximation of DeePC, whose size is invariant with respect to the amount of data collected, via differentiable convex programming. Specifically, the optimization problem in DeePC is decomposed into two parts: a control objective and a scoring function that evaluates the likelihood of a guessed I/O sequence, the latter of which is approximated with a size-invariant learned optimization problem. The proposed method is validated through numerical simulations on a quadruple tank system, illustrating that the learned controller can reduce the computational time of DeePC by a factor of 5 while maintaining its control performance.
Distributionally Robust Stochastic Data-Driven Predictive Control with Optimized Feedback Gain
We consider the problem of direct data-driven predictive control for unknown stochastic linear time-invariant (LTI) systems with partial state observation. Building upon our previous research on data-driven stochastic control, this paper (i) relaxes the assumption of Gaussian process and measurement noise, and (ii) enables optimization of the gain matrix within the affine feedback policy. Output safety constraints are modelled using conditional value-at-risk, and enforced in a distributionally robust sense. Under idealized assumptions, we prove that our proposed data-driven control method yields control inputs identical to those produced by an equivalent model-based stochastic predictive controller. A simulation study illustrates the enhanced performance of our approach over previous designs.
comment: 8 pages, 1 figure, 2 tables, the extended version of an accepted paper in Conference on Decision and Control (CDC). arXiv admin note: text overlap with arXiv:2312.15177
Decentralized Control of Multi-Agent Systems Under Acyclic Spatio-Temporal Task Dependencies
We introduce a novel distributed sampled-data control method tailored for heterogeneous multi-agent systems under a global spatio-temporal task with acyclic dependencies. Specifically, we consider the global task as a conjunction of independent and collaborative tasks, defined over the absolute and relative states of agent pairs. Task dependencies in this form are then represented by a task graph, which we assume to be acyclic. From the given task graph, we provide an algorithmic approach to define a distributed sampled-data controller prioritizing the fulfilment of collaborative tasks as the primary objective, while fulfilling independent tasks unless they conflict with collaborative ones. Moreover, communication maintenance among collaborating agents is seamlessly enforced within the proposed control framework. A numerical simulation is provided to showcase the potential of our control framework.
comment: Short version of this paper was accepted for the Conference on Decision and Control. Reupload was needed for a misspelt name and corrected minor typos
Robotics
Cooptimizing Safety and Performance with a Control-Constrained Formulation
Autonomous systems have witnessed a rapid increase in their capabilities, but it remains a challenge for them to perform tasks both effectively and safely. The fact that performance and safety can sometimes be competing objectives renders the cooptimization between them difficult. One school of thought is to treat this cooptimization as a constrained optimal control problem with a performance-oriented objective function and safety as a constraint. However, solving this constrained optimal control problem for general nonlinear systems remains challenging. In this work, we use the general framework of constrained optimal control, but given the safety state constraint, we convert it into an equivalent control constraint, resulting in a state and time-dependent control-constrained optimal control problem. This equivalent optimal control problem can readily be solved using the dynamic programming principle. We show the corresponding value function is a viscosity solution of a certain Hamilton-Jacobi-Bellman Partial Differential Equation (HJB-PDE). Furthermore, we demonstrate the effectiveness of our method with a two-dimensional case study, and the experiment shows that the controller synthesized using our method consistently outperforms the baselines, both in safety and performance.
comment: Submitted to ACC with L-CSS option
Technical Report of Mobile Manipulator Robot for Industrial Environments
This paper presents the development of the Auriga @Work robot, designed by the Robotics and Intelligent Automation Lab at Shahid Beheshti University, Department of Electrical Engineering, for the RoboCup 2024 competition. The robot is tailored for industrial applications, focusing on enhancing efficiency in repetitive or hazardous environments. It is equipped with a 4-wheel Mecanum drive system for omnidirectional mobility and a 5-degree-of-freedom manipulator arm with a custom 3D-printed gripper for object manipulation and navigation tasks. The robot's electronics are powered by custom-designed boards utilizing ESP32 microcontrollers and an Nvidia Jetson Nano for real-time control and decision-making. The key software stack integrates Hector SLAM for mapping, the A* algorithm for path planning, and YOLO for object detection, along with advanced sensor fusion for improved navigation and collision avoidance.
Towards Localizing Structural Elements: Merging Geometrical Detection with Semantic Verification in RGB-D Data
RGB-D cameras supply rich and dense visual and spatial information for various robotics tasks such as scene understanding, map reconstruction, and localization. Integrating depth and visual information can aid robots in localization and element mapping, advancing applications like 3D scene graph generation and Visual Simultaneous Localization and Mapping (VSLAM). While point cloud data containing such information is primarily used for enhanced scene understanding, exploiting their potential to capture and represent rich semantic information has yet to be adequately targeted. This paper presents a real-time pipeline for localizing building components, including wall and ground surfaces, by integrating geometric calculations for pure 3D plane detection followed by validating their semantic category using point cloud data from RGB-D cameras. It has a parallel multi-thread architecture to precisely estimate poses and equations of all the planes detected in the environment, filters the ones forming the map structure using a panoptic segmentation validation, and keeps only the validated building components. Incorporating the proposed method into a VSLAM framework confirmed that constraining the map with the detected environment-driven semantic elements can improve scene understanding and map reconstruction accuracy. It can also ensure (re-)association of these detected components into a unified 3D scene graph, bridging the gap between geometric accuracy and semantic understanding. Additionally, the pipeline allows for the detection of potential higher-level structural entities, such as rooms, by identifying the relationships between building components based on their layout.
comment: 6 pages, 5 figures. 3 tables
One-Shot Imitation under Mismatched Execution
Human demonstrations as prompts are a powerful way to program robots to do long-horizon manipulation tasks. However, directly translating such demonstrations into robot-executable actions poses significant challenges due to execution mismatches, such as different movement styles and physical capabilities. Existing methods either rely on robot-demonstrator paired data, which is infeasible to scale, or overly rely on frame-level visual similarities, which fail to hold. To address these challenges, we propose RHyME, a novel framework that automatically establishes task execution correspondences between the robot and the demonstrator by using optimal transport costs. Given long-horizon robot demonstrations, RHyME synthesizes semantically equivalent human demonstrations by retrieving and composing similar short-horizon human clips, facilitating effective policy training without the need for paired data. We show that RHyME outperforms a range of baselines across various cross-embodiment datasets on all degrees of mismatches. Through detailed analysis, we uncover insights for learning and leveraging cross-embodiment visual representations.
DemoStart: Demonstration-led auto-curriculum applied to sim-to-real with multi-fingered robots
We present DemoStart, a novel auto-curriculum reinforcement learning method capable of learning complex manipulation behaviors on an arm equipped with a three-fingered robotic hand, from only a sparse reward and a handful of demonstrations in simulation. Learning from simulation drastically reduces the development cycle of behavior generation, and domain randomization techniques are leveraged to achieve successful zero-shot sim-to-real transfer. Transferred policies are learned directly from raw pixels from multiple cameras and robot proprioception. Our approach outperforms policies learned from demonstrations on the real robot and requires 100 times fewer demonstrations, collected in simulation. More details and videos in https://sites.google.com/view/demostart.
comment: 15 pages total with 7 pages of appendix. 9 Figures, 4 in the main text and 5 in the appendix
Simulation-based Scenario Generation for Robust Hybrid AI for Autonomy
Application of Unmanned Aerial Vehicles (UAVs) in search and rescue, emergency management, and law enforcement has gained traction with the advent of low-cost platforms and sensor payloads. The emergence of hybrid neural and symbolic AI approaches for complex reasoning is expected to further push the boundaries of these applications with decreasing levels of human intervention. However, current UAV simulation environments lack semantic context suited to this hybrid approach. To address this gap, HAMERITT (Hybrid Ai Mission Environment for RapId Training and Testing) provides a simulation-based autonomy software framework that supports the training, testing and assurance of neuro-symbolic algorithms for autonomous maneuver and perception reasoning. HAMERITT includes scenario generation capabilities that offer mission-relevant contextual symbolic information in addition to raw sensor data. Scenarios include symbolic descriptions for entities of interest and their relations to scene elements, as well as spatial-temporal constraints in the form of time-bounded areas of interest with prior probabilities and restricted zones within those areas. HAMERITT also features support for training distinct algorithm threads for maneuver vs. perception within an end-to-end mission run. Future work includes improving scenario realism and scaling symbolic context generation through automated workflow.
comment: 6 pages, 5 figures, 1 table
MAPS: Energy-Reliability Tradeoff Management in Autonomous Vehicles Through LLMs Penetrated Science
As autonomous vehicles become more prevalent, highly accurate and efficient systems are increasingly critical to improve safety, performance, and energy consumption. Efficient management of energy-reliability tradeoffs in these systems demands the ability to predict various conditions during vehicle operations. With the promising improvement of Large Language Models (LLMs) and the emergence of well-known models like ChatGPT, unique opportunities for autonomous vehicle-related predictions have been provided in recent years. This paper proposed MAPS using LLMs as map reader co-drivers to predict the vital parameters to set during the autonomous vehicle operation to balance the energy-reliability tradeoff. The MAPS method demonstrates a 20% improvement in navigation accuracy compared to the best baseline method. MAPS also shows 11% energy savings in computational units and up to 54% in both mechanical and computational units.
Social Mediation through Robots -- A Scoping Review on Improving Group Interactions through Directed Robot Action using an Extended Group Process Model
Group processes refer to the dynamics that occur within a group and are critical for understanding how groups function. With robots being increasingly placed within small groups, improving these processes has emerged as an important application of social robotics. Social Mediation Robots elicit behavioral change within groups by deliberately influencing the processes of groups. While research in this field has demonstrated that robots can effectively affect interpersonal dynamics, there is a notable gap in integrating these insights to develop coherent understanding and theory. We present a scoping review of literature targeting changes in social interactions between multiple humans through intentional action from robotic agents. To guide our review, we adapt the classical Input-Process-Output (I-P-O) models that we call "Mediation I-P-O model". We evaluated 1633 publications, which yielded 89 distinct social mediation concepts. We construct 11 mediation approaches robots can use to shape processes in small groups and teams. This work strives to produce generalizable insights and evaluate the extent to which the potential of social mediation through robots has been realized thus far. We hope that the proposed framework encourages a holistic approach to the study of social mediation and provides a foundation to standardize future reporting in the domain.
Multi-robot Task Allocation and Path Planning with Maximum Range Constraints
This letter presents a novel multi-robot task allocation and path planning method that considers robots' maximum range constraints in large-sized workspaces, enabling robots to complete the assigned tasks within their range limits. Firstly, we developed a fast path planner to solve global paths efficiently. Subsequently, we propose an innovative auction-based approach that integrates our path planner into the auction phase for reward computation while considering the robots' range limits. This method accounts for extra obstacle-avoiding travel distances rather than ideal straight-line distances, resolving the coupling between task allocation and path planning. Additionally, to avoid redundant computations during iterations, we implemented a lazy auction strategy to speed up the convergence of the task allocation. Finally, we validated the proposed method's effectiveness and application potential through extensive simulation and real-world experiments. The implementation code for our method will be available at https://github.com/wuuya1/RangeTAP.
Asymptotically Optimal Lazy Lifelong Sampling-based Algorithm for Efficient Motion Planning in Dynamic Environments
The paper introduces an asymptotically optimal lifelong sampling-based path planning algorithm that combines the merits of lifelong planning algorithms and lazy search algorithms for rapid replanning in dynamic environments where edge evaluation is expensive. By evaluating only sub-path candidates for the optimal solution, the algorithm saves considerable evaluation time and thereby reduces the overall planning cost. It employs a novel informed rewiring cascade to efficiently repair the search tree when the underlying search graph changes. Simulation results demonstrate that the algorithm outperforms various state-of-the-art sampling-based planners in addressing both static and dynamic motion planning problems.
Advancements in Gesture Recognition Techniques and Machine Learning for Enhanced Human-Robot Interaction: A Comprehensive Review
In recent years robots have become an important part of our day-to-day lives with various applications. Human-robot interaction creates a positive impact in the field of robotics to interact and communicate with the robots. Gesture recognition techniques combined with machine learning algorithms have shown remarkable progress in recent years, particularly in human-robot interaction (HRI). This paper comprehensively reviews the latest advancements in gesture recognition methods and their integration with machine learning approaches to enhance HRI. Furthermore, this paper represents the vision-based gesture recognition for safe and reliable human-robot-interaction with a depth-sensing system, analyses the role of machine learning algorithms such as deep learning, reinforcement learning, and transfer learning in improving the accuracy and robustness of gesture recognition systems for effective communication between humans and robots.
comment: 19 pages,1 Figure
A Novel Ternary Evolving Estimator for Positioning Unmanned Aerial Vehicle in Harsh Environments
Obtaining reliable position estimation is fundamental for unmanned aerial vehicles during mission execution, especially in harsh environments. But environmental interference and abrupt changes usually degrade measurement reliability, leading to estimation divergence. To address this, existing works explore adaptive adjustment of sensor confidence. Unfortunately, existing methods typically lack synchronous evaluation of estimation precision, thereby rendering adjustments sensitive to abnormal data and susceptible to divergence. To tackle this issue, we propose a novel ternary-channel adaptive evolving estimator equipped with an online error monitor, where the ternary channels, states, noise covariance matrices and especially aerial drag, evolve simultaneously with environment. Firstly, an augmented filter is employed to pre-processes multidimensional data, followed by an inverse-Wishart smoother utilized to obtain posterior states and covariance matrices. Error propagation relation during estimation is analysed and hence an indicator is devised for online monitoring estimation errors. Under this premise, several restrictions are applied to suppress potential divergence led by interference. Additionally, considering motion dynamics, aerial drag matrix is reformulated based on updated states and covariance matrices. Finally, the observability, numerical sensitivity and arithmetic complexity of the proposed estimator are mathematically analyzed. Extensive experiments are conducted in both common and harsh environments (with average RMSE 0.17m and 0.39m respectively) to verify adaptability of algorithm and effectiveness of restriction design, which shows our method significantly outperforms the state-of-the-art.
comment: 14 pages, 13 figures
Multimodal Large Language Model Driven Scenario Testing for Autonomous Vehicles
The generation of corner cases has become increasingly crucial for efficiently testing autonomous vehicles prior to road deployment. However, existing methods struggle to accommodate diverse testing requirements and often lack the ability to generalize to unseen situations, thereby reducing the convenience and usability of the generated scenarios. A method that facilitates easily controllable scenario generation for efficient autonomous vehicles (AV) testing with realistic and challenging situations is greatly needed. To address this, we proposed OmniTester: a multimodal Large Language Model (LLM) based framework that fully leverages the extensive world knowledge and reasoning capabilities of LLMs. OmniTester is designed to generate realistic and diverse scenarios within a simulation environment, offering a robust solution for testing and evaluating AVs. In addition to prompt engineering, we employ tools from Simulation of Urban Mobility to simplify the complexity of codes generated by LLMs. Furthermore, we incorporate Retrieval-Augmented Generation and a self-improvement mechanism to enhance the LLM's understanding of scenarios, thereby increasing its ability to produce more realistic scenes. In the experiments, we demonstrated the controllability and realism of our approaches in generating three types of challenging and complex scenarios. Additionally, we showcased its effectiveness in reconstructing new scenarios described in crash report, driven by the generalization capability of LLMs.
Human-mimetic binaural ear design and sound source direction estimation for task realization of musculoskeletal humanoids
Human-like environment recognition by musculoskeletal humanoids is important for task realization in real complex environments and for use as dummies for test subjects. Humans integrate various sensory information to perceive their surroundings, and hearing is particularly useful for recognizing objects out of view or out of touch. In this research, we aim to realize human-like auditory environmental recognition and task realization for musculoskeletal humanoids by equipping them with a human-like auditory processing system. Humans realize sound-based environmental recognition by estimating directions of the sound sources and detecting environmental sounds based on changes in the time and frequency domain of incoming sounds and the integration of auditory information in the central nervous system. We propose a human mimetic auditory information processing system, which consists of three components: the human mimetic binaural ear unit, which mimics human ear structure and characteristics, the sound source direction estimation system, and the environmental sound detection system, which mimics processing in the central nervous system. We apply it to Musashi, a human mimetic musculoskeletal humanoid, and have it perform tasks that require sound information outside of view in real noisy environments to confirm the usefulness of the proposed methods.
comment: Accepted at ROBOMECH Journal
GeMuCo: Generalized Multisensory Correlational Model for Body Schema Learning
Humans can autonomously learn the relationship between sensation and motion in their own bodies, estimate and control their own body states, and move while continuously adapting to the current environment. On the other hand, current robots control their bodies by learning the network structure described by humans from their experiences, making certain assumptions on the relationship between sensors and actuators. In addition, the network model does not adapt to changes in the robot's body, the tools that are grasped, or the environment, and there is no unified theory, not only for control but also for state estimation, anomaly detection, simulation, and so on. In this study, we propose a Generalized Multisensory Correlational Model (GeMuCo), in which the robot itself acquires a body schema describing the correlation between sensors and actuators from its own experience, including model structures such as network input/output. The robot adapts to the current environment by updating this body schema model online, estimates and controls its body state, and even performs anomaly detection and simulation. We demonstrate the effectiveness of this method by applying it to tool-use considering changes in grasping state for an axis-driven robot, to joint-muscle mapping learning for a musculoskeletal robot, and to full-body tool manipulation for a low-rigidity plastic-made humanoid.
comment: Accepted at IEEE Robotics and Automation Magazine
Mathematical Modeling Of Four Finger Robotic Grippers
Robotic grippers are the end effector in the robot system of handling any task which used for performing various operations for the purpose of industrial application and hazardous tasks.In this paper, we developed the mathematical model for multi fingers robotics grippers. we are concerned with Jamia'shand which is developed in Robotics Lab, Mechanical Engineering Deptt, Faculty of Engg & Technolgy, Jamia Millia Islamia, India. This is a tendon-driven gripper each finger having three DOF having a total of 11 DOF. The term tendon is widely used to imply belts, cables, or similar types of applications. It is made up of three fingers and a thumb. Every finger and thumb has one degree of freedom. The power transmission mechanism is a rope and pulley system. Both hands have similar structures. Aluminum from the 5083 families was used to make this product. The gripping force can be adjusted we have done the kinematics, force, and dynamic analysis by developing a Mathematical model for the four-finger robotics grippers and their thumb. we focused it control motions in X and Y Displacements with the angular positions movements and we make the force analysis of the four fingers and thumb calculate the maximum weight, force, and torque required to move it with mass. Draw the force -displacements graph which shows the linear behavior up to 250 N and shows nonlinear behavior beyond this. and required Dmin of wire is 0.86 mm for grasping the maximum 1 kg load also developed the dynamic model (using energy )approach lagrangian method to find it torque required to move the fingers.
comment: 8 pages,7 Figures
Soft Acoustic Curvature Sensor: Design and Development
This paper introduces a novel Soft Acoustic Curvature (SAC) sensor. SAC incorporates integrated audio components and features an acoustic channel within a flexible structure. A reference acoustic wave, generated by a speaker at one end of the channel, propagates and is received by a microphone at the other channel's end. Our previous study revealed that acoustic wave energy dissipation varies with acoustic channel deformation, leading us to design a novel channel capable of large deformation due to bending. We then use Machine Learning (ML) models to establish a complex mapping between channel deformations and sound modulation. Various sound frequencies and ML models were evaluated to enhance curvature detection accuracy. The sensor, constructed using soft material and 3D printing, was validated experimentally, with curvature measurement errors remaining within 3.5 m-1 for a range of 0 to 60 m-1 curvatures. These results demonstrate the effectiveness of the proposed method for estimating curvatures. With its flexible structure, the SAC sensor holds potential for applications in soft robotics, including shape measurement for continuum manipulators, soft grippers, and wearable devices.
comment: To appear in Robotics and Automation Letter
Offline Task Assistance Planning on a Graph:Theoretic and Algorithmic Foundations
In this work we introduce the problem of task assistance planning where we are given two robots Rtask and Rassist. The first robot, Rtask, is in charge of performing a given task by executing a precomputed path. The second robot, Rassist, is in charge of assisting the task performed by Rtask using on-board sensors. The ability of Rassist to provide assistance to Rtask depends on the locations of both robots. Since Rtask is moving along its path, Rassist may also need to move to provide as much assistance as possible. The problem we study is how to compute a path for Rassist so as to maximize the portion of Rtask's path for which assistance is provided. We limit the problem to the setting where Rassist moves on a roadmap which is a graph embedded in its configuration space and show that this problem is NP-hard. Fortunately, we show that when Rassist moves on a given path, and all we have to do is compute the times at which Rassist should move from one configuration to the following one, we can solve the problem optimally in polynomial time. Together with carefully-crafted upper bounds, this polynomial-time algorithm is integrated into a Branch and Bound-based algorithm that can compute optimal solutions to the problem outperforming baselines by several orders of magnitude. We demonstrate our work empirically in simulated scenarios containing both planar manipulators and UR robots as well as in the lab on real robots.
Adaptive Electronic Skin Sensitivity for Safe Human-Robot Interaction
Artificial electronic skins covering complete robot bodies can make physical human-robot collaboration safe and hence possible. Standards for collaborative robots (e.g., ISO/TS 15066) prescribe permissible forces and pressures during contacts with the human body. These characteristics of the collision depend on the speed of the colliding robot link but also on its effective mass. Thus, to warrant contacts complying with the Power and Force Limiting (PFL) collaborative regime but at the same time maximizing productivity, protective skin thresholds should be set individually for different parts of the robot bodies and dynamically on the run. Here we present and empirically evaluate four scenarios: (a) static and uniform - fixed thresholds for the whole skin, (b) static but different settings for robot body parts, (c) dynamically set based on every link velocity, (d) dynamically set based on effective mass of every robot link. We perform experiments in simulation and on a real 6-axis collaborative robot arm (UR10e) completely covered with sensitive skin (AIRSKIN) comprising eleven individual pads. On a mock pick-and-place scenario with transient collisions with the robot body parts and two collision reactions (stop and avoid), we demonstrate the boost in productivity in going from the most conservative setting of the skin thresholds (a) to the most adaptive setting (d). The threshold settings for every skin pad are adapted with a frequency of 25 Hz. This work can be easily extended for platforms with more degrees of freedom and larger skin coverage (humanoids) and to social human-robot interaction scenarios where contacts with the robot will be used for communication.
One Policy to Run Them All: an End-to-end Learning Approach to Multi-Embodiment Locomotion
Deep Reinforcement Learning techniques are achieving state-of-the-art results in robust legged locomotion. While there exists a wide variety of legged platforms such as quadruped, humanoids, and hexapods, the field is still missing a single learning framework that can control all these different embodiments easily and effectively and possibly transfer, zero or few-shot, to unseen robot embodiments. We introduce URMA, the Unified Robot Morphology Architecture, to close this gap. Our framework brings the end-to-end Multi-Task Reinforcement Learning approach to the realm of legged robots, enabling the learned policy to control any type of robot morphology. The key idea of our method is to allow the network to learn an abstract locomotion controller that can be seamlessly shared between embodiments thanks to our morphology-agnostic encoders and decoders. This flexible architecture can be seen as a potential first step in building a foundation model for legged robot locomotion. Our experiments show that URMA can learn a locomotion policy on multiple embodiments that can be easily transferred to unseen robot platforms in simulation and the real world.
Autonomous Iterative Motion Learning (AI-MOLE) of a SCARA Robot for Automated Myocardial Injection
Stem cell therapy is a promising approach to treat heart insufficiency and benefits from automated myocardial injection which requires highly precise motion of a robotic manipulator that is equipped with a syringe. This work investigates whether sufficiently precise motion can be achieved by combining a SCARA robot and learning control methods. For this purpose, the method Autonomous Iterative Motion Learning (AI-MOLE) is extended to be applicable to multi-input/multi-output systems. The proposed learning method solves reference tracking tasks in systems with unknown, nonlinear, multi-input/multi-output dynamics by iteratively updating an input trajectory in a plug-and-play fashion and without requiring manual parameter tuning. The proposed learning method is validated in a preliminary simulation study of a simplified SCARA robot that has to perform three desired motions. The results demonstrate that the proposed learning method achieves highly precise reference tracking without requiring any a priori model information or manual parameter tuning in as little as 15 trials per motion. The results further indicate that the combination of a SCARA robot and learning method achieves sufficiently precise motion to potentially enable automatic myocardial injection if similar results can be obtained in a real-world setting.
comment: 6 pages, 4 figures
Spectral oversubtraction? An approach for speech enhancement after robot ego speech filtering in semi-real-time ICASSP
Spectral subtraction, widely used for its simplicity, has been employed to address the Robot Ego Speech Filtering (RESF) problem for detecting speech contents of human interruption from robot's single-channel microphone recordings when it is speaking. However, this approach suffers from oversubtraction in the fundamental frequency range (FFR), leading to degraded speech content recognition. To address this, we propose a Two-Mask Conformer-based Metric Generative Adversarial Network (CMGAN) to enhance the detected speech and improve recognition results. Our model compensates for oversubtracted FFR values with high-frequency information and long-term features and then de-noises the new spectrogram. In addition, we introduce an incremental processing method that allows semi-real-time audio processing with streaming input on a network trained on long fixed-length input. Evaluations of two datasets, including one with unseen noise, demonstrate significant improvements in recognition accuracy and the effectiveness of the proposed two-mask approach and incremental processing, enhancing the robustness of the proposed RESF pipeline in real-world HRI scenarios.
comment: 6 pages, 2 figures, submitted to 2025 IEEE ICASSP
Restoration of Reduced Self-Efficacy Caused by Chronic Pain through Manipulated Sensory Discrepancy
Human physical function is governed by self-efficacy, the belief in one's motor capacity. In chronic pain patients, this capacity may remain reduced long after the damage causing the pain has been cured. Chronic pain alters body schema, affecting how patients perceive the dimension and pose of their bodies. We exploit this deficit using robotic manipulation technology and augmented sensory stimuli through virtual reality technology. We propose a sensory stimuli manipulation method aimed at modifying body schema to restore lost self-efficacy.
Test-Time Certifiable Self-Supervision to Bridge the Sim2Real Gap in Event-Based Satellite Pose Estimation IROS 2024
Deep learning plays a critical role in vision-based satellite pose estimation. However, the scarcity of real data from the space environment means that deep models need to be trained using synthetic data, which raises the Sim2Real domain gap problem. A major cause of the Sim2Real gap are novel lighting conditions encountered during test time. Event sensors have been shown to provide some robustness against lighting variations in vision-based pose estimation. However, challenging lighting conditions due to strong directional light can still cause undesirable effects in the output of commercial off-the-shelf event sensors, such as noisy/spurious events and inhomogeneous event densities on the object. Such effects are non-trivial to simulate in software, thus leading to Sim2Real gap in the event domain. To close the Sim2Real gap in event-based satellite pose estimation, the paper proposes a test-time self-supervision scheme with a certifier module. Self-supervision is enabled by an optimisation routine that aligns a dense point cloud of the predicted satellite pose with the event data to attempt to rectify the inaccurately estimated pose. The certifier attempts to verify the corrected pose, and only certified test-time inputs are backpropagated via implicit differentiation to refine the predicted landmarks, thus improving the pose estimates and closing the Sim2Real gap. Results show that the our method outperforms established test-time adaptation schemes.
comment: This work has been accepted for publication at IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024). Copyright may be transferred without notice, after which this version may no longer be accessible
Loss Distillation via Gradient Matching for Point Cloud Completion with Weighted Chamfer Distance IROS
3D point clouds enhanced the robot's ability to perceive the geometrical information of the environments, making it possible for many downstream tasks such as grasp pose detection and scene understanding. The performance of these tasks, though, heavily relies on the quality of data input, as incomplete can lead to poor results and failure cases. Recent training loss functions designed for deep learning-based point cloud completion, such as Chamfer distance (CD) and its variants (\eg HyperCD ), imply a good gradient weighting scheme can significantly boost performance. However, these CD-based loss functions usually require data-related parameter tuning, which can be time-consuming for data-extensive tasks. To address this issue, we aim to find a family of weighted training losses ({\em weighted CD}) that requires no parameter tuning. To this end, we propose a search scheme, {\em Loss Distillation via Gradient Matching}, to find good candidate loss functions by mimicking the learning behavior in backpropagation between HyperCD and weighted CD. Once this is done, we propose a novel bilevel optimization formula to train the backbone network based on the weighted CD loss. We observe that: (1) with proper weighted functions, the weighted CD can always achieve similar performance to HyperCD, and (2) the Landau weighted CD, namely {\em Landau CD}, can outperform HyperCD for point cloud completion and lead to new state-of-the-art results on several benchmark datasets. {\it Our demo code is available at \url{https://github.com/Zhang-VISLab/IROS2024-LossDistillationWeightedCD}.}
comment: 10 pages, 7 figures, 7 tables, this paper was accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2024
Robust Agility via Learned Zero Dynamics Policies IROS '24
We study the design of robust and agile controllers for hybrid underactuated systems. Our approach breaks down the task of creating a stabilizing controller into: 1) learning a mapping that is invariant under optimal control, and 2) driving the actuated coordinates to the output of that mapping. This approach, termed Zero Dynamics Policies, exploits the structure of underactuation by restricting the inputs of the target mapping to the subset of degrees of freedom that cannot be directly actuated, thereby achieving significant dimension reduction. Furthermore, we retain the stability and constraint satisfaction of optimal control while reducing the online computational overhead. We prove that controllers of this type stabilize hybrid underactuated systems and experimentally validate our approach on the 3D hopping platform, ARCHER. Over the course of 3000 hops the proposed framework demonstrates robust agility, maintaining stable hopping while rejecting disturbances on rough terrain.
comment: 8 pages, 6 figures, IROS '24
Heterogeneous LiDAR Dataset for Benchmarking Robust Localization in Diverse Degenerate Scenarios
The ability to estimate pose and generate maps using 3D LiDAR significantly enhances robotic system autonomy. However, existing open-source datasets lack representation of geometrically degenerate environments, limiting the development and benchmarking of robust LiDAR SLAM algorithms. To address this gap, we introduce GEODE, a comprehensive multi-LiDAR, multi-scenario dataset specifically designed to include real-world geometrically degenerate environments. GEODE comprises 64 trajectories spanning over 64 kilometers across seven diverse settings with varying degrees of degeneracy. The data was meticulously collected to promote the development of versatile algorithms by incorporating various LiDAR sensors, stereo cameras, IMUs, and diverse motion conditions. We evaluate state-of-the-art SLAM approaches using the GEODE dataset to highlight current limitations in LiDAR SLAM techniques. This extensive dataset will be publicly available at https://geode.github.io, supporting further advancements in LiDAR-based SLAM.
comment: 15 pages, 9 figures, 6 tables. Submitted for IJRR dataset paper
Leveraging LLMs, Graphs and Object Hierarchies for Task Planning in Large-Scale Environments
Planning methods struggle with computational intractability in solving task-level problems in large-scale environments. This work explores leveraging the commonsense knowledge encoded in LLMs to empower planning techniques to deal with these complex scenarios. We achieve this by efficiently using LLMs to prune irrelevant components from the planning problem's state space, substantially simplifying its complexity. We demonstrate the efficacy of this system through extensive experiments within a household simulation environment, alongside real-world validation using a 7-DoF manipulator (video https://youtu.be/6ro2UOtOQS4).
comment: 8 pages, 6 figures
Robust Single Rotation Averaging Revisited ECCV 2024
In this work, we propose a novel method for robust single rotation averaging that can efficiently handle an extremely large fraction of outliers. Our approach is to minimize the total truncated least unsquared deviations (TLUD) cost of geodesic distances. The proposed algorithm consists of three steps: First, we consider each input rotation as a potential initial solution and choose the one that yields the least sum of truncated chordal deviations. Next, we obtain the inlier set using the initial solution and compute its chordal $L_2$-mean. Finally, starting from this estimate, we iteratively compute the geodesic $L_1$-mean of the inliers using the Weiszfeld algorithm on $SO(3)$. An extensive evaluation shows that our method is robust against up to 99% outliers given a sufficient number of accurate inliers, outperforming the current state of the art.
comment: Accepted to ECCV 2024 Workshop on Recovering 6D Object Pose (R6D)
System Neural Diversity: Measuring Behavioral Heterogeneity in Multi-Agent Learning
Evolutionary science provides evidence that diversity confers resilience in natural systems. Yet, traditional multi-agent reinforcement learning techniques commonly enforce homogeneity to increase training sample efficiency. When a system of learning agents is not constrained to homogeneous policies, individuals may develop diverse behaviors, resulting in emergent complementarity that benefits the system. Despite this, there is a surprising lack of tools that quantify behavioral diversity. Such techniques would pave the way towards understanding the impact of diversity in collective artificial intelligence and enabling its control. In this paper, we introduce System Neural Diversity (SND): a measure of behavioral heterogeneity in multi-agent systems. We discuss and prove its theoretical properties, and compare it with alternate, state-of-the-art behavioral diversity metrics used in the robotics domain. Through simulations of a variety of cooperative multi-robot tasks, we show how our metric constitutes an important tool that enables measurement and control of behavioral heterogeneity. In dynamic tasks, where the problem is affected by repeated disturbances during training, we show that SND allows us to measure latent resilience skills acquired by the agents, while other proxies, such as task performance (reward), fail to. Finally, we show how the metric can be employed to control diversity, allowing us to enforce a desired heterogeneity set-point or range. We demonstrate how this paradigm can be used to bootstrap the exploration phase, finding optimal policies faster, thus enabling novel and more efficient MARL paradigms.
Not All Errors Are Made Equal: A Regret Metric for Detecting System-level Trajectory Prediction Failures
Robot decision-making increasingly relies on data-driven human prediction models when operating around people. While these models are known to mispredict in out-of-distribution interactions, only a subset of prediction errors impact downstream robot performance. We propose characterizing such "system-level" prediction failures via the mathematical notion of regret: high-regret interactions are precisely those in which mispredictions degraded closed-loop robot performance. We further introduce a probabilistic generalization of regret that calibrates failure detection across disparate deployment contexts and renders regret compatible with reward-based and reward-free (e.g., generative) planners. In simulated autonomous driving interactions and social navigation interactions deployed on hardware, we showcase that our system-level failure metric can be used offline to automatically extract closed-loop human-robot interactions that state-of-the-art generative human predictors and robot planners previously struggled with. We further find that the very presence of high-regret data during human predictor fine-tuning is highly predictive of robot re-deployment performance improvements. Fine-tuning with the informative but significantly smaller high-regret data (23% of deployment data) is competitive with fine-tuning on the full deployment dataset, indicating a promising avenue for efficiently mitigating system-level human-robot interaction failures. Project website: https://cmu-intentlab.github.io/not-all-errors/
comment: 6 figures, 3 tables, Accepted to CoRL 2024
RISE: 3D Perception Makes Real-World Robot Imitation Simple and Effective IROS 2024
Precise robot manipulations require rich spatial information in imitation learning. Image-based policies model object positions from fixed cameras, which are sensitive to camera view changes. Policies utilizing 3D point clouds usually predict keyframes rather than continuous actions, posing difficulty in dynamic and contact-rich scenarios. To utilize 3D perception efficiently, we present RISE, an end-to-end baseline for real-world imitation learning, which predicts continuous actions directly from single-view point clouds. It compresses the point cloud to tokens with a sparse 3D encoder. After adding sparse positional encoding, the tokens are featurized using a transformer. Finally, the features are decoded into robot actions by a diffusion head. Trained with 50 demonstrations for each real-world task, RISE surpasses currently representative 2D and 3D policies by a large margin, showcasing significant advantages in both accuracy and efficiency. Experiments also demonstrate that RISE is more general and robust to environmental change compared with previous baselines. Project website: rise-policy.github.io.
comment: IROS 2024
FC-Planner: A Skeleton-guided Planning Framework for Fast Aerial Coverage of Complex 3D Scenes ICRA2024
3D coverage path planning for UAVs is a crucial problem in diverse practical applications. However, existing methods have shown unsatisfactory system simplicity, computation efficiency, and path quality in large and complex scenes. To address these challenges, we propose FC-Planner, a skeleton-guided planning framework that can achieve fast aerial coverage of complex 3D scenes without pre-processing. We decompose the scene into several simple subspaces by a skeleton-based space decomposition (SSD). Additionally, the skeleton guides us to effortlessly determine free space. We utilize the skeleton to efficiently generate a minimal set of specialized and informative viewpoints for complete coverage. Based on SSD, a hierarchical planner effectively divides the large planning problem into independent sub-problems, enabling parallel planning for each subspace. The carefully designed global and local planning strategies are then incorporated to guarantee both high quality and efficiency in path generation. We conduct extensive benchmark and real-world tests, where FC-Planner computes over 10 times faster compared to state-of-the-art methods with shorter path and more complete coverage. The source code will be made publicly available to benefit the community. Project page: https://hkust-aerial-robotics.github.io/FC-Planner.
comment: Accepted to ICRA2024. Code: https://github.com/HKUST-Aerial-Robotics/FC-Planner. Video: https://www.bilibili.com/video/BV1h84y1D7u5/?spm_id_from=333.999.0.0&vd_source=0af61c122e5e37c944053b57e313025a. Project page: https://hkust-aerial-robotics.github.io/FC-Planner
Enhancing Sliding Performance with Aerial Robots: Analysis and Solutions for Non-Actuated Multi-Wheel Configurations
Sliding tasks performed by aerial robots are valuable for inspection and simple maintenance tasks at height, such as non-destructive testing and painting. Although various end-effector designs have been used for such tasks, non-actuated wheel configurations are more frequently applied thanks to their rolling capability for sliding motion, mechanical simplicity, and lightweight design. Moreover, a non-actuated multi-wheel (more than one wheel) configuration in the end-effector design allows the placement of additional equipment e.g., sensors and tools in the center of the end-effector tip for applications. However, there is still a lack of studies on crucial contact conditions during sliding using aerial robots with such an end-effector design. In this article, we investigate the key challenges associated with sliding operations using aerial robots equipped with multiple non-actuated wheels through in-depth analysis grounded in physical experiments. The experimental data is used to create a simulator that closely captures real-world conditions. We propose solutions from both mechanical design and control perspectives to improve the sliding performance of aerial robots. From a mechanical standpoint, design guidelines are derived from experimental data. From a control perspective, we introduce a novel pressure-sensing-based control framework that ensures reliable task execution, even during sliding maneuvers. The effectiveness and robustness of the proposed approaches are then validated and compared using the built simulator, particularly in high-risk scenarios.
TiCoSS: Tightening the Coupling between Semantic Segmentation and Stereo Matching within A Joint Learning Framework
Semantic segmentation and stereo matching, respectively analogous to the ventral and dorsal streams in our human brain, are two key components of autonomous driving perception systems. Addressing these two tasks with separate networks is no longer the mainstream direction in developing computer vision algorithms, particularly with the recent advances in large vision models and embodied artificial intelligence. The trend is shifting towards combining them within a joint learning framework, especially emphasizing feature sharing between the two tasks. The major contributions of this study lie in comprehensively tightening the coupling between semantic segmentation and stereo matching. Specifically, this study introduces three novelties: (1) a tightly coupled, gated feature fusion strategy, (2) a hierarchical deep supervision strategy, and (3) a coupling tightening loss function. The combined use of these technical contributions results in TiCoSS, a state-of-the-art joint learning framework that simultaneously tackles semantic segmentation and stereo matching. Through extensive experiments on the KITTI and vKITTI2 datasets, along with qualitative and quantitative analyses, we validate the effectiveness of our developed strategies and loss function, and demonstrate its superior performance compared to prior arts, with a notable increase in mIoU by over 9%. Our source code will be publicly available at mias.group/TiCoSS upon publication.
Valeo4Cast: A Modular Approach to End-to-End Forecasting ECCV
Motion forecasting is crucial in autonomous driving systems to anticipate the future trajectories of surrounding agents such as pedestrians, vehicles, and traffic signals. In end-to-end forecasting, the model must jointly detect and track from sensor data (cameras or LiDARs) the past trajectories of the different elements of the scene and predict their future locations. We depart from the current trend of tackling this task via end-to-end training from perception to forecasting, and instead use a modular approach. We individually build and train detection, tracking and forecasting modules. We then only use consecutive finetuning steps to integrate the modules better and alleviate compounding errors. We conduct an in-depth study on the finetuning strategies and it reveals that our simple yet effective approach significantly improves performance on the end-to-end forecasting benchmark. Consequently, our solution ranks first in the Argoverse 2 End-to-end Forecasting Challenge, with 63.82 mAPf. We surpass forecasting results by +17.1 points over last year's winner and by +13.3 points over this year's runner-up. This remarkable performance in forecasting can be explained by our modular paradigm, which integrates finetuning strategies and significantly outperforms the end-to-end-trained counterparts.
comment: Winning solution of the Argoverse 2 "Unified Detection, Tracking, and Forecasting" challenge; work accepted at Road++ ECCVW 2024
LiDAR-based 4D Occupancy Completion and Forecasting IROS 2024
Scene completion and forecasting are two popular perception problems in research for mobile agents like autonomous vehicles. Existing approaches treat the two problems in isolation, resulting in a separate perception of the two aspects. In this paper, we introduce a novel LiDAR perception task of Occupancy Completion and Forecasting (OCF) in the context of autonomous driving to unify these aspects into a cohesive framework. This task requires new algorithms to address three challenges altogether: (1) sparse-to-dense reconstruction, (2) partial-to-complete hallucination, and (3) 3D-to-4D prediction. To enable supervision and evaluation, we curate a large-scale dataset termed OCFBench from public autonomous driving datasets. We analyze the performance of closely related existing baseline models and our own ones on our dataset. We envision that this research will inspire and call for further investigation in this evolving and crucial area of 4D perception. Our code for data curation and baseline implementation is available at https://github.com/ai4ce/Occ4cast.
comment: IROS 2024
Efficient Visuo-Haptic Object Shape Completion for Robot Manipulation
For robot manipulation, a complete and accurate object shape is desirable. Here, we present a method that combines visual and haptic reconstruction in a closed-loop pipeline. From an initial viewpoint, the object shape is reconstructed using an implicit surface deep neural network. The location with highest uncertainty is selected for haptic exploration, the object is touched, the new information from touch and a new point cloud from the camera are added, object position is re-estimated and the cycle is repeated. We extend Rustler et al. (2022) by using a new theoretically grounded method to determine the points with highest uncertainty, and we increase the yield of every haptic exploration by adding not only the contact points to the point cloud but also incorporating the empty space established through the robot movement to the object. Additionally, the solution is compact in that the jaws of a closed two-finger gripper are directly used for exploration. The object position is re-estimated after every robot action and multiple objects can be present simultaneously on the table. We achieve a steady improvement with every touch using three different metrics and demonstrate the utility of the better shape reconstruction in grasping experiments on the real robot. On average, grasp success rate increases from 63.3% to 70.4% after a single exploratory touch and to 82.7% after five touches. The collected data and code are publicly available (https://osf.io/j6rkd/, https://github.com/ctu-vras/vishac)
Multimodal Active Measurement for Human Mesh Recovery in Close Proximity
For physical human-robot interactions (pHRI), a robot needs to estimate the accurate body pose of a target person. However, in these pHRI scenarios, the robot cannot fully observe the target person's body with equipped cameras because the target person must be close to the robot for physical interaction. This close distance leads to severe truncation and occlusions and thus results in poor accuracy of human pose estimation. For better accuracy in this challenging environment, we propose an active measurement and sensor fusion framework of the equipped cameras with touch and ranging sensors such as 2D LiDAR. Touch and ranging sensor measurements are sparse but reliable and informative cues for localizing human body parts. In our active measurement process, camera viewpoints and sensor placements are dynamically optimized to measure body parts with higher estimation uncertainty, which is closely related to truncation or occlusion. In our sensor fusion process, assuming that the measurements of touch and ranging sensors are more reliable than the camera-based estimations, we fuse the sensor measurements to the camera-based estimated pose by aligning the estimated pose towards the measured points. Our proposed method outperformed previous methods on the standard occlusion benchmark with simulated active measurement. Furthermore, our method reliably estimated human poses using a real robot, even with practical constraints such as occlusion by blankets.
comment: Accepted at Robotics and Automation Letters (RA-L)
Pseudo-rigid body networks: learning interpretable deformable object dynamics from partial observations
Accurately predicting deformable linear object (DLO) dynamics is challenging, especially when the task requires a model that is both human-interpretable and computationally efficient. In this work, we draw inspiration from the pseudo-rigid body method (PRB) and model a DLO as a serial chain of rigid bodies whose internal state is unrolled through time by a dynamics network. This dynamics network is trained jointly with a physics-informed encoder that maps observed motion variables to the DLO's hidden state. To encourage the state to acquire a physically meaningful representation, we leverage the forward kinematics of the PRB model as a decoder. We demonstrate in robot experiments that the proposed DLO dynamics model provides physically interpretable predictions from partial observations while being on par with black-box models regarding prediction accuracy. The project code is available at: http://tinyurl.com/prb-networks
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Imitation Learning-Based Online Time-Optimal Control with Multiple-Waypoint Constraints for Quadrotors
Over the past decade, there has been a remarkable surge in utilizing quadrotors for various purposes due to their simple structure and aggressive maneuverability, such as search and rescue, delivery and autonomous drone racing, etc. One of the key challenges preventing quadrotors from being widely used in these scenarios is online waypoint-constrained time-optimal trajectory generation and control technique. This letter proposes an imitation learning-based online solution to efficiently navigate the quadrotor through multiple waypoints with time-optimal performance. The neural networks (WN&CNets) are trained to learn the control law from the dataset generated by the time-consuming CPC algorithm and then deployed to generate the optimal control commands online to guide the quadrotors. To address the challenge of limited training data and the hover maneuver at the final waypoint, we propose a transition phase strategy that utilizes MINCO trajectories to help the quadrotor 'jump over' the stop-and-go maneuver when switching waypoints. Our method is demonstrated in both simulation and real-world experiments, achieving a maximum speed of 5.6m/s while navigating through 7 waypoints in a confined space of 5.5m*5.5m*2.0m. The results show that with a slight loss in optimality, the WN&CNets significantly reduce the processing time and enable online optimal control for multiple-waypoint constrained flight tasks.
ViSaRL: Visual Reinforcement Learning Guided by Human Saliency
Training robots to perform complex control tasks from high-dimensional pixel input using reinforcement learning (RL) is sample-inefficient, because image observations are comprised primarily of task-irrelevant information. By contrast, humans are able to visually attend to task-relevant objects and areas. Based on this insight, we introduce Visual Saliency-Guided Reinforcement Learning (ViSaRL). Using ViSaRL to learn visual representations significantly improves the success rate, sample efficiency, and generalization of an RL agent on diverse tasks including DeepMind Control benchmark, robot manipulation in simulation and on a real robot. We present approaches for incorporating saliency into both CNN and Transformer-based encoders. We show that visual representations learned using ViSaRL are robust to various sources of visual perturbations including perceptual noise and scene variations. ViSaRL nearly doubles success rate on the real-robot tasks compared to the baseline which does not use saliency.
Greedy Perspectives: Multi-Drone View Planning for Collaborative Perception in Cluttered Environments IROS'24
Deployment of teams of aerial robots could enable large-scale filming of dynamic groups of people (actors) in complex environments for applications in areas such as team sports and cinematography. Toward this end, methods for submodular maximization via sequential greedy planning can enable scalable optimization of camera views across teams of robots but face challenges with efficient coordination in cluttered environments. Obstacles can produce occlusions and increase chances of inter-robot collision which can violate requirements for near-optimality guarantees. To coordinate teams of aerial robots in filming groups of people in dense environments, a more general view-planning approach is required. We explore how collision and occlusion impact performance in filming applications through the development of a multi-robot multi-actor view planner with an occlusion-aware objective for filming groups of people and compare with a formation planner and a greedy planner that ignores inter-robot collisions. We evaluate our approach based on five test environments and complex multi-actor behaviors. Compared with a formation planner, our sequential planner generates 14% greater view reward for filming the actors in three scenarios and comparable performance to formation planning on two others. We also observe near identical view rewards for sequential planning both with and without inter-robot collision constraints which indicates that robots are able to avoid collisions without impairing performance in the perception task. Overall, we demonstrate effective coordination of teams of aerial robots in environments cluttered with obstacles that may cause collisions or occlusions and for filming groups that may split, merge, or spread apart.
comment: IROS'24; 8 pages, 8 figures, 2 tables
GET-Zero: Graph Embodiment Transformer for Zero-shot Embodiment Generalization
This paper introduces GET-Zero, a model architecture and training procedure for learning an embodiment-aware control policy that can immediately adapt to new hardware changes without retraining. To do so, we present Graph Embodiment Transformer (GET), a transformer model that leverages the embodiment graph connectivity as a learned structural bias in the attention mechanism. We use behavior cloning to distill demonstration data from embodiment-specific expert policies into an embodiment-aware GET model that conditions on the hardware configuration of the robot to make control decisions. We conduct a case study on a dexterous in-hand object rotation task using different configurations of a four-fingered robot hand with joints removed and with link length extensions. Using the GET model along with a self-modeling loss enables GET-Zero to zero-shot generalize to unseen variation in graph structure and link length, yielding a 20% improvement over baseline methods. All code and qualitative video results are on https://get-zero-paper.github.io
comment: 8 pages, 5 figures, 3 tables, website https://get-zero-paper.github.io
MGS-SLAM: Monocular Sparse Tracking and Gaussian Mapping with Depth Smooth Regularization
This letter introduces a novel framework for dense Visual Simultaneous Localization and Mapping (VSLAM) based on Gaussian Splatting. Recently, SLAM based on Gaussian Splatting has shown promising results. However, in monocular scenarios, the Gaussian maps reconstructed lack geometric accuracy and exhibit weaker tracking capability. To address these limitations, we jointly optimize sparse visual odometry tracking and 3D Gaussian Splatting scene representation for the first time. We obtain depth maps on visual odometry keyframe windows using a fast Multi-View Stereo (MVS) network for the geometric supervision of Gaussian maps. Furthermore, we propose a depth smooth loss and Sparse-Dense Adjustment Ring (SDAR) to reduce the negative effect of estimated depth maps and preserve the consistency in scale between the visual odometry and Gaussian maps. We have evaluated our system across various synthetic and real-world datasets. The accuracy of our pose estimation surpasses existing methods and achieves state-of-the-art. Additionally, it outperforms previous monocular methods in terms of novel view synthesis and geometric reconstruction fidelities.
comment: Accepted by IEEE Robotics and Automation Letters
Multi-Agent Clarity-Aware Dynamic Coverage with Gaussian Processes
This paper presents two algorithms for multi-agent dynamic coverage in spatiotemporal environments, where the coverage algorithms are informed by the method of data assimilation. In particular, we show that by explicitly modeling the environment using a Gaussian Process (GP) model, and considering the sensing capabilities and the dynamics of a team of robots, we can design an estimation algorithm and multi-agent coverage controller that explores and estimates the state of the spatiotemporal environment. The uncertainty of the estimate is quantified using clarity, an information-theoretic metric, where higher clarity corresponds to lower uncertainty. By exploiting the relationship between GPs and Stochastic Differential Equations (SDEs) we quantify the increase in clarity of the estimated state at any position due to a measurement taken from any other position. We use this relationship to design two new coverage controllers, both of which scale well with the number of agents exploring the domain, assuming the robots can share the map of the clarity over the spatial domain via communication. We demonstrate the algorithms through a realistic simulation of a team of robots collecting wind data over a region in Austria.
comment: 8 pages, 2 figures, accepted in IEEE CDC 2024
Systems and Control (CS)
Cooptimizing Safety and Performance with a Control-Constrained Formulation
Autonomous systems have witnessed a rapid increase in their capabilities, but it remains a challenge for them to perform tasks both effectively and safely. The fact that performance and safety can sometimes be competing objectives renders the cooptimization between them difficult. One school of thought is to treat this cooptimization as a constrained optimal control problem with a performance-oriented objective function and safety as a constraint. However, solving this constrained optimal control problem for general nonlinear systems remains challenging. In this work, we use the general framework of constrained optimal control, but given the safety state constraint, we convert it into an equivalent control constraint, resulting in a state and time-dependent control-constrained optimal control problem. This equivalent optimal control problem can readily be solved using the dynamic programming principle. We show the corresponding value function is a viscosity solution of a certain Hamilton-Jacobi-Bellman Partial Differential Equation (HJB-PDE). Furthermore, we demonstrate the effectiveness of our method with a two-dimensional case study, and the experiment shows that the controller synthesized using our method consistently outperforms the baselines, both in safety and performance.
comment: Submitted to ACC with L-CSS option
Bayesian hypergame approach to equilibrium stability and robustness in moving target defense
We investigate the equilibrium stability and robustness in a class of moving target defense problems, in which players have both incomplete information and asymmetric cognition. We first establish a Bayesian Stackelberg game model for incomplete information and then employ a hypergame reformulation to address asymmetric cognition. With the core concept of the hyper Bayesian Nash equilibrium (HBNE), a condition for achieving both the strategic and cognitive stability in equilibria can be realized by solving linear equations. Moreover, to deal with players' underlying perturbed knowledge, we study the equilibrium robustness by presenting a condition of robust HBNE under the given configuration. Experiments evaluate our theoretical results.
An Ontology-based Approach Towards Traceable Behavior Specifications in Automated Driving
Vehicles in public traffic that are equipped with Automated Driving Systems are subject to a number of expectations: Among other aspects, their behavior should be safe, conforming to the rules of the road and provide mobility to their users. This poses challenges for the developers of such systems: Developers are responsible for specifying this behavior, for example, in terms of requirements at system design time. As we will discuss in the article, this specification always involves the need for assumptions and trade-offs. As a result, insufficiencies in such a behavior specification can occur that can potentially lead to unsafe system behavior. In order to support the identification of specification insufficiencies, requirements and respective assumptions need to be made explicit. In this article, we propose the Semantic Norm Behavior Analysis as an ontology-based approach to specify the behavior for an Automated Driving System equipped vehicle. We use ontologies to formally represent specified behavior for a targeted operational environment, and to establish traceability between specified behavior and the addressed stakeholder needs. Furthermore, we illustrate the application of the Semantic Norm Behavior Analysis in two example scenarios and evaluate our results.
comment: 22 pages, 12 figures, submitted for publication
Self-calibrated Microring Weight Function for Neuromorphic Optical Computing
This paper presents a microring resonator-based weight function for neuromorphic photonic applications achieving a record-high precision of 11.3 bits and accuracy of 9.3 bits for 2 Gbps input optical signals. The system employs an all-analog self-referenced proportional-integral-derivative (PID) controller to perform real-time temperature stabilization within a range of up to 60 degree Celsius. A self-calibrated weight function is demonstrated for a range of 6 degree Celsius with a single initial calibration and minimal accuracy and precision degradation. By monitoring the through and drop ports of the microring with variable gain transimpedance amplifiers, accurate and precise weight adjustment is achieved, ensuring optimal performance and reliability. These findings underscore the system's robustness to dynamic thermal environments, highlighting the potential for high-speed reconfigurable analog photonic networks.
On Epistemic Properties in Discrete-Event Systems: A Uniform Framework and Its Applications
In this paper, we investigate the property verification problem for partially-observed DES from a new perspective. Specifically, we consider the problem setting where the system is observed by two agents independently, each with its own observation. The purpose of the first agent, referred to as the low-level observer, is to infer the actual behavior of the system, while the second, referred to as the high-level observer, aims to infer the knowledge of Agent 1 regarding the system. We present a general notion called the epistemic property capturing the inference from the high-level observer to the low-level observer. A typical instance of this definition is the notion of high-order opacity, which specifies that the intruder does not know that the system knows some critical information. This formalization is very general and supports any user-defined information-state-based knowledge between the two observers. We demonstrate how the general definition of epistemic properties can be applied in different problem settings such as information leakage diagnosis or tactical cooperation without explicit communications. Finally, we provide a systematic approach for the verification of epistemic properties. Particularly, we identify some fragments of epistemic properties that can be verified more efficiently.
Autoencoder-Based and Physically Motivated Koopman Lifted States for Wind Farm MPC: A Comparative Case Study
This paper explores the use of Autoencoder (AE) models to identify Koopman-based linear representations for designing model predictive control (MPC) for wind farms. Wake interactions in wind farms are challenging to model, previously addressed with Koopman lifted states. In this study we investigate the performance of two AE models: The first AE model estimates the wind speeds acting on the turbines these are affected by changes in turbine control inputs. The wind speeds estimated by this AE model are then used in a second step to calculate the power output via a simple turbine model based on physical equations. The second AE model directly estimates the wind farm output, i.e., both turbine and wake dynamics are modeled. The primary inquiry of this study addresses whether any of these two AE-based models can surpass previously identified Koopman models based on physically motivated lifted states. We find that the first AE model, which estimates the wind speed and hence includes the wake dynamics, but excludes the turbine dynamics outperforms the existing physically motivated Koopman model. However, the second AE model, which estimates the farm power directly, underperforms when the turbines' underlying physical assumptions are correct. We additionally investigate specific conditions under which the second, purely data-driven AE model can excel: Notably, when modeling assumptions, such as the wind turbine power coefficient, are erroneous and remain unchecked within the MPC controller. In such cases, the data-driven AE models, when updated with recent data reflecting changed system dynamics, can outperform physics-based models operating under outdated assumptions.
comment: Accepted for Conference on Decision and Control 2024
Superior Computer Chess with Model Predictive Control, Reinforcement Learning, and Rollout
In this paper we apply model predictive control (MPC), rollout, and reinforcement learning (RL) methodologies to computer chess. We introduce a new architecture for move selection, within which available chess engines are used as components. One engine is used to provide position evaluations in an approximation in value space MPC/RL scheme, while a second engine is used as nominal opponent, to emulate or approximate the moves of the true opponent player. We show that our architecture improves substantially the performance of the position evaluation engine. In other words our architecture provides an additional layer of intelligence, on top of the intelligence of the engines on which it is based. This is true for any engine, regardless of its strength: top engines such as Stockfish and Komodo Dragon (of varying strengths), as well as weaker engines. Structurally, our basic architecture selects moves by a one-move lookahead search, with an intermediate move generated by a nominal opponent engine, and followed by a position evaluation by another chess engine. Simpler schemes that forego the use of the nominal opponent, also perform better than the position evaluator, but not quite by as much. More complex schemes, involving multistep lookahead, may also be used and generally tend to perform better as the length of the lookahead increases. Theoretically, our methodology relies on generic cost improvement properties and the superlinear convergence framework of Newton's method, which fundamentally underlies approximation in value space, and related MPC/RL and rollout/policy iteration schemes. A critical requirement of this framework is that the first lookahead step should be executed exactly. This fact has guided our architectural choices, and is apparently an important factor in improving the performance of even the best available chess engines.
A Short Information-Theoretic Analysis of Linear Auto-Regressive Learning
In this note, we give a short information-theoretic proof of the consistency of the Gaussian maximum likelihood estimator in linear auto-regressive models. Our proof yields nearly optimal non-asymptotic rates for parameter recovery and works without any invocation of stability in the case of finite hypothesis classes.
Autonomous Iterative Motion Learning (AI-MOLE) of a SCARA Robot for Automated Myocardial Injection
Stem cell therapy is a promising approach to treat heart insufficiency and benefits from automated myocardial injection which requires highly precise motion of a robotic manipulator that is equipped with a syringe. This work investigates whether sufficiently precise motion can be achieved by combining a SCARA robot and learning control methods. For this purpose, the method Autonomous Iterative Motion Learning (AI-MOLE) is extended to be applicable to multi-input/multi-output systems. The proposed learning method solves reference tracking tasks in systems with unknown, nonlinear, multi-input/multi-output dynamics by iteratively updating an input trajectory in a plug-and-play fashion and without requiring manual parameter tuning. The proposed learning method is validated in a preliminary simulation study of a simplified SCARA robot that has to perform three desired motions. The results demonstrate that the proposed learning method achieves highly precise reference tracking without requiring any a priori model information or manual parameter tuning in as little as 15 trials per motion. The results further indicate that the combination of a SCARA robot and learning method achieves sufficiently precise motion to potentially enable automatic myocardial injection if similar results can be obtained in a real-world setting.
comment: 6 pages, 4 figures
Analysis of a Simple Neuromorphic Controller for Linear Systems: A Hybrid Systems Perspective
In this paper we analyze a neuromorphic controller, inspired by the leaky integrate-and-fire neuronal model, in closed-loop with a single-input single-output linear time-invariant system. The controller consists of two neuron-like variables and generates a spiking control input whenever one of these variables reaches a threshold. The control input is different from zero only at the spiking instants and, hence, between two spiking times the system evolves in open-loop. Exploiting the hybrid nature of the integrate-and-fire neuronal dynamics, we present a hybrid modeling framework to design and analyze this new controller. In the particular case of single-state linear time-invariant plants, we prove a practical stability property for the closed-loop system, we ensure the existence of a strictly positive dwell-time between spikes, and we relate these properties to the parameters in the neurons. The results are illustrated in a numerical example.
Uncovering the inherited vulnerability of electric distribution networks
Research on the vulnerability of electric networks with a complex network approach has produced significant results in the last decade, especially for transmission networks. These studies have shown that there are causal relations between certain structural properties of networks and their vulnerabilities, leading to an inherent weakness. The purpose of present work was twofold: to test the hypotheses already examined on evolving transmission networks and to gain a deeper understanding on the nature of these inherent weaknesses. For this, historical models of a medium-voltage distribution network supply area were reconstructed and analysed. Topological efficiency of the networks was calculated against node and edge removals of different proportions. We found that the tolerance of the evolving grid remained practically unchanged during the examined period, implying that the increase in size is dominantly caused by the connection of geographically and spatially constrained supply areas and not by an evolutionary process. We also show that probability density functions of centrality metrics, typically connected to vulnerability, show only minor variation during the early evolution of the examined distribution network, and in many cases resemble the properties of the modern days.
comment: 22 pages, 10 figures
The Lynchpin of In-Memory Computing: A Benchmarking Framework for Vector-Matrix Multiplication in RRAMs
The Von Neumann bottleneck, a fundamental challenge in conventional computer architecture, arises from the inability to execute fetch and data operations simultaneously due to a shared bus linking processing and memory units. This bottleneck significantly limits system performance, increases energy consumption, and exacerbates computational complexity. Emerging technologies such as Resistive Random Access Memories (RRAMs), leveraging crossbar arrays, offer promising alternatives for addressing the demands of data-intensive computational tasks through in-memory computing of analog vector-matrix multiplication (VMM) operations. However, the propagation of errors due to device and circuit-level imperfections remains a significant challenge. In this study, we introduce MELISO (In-Memory Linear Solver), a comprehensive end-to-end VMM benchmarking framework tailored for RRAM-based systems. MELISO evaluates the error propagation in VMM operations, analyzing the impact of RRAM device metrics on error magnitude and distribution. This paper introduces the MELISO framework and demonstrates its utility in characterizing and mitigating VMM error propagation using state-of-the-art RRAM device metrics.
comment: ICONS 2024.Copyright 2024 IEEE.Personal use of this material is permitted.Permission from IEEE must be obtained for all other uses,in any current or future media,including reprinting/republishing this material for advertising or promotional purposes,creating new collective works,for resale or redistribution to servers or lists or reuse of any copyrighted component of this work in other works
APEX: Attention on Personality based Emotion ReXgnition Framework
Automated emotion recognition has applications in various fields, such as human-machine interaction, healthcare, security, education, and emotion-aware recommendation/feedback systems. Developing methods to analyze human emotions accurately is essential to enable such diverse applications. Multiple studies have been conducted to explore the possibility of using physiological signals and machine-learning techniques to evaluate human emotions. Furthermore, internal factors such as personality have been considered and involved in emotion recognition. However, integrating personality that is user specific within traditional machine-learning methods that use user-agnostic large data sets has become a critical problem. This study proposes the APEX: attention on personality-based emotion recognition framework, in which multiple weak classifiers are trained on physiological signals of each participant's data, and the classification results are reweighed based on the personality correlations between corresponding subjects and test subjects. Experiments have been conducted on the ASCERTAIN dataset, and the results show that the proposed framework outperforms existing studies.
Real-Time Ground Fault Detection for Inverter-Based Microgrid Systems
Ground fault detection in inverter-based microgrid (IBM) systems is challenging, particularly in a real-time setting, as the fault current deviates slightly from the nominal value. This difficulty is reinforced when there are partially decoupled disturbances and modeling uncertainties. The conventional solution of installing more relays to obtain additional measurements is costly and also increases the complexity of the system. In this paper, we propose a data-assisted diagnosis scheme based on an optimization-based fault detection filter with the output current as the only measurement. Modeling the microgrid dynamics and the diagnosis filter, we formulate the filter design as a quadratic programming (QP) problem that accounts for decoupling partial disturbances, robustness to non-decoupled disturbances and modeling uncertainties by training with data, and ensuring fault sensitivity simultaneously. To ease the computational effort, we also provide an approximate but analytical solution to this QP. Additionally, we use classical statistical results to provide a thresholding mechanism that enjoys probabilistic false-alarm guarantees. Finally, we implement the IBM system with Simulink and Real Time Digital Simulator (RTDS) to verify the effectiveness of the proposed method through simulations.
comment: 18 pages, 9 figures
Upstream Allocation of Bidirectional Load Demand by Power Packetization
The power packet dispatching system has been studied for power management with strict tie to an accompanying information system through power packetization. In the system, integrated units of transfer of power and information, called power packets, are delivered through a network of apparatuses called power packet routers. This paper proposes upstream allocation of a bidirectional load demand represented by a sequence of power packets to power sources. We first develop a scheme of power packet routing for upstream allocation of load demand with full integration of power and information transfer. The routing scheme is then proved to enable packetized management of bidirectional load demand, which is of practical importance for applicability to, e.g., electric drives in motoring and regenerating operations. We present a way of packetizing the bidirectional load demand and realizing the power and information flow under the upstream allocation scheme. The viability of the proposed methods is demonstrated through experiments.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Design of CANSAT for Air Quality Monitoring for an altitude of 900 meters
This paper presents the design and development of NAMBI-VJ, a CANSAT specifically designed for air quality monitoring and stabilization. The CANSAT's cylindrical structure, measuring 310mm in height and 125mm in diameter, is equipped with a mechanical gyroscope for stabilization and a spill-hole parachute for controlled descent. The primary objective of this research is to create a compact, lightweight satellite capable of monitoring air quality parameters such as particulate matter (PM), carbon dioxide (CO2), longitude, and latitude. To achieve this, the CANSAT utilizes Zigbee communication to transmit data to a ground station. Experimental testing involved dropping the CANSAT from an altitude of 900 meters using a drone. The results demonstrate the CANSAT's ability to successfully gather and transmit air quality data, highlighting its potential for environmental monitoring applications.
comment: 6 pages,6 figures
State-of-the-art review and synthesis: A requirement-based roadmap for standardized predictive maintenance automation using digital twin technologies
Recent digital advances have popularized predictive maintenance (PMx), offering enhanced efficiency, automation, accuracy, cost savings, and independence in maintenance processes. Yet, PMx continues to face numerous limitations such as poor explainability, sample inefficiency of data-driven methods, complexity of physics-based methods, and limited generalizability and scalability of knowledge-based methods. This paper proposes leveraging Digital Twins (DTs) to address these challenges and enable automated PMx adoption on a larger scale. While DTs have the potential to be transformative, they have not yet reached the maturity needed to bridge these gaps in a standardized manner. Without a standard definition guiding this evolution, the transformation lacks a solid foundation for development. This paper provides a requirement-based roadmap to support standardized PMx automation using DT technologies. Our systematic approach comprises two primary stages. First, we methodically identify the Informational Requirements (IRs) and Functional Requirements (FRs) for PMx, which serve as a foundation from which any unified framework must emerge. Our approach to defining and using IRs and FRs as the backbone of any PMx DT is supported by the proven success of these requirements as blueprints in other areas, such as product development in the software industry. Second, we conduct a thorough literature review across various fields to assess how these IRs and FRs are currently being applied within DTs, enabling us to identify specific areas where further research is needed to support the progress and maturation of requirement-based PMx DTs.
comment: This paper has been accepted for publication in Advanced Engineering Informatics (2024)
The Fragile Nature of Road Transportation Systems
Major cities worldwide experience problems with the performance of their road transportation systems, and the continuous increase in traffic demand presents a substantial challenge to the optimal operation of urban road networks and the efficiency of traffic control strategies. The operation of transportation systems is widely considered to display fragile property, i.e., the loss in performance increases exponentially with the linearly increasing magnitude of disruptions. Meanwhile, the risk engineering community is embracing the novel concept of antifragility, enabling systems to learn from historical disruptions and exhibit improved performance under black swan events. In this study, based on established traffic models, namely fundamental diagrams and macroscopic fundamental diagrams, we first conducted a rigorous mathematical analysis to prove the fragile nature of the systems theoretically. Subsequently, we propose a skewness-based indicator that can be readily applied to cross-compare the degree of fragility for different networks solely dependent on the MFD-related parameters. At last, by taking real-world stochasticity into account, we implemented a numerical simulation with realistic network data to bridge the gap between the theoretical proof and the real-world operations, to reflect the potential impact of uncertainty on the fragility of the systems. This work aims to demonstrate the fragile nature of road transportation systems and help researchers better comprehend the necessity to consider explicitly antifragile design for future traffic control strategies.
comment: 39 pages, 14 figures
Inverse Particle Filter
In cognitive systems, recent emphasis has been placed on studying the cognitive processes of the subject whose behavior was the primary focus of the system's cognitive response. This approach, known as inverse cognition, arises in counter-adversarial applications and has motivated the development of inverse Bayesian filters. In this context, a cognitive adversary, such as a radar, uses a forward Bayesian filter to track its target of interest. An inverse filter is then employed to infer the adversary's estimate of the target's or defender's state. Previous studies have addressed this inverse filtering problem by introducing methods like the inverse Kalman filter (I-KF), inverse extended KF (I-EKF), and inverse unscented KF (I-UKF). However, these filters typically assume additive Gaussian noise models and/or rely on local approximations of non-linear dynamics at the state estimates, limiting their practical application. In contrast, this paper adopts a global filtering approach and presents the development of an inverse particle filter (I-PF). The particle filter framework employs Monte Carlo (MC) methods to approximate arbitrary posterior distributions. Moreover, under mild system-level conditions, the proposed I-PF demonstrates convergence to the optimal inverse filter. Additionally, we propose the differentiable I-PF to address scenarios where system information is unknown to the defender. Using the recursive Cramer-Rao lower bound and non-credibility index (NCI), our numerical experiments for different systems demonstrate the estimation performance and time complexity of the proposed filter.
comment: 13 pages, 4 figures
Enhancing Sliding Performance with Aerial Robots: Analysis and Solutions for Non-Actuated Multi-Wheel Configurations
Sliding tasks performed by aerial robots are valuable for inspection and simple maintenance tasks at height, such as non-destructive testing and painting. Although various end-effector designs have been used for such tasks, non-actuated wheel configurations are more frequently applied thanks to their rolling capability for sliding motion, mechanical simplicity, and lightweight design. Moreover, a non-actuated multi-wheel (more than one wheel) configuration in the end-effector design allows the placement of additional equipment e.g., sensors and tools in the center of the end-effector tip for applications. However, there is still a lack of studies on crucial contact conditions during sliding using aerial robots with such an end-effector design. In this article, we investigate the key challenges associated with sliding operations using aerial robots equipped with multiple non-actuated wheels through in-depth analysis grounded in physical experiments. The experimental data is used to create a simulator that closely captures real-world conditions. We propose solutions from both mechanical design and control perspectives to improve the sliding performance of aerial robots. From a mechanical standpoint, design guidelines are derived from experimental data. From a control perspective, we introduce a novel pressure-sensing-based control framework that ensures reliable task execution, even during sliding maneuvers. The effectiveness and robustness of the proposed approaches are then validated and compared using the built simulator, particularly in high-risk scenarios.
Adaptive Economic Model Predictive Control for linear systems with performance guarantees
We present a model predictive control (MPC) formulation to directly optimize economic criteria for linear constrained systems subject to disturbances and uncertain model parameters. The proposed formulation combines a certainty equivalent economic MPC with a simple least-squares parameter adaptation. For the resulting adaptive economic MPC scheme, we derive strong asymptotic and transient performance guarantees. We provide a numerical example involving building temperature control and demonstrate performance benefits of online parameter adaptation.
comment: Final version, IEEE Conference on Decision and Control (CDC), 2024
Joint State and Sparse Input Estimation in Linear Dynamical Systems
Sparsity constraints on the control inputs of a linear dynamical system naturally arise in several practical applications such as networked control, computer vision, seismic signal processing, and cyber-physical systems. In this work, we consider the problem of jointly estimating the states and sparse inputs of such systems from low-dimensional (compressive) measurements. Due to the low-dimensional measurements, conventional Kalman filtering and smoothing algorithms fail to accurately estimate the states and inputs. We present a Bayesian approach that exploits the input sparsity to significantly improve estimation accuracy. Sparsity in the input estimates is promoted by using different prior distributions on the input. We investigate two main approaches: regularizer-based MAP, and {Bayesian learning-based estimation}. We also extend the approaches to handle control inputs with common support and analyze the time and memory complexities of the presented algorithms. Finally, using numerical simulations, we show that our algorithms outperform the state-of-the-art methods in terms of accuracy and time/memory complexities, especially in the low-dimensional measurement regime.
Probabilistic energy forecasting through quantile regression in reproducing kernel Hilbert spaces
Accurate energy demand forecasting is crucial for sustainable and resilient energy development. To meet the Net Zero Representative Concentration Pathways (RCP) $4.5$ scenario in the DACH countries, increased renewable energy production, energy storage, and reduced commercial building consumption are needed. This scenario's success depends on hydroelectric capacity and climatic factors. Informed decisions require quantifying uncertainty in forecasts. This study explores a non-parametric method based on \emph{reproducing kernel Hilbert spaces (RKHS)}, known as kernel quantile regression, for energy prediction. Our experiments demonstrate its reliability and sharpness, and we benchmark it against state-of-the-art methods in load and price forecasting for the DACH region. We offer our implementation in conjunction with additional scripts to ensure the reproducibility of our research.
comment: 12 pages, {Owner/Author | ACM} {2024}. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record will published in https://energy.acm.org/eir
Learning-Based Efficient Approximation of Data-Enabled Predictive Control
Data-Enabled Predictive Control (DeePC) bypasses the need for system identification by directly leveraging raw data to formulate optimal control policies. However, the size of the optimization problem in DeePC grows linearly with respect to the data size, which prohibits its application to resource-constrained systems due to high computational costs. In this paper, we propose an efficient approximation of DeePC, whose size is invariant with respect to the amount of data collected, via differentiable convex programming. Specifically, the optimization problem in DeePC is decomposed into two parts: a control objective and a scoring function that evaluates the likelihood of a guessed I/O sequence, the latter of which is approximated with a size-invariant learned optimization problem. The proposed method is validated through numerical simulations on a quadruple tank system, illustrating that the learned controller can reduce the computational time of DeePC by a factor of 5 while maintaining its control performance.
Real-Time Machine-Learning-Based Optimization Using Input Convex Long Short-Term Memory Network
Neural network-based optimization and control methods, often referred to as black-box approaches, are increasingly gaining attention in energy and manufacturing systems, particularly in situations where first-principles models are either unavailable or inaccurate. However, their non-convex nature significantly slows down the optimization and control processes, limiting their application in real-time decision-making processes. To address this challenge, we propose a novel Input Convex Long Short-Term Memory (IC-LSTM) network to enhance the computational efficiency of neural network-based optimization. Through two case studies employing real-time neural network-based optimization for optimizing energy and chemical systems, we demonstrate the superior performance of IC-LSTM-based optimization in terms of runtime. Specifically, in a real-time optimization problem of a real-world solar photovoltaic energy system at LHT Holdings in Singapore, IC-LSTM-based optimization achieved at least 4-fold speedup compared to conventional LSTM-based optimization. These results highlight the potential of IC-LSTM networks to significantly enhance the efficiency of neural network-based optimization and control in practical applications. Source code is available at https://github.com/killingbear999/ICLSTM.
comment: Applied Energy
Hybrid integrator-gain system based integral resonant controllers for negative imaginary systems
We introduce a hybrid control system called a hybrid integrator-gain system (HIGS) based integral resonant controller (IRC) to stabilize negative imaginary (NI) systems. A HIGS-based IRC has a similar structure to an IRC, with the integrator replaced by a HIGS. We show that a HIGS-based IRC is an NI system. Also, for a SISO NI system with a minimal realization, we show there exists a HIGS-based IRC such that their closed-loop interconnection is asymptotically stable. Also, we propose a proportional-integral-double-integral resonant controller and a HIGS-based proportional-integral-double-integral resonant controller, and we show that both of them can be applied to asymptotically stabilize an NI system. An example is provided to illustrate the proposed results.
comment: 9 pages, 9 figures. The 63rd IEEE Conference on Decision and Control (CDC 2024)
Multi-Agent Clarity-Aware Dynamic Coverage with Gaussian Processes
This paper presents two algorithms for multi-agent dynamic coverage in spatiotemporal environments, where the coverage algorithms are informed by the method of data assimilation. In particular, we show that by explicitly modeling the environment using a Gaussian Process (GP) model, and considering the sensing capabilities and the dynamics of a team of robots, we can design an estimation algorithm and multi-agent coverage controller that explores and estimates the state of the spatiotemporal environment. The uncertainty of the estimate is quantified using clarity, an information-theoretic metric, where higher clarity corresponds to lower uncertainty. By exploiting the relationship between GPs and Stochastic Differential Equations (SDEs) we quantify the increase in clarity of the estimated state at any position due to a measurement taken from any other position. We use this relationship to design two new coverage controllers, both of which scale well with the number of agents exploring the domain, assuming the robots can share the map of the clarity over the spatial domain via communication. We demonstrate the algorithms through a realistic simulation of a team of robots collecting wind data over a region in Austria.
comment: 8 pages, 2 figures, accepted in IEEE CDC 2024
Systems and Control (EESS)
Cooptimizing Safety and Performance with a Control-Constrained Formulation
Autonomous systems have witnessed a rapid increase in their capabilities, but it remains a challenge for them to perform tasks both effectively and safely. The fact that performance and safety can sometimes be competing objectives renders the cooptimization between them difficult. One school of thought is to treat this cooptimization as a constrained optimal control problem with a performance-oriented objective function and safety as a constraint. However, solving this constrained optimal control problem for general nonlinear systems remains challenging. In this work, we use the general framework of constrained optimal control, but given the safety state constraint, we convert it into an equivalent control constraint, resulting in a state and time-dependent control-constrained optimal control problem. This equivalent optimal control problem can readily be solved using the dynamic programming principle. We show the corresponding value function is a viscosity solution of a certain Hamilton-Jacobi-Bellman Partial Differential Equation (HJB-PDE). Furthermore, we demonstrate the effectiveness of our method with a two-dimensional case study, and the experiment shows that the controller synthesized using our method consistently outperforms the baselines, both in safety and performance.
comment: Submitted to ACC with L-CSS option
Bayesian hypergame approach to equilibrium stability and robustness in moving target defense
We investigate the equilibrium stability and robustness in a class of moving target defense problems, in which players have both incomplete information and asymmetric cognition. We first establish a Bayesian Stackelberg game model for incomplete information and then employ a hypergame reformulation to address asymmetric cognition. With the core concept of the hyper Bayesian Nash equilibrium (HBNE), a condition for achieving both the strategic and cognitive stability in equilibria can be realized by solving linear equations. Moreover, to deal with players' underlying perturbed knowledge, we study the equilibrium robustness by presenting a condition of robust HBNE under the given configuration. Experiments evaluate our theoretical results.
An Ontology-based Approach Towards Traceable Behavior Specifications in Automated Driving
Vehicles in public traffic that are equipped with Automated Driving Systems are subject to a number of expectations: Among other aspects, their behavior should be safe, conforming to the rules of the road and provide mobility to their users. This poses challenges for the developers of such systems: Developers are responsible for specifying this behavior, for example, in terms of requirements at system design time. As we will discuss in the article, this specification always involves the need for assumptions and trade-offs. As a result, insufficiencies in such a behavior specification can occur that can potentially lead to unsafe system behavior. In order to support the identification of specification insufficiencies, requirements and respective assumptions need to be made explicit. In this article, we propose the Semantic Norm Behavior Analysis as an ontology-based approach to specify the behavior for an Automated Driving System equipped vehicle. We use ontologies to formally represent specified behavior for a targeted operational environment, and to establish traceability between specified behavior and the addressed stakeholder needs. Furthermore, we illustrate the application of the Semantic Norm Behavior Analysis in two example scenarios and evaluate our results.
comment: 22 pages, 12 figures, submitted for publication
Self-calibrated Microring Weight Function for Neuromorphic Optical Computing
This paper presents a microring resonator-based weight function for neuromorphic photonic applications achieving a record-high precision of 11.3 bits and accuracy of 9.3 bits for 2 Gbps input optical signals. The system employs an all-analog self-referenced proportional-integral-derivative (PID) controller to perform real-time temperature stabilization within a range of up to 60 degree Celsius. A self-calibrated weight function is demonstrated for a range of 6 degree Celsius with a single initial calibration and minimal accuracy and precision degradation. By monitoring the through and drop ports of the microring with variable gain transimpedance amplifiers, accurate and precise weight adjustment is achieved, ensuring optimal performance and reliability. These findings underscore the system's robustness to dynamic thermal environments, highlighting the potential for high-speed reconfigurable analog photonic networks.
On Epistemic Properties in Discrete-Event Systems: A Uniform Framework and Its Applications
In this paper, we investigate the property verification problem for partially-observed DES from a new perspective. Specifically, we consider the problem setting where the system is observed by two agents independently, each with its own observation. The purpose of the first agent, referred to as the low-level observer, is to infer the actual behavior of the system, while the second, referred to as the high-level observer, aims to infer the knowledge of Agent 1 regarding the system. We present a general notion called the epistemic property capturing the inference from the high-level observer to the low-level observer. A typical instance of this definition is the notion of high-order opacity, which specifies that the intruder does not know that the system knows some critical information. This formalization is very general and supports any user-defined information-state-based knowledge between the two observers. We demonstrate how the general definition of epistemic properties can be applied in different problem settings such as information leakage diagnosis or tactical cooperation without explicit communications. Finally, we provide a systematic approach for the verification of epistemic properties. Particularly, we identify some fragments of epistemic properties that can be verified more efficiently.
Autoencoder-Based and Physically Motivated Koopman Lifted States for Wind Farm MPC: A Comparative Case Study
This paper explores the use of Autoencoder (AE) models to identify Koopman-based linear representations for designing model predictive control (MPC) for wind farms. Wake interactions in wind farms are challenging to model, previously addressed with Koopman lifted states. In this study we investigate the performance of two AE models: The first AE model estimates the wind speeds acting on the turbines these are affected by changes in turbine control inputs. The wind speeds estimated by this AE model are then used in a second step to calculate the power output via a simple turbine model based on physical equations. The second AE model directly estimates the wind farm output, i.e., both turbine and wake dynamics are modeled. The primary inquiry of this study addresses whether any of these two AE-based models can surpass previously identified Koopman models based on physically motivated lifted states. We find that the first AE model, which estimates the wind speed and hence includes the wake dynamics, but excludes the turbine dynamics outperforms the existing physically motivated Koopman model. However, the second AE model, which estimates the farm power directly, underperforms when the turbines' underlying physical assumptions are correct. We additionally investigate specific conditions under which the second, purely data-driven AE model can excel: Notably, when modeling assumptions, such as the wind turbine power coefficient, are erroneous and remain unchecked within the MPC controller. In such cases, the data-driven AE models, when updated with recent data reflecting changed system dynamics, can outperform physics-based models operating under outdated assumptions.
comment: Accepted for Conference on Decision and Control 2024
Superior Computer Chess with Model Predictive Control, Reinforcement Learning, and Rollout
In this paper we apply model predictive control (MPC), rollout, and reinforcement learning (RL) methodologies to computer chess. We introduce a new architecture for move selection, within which available chess engines are used as components. One engine is used to provide position evaluations in an approximation in value space MPC/RL scheme, while a second engine is used as nominal opponent, to emulate or approximate the moves of the true opponent player. We show that our architecture improves substantially the performance of the position evaluation engine. In other words our architecture provides an additional layer of intelligence, on top of the intelligence of the engines on which it is based. This is true for any engine, regardless of its strength: top engines such as Stockfish and Komodo Dragon (of varying strengths), as well as weaker engines. Structurally, our basic architecture selects moves by a one-move lookahead search, with an intermediate move generated by a nominal opponent engine, and followed by a position evaluation by another chess engine. Simpler schemes that forego the use of the nominal opponent, also perform better than the position evaluator, but not quite by as much. More complex schemes, involving multistep lookahead, may also be used and generally tend to perform better as the length of the lookahead increases. Theoretically, our methodology relies on generic cost improvement properties and the superlinear convergence framework of Newton's method, which fundamentally underlies approximation in value space, and related MPC/RL and rollout/policy iteration schemes. A critical requirement of this framework is that the first lookahead step should be executed exactly. This fact has guided our architectural choices, and is apparently an important factor in improving the performance of even the best available chess engines.
A Short Information-Theoretic Analysis of Linear Auto-Regressive Learning
In this note, we give a short information-theoretic proof of the consistency of the Gaussian maximum likelihood estimator in linear auto-regressive models. Our proof yields nearly optimal non-asymptotic rates for parameter recovery and works without any invocation of stability in the case of finite hypothesis classes.
Autonomous Iterative Motion Learning (AI-MOLE) of a SCARA Robot for Automated Myocardial Injection
Stem cell therapy is a promising approach to treat heart insufficiency and benefits from automated myocardial injection which requires highly precise motion of a robotic manipulator that is equipped with a syringe. This work investigates whether sufficiently precise motion can be achieved by combining a SCARA robot and learning control methods. For this purpose, the method Autonomous Iterative Motion Learning (AI-MOLE) is extended to be applicable to multi-input/multi-output systems. The proposed learning method solves reference tracking tasks in systems with unknown, nonlinear, multi-input/multi-output dynamics by iteratively updating an input trajectory in a plug-and-play fashion and without requiring manual parameter tuning. The proposed learning method is validated in a preliminary simulation study of a simplified SCARA robot that has to perform three desired motions. The results demonstrate that the proposed learning method achieves highly precise reference tracking without requiring any a priori model information or manual parameter tuning in as little as 15 trials per motion. The results further indicate that the combination of a SCARA robot and learning method achieves sufficiently precise motion to potentially enable automatic myocardial injection if similar results can be obtained in a real-world setting.
comment: 6 pages, 4 figures
Analysis of a Simple Neuromorphic Controller for Linear Systems: A Hybrid Systems Perspective
In this paper we analyze a neuromorphic controller, inspired by the leaky integrate-and-fire neuronal model, in closed-loop with a single-input single-output linear time-invariant system. The controller consists of two neuron-like variables and generates a spiking control input whenever one of these variables reaches a threshold. The control input is different from zero only at the spiking instants and, hence, between two spiking times the system evolves in open-loop. Exploiting the hybrid nature of the integrate-and-fire neuronal dynamics, we present a hybrid modeling framework to design and analyze this new controller. In the particular case of single-state linear time-invariant plants, we prove a practical stability property for the closed-loop system, we ensure the existence of a strictly positive dwell-time between spikes, and we relate these properties to the parameters in the neurons. The results are illustrated in a numerical example.
Uncovering the inherited vulnerability of electric distribution networks
Research on the vulnerability of electric networks with a complex network approach has produced significant results in the last decade, especially for transmission networks. These studies have shown that there are causal relations between certain structural properties of networks and their vulnerabilities, leading to an inherent weakness. The purpose of present work was twofold: to test the hypotheses already examined on evolving transmission networks and to gain a deeper understanding on the nature of these inherent weaknesses. For this, historical models of a medium-voltage distribution network supply area were reconstructed and analysed. Topological efficiency of the networks was calculated against node and edge removals of different proportions. We found that the tolerance of the evolving grid remained practically unchanged during the examined period, implying that the increase in size is dominantly caused by the connection of geographically and spatially constrained supply areas and not by an evolutionary process. We also show that probability density functions of centrality metrics, typically connected to vulnerability, show only minor variation during the early evolution of the examined distribution network, and in many cases resemble the properties of the modern days.
comment: 22 pages, 10 figures
The Lynchpin of In-Memory Computing: A Benchmarking Framework for Vector-Matrix Multiplication in RRAMs
The Von Neumann bottleneck, a fundamental challenge in conventional computer architecture, arises from the inability to execute fetch and data operations simultaneously due to a shared bus linking processing and memory units. This bottleneck significantly limits system performance, increases energy consumption, and exacerbates computational complexity. Emerging technologies such as Resistive Random Access Memories (RRAMs), leveraging crossbar arrays, offer promising alternatives for addressing the demands of data-intensive computational tasks through in-memory computing of analog vector-matrix multiplication (VMM) operations. However, the propagation of errors due to device and circuit-level imperfections remains a significant challenge. In this study, we introduce MELISO (In-Memory Linear Solver), a comprehensive end-to-end VMM benchmarking framework tailored for RRAM-based systems. MELISO evaluates the error propagation in VMM operations, analyzing the impact of RRAM device metrics on error magnitude and distribution. This paper introduces the MELISO framework and demonstrates its utility in characterizing and mitigating VMM error propagation using state-of-the-art RRAM device metrics.
comment: ICONS 2024.Copyright 2024 IEEE.Personal use of this material is permitted.Permission from IEEE must be obtained for all other uses,in any current or future media,including reprinting/republishing this material for advertising or promotional purposes,creating new collective works,for resale or redistribution to servers or lists or reuse of any copyrighted component of this work in other works
APEX: Attention on Personality based Emotion ReXgnition Framework
Automated emotion recognition has applications in various fields, such as human-machine interaction, healthcare, security, education, and emotion-aware recommendation/feedback systems. Developing methods to analyze human emotions accurately is essential to enable such diverse applications. Multiple studies have been conducted to explore the possibility of using physiological signals and machine-learning techniques to evaluate human emotions. Furthermore, internal factors such as personality have been considered and involved in emotion recognition. However, integrating personality that is user specific within traditional machine-learning methods that use user-agnostic large data sets has become a critical problem. This study proposes the APEX: attention on personality-based emotion recognition framework, in which multiple weak classifiers are trained on physiological signals of each participant's data, and the classification results are reweighed based on the personality correlations between corresponding subjects and test subjects. Experiments have been conducted on the ASCERTAIN dataset, and the results show that the proposed framework outperforms existing studies.
Real-Time Ground Fault Detection for Inverter-Based Microgrid Systems
Ground fault detection in inverter-based microgrid (IBM) systems is challenging, particularly in a real-time setting, as the fault current deviates slightly from the nominal value. This difficulty is reinforced when there are partially decoupled disturbances and modeling uncertainties. The conventional solution of installing more relays to obtain additional measurements is costly and also increases the complexity of the system. In this paper, we propose a data-assisted diagnosis scheme based on an optimization-based fault detection filter with the output current as the only measurement. Modeling the microgrid dynamics and the diagnosis filter, we formulate the filter design as a quadratic programming (QP) problem that accounts for decoupling partial disturbances, robustness to non-decoupled disturbances and modeling uncertainties by training with data, and ensuring fault sensitivity simultaneously. To ease the computational effort, we also provide an approximate but analytical solution to this QP. Additionally, we use classical statistical results to provide a thresholding mechanism that enjoys probabilistic false-alarm guarantees. Finally, we implement the IBM system with Simulink and Real Time Digital Simulator (RTDS) to verify the effectiveness of the proposed method through simulations.
comment: 18 pages, 9 figures
Upstream Allocation of Bidirectional Load Demand by Power Packetization
The power packet dispatching system has been studied for power management with strict tie to an accompanying information system through power packetization. In the system, integrated units of transfer of power and information, called power packets, are delivered through a network of apparatuses called power packet routers. This paper proposes upstream allocation of a bidirectional load demand represented by a sequence of power packets to power sources. We first develop a scheme of power packet routing for upstream allocation of load demand with full integration of power and information transfer. The routing scheme is then proved to enable packetized management of bidirectional load demand, which is of practical importance for applicability to, e.g., electric drives in motoring and regenerating operations. We present a way of packetizing the bidirectional load demand and realizing the power and information flow under the upstream allocation scheme. The viability of the proposed methods is demonstrated through experiments.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Design of CANSAT for Air Quality Monitoring for an altitude of 900 meters
This paper presents the design and development of NAMBI-VJ, a CANSAT specifically designed for air quality monitoring and stabilization. The CANSAT's cylindrical structure, measuring 310mm in height and 125mm in diameter, is equipped with a mechanical gyroscope for stabilization and a spill-hole parachute for controlled descent. The primary objective of this research is to create a compact, lightweight satellite capable of monitoring air quality parameters such as particulate matter (PM), carbon dioxide (CO2), longitude, and latitude. To achieve this, the CANSAT utilizes Zigbee communication to transmit data to a ground station. Experimental testing involved dropping the CANSAT from an altitude of 900 meters using a drone. The results demonstrate the CANSAT's ability to successfully gather and transmit air quality data, highlighting its potential for environmental monitoring applications.
comment: 6 pages,6 figures
State-of-the-art review and synthesis: A requirement-based roadmap for standardized predictive maintenance automation using digital twin technologies
Recent digital advances have popularized predictive maintenance (PMx), offering enhanced efficiency, automation, accuracy, cost savings, and independence in maintenance processes. Yet, PMx continues to face numerous limitations such as poor explainability, sample inefficiency of data-driven methods, complexity of physics-based methods, and limited generalizability and scalability of knowledge-based methods. This paper proposes leveraging Digital Twins (DTs) to address these challenges and enable automated PMx adoption on a larger scale. While DTs have the potential to be transformative, they have not yet reached the maturity needed to bridge these gaps in a standardized manner. Without a standard definition guiding this evolution, the transformation lacks a solid foundation for development. This paper provides a requirement-based roadmap to support standardized PMx automation using DT technologies. Our systematic approach comprises two primary stages. First, we methodically identify the Informational Requirements (IRs) and Functional Requirements (FRs) for PMx, which serve as a foundation from which any unified framework must emerge. Our approach to defining and using IRs and FRs as the backbone of any PMx DT is supported by the proven success of these requirements as blueprints in other areas, such as product development in the software industry. Second, we conduct a thorough literature review across various fields to assess how these IRs and FRs are currently being applied within DTs, enabling us to identify specific areas where further research is needed to support the progress and maturation of requirement-based PMx DTs.
comment: This paper has been accepted for publication in Advanced Engineering Informatics (2024)
The Fragile Nature of Road Transportation Systems
Major cities worldwide experience problems with the performance of their road transportation systems, and the continuous increase in traffic demand presents a substantial challenge to the optimal operation of urban road networks and the efficiency of traffic control strategies. The operation of transportation systems is widely considered to display fragile property, i.e., the loss in performance increases exponentially with the linearly increasing magnitude of disruptions. Meanwhile, the risk engineering community is embracing the novel concept of antifragility, enabling systems to learn from historical disruptions and exhibit improved performance under black swan events. In this study, based on established traffic models, namely fundamental diagrams and macroscopic fundamental diagrams, we first conducted a rigorous mathematical analysis to prove the fragile nature of the systems theoretically. Subsequently, we propose a skewness-based indicator that can be readily applied to cross-compare the degree of fragility for different networks solely dependent on the MFD-related parameters. At last, by taking real-world stochasticity into account, we implemented a numerical simulation with realistic network data to bridge the gap between the theoretical proof and the real-world operations, to reflect the potential impact of uncertainty on the fragility of the systems. This work aims to demonstrate the fragile nature of road transportation systems and help researchers better comprehend the necessity to consider explicitly antifragile design for future traffic control strategies.
comment: 39 pages, 14 figures
Inverse Particle Filter
In cognitive systems, recent emphasis has been placed on studying the cognitive processes of the subject whose behavior was the primary focus of the system's cognitive response. This approach, known as inverse cognition, arises in counter-adversarial applications and has motivated the development of inverse Bayesian filters. In this context, a cognitive adversary, such as a radar, uses a forward Bayesian filter to track its target of interest. An inverse filter is then employed to infer the adversary's estimate of the target's or defender's state. Previous studies have addressed this inverse filtering problem by introducing methods like the inverse Kalman filter (I-KF), inverse extended KF (I-EKF), and inverse unscented KF (I-UKF). However, these filters typically assume additive Gaussian noise models and/or rely on local approximations of non-linear dynamics at the state estimates, limiting their practical application. In contrast, this paper adopts a global filtering approach and presents the development of an inverse particle filter (I-PF). The particle filter framework employs Monte Carlo (MC) methods to approximate arbitrary posterior distributions. Moreover, under mild system-level conditions, the proposed I-PF demonstrates convergence to the optimal inverse filter. Additionally, we propose the differentiable I-PF to address scenarios where system information is unknown to the defender. Using the recursive Cramer-Rao lower bound and non-credibility index (NCI), our numerical experiments for different systems demonstrate the estimation performance and time complexity of the proposed filter.
comment: 13 pages, 4 figures
Enhancing Sliding Performance with Aerial Robots: Analysis and Solutions for Non-Actuated Multi-Wheel Configurations
Sliding tasks performed by aerial robots are valuable for inspection and simple maintenance tasks at height, such as non-destructive testing and painting. Although various end-effector designs have been used for such tasks, non-actuated wheel configurations are more frequently applied thanks to their rolling capability for sliding motion, mechanical simplicity, and lightweight design. Moreover, a non-actuated multi-wheel (more than one wheel) configuration in the end-effector design allows the placement of additional equipment e.g., sensors and tools in the center of the end-effector tip for applications. However, there is still a lack of studies on crucial contact conditions during sliding using aerial robots with such an end-effector design. In this article, we investigate the key challenges associated with sliding operations using aerial robots equipped with multiple non-actuated wheels through in-depth analysis grounded in physical experiments. The experimental data is used to create a simulator that closely captures real-world conditions. We propose solutions from both mechanical design and control perspectives to improve the sliding performance of aerial robots. From a mechanical standpoint, design guidelines are derived from experimental data. From a control perspective, we introduce a novel pressure-sensing-based control framework that ensures reliable task execution, even during sliding maneuvers. The effectiveness and robustness of the proposed approaches are then validated and compared using the built simulator, particularly in high-risk scenarios.
Adaptive Economic Model Predictive Control for linear systems with performance guarantees
We present a model predictive control (MPC) formulation to directly optimize economic criteria for linear constrained systems subject to disturbances and uncertain model parameters. The proposed formulation combines a certainty equivalent economic MPC with a simple least-squares parameter adaptation. For the resulting adaptive economic MPC scheme, we derive strong asymptotic and transient performance guarantees. We provide a numerical example involving building temperature control and demonstrate performance benefits of online parameter adaptation.
comment: Final version, IEEE Conference on Decision and Control (CDC), 2024
Joint State and Sparse Input Estimation in Linear Dynamical Systems
Sparsity constraints on the control inputs of a linear dynamical system naturally arise in several practical applications such as networked control, computer vision, seismic signal processing, and cyber-physical systems. In this work, we consider the problem of jointly estimating the states and sparse inputs of such systems from low-dimensional (compressive) measurements. Due to the low-dimensional measurements, conventional Kalman filtering and smoothing algorithms fail to accurately estimate the states and inputs. We present a Bayesian approach that exploits the input sparsity to significantly improve estimation accuracy. Sparsity in the input estimates is promoted by using different prior distributions on the input. We investigate two main approaches: regularizer-based MAP, and {Bayesian learning-based estimation}. We also extend the approaches to handle control inputs with common support and analyze the time and memory complexities of the presented algorithms. Finally, using numerical simulations, we show that our algorithms outperform the state-of-the-art methods in terms of accuracy and time/memory complexities, especially in the low-dimensional measurement regime.
Probabilistic energy forecasting through quantile regression in reproducing kernel Hilbert spaces
Accurate energy demand forecasting is crucial for sustainable and resilient energy development. To meet the Net Zero Representative Concentration Pathways (RCP) $4.5$ scenario in the DACH countries, increased renewable energy production, energy storage, and reduced commercial building consumption are needed. This scenario's success depends on hydroelectric capacity and climatic factors. Informed decisions require quantifying uncertainty in forecasts. This study explores a non-parametric method based on \emph{reproducing kernel Hilbert spaces (RKHS)}, known as kernel quantile regression, for energy prediction. Our experiments demonstrate its reliability and sharpness, and we benchmark it against state-of-the-art methods in load and price forecasting for the DACH region. We offer our implementation in conjunction with additional scripts to ensure the reproducibility of our research.
comment: 12 pages, {Owner/Author | ACM} {2024}. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record will published in https://energy.acm.org/eir
Learning-Based Efficient Approximation of Data-Enabled Predictive Control
Data-Enabled Predictive Control (DeePC) bypasses the need for system identification by directly leveraging raw data to formulate optimal control policies. However, the size of the optimization problem in DeePC grows linearly with respect to the data size, which prohibits its application to resource-constrained systems due to high computational costs. In this paper, we propose an efficient approximation of DeePC, whose size is invariant with respect to the amount of data collected, via differentiable convex programming. Specifically, the optimization problem in DeePC is decomposed into two parts: a control objective and a scoring function that evaluates the likelihood of a guessed I/O sequence, the latter of which is approximated with a size-invariant learned optimization problem. The proposed method is validated through numerical simulations on a quadruple tank system, illustrating that the learned controller can reduce the computational time of DeePC by a factor of 5 while maintaining its control performance.
Real-Time Machine-Learning-Based Optimization Using Input Convex Long Short-Term Memory Network
Neural network-based optimization and control methods, often referred to as black-box approaches, are increasingly gaining attention in energy and manufacturing systems, particularly in situations where first-principles models are either unavailable or inaccurate. However, their non-convex nature significantly slows down the optimization and control processes, limiting their application in real-time decision-making processes. To address this challenge, we propose a novel Input Convex Long Short-Term Memory (IC-LSTM) network to enhance the computational efficiency of neural network-based optimization. Through two case studies employing real-time neural network-based optimization for optimizing energy and chemical systems, we demonstrate the superior performance of IC-LSTM-based optimization in terms of runtime. Specifically, in a real-time optimization problem of a real-world solar photovoltaic energy system at LHT Holdings in Singapore, IC-LSTM-based optimization achieved at least 4-fold speedup compared to conventional LSTM-based optimization. These results highlight the potential of IC-LSTM networks to significantly enhance the efficiency of neural network-based optimization and control in practical applications. Source code is available at https://github.com/killingbear999/ICLSTM.
comment: Applied Energy
Hybrid integrator-gain system based integral resonant controllers for negative imaginary systems
We introduce a hybrid control system called a hybrid integrator-gain system (HIGS) based integral resonant controller (IRC) to stabilize negative imaginary (NI) systems. A HIGS-based IRC has a similar structure to an IRC, with the integrator replaced by a HIGS. We show that a HIGS-based IRC is an NI system. Also, for a SISO NI system with a minimal realization, we show there exists a HIGS-based IRC such that their closed-loop interconnection is asymptotically stable. Also, we propose a proportional-integral-double-integral resonant controller and a HIGS-based proportional-integral-double-integral resonant controller, and we show that both of them can be applied to asymptotically stabilize an NI system. An example is provided to illustrate the proposed results.
comment: 9 pages, 9 figures. The 63rd IEEE Conference on Decision and Control (CDC 2024)
Multi-Agent Clarity-Aware Dynamic Coverage with Gaussian Processes
This paper presents two algorithms for multi-agent dynamic coverage in spatiotemporal environments, where the coverage algorithms are informed by the method of data assimilation. In particular, we show that by explicitly modeling the environment using a Gaussian Process (GP) model, and considering the sensing capabilities and the dynamics of a team of robots, we can design an estimation algorithm and multi-agent coverage controller that explores and estimates the state of the spatiotemporal environment. The uncertainty of the estimate is quantified using clarity, an information-theoretic metric, where higher clarity corresponds to lower uncertainty. By exploiting the relationship between GPs and Stochastic Differential Equations (SDEs) we quantify the increase in clarity of the estimated state at any position due to a measurement taken from any other position. We use this relationship to design two new coverage controllers, both of which scale well with the number of agents exploring the domain, assuming the robots can share the map of the clarity over the spatial domain via communication. We demonstrate the algorithms through a realistic simulation of a team of robots collecting wind data over a region in Austria.
comment: 8 pages, 2 figures, accepted in IEEE CDC 2024
Multiagent Systems
Foragax: An Agent Based Modelling framework based on JAX
Foraging for resources is a ubiquitous activity conducted by living organisms in a shared environment to maintain their homeostasis. Modelling multi-agent foraging in-silico allows us to study both individual and collective emergent behaviour in a tractable manner. Agent-based modelling has proven to be effective in simulating such tasks, though scaling the simulations to accommodate large numbers of agents with complex dynamics remains challenging. In this work, we present Foragax, a general-purpose, scalable, hardware-accelerated, multi-agent foraging toolkit. Leveraging the JAX library, our toolkit can simulate thousands of agents foraging in a common environment, in an end-to-end vectorized and differentiable manner. The toolkit provides agent-based modelling tools to model various foraging tasks, including options to design custom spatial and temporal agent dynamics, control policies, sensor models, and boundary conditions. Further, the number of agents during such simulations can be increased or decreased based on custom rules. The toolkit can also be used to potentially model more general multi-agent scenarios.
Responsible Blockchain: STEADI Principles and the Actor-Network Theory-based Development Methodology (ANT-RDM)
This paper provides a comprehensive analysis of the challenges and controversies associated with blockchain technology. It identifies technical challenges such as scalability, security, privacy, and interoperability, as well as business and adoption challenges, and the social, economic, ethical, and environmental controversies present in current blockchain systems. We argue that responsible blockchain development is key to overcoming these challenges and achieving mass adoption. This paper defines Responsible Blockchain and introduces the STEADI principles (sustainable, transparent, ethical, adaptive, decentralized, and inclusive) for responsible blockchain development. Additionally, it presents the Actor-Network Theory-based Responsible Development Methodology (ANT-RDM) for blockchains, which includes the steps of problematization, interessement, enrollment, and mobilization.
comment: 19 pages, 1 figure, journal publication
System Neural Diversity: Measuring Behavioral Heterogeneity in Multi-Agent Learning
Evolutionary science provides evidence that diversity confers resilience in natural systems. Yet, traditional multi-agent reinforcement learning techniques commonly enforce homogeneity to increase training sample efficiency. When a system of learning agents is not constrained to homogeneous policies, individuals may develop diverse behaviors, resulting in emergent complementarity that benefits the system. Despite this, there is a surprising lack of tools that quantify behavioral diversity. Such techniques would pave the way towards understanding the impact of diversity in collective artificial intelligence and enabling its control. In this paper, we introduce System Neural Diversity (SND): a measure of behavioral heterogeneity in multi-agent systems. We discuss and prove its theoretical properties, and compare it with alternate, state-of-the-art behavioral diversity metrics used in the robotics domain. Through simulations of a variety of cooperative multi-robot tasks, we show how our metric constitutes an important tool that enables measurement and control of behavioral heterogeneity. In dynamic tasks, where the problem is affected by repeated disturbances during training, we show that SND allows us to measure latent resilience skills acquired by the agents, while other proxies, such as task performance (reward), fail to. Finally, we show how the metric can be employed to control diversity, allowing us to enforce a desired heterogeneity set-point or range. We demonstrate how this paradigm can be used to bootstrap the exploration phase, finding optimal policies faster, thus enabling novel and more efficient MARL paradigms.
Mean-Field Approximation of Cooperative Constrained Multi-Agent Reinforcement Learning (CMARL)
Mean-Field Control (MFC) has recently been proven to be a scalable tool to approximately solve large-scale multi-agent reinforcement learning (MARL) problems. However, these studies are typically limited to unconstrained cumulative reward maximization framework. In this paper, we show that one can use the MFC approach to approximate the MARL problem even in the presence of constraints. Specifically, we prove that, an $N$-agent constrained MARL problem, with state, and action spaces of each individual agents being of sizes $|\mathcal{X}|$, and $|\mathcal{U}|$ respectively, can be approximated by an associated constrained MFC problem with an error, $e\triangleq \mathcal{O}\left([\sqrt{|\mathcal{X}|}+\sqrt{|\mathcal{U}|}]/\sqrt{N}\right)$. In a special case where the reward, cost, and state transition functions are independent of the action distribution of the population, we prove that the error can be improved to $e=\mathcal{O}(\sqrt{|\mathcal{X}|}/\sqrt{N})$. Also, we provide a Natural Policy Gradient based algorithm and prove that it can solve the constrained MARL problem within an error of $\mathcal{O}(e)$ with a sample complexity of $\mathcal{O}(e^{-6})$.
Robotics
Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments
Robot models, particularly those trained with large amounts of data, have recently shown a plethora of real-world manipulation and navigation capabilities. Several independent efforts have shown that given sufficient training data in an environment, robot policies can generalize to demonstrated variations in that environment. However, needing to finetune robot models to every new environment stands in stark contrast to models in language or vision that can be deployed zero-shot for open-world problems. In this work, we present Robot Utility Models (RUMs), a framework for training and deploying zero-shot robot policies that can directly generalize to new environments without any finetuning. To create RUMs efficiently, we develop new tools to quickly collect data for mobile manipulation tasks, integrate such data into a policy with multi-modal imitation learning, and deploy policies on-device on Hello Robot Stretch, a cheap commodity robot, with an external mLLM verifier for retrying. We train five such utility models for opening cabinet doors, opening drawers, picking up napkins, picking up paper bags, and reorienting fallen objects. Our system, on average, achieves 90% success rate in unseen, novel environments interacting with unseen objects. Moreover, the utility models can also succeed in different robot and camera set-ups with no further data, training, or fine-tuning. Primary among our lessons are the importance of training data over training algorithm and policy class, guidance about data scaling, necessity for diverse yet high-quality demonstrations, and a recipe for robot introspection and retrying to improve performance on individual environments. Our code, data, models, hardware designs, as well as our experiment and deployment videos are open sourced and can be found on our project website: https://robotutilitymodels.com
comment: Project website https://robotutilitymodels.com
Neural MP: A Generalist Neural Motion Planner
The current paradigm for motion planning generates solutions from scratch for every new problem, which consumes significant amounts of time and computational resources. For complex, cluttered scenes, motion planning approaches can often take minutes to produce a solution, while humans are able to accurately and safely reach any goal in seconds by leveraging their prior experience. We seek to do the same by applying data-driven learning at scale to the problem of motion planning. Our approach builds a large number of complex scenes in simulation, collects expert data from a motion planner, then distills it into a reactive generalist policy. We then combine this with lightweight optimization to obtain a safe path for real world deployment. We perform a thorough evaluation of our method on 64 motion planning tasks across four diverse environments with randomized poses, scenes and obstacles, in the real world, demonstrating an improvement of 23%, 17% and 79% motion planning success rate over state of the art sampling, optimization and learning based planning methods. Video results available at mihdalal.github.io/neuralmotionplanner
comment: Website at mihdalal.github.io/neuralmotionplanner. Main paper: 7 pages, 4 figures, 2 tables. Appendix: 9 pages, 5 figures, 6 tables
Promptable Closed-loop Traffic Simulation
Simulation stands as a cornerstone for safe and efficient autonomous driving development. At its core a simulation system ought to produce realistic, reactive, and controllable traffic patterns. In this paper, we propose ProSim, a multimodal promptable closed-loop traffic simulation framework. ProSim allows the user to give a complex set of numerical, categorical or textual prompts to instruct each agent's behavior and intention. ProSim then rolls out a traffic scenario in a closed-loop manner, modeling each agent's interaction with other traffic participants. Our experiments show that ProSim achieves high prompt controllability given different user prompts, while reaching competitive performance on the Waymo Sim Agents Challenge when no prompt is given. To support research on promptable traffic simulation, we create ProSim-Instruct-520k, a multimodal prompt-scenario paired driving dataset with over 10M text prompts for over 520k real-world driving scenarios. We will release code of ProSim as well as data and labeling tools of ProSim-Instruct-520k at https://ariostgx.github.io/ProSim.
comment: Accepted to CoRL 2024. Website available at https://ariostgx.github.io/ProSim
Learning control of underactuated double pendulum with Model-Based Reinforcement Learning
This report describes our proposed solution for the second AI Olympics competition held at IROS 2024. Our solution is based on a recent Model-Based Reinforcement Learning algorithm named MC-PILCO. Besides briefly reviewing the algorithm, we discuss the most critical aspects of the MC-PILCO implementation in the tasks at hand.
Leveraging Object Priors for Point Tracking ECCV 2024
Point tracking is a fundamental problem in computer vision with numerous applications in AR and robotics. A common failure mode in long-term point tracking occurs when the predicted point leaves the object it belongs to and lands on the background or another object. We identify this as the failure to correctly capture objectness properties in learning to track. To address this limitation of prior work, we propose a novel objectness regularization approach that guides points to be aware of object priors by forcing them to stay inside the the boundaries of object instances. By capturing objectness cues at training time, we avoid the need to compute object masks during testing. In addition, we leverage contextual attention to enhance the feature representation for capturing objectness at the feature level more effectively. As a result, our approach achieves state-of-the-art performance on three point tracking benchmarks, and we further validate the effectiveness of our components via ablation studies. The source code is available at: https://github.com/RehgLab/tracking_objectness
comment: ECCV 2024 ILR Workshop
Creativity and Visual Communication from Machine to Musician: Sharing a Score through a Robotic Camera
This paper explores the integration of visual communication and musical interaction by implementing a robotic camera within a "Guided Harmony" musical game. We aim to examine co-creative behaviors between human musicians and robotic systems. Our research explores existing methodologies like improvisational game pieces and extends these concepts to include robotic participation using a PTZ camera. The robotic system interprets and responds to nonverbal cues from musicians, creating a collaborative and adaptive musical experience. This initial case study underscores the importance of intuitive visual communication channels. We also propose future research directions, including parameters for refining the visual cue toolkit and data collection methods to understand human-machine co-creativity further. Our findings contribute to the broader understanding of machine intelligence in augmenting human creativity, particularly in musical settings.
Design of a Variable Stiffness Quasi-Direct Drive Cable-Actuated Tensegrity Robot
Tensegrity robots excel in tasks requiring extreme levels of deformability and robustness. However, there are challenges in state estimation and payload versatility due to their high number of degrees of freedom and unconventional shape. This paper introduces a modular three-bar tensegrity robot featuring a customizable payload design. Our tensegrity robot employs a novel Quasi-Direct Drive (QDD) cable actuator paired with low-stretch polymer cables to achieve accurate proprioception without the need for external force or torque sensors. The design allows for on-the-fly stiffness tuning for better environment and payload adaptability. In this paper, we present the design, fabrication, assembly, and experimental results of the robot. Experimental data demonstrates the high accuracy cable length estimation (<1% error relative to bar length) and variable stiffness control of the cable actuator up to 7 times the minimum stiffness for self support. The presented tensegrity robot serves as a platform for future advancements in autonomous operation and open-source module design.
comment: 8 pages, 13 figures
Robust Loss Functions for Object Grasping under Limited Ground Truth
Object grasping is a crucial technology enabling robots to perceive and interact with the environment sufficiently. However, in practical applications, researchers are faced with missing or noisy ground truth while training the convolutional neural network, which decreases the accuracy of the model. Therefore, different loss functions are proposed to deal with these problems to improve the accuracy of the neural network. For missing ground truth, a new predicted category probability method is defined for unlabeled samples, which works effectively in conjunction with the pseudo-labeling method. Furthermore, for noisy ground truth, a symmetric loss function is introduced to resist the corruption of label noises. The proposed loss functions are powerful, robust, and easy to use. Experimental results based on the typical grasping neural network show that our method can improve performance by 2 to 13 percent.
RCM-Constrained Manipulator Trajectory Tracking Using Differential Kinematics Control
This paper proposes an approach for controlling surgical robotic systems, while complying with the Remote Center of Motion (RCM) constraint in Robot-Assisted Minimally Invasive Surgery (RA-MIS). In this approach, the RCM-constraint is upheld algorithmically, providing flexibility in the positioning of the insertion point and enabling compatibility with a wide range of general-purpose robots. The paper further investigates the impact of the tool's insertion ratio on the RCM-error, and introduces a manipulability index of the robot which considers the RCM-error that it is used to find a starting configuration. To accurately evaluate the proposed method's trajectory tracking within an RCM-constrained environment, an electromagnetic tracking system is employed. The results demonstrate the effectiveness of the proposed method in addressing the RCM constraint problem in RA-MIS.
comment: 6 pages, 7 figures. Published in the 21st International Conference on Advanced Robotics (ICAR 2023)
Cooperative Decision-Making for CAVs at Unsignalized Intersections: A MARL Approach with Attention and Hierarchical Game Priors
The development of autonomous vehicles has shown great potential to enhance the efficiency and safety of transportation systems. However, the decision-making issue in complex human-machine mixed traffic scenarios, such as unsignalized intersections, remains a challenge for autonomous vehicles. While reinforcement learning (RL) has been used to solve complex decision-making problems, existing RL methods still have limitations in dealing with cooperative decision-making of multiple connected autonomous vehicles (CAVs), ensuring safety during exploration, and simulating realistic human driver behaviors. In this paper, a novel and efficient algorithm, Multi-Agent Game-prior Attention Deep Deterministic Policy Gradient (MA-GA-DDPG), is proposed to address these limitations. Our proposed algorithm formulates the decision-making problem of CAVs at unsignalized intersections as a decentralized multi-agent reinforcement learning problem and incorporates an attention mechanism to capture interaction dependencies between ego CAV and other agents. The attention weights between the ego vehicle and other agents are then used to screen interaction objects and obtain prior hierarchical game relations, based on which a safety inspector module is designed to improve the traffic safety. Furthermore, both simulation and hardware-in-the-loop experiments were conducted, demonstrating that our method outperforms other baseline approaches in terms of driving safety, efficiency, and comfort.
Interactive incremental learning of generalizable skills with local trajectory modulation
The problem of generalization in learning from demonstration (LfD) has received considerable attention over the years, particularly within the context of movement primitives, where a number of approaches have emerged. Recently, two important approaches have gained recognition. While one leverages via-points to adapt skills locally by modulating demonstrated trajectories, another relies on so-called task-parameterized models that encode movements with respect to different coordinate systems, using a product of probabilities for generalization. While the former are well-suited to precise, local modulations, the latter aim at generalizing over large regions of the workspace and often involve multiple objects. Addressing the quality of generalization by leveraging both approaches simultaneously has received little attention. In this work, we propose an interactive imitation learning framework that simultaneously leverages local and global modulations of trajectory distributions. Building on the kernelized movement primitives (KMP) framework, we introduce novel mechanisms for skill modulation from direct human corrective feedback. Our approach particularly exploits the concept of via-points to incrementally and interactively 1) improve the model accuracy locally, 2) add new objects to the task during execution and 3) extend the skill into regions where demonstrations were not provided. We evaluate our method on a bearing ring-loading task using a torque-controlled, 7-DoF, DLR SARA robot.
comment: 21 pages, 16 figures
StratXplore: Strategic Novelty-seeking and Instruction-aligned Exploration for Vision and Language Navigation
Embodied navigation requires robots to understand and interact with the environment based on given tasks. Vision-Language Navigation (VLN) is an embodied navigation task, where a robot navigates within a previously seen and unseen environment, based on linguistic instruction and visual inputs. VLN agents need access to both local and global action spaces; former for immediate decision making and the latter for recovering from navigational mistakes. Prior VLN agents rely only on instruction-viewpoint alignment for local and global decision making and back-track to a previously visited viewpoint, if the instruction and its current viewpoint mismatches. These methods are prone to mistakes, due to the complexity of the instruction and partial observability of the environment. We posit that, back-tracking is sub-optimal and agent that is aware of its mistakes can recover efficiently. For optimal recovery, exploration should be extended to unexplored viewpoints (or frontiers). The optimal frontier is a recently observed but unexplored viewpoint that aligns with the instruction and is novel. We introduce a memory-based and mistake-aware path planning strategy for VLN agents, called \textit{StratXplore}, that presents global and local action planning to select the optimal frontier for path correction. The proposed method collects all past actions and viewpoint features during navigation and then selects the optimal frontier suitable for recovery. Experimental results show this simple yet effective strategy improves the success rate on two VLN datasets with different task complexities.
Interpretable Responsibility Sharing as a Heuristic for Task and Motion Planning
This article introduces a novel heuristic for Task and Motion Planning (TAMP) named Interpretable Responsibility Sharing (IRS), which enhances planning efficiency in domestic robots by leveraging human-constructed environments and inherent biases. Utilizing auxiliary objects (e.g., trays and pitchers), which are commonly found in household settings, IRS systematically incorporates these elements to simplify and optimize task execution. The heuristic is rooted in the novel concept of Responsibility Sharing (RS), where auxiliary objects share the task's responsibility with the embodied agent, dividing complex tasks into manageable sub-problems. This division not only reflects human usage patterns but also aids robots in navigating and manipulating within human spaces more effectively. By integrating Optimized Rule Synthesis (ORS) for decision-making, IRS ensures that the use of auxiliary objects is both strategic and context-aware, thereby improving the interpretability and effectiveness of robotic planning. Experiments conducted across various household tasks demonstrate that IRS significantly outperforms traditional methods by reducing the effort required in task execution and enhancing the overall decision-making process. This approach not only aligns with human intuitive methods but also offers a scalable solution adaptable to diverse domestic environments. Code is available at https://github.com/asyncs/IRS.
LEROjD: Lidar Extended Radar-Only Object Detection ECCV 2024
Accurate 3D object detection is vital for automated driving. While lidar sensors are well suited for this task, they are expensive and have limitations in adverse weather conditions. 3+1D imaging radar sensors offer a cost-effective, robust alternative but face challenges due to their low resolution and high measurement noise. Existing 3+1D imaging radar datasets include radar and lidar data, enabling cross-modal model improvements. Although lidar should not be used during inference, it can aid the training of radar-only object detectors. We explore two strategies to transfer knowledge from the lidar to the radar domain and radar-only object detectors: 1. multi-stage training with sequential lidar point cloud thin-out, and 2. cross-modal knowledge distillation. In the multi-stage process, three thin-out methods are examined. Our results show significant performance gains of up to 4.2 percentage points in mean Average Precision with multi-stage training and up to 3.9 percentage points with knowledge distillation by initializing the student with the teacher's weights. The main benefit of these approaches is their applicability to other 3D object detection networks without altering their architecture, as we show by analyzing it on two different object detectors. Our code is available at https://github.com/rst-tu-dortmund/lerojd
comment: Accepted for publication as ECCV 2024
Adaptive Probabilistic Planning for the Uncertain and Dynamic Orienteering Problem
The Orienteering Problem (OP) is a well-studied routing problem that has been extended to incorporate uncertainties, reflecting stochastic or dynamic travel costs, prize-collection costs, and prizes. Existing approaches may, however, be inefficient in real-world applications due to insufficient modeling knowledge and initially unknowable parameters in online scenarios. Thus, we propose the Uncertain and Dynamic Orienteering Problem (UDOP), modeling travel costs as distributions with unknown and time-variant parameters. UDOP also associates uncertain travel costs with dynamic prizes and prize-collection costs for its objective and budget constraints. To address UDOP, we develop an ADaptive Approach for Probabilistic paThs - ADAPT, that iteratively performs 'execution' and 'online planning' based on an initial 'offline' solution. The execution phase updates system status and records online cost observations. The online planner employs a Bayesian approach to adaptively estimate power consumption and optimize path sequence based on safety beliefs. We evaluate ADAPT in a practical Unmanned Aerial Vehicle (UAV) charging scheduling problem for Wireless Rechargeable Sensor Networks. The UAV must optimize its path to recharge sensor nodes efficiently while managing its energy under uncertain conditions. ADAPT maintains comparable solution quality and computation time while offering superior robustness. Extensive simulations show that ADAPT achieves a 100% Mission Success Rate (MSR) across all tested scenarios, outperforming comparable heuristic-based and frequentist approaches that fail up to 70% (under challenging conditions) and averaging 67% MSR, respectively. This work advances the field of OP with uncertainties, offering a reliable and efficient approach for real-world applications in uncertain and dynamic environments.
DexDiff: Towards Extrinsic Dexterity Manipulation of Ungraspable Objects in Unrestricted Environments
Grasping large and flat objects (e.g. a book or a pan) is often regarded as an ungraspable task, which poses significant challenges due to the unreachable grasping poses. Previous works leverage Extrinsic Dexterity like walls or table edges to grasp such objects. However, they are limited to task-specific policies and lack task planning to find pre-grasp conditions. This makes it difficult to adapt to various environments and extrinsic dexterity constraints. Therefore, we present DexDiff, a robust robotic manipulation method for long-horizon planning with extrinsic dexterity. Specifically, we utilize a vision-language model (VLM) to perceive the environmental state and generate high-level task plans, followed by a goal-conditioned action diffusion (GCAD) model to predict the sequence of low-level actions. This model learns the low-level policy from offline data with the cumulative reward guided by high-level planning as the goal condition, which allows for improved prediction of robot actions. Experimental results demonstrate that our method not only effectively performs ungraspable tasks but also generalizes to previously unseen objects. It outperforms baselines by a 47% higher success rate in simulation and facilitates efficient deployment and manipulation in real-world scenarios.
DWA-3D: A Reactive Planner for Robust and Efficient Autonomous UAV Navigation
Despite the growing impact of Unmanned Aerial Vehicles (UAVs) across various industries, most of current available solutions lack for a robust autonomous navigation system to deal with the appearance of obstacles safely. This work presents an approach to perform autonomous UAV planning and navigation in scenarios in which a safe and high maneuverability is required, due to the cluttered environment and the narrow rooms to move. The system combines an RRT* global planner with a newly proposed reactive planner, DWA-3D, which is the extension of the well known DWA method for 2D robots. We provide a theoretical-empirical method for adjusting the parameters of the objective function to optimize, easing the classical difficulty for tuning them. An onboard LiDAR provides a 3D point cloud, which is projected on an Octomap in which the planning and navigation decisions are made. There is not a prior map; the system builds and updates the map online, from the current and the past LiDAR information included in the Octomap. Extensive real-world experiments were conducted to validate the system and to obtain a fine tuning of the involved parameters. These experiments allowed us to provide a set of values that ensure safe operation across all the tested scenarios. Just by weighting two parameters, it is possible to prioritize either horizontal path alignment or vertical (height) tracking, resulting in enhancing vertical or lateral avoidance, respectively. Additionally, our DWA-3D proposal is able to navigate successfully even in absence of a global planner or with one that does not consider the drone's size. Finally, the conducted experiments show that computation time with the proposed parameters is not only bounded but also remains stable around 40 ms, regardless of the scenario complexity.
comment: 17 pages, 18 figures
From Words to Poses: Enhancing Novel Object Pose Estimation with Vision Language Models
Robots are increasingly envisioned to interact in real-world scenarios, where they must continuously adapt to new situations. To detect and grasp novel objects, zero-shot pose estimators determine poses without prior knowledge. Recently, vision language models (VLMs) have shown considerable advances in robotics applications by establishing an understanding between language input and image input. In our work, we take advantage of VLMs zero-shot capabilities and translate this ability to 6D object pose estimation. We propose a novel framework for promptable zero-shot 6D object pose estimation using language embeddings. The idea is to derive a coarse location of an object based on the relevancy map of a language-embedded NeRF reconstruction and to compute the pose estimate with a point cloud registration method. Additionally, we provide an analysis of LERF's suitability for open-set object pose estimation. We examine hyperparameters, such as activation thresholds for relevancy maps and investigate the zero-shot capabilities on an instance- and category-level. Furthermore, we plan to conduct robotic grasping experiments in a real-world setting.
Leveraging Computation of Expectation Models for Commonsense Affordance Estimation on 3D Scene Graphs IROS24
This article studies the commonsense object affordance concept for enabling close-to-human task planning and task optimization of embodied robotic agents in urban environments. The focus of the object affordance is on reasoning how to effectively identify object's inherent utility during the task execution, which in this work is enabled through the analysis of contextual relations of sparse information of 3D scene graphs. The proposed framework develops a Correlation Information (CECI) model to learn probability distributions using a Graph Convolutional Network, allowing to extract the commonsense affordance for individual members of a semantic class. The overall framework was experimentally validated in a real-world indoor environment, showcasing the ability of the method to level with human commonsense. For a video of the article, showcasing the experimental demonstration, please refer to the following link: https://youtu.be/BDCMVx2GiQE
comment: Accepted in IROS24
GOPT: Generalizable Online 3D Bin Packing via Transformer-based Deep Reinforcement Learning
Robotic object packing has broad practical applications in the logistics and automation industry, often formulated by researchers as the online 3D Bin Packing Problem (3D-BPP). However, existing DRL-based methods primarily focus on enhancing performance in limited packing environments while neglecting the ability to generalize across multiple environments characterized by different bin dimensions. To this end, we propose GOPT, a generalizable online 3D Bin Packing approach via Transformer-based deep reinforcement learning (DRL). First, we design a Placement Generator module to yield finite subspaces as placement candidates and the representation of the bin. Second, we propose a Packing Transformer, which fuses the features of the items and bin, to identify the spatial correlation between the item to be packed and available sub-spaces within the bin. Coupling these two components enables GOPT's ability to perform inference on bins of varying dimensions. We conduct extensive experiments and demonstrate that GOPT not only achieves superior performance against the baselines, but also exhibits excellent generalization capabilities. Furthermore, the deployment with a robot showcases the practical applicability of our method in the real world. The source code will be publicly available at https://github.com/Xiong5Heng/GOPT.
comment: 8 pages, 6 figures. This paper has been accepted by IEEE Robotics and Automation Letters
Neural Surface Reconstruction and Rendering for LiDAR-Visual Systems
This paper presents a unified surface reconstruction and rendering framework for LiDAR-visual systems, integrating Neural Radiance Fields (NeRF) and Neural Distance Fields (NDF) to recover both appearance and structural information from posed images and point clouds. We address the structural visible gap between NeRF and NDF by utilizing a visible-aware occupancy map to classify space into the free, occupied, visible unknown, and background regions. This classification facilitates the recovery of a complete appearance and structure of the scene. We unify the training of the NDF and NeRF using a spatial-varying scale SDF-to-density transformation for levels of detail for both structure and appearance. The proposed method leverages the learned NDF for structure-aware NeRF training by an adaptive sphere tracing sampling strategy for accurate structure rendering. In return, NeRF further refines structural in recovering missing or fuzzy structures in the NDF. Extensive experiments demonstrate the superior quality and versatility of the proposed method across various scenarios. To benefit the community, the codes will be released at \url{https://github.com/hku-mars/M2Mapping}.
Adaptive Visual Servoing for On-Orbit Servicing
This paper presents an adaptive visual servoing framework for robotic on-orbit servicing (OOS), specifically designed for capturing tumbling satellites. The vision-guided robotic system is capable of selecting optimal control actions in the event of partial or complete vision system failure, particularly in the short term. The autonomous system accounts for physical and operational constraints, executing visual servoing tasks to minimize a cost function. A hierarchical control architecture is developed, integrating a variant of the Iterative Closest Point (ICP) algorithm for image registration, a constrained noise-adaptive Kalman filter, fault detection and recovery logic, and a constrained optimal path planner. The dynamic estimator provides real-time estimates of unknown states and uncertain parameters essential for motion prediction, while ensuring consistency through a set of inequality constraints. It also adjusts the Kalman filter parameters adaptively in response to unexpected vision errors. In the event of vision system faults, a recovery strategy is activated, guided by fault detection logic that monitors the visual feedback via the metric fit error of image registration. The estimated/predicted pose and parameters are subsequently fed into an optimal path planner, which directs the robot's end-effector to the target's grasping point. This process is subject to multiple constraints, including acceleration limits, smooth capture, and line-of-sight maintenance with the target. Experimental results demonstrate that the proposed visual servoing system successfully captured a free-floating object, despite complete occlusion of the vision system.
comment: arXiv admin note: substantial text overlap with arXiv:2209.02156
Developing Trajectory Planning with Behavioral Cloning and Proximal Policy Optimization for Path-Tracking and Static Obstacle Nudging
End-to-end approaches with Reinforcement Learning (RL) and Imitation Learning (IL) have gained increasing popularity in autonomous driving. However, they do not involve explicit reasoning like classic robotics workflow, nor planning with horizons, leading strategies implicit and myopic. In this paper, we introduce our trajectory planning method that uses Behavioral Cloning (BC) for path-tracking and Proximal Policy Optimization (PPO) bootstrapped by BC for static obstacle nudging. It outputs lateral offset values to adjust the given reference trajectory, and performs modified path for different controllers. Our experimental results show that the algorithm can do path-tracking that mimics the expert performance, and avoiding collision to fixed obstacles by trial and errors. This method makes a good attempt at planning with learning-based methods in trajectory planning problems of autonomous driving.
comment: 6 pages, 7 figures
Path-Parameterised RRTs for Underactuated Systems IROS 2024
We present a sample-based motion planning algorithm specialised to a class of underactuated systems using path parameterisation. The structure this class presents under a path parameterisation enables the trivial computation of dynamic feasibility along a path. Using this, a specialised state-based steering mechanism within an RRT motion planning algorithm is developed, enabling the generation of both geometric paths and their time parameterisations without introducing excessive computational overhead. We find with two systems that our algorithm computes feasible trajectories with higher rates of success and lower mean computation times compared to existing approaches.
comment: 8 pages, accepted for publication at the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
Robotic Ad-Hoc Networks
Practical robotic adhoc networks (RANETs), a type of mobile wireless adhoc networks (WANETs) supporting the WiFi-Direct modes common in internet of things and phone devices, is proposed based on a strategy of exploiting WiFi-Direct connection modes to overcome hardware restrictions. For a certain period of time the community was enthusiastic about the endless opportunities in fair, robust, efficient, and cheap communication created by the Adhoc mode of the WiFi IEEE 802.11 independent basic service set (IBSS) configuration that required no dedicated access points. The mode was a main enabler of wireless Adhoc networks (WANETS). This communication mode unfortunately did not get into the standard network cards present in IoT and mobile phones, likely due to the high energy consumption it exacts. Rather, such devices implement WiFi-Direct which is designed for star topologies. Several attempts were made to overcame the restriction and support WANETs, but they break at least the fairness and symmetry property, thereby reducing applicability. Here we show a solution for fair RANETs and evaluate the behavior of various strategies using simulations.
comment: Presented at the FCRAR 24 conference in May 2024 (Florida Conference on Recent Advances in Robotics)
Online Resynthesis of High-Level Collaborative Tasks for Robots with Changing Capabilities
Given a collaborative high-level task and a team of heterogeneous robots and behaviors to satisfy it, this work focuses on the challenge of automatically, at runtime, adjusting the individual robot behaviors such that the task is still satisfied, when robots encounter changes to their abilities--either failures or additional actions they can perform. We consider tasks encoded in LTL^\psi and minimize global teaming reassignments (and as a result, local resynthesis) when robots' capabilities change. We also increase the expressivity of LTL^\psi by including additional types of constraints on the overall teaming assignment that the user can specify, such as the minimum number of robots required for each assignment. We demonstrate the framework in a simulated warehouse scenario.
comment: Under review in IEEE Robotics and Automation Letters
PaRCE: Probabilistic and Reconstruction-Based Competency Estimation for Safe Navigation Under Perception Uncertainty
Perception-based navigation systems are useful for unmanned ground vehicle (UGV) navigation in complex terrains, where traditional depth-based navigation schemes are insufficient. However, these data-driven methods are highly dependent on their training data and can fail in surprising and dramatic ways with little warning. To ensure the safety of the vehicle and the surrounding environment, it is imperative that the navigation system is able to recognize the predictive uncertainty of the perception model and respond safely and effectively in the face of uncertainty. In an effort to enable safe navigation under perception uncertainty, we develop a probabilistic and reconstruction-based competency estimation (PaRCE) method to estimate the model's level of familiarity with an input image as a whole and with specific regions in the image. We find that the overall competency score can correctly predict correctly classified, misclassified, and out-of-distribution (OOD) samples. We also confirm that the regional competency maps can accurately distinguish between familiar and unfamiliar regions across images. We then use this competency information to develop a planning and control scheme that enables effective navigation while maintaining a low probability of error. We find that the competency-aware scheme greatly reduces the number of collisions with unfamiliar obstacles, compared to a baseline controller with no competency awareness. Furthermore, the regional competency information is very valuable in enabling efficient navigation.
PEERNet: An End-to-End Profiling Tool for Real-Time Networked Robotic Systems IROS 2024
Networked robotic systems balance compute, power, and latency constraints in applications such as self-driving vehicles, drone swarms, and teleoperated surgery. A core problem in this domain is deciding when to offload a computationally expensive task to the cloud, a remote server, at the cost of communication latency. Task offloading algorithms often rely on precise knowledge of system-specific performance metrics, such as sensor data rates, network bandwidth, and machine learning model latency. While these metrics can be modeled during system design, uncertainties in connection quality, server load, and hardware conditions introduce real-time performance variations, hindering overall performance. We introduce PEERNet, an end-to-end and real-time profiling tool for cloud robotics. PEERNet enables performance monitoring on heterogeneous hardware through targeted yet adaptive profiling of system components such as sensors, networks, deep-learning pipelines, and devices. We showcase PEERNet's capabilities through networked robotics tasks, such as image-based teleoperation of a Franka Emika Panda arm and querying vision language models using an Nvidia Jetson Orin. PEERNet reveals non-intuitive behavior in robotic systems, such as asymmetric network transmission and bimodal language model output. Our evaluation underscores the effectiveness and importance of benchmarking in networked robotics, demonstrating PEERNet's adaptability. Our code is open-source and available at github.com/UTAustin-SwarmLab/PEERNet.
comment: Accepted at IROS 2024
Voronoi-based Multi-Robot Formations for 3D Source Seeking via Cooperative Gradient Estimation
In this paper, we tackle the problem of localizing the source of a three-dimensional signal field with a team of mobile robots able to collect noisy measurements of its strength and share information with each other. The adopted strategy is to cooperatively compute a closed-form estimation of the gradient of the signal field that is then employed to steer the multi-robot system toward the source location. In order to guarantee an accurate and robust gradient estimation, the robots are placed on the surface of a sphere of fixed radius. More specifically, their positions correspond to the generators of a constrained Centroidal Voronoi partition on the spherical surface. We show that, by keeping these specific formations, both crucial geometric properties and a high level of field coverage are simultaneously achieved and that they allow estimating the gradient via simple analytic expressions. We finally provide simulation results to evaluate the performance of the proposed approach, considering both noise-free and noisy measurements. In particular, a comparative analysis shows how its higher robustness against faulty measurements outperforms an alternative state-of-the-art solution.
A Multi-Modal Approach Based on Large Vision Model for Close-Range Underwater Target Localization
Underwater target localization uses real-time sensory measurements to estimate the position of underwater objects of interest, providing critical feedback information for underwater robots. While acoustic sensing is the most acknowledged method in underwater robots and possibly the only effective approach for long-range underwater target localization, such a sensing modality generally suffers from low resolution, high cost and high energy consumption, thus leading to a mediocre performance when applied to close-range underwater target localization. On the other hand, optical sensing has attracted increasing attention in the underwater robotics community for its advantages of high resolution and low cost, holding a great potential particularly in close-range underwater target localization. However, most existing studies in underwater optical sensing are restricted to specific types of targets due to the limited training data available. In addition, these studies typically focus on the design of estimation algorithms and ignore the influence of illumination conditions on the sensing performance, thus hindering wider applications in the real world. To address the aforementioned issues, this paper proposes a novel target localization method that assimilates both optical and acoustic sensory measurements to estimate the 3D positions of close-range underwater targets. A test platform with controllable illumination conditions is designed and developed to experimentally investigate the proposed multi-modal sensing approach. A large vision model is applied to process the optical imaging measurements, eliminating the requirement for training data acquisition, thus significantly expanding the scope of potential applications. Extensive experiments are conducted, the results of which validate the effectiveness of the proposed underwater target localization method.
Active Collaborative Visual SLAM exploiting ORB Features
In autonomous robotics, a significant challenge involves devising robust solutions for Active Collaborative SLAM (AC-SLAM). This process requires multiple robots to cooperatively explore and map an unknown environment by intelligently coordinating their movements and sensor data acquisition. In this article, we present an efficient visual AC-SLAM method using aerial and ground robots for environment exploration and mapping. We propose an efficient frontiers filtering method that takes into account the common IoU map frontiers and reduces the frontiers for each robot. Additionally, we also present an approach to guide robots to previously visited goal positions to promote loop closure to reduce SLAM uncertainty. The proposed method is implemented in ROS and evaluated through simulations on publicly available datasets and similar methods, achieving an accumulative average of 59% of increase in area coverage.
comment: 6 Pages, 7 Figures, 2 Tables. arXiv admin note: text overlap with arXiv:2310.01967
Jointly Learning Cost and Constraints from Demonstrations for Safe Trajectory Generation IROS
Learning from Demonstration allows robots to mimic human actions. However, these methods do not model constraints crucial to ensure safety of the learned skill. Moreover, even when explicitly modelling constraints, they rely on the assumption of a known cost function, which limits their practical usability for task with unknown cost. In this work we propose a two-step optimization process that allow to estimate cost and constraints by decoupling the learning of cost functions from the identification of unknown constraints within the demonstrated trajectories. Initially, we identify the cost function by isolating the effect of constraints on parts of the demonstrations. Subsequently, a constraint leaning method is used to identify the unknown constraints. Our approach is validated both on simulated trajectories and a real robotic manipulation task. Our experiments show the impact that incorrect cost estimation has on the learned constraints and illustrate how the proposed method is able to infer unknown constraints, such as obstacles, from demonstrated trajectories without any initial knowledge of the cost.
comment: (Accepted/In press) 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
CLFT: Camera-LiDAR Fusion Transformer for Semantic Segmentation in Autonomous Driving
Critical research about camera-and-LiDAR-based semantic object segmentation for autonomous driving significantly benefited from the recent development of deep learning. Specifically, the vision transformer is the novel ground-breaker that successfully brought the multi-head-attention mechanism to computer vision applications. Therefore, we propose a vision-transformer-based network to carry out camera-LiDAR fusion for semantic segmentation applied to autonomous driving. Our proposal uses the novel progressive-assemble strategy of vision transformers on a double-direction network and then integrates the results in a cross-fusion strategy over the transformer decoder layers. Unlike other works in the literature, our camera-LiDAR fusion transformers have been evaluated in challenging conditions like rain and low illumination, showing robust performance. The paper reports the segmentation results over the vehicle and human classes in different modalities: camera-only, LiDAR-only, and camera-LiDAR fusion. We perform coherent controlled benchmark experiments of CLFT against other networks that are also designed for semantic segmentation. The experiments aim to evaluate the performance of CLFT independently from two perspectives: multimodal sensor fusion and backbone architectures. The quantitative assessments show our CLFT networks yield an improvement of up to 10% for challenging dark-wet conditions when comparing with Fully-Convolutional-Neural-Network-based (FCN) camera-LiDAR fusion neural network. Contrasting to the network with transformer backbone but using single modality input, the all-around improvement is 5-10%.
comment: Accepted to IEEE Transactions on Intelligent Vehicles
RGBManip: Monocular Image-based Robotic Manipulation through Active Object Pose Estimation ICRA
Robotic manipulation requires accurate perception of the environment, which poses a significant challenge due to its inherent complexity and constantly changing nature. In this context, RGB image and point-cloud observations are two commonly used modalities in visual-based robotic manipulation, but each of these modalities have their own limitations. Commercial point-cloud observations often suffer from issues like sparse sampling and noisy output due to the limits of the emission-reception imaging principle. On the other hand, RGB images, while rich in texture information, lack essential depth and 3D information crucial for robotic manipulation. To mitigate these challenges, we propose an image-only robotic manipulation framework that leverages an eye-on-hand monocular camera installed on the robot's parallel gripper. By moving with the robot gripper, this camera gains the ability to actively perceive object from multiple perspectives during the manipulation process. This enables the estimation of 6D object poses, which can be utilized for manipulation. While, obtaining images from more and diverse viewpoints typically improves pose estimation, it also increases the manipulation time. To address this trade-off, we employ a reinforcement learning policy to synchronize the manipulation strategy with active perception, achieving a balance between 6D pose accuracy and manipulation efficiency. Our experimental results in both simulated and real-world environments showcase the state-of-the-art effectiveness of our approach. %, which, to the best of our knowledge, is the first to achieve robust real-world robotic manipulation through active pose estimation. We believe that our method will inspire further research on real-world-oriented robotic manipulation.
comment: 2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024
Learning to Walk and Fly with Adversarial Motion Priors IROS
Robot multimodal locomotion encompasses the ability to transition between walking and flying, representing a significant challenge in robotics. This work presents an approach that enables automatic smooth transitions between legged and aerial locomotion. Leveraging the concept of Adversarial Motion Priors, our method allows the robot to imitate motion datasets and accomplish the desired task without the need for complex reward functions. The robot learns walking patterns from human-like gaits and aerial locomotion patterns from motions obtained using trajectory optimization. Through this process, the robot adapts the locomotion scheme based on environmental feedback using reinforcement learning, with the spontaneous emergence of mode-switching behavior. The results highlight the potential for achieving multimodal locomotion in aerial humanoid robotics through automatic control of walking and flying modes, paving the way for applications in diverse domains such as search and rescue, surveillance, and exploration missions. This research contributes to advancing the capabilities of aerial humanoid robots in terms of versatile locomotion in various environments.
comment: This paper has been accepted for publication at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Abu Dhabi, 2024
Sim-to-Real of Soft Robots with Learned Residual Physics
Accurately modeling soft robots in simulation is computationally expensive and commonly falls short of representing the real world. This well-known discrepancy, known as the sim-to-real gap, can have several causes, such as coarsely approximated geometry and material models, manufacturing defects, viscoelasticity and plasticity, and hysteresis effects. Residual physics networks learn from real-world data to augment a discrepant model and bring it closer to reality. Here, we present a residual physics method for modeling soft robots with large degrees of freedom. We train neural networks to learn a residual term -- the modeling error between simulated and physical systems. Concretely, the residual term is a force applied on the whole simulated mesh, while real position data is collected with only sparse motion markers. The physical prior of the analytical simulation provides a starting point for the residual network, and the combined model is more informed than if physics were learned tabula rasa. We demonstrate our method on 1) a silicone elastomeric beam and 2) a soft pneumatic arm with hard-to-model, anisotropic fiber reinforcements. Our method outperforms traditional system identification up to 60%. We show that residual physics need not be limited to low degrees of freedom but can effectively bridge the sim-to-real gap for high dimensional systems.
comment: 8 pages, 8 figures
VisFly: An Efficient and Versatile Simulator for Training Vision-based Flight
We present VisFly, a quadrotor simulator designed to efficiently train vision-based flight policies using reinforcement learning algorithms. VisFly offers a user-friendly framework and interfaces, leveraging Habitat-Sim's rendering engines to achieve frame rates exceeding 10,000 frames per second for rendering motion and sensor data. The simulator incorporates differentiable physics and is seamlessly wrapped with the Gym environment, facilitating the straightforward implementation of various learning algorithms. It supports the directly importing open-source scene datasets compatible with Habitat-Sim, enabling training on diverse real-world environments simultaneously. To validate our simulator, we also make three reinforcement learning examples for typical flight tasks relying on visual observations. The simulator is now available at [https://github.com/SJTU-ViSYS-team/VisFly].
G-Loc: Tightly-coupled Graph Localization with Prior Topo-metric Information
Localization in already mapped environments is a critical component in many robotics and automotive applications, where previously acquired information can be exploited along with sensor fusion to provide robust and accurate localization estimates. In this work, we offer a new perspective on map-based localization by reusing prior topological and metric information. Thus, we reformulate this long-studied problem to go beyond the mere use of metric maps. Our framework seamlessly integrates LiDAR, inertial and GNSS measurements, and cloud-to-map registrations in a sliding window graph fashion, which allows to accommodate the uncertainty of each observation. The modularity of our framework allows it to work with different sensor configurations (e.g., LiDAR resolutions, GNSS denial) and environmental conditions (e.g., mapless regions, large environments). We have conducted several validation experiments, including the deployment in a real-world automotive application, demonstrating the accuracy, efficiency, and versatility of our system in online localization.
comment: 8 pages
OAFuser: Towards Omni-Aperture Fusion for Light Field Semantic Segmentation
Light field cameras are capable of capturing intricate angular and spatial details. This allows for acquiring complex light patterns and details from multiple angles, significantly enhancing the precision of image semantic segmentation. However, two significant issues arise: (1) The extensive angular information of light field cameras contains a large amount of redundant data, which is overwhelming for the limited hardware resources of intelligent agents. (2) A relative displacement difference exists in the data collected by different micro-lenses. To address these issues, we propose an Omni-Aperture Fusion model (OAFuser) that leverages dense context from the central view and extracts the angular information from sub-aperture images to generate semantically consistent results. To simultaneously streamline the redundant information from the light field cameras and avoid feature loss during network propagation, we present a simple yet very effective Sub-Aperture Fusion Module (SAFM). This module efficiently embeds sub-aperture images in angular features, allowing the network to process each sub-aperture image with a minimal computational demand of only (around 1GFlops). Furthermore, to address the mismatched spatial information across viewpoints, we present a Center Angular Rectification Module (CARM) to realize feature resorting and prevent feature occlusion caused by misalignment. The proposed OAFuser achieves state-of-the-art performance on four UrbanLF datasets in terms of all evaluation metrics and sets a new record of 84.93% in mIoU on the UrbanLF-Real Extended dataset, with a gain of +3.69%. The source code for OAFuser is available at https://github.com/FeiBryantkit/OAFuser.
comment: Accepted to IEEE Transactions on Artificial Intelligence (TAI). The source code is available at https://github.com/FeiBryantkit/OAFuser
Risk-Aware Robotics: Tail Risk Measures in Planning, Control, and Verification
The need for a systematic approach to risk assessment has increased in recent years due to the ubiquity of autonomous systems that alter our day-to-day experiences and their need for safety, e.g., for self-driving vehicles, mobile service robots, and bipedal robots. These systems are expected to function safely in unpredictable environments and interact seamlessly with humans, whose behavior is notably challenging to forecast. We present a survey of risk-aware methodologies for autonomous systems. We adopt a contemporary risk-aware approach to mitigate rare and detrimental outcomes by advocating the use of tail risk measures, a concept borrowed from financial literature. This survey will introduce these measures and explain their relevance in the context of robotic systems for planning, control, and verification applications.
RoVi-Aug: Robot and Viewpoint Augmentation for Cross-Embodiment Robot Learning
Scaling up robot learning requires large and diverse datasets, and how to efficiently reuse collected data and transfer policies to new embodiments remains an open question. Emerging research such as the Open-X Embodiment (OXE) project has shown promise in leveraging skills by combining datasets including different robots. However, imbalances in the distribution of robot types and camera angles in many datasets make policies prone to overfit. To mitigate this issue, we propose RoVi-Aug, which leverages state-of-the-art image-to-image generative models to augment robot data by synthesizing demonstrations with different robots and camera views. Through extensive physical experiments, we show that, by training on robot- and viewpoint-augmented data, RoVi-Aug can zero-shot deploy on an unseen robot with significantly different camera angles. Compared to test-time adaptation algorithms such as Mirage, RoVi-Aug requires no extra processing at test time, does not assume known camera angles, and allows policy fine-tuning. Moreover, by co-training on both the original and augmented robot datasets, RoVi-Aug can learn multi-robot and multi-task policies, enabling more efficient transfer between robots and skills and improving success rates by up to 30%. Project website: https://rovi-aug.github.io.
comment: CoRL 2024 (Oral). Project website: https://rovi-aug.github.io
Mirage: Cross-Embodiment Zero-Shot Policy Transfer with Cross-Painting
The ability to reuse collected data and transfer trained policies between robots could alleviate the burden of additional data collection and training. While existing approaches such as pretraining plus finetuning and co-training show promise, they do not generalize to robots unseen in training. Focusing on common robot arms with similar workspaces and 2-jaw grippers, we investigate the feasibility of zero-shot transfer. Through simulation studies on 8 manipulation tasks, we find that state-based Cartesian control policies can successfully zero-shot transfer to a target robot after accounting for forward dynamics. To address robot visual disparities for vision-based policies, we introduce Mirage, which uses "cross-painting"--masking out the unseen target robot and inpainting the seen source robot--during execution in real time so that it appears to the policy as if the trained source robot were performing the task. Mirage applies to both first-person and third-person camera views and policies that take in both states and images as inputs or only images as inputs. Despite its simplicity, our extensive simulation and physical experiments provide strong evidence that Mirage can successfully zero-shot transfer between different robot arms and grippers with only minimal performance degradation on a variety of manipulation tasks such as picking, stacking, and assembly, significantly outperforming a generalist policy. Project website: https://robot-mirage.github.io/
comment: RSS 2024. Project page: https://robot-mirage.github.io/
Efficient Imitation Without Demonstrations via Value-Penalized Auxiliary Control from Examples ICRA'25
Learning from examples of success is an ap pealing approach to reinforcement learning but it presents a challenging exploration problem, especially for complex or long-horizon tasks. This work introduces value-penalized auxiliary control from examples (VPACE), an algorithm that significantly improves exploration in example-based control by adding examples of simple auxiliary tasks. For instance, a manipulation task may have auxiliary examples of an object being reached for, grasped, or lifted. We show that the na\"{i}ve application of scheduled auxiliary control to example-based learning can lead to value overestimation and poor performance. We resolve the problem with an above-success-level value penalty. Across both simulated and real robotic environments, we show that our approach substantially improves learning efficiency for challenging tasks, while maintaining bounded value estimates. We compare with existing approaches to example-based learning, inverse reinforcement learning, and an exploration bonus. Preliminary results also suggest that VPACE may learn more efficiently than the more common approaches of using full trajectories or true sparse rewards. Videos, code, and datasets: https://papers.starslab.ca/vpace.
comment: Submitted to IEEE International Conference on Robotics and Automation (ICRA'25), Atlanta, USA, May 19-23, 2025
A Study on Prompt Injection Attack Against LLM-Integrated Mobile Robotic Systems
The integration of Large Language Models (LLMs) like GPT-4o into robotic systems represents a significant advancement in embodied artificial intelligence. These models can process multi-modal prompts, enabling them to generate more context-aware responses. However, this integration is not without challenges. One of the primary concerns is the potential security risks associated with using LLMs in robotic navigation tasks. These tasks require precise and reliable responses to ensure safe and effective operation. Multi-modal prompts, while enhancing the robot's understanding, also introduce complexities that can be exploited maliciously. For instance, adversarial inputs designed to mislead the model can lead to incorrect or dangerous navigational decisions. This study investigates the impact of prompt injections on mobile robot performance in LLM-integrated systems and explores secure prompt strategies to mitigate these risks. Our findings demonstrate a substantial overall improvement of approximately 30.8% in both attack detection and system performance with the implementation of robust defence mechanisms, highlighting their critical role in enhancing security and reliability in mission-oriented tasks.
Learning Lyapunov-Stable Polynomial Dynamical Systems through Imitation
Imitation learning is a paradigm to address complex motion planning problems by learning a policy to imitate an expert's behavior. However, relying solely on the expert's data might lead to unsafe actions when the robot deviates from the demonstrated trajectories. Stability guarantees have previously been provided utilizing nonlinear dynamical systems, acting as high-level motion planners, in conjunction with the Lyapunov stability theorem. Yet, these methods are prone to inaccurate policies, high computational cost, sample inefficiency, or quasi stability when replicating complex and highly nonlinear trajectories. To mitigate this problem, we present an approach for learning a globally stable nonlinear dynamical system as a motion planning policy. We model the nonlinear dynamical system as a parametric polynomial and learn the polynomial's coefficients jointly with a Lyapunov candidate. To showcase its success, we compare our method against the state of the art in simulation and conduct real-world experiments with the Kinova Gen3 Lite manipulator arm. Our experiments demonstrate the sample efficiency and reproduction accuracy of our method for various expert trajectories, while remaining stable in the face of perturbations.
comment: In 7th Annual Conference on Robot Learning 2023 Aug 30
Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics
We show that off-the-shelf text-based Transformers, with no additional training, can perform few-shot in-context visual imitation learning, mapping visual observations to action sequences that emulate the demonstrator's behaviour. We achieve this by transforming visual observations (inputs) and trajectories of actions (outputs) into sequences of tokens that a text-pretrained Transformer (GPT-4 Turbo) can ingest and generate, via a framework we call Keypoint Action Tokens (KAT). Despite being trained only on language, we show that these Transformers excel at translating tokenised visual keypoint observations into action trajectories, performing on par or better than state-of-the-art imitation learning (diffusion policies) in the low-data regime on a suite of real-world, everyday tasks. Rather than operating in the language domain as is typical, KAT leverages text-based Transformers to operate in the vision and action domains to learn general patterns in demonstration data for highly efficient imitation learning, indicating promising new avenues for repurposing natural language models for embodied tasks. Videos are available at https://www.robot-learning.uk/keypoint-action-tokens.
comment: Published at Robotics: Science and Systems (RSS) 2024
Development of Advanced FEM Simulation Technology for Pre-Operative Surgical Planning
Intracorporeal needle-based therapeutic ultrasound (NBTU) offers a minimally invasive approach for the thermal ablation of malignant brain tumors, including both primary and metastatic cancers. NBTU utilizes a high-frequency alternating electric field to excite a piezoelectric transducer, generating acoustic waves that cause localized heating and tumor cell ablation, and it provides a more precise ablation by delivering lower acoustic power doses directly to targeted tumors while sparing surrounding healthy tissue. Building on our previous work, this study introduces a database for optimizing pre-operative surgical planning by simulating ablation effects in varied tissue environments and develops an extended simulation model incorporating various tumor types and sizes to evaluate thermal damage under trans-tissue conditions. A comprehensive database is created from these simulations, detailing critical parameters such as CEM43 isodose maps, temperature changes, thermal dose areas, and maximum ablation distances for four directional probes. This database serves as a valuable resource for future studies, aiding in complex trajectory planning and parameter optimization for NBTU procedures. Moreover, a novel probe selection method is proposed to enhance pre-surgical planning, providing a strategic approach to selecting probes that maximize therapeutic efficiency and minimize ablation time. By avoiding unnecessary thermal propagation and optimizing probe angles, this method has the potential to improve patient outcomes and streamline surgical procedures. Overall, the findings of this study contribute significantly to the field of NBTU, offering a robust framework for enhancing treatment precision and efficacy in clinical settings.
comment: 8 pages, 17 figures, 2 tables
Metaverse for Safer Roadways: An Immersive Digital Twin Framework for Exploring Human-Autonomy Coexistence in Urban Transportation Systems
Societal-scale deployment of autonomous vehicles requires them to coexist with human drivers, necessitating mutual understanding and coordination among these entities. However, purely real-world or simulation-based experiments cannot be employed to explore such complex interactions due to safety and reliability concerns, respectively. Consequently, this work presents an immersive digital twin framework to explore and experiment with the interaction dynamics between autonomous and non-autonomous traffic participants. Particularly, we employ a mixed-reality human-machine interface to allow human drivers and autonomous agents to observe and interact with each other for testing edge-case scenarios while ensuring safety at all times. To validate the versatility of the proposed framework's modular architecture, we first present a discussion on a set of user experience experiments encompassing 4 different levels of immersion with 4 distinct user interfaces. We then present a case study of uncontrolled intersection traversal to demonstrate the efficacy of the proposed framework in validating the interactions of a primary human-driven, autonomous, and connected autonomous vehicle with a secondary semi-autonomous vehicle. The proposed framework has been openly released to guide the future of autonomy-oriented digital twins and research on human-autonomy coexistence.
comment: Accepted at IEEE Conference on Telepresence (TELE) 2024
What's Wrong with the Absolute Trajectory Error? ECCV 2024
One of the limitations of the commonly used Absolute Trajectory Error (ATE) is that it is highly sensitive to outliers. As a result, in the presence of just a few outliers, it often fails to reflect the varying accuracy as the inlier trajectory error or the number of outliers varies. In this work, we propose an alternative error metric for evaluating the accuracy of the reconstructed camera trajectory. Our metric, named Discernible Trajectory Error (DTE), is computed in five steps: (1) Shift the ground-truth and estimated trajectories such that both of their geometric medians are located at the origin. (2) Rotate the estimated trajectory such that it minimizes the sum of geodesic distances between the corresponding camera orientations. (3) Scale the estimated trajectory such that the median distance of the cameras to their geometric median is the same as that of the ground truth. (4) Compute, winsorize and normalize the distances between the corresponding cameras. (5) Obtain the DTE by taking the average of the mean and the root-mean-square (RMS) of the resulting distances. This metric is an attractive alternative to the ATE, in that it is capable of discerning the varying trajectory accuracy as the inlier trajectory error or the number of outliers varies. Using the similar idea, we also propose a novel rotation error metric, named Discernible Rotation Error (DRE), which has similar advantages to the DTE. Furthermore, we propose a simple yet effective method for calibrating the camera-to-marker rotation, which is needed for the computation of our metrics. Our methods are verified through extensive simulations.
comment: The main part of this manuscript (except the part on DRE) has been accepted to ECCV 2024 Workshop HALF-CENTURY OF STRUCTURE-FROM-MOTION (50SFM)
Roadmaps with Gaps over Controllers: Achieving Efficiency in Planning under Dynamics IROS
This paper aims to improve the computational efficiency of motion planning for mobile robots with non-trivial dynamics through the use of learned controllers. It adopts a decoupled strategy, where a system-specific controller is first trained offline in an empty environment to deal with the robot's dynamics. For a target environment, the proposed approach constructs offline a data structure, a "Roadmap with Gaps," to approximately learn how to solve planning queries in this environment using the learned controller. The nodes of the roadmap correspond to local regions. Edges correspond to applications of the learned control policy that approximately connect these regions. Gaps arise because the controller does not perfectly connect pairs of individual states along edges. Online, given a query, a tree sampling-based motion planner uses the roadmap so that the tree's expansion is informed towards the goal region. The tree expansion selects local subgoals given a wavefront on the roadmap that guides towards the goal. When the controller cannot reach a subgoal region, the planner resorts to random exploration to maintain probabilistic completeness and asymptotic optimality. The accompanying experimental evaluation shows that the approach significantly improves the computational efficiency of motion planning on various benchmarks, including physics-based vehicular models on uneven and varying friction terrains as well as a quadrotor under air pressure effects.
comment: To be presented at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024
PCR-99: A Practical Method for Point Cloud Registration with 99 Percent Outliers ECCV 2024
We propose a robust method for point cloud registration that can handle both unknown scales and extreme outlier ratios. Our method, dubbed PCR-99, uses a deterministic 3-point sampling approach with two novel mechanisms that significantly boost the speed: (1) an improved ordering of the samples based on pairwise scale consistency, prioritizing the point correspondences that are more likely to be inliers, and (2) an efficient outlier rejection scheme based on triplet scale consistency, prescreening bad samples and reducing the number of hypotheses to be tested. Our evaluation shows that, up to 98% outlier ratio, the proposed method achieves comparable performance to the state of the art. At 99% outlier ratio, however, it outperforms the state of the art for both known-scale and unknown-scale problems. Especially for the latter, we observe a clear superiority in terms of robustness and speed.
comment: Accepted to ECCV 2024 Workshop on Recovering 6D Object Pose (R6D)
Multiagent Systems
Voronoi-based Multi-Robot Formations for 3D Source Seeking via Cooperative Gradient Estimation
In this paper, we tackle the problem of localizing the source of a three-dimensional signal field with a team of mobile robots able to collect noisy measurements of its strength and share information with each other. The adopted strategy is to cooperatively compute a closed-form estimation of the gradient of the signal field that is then employed to steer the multi-robot system toward the source location. In order to guarantee an accurate and robust gradient estimation, the robots are placed on the surface of a sphere of fixed radius. More specifically, their positions correspond to the generators of a constrained Centroidal Voronoi partition on the spherical surface. We show that, by keeping these specific formations, both crucial geometric properties and a high level of field coverage are simultaneously achieved and that they allow estimating the gradient via simple analytic expressions. We finally provide simulation results to evaluate the performance of the proposed approach, considering both noise-free and noisy measurements. In particular, a comparative analysis shows how its higher robustness against faulty measurements outperforms an alternative state-of-the-art solution.
Geometric Structure and Polynomial-time Algorithm of Game Equilibriums
Whether a PTAS (polynomial-time approximation scheme) exists for game equilibriums has been an open question, and the absence of this polynomial-time algorithm has indications and consequences in three fields, such as the practicality of methods in algorithmic game theory, non-stationarity and curse of multiagency in MARL (multi-agent reinforcement learning), and the tractability of PPAD in computational complexity theory. In this paper, we introduce a geometric object called equilibrium bundle, which leads to a fundamental leap in the understanding of game equilibriums. Regarding the equilibrium bundle, first, we formalize perfect equilibriums of dynamic games as the zero points of its canonical section, second, we formalize a hybrid iteration of dynamic programming and interior point method as a line search on it, such that the method is an FPTAS (fully PTAS) for any perfect equilibrium of any dynamic game, implying PPAD=FP, third, we give the existence and oddness theorems of it as an extension of those of Nash equilibriums. As intermediate results, we introduce a concept called policy cone to give the sufficient and necessary condition for dynamic programming to converge to perfect equilibriums, and introduce two concepts called unbiased barrier problem and unbiased KKT conditions to make the interior point method to approximate Nash equilibriums. In experiment, the line search process is animated, and the method is tested on 2000 randomly generated dynamic games where it converges to a perfect equilibrium in every single case.
comment: 25 pages, 5 figures, code and animation are available at https://github.com/shb20tsinghua/PTAS_Game/tree/main
Systems and Control (CS)
Supervised Learning for Stochastic Optimal Control
Supervised machine learning is powerful. In recent years, it has enabled massive breakthroughs in computer vision and natural language processing. But leveraging these advances for optimal control has proved difficult. Data is a key limiting factor. Without access to the optimal policy, value function, or demonstrations, how can we fit a policy? In this paper, we show how to automatically generate supervised learning data for a class of continuous-time nonlinear stochastic optimal control problems. In particular, applying the Feynman-Kac theorem to a linear reparameterization of the Hamilton-Jacobi-Bellman PDE allows us to sample the value function by simulating a stochastic process. Hardware accelerators like GPUs could rapidly generate a large amount of this training data. With this data in hand, stochastic optimal control becomes supervised learning.
comment: CDC 2024
Distributionally Robust Stochastic Data-Driven Predictive Control with Optimized Feedback Gain
We consider the problem of direct data-driven predictive control for unknown stochastic linear time-invariant (LTI) systems with partial state observation. Building upon our previous research on data-driven stochastic control, this paper (i) relaxes the assumption of Gaussian process and measurement noise, and (ii) enables optimization of the gain matrix within the affine feedback policy. Output safety constraints are modelled using conditional value-at-risk, and enforced in a distributionally robust sense. Under idealized assumptions, we prove that our proposed data-driven control method yields control inputs identical to those produced by an equivalent model-based stochastic predictive controller. A simulation study illustrates the enhanced performance of our approach over previous designs.
comment: 8 pages, 1 figure, 2 tables, the first draft of an accepted paper of Conference on Decision and Control (CDC). arXiv admin note: text overlap with arXiv:2312.15177
Almost Global Trajectory Tracking for Quadrotors Using Thrust Direction Control on $\mathcal{S}^2$
Many of the existing works on quadrotor control address the trajectory tracking problem by employing a cascade design in which the translational and rotational dynamics are stabilized by two separate controllers. The stability of the cascade is often proved by employing trajectory-based arguments, most notably, integral input-to-state stability. In this paper, we follow a different route and present a control law ensuring that a composite function constructed from the translational and rotational tracking errors is a Lyapunov function for the closed-loop cascade. In particular, starting from a generic control law for the double integrator, we develop a suitable attitude control extension, by leveraging a backstepping-like procedure. Using this construction, we provide an almost global stability certificate. The proposed design employs the unit sphere $\mathcal{S}^2$ to describe the rotational degrees of freedom required for position control. This enables a simpler controller tuning and an improved tracking performance with respect to previous global solutions. The new design is demonstrated via numerical simulations and on real-world experiments.
Data-driven control of input-affine systems: the role of the signature transform
One of the most challenging tasks in control theory is arguably the design of a regulator for nonlinear systems when the dynamics are unknown. To tackle it, a popular strategy relies on finding a direct map between system responses and the controller, and the key ingredient is a predictor for system outputs trained on past trajectories. Focusing on continuous-time, input-affine systems, we show that the so-called signature transform provides rigorous and practically effective features to represent and predict system trajectories. Building upon such a tool, we propose a novel signature-based control strategy that is promising in view of data-driven predictive control.
comment: Submitted to IEEE Control Systems Letters
On the Convergence of Sigmoid and tanh Fuzzy General Grey Cognitive Maps
Fuzzy General Grey Cognitive Map (FGGCM) and Fuzzy Grey Cognitive Map (FGCM) are extensions of Fuzzy Cognitive Map (FCM) in terms of uncertainty. FGGCM allows for the processing of general grey number with multiple intervals, enabling FCM to better address uncertain situations. Although the convergence of FCM and FGCM has been discussed in many literature, the convergence of FGGCM has not been thoroughly explored. This paper aims to fill this research gap. First, metrics for the general grey number space and its vector space is given and proved using the Minkowski inequality. By utilizing the characteristic that Cauchy sequences are convergent sequences, the completeness of these two space is demonstrated. On this premise, utilizing Banach fixed point theorem and Browder-Gohde-Kirk fixed point theorem, combined with Lagrange's mean value theorem and Cauchy's inequality, deduces the sufficient conditions for FGGCM to converge to a unique fixed point when using tanh and sigmoid functions as activation functions. The sufficient conditions for the kernels and greyness of FGGCM to converge to a unique fixed point are also provided separately. Finally, based on Web Experience and Civil engineering FCM, designed corresponding FGGCM with sigmoid and tanh as activation functions by modifying the weights to general grey numbers. By comparing with the convergence theorems of FCM and FGCM, the effectiveness of the theorems proposed in this paper was verified. It was also demonstrated that the convergence theorems of FCM are special cases of the theorems proposed in this paper. The study for convergence of FGGCM is of great significance for guiding the learning algorithm of FGGCM, which is needed for designing FGGCM with specific fixed points, lays a solid theoretical foundation for the application of FGGCM in fields such as control, prediction, and decision support systems.
Towards Resilient 6G O-RAN: An Energy-Efficient URLLC Resource Allocation Framework
The demands of ultra-reliable low-latency communication (URLLC) in ``NextG" cellular networks necessitate innovative approaches for efficient resource utilisation. The current literature on 6G O-RAN primarily addresses improved mobile broadband (eMBB) performance or URLLC latency optimisation individually, often neglecting the intricate balance required to optimise both simultaneously under practical constraints. This paper addresses this gap by proposing a DRL-based resource allocation framework integrated with meta-learning to manage eMBB and URLLC services adaptively. Our approach efficiently allocates heterogeneous network resources, aiming to maximise energy efficiency (EE) while minimising URLLC latency, even under varying environmental conditions. We highlight the critical importance of accurately estimating the traffic distribution flow in the multi-connectivity (MC) scenario, as its uncertainty can significantly degrade EE. The proposed framework demonstrates superior adaptability across different path loss models, outperforming traditional methods and paving the way for more resilient and efficient 6G networks.
comment: This manuscript is being submitted for peer review and potential publication in the IEEE Open Journal of the Communications Society
Adaptive Probabilistic Planning for the Uncertain and Dynamic Orienteering Problem
The Orienteering Problem (OP) is a well-studied routing problem that has been extended to incorporate uncertainties, reflecting stochastic or dynamic travel costs, prize-collection costs, and prizes. Existing approaches may, however, be inefficient in real-world applications due to insufficient modeling knowledge and initially unknowable parameters in online scenarios. Thus, we propose the Uncertain and Dynamic Orienteering Problem (UDOP), modeling travel costs as distributions with unknown and time-variant parameters. UDOP also associates uncertain travel costs with dynamic prizes and prize-collection costs for its objective and budget constraints. To address UDOP, we develop an ADaptive Approach for Probabilistic paThs - ADAPT, that iteratively performs 'execution' and 'online planning' based on an initial 'offline' solution. The execution phase updates system status and records online cost observations. The online planner employs a Bayesian approach to adaptively estimate power consumption and optimize path sequence based on safety beliefs. We evaluate ADAPT in a practical Unmanned Aerial Vehicle (UAV) charging scheduling problem for Wireless Rechargeable Sensor Networks. The UAV must optimize its path to recharge sensor nodes efficiently while managing its energy under uncertain conditions. ADAPT maintains comparable solution quality and computation time while offering superior robustness. Extensive simulations show that ADAPT achieves a 100% Mission Success Rate (MSR) across all tested scenarios, outperforming comparable heuristic-based and frequentist approaches that fail up to 70% (under challenging conditions) and averaging 67% MSR, respectively. This work advances the field of OP with uncertainties, offering a reliable and efficient approach for real-world applications in uncertain and dynamic environments.
Power Control of Converters Connected via an L Filter to a Weak Grid. A Flatness-Based Approach
In this article, a nonlinear strategy based on a flatness approach is used for controlling the instantaneous complex power supplied from the Point of Common Coupling (PCC) to a weak grid. To this end, the strategy introduced by the authors in [1] considering a strong grid is robustified for avoiding system instability when the converter is connected to an unknown grid. The robustification method consists of including a notch filter that estimates the PCC voltage and using it to build the controller (i.e. the measured PCC voltage used to design the control strategy for a strong grid is replaced by the PCC voltage estimated with the notch filter). In addition, before designing the controller, the steady-state stability and safe operation limits when injecting complex instantaneous power to a grid of unknown impedance are analyzed. This analysis is independent of the control strategy, and applies to all power injection schemes. Simulations are presented for showing the performance of the proposed controller in presence of a weak grid.
comment: 7 pages, 5 figures
Design and Implementation of TAO DAQ System
Purpose: The Taishan Antineutrino Observatory (TAO) is a satellite experiment of the Jiangmen Underground Neutrino Observatory (JUNO), also known as JUNO-TAO. Located close to one of the reactors of the Taishan Nuclear Power Plant, TAO will measure the antineutrino energy spectrum precisely as a reference spectrum for JUNO. The data acquisition (DAQ) system is designed to acquire data from the TAO readout electronics and process it with software trigger and data compression algorithms. The data storage bandwidth is limited by the onsite network to be less than 100 Mb/s. Methods: The system is designed based on a distributed architecture, with fully decoupled modules to facilitate customized design and implementation. It is divided into two main components: the data flow system and the online software. The online software serves as the foundation, providing the electronics configuration, the process management, the run control, and the information sharing. The data flow system facilitates continuous data acquisition from various electronic boards or trigger systems, assembles and processes raw data, and ultimately stores it on the disk. Results: The core functionality of the system has been designed and developed. The usability of the data flow system interface and the software trigger results have been verified during the pre-installation testing phase. Conclusion: The DAQ system has been deployed for the TAO experiment. It has also successfully been applied to the integration test of the detector and electronics prototypes.
Distributed Optimization with Finite Bit Adaptive Quantization for Efficient Communication and Precision Enhancement
In realistic distributed optimization scenarios, individual nodes possess only partial information and communicate over bandwidth constrained channels. For this reason, the development of efficient distributed algorithms is essential. In our paper we addresses the challenge of unconstrained distributed optimization. In our scenario each node's local function exhibits strong convexity with Lipschitz continuous gradients. The exchange of information between nodes occurs through $3$-bit bandwidth-limited channels (i.e., nodes exchange messages represented by a only $3$-bits). Our proposed algorithm respects the network's bandwidth constraints by leveraging zoom-in and zoom-out operations to adjust quantizer parameters dynamically. We show that during our algorithm's operation nodes are able to converge to the exact optimal solution. Furthermore, we show that our algorithm achieves a linear convergence rate to the optimal solution. We conclude the paper with simulations that highlight our algorithm's unique characteristics.
comment: arXiv admin note: text overlap with arXiv:2309.04588
Adaptive Visual Servoing for On-Orbit Servicing
This paper presents an adaptive visual servoing framework for robotic on-orbit servicing (OOS), specifically designed for capturing tumbling satellites. The vision-guided robotic system is capable of selecting optimal control actions in the event of partial or complete vision system failure, particularly in the short term. The autonomous system accounts for physical and operational constraints, executing visual servoing tasks to minimize a cost function. A hierarchical control architecture is developed, integrating a variant of the Iterative Closest Point (ICP) algorithm for image registration, a constrained noise-adaptive Kalman filter, fault detection and recovery logic, and a constrained optimal path planner. The dynamic estimator provides real-time estimates of unknown states and uncertain parameters essential for motion prediction, while ensuring consistency through a set of inequality constraints. It also adjusts the Kalman filter parameters adaptively in response to unexpected vision errors. In the event of vision system faults, a recovery strategy is activated, guided by fault detection logic that monitors the visual feedback via the metric fit error of image registration. The estimated/predicted pose and parameters are subsequently fed into an optimal path planner, which directs the robot's end-effector to the target's grasping point. This process is subject to multiple constraints, including acceleration limits, smooth capture, and line-of-sight maintenance with the target. Experimental results demonstrate that the proposed visual servoing system successfully captured a free-floating object, despite complete occlusion of the vision system.
comment: arXiv admin note: substantial text overlap with arXiv:2209.02156
Distributed Robust Continuous-Time Optimization Algorithms for Time-Varying Constrained Cost
This paper presents a distributed continuous-time optimization framework aimed at overcoming the challenges posed by time-varying cost functions and constraints in multi-agent systems, particularly those subject to disturbances. By incorporating tools such as log-barrier penalty functions to address inequality constraints, an integral sliding mode control for disturbance mitigation is proposed. The algorithm ensures asymptotic tracking of the optimal solution, achieving a tracking error of zero. The convergence of the introduced algorithms is demonstrated through Lyapunov analysis and nonsmooth techniques. Furthermore, the framework's effectiveness is validated through numerical simulations considering two scenarios for the communication networks.
comment: 7 pages, 3 figures, Accepted for publication in the 12th International Conference on Control, Mechatronics and Automation (ICCMA 2024)
Towards Fast Rates for Federated and Multi-Task Reinforcement Learning
We consider a setting involving $N$ agents, where each agent interacts with an environment modeled as a Markov Decision Process (MDP). The agents' MDPs differ in their reward functions, capturing heterogeneous objectives/tasks. The collective goal of the agents is to communicate intermittently via a central server to find a policy that maximizes the average of long-term cumulative rewards across environments. The limited existing work on this topic either only provide asymptotic rates, or generate biased policies, or fail to establish any benefits of collaboration. In response, we propose Fast-FedPG - a novel federated policy gradient algorithm with a carefully designed bias-correction mechanism. Under a gradient-domination condition, we prove that our algorithm guarantees (i) fast linear convergence with exact gradients, and (ii) sub-linear rates that enjoy a linear speedup w.r.t. the number of agents with noisy, truncated policy gradients. Notably, in each case, the convergence is to a globally optimal policy with no heterogeneity-induced bias. In the absence of gradient-domination, we establish convergence to a first-order stationary point at a rate that continues to benefit from collaboration.
comment: Accepted to the Decision and Control Conference (CDC), 2024
Developing Trajectory Planning with Behavioral Cloning and Proximal Policy Optimization for Path-Tracking and Static Obstacle Nudging
End-to-end approaches with Reinforcement Learning (RL) and Imitation Learning (IL) have gained increasing popularity in autonomous driving. However, they do not involve explicit reasoning like classic robotics workflow, nor planning with horizons, leading strategies implicit and myopic. In this paper, we introduce our trajectory planning method that uses Behavioral Cloning (BC) for path-tracking and Proximal Policy Optimization (PPO) bootstrapped by BC for static obstacle nudging. It outputs lateral offset values to adjust the given reference trajectory, and performs modified path for different controllers. Our experimental results show that the algorithm can do path-tracking that mimics the expert performance, and avoiding collision to fixed obstacles by trial and errors. This method makes a good attempt at planning with learning-based methods in trajectory planning problems of autonomous driving.
comment: 6 pages, 7 figures
Advanced Energy-Efficient System for Precision Electrodermal Activity Monitoring in Stress Detection
This paper presents a novel Electrodermal Activity (EDA) signal acquisition system, designed to address the challenges of stress monitoring in contemporary society, where stress affects one in four individuals. Our system focuses on enhancing the accuracy and efficiency of EDA measurements, a reliable indicator of stress. Traditional EDA monitoring solutions often grapple with trade-offs between sensor placement, cost, and power consumption, leading to compromised data accuracy. Our innovative design incorporates an adaptive gain mechanism, catering to the broad dynamic range and high-resolution needs of EDA data analysis. The performance of our system was extensively tested through simulations and a custom Printed Circuit Board (PCB), achieving an error rate below 1\% and maintaining power consumption at a mere 700$\mu$A under a 3.7V power supply. This research contributes significantly to the field of wearable health technology, offering a robust and efficient solution for long-term stress monitoring.
PaRCE: Probabilistic and Reconstruction-Based Competency Estimation for Safe Navigation Under Perception Uncertainty
Perception-based navigation systems are useful for unmanned ground vehicle (UGV) navigation in complex terrains, where traditional depth-based navigation schemes are insufficient. However, these data-driven methods are highly dependent on their training data and can fail in surprising and dramatic ways with little warning. To ensure the safety of the vehicle and the surrounding environment, it is imperative that the navigation system is able to recognize the predictive uncertainty of the perception model and respond safely and effectively in the face of uncertainty. In an effort to enable safe navigation under perception uncertainty, we develop a probabilistic and reconstruction-based competency estimation (PaRCE) method to estimate the model's level of familiarity with an input image as a whole and with specific regions in the image. We find that the overall competency score can correctly predict correctly classified, misclassified, and out-of-distribution (OOD) samples. We also confirm that the regional competency maps can accurately distinguish between familiar and unfamiliar regions across images. We then use this competency information to develop a planning and control scheme that enables effective navigation while maintaining a low probability of error. We find that the competency-aware scheme greatly reduces the number of collisions with unfamiliar obstacles, compared to a baseline controller with no competency awareness. Furthermore, the regional competency information is very valuable in enabling efficient navigation.
ADMM for Downlink Beamforming in Cell-Free Massive MIMO Systems
In cell-free massive MIMO systems with multiple distributed access points (APs) serving multiple users over the same time-frequency resources, downlink beamforming is done through spatial precoding. Precoding vectors can be optimally designed to use the minimum downlink transmit power while satisfying a quality-of-service requirement for each user. However, existing centralized solutions to beamforming optimization pose challenges such as high communication overhead and processing delay. On the other hand, distributed approaches either require data exchange over the network that scales with the number of antennas or solve the problem for cellular systems where every user is served by only one AP. In this paper, we formulate a multi-user beamforming optimization problem to minimize the total transmit power subject to per-user SINR requirements and propose a distributed optimization algorithm based on the alternating direction method of multipliers (ADMM) to solve it. In our method, every AP solves an iterative optimization problem using its local channel state information. APs only need to share a real-valued vector of interference terms with the size of the number of users. Through simulation results, we demonstrate that our proposed algorithm solves the optimization problem within tens of ADMM iterations and can effectively satisfy per-user SINR constraints.
Bridging Autoencoders and Dynamic Mode Decomposition for Reduced-order Modeling and Control of PDEs
Modeling and controlling complex spatiotemporal dynamical systems driven by partial differential equations (PDEs) often necessitate dimensionality reduction techniques to construct lower-order models for computational efficiency. This paper explores a deep autoencoding learning method for reduced-order modeling and control of dynamical systems governed by spatiotemporal PDEs. We first analytically show that an optimization objective for learning a linear autoencoding reduced-order model can be formulated to yield a solution closely resembling the result obtained through the dynamic mode decomposition with control algorithm. We then extend this linear autoencoding architecture to a deep autoencoding framework, enabling the development of a nonlinear reduced-order model. Furthermore, we leverage the learned reduced-order model to design controllers using stability-constrained deep neural networks. Numerical experiments are presented to validate the efficacy of our approach in both modeling and control using the example of a reaction-diffusion system.
comment: 8 pages, 5 figures. Accepted to IEEE Conference on Decision and Control (CDC 2024)
PEERNet: An End-to-End Profiling Tool for Real-Time Networked Robotic Systems IROS 2024
Networked robotic systems balance compute, power, and latency constraints in applications such as self-driving vehicles, drone swarms, and teleoperated surgery. A core problem in this domain is deciding when to offload a computationally expensive task to the cloud, a remote server, at the cost of communication latency. Task offloading algorithms often rely on precise knowledge of system-specific performance metrics, such as sensor data rates, network bandwidth, and machine learning model latency. While these metrics can be modeled during system design, uncertainties in connection quality, server load, and hardware conditions introduce real-time performance variations, hindering overall performance. We introduce PEERNet, an end-to-end and real-time profiling tool for cloud robotics. PEERNet enables performance monitoring on heterogeneous hardware through targeted yet adaptive profiling of system components such as sensors, networks, deep-learning pipelines, and devices. We showcase PEERNet's capabilities through networked robotics tasks, such as image-based teleoperation of a Franka Emika Panda arm and querying vision language models using an Nvidia Jetson Orin. PEERNet reveals non-intuitive behavior in robotic systems, such as asymmetric network transmission and bimodal language model output. Our evaluation underscores the effectiveness and importance of benchmarking in networked robotics, demonstrating PEERNet's adaptability. Our code is open-source and available at github.com/UTAustin-SwarmLab/PEERNet.
comment: Accepted at IROS 2024
Dynamics modelling and path optimization for the on-orbit assembly of large flexible structures using a multi-arm robot
This paper presents a comprehensive methodology for modeling an on-orbit assembly mission scenario of a large flexible structure using a multi-arm robot. This methodology accounts for significant changes in inertia and flexibility throughout the mission, addressing the problem of coupling dynamics between the robot and the evolving flexible structure during the assembly phase. A three-legged walking robot is responsible for building the structure, with its primary goal being to walk stably on the flexible structure while picking up, carrying and assembling substructure components. To accurately capture the dynamics and interactions of all subsystems in the assembly scenario, various linear fractional representations (LFR) are developed, considering the changing geometrical configuration of the multi-arm robot, the varying flexible dynamics and uncertainties. A path optimization algorithm is proposed for the multi-arm robot, capable of selecting trajectories based on various cost functions related to different performance and stability metrics. The obtained results demonstrate the effectiveness of the proposed modeling methodology and path optimization algorithm.
When Learning Meets Dynamics: Distributed User Connectivity Maximization in UAV-Based Communication Networks
Distributed management over Unmanned Aerial Vehicle (UAV) based communication networks (UCNs) has attracted increasing research attention. In this work, we study a distributed user connectivity maximization problem in a UCN. The work features a horizontal study over different levels of information exchange during the distributed iteration and a consideration of dynamics in UAV set and user distribution, which are not well addressed in the existing works. Specifically, the studied problem is first formulated into a time-coupled mixed-integer non-convex optimization problem. A heuristic two-stage UAV-user association policy is proposed to faster determine the user connectivity. To tackle the NP-hard problem in scalable manner, the distributed user connectivity maximization algorithm 1 (DUCM-1) is proposed under the multi-agent deep Q learning (MA-DQL) framework. DUCM-1 emphasizes on designing different information exchange levels and evaluating how they impact the learning convergence with stationary and dynamic user distribution. To comply with the UAV dynamics, DUCM-2 algorithm is developed which is devoted to autonomously handling arbitrary quit's and join-in's of UAVs in a considered time horizon. Extensive simulations are conducted i) to conclude that exchanging state information with a deliberated task-specific reward function design yields the best convergence performance, and ii) to show the efficacy and robustness of DUCM-2 against the dynamics.
comment: 12 pages, 12 figures, journal draft
Explainable AI for Engineering Design: A Unified Approach of Systems Engineering and Component- Based Deep Learning Demonstrated by Energy- Efficient Building Design
Data-driven models created by machine learning, gain in importance in all fields of design and engineering. They, have high potential to assist decision-makers in creating novel, artefacts with better performance and sustainability. However,, limited generalization and the black-box nature of these models, lead to limited explainability and reusability. To overcome this, situation, we propose a component-based approach to create, partial component models by machine learning (ML). This, component-based approach aligns deep learning with systems, engineering (SE). The key contribution of the component-based, method is that activations at interfaces between the components, are interpretable engineering quantities. In this way, the, hierarchical component system forms a deep neural network, (DNN) that a priori integrates information for engineering, explainability. The, approach adapts the model structure to engineering methods of, systems engineering and to domain knowledge. We examine the, performance of the approach by the field of energy-efficient, building design: First, we observed better generalization of the, component-based method by analyzing prediction accuracy, outside the training data. Especially for representative designs, different in structure, we observe a much higher accuracy, (R2 = 0.94) compared to conventional monolithic methods, (R2 = 0.71). Second, we illustrate explainability by exemplary, demonstrating how sensitivity information from SE and rules, from low-depth decision trees serve engineering. Third, we, evaluate explainability by qualitative and quantitative methods, demonstrating the matching of preliminary knowledge and data-driven, derived strategies and show correctness of activations at, component interfaces compared to white-box simulation results, (envelope components: R2 = 0.92..0.99; zones: R2 = 0.78..0.93).
comment: 20 pages
A Review of Techniques for Ageing Detection and Monitoring on Embedded Systems
Embedded digital devices are progressively deployed in dependable or safety-critical systems. These devices undergo significant hardware ageing, particularly in harsh environments. This increases their likelihood of failure. It is crucial to understand ageing processes and to detect hardware degradation early for guaranteeing system dependability. In this survey, we review the core ageing mechanisms, identify and categorize general working principles of ageing detection and monitoring techniques for Commercial-Off-The-Shelf (COTS) components that are prevalent in embedded systems: Field Programmable Gate Arrays (FPGAs), microcontrollers, System-on-Chips (SoCs), and their power supplies. From our review, we find that online techniques are more widely applied on FPGAs than on other components, and see a rising trend towards machine learning application for analysing hardware ageing. Based on the reviewed literature, we identify research opportunities and potential directions of interest in the field. With this work, we intend to facilitate future research by systematically presenting all main approaches in a concise way.
Shock waves in nonlinear transmission lines
In the first half of the paper we consider interaction between the small amplitude travelling waves ("sound") and the shock waves in the transmission line containing both nonlinear capacitors and nonlinear inductors. We calculate the "sound" wave coefficient of reflection from (coefficient of transmission through) the shock wave. These coefficients are expressed in terms of the speeds of the "sound" waves relative to the shock and the wave impedances. In the second half of the paper we explicitly include into consideration the dissipation in the system, introducing ohmic resistors shunting the inductors and also in series with the capacitors. This allows us to justify the conditions on the shocks, postulated in the first half of the paper. This also allows us to describe the shocks as physical objects of finite width and study their profiles, same as the profiles of the waves closely connected with the shocks - the kinks. The profiles of the latter, and in some particular cases the profiles of the former, were obtained in terms of elementary functions.
comment: pdfLaTeX, 8 pages, 4 figures. The second half of the paper was edited substantially
Online Residual Learning from Offline Experts for Pedestrian Tracking
In this paper, we consider the problem of predicting unknown targets from data. We propose Online Residual Learning (ORL), a method that combines online adaptation with offline-trained predictions. At a lower level, we employ multiple offline predictions generated before or at the beginning of the prediction horizon. We augment every offline prediction by learning their respective residual error concerning the true target state online, using the recursive least squares algorithm. At a higher level, we treat the augmented lower-level predictors as experts, adopting the Prediction with Expert Advice framework. We utilize an adaptive softmax weighting scheme to form an aggregate prediction and provide guarantees for ORL in terms of regret. We employ ORL to boost performance in the setting of online pedestrian trajectory prediction. Based on data from the Stanford Drone Dataset, we show that ORL can demonstrate best-of-both-worlds performance.
comment: Accepted to CDC 2024, v2: fixed certain typos
Characterizing nonlinear systems with mixed input-output properties through dissipation inequalities
Systems that show different characteristics, such as finite-gain and passivity, depending on the nature of the inputs, are said to possess mixed input-output properties. In this paper, we provide a constructive method for characterizing mixed input-output properties of nonlinear systems using a dissipativity framework. Our results take inspiration from the generalized Kalman-Yakubovich-Popov lemma, and show that a system is mixed if it is dissipative with respect to highly specialized supply rates. The mixed input-output characterization is used for assessing stability of feedback interconnections in which the feedback components violate conditions of classical results such as the small-gain and passivity theorem. We showcase applicability of our results through various examples.
comment: 6 pages
Space-Filling Input Design for Nonlinear State-Space Identification
The quality of a model resulting from (black-box) system identification is highly dependent on the quality of the data that is used during the identification procedure. Designing experiments for linear time-invariant systems is well understood and mainly focuses on the power spectrum of the input signal. Performing experiment design for nonlinear system identification on the other hand remains an open challenge as informativity of the data depends both on the frequency-domain content and on the time-domain evolution of the input signal. Furthermore, as nonlinear system identification is much more sensitive to modeling and extrapolation errors, having experiments that explore the considered operation range of interest is of high importance. Hence, this paper focuses on designing space-filling experiments i.e., experiments that cover the full operation range of interest, for nonlinear dynamical systems that can be represented in a state-space form using a broad set of input signals. The presented experiment design approach can straightforwardly be extended to a wider range of system classes (e.g., NARMAX). The effectiveness of the proposed approach is illustrated on the experiment design for a nonlinear mass-spring-damper system, using a multisine input signal.
comment: Accepted by the 20th IFAC Symposium on System Identification (SYSID2024)
Real-Time Ground Fault Detection for Inverter-Based Microgrid Systems
Ground fault detection in inverter-based microgrid (IBM) systems is challenging, particularly in a real-time setting, as the fault current deviates slightly from the nominal value. This difficulty is reinforced when there are partially decoupled disturbances and modeling uncertainties. The conventional solution of installing more relays to obtain additional measurements is costly and also increases the complexity of the system. In this paper, we propose a data-assisted diagnosis scheme based on an optimization-based fault detection filter with the output current as the only measurement. Modeling the microgrid dynamics and the diagnosis filter, we formulate the filter design as a quadratic programming (QP) problem that accounts for decoupling partial disturbances, robustness to non-decoupled disturbances and modeling uncertainties by training with data, and ensuring fault sensitivity simultaneously. To ease the computational effort, we also provide an approximate but analytical solution to this QP. Additionally, we use classical statistical results to provide a thresholding mechanism that enjoys probabilistic false-alarm guarantees. Finally, we implement the IBM system with Simulink and Real Time Digital Simulator (RTDS) to verify the effectiveness of the proposed method through simulations.
comment: 18 pages, 9 figures
Communication and Control Co-Design in 6G: Sequential Decision-Making with LLMs
This article investigates a control system within the context of six-generation wireless networks. The control performance optimization confronts the technical challenges that arise from the intricate interactions between communication and control sub-systems, asking for a co-design. Accounting for the system dynamics, we formulate the sequential co-design decision-makings of communication and control over the discrete time horizon as a Markov decision process, for which a practical offline learning framework is proposed. Our proposed framework integrates large language models into the elements of reinforcement learning. We present a case study on the age of semantics-aware communication and control co-design to showcase the potentials from our proposed learning framework. Furthermore, we discuss the open issues remaining to make our proposed offline learning framework feasible for real-world implementations, and highlight the research directions for future explorations.
Risk-Aware Robotics: Tail Risk Measures in Planning, Control, and Verification
The need for a systematic approach to risk assessment has increased in recent years due to the ubiquity of autonomous systems that alter our day-to-day experiences and their need for safety, e.g., for self-driving vehicles, mobile service robots, and bipedal robots. These systems are expected to function safely in unpredictable environments and interact seamlessly with humans, whose behavior is notably challenging to forecast. We present a survey of risk-aware methodologies for autonomous systems. We adopt a contemporary risk-aware approach to mitigate rare and detrimental outcomes by advocating the use of tail risk measures, a concept borrowed from financial literature. This survey will introduce these measures and explain their relevance in the context of robotic systems for planning, control, and verification applications.
Learning Lyapunov-Stable Polynomial Dynamical Systems through Imitation
Imitation learning is a paradigm to address complex motion planning problems by learning a policy to imitate an expert's behavior. However, relying solely on the expert's data might lead to unsafe actions when the robot deviates from the demonstrated trajectories. Stability guarantees have previously been provided utilizing nonlinear dynamical systems, acting as high-level motion planners, in conjunction with the Lyapunov stability theorem. Yet, these methods are prone to inaccurate policies, high computational cost, sample inefficiency, or quasi stability when replicating complex and highly nonlinear trajectories. To mitigate this problem, we present an approach for learning a globally stable nonlinear dynamical system as a motion planning policy. We model the nonlinear dynamical system as a parametric polynomial and learn the polynomial's coefficients jointly with a Lyapunov candidate. To showcase its success, we compare our method against the state of the art in simulation and conduct real-world experiments with the Kinova Gen3 Lite manipulator arm. Our experiments demonstrate the sample efficiency and reproduction accuracy of our method for various expert trajectories, while remaining stable in the face of perturbations.
comment: In 7th Annual Conference on Robot Learning 2023 Aug 30
Probabilistic Metaplasticity for Continual Learning with Memristors
Edge devices operating in dynamic environments critically need the ability to continually learn without catastrophic forgetting. The strict resource constraints in these devices pose a major challenge to achieve this, as continual learning entails memory and computational overhead. Crossbar architectures using memristor devices offer energy efficiency through compute-in-memory and hold promise to address this issue. However, memristors often exhibit low precision and high variability in conductance modulation, rendering them unsuitable for continual learning solutions that require precise modulation of weight magnitude for consolidation. Current approaches fall short to address this challenge directly and rely on auxiliary high-precision memory, leading to frequent memory access, high memory overhead, and energy dissipation. In this research, we propose probabilistic metaplasticity, which consolidates weights by modulating their update probability rather than magnitude. The proposed mechanism eliminates high-precision modification to weight magnitudes and, consequently, the need for auxiliary high-precision memory. We demonstrate the efficacy of the proposed mechanism by integrating probabilistic metaplasticity into a spiking network trained on an error threshold with low-precision memristor weights. Evaluations of continual learning benchmarks show that probabilistic metaplasticity achieves performance equivalent to state-of-the-art continual learning models with high-precision weights while consuming ~ 67% lower memory for additional parameters and up to ~ 60x lower energy during parameter updates compared to an auxiliary memory-based solution. The proposed model shows potential for energy-efficient continual learning with low-precision emerging devices.
The Dilemma of Electricity Grid Expansion Planning in Areas at the Risk of Wildfire
The utilities consider public safety power shut-offs imperative for the mitigation of wildfire risk. This paper presents expansion planning of power system under fire hazard weather conditions. The power lines are quantified based on the risk of fire ignition. A 10-year expansion planning scenario is discussed to supply power to customers by considering three decision variables: distributed solar generation; modification of existing power lines; addition of new lines. Two-stage robust optimization problem is formulated and solved using Column-and-Constraint Generation Algorithm to find improved balance among de-energization of customers, distributed solar generation, modification of power lines, and addition of new lines. It involves lines de-energization of high wildfire risk regions and serving the customers by integrating distributed solar generation. The impact of de-energization of lines on distributed solar generation is assessed. The number of hours each line is energized and total load shedding during a 10-year period is evaluated. Different uncertainty levels for system demand and solar energy integration are considered to find the impact on the total operation cost of the system. The effectiveness of the presented algorithm is evaluated on 6- and 118-bus systems.
Risk-Averse Resilient Operation of Electricity Grid Under the Risk of Wildfire
Wildfires and other extreme weather conditions due to climate change are stressing the aging electrical infrastructure. Power utilities have implemented public safety power shutoffs as a method to mitigate the risk of wildfire by proactively de-energizing some power lines, which leaves customers without power. System operators have to make a compromise between de-energizing of power lines to avoid the wildfire risk and energizing those lines to serve the demand. In this work, with a quantified wildfire ignition risk of each line, a resilient operation problem is presented in power systems with a high penetration level of renewable generation resources. A two-stage robust optimization problem is formulated and solved using column-and-constraint generation algorithm to find improved balance between the de-energization of power lines and the customers served. Different penetration levels of renewable generation to mitigate the impact of extreme fire hazard situations on the energization of customers is assessed. The validity of the presented robust optimization algorithm is demonstrated on various test cases.
Edge Computing for IoT: Novel Insights from a Comparative Analysis of Access Control Models
IoT edge computing positions computing resources closer to the data sources to reduce the latency, relieve the bandwidth pressure on the cloud, and enhance data security. Nevertheless, data security in IoT edge computing still faces critical threats (e.g., data breaches). Access control is fundamental for mitigating these threats. However, IoT edge computing introduces notable challenges for achieving resource-conserving, low-latency, flexible, and scalable access control. To review recent access control measures, we novelly organize them according to different data lifecycles--data collection, storage, and usage--and, meanwhile, review blockchain technology in this novel organization. In this way, we provide novel insights and envisage several potential research directions. This survey can help readers find gaps systematically and prompt the development of access control techniques in IoT edge computing under the intricacy of innovations in access control.
Systems and Control (EESS)
Supervised Learning for Stochastic Optimal Control
Supervised machine learning is powerful. In recent years, it has enabled massive breakthroughs in computer vision and natural language processing. But leveraging these advances for optimal control has proved difficult. Data is a key limiting factor. Without access to the optimal policy, value function, or demonstrations, how can we fit a policy? In this paper, we show how to automatically generate supervised learning data for a class of continuous-time nonlinear stochastic optimal control problems. In particular, applying the Feynman-Kac theorem to a linear reparameterization of the Hamilton-Jacobi-Bellman PDE allows us to sample the value function by simulating a stochastic process. Hardware accelerators like GPUs could rapidly generate a large amount of this training data. With this data in hand, stochastic optimal control becomes supervised learning.
comment: CDC 2024
Distributionally Robust Stochastic Data-Driven Predictive Control with Optimized Feedback Gain
We consider the problem of direct data-driven predictive control for unknown stochastic linear time-invariant (LTI) systems with partial state observation. Building upon our previous research on data-driven stochastic control, this paper (i) relaxes the assumption of Gaussian process and measurement noise, and (ii) enables optimization of the gain matrix within the affine feedback policy. Output safety constraints are modelled using conditional value-at-risk, and enforced in a distributionally robust sense. Under idealized assumptions, we prove that our proposed data-driven control method yields control inputs identical to those produced by an equivalent model-based stochastic predictive controller. A simulation study illustrates the enhanced performance of our approach over previous designs.
comment: 8 pages, 1 figure, 2 tables, the first draft of an accepted paper of Conference on Decision and Control (CDC). arXiv admin note: text overlap with arXiv:2312.15177
Almost Global Trajectory Tracking for Quadrotors Using Thrust Direction Control on $\mathcal{S}^2$
Many of the existing works on quadrotor control address the trajectory tracking problem by employing a cascade design in which the translational and rotational dynamics are stabilized by two separate controllers. The stability of the cascade is often proved by employing trajectory-based arguments, most notably, integral input-to-state stability. In this paper, we follow a different route and present a control law ensuring that a composite function constructed from the translational and rotational tracking errors is a Lyapunov function for the closed-loop cascade. In particular, starting from a generic control law for the double integrator, we develop a suitable attitude control extension, by leveraging a backstepping-like procedure. Using this construction, we provide an almost global stability certificate. The proposed design employs the unit sphere $\mathcal{S}^2$ to describe the rotational degrees of freedom required for position control. This enables a simpler controller tuning and an improved tracking performance with respect to previous global solutions. The new design is demonstrated via numerical simulations and on real-world experiments.
Data-driven control of input-affine systems: the role of the signature transform
One of the most challenging tasks in control theory is arguably the design of a regulator for nonlinear systems when the dynamics are unknown. To tackle it, a popular strategy relies on finding a direct map between system responses and the controller, and the key ingredient is a predictor for system outputs trained on past trajectories. Focusing on continuous-time, input-affine systems, we show that the so-called signature transform provides rigorous and practically effective features to represent and predict system trajectories. Building upon such a tool, we propose a novel signature-based control strategy that is promising in view of data-driven predictive control.
comment: Submitted to IEEE Control Systems Letters
On the Convergence of Sigmoid and tanh Fuzzy General Grey Cognitive Maps
Fuzzy General Grey Cognitive Map (FGGCM) and Fuzzy Grey Cognitive Map (FGCM) are extensions of Fuzzy Cognitive Map (FCM) in terms of uncertainty. FGGCM allows for the processing of general grey number with multiple intervals, enabling FCM to better address uncertain situations. Although the convergence of FCM and FGCM has been discussed in many literature, the convergence of FGGCM has not been thoroughly explored. This paper aims to fill this research gap. First, metrics for the general grey number space and its vector space is given and proved using the Minkowski inequality. By utilizing the characteristic that Cauchy sequences are convergent sequences, the completeness of these two space is demonstrated. On this premise, utilizing Banach fixed point theorem and Browder-Gohde-Kirk fixed point theorem, combined with Lagrange's mean value theorem and Cauchy's inequality, deduces the sufficient conditions for FGGCM to converge to a unique fixed point when using tanh and sigmoid functions as activation functions. The sufficient conditions for the kernels and greyness of FGGCM to converge to a unique fixed point are also provided separately. Finally, based on Web Experience and Civil engineering FCM, designed corresponding FGGCM with sigmoid and tanh as activation functions by modifying the weights to general grey numbers. By comparing with the convergence theorems of FCM and FGCM, the effectiveness of the theorems proposed in this paper was verified. It was also demonstrated that the convergence theorems of FCM are special cases of the theorems proposed in this paper. The study for convergence of FGGCM is of great significance for guiding the learning algorithm of FGGCM, which is needed for designing FGGCM with specific fixed points, lays a solid theoretical foundation for the application of FGGCM in fields such as control, prediction, and decision support systems.
Towards Resilient 6G O-RAN: An Energy-Efficient URLLC Resource Allocation Framework
The demands of ultra-reliable low-latency communication (URLLC) in ``NextG" cellular networks necessitate innovative approaches for efficient resource utilisation. The current literature on 6G O-RAN primarily addresses improved mobile broadband (eMBB) performance or URLLC latency optimisation individually, often neglecting the intricate balance required to optimise both simultaneously under practical constraints. This paper addresses this gap by proposing a DRL-based resource allocation framework integrated with meta-learning to manage eMBB and URLLC services adaptively. Our approach efficiently allocates heterogeneous network resources, aiming to maximise energy efficiency (EE) while minimising URLLC latency, even under varying environmental conditions. We highlight the critical importance of accurately estimating the traffic distribution flow in the multi-connectivity (MC) scenario, as its uncertainty can significantly degrade EE. The proposed framework demonstrates superior adaptability across different path loss models, outperforming traditional methods and paving the way for more resilient and efficient 6G networks.
comment: This manuscript is being submitted for peer review and potential publication in the IEEE Open Journal of the Communications Society
Adaptive Probabilistic Planning for the Uncertain and Dynamic Orienteering Problem
The Orienteering Problem (OP) is a well-studied routing problem that has been extended to incorporate uncertainties, reflecting stochastic or dynamic travel costs, prize-collection costs, and prizes. Existing approaches may, however, be inefficient in real-world applications due to insufficient modeling knowledge and initially unknowable parameters in online scenarios. Thus, we propose the Uncertain and Dynamic Orienteering Problem (UDOP), modeling travel costs as distributions with unknown and time-variant parameters. UDOP also associates uncertain travel costs with dynamic prizes and prize-collection costs for its objective and budget constraints. To address UDOP, we develop an ADaptive Approach for Probabilistic paThs - ADAPT, that iteratively performs 'execution' and 'online planning' based on an initial 'offline' solution. The execution phase updates system status and records online cost observations. The online planner employs a Bayesian approach to adaptively estimate power consumption and optimize path sequence based on safety beliefs. We evaluate ADAPT in a practical Unmanned Aerial Vehicle (UAV) charging scheduling problem for Wireless Rechargeable Sensor Networks. The UAV must optimize its path to recharge sensor nodes efficiently while managing its energy under uncertain conditions. ADAPT maintains comparable solution quality and computation time while offering superior robustness. Extensive simulations show that ADAPT achieves a 100% Mission Success Rate (MSR) across all tested scenarios, outperforming comparable heuristic-based and frequentist approaches that fail up to 70% (under challenging conditions) and averaging 67% MSR, respectively. This work advances the field of OP with uncertainties, offering a reliable and efficient approach for real-world applications in uncertain and dynamic environments.
Power Control of Converters Connected via an L Filter to a Weak Grid. A Flatness-Based Approach
In this article, a nonlinear strategy based on a flatness approach is used for controlling the instantaneous complex power supplied from the Point of Common Coupling (PCC) to a weak grid. To this end, the strategy introduced by the authors in [1] considering a strong grid is robustified for avoiding system instability when the converter is connected to an unknown grid. The robustification method consists of including a notch filter that estimates the PCC voltage and using it to build the controller (i.e. the measured PCC voltage used to design the control strategy for a strong grid is replaced by the PCC voltage estimated with the notch filter). In addition, before designing the controller, the steady-state stability and safe operation limits when injecting complex instantaneous power to a grid of unknown impedance are analyzed. This analysis is independent of the control strategy, and applies to all power injection schemes. Simulations are presented for showing the performance of the proposed controller in presence of a weak grid.
comment: 7 pages, 5 figures
Design and Implementation of TAO DAQ System
Purpose: The Taishan Antineutrino Observatory (TAO) is a satellite experiment of the Jiangmen Underground Neutrino Observatory (JUNO), also known as JUNO-TAO. Located close to one of the reactors of the Taishan Nuclear Power Plant, TAO will measure the antineutrino energy spectrum precisely as a reference spectrum for JUNO. The data acquisition (DAQ) system is designed to acquire data from the TAO readout electronics and process it with software trigger and data compression algorithms. The data storage bandwidth is limited by the onsite network to be less than 100 Mb/s. Methods: The system is designed based on a distributed architecture, with fully decoupled modules to facilitate customized design and implementation. It is divided into two main components: the data flow system and the online software. The online software serves as the foundation, providing the electronics configuration, the process management, the run control, and the information sharing. The data flow system facilitates continuous data acquisition from various electronic boards or trigger systems, assembles and processes raw data, and ultimately stores it on the disk. Results: The core functionality of the system has been designed and developed. The usability of the data flow system interface and the software trigger results have been verified during the pre-installation testing phase. Conclusion: The DAQ system has been deployed for the TAO experiment. It has also successfully been applied to the integration test of the detector and electronics prototypes.
Distributed Optimization with Finite Bit Adaptive Quantization for Efficient Communication and Precision Enhancement
In realistic distributed optimization scenarios, individual nodes possess only partial information and communicate over bandwidth constrained channels. For this reason, the development of efficient distributed algorithms is essential. In our paper we addresses the challenge of unconstrained distributed optimization. In our scenario each node's local function exhibits strong convexity with Lipschitz continuous gradients. The exchange of information between nodes occurs through $3$-bit bandwidth-limited channels (i.e., nodes exchange messages represented by a only $3$-bits). Our proposed algorithm respects the network's bandwidth constraints by leveraging zoom-in and zoom-out operations to adjust quantizer parameters dynamically. We show that during our algorithm's operation nodes are able to converge to the exact optimal solution. Furthermore, we show that our algorithm achieves a linear convergence rate to the optimal solution. We conclude the paper with simulations that highlight our algorithm's unique characteristics.
comment: arXiv admin note: text overlap with arXiv:2309.04588
Adaptive Visual Servoing for On-Orbit Servicing
This paper presents an adaptive visual servoing framework for robotic on-orbit servicing (OOS), specifically designed for capturing tumbling satellites. The vision-guided robotic system is capable of selecting optimal control actions in the event of partial or complete vision system failure, particularly in the short term. The autonomous system accounts for physical and operational constraints, executing visual servoing tasks to minimize a cost function. A hierarchical control architecture is developed, integrating a variant of the Iterative Closest Point (ICP) algorithm for image registration, a constrained noise-adaptive Kalman filter, fault detection and recovery logic, and a constrained optimal path planner. The dynamic estimator provides real-time estimates of unknown states and uncertain parameters essential for motion prediction, while ensuring consistency through a set of inequality constraints. It also adjusts the Kalman filter parameters adaptively in response to unexpected vision errors. In the event of vision system faults, a recovery strategy is activated, guided by fault detection logic that monitors the visual feedback via the metric fit error of image registration. The estimated/predicted pose and parameters are subsequently fed into an optimal path planner, which directs the robot's end-effector to the target's grasping point. This process is subject to multiple constraints, including acceleration limits, smooth capture, and line-of-sight maintenance with the target. Experimental results demonstrate that the proposed visual servoing system successfully captured a free-floating object, despite complete occlusion of the vision system.
comment: arXiv admin note: substantial text overlap with arXiv:2209.02156
Distributed Robust Continuous-Time Optimization Algorithms for Time-Varying Constrained Cost
This paper presents a distributed continuous-time optimization framework aimed at overcoming the challenges posed by time-varying cost functions and constraints in multi-agent systems, particularly those subject to disturbances. By incorporating tools such as log-barrier penalty functions to address inequality constraints, an integral sliding mode control for disturbance mitigation is proposed. The algorithm ensures asymptotic tracking of the optimal solution, achieving a tracking error of zero. The convergence of the introduced algorithms is demonstrated through Lyapunov analysis and nonsmooth techniques. Furthermore, the framework's effectiveness is validated through numerical simulations considering two scenarios for the communication networks.
comment: 7 pages, 3 figures, Accepted for publication in the 12th International Conference on Control, Mechatronics and Automation (ICCMA 2024)
Towards Fast Rates for Federated and Multi-Task Reinforcement Learning
We consider a setting involving $N$ agents, where each agent interacts with an environment modeled as a Markov Decision Process (MDP). The agents' MDPs differ in their reward functions, capturing heterogeneous objectives/tasks. The collective goal of the agents is to communicate intermittently via a central server to find a policy that maximizes the average of long-term cumulative rewards across environments. The limited existing work on this topic either only provide asymptotic rates, or generate biased policies, or fail to establish any benefits of collaboration. In response, we propose Fast-FedPG - a novel federated policy gradient algorithm with a carefully designed bias-correction mechanism. Under a gradient-domination condition, we prove that our algorithm guarantees (i) fast linear convergence with exact gradients, and (ii) sub-linear rates that enjoy a linear speedup w.r.t. the number of agents with noisy, truncated policy gradients. Notably, in each case, the convergence is to a globally optimal policy with no heterogeneity-induced bias. In the absence of gradient-domination, we establish convergence to a first-order stationary point at a rate that continues to benefit from collaboration.
comment: Accepted to the Decision and Control Conference (CDC), 2024
Developing Trajectory Planning with Behavioral Cloning and Proximal Policy Optimization for Path-Tracking and Static Obstacle Nudging
End-to-end approaches with Reinforcement Learning (RL) and Imitation Learning (IL) have gained increasing popularity in autonomous driving. However, they do not involve explicit reasoning like classic robotics workflow, nor planning with horizons, leading strategies implicit and myopic. In this paper, we introduce our trajectory planning method that uses Behavioral Cloning (BC) for path-tracking and Proximal Policy Optimization (PPO) bootstrapped by BC for static obstacle nudging. It outputs lateral offset values to adjust the given reference trajectory, and performs modified path for different controllers. Our experimental results show that the algorithm can do path-tracking that mimics the expert performance, and avoiding collision to fixed obstacles by trial and errors. This method makes a good attempt at planning with learning-based methods in trajectory planning problems of autonomous driving.
comment: 6 pages, 7 figures
Advanced Energy-Efficient System for Precision Electrodermal Activity Monitoring in Stress Detection
This paper presents a novel Electrodermal Activity (EDA) signal acquisition system, designed to address the challenges of stress monitoring in contemporary society, where stress affects one in four individuals. Our system focuses on enhancing the accuracy and efficiency of EDA measurements, a reliable indicator of stress. Traditional EDA monitoring solutions often grapple with trade-offs between sensor placement, cost, and power consumption, leading to compromised data accuracy. Our innovative design incorporates an adaptive gain mechanism, catering to the broad dynamic range and high-resolution needs of EDA data analysis. The performance of our system was extensively tested through simulations and a custom Printed Circuit Board (PCB), achieving an error rate below 1\% and maintaining power consumption at a mere 700$\mu$A under a 3.7V power supply. This research contributes significantly to the field of wearable health technology, offering a robust and efficient solution for long-term stress monitoring.
PaRCE: Probabilistic and Reconstruction-Based Competency Estimation for Safe Navigation Under Perception Uncertainty
Perception-based navigation systems are useful for unmanned ground vehicle (UGV) navigation in complex terrains, where traditional depth-based navigation schemes are insufficient. However, these data-driven methods are highly dependent on their training data and can fail in surprising and dramatic ways with little warning. To ensure the safety of the vehicle and the surrounding environment, it is imperative that the navigation system is able to recognize the predictive uncertainty of the perception model and respond safely and effectively in the face of uncertainty. In an effort to enable safe navigation under perception uncertainty, we develop a probabilistic and reconstruction-based competency estimation (PaRCE) method to estimate the model's level of familiarity with an input image as a whole and with specific regions in the image. We find that the overall competency score can correctly predict correctly classified, misclassified, and out-of-distribution (OOD) samples. We also confirm that the regional competency maps can accurately distinguish between familiar and unfamiliar regions across images. We then use this competency information to develop a planning and control scheme that enables effective navigation while maintaining a low probability of error. We find that the competency-aware scheme greatly reduces the number of collisions with unfamiliar obstacles, compared to a baseline controller with no competency awareness. Furthermore, the regional competency information is very valuable in enabling efficient navigation.
ADMM for Downlink Beamforming in Cell-Free Massive MIMO Systems
In cell-free massive MIMO systems with multiple distributed access points (APs) serving multiple users over the same time-frequency resources, downlink beamforming is done through spatial precoding. Precoding vectors can be optimally designed to use the minimum downlink transmit power while satisfying a quality-of-service requirement for each user. However, existing centralized solutions to beamforming optimization pose challenges such as high communication overhead and processing delay. On the other hand, distributed approaches either require data exchange over the network that scales with the number of antennas or solve the problem for cellular systems where every user is served by only one AP. In this paper, we formulate a multi-user beamforming optimization problem to minimize the total transmit power subject to per-user SINR requirements and propose a distributed optimization algorithm based on the alternating direction method of multipliers (ADMM) to solve it. In our method, every AP solves an iterative optimization problem using its local channel state information. APs only need to share a real-valued vector of interference terms with the size of the number of users. Through simulation results, we demonstrate that our proposed algorithm solves the optimization problem within tens of ADMM iterations and can effectively satisfy per-user SINR constraints.
Bridging Autoencoders and Dynamic Mode Decomposition for Reduced-order Modeling and Control of PDEs
Modeling and controlling complex spatiotemporal dynamical systems driven by partial differential equations (PDEs) often necessitate dimensionality reduction techniques to construct lower-order models for computational efficiency. This paper explores a deep autoencoding learning method for reduced-order modeling and control of dynamical systems governed by spatiotemporal PDEs. We first analytically show that an optimization objective for learning a linear autoencoding reduced-order model can be formulated to yield a solution closely resembling the result obtained through the dynamic mode decomposition with control algorithm. We then extend this linear autoencoding architecture to a deep autoencoding framework, enabling the development of a nonlinear reduced-order model. Furthermore, we leverage the learned reduced-order model to design controllers using stability-constrained deep neural networks. Numerical experiments are presented to validate the efficacy of our approach in both modeling and control using the example of a reaction-diffusion system.
comment: 8 pages, 5 figures. Accepted to IEEE Conference on Decision and Control (CDC 2024)
PEERNet: An End-to-End Profiling Tool for Real-Time Networked Robotic Systems IROS 2024
Networked robotic systems balance compute, power, and latency constraints in applications such as self-driving vehicles, drone swarms, and teleoperated surgery. A core problem in this domain is deciding when to offload a computationally expensive task to the cloud, a remote server, at the cost of communication latency. Task offloading algorithms often rely on precise knowledge of system-specific performance metrics, such as sensor data rates, network bandwidth, and machine learning model latency. While these metrics can be modeled during system design, uncertainties in connection quality, server load, and hardware conditions introduce real-time performance variations, hindering overall performance. We introduce PEERNet, an end-to-end and real-time profiling tool for cloud robotics. PEERNet enables performance monitoring on heterogeneous hardware through targeted yet adaptive profiling of system components such as sensors, networks, deep-learning pipelines, and devices. We showcase PEERNet's capabilities through networked robotics tasks, such as image-based teleoperation of a Franka Emika Panda arm and querying vision language models using an Nvidia Jetson Orin. PEERNet reveals non-intuitive behavior in robotic systems, such as asymmetric network transmission and bimodal language model output. Our evaluation underscores the effectiveness and importance of benchmarking in networked robotics, demonstrating PEERNet's adaptability. Our code is open-source and available at github.com/UTAustin-SwarmLab/PEERNet.
comment: Accepted at IROS 2024
Dynamics modelling and path optimization for the on-orbit assembly of large flexible structures using a multi-arm robot
This paper presents a comprehensive methodology for modeling an on-orbit assembly mission scenario of a large flexible structure using a multi-arm robot. This methodology accounts for significant changes in inertia and flexibility throughout the mission, addressing the problem of coupling dynamics between the robot and the evolving flexible structure during the assembly phase. A three-legged walking robot is responsible for building the structure, with its primary goal being to walk stably on the flexible structure while picking up, carrying and assembling substructure components. To accurately capture the dynamics and interactions of all subsystems in the assembly scenario, various linear fractional representations (LFR) are developed, considering the changing geometrical configuration of the multi-arm robot, the varying flexible dynamics and uncertainties. A path optimization algorithm is proposed for the multi-arm robot, capable of selecting trajectories based on various cost functions related to different performance and stability metrics. The obtained results demonstrate the effectiveness of the proposed modeling methodology and path optimization algorithm.
When Learning Meets Dynamics: Distributed User Connectivity Maximization in UAV-Based Communication Networks
Distributed management over Unmanned Aerial Vehicle (UAV) based communication networks (UCNs) has attracted increasing research attention. In this work, we study a distributed user connectivity maximization problem in a UCN. The work features a horizontal study over different levels of information exchange during the distributed iteration and a consideration of dynamics in UAV set and user distribution, which are not well addressed in the existing works. Specifically, the studied problem is first formulated into a time-coupled mixed-integer non-convex optimization problem. A heuristic two-stage UAV-user association policy is proposed to faster determine the user connectivity. To tackle the NP-hard problem in scalable manner, the distributed user connectivity maximization algorithm 1 (DUCM-1) is proposed under the multi-agent deep Q learning (MA-DQL) framework. DUCM-1 emphasizes on designing different information exchange levels and evaluating how they impact the learning convergence with stationary and dynamic user distribution. To comply with the UAV dynamics, DUCM-2 algorithm is developed which is devoted to autonomously handling arbitrary quit's and join-in's of UAVs in a considered time horizon. Extensive simulations are conducted i) to conclude that exchanging state information with a deliberated task-specific reward function design yields the best convergence performance, and ii) to show the efficacy and robustness of DUCM-2 against the dynamics.
comment: 12 pages, 12 figures, journal draft
Explainable AI for Engineering Design: A Unified Approach of Systems Engineering and Component- Based Deep Learning Demonstrated by Energy- Efficient Building Design
Data-driven models created by machine learning, gain in importance in all fields of design and engineering. They, have high potential to assist decision-makers in creating novel, artefacts with better performance and sustainability. However,, limited generalization and the black-box nature of these models, lead to limited explainability and reusability. To overcome this, situation, we propose a component-based approach to create, partial component models by machine learning (ML). This, component-based approach aligns deep learning with systems, engineering (SE). The key contribution of the component-based, method is that activations at interfaces between the components, are interpretable engineering quantities. In this way, the, hierarchical component system forms a deep neural network, (DNN) that a priori integrates information for engineering, explainability. The, approach adapts the model structure to engineering methods of, systems engineering and to domain knowledge. We examine the, performance of the approach by the field of energy-efficient, building design: First, we observed better generalization of the, component-based method by analyzing prediction accuracy, outside the training data. Especially for representative designs, different in structure, we observe a much higher accuracy, (R2 = 0.94) compared to conventional monolithic methods, (R2 = 0.71). Second, we illustrate explainability by exemplary, demonstrating how sensitivity information from SE and rules, from low-depth decision trees serve engineering. Third, we, evaluate explainability by qualitative and quantitative methods, demonstrating the matching of preliminary knowledge and data-driven, derived strategies and show correctness of activations at, component interfaces compared to white-box simulation results, (envelope components: R2 = 0.92..0.99; zones: R2 = 0.78..0.93).
comment: 20 pages
A Review of Techniques for Ageing Detection and Monitoring on Embedded Systems
Embedded digital devices are progressively deployed in dependable or safety-critical systems. These devices undergo significant hardware ageing, particularly in harsh environments. This increases their likelihood of failure. It is crucial to understand ageing processes and to detect hardware degradation early for guaranteeing system dependability. In this survey, we review the core ageing mechanisms, identify and categorize general working principles of ageing detection and monitoring techniques for Commercial-Off-The-Shelf (COTS) components that are prevalent in embedded systems: Field Programmable Gate Arrays (FPGAs), microcontrollers, System-on-Chips (SoCs), and their power supplies. From our review, we find that online techniques are more widely applied on FPGAs than on other components, and see a rising trend towards machine learning application for analysing hardware ageing. Based on the reviewed literature, we identify research opportunities and potential directions of interest in the field. With this work, we intend to facilitate future research by systematically presenting all main approaches in a concise way.
Shock waves in nonlinear transmission lines
In the first half of the paper we consider interaction between the small amplitude travelling waves ("sound") and the shock waves in the transmission line containing both nonlinear capacitors and nonlinear inductors. We calculate the "sound" wave coefficient of reflection from (coefficient of transmission through) the shock wave. These coefficients are expressed in terms of the speeds of the "sound" waves relative to the shock and the wave impedances. In the second half of the paper we explicitly include into consideration the dissipation in the system, introducing ohmic resistors shunting the inductors and also in series with the capacitors. This allows us to justify the conditions on the shocks, postulated in the first half of the paper. This also allows us to describe the shocks as physical objects of finite width and study their profiles, same as the profiles of the waves closely connected with the shocks - the kinks. The profiles of the latter, and in some particular cases the profiles of the former, were obtained in terms of elementary functions.
comment: pdfLaTeX, 8 pages, 4 figures. The second half of the paper was edited substantially
Online Residual Learning from Offline Experts for Pedestrian Tracking
In this paper, we consider the problem of predicting unknown targets from data. We propose Online Residual Learning (ORL), a method that combines online adaptation with offline-trained predictions. At a lower level, we employ multiple offline predictions generated before or at the beginning of the prediction horizon. We augment every offline prediction by learning their respective residual error concerning the true target state online, using the recursive least squares algorithm. At a higher level, we treat the augmented lower-level predictors as experts, adopting the Prediction with Expert Advice framework. We utilize an adaptive softmax weighting scheme to form an aggregate prediction and provide guarantees for ORL in terms of regret. We employ ORL to boost performance in the setting of online pedestrian trajectory prediction. Based on data from the Stanford Drone Dataset, we show that ORL can demonstrate best-of-both-worlds performance.
comment: Accepted to CDC 2024, v2: fixed certain typos
Characterizing nonlinear systems with mixed input-output properties through dissipation inequalities
Systems that show different characteristics, such as finite-gain and passivity, depending on the nature of the inputs, are said to possess mixed input-output properties. In this paper, we provide a constructive method for characterizing mixed input-output properties of nonlinear systems using a dissipativity framework. Our results take inspiration from the generalized Kalman-Yakubovich-Popov lemma, and show that a system is mixed if it is dissipative with respect to highly specialized supply rates. The mixed input-output characterization is used for assessing stability of feedback interconnections in which the feedback components violate conditions of classical results such as the small-gain and passivity theorem. We showcase applicability of our results through various examples.
comment: 6 pages
Space-Filling Input Design for Nonlinear State-Space Identification
The quality of a model resulting from (black-box) system identification is highly dependent on the quality of the data that is used during the identification procedure. Designing experiments for linear time-invariant systems is well understood and mainly focuses on the power spectrum of the input signal. Performing experiment design for nonlinear system identification on the other hand remains an open challenge as informativity of the data depends both on the frequency-domain content and on the time-domain evolution of the input signal. Furthermore, as nonlinear system identification is much more sensitive to modeling and extrapolation errors, having experiments that explore the considered operation range of interest is of high importance. Hence, this paper focuses on designing space-filling experiments i.e., experiments that cover the full operation range of interest, for nonlinear dynamical systems that can be represented in a state-space form using a broad set of input signals. The presented experiment design approach can straightforwardly be extended to a wider range of system classes (e.g., NARMAX). The effectiveness of the proposed approach is illustrated on the experiment design for a nonlinear mass-spring-damper system, using a multisine input signal.
comment: Accepted by the 20th IFAC Symposium on System Identification (SYSID2024)
Real-Time Ground Fault Detection for Inverter-Based Microgrid Systems
Ground fault detection in inverter-based microgrid (IBM) systems is challenging, particularly in a real-time setting, as the fault current deviates slightly from the nominal value. This difficulty is reinforced when there are partially decoupled disturbances and modeling uncertainties. The conventional solution of installing more relays to obtain additional measurements is costly and also increases the complexity of the system. In this paper, we propose a data-assisted diagnosis scheme based on an optimization-based fault detection filter with the output current as the only measurement. Modeling the microgrid dynamics and the diagnosis filter, we formulate the filter design as a quadratic programming (QP) problem that accounts for decoupling partial disturbances, robustness to non-decoupled disturbances and modeling uncertainties by training with data, and ensuring fault sensitivity simultaneously. To ease the computational effort, we also provide an approximate but analytical solution to this QP. Additionally, we use classical statistical results to provide a thresholding mechanism that enjoys probabilistic false-alarm guarantees. Finally, we implement the IBM system with Simulink and Real Time Digital Simulator (RTDS) to verify the effectiveness of the proposed method through simulations.
comment: 18 pages, 9 figures
Risk-Aware Robotics: Tail Risk Measures in Planning, Control, and Verification
The need for a systematic approach to risk assessment has increased in recent years due to the ubiquity of autonomous systems that alter our day-to-day experiences and their need for safety, e.g., for self-driving vehicles, mobile service robots, and bipedal robots. These systems are expected to function safely in unpredictable environments and interact seamlessly with humans, whose behavior is notably challenging to forecast. We present a survey of risk-aware methodologies for autonomous systems. We adopt a contemporary risk-aware approach to mitigate rare and detrimental outcomes by advocating the use of tail risk measures, a concept borrowed from financial literature. This survey will introduce these measures and explain their relevance in the context of robotic systems for planning, control, and verification applications.
Learning Lyapunov-Stable Polynomial Dynamical Systems through Imitation
Imitation learning is a paradigm to address complex motion planning problems by learning a policy to imitate an expert's behavior. However, relying solely on the expert's data might lead to unsafe actions when the robot deviates from the demonstrated trajectories. Stability guarantees have previously been provided utilizing nonlinear dynamical systems, acting as high-level motion planners, in conjunction with the Lyapunov stability theorem. Yet, these methods are prone to inaccurate policies, high computational cost, sample inefficiency, or quasi stability when replicating complex and highly nonlinear trajectories. To mitigate this problem, we present an approach for learning a globally stable nonlinear dynamical system as a motion planning policy. We model the nonlinear dynamical system as a parametric polynomial and learn the polynomial's coefficients jointly with a Lyapunov candidate. To showcase its success, we compare our method against the state of the art in simulation and conduct real-world experiments with the Kinova Gen3 Lite manipulator arm. Our experiments demonstrate the sample efficiency and reproduction accuracy of our method for various expert trajectories, while remaining stable in the face of perturbations.
comment: In 7th Annual Conference on Robot Learning 2023 Aug 30
Probabilistic Metaplasticity for Continual Learning with Memristors
Edge devices operating in dynamic environments critically need the ability to continually learn without catastrophic forgetting. The strict resource constraints in these devices pose a major challenge to achieve this, as continual learning entails memory and computational overhead. Crossbar architectures using memristor devices offer energy efficiency through compute-in-memory and hold promise to address this issue. However, memristors often exhibit low precision and high variability in conductance modulation, rendering them unsuitable for continual learning solutions that require precise modulation of weight magnitude for consolidation. Current approaches fall short to address this challenge directly and rely on auxiliary high-precision memory, leading to frequent memory access, high memory overhead, and energy dissipation. In this research, we propose probabilistic metaplasticity, which consolidates weights by modulating their update probability rather than magnitude. The proposed mechanism eliminates high-precision modification to weight magnitudes and, consequently, the need for auxiliary high-precision memory. We demonstrate the efficacy of the proposed mechanism by integrating probabilistic metaplasticity into a spiking network trained on an error threshold with low-precision memristor weights. Evaluations of continual learning benchmarks show that probabilistic metaplasticity achieves performance equivalent to state-of-the-art continual learning models with high-precision weights while consuming ~ 67% lower memory for additional parameters and up to ~ 60x lower energy during parameter updates compared to an auxiliary memory-based solution. The proposed model shows potential for energy-efficient continual learning with low-precision emerging devices.
The Dilemma of Electricity Grid Expansion Planning in Areas at the Risk of Wildfire
The utilities consider public safety power shut-offs imperative for the mitigation of wildfire risk. This paper presents expansion planning of power system under fire hazard weather conditions. The power lines are quantified based on the risk of fire ignition. A 10-year expansion planning scenario is discussed to supply power to customers by considering three decision variables: distributed solar generation; modification of existing power lines; addition of new lines. Two-stage robust optimization problem is formulated and solved using Column-and-Constraint Generation Algorithm to find improved balance among de-energization of customers, distributed solar generation, modification of power lines, and addition of new lines. It involves lines de-energization of high wildfire risk regions and serving the customers by integrating distributed solar generation. The impact of de-energization of lines on distributed solar generation is assessed. The number of hours each line is energized and total load shedding during a 10-year period is evaluated. Different uncertainty levels for system demand and solar energy integration are considered to find the impact on the total operation cost of the system. The effectiveness of the presented algorithm is evaluated on 6- and 118-bus systems.
Risk-Averse Resilient Operation of Electricity Grid Under the Risk of Wildfire
Wildfires and other extreme weather conditions due to climate change are stressing the aging electrical infrastructure. Power utilities have implemented public safety power shutoffs as a method to mitigate the risk of wildfire by proactively de-energizing some power lines, which leaves customers without power. System operators have to make a compromise between de-energizing of power lines to avoid the wildfire risk and energizing those lines to serve the demand. In this work, with a quantified wildfire ignition risk of each line, a resilient operation problem is presented in power systems with a high penetration level of renewable generation resources. A two-stage robust optimization problem is formulated and solved using column-and-constraint generation algorithm to find improved balance between the de-energization of power lines and the customers served. Different penetration levels of renewable generation to mitigate the impact of extreme fire hazard situations on the energization of customers is assessed. The validity of the presented robust optimization algorithm is demonstrated on various test cases.
Edge Computing for IoT: Novel Insights from a Comparative Analysis of Access Control Models
IoT edge computing positions computing resources closer to the data sources to reduce the latency, relieve the bandwidth pressure on the cloud, and enhance data security. Nevertheless, data security in IoT edge computing still faces critical threats (e.g., data breaches). Access control is fundamental for mitigating these threats. However, IoT edge computing introduces notable challenges for achieving resource-conserving, low-latency, flexible, and scalable access control. To review recent access control measures, we novelly organize them according to different data lifecycles--data collection, storage, and usage--and, meanwhile, review blockchain technology in this novel organization. In this way, we provide novel insights and envisage several potential research directions. This survey can help readers find gaps systematically and prompt the development of access control techniques in IoT edge computing under the intricacy of innovations in access control.
Robotics
CARDinality: Interactive Card-shaped Robots with Locomotion and Haptics using Vibration
This paper introduces a novel approach to interactive robots by leveraging the form-factor of cards to create thin robots equipped with vibrational capabilities for locomotion and haptic feedback. The system is composed of flat-shaped robots with on-device sensing and wireless control, which offer lightweight portability and scalability. This research introduces a hardware prototype. Applications include augmented card playing, educational tools, and assistive technology, which showcase CARDinality's versatility in tangible interaction.
comment: Accepted for ACM UIST 2024
AI-Driven Robotic Crystal Explorer for Rapid Polymorph Identification
Crystallisation is an important phenomenon which facilitates the purification as well as structural and bulk phase material characterisation using crystallographic methods. However, different conditions can lead to a vast set of different crystal structure polymorphs and these often exhibit different physical properties, allowing materials to be tailored to specific purposes. This means the high dimensionality that can result from variations in the conditions which affect crystallisation, and the interaction between them, means that exhaustive exploration is difficult, time-consuming, and costly to explore. Herein we present a robotic crystal search engine for the automated and efficient high-throughput approach to the exploration of crystallisation conditions. The system comprises a closed-loop computer crystal-vision system that uses machine learning to both identify crystals and classify their identity in a multiplexed robotic platform. By exploring the formation of a well-known polymorph, we were able to show how a robotic system could be used to efficiently search experimental space as a function of relative polymorph amount and efficiently create a high dimensionality phase diagram with minimal experimental budget and without expensive analytical techniques such as crystallography. In this way, we identify the set of polymorphs possible within a set of experimental conditions, as well as the optimal values of these conditions to grow each polymorph.
comment: 18 pages, 6 figures, 20 references
A Remote Control Painting System for Exterior Walls of High-Rise Buildings through Robotic System
Exterior painting of high-rise buildings is a challenging task. In our country, as well as in other countries of the world, this task is accomplished manually, which is risky and life-threatening for the workers. Researchers and industry experts are trying to find an automatic and robotic solution for the exterior painting of high-rise building walls. In this paper, we propose a solution to this problem. We design and implement a prototype for automatically painting the building walls' exteriors. A spray mechanism was introduced in the prototype that can move in four different directions (up-down and left-right). All the movements are achieved by using microcontroller-operated servo motors. Further, these components create a scope to upgrade the proposed remote-controlled system to a robotic system in the future. In the presented system, all the operations are controlled remotely from a smartphone interface. Bluetooth technology is used for remote communications. It is expected that the suggested system will improve productivity with better workplace safety.
Adaptive Control based Friction Estimation for Tracking Control of Robot Manipulators
Adaptive control is often used for friction compensation in trajectory tracking tasks because it does not require torque sensors. However, it has some drawbacks: first, the most common certainty-equivalence adaptive control design is based on linearized parameterization of the friction model, therefore nonlinear effects, including the stiction and Stribeck effect, are usually omitted. Second, the adaptive control-based estimation can be biased due to non-zero steady-state error. Third, neglecting unknown model mismatch could result in non-robust estimation. This paper proposes a novel linear parameterized friction model capturing the nonlinear static friction phenomenon. Subsequently, an adaptive control-based friction estimator is proposed to reduce the bias during estimation based on backstepping. Finally, we propose an algorithm to generate excitation for robust estimation. Using a KUKA iiwa 14, we conducted trajectory tracking experiments to evaluate the estimated friction model, including random Fourier and drawing trajectories, showing the effectiveness of our methodology in different control schemes.
The Influence of Demographic Variation on the Perception of Industrial Robot Movements
The influence of individual differences on the perception and evaluation of interactions with robots has been researched for decades. Some human demographic characteristics have been shown to affect how individuals perceive interactions with robots. Still, it is to-date not clear whether, which and to what extent individual differences influence how we perceive robots, and even less is known about human factors and their effect on the perception of robot movements. In addition, most results on the relevance of individual differences investigate human-robot interactions with humanoid or social robots whereas interactions with industrial robots are underrepresented. We present a literature review on the relationship of robot movements and the influence of demographic variation. Our review reveals a limited comparability of existing findings due to a lack of standardized robot manipulations, various dependent variables used and differing experimental setups including different robot types. In addition, most studies have insufficient sample sizes to derive generalizable results. To overcome these shortcomings, we report the results from a Web-based experiment with 930 participants that studies the effect of demographic characteristics on the evaluation of movement behaviors of an articulated robot arm. Our findings demonstrate that most participants prefer an approach from the side, a large movement range, conventional numbers of rotations, smooth movements and neither fast nor slow movement speeds. Regarding individual differences, most of these preferences are robust to demographic variation, and only gender and age was found to cause slight preference differences between slow and fast movements.
Limiting Computation Levels in Prioritized Trajectory Planning with Safety Guarantees
In prioritized planning for vehicles, vehicles plan trajectories in parallel or in sequence. Parallel prioritized planning offers approximately consistent computation time regardless of the number of vehicles but struggles to guarantee collision-free trajectories. Conversely, sequential prioritized planning can guarantee collision-freeness but results in increased computation time as the number of sequentially computing vehicles, which we term computation levels, grows. This number is determined by the directed coupling graph resulted from the coupling and prioritization of vehicles. In this work, we guarantee safe trajectories in parallel planning through reachability analysis. Although these trajectories are collision-free, they tend to be conservative. We address this by planning with a subset of vehicles in sequence. We formulate the problem of selecting this subset as a graph partitioning problem that allows us to independently set computation levels. Our simulations demonstrate a reduction in computation levels by approximately 64% compared to sequential prioritized planning while maintaining the solution quality.
comment: 8 pages, 4 figures. This is an extended abstract of our previous work published at the 2024 European Control Conference (ECC), June 25-28, 2024. Stockholm, Sweden
Using vs. Purchasing Industrial Robots: Adding an Organizational Perspective to Industrial HRI
Purpose: Industrial robots allow manufacturing companies to increase productivity and remain competitive. For robots to be used, they must be accepted by operators on the one hand and bought by decision-makers on the other. The roles involved in such organizational processes have very different perspectives. It is therefore essential for suppliers and robot customers to understand these motives so that robots can successfully be integrated on manufacturing shopfloors. Methodology: We present findings of a qualitative study with operators and decision-makers from two Swiss manufacturing SMEs. Using laddering interviews and means-end analysis, we compare operators' and deciders' relevant elements and how these elements are linked to each other on different abstraction levels. These findings represent drivers and barriers to the acquisition, integration and acceptance of robots in the industry. Findings: We present the differing foci of operators and deciders, and how they can be used by demanders as well as suppliers of robots to achieve robot acceptance and deployment. First, we present a list of relevant attributes, consequences and values that constitute robot acceptance and/or rejection. Second, we provide quantified relevancies for these elements, and how they differ between operators and deciders. And third, we demonstrate how the elements are linked with each other on different abstraction levels, and how these links differ between the two groups.
comment: 25 pages, 2 figures, 3 tables
Gesture Generation from Trimodal Context for Humanoid Robots
Natural co-speech gestures are essential components to improve the experience of Human-robot interaction (HRI). However, current gesture generation approaches have many limitations of not being natural, not aligning with the speech and content, or the lack of diverse speaker styles. Therefore, this work aims to repoduce the work by Yoon et,al generating natural gestures in simulation based on tri-modal inputs and apply this to a robot. During evaluation, ``motion variance'' and ``Frechet Gesture Distance (FGD)'' is employed to evaluate the performance objectively. Then, human participants were recruited to subjectively evaluate the gestures. Results show that the movements in that paper have been successfully transferred to the robot and the gestures have diverse styles and are correlated with the speech. Moreover, there is a significant likeability and style difference between different gestures.
HelmetPoser: A Helmet-Mounted IMU Dataset for Data-Driven Estimation of Human Head Motion in Diverse Conditions
Helmet-mounted wearable positioning systems are crucial for enhancing safety and facilitating coordination in industrial, construction, and emergency rescue environments. These systems, including LiDAR-Inertial Odometry (LIO) and Visual-Inertial Odometry (VIO), often face challenges in localization due to adverse environmental conditions such as dust, smoke, and limited visual features. To address these limitations, we propose a novel head-mounted Inertial Measurement Unit (IMU) dataset with ground truth, aimed at advancing data-driven IMU pose estimation. Our dataset captures human head motion patterns using a helmet-mounted system, with data from ten participants performing various activities. We explore the application of neural networks, specifically Long Short-Term Memory (LSTM) and Transformer networks, to correct IMU biases and improve localization accuracy. Additionally, we evaluate the performance of these methods across different IMU data window dimensions, motion patterns, and sensor types. We release a publicly available dataset, demonstrate the feasibility of advanced neural network approaches for helmet-based localization, and provide evaluation metrics to establish a baseline for future studies in this field. Data and code can be found at \url{https://lqiutong.github.io/HelmetPoser.github.io/}.
Enhancing Socially-Aware Robot Navigation through Bidirectional Natural Language Conversation
Robot navigation is an important research field with applications in various domains. However, traditional approaches often prioritize efficiency and obstacle avoidance, neglecting a nuanced understanding of human behavior or intent in shared spaces. With the rise of service robots, there's an increasing emphasis on endowing robots with the capability to navigate and interact in complex real-world environments. Socially aware navigation has recently become a key research area. However, existing work either predicts pedestrian movements or simply emits alert signals to pedestrians, falling short of facilitating genuine interactions between humans and robots. In this paper, we introduce the Hybrid Soft Actor-Critic with Large Language Model (HSAC-LLM), an innovative model designed for socially-aware navigation in robots. This model seamlessly integrates deep reinforcement learning with large language models, enabling it to predict both continuous and discrete actions for navigation. Notably, HSAC-LLM facilitates bidirectional interaction based on natural language with pedestrian models. When a potential collision with pedestrians is detected, the robot can initiate or respond to communications with pedestrians, obtaining and executing subsequent avoidance strategies. Experimental results in 2D simulation, the Gazebo environment, and the real-world environment demonstrate that HSAC-LLM not only efficiently enables interaction with humans but also exhibits superior performance in navigation and obstacle avoidance compared to state-of-the-art DRL algorithms. We believe this innovative paradigm opens up new avenues for effective and socially aware human-robot interactions in dynamic environments. Videos are available at https://hsacllm.github.io/.
Heterogeneous LiDAR Dataset for Benchmarking Robust Localization in Diverse Degenerate Scenarios
The ability to estimate pose and generate maps using 3D LiDAR significantly enhances robotic system autonomy. However, existing open-source datasets lack representation of geometrically degenerate environments, limiting the development and benchmarking of robust LiDAR SLAM algorithms. To address this gap, we introduce GEODE, a comprehensive multi-LiDAR, multi-scenario dataset specifically designed to include real-world geometrically degenerate environments. GEODE comprises 64 trajectories spanning over 64 kilometers across seven diverse settings with varying degrees of degeneracy. The data was meticulously collected to promote the development of versatile algorithms by incorporating various LiDAR sensors, stereo cameras, IMUs, and diverse motion conditions. We evaluate state-of-the-art SLAM approaches using the GEODE dataset to highlight current limitations in LiDAR SLAM techniques. This extensive dataset will be publicly available at https://geode.github.io, supporting further advancements in LiDAR-based SLAM.
comment: 15 pages, 9 figures, 6 tables. Submitted for IJRR dataset paper
FLAF: Focal Line and Feature-constrained Active View Planning for Visual Teach and Repeat
This paper presents FLAF, a focal line and feature-constrained active view planning method for tracking failure avoidance in feature-based visual navigation of mobile robots. Our FLAF-based visual navigation is built upon a feature-based visual teach and repeat (VT\&R) framework, which supports many robotic applications by teaching a robot to navigate on various paths that cover a significant portion of daily autonomous navigation requirements. However, tracking failure in feature-based visual simultaneous localization and mapping (VSLAM) caused by textureless regions in human-made environments is still limiting VT\&R to be adopted in the real world. To address this problem, the proposed view planner is integrated into a feature-based visual SLAM system to build up an active VT\&R system that avoids tracking failure. In our system, a pan-tilt unit (PTU)-based active camera is mounted on the mobile robot. Using FLAF, the active camera-based VSLAM operates during the teaching phase to construct a complete path map and in the repeat phase to maintain stable localization. FLAF orients the robot toward more map points to avoid mapping failures during path learning and toward more feature-identifiable map points beneficial for localization while following the learned trajectory. Experiments in real scenarios demonstrate that FLAF outperforms the methods that do not consider feature-identifiability, and our active VT\&R system performs well in complex environments by effectively dealing with low-texture regions.
Open-vocabulary Temporal Action Localization using VLMs
Video action localization aims to find timings of a specific action from a long video. Although existing learning-based approaches have been successful, those require annotating videos that come with a considerable labor cost. This paper proposes a learning-free, open-vocabulary approach based on emerging off-the-shelf vision-language models (VLM). The challenge stems from the fact that VLMs are neither designed to process long videos nor tailored for finding actions. We overcome these problems by extending an iterative visual prompting technique. Specifically, we sample video frames into a concatenated image with frame index labels, making a VLM guess a frame that is considered to be closest to the start/end of the action. Iterating this process by narrowing a sampling time window results in finding a specific frame of start and end of an action. We demonstrate that this sampling technique yields reasonable results, illustrating a practical extension of VLMs for understanding videos. A sample code is available at https://microsoft.github.io/VLM-Video-Action-Localization/.
comment: 8 pages, 5 figures, 4 tables. Last updated on September 3rd, 2024
KOROL: Learning Visualizable Object Feature with Koopman Operator Rollout for Manipulation
Learning dexterous manipulation skills presents significant challenges due to complex nonlinear dynamics that underlie the interactions between objects and multi-fingered hands. Koopman operators have emerged as a robust method for modeling such nonlinear dynamics within a linear framework. However, current methods rely on runtime access to ground-truth (GT) object states, making them unsuitable for vision-based practical applications. Unlike image-to-action policies that implicitly learn visual features for control, we use a dynamics model, specifically the Koopman operator, to learn visually interpretable object features critical for robotic manipulation within a scene. We construct a Koopman operator using object features predicted by a feature extractor and utilize it to auto-regressively advance system states. We train the feature extractor to embed scene information into object features, thereby enabling the accurate propagation of robot trajectories. We evaluate our approach on simulated and real-world robot tasks, with results showing that it outperformed the model-based imitation learning NDP by 1.08$\times$ and the image-to-action Diffusion Policy by 1.16$\times$. The results suggest that our method maintains task success rates with learned features and extends applicability to real-world manipulation without GT object states. Project video and code are available at: \url{https://github.com/hychen-naza/KOROL}.
Resilient Fleet Management for Energy-Aware Intra-Factory Logistics
This paper presents a novel fleet management strategy for battery-powered robot fleets tasked with intra-factory logistics in an autonomous manufacturing facility. In this environment, repetitive material handling operations are subject to real-world uncertainties such as blocked passages, and equipment or robot malfunctions. In such cases, centralized approaches enhance resilience by immediately adjusting the task allocation between the robots. To overcome the computational expense, a two-step methodology is proposed where the nominal problem is solved a priori using a Monte Carlo Tree Search algorithm for task allocation, resulting in a nominal search tree. When a disruption occurs, the nominal search tree is rapidly updated a posteriori with costs to the new problem while simultaneously generating feasible solutions. Computational experiments prove the real-time capability of the proposed algorithm for various scenarios and compare it with the case where the search tree is not used and the decentralized approach that does not attempt task reassignment.
comment: This manuscript was accepted to the 2024 American Control Conference (ACC) which was held from Wednesday through Friday, July 10-12, 2024 in Toronto, ON, Canada. arXiv admin note: text overlap with arXiv:2304.11444
DeRi-IGP: Manipulating Rigid Objects Using Deformable Objects via Iterative Grasp-Pull
Heterogeneous systems manipulation, i.e., manipulating rigid objects via deformable (soft) objects, is an emerging field that remains in its early stages of research. Existing works in this field suffer from limited action and operational space, poor generalization ability, and expensive development. To address these challenges, we propose a universally applicable and effective moving primitive, Iterative Grasp-Pull (IGP), and a sample-based framework, DeRi-IGP, to solve the heterogeneous system manipulation task. The DeRi-IGP framework uses local onboard robots' RGBD sensors to observe the environment, comprising a soft-rigid body system. It then uses this information to iteratively grasp and pull a soft body (e.g., rope) to move the attached rigid body to a desired location. We evaluate the effectiveness of our framework in solving various heterogeneous manipulation tasks and compare its performance with several state-of-the-art baselines. The result shows that DeRi-IGP outperforms other methods by a significant margin. We also evaluate the sim-to-real generalization of our framework through real-world human-robot collaborative goal-reaching and distant object acquisition tasks. Our framework successfully transfers to the real world and demonstrates the advantage of the large operational space of the IGP primitive.
comment: We found we need IRB approval to release the human-involved experiments. So we need to retrieve version 2 of this paper
Green Screen Augmentation Enables Scene Generalisation in Robotic Manipulation
Generalising vision-based manipulation policies to novel environments remains a challenging area with limited exploration. Current practices involve collecting data in one location, training imitation learning or reinforcement learning policies with this data, and deploying the policy in the same location. However, this approach lacks scalability as it necessitates data collection in multiple locations for each task. This paper proposes a novel approach where data is collected in a location predominantly featuring green screens. We introduce Green-screen Augmentation (GreenAug), employing a chroma key algorithm to overlay background textures onto a green screen. Through extensive real-world empirical studies with over 850 training demonstrations and 8.2k evaluation episodes, we demonstrate that GreenAug surpasses no augmentation, standard computer vision augmentation, and prior generative augmentation methods in performance. While no algorithmic novelties are claimed, our paper advocates for a fundamental shift in data collection practices. We propose that real-world demonstrations in future research should utilise green screens, followed by the application of GreenAug. We believe GreenAug unlocks policy generalisation to visually distinct novel locations, addressing the current scene generalisation limitations in robot learning.
comment: Project website: https://greenaug.github.io/
PO-VINS: An Efficient and Robust Pose-Only Visual-Inertial State Estimator With LiDAR Enhancement
The pose adjustment (PA) with a pose-only visual representation has been proven equivalent to the bundle adjustment (BA), while significantly improving the computational efficiency. However, the pose-only solution has not yet been properly considered in a tightly-coupled visual-inertial state estimator (VISE) with a normal configuration for real-time navigation. In this study, we propose a tightly-coupled LiDAR-enhanced VISE, named PO-VINS, with a full pose-only form for visual and LiDAR-depth measurements to improve efficiency. Based on the pose-only visual representation, we derive the analytical depth uncertainty, which is then employed for culling LiDAR depth outliers. Thus, we propose a multi-state constraint (MSC)-based LiDAR-depth measurement model with the pose-only form to balance efficiency and robustness. The pose-only visual and LiDAR-depth measurements and the IMU-preintegration measurements are tightly integrated under the factor graph optimization framework to perform efficient and accurate state estimation. Exhaustive experimental results on private and public datasets indicate that the proposed PO-VINS yields improved or comparable accuracy to sate-of-the-art methods. Compared to the baseline method LE-VINS, the state-estimation efficiency of PO-VINS is improved by 33% and 56% on the laptop PC and the onboard ARM computer, respectively. Besides, PO-VINS yields notably improved robustness by employing the proposed outlier-culling method and the MSC-based measurement model for LiDAR depth.
comment: 17 pages, 13 figures, 8 tables
Contraction Theory for Nonlinear Stability Analysis and Learning-based Control: A Tutorial Overview
Contraction theory is an analytical tool to study differential dynamics of a non-autonomous (i.e., time-varying) nonlinear system under a contraction metric defined with a uniformly positive definite matrix, the existence of which results in a necessary and sufficient characterization of incremental exponential stability of multiple solution trajectories with respect to each other. By using a squared differential length as a Lyapunov-like function, its nonlinear stability analysis boils down to finding a suitable contraction metric that satisfies a stability condition expressed as a linear matrix inequality, indicating that many parallels can be drawn between well-known linear systems theory and contraction theory for nonlinear systems. Furthermore, contraction theory takes advantage of a superior robustness property of exponential stability used in conjunction with the comparison lemma. This yields much-needed safety and stability guarantees for neural network-based control and estimation schemes, without resorting to a more involved method of using uniform asymptotic stability for input-to-state stability. Such distinctive features permit systematic construction of a contraction metric via convex optimization, thereby obtaining an explicit exponential bound on the distance between a time-varying target trajectory and solution trajectories perturbed externally due to disturbances and learning errors. The objective of this paper is therefore to present a tutorial overview of contraction theory and its advantages in nonlinear stability analysis of deterministic and stochastic systems, with an emphasis on deriving formal robustness and stability guarantees for various learning-based and data-driven automatic control methods. In particular, we provide a detailed review of techniques for finding contraction metrics and associated control and estimation laws using deep neural networks.
comment: Annual Reviews in Control, Accepted, Oct. 1st
Multiagent Systems
Enhancing the Performance of Multi-Vehicle Navigation in Unstructured Environments using Hard Sample Mining
Contemporary research in autonomous driving has demonstrated tremendous potential in emulating the traits of human driving. However, they primarily cater to areas with well built road infrastructure and appropriate traffic management systems. Therefore, in the absence of traffic signals or in unstructured environments, these self-driving algorithms are expected to fail. This paper proposes a strategy for autonomously navigating multiple vehicles in close proximity to their desired destinations without traffic rules in unstructured environments. Graphical Neural Networks (GNNs) have demonstrated good utility for this task of multi-vehicle control. Among the different alternatives of training GNNs, supervised methods have proven to be most data-efficient, albeit require ground truth labels. However, these labels may not always be available, particularly in unstructured environments without traffic regulations. Therefore, a tedious optimization process may be required to determine them while ensuring that the vehicles reach their desired destination and do not collide with each other or any obstacles. Therefore, in order to expedite the training process, it is essential to reduce the optimization time and select only those samples for labeling that add most value to the training. In this paper, we propose a warm start method that first uses a pre-trained model trained on a simpler subset of data. Inference is then done on more complicated scenarios, to determine the hard samples wherein the model faces the greatest predicament. This is measured by the difficulty vehicles encounter in reaching their desired destination without collision. Experimental results demonstrate that mining for hard samples in this manner reduces the requirement for supervised training data by 10 fold. Videos and code can be found here: \url{https://yininghase.github.io/multiagent-collision-mining/}.
comment: 9 pages
Towards Multi-agent Policy-based Directed Hypergraph Learning for Traffic Signal Control
Deep reinforcement learning (DRL) methods that incorporate graph neural networks (GNNs) have been extensively studied for intelligent traffic signal control, which aims to coordinate traffic signals effectively across multiple intersections. Despite this progress, the standard graph learning used in these methods still struggles to capture higher-order correlations in real-world traffic flow. In this paper, we propose a multi-agent proximal policy optimization framework DHG-PPO, which incorporates PPO and directed hypergraph module to extract the spatio-temporal attributes of the road networks. DHG-PPO enables multiple agents to ingeniously interact through the dynamical construction of hypergraph. The effectiveness of DHG-PPO is validated in terms of average travel time and throughput against state-of-the-art baselines through extensive experiments.
Limiting Computation Levels in Prioritized Trajectory Planning with Safety Guarantees
In prioritized planning for vehicles, vehicles plan trajectories in parallel or in sequence. Parallel prioritized planning offers approximately consistent computation time regardless of the number of vehicles but struggles to guarantee collision-free trajectories. Conversely, sequential prioritized planning can guarantee collision-freeness but results in increased computation time as the number of sequentially computing vehicles, which we term computation levels, grows. This number is determined by the directed coupling graph resulted from the coupling and prioritization of vehicles. In this work, we guarantee safe trajectories in parallel planning through reachability analysis. Although these trajectories are collision-free, they tend to be conservative. We address this by planning with a subset of vehicles in sequence. We formulate the problem of selecting this subset as a graph partitioning problem that allows us to independently set computation levels. Our simulations demonstrate a reduction in computation levels by approximately 64% compared to sequential prioritized planning while maintaining the solution quality.
comment: 8 pages, 4 figures. This is an extended abstract of our previous work published at the 2024 European Control Conference (ECC), June 25-28, 2024. Stockholm, Sweden
Resilient Fleet Management for Energy-Aware Intra-Factory Logistics
This paper presents a novel fleet management strategy for battery-powered robot fleets tasked with intra-factory logistics in an autonomous manufacturing facility. In this environment, repetitive material handling operations are subject to real-world uncertainties such as blocked passages, and equipment or robot malfunctions. In such cases, centralized approaches enhance resilience by immediately adjusting the task allocation between the robots. To overcome the computational expense, a two-step methodology is proposed where the nominal problem is solved a priori using a Monte Carlo Tree Search algorithm for task allocation, resulting in a nominal search tree. When a disruption occurs, the nominal search tree is rapidly updated a posteriori with costs to the new problem while simultaneously generating feasible solutions. Computational experiments prove the real-time capability of the proposed algorithm for various scenarios and compare it with the case where the search tree is not used and the decentralized approach that does not attempt task reassignment.
comment: This manuscript was accepted to the 2024 American Control Conference (ACC) which was held from Wednesday through Friday, July 10-12, 2024 in Toronto, ON, Canada. arXiv admin note: text overlap with arXiv:2304.11444
Incentive-Compatible Vertiport Reservation in Advanced Air Mobility: An Auction-Based Approach
The rise of advanced air mobility (AAM) is expected to become a multibillion-dollar industry in the near future. Market-based mechanisms are touted to be an integral part of AAM operations, which comprise heterogeneous operators with private valuations. In this work, we study the problem of designing a mechanism to coordinate the movement of electric vertical take-off and landing (eVTOL) aircraft, operated by multiple operators each having heterogeneous valuations associated with their fleet, between vertiports, while enforcing the arrival, departure, and parking constraints at vertiports. Particularly, we propose an incentive-compatible and individually rational vertiport reservation mechanism that maximizes a social welfare metric, which encapsulates the objective of maximizing the overall valuations of all operators while minimizing the congestion at vertiports. Additionally, we improve the computational tractability of designing the reservation mechanism by proposing a mixed binary linear programming approach that is based on constructing network flow graph corresponding to the underlying problem.
comment: 25 pages, 2 figures, 2 tables
Random walk model that universally generates inverse square Lévy walk by eliminating search cost minimization constraint
The L\'evy walk, a type of random walk characterized by linear step lengths that follow a power-law distribution, is observed in the migratory behaviors of various organisms, ranging from bacteria to humans. Notably, L\'evy walks with power exponents close to two are frequently observed, though their underlying causes remain elusive. This study introduces a simplified, abstract random walk model designed to produce inverse square L\'evy walks, also known as Cauchy walks and explores the conditions that facilitate these phenomena. In our model, agents move toward a randomly selected destination in multi-dimensional space, and their movement strategy is parameterized by the extent to which they pursue the shortest path. When the search cost is proportional to the distance traveled, this parameter effectively reflects the emphasis on minimizing search costs. Our findings reveal that strict adherence to this cost minimization constraint results in a Brownian walk pattern. However, removing this constraint transitions the movement to an inverse square L\'evy walk. Therefore, by modulating the prioritization of search costs, our model can seamlessly alternate between Brownian and Cauchy walk dynamics. This model has the potential to be utilized for exploring the parameter space of an optimization problem.
Systems and Control (CS)
Energy Internet: A Standardization-Based Blueprint Design
The decarbonization of power and energy systems faces a bottleneck: The enormous number of user-side resources cannot be properly managed and operated by centralized system operators, who used to send dispatch instructions only to a few large power plants. To break through, we need not only new devices and algorithms, but structural reforms of our energy systems. Taking the Internet as a paradigm, a practicable design of the Energy Internet is presented based on the principle of standardization. A combination of stylized data and energy delivery, referred to as a Block of Energy Exchange (BEE), is designed as the media to be communicated, which is parsed by the Energy Internet Card. Each Energy Internet Card is assigned a unique MAC address, defining a participant of the Energy Internet, whose standardized profile will be automatically updated according to BEE transfers without the intervention of any centralized operator. The structure of Energy Internet and protocols thereof to support the transfer of BEE are presented. System operators will become Energy Internet Service Providers, who operate the energy system by flow control and dispatching centralized resources, which is decoupled from users' behaviors in the Energy Internet. Example shows that the Energy Internet can not only reduce carbon emissions via interactions between peers, but also promotes energy democracy and dwindles the gap in energy equity.
Difference Between Cyclic and Distributed Approach in Stochastic Optimization for Multi-agent System
Many stochastic optimization problems in multi-agent systems can be decomposed into smaller subproblems or reduced decision subspaces. The cyclic and distributed approaches are two widely used strategies for solving such problems. In this manuscript, we review four existing methods for addressing these problems and compare them based on their suitable problem frameworks and update rules.
Large-scale road network partitioning: a deep learning method based on convolutional autoencoder model
With the development of urbanization, the scale of urban road network continues to expand, especially in some Asian countries. Short-term traffic state prediction is one of the bases of traffic management and control. Constrained by the space-time cost of computation, the short-term traffic state prediction of large-scale urban road network is difficult. One way to solve this problem is to partition the whole network into multiple sub-networks to predict traffic state separately. In this study, a deep learning method is proposed for road network partitioning. The method mainly includes three steps. First, the daily speed series for roads are encoded into the matrix. Second, a convolutional autoencoder (AE) is built to extract series features and compress data. Third, the spatial hierarchical clustering method with adjacency relationship is applied in the road network. The proposed method was verified by the road network of Shenzhen which contains more than 5000 links. The results show that AE-hierarchical clustering distinguishes the tidal traffic characteristics and reflects the process of congestion propagation. Furthermore, two indicators are designed to make a quantitative comparison with spectral clustering: intra homogeneity increases by about 9% on average while inter heterogeneity about 9.5%. Compared with past methods, the time cost also decreases. The results may suggest ways to improve the management and control of urban road network in other metropolitan cities. The proposed method is expected to be extended to other problems related to large-scale networks.
Nonlinear Cooperative Output Regulation with Input Delay Compensation
This paper investigates the cooperative output regulation (COR) of nonlinear multi-agent systems (MASs) with long input delay based on periodic event-triggered mechanism. Compared with other mechanisms, periodic event-triggered control can automatically guarantee a Zeno-free behavior and avoid the continuous monitoring of triggered conditions. First, a new periodic event-triggered distributed observer, which is based on the fully asynchronous communication data, is proposed to estimate the leader information. Second, a new distributed predictor feedback control method is proposed for the considered nonlinear MASs with input delay. By coordinate transformation, the MASs are mapped into new coupled ODE-PDE target systems with some disturbance-like terms. Then, we show that the COR problem is solvable. At last, to further save the communication resource, a periodic event-triggered mechanism is considered in the sensor-to-controller transmission in every agent. A new periodic event-triggered filter is proposed to deal with the periodic event-triggered feedback data. The MASs with input delay are mapped into coupled ODE-PDE target systems with sampled data information. Then, Lyapunov-Krasovskii functions are constructed to demonstrate the exponential stability of the MASs. Simulations verify the validity of the proposed results.
comment: Acceptted by IEEE Trans. Automatic Control
Decentralized Control of Multi-Agent Systems Under Acyclic Spatio-Temporal Task Dependencies
We introduce a novel distributed sampled-data control method tailored for heterogeneous multi-agent systems under a global spatio-temporal task with acyclic dependencies. Specifically, we consider the global task as a conjunction of independent and collaborative tasks, defined over the absolute and relative states of agent pairs. Task dependencies in this form are then represented by a task graph, which we assume to be acyclic. From the given task graph, we provide an algorithmic approach to define a distributed sampled-data controller prioritizing the fulfilment of collaborative tasks as the primary objective, while fulfilling independent tasks unless they conflict with collaborative ones. Moreover, communication maintenance among collaborating agents is seamlessly enforced within the proposed control framework. A numerical simulation is provided to showcase the potential of our control framework.
comment: Short version of this paper was accepted for the Conference on Decision and Control
Exploring the Optimal Size of Grid-forming Energy Storage in an Off-grid Renewable P2H System under Multi-timescale Energy Management
Utility-scale off-grid renewable power-to-hydrogen systems (OReP2HSs) typically include photovoltaic plants, wind turbines, electrolyzers (ELs), and energy storage systems. As an island system, OReP2HS requires at least one component, generally the battery energy storage system (BESS), that operates for grid-forming control to provide frequency and voltage references and regulate them through transient power support and short-term energy balance regulation. While larger BESS capacity increases this ability, it also raises investment costs. This paper proposes a framework of layered multi-timescale energy management system (EMS) and evaluates the most cost-effective size of the grid-forming BESS in the OReP2HS. The proposed EMS covers the timescales ranging from those for power system transient behaviors to intra-day scheduling, coordinating renewable power, BESS, and ELs. Then, an iterative search procedure based on high-fidelity simulation is employed to determine the size of the BESS with minimal levelized cost of hydrogen (LCOH). Simulations over a reference year, based on the data from a planned OReP2HS project in Inner Mongolia, China, show that with the proposed EMS, the base-case optimal LCOH is 33.212 CNY/kg (4.581 USD/kg). The capital expenditure of the BESS accounts for 17.83% of the total, and the optimal BESS size accounts for 13.6% of the rated hourly energy output of power sources. Sensitivity analysis reveals that by reducing the electrolytic load adjustment time step from 90 to 5 s and increasing its ramping limit from 1% to 10% rated power per second, the BESS size decreases by 53.57%, and the LCOH decreases to 25.458 CNY/kg (3.511 USD/kg). Considering the cost of designing and manufacturing utility-scale ELs with fast load regulation capability, a load adjustment time step of 5-10 s and a ramping limit of 4-6% rated power per second are recommended.
On final opinions of the Friedkin-Johnsen model over random graphs with partially stubborn community
This paper studies the formation of final opinions for the Friedkin-Johnsen (FJ) model with a community of partially stubborn agents. The underlying network of the FJ model is symmetric and generated from a random graph model, in which each link is added independently from a Bernoulli distribution. It is shown that the final opinions of the FJ model will concentrate around those of an FJ model over the expected graph as the network size grows, on the condition that the stubborn agents are well connected to other agents. Probability bounds are proposed for the distance between these two final opinion vectors, respectively for the cases where there exist non-stubborn agents or not. Numerical experiments are provided to illustrate the theoretical findings. The simulation shows that, in presence of non-stubborn agents, the link probability between the stubborn and the non-stubborn communities affect the distance between the two final opinion vectors significantly. Additionally, if all agents are stubborn, the opinion distance decreases with the agent stubbornness.
An Analysis of Logit Learning with the r-Lambert Function
The well-known replicator equation in evolutionary game theory describes how population-level behaviors change over time when individuals make decisions using simple imitation learning rules. In this paper, we study evolutionary dynamics based on a fundamentally different class of learning rules known as logit learning. Numerous previous studies on logit dynamics provide numerical evidence of bifurcations of multiple fixed points for several types of games. Our results here provide a more explicit analysis of the logit fixed points and their stability properties for the entire class of two-strategy population games -- by way of the $r$-Lambert function. We find that for Prisoner's Dilemma and anti-coordination games, there is only a single fixed point for all rationality levels. However, coordination games exhibit a pitchfork bifurcation: there is a single fixed point in a low-rationality regime, and three fixed points in a high-rationality regime. We provide an implicit characterization for the level of rationality where this bifurcation occurs. In all cases, the set of logit fixed points converges to the full set of Nash equilibria in the high rationality limit.
comment: 9 pages, one figure, to be included in CDC 2024 conference proceedings
Limiting Computation Levels in Prioritized Trajectory Planning with Safety Guarantees
In prioritized planning for vehicles, vehicles plan trajectories in parallel or in sequence. Parallel prioritized planning offers approximately consistent computation time regardless of the number of vehicles but struggles to guarantee collision-free trajectories. Conversely, sequential prioritized planning can guarantee collision-freeness but results in increased computation time as the number of sequentially computing vehicles, which we term computation levels, grows. This number is determined by the directed coupling graph resulted from the coupling and prioritization of vehicles. In this work, we guarantee safe trajectories in parallel planning through reachability analysis. Although these trajectories are collision-free, they tend to be conservative. We address this by planning with a subset of vehicles in sequence. We formulate the problem of selecting this subset as a graph partitioning problem that allows us to independently set computation levels. Our simulations demonstrate a reduction in computation levels by approximately 64% compared to sequential prioritized planning while maintaining the solution quality.
comment: 8 pages, 4 figures. This is an extended abstract of our previous work published at the 2024 European Control Conference (ECC), June 25-28, 2024. Stockholm, Sweden
Cooperative Learning-Based Framework for VNF Caching and Placement Optimization over Low Earth Orbit Satellite Networks
Low Earth Orbit Satellite Networks (LSNs) are integral to supporting a broad range of modern applications, which are typically modeled as Service Function Chains (SFCs). Each SFC is composed of Virtual Network Functions (VNFs), where each VNF performs a specific task. In this work, we tackle two key challenges in deploying SFCs across an LSN. Firstly, we aim to optimize the long-term system performance by minimizing the average end-to-end SFC execution delay, given that each satellite comes with a pre-installed/cached subset of VNFs. To achieve optimal SFC placement, we formulate an offline Dynamic Programming (DP) equation. To overcome the challenges associated with DP, such as its complexity, the need for probability knowledge, and centralized decision-making, we put forth an online Multi-Agent Q-Learning (MAQL) solution. Our MAQL approach addresses convergence issues in the non-stationary LSN environment by enabling satellites to share learning parameters and update their Q-tables based on distinct rules for their selected actions. Secondly, to determine the optimal VNF subsets for satellite caching, we develop a Bayesian Optimization (BO)-based learning mechanism that operates both offline and continuously in the background during runtime. Extensive experiments demonstrate that our MAQL approach achieves near-optimal performance comparable to the DP model and significantly outperforms existing baselines. Moreover, the BO-based approach effectively enhances the request serving rate over time.
comment: 40 pages, 11 figure, 3 tables
A Performance Bound for the Greedy Algorithm in a Generalized Class of String Optimization Problems
We present a simple performance bound for the greedy scheme in string optimization problems that obtains strong results. Our approach vastly generalizes the group of previously established greedy curvature bounds by Conforti and Cornu\'{e}jols (1984). We consider three constants, $\alpha_G$, $\alpha_G'$, and $\alpha_G''$ introduced by Conforti and Cornu\'{e}jols (1984), that are used in performance bounds of greedy schemes in submodular set optimization. We first generalize both of the $\alpha_G$ and $\alpha_G''$ bounds to string optimization problems in a manner that includes maximizing submodular set functions over matroids as a special case. We then derive a much simpler and computable bound that allows for applications to a far more general class of functions with string domains. We prove that our bound is superior to both the $\alpha_G$ and $\alpha_G''$ bounds and provide a counterexample to show that the $\alpha_G'$ bound is incorrect under the assumptions in Conforti and Cornu\'{e}jols (1984). We conclude with two applications. The first is an application of our result to sensor coverage problems. We demonstrate our performance bound in cases where the objective function is set submodular and string submodular. The second is an application to a social welfare maximization problem with black-box utility functions.
A Hetero-functional Graph Resilience Analysis for Convergent Systems-of-Systems
Our modern life has grown to depend on many and nearly ubiquitous large complex engineering systems. Many disciplines now seemingly ask the same question: ``In the face of assumed disruption, to what degree will these systems continue to perform and when will they be able to bounce back to normal operation"? Furthermore, there is a growing recognition that the greatest societal challenges of the Anthropocene era are intertwined, necessitating a convergent systems-of-systems modeling and analysis framework based upon reconciled ontologies, data, and theoretical methods. Consequently, this paper develops a methodology for hetero-functional graph resilience analysis and demonstrates it on a convergent system-of-systems. It uses the Systems Modeling Language, model-based systems engineering and Hetero-Functional Graph Theory (HFGT) to overcome the convergence research challenges when constructing models and measures from multiple disciplines for systems resilience. The paper includes both the ``survival" as well as ``recovery" components of resilience. It also strikes a middle ground between two disparate approaches to resilience measurement: structural measurement of formal graphs and detailed behavioral simulation. This paper also generalizes a previous resilience measure based on HFGT and benefits from recent theoretical and computational developments in HFGT. To demonstrate the methodological developments, the resilience analysis is conducted on a hypothetical energy-water nexus system of moderate size as a type of system-of-systems.
comment: 19 pages, 6 figures, 2 tables
Resilient Fleet Management for Energy-Aware Intra-Factory Logistics
This paper presents a novel fleet management strategy for battery-powered robot fleets tasked with intra-factory logistics in an autonomous manufacturing facility. In this environment, repetitive material handling operations are subject to real-world uncertainties such as blocked passages, and equipment or robot malfunctions. In such cases, centralized approaches enhance resilience by immediately adjusting the task allocation between the robots. To overcome the computational expense, a two-step methodology is proposed where the nominal problem is solved a priori using a Monte Carlo Tree Search algorithm for task allocation, resulting in a nominal search tree. When a disruption occurs, the nominal search tree is rapidly updated a posteriori with costs to the new problem while simultaneously generating feasible solutions. Computational experiments prove the real-time capability of the proposed algorithm for various scenarios and compare it with the case where the search tree is not used and the decentralized approach that does not attempt task reassignment.
comment: This manuscript was accepted to the 2024 American Control Conference (ACC) which was held from Wednesday through Friday, July 10-12, 2024 in Toronto, ON, Canada. arXiv admin note: text overlap with arXiv:2304.11444
Stability Properties of the Impulsive Goodwin's Oscillator in 1-cycle
The Impulsive Goodwin's Oscillator (IGO) is a mathematical model of a hybrid closed-loop system. It arises by closing a special kind of continuous linear positive time-invariant system with impulsive feedback, which employs both amplitude and frequency pulse modulation. The structure of IGO precludes the existence of equilibria, and all its solutions are oscillatory. With its origin in mathematical biology, the IGO also presents a control paradigm useful in a wide range of applications, in particular dosing of chemicals and medicines. Since the pulse modulation feedback mechanism introduces significant nonlinearity and non-smoothness in the closedloop dynamics, conventional controller design methods fail to apply. However, the hybrid dynamics of IGO reduce to a nonlinear, time-invariant discrete-time system, exhibiting a one-to-one correspondence between periodic solutions of the original IGO and those of the discrete-time system. The paper proposes a design approach that leverages the linearization of the equivalent discrete-time dynamics in the vicinity of a fixed point. A simple and efficient local stability condition of the 1-cycle in terms of the characteristics of the amplitude and frequency modulation functions is obtained.
comment: extended version of the conference paper, accepted by IEEE CDC 2024
Actively Learning Reinforcement Learning: A Stochastic Optimal Control Approach
In this paper we propose a framework towards achieving two intertwined objectives: (i) equipping reinforcement learning with active exploration and deliberate information gathering, such that it regulates state and parameter uncertainties resulting from modeling mismatches and noisy sensory; and (ii) overcoming the computational intractability of stochastic optimal control. We approach both objectives by using reinforcement learning to compute the stochastic optimal control law. On one hand, we avoid the curse of dimensionality prohibiting the direct solution of the stochastic dynamic programming equation. On the other hand, the resulting stochastic optimal control reinforcement learning agent admits caution and probing, that is, optimal online exploration and exploitation. Unlike fixed exploration and exploitation balance, caution and probing are employed automatically by the controller in real-time, even after the learning process is terminated. We conclude the paper with a numerical simulation, illustrating how a Linear Quadratic Regulator with the certainty equivalence assumption may lead to poor performance and filter divergence, while our proposed approach is stabilizing, of an acceptable performance, and computationally convenient.
Incentive-Compatible Vertiport Reservation in Advanced Air Mobility: An Auction-Based Approach
The rise of advanced air mobility (AAM) is expected to become a multibillion-dollar industry in the near future. Market-based mechanisms are touted to be an integral part of AAM operations, which comprise heterogeneous operators with private valuations. In this work, we study the problem of designing a mechanism to coordinate the movement of electric vertical take-off and landing (eVTOL) aircraft, operated by multiple operators each having heterogeneous valuations associated with their fleet, between vertiports, while enforcing the arrival, departure, and parking constraints at vertiports. Particularly, we propose an incentive-compatible and individually rational vertiport reservation mechanism that maximizes a social welfare metric, which encapsulates the objective of maximizing the overall valuations of all operators while minimizing the congestion at vertiports. Additionally, we improve the computational tractability of designing the reservation mechanism by proposing a mixed binary linear programming approach that is based on constructing network flow graph corresponding to the underlying problem.
comment: 25 pages, 2 figures, 2 tables
On Bounds for Greedy Schemes in String Optimization based on Greedy Curvatures
We consider the celebrated bound introduced by Conforti and Cornu\'ejols (1984) for greedy schemes in submodular optimization. The bound assumes a submodular function defined on a collection of sets forming a matroid and is based on greedy curvature. We show that the bound holds for a very general class of string problems that includes maximizing submodular functions over set matroids as a special case. We also derive a bound that is computable in the sense that they depend only on quantities along the greedy trajectory. We prove that our bound is superior to the greedy curvature bound of Conforti and Cornu\'ejols. In addition, our bound holds under a condition that is weaker than submodularity.
comment: This version has been accepted as an invited paper in the 63rd IEEE Conference on Decision and Control, Milan, Italy, December 16--19, 2024
Feasibility-Guaranteed Safety-Critical Control with Applications to Heterogeneous Platoons
This paper studies safety and feasibility guarantees for systems with tight control bounds. It has been shown that stabilizing an affine control system while optimizing a quadratic cost and satisfying state and control constraints can be mapped to a sequence of Quadratic Programs (QPs) using Control Barrier Functions (CBF) and Control Lyapunov Functions (CLF). One of the main challenges in this method is that the QP could easily become infeasible under safety constraints of high relative degree, especially under tight control bounds. Recent work focused on deriving sufficient conditions for guaranteeing feasibility. The existing results are case-dependent. In this paper, we consider the general case. We define a feasibility constraint and propose a new type of CBF to enforce it. Our method guarantees the feasibility of the above mentioned QPs, while satisfying safety requirements. We demonstrate the proposed method on an Adaptive Cruise Control (ACC) problem for a heterogeneous platoon with tight control bounds, and compare our method to existing CBF-CLF approaches. The results show that our proposed approach can generate gradually transitioned control (without abrupt changes) with guaranteed feasibility and safety.
comment: 8 pages, 2 figures. arXiv admin note: text overlap with arXiv:2304.00372
Efficient Design of a Pixelated Rectenna for WPT Applications
This paper introduces a highly efficient rectenna (rectifying antenna) using a binary optimization algorithm. A novel pixelated receiving antenna has been developed to match the diode impedance of a rectifier, eliminating the need for a separate matching circuit in the rectenna's rectifier. The receiving antenna configuration is fine-tuned via a binary optimization algorithm. A rectenna is designed using optimization algorithm at 2.5 GHz with 38% RF-DC conversion efficiency when subjected to 0 dBm incident power, with an output voltage of 815mV. The proposed rectenna demonstrates versatility across various low-power WPT (wireless power transfer) applications.
Joint Load and Capacity Scheduling for Flexible Radio Resource Management of High-Throughput Satellites
This work first explores using flexible beam-user mapping to optimize the beam service range and beam position, in order to adapt the non-uniform traffic demand to offer in high-throughput satellite (HTS) systems. Second, on this basis, the joint flexible bandwidth allocation is adopted to adapt the offer to demand at the same time. This strategy allows both beam capacity and load to be adjusted to cope with the traffic demand. The new information generated during the load transfer process of flexible beam-user mapping can guide the direction of beam optimization. Then, the proposed strategies are tested against joint power-bandwidth allocation and joint optimization of bandwidth and beam-user mapping under different traffic profiles. Numerical results are obtained for various non-uniform traffic distributions to evaluate the performance of the solutions. Results show that flexible joint load and capacity scheduling are superior to other strategies in terms of demand satisfaction with acceptable complexity. Our source code along with results are available at crystal-zwz/HTS_RRM_Joint-Load-and-Capacity-Scheduling (github.com).
Contraction Theory for Nonlinear Stability Analysis and Learning-based Control: A Tutorial Overview
Contraction theory is an analytical tool to study differential dynamics of a non-autonomous (i.e., time-varying) nonlinear system under a contraction metric defined with a uniformly positive definite matrix, the existence of which results in a necessary and sufficient characterization of incremental exponential stability of multiple solution trajectories with respect to each other. By using a squared differential length as a Lyapunov-like function, its nonlinear stability analysis boils down to finding a suitable contraction metric that satisfies a stability condition expressed as a linear matrix inequality, indicating that many parallels can be drawn between well-known linear systems theory and contraction theory for nonlinear systems. Furthermore, contraction theory takes advantage of a superior robustness property of exponential stability used in conjunction with the comparison lemma. This yields much-needed safety and stability guarantees for neural network-based control and estimation schemes, without resorting to a more involved method of using uniform asymptotic stability for input-to-state stability. Such distinctive features permit systematic construction of a contraction metric via convex optimization, thereby obtaining an explicit exponential bound on the distance between a time-varying target trajectory and solution trajectories perturbed externally due to disturbances and learning errors. The objective of this paper is therefore to present a tutorial overview of contraction theory and its advantages in nonlinear stability analysis of deterministic and stochastic systems, with an emphasis on deriving formal robustness and stability guarantees for various learning-based and data-driven automatic control methods. In particular, we provide a detailed review of techniques for finding contraction metrics and associated control and estimation laws using deep neural networks.
comment: Annual Reviews in Control, Accepted, Oct. 1st
Systems and Control (EESS)
Energy Internet: A Standardization-Based Blueprint Design
The decarbonization of power and energy systems faces a bottleneck: The enormous number of user-side resources cannot be properly managed and operated by centralized system operators, who used to send dispatch instructions only to a few large power plants. To break through, we need not only new devices and algorithms, but structural reforms of our energy systems. Taking the Internet as a paradigm, a practicable design of the Energy Internet is presented based on the principle of standardization. A combination of stylized data and energy delivery, referred to as a Block of Energy Exchange (BEE), is designed as the media to be communicated, which is parsed by the Energy Internet Card. Each Energy Internet Card is assigned a unique MAC address, defining a participant of the Energy Internet, whose standardized profile will be automatically updated according to BEE transfers without the intervention of any centralized operator. The structure of Energy Internet and protocols thereof to support the transfer of BEE are presented. System operators will become Energy Internet Service Providers, who operate the energy system by flow control and dispatching centralized resources, which is decoupled from users' behaviors in the Energy Internet. Example shows that the Energy Internet can not only reduce carbon emissions via interactions between peers, but also promotes energy democracy and dwindles the gap in energy equity.
Difference Between Cyclic and Distributed Approach in Stochastic Optimization for Multi-agent System
Many stochastic optimization problems in multi-agent systems can be decomposed into smaller subproblems or reduced decision subspaces. The cyclic and distributed approaches are two widely used strategies for solving such problems. In this manuscript, we review four existing methods for addressing these problems and compare them based on their suitable problem frameworks and update rules.
Large-scale road network partitioning: a deep learning method based on convolutional autoencoder model
With the development of urbanization, the scale of urban road network continues to expand, especially in some Asian countries. Short-term traffic state prediction is one of the bases of traffic management and control. Constrained by the space-time cost of computation, the short-term traffic state prediction of large-scale urban road network is difficult. One way to solve this problem is to partition the whole network into multiple sub-networks to predict traffic state separately. In this study, a deep learning method is proposed for road network partitioning. The method mainly includes three steps. First, the daily speed series for roads are encoded into the matrix. Second, a convolutional autoencoder (AE) is built to extract series features and compress data. Third, the spatial hierarchical clustering method with adjacency relationship is applied in the road network. The proposed method was verified by the road network of Shenzhen which contains more than 5000 links. The results show that AE-hierarchical clustering distinguishes the tidal traffic characteristics and reflects the process of congestion propagation. Furthermore, two indicators are designed to make a quantitative comparison with spectral clustering: intra homogeneity increases by about 9% on average while inter heterogeneity about 9.5%. Compared with past methods, the time cost also decreases. The results may suggest ways to improve the management and control of urban road network in other metropolitan cities. The proposed method is expected to be extended to other problems related to large-scale networks.
Nonlinear Cooperative Output Regulation with Input Delay Compensation
This paper investigates the cooperative output regulation (COR) of nonlinear multi-agent systems (MASs) with long input delay based on periodic event-triggered mechanism. Compared with other mechanisms, periodic event-triggered control can automatically guarantee a Zeno-free behavior and avoid the continuous monitoring of triggered conditions. First, a new periodic event-triggered distributed observer, which is based on the fully asynchronous communication data, is proposed to estimate the leader information. Second, a new distributed predictor feedback control method is proposed for the considered nonlinear MASs with input delay. By coordinate transformation, the MASs are mapped into new coupled ODE-PDE target systems with some disturbance-like terms. Then, we show that the COR problem is solvable. At last, to further save the communication resource, a periodic event-triggered mechanism is considered in the sensor-to-controller transmission in every agent. A new periodic event-triggered filter is proposed to deal with the periodic event-triggered feedback data. The MASs with input delay are mapped into coupled ODE-PDE target systems with sampled data information. Then, Lyapunov-Krasovskii functions are constructed to demonstrate the exponential stability of the MASs. Simulations verify the validity of the proposed results.
comment: Acceptted by IEEE Trans. Automatic Control
Decentralized Control of Multi-Agent Systems Under Acyclic Spatio-Temporal Task Dependencies
We introduce a novel distributed sampled-data control method tailored for heterogeneous multi-agent systems under a global spatio-temporal task with acyclic dependencies. Specifically, we consider the global task as a conjunction of independent and collaborative tasks, defined over the absolute and relative states of agent pairs. Task dependencies in this form are then represented by a task graph, which we assume to be acyclic. From the given task graph, we provide an algorithmic approach to define a distributed sampled-data controller prioritizing the fulfilment of collaborative tasks as the primary objective, while fulfilling independent tasks unless they conflict with collaborative ones. Moreover, communication maintenance among collaborating agents is seamlessly enforced within the proposed control framework. A numerical simulation is provided to showcase the potential of our control framework.
comment: Short version of this paper was accepted for the Conference on Decision and Control
Exploring the Optimal Size of Grid-forming Energy Storage in an Off-grid Renewable P2H System under Multi-timescale Energy Management
Utility-scale off-grid renewable power-to-hydrogen systems (OReP2HSs) typically include photovoltaic plants, wind turbines, electrolyzers (ELs), and energy storage systems. As an island system, OReP2HS requires at least one component, generally the battery energy storage system (BESS), that operates for grid-forming control to provide frequency and voltage references and regulate them through transient power support and short-term energy balance regulation. While larger BESS capacity increases this ability, it also raises investment costs. This paper proposes a framework of layered multi-timescale energy management system (EMS) and evaluates the most cost-effective size of the grid-forming BESS in the OReP2HS. The proposed EMS covers the timescales ranging from those for power system transient behaviors to intra-day scheduling, coordinating renewable power, BESS, and ELs. Then, an iterative search procedure based on high-fidelity simulation is employed to determine the size of the BESS with minimal levelized cost of hydrogen (LCOH). Simulations over a reference year, based on the data from a planned OReP2HS project in Inner Mongolia, China, show that with the proposed EMS, the base-case optimal LCOH is 33.212 CNY/kg (4.581 USD/kg). The capital expenditure of the BESS accounts for 17.83% of the total, and the optimal BESS size accounts for 13.6% of the rated hourly energy output of power sources. Sensitivity analysis reveals that by reducing the electrolytic load adjustment time step from 90 to 5 s and increasing its ramping limit from 1% to 10% rated power per second, the BESS size decreases by 53.57%, and the LCOH decreases to 25.458 CNY/kg (3.511 USD/kg). Considering the cost of designing and manufacturing utility-scale ELs with fast load regulation capability, a load adjustment time step of 5-10 s and a ramping limit of 4-6% rated power per second are recommended.
On final opinions of the Friedkin-Johnsen model over random graphs with partially stubborn community
This paper studies the formation of final opinions for the Friedkin-Johnsen (FJ) model with a community of partially stubborn agents. The underlying network of the FJ model is symmetric and generated from a random graph model, in which each link is added independently from a Bernoulli distribution. It is shown that the final opinions of the FJ model will concentrate around those of an FJ model over the expected graph as the network size grows, on the condition that the stubborn agents are well connected to other agents. Probability bounds are proposed for the distance between these two final opinion vectors, respectively for the cases where there exist non-stubborn agents or not. Numerical experiments are provided to illustrate the theoretical findings. The simulation shows that, in presence of non-stubborn agents, the link probability between the stubborn and the non-stubborn communities affect the distance between the two final opinion vectors significantly. Additionally, if all agents are stubborn, the opinion distance decreases with the agent stubbornness.
An Analysis of Logit Learning with the r-Lambert Function
The well-known replicator equation in evolutionary game theory describes how population-level behaviors change over time when individuals make decisions using simple imitation learning rules. In this paper, we study evolutionary dynamics based on a fundamentally different class of learning rules known as logit learning. Numerous previous studies on logit dynamics provide numerical evidence of bifurcations of multiple fixed points for several types of games. Our results here provide a more explicit analysis of the logit fixed points and their stability properties for the entire class of two-strategy population games -- by way of the $r$-Lambert function. We find that for Prisoner's Dilemma and anti-coordination games, there is only a single fixed point for all rationality levels. However, coordination games exhibit a pitchfork bifurcation: there is a single fixed point in a low-rationality regime, and three fixed points in a high-rationality regime. We provide an implicit characterization for the level of rationality where this bifurcation occurs. In all cases, the set of logit fixed points converges to the full set of Nash equilibria in the high rationality limit.
comment: 9 pages, one figure, to be included in CDC 2024 conference proceedings
Limiting Computation Levels in Prioritized Trajectory Planning with Safety Guarantees
In prioritized planning for vehicles, vehicles plan trajectories in parallel or in sequence. Parallel prioritized planning offers approximately consistent computation time regardless of the number of vehicles but struggles to guarantee collision-free trajectories. Conversely, sequential prioritized planning can guarantee collision-freeness but results in increased computation time as the number of sequentially computing vehicles, which we term computation levels, grows. This number is determined by the directed coupling graph resulted from the coupling and prioritization of vehicles. In this work, we guarantee safe trajectories in parallel planning through reachability analysis. Although these trajectories are collision-free, they tend to be conservative. We address this by planning with a subset of vehicles in sequence. We formulate the problem of selecting this subset as a graph partitioning problem that allows us to independently set computation levels. Our simulations demonstrate a reduction in computation levels by approximately 64% compared to sequential prioritized planning while maintaining the solution quality.
comment: 8 pages, 4 figures. This is an extended abstract of our previous work published at the 2024 European Control Conference (ECC), June 25-28, 2024. Stockholm, Sweden
Cooperative Learning-Based Framework for VNF Caching and Placement Optimization over Low Earth Orbit Satellite Networks
Low Earth Orbit Satellite Networks (LSNs) are integral to supporting a broad range of modern applications, which are typically modeled as Service Function Chains (SFCs). Each SFC is composed of Virtual Network Functions (VNFs), where each VNF performs a specific task. In this work, we tackle two key challenges in deploying SFCs across an LSN. Firstly, we aim to optimize the long-term system performance by minimizing the average end-to-end SFC execution delay, given that each satellite comes with a pre-installed/cached subset of VNFs. To achieve optimal SFC placement, we formulate an offline Dynamic Programming (DP) equation. To overcome the challenges associated with DP, such as its complexity, the need for probability knowledge, and centralized decision-making, we put forth an online Multi-Agent Q-Learning (MAQL) solution. Our MAQL approach addresses convergence issues in the non-stationary LSN environment by enabling satellites to share learning parameters and update their Q-tables based on distinct rules for their selected actions. Secondly, to determine the optimal VNF subsets for satellite caching, we develop a Bayesian Optimization (BO)-based learning mechanism that operates both offline and continuously in the background during runtime. Extensive experiments demonstrate that our MAQL approach achieves near-optimal performance comparable to the DP model and significantly outperforms existing baselines. Moreover, the BO-based approach effectively enhances the request serving rate over time.
comment: 40 pages, 11 figure, 3 tables
A Performance Bound for the Greedy Algorithm in a Generalized Class of String Optimization Problems
We present a simple performance bound for the greedy scheme in string optimization problems that obtains strong results. Our approach vastly generalizes the group of previously established greedy curvature bounds by Conforti and Cornu\'{e}jols (1984). We consider three constants, $\alpha_G$, $\alpha_G'$, and $\alpha_G''$ introduced by Conforti and Cornu\'{e}jols (1984), that are used in performance bounds of greedy schemes in submodular set optimization. We first generalize both of the $\alpha_G$ and $\alpha_G''$ bounds to string optimization problems in a manner that includes maximizing submodular set functions over matroids as a special case. We then derive a much simpler and computable bound that allows for applications to a far more general class of functions with string domains. We prove that our bound is superior to both the $\alpha_G$ and $\alpha_G''$ bounds and provide a counterexample to show that the $\alpha_G'$ bound is incorrect under the assumptions in Conforti and Cornu\'{e}jols (1984). We conclude with two applications. The first is an application of our result to sensor coverage problems. We demonstrate our performance bound in cases where the objective function is set submodular and string submodular. The second is an application to a social welfare maximization problem with black-box utility functions.
A Hetero-functional Graph Resilience Analysis for Convergent Systems-of-Systems
Our modern life has grown to depend on many and nearly ubiquitous large complex engineering systems. Many disciplines now seemingly ask the same question: ``In the face of assumed disruption, to what degree will these systems continue to perform and when will they be able to bounce back to normal operation"? Furthermore, there is a growing recognition that the greatest societal challenges of the Anthropocene era are intertwined, necessitating a convergent systems-of-systems modeling and analysis framework based upon reconciled ontologies, data, and theoretical methods. Consequently, this paper develops a methodology for hetero-functional graph resilience analysis and demonstrates it on a convergent system-of-systems. It uses the Systems Modeling Language, model-based systems engineering and Hetero-Functional Graph Theory (HFGT) to overcome the convergence research challenges when constructing models and measures from multiple disciplines for systems resilience. The paper includes both the ``survival" as well as ``recovery" components of resilience. It also strikes a middle ground between two disparate approaches to resilience measurement: structural measurement of formal graphs and detailed behavioral simulation. This paper also generalizes a previous resilience measure based on HFGT and benefits from recent theoretical and computational developments in HFGT. To demonstrate the methodological developments, the resilience analysis is conducted on a hypothetical energy-water nexus system of moderate size as a type of system-of-systems.
comment: 19 pages, 6 figures, 2 tables
Resilient Fleet Management for Energy-Aware Intra-Factory Logistics
This paper presents a novel fleet management strategy for battery-powered robot fleets tasked with intra-factory logistics in an autonomous manufacturing facility. In this environment, repetitive material handling operations are subject to real-world uncertainties such as blocked passages, and equipment or robot malfunctions. In such cases, centralized approaches enhance resilience by immediately adjusting the task allocation between the robots. To overcome the computational expense, a two-step methodology is proposed where the nominal problem is solved a priori using a Monte Carlo Tree Search algorithm for task allocation, resulting in a nominal search tree. When a disruption occurs, the nominal search tree is rapidly updated a posteriori with costs to the new problem while simultaneously generating feasible solutions. Computational experiments prove the real-time capability of the proposed algorithm for various scenarios and compare it with the case where the search tree is not used and the decentralized approach that does not attempt task reassignment.
comment: This manuscript was accepted to the 2024 American Control Conference (ACC) which was held from Wednesday through Friday, July 10-12, 2024 in Toronto, ON, Canada. arXiv admin note: text overlap with arXiv:2304.11444
Stability Properties of the Impulsive Goodwin's Oscillator in 1-cycle
The Impulsive Goodwin's Oscillator (IGO) is a mathematical model of a hybrid closed-loop system. It arises by closing a special kind of continuous linear positive time-invariant system with impulsive feedback, which employs both amplitude and frequency pulse modulation. The structure of IGO precludes the existence of equilibria, and all its solutions are oscillatory. With its origin in mathematical biology, the IGO also presents a control paradigm useful in a wide range of applications, in particular dosing of chemicals and medicines. Since the pulse modulation feedback mechanism introduces significant nonlinearity and non-smoothness in the closedloop dynamics, conventional controller design methods fail to apply. However, the hybrid dynamics of IGO reduce to a nonlinear, time-invariant discrete-time system, exhibiting a one-to-one correspondence between periodic solutions of the original IGO and those of the discrete-time system. The paper proposes a design approach that leverages the linearization of the equivalent discrete-time dynamics in the vicinity of a fixed point. A simple and efficient local stability condition of the 1-cycle in terms of the characteristics of the amplitude and frequency modulation functions is obtained.
comment: extended version of the conference paper, accepted by IEEE CDC 2024
Actively Learning Reinforcement Learning: A Stochastic Optimal Control Approach
In this paper we propose a framework towards achieving two intertwined objectives: (i) equipping reinforcement learning with active exploration and deliberate information gathering, such that it regulates state and parameter uncertainties resulting from modeling mismatches and noisy sensory; and (ii) overcoming the computational intractability of stochastic optimal control. We approach both objectives by using reinforcement learning to compute the stochastic optimal control law. On one hand, we avoid the curse of dimensionality prohibiting the direct solution of the stochastic dynamic programming equation. On the other hand, the resulting stochastic optimal control reinforcement learning agent admits caution and probing, that is, optimal online exploration and exploitation. Unlike fixed exploration and exploitation balance, caution and probing are employed automatically by the controller in real-time, even after the learning process is terminated. We conclude the paper with a numerical simulation, illustrating how a Linear Quadratic Regulator with the certainty equivalence assumption may lead to poor performance and filter divergence, while our proposed approach is stabilizing, of an acceptable performance, and computationally convenient.
Incentive-Compatible Vertiport Reservation in Advanced Air Mobility: An Auction-Based Approach
The rise of advanced air mobility (AAM) is expected to become a multibillion-dollar industry in the near future. Market-based mechanisms are touted to be an integral part of AAM operations, which comprise heterogeneous operators with private valuations. In this work, we study the problem of designing a mechanism to coordinate the movement of electric vertical take-off and landing (eVTOL) aircraft, operated by multiple operators each having heterogeneous valuations associated with their fleet, between vertiports, while enforcing the arrival, departure, and parking constraints at vertiports. Particularly, we propose an incentive-compatible and individually rational vertiport reservation mechanism that maximizes a social welfare metric, which encapsulates the objective of maximizing the overall valuations of all operators while minimizing the congestion at vertiports. Additionally, we improve the computational tractability of designing the reservation mechanism by proposing a mixed binary linear programming approach that is based on constructing network flow graph corresponding to the underlying problem.
comment: 25 pages, 2 figures, 2 tables
On Bounds for Greedy Schemes in String Optimization based on Greedy Curvatures
We consider the celebrated bound introduced by Conforti and Cornu\'ejols (1984) for greedy schemes in submodular optimization. The bound assumes a submodular function defined on a collection of sets forming a matroid and is based on greedy curvature. We show that the bound holds for a very general class of string problems that includes maximizing submodular functions over set matroids as a special case. We also derive a bound that is computable in the sense that they depend only on quantities along the greedy trajectory. We prove that our bound is superior to the greedy curvature bound of Conforti and Cornu\'ejols. In addition, our bound holds under a condition that is weaker than submodularity.
comment: This version has been accepted as an invited paper in the 63rd IEEE Conference on Decision and Control, Milan, Italy, December 16--19, 2024
Feasibility-Guaranteed Safety-Critical Control with Applications to Heterogeneous Platoons
This paper studies safety and feasibility guarantees for systems with tight control bounds. It has been shown that stabilizing an affine control system while optimizing a quadratic cost and satisfying state and control constraints can be mapped to a sequence of Quadratic Programs (QPs) using Control Barrier Functions (CBF) and Control Lyapunov Functions (CLF). One of the main challenges in this method is that the QP could easily become infeasible under safety constraints of high relative degree, especially under tight control bounds. Recent work focused on deriving sufficient conditions for guaranteeing feasibility. The existing results are case-dependent. In this paper, we consider the general case. We define a feasibility constraint and propose a new type of CBF to enforce it. Our method guarantees the feasibility of the above mentioned QPs, while satisfying safety requirements. We demonstrate the proposed method on an Adaptive Cruise Control (ACC) problem for a heterogeneous platoon with tight control bounds, and compare our method to existing CBF-CLF approaches. The results show that our proposed approach can generate gradually transitioned control (without abrupt changes) with guaranteed feasibility and safety.
comment: 8 pages, 2 figures. arXiv admin note: text overlap with arXiv:2304.00372
Efficient Design of a Pixelated Rectenna for WPT Applications
This paper introduces a highly efficient rectenna (rectifying antenna) using a binary optimization algorithm. A novel pixelated receiving antenna has been developed to match the diode impedance of a rectifier, eliminating the need for a separate matching circuit in the rectenna's rectifier. The receiving antenna configuration is fine-tuned via a binary optimization algorithm. A rectenna is designed using optimization algorithm at 2.5 GHz with 38% RF-DC conversion efficiency when subjected to 0 dBm incident power, with an output voltage of 815mV. The proposed rectenna demonstrates versatility across various low-power WPT (wireless power transfer) applications.
Joint Load and Capacity Scheduling for Flexible Radio Resource Management of High-Throughput Satellites
This work first explores using flexible beam-user mapping to optimize the beam service range and beam position, in order to adapt the non-uniform traffic demand to offer in high-throughput satellite (HTS) systems. Second, on this basis, the joint flexible bandwidth allocation is adopted to adapt the offer to demand at the same time. This strategy allows both beam capacity and load to be adjusted to cope with the traffic demand. The new information generated during the load transfer process of flexible beam-user mapping can guide the direction of beam optimization. Then, the proposed strategies are tested against joint power-bandwidth allocation and joint optimization of bandwidth and beam-user mapping under different traffic profiles. Numerical results are obtained for various non-uniform traffic distributions to evaluate the performance of the solutions. Results show that flexible joint load and capacity scheduling are superior to other strategies in terms of demand satisfaction with acceptable complexity. Our source code along with results are available at crystal-zwz/HTS_RRM_Joint-Load-and-Capacity-Scheduling (github.com).
Contraction Theory for Nonlinear Stability Analysis and Learning-based Control: A Tutorial Overview
Contraction theory is an analytical tool to study differential dynamics of a non-autonomous (i.e., time-varying) nonlinear system under a contraction metric defined with a uniformly positive definite matrix, the existence of which results in a necessary and sufficient characterization of incremental exponential stability of multiple solution trajectories with respect to each other. By using a squared differential length as a Lyapunov-like function, its nonlinear stability analysis boils down to finding a suitable contraction metric that satisfies a stability condition expressed as a linear matrix inequality, indicating that many parallels can be drawn between well-known linear systems theory and contraction theory for nonlinear systems. Furthermore, contraction theory takes advantage of a superior robustness property of exponential stability used in conjunction with the comparison lemma. This yields much-needed safety and stability guarantees for neural network-based control and estimation schemes, without resorting to a more involved method of using uniform asymptotic stability for input-to-state stability. Such distinctive features permit systematic construction of a contraction metric via convex optimization, thereby obtaining an explicit exponential bound on the distance between a time-varying target trajectory and solution trajectories perturbed externally due to disturbances and learning errors. The objective of this paper is therefore to present a tutorial overview of contraction theory and its advantages in nonlinear stability analysis of deterministic and stochastic systems, with an emphasis on deriving formal robustness and stability guarantees for various learning-based and data-driven automatic control methods. In particular, we provide a detailed review of techniques for finding contraction metrics and associated control and estimation laws using deep neural networks.
comment: Annual Reviews in Control, Accepted, Oct. 1st
Robotics
Chemical Power Variability among Microscopic Robots in Blood Vessels
Fuel cells using oxygen and glucose could power microscopic robots operating in blood vessels. Swarms of such robots can significantly reduce oxygen concentration, depending on the time between successive transits of the lung, hematocrit variation in vessels and tissue oxygen consumption. These factors differ among circulation paths through the body. This paper evaluates how these variations affect the minimum oxygen concentration due to robot consumption and where it occurs: mainly in moderate-sized veins toward the end of long paths prior to their merging with veins from shorter paths. This shows that tens of billions of robots can obtain hundreds of picowatts throughout the body with minor reduction in total oxygen. However, a trillion robots significantly deplete oxygen in some parts of the body. By storing oxygen or limiting their consumption in long circulation paths, robots can actively mitigate this depletion. The variation in behavior is illustrated in three cases: the portal system which involves passage through two capillary networks, the spleen whose slits significantly slow some of the flow, and large tissue consumption in coronary circulation.
Learning to Open and Traverse Doors with a Legged Manipulator
Using doors is a longstanding challenge in robotics and is of significant practical interest in giving robots greater access to human-centric spaces. The task is challenging due to the need for online adaptation to varying door properties and precise control in manipulating the door panel and navigating through the confined doorway. To address this, we propose a learning-based controller for a legged manipulator to open and traverse through doors. The controller is trained using a teacher-student approach in simulation to learn robust task behaviors as well as estimate crucial door properties during the interaction. Unlike previous works, our approach is a single control policy that can handle both push and pull doors through learned behaviour which infers the opening direction during deployment without prior knowledge. The policy was deployed on the ANYmal legged robot with an arm and achieved a success rate of 95.0% in repeated trials conducted in an experimental setting. Additional experiments validate the policy's effectiveness and robustness to various doors and disturbances. A video overview of the method and experiments can be found at youtu.be/tQDZXN_k5NU.
Context-Aware Replanning with Pre-explored Semantic Map for Object Navigation
Pre-explored Semantic Maps, constructed through prior exploration using visual language models (VLMs), have proven effective as foundational elements for training-free robotic applications. However, existing approaches assume the map's accuracy and do not provide effective mechanisms for revising decisions based on incorrect maps. To address this, we introduce Context-Aware Replanning (CARe), which estimates map uncertainty through confidence scores and multi-view consistency, enabling the agent to revise erroneous decisions stemming from inaccurate maps without requiring additional labels. We demonstrate the effectiveness of our proposed method by integrating it with two modern mapping backbones, VLMaps and OpenMask3D, and observe significant performance improvements in object navigation tasks. More details can be found on the project page: https://carmaps.github.io/supplements/.
comment: CoRL 2024. The first three authors contributed equally, and their order of authorship is interchangeable. Project page: https://carmaps.github.io/supplements/
Simulation and optimization of computed torque control 3 DOF RRR manipulator using MATLAB
Robot manipulators have become a significant tool for production industries due to their advantages in high speed, accuracy, safety, and repeatability. This paper simulates and optimizes the design of a 3-DOF articulated robotic manipulator (RRR Configuration). The forward and inverse dynamic models are utilized. The trajectory is planned using the end effector's required initial position. A torque compute model is used to calculate the physical end effector's trajectory, position, and velocity. The MATLAB Simulink platform is used for all simulations of the RRR manipulator. With the aid of MATLAB, we primarily focused on manipulator control of the robot using a calculated torque control strategy to achieve the required position.
Leveraging LLMs, Graphs and Object Hierarchies for Task Planning in Large-Scale Environments
Planning methods struggle with computational intractability in solving task-level problems in large-scale environments. This work explores leveraging the commonsense knowledge encoded in LLMs to empower planning techniques to deal with these complex scenarios. We achieve this by efficiently using LLMs to prune irrelevant components from the planning problem's state space, substantially simplifying its complexity. We demonstrate the efficacy of this system through extensive experiments within a household simulation environment, alongside real-world validation using a 7-DoF manipulator (video https://youtu.be/6ro2UOtOQS4).
comment: 8 pages, 6 figures
Should I Stay or Should I Go: A Learning Approach for Drone-based Sensing Applications
Multicopter drones are becoming a key platform in several application domains, enabling precise on-the-spot sensing and/or actuation. We focus on the case where the drone must process the sensor data in order to decide, depending on the outcome, whether it needs to perform some additional action, e.g., more accurate sensing or some form of actuation. On the one hand, waiting for the computation to complete may waste time, if it turns out that no further action is needed. On the other hand, if the drone starts moving toward the next point of interest before the computation ends, it may need to return back to the previous point, if some action needs to be taken. In this paper, we propose a learning approach that enables the drone to take informed decisions about whether to wait for the result of the computation (or not), based on past experience gathered from previous missions. Through an extensive evaluation, we show that the proposed approach, when properly configured, outperforms several static policies, up to 25.8%, over a wide variety of different scenarios where the probability of some action being required at a given point of interest remains stable as well as for scenarios where this probability varies in time.
comment: 9 pages, 9 figures
Modeling Drivers' Risk Perception via Attention to Improve Driving Assistance
Advanced Driver Assistance Systems (ADAS) alert drivers during safety-critical scenarios but often provide superfluous alerts due to a lack of consideration for drivers' knowledge or scene awareness. Modeling these aspects together in a data-driven way is challenging due to the scarcity of critical scenario data with in-cabin driver state and world state recorded together. We explore the benefits of driver modeling in the context of Forward Collision Warning (FCW) systems. Working with real-world video dataset of on-road FCW deployments, we collect observers' subjective validity rating of the deployed alerts. We also annotate participants' gaze-to-objects and extract 3D trajectories of the ego vehicle and other vehicles semi-automatically. We generate a risk estimate of the scene and the drivers' perception in a two step process: First, we model the movement of vehicles in a given scenario as a joint trajectory forecasting problem. Then, we reason about the drivers' risk perception of the scene by counterfactually modifying the input to the forecasting model to represent the drivers' actual observations of vehicles in the scene. The difference in these behaviours gives us an estimate of driver behaviour that accounts for their actual (inattentive) observations and their downstream effect on overall scene risk. We compare both a learned scene representation as well as a more traditional ``worse-case'' deceleration model to achieve the future trajectory forecast. Our experiments show that using this risk formulation to generate FCW alerts may lead to improved false positive rate of FCWs and improved FCW timing.
IR2: Implicit Rendezvous for Robotic Exploration Teams under Sparse Intermittent Connectivity
Information sharing is critical in time-sensitive and realistic multi-robot exploration, especially for smaller robotic teams in large-scale environments where connectivity may be sparse and intermittent. Existing methods often overlook such communication constraints by assuming unrealistic global connectivity. Other works account for communication constraints (by maintaining close proximity or line of sight during information exchange), but are often inefficient. For instance, preplanned rendezvous approaches typically involve unnecessary detours resulting from poorly timed rendezvous, while pursuit-based approaches often result in short-sighted decisions due to their greedy nature. We present IR2, a deep reinforcement learning approach to information sharing for multi-robot exploration. Leveraging attention-based neural networks trained via reinforcement and curriculum learning, IR2 allows robots to effectively reason about the longer-term trade-offs between disconnecting for solo exploration and reconnecting for information sharing. In addition, we propose a hierarchical graph formulation to maintain a sparse yet informative graph, enabling our approach to scale to large-scale environments. We present simulation results in three large-scale Gazebo environments, which show that our approach yields 6.6-34.1% shorter exploration paths and significantly improved mapped area consistency among robots when compared to state-of-the-art baselines. Our simulation training and testing code is available at https://github.com/marmotlab/IR2.
comment: \c{opyright} 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Enhancing Sample Efficiency and Exploration in Reinforcement Learning through the Integration of Diffusion Models and Proximal Policy Optimization
Recent advancements in reinforcement learning (RL) have been fueled by large-scale data and deep neural networks, particularly for high-dimensional and complex tasks. Online RL methods like Proximal Policy Optimization (PPO) are effective in dynamic scenarios but require substantial real-time data, posing challenges in resource-constrained or slow simulation environments. Offline RL addresses this by pre-learning policies from large datasets, though its success depends on the quality and diversity of the data. This work proposes a framework that enhances PPO algorithms by incorporating a diffusion model to generate high-quality virtual trajectories for offline datasets. This approach improves exploration and sample efficiency, leading to significant gains in cumulative rewards, convergence speed, and strategy stability in complex tasks. Our contributions are threefold: we explore the potential of diffusion models in RL, particularly for offline datasets, extend the application of online RL to offline environments, and experimentally validate the performance improvements of PPO with diffusion models. These findings provide new insights and methods for applying RL to high-dimensional, complex tasks. Finally, we open-source our code at https://github.com/TianciGao/DiffPPO
LeTac-MPC: Learning Model Predictive Control for Tactile-reactive Grasping
Grasping is a crucial task in robotics, necessitating tactile feedback and reactive grasping adjustments for robust grasping of objects under various conditions and with differing physical properties. In this paper, we introduce LeTac-MPC, a learning-based model predictive control (MPC) for tactile-reactive grasping. Our approach enables the gripper to grasp objects with different physical properties on dynamic and force-interactive tasks. We utilize a vision-based tactile sensor, GelSight, which is capable of perceiving high-resolution tactile feedback that contains information on the physical properties and states of the grasped object. LeTac-MPC incorporates a differentiable MPC layer designed to model the embeddings extracted by a neural network (NN) from tactile feedback. This design facilitates convergent and robust grasping control at a frequency of 25 Hz. We propose a fully automated data collection pipeline and collect a dataset only using standardized blocks with different physical properties. However, our trained controller can generalize to daily objects with different sizes, shapes, materials, and textures. The experimental results demonstrate the effectiveness and robustness of the proposed approach. We compare LeTac-MPC with two purely model-based tactile-reactive controllers (MPC and PD) and open-loop grasping. Our results show that LeTac-MPC has optimal performance in dynamic and force-interactive tasks and optimal generalizability. We release our code and dataset at https://github.com/ZhengtongXu/LeTac-MPC.
NeB-SLAM: Neural Blocks-based Salable RGB-D SLAM for Unknown Scenes
Neural implicit representations have recently demonstrated considerable potential in the field of visual simultaneous localization and mapping (SLAM). This is due to their inherent advantages, including low storage overhead and representation continuity. However, these methods necessitate the size of the scene as input, which is impractical for unknown scenes. Consequently, we propose NeB-SLAM, a neural block-based scalable RGB-D SLAM for unknown scenes. Specifically, we first propose a divide-and-conquer mapping strategy that represents the entire unknown scene as a set of sub-maps. These sub-maps are a set of neural blocks of fixed size. Then, we introduce an adaptive map growth strategy to achieve adaptive allocation of neural blocks during camera tracking and gradually cover the whole unknown scene. Finally, extensive evaluations on various datasets demonstrate that our method is competitive in both mapping and tracking when targeting unknown environments.
Providing Safety Assurances for Systems with Unknown Dynamics
As autonomous systems become more complex and integral in our society, the need to accurately model and safely control these systems has increased significantly. In the past decade, there has been tremendous success in using deep learning techniques to model and control systems that are difficult to model using first principles. However, providing safety assurances for such systems remains difficult, partially due to the uncertainty in the learned model. In this work, we aim to provide safety assurances for systems whose dynamics are not readily derived from first principles and, hence, are more advantageous to be learned using deep learning techniques. Given the system of interest and safety constraints, we learn an ensemble model of the system dynamics from data. Leveraging ensemble uncertainty as a measure of uncertainty in the learned dynamics model, we compute a maximal robust control invariant set, starting from which the system is guaranteed to satisfy the safety constraints under the condition that realized model uncertainties are contained in the predefined set of admissible model uncertainty. We demonstrate the effectiveness of our method using a simulated case study with an inverted pendulum and a hardware experiment with a TurtleBot. The experiments show that our method robustifies the control actions of the system against model uncertainty and generates safe behaviors without being overly restrictive. The codes and accompanying videos can be found on the project website.
comment: Accepted to L-CSS and CDC 2024
Multiagent Systems
Adaptation Procedure in Misinformation Games
We study interactions between agents in multi-agent systems, in which the agents are misinformed with regards to the game that they play, essentially having a subjective and incorrect understanding of the setting, without being aware of it. For that, we introduce a new game-theoretic concept, called misinformation games, that provides the necessary toolkit to study this situation. Subsequently, we enhance this framework by developing a time-discrete procedure (called the Adaptation Procedure) that captures iterative interactions in the above context. During the Adaptation Procedure, the agents update their information and reassess their behaviour in each step. We demonstrate our ideas through an implementation, which is used to study the efficiency and characteristics of the Adaptation Procedure.
Identification of LFT Structured Descriptor Systems with Slow and Non-uniform Sampling
Time domain identification is studied in this paper for parameters of a continuous-time multi-input multi-output descriptor system, with these parameters affecting system matrices through a linear fractional transformation. Sampling is permitted to be slow and non-uniform, and there are no necessities to satisfy the Nyquist frequency restrictions. This model can be used to described the behaviors of a networked dynamic system, and the obtained results can be straightforwardly applied to an ordinary state-space model, as well as a lumped system. An explicit formula is obtained respectively for the transient and steady-state responses of the system stimulated by an arbitrary signal. Some relations have been derived between the system steady-state response and its transfer function matrix (TFM), which reveal that the value of a TFM at almost any interested point, as well as its derivatives and a right tangential interpolation along an arbitrary direction, can in principle be estimated from input-output experimental data. Based on these relations, an estimation algorithm is suggested respectively for the parameters of the descriptor system and the values of its TFM. Their properties like asymptotic unbiasedness, consistency, etc., are analyzed. A simple numerical example is included to illustrate characteristics of the suggested estimation algorithms.
comment: 17 pages
Quantifying Misalignment Between Agents: Towards a Sociotechnical Understanding of Alignment AAAI-25
Existing work on the alignment problem has focused mainly on (1) qualitative descriptions of the alignment problem; (2) attempting to align AI actions with human interests by focusing on value specification and learning; and/or (3) focusing on a single agent or on humanity as a monolith. Recent sociotechnical approaches highlight the need to understand complex misalignment among multiple human and AI agents. We address this gap by adapting a computational social science model of human contention to the alignment problem. Our model quantifies misalignment in large, diverse agent groups with potentially conflicting goals across various problem areas. Misalignment scores in our framework depend on the observed agent population, the domain in question, and conflict between agents' weighted preferences. Through simulations, we demonstrate how our model captures intuitive aspects of misalignment across different scenarios. We then apply our model to two case studies, including an autonomous vehicle setting, showcasing its practical utility. Our approach offers enhanced explanatory power for complex sociotechnical environments and could inform the design of more aligned AI systems in real-world applications.
comment: 7 pages, 8 figures, 3 tables, submitted to AAAI-25
Systems and Control (CS)
Protecting residential electrical panels and service through model predictive control: A field study
Residential electrification - replacing fossil-fueled appliances and vehicles with electric machines - can significantly reduce greenhouse gas emissions and air pollution. However, installing electric appliances or vehicle charging in a residential building can sharply increase its current draws. In older housing, high current draws can jeopardize electrical infrastructure, such as circuit breaker panels or electrical service (the wires that connect a building to the distribution grid). Upgrading electrical infrastructure can entail long delays and high costs, so poses a significant barrier to electrification. This paper develops and field-tests a control system that avoids the need for electrical upgrades by keeping an electrified home's total current draw within the safe limits of its panel and service. In the proposed control architecture, a high-level controller plans device set-points over a rolling prediction horizon. A low-level controller monitors real-time conditions and ramps down devices if necessary. The control system was tested in an occupied, electrified single-family house with code-minimum insulation, an air-to-air heat pump and backup resistance heat, a resistance water heater, and a plug-in hybrid electric vehicle with Level I charging. The field tests spanned 31 winter days with outdoor temperatures as low as -20 C. The control system maintained the whole-home current within the safe limits of electrical panels and service rated at 100 A, a common rating for older houses in North America, by adjusting only the temperature set-points of the heat pump and water heater. Simulations suggest that the same 100 A limit could accommodate a second electric vehicle with Level II charging. The proposed control system could allow older homes to safely electrify without upgrading electrical panels or service, saving a typical household on the order of $2,000 to $10,000.
Reinforcement Learning for Rate Maximization in IRS-aided OWC Networks
Optical wireless communication (OWC) is envisioned as one of the main enabling technologies of 6G networks, complementing radio frequency (RF) systems to provide high data rates. One of the crucial issues in indoor OWC is service interruptions due to blockages that obstruct the line of sight (LoS) between users and their access points (APs). Recently, reflecting surfaces referred to as intelligent reflecting surfaces (IRSs) have been considered to provide improved connectivity in OWC systems by reflecting AP signals toward users. In this study, we investigate the integration of IRSs into an indoor OWC system to improve the sum rate of the users and to ensure service continuity. We formulate an optimization problem for sum rate maximization, where the allocation of both APs and mirror elements of IRSs to users is determined to enhance the aggregate data rate. Moreover, reinforcement learning (RL) algorithms, specifically Q-learning and SARSA algorithms, are proposed to provide real-time solutions with low complexity and without prior system knowledge. The results show that the RL algorithms achieve near-optimal solutions that are close to the solutions of mixed integer linear programming (MILP). The results also show that the proposed scheme achieves up to a 45% increase in data rate compared to a traditional scheme that optimizes only the allocation of APs while the mirror elements are assigned to users based on the distance.
comment: 6Pages, 5 Figures
Barrier Integral Control for Global Asymptotic Stabilization of Uncertain Nonlinear Systems under Smooth Feedback and Transient Constraints
This paper addresses the problem of asymptotic stabilization for high-order control-affine MIMO nonlinear systems with unknown dynamic terms. We introduce Barrier Integral Control, a novel algorithm designed to confine the system's state within a predefined funnel, ensuring adherence to prescribed transient constraints, and asymptotically drive it to zero from any initial condition. The algorithm leverages the innovative integration of a reciprocal barrier function and an error-integral term, featuring smooth feedback control. Notably, it operates without relying on any information or approximation schemes for the (unknown) dynamic terms, which, unlike a large class of previous works, are not assumed to be bounded or to comply with globally Lipschitz/growth conditions. Additionally, the system's trajectory and asymptotic performance are decoupled from the uncertain model, control-gain selection, and initial conditions. Finally, comparative simulation studies validate the effectiveness of the proposed algorithm.
comment: First version
Continuous-Time Online Distributed Seeking for Generalized Nash Equilibrium of Nonmonotone Online Game
This paper mainly investigates a class of distributed generalized Nash equilibrium (GNE) seeking problems for online nonmonotone game with time-varying coupling inequality constraints. Based on a time-varying control gain, a novel continuous-time distributed GNE seeking algorithm is proposed, which realizes the constant regret bound and sublinear fit bound, matching those of the criteria for online optimization problems. Furthermore, to reduce unnecessary communication among players, a dynamic event-triggered mechanism involving internal variables is introduced into the distributed GNE seeking algorithm, while the constant regret bound and sublinear fit bound are still achieved. Also, the Zeno behavior is strictly prohibited. Finally, a numerical example is given to demonstrate the validity of the theoretical results.
Urban traffic analysis and forecasting through shared Koopman eigenmodes
Predicting traffic flow in data-scarce cities is challenging due to limited historical data. To address this, we leverage transfer learning by identifying periodic patterns common to data-rich cities using a customized variant of Dynamic Mode Decomposition (DMD): constrained Hankelized DMD (TrHDMD). This method uncovers common eigenmodes (urban heartbeats) in traffic patterns and transfers them to data-scarce cities, significantly enhancing prediction performance. TrHDMD reduces the need for extensive training datasets by utilizing prior knowledge from other cities. By applying Koopman operator theory to multi-city loop detector data, we identify stable, interpretable, and time-invariant traffic modes. Injecting ``urban heartbeats'' into forecasting tasks improves prediction accuracy and has the potential to enhance traffic management strategies for cities with varying data infrastructures. Our work introduces cross-city knowledge transfer via shared Koopman eigenmodes, offering actionable insights and reliable forecasts for data-scarce urban environments.
Physical Design: Methodologies and Developments
The design and production of VLSI chips is a multilevel heirarchical process. As the demand for reduced die-area and technology nodes becomes prevalent, it gets increasingly challenging to optimize Power, Performance and Area (PPA) parameters to accommodate for the ever-increasing core logic on a chip. A well defined heirarchical flow is thus quintessential when it comes to VLSI design process. A robust heirarchical flow should encompass all stages, right from Gate-level RTL Synthesis (Front End Design) to Logic Placement and Verification (Back End Physical Design) and finally culminating with tapeout / production. Physical Design in this aforementioned flow is the process of translating logical circuit description into physically realizable GDSII form. This involves defining the best possible placement and routing for standard cells, macros and I/Os in the design to optimize PPA for any given netlist. This paper helps capture the nitty-gritty of methodologies and algorithms that are pertinent to the building and optimization of an efficient and robust physical design flow in VLSI chip-designing process.
Optimal decentralized wavelength control in light sources for lithography
Pulsed light sources are a critical component of modern lithography, with fine light beam wavelength control paramount for wafer etching accuracy. We study optimal wavelength control by casting it as a decentralized linear quadratic Gaussian (LQG) problem in presence of time-delays. In particular, we consider the multi-optics module (optics and actuators) used for generating the requisite wavelength in light sources as cooperatively interacting systems defined over a directed acyclic graph (DAG). We show that any measurement and other continuous time-delays can be exactly compensated, and the resulting optimal controller implementation at the individual optics-level outperforms any existing wavelength control techniques.
Learning Optimal Stable Matches in Decentralized Markets with Unknown Preferences
Matching algorithms have demonstrated great success in several practical applications, but they often require centralized coordination and plentiful information. In many modern online marketplaces, agents must independently seek out and match with another using little to no information. For these kinds of settings, can we design decentralized, limited-information matching algorithms that preserve the desirable properties of standard centralized techniques? In this work, we constructively answer this question in the affirmative. We model a two-sided matching market as a game consisting of two disjoint sets of agents, referred to as proposers and acceptors, each of whom seeks to match with their most preferable partner on the opposite side of the market. However, each proposer has no knowledge of their own preferences, so they must learn their preferences while forming matches in the market. We present a simple online learning rule that guarantees a strong notion of probabilistic convergence to the welfare-maximizing equilibrium of the game, referred to as the proposer-optimal stable match. To the best of our knowledge, this represents the first completely decoupled, communication-free algorithm that guarantees probabilistic convergence to an optimal stable match, irrespective of the structure of the matching market.
DEEP-IoT: Downlink-Enhanced Efficient-Power Internet of Things
At the heart of the Internet of Things (IoT) -- a domain witnessing explosive growth -- the imperative for energy efficiency and the extension of device lifespans has never been more pressing. This paper presents DEEP-IoT, an innovative communication paradigm poised to redefine how IoT devices communicate. Through a pioneering feedback channel coding strategy, DEEP-IoT challenges and transforms the traditional transmitter (IoT devices)-centric communication model to one where the receiver (the access point) play a pivotal role, thereby cutting down energy use and boosting device longevity. We not only conceptualize DEEP-IoT but also actualize it by integrating deep learning-enhanced feedback channel codes within a narrow-band system. Simulation results show a significant enhancement in the operational lifespan of IoT cells -- surpassing traditional systems using Turbo and Polar codes by up to 52.71%. This leap signifies a paradigm shift in IoT communications, setting the stage for a future where IoT devices boast unprecedented efficiency and durability.
Polynomial Logical Zonotope: A Set Representation for Reachability Analysis of Logical Systems
In this paper, we introduce a set representation called polynomial logical zonotopes for performing exact and computationally efficient reachability analysis on logical systems. We prove that through this polynomial-like construction, we are able to perform all of the fundamental logical operations (XOR, NOT, XNOR, AND, NAND, OR, NOR) between sets of points exactly in a reduced space, i.e., generator space with reduced complexity. Polynomial logical zonotopes are a generalization of logical zonotopes, which are able to represent up to $2^n$ binary vectors using only $n$ generators. Due to their construction, logical zonotopes are only able to support exact computations of some logical operations (XOR, NOT, XNOR), while other operations (AND, NAND, OR, NOR) result in over-approximations in the generator space. In order to perform all fundamental logical operations exactly, we formulate a generalization of logical zonotopes that is constructed by dependent generators and exponent matrices. While we are able to perform all of the logical operations exactly, this comes with a slight increase in computational complexity compared to logical zonotopes. To illustrate and showcase the computational benefits of polynomial logical zonotopes, we present the results of performing reachability analysis on two use cases: (1) safety verification of an intersection crossing protocol and (2) reachability analysis on a high-dimensional Boolean function. Moreover, to highlight the extensibility of logical zonotopes, we include an additional use case where we perform a computationally tractable exhaustive search for the key of a linear feedback shift register.
comment: This paper is accepted in Automatica. arXiv admin note: substantial text overlap with arXiv:2210.08596
Monte Carlo Grid Dynamic Programming: Almost Sure Convergence and Probability Constraints
Dynamic Programming (DP) suffers from the well-known ``curse of dimensionality'', further exacerbated by the need to compute expectations over process noise in stochastic models. This paper presents a Monte Carlo-based sampling approach for the state space and an interpolation procedure for the resulting value function, dependent on the process noise density, in a "self-approximating" fashion, eliminating the need for ordering or set-membership tests. We provide proof of almost sure convergence for the value iteration (and consequently, policy iteration) procedure. The proposed meshless sampling and interpolation algorithm alleviates the burden of gridding the state space, traditionally required in DP, and avoids constructing a piecewise constant value function over a grid. Moreover, we demonstrate that the proposed interpolation procedure is well-suited for handling probabilistic constraints by sampling both infeasible and feasible regions. The curse of dimensionality cannot be avoided, however, this approach offers a practical framework for addressing lower-order stochastic nonlinear systems with probabilistic constraints, while eliminating the need for linear interpolations and set membership tests. Numerical examples are presented to further explain and illustrate the convenience of the proposed algorithms.
comment: 6 pages, 1 figure
Heterogeneous Unmanned Aerial Vehicles Cooperative Search Approach for Complex Environments
This paper studies a heterogeneous Unmanned Aerial Vehicles (UAVs) cooperative search approach suitable for complex environments. In the application, a fixed-wing UAV drops rotor UAVs to deploy the cluster rapidly. Meanwhile, the fixed-wing UAV works as a communication relay node to improve the search performance of the cluster further. The distributed model predictive control and genetic algorithms are adopted to make online intelligent decisions on UAVs search directions. On this basis, a jump grid decision method is proposed to satisfy the maneuverability constraints of UAVs, a parameter dynamic selection method is developed to make search decisions more responsive to task requirements, and a search information transmission method with low bandwidth is designed. This approach can enable UAVs to discover targets quickly, cope with various constraints and unexpected situations, and make adaptive decisions, significantly improving the robustness of search tasks in complex, dynamic, and unknown environments. The proposed approach is tested with several search scenarios, and simulation results show that the cooperative search performance of heterogeneous UAVs is significantly improved compared to homogeneous UAVs.
comment: 26 pages, 26 figures
Providing Safety Assurances for Systems with Unknown Dynamics
As autonomous systems become more complex and integral in our society, the need to accurately model and safely control these systems has increased significantly. In the past decade, there has been tremendous success in using deep learning techniques to model and control systems that are difficult to model using first principles. However, providing safety assurances for such systems remains difficult, partially due to the uncertainty in the learned model. In this work, we aim to provide safety assurances for systems whose dynamics are not readily derived from first principles and, hence, are more advantageous to be learned using deep learning techniques. Given the system of interest and safety constraints, we learn an ensemble model of the system dynamics from data. Leveraging ensemble uncertainty as a measure of uncertainty in the learned dynamics model, we compute a maximal robust control invariant set, starting from which the system is guaranteed to satisfy the safety constraints under the condition that realized model uncertainties are contained in the predefined set of admissible model uncertainty. We demonstrate the effectiveness of our method using a simulated case study with an inverted pendulum and a hardware experiment with a TurtleBot. The experiments show that our method robustifies the control actions of the system against model uncertainty and generates safe behaviors without being overly restrictive. The codes and accompanying videos can be found on the project website.
comment: Accepted to L-CSS and CDC 2024
Systems and Control (EESS)
Protecting residential electrical panels and service through model predictive control: A field study
Residential electrification - replacing fossil-fueled appliances and vehicles with electric machines - can significantly reduce greenhouse gas emissions and air pollution. However, installing electric appliances or vehicle charging in a residential building can sharply increase its current draws. In older housing, high current draws can jeopardize electrical infrastructure, such as circuit breaker panels or electrical service (the wires that connect a building to the distribution grid). Upgrading electrical infrastructure can entail long delays and high costs, so poses a significant barrier to electrification. This paper develops and field-tests a control system that avoids the need for electrical upgrades by keeping an electrified home's total current draw within the safe limits of its panel and service. In the proposed control architecture, a high-level controller plans device set-points over a rolling prediction horizon. A low-level controller monitors real-time conditions and ramps down devices if necessary. The control system was tested in an occupied, electrified single-family house with code-minimum insulation, an air-to-air heat pump and backup resistance heat, a resistance water heater, and a plug-in hybrid electric vehicle with Level I charging. The field tests spanned 31 winter days with outdoor temperatures as low as -20 C. The control system maintained the whole-home current within the safe limits of electrical panels and service rated at 100 A, a common rating for older houses in North America, by adjusting only the temperature set-points of the heat pump and water heater. Simulations suggest that the same 100 A limit could accommodate a second electric vehicle with Level II charging. The proposed control system could allow older homes to safely electrify without upgrading electrical panels or service, saving a typical household on the order of $2,000 to $10,000.
Reinforcement Learning for Rate Maximization in IRS-aided OWC Networks
Optical wireless communication (OWC) is envisioned as one of the main enabling technologies of 6G networks, complementing radio frequency (RF) systems to provide high data rates. One of the crucial issues in indoor OWC is service interruptions due to blockages that obstruct the line of sight (LoS) between users and their access points (APs). Recently, reflecting surfaces referred to as intelligent reflecting surfaces (IRSs) have been considered to provide improved connectivity in OWC systems by reflecting AP signals toward users. In this study, we investigate the integration of IRSs into an indoor OWC system to improve the sum rate of the users and to ensure service continuity. We formulate an optimization problem for sum rate maximization, where the allocation of both APs and mirror elements of IRSs to users is determined to enhance the aggregate data rate. Moreover, reinforcement learning (RL) algorithms, specifically Q-learning and SARSA algorithms, are proposed to provide real-time solutions with low complexity and without prior system knowledge. The results show that the RL algorithms achieve near-optimal solutions that are close to the solutions of mixed integer linear programming (MILP). The results also show that the proposed scheme achieves up to a 45% increase in data rate compared to a traditional scheme that optimizes only the allocation of APs while the mirror elements are assigned to users based on the distance.
comment: 6Pages, 5 Figures
Barrier Integral Control for Global Asymptotic Stabilization of Uncertain Nonlinear Systems under Smooth Feedback and Transient Constraints
This paper addresses the problem of asymptotic stabilization for high-order control-affine MIMO nonlinear systems with unknown dynamic terms. We introduce Barrier Integral Control, a novel algorithm designed to confine the system's state within a predefined funnel, ensuring adherence to prescribed transient constraints, and asymptotically drive it to zero from any initial condition. The algorithm leverages the innovative integration of a reciprocal barrier function and an error-integral term, featuring smooth feedback control. Notably, it operates without relying on any information or approximation schemes for the (unknown) dynamic terms, which, unlike a large class of previous works, are not assumed to be bounded or to comply with globally Lipschitz/growth conditions. Additionally, the system's trajectory and asymptotic performance are decoupled from the uncertain model, control-gain selection, and initial conditions. Finally, comparative simulation studies validate the effectiveness of the proposed algorithm.
comment: First version
Continuous-Time Online Distributed Seeking for Generalized Nash Equilibrium of Nonmonotone Online Game
This paper mainly investigates a class of distributed generalized Nash equilibrium (GNE) seeking problems for online nonmonotone game with time-varying coupling inequality constraints. Based on a time-varying control gain, a novel continuous-time distributed GNE seeking algorithm is proposed, which realizes the constant regret bound and sublinear fit bound, matching those of the criteria for online optimization problems. Furthermore, to reduce unnecessary communication among players, a dynamic event-triggered mechanism involving internal variables is introduced into the distributed GNE seeking algorithm, while the constant regret bound and sublinear fit bound are still achieved. Also, the Zeno behavior is strictly prohibited. Finally, a numerical example is given to demonstrate the validity of the theoretical results.
Urban traffic analysis and forecasting through shared Koopman eigenmodes
Predicting traffic flow in data-scarce cities is challenging due to limited historical data. To address this, we leverage transfer learning by identifying periodic patterns common to data-rich cities using a customized variant of Dynamic Mode Decomposition (DMD): constrained Hankelized DMD (TrHDMD). This method uncovers common eigenmodes (urban heartbeats) in traffic patterns and transfers them to data-scarce cities, significantly enhancing prediction performance. TrHDMD reduces the need for extensive training datasets by utilizing prior knowledge from other cities. By applying Koopman operator theory to multi-city loop detector data, we identify stable, interpretable, and time-invariant traffic modes. Injecting ``urban heartbeats'' into forecasting tasks improves prediction accuracy and has the potential to enhance traffic management strategies for cities with varying data infrastructures. Our work introduces cross-city knowledge transfer via shared Koopman eigenmodes, offering actionable insights and reliable forecasts for data-scarce urban environments.
Physical Design: Methodologies and Developments
The design and production of VLSI chips is a multilevel heirarchical process. As the demand for reduced die-area and technology nodes becomes prevalent, it gets increasingly challenging to optimize Power, Performance and Area (PPA) parameters to accommodate for the ever-increasing core logic on a chip. A well defined heirarchical flow is thus quintessential when it comes to VLSI design process. A robust heirarchical flow should encompass all stages, right from Gate-level RTL Synthesis (Front End Design) to Logic Placement and Verification (Back End Physical Design) and finally culminating with tapeout / production. Physical Design in this aforementioned flow is the process of translating logical circuit description into physically realizable GDSII form. This involves defining the best possible placement and routing for standard cells, macros and I/Os in the design to optimize PPA for any given netlist. This paper helps capture the nitty-gritty of methodologies and algorithms that are pertinent to the building and optimization of an efficient and robust physical design flow in VLSI chip-designing process.
Optimal decentralized wavelength control in light sources for lithography
Pulsed light sources are a critical component of modern lithography, with fine light beam wavelength control paramount for wafer etching accuracy. We study optimal wavelength control by casting it as a decentralized linear quadratic Gaussian (LQG) problem in presence of time-delays. In particular, we consider the multi-optics module (optics and actuators) used for generating the requisite wavelength in light sources as cooperatively interacting systems defined over a directed acyclic graph (DAG). We show that any measurement and other continuous time-delays can be exactly compensated, and the resulting optimal controller implementation at the individual optics-level outperforms any existing wavelength control techniques.
Learning Optimal Stable Matches in Decentralized Markets with Unknown Preferences
Matching algorithms have demonstrated great success in several practical applications, but they often require centralized coordination and plentiful information. In many modern online marketplaces, agents must independently seek out and match with another using little to no information. For these kinds of settings, can we design decentralized, limited-information matching algorithms that preserve the desirable properties of standard centralized techniques? In this work, we constructively answer this question in the affirmative. We model a two-sided matching market as a game consisting of two disjoint sets of agents, referred to as proposers and acceptors, each of whom seeks to match with their most preferable partner on the opposite side of the market. However, each proposer has no knowledge of their own preferences, so they must learn their preferences while forming matches in the market. We present a simple online learning rule that guarantees a strong notion of probabilistic convergence to the welfare-maximizing equilibrium of the game, referred to as the proposer-optimal stable match. To the best of our knowledge, this represents the first completely decoupled, communication-free algorithm that guarantees probabilistic convergence to an optimal stable match, irrespective of the structure of the matching market.
DEEP-IoT: Downlink-Enhanced Efficient-Power Internet of Things
At the heart of the Internet of Things (IoT) -- a domain witnessing explosive growth -- the imperative for energy efficiency and the extension of device lifespans has never been more pressing. This paper presents DEEP-IoT, an innovative communication paradigm poised to redefine how IoT devices communicate. Through a pioneering feedback channel coding strategy, DEEP-IoT challenges and transforms the traditional transmitter (IoT devices)-centric communication model to one where the receiver (the access point) play a pivotal role, thereby cutting down energy use and boosting device longevity. We not only conceptualize DEEP-IoT but also actualize it by integrating deep learning-enhanced feedback channel codes within a narrow-band system. Simulation results show a significant enhancement in the operational lifespan of IoT cells -- surpassing traditional systems using Turbo and Polar codes by up to 52.71%. This leap signifies a paradigm shift in IoT communications, setting the stage for a future where IoT devices boast unprecedented efficiency and durability.
Polynomial Logical Zonotope: A Set Representation for Reachability Analysis of Logical Systems
In this paper, we introduce a set representation called polynomial logical zonotopes for performing exact and computationally efficient reachability analysis on logical systems. We prove that through this polynomial-like construction, we are able to perform all of the fundamental logical operations (XOR, NOT, XNOR, AND, NAND, OR, NOR) between sets of points exactly in a reduced space, i.e., generator space with reduced complexity. Polynomial logical zonotopes are a generalization of logical zonotopes, which are able to represent up to $2^n$ binary vectors using only $n$ generators. Due to their construction, logical zonotopes are only able to support exact computations of some logical operations (XOR, NOT, XNOR), while other operations (AND, NAND, OR, NOR) result in over-approximations in the generator space. In order to perform all fundamental logical operations exactly, we formulate a generalization of logical zonotopes that is constructed by dependent generators and exponent matrices. While we are able to perform all of the logical operations exactly, this comes with a slight increase in computational complexity compared to logical zonotopes. To illustrate and showcase the computational benefits of polynomial logical zonotopes, we present the results of performing reachability analysis on two use cases: (1) safety verification of an intersection crossing protocol and (2) reachability analysis on a high-dimensional Boolean function. Moreover, to highlight the extensibility of logical zonotopes, we include an additional use case where we perform a computationally tractable exhaustive search for the key of a linear feedback shift register.
comment: This paper is accepted in Automatica. arXiv admin note: substantial text overlap with arXiv:2210.08596
Monte Carlo Grid Dynamic Programming: Almost Sure Convergence and Probability Constraints
Dynamic Programming (DP) suffers from the well-known ``curse of dimensionality'', further exacerbated by the need to compute expectations over process noise in stochastic models. This paper presents a Monte Carlo-based sampling approach for the state space and an interpolation procedure for the resulting value function, dependent on the process noise density, in a "self-approximating" fashion, eliminating the need for ordering or set-membership tests. We provide proof of almost sure convergence for the value iteration (and consequently, policy iteration) procedure. The proposed meshless sampling and interpolation algorithm alleviates the burden of gridding the state space, traditionally required in DP, and avoids constructing a piecewise constant value function over a grid. Moreover, we demonstrate that the proposed interpolation procedure is well-suited for handling probabilistic constraints by sampling both infeasible and feasible regions. The curse of dimensionality cannot be avoided, however, this approach offers a practical framework for addressing lower-order stochastic nonlinear systems with probabilistic constraints, while eliminating the need for linear interpolations and set membership tests. Numerical examples are presented to further explain and illustrate the convenience of the proposed algorithms.
comment: 6 pages, 1 figure
Heterogeneous Unmanned Aerial Vehicles Cooperative Search Approach for Complex Environments
This paper studies a heterogeneous Unmanned Aerial Vehicles (UAVs) cooperative search approach suitable for complex environments. In the application, a fixed-wing UAV drops rotor UAVs to deploy the cluster rapidly. Meanwhile, the fixed-wing UAV works as a communication relay node to improve the search performance of the cluster further. The distributed model predictive control and genetic algorithms are adopted to make online intelligent decisions on UAVs search directions. On this basis, a jump grid decision method is proposed to satisfy the maneuverability constraints of UAVs, a parameter dynamic selection method is developed to make search decisions more responsive to task requirements, and a search information transmission method with low bandwidth is designed. This approach can enable UAVs to discover targets quickly, cope with various constraints and unexpected situations, and make adaptive decisions, significantly improving the robustness of search tasks in complex, dynamic, and unknown environments. The proposed approach is tested with several search scenarios, and simulation results show that the cooperative search performance of heterogeneous UAVs is significantly improved compared to homogeneous UAVs.
comment: 26 pages, 26 figures
Providing Safety Assurances for Systems with Unknown Dynamics
As autonomous systems become more complex and integral in our society, the need to accurately model and safely control these systems has increased significantly. In the past decade, there has been tremendous success in using deep learning techniques to model and control systems that are difficult to model using first principles. However, providing safety assurances for such systems remains difficult, partially due to the uncertainty in the learned model. In this work, we aim to provide safety assurances for systems whose dynamics are not readily derived from first principles and, hence, are more advantageous to be learned using deep learning techniques. Given the system of interest and safety constraints, we learn an ensemble model of the system dynamics from data. Leveraging ensemble uncertainty as a measure of uncertainty in the learned dynamics model, we compute a maximal robust control invariant set, starting from which the system is guaranteed to satisfy the safety constraints under the condition that realized model uncertainties are contained in the predefined set of admissible model uncertainty. We demonstrate the effectiveness of our method using a simulated case study with an inverted pendulum and a hardware experiment with a TurtleBot. The experiments show that our method robustifies the control actions of the system against model uncertainty and generates safe behaviors without being overly restrictive. The codes and accompanying videos can be found on the project website.
comment: Accepted to L-CSS and CDC 2024
Robotics
Safe and Efficient Path Planning under Uncertainty via Deep Collision Probability Fields
Estimating collision probabilities between robots and environmental obstacles or other moving agents is crucial to ensure safety during path planning. This is an important building block of modern planning algorithms in many application scenarios such as autonomous driving, where noisy sensors perceive obstacles. While many approaches exist, they either provide too conservative estimates of the collision probabilities or are computationally intensive due to their sampling-based nature. To deal with these issues, we introduce Deep Collision Probability Fields, a neural-based approach for computing collision probabilities of arbitrary objects with arbitrary unimodal uncertainty distributions. Our approach relegates the computationally intensive estimation of collision probabilities via sampling at the training step, allowing for fast neural network inference of the constraints during planning. In extensive experiments, we show that Deep Collision Probability Fields can produce reasonably accurate collision probabilities (up to 10^{-3}) for planning and that our approach can be easily plugged into standard path planning approaches to plan safe paths on 2-D maps containing uncertain static and dynamic obstacles. Additional material, code, and videos are available at https://sites.google.com/view/ral-dcpf.
comment: Preprint version of a paper accepted to the IEEE Robotics and Automation Letters
SPACE: A Python-based Simulator for Evaluating Decentralized Multi-Robot Task Allocation Algorithms
Swarm robotics explores the coordination of multiple robots to achieve collective goals, with collective decision-making being a central focus. This process involves decentralized robots autonomously making local decisions and communicating them, which influences the overall emergent behavior. Testing such decentralized algorithms in real-world scenarios with hundreds or more robots is often impractical, underscoring the need for effective simulation tools. We propose SPACE (Swarm Planning and Control Evaluation), a Python-based simulator designed to support the research, evaluation, and comparison of decentralized Multi-Robot Task Allocation (MRTA) algorithms. SPACE streamlines core algorithmic development by allowing users to implement decision-making algorithms as Python plug-ins, easily construct agent behavior trees via an intuitive GUI, and leverage built-in support for inter-agent communication and local task awareness. To demonstrate its practical utility, we implement and evaluate CBBA and GRAPE within the simulator, comparing their performance across different metrics, particularly in scenarios with dynamically introduced tasks. This evaluation shows the usefulness of SPACE in conducting rigorous and standardized comparisons of MRTA algorithms, helping to support future research in the field.
Introducing a Class-Aware Metric for Monocular Depth Estimation: An Automotive Perspective ECCV
The increasing accuracy reports of metric monocular depth estimation models lead to a growing interest from the automotive domain. Current model evaluations do not provide deeper insights into the models' performance, also in relation to safety-critical or unseen classes. Within this paper, we present a novel approach for the evaluation of depth estimation models. Our proposed metric leverages three components, a class-wise component, an edge and corner image feature component, and a global consistency retaining component. Classes are further weighted on their distance in the scene and on criticality for automotive applications. In the evaluation, we present the benefits of our metric through comparison to classical metrics, class-wise analytics, and the retrieval of critical situations. The results show that our metric provides deeper insights into model results while fulfilling safety-critical requirements. We release the code and weights on the following repository: \href{https://github.com/leisemann/ca_mmde}
comment: Accepted at the European Conference on Computer Vision (ECCV) 2024 Workshop on Out Of Distribution Generalization in Computer Vision
Design and Characterization of MRI-compatible Plastic Ultrasonic Motor
Precise surgical procedures may benefit from intra-operative image guidance using magnetic resonance imaging (MRI). However, the MRI's strong magnetic fields, fast switching gradients, and constrained space pose the need for an MR-guided robotic system to assist the surgeon. Piezoelectric actuators can be used in an MRI environment by utilizing the inverse piezoelectric effect for different application purposes. Piezoelectric ultrasonic motor (USM) is one type of MRI-compatible actuator that can actuate these robots with fast response times, compactness, and simple configuration. Although the piezoelectric motors are mostly made of nonferromagnetic material, the generation of eddy currents due to the MRI's gradient fields can lead to magnetic field distortions causing image artifacts. Motor vibrations due to interactions between the MRI's magnetic fields and those generated by the eddy currents can further degrade image quality by causing image artifacts. In this work, a plastic piezoelectric ultrasonic (USM) motor with more degree of MRI compatibility was developed and induced with preliminary optimization. Multiple parameters, namely teeth number, notch size, edge bevel or straight, and surface finish level parameters were used versus the prepressure for the experiment, and the results suggested that using 48 teeth, thin teeth notch with 0.39mm, beveled edge and a surface finish using grit number of approximate 1000 sandpaper performed a better output both in rotary speed and torque. Under this combination, the highest speed reached up to 436.6665rpm when the prepressure was low, and the highest torque reached up to 0.0348Nm when the prepressure was approximately 500g.
comment: 10 pages, 30 figures, 3 tables
Matched Filtering based LiDAR Place Recognition for Urban and Natural Environments
Place recognition is an important task within autonomous navigation, involving the re-identification of previously visited locations from an initial traverse. Unlike visual place recognition (VPR), LiDAR place recognition (LPR) is tolerant to changes in lighting, seasons, and textures, leading to high performance on benchmark datasets from structured urban environments. However, there is a growing need for methods that can operate in diverse environments with high performance and minimal training. In this paper, we propose a handcrafted matching strategy that performs roto-translation invariant place recognition and relative pose estimation for both urban and unstructured natural environments. Our approach constructs Birds Eye View (BEV) global descriptors and employs a two-stage search using matched filtering -- a signal processing technique for detecting known signals amidst noise. Extensive testing on the NCLT, Oxford Radar, and WildPlaces datasets consistently demonstrates state-of-the-art (SoTA) performance across place recognition and relative pose estimation metrics, with up to 15% higher recall than previous SoTA.
Goal-Reaching Policy Learning from Non-Expert Observations via Effective Subgoal Guidance
In this work, we address the challenging problem of long-horizon goal-reaching policy learning from non-expert, action-free observation data. Unlike fully labeled expert data, our data is more accessible and avoids the costly process of action labeling. Additionally, compared to online learning, which often involves aimless exploration, our data provides useful guidance for more efficient exploration. To achieve our goal, we propose a novel subgoal guidance learning strategy. The motivation behind this strategy is that long-horizon goals offer limited guidance for efficient exploration and accurate state transition. We develop a diffusion strategy-based high-level policy to generate reasonable subgoals as waypoints, preferring states that more easily lead to the final goal. Additionally, we learn state-goal value functions to encourage efficient subgoal reaching. These two components naturally integrate into the off-policy actor-critic framework, enabling efficient goal attainment through informative exploration. We evaluate our method on complex robotic navigation and manipulation tasks, demonstrating a significant performance advantage over existing methods. Our ablation study further shows that our method is robust to observation data with various corruptions.
comment: Accepted to CoRL 2024
Development of Advanced FEM Simulation Technology for Pre-Operative Surgical Planning
Intracorporeal needle-based therapeutic ultrasound (NBTU) offers a minimally invasive approach for the thermal ablation of malignant brain tumors, including both primary and metastatic cancers. NBTU utilizes a high-frequency alternating electric field to excite a piezoelectric transducer, generating acoustic waves that cause localized heating and tumor cell ablation, and it provides a more precise ablation by delivering lower acoustic power doses directly to targeted tumors while sparing surrounding healthy tissue. Building on our previous work, this study introduces a database for optimizing pre-operative surgical planning by simulating ablation effects in varied tissue environments and develops an extended simulation model incorporating various tumor types and sizes to evaluate thermal damage under trans-tissue conditions. A comprehensive database is created from these simulations, detailing critical parameters such as CEM43 isodose maps, temperature changes, thermal dose areas, and maximum ablation distances for four directional probes. This database serves as a valuable resource for future studies, aiding in complex trajectory planning and parameter optimization for NBTU procedures. Moreover, a novel probe selection method is proposed to enhance pre-surgical planning, providing a strategic approach to selecting probes that maximize therapeutic efficiency and minimize ablation time. By avoiding unnecessary thermal propagation and optimizing probe angles, this method has the potential to improve patient outcomes and streamline surgical procedures. Overall, the findings of this study contribute significantly to the field of NBTU, offering a robust framework for enhancing treatment precision and efficacy in clinical settings.
comment: 8 pages, 17 figures, 2 tables
Automating Robot Failure Recovery Using Vision-Language Models With Optimized Prompts
Current robot autonomy struggles to operate beyond the assumed Operational Design Domain (ODD), the specific set of conditions and environments in which the system is designed to function, while the real-world is rife with uncertainties that may lead to failures. Automating recovery remains a significant challenge. Traditional methods often rely on human intervention to manually address failures or require exhaustive enumeration of failure cases and the design of specific recovery policies for each scenario, both of which are labor-intensive. Foundational Vision-Language Models (VLMs), which demonstrate remarkable common-sense generalization and reasoning capabilities, have broader, potentially unbounded ODDs. However, limitations in spatial reasoning continue to be a common challenge for many VLMs when applied to robot control and motion-level error recovery. In this paper, we investigate how optimizing visual and text prompts can enhance the spatial reasoning of VLMs, enabling them to function effectively as black-box controllers for both motion-level position correction and task-level recovery from unknown failures. Specifically, the optimizations include identifying key visual elements in visual prompts, highlighting these elements in text prompts for querying, and decomposing the reasoning process for failure detection and control generation. In experiments, prompt optimizations significantly outperform pre-trained Vision-Language-Action Models in correcting motion-level position errors and improve accuracy by 65.78% compared to VLMs with unoptimized prompts. Additionally, for task-level failures, optimized prompts enhanced the success rate by 5.8%, 5.8%, and 7.5% in VLMs' abilities to detect failures, analyze issues, and generate recovery plans, respectively, across a wide range of unknown errors in Lego assembly.
Solving Stochastic Orienteering Problems with Chance Constraints Using a GNN Powered Monte Carlo Tree Search
Leveraging the power of a graph neural network (GNN) with message passing, we present a Monte Carlo Tree Search (MCTS) method to solve stochastic orienteering problems with chance constraints. While adhering to an assigned travel budget the algorithm seeks to maximize collected reward while incurring stochastic travel costs. In this context, the acceptable probability of exceeding the assigned budget is expressed as a chance constraint. Our MCTS solution is an online and anytime algorithm alternating planning and execution that determines the next vertex to visit by continuously monitoring the remaining travel budget. The novelty of our work is that the rollout phase in the MCTS framework is implemented using a message passing GNN, predicting both the utility and failure probability of each available action. This allows to enormously expedite the search process. Our experimental evaluation shows that with the proposed method and architecture we manage to efficiently solve complex problem instances while incurring in moderate losses in terms of collected reward. Moreover, we demonstrate how the approach is capable of generalizing beyond the characteristics of the training dataset. The paper's website, open-source code, and supplementary documentation can be found at ucmercedrobotics.github.io/gnn-sop.
comment: 8 pages, 6 figures
High-Speed and Impact Resilient Teleoperation of Humanoid Robots
Teleoperation of humanoid robots has long been a challenging domain, necessitating advances in both hardware and software to achieve seamless and intuitive control. This paper presents an integrated solution based on several elements: calibration-free motion capture and retargeting, low-latency fast whole-body kinematics streaming toolbox and high-bandwidth cycloidal actuators. Our motion retargeting approach stands out for its simplicity, requiring only 7 IMUs to generate full-body references for the robot. The kinematics streaming toolbox, ensures real-time, responsive control of the robot's movements, significantly reducing latency and enhancing operational efficiency. Additionally, the use of cycloidal actuators makes it possible to withstand high speeds and impacts with the environment. Together, these approaches contribute to a teleoperation framework that offers unprecedented performance. Experimental results on the humanoid robot Nadia demonstrate the effectiveness of the integrated system.
Structure-Invariant Range-Visual-Inertial Odometry IROS
The Mars Science Helicopter (MSH) mission aims to deploy the next generation of unmanned helicopters on Mars, targeting landing sites in highly irregular terrain such as Valles Marineris, the largest canyons in the Solar system with elevation variances of up to 8000 meters. Unlike its predecessor, the Mars 2020 mission, which relied on a state estimation system assuming planar terrain, MSH requires a novel approach due to the complex topography of the landing site. This work introduces a novel range-visual-inertial odometry system tailored for the unique challenges of the MSH mission. Our system extends the state-of-the-art xVIO framework by fusing consistent range information with visual and inertial measurements, preventing metric scale drift in the absence of visual-inertial excitation (mono camera and constant velocity descent), and enabling landing on any terrain structure, without requiring any planar terrain assumption. Through extensive testing in image-based simulations using actual terrain structure and textures collected in Mars orbit, we demonstrate that our range-VIO approach estimates terrain-relative velocity meeting the stringent mission requirements, and outperforming existing methods.
comment: IEEE/RSJ International Conference on Intelligent Robots (IROS), 2024
Multi-scale Feature Fusion with Point Pyramid for 3D Object Detection
Effective point cloud processing is crucial to LiDARbased autonomous driving systems. The capability to understand features at multiple scales is required for object detection of intelligent vehicles, where road users may appear in different sizes. Recent methods focus on the design of the feature aggregation operators, which collect features at different scales from the encoder backbone and assign them to the points of interest. While efforts are made into the aggregation modules, the importance of how to fuse these multi-scale features has been overlooked. This leads to insufficient feature communication across scales. To address this issue, this paper proposes the Point Pyramid RCNN (POP-RCNN), a feature pyramid-based framework for 3D object detection on point clouds. POP-RCNN consists of a Point Pyramid Feature Enhancement (PPFE) module to establish connections across spatial scales and semantic depths for information exchange. The PPFE module effectively fuses multi-scale features for rich information without the increased complexity in feature aggregation. To remedy the impact of inconsistent point densities, a point density confidence module is deployed. This design integration enables the use of a lightweight feature aggregator, and the emphasis on both shallow and deep semantics, realising a detection framework for 3D object detection. With great adaptability, the proposed method can be applied to a variety of existing frameworks to increase feature richness, especially for long-distance detection. By adopting the PPFE in the voxel-based and point-voxel-based baselines, experimental results on KITTI and Waymo Open Dataset show that the proposed method achieves remarkable performance even with limited computational headroom.
comment: 12 pages
Developing a Modular Toolkit for Rapid Prototyping of Wearable Vibrotactile Haptic Harness
This paper presents a toolkit for rapid harness prototyping. These wearable structures attach vibrotactile actuators to the body using modular elements like 3D printed joints, laser cut or vinyl cutter-based sheets and magnetic clasps. This facilitates easy customization and assembly. The toolkit's primary objective is to simplify the design of haptic wearables, making research in this field easier and more approachable.
comment: two pages, short paper, 3 figures
ActionFlow: Equivariant, Accurate, and Efficient Policies with Spatially Symmetric Flow Matching
Spatial understanding is a critical aspect of most robotic tasks, particularly when generalization is important. Despite the impressive results of deep generative models in complex manipulation tasks, the absence of a representation that encodes intricate spatial relationships between observations and actions often limits spatial generalization, necessitating large amounts of demonstrations. To tackle this problem, we introduce a novel policy class, ActionFlow. ActionFlow integrates spatial symmetry inductive biases while generating expressive action sequences. On the representation level, ActionFlow introduces an SE(3) Invariant Transformer architecture, which enables informed spatial reasoning based on the relative SE(3) poses between observations and actions. For action generation, ActionFlow leverages Flow Matching, a state-of-the-art deep generative model known for generating high-quality samples with fast inference - an essential property for feedback control. In combination, ActionFlow policies exhibit strong spatial and locality biases and SE(3)-equivariant action generation. Our experiments demonstrate the effectiveness of ActionFlow and its two main components on several simulated and real-world robotic manipulation tasks and confirm that we can obtain equivariant, accurate, and efficient policies with spatially symmetric flow matching. Project website: https://flowbasedpolicies.github.io/
Solve paint color effect prediction problem in trajectory optimization of spray painting robot using artificial neural network inspired by the Kubelka Munk model
Currently, the spray-painting robot trajectory planning technology aiming at spray painting quality mainly applies to single-color spraying. Conventional methods of optimizing the spray gun trajectory based on simulated thickness can only qualitatively reflect the color distribution, and can not simulate the color effect of spray painting at the pixel level. Therefore, it is not possible to accurately control the area covered by the color and the gradation of the edges of the area, and it is also difficult to deal with the situation where multiple colors of paint are sprayed in combination. To solve the above problems, this paper is inspired by the Kubelka-Munk model and combines the 3D machine vision method and artificial neural network to propose a spray painting color effect prediction method. The method is enabled to predict the execution effect of the spray gun trajectory with pixel-level accuracy from the dimension of the surface color of the workpiece after spray painting. On this basis, the method can be used to replace the traditional thickness simulation method to establish the objective function of the spray gun trajectory optimization problem, and thus solve the difficult problem of spray gun trajectory optimization for multi-color paint combination spraying. In this paper, the mathematical model of the spray painting color effect prediction problem is first determined through the analysis of the Kubelka-Munk paint film color rendering model, and at the same time, the spray painting color effect dataset is established with the help of the depth camera and point cloud processing algorithm. After that, the multilayer perceptron model was improved with the help of gating and residual structure and was used for the color prediction task. To verify ...
Rico: extended TIAGo robot towards up-to-date social and assistive robot usage scenarios
Social and assistive robotics have vastly increased in popularity in recent years. Due to the wide range of usage, robots executing such tasks must be highly reliable and possess enough functions to satisfy multiple scenarios. This article describes a mobile, artificial intelligence-driven, robotic platform Rico. Its prior usage in similar scenarios, the number of its capabilities, and the experiments it presented should qualify it as a proper arm-less platform for social and assistive circumstances.
comment: PP-RAI 2024, 5th Polish Conference on Artificial Intelligence, 18-20.04.2024 Warsaw, Poland
A Black-Box Physics-Informed Estimator based on Gaussian Process Regression for Robot Inverse Dynamics Identification
Learning the inverse dynamics of robots directly from data, adopting a black-box approach, is interesting for several real-world scenarios where limited knowledge about the system is available. In this paper, we propose a black-box model based on Gaussian Process (GP) Regression for the identification of the inverse dynamics of robotic manipulators. The proposed model relies on a novel multidimensional kernel, called \textit{Lagrangian Inspired Polynomial} (\kernelInitials{}) kernel. The \kernelInitials{} kernel is based on two main ideas. First, instead of directly modeling the inverse dynamics components, we model as GPs the kinetic and potential energy of the system. The GP prior on the inverse dynamics components is derived from those on the energies by applying the properties of GPs under linear operators. Second, as regards the energy prior definition, we prove a polynomial structure of the kinetic and potential energy, and we derive a polynomial kernel that encodes this property. As a consequence, the proposed model allows also to estimate the kinetic and potential energy without requiring any label on these quantities. Results on simulation and on two real robotic manipulators, namely a 7 DOF Franka Emika Panda, and a 6 DOF MELFA RV4FL, show that the proposed model outperforms state-of-the-art black-box estimators based both on Gaussian Processes and Neural Networks in terms of accuracy, generality and data efficiency. The experiments on the MELFA robot also demonstrate that our approach achieves performance comparable to fine-tuned model-based estimators, despite requiring less prior information.
Repeatable and Reliable Efforts of Accelerated Risk Assessment in Robot Testing
Risk assessment of a robot in controlled environments, such as laboratories and proving grounds, is a common means to assess, certify, validate, verify, and characterize the robots' safety performance before, during, and even after their commercialization in the real-world. A standard testing program that acquires the risk estimate is expected to be (i) repeatable, such that it obtains similar risk assessments of the same testing subject among multiple trials or attempts with the similar testing effort by different stakeholders, and (ii) reliable against a variety of testing subjects produced by different vendors and manufacturers. Both repeatability and reliability are fundamental and crucial for a testing algorithm's validity, fairness, and practical feasibility, especially for standardization. However, these properties are rarely satisfied or ensured, especially as the subject robots become more complex, uncertain, and varied. This issue was present in traditional risk assessments through Monte-Carlo sampling, and remains a bottleneck for the recent accelerated risk assessment methods, primarily those using importance sampling. This study aims to enhance existing accelerated testing frameworks by proposing a new algorithm that provably integrates repeatability and reliability with the already established formality and efficiency. It also features demonstrations assessing the risk of instability from frontal impacts, initiated by push-over disturbances on a controlled inverted pendulum and a 7-DoF planar bipedal robot Rabbit managed by various control algorithms.
Imitation learning for sim-to-real transfer of robotic cutting policies based on residual Gaussian process disturbance force model IROS
Robotic cutting, or milling, plays a significant role in applications such as disassembly, decommissioning, and demolition. Planning and control of cutting in real-world scenarios in uncertain environments is a complex task, with the potential to benefit from simulated training environments. This letter focuses on sim-to-real transfer for robotic cutting policies, addressing the need for effective policy transfer from simulation to practical implementation. We extend our previous domain generalisation approach to learning cutting tasks based on a mechanistic model-based simulation framework, by proposing a hybrid approach for sim-to-real transfer based on a milling process force model and residual Gaussian process (GP) force model, learned from either single or multiple real-world cutting force examples. We demonstrate successful sim-to-real transfer of a robotic cutting policy without the need for fine-tuning on the real robot setup. The proposed approach autonomously adapts to materials with differing structural and mechanical properties. Furthermore, we demonstrate the proposed method outperforms fine-tuning or re-training alone.
comment: 8 pages, 9 figures, accepted for publication in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
On The Evaluation of Collision Probability along a Path
Characterizing the risk of operations is a fundamental requirement in robotics, and a crucial ingredient of safe planning. The problem is multifaceted, with multiple definitions arising in the vast recent literature fitting different application scenarios and leading to different computational approaches. A basic element shared by most frameworks is the definition and evaluation of the probability of collision for a mobile object in an environment with obstacles. We observe that, even in basic cases, different interpretations are possible. This paper proposes an index we call "Risk Density", which offers a theoretical link between conceptually distant assumptions about the interplay of single collision events along a continuous path. We show how this index can be used to approximate the collision probability in the case where the robot evolves along a nominal continuous curve from random initial conditions. Indeed, under this hypothesis the proposed approximation outperforms some well-established methods either in accuracy or computational cost.
Hyp2Nav: Hyperbolic Planning and Curiosity for Crowd Navigation IROS 2024
Autonomous robots are increasingly becoming a strong fixture in social environments. Effective crowd navigation requires not only safe yet fast planning, but should also enable interpretability and computational efficiency for working in real-time on embedded devices. In this work, we advocate for hyperbolic learning to enable crowd navigation and we introduce Hyp2Nav. Different from conventional reinforcement learning-based crowd navigation methods, Hyp2Nav leverages the intrinsic properties of hyperbolic geometry to better encode the hierarchical nature of decision-making processes in navigation tasks. We propose a hyperbolic policy model and a hyperbolic curiosity module that results in effective social navigation, best success rates, and returns across multiple simulation settings, using up to 6 times fewer parameters than competitor state-of-the-art models. With our approach, it becomes even possible to obtain policies that work in 2-dimensional embedding spaces, opening up new possibilities for low-resource crowd navigation and model interpretability. Insightfully, the internal hyperbolic representation of Hyp2Nav correlates with how much attention the robot pays to the surrounding crowds, e.g. due to multiple people occluding its pathway or to a few of them showing colliding plans, rather than to its own planned route. The code is available at https://github.com/GDam90/hyp2nav.
comment: Accepted as oral at IROS 2024
A global approach for the redefinition of higher-order flexibility and rigidity
The famous example of the double-Watt mechanism given by Connelly and Servatius raises some problems concerning the classical definitions of higher-order flexibility and rigidity, respectively, as they attest the cusp configuration of the mechanism a third-order rigidity, which conflicts with its continuous flexion. Some attempts were done to resolve the dilemma but they could not settle the problem. As cusp mechanisms demonstrate the basic shortcoming of any local mobility analysis using higher-order constraints, we present a global approach inspired by Sabitov's finite algorithm for testing the bendability of a polyhedron, which allows us (a) to compute iteratively configurations with a higher-order flexion and (b) to come up with a proper redefinition of higher-order flexibility and rigidity. We also give algorithms for computing the flexion orders as well as the associated flexes. The presented approach is demonstrated on several examples (double-Watt mechanisms and Tarnai's Leonardo structure). Moreover, we determine all configurations of a given 3-RPR manipulator with a third-order flexion and present a corresponding joint-bar framework of flexion order 23.
comment: 29 pages, 12 figures, 10 examples
A Flexible and Resilient Formation Approach based on Hierarchical Reorganization
Conventional formation methods typically rely on fixed hierarchical structures, such as predetermined leaders or predefined formation shapes. These rigid hierarchies can render formations cumbersome and inflexible in complex environments, leading to potential failure if any leader loses connectivity. To address these limitations, this paper introduces a reconfigurable affine formation that enhances both flexibility and resilience through hierarchical reorganization. The paper first elucidates the critical role of hierarchical reorganization, conceptualizing this process as involving role reallocation and dynamic changes in topological structures. To further investigate the conditions necessary for hierarchical reorganization, a reconfigurable hierarchical formation is developed based on graph theory, with its feasibility rigorously demonstrated. In conjunction with role transitions, a power-centric topology switching mechanism grounded in formation consensus convergence is proposed, ensuring coordinated resilience within the formation. Finally, simulations and experiments validate the performance of the proposed method. The aerial formations successfully performed multiple hierarchical reorganizations in both three-dimensional and two-dimensional spaces. Even in the event of a single leader's failure, the formation maintained stable flight through hierarchical reorganization. This rapid adaptability enables the robotic formations to execute complex tasks, including sharp turns and navigating through forests at speeds up to 1.9 m/s.
Trust the PRoC3S: Solving Long-Horizon Robotics Problems with LLMs and Constraint Satisfaction
Recent developments in pretrained large language models (LLMs) applied to robotics have demonstrated their capacity for sequencing a set of discrete skills to achieve open-ended goals in simple robotic tasks. In this paper, we examine the topic of LLM planning for a set of continuously parameterized skills whose execution must avoid violations of a set of kinematic, geometric, and physical constraints. We prompt the LLM to output code for a function with open parameters, which, together with environmental constraints, can be viewed as a Continuous Constraint Satisfaction Problem (CCSP). This CCSP can be solved through sampling or optimization to find a skill sequence and continuous parameter settings that achieve the goal while avoiding constraint violations. Additionally, we consider cases where the LLM proposes unsatisfiable CCSPs, such as those that are kinematically infeasible, dynamically unstable, or lead to collisions, and re-prompt the LLM to form a new CCSP accordingly. Experiments across three different simulated 3D domains demonstrate that our proposed strategy, PRoC3S, is capable of solving a wide range of complex manipulation tasks with realistic constraints on continuous parameters much more efficiently and effectively than existing baselines.
PEACE: Prompt Engineering Automation for CLIPSeg Enhancement in Aerial Robotics
Safe landing is an essential aspect of flight operations in fields ranging from industrial to space robotics. With the growing interest in artificial intelligence, we focus on learning-based methods for safe landing. Our previous work, Dynamic Open-Vocabulary Enhanced SafE-Landing with Intelligence (DOVESEI), demonstrated the feasibility of using prompt-based segmentation for identifying safe landing zones with open vocabulary models. However, relying on a heuristic selection of words for prompts is not reliable, as it cannot adapt to changing environments, potentially leading to harmful outcomes if the observed environment is not accurately represented by the chosen prompt. To address this issue, we introduce PEACE (Prompt Engineering Automation for CLIPSeg Enhancement), an enhancement to DOVESEI that automates prompt engineering to adapt to shifts in data distribution. PEACE can perform safe landings using only monocular cameras and image segmentation. PEACE shows significant improvements in prompt generation and engineering for aerial images compared to standard prompts used for CLIP and CLIPSeg. By combining DOVESEI and PEACE, our system improved the success rate of safe landing zone selection by at least 30\% in both simulations and indoor experiments.
comment: arXiv admin note: text overlap with arXiv:2308.11471
Patching Approximately Safe Value Functions Leveraging Local Hamilton-Jacobi Reachability Analysis
Safe value functions, such as control barrier functions, characterize a safe set and synthesize a safety filter, overriding unsafe actions, for a dynamic system. While function approximators like neural networks can synthesize approximately safe value functions, they typically lack formal guarantees. In this paper, we propose a local dynamic programming-based approach to "patch" approximately safe value functions to obtain a safe value function. This algorithm, HJ-Patch, produces a novel value function that provides formal safety guarantees, yet retains the global structure of the initial value function. HJ-Patch modifies an approximately safe value function at states that are both (i) near the safety boundary and (ii) may violate safety. We iteratively update both this set of "active" states and the value function until convergence. This approach bridges the gap between value function approximation methods and formal safety through Hamilton-Jacobi (HJ) reachability, offering a framework for integrating various safety methods. We provide simulation results on analytic and learned examples, demonstrating HJ-Patch reduces the computational complexity by 2 orders of magnitude with respect to standard HJ reachability. Additionally, we demonstrate the perils of using approximately safe value functions directly and showcase improved safety using HJ-Patch.
comment: 8 pages, IEEE Conference on Decision and Control (CDC), 2024 (In Press)
Deep Stochastic Kinematic Models for Probabilistic Motion Forecasting in Traffic
In trajectory forecasting tasks for traffic, future output trajectories can be computed by advancing the ego vehicle's state with predicted actions according to a kinematics model. By unrolling predicted trajectories via time integration and models of kinematic dynamics, predicted trajectories should not only be kinematically feasible but also relate uncertainty from one timestep to the next. While current works in probabilistic prediction do incorporate kinematic priors for mean trajectory prediction, _variance_ is often left as a learnable parameter, despite uncertainty in one time step being inextricably tied to uncertainty in the previous time step. In this paper, we show simple and differentiable analytical approximations describing the relationship between variance at one timestep and that at the next with the kinematic bicycle model. In our results, we find that encoding the relationship between variance across timesteps works especially well in unoptimal settings, such as with small or noisy datasets. We observe up to a 50% performance boost in partial dataset settings and up to an 8% performance boost in large-scale learning compared to previous kinematic prediction methods on SOTA trajectory forecasting architectures out-of-the-box, with no fine-tuning.
comment: 8 pages
DeliGrasp: Inferring Object Properties with LLMs for Adaptive Grasp Policies
Large language models (LLMs) can provide rich physical descriptions of most worldly objects, allowing robots to achieve more informed and capable grasping. We leverage LLMs' common sense physical reasoning and code-writing abilities to infer an object's physical characteristics$\unicode{x2013}$mass $m$, friction coefficient $\mu$, and spring constant $k$$\unicode{x2013}$from a semantic description, and then translate those characteristics into an executable adaptive grasp policy. Using a two-finger gripper with a built-in depth camera that can control its torque by limiting motor current, we demonstrate that LLM-parameterized but first-principles grasp policies outperform both traditional adaptive grasp policies and direct LLM-as-code policies on a custom benchmark of 12 delicate and deformable items including food, produce, toys, and other everyday items, spanning two orders of magnitude in mass and required pick-up force. We then improve property estimation and grasp performance on variable size objects with model finetuning on property-based comparisons and eliciting such comparisons via chain-of-thought prompting. We also demonstrate how compliance feedback from DeliGrasp policies can aid in downstream tasks such as measuring produce ripeness. Our code and videos are available at: https://deligrasp.github.io
Safe POMDP Online Planning among Dynamic Agents via Adaptive Conformal Prediction
Online planning for partially observable Markov decision processes (POMDPs) provides efficient techniques for robot decision-making under uncertainty. However, existing methods fall short of preventing safety violations in dynamic environments. This work presents a novel safe POMDP online planning approach that maximizes expected returns while providing probabilistic safety guarantees amidst environments populated by multiple dynamic agents. Our approach utilizes data-driven trajectory prediction models of dynamic agents and applies Adaptive Conformal Prediction (ACP) to quantify the uncertainties in these predictions. Leveraging the obtained ACP-based trajectory predictions, our approach constructs safety shields on-the-fly to prevent unsafe actions within POMDP online planning. Through experimental evaluation in various dynamic environments using real-world pedestrian trajectory data, the proposed approach has been shown to effectively maintain probabilistic safety guarantees while accommodating up to hundreds of dynamic agents.
Learning Generalizable Tool-use Skills through Trajectory Generation
Autonomous systems that efficiently utilize tools can assist humans in completing many common tasks such as cooking and cleaning. However, current systems fall short of matching human-level of intelligence in terms of adapting to novel tools. Prior works based on affordance often make strong assumptions about the environments and cannot scale to more complex, contact-rich tasks. In this work, we tackle this challenge and explore how agents can learn to use previously unseen tools to manipulate deformable objects. We propose to learn a generative model of the tool-use trajectories as a sequence of tool point clouds, which generalizes to different tool shapes. Given any novel tool, we first generate a tool-use trajectory and then optimize the sequence of tool poses to align with the generated trajectory. We train a single model on four different challenging deformable object manipulation tasks, using demonstration data from only one tool per task. The model generalizes to various novel tools, significantly outperforming baselines. We further test our trained policy in the real world with unseen tools, where it achieves the performance comparable to human. Additional materials can be found on our project website: https://sites.google.com/view/toolgen.
Multiagent Systems
SPACE: A Python-based Simulator for Evaluating Decentralized Multi-Robot Task Allocation Algorithms
Swarm robotics explores the coordination of multiple robots to achieve collective goals, with collective decision-making being a central focus. This process involves decentralized robots autonomously making local decisions and communicating them, which influences the overall emergent behavior. Testing such decentralized algorithms in real-world scenarios with hundreds or more robots is often impractical, underscoring the need for effective simulation tools. We propose SPACE (Swarm Planning and Control Evaluation), a Python-based simulator designed to support the research, evaluation, and comparison of decentralized Multi-Robot Task Allocation (MRTA) algorithms. SPACE streamlines core algorithmic development by allowing users to implement decision-making algorithms as Python plug-ins, easily construct agent behavior trees via an intuitive GUI, and leverage built-in support for inter-agent communication and local task awareness. To demonstrate its practical utility, we implement and evaluate CBBA and GRAPE within the simulator, comparing their performance across different metrics, particularly in scenarios with dynamically introduced tasks. This evaluation shows the usefulness of SPACE in conducting rigorous and standardized comparisons of MRTA algorithms, helping to support future research in the field.
Decentralized Learning in General-sum Markov Games
The Markov game framework is widely used to model interactions among agents with heterogeneous utilities in dynamic and uncertain societal-scale systems. In these systems, agents typically operate in a decentralized manner due to privacy and scalability concerns, often acting without any information about other agents. The design and analysis of decentralized learning algorithms that provably converge to rational outcomes remain elusive, especially beyond Markov zero-sum games and Markov potential games, which do not adequately capture the nature of many real-world interactions that is neither fully competitive nor fully cooperative. This paper investigates the design of decentralized learning algorithms for general-sum Markov games, aiming to provide provable guarantees of convergence to approximate Nash equilibria in the long run. Our approach builds on constructing a Markov Near-Potential Function (MNPF) to address the intractability of designing algorithms that converge to exact Nash equilibria. We demonstrate that MNPFs play a central role in ensuring the convergence of an actor-critic-based decentralized learning algorithm to approximate Nash equilibria. By leveraging a two-timescale approach, where Q-function estimates are updated faster than policy updates, we show that the system converges to a level set of the MNPF over the set of approximate Nash equilibria. This convergence result is further strengthened if the set of Nash equilibria is assumed to be finite. Our findings provide a new perspective on the analysis and design of decentralized learning algorithms in multi-agent systems.
comment: 16 pages, 1 figure
Systems and Control (CS)
Stability of the Theta Method for Systems with Multiple Time-Delayed Variables
The paper focuses on the numerical stability and accuracy of implicit time-domain integration (TDI) methods when applied for the solution of a power system model impacted by time delays. Such a model is generally formulated as a set of delay differential algebraic equations (DDAEs) in non index-1 Hessenberg form. In particular, the paper shows that numerically stable ordinary differential equation (ODE) methods, such as the trapezoidal and the Theta method, can become unstable when applied to a power system that includes a significant number of delayed variables. Numerical stability is discussed through a scalar test delay differential equation, as well as through a matrix pencil approach that accounts for the DDAEs of any given dynamic power system model. Simulation results are presented in a case study based on the IEEE 39-bus system.
comment: 9 pages
Unmasking Covert Intrusions: Detection of Fault-Masking Cyberattacks on Differential Protection Systems
Line Current Differential Relays (LCDRs) are high-speed relays progressively used to protect critical transmission lines. However, LCDRs are vulnerable to cyberattacks. Fault-Masking Attacks (FMAs) are stealthy cyberattacks performed by manipulating the remote measurements of the targeted LCDR to disguise faults on the protected line. Hence, they remain undetected by this LCDR. In this paper, we propose a two-module framework to detect FMAs. The first module is a Mismatch Index (MI) developed from the protected transmission line's equivalent physical model. The MI is triggered only if there is a significant mismatch in the LCDR's local and remote measurements while the LCDR itself is untriggered, which indicates an FMA. After the MI is triggered, the second module, a neural network-based classifier, promptly confirms that the triggering event is a physical fault that lies on the line protected by the LCDR before declaring the occurrence of an FMA. The proposed framework is tested using the IEEE 39-bus benchmark system. Our simulation results confirm that the proposed framework can accurately detect FMAs on LCDRs and is not affected by normal system disturbances, variations, or measurement noise. Our experimental results using OPAL-RT's real-time simulator confirm the proposed solution's real-time performance capability.
comment: Accepted to IEEE Transactions on Systems, Man, and Cybernetics: Systems. \c{opyright} 2024 IEEE
Towards a Socially Acceptable Competitive Equilibrium in Energy Markets
This paper addresses the problem of energy sharing between a population of price-taking agents who adopt decentralized primal-dual gradient dynamics to find the Competitive Equilibrium (CE). Although the CE is efficient, it does not ensure fairness and can potentially lead to high prices. As the agents and market operator share a social responsibility to keep the price below a certain socially acceptable threshold, we propose an approach where the agents modify their utility functions in a decentralized way. We introduce a dynamic feedback controller for the primal-dual dynamics to steer the agents to a Socially acceptable Competitive Equilibrium (SCE). We demonstrate our theoretical findings in a case study.
comment: Extended version
Capturing Opportunity Costs of Batteries with a Staircase Supply-Demand Function
In the global pursuit of carbon neutrality, the role of batteries is indispensable. They provide pivotal flexibilities to counter uncertainties from renewables, preferably by participating in electricity markets. Unlike thermal generators, however, the dominant type of cost for batteries is opportunity cost, which is more vague and challenging to represent through bids in stipulated formats. This article shows the opposite yet surprising results: The demand-supply function of an ideal battery, considering its opportunity cost, is a staircase function with no more than five segments, which is a perfect match with existing rules in many real electricity markets. The demand-supply function shifts horizontally with price forecasts and vertically with the initial SOC. These results can be generalized to imperfect batteries and numerous battery-like resources, including battery clusters, air-conditioners, and electric vehicle charging stations, although the number of segments may vary. These results pave the way for batteries to participate in electricity markets.
An updated look on the convergence and consistency of data-driven dynamical models
Deep sequence models are receiving significant interest in current machine learning research. By representing probability distributions that are fit to data using maximum likelihood estimation, such models can model data on general observation spaces (both continuous and discrete-valued). Furthermore, they can be applied to a wide range of modelling problems, including modelling of dynamical systems which are subject to control. The problem of learning data-driven models of systems subject to control is well studied in the field of system identification. In particular, there exist theoretical convergence and consistency results which can be used to analyze model behaviour and guide model development. However, these results typically concern models which provide point predictions of continuous-valued variables. Motivated by this, we derive convergence and consistency results for a class of nonlinear probabilistic models defined on a general observation space. The results rely on stability and regularity assumptions, and can be used to derive consistency conditions and bias expressions for nonlinear probabilistic models of systems under control. We illustrate the results on examples from linear system identification and Markov chains on finite state spaces.
comment: 9 pages, no figures. To be presented at the 63rd IEEE Conference on Decision and Control
Online Residual Learning from Offline Experts for Pedestrian Tracking
In this paper, we consider the problem of predicting unknown targets from data. We propose Online Residual Learning (ORL), a method that combines online adaptation with offline-trained predictions. At a lower level, we employ multiple offline predictions generated before or at the beginning of the prediction horizon. We augment every offline prediction by learning their respective residual error concerning the true target state online, using the recursive least squares algorithm. At a higher level, we treat the augmented lower-level predictors as experts, adopting the Prediction with Expert Advice framework. We utilize an adaptive softmax weighting scheme to form an aggregate prediction and provide guarantees for ORL in terms of regret. We employ ORL to boost performance in the setting of online pedestrian trajectory prediction. Based on data from the Stanford Drone Dataset, we show that ORL can demonstrate best-of-both-worlds performance.
comment: Accepted to CDC 2024
Jam-absorption driving with data assimilation
This paper introduces a data assimilation (DA) framework based on the extended Kalman filter-cell transmission model, designed to assist jam-absorption driving (JAD) operation to alleviate sag traffic congestion. To ascertain and demonstrate the effectiveness of the DA framework for JAD operation, in this paper, we initially investigated its impact on the motion and control performance of a single absorbing vehicle. Numerical results show that the DA framework effectively mitigated underestimated or overestimated control failures of JAD caused by misestimation of key parameters (e.g., free flow speed and critical density) of the traffic flow fundamental diagram. The findings suggest that the proposed DA framework can reduce control failures and prevent significant declines and deteriorations in JAD performance caused by changes in traffic characteristics, e.g., weather conditions or traffic composition.
comment: SMC 2024. Copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Decentralized Learning in General-sum Markov Games
The Markov game framework is widely used to model interactions among agents with heterogeneous utilities in dynamic and uncertain societal-scale systems. In these systems, agents typically operate in a decentralized manner due to privacy and scalability concerns, often acting without any information about other agents. The design and analysis of decentralized learning algorithms that provably converge to rational outcomes remain elusive, especially beyond Markov zero-sum games and Markov potential games, which do not adequately capture the nature of many real-world interactions that is neither fully competitive nor fully cooperative. This paper investigates the design of decentralized learning algorithms for general-sum Markov games, aiming to provide provable guarantees of convergence to approximate Nash equilibria in the long run. Our approach builds on constructing a Markov Near-Potential Function (MNPF) to address the intractability of designing algorithms that converge to exact Nash equilibria. We demonstrate that MNPFs play a central role in ensuring the convergence of an actor-critic-based decentralized learning algorithm to approximate Nash equilibria. By leveraging a two-timescale approach, where Q-function estimates are updated faster than policy updates, we show that the system converges to a level set of the MNPF over the set of approximate Nash equilibria. This convergence result is further strengthened if the set of Nash equilibria is assumed to be finite. Our findings provide a new perspective on the analysis and design of decentralized learning algorithms in multi-agent systems.
comment: 16 pages, 1 figure
Whittle Index Learning Algorithms for Restless Bandits with Constant Stepsizes
We study the Whittle index learning algorithm for restless multi-armed bandits. We consider index learning algorithm with Q-learning. We first present Q-learning algorithm with exploration policies -- epsilon-greedy, softmax, epsilon-softmax with constant stepsizes. We extend the study of Q-learning to index learning for single-armed restless bandit. The algorithm of index learning is two-timescale variant of stochastic approximation, on slower timescale we update index learning scheme and on faster timescale we update Q-learning assuming fixed index value. In Q-learning updates are in asynchronous manner. We study constant stepsizes two timescale stochastic approximation algorithm. We provide analysis of two-timescale stochastic approximation for index learning with constant stepsizes. Further, we present study on index learning with deep Q-network (DQN) learning and linear function approximation with state-aggregation method. We describe the performance of our algorithms using numerical examples. We have shown that index learning with Q learning, DQN and function approximations learns the Whittle index.
comment: 14 pages
Multi-scale Feature Fusion with Point Pyramid for 3D Object Detection
Effective point cloud processing is crucial to LiDARbased autonomous driving systems. The capability to understand features at multiple scales is required for object detection of intelligent vehicles, where road users may appear in different sizes. Recent methods focus on the design of the feature aggregation operators, which collect features at different scales from the encoder backbone and assign them to the points of interest. While efforts are made into the aggregation modules, the importance of how to fuse these multi-scale features has been overlooked. This leads to insufficient feature communication across scales. To address this issue, this paper proposes the Point Pyramid RCNN (POP-RCNN), a feature pyramid-based framework for 3D object detection on point clouds. POP-RCNN consists of a Point Pyramid Feature Enhancement (PPFE) module to establish connections across spatial scales and semantic depths for information exchange. The PPFE module effectively fuses multi-scale features for rich information without the increased complexity in feature aggregation. To remedy the impact of inconsistent point densities, a point density confidence module is deployed. This design integration enables the use of a lightweight feature aggregator, and the emphasis on both shallow and deep semantics, realising a detection framework for 3D object detection. With great adaptability, the proposed method can be applied to a variety of existing frameworks to increase feature richness, especially for long-distance detection. By adopting the PPFE in the voxel-based and point-voxel-based baselines, experimental results on KITTI and Waymo Open Dataset show that the proposed method achieves remarkable performance even with limited computational headroom.
comment: 12 pages
Faster Q-Learning Algorithms for Restless Bandits
We study the Whittle index learning algorithm for restless multi-armed bandits (RMAB). We first present Q-learning algorithm and its variants -- speedy Q-learning (SQL), generalized speedy Q-learning (GSQL) and phase Q-learning (PhaseQL). We also discuss exploration policies -- $\epsilon$-greedy and Upper confidence bound (UCB). We extend the study of Q-learning and its variants with UCB policy. We illustrate using numerical example that Q-learning with UCB exploration policy has faster convergence and PhaseQL with UCB have fastest convergence rate. We next extend the study of Q-learning variants for index learning to RMAB. The algorithm of index learning is two-timescale variant of stochastic approximation, on slower timescale we update index learning scheme and on faster timescale we update Q-learning assuming fixed index value. We study constant stepsizes two timescale stochastic approximation algorithm. We describe the performance of our algorithms using numerical example. It illustrate that index learning with Q learning with UCB has faster convergence that $\epsilon$ greedy. Further, PhaseQL (with UCB and $\epsilon$ greedy) has the best convergence than other Q-learning algorithms.
comment: 7 pages, 3 figures, conference. arXiv admin note: substantial text overlap with arXiv:2409.04605
A Centralized Discovery-Based Method for Integrating Data Distribution Service and Time-Sensitive Networking in In-Vehicle Networks
As the electronic and electrical architecture (E/EA) of intelligent and connected vehicles (ICVs) evolves, traditional distributed and signal-oriented architectures are being replaced by centralized, service-oriented architectures (SOA). This new generation of E/EA demands in-vehicle networks (IVNs) that offer high bandwidth, real-time, reliability, and service-oriented. data distribution service (DDS) and time-sensitive networking (TSN) are increasingly adopted to address these requirements. However, research on the integrated deployment of DDS and TSN in automotive applications is still in its infancy. This paper presents a DDS over TSN (DoT) communication architecture based on the centralized discovery architecture (CDA). First, a lightweight DDS implementation (FastDDS-lw) is developed for resource-constrained in-vehicle devices. Next, a DDS flow identification algorithm (DFIA) based on the CDA is introduced to identify potential DDS flows during the discovery phase automatically. Finally, the DoT communication architecture is designed, incorporating FastDDS-lw and DFIA. Experimental results show that the DoT architecture significantly reduces end-to-end latency and jitter for critical DDS flows compared to traditional Ethernet. Additionally, DoT provides an automated network configuration method that completes within a few tens of milliseconds.
A Digital Twin Design Methodology for Control, Simulation, and Monitoring of Fluidic Circuits
We propose a synthesis method for the design of digital twins applicable to various systems (pneumatic, hydraulic, electrical/electronic circuits). The methodology allows representing the operation of these systems through an active digital twin, thereby enabling a more suitable and easier computer-aided design, simulation, control, and monitoring. Furthermore, our methodology enables the detection of a system's actions on its own inputs (for example, in pneumatics: backflow of gases trapped in part of a fluidic system onto its own inputs). During the simulation or monitoring phase, the approach also facilitates real-time diagnosis of the controlled system. The outputs, on the controlled physical system or its digital twin, do not depend only on the current inputs but also on the history of the inputs and the history of internal states and variables. In other words, the underlying sequential logic has a memory while an only combinational logic approach does not. These capabilities can contribute to the digital transformation of the factory of the future.
comment: 9 pages, 6 figures
A Black-Box Physics-Informed Estimator based on Gaussian Process Regression for Robot Inverse Dynamics Identification
Learning the inverse dynamics of robots directly from data, adopting a black-box approach, is interesting for several real-world scenarios where limited knowledge about the system is available. In this paper, we propose a black-box model based on Gaussian Process (GP) Regression for the identification of the inverse dynamics of robotic manipulators. The proposed model relies on a novel multidimensional kernel, called \textit{Lagrangian Inspired Polynomial} (\kernelInitials{}) kernel. The \kernelInitials{} kernel is based on two main ideas. First, instead of directly modeling the inverse dynamics components, we model as GPs the kinetic and potential energy of the system. The GP prior on the inverse dynamics components is derived from those on the energies by applying the properties of GPs under linear operators. Second, as regards the energy prior definition, we prove a polynomial structure of the kinetic and potential energy, and we derive a polynomial kernel that encodes this property. As a consequence, the proposed model allows also to estimate the kinetic and potential energy without requiring any label on these quantities. Results on simulation and on two real robotic manipulators, namely a 7 DOF Franka Emika Panda, and a 6 DOF MELFA RV4FL, show that the proposed model outperforms state-of-the-art black-box estimators based both on Gaussian Processes and Neural Networks in terms of accuracy, generality and data efficiency. The experiments on the MELFA robot also demonstrate that our approach achieves performance comparable to fine-tuned model-based estimators, despite requiring less prior information.
A Hypergraph-Based Machine Learning Ensemble Network Intrusion Detection System
Network intrusion detection systems (NIDS) to detect malicious attacks continue to meet challenges. NIDS are often developed offline while they face auto-generated port scan infiltration attempts, resulting in a significant time lag from adversarial adaption to NIDS response. To address these challenges, we use hypergraphs focused on internet protocol addresses and destination ports to capture evolving patterns of port scan attacks. The derived set of hypergraph-based metrics are then used to train an ensemble machine learning (ML) based NIDS that allows for real-time adaption in monitoring and detecting port scanning activities, other types of attacks, and adversarial intrusions at high accuracy, precision and recall performances. This ML adapting NIDS was developed through the combination of (1) intrusion examples, (2) NIDS update rules, (3) attack threshold choices to trigger NIDS retraining requests, and (4) a production environment with no prior knowledge of the nature of network traffic. 40 scenarios were auto-generated to evaluate the ML ensemble NIDS comprising three tree-based models. The resulting ML Ensemble NIDS was extended and evaluated with the CIC-IDS2017 dataset. Results show that under the model settings of an Update-ALL-NIDS rule (specifically retrain and update all the three models upon the same NIDS retraining request) the proposed ML ensemble NIDS evolved intelligently and produced the best results with nearly 100% detection performance throughout the simulation.
comment: in IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2024. An updated version of this work has been accepted for publication in an IEEE journal available here: https://ieeexplore.ieee.org/document/10666746
A Flexible and Resilient Formation Approach based on Hierarchical Reorganization
Conventional formation methods typically rely on fixed hierarchical structures, such as predetermined leaders or predefined formation shapes. These rigid hierarchies can render formations cumbersome and inflexible in complex environments, leading to potential failure if any leader loses connectivity. To address these limitations, this paper introduces a reconfigurable affine formation that enhances both flexibility and resilience through hierarchical reorganization. The paper first elucidates the critical role of hierarchical reorganization, conceptualizing this process as involving role reallocation and dynamic changes in topological structures. To further investigate the conditions necessary for hierarchical reorganization, a reconfigurable hierarchical formation is developed based on graph theory, with its feasibility rigorously demonstrated. In conjunction with role transitions, a power-centric topology switching mechanism grounded in formation consensus convergence is proposed, ensuring coordinated resilience within the formation. Finally, simulations and experiments validate the performance of the proposed method. The aerial formations successfully performed multiple hierarchical reorganizations in both three-dimensional and two-dimensional spaces. Even in the event of a single leader's failure, the formation maintained stable flight through hierarchical reorganization. This rapid adaptability enables the robotic formations to execute complex tasks, including sharp turns and navigating through forests at speeds up to 1.9 m/s.
PowerFlowMultiNet: Multigraph Neural Networks for Unbalanced Three-Phase Distribution Systems
Efficiently solving unbalanced three-phase power flow in distribution grids is pivotal for grid analysis and simulation. There is a pressing need for scalable algorithms capable of handling large-scale unbalanced power grids that can provide accurate and fast solutions. To address this, deep learning techniques, especially Graph Neural Networks (GNNs), have emerged. However, existing literature primarily focuses on balanced networks, leaving a critical gap in supporting unbalanced three-phase power grids. This letter introduces PowerFlowMultiNet, a novel multigraph GNN framework explicitly designed for unbalanced three-phase power grids. The proposed approach models each phase separately in a multigraph representation, effectively capturing the inherent asymmetry in unbalanced grids. A graph embedding mechanism utilizing message passing is introduced to capture spatial dependencies within the power system network. PowerFlowMultiNet outperforms traditional methods and other deep learning approaches in terms of accuracy and computational speed. Rigorous testing reveals significantly lower error rates and a notable hundredfold increase in computational speed for large power networks compared to model-based methods.
Properties of Immersions for Systems with Multiple Limit Sets with Implications to Learning Koopman Embeddings
Linear immersions (such as Koopman eigenfunctions) of a nonlinear system have wide applications in prediction and control. In this work, we study the properties of linear immersions for nonlinear systems with multiple omega-limit sets. While previous research has indicated the possibility of discontinuous one-to-one linear immersions for such systems, it has been unclear whether continuous one-to-one linear immersions are attainable. Under mild conditions, we prove that any continuous immersion to a class of systems including finite-dimensional linear systems collapses all the omega-limit sets, and thus cannot be one-to-one. Furthermore, we show that this property is also shared by approximate linear immersions learned from data as sample size increases and sampling interval decreases. Multiple examples are studied to illustrate our results.
comment: 15 pages, 6 figures
Patching Approximately Safe Value Functions Leveraging Local Hamilton-Jacobi Reachability Analysis
Safe value functions, such as control barrier functions, characterize a safe set and synthesize a safety filter, overriding unsafe actions, for a dynamic system. While function approximators like neural networks can synthesize approximately safe value functions, they typically lack formal guarantees. In this paper, we propose a local dynamic programming-based approach to "patch" approximately safe value functions to obtain a safe value function. This algorithm, HJ-Patch, produces a novel value function that provides formal safety guarantees, yet retains the global structure of the initial value function. HJ-Patch modifies an approximately safe value function at states that are both (i) near the safety boundary and (ii) may violate safety. We iteratively update both this set of "active" states and the value function until convergence. This approach bridges the gap between value function approximation methods and formal safety through Hamilton-Jacobi (HJ) reachability, offering a framework for integrating various safety methods. We provide simulation results on analytic and learned examples, demonstrating HJ-Patch reduces the computational complexity by 2 orders of magnitude with respect to standard HJ reachability. Additionally, we demonstrate the perils of using approximately safe value functions directly and showcase improved safety using HJ-Patch.
comment: 8 pages, IEEE Conference on Decision and Control (CDC), 2024 (In Press)
Multistep Inverse Is Not All You Need
In real-world control settings, the observation space is often unnecessarily high-dimensional and subject to time-correlated noise. However, the controllable dynamics of the system are often far simpler than the dynamics of the raw observations. It is therefore desirable to learn an encoder to map the observation space to a simpler space of control-relevant variables. In this work, we consider the Ex-BMDP model, first proposed by Efroni et al. (2022), which formalizes control problems where observations can be factorized into an action-dependent latent state which evolves deterministically, and action-independent time-correlated noise. Lamb et al. (2022) proposes the "AC-State" method for learning an encoder to extract a complete action-dependent latent state representation from the observations in such problems. AC-State is a multistep-inverse method, in that it uses the encoding of the the first and last state in a path to predict the first action in the path. However, we identify cases where AC-State will fail to learn a correct latent representation of the agent-controllable factor of the state. We therefore propose a new algorithm, ACDF, which combines multistep-inverse prediction with a latent forward model. ACDF is guaranteed to correctly infer an action-dependent latent state encoder for a large class of Ex-BMDP models. We demonstrate the effectiveness of ACDF on tabular Ex-BMDPs through numerical simulations; as well as high-dimensional environments using neural-network-based encoders. Code is available at https://github.com/midi-lab/acdf.
comment: RLC 2024
Systems and Control (EESS)
Stability of the Theta Method for Systems with Multiple Time-Delayed Variables
The paper focuses on the numerical stability and accuracy of implicit time-domain integration (TDI) methods when applied for the solution of a power system model impacted by time delays. Such a model is generally formulated as a set of delay differential algebraic equations (DDAEs) in non index-1 Hessenberg form. In particular, the paper shows that numerically stable ordinary differential equation (ODE) methods, such as the trapezoidal and the Theta method, can become unstable when applied to a power system that includes a significant number of delayed variables. Numerical stability is discussed through a scalar test delay differential equation, as well as through a matrix pencil approach that accounts for the DDAEs of any given dynamic power system model. Simulation results are presented in a case study based on the IEEE 39-bus system.
comment: 9 pages
Unmasking Covert Intrusions: Detection of Fault-Masking Cyberattacks on Differential Protection Systems
Line Current Differential Relays (LCDRs) are high-speed relays progressively used to protect critical transmission lines. However, LCDRs are vulnerable to cyberattacks. Fault-Masking Attacks (FMAs) are stealthy cyberattacks performed by manipulating the remote measurements of the targeted LCDR to disguise faults on the protected line. Hence, they remain undetected by this LCDR. In this paper, we propose a two-module framework to detect FMAs. The first module is a Mismatch Index (MI) developed from the protected transmission line's equivalent physical model. The MI is triggered only if there is a significant mismatch in the LCDR's local and remote measurements while the LCDR itself is untriggered, which indicates an FMA. After the MI is triggered, the second module, a neural network-based classifier, promptly confirms that the triggering event is a physical fault that lies on the line protected by the LCDR before declaring the occurrence of an FMA. The proposed framework is tested using the IEEE 39-bus benchmark system. Our simulation results confirm that the proposed framework can accurately detect FMAs on LCDRs and is not affected by normal system disturbances, variations, or measurement noise. Our experimental results using OPAL-RT's real-time simulator confirm the proposed solution's real-time performance capability.
comment: Accepted to IEEE Transactions on Systems, Man, and Cybernetics: Systems. \c{opyright} 2024 IEEE
Towards a Socially Acceptable Competitive Equilibrium in Energy Markets
This paper addresses the problem of energy sharing between a population of price-taking agents who adopt decentralized primal-dual gradient dynamics to find the Competitive Equilibrium (CE). Although the CE is efficient, it does not ensure fairness and can potentially lead to high prices. As the agents and market operator share a social responsibility to keep the price below a certain socially acceptable threshold, we propose an approach where the agents modify their utility functions in a decentralized way. We introduce a dynamic feedback controller for the primal-dual dynamics to steer the agents to a Socially acceptable Competitive Equilibrium (SCE). We demonstrate our theoretical findings in a case study.
comment: Extended version
Capturing Opportunity Costs of Batteries with a Staircase Supply-Demand Function
In the global pursuit of carbon neutrality, the role of batteries is indispensable. They provide pivotal flexibilities to counter uncertainties from renewables, preferably by participating in electricity markets. Unlike thermal generators, however, the dominant type of cost for batteries is opportunity cost, which is more vague and challenging to represent through bids in stipulated formats. This article shows the opposite yet surprising results: The demand-supply function of an ideal battery, considering its opportunity cost, is a staircase function with no more than five segments, which is a perfect match with existing rules in many real electricity markets. The demand-supply function shifts horizontally with price forecasts and vertically with the initial SOC. These results can be generalized to imperfect batteries and numerous battery-like resources, including battery clusters, air-conditioners, and electric vehicle charging stations, although the number of segments may vary. These results pave the way for batteries to participate in electricity markets.
An updated look on the convergence and consistency of data-driven dynamical models
Deep sequence models are receiving significant interest in current machine learning research. By representing probability distributions that are fit to data using maximum likelihood estimation, such models can model data on general observation spaces (both continuous and discrete-valued). Furthermore, they can be applied to a wide range of modelling problems, including modelling of dynamical systems which are subject to control. The problem of learning data-driven models of systems subject to control is well studied in the field of system identification. In particular, there exist theoretical convergence and consistency results which can be used to analyze model behaviour and guide model development. However, these results typically concern models which provide point predictions of continuous-valued variables. Motivated by this, we derive convergence and consistency results for a class of nonlinear probabilistic models defined on a general observation space. The results rely on stability and regularity assumptions, and can be used to derive consistency conditions and bias expressions for nonlinear probabilistic models of systems under control. We illustrate the results on examples from linear system identification and Markov chains on finite state spaces.
comment: 9 pages, no figures. To be presented at the 63rd IEEE Conference on Decision and Control
Online Residual Learning from Offline Experts for Pedestrian Tracking
In this paper, we consider the problem of predicting unknown targets from data. We propose Online Residual Learning (ORL), a method that combines online adaptation with offline-trained predictions. At a lower level, we employ multiple offline predictions generated before or at the beginning of the prediction horizon. We augment every offline prediction by learning their respective residual error concerning the true target state online, using the recursive least squares algorithm. At a higher level, we treat the augmented lower-level predictors as experts, adopting the Prediction with Expert Advice framework. We utilize an adaptive softmax weighting scheme to form an aggregate prediction and provide guarantees for ORL in terms of regret. We employ ORL to boost performance in the setting of online pedestrian trajectory prediction. Based on data from the Stanford Drone Dataset, we show that ORL can demonstrate best-of-both-worlds performance.
comment: Accepted to CDC 2024
Jam-absorption driving with data assimilation
This paper introduces a data assimilation (DA) framework based on the extended Kalman filter-cell transmission model, designed to assist jam-absorption driving (JAD) operation to alleviate sag traffic congestion. To ascertain and demonstrate the effectiveness of the DA framework for JAD operation, in this paper, we initially investigated its impact on the motion and control performance of a single absorbing vehicle. Numerical results show that the DA framework effectively mitigated underestimated or overestimated control failures of JAD caused by misestimation of key parameters (e.g., free flow speed and critical density) of the traffic flow fundamental diagram. The findings suggest that the proposed DA framework can reduce control failures and prevent significant declines and deteriorations in JAD performance caused by changes in traffic characteristics, e.g., weather conditions or traffic composition.
comment: SMC 2024. Copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Decentralized Learning in General-sum Markov Games
The Markov game framework is widely used to model interactions among agents with heterogeneous utilities in dynamic and uncertain societal-scale systems. In these systems, agents typically operate in a decentralized manner due to privacy and scalability concerns, often acting without any information about other agents. The design and analysis of decentralized learning algorithms that provably converge to rational outcomes remain elusive, especially beyond Markov zero-sum games and Markov potential games, which do not adequately capture the nature of many real-world interactions that is neither fully competitive nor fully cooperative. This paper investigates the design of decentralized learning algorithms for general-sum Markov games, aiming to provide provable guarantees of convergence to approximate Nash equilibria in the long run. Our approach builds on constructing a Markov Near-Potential Function (MNPF) to address the intractability of designing algorithms that converge to exact Nash equilibria. We demonstrate that MNPFs play a central role in ensuring the convergence of an actor-critic-based decentralized learning algorithm to approximate Nash equilibria. By leveraging a two-timescale approach, where Q-function estimates are updated faster than policy updates, we show that the system converges to a level set of the MNPF over the set of approximate Nash equilibria. This convergence result is further strengthened if the set of Nash equilibria is assumed to be finite. Our findings provide a new perspective on the analysis and design of decentralized learning algorithms in multi-agent systems.
comment: 16 pages, 1 figure
Whittle Index Learning Algorithms for Restless Bandits with Constant Stepsizes
We study the Whittle index learning algorithm for restless multi-armed bandits. We consider index learning algorithm with Q-learning. We first present Q-learning algorithm with exploration policies -- epsilon-greedy, softmax, epsilon-softmax with constant stepsizes. We extend the study of Q-learning to index learning for single-armed restless bandit. The algorithm of index learning is two-timescale variant of stochastic approximation, on slower timescale we update index learning scheme and on faster timescale we update Q-learning assuming fixed index value. In Q-learning updates are in asynchronous manner. We study constant stepsizes two timescale stochastic approximation algorithm. We provide analysis of two-timescale stochastic approximation for index learning with constant stepsizes. Further, we present study on index learning with deep Q-network (DQN) learning and linear function approximation with state-aggregation method. We describe the performance of our algorithms using numerical examples. We have shown that index learning with Q learning, DQN and function approximations learns the Whittle index.
comment: 14 pages
Multi-scale Feature Fusion with Point Pyramid for 3D Object Detection
Effective point cloud processing is crucial to LiDARbased autonomous driving systems. The capability to understand features at multiple scales is required for object detection of intelligent vehicles, where road users may appear in different sizes. Recent methods focus on the design of the feature aggregation operators, which collect features at different scales from the encoder backbone and assign them to the points of interest. While efforts are made into the aggregation modules, the importance of how to fuse these multi-scale features has been overlooked. This leads to insufficient feature communication across scales. To address this issue, this paper proposes the Point Pyramid RCNN (POP-RCNN), a feature pyramid-based framework for 3D object detection on point clouds. POP-RCNN consists of a Point Pyramid Feature Enhancement (PPFE) module to establish connections across spatial scales and semantic depths for information exchange. The PPFE module effectively fuses multi-scale features for rich information without the increased complexity in feature aggregation. To remedy the impact of inconsistent point densities, a point density confidence module is deployed. This design integration enables the use of a lightweight feature aggregator, and the emphasis on both shallow and deep semantics, realising a detection framework for 3D object detection. With great adaptability, the proposed method can be applied to a variety of existing frameworks to increase feature richness, especially for long-distance detection. By adopting the PPFE in the voxel-based and point-voxel-based baselines, experimental results on KITTI and Waymo Open Dataset show that the proposed method achieves remarkable performance even with limited computational headroom.
comment: 12 pages
Faster Q-Learning Algorithms for Restless Bandits
We study the Whittle index learning algorithm for restless multi-armed bandits (RMAB). We first present Q-learning algorithm and its variants -- speedy Q-learning (SQL), generalized speedy Q-learning (GSQL) and phase Q-learning (PhaseQL). We also discuss exploration policies -- $\epsilon$-greedy and Upper confidence bound (UCB). We extend the study of Q-learning and its variants with UCB policy. We illustrate using numerical example that Q-learning with UCB exploration policy has faster convergence and PhaseQL with UCB have fastest convergence rate. We next extend the study of Q-learning variants for index learning to RMAB. The algorithm of index learning is two-timescale variant of stochastic approximation, on slower timescale we update index learning scheme and on faster timescale we update Q-learning assuming fixed index value. We study constant stepsizes two timescale stochastic approximation algorithm. We describe the performance of our algorithms using numerical example. It illustrate that index learning with Q learning with UCB has faster convergence that $\epsilon$ greedy. Further, PhaseQL (with UCB and $\epsilon$ greedy) has the best convergence than other Q-learning algorithms.
comment: 7 pages, 3 figures, conference. arXiv admin note: substantial text overlap with arXiv:2409.04605
A Centralized Discovery-Based Method for Integrating Data Distribution Service and Time-Sensitive Networking in In-Vehicle Networks
As the electronic and electrical architecture (E/EA) of intelligent and connected vehicles (ICVs) evolves, traditional distributed and signal-oriented architectures are being replaced by centralized, service-oriented architectures (SOA). This new generation of E/EA demands in-vehicle networks (IVNs) that offer high bandwidth, real-time, reliability, and service-oriented. data distribution service (DDS) and time-sensitive networking (TSN) are increasingly adopted to address these requirements. However, research on the integrated deployment of DDS and TSN in automotive applications is still in its infancy. This paper presents a DDS over TSN (DoT) communication architecture based on the centralized discovery architecture (CDA). First, a lightweight DDS implementation (FastDDS-lw) is developed for resource-constrained in-vehicle devices. Next, a DDS flow identification algorithm (DFIA) based on the CDA is introduced to identify potential DDS flows during the discovery phase automatically. Finally, the DoT communication architecture is designed, incorporating FastDDS-lw and DFIA. Experimental results show that the DoT architecture significantly reduces end-to-end latency and jitter for critical DDS flows compared to traditional Ethernet. Additionally, DoT provides an automated network configuration method that completes within a few tens of milliseconds.
A Digital Twin Design Methodology for Control, Simulation, and Monitoring of Fluidic Circuits
We propose a synthesis method for the design of digital twins applicable to various systems (pneumatic, hydraulic, electrical/electronic circuits). The methodology allows representing the operation of these systems through an active digital twin, thereby enabling a more suitable and easier computer-aided design, simulation, control, and monitoring. Furthermore, our methodology enables the detection of a system's actions on its own inputs (for example, in pneumatics: backflow of gases trapped in part of a fluidic system onto its own inputs). During the simulation or monitoring phase, the approach also facilitates real-time diagnosis of the controlled system. The outputs, on the controlled physical system or its digital twin, do not depend only on the current inputs but also on the history of the inputs and the history of internal states and variables. In other words, the underlying sequential logic has a memory while an only combinational logic approach does not. These capabilities can contribute to the digital transformation of the factory of the future.
comment: 9 pages, 6 figures
A Black-Box Physics-Informed Estimator based on Gaussian Process Regression for Robot Inverse Dynamics Identification
Learning the inverse dynamics of robots directly from data, adopting a black-box approach, is interesting for several real-world scenarios where limited knowledge about the system is available. In this paper, we propose a black-box model based on Gaussian Process (GP) Regression for the identification of the inverse dynamics of robotic manipulators. The proposed model relies on a novel multidimensional kernel, called \textit{Lagrangian Inspired Polynomial} (\kernelInitials{}) kernel. The \kernelInitials{} kernel is based on two main ideas. First, instead of directly modeling the inverse dynamics components, we model as GPs the kinetic and potential energy of the system. The GP prior on the inverse dynamics components is derived from those on the energies by applying the properties of GPs under linear operators. Second, as regards the energy prior definition, we prove a polynomial structure of the kinetic and potential energy, and we derive a polynomial kernel that encodes this property. As a consequence, the proposed model allows also to estimate the kinetic and potential energy without requiring any label on these quantities. Results on simulation and on two real robotic manipulators, namely a 7 DOF Franka Emika Panda, and a 6 DOF MELFA RV4FL, show that the proposed model outperforms state-of-the-art black-box estimators based both on Gaussian Processes and Neural Networks in terms of accuracy, generality and data efficiency. The experiments on the MELFA robot also demonstrate that our approach achieves performance comparable to fine-tuned model-based estimators, despite requiring less prior information.
A Hypergraph-Based Machine Learning Ensemble Network Intrusion Detection System
Network intrusion detection systems (NIDS) to detect malicious attacks continue to meet challenges. NIDS are often developed offline while they face auto-generated port scan infiltration attempts, resulting in a significant time lag from adversarial adaption to NIDS response. To address these challenges, we use hypergraphs focused on internet protocol addresses and destination ports to capture evolving patterns of port scan attacks. The derived set of hypergraph-based metrics are then used to train an ensemble machine learning (ML) based NIDS that allows for real-time adaption in monitoring and detecting port scanning activities, other types of attacks, and adversarial intrusions at high accuracy, precision and recall performances. This ML adapting NIDS was developed through the combination of (1) intrusion examples, (2) NIDS update rules, (3) attack threshold choices to trigger NIDS retraining requests, and (4) a production environment with no prior knowledge of the nature of network traffic. 40 scenarios were auto-generated to evaluate the ML ensemble NIDS comprising three tree-based models. The resulting ML Ensemble NIDS was extended and evaluated with the CIC-IDS2017 dataset. Results show that under the model settings of an Update-ALL-NIDS rule (specifically retrain and update all the three models upon the same NIDS retraining request) the proposed ML ensemble NIDS evolved intelligently and produced the best results with nearly 100% detection performance throughout the simulation.
comment: in IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2024. An updated version of this work has been accepted for publication in an IEEE journal available here: https://ieeexplore.ieee.org/document/10666746
A Flexible and Resilient Formation Approach based on Hierarchical Reorganization
Conventional formation methods typically rely on fixed hierarchical structures, such as predetermined leaders or predefined formation shapes. These rigid hierarchies can render formations cumbersome and inflexible in complex environments, leading to potential failure if any leader loses connectivity. To address these limitations, this paper introduces a reconfigurable affine formation that enhances both flexibility and resilience through hierarchical reorganization. The paper first elucidates the critical role of hierarchical reorganization, conceptualizing this process as involving role reallocation and dynamic changes in topological structures. To further investigate the conditions necessary for hierarchical reorganization, a reconfigurable hierarchical formation is developed based on graph theory, with its feasibility rigorously demonstrated. In conjunction with role transitions, a power-centric topology switching mechanism grounded in formation consensus convergence is proposed, ensuring coordinated resilience within the formation. Finally, simulations and experiments validate the performance of the proposed method. The aerial formations successfully performed multiple hierarchical reorganizations in both three-dimensional and two-dimensional spaces. Even in the event of a single leader's failure, the formation maintained stable flight through hierarchical reorganization. This rapid adaptability enables the robotic formations to execute complex tasks, including sharp turns and navigating through forests at speeds up to 1.9 m/s.
PowerFlowMultiNet: Multigraph Neural Networks for Unbalanced Three-Phase Distribution Systems
Efficiently solving unbalanced three-phase power flow in distribution grids is pivotal for grid analysis and simulation. There is a pressing need for scalable algorithms capable of handling large-scale unbalanced power grids that can provide accurate and fast solutions. To address this, deep learning techniques, especially Graph Neural Networks (GNNs), have emerged. However, existing literature primarily focuses on balanced networks, leaving a critical gap in supporting unbalanced three-phase power grids. This letter introduces PowerFlowMultiNet, a novel multigraph GNN framework explicitly designed for unbalanced three-phase power grids. The proposed approach models each phase separately in a multigraph representation, effectively capturing the inherent asymmetry in unbalanced grids. A graph embedding mechanism utilizing message passing is introduced to capture spatial dependencies within the power system network. PowerFlowMultiNet outperforms traditional methods and other deep learning approaches in terms of accuracy and computational speed. Rigorous testing reveals significantly lower error rates and a notable hundredfold increase in computational speed for large power networks compared to model-based methods.
Properties of Immersions for Systems with Multiple Limit Sets with Implications to Learning Koopman Embeddings
Linear immersions (such as Koopman eigenfunctions) of a nonlinear system have wide applications in prediction and control. In this work, we study the properties of linear immersions for nonlinear systems with multiple omega-limit sets. While previous research has indicated the possibility of discontinuous one-to-one linear immersions for such systems, it has been unclear whether continuous one-to-one linear immersions are attainable. Under mild conditions, we prove that any continuous immersion to a class of systems including finite-dimensional linear systems collapses all the omega-limit sets, and thus cannot be one-to-one. Furthermore, we show that this property is also shared by approximate linear immersions learned from data as sample size increases and sampling interval decreases. Multiple examples are studied to illustrate our results.
comment: 15 pages, 6 figures
Patching Approximately Safe Value Functions Leveraging Local Hamilton-Jacobi Reachability Analysis
Safe value functions, such as control barrier functions, characterize a safe set and synthesize a safety filter, overriding unsafe actions, for a dynamic system. While function approximators like neural networks can synthesize approximately safe value functions, they typically lack formal guarantees. In this paper, we propose a local dynamic programming-based approach to "patch" approximately safe value functions to obtain a safe value function. This algorithm, HJ-Patch, produces a novel value function that provides formal safety guarantees, yet retains the global structure of the initial value function. HJ-Patch modifies an approximately safe value function at states that are both (i) near the safety boundary and (ii) may violate safety. We iteratively update both this set of "active" states and the value function until convergence. This approach bridges the gap between value function approximation methods and formal safety through Hamilton-Jacobi (HJ) reachability, offering a framework for integrating various safety methods. We provide simulation results on analytic and learned examples, demonstrating HJ-Patch reduces the computational complexity by 2 orders of magnitude with respect to standard HJ reachability. Additionally, we demonstrate the perils of using approximately safe value functions directly and showcase improved safety using HJ-Patch.
comment: 8 pages, IEEE Conference on Decision and Control (CDC), 2024 (In Press)
Multistep Inverse Is Not All You Need
In real-world control settings, the observation space is often unnecessarily high-dimensional and subject to time-correlated noise. However, the controllable dynamics of the system are often far simpler than the dynamics of the raw observations. It is therefore desirable to learn an encoder to map the observation space to a simpler space of control-relevant variables. In this work, we consider the Ex-BMDP model, first proposed by Efroni et al. (2022), which formalizes control problems where observations can be factorized into an action-dependent latent state which evolves deterministically, and action-independent time-correlated noise. Lamb et al. (2022) proposes the "AC-State" method for learning an encoder to extract a complete action-dependent latent state representation from the observations in such problems. AC-State is a multistep-inverse method, in that it uses the encoding of the the first and last state in a path to predict the first action in the path. However, we identify cases where AC-State will fail to learn a correct latent representation of the agent-controllable factor of the state. We therefore propose a new algorithm, ACDF, which combines multistep-inverse prediction with a latent forward model. ACDF is guaranteed to correctly infer an action-dependent latent state encoder for a large class of Ex-BMDP models. We demonstrate the effectiveness of ACDF on tabular Ex-BMDPs through numerical simulations; as well as high-dimensional environments using neural-network-based encoders. Code is available at https://github.com/midi-lab/acdf.
comment: RLC 2024
Robotics
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Complex 3D scene understanding has gained increasing attention, with scene encoding strategies playing a crucial role in this success. However, the optimal scene encoding strategies for various scenarios remain unclear, particularly compared to their image-based counterparts. To address this issue, we present a comprehensive study that probes various visual encoding models for 3D scene understanding, identifying the strengths and limitations of each model across different scenarios. Our evaluation spans seven vision foundation encoders, including image-based, video-based, and 3D foundation models. We evaluate these models in four tasks: Vision-Language Scene Reasoning, Visual Grounding, Segmentation, and Registration, each focusing on different aspects of scene understanding. Our evaluations yield key findings: DINOv2 demonstrates superior performance, video models excel in object-level tasks, diffusion models benefit geometric tasks, and language-pretrained models show unexpected limitations in language-related tasks. These insights challenge some conventional understandings, provide novel perspectives on leveraging visual foundation models, and highlight the need for more flexible encoder selection in future vision-language and scene-understanding tasks.
comment: Project page: https://yunzeman.github.io/lexicon3d , Github: https://github.com/YunzeMan/Lexicon3D
Reprogrammable sequencing for physically intelligent under-actuated robots
Programming physical intelligence into mechanisms holds great promise for machines that can accomplish tasks such as navigation of unstructured environments while utilizing a minimal amount of computational resources and electronic components. In this study, we introduce a novel design approach for physically intelligent under-actuated mechanisms capable of autonomously adjusting their motion in response to environmental interactions. Specifically, multistability is harnessed to sequence the motion of different degrees of freedom in a programmed order. A key aspect of this approach is that these sequences can be passively reprogrammed through mechanical stimuli that arise from interactions with the environment. To showcase our approach, we construct a four degree of freedom robot capable of autonomously navigating mazes and moving away from obstacles. Remarkably, this robot operates without relying on traditional computational architectures and utilizes only a single linear actuator.
View-Invariant Policy Learning via Zero-Shot Novel View Synthesis
Large-scale visuomotor policy learning is a promising approach toward developing generalizable manipulation systems. Yet, policies that can be deployed on diverse embodiments, environments, and observational modalities remain elusive. In this work, we investigate how knowledge from large-scale visual data of the world may be used to address one axis of variation for generalizable manipulation: observational viewpoint. Specifically, we study single-image novel view synthesis models, which learn 3D-aware scene-level priors by rendering images of the same scene from alternate camera viewpoints given a single input image. For practical application to diverse robotic data, these models must operate zero-shot, performing view synthesis on unseen tasks and environments. We empirically analyze view synthesis models within a simple data-augmentation scheme that we call View Synthesis Augmentation (VISTA) to understand their capabilities for learning viewpoint-invariant policies from single-viewpoint demonstration data. Upon evaluating the robustness of policies trained with our method to out-of-distribution camera viewpoints, we find that they outperform baselines in both simulated and real-world manipulation tasks. Videos and additional visualizations are available at https://s-tian.github.io/projects/vista.
comment: Accepted to CoRL 2024
1 Modular Parallel Manipulator for Long-Term Soft Robotic Data Collection
Performing long-term experimentation or large-scale data collection for machine learning in the field of soft robotics is challenging, due to the hardware robustness and experimental flexibility required. In this work, we propose a modular parallel robotic manipulation platform suitable for such large-scale data collection and compatible with various soft-robotic fabrication methods. Considering the computational and theoretical difficulty of replicating the high-fidelity, faster-than-real-time simulations that enable large-scale data collection in rigid robotic systems, a robust soft-robotic hardware platform becomes a high priority development task for the field. The platform's modules consist of a pair of off-the-shelf electrical motors which actuate a customizable finger consisting of a compliant parallel structure. The parallel mechanism of the finger can be as simple as a single 3D-printed urethane or molded silicone bulk structure, due to the motors being able to fully actuate a passive structure. This design flexibility allows experimentation with soft mechanism varied geometries, bulk properties and surface properties. Additionally, while the parallel mechanism does not require separate electronics or additional parts, these can be included, and it can be constructed using multi-functional soft materials to study compatible soft sensors and actuators in the learning process. In this work, we validate the platform's ability to be used for policy gradient reinforcement learning directly on hardware in a benchmark 2D manipulation task. We additionally demonstrate compatibility with multiple fingers and characterize the design constraints for compatible extensions.
MaskVal: Simple but Effective Uncertainty Quantification for 6D Pose Estimation
For the use of 6D pose estimation in robotic applications, reliable poses are of utmost importance to ensure a safe, reliable and predictable operational performance. Despite these requirements, state-of-the-art 6D pose estimators often do not provide any uncertainty quantification for their pose estimates at all, or if they do, it has been shown that the uncertainty provided is only weakly correlated with the actual true error. To address this issue, we investigate a simple but effective uncertainty quantification, that we call MaskVal, which compares the pose estimates with their corresponding instance segmentations by rendering and does not require any modification of the pose estimator itself. Despite its simplicity, MaskVal significantly outperforms a state-of-the-art ensemble method on both a dataset and a robotic setup. We show that by using MaskVal, the performance of a state-of-the-art 6D pose estimator is significantly improved towards a safe and reliable operation. In addition, we propose a new and specific approach to compare and evaluate uncertainty quantification methods for 6D pose estimation in the context of robotic manipulation.
Interactive Surgical Liver Phantom for Cholecystectomy Training
Training and prototype development in robot-assisted surgery requires appropriate and safe environments for the execution of surgical procedures. Current dry lab laparoscopy phantoms often lack the ability to mimic complex, interactive surgical tasks. This work presents an interactive surgical phantom for the cholecystectomy. The phantom enables the removal of the gallbladder during cholecystectomy by allowing manipulations and cutting interactions with the synthetic tissue. The force-displacement behavior of the gallbladder is modelled based on retraction demonstrations. The force model is compared to the force model of ex-vivo porcine gallbladders and evaluated on its ability to estimate retraction forces.
comment: As presented at CURAC 2023
FLAF: Focal Line and Feature-constrained Active View Planning for Visual Teach and Repeat
This paper presents FLAF, a focal line and feature-constrained active view planning method for tracking failure avoidance in feature-based visual navigation of mobile robots. Our FLAF-based visual navigation is built upon a feature-based visual teach and repeat (VT\&R) framework, which supports many robotic applications by teaching a robot to navigate on various paths that cover a significant portion of daily autonomous navigation requirements. However, tracking failure in feature-based visual simultaneous localization and mapping (VSLAM) caused by textureless regions in human-made environments is still limiting VT\&R to be adopted in the real world. To address this problem, the proposed view planner is integrated into a feature-based visual SLAM system to build up an active VT\&R system that avoids tracking failure. In our system, a pan-tilt unit (PTU)-based active camera is mounted on the mobile robot. Using FLAF, the active camera-based VSLAM operates during the teaching phase to construct a complete path map and in the repeat phase to maintain stable localization. FLAF orients the robot toward more map points to avoid mapping failures during path learning and toward more feature-identifiable map points beneficial for localization while following the learned trajectory. Experiments in real scenarios demonstrate that FLAF outperforms the methods that do not consider feature-identifiability, and our active VT\&R system performs well in complex environments by effectively dealing with low-texture regions.
Neural HD Map Generation from Multiple Vectorized Tiles Locally Produced by Autonomous Vehicles
High-definition (HD) map is a fundamental component of autonomous driving systems, as it can provide precise environmental information about driving scenes. Recent work on vectorized map generation could produce merely 65% local map elements around the ego-vehicle at runtime by one tour with onboard sensors, leaving a puzzle of how to construct a global HD map projected in the world coordinate system under high-quality standards. To address the issue, we present GNMap as an end-to-end generative neural network to automatically construct HD maps with multiple vectorized tiles which are locally produced by autonomous vehicles through several tours. It leverages a multi-layer and attention-based autoencoder as the shared network, of which parameters are learned from two different tasks (i.e., pretraining and finetuning, respectively) to ensure both the completeness of generated maps and the correctness of element categories. Abundant qualitative evaluations are conducted on a real-world dataset and experimental results show that GNMap can surpass the SOTA method by more than 5% F1 score, reaching the level of industrial usage with a small amount of manual modification. We have already deployed it at Navinfo Co., Ltd., serving as an indispensable software to automatically build HD maps for autonomous driving systems.
comment: Accepted by SpatialDI'24
KiloBot: A Programming Language for Deploying Perception-Guided Industrial Manipulators at Scale
We would like industrial robots to handle unstructured environments with cameras and perception pipelines. In contrast to traditional industrial robots that replay offline-crafted trajectories, online behavior planning is required for these perception-guided industrial applications. Aside from perception and planning algorithms, deploying perception-guided manipulators also requires substantial effort in integration. One approach is writing scripts in a traditional language (such as Python) to construct the planning problem and perform integration with other algorithmic modules & external devices. While scripting in Python is feasible for a handful of robots and applications, deploying perception-guided manipulation at scale (e.g., more than 10000 robot workstations in over 2000 customer sites) becomes intractable. To resolve this challenge, we propose a Domain-Specific Language (DSL) for perception-guided manipulation applications. To scale up the deployment,our DSL provides: 1) an easily accessible interface to construct & solve a sub-class of Task and Motion Planning (TAMP) problems that are important in practical applications; and 2) a mechanism to implement flexible control flow to perform integration and address customized requirements of distinct industrial application. Combined with an intuitive graphical programming frontend, our DSL is mainly used by machine operators without coding experience in traditional programming languages. Within hours of training, operators are capable of orchestrating interesting sophisticated manipulation behaviors with our DSL. Extensive practical deployments demonstrate the efficacy of our method.
Reinforcement Learning Approach to Optimizing Profilometric Sensor Trajectories for Surface Inspection
High-precision surface defect detection in manufacturing is essential for ensuring quality control. Laser triangulation profilometric sensors are key to this process, providing detailed and accurate surface measurements over a line. To achieve a complete and precise surface scan, accurate relative motion between the sensor and the workpiece is required. It is crucial to control the sensor pose to maintain optimal distance and relative orientation to the surface. It is also important to ensure uniform profile distribution throughout the scanning process. This paper presents a novel Reinforcement Learning (RL) based approach to optimize robot inspection trajectories for profilometric sensors. Building upon the Boustrophedon scanning method, our technique dynamically adjusts the sensor position and tilt to maintain optimal orientation and distance from the surface, while also ensuring a consistent profile distance for uniform and high-quality scanning. Utilizing a simulated environment based on the CAD model of the part, we replicate real-world scanning conditions, including sensor noise and surface irregularities. This simulation-based approach enables offline trajectory planning based on CAD models. Key contributions include the modeling of the state space, action space, and reward function, specifically designed for inspection applications using profilometric sensors. We use Proximal Policy Optimization (PPO) algorithm to efficiently train the RL agent, demonstrating its capability to optimize inspection trajectories with profilometric sensors. To validate our approach, we conducted several experiments where a model trained on a specific training piece was tested on various parts in simulation. Also, we conducted a real-world experiment by executing the optimized trajectory, generated offline from a CAD model, to inspect a part using a UR3e robotic arm model.
F3T: A soft tactile unit with 3D force and temperature mathematical decoupling ability for robots
The human skin exhibits remarkable capability to perceive contact forces and environmental temperatures, providing intricate information essential for nuanced manipulation. Despite recent advancements in soft tactile sensors, a significant challenge remains in accurately decoupling signals - specifically, separating force from directional orientation and temperature - resulting in fail to meet the advanced application requirements of robots. This research proposes a multi-layered soft sensor unit (F3T) designed to achieve isolated measurements and mathematical decoupling of normal pressure, omnidirectional tangential forces, and temperature. We developed a circular coaxial magnetic film featuring a floating-mountain multi-layer capacitor, facilitating the physical decoupling of normal and tangential forces in all directions. Additionally, we incorporated an ion gel-based temperature sensing film atop the tactile sensor. This sensor is resilient to external pressure and deformation, enabling it to measure temperature and, crucially, eliminate capacitor errors induced by environmental temperature changes. This innovative design allows for the decoupled measurement of multiple signals, paving the way for advancements in higher-level robot motion control, autonomous decision-making, and task planning.
RoVi-Aug: Robot and Viewpoint Augmentation for Cross-Embodiment Robot Learning
Scaling up robot learning requires large and diverse datasets, and how to efficiently reuse collected data and transfer policies to new embodiments remains an open question. Emerging research such as the Open-X Embodiment (OXE) project has shown promise in leveraging skills by combining datasets including different robots. However, imbalances in the distribution of robot types and camera angles in many datasets make policies prone to overfit. To mitigate this issue, we propose RoVi-Aug, which leverages state-of-the-art image-to-image generative models to augment robot data by synthesizing demonstrations with different robots and camera views. Through extensive physical experiments, we show that, by training on robot- and viewpoint-augmented data, RoVi-Aug can zero-shot deploy on an unseen robot with significantly different camera angles. Compared to test-time adaptation algorithms such as Mirage, RoVi-Aug requires no extra processing at test time, does not assume known camera angles, and allows policy fine-tuning. Moreover, by co-training on both the original and augmented robot datasets, RoVi-Aug can learn multi-robot and multi-task policies, enabling more efficient transfer between robots and skills and improving success rates by up to 30%.
comment: CoRL 2024 (Oral)
Game On: Towards Language Models as RL Experimenters
We propose an agent architecture that automates parts of the common reinforcement learning experiment workflow, to enable automated mastery of control domains for embodied agents. To do so, it leverages a VLM to perform some of the capabilities normally required of a human experimenter, including the monitoring and analysis of experiment progress, the proposition of new tasks based on past successes and failures of the agent, decomposing tasks into a sequence of subtasks (skills), and retrieval of the skill to execute - enabling our system to build automated curricula for learning. We believe this is one of the first proposals for a system that leverages a VLM throughout the full experiment cycle of reinforcement learning. We provide a first prototype of this system, and examine the feasibility of current models and techniques for the desired level of automation. For this, we use a standard Gemini model, without additional fine-tuning, to provide a curriculum of skills to a language-conditioned Actor-Critic algorithm, in order to steer data collection so as to aid learning new skills. Data collected in this way is shown to be useful for learning and iteratively improving control policies in a robotics domain. Additional examination of the ability of the system to build a growing library of skills, and to judge the progress of the training of those skills, also shows promising results, suggesting that the proposed architecture provides a potential recipe for fully automated mastery of tasks and domains for embodied agents.
Fast Payload Calibration for Sensorless Contact Estimation Using Model Pre-training
Force and torque sensing is crucial in robotic manipulation across both collaborative and industrial settings. Traditional methods for dynamics identification enable the detection and control of external forces and torques without the need for costly sensors. However, these approaches show limitations in scenarios where robot dynamics, particularly the end-effector payload, are subject to changes. Moreover, existing calibration techniques face trade-offs between efficiency and accuracy due to concerns over joint space coverage. In this paper, we introduce a calibration scheme that leverages pre-trained Neural Network models to learn calibrated dynamics across a wide range of joint space in advance. This offline learning strategy significantly reduces the need for online data collection, whether for selection of the optimal model or identification of payload features, necessitating merely a 4-second trajectory for online calibration. This method is particularly effective in tasks that require frequent dynamics recalibration for precise contact estimation. We further demonstrate the efficacy of this approach through applications in sensorless joint and task compliance, accounting for payload variability.
comment: Accepted to Robotics and Automation Letters (RA-L), 8 pages
MouseSIS: A Frames-and-Events Dataset for Space-Time Instance Segmentation of Mice ECCV
Enabled by large annotated datasets, tracking and segmentation of objects in videos has made remarkable progress in recent years. Despite these advancements, algorithms still struggle under degraded conditions and during fast movements. Event cameras are novel sensors with high temporal resolution and high dynamic range that offer promising advantages to address these challenges. However, annotated data for developing learning-based mask-level tracking algorithms with events is not available. To this end, we introduce: ($i$) a new task termed \emph{space-time instance segmentation}, similar to video instance segmentation, whose goal is to segment instances throughout the entire duration of the sensor input (here, the input are quasi-continuous events and optionally aligned frames); and ($ii$) \emph{\dname}, a dataset for the new task, containing aligned grayscale frames and events. It includes annotated ground-truth labels (pixel-level instance segmentation masks) of a group of up to seven freely moving and interacting mice. We also provide two reference methods, which show that leveraging event data can consistently improve tracking performance, especially when used in combination with conventional cameras. The results highlight the potential of event-aided tracking in difficult scenarios. We hope our dataset opens the field of event-based video instance segmentation and enables the development of robust tracking algorithms for challenging conditions.\url{https://github.com/tub-rip/MouseSIS}
comment: 18 pages, 5 figures, ECCV Workshops
Masked Sensory-Temporal Attention for Sensor Generalization in Quadruped Locomotion
With the rising focus on quadrupeds, a generalized policy capable of handling different robot models and sensory inputs will be highly beneficial. Although several methods have been proposed to address different morphologies, it remains a challenge for learning-based policies to manage various combinations of proprioceptive information. This paper presents Masked Sensory-Temporal Attention (MSTA), a novel transformer-based model with masking for quadruped locomotion. It employs direct sensor-level attention to enhance sensory-temporal understanding and handle different combinations of sensor data, serving as a foundation for incorporating unseen information. This model can effectively understand its states even with a large portion of missing information, and is flexible enough to be deployed on a physical system despite the long input sequence.
comment: Project website for video: https://johnliudk.github.io/msta/
Bringing the RT-1-X Foundation Model to a SCARA robot
Traditional robotic systems require specific training data for each task, environment, and robot form. While recent advancements in machine learning have enabled models to generalize across new tasks and environments, the challenge of adapting these models to entirely new settings remains largely unexplored. This study addresses this by investigating the generalization capabilities of the RT-1-X robotic foundation model to a type of robot unseen during its training: a SCARA robot from UMI-RTX. Initial experiments reveal that RT-1-X does not generalize zero-shot to the unseen type of robot. However, fine-tuning of the RT-1-X model by demonstration allows the robot to learn a pickup task which was part of the foundation model (but learned for another type of robot). When the robot is presented with an object that is included in the foundation model but not in the fine-tuning dataset, it demonstrates that only the skill, but not the object-specific knowledge, has been transferred.
comment: 14 pages, submitted to the joint Artificial Intelligence & Machine Learning conference for Belgium, Netherlands & Luxembourg (BNAIC/BeNeLearn)
OccLLaMA: An Occupancy-Language-Action Generative World Model for Autonomous Driving
The rise of multi-modal large language models(MLLMs) has spurred their applications in autonomous driving. Recent MLLM-based methods perform action by learning a direct mapping from perception to action, neglecting the dynamics of the world and the relations between action and world dynamics. In contrast, human beings possess world model that enables them to simulate the future states based on 3D internal visual representation and plan actions accordingly. To this end, we propose OccLLaMA, an occupancy-language-action generative world model, which uses semantic occupancy as a general visual representation and unifies vision-language-action(VLA) modalities through an autoregressive model. Specifically, we introduce a novel VQVAE-like scene tokenizer to efficiently discretize and reconstruct semantic occupancy scenes, considering its sparsity and classes imbalance. Then, we build a unified multi-modal vocabulary for vision, language and action. Furthermore, we enhance LLM, specifically LLaMA, to perform the next token/scene prediction on the unified vocabulary to complete multiple tasks in autonomous driving. Extensive experiments demonstrate that OccLLaMA achieves competitive performance across multiple tasks, including 4D occupancy forecasting, motion planning, and visual question answering, showcasing its potential as a foundation model in autonomous driving.
Improving agent performance in fluid environments by perceptual pretraining
In this paper, we construct a pretraining framework for fluid environment perception, which includes an information compression model and the corresponding pretraining method. We test this framework in a two-cylinder problem through numerical simulation. The results show that after unsupervised pretraining with this framework, the intelligent agent can acquire key features of surrounding fluid environment, thereby adapting more quickly and effectively to subsequent multi-scenario tasks. In our research, these tasks include perceiving the position of the upstream obstacle and actively avoiding shedding vortices in the flow field to achieve drag reduction. Better performance of the pretrained agent is discussed in the sensitivity analysis.
Upper-Limb Rehabilitation with a Dual-Mode Individualized Exoskeleton Robot: A Generative-Model-Based Solution
Several upper-limb exoskeleton robots have been developed for stroke rehabilitation, but their rather low level of individualized assistance typically limits their effectiveness and practicability. Individualized assistance involves an upper-limb exoskeleton robot continuously assessing feedback from a stroke patient and then meticulously adjusting interaction forces to suit specific conditions and online changes. This paper describes the development of a new upper-limb exoskeleton robot with a novel online generative capability that allows it to provide individualized assistance to support the rehabilitation training of stroke patients. Specifically, the upper-limb exoskeleton robot exploits generative models to customize the fine and fit trajectory for the patient, as medical conditions, responses, and comfort feedback during training generally differ between patients. This generative capability is integrated into the two working modes of the upper-limb exoskeleton robot: an active mirroring mode for patients who retain motor abilities on one side of the body and a passive following mode for patients who lack motor ability on both sides of the body. The performance of the upper-limb exoskeleton robot was illustrated in experiments involving healthy subjects and stroke patients.
Solving Stochastic Orienteering Problems with Chance Constraints Using Monte Carlo Tree Search
We present a new Monte Carlo Tree Search (MCTS) algorithm to solve the stochastic orienteering problem with chance constraints, i.e., a version of the problem where travel costs are random, and one is assigned a bound on the tolerable probability of exceeding the budget. The algorithm we present is online and anytime, i.e., it alternates planning and execution, and the quality of the solution it produces increases as the allowed computational time increases. Differently from most former MCTS algorithms, for each action available in a state the algorithm maintains estimates of both its value and the probability that its execution will eventually result in a violation of the chance constraint. Then, at action selection time, our proposed solution prunes away trajectories that are estimated to violate the failure probability. Extensive simulation results show that this approach can quickly produce high-quality solutions and is competitive with the optimal but time-consuming solution.
comment: Paper to appear on the IEEE Transactions on Automation Science and Engineering
Continual Skill and Task Learning via Dialogue
Continual and interactive robot learning is a challenging problem as the robot is present with human users who expect the robot to learn novel skills to solve novel tasks perpetually with sample efficiency. In this work we present a framework for robots to query and learn visuo-motor robot skills and task relevant information via natural language dialog interactions with human users. Previous approaches either focus on improving the performance of instruction following agents, or passively learn novel skills or concepts. Instead, we used dialog combined with a language-skill grounding embedding to query or confirm skills and/or tasks requested by a user. To achieve this goal, we developed and integrated three different components for our agent. Firstly, we propose a novel visual-motor control policy ACT with Low Rank Adaptation (ACT-LoRA), which enables the existing SoTA ACT model to perform few-shot continual learning. Secondly, we develop an alignment model that projects demonstrations across skill embodiments into a shared embedding allowing us to know when to ask questions and/or demonstrations from users. Finally, we integrated an existing LLM to interact with a human user to perform grounded interactive continual skill learning to solve a task. Our ACT-LoRA model learns novel fine-tuned skills with a 100% accuracy when trained with only five demonstrations for a novel skill while still maintaining a 74.75% accuracy on pre-trained skills in the RLBench dataset where other models fall significantly short. We also performed a human-subjects study with 8 subjects to demonstrate the continual learning capabilities of our combined framework. We achieve a success rate of 75% in the task of sandwich making with the real robot learning from participant data demonstrating that robots can learn novel skills or task knowledge from dialogue with non-expert users using our approach.
Autonomous Drifting Based on Maximal Safety Probability Learning
This paper proposes a novel learning-based framework for autonomous driving based on the concept of maximal safety probability. Efficient learning requires rewards that are informative of desirable/undesirable states, but such rewards are challenging to design manually due to the difficulty of differentiating better states among many safe states. On the other hand, learning policies that maximize safety probability does not require laborious reward shaping but is numerically challenging because the algorithms must optimize policies based on binary rewards sparse in time. Here, we show that physics-informed reinforcement learning can efficiently learn this form of maximally safe policy. Unlike existing drift control methods, our approach does not require a specific reference trajectory or complex reward shaping, and can learn safe behaviors only from sparse binary rewards. This is enabled by the use of the physics loss that plays an analogous role to reward shaping. The effectiveness of the proposed approach is demonstrated through lane keeping in a normal cornering scenario and safe drifting in a high-speed racing scenario.
comment: arXiv admin note: text overlap with arXiv:2403.16391
Can we enhance prosocial behavior? Using post-ride feedback to improve micromobility interactions
Micromobility devices, such as e-scooters and delivery robots, hold promise for eco-friendly and cost-effective alternatives for future urban transportation. However, their lack of societal acceptance remains a challenge. Therefore, we must consider ways to promote prosocial behavior in micromobility interactions. We investigate how post-ride feedback can encourage the prosocial behavior of e-scooter riders while interacting with sidewalk users, including pedestrians and delivery robots. Using a web-based platform, we measure the prosocial behavior of e-scooter riders. Results found that post-ride feedback can successfully promote prosocial behavior, and objective measures indicated better gap behavior, lower speeds at interaction, and longer stopping time around other sidewalk actors. The findings of this study demonstrate the efficacy of post-ride feedback and provide a step toward designing methodologies to improve the prosocial behavior of mobility users.
comment: In 16th International ACM Conference on Automotive User Interfaces and Interactive Vehicular Applications (AutomotiveUI'24), September 22-25, 2024, Stanford, CA, USA. 11 pages
DRAL: Deep Reinforcement Adaptive Learning for Multi-UAVs Navigation in Unknown Indoor Environment
Autonomous indoor navigation of UAVs presents numerous challenges, primarily due to the limited precision of GPS in enclosed environments. Additionally, UAVs' limited capacity to carry heavy or power-intensive sensors, such as overheight packages, exacerbates the difficulty of achieving autonomous navigation indoors. This paper introduces an advanced system in which a drone autonomously navigates indoor spaces to locate a specific target, such as an unknown Amazon package, using only a single camera. Employing a deep learning approach, a deep reinforcement adaptive learning algorithm is trained to develop a control strategy that emulates the decision-making process of an expert pilot. We demonstrate the efficacy of our system through real-time simulations conducted in various indoor settings. We apply multiple visualization techniques to gain deeper insights into our trained network. Furthermore, we extend our approach to include an adaptive control algorithm for coordinating multiple drones to lift an object in an indoor environment collaboratively. Integrating our DRAL algorithm enables multiple UAVs to learn optimal control strategies that adapt to dynamic conditions and uncertainties. This innovation enhances the robustness and flexibility of indoor navigation and opens new possibilities for complex multi-drone operations in confined spaces. The proposed framework highlights significant advancements in adaptive control and deep reinforcement learning, offering robust solutions for complex multi-agent systems in real-world applications.
Asymptotically-Optimal Multi-Query Path Planning for Moving A Convex Polygon in 2D
The classical shortest-path roadmaps, also known as reduced visibility graphs, provide a multi-query method for quickly computing optimal paths in two-dimensional environments. Combined with Minkowski sum computations, shortest-path roadmaps can compute optimal paths for a translating robot in 2D. In this study, we explore the intuitive idea of stacking up a set of reduced visibility graphs at different orientations for a convex-shaped holonomic robot, to support the fast computation of near-optimal paths allowing simultaneous 2D translation and rotation. The resulting algorithm, rotation-stacked visibility graph (RVG), is shown to be resolution-complete and asymptotically optimal. RVG out-performs SOTA single-query sampling-based methods including BIT* and AIT* on both computation time and solution optimality fronts.
Achieving the Safety and Security of the End-to-End AV Pipeline SC
In the current landscape of autonomous vehicle (AV) safety and security research, there are multiple isolated problems being tackled by the community at large. Due to the lack of common evaluation criteria, several important research questions are at odds with one another. For instance, while much research has been conducted on physical attacks deceiving AV perception systems, there is often inadequate investigations on working defenses and on the downstream effects of safe vehicle control. This paper provides a thorough description of the current state of AV safety and security research. We provide individual sections for the primary research questions that concern this research area, including AV surveillance, sensor system reliability, security of the AV stack, algorithmic robustness, and safe environment interaction. We wrap up the paper with a discussion of the issues that concern the interactions of these separate problems. At the conclusion of each section, we propose future research questions that still lack conclusive answers. This position article will serve as an entry point to novice and veteran researchers seeking to partake in this research domain.
comment: Accepted to 1st Cyber Security in Cars Workshop (CSCS) at CCS
Multi-agent Path Finding for Mixed Autonomy Traffic Coordination
In the evolving landscape of urban mobility, the prospective integration of Connected and Automated Vehicles (CAVs) with Human-Driven Vehicles (HDVs) presents a complex array of challenges and opportunities for autonomous driving systems. While recent advancements in robotics have yielded Multi-Agent Path Finding (MAPF) algorithms tailored for agent coordination task characterized by simplified kinematics and complete control over agent behaviors, these solutions are inapplicable in mixed-traffic environments where uncontrollable HDVs must coexist and interact with CAVs. Addressing this gap, we propose the Behavior Prediction Kinematic Priority Based Search (BK-PBS), which leverages an offline-trained conditional prediction model to forecast HDV responses to CAV maneuvers, integrating these insights into a Priority Based Search (PBS) where the A* search proceeds over motion primitives to accommodate kinematic constraints. We compare BK-PBS with CAV planning algorithms derived by rule-based car-following models, and reinforcement learning. Through comprehensive simulation on a highway merging scenario across diverse scenarios of CAV penetration rate and traffic density, BK-PBS outperforms these baselines in reducing collision rates and enhancing system-level travel delay. Our work is directly applicable to many scenarios of multi-human multi-robot coordination.
Deep Brain Ultrasound Ablation Thermal Dose Modeling with in Vivo Experimental Validation
Intracorporeal needle-based therapeutic ultrasound (NBTU) is a minimally invasive option for intervening in malignant brain tumors, commonly used in thermal ablation procedures. This technique is suitable for both primary and metastatic cancers, utilizing a high-frequency alternating electric field (up to 10 MHz) to excite a piezoelectric transducer. The resulting rapid deformation of the transducer produces an acoustic wave that propagates through tissue, leading to localized high-temperature heating at the target tumor site and inducing rapid cell death. To optimize the design of NBTU transducers for thermal dose delivery during treatment, numerical modeling of the acoustic pressure field generated by the deforming piezoelectric transducer is frequently employed. The bioheat transfer process generated by the input pressure field is used to track the thermal propagation of the applicator over time. Magnetic resonance thermal imaging (MRTI) can be used to experimentally validate these models. Validation results using MRTI demonstrated the feasibility of this model, showing a consistent thermal propagation pattern. However, a thermal damage isodose map is more advantageous for evaluating therapeutic efficacy. To achieve a more accurate simulation based on the actual brain tissue environment, a new finite element method (FEM) simulation with enhanced damage evaluation capabilities was conducted. The results showed that the highest temperature and ablated volume differed between experimental and simulation results by 2.1884{\deg}C (3.71%) and 0.0631 cm$^3$ (5.74%), respectively. The lowest Pearson correlation coefficient (PCC) for peak temperature was 0.7117, and the lowest Dice coefficient for the ablated area was 0.7021, indicating a good agreement in accuracy between simulation and experiment.
comment: 9 pages, 9 figures, 7 tables
Trustworthy Human-AI Collaboration: Reinforcement Learning with Human Feedback and Physics Knowledge for Safe Autonomous Driving
In the field of autonomous driving, developing safe and trustworthy autonomous driving policies remains a significant challenge. Recently, Reinforcement Learning with Human Feedback (RLHF) has attracted substantial attention due to its potential to enhance training safety and sampling efficiency. Nevertheless, existing RLHF-enabled methods often falter when faced with imperfect human demonstrations, potentially leading to training oscillations or even worse performance than rule-based approaches. Inspired by the human learning process, we propose Physics-enhanced Reinforcement Learning with Human Feedback (PE-RLHF). This novel framework synergistically integrates human feedback (e.g., human intervention and demonstration) and physics knowledge (e.g., traffic flow model) into the training loop of reinforcement learning. The key advantage of PE-RLHF is its guarantee that the learned policy will perform at least as well as the given physics-based policy, even when human feedback quality deteriorates, thus ensuring trustworthy safety improvements. PE-RLHF introduces a Physics-enhanced Human-AI (PE-HAI) collaborative paradigm for dynamic action selection between human and physics-based actions, employs a reward-free approach with a proxy value function to capture human preferences, and incorporates a minimal intervention mechanism to reduce the cognitive load on human mentors. Extensive experiments across diverse driving scenarios demonstrate that PE-RLHF significantly outperforms traditional methods, achieving state-of-the-art (SOTA) performance in safety, efficiency, and generalizability, even with varying quality of human feedback. The philosophy behind PE-RLHF not only advances autonomous driving technology but can also offer valuable insights for other safety-critical domains. Demo video and code are available at: \https://zilin-huang.github.io/PE-RLHF-website/
comment: 33 pages, 20 figures
A Survey for Foundation Models in Autonomous Driving
The advent of foundation models has revolutionized the fields of natural language processing and computer vision, paving the way for their application in autonomous driving (AD). This survey presents a comprehensive review of more than 40 research papers, demonstrating the role of foundation models in enhancing AD. Large language models contribute to planning and simulation in AD, particularly through their proficiency in reasoning, code generation and translation. In parallel, vision foundation models are increasingly adapted for critical tasks such as 3D object detection and tracking, as well as creating realistic driving scenarios for simulation and testing. Multi-modal foundation models, integrating diverse inputs, exhibit exceptional visual understanding and spatial reasoning, crucial for end-to-end AD. This survey not only provides a structured taxonomy, categorizing foundation models based on their modalities and functionalities within the AD domain but also delves into the methods employed in current research. It identifies the gaps between existing foundation models and cutting-edge AD approaches, thereby charting future research directions and proposing a roadmap for bridging these gaps.
Deep Neural Implicit Representation of Accessibility for Multi-Axis Manufacturing SP
One of the main concerns in design and process planning for multi-axis additive and subtractive manufacturing is collision avoidance between moving objects (e.g., tool assemblies) and stationary objects (e.g., a part unified with fixtures). The collision measure for various pairs of relative rigid translations and rotations between the two pointsets can be conceptualized by a compactly supported scalar field over the 6D non-Euclidean configuration space. Explicit representation and computation of this field is costly in both time and space. If we fix $O(m)$ sparsely sampled rotations (e.g., tool orientations), computation of the collision measure field as a convolution of indicator functions of the 3D pointsets over a uniform grid (i.e., voxelized geometry) of resolution $O(n^3)$ via fast Fourier transforms (FFTs) scales as in $O(mn^3 \log n)$ in time and $O(mn^3)$ in space. In this paper, we develop an implicit representation of the collision measure field via deep neural networks (DNNs). We show that our approach is able to accurately interpolate the collision measure from a sparse sampling of rotations, and can represent the collision measure field with a small memory footprint. Moreover, we show that this representation can be efficiently updated through fine-tuning to more efficiently train the network on multi-resolution data, as well as accommodate incremental changes to the geometry (such as might occur in iterative processes such as topology optimization of the part subject to CNC tool accessibility constraints).
comment: Special Issue on symposium on Solid and Physical Modeling (SPM 2023)
Fast Decentralized State Estimation for Legged Robot Locomotion via EKF and MHE
In this paper, we present a fast and decentralized state estimation framework for the control of legged locomotion. The nonlinear estimation of the floating base states is decentralized to an orientation estimation via Extended Kalman Filter (EKF) and a linear velocity estimation via Moving Horizon Estimation (MHE). The EKF fuses the inertia sensor with vision to estimate the floating base orientation. The MHE uses the estimated orientation with all the sensors within a time window in the past to estimate the linear velocities based on a time-varying linear dynamics formulation of the interested states with state constraints. More importantly, a marginalization method based on the optimization structure of the full information filter (FIF) is proposed to convert the equality-constrained FIF to an equivalent MHE. This decoupling of state estimation promotes the desired balance of computation efficiency, accuracy of estimation, and the inclusion of state constraints. The proposed method is shown to be capable of providing accurate state estimation to several legged robots, including the highly dynamic hopping robot PogoX, the bipedal robot Cassie, and the quadrupedal robot Unitree Go1, with a frequency at 200 Hz and a window interval of 0.1s.
Hierarchical Generative Adversarial Imitation Learning with Mid-level Input Generation for Autonomous Driving on Urban Environments
Deriving robust control policies for realistic urban navigation scenarios is not a trivial task. In an end-to-end approach, these policies must map high-dimensional images from the vehicle's cameras to low-level actions such as steering and throttle. While pure Reinforcement Learning (RL) approaches are based exclusively on engineered rewards, Generative Adversarial Imitation Learning (GAIL) agents learn from expert demonstrations while interacting with the environment, which favors GAIL on tasks for which a reward signal is difficult to derive, such as autonomous driving. However, training deep networks directly from raw images on RL tasks is known to be unstable and troublesome. To deal with that, this work proposes a hierarchical GAIL-based architecture (hGAIL) which decouples representation learning from the driving task to solve the autonomous navigation of a vehicle. The proposed architecture consists of two modules: a GAN (Generative Adversarial Net) which generates an abstract mid-level input representation, which is the Bird's-Eye View (BEV) from the surroundings of the vehicle; and the GAIL which learns to control the vehicle based on the BEV predictions from the GAN as input. hGAIL is able to learn both the policy and the mid-level representation simultaneously as the agent interacts with the environment. Our experiments made in the CARLA simulation environment have shown that GAIL exclusively from cameras (without BEV) fails to even learn the task, while hGAIL, after training exclusively on one city, was able to autonomously navigate successfully in 98% of the intersections of a new city not used in training phase. Videos and code available at: https://sites.google.com/view/hgail
MimicTouch: Leveraging Multi-modal Human Tactile Demonstrations for Contact-rich Manipulation NeurIPS 2023
Tactile sensing is critical to fine-grained, contact-rich manipulation tasks, such as insertion and assembly. Prior research has shown the possibility of learning tactile-guided policy from teleoperated demonstration data. However, to provide the demonstration, human users often rely on visual feedback to control the robot. This creates a gap between the sensing modality used for controlling the robot (visual) and the modality of interest (tactile). To bridge this gap, we introduce "MimicTouch", a novel framework for learning policies directly from demonstrations provided by human users with their hands. The key innovations are i) a human tactile data collection system which collects multi-modal tactile dataset for learning human's tactile-guided control strategy, ii) an imitation learning-based framework for learning human's tactile-guided control strategy through such data, and iii) an online residual RL framework to bridge the embodiment gap between the human hand and the robot gripper. Through comprehensive experiments, we highlight the efficacy of utilizing human's tactile-guided control strategy to resolve contact-rich manipulation tasks. The project website is at https://sites.google.com/view/MimicTouch.
comment: Accepted by CoRL 2024, Best Paper Award at NeurIPS 2023 Touch Processing Workshop
Automatic Robot Hand-Eye Calibration Enabled by Learning-Based 3D Vision
Hand-eye calibration, as a fundamental task in vision-based robotic systems, aims to estimate the transformation matrix between the coordinate frame of the camera and the robot flange. Most approaches to hand-eye calibration rely on external markers or human assistance. We proposed Look at Robot Base Once (LRBO), a novel methodology that addresses the hand-eye calibration problem without external calibration objects or human support, but with the robot base. Using point clouds of the robot base, a transformation matrix from the coordinate frame of the camera to the robot base is established as I=AXB. To this end, we exploit learning-based 3D detection and registration algorithms to estimate the location and orientation of the robot base. The robustness and accuracy of the method are quantified by ground-truth-based evaluation, and the accuracy result is compared with other 3D vision-based calibration methods. To assess the feasibility of our methodology, we carried out experiments utilizing a low-cost structured light scanner across varying joint configurations and groups of experiments. The proposed hand-eye calibration method achieved a translation deviation of 0.930 mm and a rotation deviation of 0.265 degrees according to the experimental results. Additionally, the 3D reconstruction experiments demonstrated a rotation error of 0.994 degrees and a position error of 1.697 mm. Moreover, our method offers the potential to be completed in 1 second, which is the fastest compared to other 3D hand-eye calibration methods. Code is released at github.com/leihui6/LRBO.
comment: Accepted by Journal of Intelligent & Robotic Systems
Region-aware Grasp Framework with Normalized Grasp Space for Efficient 6-DoF Grasping
A series of region-based methods succeed in extracting regional features and enhancing grasp detection quality. However, faced with a cluttered scene with potential collision, the definition of the grasp-relevant region stays inconsistent, and the relationship between grasps and regional spaces remains incompletely investigated. In this paper, we propose Normalized Grasp Space (NGS) from a novel region-aware viewpoint, unifying the grasp representation within a normalized regional space and benefiting the generalizability of methods. Leveraging the NGS, we find that CNNs are underestimated for 3D feature extraction and 6-DoF grasp detection in clutter scenes and build a highly efficient Region-aware Normalized Grasp Network (RNGNet). Experiments on the public benchmark show that our method achieves significant >20% performance gains while attaining a real-time inference speed of approximately 50 FPS. Real-world cluttered scene clearance experiments underscore the effectiveness of our method. Further, human-to-robot handover and dynamic object grasping experiments demonstrate the potential of our proposed method for closed-loop grasping in dynamic scenarios.
comment: Accepted by CoRL2024, final camera-ready version will be updated soon
Efficient Incremental Penetration Depth Estimation between Convex Geometries IROS 2024
Penetration depth (PD) is essential for robotics due to its extensive applications in dynamic simulation, motion planning, haptic rendering, etc. The Expanding Polytope Algorithm (EPA) is the de facto standard for this problem, which estimates PD by expanding an inner polyhedral approximation of an implicit set. In this paper, we propose a novel optimization-based algorithm that incrementally estimates minimum penetration depth and its direction. One major advantage of our method is that it can be warm-started by exploiting the spatial and temporal coherence, which emerges naturally in many robotic applications (e.g., the temporal coherence between adjacent simulation time knots). As a result, our algorithm achieves substantial speedup -- we demonstrate it is 5-30x faster than EPA on several benchmarks. Moreover, our approach is built upon the same implicit geometry representation as EPA, which enables easy integration and deployment into existing software stacks. We also provide an open-source implementation on: https://github.com/weigao95/mind-fcl
comment: IROS 2024
Tightly-Coupled LiDAR-IMU-Wheel Odometry with Online Calibration of a Kinematic Model for Skid-Steering Robots
Tunnels and long corridors are challenging environments for mobile robots because a LiDAR point cloud should degenerate in these environments. To tackle point cloud degeneration, this study presents a tightly-coupled LiDAR-IMU-wheel odometry algorithm with an online calibration for skid-steering robots. We propose a full linear wheel odometry factor, which not only serves as a motion constraint but also performs the online calibration of kinematic models for skid-steering robots. Despite the dynamically changing kinematic model (e.g., wheel radii changes caused by tire pressures) and terrain conditions, our method can address the model error via online calibration. Moreover, our method enables an accurate localization in cases of degenerated environments, such as long and straight corridors, by calibration while the LiDAR-IMU fusion sufficiently operates. Furthermore, we estimate the uncertainty (i.e., covariance matrix) of the wheel odometry online for creating a reasonable constraint. The proposed method is validated through three experiments. The first indoor experiment shows that the proposed method is robust in severe degeneracy cases (long corridors) and changes in the wheel radii. The second outdoor experiment demonstrates that our method accurately estimates the sensor trajectory despite being in rough outdoor terrain owing to online uncertainty estimation of wheel odometry. The third experiment shows the proposed online calibration enables robust odometry estimation in changing terrains.
comment: open-source: https://github.com/TakuOkawara/full_linear_wheel_odometry_factor
FREA: Feasibility-Guided Generation of Safety-Critical Scenarios with Reasonable Adversariality
Generating safety-critical scenarios, which are essential yet difficult to collect at scale, offers an effective method to evaluate the robustness of autonomous vehicles (AVs). Existing methods focus on optimizing adversariality while preserving the naturalness of scenarios, aiming to achieve a balance through data-driven approaches. However, without an appropriate upper bound for adversariality, the scenarios might exhibit excessive adversariality, potentially leading to unavoidable collisions. In this paper, we introduce FREA, a novel safety-critical scenarios generation method that incorporates the Largest Feasible Region (LFR) of AV as guidance to ensure the reasonableness of the adversarial scenarios. Concretely, FREA initially pre-calculates the LFR of AV from offline datasets. Subsequently, it learns a reasonable adversarial policy that controls the scene's critical background vehicles (CBVs) to generate adversarial yet AV-feasible scenarios by maximizing a novel feasibility-dependent adversarial objective function. Extensive experiments illustrate that FREA can effectively generate safety-critical scenarios, yielding considerable near-miss events while ensuring AV's feasibility. Generalization analysis also confirms the robustness of FREA in AV testing across various surrogate AV methods and traffic environments.
comment: Accepted by CoRL 2024
Online Multi-Agent Pickup and Delivery with Task Deadlines IROS 2024
Managing delivery deadlines in automated warehouses and factories is crucial for maintaining customer satisfaction and ensuring seamless production. This study introduces the problem of online multi-agent pickup and delivery with task deadlines (MAPD-D), an advanced variant of the online MAPD problem incorporating delivery deadlines. In the MAPD problem, agents must manage a continuous stream of delivery tasks online. Tasks are added at any time. Agents must complete their tasks while avoiding collisions with each other. MAPD-D introduces a dynamic, deadline-driven approach that incorporates task deadlines, challenging the conventional MAPD frameworks. To tackle MAPD-D, we propose a novel algorithm named deadline-aware token passing (D-TP). The D-TP algorithm calculates pickup deadlines and assigns tasks while balancing execution cost and deadline proximity. Additionally, we introduce the D-TP with task swaps (D-TPTS) method to further reduce task tardiness, enhancing flexibility and efficiency through task-swapping strategies. Numerical experiments were conducted in simulated warehouse environments to showcase the effectiveness of the proposed methods. Both D-TP and D-TPTS demonstrated significant reductions in task tardiness compared to existing methods. Our methods contribute to efficient operations in automated warehouses and factories with delivery deadlines.
comment: 7 pages, 4 figures, IROS 2024
MCDS-VSS: Moving Camera Dynamic Scene Video Semantic Segmentation by Filtering with Self-Supervised Geometry and Motion BMVC 2024
Autonomous systems, such as self-driving cars, rely on reliable semantic environment perception for decision making. Despite great advances in video semantic segmentation, existing approaches ignore important inductive biases and lack structured and interpretable internal representations. In this work, we propose MCDS-VSS, a structured filter model that learns in a self-supervised manner to estimate scene geometry and ego-motion of the camera, while also estimating the motion of external objects. Our model leverages these representations to improve the temporal consistency of semantic segmentation without sacrificing segmentation accuracy. MCDS-VSS follows a prediction-fusion approach in which scene geometry and camera motion are first used to compensate for ego-motion, then residual flow is used to compensate motion of dynamic objects, and finally the predicted scene features are fused with the current features to obtain a temporally consistent scene segmentation. Our model parses automotive scenes into multiple decoupled interpretable representations such as scene geometry, ego-motion, and object motion. Quantitative evaluation shows that MCDS-VSS achieves superior temporal consistency on video sequences while retaining competitive segmentation performance.
comment: Accepted for publication at BMVC 2024
BEVal: A Cross-dataset Evaluation Study of BEV Segmentation Models for Autononomous Driving
Current research in semantic bird's-eye view segmentation for autonomous driving focuses solely on optimizing neural network models using a single dataset, typically nuScenes. This practice leads to the development of highly specialized models that may fail when faced with different environments or sensor setups, a problem known as domain shift. In this paper, we conduct a comprehensive cross-dataset evaluation of state-of-the-art BEV segmentation models to assess their performance across different training and testing datasets and setups, as well as different semantic categories. We investigate the influence of different sensors, such as cameras and LiDAR, on the models' ability to generalize to diverse conditions and scenarios. Additionally, we conduct multi-dataset training experiments that improve models' BEV segmentation performance compared to single-dataset training. Our work addresses the gap in evaluating BEV segmentation models under cross-dataset validation. And our findings underscore the importance of enhancing model generalizability and adaptability to ensure more robust and reliable BEV segmentation approaches for autonomous driving applications. The code for this paper available at https://github.com/manueldiaz96/beval .
FRAC-Q-Learning: A Reinforcement Learning with Boredom Avoidance Processes for Social Robots
The reinforcement learning algorithms have often been applied to social robots. However, most reinforcement learning algorithms were not optimized for the use of social robots, and consequently they may bore users. We proposed a new reinforcement learning method specialized for the social robot, the FRAC-Q-learning, that can avoid user boredom. The proposed algorithm consists of a forgetting process in addition to randomizing and categorizing processes. This study evaluated interest and boredom hardness scores of the FRAC-Q-learning by a comparison with the traditional Q-learning. The FRAC-Q-learning showed significantly higher trend of interest score, and indicated significantly harder to bore users compared to the traditional Q-learning. Therefore, the FRAC-Q-learning can contribute to develop a social robot that will not bore users. The proposed algorithm has a potential to apply for Web-based communication and educational systems. This paper presents the entire process, detailed implementation and a detailed evaluation method of the of the FRAC-Q-learning for the first time.
TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation
Semantic segmentation benchmarks in the realm of autonomous driving are dominated by large pre-trained transformers, yet their widespread adoption is impeded by substantial computational costs and prolonged training durations. To lift this constraint, we look at efficient semantic segmentation from a perspective of comprehensive knowledge distillation and aim to bridge the gap between multi-source knowledge extractions and transformer-specific patch embeddings. We put forward the Transformer-based Knowledge Distillation (TransKD) framework which learns compact student transformers by distilling both feature maps and patch embeddings of large teacher transformers, bypassing the long pre-training process and reducing the FLOPs by >85.0%. Specifically, we propose two fundamental modules to realize feature map distillation and patch embedding distillation, respectively: (1) Cross Selective Fusion (CSF) enables knowledge transfer between cross-stage features via channel attention and feature map distillation within hierarchical transformers; (2) Patch Embedding Alignment (PEA) performs dimensional transformation within the patchifying process to facilitate the patch embedding distillation. Furthermore, we introduce two optimization modules to enhance the patch embedding distillation from different perspectives: (1) Global-Local Context Mixer (GL-Mixer) extracts both global and local information of a representative embedding; (2) Embedding Assistant (EA) acts as an embedding method to seamlessly bridge teacher and student models with the teacher's number of channels. Experiments on Cityscapes, ACDC, NYUv2, and Pascal VOC2012 datasets show that TransKD outperforms state-of-the-art distillation frameworks and rivals the time-consuming pre-training method. The source code is publicly available at https://github.com/RuipingL/TransKD.
comment: Accepted to IEEE Transactions on Intelligent Transportation Systems (T-ITS). The source code is publicly available at https://github.com/RuipingL/TransKD
Quadrotor Manipulation System: Development of a Robust Contact Force Estimation and Impedance Control Scheme Based on DOb and FTRLS
The research on aerial manipulation systems has been increased rapidly in recent years. These systems are very attractive for a wide range of applications due to their unique features. However, dynamics, control and manipulation tasks of such systems are quite challenging because they are naturally unstable, have very fast dynamics, have strong nonlinearities, are very susceptible to parameters variations due to carrying a payload besides the external disturbances, and have complex inverse kinematics. In addition, the manipulation tasks require estimating (applying) a certain force of (at) the end-effector as well as the accurate positioning of it. Thus, in this article, a robust force estimation and impedance control scheme is proposed to address these issues. The robustness is achieved based on the Disturbance Observer (DOb) technique. Then, a tracking and performance low computational linear controller is used. For teleoperation purpose, the contact force needs to be identified. However, the current developed techniques for force estimation have limitations because they are based on ignoring some dynamics and/or requiring of an indicator of the environment contact. Unlike these techniques, we propose a technique based on linearization capabilities of DOb and a Fast Tracking Recursive Least Squares (FTRLS) algorithm. The complex inverse kinematics problem of such a system is solved by a Jacobin based algorithm. The stability analysis of the proposed scheme is presented. The algorithm is tested to achieve tracking of task space reference trajectories besides the impedance control. The efficiency of the proposed technique is enlightened via numerical simulation.
OpenVLA: An Open-Source Vision-Language-Action Model
Large policies pretrained on a combination of Internet-scale vision-language data and diverse robot demonstrations have the potential to change how we teach robots new skills: rather than training new behaviors from scratch, we can fine-tune such vision-language-action (VLA) models to obtain robust, generalizable policies for visuomotor control. Yet, widespread adoption of VLAs for robotics has been challenging as 1) existing VLAs are largely closed and inaccessible to the public, and 2) prior work fails to explore methods for efficiently fine-tuning VLAs for new tasks, a key component for adoption. Addressing these challenges, we introduce OpenVLA, a 7B-parameter open-source VLA trained on a diverse collection of 970k real-world robot demonstrations. OpenVLA builds on a Llama 2 language model combined with a visual encoder that fuses pretrained features from DINOv2 and SigLIP. As a product of the added data diversity and new model components, OpenVLA demonstrates strong results for generalist manipulation, outperforming closed models such as RT-2-X (55B) by 16.5% in absolute task success rate across 29 tasks and multiple robot embodiments, with 7x fewer parameters. We further show that we can effectively fine-tune OpenVLA for new settings, with especially strong generalization results in multi-task environments involving multiple objects and strong language grounding abilities, and outperform expressive from-scratch imitation learning methods such as Diffusion Policy by 20.4%. We also explore compute efficiency; as a separate contribution, we show that OpenVLA can be fine-tuned on consumer GPUs via modern low-rank adaptation methods and served efficiently via quantization without a hit to downstream success rate. Finally, we release model checkpoints, fine-tuning notebooks, and our PyTorch codebase with built-in support for training VLAs at scale on Open X-Embodiment datasets.
comment: Website: https://openvla.github.io/
An Open-source Hardware/Software Architecture and Supporting Simulation Environment to Perform Human FPV Flight Demonstrations for Unmanned Aerial Vehicle Autonomy
Small multi-rotor unmanned aerial vehicles (UAVs), mainly quadcopters, are nowadays ubiquitous in research on aerial autonomy, including serving as scaled-down models for much larger aircraft such as vertical-take-off-and-lift vehicles for urban air mobility. Among the various research use cases, first-person-view RC flight experiments allow for collecting data on how human pilots fly such aircraft, which could then be used to compare, contrast, validate, or train autonomous flight agents. While this could be uniquely beneficial, especially for studying UAV operation in contextually complex and safety-critical environments such as in human-UAV shared spaces, the lack of inexpensive and open-source hardware/software platforms that offer this capability along with low-level access to the underlying control software and data remains limited. To address this gap and significantly reduce barriers to human-guided autonomy research with UAVs, this paper presents an open-source software architecture implemented with an inexpensive in-house built quadcopter platform based on the F450 Quadcopter Frame. This setup uses two cameras to provide a dual-view FPV and an open-source flight controller, Pixhawk. The underlying software architecture, developed using the Python-based Kivy library, allows logging telemetry, GPS, control inputs, and camera frame data in a synchronized manner on the ground station computer. Since costs (time) and weather constraints typically limit numbers of physical outdoor flight experiments, this paper also presents a unique AirSim/Unreal Engine based simulation environment and graphical user interface aka digital twin, that provides a Hardware In The Loop setup via the Pixhawk flight controller. We demonstrate the usability and reliability of the overall framework through a set of diverse physical FPV flight experiments and corresponding flight tests in the digital twin.
comment: Presented at AIAA Aviation Forum 2024
Demonstrating a Robust Walking Algorithm for Underactuated Bipedal Robots in Non-flat, Non-stationary Environments
This work explores an innovative algorithm designed to enhance the mobility of underactuated bipedal robots across challenging terrains, especially when navigating through spaces with constrained opportunities for foot support, like steps or stairs. By combining ankle torque with a refined angular momentum-based linear inverted pendulum model (ALIP), our method allows variability in the robot's center of mass height. We employ a dual-strategy controller that merges virtual constraints for precise motion regulation across essential degrees of freedom with an ALIP-centric model predictive control (MPC) framework, aimed at enforcing gait stability. The effectiveness of our feedback design is demonstrated through its application on the Cassie bipedal robot, which features 20 degrees of freedom. Key to our implementation is the development of tailored nominal trajectories and an optimized MPC that reduces the execution time to under 500 microseconds--and, hence, is compatible with Cassie's controller update frequency. This paper not only showcases the successful hardware deployment but also demonstrates a new capability, a bipedal robot using a moving walkway.
CURE: Simulation-Augmented Auto-Tuning in Robotics
Robotic systems are typically composed of various subsystems, such as localization and navigation, each encompassing numerous configurable components (e.g., selecting different planning algorithms). Once an algorithm has been selected for a component, its associated configuration options must be set to the appropriate values. Configuration options across the system stack interact non-trivially. Finding optimal configurations for highly configurable robots to achieve desired performance poses a significant challenge due to the interactions between configuration options across software and hardware that result in an exponentially large and complex configuration space. These challenges are further compounded by the need for transferability between different environments and robotic platforms. Data efficient optimization algorithms (e.g., Bayesian optimization) have been increasingly employed to automate the tuning of configurable parameters in cyber-physical systems. However, such optimization algorithms converge at later stages, often after exhausting the allocated budget (e.g., optimization steps, allotted time) and lacking transferability. This paper proposes CURE -- a method that identifies causally relevant configuration options, enabling the optimization process to operate in a reduced search space, thereby enabling faster optimization of robot performance. CURE abstracts the causal relationships between various configuration options and robot performance objectives by learning a causal model in the source (a low-cost environment such as the Gazebo simulator) and applying the learned knowledge to perform optimization in the target (e.g., Turtlebot 3 physical robot). We demonstrate the effectiveness and transferability of CURE by conducting experiments that involve varying degrees of deployment changes in both physical robots and simulation.
comment: Revised submission in IEEE Transactions on Robotics (T-RO), 2024
Multiagent Systems
Non-stationary and Sparsely-correlated Multi-output Gaussian Process with Spike-and-Slab Prior
Multi-output Gaussian process (MGP) is commonly used as a transfer learning method to leverage information among multiple outputs. A key advantage of MGP is providing uncertainty quantification for prediction, which is highly important for subsequent decision-making tasks. However, traditional MGP may not be sufficiently flexible to handle multivariate data with dynamic characteristics, particularly when dealing with complex temporal correlations. Additionally, since some outputs may lack correlation, transferring information among them may lead to negative transfer. To address these issues, this study proposes a non-stationary MGP model that can capture both the dynamic and sparse correlation among outputs. Specifically, the covariance functions of MGP are constructed using convolutions of time-varying kernel functions. Then a dynamic spike-and-slab prior is placed on correlation parameters to automatically decide which sources are informative to the target output in the training process. An expectation-maximization (EM) algorithm is proposed for efficient model fitting. Both numerical studies and a real case demonstrate its efficacy in capturing dynamic and sparse correlation structure and mitigating negative transfer for high-dimensional time-series data. Finally, a mountain-car reinforcement learning case highlights its potential application in decision making problems.
Multi-agent Path Finding for Mixed Autonomy Traffic Coordination
In the evolving landscape of urban mobility, the prospective integration of Connected and Automated Vehicles (CAVs) with Human-Driven Vehicles (HDVs) presents a complex array of challenges and opportunities for autonomous driving systems. While recent advancements in robotics have yielded Multi-Agent Path Finding (MAPF) algorithms tailored for agent coordination task characterized by simplified kinematics and complete control over agent behaviors, these solutions are inapplicable in mixed-traffic environments where uncontrollable HDVs must coexist and interact with CAVs. Addressing this gap, we propose the Behavior Prediction Kinematic Priority Based Search (BK-PBS), which leverages an offline-trained conditional prediction model to forecast HDV responses to CAV maneuvers, integrating these insights into a Priority Based Search (PBS) where the A* search proceeds over motion primitives to accommodate kinematic constraints. We compare BK-PBS with CAV planning algorithms derived by rule-based car-following models, and reinforcement learning. Through comprehensive simulation on a highway merging scenario across diverse scenarios of CAV penetration rate and traffic density, BK-PBS outperforms these baselines in reducing collision rates and enhancing system-level travel delay. Our work is directly applicable to many scenarios of multi-human multi-robot coordination.
PARCO: Learning Parallel Autoregressive Policies for Efficient Multi-Agent Combinatorial Optimization
Multi-agent combinatorial optimization problems such as routing and scheduling have great practical relevance but present challenges due to their NP-hard combinatorial nature, hard constraints on the number of possible agents, and hard-to-optimize objective functions. This paper introduces PARCO (Parallel AutoRegressive Combinatorial Optimization), a novel approach that learns fast surrogate solvers for multi-agent combinatorial problems with reinforcement learning by employing parallel autoregressive decoding. We propose a model with a Multiple Pointer Mechanism to efficiently decode multiple decisions simultaneously by different agents, enhanced by a Priority-based Conflict Handling scheme. Moreover, we design specialized Communication Layers that enable effective agent collaboration, thus enriching decision-making. We evaluate PARCO in representative multi-agent combinatorial problems in routing and scheduling and demonstrate that our learned solvers offer competitive results against both classical and neural baselines in terms of both solution quality and speed. We make our code openly available at https://github.com/ai4co/parco.
MARPF: Multi-Agent and Multi-Rack Path Finding IROS 2024
In environments where many automated guided vehicles (AGVs) operate, planning efficient, collision-free paths is essential. Related research has mainly focused on environments with pre-defined passages, resulting in space inefficiency. We attempt to relax this assumption. In this study, we define multi-agent and multi-rack path finding (MARPF) as the problem of planning paths for AGVs to convey target racks to their designated locations in environments without passages. In such environments, an AGV without a rack can pass under racks, whereas one with a rack cannot pass under racks to avoid collisions. MARPF entails conveying the target racks without collisions, while the obstacle racks are relocated to prevent any interference with the target racks. We formulated MARPF as an integer linear programming problem in a network flow. To distinguish situations in which an AGV is or is not loading a rack, the proposed method introduces two virtual layers into the network. We optimized the AGVs' movements to move obstacle racks and convey the target racks. The formulation and applicability of the algorithm were validated through numerical experiments. The results indicated that the proposed algorithm addressed issues in environments with dense racks.
comment: 7 pages, 10 figures, IROS 2024
Online Multi-Agent Pickup and Delivery with Task Deadlines IROS 2024
Managing delivery deadlines in automated warehouses and factories is crucial for maintaining customer satisfaction and ensuring seamless production. This study introduces the problem of online multi-agent pickup and delivery with task deadlines (MAPD-D), an advanced variant of the online MAPD problem incorporating delivery deadlines. In the MAPD problem, agents must manage a continuous stream of delivery tasks online. Tasks are added at any time. Agents must complete their tasks while avoiding collisions with each other. MAPD-D introduces a dynamic, deadline-driven approach that incorporates task deadlines, challenging the conventional MAPD frameworks. To tackle MAPD-D, we propose a novel algorithm named deadline-aware token passing (D-TP). The D-TP algorithm calculates pickup deadlines and assigns tasks while balancing execution cost and deadline proximity. Additionally, we introduce the D-TP with task swaps (D-TPTS) method to further reduce task tardiness, enhancing flexibility and efficiency through task-swapping strategies. Numerical experiments were conducted in simulated warehouse environments to showcase the effectiveness of the proposed methods. Both D-TP and D-TPTS demonstrated significant reductions in task tardiness compared to existing methods. Our methods contribute to efficient operations in automated warehouses and factories with delivery deadlines.
comment: 7 pages, 4 figures, IROS 2024
A Graph-based Adversarial Imitation Learning Framework for Reliable & Realtime Fleet Scheduling in Urban Air Mobility
The advent of Urban Air Mobility (UAM) presents the scope for a transformative shift in the domain of urban transportation. However, its widespread adoption and economic viability depends in part on the ability to optimally schedule the fleet of aircraft across vertiports in a UAM network, under uncertainties attributed to airspace congestion, changing weather conditions, and varying demands. This paper presents a comprehensive optimization formulation of the fleet scheduling problem, while also identifying the need for alternate solution approaches, since directly solving the resulting integer nonlinear programming problem is computationally prohibitive for daily fleet scheduling. Previous work has shown the effectiveness of using (graph) reinforcement learning (RL) approaches to train real-time executable policy models for fleet scheduling. However, such policies can often be brittle on out-of-distribution scenarios or edge cases. Moreover, training performance also deteriorates as the complexity (e.g., number of constraints) of the problem increases. To address these issues, this paper presents an imitation learning approach where the RL-based policy exploits expert demonstrations yielded by solving the exact optimization using a Genetic Algorithm. The policy model comprises Graph Neural Network (GNN) based encoders that embed the space of vertiports and aircraft, Transformer networks to encode demand, passenger fare, and transport cost profiles, and a Multi-head attention (MHA) based decoder. Expert demonstrations are used through the Generative Adversarial Imitation Learning (GAIL) algorithm. Interfaced with a UAM simulation environment involving 8 vertiports and 40 aircrafts, in terms of the daily profits earned reward, the new imitative approach achieves better mean performance and remarkable improvement in the case of unseen worst-case scenarios, compared to pure RL results.
comment: Presented at the AIAA Aviation Forum 2024
Ontology-driven Reinforcement Learning for Personalized Student Support
In the search for more effective education, there is a widespread effort to develop better approaches to personalize student education. Unassisted, educators often do not have time or resources to personally support every student in a given classroom. Motivated by this issue, and by recent advancements in artificial intelligence, this paper presents a general-purpose framework for personalized student support, applicable to any virtual educational system such as a serious game or an intelligent tutoring system. To fit any educational situation, we apply ontologies for their semantic organization, combining them with data collection considerations and multi-agent reinforcement learning. The result is a modular system that can be adapted to any virtual educational software to provide useful personalized assistance to students.
comment: 6 pages, 3 figures, in press for IEEE Systems, Man, and Cybernetics 2024 Conference
Systems and Control (CS)
Differentiable Discrete Event Simulation for Queuing Network Control
Queuing network control is essential for managing congestion in job-processing systems such as service systems, communication networks, and manufacturing processes. Despite growing interest in applying reinforcement learning (RL) techniques, queueing network control poses distinct challenges, including high stochasticity, large state and action spaces, and lack of stability. To tackle these challenges, we propose a scalable framework for policy optimization based on differentiable discrete event simulation. Our main insight is that by implementing a well-designed smoothing technique for discrete event dynamics, we can compute pathwise policy gradients for large-scale queueing networks using auto-differentiation software (e.g., Tensorflow, PyTorch) and GPU parallelization. Through extensive empirical experiments, we observe that our policy gradient estimators are several orders of magnitude more accurate than typical REINFORCE-based estimators. In addition, We propose a new policy architecture, which drastically improves stability while maintaining the flexibility of neural-network policies. In a wide variety of scheduling and admission control tasks, we demonstrate that training control policies with pathwise gradients leads to a 50-1000x improvement in sample efficiency over state-of-the-art RL methods. Unlike prior tailored approaches to queueing, our methods can flexibly handle realistic scenarios, including systems operating in non-stationary environments and those with non-exponential interarrival/service times.
A Deep Generative Learning Approach for Two-stage Adaptive Robust Optimization
Two-stage adaptive robust optimization is a powerful approach for planning under uncertainty that aims to balance costs of "here-and-now" first-stage decisions with those of "wait-and-see" recourse decisions made after uncertainty is realized. To embed robustness against uncertainty, modelers typically assume a simple polyhedral or ellipsoidal set over which contingencies may be realized. However, these simple uncertainty sets tend to yield highly conservative decision-making when uncertainties are high-dimensional. In this work, we introduce AGRO, a column-and-constraint generation algorithm that performs adversarial generation for two-stage adaptive robust optimization using a variational autoencoder. AGRO identifies realistic and cost-maximizing contingencies by optimizing over spherical uncertainty sets in a latent space using a projected gradient ascent approach that differentiates the optimal recourse cost with respect to the latent variable. To demonstrate the cost- and time-efficiency of our approach experimentally, we apply AGRO to an adaptive robust capacity expansion problem for a regional power system and show that AGRO is able to reduce costs by up to 7.8% and runtimes by up to 77% in comparison to the conventional column-and-constraint generation algorithm.
Design of CANSAT for Air Quality Monitoring for an altitude of 900 meters
This paper presents the design and development of NAMBI-VJ, a CANSAT specifically designed for air quality monitoring and stabilization. The CANSAT's cylindrical structure, measuring 310mm in height and 125mm in diameter, is equipped with a mechanical gyroscope for stabilization and a spill-hole parachute for controlled descent. The primary objective of this research is to create a compact, lightweight satellite capable of monitoring air quality parameters such as particulate matter (PM), carbon dioxide (CO2), longitude, and latitude. To achieve this, the CANSAT utilizes Zigbee communication to transmit data to a ground station. Experimental testing involved dropping the CANSAT from an altitude of 900 meters using a drone. The results demonstrate the CANSAT's ability to successfully gather and transmit air quality data, highlighting its potential for environmental monitoring applications.
comment: 6 pages,6 figures
Advances in Cislunar Periodic Solutions via Taylor Polynomial Maps
In this paper, novel approaches are developed to explore the dynamics of motion in periodic orbits near libration points in cislunar space using the Differential Algebra (DA) framework. The Circular Restricted Three-Body Problem (CR3BP) models the motion, with initial states derived numerically via differential correction. Periodic orbit families are computed using the Pseudo-Arclength Continuation (PAC) method and fitted. Two newly developed polynomial regression models (PRMs) express initial states as functions of predefined parameters and are used in the DA framework to evaluate propagated states. The initial states, expressed via PRM, are propagated in the DA framework using the fourth-order Runge-Kutta (RK4) method. The resultant polynomials of both PRM and DA are employed to develop a control law that shows significantly reduced control effort compared to the traditional tracking control law, demonstrating their potential for cislunar space applications, particularly those requiring computationally inexpensive low-energy transfers.
comment: 20 pages, 19 figures
Wind turbine condition monitoring based on intra- and inter-farm federated learning
As wind energy adoption is growing, ensuring the efficient operation and maintenance of wind turbines becomes essential for maximizing energy production and minimizing costs and downtime. Many AI applications in wind energy, such as in condition monitoring and power forecasting, may benefit from using operational data not only from individual wind turbines but from multiple turbines and multiple wind farms. Collaborative distributed AI which preserves data privacy holds a strong potential for these applications. Federated learning has emerged as a privacy-preserving distributed machine learning approach in this context. We explore federated learning in wind turbine condition monitoring, specifically for fault detection using normal behaviour models. We investigate various federated learning strategies, including collaboration across different wind farms and turbine models, as well as collaboration restricted to the same wind farm and turbine model. Our case study results indicate that federated learning across multiple wind turbines consistently outperforms models trained on a single turbine, especially when training data is scarce. Moreover, the amount of historical data necessary to train an effective model can be significantly reduced by employing a collaborative federated learning strategy. Finally, our findings show that extending the collaboration to multiple wind farms may result in inferior performance compared to restricting learning within a farm, specifically when faced with statistical heterogeneity and imbalanced datasets.
Generalizing Linear Graphs and Bond Graph Models with Hetero-functional Graphs for System-of-Systems Engineering Applications
In the 20th century, individual technology products like the generator, telephone, and automobile were connected to form many of the large-scale, complex, infrastructure networks we know today: the power grid, the communication infrastructure, and the transportation system. Progressively, these networked systems began interacting, forming what is now known as systems-of-systems. Because the component systems in the system-of-systems differ, modeling and analysis techniques with primitives applicable across multiple domains or disciplines are needed. For example, linear graphs and bond graphs have been used extensively in the electrical engineering, mechanical engineering, and mechatronic fields to design and analyze a wide variety of engineering systems. In contrast, hetero-functional graph theory (HFGT) has emerged to study many complex engineering systems and systems-of-systems (e.g. electric power, potable water, wastewater, natural gas, oil, coal, multi-modal transportation, mass-customized production, and personalized healthcare delivery systems). This paper seeks to relate hetero-functional graphs to linear graphs and bond graphs and demonstrate that the former is a generalization of the latter two. The contribution is relayed in three stages. First, the three modeling techniques are compared conceptually. Next, these techniques are contrasted on six example systems: (a) an electrical system, (b) a translational mechanical system, (c) a rotational mechanical system, (d) a fluidic system, (e) a thermal system, and (f) a multi-energy (electro-mechanical) system. Finally, this paper proves mathematically that hetero-functional graphs are a formal generalization of both linear graphs and bond graphs.
comment: This paper seeks to show that hetero-functional graphs generalize linear graphs and bond graphs. In order to do so, we needed to demonstrate all three graph techniques for six energy domains. We also add a mathematical proof. The resulting paper, although lengthier than normal, makes a significant theoretical as well as pedagogical contribution
Large-Area Conductor-Loaded PDMS Dielectric Composites for High-Sensitivity Wireless and Chipless Electromagnetic Temperature Sensors
Wireless electromagnetic sensing is a passive, non-destructive technique for measuring physical and chemical changes through permittivity or conductivity changes. However, the readout is limited by the sensitivity of the materials, often requiring complex sampling, particularly in the GHz range. We report capacitive dielectric temperature sensors based on polydimethylsiloxne (PDMS) loaded with 10 vol% of inexpensive, commercially available conductive fillers including copper powder (Cu), graphite powder (GP) and milled carbon fibre powder (CF). The sensors are tested in the range of 20{\deg}C to 110{\deg}C, with enhanced sensitivity from 20 to 60{\deg}C, and relative response of up to 85.5% at 200 MHz for CF loaded capacitors. Additionally, we demonstrate that operating frequency influences the relative sensing response by as much as 15.0% in loaded composite capacitors. Finally, we demonstrate the suitability of PDMS-CF capacitors as a sensing element in wirelessly coupled chipless resonant coils tuned to 6.78 MHz with a readout response, in the resonant frequency of the sensor. The wireless readout of the PDMS-CF chipless system exhibited an average sensitivity of 0.38 %{\deg}C-1 which is a 40x improvement over a pristine PDMS-based capacitive sensor and outperforms state of the art frequency-domain radio frequency (RF) temperature sensors including carbon-based composites at higher loadings. Exploiting the high sensitivity, we interrogate the sensor wirelessly using a low-cost and portable open-source NanoVNA demonstrating a relative response in the resonant frequency of the reader coil of 48.5%, with the response in good agreement with the instrumentation-grade vector network analyzers (VNAs), demonstrating that the sensors could be integrated into an inexpensive and portable measurement setup that does not rely on specialty equipment or highly trained operators.
comment: Main body 24 pages, 8 figures, one table. Supplemental information 10 pages, 11 figures, two tables
Nonlinear identifiability of directed acyclic graphs with partial excitation and measurement
We analyze the identifiability of directed acyclic graphs in the case of partial excitation and measurement. We consider an additive model where the nonlinear functions located in the edges depend only on a past input, and we analyze the identifiability problem in the class of pure nonlinear functions satisfying $f(0)=0$. We show that any identification pattern (set of measured nodes and set of excited nodes) requires the excitation of sources, measurement of sinks and the excitation or measurement of the other nodes. Then, we show that a directed acyclic graph (DAG) is identifiable with a given identification pattern if and only if it is identifiable with the measurement of all the nodes. Next, we analyze the case of trees where we prove that any identification pattern guarantees the identifiability of the network. Finally, by introducing the notion of a generic nonlinear network matrix, we provide sufficient conditions for the identifiability of DAGs based on the notion of vertex-disjoint paths.
comment: 7 pages, 6 figures, to appear in IEEE Conference on Decision and Control (CDC 2024)
Maximum likelihood inference for high-dimensional problems with multiaffine variable relations
Maximum Likelihood Estimation of continuous variable models can be very challenging in high dimensions, due to potentially complex probability distributions. The existence of multiple interdependencies among variables can make it very difficult to establish convergence guarantees. This leads to a wide use of brute-force methods, such as grid searching and Monte-Carlo sampling and, when applicable, complex and problem-specific algorithms. In this paper, we consider inference problems where the variables are related by multiaffine expressions. We propose a novel Alternating and Iteratively-Reweighted Least Squares (AIRLS) algorithm, and prove its convergence for problems with Generalized Normal Distributions. We also provide an efficient method to compute the variance of the estimates obtained using AIRLS. Finally, we show how the method can be applied to graphical statistical models. We perform numerical experiments on several inference problems, showing significantly better performance than state-of-the-art approaches in terms of scalability, robustness to noise, and convergence speed due to an empirically observed super-linear convergence rate.
An Effective Current Limiting Strategy to Enhance Transient Stability of Virtual Synchronous Generator
VSG control has emerged as a crucial technology for integrating renewable energy sources. However, renewable energy have limited tolerance to overcurrent, necessitating the implementation of current limiting (CL)strategies to mitigate the overcurrent. The introduction of different CL strategies can have varying impacts on the system. While previous studies have discussed the effects of different CL strategies on the system, but they lack intuitive and explicit explanations. Meanwhile, previous CL strategy have failed to effectively ensure the stability of the system. In this paper, the Equal Proportional Area Criterion (EPAC) method is employed to intuitively explain how different CL strategies affect transient stability. Based on this, an effective current limiting strategy is proposed. Simulations are conducted in MATLAB/Simulink to validate the proposed strategy. The simulation results demonstrate that, the proposed effective CL strategy exhibits superior stability.
comment: 2024 IEEE Energy Conversion Congress and Exposition (ECCE)
An innovation-based cycle-slip, multipath estimation, detection and mitigation method for tightly coupled GNSS/INS/Vision navigation in urban areas
Precise, consistent, and reliable positioning is crucial for a multitude of uses. In order to achieve high precision global positioning services, multi-sensor fusion techniques, such as the Global Navigation Satellite System (GNSS)/Inertial Navigation System (INS)/Vision integration system, combine the strengths of various sensors. This technique is essential for localization in complex environments and has been widely used in the mass market. However, frequent signal deterioration and blocking in urban environments exacerbates the degradation of GNSS positioning and negatively impacts the performance of the multi-sensor integration system. For GNSS pseudorange and carrier phase observation data in the urban environment, we offer an innovation-based cycle slip/multipath estimation, detection, and mitigation (I-EDM) method to reduce the influence of multipath effects and cycle slips on location induced by obstruction in urban settings. The method obtains the innovations of GNSS observations with the cluster analysis method. Then the innovations are used to detect the cycle slips and multipath. Compared with the residual-based method, the innovation-based method avoids the residual overfitting caused by the least square method, resulting in better detection of outliers within the GNSS observations. The vehicle tests carried out in urban settings verify the proposed approach. Experimental results indicate that the accuracy of 0.23m, 0.11m, and 0.31m in the east, north and up components can be achieved by the GNSS/INS/Vision tightly coupled system with the I-EDM method, which has a maximum of 21.6% improvement when compared with the residual-based EDM (R-EDM) method.
Recursive Quantization for $\mathcal{L}_2$ Stabilization of a Finite Capacity Stochastic Control Loop with Intermittent State Observations
The problem of $\mathcal{L}_2$ stabilization of a state feedback stochastic control loop is investigated under different constraints. The discrete time linear time invariant (LTI) open loop plant is chosen to be unstable. The additive white Gaussian noise is assumed to be stationary. The link between the plant and the controller is assumed to be a finite capacity stationary channel, which puts a constraint on the bit rate of the transmission. Moreover, the state of the plant is observed only intermittently keeping the loop open some of the time. In this manuscript both scalar and vector plants under Bernoulli and Markov intermittence models are investigated. Novel bounds on intermittence parameters are obtained to ensure $\mathcal{L}_2$ stability. Moreover, novel recursive quantization algorithms are developed to implement the stabilization scheme under all the constraints. Suitable illustrative examples are provided to elucidate the main results.
On Using Curved Mirrors to Decrease Shadowing in VLC
Visible light communication (VLC) complements radio frequency in indoor environments with large wireless data traffic. However, VLC is hindered by dramatic path losses when an opaque object is interposed between the transmitter and the receiver. Prior works propose the use of plane mirrors as optical reconfigurable intelligent surfaces (ORISs) to enhance communications through non-line-of-sight links. Plane mirrors rely on their orientation to forward the light to the target user location, which is challenging to implement in practice. This paper studies the potential of curved mirrors as static reflective surfaces to provide a broadening specular reflection that increases the signal coverage in mirror-assisted VLC scenarios. We study the behavior of paraboloid and semi-spherical mirrors and derive the irradiance equations. We provide extensive numerical and analytical results and show that curved mirrors, when developed with proper dimensions, may reduce the shadowing probability to zero, while static plane mirrors of the same size have shadowing probabilities larger than 65%. Furthermore, the signal-to-noise ratio offered by curved mirrors may suffice to provide connectivity to users deployed in the room even when a line-of-sight link blockage occurs.
comment: Accepted to be published in IEEE Globecom 2024
Robust synchronization and policy adaptation for networked heterogeneous agents
We propose a robust adaptive online synchronization method for leader-follower networks of nonlinear heterogeneous agents with system uncertainties and input magnitude saturation. Synchronization is achieved using a Distributed input Magnitude Saturation Adaptive Control with Reinforcement Learning (DMSAC-RL), which improves the empirical performance of policies trained on off-the-shelf models using Reinforcement Learning (RL) strategies. The leader observes the performance of a reference model, and followers observe the states and actions of the agents they are connected to, but not the reference model. The leader and followers may differ from the reference model in which the RL control policy was trained. DMSAC-RL uses an internal loop that adjusts the learned policy for the agents in the form of augmented input to solve the distributed control problem, including input-matched uncertainty parameters. We show that the synchronization error of the heterogeneous network is Uniformly Ultimately Bounded (UUB). Numerical analysis of a network of Multiple Input Multiple Output (MIMO) systems supports our theoretical findings.
comment: 30 pages, 12 figures, conference paper
Dual-TSST: A Dual-Branch Temporal-Spectral-Spatial Transformer Model for EEG Decoding
The decoding of electroencephalography (EEG) signals allows access to user intentions conveniently, which plays an important role in the fields of human-machine interaction. To effectively extract sufficient characteristics of the multichannel EEG, a novel decoding architecture network with a dual-branch temporal-spectral-spatial transformer (Dual-TSST) is proposed in this study. Specifically, by utilizing convolutional neural networks (CNNs) on different branches, the proposed processing network first extracts the temporal-spatial features of the original EEG and the temporal-spectral-spatial features of time-frequency domain data converted by wavelet transformation, respectively. These perceived features are then integrated by a feature fusion block, serving as the input of the transformer to capture the global long-range dependencies entailed in the non-stationary EEG, and being classified via the global average pooling and multi-layer perceptron blocks. To evaluate the efficacy of the proposed approach, the competitive experiments are conducted on three publicly available datasets of BCI IV 2a, BCI IV 2b, and SEED, with the head-to-head comparison of more than ten other state-of-the-art methods. As a result, our proposed Dual-TSST performs superiorly in various tasks, which achieves the promising EEG classification performance of average accuracy of 80.67% in BCI IV 2a, 88.64% in BCI IV 2b, and 96.65% in SEED, respectively. Extensive ablation experiments conducted between the Dual-TSST and comparative baseline model also reveal the enhanced decoding performance with each module of our proposed method. This study provides a new approach to high-performance EEG decoding, and has great potential for future CNN-Transformer based applications.
Grid-Forming Storage Networks: Analytical Characterization of Damping and Design Insights
The paper presents a theoretical study on small-signal stability and damping in bulk power systems with multiple grid-forming inverter-based storage resources. A detailed analysis is presented, characterizing the impacts of inverter droop gains and storage size on the slower eigenvalues, particularly those concerning inter-area oscillation modes. From these parametric sensitivity studies, a set of necessary conditions are derived that the design of droop gain must satisfy to enhance damping performance. The analytical findings are structured into propositions highlighting potential design considerations for improving system stability. The findings are illustrated via numerical studies on an IEEE 68-bus grid-forming storage network.
comment: accepted for presentation at The 63rd IEEE Conference on Decision and Control
Robust Q-Learning under Corrupted Rewards
Recently, there has been a surge of interest in analyzing the non-asymptotic behavior of model-free reinforcement learning algorithms. However, the performance of such algorithms in non-ideal environments, such as in the presence of corrupted rewards, is poorly understood. Motivated by this gap, we investigate the robustness of the celebrated Q-learning algorithm to a strong-contamination attack model, where an adversary can arbitrarily perturb a small fraction of the observed rewards. We start by proving that such an attack can cause the vanilla Q-learning algorithm to incur arbitrarily large errors. We then develop a novel robust synchronous Q-learning algorithm that uses historical reward data to construct robust empirical Bellman operators at each time step. Finally, we prove a finite-time convergence rate for our algorithm that matches known state-of-the-art bounds (in the absence of attacks) up to a small inevitable $O(\varepsilon)$ error term that scales with the adversarial corruption fraction $\varepsilon$. Notably, our results continue to hold even when the true reward distributions have infinite support, provided they admit bounded second moments.
comment: Accepted to the Decision and Control Conference (CDC) 2024
Memristors based Computation and Synthesis
Memristor has been identified as the fourth fundamental circuit element by Dr. Leon Chua in 1971 and since then it has gathered a lot of interest because of its non-volatility and are considered as a viable solution to the beyond CMOS era computation. Recently, memristor have been used to perform basic logic operations like AND, OR, NAND, NOR, XOR etc. and are also used in applications like Dot Product Engine, Convolution Neural Networks etc. This paper presents a new behavioural model of memristor then using it to build a 32-bit ripple carry adder. The paper later compares the area, power and time delay of the 32 bit Ripple Carry Adder using memristor with the 45nm CMOS technology and highlights its advantages and pitfalls.
Model Predictive Online Trajectory Planning for Adaptive Battery Discharging in Fuel Cell Vehicle
This paper presents an online trajectory planning approach for optimal coordination of Fuel Cell (FC) and battery in plug-in Hybrid Electric Vehicle (HEV). One of the main challenges in energy management of plug-in HEV is generating State-of-Charge (SOC) reference curves by optimally depleting battery under high uncertainties in driving scenarios. Recent studies have begun to explore the potential of utilizing partial trip information for optimal SOC trajectory planning, but dynamic responses of the FC system are not taken into account. On the other hand, research focusing on dynamic operation of FC systems often focuses on air flow management, and battery has been treated only partially. Our aim is to fill this gap by designing an online trajectory planner for dynamic coordination of FC and battery systems that works with a high-level SOC planner in a hierarchical manner. We propose an iterative LQR based online trajectory planning method where the amount of electricity dischargeable at each driving segment can be explicitly and adaptively specified by the high-level planner. Numerical results are provided as a proof of concept example to show the effectiveness of the proposed approach.
Data-based approaches to learning and control by similarity between heterogeneous systems
This paper proposes basic definitions of similarity and similarity indexes between admissible behaviors of heterogeneous host and guest systems and further presents a similarity-based learning control framework by exploiting offline sampled data. By exploring helpful geometric properties of the admissible behavior and decomposing it into the subspace and offset components, the similarity indexes between two admissible behaviors are defined as the principal angles between their corresponding subspace components. By reconstructing the admissible behaviors leveraging sampled data, an efficient strategy for calculating the similarity indexes is developed, based on which a similarity-based learning control framework is proposed. It is shown that the host system can directly accomplish the same control tasks by utilizing the successful experience from the guest system, without having to undergo the trial-and-error process.
InfraLib: Enabling Reinforcement Learning and Decision Making for Large Scale Infrastructure Management
Efficient management of infrastructure systems is crucial for economic stability, sustainability, and public safety. However, infrastructure management is challenging due to the vast scale of systems, stochastic deterioration of components, partial observability, and resource constraints. While data-driven approaches like reinforcement learning (RL) offer a promising avenue for optimizing management policies, their application to infrastructure has been limited by the lack of suitable simulation environments. We introduce InfraLib, a comprehensive framework for modeling and analyzing infrastructure management problems. InfraLib employs a hierarchical, stochastic approach to realistically model infrastructure systems and their deterioration. It supports practical functionality such as modeling component unavailability, cyclical budgets, and catastrophic failures. To facilitate research, InfraLib provides tools for expert data collection, simulation-driven analysis, and visualization. We demonstrate InfraLib's capabilities through case studies on a real-world road network and a synthetic benchmark with 100,000 components.
CyberDep: Towards the Analysis of Cyber-Physical Power System Interdependencies Using Bayesian Networks and Temporal Data
Modern-day power systems have become increasingly cyber-physical due to the ongoing developments to the grid that include the rise of distributed energy generation and the increase of the deployment of many cyber devices for monitoring and control, such as the Supervisory Control and Data Acquisition (SCADA) system. Such capabilities have made the power system more vulnerable to cyber-attacks that can harm the physical components of the system. As such, it is of utmost importance to study both the physical and cyber components together, focusing on characterizing and quantifying the interdependency between these components. This paper focuses on developing an algorithm, named CyberDep, for Bayesian network generation through conditional probability calculations of cyber traffic flows between system nodes. Additionally, CyberDep is implemented on the temporal data of the cyber-physical emulation of the WSCC 9-bus power system. The results of this work provide a visual representation of the probabilistic relationships within the cyber and physical components of the system, aiding in cyber-physical interdependency quantification.
comment: Accepted and Presented at the 2024 Kansas Power and Energy Conference (KPEC 2024)
Autonomous Drifting Based on Maximal Safety Probability Learning
This paper proposes a novel learning-based framework for autonomous driving based on the concept of maximal safety probability. Efficient learning requires rewards that are informative of desirable/undesirable states, but such rewards are challenging to design manually due to the difficulty of differentiating better states among many safe states. On the other hand, learning policies that maximize safety probability does not require laborious reward shaping but is numerically challenging because the algorithms must optimize policies based on binary rewards sparse in time. Here, we show that physics-informed reinforcement learning can efficiently learn this form of maximally safe policy. Unlike existing drift control methods, our approach does not require a specific reference trajectory or complex reward shaping, and can learn safe behaviors only from sparse binary rewards. This is enabled by the use of the physics loss that plays an analogous role to reward shaping. The effectiveness of the proposed approach is demonstrated through lane keeping in a normal cornering scenario and safe drifting in a high-speed racing scenario.
comment: arXiv admin note: text overlap with arXiv:2403.16391
Non-stationary and Sparsely-correlated Multi-output Gaussian Process with Spike-and-Slab Prior
Multi-output Gaussian process (MGP) is commonly used as a transfer learning method to leverage information among multiple outputs. A key advantage of MGP is providing uncertainty quantification for prediction, which is highly important for subsequent decision-making tasks. However, traditional MGP may not be sufficiently flexible to handle multivariate data with dynamic characteristics, particularly when dealing with complex temporal correlations. Additionally, since some outputs may lack correlation, transferring information among them may lead to negative transfer. To address these issues, this study proposes a non-stationary MGP model that can capture both the dynamic and sparse correlation among outputs. Specifically, the covariance functions of MGP are constructed using convolutions of time-varying kernel functions. Then a dynamic spike-and-slab prior is placed on correlation parameters to automatically decide which sources are informative to the target output in the training process. An expectation-maximization (EM) algorithm is proposed for efficient model fitting. Both numerical studies and a real case demonstrate its efficacy in capturing dynamic and sparse correlation structure and mitigating negative transfer for high-dimensional time-series data. Finally, a mountain-car reinforcement learning case highlights its potential application in decision making problems.
Envisioning an Optimal Network of Space-Based Lasers for Orbital Debris Remediation
The rapid increase in resident space objects, including satellites and orbital debris, threatens the safety and sustainability of space missions. This paper explores orbital debris remediation using laser ablation with a network of collaborative space-based lasers. A novel delta-v vector analysis framework quantifies the effects of multiple simultaneous laser-to-debris (L2D) engagements by leveraging a vector composition of imparted delta-v vectors. The paper introduces the Concurrent Location-Scheduling Problem (CLSP), which optimizes the placement of laser platforms and schedules L2D engagements to maximize debris remediation capacity. Due to the computational complexity of CLSP, it is decomposed into two sequential subproblems: (1) optimal laser platform locations are determined using the Maximal Covering Location Problem, and (2) a novel integer linear programming-based approach schedules L2D engagements within the network configuration to maximize remediation capacity. Computational experiments are conducted to evaluate the proposed framework's effectiveness under various mission scenarios, demonstrating key network functions such as collaborative nudging, deorbiting, and just-in-time collision avoidance. A cost-benefit analysis further explores how varying the number and distribution of laser platforms affects debris remediation capacity, providing insights into optimizing the performance of space-based laser networks.
comment: 41 pages, 13 figures, submitted to the Journal of Spacecraft and Rockets
Vehicular Resilient Control Strategy for a Platoon of Self-Driving Vehicles under DoS Attack
In a platoon, multiple autonomous vehicles engage in data exchange to navigate toward their intended destination. Within this network, a designated leader shares its status information with followers based on a predefined communication graph. However, these vehicles are susceptible to disturbances, leading to deviations from their intended routes. Denial-of-service (DoS) attacks, a significant type of cyber threat, can impact the motion of the leader. This paper addresses the destabilizing effects of DoS attacks on platoons and introduces a novel vehicular resilient control strategy to restore stability. Upon detecting and measuring a DoS attack, modeled with a time-varying delay, the proposed method initiates a process to retrieve the attacked leader. Through a newly designed switching system, the attacked leader transitions to a follower role, and a new leader is identified within a restructured platoon configuration, enabling the platoon to maintain consensus. Specifically, in the event of losing the original leader due to a DoS attack, the remaining vehicles do experience destabilization. They adapt their motions as a cohesive network through a distributed resilient controller. The effectiveness of the proposed approach is validated through an illustrative case study, showing its applicability in real-world scenarios.
comment: 9 pages
A Nonlinear Controller for Parallel DC-DC Converters with ZIP Load and Constrained Output Voltage
In this paper, an adaptive nonlinear controller is designed for a parallel DC-DC converter system that feeds an unknown ZIP load, characterized by constant impedance (Z), constant current (I), and constant power (P), at the DC bus. The proposed controller ensures simultaneous voltage adjustment and power sharing in the large signal sense despite uncertainties in ZIP loads, DC input voltages, and other electrical parameters. To keep the output voltage within a desired range, we utilize a barrier function that is invertible, smoothly continuous, and strictly increasing. Its limits at infinity represent the upper and lower bounds for the output voltage. We apply the invertible transformation of the barrier function to the output voltage and then design the controller using the adaptive backstepping method. Using this barrier-function-based adaptive backstepping controller, uncertain parameters are identified on-line, and the voltage adjustment and power sharing objectives are established. Moreover, voltage constraint is not violated event in the presence of sudden and unknown large variations of load. The efficiency of the proposed nonlinear controller is evaluated through simulations of a parallel DC-DC converter system using the MATLAB/Simscape Electrical environment.
Analytical Optimized Traffic Flow Recovery for Large-scale Urban Transportation Network
The implementation of intelligent transportation systems (ITS) has enhanced data collection in urban transportation through advanced traffic sensing devices. However, the high costs associated with installation and maintenance result in sparse traffic data coverage. To obtain complete, accurate, and high-resolution network-wide traffic flow data, this study introduces the Analytical Optimized Recovery (AOR) approach that leverages abundant GPS speed data alongside sparse flow data to estimate traffic flow in large-scale urban networks. The method formulates a constrained optimization framework that utilizes a quadratic objective function with l2 norm regularization terms to address the traffic flow recovery problem effectively and incorporates a Lagrangian relaxation technique to maintain non-negativity constraints. The effectiveness of this approach was validated in a large urban network in Shenzhen's Futian District using the Simulation of Urban MObility (SUMO) platform. Analytical results indicate that the method achieves low estimation errors, affirming its suitability for comprehensive traffic analysis in urban settings with limited sensor deployment.
comment: 27 pages, 13 figures
Data-informativity conditions for structured linear systems with implications for dynamic networks
When estimating models of of a multivariable dynamic system, a typical condition for consistency is to require the input signals to be persistently exciting, which is guaranteed if the input spectrum is positive definite for a sufficient number of frequencies. In this paper it is investigated how such a condition can be relaxed by exploiting prior structural information on the multivariable system, such as structural zero elements in the transfer matrix or entries that are a priori known and therefore not parametrized. It is shown that in particular situations the data-informativity condition can be decomposed into different MISO (multiple input single output) situations, leading to relaxed conditions for the MIMO (multiple input multiple output) model. When estimating a single module in a linear dynamic network, the data-informativity conditions can generically be formulated as path-based conditions on the graph of the network. The new relaxed conditions for data-informativity will then also lead to relaxed path-based conditions on the network graph. Additionally the new expressions are shown to be closely related to earlier derived conditions for (generic) single module identifiability.
comment: 15 pages, 3 figures
Inferring Global Exponential Stability Properties using Lie-bracket Approximations
In the present paper, a novel result for inferring uniform global, not semi-global, exponential stability in the sense of Lyapunov with respect to input-affine systems from global uniform exponential stability properties with respect to their associated Lie-bracket systems is shown. The result is applied to adapt dither frequencies to find a sufficiently high gain in adaptive control of linear unknown systems, and a simple numerical example is simulated to support the theoretical findings.
comment: Extended Version
Distributionally Robust Control for Chance-Constrained Signal Temporal Logic Specifications
We consider distributionally robust optimal control of stochastic linear systems under signal temporal logic (STL) chance constraints when the disturbance distribution is unknown. By assuming that the underlying predicate functions are Lipschitz continuous and the noise realizations are drawn from a distribution having a concentration of measure property, we first formulate the underlying chance-constrained control problem as stochastic programming with constraints on expectations and propose a solution using a distributionally robust approach based on the Wasserstein metric. We show that by choosing a proper Wasserstein radius, the original chance-constrained optimization can be satisfied with a user-defined confidence level. A numerical example illustrates the efficacy of the method.
comment: 8 pages and 1 fiugre
Combined Plant and Control Co-design via Solutions of Hamilton-Jacobi-Bellman Equation Based on Physics-informed Learning
This paper addresses integrated design of engineering systems, where physical structure of the plant and controller design are optimized simultaneously. To cope with uncertainties due to noises acting on the dynamics and modeling errors, an Uncertain Control Co-design (UCCD) problem formulation is proposed. Existing UCCD methods usually rely on uncertainty propagation analyses using Monte Calro methods for open-loop solutions of optimal control, which suffer from stringent trade-offs among accuracy, time horizon, and computational time. The proposed method utilizes closed-loop solutions characterized by the Hamilton-Jacobi-Bellman equation, a Partial Differential Equation (PDE) defined on the state space. A solution algorithm for the proposed UCCD formulation is developed based on PDE solutions of Physics-informed Neural Networks (PINNs). Numerical examples of regulator design problems are provided, and it is shown that simultaneous update of PINN weights and the design parameters effectively works for solving UCCD problems.
Simultaneous compensation of input delay and state/input quantization for linear systems via switched predictor feedback
We develop a switched predictor-feedback law, which achieves global asymptotic stabilization of linear systems with input delay and with the plant and actuator states available only in (almost) quantized form. The control design relies on a quantized version of the nominal predictor-feedback law for linear systems, in which quantized measurements of the plant and actuator states enter the predictor state formula. A switching strategy is constructed to dynamically adjust the tunable parameter of the quantizer (in a piecewise constant manner), in order to initially increase the range and subsequently decrease the error of the quantizers. The key element in the proof of global asymptotic stability in the supremum norm of the actuator state is derivation of solutions' estimates combining a backstepping transformation with small-gain and input-to-state stability arguments, for addressing the error due to quantization. We extend this result to the input quantization case and illustrate our theory with a numerical example.
comment: 12 pages, 15 figures, Systems & Control Letters
Using matrix sparsification to solve tropical linear vector equations
A linear vector equation in two unknown vectors is examined in the framework of tropical algebra dealing with the theory and applications of semirings and semifields with idempotent addition. We consider a two-sided equation where each side is a tropical product of a given matrix by one of the unknown vectors. We use a matrix sparsification technique to reduce the equation to a set of vector inequalities that involve row-monomial matrices obtained from the given matrices. An existence condition of solutions for the inequalities is established, and a direct representation of the solutions is derived in a compact vector form. To illustrate the proposed approach and to compare the obtained result with that of an existing solution procedure, we apply our solution technique to handle two-sided equations known in the literature. Finally, a computational scheme based on the approach to derive all solutions of the two-sided equation is discussed.
comment: 16 pages
Finite Sample Frequency Domain Identification
We study non-parametric frequency-domain system identification from a finite-sample perspective. We assume an open loop scenario where the excitation input is periodic and consider the Empirical Transfer Function Estimate (ETFE), where the goal is to estimate the frequency response at certain desired (evenly-spaced) frequencies, given input-output samples. We show that under sub-Gaussian colored noise (in time-domain) and stability assumptions, the ETFE estimates are concentrated around the true values. The error rate is of the order of $\mathcal{O}((d_{\mathrm{u}}+\sqrt{d_{\mathrm{u}}d_{\mathrm{y}}})\sqrt{M/N_{\mathrm{tot}}})$, where $N_{\mathrm{tot}}$ is the total number of samples, $M$ is the number of desired frequencies, and $d_{\mathrm{u}},\,d_{\mathrm{y}}$ are the dimensions of the input and output signals respectively. This rate remains valid for general irrational transfer functions and does not require a finite order state-space representation. By tuning $M$, we obtain a $N_{\mathrm{tot}}^{-1/3}$ finite-sample rate for learning the frequency response over all frequencies in the $ \mathcal{H}_{\infty}$ norm. Our result draws upon an extension of the Hanson-Wright inequality to semi-infinite matrices. We study the finite-sample behavior of ETFE in simulations.
comment: Version 2 changes: several typos were fixed and some proof steps were expanded
On Computation of Approximate Solutions to Large-Scale Backstepping Kernel Equations via Continuum Approximation
We provide two methods for computation of continuum backstepping kernels that arise in control of continua (ensembles) of linear hyperbolic PDEs and which can approximate backstepping kernels arising in control of a large-scale, PDE system counterpart (with computational complexity that does not grow with the number of state components of the large-scale system). In the first method, we provide explicit formulae for the solution to the continuum kernels PDEs, employing a (triple) power series representation of the continuum kernel and establishing its convergence properties. In this case, we also provide means for reducing computational complexity by properly truncating the power series (in the powers of the ensemble variable). In the second method, we identify a class of systems for which the solution to the continuum (and hence, also an approximate solution to the respective large-scale) kernel equations can be constructed in closed form. We also present numerical examples to illustrate computational efficiency/accuracy of the approaches, as well as to validate the stabilization properties of the approximate control kernels, constructed based on the continuum.
comment: 15 pages, 5 figures, submitted to Systems & Control Letters, MATLAB implementation of Algorithm 1 available at https://github.com/jphumaloja/Continuum-Kernels-Power-Series/
Parameter Dependent Robust Control Invariant Sets for LPV Systems with Bounded Parameter Variation Rate
Real-time measurements of the scheduling parameter of linear parameter-varying (LPV) systems enables the synthesis of robust control invariant (RCI) sets and parameter dependent controllers inducing invariance. We present a method to synthesize parameter-dependent robust control invariant (PD-RCI) sets for LPV systems with bounded parameter variation, in which invariance is induced using PD-vertex control laws. The PD-RCI sets are parameterized as configuration-constrained polytopes that admit a joint parameterization of their facets and vertices. The proposed sets and associated control laws are computed by solving a single semidefinite programing (SDP) problem. Through numerical examples, we demonstrate that the proposed method outperforms state-of-the-art methods for synthesizing PD-RCI sets, both with respect to conservativeness and computational load.
comment: 8 pages, 6 figures
Online learning for robust voltage control under uncertain grid topology
Voltage control generally requires accurate information about the grid's topology in order to guarantee network stability. However, accurate topology identification is challenging for existing methods, especially as the grid is subject to increasingly frequent reconfiguration due to the adoption of renewable energy. In this work, we combine a nested convex body chasing algorithm with a robust predictive controller to achieve provably finite-time convergence to safe voltage limits in the online setting where there is uncertainty in both the network topology as well as load and generation variations. In an online fashion, our algorithm narrows down the set of possible grid models that are consistent with observations and adjusts reactive power generation accordingly to keep voltages within desired safety limits. Our approach can also incorporate existing partial knowledge of the network to improve voltage control performance. We demonstrate the effectiveness of our approach in a case study on a Southern California Edison 56-bus distribution system. Our experiments show that in practical settings, the controller is indeed able to narrow the set of consistent topologies quickly enough to make control decisions that ensure stability in both linearized and realistic non-linear models of the distribution grid.
comment: Published in IEEE Transactions on Smart Grid, vol. 15, no. 5, pp. 4754-4764, Sept. 2024. arXiv admin note: substantial text overlap with arXiv:2206.14369
Demonstrating a Robust Walking Algorithm for Underactuated Bipedal Robots in Non-flat, Non-stationary Environments
This work explores an innovative algorithm designed to enhance the mobility of underactuated bipedal robots across challenging terrains, especially when navigating through spaces with constrained opportunities for foot support, like steps or stairs. By combining ankle torque with a refined angular momentum-based linear inverted pendulum model (ALIP), our method allows variability in the robot's center of mass height. We employ a dual-strategy controller that merges virtual constraints for precise motion regulation across essential degrees of freedom with an ALIP-centric model predictive control (MPC) framework, aimed at enforcing gait stability. The effectiveness of our feedback design is demonstrated through its application on the Cassie bipedal robot, which features 20 degrees of freedom. Key to our implementation is the development of tailored nominal trajectories and an optimized MPC that reduces the execution time to under 500 microseconds--and, hence, is compatible with Cassie's controller update frequency. This paper not only showcases the successful hardware deployment but also demonstrates a new capability, a bipedal robot using a moving walkway.
Systems and Control (EESS)
Differentiable Discrete Event Simulation for Queuing Network Control
Queuing network control is essential for managing congestion in job-processing systems such as service systems, communication networks, and manufacturing processes. Despite growing interest in applying reinforcement learning (RL) techniques, queueing network control poses distinct challenges, including high stochasticity, large state and action spaces, and lack of stability. To tackle these challenges, we propose a scalable framework for policy optimization based on differentiable discrete event simulation. Our main insight is that by implementing a well-designed smoothing technique for discrete event dynamics, we can compute pathwise policy gradients for large-scale queueing networks using auto-differentiation software (e.g., Tensorflow, PyTorch) and GPU parallelization. Through extensive empirical experiments, we observe that our policy gradient estimators are several orders of magnitude more accurate than typical REINFORCE-based estimators. In addition, We propose a new policy architecture, which drastically improves stability while maintaining the flexibility of neural-network policies. In a wide variety of scheduling and admission control tasks, we demonstrate that training control policies with pathwise gradients leads to a 50-1000x improvement in sample efficiency over state-of-the-art RL methods. Unlike prior tailored approaches to queueing, our methods can flexibly handle realistic scenarios, including systems operating in non-stationary environments and those with non-exponential interarrival/service times.
A Deep Generative Learning Approach for Two-stage Adaptive Robust Optimization
Two-stage adaptive robust optimization is a powerful approach for planning under uncertainty that aims to balance costs of "here-and-now" first-stage decisions with those of "wait-and-see" recourse decisions made after uncertainty is realized. To embed robustness against uncertainty, modelers typically assume a simple polyhedral or ellipsoidal set over which contingencies may be realized. However, these simple uncertainty sets tend to yield highly conservative decision-making when uncertainties are high-dimensional. In this work, we introduce AGRO, a column-and-constraint generation algorithm that performs adversarial generation for two-stage adaptive robust optimization using a variational autoencoder. AGRO identifies realistic and cost-maximizing contingencies by optimizing over spherical uncertainty sets in a latent space using a projected gradient ascent approach that differentiates the optimal recourse cost with respect to the latent variable. To demonstrate the cost- and time-efficiency of our approach experimentally, we apply AGRO to an adaptive robust capacity expansion problem for a regional power system and show that AGRO is able to reduce costs by up to 7.8% and runtimes by up to 77% in comparison to the conventional column-and-constraint generation algorithm.
Design of CANSAT for Air Quality Monitoring for an altitude of 900 meters
This paper presents the design and development of NAMBI-VJ, a CANSAT specifically designed for air quality monitoring and stabilization. The CANSAT's cylindrical structure, measuring 310mm in height and 125mm in diameter, is equipped with a mechanical gyroscope for stabilization and a spill-hole parachute for controlled descent. The primary objective of this research is to create a compact, lightweight satellite capable of monitoring air quality parameters such as particulate matter (PM), carbon dioxide (CO2), longitude, and latitude. To achieve this, the CANSAT utilizes Zigbee communication to transmit data to a ground station. Experimental testing involved dropping the CANSAT from an altitude of 900 meters using a drone. The results demonstrate the CANSAT's ability to successfully gather and transmit air quality data, highlighting its potential for environmental monitoring applications.
comment: 6 pages,6 figures
Advances in Cislunar Periodic Solutions via Taylor Polynomial Maps
In this paper, novel approaches are developed to explore the dynamics of motion in periodic orbits near libration points in cislunar space using the Differential Algebra (DA) framework. The Circular Restricted Three-Body Problem (CR3BP) models the motion, with initial states derived numerically via differential correction. Periodic orbit families are computed using the Pseudo-Arclength Continuation (PAC) method and fitted. Two newly developed polynomial regression models (PRMs) express initial states as functions of predefined parameters and are used in the DA framework to evaluate propagated states. The initial states, expressed via PRM, are propagated in the DA framework using the fourth-order Runge-Kutta (RK4) method. The resultant polynomials of both PRM and DA are employed to develop a control law that shows significantly reduced control effort compared to the traditional tracking control law, demonstrating their potential for cislunar space applications, particularly those requiring computationally inexpensive low-energy transfers.
comment: 20 pages, 19 figures
Wind turbine condition monitoring based on intra- and inter-farm federated learning
As wind energy adoption is growing, ensuring the efficient operation and maintenance of wind turbines becomes essential for maximizing energy production and minimizing costs and downtime. Many AI applications in wind energy, such as in condition monitoring and power forecasting, may benefit from using operational data not only from individual wind turbines but from multiple turbines and multiple wind farms. Collaborative distributed AI which preserves data privacy holds a strong potential for these applications. Federated learning has emerged as a privacy-preserving distributed machine learning approach in this context. We explore federated learning in wind turbine condition monitoring, specifically for fault detection using normal behaviour models. We investigate various federated learning strategies, including collaboration across different wind farms and turbine models, as well as collaboration restricted to the same wind farm and turbine model. Our case study results indicate that federated learning across multiple wind turbines consistently outperforms models trained on a single turbine, especially when training data is scarce. Moreover, the amount of historical data necessary to train an effective model can be significantly reduced by employing a collaborative federated learning strategy. Finally, our findings show that extending the collaboration to multiple wind farms may result in inferior performance compared to restricting learning within a farm, specifically when faced with statistical heterogeneity and imbalanced datasets.
Generalizing Linear Graphs and Bond Graph Models with Hetero-functional Graphs for System-of-Systems Engineering Applications
In the 20th century, individual technology products like the generator, telephone, and automobile were connected to form many of the large-scale, complex, infrastructure networks we know today: the power grid, the communication infrastructure, and the transportation system. Progressively, these networked systems began interacting, forming what is now known as systems-of-systems. Because the component systems in the system-of-systems differ, modeling and analysis techniques with primitives applicable across multiple domains or disciplines are needed. For example, linear graphs and bond graphs have been used extensively in the electrical engineering, mechanical engineering, and mechatronic fields to design and analyze a wide variety of engineering systems. In contrast, hetero-functional graph theory (HFGT) has emerged to study many complex engineering systems and systems-of-systems (e.g. electric power, potable water, wastewater, natural gas, oil, coal, multi-modal transportation, mass-customized production, and personalized healthcare delivery systems). This paper seeks to relate hetero-functional graphs to linear graphs and bond graphs and demonstrate that the former is a generalization of the latter two. The contribution is relayed in three stages. First, the three modeling techniques are compared conceptually. Next, these techniques are contrasted on six example systems: (a) an electrical system, (b) a translational mechanical system, (c) a rotational mechanical system, (d) a fluidic system, (e) a thermal system, and (f) a multi-energy (electro-mechanical) system. Finally, this paper proves mathematically that hetero-functional graphs are a formal generalization of both linear graphs and bond graphs.
comment: This paper seeks to show that hetero-functional graphs generalize linear graphs and bond graphs. In order to do so, we needed to demonstrate all three graph techniques for six energy domains. We also add a mathematical proof. The resulting paper, although lengthier than normal, makes a significant theoretical as well as pedagogical contribution
Large-Area Conductor-Loaded PDMS Dielectric Composites for High-Sensitivity Wireless and Chipless Electromagnetic Temperature Sensors
Wireless electromagnetic sensing is a passive, non-destructive technique for measuring physical and chemical changes through permittivity or conductivity changes. However, the readout is limited by the sensitivity of the materials, often requiring complex sampling, particularly in the GHz range. We report capacitive dielectric temperature sensors based on polydimethylsiloxne (PDMS) loaded with 10 vol% of inexpensive, commercially available conductive fillers including copper powder (Cu), graphite powder (GP) and milled carbon fibre powder (CF). The sensors are tested in the range of 20{\deg}C to 110{\deg}C, with enhanced sensitivity from 20 to 60{\deg}C, and relative response of up to 85.5% at 200 MHz for CF loaded capacitors. Additionally, we demonstrate that operating frequency influences the relative sensing response by as much as 15.0% in loaded composite capacitors. Finally, we demonstrate the suitability of PDMS-CF capacitors as a sensing element in wirelessly coupled chipless resonant coils tuned to 6.78 MHz with a readout response, in the resonant frequency of the sensor. The wireless readout of the PDMS-CF chipless system exhibited an average sensitivity of 0.38 %{\deg}C-1 which is a 40x improvement over a pristine PDMS-based capacitive sensor and outperforms state of the art frequency-domain radio frequency (RF) temperature sensors including carbon-based composites at higher loadings. Exploiting the high sensitivity, we interrogate the sensor wirelessly using a low-cost and portable open-source NanoVNA demonstrating a relative response in the resonant frequency of the reader coil of 48.5%, with the response in good agreement with the instrumentation-grade vector network analyzers (VNAs), demonstrating that the sensors could be integrated into an inexpensive and portable measurement setup that does not rely on specialty equipment or highly trained operators.
comment: Main body 24 pages, 8 figures, one table. Supplemental information 10 pages, 11 figures, two tables
Nonlinear identifiability of directed acyclic graphs with partial excitation and measurement
We analyze the identifiability of directed acyclic graphs in the case of partial excitation and measurement. We consider an additive model where the nonlinear functions located in the edges depend only on a past input, and we analyze the identifiability problem in the class of pure nonlinear functions satisfying $f(0)=0$. We show that any identification pattern (set of measured nodes and set of excited nodes) requires the excitation of sources, measurement of sinks and the excitation or measurement of the other nodes. Then, we show that a directed acyclic graph (DAG) is identifiable with a given identification pattern if and only if it is identifiable with the measurement of all the nodes. Next, we analyze the case of trees where we prove that any identification pattern guarantees the identifiability of the network. Finally, by introducing the notion of a generic nonlinear network matrix, we provide sufficient conditions for the identifiability of DAGs based on the notion of vertex-disjoint paths.
comment: 7 pages, 6 figures, to appear in IEEE Conference on Decision and Control (CDC 2024)
Maximum likelihood inference for high-dimensional problems with multiaffine variable relations
Maximum Likelihood Estimation of continuous variable models can be very challenging in high dimensions, due to potentially complex probability distributions. The existence of multiple interdependencies among variables can make it very difficult to establish convergence guarantees. This leads to a wide use of brute-force methods, such as grid searching and Monte-Carlo sampling and, when applicable, complex and problem-specific algorithms. In this paper, we consider inference problems where the variables are related by multiaffine expressions. We propose a novel Alternating and Iteratively-Reweighted Least Squares (AIRLS) algorithm, and prove its convergence for problems with Generalized Normal Distributions. We also provide an efficient method to compute the variance of the estimates obtained using AIRLS. Finally, we show how the method can be applied to graphical statistical models. We perform numerical experiments on several inference problems, showing significantly better performance than state-of-the-art approaches in terms of scalability, robustness to noise, and convergence speed due to an empirically observed super-linear convergence rate.
An Effective Current Limiting Strategy to Enhance Transient Stability of Virtual Synchronous Generator
VSG control has emerged as a crucial technology for integrating renewable energy sources. However, renewable energy have limited tolerance to overcurrent, necessitating the implementation of current limiting (CL)strategies to mitigate the overcurrent. The introduction of different CL strategies can have varying impacts on the system. While previous studies have discussed the effects of different CL strategies on the system, but they lack intuitive and explicit explanations. Meanwhile, previous CL strategy have failed to effectively ensure the stability of the system. In this paper, the Equal Proportional Area Criterion (EPAC) method is employed to intuitively explain how different CL strategies affect transient stability. Based on this, an effective current limiting strategy is proposed. Simulations are conducted in MATLAB/Simulink to validate the proposed strategy. The simulation results demonstrate that, the proposed effective CL strategy exhibits superior stability.
comment: 2024 IEEE Energy Conversion Congress and Exposition (ECCE)
An innovation-based cycle-slip, multipath estimation, detection and mitigation method for tightly coupled GNSS/INS/Vision navigation in urban areas
Precise, consistent, and reliable positioning is crucial for a multitude of uses. In order to achieve high precision global positioning services, multi-sensor fusion techniques, such as the Global Navigation Satellite System (GNSS)/Inertial Navigation System (INS)/Vision integration system, combine the strengths of various sensors. This technique is essential for localization in complex environments and has been widely used in the mass market. However, frequent signal deterioration and blocking in urban environments exacerbates the degradation of GNSS positioning and negatively impacts the performance of the multi-sensor integration system. For GNSS pseudorange and carrier phase observation data in the urban environment, we offer an innovation-based cycle slip/multipath estimation, detection, and mitigation (I-EDM) method to reduce the influence of multipath effects and cycle slips on location induced by obstruction in urban settings. The method obtains the innovations of GNSS observations with the cluster analysis method. Then the innovations are used to detect the cycle slips and multipath. Compared with the residual-based method, the innovation-based method avoids the residual overfitting caused by the least square method, resulting in better detection of outliers within the GNSS observations. The vehicle tests carried out in urban settings verify the proposed approach. Experimental results indicate that the accuracy of 0.23m, 0.11m, and 0.31m in the east, north and up components can be achieved by the GNSS/INS/Vision tightly coupled system with the I-EDM method, which has a maximum of 21.6% improvement when compared with the residual-based EDM (R-EDM) method.
Recursive Quantization for $\mathcal{L}_2$ Stabilization of a Finite Capacity Stochastic Control Loop with Intermittent State Observations
The problem of $\mathcal{L}_2$ stabilization of a state feedback stochastic control loop is investigated under different constraints. The discrete time linear time invariant (LTI) open loop plant is chosen to be unstable. The additive white Gaussian noise is assumed to be stationary. The link between the plant and the controller is assumed to be a finite capacity stationary channel, which puts a constraint on the bit rate of the transmission. Moreover, the state of the plant is observed only intermittently keeping the loop open some of the time. In this manuscript both scalar and vector plants under Bernoulli and Markov intermittence models are investigated. Novel bounds on intermittence parameters are obtained to ensure $\mathcal{L}_2$ stability. Moreover, novel recursive quantization algorithms are developed to implement the stabilization scheme under all the constraints. Suitable illustrative examples are provided to elucidate the main results.
On Using Curved Mirrors to Decrease Shadowing in VLC
Visible light communication (VLC) complements radio frequency in indoor environments with large wireless data traffic. However, VLC is hindered by dramatic path losses when an opaque object is interposed between the transmitter and the receiver. Prior works propose the use of plane mirrors as optical reconfigurable intelligent surfaces (ORISs) to enhance communications through non-line-of-sight links. Plane mirrors rely on their orientation to forward the light to the target user location, which is challenging to implement in practice. This paper studies the potential of curved mirrors as static reflective surfaces to provide a broadening specular reflection that increases the signal coverage in mirror-assisted VLC scenarios. We study the behavior of paraboloid and semi-spherical mirrors and derive the irradiance equations. We provide extensive numerical and analytical results and show that curved mirrors, when developed with proper dimensions, may reduce the shadowing probability to zero, while static plane mirrors of the same size have shadowing probabilities larger than 65%. Furthermore, the signal-to-noise ratio offered by curved mirrors may suffice to provide connectivity to users deployed in the room even when a line-of-sight link blockage occurs.
comment: Accepted to be published in IEEE Globecom 2024
Robust synchronization and policy adaptation for networked heterogeneous agents
We propose a robust adaptive online synchronization method for leader-follower networks of nonlinear heterogeneous agents with system uncertainties and input magnitude saturation. Synchronization is achieved using a Distributed input Magnitude Saturation Adaptive Control with Reinforcement Learning (DMSAC-RL), which improves the empirical performance of policies trained on off-the-shelf models using Reinforcement Learning (RL) strategies. The leader observes the performance of a reference model, and followers observe the states and actions of the agents they are connected to, but not the reference model. The leader and followers may differ from the reference model in which the RL control policy was trained. DMSAC-RL uses an internal loop that adjusts the learned policy for the agents in the form of augmented input to solve the distributed control problem, including input-matched uncertainty parameters. We show that the synchronization error of the heterogeneous network is Uniformly Ultimately Bounded (UUB). Numerical analysis of a network of Multiple Input Multiple Output (MIMO) systems supports our theoretical findings.
comment: 30 pages, 12 figures, conference paper
Dual-TSST: A Dual-Branch Temporal-Spectral-Spatial Transformer Model for EEG Decoding
The decoding of electroencephalography (EEG) signals allows access to user intentions conveniently, which plays an important role in the fields of human-machine interaction. To effectively extract sufficient characteristics of the multichannel EEG, a novel decoding architecture network with a dual-branch temporal-spectral-spatial transformer (Dual-TSST) is proposed in this study. Specifically, by utilizing convolutional neural networks (CNNs) on different branches, the proposed processing network first extracts the temporal-spatial features of the original EEG and the temporal-spectral-spatial features of time-frequency domain data converted by wavelet transformation, respectively. These perceived features are then integrated by a feature fusion block, serving as the input of the transformer to capture the global long-range dependencies entailed in the non-stationary EEG, and being classified via the global average pooling and multi-layer perceptron blocks. To evaluate the efficacy of the proposed approach, the competitive experiments are conducted on three publicly available datasets of BCI IV 2a, BCI IV 2b, and SEED, with the head-to-head comparison of more than ten other state-of-the-art methods. As a result, our proposed Dual-TSST performs superiorly in various tasks, which achieves the promising EEG classification performance of average accuracy of 80.67% in BCI IV 2a, 88.64% in BCI IV 2b, and 96.65% in SEED, respectively. Extensive ablation experiments conducted between the Dual-TSST and comparative baseline model also reveal the enhanced decoding performance with each module of our proposed method. This study provides a new approach to high-performance EEG decoding, and has great potential for future CNN-Transformer based applications.
Grid-Forming Storage Networks: Analytical Characterization of Damping and Design Insights
The paper presents a theoretical study on small-signal stability and damping in bulk power systems with multiple grid-forming inverter-based storage resources. A detailed analysis is presented, characterizing the impacts of inverter droop gains and storage size on the slower eigenvalues, particularly those concerning inter-area oscillation modes. From these parametric sensitivity studies, a set of necessary conditions are derived that the design of droop gain must satisfy to enhance damping performance. The analytical findings are structured into propositions highlighting potential design considerations for improving system stability. The findings are illustrated via numerical studies on an IEEE 68-bus grid-forming storage network.
comment: accepted for presentation at The 63rd IEEE Conference on Decision and Control
Robust Q-Learning under Corrupted Rewards
Recently, there has been a surge of interest in analyzing the non-asymptotic behavior of model-free reinforcement learning algorithms. However, the performance of such algorithms in non-ideal environments, such as in the presence of corrupted rewards, is poorly understood. Motivated by this gap, we investigate the robustness of the celebrated Q-learning algorithm to a strong-contamination attack model, where an adversary can arbitrarily perturb a small fraction of the observed rewards. We start by proving that such an attack can cause the vanilla Q-learning algorithm to incur arbitrarily large errors. We then develop a novel robust synchronous Q-learning algorithm that uses historical reward data to construct robust empirical Bellman operators at each time step. Finally, we prove a finite-time convergence rate for our algorithm that matches known state-of-the-art bounds (in the absence of attacks) up to a small inevitable $O(\varepsilon)$ error term that scales with the adversarial corruption fraction $\varepsilon$. Notably, our results continue to hold even when the true reward distributions have infinite support, provided they admit bounded second moments.
comment: Accepted to the Decision and Control Conference (CDC) 2024
Memristors based Computation and Synthesis
Memristor has been identified as the fourth fundamental circuit element by Dr. Leon Chua in 1971 and since then it has gathered a lot of interest because of its non-volatility and are considered as a viable solution to the beyond CMOS era computation. Recently, memristor have been used to perform basic logic operations like AND, OR, NAND, NOR, XOR etc. and are also used in applications like Dot Product Engine, Convolution Neural Networks etc. This paper presents a new behavioural model of memristor then using it to build a 32-bit ripple carry adder. The paper later compares the area, power and time delay of the 32 bit Ripple Carry Adder using memristor with the 45nm CMOS technology and highlights its advantages and pitfalls.
Model Predictive Online Trajectory Planning for Adaptive Battery Discharging in Fuel Cell Vehicle
This paper presents an online trajectory planning approach for optimal coordination of Fuel Cell (FC) and battery in plug-in Hybrid Electric Vehicle (HEV). One of the main challenges in energy management of plug-in HEV is generating State-of-Charge (SOC) reference curves by optimally depleting battery under high uncertainties in driving scenarios. Recent studies have begun to explore the potential of utilizing partial trip information for optimal SOC trajectory planning, but dynamic responses of the FC system are not taken into account. On the other hand, research focusing on dynamic operation of FC systems often focuses on air flow management, and battery has been treated only partially. Our aim is to fill this gap by designing an online trajectory planner for dynamic coordination of FC and battery systems that works with a high-level SOC planner in a hierarchical manner. We propose an iterative LQR based online trajectory planning method where the amount of electricity dischargeable at each driving segment can be explicitly and adaptively specified by the high-level planner. Numerical results are provided as a proof of concept example to show the effectiveness of the proposed approach.
Data-based approaches to learning and control by similarity between heterogeneous systems
This paper proposes basic definitions of similarity and similarity indexes between admissible behaviors of heterogeneous host and guest systems and further presents a similarity-based learning control framework by exploiting offline sampled data. By exploring helpful geometric properties of the admissible behavior and decomposing it into the subspace and offset components, the similarity indexes between two admissible behaviors are defined as the principal angles between their corresponding subspace components. By reconstructing the admissible behaviors leveraging sampled data, an efficient strategy for calculating the similarity indexes is developed, based on which a similarity-based learning control framework is proposed. It is shown that the host system can directly accomplish the same control tasks by utilizing the successful experience from the guest system, without having to undergo the trial-and-error process.
InfraLib: Enabling Reinforcement Learning and Decision Making for Large Scale Infrastructure Management
Efficient management of infrastructure systems is crucial for economic stability, sustainability, and public safety. However, infrastructure management is challenging due to the vast scale of systems, stochastic deterioration of components, partial observability, and resource constraints. While data-driven approaches like reinforcement learning (RL) offer a promising avenue for optimizing management policies, their application to infrastructure has been limited by the lack of suitable simulation environments. We introduce InfraLib, a comprehensive framework for modeling and analyzing infrastructure management problems. InfraLib employs a hierarchical, stochastic approach to realistically model infrastructure systems and their deterioration. It supports practical functionality such as modeling component unavailability, cyclical budgets, and catastrophic failures. To facilitate research, InfraLib provides tools for expert data collection, simulation-driven analysis, and visualization. We demonstrate InfraLib's capabilities through case studies on a real-world road network and a synthetic benchmark with 100,000 components.
CyberDep: Towards the Analysis of Cyber-Physical Power System Interdependencies Using Bayesian Networks and Temporal Data
Modern-day power systems have become increasingly cyber-physical due to the ongoing developments to the grid that include the rise of distributed energy generation and the increase of the deployment of many cyber devices for monitoring and control, such as the Supervisory Control and Data Acquisition (SCADA) system. Such capabilities have made the power system more vulnerable to cyber-attacks that can harm the physical components of the system. As such, it is of utmost importance to study both the physical and cyber components together, focusing on characterizing and quantifying the interdependency between these components. This paper focuses on developing an algorithm, named CyberDep, for Bayesian network generation through conditional probability calculations of cyber traffic flows between system nodes. Additionally, CyberDep is implemented on the temporal data of the cyber-physical emulation of the WSCC 9-bus power system. The results of this work provide a visual representation of the probabilistic relationships within the cyber and physical components of the system, aiding in cyber-physical interdependency quantification.
comment: Accepted and Presented at the 2024 Kansas Power and Energy Conference (KPEC 2024)
Autonomous Drifting Based on Maximal Safety Probability Learning
This paper proposes a novel learning-based framework for autonomous driving based on the concept of maximal safety probability. Efficient learning requires rewards that are informative of desirable/undesirable states, but such rewards are challenging to design manually due to the difficulty of differentiating better states among many safe states. On the other hand, learning policies that maximize safety probability does not require laborious reward shaping but is numerically challenging because the algorithms must optimize policies based on binary rewards sparse in time. Here, we show that physics-informed reinforcement learning can efficiently learn this form of maximally safe policy. Unlike existing drift control methods, our approach does not require a specific reference trajectory or complex reward shaping, and can learn safe behaviors only from sparse binary rewards. This is enabled by the use of the physics loss that plays an analogous role to reward shaping. The effectiveness of the proposed approach is demonstrated through lane keeping in a normal cornering scenario and safe drifting in a high-speed racing scenario.
comment: arXiv admin note: text overlap with arXiv:2403.16391
Non-stationary and Sparsely-correlated Multi-output Gaussian Process with Spike-and-Slab Prior
Multi-output Gaussian process (MGP) is commonly used as a transfer learning method to leverage information among multiple outputs. A key advantage of MGP is providing uncertainty quantification for prediction, which is highly important for subsequent decision-making tasks. However, traditional MGP may not be sufficiently flexible to handle multivariate data with dynamic characteristics, particularly when dealing with complex temporal correlations. Additionally, since some outputs may lack correlation, transferring information among them may lead to negative transfer. To address these issues, this study proposes a non-stationary MGP model that can capture both the dynamic and sparse correlation among outputs. Specifically, the covariance functions of MGP are constructed using convolutions of time-varying kernel functions. Then a dynamic spike-and-slab prior is placed on correlation parameters to automatically decide which sources are informative to the target output in the training process. An expectation-maximization (EM) algorithm is proposed for efficient model fitting. Both numerical studies and a real case demonstrate its efficacy in capturing dynamic and sparse correlation structure and mitigating negative transfer for high-dimensional time-series data. Finally, a mountain-car reinforcement learning case highlights its potential application in decision making problems.
Envisioning an Optimal Network of Space-Based Lasers for Orbital Debris Remediation
The rapid increase in resident space objects, including satellites and orbital debris, threatens the safety and sustainability of space missions. This paper explores orbital debris remediation using laser ablation with a network of collaborative space-based lasers. A novel delta-v vector analysis framework quantifies the effects of multiple simultaneous laser-to-debris (L2D) engagements by leveraging a vector composition of imparted delta-v vectors. The paper introduces the Concurrent Location-Scheduling Problem (CLSP), which optimizes the placement of laser platforms and schedules L2D engagements to maximize debris remediation capacity. Due to the computational complexity of CLSP, it is decomposed into two sequential subproblems: (1) optimal laser platform locations are determined using the Maximal Covering Location Problem, and (2) a novel integer linear programming-based approach schedules L2D engagements within the network configuration to maximize remediation capacity. Computational experiments are conducted to evaluate the proposed framework's effectiveness under various mission scenarios, demonstrating key network functions such as collaborative nudging, deorbiting, and just-in-time collision avoidance. A cost-benefit analysis further explores how varying the number and distribution of laser platforms affects debris remediation capacity, providing insights into optimizing the performance of space-based laser networks.
comment: 41 pages, 13 figures, submitted to the Journal of Spacecraft and Rockets
Vehicular Resilient Control Strategy for a Platoon of Self-Driving Vehicles under DoS Attack
In a platoon, multiple autonomous vehicles engage in data exchange to navigate toward their intended destination. Within this network, a designated leader shares its status information with followers based on a predefined communication graph. However, these vehicles are susceptible to disturbances, leading to deviations from their intended routes. Denial-of-service (DoS) attacks, a significant type of cyber threat, can impact the motion of the leader. This paper addresses the destabilizing effects of DoS attacks on platoons and introduces a novel vehicular resilient control strategy to restore stability. Upon detecting and measuring a DoS attack, modeled with a time-varying delay, the proposed method initiates a process to retrieve the attacked leader. Through a newly designed switching system, the attacked leader transitions to a follower role, and a new leader is identified within a restructured platoon configuration, enabling the platoon to maintain consensus. Specifically, in the event of losing the original leader due to a DoS attack, the remaining vehicles do experience destabilization. They adapt their motions as a cohesive network through a distributed resilient controller. The effectiveness of the proposed approach is validated through an illustrative case study, showing its applicability in real-world scenarios.
comment: 9 pages
A Nonlinear Controller for Parallel DC-DC Converters with ZIP Load and Constrained Output Voltage
In this paper, an adaptive nonlinear controller is designed for a parallel DC-DC converter system that feeds an unknown ZIP load, characterized by constant impedance (Z), constant current (I), and constant power (P), at the DC bus. The proposed controller ensures simultaneous voltage adjustment and power sharing in the large signal sense despite uncertainties in ZIP loads, DC input voltages, and other electrical parameters. To keep the output voltage within a desired range, we utilize a barrier function that is invertible, smoothly continuous, and strictly increasing. Its limits at infinity represent the upper and lower bounds for the output voltage. We apply the invertible transformation of the barrier function to the output voltage and then design the controller using the adaptive backstepping method. Using this barrier-function-based adaptive backstepping controller, uncertain parameters are identified on-line, and the voltage adjustment and power sharing objectives are established. Moreover, voltage constraint is not violated event in the presence of sudden and unknown large variations of load. The efficiency of the proposed nonlinear controller is evaluated through simulations of a parallel DC-DC converter system using the MATLAB/Simscape Electrical environment.
Analytical Optimized Traffic Flow Recovery for Large-scale Urban Transportation Network
The implementation of intelligent transportation systems (ITS) has enhanced data collection in urban transportation through advanced traffic sensing devices. However, the high costs associated with installation and maintenance result in sparse traffic data coverage. To obtain complete, accurate, and high-resolution network-wide traffic flow data, this study introduces the Analytical Optimized Recovery (AOR) approach that leverages abundant GPS speed data alongside sparse flow data to estimate traffic flow in large-scale urban networks. The method formulates a constrained optimization framework that utilizes a quadratic objective function with l2 norm regularization terms to address the traffic flow recovery problem effectively and incorporates a Lagrangian relaxation technique to maintain non-negativity constraints. The effectiveness of this approach was validated in a large urban network in Shenzhen's Futian District using the Simulation of Urban MObility (SUMO) platform. Analytical results indicate that the method achieves low estimation errors, affirming its suitability for comprehensive traffic analysis in urban settings with limited sensor deployment.
comment: 27 pages, 13 figures
Data-informativity conditions for structured linear systems with implications for dynamic networks
When estimating models of of a multivariable dynamic system, a typical condition for consistency is to require the input signals to be persistently exciting, which is guaranteed if the input spectrum is positive definite for a sufficient number of frequencies. In this paper it is investigated how such a condition can be relaxed by exploiting prior structural information on the multivariable system, such as structural zero elements in the transfer matrix or entries that are a priori known and therefore not parametrized. It is shown that in particular situations the data-informativity condition can be decomposed into different MISO (multiple input single output) situations, leading to relaxed conditions for the MIMO (multiple input multiple output) model. When estimating a single module in a linear dynamic network, the data-informativity conditions can generically be formulated as path-based conditions on the graph of the network. The new relaxed conditions for data-informativity will then also lead to relaxed path-based conditions on the network graph. Additionally the new expressions are shown to be closely related to earlier derived conditions for (generic) single module identifiability.
comment: 15 pages, 3 figures
Inferring Global Exponential Stability Properties using Lie-bracket Approximations
In the present paper, a novel result for inferring uniform global, not semi-global, exponential stability in the sense of Lyapunov with respect to input-affine systems from global uniform exponential stability properties with respect to their associated Lie-bracket systems is shown. The result is applied to adapt dither frequencies to find a sufficiently high gain in adaptive control of linear unknown systems, and a simple numerical example is simulated to support the theoretical findings.
comment: Extended Version
Distributionally Robust Control for Chance-Constrained Signal Temporal Logic Specifications
We consider distributionally robust optimal control of stochastic linear systems under signal temporal logic (STL) chance constraints when the disturbance distribution is unknown. By assuming that the underlying predicate functions are Lipschitz continuous and the noise realizations are drawn from a distribution having a concentration of measure property, we first formulate the underlying chance-constrained control problem as stochastic programming with constraints on expectations and propose a solution using a distributionally robust approach based on the Wasserstein metric. We show that by choosing a proper Wasserstein radius, the original chance-constrained optimization can be satisfied with a user-defined confidence level. A numerical example illustrates the efficacy of the method.
comment: 8 pages and 1 fiugre
Combined Plant and Control Co-design via Solutions of Hamilton-Jacobi-Bellman Equation Based on Physics-informed Learning
This paper addresses integrated design of engineering systems, where physical structure of the plant and controller design are optimized simultaneously. To cope with uncertainties due to noises acting on the dynamics and modeling errors, an Uncertain Control Co-design (UCCD) problem formulation is proposed. Existing UCCD methods usually rely on uncertainty propagation analyses using Monte Calro methods for open-loop solutions of optimal control, which suffer from stringent trade-offs among accuracy, time horizon, and computational time. The proposed method utilizes closed-loop solutions characterized by the Hamilton-Jacobi-Bellman equation, a Partial Differential Equation (PDE) defined on the state space. A solution algorithm for the proposed UCCD formulation is developed based on PDE solutions of Physics-informed Neural Networks (PINNs). Numerical examples of regulator design problems are provided, and it is shown that simultaneous update of PINN weights and the design parameters effectively works for solving UCCD problems.
Simultaneous compensation of input delay and state/input quantization for linear systems via switched predictor feedback
We develop a switched predictor-feedback law, which achieves global asymptotic stabilization of linear systems with input delay and with the plant and actuator states available only in (almost) quantized form. The control design relies on a quantized version of the nominal predictor-feedback law for linear systems, in which quantized measurements of the plant and actuator states enter the predictor state formula. A switching strategy is constructed to dynamically adjust the tunable parameter of the quantizer (in a piecewise constant manner), in order to initially increase the range and subsequently decrease the error of the quantizers. The key element in the proof of global asymptotic stability in the supremum norm of the actuator state is derivation of solutions' estimates combining a backstepping transformation with small-gain and input-to-state stability arguments, for addressing the error due to quantization. We extend this result to the input quantization case and illustrate our theory with a numerical example.
comment: 12 pages, 15 figures, Systems & Control Letters
Using matrix sparsification to solve tropical linear vector equations
A linear vector equation in two unknown vectors is examined in the framework of tropical algebra dealing with the theory and applications of semirings and semifields with idempotent addition. We consider a two-sided equation where each side is a tropical product of a given matrix by one of the unknown vectors. We use a matrix sparsification technique to reduce the equation to a set of vector inequalities that involve row-monomial matrices obtained from the given matrices. An existence condition of solutions for the inequalities is established, and a direct representation of the solutions is derived in a compact vector form. To illustrate the proposed approach and to compare the obtained result with that of an existing solution procedure, we apply our solution technique to handle two-sided equations known in the literature. Finally, a computational scheme based on the approach to derive all solutions of the two-sided equation is discussed.
comment: 16 pages
Finite Sample Frequency Domain Identification
We study non-parametric frequency-domain system identification from a finite-sample perspective. We assume an open loop scenario where the excitation input is periodic and consider the Empirical Transfer Function Estimate (ETFE), where the goal is to estimate the frequency response at certain desired (evenly-spaced) frequencies, given input-output samples. We show that under sub-Gaussian colored noise (in time-domain) and stability assumptions, the ETFE estimates are concentrated around the true values. The error rate is of the order of $\mathcal{O}((d_{\mathrm{u}}+\sqrt{d_{\mathrm{u}}d_{\mathrm{y}}})\sqrt{M/N_{\mathrm{tot}}})$, where $N_{\mathrm{tot}}$ is the total number of samples, $M$ is the number of desired frequencies, and $d_{\mathrm{u}},\,d_{\mathrm{y}}$ are the dimensions of the input and output signals respectively. This rate remains valid for general irrational transfer functions and does not require a finite order state-space representation. By tuning $M$, we obtain a $N_{\mathrm{tot}}^{-1/3}$ finite-sample rate for learning the frequency response over all frequencies in the $ \mathcal{H}_{\infty}$ norm. Our result draws upon an extension of the Hanson-Wright inequality to semi-infinite matrices. We study the finite-sample behavior of ETFE in simulations.
comment: Version 2 changes: several typos were fixed and some proof steps were expanded
On Computation of Approximate Solutions to Large-Scale Backstepping Kernel Equations via Continuum Approximation
We provide two methods for computation of continuum backstepping kernels that arise in control of continua (ensembles) of linear hyperbolic PDEs and which can approximate backstepping kernels arising in control of a large-scale, PDE system counterpart (with computational complexity that does not grow with the number of state components of the large-scale system). In the first method, we provide explicit formulae for the solution to the continuum kernels PDEs, employing a (triple) power series representation of the continuum kernel and establishing its convergence properties. In this case, we also provide means for reducing computational complexity by properly truncating the power series (in the powers of the ensemble variable). In the second method, we identify a class of systems for which the solution to the continuum (and hence, also an approximate solution to the respective large-scale) kernel equations can be constructed in closed form. We also present numerical examples to illustrate computational efficiency/accuracy of the approaches, as well as to validate the stabilization properties of the approximate control kernels, constructed based on the continuum.
comment: 15 pages, 5 figures, submitted to Systems & Control Letters, MATLAB implementation of Algorithm 1 available at https://github.com/jphumaloja/Continuum-Kernels-Power-Series/
Parameter Dependent Robust Control Invariant Sets for LPV Systems with Bounded Parameter Variation Rate
Real-time measurements of the scheduling parameter of linear parameter-varying (LPV) systems enables the synthesis of robust control invariant (RCI) sets and parameter dependent controllers inducing invariance. We present a method to synthesize parameter-dependent robust control invariant (PD-RCI) sets for LPV systems with bounded parameter variation, in which invariance is induced using PD-vertex control laws. The PD-RCI sets are parameterized as configuration-constrained polytopes that admit a joint parameterization of their facets and vertices. The proposed sets and associated control laws are computed by solving a single semidefinite programing (SDP) problem. Through numerical examples, we demonstrate that the proposed method outperforms state-of-the-art methods for synthesizing PD-RCI sets, both with respect to conservativeness and computational load.
comment: 8 pages, 6 figures
Online learning for robust voltage control under uncertain grid topology
Voltage control generally requires accurate information about the grid's topology in order to guarantee network stability. However, accurate topology identification is challenging for existing methods, especially as the grid is subject to increasingly frequent reconfiguration due to the adoption of renewable energy. In this work, we combine a nested convex body chasing algorithm with a robust predictive controller to achieve provably finite-time convergence to safe voltage limits in the online setting where there is uncertainty in both the network topology as well as load and generation variations. In an online fashion, our algorithm narrows down the set of possible grid models that are consistent with observations and adjusts reactive power generation accordingly to keep voltages within desired safety limits. Our approach can also incorporate existing partial knowledge of the network to improve voltage control performance. We demonstrate the effectiveness of our approach in a case study on a Southern California Edison 56-bus distribution system. Our experiments show that in practical settings, the controller is indeed able to narrow the set of consistent topologies quickly enough to make control decisions that ensure stability in both linearized and realistic non-linear models of the distribution grid.
comment: Published in IEEE Transactions on Smart Grid, vol. 15, no. 5, pp. 4754-4764, Sept. 2024. arXiv admin note: substantial text overlap with arXiv:2206.14369
Demonstrating a Robust Walking Algorithm for Underactuated Bipedal Robots in Non-flat, Non-stationary Environments
This work explores an innovative algorithm designed to enhance the mobility of underactuated bipedal robots across challenging terrains, especially when navigating through spaces with constrained opportunities for foot support, like steps or stairs. By combining ankle torque with a refined angular momentum-based linear inverted pendulum model (ALIP), our method allows variability in the robot's center of mass height. We employ a dual-strategy controller that merges virtual constraints for precise motion regulation across essential degrees of freedom with an ALIP-centric model predictive control (MPC) framework, aimed at enforcing gait stability. The effectiveness of our feedback design is demonstrated through its application on the Cassie bipedal robot, which features 20 degrees of freedom. Key to our implementation is the development of tailored nominal trajectories and an optimized MPC that reduces the execution time to under 500 microseconds--and, hence, is compatible with Cassie's controller update frequency. This paper not only showcases the successful hardware deployment but also demonstrates a new capability, a bipedal robot using a moving walkway.
Robotics
RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins (early version)
Effective collaboration of dual-arm robots and their tool use capabilities are increasingly important areas in the advancement of robotics. These skills play a significant role in expanding robots' ability to operate in diverse real-world environments. However, progress is impeded by the scarcity of specialized training data. This paper introduces RoboTwin, a novel benchmark dataset combining real-world teleoperated data with synthetic data from digital twins, designed for dual-arm robotic scenarios. Using the COBOT Magic platform, we have collected diverse data on tool usage and human-robot interaction. We present a innovative approach to creating digital twins using AI-generated content, transforming 2D images into detailed 3D models. Furthermore, we utilize large language models to generate expert-level training data and task-specific pose sequences oriented toward functionality. Our key contributions are: 1) the RoboTwin benchmark dataset, 2) an efficient real-to-simulation pipeline, and 3) the use of language models for automatic expert-level data generation. These advancements are designed to address the shortage of robotic training data, potentially accelerating the development of more capable and versatile robotic systems for a wide range of real-world applications. The project page is available at https://robotwin-benchmark.github.io/early-version/
comment: Project page: https://robotwin-benchmark.github.io/early-version/
Hybrid Imitation-Learning Motion Planner for Urban Driving
With the release of open source datasets such as nuPlan and Argoverse, the research around learning-based planners has spread a lot in the last years. Existing systems have shown excellent capabilities in imitating the human driver behaviour, but they struggle to guarantee safe closed-loop driving. Conversely, optimization-based planners offer greater security in short-term planning scenarios. To confront this challenge, in this paper we propose a novel hybrid motion planner that integrates both learning-based and optimization-based techniques. Initially, a multilayer perceptron (MLP) generates a human-like trajectory, which is then refined by an optimization-based component. This component not only minimizes tracking errors but also computes a trajectory that is both kinematically feasible and collision-free with obstacles and road boundaries. Our model effectively balances safety and human-likeness, mitigating the trade-off inherent in these objectives. We validate our approach through simulation experiments and further demonstrate its efficacy by deploying it in real-world self-driving vehicles.
CONClave -- Secure and Robust Cooperative Perception for CAVs Using Authenticated Consensus and Trust Scoring
Connected Autonomous Vehicles have great potential to improve automobile safety and traffic flow, especially in cooperative applications where perception data is shared between vehicles. However, this cooperation must be secured from malicious intent and unintentional errors that could cause accidents. Previous works typically address singular security or reliability issues for cooperative driving in specific scenarios rather than the set of errors together. In this paper, we propose CONClave, a tightly coupled authentication, consensus, and trust scoring mechanism that provides comprehensive security and reliability for cooperative perception in autonomous vehicles. CONClave benefits from the pipelined nature of the steps such that faults can be detected significantly faster and with less compute. Overall, CONClave shows huge promise in preventing security flaws, detecting even relatively minor sensing faults, and increasing the robustness and accuracy of cooperative perception in CAVs while adding minimal overhead.
comment: 6 pages, 6 figures, Design Automation Conference June 2024
SOAR: Simultaneous Exploration and Photographing with Heterogeneous UAVs for Fast Autonomous Reconstruction IROS2024
Unmanned Aerial Vehicles (UAVs) have gained significant popularity in scene reconstruction. This paper presents SOAR, a LiDAR-Visual heterogeneous multi-UAV system specifically designed for fast autonomous reconstruction of complex environments. Our system comprises a LiDAR-equipped explorer with a large field-of-view (FoV), alongside photographers equipped with cameras. To ensure rapid acquisition of the scene's surface geometry, we employ a surface frontier-based exploration strategy for the explorer. As the surface is progressively explored, we identify the uncovered areas and generate viewpoints incrementally. These viewpoints are then assigned to photographers through solving a Consistent Multiple Depot Multiple Traveling Salesman Problem (Consistent-MDMTSP), which optimizes scanning efficiency while ensuring task consistency. Finally, photographers utilize the assigned viewpoints to determine optimal coverage paths for acquiring images. We present extensive benchmarks in the realistic simulator, which validates the performance of SOAR compared with classical and state-of-the-art methods. For more details, please see our project page at https://sysu-star.github.io/SOAR}{sysu-star.github.io/SOAR.
comment: Accepted to IROS2024. Code: https://github.com/SYSU-STAR/SOAR. Project page: http://sysu-star.com/SOAR/
Surgical Task Automation Using Actor-Critic Frameworks and Self-Supervised Imitation Learning
Surgical robot task automation has recently attracted great attention due to its potential to benefit both surgeons and patients. Reinforcement learning (RL) based approaches have demonstrated promising ability to provide solutions to automated surgical manipulations on various tasks. To address the exploration challenge, expert demonstrations can be utilized to enhance the learning efficiency via imitation learning (IL) approaches. However, the successes of such methods normally rely on both states and action labels. Unfortunately action labels can be hard to capture or their manual annotation is prohibitively expensive owing to the requirement for expert knowledge. It therefore remains an appealing and open problem to leverage expert demonstrations composed of pure states in RL. In this work, we present an actor-critic RL framework, termed AC-SSIL, to overcome this challenge of learning with state-only demonstrations collected by following an unknown expert policy. It adopts a self-supervised IL method, dubbed SSIL, to effectively incorporate demonstrated states into RL paradigms by retrieving from demonstrates the nearest neighbours of the query state and utilizing the bootstrapping of actor networks. We showcase through experiments on an open-source surgical simulation platform that our method delivers remarkable improvements over the RL baseline and exhibits comparable performance against action based IL methods, which implies the efficacy and potential of our method for expert demonstration-guided learning scenarios.
comment: 8 pages,7 figures, 62 conferences
A Low-Cost Real-Time Spiking System for Obstacle Detection based on Ultrasonic Sensors and Rate Coding
Since the advent of mobile robots, obstacle detection has been a topic of great interest. It has also been a subject of study in neuroscience, where flying insects and bats could be considered two of the most interesting cases in terms of vision-based and sound-based mechanisms for obstacle detection, respectively. Currently, many studies focus on vision-based obstacle detection, but not many can be found regarding sound-based obstacle detection. This work focuses on the latter approach, which also makes use of a Spiking Neural Network to exploit the advantages of these architectures and achieve an approach closer to biology. The complete system was tested through a series of experiments that confirm the validity of the spiking architecture for obstacle detection. It is empirically demonstrated that, when the distance between the robot and the obstacle decreases, the output firing rate of the system increases in response as expected, and vice versa. Therefore, there is a direct relation between the two. Furthermore, there is a distance threshold between detectable and undetectable objects which is also empirically measured in this work. An in-depth study on how this system works at low level based on the Inter-Spike Interval concept was performed, which may be useful in the future development of applications based on spiking filters.
comment: 22 pages, 8 figures
Causality-Aware Transformer Networks for Robotic Navigation
Recent advances in machine learning algorithms have garnered growing interest in developing versatile Embodied AI systems. However, current research in this domain reveals opportunities for improvement. First, the direct adoption of RNNs and Transformers often overlooks the specific differences between Embodied AI and traditional sequential data modelling, potentially limiting its performance in Embodied AI tasks. Second, the reliance on task-specific configurations, such as pre-trained modules and dataset-specific logic, compromises the generalizability of these methods. We address these constraints by initially exploring the unique differences between Embodied AI tasks and other sequential data tasks through the lens of Causality, presenting a causal framework to elucidate the inadequacies of conventional sequential methods for Embodied AI. By leveraging this causal perspective, we propose Causality-Aware Transformer (CAT) Networks for Navigation, featuring a Causal Understanding Module to enhance the models's Environmental Understanding capability. Meanwhile, our method is devoid of task-specific inductive biases and can be trained in an End-to-End manner, which enhances the method's generalizability across various contexts. Empirical evaluations demonstrate that our methodology consistently surpasses benchmark performances across a spectrum of settings, tasks and simulation environments. Extensive ablation studies reveal that the performance gains can be attributed to the Causal Understanding Module, which demonstrates effectiveness and efficiency in both Reinforcement Learning and Supervised Learning settings.
Learning-Based Error Detection System for Advanced Vehicle Instrument Cluster Rendering
The automotive industry is currently expanding digital display options with every new model that comes onto the market. This entails not just an expansion in dimensions, resolution, and customization choices, but also the capability to employ novel display effects like overlays while assembling the content of the display cluster. Unfortunately, this raises the need for appropriate monitoring systems that can detect rendering errors and apply appropriate countermeasures when required. Classical solutions such as Cyclic Redundancy Checks (CRC) will soon be no longer viable as any sort of alpha blending, warping of scaling of content can cause unwanted CRC violations. Therefore, we propose a novel monitoring approach to verify correctness of displayed content using telltales (e.g. warning signs) as example. It uses a learning-based approach to separate "good" telltales, i.e. those that a human driver will understand correctly, and "corrupted" telltales, i.e. those that will not be visible or perceived correctly. As a result, it possesses inherent resilience against individual pixel errors and implicitly supports changing backgrounds, overlay or scaling effects. This is underlined by our experimental study where all "corrupted" test patterns were correctly classified, while no false alarms were triggered.
comment: 9 pages
Mamba as a motion encoder for robotic imitation learning
Recent advancements in imitation learning, particularly with the integration of LLM techniques, are set to significantly improve robots' dexterity and adaptability. In this study, we propose using Mamba, a state-of-the-art architecture with potential applications in LLMs, for robotic imitation learning, highlighting its ability to function as an encoder that effectively captures contextual information. By reducing the dimensionality of the state space, Mamba operates similarly to an autoencoder. It effectively compresses the sequential information into state variables while preserving the essential temporal dynamics necessary for accurate motion prediction. Experimental results in tasks such as cup placing and case loading demonstrate that despite exhibiting higher estimation errors, Mamba achieves superior success rates compared to Transformers in practical task execution. This performance is attributed to Mamba's structure, which encompasses the state space model. Additionally, the study investigates Mamba's capacity to serve as a real-time motion generator with a limited amount of training data.
comment: 7 pages, 7 figures
Modelling, Design Optimization and Prototype development of Knee Exoskeleton
This study focuses on enhancing the design of an existing knee exoskeleton by addressing limitations in the range of motion (ROM) during Sit-to-Stand (STS) motions. While current knee exoskeletons emphasize toughness and rehabilitation, their closed-loop mechanisms hinder optimal ROM, which is crucial for effective rehabilitation. This research aims to optimize the exoskeleton design to achieve the necessary ROM, improving its functionality in rehabilitation. This can be achieved by utilizing kinematic modeling and formulation, the existing design was represented in the non-linear and non-convex mathematical functions. Optimization techniques, considering constraints based on human leg measurements, were applied to determine the best dimensions for the exoskeleton. This resulted in a significant increase in ROM compared to existing models. A MATLAB program was developed to compare the ROM of the optimized exoskeleton with the original design. To validate the practicality of the optimized design, analysis was conducted using a mannequin with average human dimensions, followed by constructing a cardboard dummy model to confirm simulation results. The STS motion of an average human was captured using a camera and TRACKER software, and the motion was compared with that of the dummy model to identify any misalignments between the human and exoskeleton knee joints. Furthermore, a prototype of the knee joint exoskeleton is being developed to further investigate misalignments and improve the design. Future work includes the use of EMG sensors for more detailed analysis and better results.
SurgTrack: CAD-Free 3D Tracking of Real-world Surgical Instruments
Vision-based surgical navigation has received increasing attention due to its non-invasive, cost-effective, and flexible advantages. In particular, a critical element of the vision-based navigation system is tracking surgical instruments. Compared with 2D instrument tracking methods, 3D instrument tracking has broader value in clinical practice, but is also more challenging due to weak texture, occlusion, and lack of Computer-Aided Design (CAD) models for 3D registration. To solve these challenges, we propose the SurgTrack, a two-stage 3D instrument tracking method for CAD-free and robust real-world applications. In the first registration stage, we incorporate an Instrument Signed Distance Field (SDF) modeling the 3D representation of instruments, achieving CAD-freed 3D registration. Due to this, we can obtain the location and orientation of instruments in the 3D space by matching the video stream with the registered SDF model. In the second tracking stage, we devise a posture graph optimization module, leveraging the historical tracking results of the posture memory pool to optimize the tracking results and improve the occlusion robustness. Furthermore, we collect the Instrument3D dataset to comprehensively evaluate the 3D tracking of surgical instruments. The extensive experiments validate the superiority and scalability of our SurgTrack, by outperforming the state-of-the-arts with a remarkable improvement. The code and dataset are available at https://github.com/wenwucode/SurgTrack.
Vision-Language Navigation with Continual Learning
Vision-language navigation (VLN) is a critical domain within embedded intelligence, requiring agents to navigate 3D environments based on natural language instructions. Traditional VLN research has focused on improving environmental understanding and decision accuracy. However, these approaches often exhibit a significant performance gap when agents are deployed in novel environments, mainly due to the limited diversity of training data. Expanding datasets to cover a broader range of environments is impractical and costly. We propose the Vision-Language Navigation with Continual Learning (VLNCL) paradigm to address this challenge. In this paradigm, agents incrementally learn new environments while retaining previously acquired knowledge. VLNCL enables agents to maintain an environmental memory and extract relevant knowledge, allowing rapid adaptation to new environments while preserving existing information. We introduce a novel dual-loop scenario replay method (Dual-SR) inspired by brain memory replay mechanisms integrated with VLN agents. This method facilitates consolidating past experiences and enhances generalization across new tasks. By utilizing a multi-scenario memory buffer, the agent efficiently organizes and replays task memories, thereby bolstering its ability to adapt quickly to new environments and mitigating catastrophic forgetting. Our work pioneers continual learning in VLN agents, introducing a novel experimental setup and evaluation metrics. We demonstrate the effectiveness of our approach through extensive evaluations and establish a benchmark for the VLNCL paradigm. Comparative experiments with existing continual learning and VLN methods show significant improvements, achieving state-of-the-art performance in continual learning ability and highlighting the potential of our approach in enabling rapid adaptation while preserving prior knowledge.
Want a Ride? Attitudes Towards Autonomous Driving and Behavior in Autonomous Vehicles
Research conducted previously has focused on either attitudes toward or behaviors associated with autonomous driving. In this paper, we bridge these two dimensions by exploring how attitudes towards autonomous driving influence behavior in an autonomous car. We conducted a field experiment with twelve participants engaged in non-driving related tasks. Our findings indicate that attitudes towards autonomous driving do not affect participants' driving interventions in vehicle control and eye glance behavior. Therefore, studies on autonomous driving technology lacking field tests might be unreliable for assessing the potential behaviors, attitudes, and acceptance of autonomous vehicles.
comment: 12 pages
Modular pipeline for small bodies gravity field modeling: an efficient representation of variable density spherical harmonics coefficients
Proximity operations to small bodies, such as asteroids and comets, demand high levels of autonomy to achieve cost-effective, safe, and reliable Guidance, Navigation and Control (GNC) solutions. Enabling autonomous GNC capabilities in the vicinity of these targets is thus vital for future space applications. However, the highly non-linear and uncertain environment characterizing their vicinity poses unique challenges that need to be assessed to grant robustness against unknown shapes and gravity fields. In this paper, a pipeline designed to generate variable density gravity field models is proposed, allowing the generation of a coherent set of scenarios that can be used for design, validation, and testing of GNC algorithms. The proposed approach consists in processing a polyhedral shape model of the body with a given density distribution to compute the coefficients of the spherical harmonics expansion associated with the gravity field. To validate the approach, several comparison are conducted against analytical solutions, literature results, and higher fidelity models, across a diverse set of targets with varying morphological and physical properties. Simulation results demonstrate the effectiveness of the methodology, showing good performances in terms of modeling accuracy and computational efficiency. This research presents a faster and more robust framework for generating environmental models to be used in simulation and hardware-in-the-loop testing of onboard GNC algorithms.
Cog-GA: A Large Language Models-based Generative Agent for Vision-Language Navigation in Continuous Environments
Vision Language Navigation in Continuous Environments (VLN-CE) represents a frontier in embodied AI, demanding agents to navigate freely in unbounded 3D spaces solely guided by natural language instructions. This task introduces distinct challenges in multimodal comprehension, spatial reasoning, and decision-making. To address these challenges, we introduce Cog-GA, a generative agent founded on large language models (LLMs) tailored for VLN-CE tasks. Cog-GA employs a dual-pronged strategy to emulate human-like cognitive processes. Firstly, it constructs a cognitive map, integrating temporal, spatial, and semantic elements, thereby facilitating the development of spatial memory within LLMs. Secondly, Cog-GA employs a predictive mechanism for waypoints, strategically optimizing the exploration trajectory to maximize navigational efficiency. Each waypoint is accompanied by a dual-channel scene description, categorizing environmental cues into 'what' and 'where' streams as the brain. This segregation enhances the agent's attentional focus, enabling it to discern pertinent spatial information for navigation. A reflective mechanism complements these strategies by capturing feedback from prior navigation experiences, facilitating continual learning and adaptive replanning. Extensive evaluations conducted on VLN-CE benchmarks validate Cog-GA's state-of-the-art performance and ability to simulate human-like navigation behaviors. This research significantly contributes to the development of strategic and interpretable VLN-CE agents.
eRSS-RAMP: A Rule-Adherence Motion Planner Based on Extended Responsibility-Sensitive Safety for Autonomous Driving
Driving safety and responsibility determination are indispensable pieces of the puzzle for autonomous driving. They are also deeply related to the allocation of right-of-way and the determination of accident liability. Therefore, Intel/Mobileye designed the responsibility-sensitive safety (RSS) framework to further enhance the safety regulation of autonomous driving, which mathematically defines rules for autonomous vehicles (AVs) behaviors in various traffic scenarios. However, the RSS framework's rules are relatively rudimentary in certain scenarios characterized by interaction uncertainty, especially those requiring collaborative driving during emergency collision avoidance. Besides, the integration of the RSS framework with motion planning is rarely discussed in current studies. Therefore, we proposed a rule-adherence motion planner (RAMP) based on the extended RSS (eRSS) regulation for non-connected and connected AVs in merging and emergency-avoiding scenarios. The simulation results indicate that the proposed method can achieve faster and safer lane merging performance (53.0% shorter merging length and a 73.5% decrease in merging time), and allows for more stable steering maneuvers in emergency collision avoidance, resulting in smoother paths for ego vehicle and surrounding vehicles.
comment: 12 pages, 19 figures, submitted to an IEEE journal
Dispelling Four Challenges in Inertial Motion Tracking with One Recurrent Inertial Graph-based Estimator (RING)
In this paper, we extend the Recurrent Inertial Graph-based Estimator (RING), a novel neural-network-based solution for Inertial Motion Tracking (IMT), to generalize across a large range of sampling rates, and we demonstrate that it can overcome four real-world challenges: inhomogeneous magnetic fields, sensor-to-segment misalignment, sparse sensor setups, and nonrigid sensor attachment. RING can estimate the rotational state of a three-segment kinematic chain with double hinge joints from inertial data, and achieves an experimental mean-absolute-(tracking)-error of 8.10 +/- 1.19 degrees if all four challenges are present simultaneously. The network is trained on simulated data yet evaluated on experimental data, highlighting its remarkable ability to zero-shot generalize from simulation to experiment. We conduct an ablation study to analyze the impact of each of the four challenges on RING's performance, we showcase its robustness to varying sampling rates, and we demonstrate that RING is capable of real-time operation. This research not only advances IMT technology by making it more accessible and versatile but also enhances its potential for new application domains including non-expert use of sparse IMT with nonrigid sensor attachments in unconstrained environments.
comment: Submitted to 12th IFAC Symposium on Biological and Medical Systems
USV-AUV Collaboration Framework for Underwater Tasks under Extreme Sea Conditions
Autonomous underwater vehicles (AUVs) are valuable for ocean exploration due to their flexibility and ability to carry communication and detection units. Nevertheless, AUVs alone often face challenges in harsh and extreme sea conditions. This study introduces a unmanned surface vehicle (USV)-AUV collaboration framework, which includes high-precision multi-AUV positioning using USV path planning via Fisher information matrix optimization and reinforcement learning for multi-AUV cooperative tasks. Applied to a multi-AUV underwater data collection task scenario, extensive simulations validate the framework's feasibility and superior performance, highlighting exceptional coordination and robustness under extreme sea conditions. The simulation code will be made available as open-source to foster future research in this area.
Fuzzy Logic Control for Indoor Navigation of Mobile Robots
Autonomous mobile robots have many applications in indoor unstructured environment, wherein optimal movement of the robot is needed. The robot therefore needs to navigate in unknown and dynamic environments. This paper presents an implementation of fuzzy logic controller for navigation of mobile robot in an unknown dynamically cluttered environment. Fuzzy logic controller is used here as it is capable of making inferences even under uncertainties. It helps in rule generation and decision making process in order to reach the goal position under various situations. Sensor readings from the robot and the desired direction of motion are inputs to the fuzz logic controllers and the acceleration of the respective wheels are the output of the controller. Hence, the mobile robot avoids obstacles and reaches the goal position. Keywords: Fuzzy Logic Controller, Membership Functions, Takagi-Sugeno-Kang FIS, Centroid Defuzzification
Occlusion-Based Cooperative Transport for Concave Objects with a Swarm of Miniature Mobile Robots
An occlusion based strategy for collective transport of a concave object using a swarm of mobile robots has been proposed in this paper. We aim to overcome the challenges of transporting concave objects using decentralized approach. The interesting aspect of this task is that the agents have no prior knowledge about the geometry of the object and do not explicitly communicate with each other. The concept is to eliminate the concavity of the object by filling a number of robots in its cavity and then carry out an occlusion based transport strategy on the newly formed convex object or "pseudo object". We divide our work into two parts: concavity filling of various concave objects and occlusion based collective transport of convex objects.
Deep Brain Ultrasound Ablation Thermal Dose Modeling with in Vivo Experimental Validation
Intracorporeal needle-based therapeutic ultrasound (NBTU) is a minimally invasive option for intervening in malignant brain tumors, commonly used in thermal ablation procedures. This technique is suitable for both primary and metastatic cancers, utilizing a high-frequency alternating electric field (up to 10 MHz) to excite a piezoelectric transducer. The resulting rapid deformation of the transducer produces an acoustic wave that propagates through tissue, leading to localized high-temperature heating at the target tumor site and inducing rapid cell death. To optimize the design of NBTU transducers for thermal dose delivery during treatment, numerical modeling of the acoustic pressure field generated by the deforming piezoelectric transducer is frequently employed. The bioheat transfer process generated by the input pressure field is used to track the thermal propagation of the applicator over time. Magnetic resonance thermal imaging (MRTI) can be used to experimentally validate these models. Validation results using MRTI demonstrated the feasibility of this model, showing a consistent thermal propagation pattern. However, a thermal damage isodose map is more advantageous for evaluating therapeutic efficacy. To achieve a more accurate simulation based on the actual brain tissue environment, a new finite element method (FEM) simulation with enhanced damage evaluation capabilities was conducted. The results showed that the highest temperature and ablated volume differed between experimental and simulation results by 2.1884{\deg}C (3.71%) and 0.0631 cm$^3$ (5.74%), respectively. The lowest Pearson correlation coefficient (PCC) for peak temperature was 0.7117, and the lowest Dice coefficient for the ablated area was 0.7021, indicating a good agreement in accuracy between simulation and experiment.
comment: 9 pages, 9 figures, 7 tables
Multi-modal Situated Reasoning in 3D Scenes
Situation awareness is essential for understanding and reasoning about 3D scenes in embodied AI agents. However, existing datasets and benchmarks for situated understanding are limited in data modality, diversity, scale, and task scope. To address these limitations, we propose Multi-modal Situated Question Answering (MSQA), a large-scale multi-modal situated reasoning dataset, scalably collected leveraging 3D scene graphs and vision-language models (VLMs) across a diverse range of real-world 3D scenes. MSQA includes 251K situated question-answering pairs across 9 distinct question categories, covering complex scenarios within 3D scenes. We introduce a novel interleaved multi-modal input setting in our benchmark to provide text, image, and point cloud for situation and question description, resolving ambiguity in previous single-modality convention (e.g., text). Additionally, we devise the Multi-modal Situated Next-step Navigation (MSNN) benchmark to evaluate models' situated reasoning for navigation. Comprehensive evaluations on MSQA and MSNN highlight the limitations of existing vision-language models and underscore the importance of handling multi-modal interleaved inputs and situation modeling. Experiments on data scaling and cross-domain transfer further demonstrate the efficacy of leveraging MSQA as a pre-training dataset for developing more powerful situated reasoning models.
comment: Project page: https://msr3d.github.io/
Reinforcement Learning for Wheeled Mobility on Vertically Challenging Terrain
Off-road navigation on vertically challenging terrain, involving steep slopes and rugged boulders, presents significant challenges for wheeled robots both at the planning level to achieve smooth collision-free trajectories and at the control level to avoid rolling over or getting stuck. Considering the complex model of wheel-terrain interactions, we develop an end-to-end Reinforcement Learning (RL) system for an autonomous vehicle to learn wheeled mobility through simulated trial-and-error experiences. Using a custom-designed simulator built on the Chrono multi-physics engine, our approach leverages Proximal Policy Optimization (PPO) and a terrain difficulty curriculum to refine a policy based on a reward function to encourage progress towards the goal and penalize excessive roll and pitch angles, which circumvents the need of complex and expensive kinodynamic modeling, planning, and control. Additionally, we present experimental results in the simulator and deploy our approach on a physical Verti-4-Wheeler (V4W) platform, demonstrating that RL can equip conventional wheeled robots with previously unrealized potential of navigating vertically challenging terrain.
Approximate Environment Decompositions for Robot Coverage Planning using Submodular Set Cover
In this paper, we investigate the problem of decomposing 2D environments for robot coverage planning. Coverage path planning (CPP) involves computing a cost-minimizing path for a robot equipped with a coverage or sensing tool so that the tool visits all points in the environment. CPP is an NP-Hard problem, so existing approaches simplify the problem by decomposing the environment into the minimum number of sectors. Sectors are sub-regions of the environment that can each be covered using a lawnmower path (i.e., along parallel straight-line paths) oriented at an angle. However, traditional methods either limit the coverage orientations to be axis-parallel (horizontal/vertical) or provide no guarantees on the number of sectors in the decomposition. We introduce an approach to decompose the environment into possibly overlapping rectangular sectors. We provide an approximation guarantee on the number of sectors computed using our approach for a given environment. We do this by leveraging the submodular property of the sector coverage function, which enables us to formulate the decomposition problem as a submodular set cover (SSC) problem with well-known approximation guarantees for the greedy algorithm. Our approach improves upon existing coverage planning methods, as demonstrated through an evaluation using maps of complex real-world environments.
comment: Extended version of the 2024 IEEE CDC paper, 8 pages, 3 figures
Developing, Analyzing, and Evaluating Self-Drive Algorithms Using Drive-by-Wire Electric Vehicles
Reliable lane-following algorithms are essential for safe and effective autonomous driving. This project was primarily focused on developing and evaluating different lane-following programs to find the most reliable algorithm for a Vehicle to Everything (V2X) project. The algorithms were first tested on a simulator and then with real vehicles equipped with a drive-by-wire system using ROS (Robot Operating System). Their performance was assessed through reliability, comfort, speed, and adaptability metrics. The results show that the two most reliable approaches detect both lane lines and use unsupervised learning to separate them. These approaches proved to be robust in various driving scenarios, making them suitable candidates for integration into the V2X project.
comment: Supported by the National Science Foundation under Grants No. 2150292 and 2150096
RoboKoop: Efficient Control Conditioned Representations from Visual Input in Robotics using Koopman Operator
Developing agents that can perform complex control tasks from high-dimensional observations is a core ability of autonomous agents that requires underlying robust task control policies and adapting the underlying visual representations to the task. Most existing policies need a lot of training samples and treat this problem from the lens of two-stage learning with a controller learned on top of pre-trained vision models. We approach this problem from the lens of Koopman theory and learn visual representations from robotic agents conditioned on specific downstream tasks in the context of learning stabilizing control for the agent. We introduce a Contrastive Spectral Koopman Embedding network that allows us to learn efficient linearized visual representations from the agent's visual data in a high dimensional latent space and utilizes reinforcement learning to perform off-policy control on top of the extracted representations with a linear controller. Our method enhances stability and control in gradient dynamics over time, significantly outperforming existing approaches by improving efficiency and accuracy in learning task policies over extended horizons.
comment: Accepted to the $8^{th}$ Conference on Robot Learning (CoRL 2024)
Incorporating dense metric depth into neural 3D representations for view synthesis and relighting
Synthesizing accurate geometry and photo-realistic appearance of small scenes is an active area of research with compelling use cases in gaming, virtual reality, robotic-manipulation, autonomous driving, convenient product capture, and consumer-level photography. When applying scene geometry and appearance estimation techniques to robotics, we found that the narrow cone of possible viewpoints due to the limited range of robot motion and scene clutter caused current estimation techniques to produce poor quality estimates or even fail. On the other hand, in robotic applications, dense metric depth can often be measured directly using stereo and illumination can be controlled. Depth can provide a good initial estimate of the object geometry to improve reconstruction, while multi-illumination images can facilitate relighting. In this work we demonstrate a method to incorporate dense metric depth into the training of neural 3D representations and address an artifact observed while jointly refining geometry and appearance by disambiguating between texture and geometry edges. We also discuss a multi-flash stereo camera system developed to capture the necessary data for our pipeline and show results on relighting and view synthesis with a few training views.
comment: Project webpage: https://stereomfc.github.io
PIETRA: Physics-Informed Evidential Learning for Traversing Out-of-Distribution Terrain
Self-supervised learning is a powerful approach for developing traversability models for off-road navigation, but these models often struggle with inputs unseen during training. Existing methods utilize techniques like evidential deep learning to quantify model uncertainty, helping to identify and avoid out-of-distribution terrain. However, always avoiding out-of-distribution terrain can be overly conservative, e.g., when novel terrain can be effectively analyzed using a physics-based model. To overcome this challenge, we introduce Physics-Informed Evidential Traversability (PIETRA), a self-supervised learning framework that integrates physics priors directly into the mathematical formulation of evidential neural networks and introduces physics knowledge implicitly through an uncertainty-aware, physics-informed training loss. Our evidential network seamlessly transitions between learned and physics-based predictions for out-of-distribution inputs. Additionally, the physics-informed loss regularizes the learned model, ensuring better alignment with the physics model. Extensive simulations and hardware experiments demonstrate that PIETRA improves both learning accuracy and navigation performance in environments with significant distribution shifts.
comment: Submitted to RA-L. Video: https://youtu.be/OTnNZ96oJRk
Efficient Multi-agent Navigation with Lightweight DRL Policy
In this article, we present an end-to-end collision avoidance policy based on deep reinforcement learning (DRL) for multi-agent systems, demonstrating encouraging outcomes in real-world applications. In particular, our policy calculates the control commands of the agent based on the raw LiDAR observation. In addition, the number of parameters of the proposed basic model is 140,000, and the size of the parameter file is 3.5 MB, which allows the robot to calculate the actions from the CPU alone. We propose a multi-agent training platform based on a physics-based simulator to further bridge the gap between simulation and the real world. The policy is trained on a policy-gradients-based RL algorithm in a dense and messy training environment. A novel reward function is introduced to address the issue of agents choosing suboptimal actions in some common scenarios. Although the data used for training is exclusively from the simulation platform, the policy can be successfully transferred and deployed in real-world robots. Finally, our policy effectively responds to intentional obstructions and avoids collisions. The website is available at \url{https://sites.google.com/view/xingrong2024efficient/%E9%A6%96%E9%A1%B5}.
Augmented Reality without Borders: Achieving Precise Localization Without Maps
Visual localization is crucial for Computer Vision and Augmented Reality (AR) applications, where determining the camera or device's position and orientation is essential to accurately interact with the physical environment. Traditional methods rely on detailed 3D maps constructed using Structure from Motion (SfM) or Simultaneous Localization and Mapping (SLAM), which is computationally expensive and impractical for dynamic or large-scale environments. We introduce MARLoc, a novel localization framework for AR applications that uses known relative transformations within image sequences to perform intra-sequence triangulation, generating 3D-2D correspondences for pose estimation and refinement. MARLoc eliminates the need for pre-built SfM maps, providing accurate and efficient localization suitable for dynamic outdoor environments. Evaluation with benchmark datasets and real-world experiments demonstrates MARLoc's state-of-the-art performance and robustness. By integrating MARLoc into an AR device, we highlight its capability to achieve precise localization in real-world outdoor scenarios, showcasing its practical effectiveness and potential to enhance visual localization in AR applications.
Accelerating Model Predictive Control for Legged Robots through Distributed Optimization IROS
This paper presents a novel approach to enhance Model Predictive Control (MPC) for legged robots through Distributed Optimization. Our method focuses on decomposing the robot dynamics into smaller, parallelizable subsystems, and utilizing the Alternating Direction Method of Multipliers (ADMM) to ensure consensus among them. Each subsystem is managed by its own Optimal Control Problem, with ADMM facilitating consistency between their optimizations. This approach not only decreases the computational time but also allows for effective scaling with more complex robot configurations, facilitating the integration of additional subsystems such as articulated arms on a quadruped robot. We demonstrate, through numerical evaluations, the convergence of our approach on two systems with increasing complexity. In addition, we showcase that our approach converges towards the same solution when compared to a state-of-the-art centralized whole-body MPC implementation. Moreover, we quantitatively compare the computational efficiency of our method to the centralized approach, revealing up to a 75% reduction in computational time. Overall, our approach offers a promising avenue for accelerating MPC solutions for legged robots, paving the way for more effective utilization of the computational performance of modern hardware.
comment: Accepted for publication at the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
On the Benefits of GPU Sample-Based Stochastic Predictive Controllers for Legged Locomotion IROS
Quadrupedal robots excel in mobility, navigating complex terrains with agility. However, their complex control systems present challenges that are still far from being fully addressed. In this paper, we introduce the use of Sample-Based Stochastic control strategies for quadrupedal robots, as an alternative to traditional optimal control laws. We show that Sample-Based Stochastic methods, supported by GPU acceleration, can be effectively applied to real quadruped robots. In particular, in this work, we focus on achieving gait frequency adaptation, a notable challenge in quadrupedal locomotion for gradient-based methods. To validate the effectiveness of Sample-Based Stochastic controllers we test two distinct approaches for quadrupedal robots and compare them against a conventional gradient-based Model Predictive Control system. Our findings, validated both in simulation and on a real 21Kg Aliengo quadruped, demonstrate that our method is on par with a traditional Model Predictive Control strategy when the robot is subject to zero or moderate disturbance, while it surpasses gradient-based methods in handling sustained external disturbances, thanks to the straightforward gait adaptation strategy that is possible to achieve within their formulation.
comment: Accepted for publication at the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Can Vehicle Motion Planning Generalize to Realistic Long-tail Scenarios?
Real-world autonomous driving systems must make safe decisions in the face of rare and diverse traffic scenarios. Current state-of-the-art planners are mostly evaluated on real-world datasets like nuScenes (open-loop) or nuPlan (closed-loop). In particular, nuPlan seems to be an expressive evaluation method since it is based on real-world data and closed-loop, yet it mostly covers basic driving scenarios. This makes it difficult to judge a planner's capabilities to generalize to rarely-seen situations. Therefore, we propose a novel closed-loop benchmark interPlan containing several edge cases and challenging driving scenarios. We assess existing state-of-the-art planners on our benchmark and show that neither rule-based nor learning-based planners can safely navigate the interPlan scenarios. A recently evolving direction is the usage of foundation models like large language models (LLM) to handle generalization. We evaluate an LLM-only planner and introduce a novel hybrid planner that combines an LLM-based behavior planner with a rule-based motion planner that achieves state-of-the-art performance on our benchmark.
Evaluating the precision of the HTC VIVE Ultimate Tracker with robotic and human movements under varied environmental conditions
The HTC VIVE Ultimate Tracker, utilizing inside-out tracking with internal stereo cameras providing 6 DoF tracking without external cameras, offers a cost-efficient and straightforward setup for motion tracking. Initially designed for the gaming and VR industry, we explored its application beyond VR, providing source code for data capturing in both C++ and Python without requiring a VR headset. This study is the first to evaluate the tracker's precision across various experimental scenarios. To assess the robustness of the tracking precision, we employed a robotic arm as a precise and repeatable source of motion. Using the OptiTrack system as a reference, we conducted tests under varying experimental conditions: lighting, movement velocity, environmental changes caused by displacing objects in the scene, and human movement in front of the trackers, as well as varying the displacement size relative to the calibration center. On average, the HTC VIVE Ultimate Tracker achieved a precision of 4.98 mm +/- 4 mm across various conditions. The most critical factors affecting accuracy were lighting conditions, movement velocity, and range of motion relative to the calibration center. For practical evaluation, we captured human movements with 5 trackers in realistic motion capture scenarios. Our findings indicate sufficient precision for capturing human movements, validated through two tasks: a low-dynamic pick-and-place task and high-dynamic fencing movements performed by an elite athlete. Even though its precision is lower than that of conventional fixed-camera-based motion capture systems and its performance is influenced by several factors, the HTC VIVE Ultimate Tracker demonstrates adequate accuracy for a variety of motion tracking applications. Its ability to capture human or object movements outside of VR or MOCAP environments makes it particularly versatile.
A unified theory and statistical learning approach for traffic conflict detection
This study proposes a unified theory and statistical learning approach for traffic conflict detection, addressing the long-existing call for a consistent and comprehensive methodology to evaluate the collision risk emerging in road user interactions. The proposed theory assumes context-dependent probabilistic collision risk and frames conflict detection as assessing this risk by statistical learning of extreme events in daily interactions. Experiments using real-world trajectory data are conducted for demonstration. Firstly, a unified metric for indicating conflicts is trained with lane-changing interactions on German highways. This metric and other existing metrics are then applied to near-crash events from the 100-Car Naturalistic Driving Study in the U.S. for performance comparison. Results of the experiments show that the trained metric provides effective collision warnings, generalises across distinct datasets and traffic environments, covers a broad range of conflict types, and delivers a long-tailed distribution of conflict intensity. Reflecting on these results, the proposed theory ensures consistent evaluation by a generic formulation that encompasses varying assumptions of traffic conflicts; the statistical learning approach then enables a comprehensive consideration of influencing factors such as motion states of road users, environment conditions, and participant characteristics. Therefore, the theory and learning approach jointly provide an explainable and adaptable methodology for conflict detection among different road users and across various interaction scenarios. This promises to reduce accidents and improve overall traffic safety, by enhanced safety assessment of traffic infrastructures, more effective collision warning systems for autonomous driving, and a deeper understanding of road user behaviour in different traffic conditions.
comment: 21 pages, 9 figures, in submission
Invariant Smoothing for Localization: Including the IMU Biases
In this article we investigate smoothing (i.e., optimisation-based) estimation techniques for robot localization using an IMU aided by other localization sensors. We more particularly focus on Invariant Smoothing (IS), a variant based on the use of nontrivial Lie groups from robotics. We study the recently introduced Two Frames Group (TFG), and prove it can fit into the framework of Invariant Smoothing in order to better take into account the IMU biases, as compared to the state-of-the-art in robotics. Experiments based on the KITTI dataset show the proposed framework compares favorably to the state-of-the-art smoothing methods in terms of robustness in some challenging situations.
BiKC: Keypose-Conditioned Consistency Policy for Bimanual Robotic Manipulation
Bimanual manipulation tasks typically involve multiple stages which require efficient interactions between two arms, posing step-wise and stage-wise challenges for imitation learning systems. Specifically, failure and delay of one step will broadcast through time, hinder success and efficiency of each sub-stage task, and thereby overall task performance. Although recent works have made strides in addressing certain challenges, few approaches explicitly consider the multi-stage nature of bimanual tasks while simultaneously emphasizing the importance of inference speed. In this paper, we introduce a novel keypose-conditioned consistency policy tailored for bimanual manipulation. It is a hierarchical imitation learning framework that consists of a high-level keypose predictor and a low-level trajectory generator. The predicted keyposes provide guidance for trajectory generation and also mark the completion of one sub-stage task. The trajectory generator is designed as a consistency model trained from scratch without distillation, which generates action sequences conditioning on current observations and predicted keyposes with fast inference speed. Simulated and real-world experimental results demonstrate that the proposed approach surpasses baseline methods in terms of success rate and operational efficiency. Codes are available at https://github.com/ManUtdMoon/BiKC.
comment: Accepted by The 16th International Workshop on the Algorithmic Foundations of Robotics (WAFR 2024)
Closed-Loop Magnetic Control of Medical Soft Continuum Robots for Deflection
Magnetic soft continuum robots (MSCRs) have emerged as powerful devices in endovascular interventions owing to their hyperelastic fibre matrix and enhanced magnetic manipulability. Effective closed-loop control of tethered magnetic devices contributes to the achievement of autonomous vascular robotic surgery. In this article, we employ a magnetic actuation system equipped with a single rotatable permanent magnet to achieve closed-loop deflection control of the MSCR. To this end, we establish a differential kinematic model of MSCRs exposed to non-uniform magnetic fields. The relationship between the existence and uniqueness of Jacobian and the geometric position between robots is deduced. The control direction induced by Jacobian is demonstrated to be crucial in simulations. Then, the corresponding quasi-static control (QSC) framework integrates a linear extended state observer to estimate model uncertainties. Finally, the effectiveness of the proposed QSC framework is validated through comparative trajectory tracking experiments with the PD controller under external disturbances. Further extensions are made for the Jacobian to path-following control at the distal end position. The proposed control framework prevents the actuator from reaching the joint limit and achieves fast and low error-tracking performance without overshooting.
OpenVLA: An Open-Source Vision-Language-Action Model
Large policies pretrained on a combination of Internet-scale vision-language data and diverse robot demonstrations have the potential to change how we teach robots new skills: rather than training new behaviors from scratch, we can fine-tune such vision-language-action (VLA) models to obtain robust, generalizable policies for visuomotor control. Yet, widespread adoption of VLAs for robotics has been challenging as 1) existing VLAs are largely closed and inaccessible to the public, and 2) prior work fails to explore methods for efficiently fine-tuning VLAs for new tasks, a key component for adoption. Addressing these challenges, we introduce OpenVLA, a 7B-parameter open-source VLA trained on a diverse collection of 970k real-world robot demonstrations. OpenVLA builds on a Llama 2 language model combined with a visual encoder that fuses pretrained features from DINOv2 and SigLIP. As a product of the added data diversity and new model components, OpenVLA demonstrates strong results for generalist manipulation, outperforming closed models such as RT-2-X (55B) by 16.5% in absolute task success rate across 29 tasks and multiple robot embodiments, with 7x fewer parameters. We further show that we can effectively fine-tune OpenVLA for new settings, with especially strong generalization results in multi-task environments involving multiple objects and strong language grounding abilities, and outperform expressive from-scratch imitation learning methods such as Diffusion Policy by 20.4%. We also explore compute efficiency; as a separate contribution, we show that OpenVLA can be fine-tuned on consumer GPUs via modern low-rank adaptation methods and served efficiently via quantization without a hit to downstream success rate. Finally, we release model checkpoints, fine-tuning notebooks, and our PyTorch codebase with built-in support for training VLAs at scale on Open X-Embodiment datasets.
comment: Website: https://openvla.github.io/
MOKA: Open-World Robotic Manipulation through Mark-Based Visual Prompting
Open-world generalization requires robotic systems to have a profound understanding of the physical world and the user command to solve diverse and complex tasks. While the recent advancement in vision-language models (VLMs) has offered unprecedented opportunities to solve open-world problems, how to leverage their capabilities to control robots remains a grand challenge. In this paper, we introduce Marking Open-world Keypoint Affordances (MOKA), an approach that employs VLMs to solve robotic manipulation tasks specified by free-form language instructions. Central to our approach is a compact point-based representation of affordance, which bridges the VLM's predictions on observed images and the robot's actions in the physical world. By prompting the pre-trained VLM, our approach utilizes the VLM's commonsense knowledge and concept understanding acquired from broad data sources to predict affordances and generate motions. To facilitate the VLM's reasoning in zero-shot and few-shot manners, we propose a visual prompting technique that annotates marks on images, converting affordance reasoning into a series of visual question-answering problems that are solvable by the VLM. We further explore methods to enhance performance with robot experiences collected by MOKA through in-context learning and policy distillation. We evaluate and analyze MOKA's performance on various table-top manipulation tasks including tool use, deformable body manipulation, and object rearrangement.
Extending Structural Causal Models for Autonomous Embodied Systems
In this work we aim to bridge the divide between autonomous embodied systems and causal reasoning. Autonomous embodied systems have come to increasingly interact with humans, and in many cases may pose risks to the physical or mental well-being of those they interact with. Meanwhile causal models, despite their inherent transparency and ability to offer contrastive explanations, have found limited usage within such systems. As such, we first identify the challenges that have limited the integration of structural causal models within autonomous embodied systems. We then introduce a number of theoretical extensions to the structural causal model formalism in order to tackle these challenges. This augments these models to possess greater levels of modularisation and encapsulation, as well presenting a constant space temporal causal model representation. While not an extension itself, we also prove through the extensions we have introduced that dynamically mutable sets can be captured within structural causal models while maintaining a form of causal stationarity. Finally we introduce two case study architectures demonstrating the application of these extensions along with a discussion of where these extensions could be utilised in future work.
comment: 33 Pages = 7 Pages (Main Content) + 2 Pages (References) + 24 Pages (Appendix), 17 Figures = 2 Figures (Main Content) + 15 (Appendix), Oxford Robotics Institute Preprint; Significantly updated based upon feedback
Human-Like Implicit Intention Expression for Autonomous Driving Motion Planning: A Method Based on Learning Human Intention Priors
One of the key factors determining whether autonomous vehicles (AVs) can be seamlessly integrated into existing traffic systems is their ability to interact smoothly and efficiently with human drivers and communicate their intentions. While many studies have focused on enhancing AVs' human-like interaction and communication capabilities at the behavioral decision-making level, a significant gap remains between the actual motion trajectories of AVs and the psychological expectations of human drivers. This discrepancy can seriously affect the safety and efficiency of AV-HV (Autonomous Vehicle-Human Vehicle) interactions. To address these challenges, we propose a motion planning method for AVs that incorporates implicit intention expression. First, we construct a trajectory space constraint based on human implicit intention priors, compressing and pruning the trajectory space to generate candidate motion trajectories that consider intention expression. We then apply maximum entropy inverse reinforcement learning to learn and estimate human trajectory preferences, constructing a reward function that represents the cognitive characteristics of drivers. Finally, using a Boltzmann distribution, we establish a probabilistic distribution of candidate trajectories based on the reward obtained, selecting human-like trajectory actions. We validated our approach on a real trajectory dataset and compared it with several baseline methods. The results demonstrate that our method excels in human-likeness, intention expression capability, and computational efficiency.
VANP: Learning Where to See for Navigation with Self-Supervised Vision-Action Pre-Training IROS 2024
Humans excel at efficiently navigating through crowds without collision by focusing on specific visual regions relevant to navigation. However, most robotic visual navigation methods rely on deep learning models pre-trained on vision tasks, which prioritize salient objects -- not necessarily relevant to navigation and potentially misleading. Alternative approaches train specialized navigation models from scratch, requiring significant computation. On the other hand, self-supervised learning has revolutionized computer vision and natural language processing, but its application to robotic navigation remains underexplored due to the difficulty of defining effective self-supervision signals. Motivated by these observations, in this work, we propose a Self-Supervised Vision-Action Model for Visual Navigation Pre-Training (VANP). Instead of detecting salient objects that are beneficial for tasks such as classification or detection, VANP learns to focus only on specific visual regions that are relevant to the navigation task. To achieve this, VANP uses a history of visual observations, future actions, and a goal image for self-supervision, and embeds them using two small Transformer Encoders. Then, VANP maximizes the information between the embeddings by using a mutual information maximization objective function. We demonstrate that most VANP-extracted features match with human navigation intuition. VANP achieves comparable performance as models learned end-to-end with half the training time and models trained on a large-scale, fully supervised dataset, i.e., ImageNet, with only 0.08% data.
comment: Extended version of the paper accepted at IROS 2024. Code: https://github.com/mhnazeri/VANP
Systems and Control (CS)
Adaptive Formation Learning Control for Cooperative AUVs under Complete Uncertainty
This paper presents a two-layer control framework for Autonomous Underwater Vehicles (AUVs) designed to handle uncertain nonlinear dynamics, including the mass matrix, previously assumed known. Unlike prior studies, this approach makes the controller independent of the robot's configuration and varying environmental conditions. The proposed framework applies across different environmental conditions affecting AUVs. It features a first-layer cooperative estimator and a second-layer decentralized deterministic learning controller. This architecture supports robust operation under diverse underwater scenarios, managing environmental effects like changes in water viscosity and flow, which impact the AUV's effective mass and damping dynamics. The first-layer estimator enables seamless inter-agent communication by sharing crucial system estimates without relying on global information. The second-layer controller uses local feedback to adjust each AUV's trajectory, ensuring accurate formation control and dynamic adaptability. Radial basis function neural networks enable local learning and knowledge storage, allowing AUVs to efficiently reapply learned dynamics after system restarts. Simulations validate the effectiveness of this framework, marking it as a significant advancement in distributed adaptive control systems for AUVs, enhancing operational flexibility and resilience in unpredictable marine environments.
comment: Submitted in journal of Frontiers in Robotics and AI, 30 pages, 8 figures, 1 table
Generalized Individual Q-learning for Polymatrix Games with Partial Observations
This paper addresses the challenge of limited observations in non-cooperative multi-agent systems where agents can have partial access to other agents' actions. We present the generalized individual Q-learning dynamics that combine belief-based and payoff-based learning for the networked interconnections of more than two self-interested agents. This approach leverages access to opponents' actions whenever possible, demonstrably achieving a faster (guaranteed) convergence to quantal response equilibrium in multi-agent zero-sum and potential polymatrix games. Notably, the dynamics reduce to the well-studied smoothed fictitious play and individual Q-learning under full and no access to opponent actions, respectively. We further quantify the improvement in convergence rate due to observing opponents' actions through numerical simulations.
comment: Extended version (including proofs of Propositions 1 and 2) of the paper: A. S. Donmez and M. O. Sayin, "Generalized individual Q-learning for polymatrix games with partial observations", to appear in the Proceedings of the 63rd IEEE Conference on Decision and Control, 2024
Mamba as a motion encoder for robotic imitation learning
Recent advancements in imitation learning, particularly with the integration of LLM techniques, are set to significantly improve robots' dexterity and adaptability. In this study, we propose using Mamba, a state-of-the-art architecture with potential applications in LLMs, for robotic imitation learning, highlighting its ability to function as an encoder that effectively captures contextual information. By reducing the dimensionality of the state space, Mamba operates similarly to an autoencoder. It effectively compresses the sequential information into state variables while preserving the essential temporal dynamics necessary for accurate motion prediction. Experimental results in tasks such as cup placing and case loading demonstrate that despite exhibiting higher estimation errors, Mamba achieves superior success rates compared to Transformers in practical task execution. This performance is attributed to Mamba's structure, which encompasses the state space model. Additionally, the study investigates Mamba's capacity to serve as a real-time motion generator with a limited amount of training data.
comment: 7 pages, 7 figures
A Variable Power Surface Error Function backstepping based Dynamic Surface Control of Non-Lower Triangular Nonlinear Systems
A control design for error reduction in the tracking control for a class of non-lower triangular nonlinear systems is presented by combining techniques of Variable Power Surface Error Function (VPSEF), backstepping, and dynamic surface control. At each step of design, a surface error is obtained, and based on its magnitude, the VPSEF technique decides the surface error to be used. Thus, the backstepping-based virtual and actual control law is designed to stabilize the corresponding subsystem. To address the issue of circular structure, a first-order low-pass filter is used to handle the virtual control signal at each intermediate stage of the recursive design. The stability analysis of the closed-loop system demonstrates that all signals indicate semi-global uniform ultimate boundedness. Moreover, by using the switching strategy of the control input using the VPSEF technique suitably, it is possible to ensure that the steady-state tracking error converges to a neighborhood of zero with an arbitrarily very small size. The effectiveness of the proposed concept has been verified using two different simulated demonstrations.
USV-AUV Collaboration Framework for Underwater Tasks under Extreme Sea Conditions
Autonomous underwater vehicles (AUVs) are valuable for ocean exploration due to their flexibility and ability to carry communication and detection units. Nevertheless, AUVs alone often face challenges in harsh and extreme sea conditions. This study introduces a unmanned surface vehicle (USV)-AUV collaboration framework, which includes high-precision multi-AUV positioning using USV path planning via Fisher information matrix optimization and reinforcement learning for multi-AUV cooperative tasks. Applied to a multi-AUV underwater data collection task scenario, extensive simulations validate the framework's feasibility and superior performance, highlighting exceptional coordination and robustness under extreme sea conditions. The simulation code will be made available as open-source to foster future research in this area.
Large Language Models as Efficient Reward Function Searchers for Custom-Environment Multi-Objective Reinforcement Learning
Leveraging large language models (LLMs) for designing reward functions demonstrates significant potential. However, achieving effective design and improvement of reward functions in reinforcement learning (RL) tasks with complex custom environments and multiple requirements presents considerable challenges. In this paper, we enable LLMs to be effective white-box searchers, highlighting their advanced semantic understanding capabilities. Specifically, we generate reward components for each explicit user requirement and employ the reward critic to identify the correct code form. Then, LLMs assign weights to the reward components to balance their values and iteratively search and optimize these weights based on the context provided by the training log analyzer, while adaptively determining the search step size. We applied the framework to an underwater information collection RL task without direct human feedback or reward examples (zero-shot). The reward critic successfully correct the reward code with only one feedback for each requirement, effectively preventing irreparable errors that can occur when reward function feedback is provided in aggregate. The effective initialization of weights enables the acquisition of different reward functions within the Pareto solution set without weight search. Even in the case where a weight is 100 times off, fewer than four iterations are needed to obtain solutions that meet user requirements. The framework also works well with most prompts utilizing GPT-3.5 Turbo, since it does not require advanced numerical understanding or calculation.
Enhancing Information Freshness: An AoI Optimized Markov Decision Process Dedicated In the Underwater Task
Ocean exploration utilizing autonomous underwater vehicles (AUVs) via reinforcement learning (RL) has emerged as a significant research focus. However, underwater tasks have mostly failed due to the observation delay caused by acoustic communication in the Internet of underwater things. In this study, we present an AoI optimized Markov decision process (AoI-MDP) to improve the performance of underwater tasks. Specifically, AoI-MDP models observation delay as signal delay through statistical signal processing, and includes this delay as a new component in the state space. Additionally, we introduce wait time in the action space, and integrate AoI with reward functions to achieve joint optimization of information freshness and decision-making for AUVs leveraging RL for training. Finally, we apply this approach to the multi-AUV data collection task scenario as an example. Simulation results highlight the feasibility of AoI-MDP, which effectively minimizes AoI while showcasing superior performance in the task. To accelerate relevant research in this field, the code for simulation will be released as open-source in the future.
Force-Limited Control of Wave Energy Converters using a Describing Function Linearization
Actuator saturation is a common nonlinearity. In wave energy conversion, force saturation conveniently limits drivetrain size and cost with minimal impact on energy generation. However, such nonlinear dynamics typically demand numerical simulation, which increases computational cost and diminishes intuition. This paper instead uses describing functions to approximate a force saturation nonlinearity as a linear impedance mismatch. In the frequency domain, the impact of controller impedance mismatch (such as force limit, finite bandwidth, or parameter error) on electrical power production is shown analytically and graphically for a generic nondimensionalized single degree of freedom wave energy converter in regular waves. Results are visualized with Smith charts. Notably, systems with a specific ratio of reactive to real mechanical impedance are least sensitive to force limits, a criteria which conflicts with resonance and bandwidth considerations. The describing function method shows promise to enable future studies such as large-scale design optimization and co-design.
comment: 6 pages, 7 figures. For code, see https://github.com/symbiotic-engineering/IFAC_CAMS_2024/ . To be presented at IFAC CAMS 2024 conference and to appear in IFAC-PapersOnLine
Upstream Allocation of Bidirectional Load Demand by Power Packetization
The power packet dispatching system has been studied for power management with strict tie to an accompanying information system through power packetization. In the system, integrated units of transfer of power and information, called power packets, are delivered through a network of apparatuses called power packet routers. This paper proposes upstream allocation of a bidirectional load demand represented by a sequence of power packets to power sources. We first develop a scheme of power packet routing for upstream allocation of load demand with full integration of power and information transfer. The routing scheme is then proved to enable packetized management of bidirectional load demand, which is of practical importance for applicability to, e.g., electric drives in motoring and regenerating operations. We present a way of packetizing the bidirectional load demand and realizing the power and information flow under the upstream allocation scheme. The viability of the proposed methods is demonstrated through experiments.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Neighbourhood conditions for network stability with link uncertainty
The main result relates to structured robust stability analysis of an input-output model for networks with link uncertainty. It constitutes a collection of integral quadratic constraints, which together imply robust stability of the uncertain networked dynamics. Each condition is decentralized in the sense that it depends on model data pertaining to the neighbourhood of a specific agent. By contrast, pre-existing conditions for the network model are link-wise decentralized, with each involving conservatively more localized problem data. A numerical example is presented to illustrate the advantage of the new broader neighbourhood conditions.
comment: arXiv admin note: text overlap with arXiv:2403.14931
Combined Plant and Control Co-design via Solutions of Hamilton-Jacobi-Bellman Equation Based on Physics-informed Learning
This paper addresses integrated design of engineering systems, where physical structure of the plant and controller design are optimized simultaneously. To cope with uncertainties due to noises acting on the dynamics and modeling errors, an Uncertain Control Co-design (UCCD) problem formulation is proposed. Existing UCCD methods usually rely on uncertainty propagation analyses using Monte Calro methods for open-loop solutions of optimal control, which suffer from stringent trade-offs among accuracy, time horizon, and computational time. The proposed method utilizes closed-loop solutions characterized by the Hamilton-Jacobi-Bellman equation, a Partial Differential Equation (PDE) defined on the state space. A solution algorithm for the proposed UCCD formulation is developed based on PDE solutions of Physics-informed Neural Networks (PINNs). Numerical examples of regulator design problems are provided, and it is shown that simultaneous update of PINN weights and the design parameters effectively works for solving UCCD problems.
RoboKoop: Efficient Control Conditioned Representations from Visual Input in Robotics using Koopman Operator
Developing agents that can perform complex control tasks from high-dimensional observations is a core ability of autonomous agents that requires underlying robust task control policies and adapting the underlying visual representations to the task. Most existing policies need a lot of training samples and treat this problem from the lens of two-stage learning with a controller learned on top of pre-trained vision models. We approach this problem from the lens of Koopman theory and learn visual representations from robotic agents conditioned on specific downstream tasks in the context of learning stabilizing control for the agent. We introduce a Contrastive Spectral Koopman Embedding network that allows us to learn efficient linearized visual representations from the agent's visual data in a high dimensional latent space and utilizes reinforcement learning to perform off-policy control on top of the extracted representations with a linear controller. Our method enhances stability and control in gradient dynamics over time, significantly outperforming existing approaches by improving efficiency and accuracy in learning task policies over extended horizons.
comment: Accepted to the $8^{th}$ Conference on Robot Learning (CoRL 2024)
Rigid-Body Attitude Control on $\mathsf{SO(3)}$ using Nonlinear Dynamic Inversion
This paper presents a cascaded control architecture, based on nonlinear dynamic inversion (NDI), for rigid body attitude control. The proposed controller works directly with the rotation matrix parameterization, that is, with elements of the Special Orthogonal Group $\mathsf{SO(3)}$, and avoids problems related to singularities and non-uniqueness which affect other commonly used attitude representations such as Euler angles, unit quaternions, modified Rodrigues parameters, etc. The proposed NDI-based controller is capable of imposing desired linear dynamics of any order for the outer attitude loop and the inner rate loop, and gives control designers the flexibility to choose higher-order dynamic compensators in both loops. In addition, sufficient conditions are presented in the form of linear matrix inequalities (LMIs) which ensure that the outer loop controller renders the attitude loop almost globally asymptotically stable (AGAS) and the rate loop globally asymptotically stable (GAS). Furthermore, the overall cascaded control architecture is shown to be AGAS in the case of attitude error regulation. Lastly, the proposed scheme is compared with an Euler angles-based NDI scheme from literature for a tracking problem involving agile maneuvering of a multicopter in a high-fidelity nonlinear simulation.
comment: 7 pages, 6 figures, accepted in IEEE Conference on Decision and Control (CDC), 2024
Angular Spread Statistics for 6.75 GHz FR1(C) and 16.95 GHz FR3 Mid-Band Frequencies in an Indoor Hotspot Environment
We present detailed multipath propagation spatial statistics for next-generation wireless systems operating at lower and upper mid-band frequencies spanning 6--24 GHz. The large-scale spatial characteristics of the wireless channel include Azimuth angular Spread of Departure (ASD) and Zenith angular Spread of Departure (ZSD) of multipath components (MPC) from a transmitter and the Azimuth angular Spread of Arrival (ASA) and Zenith angular Spread of Arrival (ZSA) at a receiver. The angular statistics calculated from measurements were compared with industry-standard 3GPP models, and ASD and ASA values were found to be in close agreement at both 6.75 GHz and 16.95 GHz. Measured LOS ASD was found larger than 3GPP ASD indicating more diverse MPC departure directions in the azimuth. ZSA and ZSD were observed smaller than the 3GPP modeling results as most multipath arrivals and departures during measurements were recorded at the boresight antenna elevation. The wide angular spreads indicate a multipath-rich spatial propagation at 6.75 GHz and 16.95 GHz, showing greater promise for the implementation of MIMO beamforming systems in the mid-band spectrum.
comment: 6 pages, 3 figures, 1 table, IEEE Wireless Communications and Networking Conference
PIETRA: Physics-Informed Evidential Learning for Traversing Out-of-Distribution Terrain
Self-supervised learning is a powerful approach for developing traversability models for off-road navigation, but these models often struggle with inputs unseen during training. Existing methods utilize techniques like evidential deep learning to quantify model uncertainty, helping to identify and avoid out-of-distribution terrain. However, always avoiding out-of-distribution terrain can be overly conservative, e.g., when novel terrain can be effectively analyzed using a physics-based model. To overcome this challenge, we introduce Physics-Informed Evidential Traversability (PIETRA), a self-supervised learning framework that integrates physics priors directly into the mathematical formulation of evidential neural networks and introduces physics knowledge implicitly through an uncertainty-aware, physics-informed training loss. Our evidential network seamlessly transitions between learned and physics-based predictions for out-of-distribution inputs. Additionally, the physics-informed loss regularizes the learned model, ensuring better alignment with the physics model. Extensive simulations and hardware experiments demonstrate that PIETRA improves both learning accuracy and navigation performance in environments with significant distribution shifts.
comment: Submitted to RA-L. Video: https://youtu.be/OTnNZ96oJRk
Efficient Multi-agent Navigation with Lightweight DRL Policy
In this article, we present an end-to-end collision avoidance policy based on deep reinforcement learning (DRL) for multi-agent systems, demonstrating encouraging outcomes in real-world applications. In particular, our policy calculates the control commands of the agent based on the raw LiDAR observation. In addition, the number of parameters of the proposed basic model is 140,000, and the size of the parameter file is 3.5 MB, which allows the robot to calculate the actions from the CPU alone. We propose a multi-agent training platform based on a physics-based simulator to further bridge the gap between simulation and the real world. The policy is trained on a policy-gradients-based RL algorithm in a dense and messy training environment. A novel reward function is introduced to address the issue of agents choosing suboptimal actions in some common scenarios. Although the data used for training is exclusively from the simulation platform, the policy can be successfully transferred and deployed in real-world robots. Finally, our policy effectively responds to intentional obstructions and avoids collisions. The website is available at \url{https://sites.google.com/view/xingrong2024efficient/%E9%A6%96%E9%A1%B5}.
Low-rank approximated Kalman filter using Oja's principal component flow for discrete-time linear systems
The Kalman filter is indispensable for state estimation across diverse fields but faces computational challenges with higher dimensions. Approaches such as Riccati equation approximations aim to alleviate this complexity, yet ensuring properties like bounded errors remains challenging. Yamada and Ohki introduced low-rank Kalman-Bucy filters for continuous-time systems, ensuring bounded errors. This paper proposes a discrete-time counterpart of the low-rank filter and shows its system theoretic properties and conditions for bounded mean square error estimation. Numerical simulations show the effectiveness of the proposed method.
comment: 6 pages, presented at SICE2024
Fast System Level Synthesis: Robust Model Predictive Control using Riccati Recursions
System level synthesis enables improved robust MPC formulations by allowing for joint optimization of the nominal trajectory and controller. This paper introduces a tailored algorithm for solving the corresponding disturbance feedback optimization problem for linear time-varying systems. The proposed algorithm iterates between optimizing the controller and the nominal trajectory while converging q-linearly to an optimal solution. We show that the controller optimization can be solved through Riccati recursions leading to a horizon-length, state, and input scalability of $\mathcal{O}(N^2 ( n_x^3 +n_u^3))$ for each iterate. On a numerical example, the proposed algorithm exhibits computational speedups by a factor of up to $10^3$ compared to general-purpose commercial solvers.
comment: Young Author Award (finalist): IFAC Conference on Nonlinear Model Predictive Control (NMPC) 2024
Large Language Models for Explainable Decisions in Dynamic Digital Twins
Dynamic data-driven Digital Twins (DDTs) can enable informed decision-making and provide an optimisation platform for the underlying system. By leveraging principles of Dynamic Data-Driven Applications Systems (DDDAS), DDTs can formulate computational modalities for feedback loops, model updates and decision-making, including autonomous ones. However, understanding autonomous decision-making often requires technical and domain-specific knowledge. This paper explores using large language models (LLMs) to provide an explainability platform for DDTs, generating natural language explanations of the system's decision-making by leveraging domain-specific knowledge bases. A case study from smart agriculture is presented.
comment: 9 pages, 3 figures, accepted by DDDAS2024 -- the 5th International Conference on Dynamic Data Driven Applications Systems
Bayesian Persuasion for Containing SIS Epidemics with Asymptomatic Infection
We investigate the strategic behavior of a large population of agents who decide whether to adopt a costly partially effective protection or remain unprotected against the susceptible-infected-susceptible epidemic. In contrast with most prior works on epidemic games, we assume that the agents are not aware of their true infection status while making decisions. We adopt the Bayesian persuasion framework where the agents receive a noisy signal regarding their true infection status, and maximize their expected utility computed using the posterior probability of being infected conditioned on the received signal. We characterize the stationary Nash equilibrium of this setting under suitable assumptions, and identify conditions under which partial information disclosure leads to a smaller proportion of infected individuals at the equilibrium compared to full information disclosure, and vice versa.
Deterministic Multistage Constellation Reconfiguration Using Integer Programming and Sequential Decision-Making Methods
In this paper, we address the problem of reconfiguring Earth observation satellite constellation systems through multiple stages. The Multi-stage Constellation Reconfiguration Problem (MCRP) aims to maximize the total observation rewards obtained by covering a set of targets of interest through the active manipulation of the orbits and relative phasing of constituent satellites. In this paper, we consider deterministic problem settings in which the targets of interest are known a priori. We propose a novel integer linear programming formulation for MCRP, capable of obtaining provably optimal solutions. To overcome computational intractability due to the combinatorial explosion in solving large-scale instances, we introduce two computationally efficient sequential decision-making methods based on the principles of a myopic policy and a rolling horizon procedure. The computational experiments demonstrate that the devised sequential decision-making approaches yield high-quality solutions with improved computational efficiency over the baseline MCRP. Finally, a case study using Hurricane Harvey data showcases the advantages of multi-stage constellation reconfiguration over single-stage and no-reconfiguration scenarios.
comment: 39 pages, 13 figures, Journal of Spacecraft and Rockets (Published)
Online Feedback Optimization and Singular Perturbation via Contraction Theory
In this paper, we provide a novel contraction-theoretic approach to analyze two-time scale systems, including those commonly encountered in Online Feedback Optimization (OFO). Our framework endows these systems with several robustness properties, enabling a more comprehensive characterization of their behaviors. The primary assumptions are the contractivity of the fast sub-system and the reduced model, along with an explicit upper bound on the time-scale parameter. For two-time scale systems subject to disturbances, we show that the distance between solutions of the nominal system and solutions of its reduced model is uniformly upper bounded by a function of contraction rates, Lipschitz constants, the time-scale parameter, and the variability of the disturbances over time. Applying these general results to the OFO context, we establish new individual tracking error bounds, showing that solutions converge to their time-varying optimizer, provided the plant and steady-state feedback controller exhibit contractivity and the controller gain is suitably bounded. Finally, we explore two special cases: for autonomous nonlinear systems, we derive sharper bounds than those in the general results, and for linear time-invariant systems, we present novel bounds based on induced matrix norms and induced matrix log norms.
comment: This paper has been submitted to SIAM Journal on Control and Optimization
Wasserstein speed limits for Langevin systems
Physical systems transition between states with finite speed that is limited by energetic costs. In this work, we derive bounds on transition times for general Langevin systems that admit a decomposition into reversible and irreversible dynamics, in terms of the Wasserstein distance between states and the energetic costs associated with respective reversible and irreversible currents. For illustration we discuss Brownian particles subject to arbitrary forcing and an RLC circuit with time-varying inductor.
comment: 10 pages, 2 figures
Systems and Control (EESS)
Adaptive Formation Learning Control for Cooperative AUVs under Complete Uncertainty
This paper presents a two-layer control framework for Autonomous Underwater Vehicles (AUVs) designed to handle uncertain nonlinear dynamics, including the mass matrix, previously assumed known. Unlike prior studies, this approach makes the controller independent of the robot's configuration and varying environmental conditions. The proposed framework applies across different environmental conditions affecting AUVs. It features a first-layer cooperative estimator and a second-layer decentralized deterministic learning controller. This architecture supports robust operation under diverse underwater scenarios, managing environmental effects like changes in water viscosity and flow, which impact the AUV's effective mass and damping dynamics. The first-layer estimator enables seamless inter-agent communication by sharing crucial system estimates without relying on global information. The second-layer controller uses local feedback to adjust each AUV's trajectory, ensuring accurate formation control and dynamic adaptability. Radial basis function neural networks enable local learning and knowledge storage, allowing AUVs to efficiently reapply learned dynamics after system restarts. Simulations validate the effectiveness of this framework, marking it as a significant advancement in distributed adaptive control systems for AUVs, enhancing operational flexibility and resilience in unpredictable marine environments.
comment: Submitted in journal of Frontiers in Robotics and AI, 30 pages, 8 figures, 1 table
Generalized Individual Q-learning for Polymatrix Games with Partial Observations
This paper addresses the challenge of limited observations in non-cooperative multi-agent systems where agents can have partial access to other agents' actions. We present the generalized individual Q-learning dynamics that combine belief-based and payoff-based learning for the networked interconnections of more than two self-interested agents. This approach leverages access to opponents' actions whenever possible, demonstrably achieving a faster (guaranteed) convergence to quantal response equilibrium in multi-agent zero-sum and potential polymatrix games. Notably, the dynamics reduce to the well-studied smoothed fictitious play and individual Q-learning under full and no access to opponent actions, respectively. We further quantify the improvement in convergence rate due to observing opponents' actions through numerical simulations.
comment: Extended version (including proofs of Propositions 1 and 2) of the paper: A. S. Donmez and M. O. Sayin, "Generalized individual Q-learning for polymatrix games with partial observations", to appear in the Proceedings of the 63rd IEEE Conference on Decision and Control, 2024
Mamba as a motion encoder for robotic imitation learning
Recent advancements in imitation learning, particularly with the integration of LLM techniques, are set to significantly improve robots' dexterity and adaptability. In this study, we propose using Mamba, a state-of-the-art architecture with potential applications in LLMs, for robotic imitation learning, highlighting its ability to function as an encoder that effectively captures contextual information. By reducing the dimensionality of the state space, Mamba operates similarly to an autoencoder. It effectively compresses the sequential information into state variables while preserving the essential temporal dynamics necessary for accurate motion prediction. Experimental results in tasks such as cup placing and case loading demonstrate that despite exhibiting higher estimation errors, Mamba achieves superior success rates compared to Transformers in practical task execution. This performance is attributed to Mamba's structure, which encompasses the state space model. Additionally, the study investigates Mamba's capacity to serve as a real-time motion generator with a limited amount of training data.
comment: 7 pages, 7 figures
A Variable Power Surface Error Function backstepping based Dynamic Surface Control of Non-Lower Triangular Nonlinear Systems
A control design for error reduction in the tracking control for a class of non-lower triangular nonlinear systems is presented by combining techniques of Variable Power Surface Error Function (VPSEF), backstepping, and dynamic surface control. At each step of design, a surface error is obtained, and based on its magnitude, the VPSEF technique decides the surface error to be used. Thus, the backstepping-based virtual and actual control law is designed to stabilize the corresponding subsystem. To address the issue of circular structure, a first-order low-pass filter is used to handle the virtual control signal at each intermediate stage of the recursive design. The stability analysis of the closed-loop system demonstrates that all signals indicate semi-global uniform ultimate boundedness. Moreover, by using the switching strategy of the control input using the VPSEF technique suitably, it is possible to ensure that the steady-state tracking error converges to a neighborhood of zero with an arbitrarily very small size. The effectiveness of the proposed concept has been verified using two different simulated demonstrations.
USV-AUV Collaboration Framework for Underwater Tasks under Extreme Sea Conditions
Autonomous underwater vehicles (AUVs) are valuable for ocean exploration due to their flexibility and ability to carry communication and detection units. Nevertheless, AUVs alone often face challenges in harsh and extreme sea conditions. This study introduces a unmanned surface vehicle (USV)-AUV collaboration framework, which includes high-precision multi-AUV positioning using USV path planning via Fisher information matrix optimization and reinforcement learning for multi-AUV cooperative tasks. Applied to a multi-AUV underwater data collection task scenario, extensive simulations validate the framework's feasibility and superior performance, highlighting exceptional coordination and robustness under extreme sea conditions. The simulation code will be made available as open-source to foster future research in this area.
Large Language Models as Efficient Reward Function Searchers for Custom-Environment Multi-Objective Reinforcement Learning
Leveraging large language models (LLMs) for designing reward functions demonstrates significant potential. However, achieving effective design and improvement of reward functions in reinforcement learning (RL) tasks with complex custom environments and multiple requirements presents considerable challenges. In this paper, we enable LLMs to be effective white-box searchers, highlighting their advanced semantic understanding capabilities. Specifically, we generate reward components for each explicit user requirement and employ the reward critic to identify the correct code form. Then, LLMs assign weights to the reward components to balance their values and iteratively search and optimize these weights based on the context provided by the training log analyzer, while adaptively determining the search step size. We applied the framework to an underwater information collection RL task without direct human feedback or reward examples (zero-shot). The reward critic successfully correct the reward code with only one feedback for each requirement, effectively preventing irreparable errors that can occur when reward function feedback is provided in aggregate. The effective initialization of weights enables the acquisition of different reward functions within the Pareto solution set without weight search. Even in the case where a weight is 100 times off, fewer than four iterations are needed to obtain solutions that meet user requirements. The framework also works well with most prompts utilizing GPT-3.5 Turbo, since it does not require advanced numerical understanding or calculation.
Enhancing Information Freshness: An AoI Optimized Markov Decision Process Dedicated In the Underwater Task
Ocean exploration utilizing autonomous underwater vehicles (AUVs) via reinforcement learning (RL) has emerged as a significant research focus. However, underwater tasks have mostly failed due to the observation delay caused by acoustic communication in the Internet of underwater things. In this study, we present an AoI optimized Markov decision process (AoI-MDP) to improve the performance of underwater tasks. Specifically, AoI-MDP models observation delay as signal delay through statistical signal processing, and includes this delay as a new component in the state space. Additionally, we introduce wait time in the action space, and integrate AoI with reward functions to achieve joint optimization of information freshness and decision-making for AUVs leveraging RL for training. Finally, we apply this approach to the multi-AUV data collection task scenario as an example. Simulation results highlight the feasibility of AoI-MDP, which effectively minimizes AoI while showcasing superior performance in the task. To accelerate relevant research in this field, the code for simulation will be released as open-source in the future.
Force-Limited Control of Wave Energy Converters using a Describing Function Linearization
Actuator saturation is a common nonlinearity. In wave energy conversion, force saturation conveniently limits drivetrain size and cost with minimal impact on energy generation. However, such nonlinear dynamics typically demand numerical simulation, which increases computational cost and diminishes intuition. This paper instead uses describing functions to approximate a force saturation nonlinearity as a linear impedance mismatch. In the frequency domain, the impact of controller impedance mismatch (such as force limit, finite bandwidth, or parameter error) on electrical power production is shown analytically and graphically for a generic nondimensionalized single degree of freedom wave energy converter in regular waves. Results are visualized with Smith charts. Notably, systems with a specific ratio of reactive to real mechanical impedance are least sensitive to force limits, a criteria which conflicts with resonance and bandwidth considerations. The describing function method shows promise to enable future studies such as large-scale design optimization and co-design.
comment: 6 pages, 7 figures. For code, see https://github.com/symbiotic-engineering/IFAC_CAMS_2024/ . To be presented at IFAC CAMS 2024 conference and to appear in IFAC-PapersOnLine
Upstream Allocation of Bidirectional Load Demand by Power Packetization
The power packet dispatching system has been studied for power management with strict tie to an accompanying information system through power packetization. In the system, integrated units of transfer of power and information, called power packets, are delivered through a network of apparatuses called power packet routers. This paper proposes upstream allocation of a bidirectional load demand represented by a sequence of power packets to power sources. We first develop a scheme of power packet routing for upstream allocation of load demand with full integration of power and information transfer. The routing scheme is then proved to enable packetized management of bidirectional load demand, which is of practical importance for applicability to, e.g., electric drives in motoring and regenerating operations. We present a way of packetizing the bidirectional load demand and realizing the power and information flow under the upstream allocation scheme. The viability of the proposed methods is demonstrated through experiments.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Neighbourhood conditions for network stability with link uncertainty
The main result relates to structured robust stability analysis of an input-output model for networks with link uncertainty. It constitutes a collection of integral quadratic constraints, which together imply robust stability of the uncertain networked dynamics. Each condition is decentralized in the sense that it depends on model data pertaining to the neighbourhood of a specific agent. By contrast, pre-existing conditions for the network model are link-wise decentralized, with each involving conservatively more localized problem data. A numerical example is presented to illustrate the advantage of the new broader neighbourhood conditions.
comment: arXiv admin note: text overlap with arXiv:2403.14931
Combined Plant and Control Co-design via Solutions of Hamilton-Jacobi-Bellman Equation Based on Physics-informed Learning
This paper addresses integrated design of engineering systems, where physical structure of the plant and controller design are optimized simultaneously. To cope with uncertainties due to noises acting on the dynamics and modeling errors, an Uncertain Control Co-design (UCCD) problem formulation is proposed. Existing UCCD methods usually rely on uncertainty propagation analyses using Monte Calro methods for open-loop solutions of optimal control, which suffer from stringent trade-offs among accuracy, time horizon, and computational time. The proposed method utilizes closed-loop solutions characterized by the Hamilton-Jacobi-Bellman equation, a Partial Differential Equation (PDE) defined on the state space. A solution algorithm for the proposed UCCD formulation is developed based on PDE solutions of Physics-informed Neural Networks (PINNs). Numerical examples of regulator design problems are provided, and it is shown that simultaneous update of PINN weights and the design parameters effectively works for solving UCCD problems.
RoboKoop: Efficient Control Conditioned Representations from Visual Input in Robotics using Koopman Operator
Developing agents that can perform complex control tasks from high-dimensional observations is a core ability of autonomous agents that requires underlying robust task control policies and adapting the underlying visual representations to the task. Most existing policies need a lot of training samples and treat this problem from the lens of two-stage learning with a controller learned on top of pre-trained vision models. We approach this problem from the lens of Koopman theory and learn visual representations from robotic agents conditioned on specific downstream tasks in the context of learning stabilizing control for the agent. We introduce a Contrastive Spectral Koopman Embedding network that allows us to learn efficient linearized visual representations from the agent's visual data in a high dimensional latent space and utilizes reinforcement learning to perform off-policy control on top of the extracted representations with a linear controller. Our method enhances stability and control in gradient dynamics over time, significantly outperforming existing approaches by improving efficiency and accuracy in learning task policies over extended horizons.
comment: Accepted to the $8^{th}$ Conference on Robot Learning (CoRL 2024)
Rigid-Body Attitude Control on $\mathsf{SO(3)}$ using Nonlinear Dynamic Inversion
This paper presents a cascaded control architecture, based on nonlinear dynamic inversion (NDI), for rigid body attitude control. The proposed controller works directly with the rotation matrix parameterization, that is, with elements of the Special Orthogonal Group $\mathsf{SO(3)}$, and avoids problems related to singularities and non-uniqueness which affect other commonly used attitude representations such as Euler angles, unit quaternions, modified Rodrigues parameters, etc. The proposed NDI-based controller is capable of imposing desired linear dynamics of any order for the outer attitude loop and the inner rate loop, and gives control designers the flexibility to choose higher-order dynamic compensators in both loops. In addition, sufficient conditions are presented in the form of linear matrix inequalities (LMIs) which ensure that the outer loop controller renders the attitude loop almost globally asymptotically stable (AGAS) and the rate loop globally asymptotically stable (GAS). Furthermore, the overall cascaded control architecture is shown to be AGAS in the case of attitude error regulation. Lastly, the proposed scheme is compared with an Euler angles-based NDI scheme from literature for a tracking problem involving agile maneuvering of a multicopter in a high-fidelity nonlinear simulation.
comment: 7 pages, 6 figures, accepted in IEEE Conference on Decision and Control (CDC), 2024
Angular Spread Statistics for 6.75 GHz FR1(C) and 16.95 GHz FR3 Mid-Band Frequencies in an Indoor Hotspot Environment
We present detailed multipath propagation spatial statistics for next-generation wireless systems operating at lower and upper mid-band frequencies spanning 6--24 GHz. The large-scale spatial characteristics of the wireless channel include Azimuth angular Spread of Departure (ASD) and Zenith angular Spread of Departure (ZSD) of multipath components (MPC) from a transmitter and the Azimuth angular Spread of Arrival (ASA) and Zenith angular Spread of Arrival (ZSA) at a receiver. The angular statistics calculated from measurements were compared with industry-standard 3GPP models, and ASD and ASA values were found to be in close agreement at both 6.75 GHz and 16.95 GHz. Measured LOS ASD was found larger than 3GPP ASD indicating more diverse MPC departure directions in the azimuth. ZSA and ZSD were observed smaller than the 3GPP modeling results as most multipath arrivals and departures during measurements were recorded at the boresight antenna elevation. The wide angular spreads indicate a multipath-rich spatial propagation at 6.75 GHz and 16.95 GHz, showing greater promise for the implementation of MIMO beamforming systems in the mid-band spectrum.
comment: 6 pages, 3 figures, 1 table, IEEE Wireless Communications and Networking Conference
PIETRA: Physics-Informed Evidential Learning for Traversing Out-of-Distribution Terrain
Self-supervised learning is a powerful approach for developing traversability models for off-road navigation, but these models often struggle with inputs unseen during training. Existing methods utilize techniques like evidential deep learning to quantify model uncertainty, helping to identify and avoid out-of-distribution terrain. However, always avoiding out-of-distribution terrain can be overly conservative, e.g., when novel terrain can be effectively analyzed using a physics-based model. To overcome this challenge, we introduce Physics-Informed Evidential Traversability (PIETRA), a self-supervised learning framework that integrates physics priors directly into the mathematical formulation of evidential neural networks and introduces physics knowledge implicitly through an uncertainty-aware, physics-informed training loss. Our evidential network seamlessly transitions between learned and physics-based predictions for out-of-distribution inputs. Additionally, the physics-informed loss regularizes the learned model, ensuring better alignment with the physics model. Extensive simulations and hardware experiments demonstrate that PIETRA improves both learning accuracy and navigation performance in environments with significant distribution shifts.
comment: Submitted to RA-L. Video: https://youtu.be/OTnNZ96oJRk
Efficient Multi-agent Navigation with Lightweight DRL Policy
In this article, we present an end-to-end collision avoidance policy based on deep reinforcement learning (DRL) for multi-agent systems, demonstrating encouraging outcomes in real-world applications. In particular, our policy calculates the control commands of the agent based on the raw LiDAR observation. In addition, the number of parameters of the proposed basic model is 140,000, and the size of the parameter file is 3.5 MB, which allows the robot to calculate the actions from the CPU alone. We propose a multi-agent training platform based on a physics-based simulator to further bridge the gap between simulation and the real world. The policy is trained on a policy-gradients-based RL algorithm in a dense and messy training environment. A novel reward function is introduced to address the issue of agents choosing suboptimal actions in some common scenarios. Although the data used for training is exclusively from the simulation platform, the policy can be successfully transferred and deployed in real-world robots. Finally, our policy effectively responds to intentional obstructions and avoids collisions. The website is available at \url{https://sites.google.com/view/xingrong2024efficient/%E9%A6%96%E9%A1%B5}.
Low-rank approximated Kalman filter using Oja's principal component flow for discrete-time linear systems
The Kalman filter is indispensable for state estimation across diverse fields but faces computational challenges with higher dimensions. Approaches such as Riccati equation approximations aim to alleviate this complexity, yet ensuring properties like bounded errors remains challenging. Yamada and Ohki introduced low-rank Kalman-Bucy filters for continuous-time systems, ensuring bounded errors. This paper proposes a discrete-time counterpart of the low-rank filter and shows its system theoretic properties and conditions for bounded mean square error estimation. Numerical simulations show the effectiveness of the proposed method.
comment: 6 pages, presented at SICE2024
Fast System Level Synthesis: Robust Model Predictive Control using Riccati Recursions
System level synthesis enables improved robust MPC formulations by allowing for joint optimization of the nominal trajectory and controller. This paper introduces a tailored algorithm for solving the corresponding disturbance feedback optimization problem for linear time-varying systems. The proposed algorithm iterates between optimizing the controller and the nominal trajectory while converging q-linearly to an optimal solution. We show that the controller optimization can be solved through Riccati recursions leading to a horizon-length, state, and input scalability of $\mathcal{O}(N^2 ( n_x^3 +n_u^3))$ for each iterate. On a numerical example, the proposed algorithm exhibits computational speedups by a factor of up to $10^3$ compared to general-purpose commercial solvers.
comment: Young Author Award (finalist): IFAC Conference on Nonlinear Model Predictive Control (NMPC) 2024
Large Language Models for Explainable Decisions in Dynamic Digital Twins
Dynamic data-driven Digital Twins (DDTs) can enable informed decision-making and provide an optimisation platform for the underlying system. By leveraging principles of Dynamic Data-Driven Applications Systems (DDDAS), DDTs can formulate computational modalities for feedback loops, model updates and decision-making, including autonomous ones. However, understanding autonomous decision-making often requires technical and domain-specific knowledge. This paper explores using large language models (LLMs) to provide an explainability platform for DDTs, generating natural language explanations of the system's decision-making by leveraging domain-specific knowledge bases. A case study from smart agriculture is presented.
comment: 9 pages, 3 figures, accepted by DDDAS2024 -- the 5th International Conference on Dynamic Data Driven Applications Systems
Bayesian Persuasion for Containing SIS Epidemics with Asymptomatic Infection
We investigate the strategic behavior of a large population of agents who decide whether to adopt a costly partially effective protection or remain unprotected against the susceptible-infected-susceptible epidemic. In contrast with most prior works on epidemic games, we assume that the agents are not aware of their true infection status while making decisions. We adopt the Bayesian persuasion framework where the agents receive a noisy signal regarding their true infection status, and maximize their expected utility computed using the posterior probability of being infected conditioned on the received signal. We characterize the stationary Nash equilibrium of this setting under suitable assumptions, and identify conditions under which partial information disclosure leads to a smaller proportion of infected individuals at the equilibrium compared to full information disclosure, and vice versa.
Deterministic Multistage Constellation Reconfiguration Using Integer Programming and Sequential Decision-Making Methods
In this paper, we address the problem of reconfiguring Earth observation satellite constellation systems through multiple stages. The Multi-stage Constellation Reconfiguration Problem (MCRP) aims to maximize the total observation rewards obtained by covering a set of targets of interest through the active manipulation of the orbits and relative phasing of constituent satellites. In this paper, we consider deterministic problem settings in which the targets of interest are known a priori. We propose a novel integer linear programming formulation for MCRP, capable of obtaining provably optimal solutions. To overcome computational intractability due to the combinatorial explosion in solving large-scale instances, we introduce two computationally efficient sequential decision-making methods based on the principles of a myopic policy and a rolling horizon procedure. The computational experiments demonstrate that the devised sequential decision-making approaches yield high-quality solutions with improved computational efficiency over the baseline MCRP. Finally, a case study using Hurricane Harvey data showcases the advantages of multi-stage constellation reconfiguration over single-stage and no-reconfiguration scenarios.
comment: 39 pages, 13 figures, Journal of Spacecraft and Rockets (Published)
Online Feedback Optimization and Singular Perturbation via Contraction Theory
In this paper, we provide a novel contraction-theoretic approach to analyze two-time scale systems, including those commonly encountered in Online Feedback Optimization (OFO). Our framework endows these systems with several robustness properties, enabling a more comprehensive characterization of their behaviors. The primary assumptions are the contractivity of the fast sub-system and the reduced model, along with an explicit upper bound on the time-scale parameter. For two-time scale systems subject to disturbances, we show that the distance between solutions of the nominal system and solutions of its reduced model is uniformly upper bounded by a function of contraction rates, Lipschitz constants, the time-scale parameter, and the variability of the disturbances over time. Applying these general results to the OFO context, we establish new individual tracking error bounds, showing that solutions converge to their time-varying optimizer, provided the plant and steady-state feedback controller exhibit contractivity and the controller gain is suitably bounded. Finally, we explore two special cases: for autonomous nonlinear systems, we derive sharper bounds than those in the general results, and for linear time-invariant systems, we present novel bounds based on induced matrix norms and induced matrix log norms.
comment: This paper has been submitted to SIAM Journal on Control and Optimization
Wasserstein speed limits for Langevin systems
Physical systems transition between states with finite speed that is limited by energetic costs. In this work, we derive bounds on transition times for general Langevin systems that admit a decomposition into reversible and irreversible dynamics, in terms of the Wasserstein distance between states and the energetic costs associated with respective reversible and irreversible currents. For illustration we discuss Brownian particles subject to arbitrary forcing and an RLC circuit with time-varying inductor.
comment: 10 pages, 2 figures
Multiagent Systems
CONClave -- Secure and Robust Cooperative Perception for CAVs Using Authenticated Consensus and Trust Scoring
Connected Autonomous Vehicles have great potential to improve automobile safety and traffic flow, especially in cooperative applications where perception data is shared between vehicles. However, this cooperation must be secured from malicious intent and unintentional errors that could cause accidents. Previous works typically address singular security or reliability issues for cooperative driving in specific scenarios rather than the set of errors together. In this paper, we propose CONClave, a tightly coupled authentication, consensus, and trust scoring mechanism that provides comprehensive security and reliability for cooperative perception in autonomous vehicles. CONClave benefits from the pipelined nature of the steps such that faults can be detected significantly faster and with less compute. Overall, CONClave shows huge promise in preventing security flaws, detecting even relatively minor sensing faults, and increasing the robustness and accuracy of cooperative perception in CAVs while adding minimal overhead.
comment: 6 pages, 6 figures, Design Automation Conference June 2024
A Survey on Emergent Language
The field of emergent language represents a novel area of research within the domain of artificial intelligence, particularly within the context of multi-agent reinforcement learning. Although the concept of studying language emergence is not new, early approaches were primarily concerned with explaining human language formation, with little consideration given to its potential utility for artificial agents. In contrast, studies based on reinforcement learning aim to develop communicative capabilities in agents that are comparable to or even superior to human language. Thus, they extend beyond the learned statistical representations that are common in natural language processing research. This gives rise to a number of fundamental questions, from the prerequisites for language emergence to the criteria for measuring its success. This paper addresses these questions by providing a comprehensive review of 181 scientific publications on emergent language in artificial intelligence. Its objective is to serve as a reference for researchers interested in or proficient in the field. Consequently, the main contributions are the definition and overview of the prevailing terminology, the analysis of existing evaluation methods and metrics, and the description of the identified research gaps.
Context-Aware Agent-based Model for Smart Long Distance Transport System
Long-distance transport plays a vital role in the economic growth of countries. However, there is a lack of systems being developed for monitoring and support of long-route vehicles (LRV). Sustainable and context-aware transport systems with modern technologies are needed. We model for long-distance vehicle transportation monitoring and support systems in a multi-agent environment. Our model incorporates the distance vehicle transport mechanism through agent-based modeling (ABM). This model constitutes the design protocol of ABM called Overview, Design, and Details (ODD). This model constitutes that every category of agents is offering information as a service. Hence, a federation of services through protocol for the communication between sensors and software components is desired. Such integration of services supports monitoring and tracking of vehicles on the route. The model simulations provide useful results for the integration of services based on smart objects.
An Introduction to Centralized Training for Decentralized Execution in Cooperative Multi-Agent Reinforcement Learning
Multi-agent reinforcement learning (MARL) has exploded in popularity in recent years. Many approaches have been developed but they can be divided into three main types: centralized training and execution (CTE), centralized training for decentralized execution (CTDE), and Decentralized training and execution (DTE). CTDE methods are the most common as they can use centralized information during training but execute in a decentralized manner -- using only information available to that agent during execution. CTDE is the only paradigm that requires a separate training phase where any available information (e.g., other agent policies, underlying states) can be used. As a result, they can be more scalable than CTE methods, do not require communication during execution, and can often perform well. CTDE fits most naturally with the cooperative case, but can be potentially applied in competitive or mixed settings depending on what information is assumed to be observed. This text is an introduction to CTDE in cooperative MARL. It is meant to explain the setting, basic concepts, and common methods. It does not cover all work in CTDE MARL as the subarea is quite extensive. I have included work that I believe is important for understanding the main concepts in the subarea and apologize to those that I have omitted.
comment: arXiv admin note: text overlap with arXiv:2405.06161
Partially Observable Multi-Agent Reinforcement Learning with Information Sharing ICML 2023
We study provable multi-agent reinforcement learning (RL) in the general framework of partially observable stochastic games (POSGs). To circumvent the known hardness results and the use of computationally intractable oracles, we advocate leveraging the potential \emph{information-sharing} among agents, a common practice in empirical multi-agent RL, and a standard model for multi-agent control systems with communications. We first establish several computational complexity results to justify the necessity of information-sharing, as well as the observability assumption that has enabled quasi-efficient single-agent RL with partial observations, for efficiently solving POSGs. {Inspired by the inefficiency of planning in the ground-truth model,} we then propose to further \emph{approximate} the shared common information to construct an {approximate model} of the POSG, in which planning an approximate \emph{equilibrium} (in terms of solving the original POSG) can be quasi-efficient, i.e., of quasi-polynomial-time, under the aforementioned assumptions. Furthermore, we develop a partially observable multi-agent RL algorithm that is \emph{both} statistically and computationally quasi-efficient. {Finally, beyond equilibrium learning, we extend our algorithmic framework to finding the \emph{team-optimal solution} in cooperative POSGs, i.e., decentralized partially observable Markov decision processes, a much more challenging goal. We establish concrete computational and sample complexities under several common structural assumptions of the model.} We hope our study could open up the possibilities of leveraging and even designing different \emph{information structures}, a well-studied notion in control theory, for developing both sample- and computation-efficient partially observable multi-agent RL.
comment: Journal extension of the conference version at ICML 2023. Changed to the more general reward function form, added new results for learning in Dec-POMDPs, and streamlined proof outlines
Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques
We initiate the study of Multi-Agent Reinforcement Learning from Human Feedback (MARLHF), exploring both theoretical foundations and empirical validations. We define the task as identifying Nash equilibrium from a preference-only offline dataset in general-sum games, a problem marked by the challenge of sparse feedback signals. Our theory establishes the upper complexity bounds for Nash Equilibrium in effective MARLHF, demonstrating that single-policy coverage is inadequate and highlighting the importance of unilateral dataset coverage. These theoretical insights are verified through comprehensive experiments. To enhance the practical performance, we further introduce two algorithmic techniques. (1) We propose a Mean Squared Error (MSE) regularization along the time axis to achieve a more uniform reward distribution and improve reward learning outcomes. (2) We utilize imitation learning to approximate the reference policy, ensuring stability and effectiveness in training. Our findings underscore the multifaceted approach required for MARLHF, paving the way for effective preference-based multi-agent systems.
Liquid-Graph Time-Constant Network for Multi-Agent Systems Control
In this paper, we propose the Liquid-Graph Time-constant (LGTC) network, a continuous graph neural network(GNN) model for control of multi-agent systems based on therecent Liquid Time Constant (LTC) network. We analyse itsstability leveraging contraction analysis and propose a closed-form model that preserves the model contraction rate and doesnot require solving an ODE at each iteration. Compared todiscrete models like Graph Gated Neural Networks (GGNNs),the higher expressivity of the proposed model guaranteesremarkable performance while reducing the large amountof communicated variables normally required by GNNs. Weevaluate our model on a distributed multi-agent control casestudy (flocking) taking into account variable communicationrange and scalability under non-instantaneous communication
comment: arXiv admin note: text overlap with arXiv:2305.19235
Robotics
Coaching a Robotic Sonographer: Learning Robotic Ultrasound with Sparse Expert's Feedback
Ultrasound is widely employed for clinical intervention and diagnosis, due to its advantages of offering non-invasive, radiation-free, and real-time imaging. However, the accessibility of this dexterous procedure is limited due to the substantial training and expertise required of operators. The robotic ultrasound (RUS) offers a viable solution to address this limitation; nonetheless, achieving human-level proficiency remains challenging. Learning from demonstrations (LfD) methods have been explored in RUS, which learns the policy prior from a dataset of offline demonstrations to encode the mental model of the expert sonographer. However, active engagement of experts, i.e. Coaching, during the training of RUS has not been explored thus far. Coaching is known for enhancing efficiency and performance in human training. This paper proposes a coaching framework for RUS to amplify its performance. The framework combines DRL (self-supervised practice) with sparse expert's feedback through coaching. The DRL employs an off-policy Soft Actor-Critic (SAC) network, with a reward based on image quality rating. The coaching by experts is modeled as a Partially Observable Markov Decision Process (POMDP), which updates the policy parameters based on the correction by the expert. The validation study on phantoms showed that coaching increases the learning rate by $25\%$ and the number of high-quality image acquisition by $74.5\%$.
comment: Accepted in IEEE Transactions on Medical Robotics and Bionics (TMRB) 2024
YoloTag: Vision-based Robust UAV Navigation with Fiducial Markers
By harnessing fiducial markers as visual landmarks in the environment, Unmanned Aerial Vehicles (UAVs) can rapidly build precise maps and navigate spaces safely and efficiently, unlocking their potential for fluent collaboration and coexistence with humans. Existing fiducial marker methods rely on handcrafted feature extraction, which sacrifices accuracy. On the other hand, deep learning pipelines for marker detection fail to meet real-time runtime constraints crucial for navigation applications. In this work, we propose YoloTag \textemdash a real-time fiducial marker-based localization system. YoloTag uses a lightweight YOLO v8 object detector to accurately detect fiducial markers in images while meeting the runtime constraints needed for navigation. The detected markers are then used by an efficient perspective-n-point algorithm to estimate UAV states. However, this localization system introduces noise, causing instability in trajectory tracking. To suppress noise, we design a higher-order Butterworth filter that effectively eliminates noise through frequency domain analysis. We evaluate our algorithm through real-robot experiments in an indoor environment, comparing the trajectory tracking performance of our method against other approaches in terms of several distance metrics.
Visual Servoing for Robotic On-Orbit Servicing: A Survey
On-orbit servicing (OOS) activities will power the next big step for sustainable exploration and commercialization of space. Developing robotic capabilities for autonomous OOS operations is a priority for the space industry. Visual Servoing (VS) enables robots to achieve the precise manoeuvres needed for critical OOS missions by utilizing visual information for motion control. This article presents an overview of existing VS approaches for autonomous OOS operations with space manipulator systems (SMS). We divide the approaches according to their contribution to the typical phases of a robotic OOS mission: a) Recognition, b) Approach, and c) Contact. We also present a discussion on the reviewed VS approaches, identifying current trends. Finally, we highlight the challenges and areas for future research on VS techniques for robotic OOS.
comment: Accepted for publication at the 2024 International Conference on Space Robotics (iSpaRo)
Investigating Mixed Reality for Communication Between Humans and Mobile Manipulators
This article investigates mixed reality (MR) to enhance human-robot collaboration (HRC). The proposed solution adopts MR as a communication layer to convey a mobile manipulator's intentions and upcoming actions to the humans with whom it interacts, thus improving their collaboration. A user study involving 20 participants demonstrated the effectiveness of this MR-focused approach in facilitating collaborative tasks, with a positive effect on overall collaboration performances and human satisfaction.
comment: This paper has been published in the Proceedings of the 2024 IEEE International Conference on Human and Robot Interactive Communication (RO-MAN), Pasadena, CA, USA, August 2024
Kinesthetic Teaching in Robotics: a Mixed Reality Approach
As collaborative robots become more common in manufacturing scenarios and adopted in hybrid human-robot teams, we should develop new interaction and communication strategies to ensure smooth collaboration between agents. In this paper, we propose a novel communicative interface that uses Mixed Reality as a medium to perform Kinesthetic Teaching (KT) on any robotic platform. We evaluate our proposed approach in a user study involving multiple subjects and two different robots, comparing traditional physical KT with holographic-based KT through user experience questionnaires and task-related metrics.
comment: This paper has been published in the Proceedings of the 2024 IEEE International Conference on Human and Robot Interactive Communication (RO-MAN), Pasadena, CA, USA, August 2024
Unsupervised Welding Defect Detection Using Audio And Video
In this work we explore the application of AI to robotic welding. Robotic welding is a widely used technology in many industries, but robots currently do not have the capability to detect welding defects which get introduced due to various reasons in the welding process. We describe how deep-learning methods can be applied to detect weld defects in real-time by recording the welding process with microphones and a camera. Our findings are based on a large database with more than 4000 welding samples we collected which covers different weld types, materials and various defect categories. All deep learning models are trained in an unsupervised fashion because the space of possible defects is large and the defects in our data may contain biases. We demonstrate that a reliable real-time detection of most categories of weld defects is feasible both from audio and video, with improvements achieved by combining both modalities. Specifically, the multi-modal approach achieves an average Area-under-ROC-Curve (AUC) of 0.92 over all eleven defect types in our data. We conclude the paper with an analysis of the results by defect type and a discussion of future work.
comment: 21 pages
SlipNet: Slip Cost Map for Autonomous Navigation on Heterogeneous Deformable Terrains
Autonomous space rovers face significant challenges when navigating deformable and heterogeneous terrains during space exploration. The variability in terrain types, influenced by different soil properties, often results in severe wheel slip, compromising navigation efficiency and potentially leading to entrapment. This paper proposes SlipNet, an approach for predicting slip in segmented regions of heterogeneous deformable terrain surfaces to enhance navigation algorithms. Unlike previous methods, SlipNet does not depend on prior terrain classification, reducing prediction errors and misclassifications through dynamic terrain segmentation and slip assignment during deployment while maintaining a history of terrain classes. This adaptive reclassification mechanism has improved prediction performance. Extensive simulation results demonstrate that our model (DeepLab v3+ + SlipNet) achieves better slip prediction performance than the TerrainNet, with a lower mean absolute error (MAE) in five terrain sample tests.
GraspSplats: Efficient Manipulation with 3D Feature Splatting
The ability for robots to perform efficient and zero-shot grasping of object parts is crucial for practical applications and is becoming prevalent with recent advances in Vision-Language Models (VLMs). To bridge the 2D-to-3D gap for representations to support such a capability, existing methods rely on neural fields (NeRFs) via differentiable rendering or point-based projection methods. However, we demonstrate that NeRFs are inappropriate for scene changes due to their implicitness and point-based methods are inaccurate for part localization without rendering-based optimization. To amend these issues, we propose GraspSplats. Using depth supervision and a novel reference feature computation method, GraspSplats generates high-quality scene representations in under 60 seconds. We further validate the advantages of Gaussian-based representation by showing that the explicit and optimized geometry in GraspSplats is sufficient to natively support (1) real-time grasp sampling and (2) dynamic and articulated object manipulation with point trackers. With extensive experiments on a Franka robot, we demonstrate that GraspSplats significantly outperforms existing methods under diverse task settings. In particular, GraspSplats outperforms NeRF-based methods like F3RM and LERF-TOGO, and 2D detection methods.
comment: Project webpage: https://graspsplats.github.io/
A Modern Take on Visual Relationship Reasoning for Grasp Planning
Interacting with real-world cluttered scenes pose several challenges to robotic agents that need to understand complex spatial dependencies among the observed objects to determine optimal pick sequences or efficient object retrieval strategies. Existing solutions typically manage simplified scenarios and focus on predicting pairwise object relationships following an initial object detection phase, but often overlook the global context or struggle with handling redundant and missing object relations. In this work, we present a modern take on visual relational reasoning for grasp planning. We introduce D3GD, a novel testbed that includes bin picking scenes with up to 35 objects from 97 distinct categories. Additionally, we propose D3G, a new end-to-end transformer-based dependency graph generation model that simultaneously detects objects and produces an adjacency matrix representing their spatial relationships. Recognizing the limitations of standard metrics, we employ the Average Precision of Relationships for the first time to evaluate model performance, conducting an extensive experimental benchmark. The obtained results establish our approach as the new state-of-the-art for this task, laying the foundation for future research in robotic manipulation. We publicly release the code and dataset at https://paolotron.github.io/d3g.github.io.
Planning to avoid ambiguous states through Gaussian approximations to non-linear sensors in active inference agents
In nature, active inference agents must learn how observations of the world represent the state of the agent. In engineering, the physics behind sensors is often known reasonably accurately and measurement functions can be incorporated into generative models. When a measurement function is non-linear, the transformed variable is typically approximated with a Gaussian distribution to ensure tractable inference. We show that Gaussian approximations that are sensitive to the curvature of the measurement function, such as a second-order Taylor approximation, produce a state-dependent ambiguity term. This induces a preference over states, based on how accurately the state can be inferred from the observation. We demonstrate this preference with a robot navigation experiment where agents plan trajectories.
comment: 13 pages, 3 figures. Accepted to the International Workshop on Active Inference 2024
Learning Resilient Formation Control of Drones with Graph Attention Network
The rapid advancement of drone technology has significantly impacted various sectors, including search and rescue, environmental surveillance, and industrial inspection. Multidrone systems offer notable advantages such as enhanced efficiency, scalability, and redundancy over single-drone operations. Despite these benefits, ensuring resilient formation control in dynamic and adversarial environments, such as under communication loss or cyberattacks, remains a significant challenge. Classical approaches to resilient formation control, while effective in certain scenarios, often struggle with complex modeling and the curse of dimensionality, particularly as the number of agents increases. This paper proposes a novel, learning-based formation control for enhancing the adaptability and resilience of multidrone formations using graph attention networks (GATs). By leveraging GAT's dynamic capabilities to extract internode relationships based on the attention mechanism, this GAT-based formation controller significantly improves the robustness of drone formations against various threats, such as Denial of Service (DoS) attacks. Our approach not only improves formation performance in normal conditions but also ensures the resilience of multidrone systems in variable and adversarial environments. Extensive simulation results demonstrate the superior performance of our method over baseline formation controllers. Furthermore, the physical experiments validate the effectiveness of the trained control policy in real-world flights.
comment: This work has been submitted to the IEEE for possible publication
Integration of Augmented Reality and Mobile Robot Indoor SLAM for Enhanced Spatial Awareness
This research explores the integration of indoor Simultaneous Localization and Mapping (SLAM) with Augmented Reality (AR) to enhance situational awareness, improving safety in hazardous or emergency situations. The main contribution of this work is to enable mobile robots to provide real-time spatial perception to users who are not co-located with the robot. This is a comprehensive approach, including selecting suitable sensors for indoor SLAM, designing and building a platform, developing methods to display maps on AR devices, implementing this into software on an AR device, and improving the robustness of communication and localization between the robot and AR device in real-world testing. By taking this approach and analyzing each component of the integrated system, this paper highlights numerous areas for future research that can further advance the integration of SLAM and AR technologies. These advancements aim to significantly improve safety and efficiency during rescue operations.
comment: 8 pages
Securing Federated Learning in Robot Swarms using Blockchain Technology
Federated learning is a new approach to distributed machine learning that offers potential advantages such as reducing communication requirements and distributing the costs of training algorithms. Therefore, it could hold great promise in swarm robotics applications. However, federated learning usually requires a centralized server for the aggregation of the models. In this paper, we present a proof-of-concept implementation of federated learning in a robot swarm that does not compromise decentralization. To do so, we use blockchain technology to enable our robot swarm to securely synchronize a shared model that is the aggregation of the individual models without relying on a central server. We then show that introducing a single malfunctioning robot can, however, heavily disrupt the training process. To prevent such situations, we devise protection mechanisms that are implemented through secure and tamper-proof blockchain smart contracts. Our experiments are conducted in ARGoS, a physics-based simulator for swarm robotics, using the Ethereum blockchain protocol which is executed by each simulated robot.
comment: To be published in the Proceedings of the 17th International Symposium on Distributed Autonomous Robotic Systems (DARS 2024)
Three-dimensional geometric resolution of the inverse kinematics of a 7 degree of freedom articulated arm
This work presents a three-dimensional geometric resolution method to calculate the complete inverse kinematics of a 7-degree-of-freedom articulated arm, including the hand itself. The method is classified as an analytical method with geometric solution, since it obtains a precise solution in a closed number of steps, converting the inverse kinematic problem into a three-dimensional geometric model. To simplify the problem, the kinematic decoupling method is used, so that the position of the wrist is calculated independently on one hand with information on the orientation of the hand, and the angles of the rest of the arm are calculated from the wrist.
comment: in Spanish language
Mapping Safe Zones for Co-located Human-UAV Interaction
Recent advances in robotics bring us closer to the reality of living, co-habiting, and sharing personal spaces with robots. However, it is not clear how close a co-located robot can be to a human in a shared environment without making the human uncomfortable or anxious. This research aims to map safe and comfortable zones for co-located aerial robots. The objective is to identify the distances at which a drone causes discomfort to a co-located human and to create a map showing no-fly, moderate-fly, and safe-fly zones. We recruited a total of 18 participants and conducted two indoor laboratory experiments, one with a single drone and the other set with two drones. Our results show that multiple drones cause more discomfort when close to a co-located human than a single drone. We observed that distances below 200 cm caused discomfort, the moderate fly zone was 200 - 300 cm, and the safe-fly zone was any distance greater than 300 cm in single drone experiments. The safe zones were pushed further away by 100 cm for the multiple drone experiments. In this paper, we present the preliminary findings on safe-fly zones for multiple drones. Further work would investigate the impact of a higher number of aerial robots, the speed of approach, direction of travel, and noise level on co-located humans, and autonomously develop 3D models of trust zones and safe zones for co-located aerial swarms.
comment: This paper has been accepted for presentation at the Second International Symposium on Trustworthy Autonomous Systems (TAS '24), being held from September 16-18, 2024, at Austin, TX, USA. It consists of 10 pages, 9 figures, and 1 table
ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation
Representing robotic manipulation tasks as constraints that associate the robot and the environment is a promising way to encode desired robot behaviors. However, it remains unclear how to formulate the constraints such that they are 1) versatile to diverse tasks, 2) free of manual labeling, and 3) optimizable by off-the-shelf solvers to produce robot actions in real-time. In this work, we introduce Relational Keypoint Constraints (ReKep), a visually-grounded representation for constraints in robotic manipulation. Specifically, ReKep is expressed as Python functions mapping a set of 3D keypoints in the environment to a numerical cost. We demonstrate that by representing a manipulation task as a sequence of Relational Keypoint Constraints, we can employ a hierarchical optimization procedure to solve for robot actions (represented by a sequence of end-effector poses in SE(3)) with a perception-action loop at a real-time frequency. Furthermore, in order to circumvent the need for manual specification of ReKep for each new task, we devise an automated procedure that leverages large vision models and vision-language models to produce ReKep from free-form language instructions and RGB-D observations. We present system implementations on a wheeled single-arm platform and a stationary dual-arm platform that can perform a large variety of manipulation tasks, featuring multi-stage, in-the-wild, bimanual, and reactive behaviors, all without task-specific data or environment models. Website at https://rekep-robot.github.io.
BEVNav: Robot Autonomous Navigation Via Spatial-Temporal Contrastive Learning in Bird's-Eye View
Goal-driven mobile robot navigation in map-less environments requires effective state representations for reliable decision-making. Inspired by the favorable properties of Bird's-Eye View (BEV) in point clouds for visual perception, this paper introduces a novel navigation approach named BEVNav. It employs deep reinforcement learning to learn BEV representations and enhance decision-making reliability. First, we propose a self-supervised spatial-temporal contrastive learning approach to learn BEV representations. Spatially, two randomly augmented views from a point cloud predict each other, enhancing spatial features. Temporally, we combine the current observation with consecutive frames' actions to predict future features, establishing the relationship between observation transitions and actions to capture temporal cues. Then, incorporating this spatial-temporal contrastive learning in the Soft Actor-Critic reinforcement learning framework, our BEVNav offers a superior navigation policy. Extensive experiments demonstrate BEVNav's robustness in environments with dense pedestrians, outperforming state-of-the-art methods across multiple benchmarks. \rev{The code will be made publicly available at https://github.com/LanrenzzzZ/BEVNav.
SafeEmbodAI: a Safety Framework for Mobile Robots in Embodied AI Systems
Embodied AI systems, including AI-powered robots that autonomously interact with the physical world, stand to be significantly advanced by Large Language Models (LLMs), which enable robots to better understand complex language commands and perform advanced tasks with enhanced comprehension and adaptability, highlighting their potential to improve embodied AI capabilities. However, this advancement also introduces safety challenges, particularly in robotic navigation tasks. Improper safety management can lead to failures in complex environments and make the system vulnerable to malicious command injections, resulting in unsafe behaviours such as detours or collisions. To address these issues, we propose \textit{SafeEmbodAI}, a safety framework for integrating mobile robots into embodied AI systems. \textit{SafeEmbodAI} incorporates secure prompting, state management, and safety validation mechanisms to secure and assist LLMs in reasoning through multi-modal data and validating responses. We designed a metric to evaluate mission-oriented exploration, and evaluations in simulated environments demonstrate that our framework effectively mitigates threats from malicious commands and improves performance in various environment settings, ensuring the safety of embodied AI systems. Notably, In complex environments with mixed obstacles, our method demonstrates a significant performance increase of 267\% compared to the baseline in attack scenarios, highlighting its robustness in challenging conditions.
High Precision Positioning System
SAPPO is a high-precision, low-cost and highly scalable indoor localization system. The system is designed using modified HC-SR04 ultrasound transducers as a base to be used as distance meters between beacons and mobile robots. Additionally, it has a very unusual arrangement of its elements, such that the beacons and the array of transmitters of the mobile robot are located in very close planes, in a horizontal emission arrangement, parallel to the ground, achieving a range per transducer of almost 12 meters. SAPPO represents a significant leap forward in ultrasound localization systems, in terms of reducing the density of beacons while maintaining average precision in the millimeter range.
comment: in Spanish language
GaussianPU: A Hybrid 2D-3D Upsampling Framework for Enhancing Color Point Clouds via 3D Gaussian Splatting
Dense colored point clouds enhance visual perception and are of significant value in various robotic applications. However, existing learning-based point cloud upsampling methods are constrained by computational resources and batch processing strategies, which often require subdividing point clouds into smaller patches, leading to distortions that degrade perceptual quality. To address this challenge, we propose a novel 2D-3D hybrid colored point cloud upsampling framework (GaussianPU) based on 3D Gaussian Splatting (3DGS) for robotic perception. This approach leverages 3DGS to bridge 3D point clouds with their 2D rendered images in robot vision systems. A dual scale rendered image restoration network transforms sparse point cloud renderings into dense representations, which are then input into 3DGS along with precise robot camera poses and interpolated sparse point clouds to reconstruct dense 3D point clouds. We have made a series of enhancements to the vanilla 3DGS, enabling precise control over the number of points and significantly boosting the quality of the upsampled point cloud for robotic scene understanding. Our framework supports processing entire point clouds on a single consumer-grade GPU, such as the NVIDIA GeForce RTX 3090, eliminating the need for segmentation and thus producing high-quality, dense colored point clouds with millions of points for robot navigation and manipulation tasks. Extensive experimental results on generating million-level point cloud data validate the effectiveness of our method, substantially improving the quality of colored point clouds and demonstrating significant potential for applications involving large-scale point clouds in autonomous robotics and human-robot interaction scenarios.
comment: 7 pages, 5 figures
PR2: A Physics- and Photo-realistic Testbed for Embodied AI and Humanoid Robots
This paper presents the development of a Physics-realistic and Photo-\underline{r}ealistic humanoid robot testbed, PR2, to facilitate collaborative research between Embodied Artificial Intelligence (Embodied AI) and robotics. PR2 offers high-quality scene rendering and robot dynamic simulation, enabling (i) the creation of diverse scenes using various digital assets, (ii) the integration of advanced perception or foundation models, and (iii) the implementation of planning and control algorithms for dynamic humanoid robot behaviors based on environmental feedback. The beta version of PR2 has been deployed for the simulation track of a nationwide full-size humanoid robot competition for college students, attracting 137 teams and over 400 participants within four months. This competition covered traditional tasks in bipedal walking, as well as novel challenges in loco-manipulation and language-instruction-based object search, marking a first for public college robotics competitions. A retrospective analysis of the competition suggests that future events should emphasize the integration of locomotion with manipulation and perception. By making the PR2 testbed publicly available at https://github.com/pr2-humanoid/PR2-Platform, we aim to further advance education and training in humanoid robotics.
DOB-based Wind Estimation of A UAV Using Its Onboard Sensor
Unmanned Aerial Vehicles (UAVs) play a crucial role in meteorological research, particularly in environmental wind field measurements. However, several challenges exist in current wind measurement methods using UAVs that need to be addressed. Firstly, the accuracy of measurement is low, and the measurement range is limited. Secondly, the algorithms employed lack robustness and adaptability across different UAV platforms. Thirdly, there are limited approaches available for wind estimation during dynamic flight. Finally, while horizontal plane measurements are feasible, vertical direction estimation is often missing. To tackle these challenges, we present and implement a comprehensive wind estimation algorithm. Our algorithm offers several key features, including the capability to estimate the 3-D wind vector, enabling wind estimation even during dynamic flight of the UAV. Furthermore, our algorithm exhibits adaptability across various UAV platforms. Experimental results in the wind tunnel validate the effectiveness of our algorithm, showcasing improvements such as wind speed accuracy of $0.11$ m/s and wind direction errors of less than $2.8^\circ$. Additionally, our approach extends the measurement range to $10$ m/s.
Situation-aware Autonomous Driving Decision Making with Cooperative Perception on Demand
This paper investigates the impact of cooperative perception on autonomous driving decision making on urban roads. The extended perception range contributed by the cooperative perception can be properly leveraged to address the implicit dependencies within the vehicles, thereby the vehicle decision making performance can be improved. Meanwhile, we acknowledge the inherent limitation of wireless communication and propose a Cooperative Perception on Demand (CPoD) strategy, where the cooperative perception will only be activated when the extended perception range is necessary for proper situation-awareness. The situation-aware decision making with CPoD is modeled as a Partially Observable Markov Decision Process (POMDP) and solved in an online manner. The evaluation results demonstrate that the proposed approach can function safely and efficiently for autonomous driving on urban roads.
Open-vocabulary Temporal Action Localization using VLMs
Video action localization aims to find timings of a specific action from a long video. Although existing learning-based approaches have been successful, those require annotating videos that come with a considerable labor cost. This paper proposes a learning-free, open-vocabulary approach based on emerging off-the-shelf vision-language models (VLM). The challenge stems from the fact that VLMs are neither designed to process long videos nor tailored for finding actions. We overcome these problems by extending an iterative visual prompting technique. Specifically, we sample video frames into a concatenated image with frame index labels, making a VLM guess a frame that is considered to be closest to the start/end of the action. Iterating this process by narrowing a sampling time window results in finding a specific frame of start and end of an action. We demonstrate that this sampling technique yields reasonable results, illustrating a practical extension of VLMs for understanding videos. A sample code is available at https://microsoft.github.io/VLM-Video-Action-Localization/.
comment: 7 pages, 5 figures, 4 tables. Last updated on September 3rd, 2024
Expansion-GRR: Efficient Generation of Smooth Global Redundancy Resolution Roadmaps IROS
Global redundancy resolution (GRR) roadmaps is a novel concept in robotics that facilitates the mapping from task space paths to configuration space paths in a legible, predictable, and repeatable way. Such roadmaps could find widespread utility in applications such as safe teleoperation, consistent path planning, and motion primitives generation. However, previous methods to compute GRR roadmaps often necessitate a lengthy computation time and produce non-smooth paths, limiting their practical efficacy. To address this challenge, we introduce a novel method Expansion-GRR that leverages efficient configuration space projections and enables rapid generation of smooth roadmaps that satisfy the task constraints. Additionally, we propose a simple multi-seed strategy that further enhances the final quality. We conducted experiments in simulation with a 5-link planar manipulator and a Kinova arm. We were able to generate the Expansion-GRR roadmaps up to 2 orders of magnitude faster while achieving higher smoothness. We also demonstrate the utility of the GRR roadmaps in teleoperation tasks where our method outperformed prior methods and reactive IK solvers in terms of success rate and solution quality.
comment: Accepted by International Conference on Intelligent Robots and Systems (IROS), 2024
Behavioral Learning of Dish Rinsing and Scrubbing based on Interruptive Direct Teaching Considering Assistance Rate
Robots are expected to manipulate objects in a safe and dexterous way. For example, washing dishes is a dexterous operation that involves scrubbing the dishes with a sponge and rinsing them with water. It is necessary to learn it safely without splashing water and without dropping the dishes. In this study, we propose a safe and dexterous manipulation system. The robot learns a dynamics model of the object by estimating the state of the object and the robot itself, the control input, and the amount of human assistance required (assistance rate) after the human corrects the initial trajectory of the robot's hands by interruptive direct teaching. By backpropagating the error between the estimated and the reference value using the acquired dynamics model, the robot can generate a control input that approaches the reference value, for example, so that human assistance is not required and the dish does not move excessively. This allows for adaptive rinsing and scrubbing of dishes with unknown shapes and properties. As a result, it is possible to generate safe actions that require less human assistance.
comment: Accepted at Advanced Robotics
OceanGPT: A Large Language Model for Ocean Science Tasks ACL2024
Ocean science, which delves into the oceans that are reservoirs of life and biodiversity, is of great significance given that oceans cover over 70% of our planet's surface. Recently, advances in Large Language Models (LLMs) have transformed the paradigm in science. Despite the success in other domains, current LLMs often fall short in catering to the needs of domain experts like oceanographers, and the potential of LLMs for ocean science is under-explored. The intrinsic reasons are the immense and intricate nature of ocean data as well as the necessity for higher granularity and richness in knowledge. To alleviate these issues, we introduce OceanGPT, the first-ever large language model in the ocean domain, which is expert in various ocean science tasks. We also propose OceanGPT, a novel framework to automatically obtain a large volume of ocean domain instruction data, which generates instructions based on multi-agent collaboration. Additionally, we construct the first oceanography benchmark, OceanBench, to evaluate the capabilities of LLMs in the ocean domain. Though comprehensive experiments, OceanGPT not only shows a higher level of knowledge expertise for oceans science tasks but also gains preliminary embodied intelligence capabilities in ocean technology.
comment: ACL2024. Project Website: http://oceangpt.zjukg.cn/
FMB: a Functional Manipulation Benchmark for Generalizable Robotic Learning
In this paper, we propose a real-world benchmark for studying robotic learning in the context of functional manipulation: a robot needs to accomplish complex long-horizon behaviors by composing individual manipulation skills in functionally relevant ways. The core design principles of our Functional Manipulation Benchmark (FMB) emphasize a harmonious balance between complexity and accessibility. Tasks are deliberately scoped to be narrow, ensuring that models and datasets of manageable scale can be utilized effectively to track progress. Simultaneously, they are diverse enough to pose a significant generalization challenge. Furthermore, the benchmark is designed to be easily replicable, encompassing all essential hardware and software components. To achieve this goal, FMB consists of a variety of 3D-printed objects designed for easy and accurate replication by other researchers. The objects are procedurally generated, providing a principled framework to study generalization in a controlled fashion. We focus on fundamental manipulation skills, including grasping, repositioning, and a range of assembly behaviors. The FMB can be used to evaluate methods for acquiring individual skills, as well as methods for combining and ordering such skills to solve complex, multi-stage manipulation tasks. We also offer an imitation learning framework that includes a suite of policies trained to solve the proposed tasks. This enables researchers to utilize our tasks as a versatile toolkit for examining various parts of the pipeline. For example, researchers could propose a better design for a grasping controller and evaluate it in combination with our baseline reorientation and assembly policies as part of a pipeline for solving multi-stage tasks. Our dataset, object CAD files, code, and evaluation videos can be found on our project website: https://functional-manipulation-benchmark.github.io
comment: IJRR 2024
PIE: Parkour with Implicit-Explicit Learning Framework for Legged Robots
Parkour presents a highly challenging task for legged robots, requiring them to traverse various terrains with agile and smooth locomotion. This necessitates comprehensive understanding of both the robot's own state and the surrounding terrain, despite the inherent unreliability of robot perception and actuation. Current state-of-the-art methods either rely on complex pre-trained high-level terrain reconstruction modules or limit the maximum potential of robot parkour to avoid failure due to inaccurate perception. In this paper, we propose a one-stage end-to-end learning-based parkour framework: Parkour with Implicit-Explicit learning framework for legged robots (PIE) that leverages dual-level implicit-explicit estimation. With this mechanism, even a low-cost quadruped robot equipped with an unreliable egocentric depth camera can achieve exceptional performance on challenging parkour terrains using a relatively simple training process and reward function. While the training process is conducted entirely in simulation, our real-world validation demonstrates successful zero-shot deployment of our framework, showcasing superior parkour performance on harsh terrains.
comment: Accepted for IEEE Robotics and Automation Letters (RA-L)
GISR: Geometric Initialization and Silhouette-based Refinement for Single-View Robot Pose and Configuration Estimation
In autonomous robotics, measurement of the robot's internal state and perception of its environment, including interaction with other agents such as collaborative robots, are essential. Estimating the pose of the robot arm from a single view has the potential to replace classical eye-to-hand calibration approaches and is particularly attractive for online estimation and dynamic environments. In addition to its pose, recovering the robot configuration provides a complete spatial understanding of the observed robot that can be used to anticipate the actions of other agents in advanced robotics use cases. Furthermore, this additional redundancy enables the planning and execution of recovery protocols in case of sensor failures or external disturbances. We introduce GISR - a deep configuration and robot-to-camera pose estimation method that prioritizes execution in real-time. GISR consists of two modules: (i) a geometric initialization module that efficiently computes an approximate robot pose and configuration, and (ii) a deep iterative silhouette-based refinement module that arrives at a final solution in just a few iterations. We evaluate GISR on publicly available data and show that it outperforms existing methods of the same class in terms of both speed and accuracy, and can compete with approaches that rely on ground-truth proprioception and recover only the pose.
comment: IEEE Robotics and Automation Letters (under revision), code available at http://github.com/iwhitey/GISR-robot
Joint Pedestrian Trajectory Prediction through Posterior Sampling
Joint pedestrian trajectory prediction has long grappled with the inherent unpredictability of human behaviors. Recent investigations employing variants of conditional diffusion models in trajectory prediction have exhibited notable success. Nevertheless, the heavy dependence on accurate historical data results in their vulnerability to noise disturbances and data incompleteness. To improve the robustness and reliability, we introduce the Guided Full Trajectory Diffuser (GFTD), a novel diffusion model framework that captures the joint full (historical and future) trajectory distribution. By learning from the full trajectory, GFTD can recover the noisy and missing data, hence improving the robustness. In addition, GFTD can adapt to data imperfections without additional training requirements, leveraging posterior sampling for reliable prediction and controllable generation. Our approach not only simplifies the prediction process but also enhances generalizability in scenarios with noise and incomplete inputs. Through rigorous experimental evaluation, GFTD exhibits superior performance in both trajectory prediction and controllable generation.
Learning Symbolic and Subsymbolic Temporal Task Constraints from Bimanual Human Demonstrations IROS 2024
Learning task models of bimanual manipulation from human demonstration and their execution on a robot should take temporal constraints between actions into account. This includes constraints on (i) the symbolic level such as precedence relations or temporal overlap in the execution, and (ii) the subsymbolic level such as the duration of different actions, or their starting and end points in time. Such temporal constraints are crucial for temporal planning, reasoning, and the exact timing for the execution of bimanual actions on a bimanual robot. In our previous work, we addressed the learning of temporal task constraints on the symbolic level and demonstrated how a robot can leverage this knowledge to respond to failures during execution. In this work, we propose a novel model-driven approach for the combined learning of symbolic and subsymbolic temporal task constraints from multiple bimanual human demonstrations. Our main contributions are a subsymbolic foundation of a temporal task model that describes temporal nexuses of actions in the task based on distributions of temporal differences between semantic action keypoints, as well as a method based on fuzzy logic to derive symbolic temporal task constraints from this representation. This complements our previous work on learning comprehensive temporal task models by integrating symbolic and subsymbolic information based on a subsymbolic foundation, while still maintaining the symbolic expressiveness of our previous approach. We compare our proposed approach with our previous pure-symbolic approach and show that we can reproduce and even outperform it. Additionally, we show how the subsymbolic temporal task constraints can synchronize otherwise unimanual movement primitives for bimanual behavior on a humanoid robot.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. 8 pages, submitted to IROS 2024
Agent-Agnostic Centralized Training for Decentralized Multi-Agent Cooperative Driving IROS 2024
Active traffic management with autonomous vehicles offers the potential for reduced congestion and improved traffic flow. However, developing effective algorithms for real-world scenarios requires overcoming challenges related to infinite-horizon traffic flow and partial observability. To address these issues and further decentralize traffic management, we propose an asymmetric actor-critic model that learns decentralized cooperative driving policies for autonomous vehicles using single-agent reinforcement learning. By employing attention neural networks with masking, our approach efficiently manages real-world traffic dynamics and partial observability, eliminating the need for predefined agents or agent-specific experience buffers in multi-agent reinforcement learning. Extensive evaluations across various traffic scenarios demonstrate our method's significant potential in improving traffic flow at critical bottleneck points. Moreover, we address the challenges posed by conservative autonomous vehicle driving behaviors that adhere strictly to traffic rules, showing that our cooperative policy effectively alleviates potential slowdowns without compromising safety.
comment: accepted by IROS 2024
Distilling Knowledge for Short-to-Long Term Trajectory Prediction IROS 2024
Long-term trajectory forecasting is an important and challenging problem in the fields of computer vision, machine learning, and robotics. One fundamental difficulty stands in the evolution of the trajectory that becomes more and more uncertain and unpredictable as the time horizon grows, subsequently increasing the complexity of the problem. To overcome this issue, in this paper, we propose Di-Long, a new method that employs the distillation of a short-term trajectory model forecaster that guides a student network for long-term trajectory prediction during the training process. Given a total sequence length that comprehends the allowed observation for the student network and the complementary target sequence, we let the student and the teacher solve two different related tasks defined over the same full trajectory: the student observes a short sequence and predicts a long trajectory, whereas the teacher observes a longer sequence and predicts the remaining short target trajectory. The teacher's task is less uncertain, and we use its accurate predictions to guide the student through our knowledge distillation framework, reducing long-term future uncertainty. Our experiments show that our proposed Di-Long method is effective for long-term forecasting and achieves state-of-the-art performance on the Intersection Drone Dataset (inD) and the Stanford Drone Dataset (SDD).
comment: Accepted to IROS 2024
Aerostack2: A Software Framework for Developing Multi-robot Aerial Systems
The development of autonomous aerial systems, particularly for multi-robot configurations, is a complex challenge requiring multidisciplinary expertise. Unlike ground robotics, aerial robotics has seen limited standardization, leading to fragmented development efforts. To address this gap, we introduce Aerostack2, a comprehensive, open-source ROS 2 based framework designed for creating versatile and robust multi-robot aerial systems. Aerostack2 features platform independence, a modular plugin architecture, and behavior-based mission control, enabling easy customization and integration across various platforms. In this paper, we detail the full architecture of Aerostack2, which has been tested with several platforms in both simulation and real flights. We demonstrate its effectiveness through multiple validation scenarios, highlighting its potential to accelerate innovation and enhance collaboration in the aerial robotics community.
comment: Submitted to IEEE Robotics and Automation Letters
Multiagent Systems
From Grounding to Planning: Benchmarking Bottlenecks in Web Agents
General web-based agents are increasingly essential for interacting with complex web environments, yet their performance in real-world web applications remains poor, yielding extremely low accuracy even with state-of-the-art frontier models. We observe that these agents can be decomposed into two primary components: Planning and Grounding. Yet, most existing research treats these agents as black boxes, focusing on end-to-end evaluations which hinder meaningful improvements. We sharpen the distinction between the planning and grounding components and conduct a novel analysis by refining experiments on the Mind2Web dataset. Our work proposes a new benchmark for each of the components separately, identifying the bottlenecks and pain points that limit agent performance. Contrary to prevalent assumptions, our findings suggest that grounding is not a significant bottleneck and can be effectively addressed with current techniques. Instead, the primary challenge lies in the planning component, which is the main source of performance degradation. Through this analysis, we offer new insights and demonstrate practical suggestions for improving the capabilities of web agents, paving the way for more reliable agents.
Managing multiple agents by automatically adjusting incentives
In the coming years, AI agents will be used for making more complex decisions, including in situations involving many different groups of people. One big challenge is that AI agent tends to act in its own interest, unlike humans who often think about what will be the best for everyone in the long run. In this paper, we explore a method to get self-interested agents to work towards goals that benefit society as a whole. We propose a method to add a manager agent to mediate agent interactions by assigning incentives to certain actions. We tested our method with a supply-chain management problem and showed that this framework (1) increases the raw reward by 22.2%, (2) increases the agents' reward by 23.8%, and (3) increases the manager's reward by 20.1%.
comment: 7 pages
Adaptive Incentive Design with Learning Agents
How can the system operator learn an incentive mechanism that achieves social optimality based on limited information about the agents' behavior, who are dynamically updating their strategies? To answer this question, we propose an \emph{adaptive} incentive mechanism. This mechanism updates the incentives of agents based on the feedback of each agent's externality, evaluated as the difference between the player's marginal cost and society's marginal cost at each time step. The proposed mechanism updates the incentives on a slower timescale compared to the agents' learning dynamics, resulting in a two-timescale coupled dynamical system. Notably, this mechanism is agnostic to the specific learning dynamics used by agents to update their strategies. We show that any fixed point of this adaptive incentive mechanism corresponds to the optimal incentive mechanism, ensuring that the Nash equilibrium coincides with the socially optimal strategy. Additionally, we provide sufficient conditions that guarantee the convergence of the adaptive incentive mechanism to a fixed point. Our results apply to both atomic and non-atomic games. To demonstrate the effectiveness of our proposed mechanism, we verify the convergence conditions in two practically relevant games: atomic networked quadratic aggregative games and non-atomic network routing games.
comment: 30 pages
Improving the Prediction of Individual Engagement in Recommendations Using Cognitive Models
For public health programs with limited resources, the ability to predict how behaviors change over time and in response to interventions is crucial for deciding when and to whom interventions should be allocated. Using data from a real-world maternal health program, we demonstrate how a cognitive model based on Instance-Based Learning (IBL) Theory can augment existing purely computational approaches. Our findings show that, compared to general time-series forecasters (e.g., LSTMs), IBL models, which reflect human decision-making processes, better predict the dynamics of individuals' states. Additionally, IBL provides estimates of the volatility in individuals' states and their sensitivity to interventions, which can improve the efficiency of training of other time series models.
Strategic Negotiations in Endogenous Network Formation
In network formation games, agents form edges with each other to maximize their utility. Each agent's utility depends on its private beliefs and its edges in the network. Strategic agents can misrepresent their beliefs to get a better resulting network. Most prior works in this area consider honest agents or a single strategic agent. Instead, we propose a model where any subset of agents can be strategic. We provide an efficient algorithm for finding the set of Nash equilibria, if any exist, and certify their nonexistence otherwise. We also show that when several strategic agents are present, their utilities can increase or decrease compared to when they are all honest. Small changes in the inter-agent correlations can cause such shifts. In contrast, the simpler one-strategic-agent setting explored in the literature lacks such complex patterns. Finally, we develop an algorithm by which new agents can learn the information needed for strategic behavior. Our algorithm works even when the (unknown) strategic agents deviate from the Nash-optimal strategies. We verify these results on both simulated networks and a real-world dataset on international trade.
Systems and Control (CS)
Comparative Analysis of Learning-Based Methods for Transient Stability Assessment
Transient stability and critical clearing time (CCT) are important concepts in power system protection and control. This paper explores and compares various learning-based methods for predicting CCT under uncertainties arising from renewable generation, loads, and contingencies. Specially, we introduce new definitions of transient stability (B-stablilty) and CCT from an engineering perspective. For training the models, only the initial values of system variables and contingency cases are used as features, enabling the provision of protection information based on these initial values. To enhance efficiency, a hybrid feature selection strategy combining the maximal information coefficient (MIC) and Spearman's Correlation Coefficient (SCC) is employed to reduce the feature dimension. The performance of different learning-based models is evaluated on a WSCC 9-bus system.
comment: Accepted for presentation at the 56th North American Power Symposium (NAPS)
Visual Servoing for Robotic On-Orbit Servicing: A Survey
On-orbit servicing (OOS) activities will power the next big step for sustainable exploration and commercialization of space. Developing robotic capabilities for autonomous OOS operations is a priority for the space industry. Visual Servoing (VS) enables robots to achieve the precise manoeuvres needed for critical OOS missions by utilizing visual information for motion control. This article presents an overview of existing VS approaches for autonomous OOS operations with space manipulator systems (SMS). We divide the approaches according to their contribution to the typical phases of a robotic OOS mission: a) Recognition, b) Approach, and c) Contact. We also present a discussion on the reviewed VS approaches, identifying current trends. Finally, we highlight the challenges and areas for future research on VS techniques for robotic OOS.
comment: Accepted for publication at the 2024 International Conference on Space Robotics (iSpaRo)
Discrete-Time Maximum Likelihood Neural Distribution Steering
This paper studies the problem of steering the distribution of a discrete-time dynamical system from an initial distribution to a target distribution in finite time. The formulation is fully nonlinear, allowing the use of general control policies, parametrized by neural networks. Although similar solutions have been explored in the continuous-time context, extending these techniques to systems with discrete dynamics is not trivial. The proposed algorithm results in a regularized maximum likelihood optimization problem, which is solved using machine learning techniques. After presenting the algorithm, we provide several numerical examples that illustrate the capabilities of the proposed method. We start from a simple problem that admits a solution through semidefinite programming, serving as a benchmark for the proposed approach. Then, we employ the framework in more general problems that cannot be solved using existing techniques, such as problems with non-Gaussian boundary distributions and non-linear dynamics.
comment: Accepted for publication in CDC 2024
Reinforcement Learning-enabled Satellite Constellation Reconfiguration and Retasking for Mission-Critical Applications
The development of satellite constellation applications is rapidly advancing due to increasing user demands, reduced operational costs, and technological advancements. However, a significant gap in the existing literature concerns reconfiguration and retasking issues within satellite constellations, which is the primary focus of our research. In this work, we critically assess the impact of satellite failures on constellation performance and the associated task requirements. To facilitate this analysis, we introduce a system modeling approach for GPS satellite constellations, enabling an investigation into performance dynamics and task distribution strategies, particularly in scenarios where satellite failures occur during mission-critical operations. Additionally, we introduce reinforcement learning (RL) techniques, specifically Q-learning, Policy Gradient, Deep Q-Network (DQN), and Proximal Policy Optimization (PPO), for managing satellite constellations, addressing the challenges posed by reconfiguration and retasking following satellite failures. Our results demonstrate that DQN and PPO achieve effective outcomes in terms of average rewards, task completion rates, and response times.
comment: Accepted for publication in the IEEE Military Communications Conference (IEEE MILCOM 2024)
Thermal Inverse design for resistive micro-heaters
This paper proposes an inverse design scheme for resistive heaters. By adjusting the spatial distribution of a binary electrical resistivity map, the scheme enables objective-driven optimization of heaters to achieve pre-defined steady-state temperature profiles. The approach can be fully automated and is computationally efficient since it does not entail extensive iterative simulations of the entire heater structure. The design scheme offers a powerful solution for resistive heater device engineering in applications spanning electronics, photonics, and microelectromechanical systems.
Open6G OTIC: A Blueprint for Programmable O-RAN and 3GPP Testing Infrastructure
Softwarized and programmable Radio Access Networks (RANs) come with virtualized and disaggregated components, increasing the supply chain robustness and the flexibility and dynamism of the network deployments. This is a key tenet of Open RAN, with open interfaces across disaggregated components specified by the O-RAN ALLIANCE. It is mandatory, however, to validate that all components are compliant with the specifications and can successfully interoperate, without performance gaps with traditional, monolithic appliances. Open Testing & Integration Centers (OTICs) are entities that can verify such interoperability and adherence to the standard through rigorous testing. However, how to design, instrument, and deploy an OTIC which can offer testing for multiple tenants, heterogeneous devices, and is ready to support automated testing is still an open challenge. In this paper, we introduce a blueprint for a programmable OTIC testing infrastructure, based on the design and deployment of the Open6G OTIC at Northeastern University, Boston, and provide insights on technical challenges and solutions for O-RAN testing at scale.
comment: Presented at IEEE VTC Fall RitiRAN Workshop, 5 pages, 3 figures, 3 tables
Early Design Exploration of Aerospace Systems Using Assume-Guarantee Contracts
We present a compositional approach to early modeling and analysis of complex aerospace systems based on assume-guarantee contracts. Components in a system are abstracted into assume-guarantee specifications. Performing algebraic contract operations with Pacti allows us to relate local component specifications to that of the system. Applications to two aerospace case studies (the design of spacecraft to satisfy a rendezvous mission and the design of the thermal management system of a prototypical aircraft) show that this methodology provides engineers with an agile, early analysis and exploration process.
comment: 32 pages
Planning to avoid ambiguous states through Gaussian approximations to non-linear sensors in active inference agents
In nature, active inference agents must learn how observations of the world represent the state of the agent. In engineering, the physics behind sensors is often known reasonably accurately and measurement functions can be incorporated into generative models. When a measurement function is non-linear, the transformed variable is typically approximated with a Gaussian distribution to ensure tractable inference. We show that Gaussian approximations that are sensitive to the curvature of the measurement function, such as a second-order Taylor approximation, produce a state-dependent ambiguity term. This induces a preference over states, based on how accurately the state can be inferred from the observation. We demonstrate this preference with a robot navigation experiment where agents plan trajectories.
comment: 13 pages, 3 figures. Accepted to the International Workshop on Active Inference 2024
Adaptive Stochastic Predictive Control from Noisy Data: A Sampling-based Approach
In this work, an adaptive predictive control scheme for linear systems with unknown parameters and bounded additive disturbances is proposed. In contrast to related adaptive control approaches that robustly consider the parametric uncertainty, the proposed method handles all uncertainties stochastically by employing an online adaptive sampling-based approximation of chance constraints. The approach requires initial data in the form of a short input-output trajectory and distributional knowledge of the disturbances. This prior knowledge is used to construct an initial set of data-consistent system parameters and a distribution that allows for sample generation. As new data stream in online, the set of consistent system parameters is adapted by exploiting set membership identification. Consequently, chance constraints are deterministically approximated using a probabilistic scaling approach by sampling from the set of system parameters. In combination with a robust constraint on the first predicted step, recursive feasibility of the proposed predictive controller and closed-loop constraint satisfaction are guaranteed. A numerical example demonstrates the efficacy of the proposed method.
comment: Accepted for presentation at the 63rd IEEE Conference on Decision and Control (CDC2024)
Modeling IoT Traffic Patterns: Insights from a Statistical Analysis of an MTC Dataset
The Internet-of-Things (IoT) is rapidly expanding, connecting numerous devices and becoming integral to our daily lives. As this occurs, ensuring efficient traffic management becomes crucial. Effective IoT traffic management requires modeling and predicting intrincate machine-type communication (MTC) dynamics, for which machine-learning (ML) techniques are certainly appealing. However, obtaining comprehensive and high-quality datasets, along with accessible platforms for reproducing ML-based predictions, continues to impede the research progress. In this paper, we aim to fill this gap by characterizing the Smart Campus MTC dataset provided by the University of Oulu. Specifically, we perform a comprehensive statistical analysis of the MTC traffic utilizing goodness-of-fit tests, including well-established tests such as Kolmogorov-Smirnov, Anderson-Darling, chi-squared, and root mean square error. The analysis centers on examining and evaluating three models that accurately represent the two most significant MTC traffic types: periodic updating and event-driven, which are also identified from the dataset. The results demonstrate that the models accurately characterize the traffic patterns. The Poisson point process model exhibits the best fit for event-driven patterns with errors below 11%, while the quasi-periodic model fits accurately the periodic updating traffic with errors below 7%.
comment: SSRN:4655476
Mixed Regular and Impulsive Sampled-data LQR
We investigate the benefits of combining regular and impulsive inputs for the control of sampled-data linear time-invariant systems. We first observe that adding an impulsive term to a regular, zero-order-hold controller may help enlarging the set of sampling periods under which controllability is preserved by sampling. In this context, we provide a tailored Hautus-like necessary and sufficient condition under which controllability of the mixed regular, impulsive (MRI) sampled-data model is preserved. We then focus on LQR optimal control. After having presented the optimal controllers for the sampled-data LQR control in the MRI setting, we consider the scenario where an impulsive disturbance affects the dynamics and is known ahead of time. The solution to the so-called preview LQR is presented exploiting both regular and impulsive input components. Numerical examples, that include an insulin infusion benchmark, illustrate that leveraging both future disturbance information and MRI controls may lead to significant performance improvements.
COOCK project Smart Port 2025 D3.2: "Variability in Twinning Architectures"
This document is a result of the COOCK project "Smart Port 2025: improving and accelerating the operational efficiency of a harbour eco-system through the application of intelligent technologies". The project is mainly aimed at SMEs, but also at large corporations. Together, they form the value-chain of the harbour. The digital maturity of these actors will be increased by model and data-driven digitization. The project brings together both technology users and providers/integrators. In this report, the broad spectrum of model and data-based digitization approaches is structured, under the unifying umbrella of "Digital Twins". During the (currently quite ad-hoc) digitization process and in particular, the creations of Digital Twins, a variety of choices have an impact on the ultimately realised system. This document identifies three stages during which this "variability" appears: the Problem Space Goal Construction Stage, the (Conceptual) Architecture Design Stage and the Deployment Stage. To illustrate the workflow, two simple use-cases are used: one of a ship moving in 1 dimension and, at a different scale and level of detail, a macroscopic model of the Port of Antwerp.
Hardware-Based Microgrid Coupled to Real-Time Simulated Power Grids for Evaluating New Control Strategies in Future Energy Systems
The design of new control strategies for future energy systems can neither be directly tested in real power grids nor be evaluated based on only current grid situations. In this regard, extensive tests are required in laboratory settings using real power system equipment. However, since it is impossible to replicate the entire grid section of interest, even in large-scale experiments, hardware setups must be supplemented by detailed simulations to reproduce the system under study fully. This paper presents a unique test environment in which a hardware-based microgrid environment is physically coupled with a large-scale real-time simulation framework. The setup combines the advantages of developing new solutions using hardware-based experiments and evaluating the impact on large-scale power systems using real-time simulations. In this paper, the interface between the microgrid-under-test environment and the real-time simulations is evaluated in terms of accuracy and communication delays. Furthermore, a test case is presented showing the approach's ability to test microgrid control strategies for supporting the grid. It is observed that the communication delays via the physical interface depend on the simulation sampling time and do not significantly affect the accuracy in the interaction between the hardware and the simulated grid.
comment: 16 pages, 10 figures
Design and Operation Principles of a Wave-Controlled Reconfigurable Intelligent Surface
A Reflective Intelligent Surface (RIS) consists of many small reflective elements whose reflection properties can be adjusted to change the wireless propagation environment. Envisioned implementations require that each RIS element be connected to a controller, and as the number of RIS elements on a surface may be on the order of hundreds or more, the number of required electrical connectors creates a difficult wiring problem, especially at high frequencies where the physical space between the elements is limited. A potential solution to this problem was previously proposed by the authors in which "biasing transmission lines" carrying standing waves are sampled at each RIS location to produce the desired bias voltage for each RIS element. This solution has the potential to substantially reduce the complexity of the RIS control. This paper presents models for the RIS elements that account for mutual coupling and realistic varactor characteristics, as well as circuit models for sampling the transmission line to generate the RIS control signals. For the latter case, the paper investigates two techniques for conversion of the transmission line standing wave voltage to the varactor bias voltage, namely an envelope detector and a sample-and-hold circuit. The paper also develops a modal decomposition approach for generating standing waves that are able to generate beams and nulls in the resulting RIS radiation pattern that maximize either the Signal-to-Noise Ratio (SNR) or the Signal-to-Leakage-plus-Noise Ratio (SLNR). Extensive simulation results are provided for the two techniques, together with a discussion of computational complexity.
comment: 20 pages, 25 figures, 2 tables
Optimal Power Grid Operations with Foundation Models
The energy transition, crucial for tackling the climate crisis, demands integrating numerous distributed, renewable energy sources into existing grids. Along with climate change and consumer behavioral changes, this leads to changes and variability in generation and load patterns, introducing significant complexity and uncertainty into grid planning and operations. While the industry has already started to exploit AI to overcome computational challenges of established grid simulation tools, we propose the use of AI Foundation Models (FMs) and advances in Graph Neural Networks to efficiently exploit poorly available grid data for different downstream tasks, enhancing grid operations. For capturing the grid's underlying physics, we believe that building a self-supervised model learning the power flow dynamics is a critical first step towards developing an FM for the power grid. We show how this approach may close the gap between the industry needs and current grid analysis capabilities, to bring the industry closer to optimal grid operation and planning.
LiDAR-Aided Millimeter-Wave Range Extension using a Passive Mirror Reflector
Passive reflectors mitigate millimeter-wave (mmwave) link blockages by extending coverage to non-line-ofsight (NLoS) regions. However, their deployment often leads to irregular reflected beam patterns and coverage gaps. This results in rapid channel fluctuations and potential outages. In this paper, we propose two LiDAR-aided link enhancement techniques to address these challenges. Leveraging user position information, we introduce a location-dependent link control strategy and a user selection technique to improve NLoS link reliability and coverage. Experimental results validate the efficacy of the proposed techniques in reducing outages and enhancing NLoS signal strength.
Cooperative Global $\mathcal{K}$-exponential Tracking Control of Multiple Mobile Robots -- Extended Version
This paper studies the cooperative tracking control problem for multiple mobile robots over a directed communication network. First, it is shown that the closed-loop system is uniformly globally asymptotically stable under the proposed distributed continuous feedback control law, where an explicit strict Lyapunov function is constructed. Then, by investigating the convergence rate, it is further proven that the closed-loop system is globally $\mathcal{K}$-exponentially stable. Moreover, to make the proposed control law more practical, the distributed continuous feedback control law is generalized to a distributed sampled-data feedback control law using the emulation approach, based on the strong integral input-to-state stable Lyapunov function. Numerical simulations are presented to validate the effectiveness of the proposed control methods.
comment: 8 pages, 3 figures
Bridging the Gap Between Central and Local Decision-Making: The Efficacy of Collaborative Equilibria in Altruistic Congestion Games
Congestion games are popular models often used to study the system-level inefficiencies caused by selfish agents, typically measured by the price of anarchy. One may expect that aligning the agents' preferences with the system-level objective--altruistic behavior--would improve efficiency, but recent works have shown that altruism can lead to more significant inefficiency than selfishness in congestion games. In this work, we study to what extent the localness of decision-making causes inefficiency by considering collaborative decision-making paradigms that exist between centralized and distributed in altruistic congestion games. In altruistic congestion games with convex latency functions, the system cost is a super-modular function over the player's joint actions, and the Nash equilibria of the game are local optima in the neighborhood of unilateral deviations. When agents can collaborate, we can exploit the common-interest structure to consider equilibria with stronger local optimality guarantees in the system objective, e.g., if groups of k agents can collaboratively minimize the system cost, the system equilibria are the local optima over k-lateral deviations. Our main contributions are in constructing tractable linear programs that provide bounds on the price of anarchy of collaborative equilibria in altruistic congestion games. Our findings bridge the gap between the known efficiency guarantees of centralized and distributed decision-making paradigms while also providing insights into the benefit of inter-agent collaboration in multi-agent systems.
Controlled fluid transport by the collective motion of microrotors
Torque-driven microscale swimming robots, or microrotors, hold significant potential in biomedical applications such as targeted drug delivery, minimally invasive surgery, and micromanipulation. This paper addresses the challenge of controlling the transport of fluid volumes using the flow fields generated by interacting groups of microrotors. Our approach uses polynomial chaos expansions to model the time evolution of fluid particle distributions and formulate an optimal control problem, which we solve numerically. We implement this framework in simulation to achieve the controlled transport of an initial fluid particle distribution to a target destination while minimizing undesirable effects such as stretching and mixing. We consider the case where translational velocities of the rotors are directly controlled, as well as the case where only torques are controlled and the rotors move in response to the collective flow fields they generate. We analyze the solution of this optimal control problem by computing the Lagrangian coherent structures of the associated flow field, which reveal the formation of transport barriers that efficiently guide particles toward their target. This analysis provides insights into the underlying mechanisms of controlled transport.
Probabilistic Reachability Analysis of Stochastic Control Systems
We address the reachability problem for continuous-time stochastic dynamic systems. Our objective is to present a unified framework that characterizes the reachable set of a dynamic system in the presence of both stochastic disturbances and deterministic inputs. To achieve this, we devise a strategy that effectively decouples the effects of deterministic inputs and stochastic disturbances on the reachable sets of the system. For the deterministic part, many existing methods can capture the deterministic reachability. As for the stochastic disturbances, we introduce a novel technique that probabilistically bounds the difference between a stochastic trajectory and its deterministic counterpart. The key to our approach is introducing a novel energy function termed the Averaged Moment Generating Function that yields a high probability bound for this difference. This bound is tight and exact for linear stochastic dynamics and applicable to a large class of nonlinear stochastic dynamics. By combining our innovative technique with existing methods for deterministic reachability analysis, we can compute estimations of reachable sets that surpass those obtained with current approaches for stochastic reachability analysis. We validate the effectiveness of our framework through various numerical experiments. Beyond its immediate applications in reachability analysis, our methodology is poised to have profound implications in the broader analysis and control of stochastic systems. It opens avenues for enhanced understanding and manipulation of complex stochastic dynamics, presenting opportunities for advancements in related fields.
PID Accelerated Temporal Difference Algorithms
Long-horizon tasks, which have a large discount factor, pose a challenge for most conventional reinforcement learning (RL) algorithms. Algorithms such as Value Iteration and Temporal Difference (TD) learning have a slow convergence rate and become inefficient in these tasks. When the transition distributions are given, PID VI was recently introduced to accelerate the convergence of Value Iteration using ideas from control theory. Inspired by this, we introduce PID TD Learning and PID Q-Learning algorithms for the RL setting, in which only samples from the environment are available. We give a theoretical analysis of the convergence of PID TD Learning and its acceleration compared to the conventional TD Learning. We also introduce a method for adapting PID gains in the presence of noise and empirically verify its effectiveness.
Risk-Aware Real-Time Task Allocation for Stochastic Multi-Agent Systems under STL Specifications
This paper addresses the control synthesis of heterogeneous stochastic linear multi-agent systems with real-time allocation of signal temporal logic (STL) specifications. Based on previous work, we decompose specifications into sub-specifications on the individual agent level. To leverage the efficiency of task allocation, a heuristic filter evaluates potential task allocation based on STL robustness, and subsequently, an auctioning algorithm determines the definitive allocation of specifications. Finally, a control strategy is synthesized for each agent-specification pair using tube-based model predictive control (MPC), ensuring provable probabilistic satisfaction. We demonstrate the efficacy of the proposed methods using a multi-shuttle scenario that highlights a promising extension to automated driving applications like vehicle routing.
comment: 7 pages, 5 figures, Accepted for CDC 2024. arXiv admin note: text overlap with arXiv:2402.03165
Behavioral Learning of Dish Rinsing and Scrubbing based on Interruptive Direct Teaching Considering Assistance Rate
Robots are expected to manipulate objects in a safe and dexterous way. For example, washing dishes is a dexterous operation that involves scrubbing the dishes with a sponge and rinsing them with water. It is necessary to learn it safely without splashing water and without dropping the dishes. In this study, we propose a safe and dexterous manipulation system. The robot learns a dynamics model of the object by estimating the state of the object and the robot itself, the control input, and the amount of human assistance required (assistance rate) after the human corrects the initial trajectory of the robot's hands by interruptive direct teaching. By backpropagating the error between the estimated and the reference value using the acquired dynamics model, the robot can generate a control input that approaches the reference value, for example, so that human assistance is not required and the dish does not move excessively. This allows for adaptive rinsing and scrubbing of dishes with unknown shapes and properties. As a result, it is possible to generate safe actions that require less human assistance.
comment: Accepted at Advanced Robotics
A General Framework for Load Forecasting based on Pre-trained Large Language Model
Accurate load forecasting is crucial for maintaining the power balance between generators and consumers,particularly with the increasing integration of renewable energy sources, which introduce significant intermittent volatility. With the advancement of data-driven methods, machine learning and deep learning models have become the predominant approaches for load forecasting tasks. In recent years, pre-trained large language models (LLMs) have achieved significant progress, demonstrating superior performance across various fields. This paper proposes a load forecasting method based on LLMs, offering not only precise predictive capabilities but also broad and flexible applicability. Additionally, a data modeling method is introduced to effectively transform load sequence data into natural language suitable for LLM training. Furthermore, a data enhancement strategy is designed to mitigate the impact of LLM hallucinations on forecasting results. The effectiveness of the proposed method is validated using two real-world datasets. Compared to existing methods, our approach demonstrates state-of-the-art performance across all validation metrics.
comment: 11 pages, 3 figures and 5 tables
Adaptive Incentive Design with Learning Agents
How can the system operator learn an incentive mechanism that achieves social optimality based on limited information about the agents' behavior, who are dynamically updating their strategies? To answer this question, we propose an \emph{adaptive} incentive mechanism. This mechanism updates the incentives of agents based on the feedback of each agent's externality, evaluated as the difference between the player's marginal cost and society's marginal cost at each time step. The proposed mechanism updates the incentives on a slower timescale compared to the agents' learning dynamics, resulting in a two-timescale coupled dynamical system. Notably, this mechanism is agnostic to the specific learning dynamics used by agents to update their strategies. We show that any fixed point of this adaptive incentive mechanism corresponds to the optimal incentive mechanism, ensuring that the Nash equilibrium coincides with the socially optimal strategy. Additionally, we provide sufficient conditions that guarantee the convergence of the adaptive incentive mechanism to a fixed point. Our results apply to both atomic and non-atomic games. To demonstrate the effectiveness of our proposed mechanism, we verify the convergence conditions in two practically relevant games: atomic networked quadratic aggregative games and non-atomic network routing games.
comment: 30 pages
Koopman Analysis of the Singularly-Perturbed van der Pol Oscillator
The Koopman operator framework holds promise for spectral analysis of nonlinear dynamical systems based on linear operators. Eigenvalues and eigenfunctions of the Koopman operator, so-called Koopman eigenvalues and Koopman eigenfunctions, respectively, mirror global properties of the system's flow. In this paper we perform the Koopman analysis of the singularly-perturbed van der Pol system. First, we show the spectral signature depending on singular perturbation: how two Koopman {principal} eigenvalues are ordered and what distinct shapes emerge in their associated Koopman eigenfunctions. Second, we discuss the singular limit of the Koopman operator, which is derived through the concatenation of Koopman operators for the fast and slow subsystems. From the spectral properties of the Koopman operator for the {singularly}-perturbed system and the singular limit, we suggest that the Koopman eigenfunctions inherit geometric properties of the singularly-perturbed system. These results are applicable to general planar singularly-perturbed systems with stable limit cycles.
comment: 21 pages, 10 figures
Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions
Learning-based control has recently shown great efficacy in performing complex tasks for various applications. However, to deploy it in real systems, it is of vital importance to guarantee the system will stay safe. Control Barrier Functions (CBFs) offer mathematical tools for designing safety-preserving controllers for systems with known dynamics. In this article, we first introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers using Gaussian Process (GP) regression to close the gap between an approximate mathematical model and the real system, which results in a second-order cone program (SOCP)-based control design. We then present the pointwise feasibility conditions of the resulting safety controller, highlighting the level of richness that the available system information must meet to ensure safety. We use these conditions to devise an event-triggered online data collection strategy that ensures the recursive feasibility of the learned safety controller. Our method works by constantly reasoning about whether the current information is sufficient to ensure safety or if new measurements under active safe exploration are required to reduce the uncertainty. As a result, our proposed framework can guarantee the forward invariance of the safe set defined by the CBF with high probability, even if it contains a priori unexplored regions. We validate the proposed framework in two numerical simulation experiments.
comment: Journal article. Includes the results of the 2021 CDC paper titled "Pointwise feasibility of gaussian process-based safety-critical control under model uncertainty" and proposes a recursively feasible safe online learning algorithm as new contribution
On the Sample Complexity of Imitation Learning for Smoothed Model Predictive Control
Recent work in imitation learning has shown that having an expert controller that is both suitably smooth and stable enables stronger guarantees on the performance of the learned controller. However, constructing such smoothed expert controllers for arbitrary systems remains challenging, especially in the presence of input and state constraints. As our primary contribution, we show how such a smoothed expert can be designed for a general class of systems using a log-barrier-based relaxation of a standard Model Predictive Control (MPC) optimization problem. At the crux of this theoretical guarantee on smoothness is a new lower bound we prove on the optimality gap of the analytic center associated with a convex Lipschitz function, which we hope could be of independent interest. We validate our theoretical findings via experiments, demonstrating the merits of our smoothing approach over randomized smoothing.
comment: 15 pages, 2 figures. Preliminary version accepted to CDC 2024
Monotonicity and Contraction on Polyhedral Cones
In this note, we study monotone dynamical systems with respect to polyhedral cones. Using the half-space representation and the vertex representation, we propose three equivalent conditions to certify monotonicity of a dynamical system with respect to a polyhedral cone. We then introduce the notion of gauge norm associated with a cone and provide closed-from formulas for computing gauge norms associated with polyhedral cones. A key feature of gauge norms is that contractivity of monotone systems with respect to them can be efficiently characterized using simple inequalities. This result generalizes the well-known criteria for Hurwitzness of Metzler matrices and provides a scalable approach to search for Lyapunov functions of monotone systems with respect to polyhedral cones. Finally, we study the applications of our results in transient stability of dynamic flow networks and in scalable control design with safety guarantees.
Differentially Private Reward Functions in Policy Synthesis for Markov Decision Processes
Markov decision processes often seek to maximize a reward function, but onlookers may infer reward functions by observing the states and actions of such systems, revealing sensitive information. Therefore, in this paper we introduce and compare two methods for privatizing reward functions in policy synthesis for multi-agent Markov decision processes, which generalize Markov decision processes. Reward functions are privatized using differential privacy, a statistical framework for protecting sensitive data. The methods we develop perturb either (1) each agent's individual reward function or (2) the joint reward function shared by all agents. We show that approach (1) provides better performance. We then develop a polynomial-time algorithm for the numerical computation of the performance loss due to privacy on a case-by-case basis. Next, using approach (1), we develop guidelines for selecting reward function values to preserve ``goal" and ``avoid" states while still remaining private, and we quantify the increase in computational complexity needed to compute policies from privatized rewards. Numerical simulations are performed on three classes of systems and they reveal a surprising compatibility with privacy: using reasonably strong privacy ($\epsilon =1.3$) on average induces as little as a~$5\%$ decrease in total accumulated reward and a $0.016\%$ increase in computation time.
comment: 16 Pages, 11 figures
Aerostack2: A Software Framework for Developing Multi-robot Aerial Systems
The development of autonomous aerial systems, particularly for multi-robot configurations, is a complex challenge requiring multidisciplinary expertise. Unlike ground robotics, aerial robotics has seen limited standardization, leading to fragmented development efforts. To address this gap, we introduce Aerostack2, a comprehensive, open-source ROS 2 based framework designed for creating versatile and robust multi-robot aerial systems. Aerostack2 features platform independence, a modular plugin architecture, and behavior-based mission control, enabling easy customization and integration across various platforms. In this paper, we detail the full architecture of Aerostack2, which has been tested with several platforms in both simulation and real flights. We demonstrate its effectiveness through multiple validation scenarios, highlighting its potential to accelerate innovation and enhance collaboration in the aerial robotics community.
comment: Submitted to IEEE Robotics and Automation Letters
Systems and Control (EESS)
Comparative Analysis of Learning-Based Methods for Transient Stability Assessment
Transient stability and critical clearing time (CCT) are important concepts in power system protection and control. This paper explores and compares various learning-based methods for predicting CCT under uncertainties arising from renewable generation, loads, and contingencies. Specially, we introduce new definitions of transient stability (B-stablilty) and CCT from an engineering perspective. For training the models, only the initial values of system variables and contingency cases are used as features, enabling the provision of protection information based on these initial values. To enhance efficiency, a hybrid feature selection strategy combining the maximal information coefficient (MIC) and Spearman's Correlation Coefficient (SCC) is employed to reduce the feature dimension. The performance of different learning-based models is evaluated on a WSCC 9-bus system.
comment: Accepted for presentation at the 56th North American Power Symposium (NAPS)
Visual Servoing for Robotic On-Orbit Servicing: A Survey
On-orbit servicing (OOS) activities will power the next big step for sustainable exploration and commercialization of space. Developing robotic capabilities for autonomous OOS operations is a priority for the space industry. Visual Servoing (VS) enables robots to achieve the precise manoeuvres needed for critical OOS missions by utilizing visual information for motion control. This article presents an overview of existing VS approaches for autonomous OOS operations with space manipulator systems (SMS). We divide the approaches according to their contribution to the typical phases of a robotic OOS mission: a) Recognition, b) Approach, and c) Contact. We also present a discussion on the reviewed VS approaches, identifying current trends. Finally, we highlight the challenges and areas for future research on VS techniques for robotic OOS.
comment: Accepted for publication at the 2024 International Conference on Space Robotics (iSpaRo)
Discrete-Time Maximum Likelihood Neural Distribution Steering
This paper studies the problem of steering the distribution of a discrete-time dynamical system from an initial distribution to a target distribution in finite time. The formulation is fully nonlinear, allowing the use of general control policies, parametrized by neural networks. Although similar solutions have been explored in the continuous-time context, extending these techniques to systems with discrete dynamics is not trivial. The proposed algorithm results in a regularized maximum likelihood optimization problem, which is solved using machine learning techniques. After presenting the algorithm, we provide several numerical examples that illustrate the capabilities of the proposed method. We start from a simple problem that admits a solution through semidefinite programming, serving as a benchmark for the proposed approach. Then, we employ the framework in more general problems that cannot be solved using existing techniques, such as problems with non-Gaussian boundary distributions and non-linear dynamics.
comment: Accepted for publication in CDC 2024
Reinforcement Learning-enabled Satellite Constellation Reconfiguration and Retasking for Mission-Critical Applications
The development of satellite constellation applications is rapidly advancing due to increasing user demands, reduced operational costs, and technological advancements. However, a significant gap in the existing literature concerns reconfiguration and retasking issues within satellite constellations, which is the primary focus of our research. In this work, we critically assess the impact of satellite failures on constellation performance and the associated task requirements. To facilitate this analysis, we introduce a system modeling approach for GPS satellite constellations, enabling an investigation into performance dynamics and task distribution strategies, particularly in scenarios where satellite failures occur during mission-critical operations. Additionally, we introduce reinforcement learning (RL) techniques, specifically Q-learning, Policy Gradient, Deep Q-Network (DQN), and Proximal Policy Optimization (PPO), for managing satellite constellations, addressing the challenges posed by reconfiguration and retasking following satellite failures. Our results demonstrate that DQN and PPO achieve effective outcomes in terms of average rewards, task completion rates, and response times.
comment: Accepted for publication in the IEEE Military Communications Conference (IEEE MILCOM 2024)
Thermal Inverse design for resistive micro-heaters
This paper proposes an inverse design scheme for resistive heaters. By adjusting the spatial distribution of a binary electrical resistivity map, the scheme enables objective-driven optimization of heaters to achieve pre-defined steady-state temperature profiles. The approach can be fully automated and is computationally efficient since it does not entail extensive iterative simulations of the entire heater structure. The design scheme offers a powerful solution for resistive heater device engineering in applications spanning electronics, photonics, and microelectromechanical systems.
Open6G OTIC: A Blueprint for Programmable O-RAN and 3GPP Testing Infrastructure
Softwarized and programmable Radio Access Networks (RANs) come with virtualized and disaggregated components, increasing the supply chain robustness and the flexibility and dynamism of the network deployments. This is a key tenet of Open RAN, with open interfaces across disaggregated components specified by the O-RAN ALLIANCE. It is mandatory, however, to validate that all components are compliant with the specifications and can successfully interoperate, without performance gaps with traditional, monolithic appliances. Open Testing & Integration Centers (OTICs) are entities that can verify such interoperability and adherence to the standard through rigorous testing. However, how to design, instrument, and deploy an OTIC which can offer testing for multiple tenants, heterogeneous devices, and is ready to support automated testing is still an open challenge. In this paper, we introduce a blueprint for a programmable OTIC testing infrastructure, based on the design and deployment of the Open6G OTIC at Northeastern University, Boston, and provide insights on technical challenges and solutions for O-RAN testing at scale.
comment: Presented at IEEE VTC Fall RitiRAN Workshop, 5 pages, 3 figures, 3 tables
Early Design Exploration of Aerospace Systems Using Assume-Guarantee Contracts
We present a compositional approach to early modeling and analysis of complex aerospace systems based on assume-guarantee contracts. Components in a system are abstracted into assume-guarantee specifications. Performing algebraic contract operations with Pacti allows us to relate local component specifications to that of the system. Applications to two aerospace case studies (the design of spacecraft to satisfy a rendezvous mission and the design of the thermal management system of a prototypical aircraft) show that this methodology provides engineers with an agile, early analysis and exploration process.
comment: 32 pages
Planning to avoid ambiguous states through Gaussian approximations to non-linear sensors in active inference agents
In nature, active inference agents must learn how observations of the world represent the state of the agent. In engineering, the physics behind sensors is often known reasonably accurately and measurement functions can be incorporated into generative models. When a measurement function is non-linear, the transformed variable is typically approximated with a Gaussian distribution to ensure tractable inference. We show that Gaussian approximations that are sensitive to the curvature of the measurement function, such as a second-order Taylor approximation, produce a state-dependent ambiguity term. This induces a preference over states, based on how accurately the state can be inferred from the observation. We demonstrate this preference with a robot navigation experiment where agents plan trajectories.
comment: 13 pages, 3 figures. Accepted to the International Workshop on Active Inference 2024
Adaptive Stochastic Predictive Control from Noisy Data: A Sampling-based Approach
In this work, an adaptive predictive control scheme for linear systems with unknown parameters and bounded additive disturbances is proposed. In contrast to related adaptive control approaches that robustly consider the parametric uncertainty, the proposed method handles all uncertainties stochastically by employing an online adaptive sampling-based approximation of chance constraints. The approach requires initial data in the form of a short input-output trajectory and distributional knowledge of the disturbances. This prior knowledge is used to construct an initial set of data-consistent system parameters and a distribution that allows for sample generation. As new data stream in online, the set of consistent system parameters is adapted by exploiting set membership identification. Consequently, chance constraints are deterministically approximated using a probabilistic scaling approach by sampling from the set of system parameters. In combination with a robust constraint on the first predicted step, recursive feasibility of the proposed predictive controller and closed-loop constraint satisfaction are guaranteed. A numerical example demonstrates the efficacy of the proposed method.
comment: Accepted for presentation at the 63rd IEEE Conference on Decision and Control (CDC2024)
Modeling IoT Traffic Patterns: Insights from a Statistical Analysis of an MTC Dataset
The Internet-of-Things (IoT) is rapidly expanding, connecting numerous devices and becoming integral to our daily lives. As this occurs, ensuring efficient traffic management becomes crucial. Effective IoT traffic management requires modeling and predicting intrincate machine-type communication (MTC) dynamics, for which machine-learning (ML) techniques are certainly appealing. However, obtaining comprehensive and high-quality datasets, along with accessible platforms for reproducing ML-based predictions, continues to impede the research progress. In this paper, we aim to fill this gap by characterizing the Smart Campus MTC dataset provided by the University of Oulu. Specifically, we perform a comprehensive statistical analysis of the MTC traffic utilizing goodness-of-fit tests, including well-established tests such as Kolmogorov-Smirnov, Anderson-Darling, chi-squared, and root mean square error. The analysis centers on examining and evaluating three models that accurately represent the two most significant MTC traffic types: periodic updating and event-driven, which are also identified from the dataset. The results demonstrate that the models accurately characterize the traffic patterns. The Poisson point process model exhibits the best fit for event-driven patterns with errors below 11%, while the quasi-periodic model fits accurately the periodic updating traffic with errors below 7%.
comment: SSRN:4655476
Mixed Regular and Impulsive Sampled-data LQR
We investigate the benefits of combining regular and impulsive inputs for the control of sampled-data linear time-invariant systems. We first observe that adding an impulsive term to a regular, zero-order-hold controller may help enlarging the set of sampling periods under which controllability is preserved by sampling. In this context, we provide a tailored Hautus-like necessary and sufficient condition under which controllability of the mixed regular, impulsive (MRI) sampled-data model is preserved. We then focus on LQR optimal control. After having presented the optimal controllers for the sampled-data LQR control in the MRI setting, we consider the scenario where an impulsive disturbance affects the dynamics and is known ahead of time. The solution to the so-called preview LQR is presented exploiting both regular and impulsive input components. Numerical examples, that include an insulin infusion benchmark, illustrate that leveraging both future disturbance information and MRI controls may lead to significant performance improvements.
COOCK project Smart Port 2025 D3.2: "Variability in Twinning Architectures"
This document is a result of the COOCK project "Smart Port 2025: improving and accelerating the operational efficiency of a harbour eco-system through the application of intelligent technologies". The project is mainly aimed at SMEs, but also at large corporations. Together, they form the value-chain of the harbour. The digital maturity of these actors will be increased by model and data-driven digitization. The project brings together both technology users and providers/integrators. In this report, the broad spectrum of model and data-based digitization approaches is structured, under the unifying umbrella of "Digital Twins". During the (currently quite ad-hoc) digitization process and in particular, the creations of Digital Twins, a variety of choices have an impact on the ultimately realised system. This document identifies three stages during which this "variability" appears: the Problem Space Goal Construction Stage, the (Conceptual) Architecture Design Stage and the Deployment Stage. To illustrate the workflow, two simple use-cases are used: one of a ship moving in 1 dimension and, at a different scale and level of detail, a macroscopic model of the Port of Antwerp.
Hardware-Based Microgrid Coupled to Real-Time Simulated Power Grids for Evaluating New Control Strategies in Future Energy Systems
The design of new control strategies for future energy systems can neither be directly tested in real power grids nor be evaluated based on only current grid situations. In this regard, extensive tests are required in laboratory settings using real power system equipment. However, since it is impossible to replicate the entire grid section of interest, even in large-scale experiments, hardware setups must be supplemented by detailed simulations to reproduce the system under study fully. This paper presents a unique test environment in which a hardware-based microgrid environment is physically coupled with a large-scale real-time simulation framework. The setup combines the advantages of developing new solutions using hardware-based experiments and evaluating the impact on large-scale power systems using real-time simulations. In this paper, the interface between the microgrid-under-test environment and the real-time simulations is evaluated in terms of accuracy and communication delays. Furthermore, a test case is presented showing the approach's ability to test microgrid control strategies for supporting the grid. It is observed that the communication delays via the physical interface depend on the simulation sampling time and do not significantly affect the accuracy in the interaction between the hardware and the simulated grid.
comment: 16 pages, 10 figures
Design and Operation Principles of a Wave-Controlled Reconfigurable Intelligent Surface
A Reflective Intelligent Surface (RIS) consists of many small reflective elements whose reflection properties can be adjusted to change the wireless propagation environment. Envisioned implementations require that each RIS element be connected to a controller, and as the number of RIS elements on a surface may be on the order of hundreds or more, the number of required electrical connectors creates a difficult wiring problem, especially at high frequencies where the physical space between the elements is limited. A potential solution to this problem was previously proposed by the authors in which "biasing transmission lines" carrying standing waves are sampled at each RIS location to produce the desired bias voltage for each RIS element. This solution has the potential to substantially reduce the complexity of the RIS control. This paper presents models for the RIS elements that account for mutual coupling and realistic varactor characteristics, as well as circuit models for sampling the transmission line to generate the RIS control signals. For the latter case, the paper investigates two techniques for conversion of the transmission line standing wave voltage to the varactor bias voltage, namely an envelope detector and a sample-and-hold circuit. The paper also develops a modal decomposition approach for generating standing waves that are able to generate beams and nulls in the resulting RIS radiation pattern that maximize either the Signal-to-Noise Ratio (SNR) or the Signal-to-Leakage-plus-Noise Ratio (SLNR). Extensive simulation results are provided for the two techniques, together with a discussion of computational complexity.
comment: 20 pages, 25 figures, 2 tables
Optimal Power Grid Operations with Foundation Models
The energy transition, crucial for tackling the climate crisis, demands integrating numerous distributed, renewable energy sources into existing grids. Along with climate change and consumer behavioral changes, this leads to changes and variability in generation and load patterns, introducing significant complexity and uncertainty into grid planning and operations. While the industry has already started to exploit AI to overcome computational challenges of established grid simulation tools, we propose the use of AI Foundation Models (FMs) and advances in Graph Neural Networks to efficiently exploit poorly available grid data for different downstream tasks, enhancing grid operations. For capturing the grid's underlying physics, we believe that building a self-supervised model learning the power flow dynamics is a critical first step towards developing an FM for the power grid. We show how this approach may close the gap between the industry needs and current grid analysis capabilities, to bring the industry closer to optimal grid operation and planning.
LiDAR-Aided Millimeter-Wave Range Extension using a Passive Mirror Reflector
Passive reflectors mitigate millimeter-wave (mmwave) link blockages by extending coverage to non-line-ofsight (NLoS) regions. However, their deployment often leads to irregular reflected beam patterns and coverage gaps. This results in rapid channel fluctuations and potential outages. In this paper, we propose two LiDAR-aided link enhancement techniques to address these challenges. Leveraging user position information, we introduce a location-dependent link control strategy and a user selection technique to improve NLoS link reliability and coverage. Experimental results validate the efficacy of the proposed techniques in reducing outages and enhancing NLoS signal strength.
Cooperative Global $\mathcal{K}$-exponential Tracking Control of Multiple Mobile Robots -- Extended Version
This paper studies the cooperative tracking control problem for multiple mobile robots over a directed communication network. First, it is shown that the closed-loop system is uniformly globally asymptotically stable under the proposed distributed continuous feedback control law, where an explicit strict Lyapunov function is constructed. Then, by investigating the convergence rate, it is further proven that the closed-loop system is globally $\mathcal{K}$-exponentially stable. Moreover, to make the proposed control law more practical, the distributed continuous feedback control law is generalized to a distributed sampled-data feedback control law using the emulation approach, based on the strong integral input-to-state stable Lyapunov function. Numerical simulations are presented to validate the effectiveness of the proposed control methods.
comment: 8 pages, 3 figures
Bridging the Gap Between Central and Local Decision-Making: The Efficacy of Collaborative Equilibria in Altruistic Congestion Games
Congestion games are popular models often used to study the system-level inefficiencies caused by selfish agents, typically measured by the price of anarchy. One may expect that aligning the agents' preferences with the system-level objective--altruistic behavior--would improve efficiency, but recent works have shown that altruism can lead to more significant inefficiency than selfishness in congestion games. In this work, we study to what extent the localness of decision-making causes inefficiency by considering collaborative decision-making paradigms that exist between centralized and distributed in altruistic congestion games. In altruistic congestion games with convex latency functions, the system cost is a super-modular function over the player's joint actions, and the Nash equilibria of the game are local optima in the neighborhood of unilateral deviations. When agents can collaborate, we can exploit the common-interest structure to consider equilibria with stronger local optimality guarantees in the system objective, e.g., if groups of k agents can collaboratively minimize the system cost, the system equilibria are the local optima over k-lateral deviations. Our main contributions are in constructing tractable linear programs that provide bounds on the price of anarchy of collaborative equilibria in altruistic congestion games. Our findings bridge the gap between the known efficiency guarantees of centralized and distributed decision-making paradigms while also providing insights into the benefit of inter-agent collaboration in multi-agent systems.
Controlled fluid transport by the collective motion of microrotors
Torque-driven microscale swimming robots, or microrotors, hold significant potential in biomedical applications such as targeted drug delivery, minimally invasive surgery, and micromanipulation. This paper addresses the challenge of controlling the transport of fluid volumes using the flow fields generated by interacting groups of microrotors. Our approach uses polynomial chaos expansions to model the time evolution of fluid particle distributions and formulate an optimal control problem, which we solve numerically. We implement this framework in simulation to achieve the controlled transport of an initial fluid particle distribution to a target destination while minimizing undesirable effects such as stretching and mixing. We consider the case where translational velocities of the rotors are directly controlled, as well as the case where only torques are controlled and the rotors move in response to the collective flow fields they generate. We analyze the solution of this optimal control problem by computing the Lagrangian coherent structures of the associated flow field, which reveal the formation of transport barriers that efficiently guide particles toward their target. This analysis provides insights into the underlying mechanisms of controlled transport.
State and Action Factorization in Power Grids
The increase of renewable energy generation towards the zero-emission target is making the problem of controlling power grids more and more challenging. The recent series of competitions Learning To Run a Power Network (L2RPN) have encouraged the use of Reinforcement Learning (RL) for the assistance of human dispatchers in operating power grids. All the solutions proposed so far severely restrict the action space and are based on a single agent acting on the entire grid or multiple independent agents acting at the substations level. In this work, we propose a domain-agnostic algorithm that estimates correlations between state and action components entirely based on data. Highly correlated state-action pairs are grouped together to create simpler, possibly independent subproblems that can lead to distinct learning processes with less computational and data requirements. The algorithm is validated on a power grid benchmark obtained with the Grid2Op simulator that has been used throughout the aforementioned competitions, showing that our algorithm is in line with domain-expert analysis. Based on these results, we lay a theoretically-grounded foundation for using distributed reinforcement learning in order to improve the existing solutions.
Probabilistic Reachability Analysis of Stochastic Control Systems
We address the reachability problem for continuous-time stochastic dynamic systems. Our objective is to present a unified framework that characterizes the reachable set of a dynamic system in the presence of both stochastic disturbances and deterministic inputs. To achieve this, we devise a strategy that effectively decouples the effects of deterministic inputs and stochastic disturbances on the reachable sets of the system. For the deterministic part, many existing methods can capture the deterministic reachability. As for the stochastic disturbances, we introduce a novel technique that probabilistically bounds the difference between a stochastic trajectory and its deterministic counterpart. The key to our approach is introducing a novel energy function termed the Averaged Moment Generating Function that yields a high probability bound for this difference. This bound is tight and exact for linear stochastic dynamics and applicable to a large class of nonlinear stochastic dynamics. By combining our innovative technique with existing methods for deterministic reachability analysis, we can compute estimations of reachable sets that surpass those obtained with current approaches for stochastic reachability analysis. We validate the effectiveness of our framework through various numerical experiments. Beyond its immediate applications in reachability analysis, our methodology is poised to have profound implications in the broader analysis and control of stochastic systems. It opens avenues for enhanced understanding and manipulation of complex stochastic dynamics, presenting opportunities for advancements in related fields.
PID Accelerated Temporal Difference Algorithms
Long-horizon tasks, which have a large discount factor, pose a challenge for most conventional reinforcement learning (RL) algorithms. Algorithms such as Value Iteration and Temporal Difference (TD) learning have a slow convergence rate and become inefficient in these tasks. When the transition distributions are given, PID VI was recently introduced to accelerate the convergence of Value Iteration using ideas from control theory. Inspired by this, we introduce PID TD Learning and PID Q-Learning algorithms for the RL setting, in which only samples from the environment are available. We give a theoretical analysis of the convergence of PID TD Learning and its acceleration compared to the conventional TD Learning. We also introduce a method for adapting PID gains in the presence of noise and empirically verify its effectiveness.
Risk-Aware Real-Time Task Allocation for Stochastic Multi-Agent Systems under STL Specifications
This paper addresses the control synthesis of heterogeneous stochastic linear multi-agent systems with real-time allocation of signal temporal logic (STL) specifications. Based on previous work, we decompose specifications into sub-specifications on the individual agent level. To leverage the efficiency of task allocation, a heuristic filter evaluates potential task allocation based on STL robustness, and subsequently, an auctioning algorithm determines the definitive allocation of specifications. Finally, a control strategy is synthesized for each agent-specification pair using tube-based model predictive control (MPC), ensuring provable probabilistic satisfaction. We demonstrate the efficacy of the proposed methods using a multi-shuttle scenario that highlights a promising extension to automated driving applications like vehicle routing.
comment: 7 pages, 5 figures, Accepted for CDC 2024. arXiv admin note: text overlap with arXiv:2402.03165
Behavioral Learning of Dish Rinsing and Scrubbing based on Interruptive Direct Teaching Considering Assistance Rate
Robots are expected to manipulate objects in a safe and dexterous way. For example, washing dishes is a dexterous operation that involves scrubbing the dishes with a sponge and rinsing them with water. It is necessary to learn it safely without splashing water and without dropping the dishes. In this study, we propose a safe and dexterous manipulation system. The robot learns a dynamics model of the object by estimating the state of the object and the robot itself, the control input, and the amount of human assistance required (assistance rate) after the human corrects the initial trajectory of the robot's hands by interruptive direct teaching. By backpropagating the error between the estimated and the reference value using the acquired dynamics model, the robot can generate a control input that approaches the reference value, for example, so that human assistance is not required and the dish does not move excessively. This allows for adaptive rinsing and scrubbing of dishes with unknown shapes and properties. As a result, it is possible to generate safe actions that require less human assistance.
comment: Accepted at Advanced Robotics
A General Framework for Load Forecasting based on Pre-trained Large Language Model
Accurate load forecasting is crucial for maintaining the power balance between generators and consumers,particularly with the increasing integration of renewable energy sources, which introduce significant intermittent volatility. With the advancement of data-driven methods, machine learning and deep learning models have become the predominant approaches for load forecasting tasks. In recent years, pre-trained large language models (LLMs) have achieved significant progress, demonstrating superior performance across various fields. This paper proposes a load forecasting method based on LLMs, offering not only precise predictive capabilities but also broad and flexible applicability. Additionally, a data modeling method is introduced to effectively transform load sequence data into natural language suitable for LLM training. Furthermore, a data enhancement strategy is designed to mitigate the impact of LLM hallucinations on forecasting results. The effectiveness of the proposed method is validated using two real-world datasets. Compared to existing methods, our approach demonstrates state-of-the-art performance across all validation metrics.
comment: 11 pages, 3 figures and 5 tables
Adaptive Incentive Design with Learning Agents
How can the system operator learn an incentive mechanism that achieves social optimality based on limited information about the agents' behavior, who are dynamically updating their strategies? To answer this question, we propose an \emph{adaptive} incentive mechanism. This mechanism updates the incentives of agents based on the feedback of each agent's externality, evaluated as the difference between the player's marginal cost and society's marginal cost at each time step. The proposed mechanism updates the incentives on a slower timescale compared to the agents' learning dynamics, resulting in a two-timescale coupled dynamical system. Notably, this mechanism is agnostic to the specific learning dynamics used by agents to update their strategies. We show that any fixed point of this adaptive incentive mechanism corresponds to the optimal incentive mechanism, ensuring that the Nash equilibrium coincides with the socially optimal strategy. Additionally, we provide sufficient conditions that guarantee the convergence of the adaptive incentive mechanism to a fixed point. Our results apply to both atomic and non-atomic games. To demonstrate the effectiveness of our proposed mechanism, we verify the convergence conditions in two practically relevant games: atomic networked quadratic aggregative games and non-atomic network routing games.
comment: 30 pages
Koopman Analysis of the Singularly-Perturbed van der Pol Oscillator
The Koopman operator framework holds promise for spectral analysis of nonlinear dynamical systems based on linear operators. Eigenvalues and eigenfunctions of the Koopman operator, so-called Koopman eigenvalues and Koopman eigenfunctions, respectively, mirror global properties of the system's flow. In this paper we perform the Koopman analysis of the singularly-perturbed van der Pol system. First, we show the spectral signature depending on singular perturbation: how two Koopman {principal} eigenvalues are ordered and what distinct shapes emerge in their associated Koopman eigenfunctions. Second, we discuss the singular limit of the Koopman operator, which is derived through the concatenation of Koopman operators for the fast and slow subsystems. From the spectral properties of the Koopman operator for the {singularly}-perturbed system and the singular limit, we suggest that the Koopman eigenfunctions inherit geometric properties of the singularly-perturbed system. These results are applicable to general planar singularly-perturbed systems with stable limit cycles.
comment: 21 pages, 10 figures
Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions
Learning-based control has recently shown great efficacy in performing complex tasks for various applications. However, to deploy it in real systems, it is of vital importance to guarantee the system will stay safe. Control Barrier Functions (CBFs) offer mathematical tools for designing safety-preserving controllers for systems with known dynamics. In this article, we first introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers using Gaussian Process (GP) regression to close the gap between an approximate mathematical model and the real system, which results in a second-order cone program (SOCP)-based control design. We then present the pointwise feasibility conditions of the resulting safety controller, highlighting the level of richness that the available system information must meet to ensure safety. We use these conditions to devise an event-triggered online data collection strategy that ensures the recursive feasibility of the learned safety controller. Our method works by constantly reasoning about whether the current information is sufficient to ensure safety or if new measurements under active safe exploration are required to reduce the uncertainty. As a result, our proposed framework can guarantee the forward invariance of the safe set defined by the CBF with high probability, even if it contains a priori unexplored regions. We validate the proposed framework in two numerical simulation experiments.
comment: Journal article. Includes the results of the 2021 CDC paper titled "Pointwise feasibility of gaussian process-based safety-critical control under model uncertainty" and proposes a recursively feasible safe online learning algorithm as new contribution
On the Sample Complexity of Imitation Learning for Smoothed Model Predictive Control
Recent work in imitation learning has shown that having an expert controller that is both suitably smooth and stable enables stronger guarantees on the performance of the learned controller. However, constructing such smoothed expert controllers for arbitrary systems remains challenging, especially in the presence of input and state constraints. As our primary contribution, we show how such a smoothed expert can be designed for a general class of systems using a log-barrier-based relaxation of a standard Model Predictive Control (MPC) optimization problem. At the crux of this theoretical guarantee on smoothness is a new lower bound we prove on the optimality gap of the analytic center associated with a convex Lipschitz function, which we hope could be of independent interest. We validate our theoretical findings via experiments, demonstrating the merits of our smoothing approach over randomized smoothing.
comment: 15 pages, 2 figures. Preliminary version accepted to CDC 2024
Monotonicity and Contraction on Polyhedral Cones
In this note, we study monotone dynamical systems with respect to polyhedral cones. Using the half-space representation and the vertex representation, we propose three equivalent conditions to certify monotonicity of a dynamical system with respect to a polyhedral cone. We then introduce the notion of gauge norm associated with a cone and provide closed-from formulas for computing gauge norms associated with polyhedral cones. A key feature of gauge norms is that contractivity of monotone systems with respect to them can be efficiently characterized using simple inequalities. This result generalizes the well-known criteria for Hurwitzness of Metzler matrices and provides a scalable approach to search for Lyapunov functions of monotone systems with respect to polyhedral cones. Finally, we study the applications of our results in transient stability of dynamic flow networks and in scalable control design with safety guarantees.
Differentially Private Reward Functions in Policy Synthesis for Markov Decision Processes
Markov decision processes often seek to maximize a reward function, but onlookers may infer reward functions by observing the states and actions of such systems, revealing sensitive information. Therefore, in this paper we introduce and compare two methods for privatizing reward functions in policy synthesis for multi-agent Markov decision processes, which generalize Markov decision processes. Reward functions are privatized using differential privacy, a statistical framework for protecting sensitive data. The methods we develop perturb either (1) each agent's individual reward function or (2) the joint reward function shared by all agents. We show that approach (1) provides better performance. We then develop a polynomial-time algorithm for the numerical computation of the performance loss due to privacy on a case-by-case basis. Next, using approach (1), we develop guidelines for selecting reward function values to preserve ``goal" and ``avoid" states while still remaining private, and we quantify the increase in computational complexity needed to compute policies from privatized rewards. Numerical simulations are performed on three classes of systems and they reveal a surprising compatibility with privacy: using reasonably strong privacy ($\epsilon =1.3$) on average induces as little as a~$5\%$ decrease in total accumulated reward and a $0.016\%$ increase in computation time.
comment: 16 Pages, 11 figures
Aerostack2: A Software Framework for Developing Multi-robot Aerial Systems
The development of autonomous aerial systems, particularly for multi-robot configurations, is a complex challenge requiring multidisciplinary expertise. Unlike ground robotics, aerial robotics has seen limited standardization, leading to fragmented development efforts. To address this gap, we introduce Aerostack2, a comprehensive, open-source ROS 2 based framework designed for creating versatile and robust multi-robot aerial systems. Aerostack2 features platform independence, a modular plugin architecture, and behavior-based mission control, enabling easy customization and integration across various platforms. In this paper, we detail the full architecture of Aerostack2, which has been tested with several platforms in both simulation and real flights. We demonstrate its effectiveness through multiple validation scenarios, highlighting its potential to accelerate innovation and enhance collaboration in the aerial robotics community.
comment: Submitted to IEEE Robotics and Automation Letters
Robotics
Time-Varying Soft-Maximum Barrier Functions for Safety in Unmapped and Dynamic Environments
We present a closed-form optimal feedback control method that ensures safety in an a prior unknown and potentially dynamic environment. This article considers the scenario where local perception data (e.g., LiDAR) is obtained periodically, and this data can be used to construct a local control barrier function (CBF) that models a local set that is safe for a period of time into the future. Then, we use a smooth time-varying soft-maximum function to compose the N most recently obtained local CBFs into a single barrier function that models an approximate union of the N most recently obtained local sets. This composite barrier function is used in a constrained quadratic optimization, which is solved in closed form to obtain a safe-and-optimal feedback control. We also apply the time-varying soft-maximum barrier function control to 2 robotic systems (nonholonomic ground robot with nonnegligible inertia, and quadrotor robot), where the objective is to navigate an a priori unknown environment safely and reach a target destination. In these applications, we present a simple approach to generate local CBFs from periodically obtained perception data.
comment: Preprint submitted to IEEE Transactions on Control Systems Technology (TCST)
Enhancing Sample Efficiency and Exploration in Reinforcement Learning through the Integration of Diffusion Models and Proximal Policy Optimization
Recent advancements in reinforcement learning (RL) have been fueled by large-scale data and deep neural networks, particularly for high-dimensional and complex tasks. Online RL methods like Proximal Policy Optimization (PPO) are effective in dynamic scenarios but require substantial real-time data, posing challenges in resource-constrained or slow simulation environments. Offline RL addresses this by pre-learning policies from large datasets, though its success depends on the quality and diversity of the data. This work proposes a framework that enhances PPO algorithms by incorporating a diffusion model to generate high-quality virtual trajectories for offline datasets. This approach improves exploration and sample efficiency, leading to significant gains in cumulative rewards, convergence speed, and strategy stability in complex tasks. Our contributions are threefold: we explore the potential of diffusion models in RL, particularly for offline datasets, extend the application of online RL to offline environments, and experimentally validate the performance improvements of PPO with diffusion models. These findings provide new insights and methods for applying RL to high-dimensional, complex tasks. Finally, we open-source our code at https://github.com/TianciGao/DiffPPO
Performance-Aware Self-Configurable Multi-Agent Networks: A Distributed Submodular Approach for Simultaneous Coordination and Network Design
We introduce the first, to our knowledge, rigorous approach that enables multi-agent networks to self-configure their communication topology to balance the trade-off between scalability and optimality during multi-agent planning. We are motivated by the future of ubiquitous collaborative autonomy where numerous distributed agents will be coordinating via agent-to-agent communication to execute complex tasks such as traffic monitoring, event detection, and environmental exploration. But the explosion of information in such large-scale networks currently curtails their deployment due to impractical decision times induced by the computational and communication requirements of the existing near-optimal coordination algorithms. To overcome this challenge, we present the AlterNAting COordination and Network-Design Algorithm (Anaconda), a scalable algorithm that also enjoys near-optimality guarantees. Subject to the agents' bandwidth constraints, Anaconda enables the agents to optimize their local communication neighborhoods such that the action-coordination approximation performance of the network is maximized. Compared to the state of the art, Anaconda is an anytime self-configurable algorithm that quantifies its suboptimality guarantee for any type of network, from fully disconnected to fully centralized, and that, for sparse networks, is one order faster in terms of decision speed. To develop the algorithm, we quantify the suboptimality cost due to decentralization, i.e., due to communication-minimal distributed coordination. We also employ tools inspired by the literature on multi-armed bandits and submodular maximization subject to cardinality constraints. We demonstrate Anaconda in simulated scenarios of area monitoring and compare it with a state-of-the-art algorithm.
comment: Accepted to CDC 2024
Grounding Language Models in Autonomous Loco-manipulation Tasks ICRA
Humanoid robots with behavioral autonomy have consistently been regarded as ideal collaborators in our daily lives and promising representations of embodied intelligence. Compared to fixed-based robotic arms, humanoid robots offer a larger operational space while significantly increasing the difficulty of control and planning. Despite the rapid progress towards general-purpose humanoid robots, most studies remain focused on locomotion ability with few investigations into whole-body coordination and tasks planning, thus limiting the potential to demonstrate long-horizon tasks involving both mobility and manipulation under open-ended verbal instructions. In this work, we propose a novel framework that learns, selects, and plans behaviors based on tasks in different scenarios. We combine reinforcement learning (RL) with whole-body optimization to generate robot motions and store them into a motion library. We further leverage the planning and reasoning features of the large language model (LLM), constructing a hierarchical task graph that comprises a series of motion primitives to bridge lower-level execution with higher-level planning. Experiments in simulation and real-world using the CENTAURO robot show that the language model based planner can efficiently adapt to new loco-manipulation tasks, demonstrating high autonomy from free-text commands in unstructured scenes.
comment: Summit to ICRA@40. arXiv admin note: substantial text overlap with arXiv:2406.14655
An Investigation of Denial of Service Attacks on Autonomous Driving Software and Hardware in Operation
This research investigates the impact of Denial of Service (DoS) attacks, specifically Internet Control Message Protocol (ICMP) flood attacks, on Autonomous Driving (AD) systems, focusing on their control modules. Two experimental setups were created: the first involved an ICMP flood attack on a Raspberry Pi running an AD software stack, and the second examined the effects of single and double ICMP flood attacks on a Global Navigation Satellite System Real-Time Kinematic (GNSS-RTK) device for high-accuracy localization of an autonomous vehicle that is available on the market. The results indicate a moderate impact of DoS attacks on the AD stack, where the increase in median computation time was marginal, suggesting a degree of resilience to these types of attacks. In contrast, the GNSS device demonstrated significant vulnerability: during DoS attacks, the sample rate dropped drastically to approximately 50% and 5% of the nominal rate for single and double attacker configurations, respectively. Additionally, the longest observed time increments were in the range of seconds during the attacks. These results underscore the vulnerability of AD systems to DoS attacks and the critical need for robust cybersecurity measures. This work provides valuable insights into the design requirements of AD software stacks and highlights that external hardware and modules can be significant attack surfaces.
External Steering of Vine Robots via Magnetic Actuation
This paper explores the concept of external magnetic control for vine robots to enable their high curvature steering and navigation for use in endoluminal applications. Vine robots, inspired by natural growth and locomotion strategies, present unique shape adaptation capabilities that allow passive deformation around obstacles. However, without additional steering mechanisms, they lack the ability to actively select the desired direction of growth. The principles of magnetically steered growing robots are discussed, and experimental results showcase the effectiveness of the proposed magnetic actuation approach. We present a 25 mm diameter vine robot with integrated magnetic tip capsule, including 6 Degrees of Freedom (DOF) localization and camera and demonstrate a minimum bending radius of 3.85 cm with an internal pressure of 30 kPa. Furthermore, we evaluate the robot's ability to form tight curvature through complex navigation tasks, with magnetic actuation allowing for extended free-space navigation without buckling. The suspension of the magnetic tip was also validated using the 6 DOF localization system to ensure that the shear-free nature of vine robots was preserved. Additionally, by exploiting the magnetic wrench at the tip, we showcase preliminary results of vine retraction. The findings contribute to the development of controllable vine robots for endoluminal applications, providing high tip force and shear-free navigation.
comment: 13 pages, 10 figures
Adaptive Artificial Time Delay Control for Robotic Systems
Artificial time delay controller was conceptualised for nonlinear systems to reduce dependency on precise system modelling unlike the conventional adaptive and robust control strategies. In this approach unknown dynamics is compensated by using input and state measurements collected at immediate past time instant (i.e., artificially delayed). The advantage of this kind of approach lies in its simplicity and ease of implementation. However, the applications of artificial time delay controllers in robotics, which are also robust against unknown state-dependent uncertainty, are still missing at large. This thesis presents the study of this control approach toward two important classes of robotic systems, namely a fully actuated bipedal walking robot and an underactuated quadrotor system. In the first work, we explore the idea of a unified control design instead of multiple controllers for different walking phases in adaptive bipedal walking control while bypassing computing constraint forces, since they often lead to complex designs. The second work focuses on quadrotors employed for applications such as payload delivery, inspection and search-and-rescue. The effectiveness of this controller is validated using experimental results.
Revisiting Safe Exploration in Safe Reinforcement learning
Safe reinforcement learning (SafeRL) extends standard reinforcement learning with the idea of safety, where safety is typically defined through the constraint of the expected cost return of a trajectory being below a set limit. However, this metric fails to distinguish how costs accrue, treating infrequent severe cost events as equal to frequent mild ones, which can lead to riskier behaviors and result in unsafe exploration. We introduce a new metric, expected maximum consecutive cost steps (EMCC), which addresses safety during training by assessing the severity of unsafe steps based on their consecutive occurrence. This metric is particularly effective for distinguishing between prolonged and occasional safety violations. We apply EMMC in both on- and off-policy algorithm for benchmarking their safe exploration capability. Finally, we validate our metric through a set of benchmarks and propose a new lightweight benchmark task, which allows fast evaluation for algorithm design.
Saying goodbyes to rotating your phone: Magnetometer calibration during SLAM
While Wi-Fi positioning is still more common indoors, using magnetic field features has become widely known and utilized as an alternative or supporting source of information. Magnetometer bias presents significant challenge in magnetic field navigation and SLAM. Traditionally, magnetometers have been calibrated using standard sphere or ellipsoid fitting methods and by requiring manual user procedures, such as rotating a smartphone in a figure-eight shape. This is not always feasible, particularly when the magnetometer is attached to heavy or fast-moving platforms, or when user behavior cannot be reliably controlled. Recent research has proposed using map data for calibration during positioning. This paper takes a step further and verifies that a pre-collected map is not needed; instead, calibration can be done as part of a SLAM process. The presented solution uses a factorized particle filter that factors out calibration in addition to the magnetic field map. The method is validated using smartphone data from a shopping mall and mobile robotics data from an office environment. Results support the claim that magnetometer calibration can be achieved during SLAM with comparable accuracy to manual calibration. Furthermore, the method seems to slightly improve manual calibration when used on top of it, suggesting potential for integrating various calibration approaches.
comment: Accepted for publication at the 14th International Conference on Indoor Positioning and Indoor Navigation (IPIN 2024)
CyberCortex.AI: An AI-based Operating System for Autonomous Robotics and Complex Automation
The underlying framework for controlling autonomous robots and complex automation applications are Operating Systems (OS) capable of scheduling perception-and-control tasks, as well as providing real-time data communication to other robotic peers and remote cloud computers. In this paper, we introduce CyberCortex.AI, a robotics OS designed to enable heterogeneous AI-based robotics and complex automation applications. CyberCortex.AI is a decentralized distributed OS which enables robots to talk to each other, as well as to High Performance Computers (HPC) in the cloud. Sensory and control data from the robots is streamed towards HPC systems with the purpose of training AI algorithms, which are afterwards deployed on the robots. Each functionality of a robot (e.g. sensory data acquisition, path planning, motion control, etc.) is executed within a so-called DataBlock of Filters shared through the internet, where each filter is computed either locally on the robot itself, or remotely on a different robotic system. The data is stored and accessed via a so-called \textit{Temporal Addressable Memory} (TAM), which acts as a gateway between each filter's input and output. CyberCortex.AI has two main components: i) the CyberCortex.AI.inference system, which is a real-time implementation of the DataBlock running on the robots' embedded hardware, and ii) the CyberCortex.AI.dojo, which runs on an HPC computer in the cloud, and it is used to design, train and deploy AI algorithms. We present a quantitative and qualitative performance analysis of the proposed approach using two collaborative robotics applications: \textit{i}) a forest fires prevention system based on an Unitree A1 legged robot and an Anafi Parrot 4K drone, as well as \textit{ii}) an autonomous driving system which uses CyberCortex.AI for collaborative perception and motion control.
Direct Kinematics, Inverse Kinematics, and Motion Planning of 1-DoF Rational Linkages
This study presents a set of algorithms that deal with trajectory planning of rational single-loop mechanisms with one degree-of-freedom (DoF). Benefiting from a dual quaternion representation of a rational motion, a formula for direct (forward) kinematics, a numerical inverse kinematics algorithm, and the generation of a driving-joint trajectory are provided. A novel approach using the Gauss-Newton search for the one-parameter inverse kinematics problem is presented. Additionally, a method for performing smooth equidistant travel of the tool is provided by applying arc-length reparameterization. This general approach can be applied to one-DoF mechanisms with four to seven joints characterized by a rational motion, without any additional geometrical analysis. An experiment was performed to demonstrate the usage in a laboratory setup.
Integrating End-to-End and Modular Driving Approaches for Online Corner Case Detection in Autonomous Driving
Online corner case detection is crucial for ensuring safety in autonomous driving vehicles. Current autonomous driving approaches can be categorized into modular approaches and end-to-end approaches. To leverage the advantages of both, we propose a method for online corner case detection that integrates an end-to-end approach into a modular system. The modular system takes over the primary driving task and the end-to-end network runs in parallel as a secondary one, the disagreement between the systems is then used for corner case detection. We implement this method on a real vehicle and evaluate it qualitatively. Our results demonstrate that end-to-end networks, known for their superior situational awareness, as secondary driving systems, can effectively contribute to corner case detection. These findings suggest that such an approach holds potential for enhancing the safety of autonomous vehicles.
comment: IEEE SMC 2024
Development and Validation of a Modular Sensor-Based System for Gait Analysis and Control in Lower-Limb Exoskeletons
With rapid advancements in exoskeleton hardware technologies, successful assessment and accurate control remain challenging. This study introduces a modular sensor-based system to enhance biomechanical evaluation and control in lower-limb exoskeletons, utilizing advanced sensor technologies and fuzzy logic. We aim to surpass the limitations of current biomechanical evaluation methods confined to laboratories and to address the high costs and complexity of exoskeleton control systems. The system integrates inertial measurement units, force-sensitive resistors, and load cells into instrumented crutches and 3D-printed insoles. These components function both independently and collectively to capture comprehensive biomechanical data, including the anteroposterior center of pressure and crutch ground reaction forces. This data is processed through a central unit using fuzzy logic algorithms for real-time gait phase estimation and exoskeleton control. Validation experiments with three participants, benchmarked against gold-standard motion capture and force plate technologies, demonstrate our system's capability for reliable gait phase detection and precise biomechanical measurements. By offering our designs open-source and integrating cost-effective technologies, this study advances wearable robotics and promotes broader innovation and adoption in exoskeleton research.
comment: 12 pages, 8 figures, submitted to IEEE Transactions in Medical Robotics and Bionics
Remote telepresence over large distances via robot avatars: case studies
This paper discusses the necessary considerations and adjustments that allow a recently proposed avatar system architecture to be used with different robotic avatar morphologies (both wheeled and legged robots with various types of hands and kinematic structures) for the purpose of enabling remote (intercontinental) telepresence under communication bandwidth restrictions. The case studies reported involve robots using both position and torque control modes, independently of their software middleware.
Online Non-linear Centroidal MPC with Stability Guarantees for Robust Locomotion of Legged Robots
Nonlinear model predictive locomotion controllers based on the reduced centroidal dynamics are nowadays ubiquitous in legged robots. These schemes, even if they assume an inherent simplification of the robot's dynamics, were shown to endow robots with a step-adjustment capability in reaction to small pushes, and, moreover, in the case of uncertain parameters - as unknown payloads - they were shown to be able to provide some practical, albeit limited, robustness. In this work, we provide rigorous certificates of their closed loop stability via a reformulation of the centroidal MPC controller. This is achieved thanks to a systematic procedure inspired by the machinery of adaptive control, together with ideas coming from Control Lyapunov functions. Our reformulation, in addition, provides robustness for a class of unmeasured constant disturbances. To demonstrate the generality of our approach, we validated our formulation on a new generation of humanoid robots - the 56.7 kg ergoCub, as well as on a commercially available 21 kg quadruped robot, Aliengo.
Coverage Metrics for a Scenario Database for the Scenario-Based Assessment of Automated Driving Systems
Automated Driving Systems (ADSs) have the potential to make mobility services available and safe for all. A multi-pillar Safety Assessment Framework (SAF) has been proposed for the type-approval process of ADSs. The SAF requires that the test scenarios for the ADS adequately covers the Operational Design Domain (ODD) of the ADS. A common method for generating test scenarios involves basing them on scenarios identified and characterized from driving data. This work addresses two questions when collecting scenarios from driving data. First, do the collected scenarios cover all relevant aspects of the ADS' ODD? Second, do the collected scenarios cover all relevant aspects that are in the driving data, such that no potentially important situations are missed? This work proposes coverage metrics that provide a quantitative answer to these questions. The proposed coverage metrics are illustrated by means of an experiment in which over 200000 scenarios from 10 different scenario categories are collected from the HighD data set. The experiment demonstrates that a coverage of 100 % can be achieved under certain conditions, and it also identifies which data and scenarios could be added to enhance the coverage outcomes in case a 100 % coverage has not been achieved. Whereas this work presents metrics for the quantification of the coverage of driving data and the identified scenarios, this paper concludes with future research directions, including the quantification of the completeness of driving data and the identified scenarios.
comment: Accepted for the 2024 IEEE International Automated Vehicle Validation (IAVVC 2024) Conference
Scenario-based assessment of automated driving systems: How (not) to parameterize scenarios?
The development of Automated Driving Systems (ADSs) has advanced significantly. To enable their large-scale deployment, the United Nations Regulation 157 (UN R157) concerning the approval of Automated Lane Keeping Systems (ALKSs) has been approved in 2021. UN R157 requires an activated ALKS to avoid any collisions that are reasonably preventable and proposes a method to distinguish reasonably preventable collisions from unpreventable ones using "the simulated performance of a skilled and attentive human driver". With different driver models, benchmarks are set for ALKSs in three types of scenarios. The three types of scenarios considered in the proposed method in UN R157 assume a certain parameterization without any further consideration. This work investigates the parameterization of these scenarios, showing that the choice of parameterization significantly affects the simulation outcomes. By comparing real-world and parameterized scenarios, we show that the influence of parameterization depends on the scenario type, driver model, and evaluation criterion. Alternative parameterizations are proposed, leading to results that are closer to the non-parameterized scenarios in terms of recall, precision, and F1 score. The study highlights the importance of careful scenario parameterization and suggests improvements to the current UN R157 approach.
comment: Accepted for the 2024 IEEE International Automated Vehicle Validation (IAVVC2024) Conference
AI Olympics challenge with Evolutionary Soft Actor Critic
In the following report, we describe the solution we propose for the AI Olympics competition held at IROS 2024. Our solution is based on a Model-free Deep Reinforcement Learning approach combined with an evolutionary strategy. We will briefly describe the algorithms that have been used and then provide details of the approach
Online One-Dimensional Magnetic Field SLAM with Loop-Closure Detection
We present a lightweight magnetic field simultaneous localisation and mapping (SLAM) approach for drift correction in odometry paths, where the interest is purely in the odometry and not in map building. We represent the past magnetic field readings as a one-dimensional trajectory against which the current magnetic field observations are matched. This approach boils down to sequential loop-closure detection and decision-making, based on the current pose state estimate and the magnetic field. We combine this setup with a path estimation framework using an extended Kalman smoother which fuses the odometry increments with the detected loop-closure timings. We demonstrate the practical applicability of the model with several different real-world examples from a handheld iPad moving in indoor scenes.
comment: To appear in International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI) 2024
Affordance-based Robot Manipulation with Flow Matching
We present a framework for assistive robot manipulation, which focuses on two fundamental challenges: first, efficiently adapting large-scale models to downstream scene affordance understanding tasks, especially in daily living scenarios where gathering multi-task data involving humans requires strenuous effort; second, effectively learning robot trajectories by grounding the visual affordance model. We tackle the first challenge by employing a parameter-efficient prompt tuning method that prepends learnable text prompts to the frozen vision model to predict manipulation affordances in multi-task scenarios. Then we propose to learn robot trajectories guided by affordances in a supervised Flow Matching method. Flow matching represents a robot visuomotor policy as a conditional process of flowing random waypoints to desired robot trajectories. Finally, we introduce a real-world dataset with 10 tasks across Activities of Daily Living to test our framework. Our extensive evaluation highlights that the proposed prompt tuning method for learning manipulation affordance with language prompter achieves competitive performance and even outperforms other finetuning protocols across data scales, while satisfying parameter efficiency. Learning multi-task robot trajectories with a single flow matching policy also leads to consistently better performance than alternative behavior cloning methods, especially given multimodal robot action distributions. Our framework seamlessly unifies affordance model learning and trajectory generation with flow matching for robot manipulation.
Flying a Quadrotor with Unknown Actuators and Sensor Configuration
Though control algorithms for multirotor Unmanned Air Vehicle (UAV) are well understood, the configuration, parameter estimation, and tuning of flight control algorithms takes quite some time and resources. In previous work, we have shown that it is possible to identify the control effectiveness and motor dynamics of a multirotor fast enough for it to recover to a stable hover after being thrown 4 meters in the air. In this paper, we extend this to include estimation of the position of the Inertial Measurement Unit (IMU) relative to the Center of Gravity (CoG), estimation of the IMU rotation, the thrust direction of all motors and the optimal combined thrust direction. In order to guarantee a correct IMU position estimation, two prior throw-and-catches of the vehicle with spin around different axes are required. For these throws, a height as low as 1 meter is sufficient. Quadrotor flight experimentation confirms the efficacy of the approach, and a simulation shows its applicability to fully-actuated crafts with multiple possible hover orientations.
comment: This work has been submitted to IMAV 2024 for possible publication
Accelerated Multi-objective Task Learning using Modified Q-learning Algorithm
Robots find extensive applications in industry. In recent years, the influence of robots has also increased rapidly in domestic scenarios. The Q-learning algorithm aims to maximise the reward for reaching the goal. This paper proposes a modified version of the Q-learning algorithm, known as Q-learning with scaled distance metric (Q-SD). This algorithm enhances task learning and makes task completion more meaningful. A robotic manipulator (agent) applies the Q-SD algorithm to the task of table cleaning. Using Q-SD, the agent acquires the sequence of steps necessary to accomplish the task while minimising the manipulator's movement distance. We partition the table into grids of different dimensions. The first has a grid count of 3 times 3, and the second has a grid count of 4 times 4. Using the Q-SD algorithm, the maximum success obtained in these two environments was 86% and 59% respectively. Moreover, Compared to the conventional Q-learning algorithm, the drop in average distance moved by the agent in these two environments using the Q-SD algorithm was 8.61% and 6.7% respectively.
comment: 9 pages, 9 figures, 7 tables
Robust Vehicle Localization and Tracking in Rain using Street Maps
GPS-based vehicle localization and tracking suffers from unstable positional information commonly experienced in tunnel segments and in dense urban areas. Also, both Visual Odometry (VO) and Visual Inertial Odometry (VIO) are susceptible to adverse weather conditions that causes occlusions or blur on the visual input. In this paper, we propose a novel approach for vehicle localization that uses street network based map information to correct drifting odometry estimates and intermittent GPS measurements especially, in adversarial scenarios such as driving in rain and tunnels. Specifically, our approach is a flexible fusion algorithm that integrates intermittent GPS, drifting IMU and VO estimates together with 2D map information for robust vehicle localization and tracking. We refer to our approach as Map-Fusion. We robustly evaluate our proposed approach on four geographically diverse datasets from different countries ranging across clear and rain weather conditions. These datasets also include challenging visual segments in tunnels and underpasses. We show that with the integration of the map information, our Map-Fusion algorithm reduces the error of the state-of-the-art VO and VIO approaches across all datasets. We also validate our proposed algorithm in a real-world environment and in real-time on a hardware constrained mobile robot. Map-Fusion achieved 2.46m error in clear weather and 6.05m error in rain weather for a 150m route.
Upgrading Pepper Robot s Social Interaction with Advanced Hardware and Perception Enhancements
In this paper, we propose hardware and software enhancements for the Pepper robot to improve its human-robot interaction capabilities. This includes the integration of an NVIDIA Jetson GPU to enhance computational capabilities and execute real time algorithms, and a RealSense D435i camera to capture depth images, as well as the computer vision algorithms to detect and localize the humans around the robot and estimate their body orientation and gaze direction. The new stack is implemented on ROS and is running on the extended Pepper hardware, and the communication with the robot s firmware is done through the NAOqi ROS driver API. We have also collected a MoCap dataset of human activities in a controlled environment, together with the corresponding RGB-D data, to validate the proposed perception algorithms.
Kalman Filtering for Precise Indoor Position and Orientation Estimation Using IMU and Acoustics on Riemannian Manifolds
Indoor tracking and pose estimation, i.e., determining the position and orientation of a moving target, are increasingly important due to their numerous applications. While Inertial Navigation Systems (INS) provide high update rates, their positioning errors can accumulate rapidly over time. To mitigate this, it is common to integrate INS with complementary systems to correct drift and improve accuracy. This paper presents a novel approach that combines INS with an acoustic Riemannian-based localization system to enhance indoor positioning and orientation tracking. The proposed method employs both the Extended Kalman Filter (EKF) and the Unscented Kalman Filter (UKF) for fusing data from the two systems. The Riemannian-based localization system delivers high-accuracy estimates of the target's position and orientation, which are then used to correct the INS data. A new projection algorithm is introduced to map the EKF or UKF output onto the Riemannian manifold, further improving estimation accuracy. Our results show that the proposed methods significantly outperform benchmark algorithms in both position and orientation estimation. The effectiveness of the proposed methods was evaluated through extensive numerical simulations and testing using our in-house experimental setup. These evaluations confirm the superior performance of our approach in practical scenarios.
MFCalib: Single-shot and Automatic Extrinsic Calibration for LiDAR and Camera in Targetless Environments Based on Multi-Feature Edge IROS2024
This paper presents MFCalib, an innovative extrinsic calibration technique for LiDAR and RGB camera that operates automatically in targetless environments with a single data capture. At the heart of this method is using a rich set of edge information, significantly enhancing calibration accuracy and robustness. Specifically, we extract both depth-continuous and depth-discontinuous edges, along with intensity-discontinuous edges on planes. This comprehensive edge extraction strategy ensures our ability to achieve accurate calibration with just one round of data collection, even in complex and varied settings. Addressing the uncertainty of depth-discontinuous edges, we delve into the physical measurement principles of LiDAR and develop a beam model, effectively mitigating the issue of edge inflation caused by the LiDAR beam. Extensive experiment results demonstrate that MFCalib outperforms the state-of-the-art targetless calibration methods across various scenes, achieving and often surpassing the precision of multi-scene calibrations in a single-shot collection. To support community development, we make our code available open-source on GitHub.
comment: 8 pages, 10 figures, accepted by IROS2024
Semantically Controllable Augmentations for Generalizable Robot Learning
Generalization to unseen real-world scenarios for robot manipulation requires exposure to diverse datasets during training. However, collecting large real-world datasets is intractable due to high operational costs. For robot learning to generalize despite these challenges, it is essential to leverage sources of data or priors beyond the robot's direct experience. In this work, we posit that image-text generative models, which are pre-trained on large corpora of web-scraped data, can serve as such a data source. These generative models encompass a broad range of real-world scenarios beyond a robot's direct experience and can synthesize novel synthetic experiences that expose robotic agents to additional world priors aiding real-world generalization at no extra cost. In particular, our approach leverages pre-trained generative models as an effective tool for data augmentation. We propose a generative augmentation framework for semantically controllable augmentations and rapidly multiplying robot datasets while inducing rich variations that enable real-world generalization. Based on diverse augmentations of robot data, we show how scalable robot manipulation policies can be trained and deployed both in simulation and in unseen real-world environments such as kitchens and table-tops. By demonstrating the effectiveness of image-text generative models in diverse real-world robotic applications, our generative augmentation framework provides a scalable and efficient path for boosting generalization in robot learning at no extra human cost.
comment: Accepted for publication by IJRR. First 3 authors contributed equally. Last 3 authors advised equally
Development of Occupancy Prediction Algorithm for Underground Parking Lots
The core objective of this study is to address the perception challenges faced by autonomous driving in adverse environments like basements. Initially, this paper commences with data collection in an underground garage. A simulated underground garage model is established within the CARLA simulation environment, and SemanticKITTI format occupancy ground truth data is collected in this simulated setting. Subsequently, the study integrates a Transformer-based Occupancy Network model to complete the occupancy grid prediction task within this scenario. A comprehensive BEV perception framework is designed to enhance the accuracy of neural network models in dimly lit, challenging autonomous driving environments. Finally, experiments validate the accuracy of the proposed solution's perception performance in basement scenarios. The proposed solution is tested on our self-constructed underground garage dataset, SUSTech-COE-ParkingLot, yielding satisfactory results.
Whole-Body Control Through Narrow Gaps From Pixels To Action
Flying through body-size narrow gaps in the environment is one of the most challenging moments for an underactuated multirotor. We explore a purely data-driven method to master this flight skill in simulation, where a neural network directly maps pixels and proprioception to continuous low-level control commands. This learned policy enables whole-body control through gaps with different geometries demanding sharp attitude changes (e.g., near-vertical roll angle). The policy is achieved by successive model-free reinforcement learning (RL) and online observation space distillation. The RL policy receives (virtual) point clouds of the gaps' edges for scalable simulation and is then distilled into the high-dimensional pixel space. However, this flight skill is fundamentally expensive to learn by exploring due to restricted feasible solution space. We propose to reset the agent as states on the trajectories by a model-based trajectory optimizer to alleviate this problem. The presented training pipeline is compared with baseline methods, and ablation studies are conducted to identify the key ingredients of our method. The immediate next step is to scale up the variation of gap sizes and geometries in anticipation of emergent policies and demonstrate the sim-to-real transformation.
comment: 9 pages, 8 figures, 2 tables
PhysORD: A Neuro-Symbolic Approach for Physics-infused Motion Prediction in Off-road Driving
Motion prediction is critical for autonomous off-road driving, however, it presents significantly more challenges than on-road driving because of the complex interaction between the vehicle and the terrain. Traditional physics-based approaches encounter difficulties in accurately modeling dynamic systems and external disturbance. In contrast, data-driven neural networks require extensive datasets and struggle with explicitly capturing the fundamental physical laws, which can easily lead to poor generalization. By merging the advantages of both methods, neuro-symbolic approaches present a promising direction. These methods embed physical laws into neural models, potentially significantly improving generalization capabilities. However, no prior works were evaluated in real-world settings for off-road driving. To bridge this gap, we present PhysORD, a neural-symbolic approach integrating the conservation law, i.e., the Euler-Lagrange equation, into data-driven neural models for motion prediction in off-road driving. Our experiments showed that PhysORD can accurately predict vehicle motion and tolerate external disturbance by modeling uncertainties. It outperforms existing methods both in accuracy and efficiency and demonstrates data-efficient learning and generalization ability in long-term prediction.
PEACE: Prompt Engineering Automation for CLIPSeg Enhancement in Aerial Robotics
Safe landing is an essential aspect of flight operations in fields ranging from industrial to space robotics. With the growing interest in artificial intelligence, we focus on learning-based methods for safe landing. Our previous work, Dynamic Open-Vocabulary Enhanced SafE-Landing with Intelligence (DOVESEI), demonstrated the feasibility of using prompt-based segmentation for identifying safe landing zones with open vocabulary models. However, relying on a heuristic selection of words for prompts is not reliable, as it cannot adapt to changing environments, potentially leading to harmful outcomes if the observed environment is not accurately represented by the chosen prompt. To address this issue, we introduce PEACE (Prompt Engineering Automation for CLIPSeg Enhancement), an enhancement to DOVESEI that automates prompt engineering to adapt to shifts in data distribution. PEACE can perform safe landings using only monocular cameras and image segmentation. PEACE shows significant improvements in prompt generation and engineering for aerial images compared to standard prompts used for CLIP and CLIPSeg. By combining DOVESEI and PEACE, our system improved the success rate of safe landing zone selection by at least 30\% in both simulations and indoor experiments.
comment: arXiv admin note: text overlap with arXiv:2308.11471
Fast and Certifiable Trajectory Optimization
We propose semidefinite trajectory optimization (STROM), a framework that computes fast and certifiably optimal solutions for nonconvex trajectory optimization problems defined by polynomial objectives and constraints. STROM employs sparse second-order Lasserre's hierarchy to generate semidefinite program (SDP) relaxations of trajectory optimization. Different from existing tools (e.g., YALMIP and SOSTOOLS in Matlab), STROM generates chain-like multiple-block SDPs with only positive semidefinite (PSD) variables. Moreover, STROM does so two orders of magnitude faster. Underpinning STROM is cuADMM, the first ADMM-based SDP solver implemented in CUDA and runs in GPUs (with C/C++ extension). cuADMM builds upon the symmetric Gauss-Seidel ADMM algorithm and leverages GPU parallelization to speedup solving sparse linear systems and projecting onto PSD cones. In five trajectory optimization problems (inverted pendulum, cart-pole, vehicle landing, flying robot, and car back-in), cuADMM computes optimal trajectories (with certified suboptimality below 1%) in minutes (when other solvers take hours or run out of memory) and seconds (when others take minutes). Further, when warmstarted by data-driven initialization in the inverted pendulum problem, cuADMM delivers real-time performance: providing certifiably optimal trajectories in 0.66 seconds despite the SDP has 49,500 variables and 47,351 constraints.
Multi-Visual-Inertial System: Analysis, Calibration and Estimation
In this paper, we study state estimation of multi-visual-inertial systems (MVIS) and develop sensor fusion algorithms to optimally fuse an arbitrary number of asynchronous inertial measurement units (IMUs) or gyroscopes and global and(or) rolling shutter cameras. We are especially interested in the full calibration of the associated visual-inertial sensors, including the IMU or camera intrinsics and the IMU-IMU(or camera) spatiotemporal extrinsics as well as the image readout time of rolling-shutter cameras (if used). To this end, we develop a new analytic combined IMU integration with intrinsics-termed ACI3-to preintegrate IMU measurements, which is leveraged to fuse auxiliary IMUs and(or) gyroscopes alongside a base IMU. We model the multi-inertial measurements to include all the necessary inertial intrinsic and IMU-IMU spatiotemporal extrinsic parameters, while leveraging IMU-IMU rigid-body constraints to eliminate the necessity of auxiliary inertial poses and thus reducing computational complexity. By performing observability analysis of MVIS, we prove that the standard four unobservable directions remain - no matter how many inertial sensors are used, and also identify, for the first time, degenerate motions for IMU-IMU spatiotemporal extrinsics and auxiliary inertial intrinsics. In addition to the extensive simulations that validate our analysis and algorithms, we have built our own MVIS sensor rig and collected over 25 real-world datasets to experimentally verify the proposed calibration against the state-of-the-art calibration method such as Kalibr. We show that the proposed MVIS calibration is able to achieve competing accuracy with improved convergence and repeatability, which is open sourced to better benefit the community.
Active Semantic Mapping and Pose Graph Spectral Analysis for Robot Exploration
Exploration in unknown and unstructured environments is a pivotal requirement for robotic applications. A robot's exploration behavior can be inherently affected by the performance of its Simultaneous Localization and Mapping (SLAM) subsystem, although SLAM and exploration are generally studied separately. In this paper, we formulate exploration as an active mapping problem and extend it with semantic information. We introduce a novel active metric-semantic SLAM approach, leveraging recent research advances in information theory and spectral graph theory: we combine semantic mutual information and the connectivity metrics of the underlying pose graph of the SLAM subsystem. We use the resulting utility function to evaluate different trajectories to select the most favorable strategy during exploration. Exploration and SLAM metrics are analyzed in experiments. Running our algorithm on the Habitat dataset, we show that, while maintaining efficiency close to the state-of-the-art exploration methods, our approach effectively increases the performance of metric-semantic SLAM with a 21% reduction in average map error and a 9% improvement in average semantic classification accuracy.
comment: 8 pages, 5 figures
Globally Stable Neural Imitation Policies
Imitation learning presents an effective approach to alleviate the resource-intensive and time-consuming nature of policy learning from scratch in the solution space. Even though the resulting policy can mimic expert demonstrations reliably, it often lacks predictability in unexplored regions of the state-space, giving rise to significant safety concerns in the face of perturbations. To address these challenges, we introduce the Stable Neural Dynamical System (SNDS), an imitation learning regime which produces a policy with formal stability guarantees. We deploy a neural policy architecture that facilitates the representation of stability based on Lyapunov theorem, and jointly train the policy and its corresponding Lyapunov candidate to ensure global stability. We validate our approach by conducting extensive experiments in simulation and successfully deploying the trained policies on a real-world manipulator arm. The experimental results demonstrate that our method overcomes the instability, accuracy, and computational intensity problems associated with previous imitation learning methods, making our method a promising solution for stable policy learning in complex planning scenarios.
Driving from Vision through Differentiable Optimal Control IROS 2024
This paper proposes DriViDOC: a framework for Driving from Vision through Differentiable Optimal Control, and its application to learn autonomous driving controllers from human demonstrations. DriViDOC combines the automatic inference of relevant features from camera frames with the properties of nonlinear model predictive control (NMPC), such as constraint satisfaction. Our approach leverages the differentiability of parametric NMPC, allowing for end-to-end learning of the driving model from images to control. The model is trained on an offline dataset comprising various human demonstrations collected on a motion-base driving simulator. During online testing, the model demonstrates successful imitation of different driving styles, and the interpreted NMPC parameters provide insights into the achievement of specific driving behaviors. Our experimental results show that DriViDOC outperforms other methods involving NMPC and neural networks, exhibiting an average improvement of 20% in imitation scores.
comment: This work has been accepted for publication in the Proceedings of the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024). Accompanying video available at: https://youtu.be/ENHhphpbPLs
A Soft Robotic System Automatically Learns Precise Agile Motions Without Model Information IROS 2024
Many application domains, e.g., in medicine and manufacturing, can greatly benefit from pneumatic Soft Robots (SRs). However, the accurate control of SRs has remained a significant challenge to date, mainly due to their nonlinear dynamics and viscoelastic material properties. Conventional control design methods often rely on either complex system modeling or time-intensive manual tuning, both of which require significant amounts of human expertise and thus limit their practicality. In recent works, the data-driven method, Automatic Neural ODE Control (ANODEC) has been successfully used to -- fully automatically and utilizing only input-output data -- design controllers for various nonlinear systems in silico, and without requiring prior model knowledge or extensive manual tuning. In this work, we successfully apply ANODEC to automatically learn to perform agile, non-repetitive reference tracking motion tasks in a real-world SR and within a finite time horizon. To the best of the authors' knowledge, ANODEC achieves, for the first time, performant control of a SR with hysteresis effects from only 30 seconds of input-output data and without any prior model knowledge. We show that for multiple, qualitatively different and even out-of-training-distribution reference signals, a single feedback controller designed by ANODEC outperforms a manually tuned PID baseline consistently. Overall, this contribution not only further strengthens the validity of ANODEC, but it marks an important step towards more practical, easy-to-use SRs that can automatically learn to perform agile motions from minimal experimental interaction time.
comment: Submitted to the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
Augmented Reality without Borders: Achieving Precise Localization Without Maps
Visual localization is crucial for Computer Vision and Augmented Reality (AR) applications, where determining the camera or device's position and orientation is essential to accurately interact with the physical environment. Traditional methods rely on detailed 3D maps constructed using Structure from Motion (SfM) or Simultaneous Localization and Mapping (SLAM), which is computationally expensive and impractical for dynamic or large-scale environments. We introduce MARLOC, a novel localization framework for AR applications that uses known relative transformations within image sequences to perform intra-sequence triangulation, generating 3D-2D correspondences for pose estimation and refinement. MARLOC eliminates the need for pre-built SfM maps, providing accurate and efficient localization suitable for dynamic outdoor environments. Evaluation with benchmark datasets and real-world experiments demonstrates MARLOC's state-of-the-art performance and robustness. By integrating MARLOC into an AR device, we highlight its capability to achieve precise localization in real-world outdoor scenarios, showcasing its practical effectiveness and potential to enhance visual localization in AR applications.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
DecAP: Decaying Action Priors for Accelerated Imitation Learning of Torque-Based Legged Locomotion Policies
Optimal Control for legged robots has gone through a paradigm shift from position-based to torque-based control, owing to the latter's compliant and robust nature. In parallel to this shift, the community has also turned to Deep Reinforcement Learning (DRL) as a promising approach to directly learn locomotion policies for complex real-life tasks. However, most end-to-end DRL approaches still operate in position space, mainly because learning in torque space is often sample-inefficient and does not consistently converge to natural gaits. To address these challenges, we propose a two-stage framework. In the first stage, we generate our own imitation data by training a position-based policy, eliminating the need for expert knowledge to design optimal controllers. The second stage incorporates decaying action priors, a novel method to enhance the exploration of torque-based policies aided by imitation rewards. We show that our approach consistently outperforms imitation learning alone and is robust to scaling these rewards from 0.1x to 10x. We further validate the benefits of torque control by comparing the robustness of a position-based policy to a position-assisted torque-based policy on a quadruped (Unitree Go1) without any domain randomization in the form of external disturbances during training.
comment: \c{opyright} 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Exploring and Learning Structure: Active Inference Approach in Navigational Agents
Drawing inspiration from animal navigation strategies, we introduce a novel computational model for navigation and mapping, rooted in biologically inspired principles. Animals exhibit remarkable navigation abilities by efficiently using memory, imagination, and strategic decision-making to navigate complex and aliased environments. Building on these insights, we integrate traditional cognitive mapping approaches with an Active Inference Framework (AIF) to learn an environment structure in a few steps. Through the incorporation of topological mapping for long-term memory and AIF for navigation planning and structure learning, our model can dynamically apprehend environmental structures and expand its internal map with predicted beliefs during exploration. Comparative experiments with the Clone-Structured Graph (CSCG) model highlight our model's ability to rapidly learn environmental structures in a single episode, with minimal navigation overlap. this is achieved without prior knowledge of the dimensions of the environment or the type of observations, showcasing its robustness and effectiveness in navigating ambiguous environments.
comment: IWAI workshop 2024
Neuromorphic force-control in an industrial task: validating energy and latency benefits IROS 2024
As robots become smarter and more ubiquitous, optimizing the power consumption of intelligent compute becomes imperative towards ensuring the sustainability of technological advancements. Neuromorphic computing hardware makes use of biologically inspired neural architectures to achieve energy and latency improvements compared to conventional von Neumann computing architecture. Applying these benefits to robots has been demonstrated in several works in the field of neurorobotics, typically on relatively simple control tasks. Here, we introduce an example of neuromorphic computing applied to the real-world industrial task of object insertion. We trained a spiking neural network (SNN) to perform force-torque feedback control using a reinforcement learning approach in simulation. We then ported the SNN to the Intel neuromorphic research chip Loihi interfaced with a KUKA robotic arm. At inference time we show latency competitive with current CPU/GPU architectures, and one order of magnitude less energy usage in comparison to state-of-the-art low-energy edge-hardware. We offer this example as a proof of concept implementation of a neuromoprhic controller in real-world robotic setting, highlighting the benefits of neuromorphic hardware for the development of intelligent controllers for robots.
comment: Accepted at IROS 2024
Toward Globally Optimal State Estimation Using Automatically Tightened Semidefinite Relaxations
In recent years, semidefinite relaxations of common optimization problems in robotics have attracted growing attention due to their ability to provide globally optimal solutions. In many cases, it was shown that specific handcrafted redundant constraints are required to obtain tight relaxations and thus global optimality. These constraints are formulation-dependent and typically identified through a lengthy manual process. Instead, the present paper suggests an automatic method to find a set of sufficient redundant constraints to obtain tightness, if they exist. We first propose an efficient feasibility check to determine if a given set of variables can lead to a tight formulation. Secondly, we show how to scale the method to problems of bigger size. At no point of the process do we have to find redundant constraints manually. We showcase the effectiveness of the approach, in simulation and on real datasets, for range-based localization and stereo-based pose estimation. Finally, we reproduce semidefinite relaxations presented in recent literature and show that our automatic method always finds a smaller set of constraints sufficient for tightness than previously considered.
comment: 20 pages, 22 figures. Version history: v4 (conditionally accepted version T-RO), v3 (revised version), v2 (submitted version), v1 (initial version)
Safe and Smooth: Certified Continuous-Time Range-Only Localization
A common approach to localize a mobile robot is by measuring distances to points of known positions, called anchors. Locating a device from distance measurements is typically posed as a non-convex optimization problem, stemming from the nonlinearity of the measurement model. Non-convex optimization problems may yield suboptimal solutions when local iterative solvers such as Gauss-Newton are employed. In this paper, we design an optimality certificate for continuous-time range-only localization. Our formulation allows for the integration of a motion prior, which ensures smoothness of the solution and is crucial for localizing from only a few distance measurements. The proposed certificate comes at little additional cost since it has the same complexity as the sparse local solver itself: linear in the number of positions. We show, both in simulation and on real-world datasets, that the efficient local solver often finds the globally optimal solution (confirmed by our certificate), but it may converge to local solutions with high errors, which our certificate correctly detects.
comment: 10 pages, 7 figures, accepted to IEEE Robotics and Automation Letters (this arXiv version contains supplementary appendix). Version info: v4 (add publication header, change min to argmin in (2) and (16)), v3 (revised version), v2 (submitted version), v1 (initial version)
DarkGS: Learning Neural Illumination and 3D Gaussians Relighting for Robotic Exploration in the Dark
Humans have the remarkable ability to construct consistent mental models of an environment, even under limited or varying levels of illumination. We wish to endow robots with this same capability. In this paper, we tackle the challenge of constructing a photorealistic scene representation under poorly illuminated conditions and with a moving light source. We approach the task of modeling illumination as a learning problem, and utilize the developed illumination model to aid in scene reconstruction. We introduce an innovative framework that uses a data-driven approach, Neural Light Simulators (NeLiS), to model and calibrate the camera-light system. Furthermore, we present DarkGS, a method that applies NeLiS to create a relightable 3D Gaussian scene model capable of real-time, photorealistic rendering from novel viewpoints. We show the applicability and robustness of our proposed simulator and system in a variety of real-world environments.
comment: 8 pages, 10 figures
Multiagent Systems
Performance-Aware Self-Configurable Multi-Agent Networks: A Distributed Submodular Approach for Simultaneous Coordination and Network Design
We introduce the first, to our knowledge, rigorous approach that enables multi-agent networks to self-configure their communication topology to balance the trade-off between scalability and optimality during multi-agent planning. We are motivated by the future of ubiquitous collaborative autonomy where numerous distributed agents will be coordinating via agent-to-agent communication to execute complex tasks such as traffic monitoring, event detection, and environmental exploration. But the explosion of information in such large-scale networks currently curtails their deployment due to impractical decision times induced by the computational and communication requirements of the existing near-optimal coordination algorithms. To overcome this challenge, we present the AlterNAting COordination and Network-Design Algorithm (Anaconda), a scalable algorithm that also enjoys near-optimality guarantees. Subject to the agents' bandwidth constraints, Anaconda enables the agents to optimize their local communication neighborhoods such that the action-coordination approximation performance of the network is maximized. Compared to the state of the art, Anaconda is an anytime self-configurable algorithm that quantifies its suboptimality guarantee for any type of network, from fully disconnected to fully centralized, and that, for sparse networks, is one order faster in terms of decision speed. To develop the algorithm, we quantify the suboptimality cost due to decentralization, i.e., due to communication-minimal distributed coordination. We also employ tools inspired by the literature on multi-armed bandits and submodular maximization subject to cardinality constraints. We demonstrate Anaconda in simulated scenarios of area monitoring and compare it with a state-of-the-art algorithm.
comment: Accepted to CDC 2024
On Mechanism Underlying Algorithmic Collusion
Two issues of algorithmic collusion are addressed in this paper. First, we show that in a general class of symmetric games, including Prisoner's Dilemma, Bertrand competition, and any (nonlinear) mixture of first and second price auction, only (strict) Nash Equilibrium (NE) is stochastically stable. Therefore, the tacit collusion is driven by failure to learn NE due to insufficient learning, instead of learning some strategies to sustain collusive outcomes. Second, we study how algorithms adapt to collusion in real simulations with insufficient learning. Extensive explorations in early stages and discount factors inflates the Q-value, which interrupts the sequential and alternative price undercut and leads to bilateral rebound. The process is iterated, making the price curves like Edgeworth cycles. When both exploration rate and Q-value decrease, algorithms may bilaterally rebound to relatively high common price level by coincidence, and then get stuck. Finally, we accommodate our reasoning to simulation outcomes in the literature, including optimistic initialization, market design and algorithm design.
Evolution of Social Norms in LLM Agents using Natural Language
Recent advancements in Large Language Models (LLMs) have spurred a surge of interest in leveraging these models for game-theoretical simulations, where LLMs act as individual agents engaging in social interactions. This study explores the potential for LLM agents to spontaneously generate and adhere to normative strategies through natural language discourse, building upon the foundational work of Axelrod's metanorm games. Our experiments demonstrate that through dialogue, LLM agents can form complex social norms, such as metanorms-norms enforcing the punishment of those who do not punish cheating-purely through natural language interaction. The results affirm the effectiveness of using LLM agents for simulating social interactions and understanding the emergence and evolution of complex strategies and norms through natural language. Future work may extend these findings by incorporating a wider range of scenarios and agent characteristics, aiming to uncover more nuanced mechanisms behind social norm formation.
comment: 5 pages, 8 figures
United We Stand: Decentralized Multi-Agent Planning With Attrition ECAI 2024
Decentralized planning is a key element of cooperative multi-agent systems for information gathering tasks. However, despite the high frequency of agent failures in realistic large deployment scenarios, current approaches perform poorly in the presence of failures, by not converging at all, and/or by making very inefficient use of resources (e.g. energy). In this work, we propose Attritable MCTS (A-MCTS), a decentralized MCTS algorithm capable of timely and efficient adaptation to changes in the set of active agents. It is based on the use of a global reward function for the estimation of each agent's local contribution, and regret matching for coordination. We evaluate its effectiveness in realistic data-harvesting problems under different scenarios. We show both theoretically and experimentally that A-MCTS enables efficient adaptation even under high failure rates. Results suggest that, in the presence of frequent failures, our solution improves substantially over the best existing approaches in terms of global utility and scalability.
comment: To appear in ECAI 2024
Systems and Control (CS)
Terminal Soft Landing Guidance Law Using Analytic Gravity Turn Trajectory
This paper presents an innovative terminal landing guidance law that utilizes an analytic solution derived from the gravity turn trajectory. The characteristics of the derived solution are thoroughly investigated, and the solution is employed to generate a reference velocity vector that satisfies terminal landing conditions. A nonlinear control law is applied to effectively track the reference velocity vector within a finite time, and its robustness against disturbances is studied. Furthermore, the guidance law is expanded to incorporate ground collision avoidance by considering the shape of the gravity turn trajectory. The proposed method's fuel efficiency, robustness, and practicality are demonstrated through comprehensive numerical simulations, and its performance is compared with existing methods.
Time-Varying Soft-Maximum Barrier Functions for Safety in Unmapped and Dynamic Environments
We present a closed-form optimal feedback control method that ensures safety in an a prior unknown and potentially dynamic environment. This article considers the scenario where local perception data (e.g., LiDAR) is obtained periodically, and this data can be used to construct a local control barrier function (CBF) that models a local set that is safe for a period of time into the future. Then, we use a smooth time-varying soft-maximum function to compose the N most recently obtained local CBFs into a single barrier function that models an approximate union of the N most recently obtained local sets. This composite barrier function is used in a constrained quadratic optimization, which is solved in closed form to obtain a safe-and-optimal feedback control. We also apply the time-varying soft-maximum barrier function control to 2 robotic systems (nonholonomic ground robot with nonnegligible inertia, and quadrotor robot), where the objective is to navigate an a priori unknown environment safely and reach a target destination. In these applications, we present a simple approach to generate local CBFs from periodically obtained perception data.
comment: Preprint submitted to IEEE Transactions on Control Systems Technology (TCST)
Performance-Aware Self-Configurable Multi-Agent Networks: A Distributed Submodular Approach for Simultaneous Coordination and Network Design
We introduce the first, to our knowledge, rigorous approach that enables multi-agent networks to self-configure their communication topology to balance the trade-off between scalability and optimality during multi-agent planning. We are motivated by the future of ubiquitous collaborative autonomy where numerous distributed agents will be coordinating via agent-to-agent communication to execute complex tasks such as traffic monitoring, event detection, and environmental exploration. But the explosion of information in such large-scale networks currently curtails their deployment due to impractical decision times induced by the computational and communication requirements of the existing near-optimal coordination algorithms. To overcome this challenge, we present the AlterNAting COordination and Network-Design Algorithm (Anaconda), a scalable algorithm that also enjoys near-optimality guarantees. Subject to the agents' bandwidth constraints, Anaconda enables the agents to optimize their local communication neighborhoods such that the action-coordination approximation performance of the network is maximized. Compared to the state of the art, Anaconda is an anytime self-configurable algorithm that quantifies its suboptimality guarantee for any type of network, from fully disconnected to fully centralized, and that, for sparse networks, is one order faster in terms of decision speed. To develop the algorithm, we quantify the suboptimality cost due to decentralization, i.e., due to communication-minimal distributed coordination. We also employ tools inspired by the literature on multi-armed bandits and submodular maximization subject to cardinality constraints. We demonstrate Anaconda in simulated scenarios of area monitoring and compare it with a state-of-the-art algorithm.
comment: Accepted to CDC 2024
An Investigation of Denial of Service Attacks on Autonomous Driving Software and Hardware in Operation
This research investigates the impact of Denial of Service (DoS) attacks, specifically Internet Control Message Protocol (ICMP) flood attacks, on Autonomous Driving (AD) systems, focusing on their control modules. Two experimental setups were created: the first involved an ICMP flood attack on a Raspberry Pi running an AD software stack, and the second examined the effects of single and double ICMP flood attacks on a Global Navigation Satellite System Real-Time Kinematic (GNSS-RTK) device for high-accuracy localization of an autonomous vehicle that is available on the market. The results indicate a moderate impact of DoS attacks on the AD stack, where the increase in median computation time was marginal, suggesting a degree of resilience to these types of attacks. In contrast, the GNSS device demonstrated significant vulnerability: during DoS attacks, the sample rate dropped drastically to approximately 50% and 5% of the nominal rate for single and double attacker configurations, respectively. Additionally, the longest observed time increments were in the range of seconds during the attacks. These results underscore the vulnerability of AD systems to DoS attacks and the critical need for robust cybersecurity measures. This work provides valuable insights into the design requirements of AD software stacks and highlights that external hardware and modules can be significant attack surfaces.
Analyzing electric vehicle, load and photovoltaic generation uncertainty using publicly available datasets
This paper aims to analyze three publicly available datasets for quantifying seasonal and annual uncertainty for efficient scenario creation. The datasets from Elaad, Elia and Fluvius are utilized to statistically analyze electric vehicle charging, normalized solar generation and low-voltage consumer load profiles, respectively. Frameworks for scenario generation are also provided for these datasets. The datasets for load profiles and solar generation analyzed are for the year 2022, thus embedding seasonal information. An online repository is created for the wider applicability of this work. Finally, the extreme load week(s) are identified and linked to the weather data measured at EnergyVille in Belgium.
Adaptive Artificial Time Delay Control for Robotic Systems
Artificial time delay controller was conceptualised for nonlinear systems to reduce dependency on precise system modelling unlike the conventional adaptive and robust control strategies. In this approach unknown dynamics is compensated by using input and state measurements collected at immediate past time instant (i.e., artificially delayed). The advantage of this kind of approach lies in its simplicity and ease of implementation. However, the applications of artificial time delay controllers in robotics, which are also robust against unknown state-dependent uncertainty, are still missing at large. This thesis presents the study of this control approach toward two important classes of robotic systems, namely a fully actuated bipedal walking robot and an underactuated quadrotor system. In the first work, we explore the idea of a unified control design instead of multiple controllers for different walking phases in adaptive bipedal walking control while bypassing computing constraint forces, since they often lead to complex designs. The second work focuses on quadrotors employed for applications such as payload delivery, inspection and search-and-rescue. The effectiveness of this controller is validated using experimental results.
Sample Complexity of the Sign-Perturbed Sums Method
We study the sample complexity of the Sign-Perturbed Sums (SPS) method, which constructs exact, non-asymptotic confidence regions for the true system parameters under mild statistical assumptions, such as independent and symmetric noise terms. The standard version of SPS deals with linear regression problems, however, it can be generalized to stochastic linear (dynamical) systems, even with closed-loop setups, and to nonlinear and nonparametric problems, as well. Although the strong consistency of the method was rigorously proven, the sample complexity of the algorithm was only analyzed so far for scalar linear regression problems. In this paper we study the sample complexity of SPS for general linear regression problems. We establish high probability upper bounds for the diameters of SPS confidence regions for finite sample sizes and show that the SPS regions shrink at the same, optimal rate as the classical asymptotic confidence ellipsoids. Finally, the difference between the theoretical bounds and the empirical sizes of SPS confidence regions is investigated experimentally.
Nonlinear PDE Constrained Optimal Dispatch of Gas and Power: A Global Linearization Approach
The coordinated dispatch of power and gas in the electricity-gas integrated energy system (EG-IES) is fundamental for ensuring operational security. However, the gas dynamics in the natural gas system (NGS) are governed by the nonlinear partial differential equations (PDE), making the dispatch problem of the EG-IES a complicated optimization model constrained by nonlinear PDE. To address it, we propose a globally linearized gas network model based on the Koopman operator theory, avoiding the commonly used local linearization and spatial discretization. Particularly, we propose a data-driven Koopman operator approximation approach for the globally linearized gas network model based on the extended dynamic mode decomposition, in which a physics-informed stability constraint is derived and embedded to improve the generalization ability and accuracy of the model. Based on this, we develop an optimal dispatch model for the EG-IES that first considers the nonlinear gas dynamics in the NGS. The case study verifies the effectiveness of this work. Simulation results reveal that the commonly used locally linearized gas network model fails to accurately capture the dynamic characteristics of NGS, bringing potential security threats to the system.
Geometric Scaling Laws for Axial Flux Permanent Magnet Motors in In-Wheel Powertrain Topologies
In this paper, we present geometric scaling models for axial flux motors (AFMs) to be used for in-wheel powertrain design optimization purposes. We first present a vehicle and powertrain model, with emphasis on the electric motor model. We construct the latter by formulating the analytical scaling laws for AFMs, based on the scaling concept of RFMs from the literature, specifically deriving the model of the main loss component in electric motors: the copper losses. We further present separate scaling models of motor parameters, losses and thermal models, as well as the torque limits and cost, as a function of the design variables. Second, we validate these scaling laws with several experiments leveraging high-fidelity finite-element simulations. Finally, we define an optimization problem that minimizes the energy consumption over a drive cycle, optimizing the motor size and transmission ratio for a wide range of electric vehicle powertrain topologies. In our study, we observe that the all-wheel drive topology equipped with in-wheel AFMs is the most efficient, but also generates the highest material cost.
comment: 5 pages, 6 figures, 4 tables, 2024 IEEE Vehicle Power and Propulsion Conference, Washington DC, USA
PACSBO: Probably approximately correct safe Bayesian optimization
Safe Bayesian optimization (BO) algorithms promise to find optimal control policies without knowing the system dynamics while at the same time guaranteeing safety with high probability. In exchange for those guarantees, popular algorithms require a smoothness assumption: a known upper bound on a norm in a reproducing kernel Hilbert space (RKHS). The RKHS is a potentially infinite-dimensional space, and it is unclear how to, in practice, obtain an upper bound of an unknown function in its corresponding RKHS. In response, we propose an algorithm that estimates an upper bound on the RKHS norm of an unknown function from data and investigate its theoretical properties. Moreover, akin to Lipschitz-based methods, we treat the RKHS norm as a local rather than a global object, and thus reduce conservatism. Integrating the RKHS norm estimation and the local interpretation of the RKHS norm into a safe BO algorithm yields PACSBO, an algorithm for probably approximately correct safe Bayesian optimization, for which we provide numerical and hardware experiments that demonstrate its applicability and benefits over popular safe BO algorithms.
comment: Accepted to the Symposium on Systems Theory in Data and Optimization (SysDO 2024). This is a preprint of the final version, which is to appear in Lecture Notes in Control and Information Sciences - Proceedings
Massive Random Access in Cell-Free Massive MIMO Systems for High-Speed Mobility with OTFS Modulation
In the research of next-generation wireless communication technologies, orthogonal time frequency space (OTFS) modulation is emerging as a promising technique for high-speed mobile environments due to its superior efficiency and robustness in doubly selective channels. Additionally, the cell-free architecture, which eliminates the issues associated with cell boundaries, offers broader coverage for radio access networks. By combining cell-free network architecture with OTFS modulation, the system may meet the demands of massive random access required by machine-type communication devices in high-speed scenarios. This paper explores a massive random access scheme based on OTFS modulation within a cell-free architecture. A transceiver model for uplink OTFS signals involving multiple access points (APs) is developed, where channel estimation with fractional channel parameters is approximated as a block sparse matrix recovery problem. Building on existing superimposed and embedded preamble schemes, a hybrid preamble scheme is proposed. This scheme leverages superimposed and embedded preambles to respectively achieve rough and accurate active user equipment (UEs) detection (AUD), as well as precise channel estimation, under the condition of supporting a large number of access UEs. Moreover, this study introduces a generalized approximate message passing and pattern coupling sparse Bayesian learning with Laplacian prior (GAMP-PCSBL-La) algorithm, which effectively captures block sparse features after discrete cosine transform (DCT), delivering precise estimation results with reduced computational complexity. Simulation results demonstrate that the proposed scheme is effective and provides superior performance compared to other existing schemes.
Flying a Quadrotor with Unknown Actuators and Sensor Configuration
Though control algorithms for multirotor Unmanned Air Vehicle (UAV) are well understood, the configuration, parameter estimation, and tuning of flight control algorithms takes quite some time and resources. In previous work, we have shown that it is possible to identify the control effectiveness and motor dynamics of a multirotor fast enough for it to recover to a stable hover after being thrown 4 meters in the air. In this paper, we extend this to include estimation of the position of the Inertial Measurement Unit (IMU) relative to the Center of Gravity (CoG), estimation of the IMU rotation, the thrust direction of all motors and the optimal combined thrust direction. In order to guarantee a correct IMU position estimation, two prior throw-and-catches of the vehicle with spin around different axes are required. For these throws, a height as low as 1 meter is sufficient. Quadrotor flight experimentation confirms the efficacy of the approach, and a simulation shows its applicability to fully-actuated crafts with multiple possible hover orientations.
comment: This work has been submitted to IMAV 2024 for possible publication
A blueprint for large-scale quantum-network deployments
Quantum Communications is a field that promises advances in cryptography, quantum computing and clock synchronisation, among other potential applications. However, communication based on quantum phenomena requires an extreme level of isolation from external disturbances, making the transmission of quantum signals together with classical ones difficult. A range of techniques has been tested to introduce quantum communications in already deployed optical networks which also carry legacy traffic. This comes with challenges, not only at the physical layer but also at the operations and management layer. To achieve a broad acceptance among network operators, the joint management and operation of quantum and classical resources, compliance with standards, and quality and legal assurance need to be addressed. This article presents a detailed account of solutions to the above issues, deployed and evaluated in the MadQCI (Madrid Quantum Communication Infrastructure) testbed. This network is designed to integrate quantum communications in the telecommunications ecosystem by installing quantum-key-distribution modules from multiple providers in production nodes of two different operators. The modules were connected through an optical-switched network with more than 130 km of deployed optical fibre. The tests were done in compliance with strict service level agreements that protected the legacy traffic of the pre-existing classical network. The goal was to achieve full quantum-classical compatibility at all levels, while limiting the modifications of optical transport and encryption and complying with as many standards as possible. This effort was intended to serve as a blueprint, which can be used as the foundation of large-scale quantum network deployments. To demonstrate the capabilities of MadQCI, end-to-end encryption services were deployed and a variety of use-cases were showcased.
comment: 16 pages
Learning in Hybrid Active Inference Models
An open problem in artificial intelligence is how systems can flexibly learn discrete abstractions that are useful for solving inherently continuous problems. Previous work in computational neuroscience has considered this functional integration of discrete and continuous variables during decision-making under the formalism of active inference (Parr, Friston & de Vries, 2017; Parr & Friston, 2018). However, their focus is on the expressive physical implementation of categorical decisions and the hierarchical mixed generative model is assumed to be known. As a consequence, it is unclear how this framework might be extended to learning. We therefore present a novel hierarchical hybrid active inference agent in which a high-level discrete active inference planner sits above a low-level continuous active inference controller. We make use of recent work in recurrent switching linear dynamical systems (rSLDS) which implement end-to-end learning of meaningful discrete representations via the piecewise linear decomposition of complex continuous dynamics (Linderman et al., 2016). The representations learned by the rSLDS inform the structure of the hybrid decision-making agent and allow us to (1) specify temporally-abstracted sub-goals in a method reminiscent of the options framework, (2) lift the exploration into discrete space allowing us to exploit information-theoretic exploration bonuses and (3) `cache' the approximate solutions to low-level problems in the discrete planner. We apply our model to the sparse Continuous Mountain Car task, demonstrating fast system identification via enhanced exploration and successful planning through the delineation of abstract sub-goals.
comment: 11 pages (+ appendix). Accepted to the International Workshop on Active Inference 2024. arXiv admin note: substantial text overlap with arXiv:2408.10970
Upgrading Pepper Robot s Social Interaction with Advanced Hardware and Perception Enhancements
In this paper, we propose hardware and software enhancements for the Pepper robot to improve its human-robot interaction capabilities. This includes the integration of an NVIDIA Jetson GPU to enhance computational capabilities and execute real time algorithms, and a RealSense D435i camera to capture depth images, as well as the computer vision algorithms to detect and localize the humans around the robot and estimate their body orientation and gaze direction. The new stack is implemented on ROS and is running on the extended Pepper hardware, and the communication with the robot s firmware is done through the NAOqi ROS driver API. We have also collected a MoCap dataset of human activities in a controlled environment, together with the corresponding RGB-D data, to validate the proposed perception algorithms.
Stability of multiplexed NCS based on an epsilon-greedy algorithm for communication selection
In this letter, we study a Networked Control System (NCS) with multiplexed communication and Bernoulli packet drops. Multiplexed communication refers to the constraint that transmission of a control signal and an observation signal cannot occur simultaneously due to the limited bandwidth. First, we propose an epsilon-greedy algorithm for the selection of the communication sequence that also ensures Mean Square Stability (MSS). We formulate the system as a Markovian Jump Linear System (MJLS) and provide the necessary conditions for MSS in terms of Linear Matrix Inequalities (LMIs) that need to be satisfied for three corner cases. We prove that the system is MSS for any convex combination of these three corner cases. Furthermore, we propose to use the epsilon-greedy algorithm with the epsilon that satisfies MSS conditions for training a Deep Q Network (DQN). The DQN is used to obtain an optimal communication sequence that minimizes a quadratic cost. We validate our approach with a numerical example that shows the efficacy of our method in comparison to the round-robin and a random scheme.
comment: A preliminary version of this article has been submitted to IEEE Control Systems articles
Distributed Optimization under Edge Agreement with Application in Battery Network Management
This paper investigates a distributed optimization problem under edge agreements, where each agent in the network is also subject to local convex constraints. Generalized from the concept of consensus, a group of edge agreements represents the constraints defined for neighboring agents, with each pair of neighboring agents required to satisfy one edge agreement constraint. Edge agreements are defined locally to allow more flexibility than a global consensus, enabling heterogeneous coordination within the network. This paper proposes a discrete-time algorithm to solve such problems, providing a theoretical analysis to prove its convergence. Additionally, this paper illustrates the connection between the theory of distributed optimization under edge agreements and distributed model predictive control through a distributed battery network energy management problem. This approach enables a new perspective to formulate and solve network control and optimization problems.
The Fragile Nature of Road Transportation Systems
Major cities worldwide experience problems with the performance of their road transportation systems, and the continuous increase in traffic demand presents a substantial challenge to the optimal operation of urban road networks and the efficiency of traffic control strategies. The operation of transportation systems is widely considered to display fragile property, i.e., the loss in performance increases exponentially with the linearly increasing magnitude of disruptions. Meanwhile, the risk engineering community is embracing the novel concept of antifragility, enabling systems to learn from historical disruptions and exhibit improved performance under black swan events. In this study, based on established traffic models, namely fundamental diagrams and macroscopic fundamental diagrams, we first conducted a rigorous mathematical analysis to prove the fragile nature of the systems theoretically. Subsequently, we propose a skewness-based indicator that can be readily applied to cross-compare the degree of fragility for different networks solely dependent on the MFD-related parameters. At last, by taking real-world stochasticity into account, we implemented a numerical simulation with realistic network data to bridge the gap between the theoretical proof and the real-world operations, to reflect the potential impact of uncertainty on the fragility of the systems. This work aims to demonstrate the fragile nature of road transportation systems and help researchers better comprehend the necessity to consider explicitly antifragile design for future traffic control strategies.
comment: 44 pages, 15 figures
Physics-informed Neural Networks for Heterogeneous Poroelastic Media
This study presents a novel physics-informed neural network (PINN) framework for modeling poroelasticity in heterogeneous media with material interfaces. The approach introduces a composite neural network (CoNN) where separate neural networks predict displacement and pressure variables for each material. While sharing identical activation functions, these networks are independently trained for all other parameters. To address challenges posed by heterogeneous material interfaces, the CoNN is integrated with the Interface-PINNs or I-PINNs framework (Sarma et al. 2024, https://dx.doi.org/10.1016/j.cma.2024.117135), allowing different activation functions across material interfaces. This ensures accurate approximation of discontinuous solution fields and gradients. Performance and accuracy of this combined architecture were evaluated against the conventional PINNs approach, a single neural network (SNN) architecture, and the eXtended PINNs (XPINNs) framework through two one-dimensional benchmark examples with discontinuous material properties. The results show that the proposed CoNN with I-PINNs architecture achieves an RMSE that is two orders of magnitude better than the conventional PINNs approach and is at least 40 times faster than the SNN framework. Compared to XPINNs, the proposed method achieves an RMSE at least one order of magnitude better and is 40% faster.
comment: 34 pages, 12 figures, 3 tables
A Two-sided Model for EV Market Dynamics and Policy Implications
The diffusion of Electric Vehicles (EVs) plays a pivotal role in mitigating greenhouse gas emissions, particularly in the U.S., where ambitious zero-emission and carbon neutrality objectives have been set. In pursuit of these goals, many states have implemented a range of incentive policies aimed at stimulating EV adoption and charging infrastructure development, especially public EV charging stations (EVCS). This study examines the indirect network effect observed between EV adoption and EVCS deployment within urban landscapes. We developed a two-sided log-log regression model with historical data on EV purchases and EVCS development to quantify this effect. To test the robustness, we then conducted a case study of the EV market in Los Angeles (LA) County, which suggests that a 1% increase in EVCS correlates with a 0.35% increase in EV sales. Additionally, we forecasted the future EV market dynamics in LA County, revealing a notable disparity between current policies and the targeted 80% EV market share for private cars by 2045. To bridge this gap, we proposed a combined policy recommendation that enhances EV incentives by 60% and EVCS rebates by 66%, facilitating the achievement of future EV market objectives.
comment: Conference preprint, 6 pages, 2 figures
Emergent Cooperation for Energy-efficient Connectivity via Wireless Power Transfer
This paper addresses the challenge of incentivizing energy-constrained, non-cooperative user equipment (UE) to serve as cooperative relays. We consider a source UE with a non-line-of-sight channel to an access point (AP), where direct communication may be infeasible or may necessitate a substantial transmit power. Other UEs in the vicinity are viewed as relay candidates, and our aim is to enable energy-efficient connectivity for the source, while accounting for the self-interested behavior and private channel state information of these candidates, by allowing the source to ``pay" the candidates via wireless power transfer (WPT). We propose a cooperation-inducing protocol, inspired by Myerson auction theory, which ensures that candidates truthfully report power requirements while minimizing the expected power used by the source. Through rigorous analysis, we establish the regularity of valuations for lognormal fading channels, which allows for the efficient determination of the optimal source transmit power. Extensive simulation experiments, employing real-world communication and WPT parameters, validate our theoretical framework. Our results demonstrate over 71% reduction in outage probability with as few as 4 relay candidates, compared to the non-cooperative scenario, and as much as 70% source power savings compared to a baseline approach, highlighting the efficacy of our proposed methodology.
Unmanned Vehicles in 6G Networks: A Unifying Treatment of Problems, Formulations, and Tools
Unmanned Vehicles (UVs) functioning as autonomous agents are anticipated to play a crucial role in the 6th Generation of wireless networks. Their seamless integration, cost-effectiveness, and the additional controllability through motion planning make them an attractive deployment option for a wide range of applications, both as assets in the network (e.g., mobile base stations) and as consumers of network services (e.g., autonomous delivery systems). However, despite their potential, the convergence of UVs and wireless systems brings forth numerous challenges that require attention from both academia and industry. This paper then aims to offer a comprehensive overview encompassing the transformative possibilities as well as the significant challenges associated with UV-assisted next-generation wireless communications. Considering the diverse landscape of possible application scenarios, problem formulations, and mathematical tools related to UV-assisted wireless systems, the underlying core theme of this paper is the unification of the problem space, providing a structured framework to understand the use cases, problem formulations, and necessary mathematical tools. Overall, the paper sets forth a clear understanding of how unmanned vehicles can be integrated in the 6G ecosystem, paving the way towards harnessing the full potential at this intersection.
Opinion dynamics on signed graphs and graphons: Beyond the piece-wise constant case (Extended version)
In this paper we make use of graphon theory to study opinion dynamics on large undirected networks. The opinion dynamics models that we take into consideration allow for negative interactions between the individuals, i.e. competing entities whose opinions can grow apart. We consider both the repelling model and the opposing model that are studied in the literature. We define the repelling and the opposing dynamics on graphons and we show that their initial value problem's solutions exist and are unique. We then show that the graphon dynamics well approximate the dynamics on large graphs that converge to a graphon. This result applies to large random graphs that are sampled according to a graphon. All these facts are illustrated in an extended numerical example.
comment: 8 double-column pages. This revised version corrects several typos. An abridged version is going to appear in the proceedings of the 2024 IEEE Conference on Decision and Control
Leveraging Blockchain and ANFIS for Optimal Supply Chain Management
The supply chain is a critical segment of the product manufacturing cycle, continuously influenced by risky, uncertain, and undesirable events. Optimizing flexibility in the supply chain presents a complex, multi-objective, and nonlinear programming challenge. In the poultry supply chain, the development of mass customization capabilities has led manufacturing companies to increasingly focus on offering tailored and customized services for individual products. To safeguard against data tampering and ensure the integrity of setup costs and overall profitability, a multi-signature decentralized finance (DeFi) protocol, integrated with the IoT on a blockchain platform, is proposed. Managing the poultry supply chain involves uncertainties that may not account for parameters such as delivery time to retailers, reorder time, and the number of requested products. To address these challenges, this study employs an adaptive neuro-fuzzy inference system (ANFIS), combining neural networks with fuzzy logic to compensate for the lack of data training in parameter identification. Through MATLAB simulations, the study investigates the average shop delivery duration, the reorder time, and the number of products per order. By implementing the proposed technique, the average delivery time decreases from 40 to 37 minutes, the reorder time decreases from five to four days, and the quantity of items requested per order grows from six to eleven. Additionally, the ANFIS model enhances overall supply chain performance by reducing transaction times by 15\% compared to conventional systems, thereby improving real-time responsiveness and boosting transparency in supply chain operations, effectively resolving operational issues.
Autonomous Payload Thermal Control SP
In small satellites there is less room for heat control equipment, scientific instruments, and electronic components. Furthermore, the near proximity of electronic components makes power dissipation difficult, with the risk of not being able to control the temperature appropriately, reducing component lifetime and mission performance. To address this challenge, taking advantage of the advent of increasing intelligence on board satellites, an autonomous thermal control tool that uses deep reinforcement learning is proposed for learning the thermal control policy onboard. The tool was evaluated in a real space edge processing computer that will be used in a demonstration payload hosted in the International Space Station (ISS). The experiment results show that the proposed framework is able to learn to control the payload processing power to maintain the temperature under operational ranges, complementing traditional thermal control systems.
comment: To be included in the proceedings of ESA's SPAICE conference at ECSAT, UK, 2024
Measurement and Modeling on Terahertz Channels in Rain
The Terahertz (THz) frequency band offers a wide range of bandwidths, from tens to hundreds of gigahertz (GHz) and also supports data speeds of several terabits per second (Tbps). Because of this, maintaining THz channel reliability and efficiency in adverse weather conditions is crucial. Rain, in particular, disrupts THz channel propagation significantly and there is still lack of comprehensive investigations due to the involved experimental difficulties. This work explores how rain affects THz channel performance by conducting experiments in a rain emulation chamber and under actual rainy conditions outdoors. We focus on variables like rain intensity, raindrop size distribution (RDSD), and the channel's gradient height. We observe that the gradient height (for air-to-ground channel) can induce changes of the RDSD along the channel's path, impacting the precision of modeling efforts. To address this, we propose a theoretical model, integrating Mie scattering theory with considerations of channel's gradient height. Both our experimental and theoretical findings confirm this model's effectiveness in predicting THz channel behavior in rainy conditions. This work underscores the necessary in incorporating the variation of RDSD when THz channel travels in scenarios involving ground-to-air or air-to-ground communications.
comment: submitted to Journal of Infrared, Millimeter and Terahertz Waves
Active Search for Low-altitude UAV Sensing and Communication for Users at Unknown Locations
This paper studies optimal unmanned aerial vehicle (UAV) placement to ensure line-of-sight (LOS) communication and sensing for a cluster of ground users possibly in deep shadow, while the UAV maintains backhaul connectivity with a base station (BS). The key challenges include unknown user locations, uncertain channel model parameters, and unavailable urban structure. Addressing these challenges, this paper focuses on developing an efficient online search strategy which jointly estimates channels, guides UAV positioning, and optimizes resource allocation. Analytically exploiting the geometric properties of the equipotential surface, this paper develops an LOS discovery trajectory on the equipotential surface while the closed-form search directions are determined using perturbation theory. Since the explicit expression of the equipotential surface is not available, this paper proposes to locally construct a channel model for each user in the LOS regime utilizing polynomial regression without depending on user locations or propagation distance. A class of spiral trajectories to simultaneously construct the LOS channels and search on the equipotential surface is developed. An optimal radius of the spiral and an optimal measurement pattern for channel gain estimation are derived to minimize the mean squared error (MSE) of the locally constructed channel. Numerical results on real 3D city maps demonstrate that the proposed scheme achieves over 94% of the performance of a 3D exhaustive search scheme with just a 3-kilometer search.
Systems and Control (EESS)
Terminal Soft Landing Guidance Law Using Analytic Gravity Turn Trajectory
This paper presents an innovative terminal landing guidance law that utilizes an analytic solution derived from the gravity turn trajectory. The characteristics of the derived solution are thoroughly investigated, and the solution is employed to generate a reference velocity vector that satisfies terminal landing conditions. A nonlinear control law is applied to effectively track the reference velocity vector within a finite time, and its robustness against disturbances is studied. Furthermore, the guidance law is expanded to incorporate ground collision avoidance by considering the shape of the gravity turn trajectory. The proposed method's fuel efficiency, robustness, and practicality are demonstrated through comprehensive numerical simulations, and its performance is compared with existing methods.
Time-Varying Soft-Maximum Barrier Functions for Safety in Unmapped and Dynamic Environments
We present a closed-form optimal feedback control method that ensures safety in an a prior unknown and potentially dynamic environment. This article considers the scenario where local perception data (e.g., LiDAR) is obtained periodically, and this data can be used to construct a local control barrier function (CBF) that models a local set that is safe for a period of time into the future. Then, we use a smooth time-varying soft-maximum function to compose the N most recently obtained local CBFs into a single barrier function that models an approximate union of the N most recently obtained local sets. This composite barrier function is used in a constrained quadratic optimization, which is solved in closed form to obtain a safe-and-optimal feedback control. We also apply the time-varying soft-maximum barrier function control to 2 robotic systems (nonholonomic ground robot with nonnegligible inertia, and quadrotor robot), where the objective is to navigate an a priori unknown environment safely and reach a target destination. In these applications, we present a simple approach to generate local CBFs from periodically obtained perception data.
comment: Preprint submitted to IEEE Transactions on Control Systems Technology (TCST)
Performance-Aware Self-Configurable Multi-Agent Networks: A Distributed Submodular Approach for Simultaneous Coordination and Network Design
We introduce the first, to our knowledge, rigorous approach that enables multi-agent networks to self-configure their communication topology to balance the trade-off between scalability and optimality during multi-agent planning. We are motivated by the future of ubiquitous collaborative autonomy where numerous distributed agents will be coordinating via agent-to-agent communication to execute complex tasks such as traffic monitoring, event detection, and environmental exploration. But the explosion of information in such large-scale networks currently curtails their deployment due to impractical decision times induced by the computational and communication requirements of the existing near-optimal coordination algorithms. To overcome this challenge, we present the AlterNAting COordination and Network-Design Algorithm (Anaconda), a scalable algorithm that also enjoys near-optimality guarantees. Subject to the agents' bandwidth constraints, Anaconda enables the agents to optimize their local communication neighborhoods such that the action-coordination approximation performance of the network is maximized. Compared to the state of the art, Anaconda is an anytime self-configurable algorithm that quantifies its suboptimality guarantee for any type of network, from fully disconnected to fully centralized, and that, for sparse networks, is one order faster in terms of decision speed. To develop the algorithm, we quantify the suboptimality cost due to decentralization, i.e., due to communication-minimal distributed coordination. We also employ tools inspired by the literature on multi-armed bandits and submodular maximization subject to cardinality constraints. We demonstrate Anaconda in simulated scenarios of area monitoring and compare it with a state-of-the-art algorithm.
comment: Accepted to CDC 2024
An Investigation of Denial of Service Attacks on Autonomous Driving Software and Hardware in Operation
This research investigates the impact of Denial of Service (DoS) attacks, specifically Internet Control Message Protocol (ICMP) flood attacks, on Autonomous Driving (AD) systems, focusing on their control modules. Two experimental setups were created: the first involved an ICMP flood attack on a Raspberry Pi running an AD software stack, and the second examined the effects of single and double ICMP flood attacks on a Global Navigation Satellite System Real-Time Kinematic (GNSS-RTK) device for high-accuracy localization of an autonomous vehicle that is available on the market. The results indicate a moderate impact of DoS attacks on the AD stack, where the increase in median computation time was marginal, suggesting a degree of resilience to these types of attacks. In contrast, the GNSS device demonstrated significant vulnerability: during DoS attacks, the sample rate dropped drastically to approximately 50% and 5% of the nominal rate for single and double attacker configurations, respectively. Additionally, the longest observed time increments were in the range of seconds during the attacks. These results underscore the vulnerability of AD systems to DoS attacks and the critical need for robust cybersecurity measures. This work provides valuable insights into the design requirements of AD software stacks and highlights that external hardware and modules can be significant attack surfaces.
Analyzing electric vehicle, load and photovoltaic generation uncertainty using publicly available datasets
This paper aims to analyze three publicly available datasets for quantifying seasonal and annual uncertainty for efficient scenario creation. The datasets from Elaad, Elia and Fluvius are utilized to statistically analyze electric vehicle charging, normalized solar generation and low-voltage consumer load profiles, respectively. Frameworks for scenario generation are also provided for these datasets. The datasets for load profiles and solar generation analyzed are for the year 2022, thus embedding seasonal information. An online repository is created for the wider applicability of this work. Finally, the extreme load week(s) are identified and linked to the weather data measured at EnergyVille in Belgium.
Adaptive Artificial Time Delay Control for Robotic Systems
Artificial time delay controller was conceptualised for nonlinear systems to reduce dependency on precise system modelling unlike the conventional adaptive and robust control strategies. In this approach unknown dynamics is compensated by using input and state measurements collected at immediate past time instant (i.e., artificially delayed). The advantage of this kind of approach lies in its simplicity and ease of implementation. However, the applications of artificial time delay controllers in robotics, which are also robust against unknown state-dependent uncertainty, are still missing at large. This thesis presents the study of this control approach toward two important classes of robotic systems, namely a fully actuated bipedal walking robot and an underactuated quadrotor system. In the first work, we explore the idea of a unified control design instead of multiple controllers for different walking phases in adaptive bipedal walking control while bypassing computing constraint forces, since they often lead to complex designs. The second work focuses on quadrotors employed for applications such as payload delivery, inspection and search-and-rescue. The effectiveness of this controller is validated using experimental results.
Sample Complexity of the Sign-Perturbed Sums Method
We study the sample complexity of the Sign-Perturbed Sums (SPS) method, which constructs exact, non-asymptotic confidence regions for the true system parameters under mild statistical assumptions, such as independent and symmetric noise terms. The standard version of SPS deals with linear regression problems, however, it can be generalized to stochastic linear (dynamical) systems, even with closed-loop setups, and to nonlinear and nonparametric problems, as well. Although the strong consistency of the method was rigorously proven, the sample complexity of the algorithm was only analyzed so far for scalar linear regression problems. In this paper we study the sample complexity of SPS for general linear regression problems. We establish high probability upper bounds for the diameters of SPS confidence regions for finite sample sizes and show that the SPS regions shrink at the same, optimal rate as the classical asymptotic confidence ellipsoids. Finally, the difference between the theoretical bounds and the empirical sizes of SPS confidence regions is investigated experimentally.
Nonlinear PDE Constrained Optimal Dispatch of Gas and Power: A Global Linearization Approach
The coordinated dispatch of power and gas in the electricity-gas integrated energy system (EG-IES) is fundamental for ensuring operational security. However, the gas dynamics in the natural gas system (NGS) are governed by the nonlinear partial differential equations (PDE), making the dispatch problem of the EG-IES a complicated optimization model constrained by nonlinear PDE. To address it, we propose a globally linearized gas network model based on the Koopman operator theory, avoiding the commonly used local linearization and spatial discretization. Particularly, we propose a data-driven Koopman operator approximation approach for the globally linearized gas network model based on the extended dynamic mode decomposition, in which a physics-informed stability constraint is derived and embedded to improve the generalization ability and accuracy of the model. Based on this, we develop an optimal dispatch model for the EG-IES that first considers the nonlinear gas dynamics in the NGS. The case study verifies the effectiveness of this work. Simulation results reveal that the commonly used locally linearized gas network model fails to accurately capture the dynamic characteristics of NGS, bringing potential security threats to the system.
Geometric Scaling Laws for Axial Flux Permanent Magnet Motors in In-Wheel Powertrain Topologies
In this paper, we present geometric scaling models for axial flux motors (AFMs) to be used for in-wheel powertrain design optimization purposes. We first present a vehicle and powertrain model, with emphasis on the electric motor model. We construct the latter by formulating the analytical scaling laws for AFMs, based on the scaling concept of RFMs from the literature, specifically deriving the model of the main loss component in electric motors: the copper losses. We further present separate scaling models of motor parameters, losses and thermal models, as well as the torque limits and cost, as a function of the design variables. Second, we validate these scaling laws with several experiments leveraging high-fidelity finite-element simulations. Finally, we define an optimization problem that minimizes the energy consumption over a drive cycle, optimizing the motor size and transmission ratio for a wide range of electric vehicle powertrain topologies. In our study, we observe that the all-wheel drive topology equipped with in-wheel AFMs is the most efficient, but also generates the highest material cost.
comment: 5 pages, 6 figures, 4 tables, 2024 IEEE Vehicle Power and Propulsion Conference, Washington DC, USA
PACSBO: Probably approximately correct safe Bayesian optimization
Safe Bayesian optimization (BO) algorithms promise to find optimal control policies without knowing the system dynamics while at the same time guaranteeing safety with high probability. In exchange for those guarantees, popular algorithms require a smoothness assumption: a known upper bound on a norm in a reproducing kernel Hilbert space (RKHS). The RKHS is a potentially infinite-dimensional space, and it is unclear how to, in practice, obtain an upper bound of an unknown function in its corresponding RKHS. In response, we propose an algorithm that estimates an upper bound on the RKHS norm of an unknown function from data and investigate its theoretical properties. Moreover, akin to Lipschitz-based methods, we treat the RKHS norm as a local rather than a global object, and thus reduce conservatism. Integrating the RKHS norm estimation and the local interpretation of the RKHS norm into a safe BO algorithm yields PACSBO, an algorithm for probably approximately correct safe Bayesian optimization, for which we provide numerical and hardware experiments that demonstrate its applicability and benefits over popular safe BO algorithms.
comment: Accepted to the Symposium on Systems Theory in Data and Optimization (SysDO 2024). This is a preprint of the final version, which is to appear in Lecture Notes in Control and Information Sciences - Proceedings
Massive Random Access in Cell-Free Massive MIMO Systems for High-Speed Mobility with OTFS Modulation
In the research of next-generation wireless communication technologies, orthogonal time frequency space (OTFS) modulation is emerging as a promising technique for high-speed mobile environments due to its superior efficiency and robustness in doubly selective channels. Additionally, the cell-free architecture, which eliminates the issues associated with cell boundaries, offers broader coverage for radio access networks. By combining cell-free network architecture with OTFS modulation, the system may meet the demands of massive random access required by machine-type communication devices in high-speed scenarios. This paper explores a massive random access scheme based on OTFS modulation within a cell-free architecture. A transceiver model for uplink OTFS signals involving multiple access points (APs) is developed, where channel estimation with fractional channel parameters is approximated as a block sparse matrix recovery problem. Building on existing superimposed and embedded preamble schemes, a hybrid preamble scheme is proposed. This scheme leverages superimposed and embedded preambles to respectively achieve rough and accurate active user equipment (UEs) detection (AUD), as well as precise channel estimation, under the condition of supporting a large number of access UEs. Moreover, this study introduces a generalized approximate message passing and pattern coupling sparse Bayesian learning with Laplacian prior (GAMP-PCSBL-La) algorithm, which effectively captures block sparse features after discrete cosine transform (DCT), delivering precise estimation results with reduced computational complexity. Simulation results demonstrate that the proposed scheme is effective and provides superior performance compared to other existing schemes.
Flying a Quadrotor with Unknown Actuators and Sensor Configuration
Though control algorithms for multirotor Unmanned Air Vehicle (UAV) are well understood, the configuration, parameter estimation, and tuning of flight control algorithms takes quite some time and resources. In previous work, we have shown that it is possible to identify the control effectiveness and motor dynamics of a multirotor fast enough for it to recover to a stable hover after being thrown 4 meters in the air. In this paper, we extend this to include estimation of the position of the Inertial Measurement Unit (IMU) relative to the Center of Gravity (CoG), estimation of the IMU rotation, the thrust direction of all motors and the optimal combined thrust direction. In order to guarantee a correct IMU position estimation, two prior throw-and-catches of the vehicle with spin around different axes are required. For these throws, a height as low as 1 meter is sufficient. Quadrotor flight experimentation confirms the efficacy of the approach, and a simulation shows its applicability to fully-actuated crafts with multiple possible hover orientations.
comment: This work has been submitted to IMAV 2024 for possible publication
A blueprint for large-scale quantum-network deployments
Quantum Communications is a field that promises advances in cryptography, quantum computing and clock synchronisation, among other potential applications. However, communication based on quantum phenomena requires an extreme level of isolation from external disturbances, making the transmission of quantum signals together with classical ones difficult. A range of techniques has been tested to introduce quantum communications in already deployed optical networks which also carry legacy traffic. This comes with challenges, not only at the physical layer but also at the operations and management layer. To achieve a broad acceptance among network operators, the joint management and operation of quantum and classical resources, compliance with standards, and quality and legal assurance need to be addressed. This article presents a detailed account of solutions to the above issues, deployed and evaluated in the MadQCI (Madrid Quantum Communication Infrastructure) testbed. This network is designed to integrate quantum communications in the telecommunications ecosystem by installing quantum-key-distribution modules from multiple providers in production nodes of two different operators. The modules were connected through an optical-switched network with more than 130 km of deployed optical fibre. The tests were done in compliance with strict service level agreements that protected the legacy traffic of the pre-existing classical network. The goal was to achieve full quantum-classical compatibility at all levels, while limiting the modifications of optical transport and encryption and complying with as many standards as possible. This effort was intended to serve as a blueprint, which can be used as the foundation of large-scale quantum network deployments. To demonstrate the capabilities of MadQCI, end-to-end encryption services were deployed and a variety of use-cases were showcased.
comment: 16 pages
Learning in Hybrid Active Inference Models
An open problem in artificial intelligence is how systems can flexibly learn discrete abstractions that are useful for solving inherently continuous problems. Previous work in computational neuroscience has considered this functional integration of discrete and continuous variables during decision-making under the formalism of active inference (Parr, Friston & de Vries, 2017; Parr & Friston, 2018). However, their focus is on the expressive physical implementation of categorical decisions and the hierarchical mixed generative model is assumed to be known. As a consequence, it is unclear how this framework might be extended to learning. We therefore present a novel hierarchical hybrid active inference agent in which a high-level discrete active inference planner sits above a low-level continuous active inference controller. We make use of recent work in recurrent switching linear dynamical systems (rSLDS) which implement end-to-end learning of meaningful discrete representations via the piecewise linear decomposition of complex continuous dynamics (Linderman et al., 2016). The representations learned by the rSLDS inform the structure of the hybrid decision-making agent and allow us to (1) specify temporally-abstracted sub-goals in a method reminiscent of the options framework, (2) lift the exploration into discrete space allowing us to exploit information-theoretic exploration bonuses and (3) `cache' the approximate solutions to low-level problems in the discrete planner. We apply our model to the sparse Continuous Mountain Car task, demonstrating fast system identification via enhanced exploration and successful planning through the delineation of abstract sub-goals.
comment: 11 pages (+ appendix). Accepted to the International Workshop on Active Inference 2024. arXiv admin note: substantial text overlap with arXiv:2408.10970
Upgrading Pepper Robot s Social Interaction with Advanced Hardware and Perception Enhancements
In this paper, we propose hardware and software enhancements for the Pepper robot to improve its human-robot interaction capabilities. This includes the integration of an NVIDIA Jetson GPU to enhance computational capabilities and execute real time algorithms, and a RealSense D435i camera to capture depth images, as well as the computer vision algorithms to detect and localize the humans around the robot and estimate their body orientation and gaze direction. The new stack is implemented on ROS and is running on the extended Pepper hardware, and the communication with the robot s firmware is done through the NAOqi ROS driver API. We have also collected a MoCap dataset of human activities in a controlled environment, together with the corresponding RGB-D data, to validate the proposed perception algorithms.
Stability of multiplexed NCS based on an epsilon-greedy algorithm for communication selection
In this letter, we study a Networked Control System (NCS) with multiplexed communication and Bernoulli packet drops. Multiplexed communication refers to the constraint that transmission of a control signal and an observation signal cannot occur simultaneously due to the limited bandwidth. First, we propose an epsilon-greedy algorithm for the selection of the communication sequence that also ensures Mean Square Stability (MSS). We formulate the system as a Markovian Jump Linear System (MJLS) and provide the necessary conditions for MSS in terms of Linear Matrix Inequalities (LMIs) that need to be satisfied for three corner cases. We prove that the system is MSS for any convex combination of these three corner cases. Furthermore, we propose to use the epsilon-greedy algorithm with the epsilon that satisfies MSS conditions for training a Deep Q Network (DQN). The DQN is used to obtain an optimal communication sequence that minimizes a quadratic cost. We validate our approach with a numerical example that shows the efficacy of our method in comparison to the round-robin and a random scheme.
comment: A preliminary version of this article has been submitted to IEEE Control Systems articles
Distributed Optimization under Edge Agreement with Application in Battery Network Management
This paper investigates a distributed optimization problem under edge agreements, where each agent in the network is also subject to local convex constraints. Generalized from the concept of consensus, a group of edge agreements represents the constraints defined for neighboring agents, with each pair of neighboring agents required to satisfy one edge agreement constraint. Edge agreements are defined locally to allow more flexibility than a global consensus, enabling heterogeneous coordination within the network. This paper proposes a discrete-time algorithm to solve such problems, providing a theoretical analysis to prove its convergence. Additionally, this paper illustrates the connection between the theory of distributed optimization under edge agreements and distributed model predictive control through a distributed battery network energy management problem. This approach enables a new perspective to formulate and solve network control and optimization problems.
The Fragile Nature of Road Transportation Systems
Major cities worldwide experience problems with the performance of their road transportation systems, and the continuous increase in traffic demand presents a substantial challenge to the optimal operation of urban road networks and the efficiency of traffic control strategies. The operation of transportation systems is widely considered to display fragile property, i.e., the loss in performance increases exponentially with the linearly increasing magnitude of disruptions. Meanwhile, the risk engineering community is embracing the novel concept of antifragility, enabling systems to learn from historical disruptions and exhibit improved performance under black swan events. In this study, based on established traffic models, namely fundamental diagrams and macroscopic fundamental diagrams, we first conducted a rigorous mathematical analysis to prove the fragile nature of the systems theoretically. Subsequently, we propose a skewness-based indicator that can be readily applied to cross-compare the degree of fragility for different networks solely dependent on the MFD-related parameters. At last, by taking real-world stochasticity into account, we implemented a numerical simulation with realistic network data to bridge the gap between the theoretical proof and the real-world operations, to reflect the potential impact of uncertainty on the fragility of the systems. This work aims to demonstrate the fragile nature of road transportation systems and help researchers better comprehend the necessity to consider explicitly antifragile design for future traffic control strategies.
comment: 44 pages, 15 figures
Physics-informed Neural Networks for Heterogeneous Poroelastic Media
This study presents a novel physics-informed neural network (PINN) framework for modeling poroelasticity in heterogeneous media with material interfaces. The approach introduces a composite neural network (CoNN) where separate neural networks predict displacement and pressure variables for each material. While sharing identical activation functions, these networks are independently trained for all other parameters. To address challenges posed by heterogeneous material interfaces, the CoNN is integrated with the Interface-PINNs or I-PINNs framework (Sarma et al. 2024, https://dx.doi.org/10.1016/j.cma.2024.117135), allowing different activation functions across material interfaces. This ensures accurate approximation of discontinuous solution fields and gradients. Performance and accuracy of this combined architecture were evaluated against the conventional PINNs approach, a single neural network (SNN) architecture, and the eXtended PINNs (XPINNs) framework through two one-dimensional benchmark examples with discontinuous material properties. The results show that the proposed CoNN with I-PINNs architecture achieves an RMSE that is two orders of magnitude better than the conventional PINNs approach and is at least 40 times faster than the SNN framework. Compared to XPINNs, the proposed method achieves an RMSE at least one order of magnitude better and is 40% faster.
comment: 34 pages, 12 figures, 3 tables
A Two-sided Model for EV Market Dynamics and Policy Implications
The diffusion of Electric Vehicles (EVs) plays a pivotal role in mitigating greenhouse gas emissions, particularly in the U.S., where ambitious zero-emission and carbon neutrality objectives have been set. In pursuit of these goals, many states have implemented a range of incentive policies aimed at stimulating EV adoption and charging infrastructure development, especially public EV charging stations (EVCS). This study examines the indirect network effect observed between EV adoption and EVCS deployment within urban landscapes. We developed a two-sided log-log regression model with historical data on EV purchases and EVCS development to quantify this effect. To test the robustness, we then conducted a case study of the EV market in Los Angeles (LA) County, which suggests that a 1% increase in EVCS correlates with a 0.35% increase in EV sales. Additionally, we forecasted the future EV market dynamics in LA County, revealing a notable disparity between current policies and the targeted 80% EV market share for private cars by 2045. To bridge this gap, we proposed a combined policy recommendation that enhances EV incentives by 60% and EVCS rebates by 66%, facilitating the achievement of future EV market objectives.
comment: Conference preprint, 6 pages, 2 figures
Emergent Cooperation for Energy-efficient Connectivity via Wireless Power Transfer
This paper addresses the challenge of incentivizing energy-constrained, non-cooperative user equipment (UE) to serve as cooperative relays. We consider a source UE with a non-line-of-sight channel to an access point (AP), where direct communication may be infeasible or may necessitate a substantial transmit power. Other UEs in the vicinity are viewed as relay candidates, and our aim is to enable energy-efficient connectivity for the source, while accounting for the self-interested behavior and private channel state information of these candidates, by allowing the source to ``pay" the candidates via wireless power transfer (WPT). We propose a cooperation-inducing protocol, inspired by Myerson auction theory, which ensures that candidates truthfully report power requirements while minimizing the expected power used by the source. Through rigorous analysis, we establish the regularity of valuations for lognormal fading channels, which allows for the efficient determination of the optimal source transmit power. Extensive simulation experiments, employing real-world communication and WPT parameters, validate our theoretical framework. Our results demonstrate over 71% reduction in outage probability with as few as 4 relay candidates, compared to the non-cooperative scenario, and as much as 70% source power savings compared to a baseline approach, highlighting the efficacy of our proposed methodology.
Unmanned Vehicles in 6G Networks: A Unifying Treatment of Problems, Formulations, and Tools
Unmanned Vehicles (UVs) functioning as autonomous agents are anticipated to play a crucial role in the 6th Generation of wireless networks. Their seamless integration, cost-effectiveness, and the additional controllability through motion planning make them an attractive deployment option for a wide range of applications, both as assets in the network (e.g., mobile base stations) and as consumers of network services (e.g., autonomous delivery systems). However, despite their potential, the convergence of UVs and wireless systems brings forth numerous challenges that require attention from both academia and industry. This paper then aims to offer a comprehensive overview encompassing the transformative possibilities as well as the significant challenges associated with UV-assisted next-generation wireless communications. Considering the diverse landscape of possible application scenarios, problem formulations, and mathematical tools related to UV-assisted wireless systems, the underlying core theme of this paper is the unification of the problem space, providing a structured framework to understand the use cases, problem formulations, and necessary mathematical tools. Overall, the paper sets forth a clear understanding of how unmanned vehicles can be integrated in the 6G ecosystem, paving the way towards harnessing the full potential at this intersection.
Opinion dynamics on signed graphs and graphons: Beyond the piece-wise constant case (Extended version)
In this paper we make use of graphon theory to study opinion dynamics on large undirected networks. The opinion dynamics models that we take into consideration allow for negative interactions between the individuals, i.e. competing entities whose opinions can grow apart. We consider both the repelling model and the opposing model that are studied in the literature. We define the repelling and the opposing dynamics on graphons and we show that their initial value problem's solutions exist and are unique. We then show that the graphon dynamics well approximate the dynamics on large graphs that converge to a graphon. This result applies to large random graphs that are sampled according to a graphon. All these facts are illustrated in an extended numerical example.
comment: 8 double-column pages. This revised version corrects several typos. An abridged version is going to appear in the proceedings of the 2024 IEEE Conference on Decision and Control
Leveraging Blockchain and ANFIS for Optimal Supply Chain Management
The supply chain is a critical segment of the product manufacturing cycle, continuously influenced by risky, uncertain, and undesirable events. Optimizing flexibility in the supply chain presents a complex, multi-objective, and nonlinear programming challenge. In the poultry supply chain, the development of mass customization capabilities has led manufacturing companies to increasingly focus on offering tailored and customized services for individual products. To safeguard against data tampering and ensure the integrity of setup costs and overall profitability, a multi-signature decentralized finance (DeFi) protocol, integrated with the IoT on a blockchain platform, is proposed. Managing the poultry supply chain involves uncertainties that may not account for parameters such as delivery time to retailers, reorder time, and the number of requested products. To address these challenges, this study employs an adaptive neuro-fuzzy inference system (ANFIS), combining neural networks with fuzzy logic to compensate for the lack of data training in parameter identification. Through MATLAB simulations, the study investigates the average shop delivery duration, the reorder time, and the number of products per order. By implementing the proposed technique, the average delivery time decreases from 40 to 37 minutes, the reorder time decreases from five to four days, and the quantity of items requested per order grows from six to eleven. Additionally, the ANFIS model enhances overall supply chain performance by reducing transaction times by 15\% compared to conventional systems, thereby improving real-time responsiveness and boosting transparency in supply chain operations, effectively resolving operational issues.
Autonomous Payload Thermal Control SP
In small satellites there is less room for heat control equipment, scientific instruments, and electronic components. Furthermore, the near proximity of electronic components makes power dissipation difficult, with the risk of not being able to control the temperature appropriately, reducing component lifetime and mission performance. To address this challenge, taking advantage of the advent of increasing intelligence on board satellites, an autonomous thermal control tool that uses deep reinforcement learning is proposed for learning the thermal control policy onboard. The tool was evaluated in a real space edge processing computer that will be used in a demonstration payload hosted in the International Space Station (ISS). The experiment results show that the proposed framework is able to learn to control the payload processing power to maintain the temperature under operational ranges, complementing traditional thermal control systems.
comment: To be included in the proceedings of ESA's SPAICE conference at ECSAT, UK, 2024
Measurement and Modeling on Terahertz Channels in Rain
The Terahertz (THz) frequency band offers a wide range of bandwidths, from tens to hundreds of gigahertz (GHz) and also supports data speeds of several terabits per second (Tbps). Because of this, maintaining THz channel reliability and efficiency in adverse weather conditions is crucial. Rain, in particular, disrupts THz channel propagation significantly and there is still lack of comprehensive investigations due to the involved experimental difficulties. This work explores how rain affects THz channel performance by conducting experiments in a rain emulation chamber and under actual rainy conditions outdoors. We focus on variables like rain intensity, raindrop size distribution (RDSD), and the channel's gradient height. We observe that the gradient height (for air-to-ground channel) can induce changes of the RDSD along the channel's path, impacting the precision of modeling efforts. To address this, we propose a theoretical model, integrating Mie scattering theory with considerations of channel's gradient height. Both our experimental and theoretical findings confirm this model's effectiveness in predicting THz channel behavior in rainy conditions. This work underscores the necessary in incorporating the variation of RDSD when THz channel travels in scenarios involving ground-to-air or air-to-ground communications.
comment: submitted to Journal of Infrared, Millimeter and Terahertz Waves
Active Search for Low-altitude UAV Sensing and Communication for Users at Unknown Locations
This paper studies optimal unmanned aerial vehicle (UAV) placement to ensure line-of-sight (LOS) communication and sensing for a cluster of ground users possibly in deep shadow, while the UAV maintains backhaul connectivity with a base station (BS). The key challenges include unknown user locations, uncertain channel model parameters, and unavailable urban structure. Addressing these challenges, this paper focuses on developing an efficient online search strategy which jointly estimates channels, guides UAV positioning, and optimizes resource allocation. Analytically exploiting the geometric properties of the equipotential surface, this paper develops an LOS discovery trajectory on the equipotential surface while the closed-form search directions are determined using perturbation theory. Since the explicit expression of the equipotential surface is not available, this paper proposes to locally construct a channel model for each user in the LOS regime utilizing polynomial regression without depending on user locations or propagation distance. A class of spiral trajectories to simultaneously construct the LOS channels and search on the equipotential surface is developed. An optimal radius of the spiral and an optimal measurement pattern for channel gain estimation are derived to minimize the mean squared error (MSE) of the locally constructed channel. Numerical results on real 3D city maps demonstrate that the proposed scheme achieves over 94% of the performance of a 3D exhaustive search scheme with just a 3-kilometer search.
Robotics
Detection, Recognition and Pose Estimation of Tabletop Objects
The problem of cleaning a messy table using Deep Neural Networks is a very interesting problem in both social and industrial robotics. This project focuses on the social application of this technology. A neural network model that is capable of detecting and recognizing common tabletop objects, such as a mug, mouse, or stapler is developed. The model also predicts the angle at which these objects are placed on a table,with respect to some reference. Assuming each object has a fixed intended position and orientation on the tabletop, the orientation of a particular object predicted by the deep learning model can be used to compute the transformation matrix to move the object from its initial position to the intended position. This can be fed to a pick and place robot to carry out the transfer.This paper talks about the deep learning approaches used in this project for object detection and orientation estimation.
Kinematics & Dynamics Library for Baxter Arm
The Baxter robot is a standard research platform used widely in research tasks, supported with an SDK provided by the developers, Rethink Robotics. Despite the ubiquitous use of the robot, the official software support is sub-standard. Especially, the native IK service has a low success rate and is often inconsistent. This unreliable behavior makes Baxter difficult to use for experiments and the research community is in need of a more reliable software support to control the robot. We present our work towards creating a Python based software library supporting the kinematics and dynamics of the Baxter robot. Our toolbox contains implementation of pose and velocity kinematics with support for Jacobian operations for redundancy resolution. We present the implementation and performance of our library, along with a comparison with PyKDL. Keywords- Baxter Research Robot, Manipulator Kinematics, Iterative IK, Dynamical Model, Redundant Manipulator
Vehicle-to-Everything (V2X) Communication: A Roadside Unit for Adaptive Intersection Control of Autonomous Electric Vehicles
Recent advances in autonomous vehicle technologies and cellular network speeds motivate developments in vehicle-to-everything (V2X) communications. Enhanced road safety features and improved fuel efficiency are some of the motivations behind V2X for future transportation systems. Adaptive intersection control systems have considerable potential to achieve these goals by minimizing idle times and predicting short-term future traffic conditions. Integrating V2X into traffic management systems introduces the infrastructure necessary to make roads safer for all users and initiates the shift towards more intelligent and connected cities. To demonstrate our solution, we implement both a simulated and real-world representation of a 4-way intersection and crosswalk scenario with 2 self-driving electric vehicles, a roadside unit (RSU), and traffic light. Our architecture minimizes fuel consumption through intersections by reducing acceleration and braking by up to 75.35%. We implement a cost-effective solution to intelligent and connected intersection control to serve as a proof-of-concept model suitable as the basis for continued research and development. Code for this project is available at https://github.com/MMachado05/REU-2024.
comment: Supported by the National Science Foundation under Grants No. 2150292 and 2150096
Automated Cinematography Motion Planning for UAVs
This project aimed to develop an automated cinematography platform using an unmanned aerial vehicle. Quadcopters are a great platform for shooting aerial scenes but are difficult to maneuver smoothly and can require expertise to pilot. We aim to design an algorithm to enable automated cinematography of a desired object of interest. Given the location of an object and other obstacles in the environment, the drone is able to plan its trajectory while simultaneously keeping the desired object in the video frame and avoiding obstacles. The high maneuverability of quadcopter platforms coupled with the desire for smooth movement and stability from camera platforms means a robust motion planning algorithm must be developed which can take advantage of the quadcopter's abilities while creating motion paths which satisfy the ultimate goal of capturing aerial video. This project aims to research, develop, simulate, and test such an algorithm.
Trustworthy Human-AI Collaboration: Reinforcement Learning with Human Feedback and Physics Knowledge for Safe Autonomous Driving
In the field of autonomous driving, developing safe and trustworthy autonomous driving policies remains a significant challenge. Recently, Reinforcement Learning with Human Feedback (RLHF) has attracted substantial attention due to its potential to enhance training safety and sampling efficiency. Nevertheless, existing RLHF-enabled methods often falter when faced with imperfect human demonstrations, potentially leading to training oscillations or even worse performance than rule-based approaches. Inspired by the human learning process, we propose Physics-enhanced Reinforcement Learning with Human Feedback (PE-RLHF). This novel framework synergistically integrates human feedback (e.g., human intervention and demonstration) and physics knowledge (e.g., traffic flow model) into the training loop of reinforcement learning. The key advantage of PE-RLHF is its guarantee that the learned policy will perform at least as well as the given physics-based policy, even when human feedback quality deteriorates, thus ensuring trustworthy safety improvements. PE-RLHF introduces a Physics-enhanced Human-AI (PE-HAI) collaborative paradigm for dynamic action selection between human and physics-based actions, employs a reward-free approach with a proxy value function to capture human preferences, and incorporates a minimal intervention mechanism to reduce the cognitive load on human mentors. Extensive experiments across diverse driving scenarios demonstrate that PE-RLHF significantly outperforms traditional methods, achieving state-of-the-art (SOTA) performance in safety, efficiency, and generalizability, even with varying quality of human feedback. The philosophy behind PE-RLHF not only advances autonomous driving technology but can also offer valuable insights for other safety-critical domains. Demo video and code are available at: \https://zilin-huang.github.io/PE-RLHF-website/
comment: 33 pages, 20 figures
Dynamic Subgoal based Path Formation and Task Allocation: A NeuroFleets Approach to Scalable Swarm Robotics
This paper addresses the challenges of exploration and navigation in unknown environments from the perspective of evolutionary swarm robotics. A key focus is on path formation, which is essential for enabling cooperative swarm robots to navigate effectively. We designed the task allocation and path formation process based on a finite state machine, ensuring systematic decision-making and efficient state transitions. The approach is decentralized, allowing each robot to make decisions independently based on local information, which enhances scalability and robustness. We present a novel subgoal-based path formation method that establishes paths between locations by leveraging visually connected subgoals. Simulation experiments conducted in the Argos simulator show that this method successfully forms paths in the majority of trials. However, inter-collision (traffic) among numerous robots during path formation can negatively impact performance. To address this issue, we propose a task allocation strategy that uses local communication protocols and light signal-based communication to manage robot deployment. This strategy assesses the distance between points and determines the optimal number of robots needed for the path formation task, thereby reducing unnecessary exploration and traffic congestion. The performance of both the subgoal-based path formation method and the task allocation strategy is evaluated by comparing the path length, time, and resource usage against the A* algorithm. Simulation results demonstrate the effectiveness of our approach, highlighting its scalability, robustness, and fault tolerance.
comment: arXiv admin note: text overlap with arXiv:2312.16606
DSLO: Deep Sequence LiDAR Odometry Based on Inconsistent Spatio-temporal Propagation IROS 2024
This paper introduces a 3D point cloud sequence learning model based on inconsistent spatio-temporal propagation for LiDAR odometry, termed DSLO. It consists of a pyramid structure with a spatial information reuse strategy, a sequential pose initialization module, a gated hierarchical pose refinement module, and a temporal feature propagation module. First, spatial features are encoded using a point feature pyramid, with features reused in successive pose estimations to reduce computational overhead. Second, a sequential pose initialization method is introduced, leveraging the high-frequency sampling characteristic of LiDAR to initialize the LiDAR pose. Then, a gated hierarchical pose refinement mechanism refines poses from coarse to fine by selectively retaining or discarding motion information from different layers based on gate estimations. Finally, temporal feature propagation is proposed to incorporate the historical motion information from point cloud sequences, and address the spatial inconsistency issue when transmitting motion information embedded in point clouds between frames. Experimental results on the KITTI odometry dataset and Argoverse dataset demonstrate that DSLO outperforms state-of-the-art methods, achieving at least a 15.67\% improvement on RTE and a 12.64\% improvement on RRE, while also achieving a 34.69\% reduction in runtime compared to baseline methods. Our implementation will be available at https://github.com/IRMVLab/DSLO.
comment: 6 pages, 5 figures, accepted by IROS 2024
Antagonist Inhibition Control in Redundant Tendon-driven Structures Based on Human Reciprocal Innervation for Wide Range Limb Motion of Musculoskeletal Humanoids
The body structure of an anatomically correct tendon-driven musculoskeletal humanoid is complex, and the difference between its geometric model and the actual robot is very large because expressing the complex routes of tendon wires in a geometric model is very difficult. If we move a tendon-driven musculoskeletal humanoid by the tendon wire lengths of the geometric model, unintended muscle tension and slack will emerge. In some cases, this can lead to the wreckage of the actual robot. To solve this problem, we focused on reciprocal innervation in the human nervous system, and then implemented antagonist inhibition control (AIC) based on the reflex. This control makes it possible to avoid unnecessary internal muscle tension and slack of tendon wires caused by model error, and to perform wide range motion safely for a long time. To verify its effectiveness, we applied AIC to the upper limb of the tendon-driven musculoskeletal humanoid, Kengoro, and succeeded in dangling for 14 minutes and doing pull-ups.
comment: Accepted at IEEE Robotics and Automation Letters
Automatic Grouping of Redundant Sensors and Actuators Using Functional and Spatial Connections: Application to Muscle Grouping for Musculoskeletal Humanoids
For a robot with redundant sensors and actuators distributed throughout its body, it is difficult to construct a controller or a neural network using all of them due to computational cost and complexity. Therefore, it is effective to extract functionally related sensors and actuators, group them, and construct a controller or a network for each of these groups. In this study, the functional and spatial connections among sensors and actuators are embedded into a graph structure and a method for automatic grouping is developed. Taking a musculoskeletal humanoid with a large number of redundant muscles as an example, this method automatically divides all the muscles into regions such as the forearm, upper arm, scapula, neck, etc., which has been done by humans based on a geometric model. The functional relationship among the muscles and the spatial relationship of the neural connections are calculated without a geometric model.
comment: Accepted at IEEE Robotics and Automation Letters
Learning to Singulate Objects in Packed Environments using a Dexterous Hand
Robotic object singulation, where a robot must isolate, grasp, and retrieve a target object in a cluttered environment, is a fundamental challenge in robotic manipulation. This task is difficult due to occlusions and how other objects act as obstacles for manipulation. A robot must also reason about the effect of object-object interactions as it tries to singulate the target. Prior work has explored object singulation in scenarios where there is enough free space to perform relatively long pushes to separate objects, in contrast to when space is tight and objects have little separation from each other. In this paper, we propose the Singulating Objects in Packed Environments (SOPE) framework. We propose a novel method that involves a displacement-based state representation and a multi-phase reinforcement learning procedure that enables singulation using the 16-DOF Allegro Hand. We demonstrate extensive experiments in Isaac Gym simulation, showing the ability of our system to singulate a target object in clutter. We directly transfer the policy trained in simulation to the real world. Over 250 physical robot manipulation trials, our method obtains success rates of 79.2%, outperforming alternative learning and non-learning methods.
Deep Probabilistic Traversability with Test-time Adaptation for Uncertainty-aware Planetary Rover Navigation
Traversability assessment of deformable terrain is vital for safe rover navigation on planetary surfaces. Machine learning (ML) is a powerful tool for traversability prediction but faces predictive uncertainty. This uncertainty leads to prediction errors, increasing the risk of wheel slips and immobilization for planetary rovers. To address this issue, we integrate principal approaches to uncertainty handling -- quantification, exploitation, and adaptation -- into a single learning and planning framework for rover navigation. The key concept is \emph{deep probabilistic traversability}, forming the basis of an end-to-end probabilistic ML model that predicts slip distributions directly from rover traverse observations. This probabilistic model quantifies uncertainties in slip prediction and exploits them as traversability costs in path planning. Its end-to-end nature also allows adaptation of pre-trained models with in-situ traverse experience to reduce uncertainties. We perform extensive simulations in synthetic environments that pose representative uncertainties in planetary analog terrains. Experimental results show that our method achieves more robust path planning under novel environmental conditions than existing approaches.
comment: 8 pages, 4 figures. Submitted to IEEE Robotics and Automation Letters (RA-L)
Incorporating General Contact Surfaces in the Kinematics of Tendon-Driven Rolling-Contact Joint Mechanisms
This paper presents the first kinematic modeling of tendon-driven rolling-contact joint mechanisms with general contact surfaces subject to external loads. We derived the kinematics as a set of recursive equations and developed efficient iterative algorithms to solve for both tendon force actuation and tendon displacement actuation. The configuration predictions of the kinematics were experimentally validated using a prototype mechanism. Our MATLAB implementation of the proposed kinematic is available at https://github.com/hjhdog1/RollingJoint.
comment: 10 pages, 13 figures
Online Temporal Fusion for Vectorized Map Construction in Mapless Autonomous Driving
To reduce the reliance on high-definition (HD) maps, a growing trend in autonomous driving is leveraging on-board sensors to generate vectorized maps online. However, current methods are mostly constrained by processing only single-frame inputs, which hampers their robustness and effectiveness in complex scenarios. To overcome this problem, we propose an online map construction system that exploits the long-term temporal information to build a consistent vectorized map. First, the system efficiently fuses all historical road marking detections from an off-the-shelf network into a semantic voxel map, which is implemented using a hashing-based strategy to exploit the sparsity of road elements. Then reliable voxels are found by examining the fused information and incrementally clustered into an instance-level representation of road markings. Finally, the system incorporates domain knowledge to estimate the geometric and topological structures of roads, which can be directly consumed by the planning and control (PnC) module. Through experiments conducted in complicated urban environments, we have demonstrated that the output of our system is more consistent and accurate than the network output by a large margin and can be effectively used in a closed-loop autonomous driving system.
comment: 8 pages, 9 figures
Diffusion Policy Policy Optimization
We introduce Diffusion Policy Policy Optimization, DPPO, an algorithmic framework including best practices for fine-tuning diffusion-based policies (e.g. Diffusion Policy) in continuous control and robot learning tasks using the policy gradient (PG) method from reinforcement learning (RL). PG methods are ubiquitous in training RL policies with other policy parameterizations; nevertheless, they had been conjectured to be less efficient for diffusion-based policies. Surprisingly, we show that DPPO achieves the strongest overall performance and efficiency for fine-tuning in common benchmarks compared to other RL methods for diffusion-based policies and also compared to PG fine-tuning of other policy parameterizations. Through experimental investigation, we find that DPPO takes advantage of unique synergies between RL fine-tuning and the diffusion parameterization, leading to structured and on-manifold exploration, stable training, and strong policy robustness. We further demonstrate the strengths of DPPO in a range of realistic settings, including simulated robotic tasks with pixel observations, and via zero-shot deployment of simulation-trained policies on robot hardware in a long-horizon, multi-stage manipulation task. Website with code: diffusion-ppo.github.io
comment: Website: diffusion-ppo.github.io
The Persistent Robot Charging Problem for Long-Duration Autonomy
This paper introduces a novel formulation aimed at determining the optimal schedule for recharging a fleet of $n$ heterogeneous robots, with the primary objective of minimizing resource utilization. This study provides a foundational framework applicable to Multi-Robot Mission Planning, particularly in scenarios demanding Long-Duration Autonomy (LDA) or other contexts that necessitate periodic recharging of multiple robots. A novel Integer Linear Programming (ILP) model is proposed to calculate the optimal initial conditions (partial charge) for individual robots, leading to the minimal utilization of charging stations. This formulation was further generalized to maximize the servicing time for robots given adequate charging stations. The efficacy of the proposed formulation is evaluated through a comparative analysis, measuring its performance against the thrift price scheduling algorithm documented in the existing literature. The findings not only validate the effectiveness of the proposed approach but also underscore its potential as a valuable tool in optimizing resource allocation for a range of robotic and engineering applications.
Surgical-VQLA++: Adversarial Contrastive Learning for Calibrated Robust Visual Question-Localized Answering in Robotic Surgery
Medical visual question answering (VQA) bridges the gap between visual information and clinical decision-making, enabling doctors to extract understanding from clinical images and videos. In particular, surgical VQA can enhance the interpretation of surgical data, aiding in accurate diagnoses, effective education, and clinical interventions. However, the inability of VQA models to visually indicate the regions of interest corresponding to the given questions results in incomplete comprehension of the surgical scene. To tackle this, we propose the surgical visual question localized-answering (VQLA) for precise and context-aware responses to specific queries regarding surgical images. Furthermore, to address the strong demand for safety in surgical scenarios and potential corruptions in image acquisition and transmission, we propose a novel approach called Calibrated Co-Attention Gated Vision-Language (C$^2$G-ViL) embedding to integrate and align multimodal information effectively. Additionally, we leverage the adversarial sample-based contrastive learning strategy to boost our performance and robustness. We also extend our EndoVis-18-VQLA and EndoVis-17-VQLA datasets to broaden the scope and application of our data. Extensive experiments on the aforementioned datasets demonstrate the remarkable performance and robustness of our solution. Our solution can effectively combat real-world image corruption. Thus, our proposed approach can serve as an effective tool for assisting surgical education, patient care, and enhancing surgical outcomes.
comment: Accepted by Information Fusion. Code and data availability: https://github.com/longbai1006/Surgical-VQLAPlus
CTS: Concurrent Teacher-Student Reinforcement Learning for Legged Locomotion
Thanks to recent explosive developments of data-driven learning methodologies, reinforcement learning (RL) emerges as a promising solution to address the legged locomotion problem in robotics. In this paper, we propose CTS, a novel Concurrent Teacher-Student reinforcement learning architecture for legged locomotion over uneven terrains. Different from conventional teacher-student architecture that trains the teacher policy via RL first and then transfers the knowledge to the student policy through supervised learning, our proposed architecture trains teacher and student policy networks concurrently under the reinforcement learning paradigm. To this end, we develop a new training scheme based on a modified proximal policy gradient (PPO) method that exploits data samples collected from the interactions between both the teacher and the student policies with the environment. The effectiveness of the proposed architecture and the new training scheme is demonstrated through substantial quantitative simulation comparisons with the state-of-the-art approaches and extensive indoor and outdoor experiments with quadrupedal and point-foot bipedal robot platforms, showcasing robust and agile locomotion capability. Quantitative simulation comparisons show that our approach reduces the average velocity tracking error by up to 20% compared to the two-stage teacher-student, demonstrating significant superiority in addressing blind locomotion tasks. Videos are available at https://clearlab-sustech.github.io/concurrentTS.
Bridging the Sim-to-Real Gap with Bayesian Inference
We present SIM-FSVGD for learning robot dynamics from data. As opposed to traditional methods, SIM-FSVGD leverages low-fidelity physical priors, e.g., in the form of simulators, to regularize the training of neural network models. While learning accurate dynamics already in the low data regime, SIM-FSVGD scales and excels also when more data is available. We empirically show that learning with implicit physical priors results in accurate mean model estimation as well as precise uncertainty quantification. We demonstrate the effectiveness of SIM-FSVGD in bridging the sim-to-real gap on a high-performance RC racecar system. Using model-based RL, we demonstrate a highly dynamic parking maneuver with drifting, using less than half the data compared to the state of the art.
NEDS-SLAM: A Neural Explicit Dense Semantic SLAM Framework using 3D Gaussian Splatting
We propose NEDS-SLAM, a dense semantic SLAM system based on 3D Gaussian representation, that enables robust 3D semantic mapping, accurate camera tracking, and high-quality rendering in real-time. In the system, we propose a Spatially Consistent Feature Fusion model to reduce the effect of erroneous estimates from pre-trained segmentation head on semantic reconstruction, achieving robust 3D semantic Gaussian mapping. Additionally, we employ a lightweight encoder-decoder to compress the high-dimensional semantic features into a compact 3D Gaussian representation, mitigating the burden of excessive memory consumption. Furthermore, we leverage the advantage of 3D Gaussian splatting, which enables efficient and differentiable novel view rendering, and propose a Virtual Camera View Pruning method to eliminate outlier gaussians, thereby effectively enhancing the quality of scene representations. Our NEDS-SLAM method demonstrates competitive performance over existing dense semantic SLAM methods in terms of mapping and tracking accuracy on Replica and ScanNet datasets, while also showing excellent capabilities in 3D dense semantic mapping.
comment: accepted by RA-L, IEEE Robotics and Automation Letters
AirPilot: Interpretable PPO-based DRL Auto-Tuned Nonlinear PID Drone Controller for Robust Autonomous Flights
Navigation precision, speed and stability are crucial for safe Unmanned Aerial Vehicle (UAV) flight maneuvers and effective flight mission executions in dynamic environments. Different flight missions may have varying objectives, such as minimizing energy consumption, achieving precise positioning, or maximizing speed. A controller that can adapt to different objectives on the fly is highly valuable. Proportional Integral Derivative (PID) controllers are one of the most popular and widely used control algorithms for drones and other control systems, but their linear control algorithm fails to capture the nonlinear nature of the dynamic wind conditions and complex drone system. Manually tuning the PID gains for various missions can be time-consuming and requires significant expertise. This paper aims to revolutionize drone flight control by presenting the AirPilot, a nonlinear Deep Reinforcement Learning (DRL) - enhanced Proportional Integral Derivative (PID) drone controller using Proximal Policy Optimization (PPO). AirPilot controller combines the simplicity and effectiveness of traditional PID control with the adaptability, learning capability, and optimization potential of DRL. This makes it better suited for modern drone applications where the environment is dynamic, and mission-specific performance demands are high. We employed a COEX Clover autonomous drone for training the DRL agent within the simulator and implemented it in a real-world lab setting, which marks a significant milestone as one of the first attempts to apply a DRL-based flight controller on an actual drone. Airpilot is capable of reducing the navigation error of the default PX4 PID position controller by 90%, improving effective navigation speed of a fine-tuned PID controller by 21%, reducing settling time and overshoot by 17% and 16% respectively.
comment: 9 pages, 20 figures
Parallel Distributional Deep Reinforcement Learning for Mapless Navigation of Terrestrial Mobile Robots
This paper introduces novel deep reinforcement learning (Deep-RL) techniques using parallel distributional actor-critic networks for navigating terrestrial mobile robots. Our approaches use laser range findings, relative distance, and angle to the target to guide the robot. We trained agents in the Gazebo simulator and deployed them in real scenarios. Results show that parallel distributional Deep-RL algorithms enhance decision-making and outperform non-distributional and behavior-based approaches in navigation and spatial generalization.
comment: Paper accepted at the 24th International Conference on Control, Automation and Systems (ICCAS)
Multiagent Systems
Accelerating Hybrid Agent-Based Models and Fuzzy Cognitive Maps: How to Combine Agents who Think Alike?
While Agent-Based Models can create detailed artificial societies based on individual differences and local context, they can be computationally intensive. Modelers may offset these costs through a parsimonious use of the model, for example by using smaller population sizes (which limits analyses in sub-populations), running fewer what-if scenarios, or accepting more uncertainty by performing fewer simulations. Alternatively, researchers may accelerate simulations via hardware solutions (e.g., GPU parallelism) or approximation approaches that operate a tradeoff between accuracy and compute time. In this paper, we present an approximation that combines agents who `think alike', thus reducing the population size and the compute time. Our innovation relies on representing agent behaviors as networks of rules (Fuzzy Cognitive Maps) and empirically evaluating different measures of distance between these networks. Then, we form groups of think-alike agents via community detection and simplify them to a representative agent. Case studies show that our simplifications remain accuracy.
comment: To appear at the 2024 Winter Simulation Conference
Dynamic Subgoal based Path Formation and Task Allocation: A NeuroFleets Approach to Scalable Swarm Robotics
This paper addresses the challenges of exploration and navigation in unknown environments from the perspective of evolutionary swarm robotics. A key focus is on path formation, which is essential for enabling cooperative swarm robots to navigate effectively. We designed the task allocation and path formation process based on a finite state machine, ensuring systematic decision-making and efficient state transitions. The approach is decentralized, allowing each robot to make decisions independently based on local information, which enhances scalability and robustness. We present a novel subgoal-based path formation method that establishes paths between locations by leveraging visually connected subgoals. Simulation experiments conducted in the Argos simulator show that this method successfully forms paths in the majority of trials. However, inter-collision (traffic) among numerous robots during path formation can negatively impact performance. To address this issue, we propose a task allocation strategy that uses local communication protocols and light signal-based communication to manage robot deployment. This strategy assesses the distance between points and determines the optimal number of robots needed for the path formation task, thereby reducing unnecessary exploration and traffic congestion. The performance of both the subgoal-based path formation method and the task allocation strategy is evaluated by comparing the path length, time, and resource usage against the A* algorithm. Simulation results demonstrate the effectiveness of our approach, highlighting its scalability, robustness, and fault tolerance.
comment: arXiv admin note: text overlap with arXiv:2312.16606
Simulation of Social Media-Driven Bubble Formation in Financial Markets using an Agent-Based Model with Hierarchical Influence Network
We propose that a tree-like hierarchical structure represents a simple and effective way to model the emergent behaviour of financial markets, especially markets where there exists a pronounced intersection between social media influences and investor behaviour. To explore this hypothesis, we introduce an agent-based model of financial markets, where trading agents are embedded in a hierarchical network of communities, and communities influence the strategies and opinions of traders. Empirical analysis of the model shows that its behaviour conforms to several stylized facts observed in real financial markets; and the model is able to realistically simulate the effects that social media-driven phenomena, such as echo chambers and pump-and-dump schemes, have on financial markets.
comment: 11 pages, 7 figures, To appear in Proceedings of 36th European Modeling and Simulation Symposium (EMSS), 21st International Multidisciplinary Modelling and Simulation Multiconference (I3M), Tenerife, Spain, Sep. 2024
A Learnable Agent Collaboration Network Framework for Personalized Multimodal AI Search Engine
Large language models (LLMs) and retrieval-augmented generation (RAG) techniques have revolutionized traditional information access, enabling AI agent to search and summarize information on behalf of users during dynamic dialogues. Despite their potential, current AI search engines exhibit considerable room for improvement in several critical areas. These areas include the support for multimodal information, the delivery of personalized responses, the capability to logically answer complex questions, and the facilitation of more flexible interactions. This paper proposes a novel AI Search Engine framework called the Agent Collaboration Network (ACN). The ACN framework consists of multiple specialized agents working collaboratively, each with distinct roles such as Account Manager, Solution Strategist, Information Manager, and Content Creator. This framework integrates mechanisms for picture content understanding, user profile tracking, and online evolution, enhancing the AI search engine's response quality, personalization, and interactivity. A highlight of the ACN is the introduction of a Reflective Forward Optimization method (RFO), which supports the online synergistic adjustment among agents. This feature endows the ACN with online learning capabilities, ensuring that the system has strong interactive flexibility and can promptly adapt to user feedback. This learning method may also serve as an optimization approach for agent-based systems, potentially influencing other domains of agent applications.
comment: ACMMM 2024 MMGR WORKSHOP
Editing Personality for Large Language Models NLPCC 2024
This paper introduces an innovative task focused on editing the personality traits of Large Language Models (LLMs). This task seeks to adjust the models' responses to opinion-related questions on specified topics since an individual's personality often manifests in the form of their expressed opinions, thereby showcasing different personality traits. Specifically, we construct PersonalityEdit, a new benchmark dataset to address this task. Drawing on the theory in Social Psychology, we isolate three representative traits, namely Neuroticism, Extraversion, and Agreeableness, as the foundation for our benchmark. We then gather data using GPT-4, generating responses that align with a specified topic and embody the targeted personality trait. We conduct comprehensive experiments involving various baselines and discuss the representation of personality behavior in LLMs. Our findings uncover potential challenges of the proposed task, illustrating several remaining issues. We anticipate that our work can stimulate further annotation in model editing and personality-related research. Code is available at https://github.com/zjunlp/EasyEdit.
comment: NLPCC 2024
Systems and Control (CS)
Generalized Multi-hop Traffic Pressure for Heterogeneous Traffic Perimeter Control
Perimeter control prevents loss of traffic network capacity due to congestion in urban areas. Homogeneous perimeter control allows all access points to a protected region to have the same maximal permitted inflow. However, homogeneous perimeter control performs poorly when the congestion in the protected region is heterogeneous (e.g., imbalanced demand) since the homogeneous perimeter control does not consider location-specific traffic conditions around the perimeter. When the protected region has spatially heterogeneous congestion, it can often make sense to modulate the perimeter inflow rate to be higher near low-density regions and vice versa for high-density regions. To assist with this modulation, we can leverage the concept of 1-hop traffic pressure to measure intersection-level traffic congestion. However, as we show, 1-hop pressure turns out to be too spatially myopic for perimeter control and hence we formulate multi-hop generalizations of pressure that look ``deeper'' inside the perimeter beyond the entry intersection. In addition, we formulate a simple heterogeneous perimeter control methodology that can leverage this novel multi-hop pressure to redistribute the total permitted inflow provided by the homogeneous perimeter controller. Experimental results show that our heterogeneous perimeter control policies leveraging multi-hop pressure significantly outperform homogeneous perimeter control in scenarios where the origin-destination flows are highly imbalanced with high spatial heterogeneity.
comment: 21 pages main body, 12 figures, journal paper
CSAC Drift Modeling Considering GPS Signal Quality in the Case of GPS Signal Unavailability
The Global Positioning System (GPS), one of the Global Navigation Satellite Systems (GNSS), provides accurate position, navigation and time (PNT) information to various applications. One of the application that is highly receiving attention is satellite vehicles, especially Low Earth Orbit (LEO) satellites. Due to their limited ways to get PNT information and low performance of their onboard clocks, GPS system time (GPST) provided by GPS is a good reference clock to synchronize. However, GPS is well-known for its vulnerability to intentional or unintentional interference. This study aims to maintain the onboard clock with less error relative to the GPST even when the GPS signal is disrupted. In this study, we analyzed two major factors that affects the quality of the GPS measurements: the number of the visible satellites and the geometry of the satellites. Then, we proposed a weighted model for a Chip-Scale Atomic Clock (CSAC) that mitigates the clock error relative to the GPST while considering the two factors. Based on this model, a stand-alone CSAC could maintain its error less than 4 microseconds, even in a situation where no GPS signals are received for 12 hours.
comment: 6 pages
Learning and Control from Similarity Between Heterogeneous Systems: A Behavioral Approach
This paper proposes basic definitions of similarity and similarity indexes between heterogeneous linear systems and presents a similarity-based learning control strategy. By exploring geometric properties of admissible behaviors of linear systems, the similarity indexes between two admissible behaviors of heterogeneous systems are defined as the principal angles between their subspace components, and an efficient strategy for calculating the similarity indexes is developed. By leveraging the similarity indexes, a similarity-based learning control strategy is proposed via projection techniques. With the application of the similarity-based learning control strategy, host system can efficiently accomplish the same tasks by leveraging the successful experience of guest system, without the necessity to repeat the trial-and-error process experienced by the guest system.
AirPilot: Interpretable PPO-based DRL Auto-Tuned Nonlinear PID Drone Controller for Robust Autonomous Flights
Navigation precision, speed and stability are crucial for safe Unmanned Aerial Vehicle (UAV) flight maneuvers and effective flight mission executions in dynamic environments. Different flight missions may have varying objectives, such as minimizing energy consumption, achieving precise positioning, or maximizing speed. A controller that can adapt to different objectives on the fly is highly valuable. Proportional Integral Derivative (PID) controllers are one of the most popular and widely used control algorithms for drones and other control systems, but their linear control algorithm fails to capture the nonlinear nature of the dynamic wind conditions and complex drone system. Manually tuning the PID gains for various missions can be time-consuming and requires significant expertise. This paper aims to revolutionize drone flight control by presenting the AirPilot, a nonlinear Deep Reinforcement Learning (DRL) - enhanced Proportional Integral Derivative (PID) drone controller using Proximal Policy Optimization (PPO). AirPilot controller combines the simplicity and effectiveness of traditional PID control with the adaptability, learning capability, and optimization potential of DRL. This makes it better suited for modern drone applications where the environment is dynamic, and mission-specific performance demands are high. We employed a COEX Clover autonomous drone for training the DRL agent within the simulator and implemented it in a real-world lab setting, which marks a significant milestone as one of the first attempts to apply a DRL-based flight controller on an actual drone. Airpilot is capable of reducing the navigation error of the default PX4 PID position controller by 90%, improving effective navigation speed of a fine-tuned PID controller by 21%, reducing settling time and overshoot by 17% and 16% respectively.
comment: 9 pages, 20 figures
Systems and Control (EESS)
Generalized Multi-hop Traffic Pressure for Heterogeneous Traffic Perimeter Control
Perimeter control prevents loss of traffic network capacity due to congestion in urban areas. Homogeneous perimeter control allows all access points to a protected region to have the same maximal permitted inflow. However, homogeneous perimeter control performs poorly when the congestion in the protected region is heterogeneous (e.g., imbalanced demand) since the homogeneous perimeter control does not consider location-specific traffic conditions around the perimeter. When the protected region has spatially heterogeneous congestion, it can often make sense to modulate the perimeter inflow rate to be higher near low-density regions and vice versa for high-density regions. To assist with this modulation, we can leverage the concept of 1-hop traffic pressure to measure intersection-level traffic congestion. However, as we show, 1-hop pressure turns out to be too spatially myopic for perimeter control and hence we formulate multi-hop generalizations of pressure that look ``deeper'' inside the perimeter beyond the entry intersection. In addition, we formulate a simple heterogeneous perimeter control methodology that can leverage this novel multi-hop pressure to redistribute the total permitted inflow provided by the homogeneous perimeter controller. Experimental results show that our heterogeneous perimeter control policies leveraging multi-hop pressure significantly outperform homogeneous perimeter control in scenarios where the origin-destination flows are highly imbalanced with high spatial heterogeneity.
comment: 21 pages main body, 12 figures, journal paper
CSAC Drift Modeling Considering GPS Signal Quality in the Case of GPS Signal Unavailability
The Global Positioning System (GPS), one of the Global Navigation Satellite Systems (GNSS), provides accurate position, navigation and time (PNT) information to various applications. One of the application that is highly receiving attention is satellite vehicles, especially Low Earth Orbit (LEO) satellites. Due to their limited ways to get PNT information and low performance of their onboard clocks, GPS system time (GPST) provided by GPS is a good reference clock to synchronize. However, GPS is well-known for its vulnerability to intentional or unintentional interference. This study aims to maintain the onboard clock with less error relative to the GPST even when the GPS signal is disrupted. In this study, we analyzed two major factors that affects the quality of the GPS measurements: the number of the visible satellites and the geometry of the satellites. Then, we proposed a weighted model for a Chip-Scale Atomic Clock (CSAC) that mitigates the clock error relative to the GPST while considering the two factors. Based on this model, a stand-alone CSAC could maintain its error less than 4 microseconds, even in a situation where no GPS signals are received for 12 hours.
comment: 6 pages
Learning and Control from Similarity Between Heterogeneous Systems: A Behavioral Approach
This paper proposes basic definitions of similarity and similarity indexes between heterogeneous linear systems and presents a similarity-based learning control strategy. By exploring geometric properties of admissible behaviors of linear systems, the similarity indexes between two admissible behaviors of heterogeneous systems are defined as the principal angles between their subspace components, and an efficient strategy for calculating the similarity indexes is developed. By leveraging the similarity indexes, a similarity-based learning control strategy is proposed via projection techniques. With the application of the similarity-based learning control strategy, host system can efficiently accomplish the same tasks by leveraging the successful experience of guest system, without the necessity to repeat the trial-and-error process experienced by the guest system.
AirPilot: Interpretable PPO-based DRL Auto-Tuned Nonlinear PID Drone Controller for Robust Autonomous Flights
Navigation precision, speed and stability are crucial for safe Unmanned Aerial Vehicle (UAV) flight maneuvers and effective flight mission executions in dynamic environments. Different flight missions may have varying objectives, such as minimizing energy consumption, achieving precise positioning, or maximizing speed. A controller that can adapt to different objectives on the fly is highly valuable. Proportional Integral Derivative (PID) controllers are one of the most popular and widely used control algorithms for drones and other control systems, but their linear control algorithm fails to capture the nonlinear nature of the dynamic wind conditions and complex drone system. Manually tuning the PID gains for various missions can be time-consuming and requires significant expertise. This paper aims to revolutionize drone flight control by presenting the AirPilot, a nonlinear Deep Reinforcement Learning (DRL) - enhanced Proportional Integral Derivative (PID) drone controller using Proximal Policy Optimization (PPO). AirPilot controller combines the simplicity and effectiveness of traditional PID control with the adaptability, learning capability, and optimization potential of DRL. This makes it better suited for modern drone applications where the environment is dynamic, and mission-specific performance demands are high. We employed a COEX Clover autonomous drone for training the DRL agent within the simulator and implemented it in a real-world lab setting, which marks a significant milestone as one of the first attempts to apply a DRL-based flight controller on an actual drone. Airpilot is capable of reducing the navigation error of the default PX4 PID position controller by 90%, improving effective navigation speed of a fine-tuned PID controller by 21%, reducing settling time and overshoot by 17% and 16% respectively.
comment: 9 pages, 20 figures
Multiagent Systems
Identification of LFT Structured Descriptor Systems with Slow and Non-uniform Sampling
Time domain identification is studied in this paper for parameters of a continuous-time multi-input multi-output descriptor system, with these parameters affecting system matrices through a linear fractional transformation. Sampling is permitted to be slow and non-uniform, and there are no necessities to satisfy the Nyquist frequency restrictions. This model can be used to described the behaviors of a networked dynamic system, and the obtained results can be straightforwardly applied to an ordinary state-space model, as well as a lumped system. An explicit formula is obtained respectively for the transient and steady-state responses of the system stimulated by an arbitrary signal. Some relations have been derived between the system steady-state response and its transfer function matrix (TFM), which reveal that the value of a TFM at almost any interested point, as well as its derivatives and a right tangential interpolation along an arbitrary direction, can in principle be estimated from input-output experimental data. Based on these relations, an estimation algorithm is suggested respectively for the parameters of the descriptor system and the values of its TFM. Their properties like asymptotic unbiasedness, consistency, etc., are analyzed.
comment: 15 pages
Systems and Control (CS)
Sparse Mamba: Reinforcing Controllability In Structural State Space Models
In this article, we introduce the concept of controllability and observability to the M amba architecture in our Sparse-Mamba (S-Mamba) for natural language processing (NLP) applications. The structured state space model (SSM) development in recent studies, such as Mamba and Mamba2, outperformed and solved the computational inefficiency of transformers and large language models (LLMs) on longer sequences in small to medium NLP tasks. The Mamba SSMs architecture drops the need for attention layer or MLB blocks in transformers. However, the current Mamba models do not reinforce the controllability on state space equations in the calculation of A, B, C, and D matrices at each time step, which increase the complexity and the computational cost needed. In this article we show that the number of parameters can be significantly decreased by reinforcing controllability in the state space equations in the proposed Sparse-Mamba (S-Mamba), while maintaining the performance. The controllable n x n state matrix A is sparse and it has only n free parameters. Our novel approach will ensure a controllable system and could be the gate key for Mamba 3.
Review of meta-heuristic optimization algorithms to tune the PID controller parameters for automatic voltage regulator
A Proportional- Integral- Derivative (PID) controller is required to bring a system back to the stable operating region as soon as possible following a disturbance or discrepancy. For successful operation of the PID controller, it is necessary to design the controller parameters in a manner that will render low optimization complexity, less memory for operation, fast convergence, and should be able to operate dynamically. Recent investigations have postulated many different meta-heuristic algorithms for efficient tuning of PID controller under various system milieus. However, researchers have seldom compared their custom made objective functions with previous investigations while proposing new algorithmic methods. This paper focuses on a detailed study on the research progress, deficiency, accomplishment and future scopes of recently proposed heuristic algorithms to designing and tuning the PID controller parameters for an automatic voltage regulator (AVR) system. Objective functions, including ITSE, ITAE, IAE, ISE, and ZLG, are considered to enumerate a measurable outcome of the algorithms. Considering a slight variation in the sytem gain parameters of the algorithms, the observed PID gain with ITSE results in 0.81918 - 1.9499 for K_p, 0.24366 - 1.4608 for K_i, and 0.31840 - 0.9683 for K_d. Whereas with ITAE the values are 0.24420 - 1.2771, 0.14230 - 0.8471, and 0.04270 - 0.4775, respectively. The time domain and frequency domain characteristics also changes significantly with each objective function. Our outlined comparison will be a guideline for investigating any newer algorithms in the field.
Formal Verification and Control with Conformal Prediction
In this survey, we design formal verification and control algorithms for autonomous systems with practical safety guarantees using conformal prediction (CP), a statistical tool for uncertainty quantification. We focus on learning-enabled autonomous systems (LEASs) in which the complexity of learning-enabled components (LECs) is a major bottleneck that hampers the use of existing model-based verification and design techniques. Instead, we advocate for the use of CP, and we will demonstrate its use in formal verification, systems and control theory, and robotics. We argue that CP is specifically useful due to its simplicity (easy to understand, use, and modify), generality (requires no assumptions on learned models and data distributions, i.e., is distribution-free), and efficiency (real-time capable and accurate). We pursue the following goals with this survey. First, we provide an accessible introduction to CP for non-experts who are interested in using CP to solve problems in autonomy. Second, we show how to use CP for the verification of LECs, e.g., for verifying input-output properties of neural networks. Third and fourth, we review recent articles that use CP for safe control design as well as offline and online verification of LEASs. We summarize their ideas in a unifying framework that can deal with the complexity of LEASs in a computationally efficient manner. In our exposition, we consider simple system specifications, e.g., robot navigation tasks, as well as complex specifications formulated in temporal logic formalisms. Throughout our survey, we compare to other statistical techniques (e.g., scenario optimization, PAC-Bayes theory, etc.) and how these techniques have been used in verification and control. Lastly, we point the reader to open problems and future research directions.
Leaky Wave Antenna-Equipped RF Chipless Tags for Orientation Estimation
Accurate orientation estimation of an object in a scene is critical in robotics, aerospace, augmented reality, and medicine, as it supports scene understanding. This paper introduces a novel orientation estimation approach leveraging radio frequency (RF) sensing technology and leaky-wave antennas (LWAs). Specifically, we propose a framework for a radar system to estimate the orientation of a \textit{dumb} LWA-equipped backscattering tag, marking the first exploration of this method in the literature. Our contributions include a comprehensive framework for signal modeling and orientation estimation with multi-subcarrier transmissions, and the formulation of a maximum likelihood estimator (MLE). Moreover, we analyze the impact of imperfect tag location information, revealing that it minimally affects estimation accuracy. Exploiting related results, we propose an approximate MLE and introduce a low-complexity radiation-pointing angle-based estimator with near-optimal performance. We derive the feasible orientation estimation region of the latter and show that it depends mainly on the system bandwidth. Our analytical results are validated through Monte Carlo simulations and reveal that the low-complexity estimator achieves near-optimal accuracy and that its feasible orientation estimation region is also approximately shared by the other estimators. Finally, we show that the optimal number of subcarriers increases with sensing time under a power budget constraint.
comment: 14 pages, 2 tables, 8 figs. Submitted to IEEE TWC
Evaluation of Prosumer Networks for Peak Load Management in Iran: A Distributed Contextual Stochastic Optimization Approach
Renewable prosumers face the complex challenge of balancing self-sufficiency with seamless grid and market integration. This paper introduces a novel prosumers network framework aimed at mitigating peak loads in Iran, particularly under the uncertainties inherent in renewable energy generation and demand. A cost-oriented integrated prediction and optimization approach is proposed, empowering prosumers to make informed decisions within a distributed contextual stochastic optimization (DCSO) framework. The problem is formulated as a bi-level two-stage multi-time scale optimization to determine optimal operation and interaction strategies under various scenarios, considering flexible resources. To facilitate grid integration, a novel consensus-based contextual information sharing mechanism is proposed. This approach enables coordinated collective behaviors and leverages contextual data more effectively. The overall problem is recast as a mixed-integer linear program (MILP) by incorporating optimality conditions and linearizing complementarity constraints. Additionally, a distributed algorithm using the consensus alternating direction method of multipliers (ADMM) is presented for computational tractability and privacy preservation. Numerical results highlights that integrating prediction with optimization and implementing a contextual information-sharing network among prosumers significantly reduces peak loads as well as total costs.
comment: 10 pages, 26 figure, journal paper
Advancing Machine Learning in Industry 4.0: Benchmark Framework for Rare-event Prediction in Chemical Processes
Previously, using forward-flux sampling (FFS) and machine learning (ML), we developed multivariate alarm systems to counter rare un-postulated abnormal events. Our alarm systems utilized ML-based predictive models to quantify committer probabilities as functions of key process variables (e.g., temperature, concentrations, and the like), with these data obtained in FFS simulations. Herein, we introduce a novel and comprehensive benchmark framework for rare-event prediction, comparing ML algorithms of varying complexity, including Linear Support-Vector Regressor and k-Nearest Neighbors, to more sophisticated algorithms, such as Random Forests, XGBoost, LightGBM, CatBoost, Dense Neural Networks, and TabNet. This evaluation uses comprehensive performance metrics, such as: $\textit{RMSE}$, model training, testing, hyperparameter tuning and deployment times, and number and efficiency of alarms. These balance model accuracy, computational efficiency, and alarm-system efficiency, identifying optimal ML strategies for predicting abnormal rare events, enabling operators to obtain safer and more reliable plant operations.
comment: This is a preprint for our manuscript to be submitted for publication in Computers and Chemical Engineering Journal. Pages: 22 (including Appendix and References). Figures: 9 (main) + 3 (Appendix). Tables: 3 (main) + 3 (Appendix)
Enhancing Bistable Vibration Energy Harvesters with Tunable Circuits: A Comparative Analysis
In this article, we present an analysis of the effects of nonlinear circuits on bistable vibration energy harvesters. We begin by introducing an analytical model for bistable vibration energy harvesters and demonstrate that the impact of nonlinear circuits can be characterized by two factors: electrically-induced damping and electrically-induced stiffness. Subsequently, we investigate how these factors influence the power-frequency response of the harvester. Our findings reveal that electrically-induced damping significantly affects the bistable vibration energy harvester dynamics, whereas electrically-induced stiffness has minimal impact, which is a notable difference from the behavior of linear harvesters connected to nonlinear circuits. Thereafter, we conduct a comparative study of bistable energy harvesters connected to different nonlinear circuits already well documented in the literature. Our analysis demonstrates that, in most cases, the parallel synchronized switch harvesting on inductor circuit yields superior performance due to its ability to maximize electrically-induced damping. These comparative assessments and conclusions are evaluated within the framework of our proposed models and are contrasted with results obtained from linear vibration energy harvesters. The derived comparison maps presented at the end of this paper offer quantitative justification for selecting the optimal circuit for bistable energy harvesters, while deepening our comprehension of the intricate dynamics associated with nonlinear harvesters coupled with nonlinear circuits.
comment: 15 pages, 13 figures, 2 Appendices
Flexible Ramping Product Procurement in Day-Ahead Markets
Flexible ramping products (FRPs) emerge as a promising instrument for addressing steep and uncertain ramping needs through market mechanisms. Initial implementations of FRPs in North American electricity markets, however, revealed several shortcomings in existing FRP designs. In many instances, FRP prices failed to signal the true value of ramping capacity, most notably evident in zero FRP prices observed in a myriad of periods during which the system was in acute need for rampable capacity. These periods were marked by scheduled but undeliverable FRPs, often calling for operator out-of-market actions. On top of that, the methods used for procuring FRPs have been primarily rule-based, lacking explicit economic underpinnings. In this paper, we put forth an alternative framework for FRP procurement, which seeks to set FRP requirements and schedule FRP awards such that the expected system operation cost is minimized. Using real world data from U.S. ISOs, we showcase the relative merits of the framework in (i) reducing the total system operation cost, (ii) improving price formation, (iii) enhancing the the deliverability of FRP awards, and (iv) reducing the need for out-of-market actions.
Optimization-Based Control of Distributed Battery Storage in Distribution Networks
We propose a combined global-local control approach to regulate voltage and minimize power losses in distribution networks with high integration of distributed energy resources (DERs). Local controllers embed the fast acting proportional volt-var-watt control law and have their gain (slope) coefficients updated regularly by a global optimization problem at a slower time-scale. Design of optimal coefficients preserve overall system stability and encapsulate inverter and energy limits of controllable DERs. The proposed approach is formulated based on a linear network model (LinDistFlow) and suitable approximations to produce a convex multi-period optimization formulation. Numerical simulations with real-world customer data and two different distribution feeders revealed that our approach provides substantial voltage regulation, while reducing losses by 11 per cent and peak substation power by 26 per cent compared to other state-of-the-art algorithms.
Lyapunov Neural ODE Feedback Control Policies
Deep neural networks are increasingly used as an effective way to represent control policies in a wide-range of learning-based control methods. For continuous-time optimal control problems (OCPs), which are central to many decision-making tasks, control policy learning can be cast as a neural ordinary differential equation (NODE) problem wherein state and control constraints are naturally accommodated. This paper presents a Lyapunov-NODE control (L-NODEC) approach to solving continuous-time OCPs for the case of stabilizing a known constrained nonlinear system around a terminal equilibrium point. We propose a Lyapunov loss formulation that incorporates a control-theoretic Lyapunov condition into the problem of learning a state-feedback neural control policy. We establish that L-NODEC ensures exponential stability of the controlled system, as well as its adversarial robustness to uncertain initial conditions. The performance of L-NODEC is illustrated on a benchmark double integrator problem and for optimal control of thermal dose delivery using a cold atmospheric plasma biomedical system. L-NODEC can substantially reduce the inference time necessary to reach the equilibrium state.
Facilitating AI and System Operator Synergy: Active Learning-Enhanced Digital Twin Architecture for Day-Ahead Load Forecasting
In this paper, we introduce a synergistic approach between artificial intelligence and system operators through an innovative digital twin architecture, integrated with an active learning framework, to enhance short-term load forecasting. Central to this architecture is the incorporation of sophisticated data pipelines, facilitating the real-time ingestion, processing and analysis of grid-related data. Utilizing a recurrent neural network architecture, our model generates day-ahead load forecasts together with prediction confidence intervals, strengthening system operator trust in the model's predictive reliability and enhancing their ability to respond to evolving grid conditions effectively. The active learning framework iteratively refines the predictions by incorporating real-time feedback based on forecast uncertainty, utilizing newly available data to continuously enhance forecasting accuracy and confidence. This AI-assisted strategy is exemplified in a case study of the Greek transmission system. It demonstrates the potential to transform short-term load forecasting, thereby increasing the reliability and operational efficiency of modern power grids. This approach marks a significant step forward in the digitalization and intelligent management of power systems.
Distributionally Robust Joint Chance-Constrained Optimization for Electricity Imbalance in Iran: Integrating Renewables and Storage
Iran's power grid faces mounting challenges due to the widening gap between rapidly increasing peak demand and lagging sustainable capacity expansion or load management. Prosumers have become key players in reducing grid load and offering valuable flexible services, but their effectiveness is hampered by a lack of knowledge about uncertain parameters and their probability distributions. This study introduces a novel two-stage multi-time scale distributionally robust optimization framework integrated with joint chance constraints to effectively manage the operation of prosumers and their energy sharing to mitigate overall peak load imbalances under uncertainties. In a data-driven setting and by leveraging historical data, the proposed model is reformulated as a tractable second-order conic constrained quadratic programing (SOCP). By considering real-world complexities based on realistic-data such as diverse load profiles and intermittent renewable generation, our approach demonstrates enhanced energy management system performance, even in out-of-sample scenarios. The synergy of distributed energy resources and coordinated flexibility within the network is instrumental in achieving substantial reductions in peak load and improving grid resilience.
comment: 9 pages; 11 figures, journal paper
The Dilemma of Electricity Grid Expansion Planning in Areas at the Risk of Wildfire
The utilities consider public safety power shut-offs imperative for the mitigation of wildfire risk. This paper presents expansion planning of power system under fire hazard weather conditions. The power lines are quantified based on the risk of fire ignition. A 10-year expansion planning scenario is discussed to supply power to customers by considering three decision variables: distributed solar generation; modification of existing power lines; addition of new lines. Two-stage robust optimization problem is formulated and solved using Column-and-Constraint Generation Algorithm to find improved balance among de-energization of customers, distributed solar generation, modification of power lines, and addition of new lines. It involves lines de-energization of high wildfire risk regions and serving the customers by integrating distributed solar generation. The impact of de-energization of lines on distributed solar generation is assessed. The number of hours each line is energized and total load shedding during a 10-year period is evaluated. Different uncertainty levels for system demand and solar energy integration are considered to find the impact on the total operation cost of the system. The effectiveness of the presented algorithm is evaluated on 6- and 118-bus systems.
Scalable analysis of stop-and-go waves
Analyzing stop-and-go waves at the scale of miles and hours of data is an emerging challenge in traffic research. In the past, datasets were of limited scale and could be easily analyzed by hand or with rudimentary methods to identify a very limited set of traffic waves present within the data. This paper introduces an automatic and scalable stop-and-go wave identification method capable of capturing wave generation, propagation, dissipation, as well as bifurcation and merging, which have previously been observed only very rarely. Using a concise and simple critical-speed based definition of a stop-and-go wave, the proposed method identifies all wave boundaries that encompass spatio-temporal points where vehicle speed is below a chosen critical speed. The method is built upon a graph-based representation of the spatio-temporal points associated with stop-and-go waves, specifically wave front (start) points and wave tail (end) points, and approaches the solution as a graph component identification problem. The method is implemented in Python and demonstrated on a large-scale dataset, I-24 MOTION INCEPTION. New insights revealed from this demonstration with emerging phenomena include: (a) we demonstrate that waves do generate, propagate, and dissipate at a scale (miles and hours) and ubiquity never observed before; (b) wave fronts and tails travels at a consistent speed for a critical speed between 10-20 mph, with propagation variation across lanes, where wave speed on the outer lane are less consistent compared to those on the inner lane; (c) wave fronts and tails propagate at different speeds; (d) wave boundaries capture rich and non-trivial wave topologies, highlighting the complexity of waves.
Safe Barrier-Constrained Control of Uncertain Systems via Event-triggered Learning
While control barrier functions are employed in addressing safety, control synthesis methods based on them generally rely on accurate system dynamics. This is a critical limitation, since the dynamics of complex systems are often not fully known. Supervised machine learning techniques hold great promise for alleviating this weakness by inferring models from data. We propose a novel control barrier function-based framework for safe control through event-triggered learning, which switches between prioritizing control performance and improving model accuracy based on the uncertainty of the learned model. By updating a Gaussian process model with training points gathered online, the approach guarantees the feasibility of control barrier function conditions with high probability, such that safety can be ensured in a data-efficient manner. Furthermore, we establish the absence of Zeno behavior in the triggering scheme, and extend the algorithm to sampled-data realizations by accounting for inter-sampling effects. The effectiveness of the proposed approach and theory is demonstrated in simulations.
comment: The first two authors contributed equally to the work
Risk-Averse Resilient Operation of Electricity Grid Under the Risk of Wildfire
Wildfires and other extreme weather conditions due to climate change are stressing the aging electrical infrastructure. Power utilities have implemented public safety power shutoffs as a method to mitigate the risk of wildfire by proactively de-energizing some power lines, which leaves customers without power. System operators have to make a compromise between de-energizing of power lines to avoid the wildfire risk and energizing those lines to serve the demand. In this work, with a quantified wildfire ignition risk of each line, a resilient operation problem is presented in power systems with a high penetration level of renewable generation resources. A two-stage robust optimization problem is formulated and solved using column-and-constraint generation algorithm to find improved balance between the de-energization of power lines and the customers served. Different penetration levels of renewable generation to mitigate the impact of extreme fire hazard situations on the energization of customers is assessed. The validity of the presented robust optimization algorithm is demonstrated on various test cases.
Online-Score-Aided Federated Learning: Taming the Resource Constraints in Wireless Networks
While FL is a widely popular distributed ML strategy that protects data privacy, time-varying wireless network parameters and heterogeneous system configurations of the wireless device pose significant challenges. Although the limited radio and computational resources of the network and the clients, respectively, are widely acknowledged, two critical yet often ignored aspects are (a) wireless devices can only dedicate a small chunk of their limited storage for the FL task and (b) new training samples may arrive in an online manner in many practical wireless applications. Therefore, we propose a new FL algorithm called OSAFL, specifically designed to learn tasks relevant to wireless applications under these practical considerations. Since it has long been proven that under extreme resource constraints, clients may perform an arbitrary number of local training steps, which may lead to client drift under statistically heterogeneous data distributions, we leverage normalized gradient similarities and exploit weighting clients' updates based on optimized scores that facilitate the convergence rate of the proposed OSAFL algorithm. Our extensive simulation results on two different tasks -- each with three different datasets -- with four popular ML models validate the effectiveness of OSAFL compared to six existing state-of-the-art FL baselines.
comment: Under review for possible publication in IEEE Transactions on Communications
An RF-Domain Leakage Cancellation Scheme for FMCW Radars
This paper proposes a novel solution to address the leakage from the transmitter (TX) to the receiver (RX) in frequency-modulated continuous-wave (FMCW) radars. The proposed scheme replicates the leakage using an in-phase and quadrature mixer (IQ-mixer) and performs leakage cancellation in the radio-frequency (RF) domain. This approach utilizes a Wilkinson power combiner after the RX antenna to subtract the replicated leakage signal from the received signal, ensuring that only the true target signal reaches the low-noise amplifier (LNA). This scheme enhances the dynamic range and the receiver's ability to discern proximate targets from previously indistinguishable low beat-frequency clutter. In addition, the proposed technique incorporates a second IQ-mixer based complex modulator in the transmitter to tune the leakage beat frequency. This allows for accurate estimation of the leakage amplitude and phase without additional hardware. Simulation results show more than 20 dB of leakage cancellation.
An experimental evaluation of Deep Reinforcement Learning algorithms for HVAC control
Heating, Ventilation, and Air Conditioning (HVAC) systems are a major driver of energy consumption in commercial and residential buildings. Recent studies have shown that Deep Reinforcement Learning (DRL) algorithms can outperform traditional reactive controllers. However, DRL-based solutions are generally designed for ad hoc setups and lack standardization for comparison. To fill this gap, this paper provides a critical and reproducible evaluation, in terms of comfort and energy consumption, of several state-of-the-art DRL algorithms for HVAC control. The study examines the controllers' robustness, adaptability, and trade-off between optimization goals by using the Sinergym framework. The results obtained confirm the potential of DRL algorithms, such as SAC and TD3, in complex scenarios and reveal several challenges related to generalization and incremental learning.
Active Learning of Discrete-Time Dynamics for Uncertainty-Aware Model Predictive Control
Model-based control requires an accurate model of the system dynamics for precisely and safely controlling the robot in complex and dynamic environments. Moreover, in the presence of variations in the operating conditions, the model should be continuously refined to compensate for dynamics changes. In this paper, we present a self-supervised learning approach that actively models the dynamics of nonlinear robotic systems. We combine offline learning from past experience and online learning from current robot interaction with the unknown environment. These two ingredients enable a highly sample-efficient and adaptive learning process, capable of accurately inferring model dynamics in real-time even in operating regimes that greatly differ from the training distribution. Moreover, we design an uncertainty-aware model predictive controller that is heuristically conditioned to the aleatoric (data) uncertainty of the learned dynamics. This controller actively chooses the optimal control actions that (i) optimize the control performance and (ii) improve the efficiency of online learning sample collection. We demonstrate the effectiveness of our method through a series of challenging real-world experiments using a quadrotor system. Our approach showcases high resilience and generalization capabilities by consistently adapting to unseen flight conditions, while it significantly outperforms classical and adaptive control baselines.
Towards 6G-V2X: Aggregated RF-VLC for Ultra-Reliable and Low-Latency Autonomous Driving
We are witnessing a transition to a new era where driverless cars will be pervasively connected to deliver significantly improved safety, traffic efficiency, and travel experiences. A diverse set of advanced vehicular use cases including connected autonomous vehicles will be made possible by building upon the emerging sixth-generation (6G) wireless networks. Among many 6G wireless technologies, the principal objective of this paper is to introduce the potential benefits of the hybrid integration of Vehicular Visible Light Communication (V VLC) and Vehicular Radio Frequency (V RF) communication systems by studying the impact of interference as well as various meteorological phenomenon viz. rain, fog and dry snow. In particular, we show that regardless of any meteorological impact, a properly configured link-aggregated hybrid V-VLC/V-RF system is capable of meeting stringent ultra high reliability (>99.999%) and ultra-low latency (<3 ms) requirements, making it a promising candidate for 6G Vehicle-to-Everything (V2X) Communications. To stimulate future research in the hybrid RF-VLC V2X space, we also highlight the potential challenges and research directions.
Systems and Control (EESS)
Sparse Mamba: Reinforcing Controllability In Structural State Space Models
In this article, we introduce the concept of controllability and observability to the M amba architecture in our Sparse-Mamba (S-Mamba) for natural language processing (NLP) applications. The structured state space model (SSM) development in recent studies, such as Mamba and Mamba2, outperformed and solved the computational inefficiency of transformers and large language models (LLMs) on longer sequences in small to medium NLP tasks. The Mamba SSMs architecture drops the need for attention layer or MLB blocks in transformers. However, the current Mamba models do not reinforce the controllability on state space equations in the calculation of A, B, C, and D matrices at each time step, which increase the complexity and the computational cost needed. In this article we show that the number of parameters can be significantly decreased by reinforcing controllability in the state space equations in the proposed Sparse-Mamba (S-Mamba), while maintaining the performance. The controllable n x n state matrix A is sparse and it has only n free parameters. Our novel approach will ensure a controllable system and could be the gate key for Mamba 3.
Review of meta-heuristic optimization algorithms to tune the PID controller parameters for automatic voltage regulator
A Proportional- Integral- Derivative (PID) controller is required to bring a system back to the stable operating region as soon as possible following a disturbance or discrepancy. For successful operation of the PID controller, it is necessary to design the controller parameters in a manner that will render low optimization complexity, less memory for operation, fast convergence, and should be able to operate dynamically. Recent investigations have postulated many different meta-heuristic algorithms for efficient tuning of PID controller under various system milieus. However, researchers have seldom compared their custom made objective functions with previous investigations while proposing new algorithmic methods. This paper focuses on a detailed study on the research progress, deficiency, accomplishment and future scopes of recently proposed heuristic algorithms to designing and tuning the PID controller parameters for an automatic voltage regulator (AVR) system. Objective functions, including ITSE, ITAE, IAE, ISE, and ZLG, are considered to enumerate a measurable outcome of the algorithms. Considering a slight variation in the sytem gain parameters of the algorithms, the observed PID gain with ITSE results in 0.81918 - 1.9499 for K_p, 0.24366 - 1.4608 for K_i, and 0.31840 - 0.9683 for K_d. Whereas with ITAE the values are 0.24420 - 1.2771, 0.14230 - 0.8471, and 0.04270 - 0.4775, respectively. The time domain and frequency domain characteristics also changes significantly with each objective function. Our outlined comparison will be a guideline for investigating any newer algorithms in the field.
Formal Verification and Control with Conformal Prediction
In this survey, we design formal verification and control algorithms for autonomous systems with practical safety guarantees using conformal prediction (CP), a statistical tool for uncertainty quantification. We focus on learning-enabled autonomous systems (LEASs) in which the complexity of learning-enabled components (LECs) is a major bottleneck that hampers the use of existing model-based verification and design techniques. Instead, we advocate for the use of CP, and we will demonstrate its use in formal verification, systems and control theory, and robotics. We argue that CP is specifically useful due to its simplicity (easy to understand, use, and modify), generality (requires no assumptions on learned models and data distributions, i.e., is distribution-free), and efficiency (real-time capable and accurate). We pursue the following goals with this survey. First, we provide an accessible introduction to CP for non-experts who are interested in using CP to solve problems in autonomy. Second, we show how to use CP for the verification of LECs, e.g., for verifying input-output properties of neural networks. Third and fourth, we review recent articles that use CP for safe control design as well as offline and online verification of LEASs. We summarize their ideas in a unifying framework that can deal with the complexity of LEASs in a computationally efficient manner. In our exposition, we consider simple system specifications, e.g., robot navigation tasks, as well as complex specifications formulated in temporal logic formalisms. Throughout our survey, we compare to other statistical techniques (e.g., scenario optimization, PAC-Bayes theory, etc.) and how these techniques have been used in verification and control. Lastly, we point the reader to open problems and future research directions.
Leaky Wave Antenna-Equipped RF Chipless Tags for Orientation Estimation
Accurate orientation estimation of an object in a scene is critical in robotics, aerospace, augmented reality, and medicine, as it supports scene understanding. This paper introduces a novel orientation estimation approach leveraging radio frequency (RF) sensing technology and leaky-wave antennas (LWAs). Specifically, we propose a framework for a radar system to estimate the orientation of a \textit{dumb} LWA-equipped backscattering tag, marking the first exploration of this method in the literature. Our contributions include a comprehensive framework for signal modeling and orientation estimation with multi-subcarrier transmissions, and the formulation of a maximum likelihood estimator (MLE). Moreover, we analyze the impact of imperfect tag location information, revealing that it minimally affects estimation accuracy. Exploiting related results, we propose an approximate MLE and introduce a low-complexity radiation-pointing angle-based estimator with near-optimal performance. We derive the feasible orientation estimation region of the latter and show that it depends mainly on the system bandwidth. Our analytical results are validated through Monte Carlo simulations and reveal that the low-complexity estimator achieves near-optimal accuracy and that its feasible orientation estimation region is also approximately shared by the other estimators. Finally, we show that the optimal number of subcarriers increases with sensing time under a power budget constraint.
comment: 14 pages, 2 tables, 8 figs. Submitted to IEEE TWC
Evaluation of Prosumer Networks for Peak Load Management in Iran: A Distributed Contextual Stochastic Optimization Approach
Renewable prosumers face the complex challenge of balancing self-sufficiency with seamless grid and market integration. This paper introduces a novel prosumers network framework aimed at mitigating peak loads in Iran, particularly under the uncertainties inherent in renewable energy generation and demand. A cost-oriented integrated prediction and optimization approach is proposed, empowering prosumers to make informed decisions within a distributed contextual stochastic optimization (DCSO) framework. The problem is formulated as a bi-level two-stage multi-time scale optimization to determine optimal operation and interaction strategies under various scenarios, considering flexible resources. To facilitate grid integration, a novel consensus-based contextual information sharing mechanism is proposed. This approach enables coordinated collective behaviors and leverages contextual data more effectively. The overall problem is recast as a mixed-integer linear program (MILP) by incorporating optimality conditions and linearizing complementarity constraints. Additionally, a distributed algorithm using the consensus alternating direction method of multipliers (ADMM) is presented for computational tractability and privacy preservation. Numerical results highlights that integrating prediction with optimization and implementing a contextual information-sharing network among prosumers significantly reduces peak loads as well as total costs.
comment: 10 pages, 26 figure, journal paper
Advancing Machine Learning in Industry 4.0: Benchmark Framework for Rare-event Prediction in Chemical Processes
Previously, using forward-flux sampling (FFS) and machine learning (ML), we developed multivariate alarm systems to counter rare un-postulated abnormal events. Our alarm systems utilized ML-based predictive models to quantify committer probabilities as functions of key process variables (e.g., temperature, concentrations, and the like), with these data obtained in FFS simulations. Herein, we introduce a novel and comprehensive benchmark framework for rare-event prediction, comparing ML algorithms of varying complexity, including Linear Support-Vector Regressor and k-Nearest Neighbors, to more sophisticated algorithms, such as Random Forests, XGBoost, LightGBM, CatBoost, Dense Neural Networks, and TabNet. This evaluation uses comprehensive performance metrics, such as: $\textit{RMSE}$, model training, testing, hyperparameter tuning and deployment times, and number and efficiency of alarms. These balance model accuracy, computational efficiency, and alarm-system efficiency, identifying optimal ML strategies for predicting abnormal rare events, enabling operators to obtain safer and more reliable plant operations.
comment: This is a preprint for our manuscript to be submitted for publication in Computers and Chemical Engineering Journal. Pages: 22 (including Appendix and References). Figures: 9 (main) + 3 (Appendix). Tables: 3 (main) + 3 (Appendix)
Enhancing Bistable Vibration Energy Harvesters with Tunable Circuits: A Comparative Analysis
In this article, we present an analysis of the effects of nonlinear circuits on bistable vibration energy harvesters. We begin by introducing an analytical model for bistable vibration energy harvesters and demonstrate that the impact of nonlinear circuits can be characterized by two factors: electrically-induced damping and electrically-induced stiffness. Subsequently, we investigate how these factors influence the power-frequency response of the harvester. Our findings reveal that electrically-induced damping significantly affects the bistable vibration energy harvester dynamics, whereas electrically-induced stiffness has minimal impact, which is a notable difference from the behavior of linear harvesters connected to nonlinear circuits. Thereafter, we conduct a comparative study of bistable energy harvesters connected to different nonlinear circuits already well documented in the literature. Our analysis demonstrates that, in most cases, the parallel synchronized switch harvesting on inductor circuit yields superior performance due to its ability to maximize electrically-induced damping. These comparative assessments and conclusions are evaluated within the framework of our proposed models and are contrasted with results obtained from linear vibration energy harvesters. The derived comparison maps presented at the end of this paper offer quantitative justification for selecting the optimal circuit for bistable energy harvesters, while deepening our comprehension of the intricate dynamics associated with nonlinear harvesters coupled with nonlinear circuits.
comment: 15 pages, 13 figures, 2 Appendices
Flexible Ramping Product Procurement in Day-Ahead Markets
Flexible ramping products (FRPs) emerge as a promising instrument for addressing steep and uncertain ramping needs through market mechanisms. Initial implementations of FRPs in North American electricity markets, however, revealed several shortcomings in existing FRP designs. In many instances, FRP prices failed to signal the true value of ramping capacity, most notably evident in zero FRP prices observed in a myriad of periods during which the system was in acute need for rampable capacity. These periods were marked by scheduled but undeliverable FRPs, often calling for operator out-of-market actions. On top of that, the methods used for procuring FRPs have been primarily rule-based, lacking explicit economic underpinnings. In this paper, we put forth an alternative framework for FRP procurement, which seeks to set FRP requirements and schedule FRP awards such that the expected system operation cost is minimized. Using real world data from U.S. ISOs, we showcase the relative merits of the framework in (i) reducing the total system operation cost, (ii) improving price formation, (iii) enhancing the the deliverability of FRP awards, and (iv) reducing the need for out-of-market actions.
Optimization-Based Control of Distributed Battery Storage in Distribution Networks
We propose a combined global-local control approach to regulate voltage and minimize power losses in distribution networks with high integration of distributed energy resources (DERs). Local controllers embed the fast acting proportional volt-var-watt control law and have their gain (slope) coefficients updated regularly by a global optimization problem at a slower time-scale. Design of optimal coefficients preserve overall system stability and encapsulate inverter and energy limits of controllable DERs. The proposed approach is formulated based on a linear network model (LinDistFlow) and suitable approximations to produce a convex multi-period optimization formulation. Numerical simulations with real-world customer data and two different distribution feeders revealed that our approach provides substantial voltage regulation, while reducing losses by 11 per cent and peak substation power by 26 per cent compared to other state-of-the-art algorithms.
Lyapunov Neural ODE Feedback Control Policies
Deep neural networks are increasingly used as an effective way to represent control policies in a wide-range of learning-based control methods. For continuous-time optimal control problems (OCPs), which are central to many decision-making tasks, control policy learning can be cast as a neural ordinary differential equation (NODE) problem wherein state and control constraints are naturally accommodated. This paper presents a Lyapunov-NODE control (L-NODEC) approach to solving continuous-time OCPs for the case of stabilizing a known constrained nonlinear system around a terminal equilibrium point. We propose a Lyapunov loss formulation that incorporates a control-theoretic Lyapunov condition into the problem of learning a state-feedback neural control policy. We establish that L-NODEC ensures exponential stability of the controlled system, as well as its adversarial robustness to uncertain initial conditions. The performance of L-NODEC is illustrated on a benchmark double integrator problem and for optimal control of thermal dose delivery using a cold atmospheric plasma biomedical system. L-NODEC can substantially reduce the inference time necessary to reach the equilibrium state.
Facilitating AI and System Operator Synergy: Active Learning-Enhanced Digital Twin Architecture for Day-Ahead Load Forecasting
In this paper, we introduce a synergistic approach between artificial intelligence and system operators through an innovative digital twin architecture, integrated with an active learning framework, to enhance short-term load forecasting. Central to this architecture is the incorporation of sophisticated data pipelines, facilitating the real-time ingestion, processing and analysis of grid-related data. Utilizing a recurrent neural network architecture, our model generates day-ahead load forecasts together with prediction confidence intervals, strengthening system operator trust in the model's predictive reliability and enhancing their ability to respond to evolving grid conditions effectively. The active learning framework iteratively refines the predictions by incorporating real-time feedback based on forecast uncertainty, utilizing newly available data to continuously enhance forecasting accuracy and confidence. This AI-assisted strategy is exemplified in a case study of the Greek transmission system. It demonstrates the potential to transform short-term load forecasting, thereby increasing the reliability and operational efficiency of modern power grids. This approach marks a significant step forward in the digitalization and intelligent management of power systems.
Distributionally Robust Joint Chance-Constrained Optimization for Electricity Imbalance in Iran: Integrating Renewables and Storage
Iran's power grid faces mounting challenges due to the widening gap between rapidly increasing peak demand and lagging sustainable capacity expansion or load management. Prosumers have become key players in reducing grid load and offering valuable flexible services, but their effectiveness is hampered by a lack of knowledge about uncertain parameters and their probability distributions. This study introduces a novel two-stage multi-time scale distributionally robust optimization framework integrated with joint chance constraints to effectively manage the operation of prosumers and their energy sharing to mitigate overall peak load imbalances under uncertainties. In a data-driven setting and by leveraging historical data, the proposed model is reformulated as a tractable second-order conic constrained quadratic programing (SOCP). By considering real-world complexities based on realistic-data such as diverse load profiles and intermittent renewable generation, our approach demonstrates enhanced energy management system performance, even in out-of-sample scenarios. The synergy of distributed energy resources and coordinated flexibility within the network is instrumental in achieving substantial reductions in peak load and improving grid resilience.
comment: 9 pages; 11 figures, journal paper
The Dilemma of Electricity Grid Expansion Planning in Areas at the Risk of Wildfire
The utilities consider public safety power shut-offs imperative for the mitigation of wildfire risk. This paper presents expansion planning of power system under fire hazard weather conditions. The power lines are quantified based on the risk of fire ignition. A 10-year expansion planning scenario is discussed to supply power to customers by considering three decision variables: distributed solar generation; modification of existing power lines; addition of new lines. Two-stage robust optimization problem is formulated and solved using Column-and-Constraint Generation Algorithm to find improved balance among de-energization of customers, distributed solar generation, modification of power lines, and addition of new lines. It involves lines de-energization of high wildfire risk regions and serving the customers by integrating distributed solar generation. The impact of de-energization of lines on distributed solar generation is assessed. The number of hours each line is energized and total load shedding during a 10-year period is evaluated. Different uncertainty levels for system demand and solar energy integration are considered to find the impact on the total operation cost of the system. The effectiveness of the presented algorithm is evaluated on 6- and 118-bus systems.
Scalable analysis of stop-and-go waves
Analyzing stop-and-go waves at the scale of miles and hours of data is an emerging challenge in traffic research. In the past, datasets were of limited scale and could be easily analyzed by hand or with rudimentary methods to identify a very limited set of traffic waves present within the data. This paper introduces an automatic and scalable stop-and-go wave identification method capable of capturing wave generation, propagation, dissipation, as well as bifurcation and merging, which have previously been observed only very rarely. Using a concise and simple critical-speed based definition of a stop-and-go wave, the proposed method identifies all wave boundaries that encompass spatio-temporal points where vehicle speed is below a chosen critical speed. The method is built upon a graph-based representation of the spatio-temporal points associated with stop-and-go waves, specifically wave front (start) points and wave tail (end) points, and approaches the solution as a graph component identification problem. The method is implemented in Python and demonstrated on a large-scale dataset, I-24 MOTION INCEPTION. New insights revealed from this demonstration with emerging phenomena include: (a) we demonstrate that waves do generate, propagate, and dissipate at a scale (miles and hours) and ubiquity never observed before; (b) wave fronts and tails travels at a consistent speed for a critical speed between 10-20 mph, with propagation variation across lanes, where wave speed on the outer lane are less consistent compared to those on the inner lane; (c) wave fronts and tails propagate at different speeds; (d) wave boundaries capture rich and non-trivial wave topologies, highlighting the complexity of waves.
Safe Barrier-Constrained Control of Uncertain Systems via Event-triggered Learning
While control barrier functions are employed in addressing safety, control synthesis methods based on them generally rely on accurate system dynamics. This is a critical limitation, since the dynamics of complex systems are often not fully known. Supervised machine learning techniques hold great promise for alleviating this weakness by inferring models from data. We propose a novel control barrier function-based framework for safe control through event-triggered learning, which switches between prioritizing control performance and improving model accuracy based on the uncertainty of the learned model. By updating a Gaussian process model with training points gathered online, the approach guarantees the feasibility of control barrier function conditions with high probability, such that safety can be ensured in a data-efficient manner. Furthermore, we establish the absence of Zeno behavior in the triggering scheme, and extend the algorithm to sampled-data realizations by accounting for inter-sampling effects. The effectiveness of the proposed approach and theory is demonstrated in simulations.
comment: The first two authors contributed equally to the work
Risk-Averse Resilient Operation of Electricity Grid Under the Risk of Wildfire
Wildfires and other extreme weather conditions due to climate change are stressing the aging electrical infrastructure. Power utilities have implemented public safety power shutoffs as a method to mitigate the risk of wildfire by proactively de-energizing some power lines, which leaves customers without power. System operators have to make a compromise between de-energizing of power lines to avoid the wildfire risk and energizing those lines to serve the demand. In this work, with a quantified wildfire ignition risk of each line, a resilient operation problem is presented in power systems with a high penetration level of renewable generation resources. A two-stage robust optimization problem is formulated and solved using column-and-constraint generation algorithm to find improved balance between the de-energization of power lines and the customers served. Different penetration levels of renewable generation to mitigate the impact of extreme fire hazard situations on the energization of customers is assessed. The validity of the presented robust optimization algorithm is demonstrated on various test cases.
Online-Score-Aided Federated Learning: Taming the Resource Constraints in Wireless Networks
While FL is a widely popular distributed ML strategy that protects data privacy, time-varying wireless network parameters and heterogeneous system configurations of the wireless device pose significant challenges. Although the limited radio and computational resources of the network and the clients, respectively, are widely acknowledged, two critical yet often ignored aspects are (a) wireless devices can only dedicate a small chunk of their limited storage for the FL task and (b) new training samples may arrive in an online manner in many practical wireless applications. Therefore, we propose a new FL algorithm called OSAFL, specifically designed to learn tasks relevant to wireless applications under these practical considerations. Since it has long been proven that under extreme resource constraints, clients may perform an arbitrary number of local training steps, which may lead to client drift under statistically heterogeneous data distributions, we leverage normalized gradient similarities and exploit weighting clients' updates based on optimized scores that facilitate the convergence rate of the proposed OSAFL algorithm. Our extensive simulation results on two different tasks -- each with three different datasets -- with four popular ML models validate the effectiveness of OSAFL compared to six existing state-of-the-art FL baselines.
comment: Under review for possible publication in IEEE Transactions on Communications
An RF-Domain Leakage Cancellation Scheme for FMCW Radars
This paper proposes a novel solution to address the leakage from the transmitter (TX) to the receiver (RX) in frequency-modulated continuous-wave (FMCW) radars. The proposed scheme replicates the leakage using an in-phase and quadrature mixer (IQ-mixer) and performs leakage cancellation in the radio-frequency (RF) domain. This approach utilizes a Wilkinson power combiner after the RX antenna to subtract the replicated leakage signal from the received signal, ensuring that only the true target signal reaches the low-noise amplifier (LNA). This scheme enhances the dynamic range and the receiver's ability to discern proximate targets from previously indistinguishable low beat-frequency clutter. In addition, the proposed technique incorporates a second IQ-mixer based complex modulator in the transmitter to tune the leakage beat frequency. This allows for accurate estimation of the leakage amplitude and phase without additional hardware. Simulation results show more than 20 dB of leakage cancellation.
An experimental evaluation of Deep Reinforcement Learning algorithms for HVAC control
Heating, Ventilation, and Air Conditioning (HVAC) systems are a major driver of energy consumption in commercial and residential buildings. Recent studies have shown that Deep Reinforcement Learning (DRL) algorithms can outperform traditional reactive controllers. However, DRL-based solutions are generally designed for ad hoc setups and lack standardization for comparison. To fill this gap, this paper provides a critical and reproducible evaluation, in terms of comfort and energy consumption, of several state-of-the-art DRL algorithms for HVAC control. The study examines the controllers' robustness, adaptability, and trade-off between optimization goals by using the Sinergym framework. The results obtained confirm the potential of DRL algorithms, such as SAC and TD3, in complex scenarios and reveal several challenges related to generalization and incremental learning.
Active Learning of Discrete-Time Dynamics for Uncertainty-Aware Model Predictive Control
Model-based control requires an accurate model of the system dynamics for precisely and safely controlling the robot in complex and dynamic environments. Moreover, in the presence of variations in the operating conditions, the model should be continuously refined to compensate for dynamics changes. In this paper, we present a self-supervised learning approach that actively models the dynamics of nonlinear robotic systems. We combine offline learning from past experience and online learning from current robot interaction with the unknown environment. These two ingredients enable a highly sample-efficient and adaptive learning process, capable of accurately inferring model dynamics in real-time even in operating regimes that greatly differ from the training distribution. Moreover, we design an uncertainty-aware model predictive controller that is heuristically conditioned to the aleatoric (data) uncertainty of the learned dynamics. This controller actively chooses the optimal control actions that (i) optimize the control performance and (ii) improve the efficiency of online learning sample collection. We demonstrate the effectiveness of our method through a series of challenging real-world experiments using a quadrotor system. Our approach showcases high resilience and generalization capabilities by consistently adapting to unseen flight conditions, while it significantly outperforms classical and adaptive control baselines.
Towards 6G-V2X: Aggregated RF-VLC for Ultra-Reliable and Low-Latency Autonomous Driving
We are witnessing a transition to a new era where driverless cars will be pervasively connected to deliver significantly improved safety, traffic efficiency, and travel experiences. A diverse set of advanced vehicular use cases including connected autonomous vehicles will be made possible by building upon the emerging sixth-generation (6G) wireless networks. Among many 6G wireless technologies, the principal objective of this paper is to introduce the potential benefits of the hybrid integration of Vehicular Visible Light Communication (V VLC) and Vehicular Radio Frequency (V RF) communication systems by studying the impact of interference as well as various meteorological phenomenon viz. rain, fog and dry snow. In particular, we show that regardless of any meteorological impact, a properly configured link-aggregated hybrid V-VLC/V-RF system is capable of meeting stringent ultra high reliability (>99.999%) and ultra-low latency (<3 ms) requirements, making it a promising candidate for 6G Vehicle-to-Everything (V2X) Communications. To stimulate future research in the hybrid RF-VLC V2X space, we also highlight the potential challenges and research directions.
Robotics
Formal Verification and Control with Conformal Prediction
In this survey, we design formal verification and control algorithms for autonomous systems with practical safety guarantees using conformal prediction (CP), a statistical tool for uncertainty quantification. We focus on learning-enabled autonomous systems (LEASs) in which the complexity of learning-enabled components (LECs) is a major bottleneck that hampers the use of existing model-based verification and design techniques. Instead, we advocate for the use of CP, and we will demonstrate its use in formal verification, systems and control theory, and robotics. We argue that CP is specifically useful due to its simplicity (easy to understand, use, and modify), generality (requires no assumptions on learned models and data distributions, i.e., is distribution-free), and efficiency (real-time capable and accurate). We pursue the following goals with this survey. First, we provide an accessible introduction to CP for non-experts who are interested in using CP to solve problems in autonomy. Second, we show how to use CP for the verification of LECs, e.g., for verifying input-output properties of neural networks. Third and fourth, we review recent articles that use CP for safe control design as well as offline and online verification of LEASs. We summarize their ideas in a unifying framework that can deal with the complexity of LEASs in a computationally efficient manner. In our exposition, we consider simple system specifications, e.g., robot navigation tasks, as well as complex specifications formulated in temporal logic formalisms. Throughout our survey, we compare to other statistical techniques (e.g., scenario optimization, PAC-Bayes theory, etc.) and how these techniques have been used in verification and control. Lastly, we point the reader to open problems and future research directions.
DAP: Diffusion-based Affordance Prediction for Multi-modality Storage IROS2024
Solving storage problem: where objects must be accurately placed into containers with precise orientations and positions, presents a distinct challenge that extends beyond traditional rearrangement tasks. These challenges are primarily due to the need for fine-grained 6D manipulation and the inherent multi-modality of solution spaces, where multiple viable goal configurations exist for the same storage container. We present a novel Diffusion-based Affordance Prediction (DAP) pipeline for the multi-modal object storage problem. DAP leverages a two-step approach, initially identifying a placeable region on the container and then precisely computing the relative pose between the object and that region. Existing methods either struggle with multi-modality issues or computation-intensive training. Our experiments demonstrate DAP's superior performance and training efficiency over the current state-of-the-art RPDiff, achieving remarkable results on the RPDiff benchmark. Additionally, our experiments showcase DAP's data efficiency in real-world applications, an advancement over existing simulation-driven approaches. Our contribution fills a gap in robotic manipulation research by offering a solution that is both computationally efficient and capable of handling real-world variability. Code and supplementary material can be found at: https://github.com/changhaonan/DPS.git.
comment: Paper Accepted by IROS2024. Arxiv version is 8 pages
Rapid Gyroscope Calibration: A Deep Learning Approach
Low-cost gyroscope calibration is essential for ensuring the accuracy and reliability of gyroscope measurements. Stationary calibration estimates the deterministic parts of measurement errors. To this end, a common practice is to average the gyroscope readings during a predefined period and estimate the gyroscope bias. Calibration duration plays a crucial role in performance, therefore, longer periods are preferred. However, some applications require quick startup times and calibration is therefore allowed only for a short time. In this work, we focus on reducing low-cost gyroscope calibration time using deep learning methods. We propose a deep-learning framework and explore the possibilities of using multiple real and virtual gyroscopes to improve the calibration performance of single gyroscopes. To train and validate our approach, we recorded a dataset consisting of 169 hours of gyroscope readings, using 24 gyroscopes of two different brands. We also created a virtual dataset consisting of simulated gyroscope readings. The two datasets were used to evaluate our proposed approach. One of our key achievements in this work is reducing gyroscope calibration time by up to 89% using three low-cost gyroscopes.
comment: 10 Pages, 14 Figures,
UDGS-SLAM : UniDepth Assisted Gaussian Splatting for Monocular SLAM
Recent advancements in monocular neural depth estimation, particularly those achieved by the UniDepth network, have prompted the investigation of integrating UniDepth within a Gaussian splatting framework for monocular SLAM.This study presents UDGS-SLAM, a novel approach that eliminates the necessity of RGB-D sensors for depth estimation within Gaussian splatting framework. UDGS-SLAM employs statistical filtering to ensure local consistency of the estimated depth and jointly optimizes camera trajectory and Gaussian scene representation parameters. The proposed method achieves high-fidelity rendered images and low ATERMSE of the camera trajectory. The performance of UDGS-SLAM is rigorously evaluated using the TUM RGB-D dataset and benchmarked against several baseline methods, demonstrating superior performance across various scenarios. Additionally, an ablation study is conducted to validate design choices and investigate the impact of different network backbone encoders on system performance.
Rapid and Robust Trajectory Optimization for Humanoids
Performing trajectory design for humanoid robots with high degrees of freedom is computationally challenging. The trajectory design process also often involves carefully selecting various hyperparameters and requires a good initial guess which can further complicate the development process. This work introduces a generalized gait optimization framework that directly generates smooth and physically feasible trajectories. The proposed method demonstrates faster and more robust convergence than existing techniques and explicitly incorporates closed-loop kinematic constraints that appear in many modern humanoids. The method is implemented as an open-source C++ codebase which can be found at https://roahmlab.github.io/RAPTOR/.
Active Learning of Discrete-Time Dynamics for Uncertainty-Aware Model Predictive Control
Model-based control requires an accurate model of the system dynamics for precisely and safely controlling the robot in complex and dynamic environments. Moreover, in the presence of variations in the operating conditions, the model should be continuously refined to compensate for dynamics changes. In this paper, we present a self-supervised learning approach that actively models the dynamics of nonlinear robotic systems. We combine offline learning from past experience and online learning from current robot interaction with the unknown environment. These two ingredients enable a highly sample-efficient and adaptive learning process, capable of accurately inferring model dynamics in real-time even in operating regimes that greatly differ from the training distribution. Moreover, we design an uncertainty-aware model predictive controller that is heuristically conditioned to the aleatoric (data) uncertainty of the learned dynamics. This controller actively chooses the optimal control actions that (i) optimize the control performance and (ii) improve the efficiency of online learning sample collection. We demonstrate the effectiveness of our method through a series of challenging real-world experiments using a quadrotor system. Our approach showcases high resilience and generalization capabilities by consistently adapting to unseen flight conditions, while it significantly outperforms classical and adaptive control baselines.
Robust Path Planning via Learning from Demonstrations for Robotic Catheters in Deformable Environments
Objective: Navigation through tortuous and deformable vessels using catheters with limited steering capability underscores the need for reliable path planning. State-of-the-art path planners do not fully account for the deformable nature of the environment. Methods: This work proposes a robust path planner via a learning from demonstrations method, named Curriculum Generative Adversarial Imitation Learning (C-GAIL). This path planning framework takes into account the interaction between steerable catheters and vessel walls and the deformable property of vessels. Results: In-silico comparative experiments show that the proposed network achieves a 38% higher success rate in static environments and 17% higher in dynamic environments compared to a state-of-the-art approach based on GAIL. In-vitro validation experiments indicate that the path generated by the proposed C-GAIL path planner achieves a targeting error of 1.26$\pm$0.55mm and a tracking error of 5.18$\pm$3.48mm. These results represent improvements of 41% and 40% over the conventional centerline-following technique for targeting error and tracking error, respectively. Conclusion: The proposed C-GAIL path planner outperforms the state-of-the-art GAIL approach. The in-vitro validation experiments demonstrate that the path generated by the proposed C-GAIL path planner aligns better with the actual steering capability of the pneumatic artificial muscle-driven catheter utilized in this study. Therefore, the proposed approach can provide enhanced support to the user in navigating the catheter towards the target with greater accuracy, effectively meeting clinical accuracy requirements. Significance: The proposed path planning framework exhibits superior performance in managing uncertainty associated with vessel deformation, thereby resulting in lower tracking errors.
comment: 11 pages, 10 figures
S3E: A Mulit-Robot Multimodal Dataset for Collaborative SLAM
The burgeoning demand for collaborative robotic systems to execute complex tasks collectively has intensified the research community's focus on advancing simultaneous localization and mapping (SLAM) in a cooperative context. Despite this interest, the scalability and diversity of existing datasets for collaborative trajectories remain limited, especially in scenarios with constrained perspectives where the generalization capabilities of Collaborative SLAM (C-SLAM) are critical for the feasibility of multi-agent missions. Addressing this gap, we introduce S3E, an expansive multimodal dataset. Captured by a fleet of unmanned ground vehicles traversing four distinct collaborative trajectory paradigms, S3E encompasses 13 outdoor and 5 indoor sequences. These sequences feature meticulously synchronized and spatially calibrated data streams, including 360-degree LiDAR point cloud, high-resolution stereo imagery, high-frequency inertial measurement units (IMU), and Ultra-wideband (UWB) relative observations. Our dataset not only surpasses previous efforts in scale, scene diversity, and data intricacy but also provides a thorough analysis and benchmarks for both collaborative and individual SLAM methodologies. For access to the dataset and the latest information, please visit our repository at https://pengyu-team.github.io/S3E.
Towards Safe Robot Use with Edged or Pointed Objects: A Surrogate Study Assembling a Human Hand Injury Protection Database
The use of pointed or edged tools or objects is one of the most challenging aspects of today's application of physical human-robot interaction (pHRI). One reason for this is that the severity of harm caused by such edged or pointed impactors is less well studied than for blunt impactors. Consequently, the standards specify well-reasoned force and pressure thresholds for blunt impactors and advise avoiding any edges and corners in contacts. Nevertheless, pointed or edged impactor geometries cannot be completely ruled out in real pHRI applications. For example, to allow edged or pointed tools such as screwdrivers near human operators, the knowledge of injury severity needs to be extended so that robot integrators can perform well-reasoned, time-efficient risk assessments. In this paper, we provide the initial datasets on injury prevention for the human hand based on drop tests with surrogates for the human hand, namely pig claws and chicken drumsticks. We then demonstrate the ease and efficiency of robot use using the dataset for contact on two examples. Finally, our experiments provide a set of injuries that may also be expected for human subjects under certain robot mass-velocity constellations in collisions. To extend this work, testing on human samples and a collaborative effort from research institutes worldwide is needed to create a comprehensive human injury avoidance database for any pHRI scenario and thus for safe pHRI applications including edged and pointed geometries.
comment: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
A Learning Quasi-stiffness Control Framework of a Powered Trans-femoral Prosthesis for Adaptive Speed and Incline Walking
Impedance-based control represents a prevalent strategy in the powered trans femoral prostheses because of its ability to reproduce natural walking. However, most existing studies have developed impedance-based prosthesis controllers for specific tasks, while creating a task-adaptive controller for variable-task walking continues to be a significant challenge. This article proposes a task-adaptive quasi-stiffness control framework for powered prostheses that generalizes across various walking tasks, including the torque-angle relationship reconstruction part and the quasi-stiffness controller design part. A Gaussian Process Regression model is introduced to predict the target features of the human joints angle and torque in a new task. Subsequently, a Kernel Movement Primitives is employed to reconstruct the torque-angle relationship of the new task from multiple human reference trajectories and estimated target features. Based on the torque-angle relationship of the new task, a quasi-stiffness control approach is designed for a powered prosthesis. Finally, the proposed framework is validated through practical examples, including varying speeds and inclines walking tasks. Notably, the proposed framework not only aligns with but frequently surpasses the performance of a benchmark finite state machine impedance controller without necessitating manual impedance tuning and has the potential to expand to variable walking tasks in daily life for the trans-femoral amputees.
comment: 9 pages, 11 figures. This work has been submitted to the IEEE-TCDS for possible publication
Towards Unconstrained Collision Injury Protection Data Sets: Initial Surrogate Experiments for the Human Hand
Safety for physical human-robot interaction (pHRI) is a major concern for all application domains. While current standardization for industrial robot applications provide safety constraints that address the onset of pain in blunt impacts, these impact thresholds are difficult to use on edged or pointed impactors. The most severe injuries occur in constrained contact scenarios, where crushing is possible. Nevertheless, situations potentially resulting in constrained contact only occur in certain areas of a workspace and design or organisational approaches can be used to avoid them. What remains are risks to the human physical integrity caused by unconstrained accidental contacts, which are difficult to avoid while maintaining robot motion efficiency. Nevertheless, the probability and severity of injuries occurring with edged or pointed impacting objects in unconstrained collisions is hardly researched. In this paper, we propose an experimental setup and procedure using two pendulums modeling human hands and arms and robots to understand the injury potential of unconstrained collisions of human hands with edged objects. Pig feet are used as ex vivo surrogate samples - as these closely resemble the physiological characteristics of human hands - to create an initial injury database on the severity of injuries caused by unconstrained edged or pointed impacts. For the effective mass range of typical lightweight robots, the data obtained show low probabilities of injuries such as skin cuts or bone/tendon injuries in unconstrained collisions when the velocity is reduced to < 0.5 m/s. The proposed experimental setups and procedures should be complemented by sufficient human modeling and will eventually lead to a complete understanding of the biomechanical injury potential in pHRI.
comment: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Harnessing the Potential of Omnidirectional Multi-Rotor Aerial Vehicles in Cooperative Jamming Against Eavesdropping
Recent research in communications-aware robotics has been propelled by advancements in 5G and emerging 6G technologies. This field now includes the integration of Multi-Rotor Aerial Vehicles (MRAVs) into cellular networks, with a specific focus on under-actuated MRAVs. These vehicles face challenges in independently controlling position and orientation due to their limited control inputs, which adversely affects communication metrics such as Signal-to-Noise Ratio. In response, a newer class of omnidirectional MRAVs has been developed, which can control both position and orientation simultaneously by tilting their propellers. However, exploiting this capability fully requires sophisticated motion planning techniques. This paper presents a novel application of omnidirectional MRAVs designed to enhance communication security and thwart eavesdropping. It proposes a strategy where one MRAV functions as an aerial Base Station, while another acts as a friendly jammer to secure communications. This study is the first to apply such a strategy to MRAVs in scenarios involving eavesdroppers.
comment: 7 pages, 4 figures, Accepted for presentation to the 2024 IEEE Global Communications Conference (IEEE GLOBECOM), Cape Town, South Africa. Copyright may be transferred without notice, after which this version may no longer be accessible
SeeThruFinger: See and Grasp Anything with a Multi-Modal Soft Touch
We present SeeThruFinger, a Vision-Based Tactile Sensing (VBTS) architecture using a markerless See-Thru-Network. It achieves simultaneous visual perception and tactile sensing while providing omni-directional, adaptive grasping for manipulation. Multi-modal perception of intrinsic and extrinsic interactions is critical in building intelligent robots that learn. Instead of adding various sensors for different modalities, a preferred solution is to integrate them into one elegant and coherent design, which is a challenging task. This study leverages the in-finger vision to inpaint occluded regions of the external environment, achieving coherent scene reconstruction for visual perception. By tracking real-time segmentation of the Soft Polyhedral Network's large-scale deformation, we achieved real-time markerless tactile sensing of 6D forces and torques. We further demonstrate the application of the SeeThruFinger for reactive grasping without using external cameras or dedicated force and torque sensors. As a result, our proposed SeeThruFinger architecture enables multi-modal perception via a single in-finger vision camera in a markerless way, including scene inpainting, object detection, segmentation tracking, and tactile sensing.
comment: 10 pages, 5 figures, 1 table
A Survey for Foundation Models in Autonomous Driving
The advent of foundation models has revolutionized the fields of natural language processing and computer vision, paving the way for their application in autonomous driving (AD). This survey presents a comprehensive review of more than 40 research papers, demonstrating the role of foundation models in enhancing AD. Large language models contribute to planning and simulation in AD, particularly through their proficiency in reasoning, code generation and translation. In parallel, vision foundation models are increasingly adapted for critical tasks such as 3D object detection and tracking, as well as creating realistic driving scenarios for simulation and testing. Multi-modal foundation models, integrating diverse inputs, exhibit exceptional visual understanding and spatial reasoning, crucial for end-to-end AD. This survey not only provides a structured taxonomy, categorizing foundation models based on their modalities and functionalities within the AD domain but also delves into the methods employed in current research. It identifies the gaps between existing foundation models and cutting-edge AD approaches, thereby charting future research directions and proposing a roadmap for bridging these gaps.
Robotics
Information-Based Trajectory Planning for Autonomous Absolute Tracking in Cislunar Space
The resurgence of lunar operations requires advancements in cislunar navigation and Space Situational Awareness (SSA). Challenges associated to these tasks have created an interest in autonomous planning, navigation, and tracking technologies that operate with little ground-based intervention. This research introduces a trajectory planning tool for a low-thrust mobile observer, aimed at maximizing navigation and tracking performance with satellite-to-satellite relative measurements. We formulate an expression for the information gathered over an observation period based on the mutual information between augmented observer/target states and the associated measurement set collected. We then develop an optimal trajectory design problem for a mobile observer, balancing information gain and control effort, and solve this problem with a Sequential Convex Programming (SCP) approach. The developed methods are demonstrated in scenarios involving spacecraft in the cislunar regime, demonstrating the potential for improved autonomous navigation and tracking.
comment: 2024 AAS/AIAA Astrodynamics Specialist Conference
Open-vocabulary Temporal Action Localization using VLMs
Video action localization aims to find timings of a specific action from a long video. Although existing learning-based approaches have been successful, those require annotating videos that come with a considerable labor cost. This paper proposes a learning-free, open-vocabulary approach based on emerging vision-language models (VLM). The challenge stems from the fact that VLMs are neither designed to process long videos nor tailored for finding actions. We overcome these problems by extending an iterative visual prompting technique. Specifically, we sample video frames into a concatenated image with frame index labels, making a VLM guess a frame that is considered to be closest to the start/end of the action. Iterating this process by narrowing a sampling time window results in finding a specific frame of start and end of an action. We demonstrate that this sampling technique yields reasonable results, illustrating a practical extension of VLMs for understanding videos.
comment: 7 pages, 5 figures, 4 tables. Last updated on August 30th, 2024
EMPOWER: Embodied Multi-role Open-vocabulary Planning with Online Grounding and Execution IROS 2024
Task planning for robots in real-life settings presents significant challenges. These challenges stem from three primary issues: the difficulty in identifying grounded sequences of steps to achieve a goal; the lack of a standardized mapping between high-level actions and low-level commands; and the challenge of maintaining low computational overhead given the limited resources of robotic hardware. We introduce EMPOWER, a framework designed for open-vocabulary online grounding and planning for embodied agents aimed at addressing these issues. By leveraging efficient pre-trained foundation models and a multi-role mechanism, EMPOWER demonstrates notable improvements in grounded planning and execution. Quantitative results highlight the effectiveness of our approach, achieving an average success rate of 0.73 across six different real-life scenarios using a TIAGo robot.
comment: Accepted at IROS 2024
Augmented Reality without Borders: Achieving Precise Localization Without Maps
Visual localization is crucial for Computer Vision and Augmented Reality (AR) applications, where determining the camera or device's position and orientation is essential to accurately interact with the physical environment. Traditional methods rely on detailed 3D maps constructed using Structure from Motion (SfM) or Simultaneous Localization and Mapping (SLAM), which is computationally expensive and impractical for dynamic or large-scale environments. We introduce MARLOC, a novel localization framework for AR applications that uses known relative transformations within image sequences to perform intra-sequence triangulation, generating 3D-2D correspondences for pose estimation and refinement. MARLOC eliminates the need for pre-built SfM maps, providing accurate and efficient localization suitable for dynamic outdoor environments. Evaluation with benchmark datasets and real-world experiments demonstrates MARLOC's state-of-the-art performance and robustness. By integrating MARLOC into an AR device, we highlight its capability to achieve precise localization in real-world outdoor scenarios, showcasing its practical effectiveness and potential to enhance visual localization in AR applications.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling
Predicting and executing a sequence of actions without intermediate replanning, known as action chunking, is increasingly used in robot learning from human demonstrations. However, its effects on learned policies remain puzzling: some studies highlight its importance for achieving strong performance, while others observe detrimental effects. In this paper, we first dissect the role of action chunking by analyzing the divergence between the learner and the demonstrator. We find that longer action chunks enable a policy to better capture temporal dependencies by taking into account more past states and actions within the chunk. However, this advantage comes at the cost of exacerbating errors in stochastic environments due to fewer observations of recent states. To address this, we propose Bidirectional Decoding (BID), a test-time inference algorithm that bridges action chunking with closed-loop operations. BID samples multiple predictions at each time step and searches for the optimal one based on two criteria: (i) backward coherence, which favors samples aligned with previous decisions, (ii) forward contrast, which favors samples close to outputs of a stronger policy and distant from those of a weaker policy. By coupling decisions within and across action chunks, BID enhances temporal consistency over extended sequences while enabling adaptive replanning in stochastic environments. Experimental results show that BID substantially outperforms conventional closed-loop operations of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks.
comment: Project website: https://bid-robot.github.io/
Optimizing Interaction Space: Enlarging the Capture Volume for Multiple Portable Motion Capture Devices
Markerless motion capture devices such as the Leap Motion Controller (LMC) have been extensively used for tracking hand, wrist, and forearm positions as an alternative to Marker-based Motion Capture (MMC). However, previous studies have highlighted the subpar performance of LMC in reliably recording hand kinematics. In this study, we employ four LMC devices to optimize their collective tracking volume, aiming to enhance the accuracy and precision of hand kinematics. Through Monte Carlo simulation, we determine an optimized layout for the four LMC devices and subsequently conduct reliability and validity experiments encompassing 1560 trials across ten subjects. The combined tracking volume is validated against an MMC system, particularly for kinematic movements involving wrist, index, and thumb flexion. Utilizing calculation resources in one computer, our result of the optimized configuration has a better visibility rate with a value of 0.05 $\pm$ 0.55 compared to the initial configuration with -0.07 $\pm$ 0.40. Multiple Leap Motion Controllers (LMCs) have proven to increase the interaction space of capture volume but are still unable to give agreeable measurements from dynamic movement.
comment: This paper has eight pages and five figures. It has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. The code used in this work is available at https://github.com/hilmanfatoni/Multi-LMC_Optimization
NanoMVG: USV-Centric Low-Power Multi-Task Visual Grounding based on Prompt-Guided Camera and 4D mmWave Radar
Recently, visual grounding and multi-sensors setting have been incorporated into perception system for terrestrial autonomous driving systems and Unmanned Surface Vehicles (USVs), yet the high complexity of modern learning-based visual grounding model using multi-sensors prevents such model to be deployed on USVs in the real-life. To this end, we design a low-power multi-task model named NanoMVG for waterway embodied perception, guiding both camera and 4D millimeter-wave radar to locate specific object(s) through natural language. NanoMVG can perform both box-level and mask-level visual grounding tasks simultaneously. Compared to other visual grounding models, NanoMVG achieves highly competitive performance on the WaterVG dataset, particularly in harsh environments and boasts ultra-low power consumption for long endurance.
comment: 8 pages, 6 figures
Non-verbal Interaction and Interface with a Quadruped Robot using Body and Hand Gestures: Design and User Experience Evaluation
In recent years, quadruped robots have attracted significant attention due to their practical advantages in maneuverability, particularly when navigating rough terrain and climbing stairs. As these robots become more integrated into various industries, including construction and healthcare, researchers have increasingly focused on developing intuitive interaction methods such as speech and gestures that do not require separate devices such as keyboards or joysticks. This paper aims at investigating a comfortable and efficient interaction method with quadruped robots that possess a familiar form factor. To this end, we conducted two preliminary studies to observe how individuals naturally interact with a quadruped robot in natural and controlled settings, followed by a prototype experiment to examine human preferences for body-based and hand-based gesture controls using a Unitree Go1 Pro quadruped robot. We assessed the user experience of 13 participants using the User Experience Questionnaire and measured the time taken to complete specific tasks. The findings of our preliminary results indicate that humans have a natural preference for communicating with robots through hand and body gestures rather than speech. In addition, participants reported higher satisfaction and completed tasks more quickly when using body gestures to interact with the robot. This contradicts the fact that most gesture-based control technologies for quadruped robots are hand-based. The video is available at https://youtu.be/rysv1p1zvp4.
comment: 16 pages
Robotic Object Insertion with a Soft Wrist through Sim-to-Real Privileged Training IROS 2024
This study addresses contact-rich object insertion tasks under unstructured environments using a robot with a soft wrist, enabling safe contact interactions. For the unstructured environments, we assume that there are uncertainties in object grasp and hole pose and that the soft wrist pose cannot be directly measured. Recent methods employ learning approaches and force/torque sensors for contact localization; however, they require data collection in the real world. This study proposes a sim-to-real approach using a privileged training strategy. This method has two steps. 1) The teacher policy is trained to complete the task with sensor inputs and ground truth privileged information such as the peg pose, and then 2) the student encoder is trained with data produced from teacher policy rollouts to estimate the privileged information from sensor history. We performed sim-to-real experiments under grasp and hole pose uncertainties. This resulted in 100\%, 95\%, and 80\% success rates for circular peg insertion with 0, +5, and -5 degree peg misalignments, respectively, and start positions randomly shifted $\pm$ 10 mm from a default position. Also, we tested the proposed method with a square peg that was never seen during training. Additional simulation evaluations revealed that using the privileged strategy improved success rates compared to training with only simulated sensor data. Our results demonstrate the advantage of using sim-to-real privileged training for soft robots, which has the potential to alleviate human engineering efforts for robotic assembly.
comment: This paper has been accepted at IROS 2024
Generative Modeling Perspective for Control and Reasoning in Robotics
Heralded by the initial success in speech recognition and image classification, learning-based approaches with neural networks, commonly referred to as deep learning, have spread across various fields. A primitive form of a neural network functions as a deterministic mapping from one vector to another, parameterized by trainable weights. This is well suited for point estimation in which the model learns a one-to-one mapping (e.g., mapping a front camera view to a steering angle) that is required to solve the task of interest. Although learning such a deterministic, one-to-one mapping is effective, there are scenarios where modeling \emph{multimodal} data distributions, namely learning one-to-many relationships, is helpful or even necessary. In this thesis, we adopt a generative modeling perspective on robotics problems. Generative models learn and produce samples from multimodal distributions, rather than performing point estimation. We will explore the advantages this perspective offers for three topics in robotics.
comment: arXiv admin note: text overlap with arXiv:2302.12244
MakeWay: Object-Aware Costmaps for Proactive Indoor Navigation Using LiDAR
In this paper, we introduce a LiDAR-based robot navigation system, based on novel object-aware affordance-based costmaps. Utilizing a 3D object detection network, our system identifies objects of interest in LiDAR keyframes, refines their 3D poses with the Iterative Closest Point (ICP) algorithm, and tracks them via Kalman filters and the Hungarian algorithm for data association. It then updates existing object poses with new associated detections and creates new object maps for unmatched detections. Using the maintained object-level mapping system, our system creates affordance-driven object costmaps for proactive collision avoidance in path planning. Additionally, we address the scarcity of indoor semantic LiDAR data by introducing an automated labeling technique. This method utilizes a CAD model database for accurate ground-truth annotations, encompassing bounding boxes, positions, orientations, and point-wise semantics of each object in LiDAR sequences. Our extensive evaluations, conducted in both simulated and real-world robot platforms, highlights the effectiveness of proactive object avoidance by using object affordance costmaps, enhancing robotic navigation safety and efficiency. The system can operate in real-time onboard and we intend to release our code and data for public use.
comment: 8 pages, 11 figures
Efficient Camera Exposure Control for Visual Odometry via Deep Reinforcement Learning
The stability of visual odometry (VO) systems is undermined by degraded image quality, especially in environments with significant illumination changes. This study employs a deep reinforcement learning (DRL) framework to train agents for exposure control, aiming to enhance imaging performance in challenging conditions. A lightweight image simulator is developed to facilitate the training process, enabling the diversification of image exposure and sequence trajectory. This setup enables completely offline training, eliminating the need for direct interaction with camera hardware and the real environments. Different levels of reward functions are crafted to enhance the VO systems, equipping the DRL agents with varying intelligence. Extensive experiments have shown that our exposure control agents achieve superior efficiency-with an average inference duration of 1.58 ms per frame on a CPU-and respond more quickly than traditional feedback control schemes. By choosing an appropriate reward function, agents acquire an intelligent understanding of motion trends and anticipate future illumination changes. This predictive capability allows VO systems to deliver more stable and precise odometry results. The codes and datasets are available at https://github.com/ShuyangUni/drl_exposure_ctrl.
comment: 8 pages, 7 figures
From "Made In" to Mukokuseki: Exploring the Visual Perception of National Identity in Robots
People read human characteristics into the design of social robots, a visual process with socio-cultural implications. One factor may be nationality, a complex social characteristic that is linked to ethnicity, culture, and other factors of identity that can be embedded in the visual design of robots. Guided by social identity theory (SIT), we explored the notion of "mukokuseki," a visual design characteristic defined by the absence of visual cues to national and ethnic identity in Japanese cultural exports. In a two-phase categorization study (n=212), American (n=110) and Japanese (n=92) participants rated a random selection of nine robot stimuli from America and Japan, plus multinational Pepper. We found evidence of made-in and two kinds of mukokuseki effects. We offer suggestions for the visual design of mukokuseki robots that may interact with people from diverse backgrounds. Our findings have implications for robots and social identity, the viability of robotic exports, and the use of robots internationally.
comment: 35 pages
Addressing the challenges of loop detection in agricultural environments
While visual SLAM systems are well studied and achieve impressive results in indoor and urban settings, natural, outdoor and open-field environments are much less explored and still present relevant research challenges. Visual navigation and local mapping have shown a relatively good performance in open-field environments. However, globally consistent mapping and long-term localization still depend on the robustness of loop detection and closure, for which the literature is scarce. In this work we propose a novel method to pave the way towards robust loop detection in open fields, particularly in agricultural settings, based on local feature search and stereo geometric refinement, with a final stage of relative pose estimation. Our method consistently achieves good loop detections, with a median error of 15cm. We aim to characterize open fields as a novel environment for loop detection, understanding the limitations and problems that arise when dealing with them.
Evaluation and Deployment of LiDAR-based Place Recognition in Dense Forests
Many LiDAR place recognition systems have been developed and tested specifically for urban driving scenarios. Their performance in natural environments such as forests and woodlands have been studied less closely. In this paper, we analyzed the capabilities of four different LiDAR place recognition systems, both handcrafted and learning-based methods, using LiDAR data collected with a handheld device and legged robot within dense forest environments. In particular, we focused on evaluating localization where there is significant translational and orientation difference between corresponding LiDAR scan pairs. This is particularly important for forest survey systems where the sensor or robot does not follow a defined road or path. Extending our analysis we then incorporated the best performing approach, Logg3dNet, into a full 6-DoF pose estimation system -- introducing several verification layers for precise registration. We demonstrated the performance of our methods in three operational modes: online SLAM, offline multi-mission SLAM map merging, and relocalization into a prior map. We evaluated these modes using data captured in forests from three different countries, achieving 80% of correct loop closures candidates with baseline distances up to 5m, and 60% up to 10m. Video at: https://youtu.be/86l-oxjwmjY
DeformGS: Scene Flow in Highly Deformable Scenes for Deformable Object Manipulation
Teaching robots to fold, drape, or reposition deformable objects such as cloth will unlock a variety of automation applications. While remarkable progress has been made for rigid object manipulation, manipulating deformable objects poses unique challenges, including frequent occlusions, infinite-dimensional state spaces and complex dynamics. Just as object pose estimation and tracking have aided robots for rigid manipulation, dense 3D tracking (scene flow) of highly deformable objects will enable new applications in robotics while aiding existing approaches, such as imitation learning or creating digital twins with real2sim transfer. We propose DeformGS, an approach to recover scene flow in highly deformable scenes, using simultaneous video captures of a dynamic scene from multiple cameras. DeformGS builds on recent advances in Gaussian splatting, a method that learns the properties of a large number of Gaussians for state-of-the-art and fast novel-view synthesis. DeformGS learns a deformation function to project a set of Gaussians with canonical properties into world space. The deformation function uses a neural-voxel encoding and a multilayer perceptron (MLP) to infer Gaussian position, rotation, and a shadow scalar. We enforce physics-inspired regularization terms based on conservation of momentum and isometry, which leads to trajectories with smaller trajectory errors. We also leverage existing foundation models SAM and XMEM to produce noisy masks, and learn a per-Gaussian mask for better physics-inspired regularization. DeformGS achieves high-quality 3D tracking on highly deformable scenes with shadows and occlusions. In experiments, DeformGS improves 3D tracking by an average of 55.8% compared to the state-of-the-art. With sufficient texture, DeformGS achieves a median tracking error of 3.3 mm on a cloth of 1.5 x 1.5 m in area. Website: https://deformgs.github.io
Scalable Multi-Agent Reinforcement Learning for Warehouse Logistics with Robotic and Human Co-Workers IROS
We consider a warehouse in which dozens of mobile robots and human pickers work together to collect and deliver items within the warehouse. The fundamental problem we tackle, called the order-picking problem, is how these worker agents must coordinate their movement and actions in the warehouse to maximise performance in this task. Established industry methods using heuristic approaches require large engineering efforts to optimise for innately variable warehouse configurations. In contrast, multi-agent reinforcement learning (MARL) can be flexibly applied to diverse warehouse configurations (e.g. size, layout, number/types of workers, item replenishment frequency), and different types of order-picking paradigms (e.g. Goods-to-Person and Person-to-Goods), as the agents can learn how to cooperate optimally through experience. We develop hierarchical MARL algorithms in which a manager agent assigns goals to worker agents, and the policies of the manager and workers are co-trained toward maximising a global objective (e.g. pick rate). Our hierarchical algorithms achieve significant gains in sample efficiency over baseline MARL algorithms and overall pick rates over multiple established industry heuristics in a diverse set of warehouse configurations and different order-picking paradigms.
comment: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024
Optimal and Bounded Suboptimal Any-Angle Multi-agent Pathfinding IROS 2024
Multi-agent pathfinding (MAPF) is the problem of finding a set of conflict-free paths for a set of agents. Typically, the agents' moves are limited to a pre-defined graph of possible locations and allowed transitions between them, e.g. a 4-neighborhood grid. We explore how to solve MAPF problems when each agent can move between any pair of possible locations as long as traversing the line segment connecting them does not lead to a collision with the obstacles. This is known as any-angle pathfinding. We present the first optimal any-angle multi-agent pathfinding algorithm. Our planner is based on the Continuous Conflict-based Search (CCBS) algorithm and an optimal any-angle variant of the Safe Interval Path Planning (TO-AA-SIPP). The straightforward combination of those, however, scales poorly since any-angle path finding induces search trees with a very large branching factor. To mitigate this, we adapt two techniques from classical MAPF to the any-angle setting, namely Disjoint Splitting and Multi-Constraints. Experimental results on different combinations of these techniques show they enable solving over 30% more problems than the vanilla combination of CCBS and TO-AA-SIPP. In addition, we present a bounded-suboptimal variant of our algorithm, that enables trading runtime for solution cost in a controlled manner.
comment: This is a pre-print version of the paper accepted to IROS 2024. Its main body is similar to the camera-ready version of the conference paper. In addition this pre-print contains Appendix
Localization Under Consistent Assumptions Over Dynamics
Accurate maps are a prerequisite for virtually all mobile robot tasks. Most state-of-the-art maps assume a static world; therefore, dynamic objects are filtered out of the measurements. However, this division ignores movable but non-moving -- i.e., semi-static -- objects, which are usually recorded in the map and treated as static objects, violating the static world assumption and causing errors in the localization. This paper presents a method for consistently modeling moving and movable objects to match the map and measurements. This reduces the error resulting from inconsistent categorization and treatment of non-static measurements. A semantic segmentation network is used to categorize the measurements into static and semi-static classes, and a background subtraction filter is used to remove dynamic measurements. Finally, we show that consistent assumptions over dynamics improve localization accuracy when compared against a state-of-the-art baseline solution using real-world data from the Oxford Radar RobotCar data set.
comment: IEEE-MFI 2024
Object-Oriented Grid Mapping in Dynamic Environments
Grid maps, especially occupancy grid maps, are ubiquitous in many mobile robot applications. To simplify the process of learning the map, grid maps subdivide the world into a grid of cells whose occupancies are independently estimated using measurements in the perceptual field of the particular cell. However, the world consists of objects that span multiple cells, which means that measurements falling onto a cell provide evidence of the occupancy of other cells belonging to the same object. Current models do not capture this correlation and, therefore, do not use object-level information for estimating the state of the environment. In this work, we present a way to generalize the update of grid maps, relaxing the assumption of independence. We propose modeling the relationship between the measurements and the occupancy of each cell as a set of latent variables and jointly estimate those variables and the posterior of the map. We propose a method to estimate the latent variables by clustering based on semantic labels and an extension to the Normal Distributions Transform Occupancy Map (NDT-OM) to facilitate the proposed map update method. We perform comprehensive map creation and localization experiments with real-world data sets and show that the proposed method creates better maps in highly dynamic environments compared to state-of-the-art methods. Finally, we demonstrate the ability of the proposed method to remove occluded objects from the map in a lifelong map update scenario.
comment: IEEE-MFI 2024
RoadRunner -- Learning Traversability Estimation for Autonomous Off-road Driving
Autonomous navigation at high speeds in off-road environments necessitates robots to comprehensively understand their surroundings using onboard sensing only. The extreme conditions posed by the off-road setting can cause degraded camera image quality due to poor lighting and motion blur, as well as limited sparse geometric information available from LiDAR sensing when driving at high speeds. In this work, we present RoadRunner, a novel framework capable of predicting terrain traversability and an elevation map directly from camera and LiDAR sensor inputs. RoadRunner enables reliable autonomous navigation, by fusing sensory information, handling of uncertainty, and generation of contextually informed predictions about the geometry and traversability of the terrain while operating at low latency. In contrast to existing methods relying on classifying handcrafted semantic classes and using heuristics to predict traversability costs, our method is trained end-to-end in a self-supervised fashion. The RoadRunner network architecture builds upon popular sensor fusion network architectures from the autonomous driving domain, which embed LiDAR and camera information into a common Bird's Eye View perspective. Training is enabled by utilizing an existing traversability estimation stack to generate training data in hindsight in a scalable manner from real-world off-road driving datasets. Furthermore, RoadRunner improves the system latency by a factor of roughly 4, from 500 ms to 140 ms, while improving the accuracy for traversability costs and elevation map predictions. We demonstrate the effectiveness of RoadRunner in enabling safe and reliable off-road navigation at high speeds in multiple real-world driving scenarios through unstructured desert environments.
comment: accepted for IEEE Transactions on Field Robotics (T-FR)
Revisiting Reward Design and Evaluation for Robust Humanoid Standing and Walking
A necessary capability for humanoid robots is the ability to stand and walk while rejecting natural disturbances. Recent progress has been made using sim-to-real reinforcement learning (RL) to train such locomotion controllers, with approaches differing mainly in their reward functions. However, prior works lack a clear method to systematically test new reward functions and compare controller performance through repeatable experiments. This limits our understanding of the trade-offs between approaches and hinders progress. To address this, we propose a low-cost, quantitative benchmarking method to evaluate and compare the real-world performance of standing and walking (SaW) controllers on metrics like command following, disturbance recovery, and energy efficiency. We also revisit reward function design and construct a minimally constraining reward function to train SaW controllers. We experimentally verify that our benchmarking framework can identify areas for improvement, which can be systematically addressed to enhance the policies. We also compare our new controller to state-of-the-art controllers on the Digit humanoid robot. The results provide clear quantitative trade-offs among the controllers and suggest directions for future improvements to the reward functions and expansion of the benchmarks.
comment: 8 pages, 5 figs
Motion Polynomials Admitting a Factorization with Linear Factors
Motion polynomials (polynomials over the dual quaternions with nonzero real norm) describe rational motions. We present a necessary and sufficient condition for reduced bounded motion polynomials to admit factorizations into monic linear factors, and we give an algorithm to compute them. We can use those linear factors to construct mechanisms because the factorization corresponds to the decomposition of the rational motion into simple rotations or translations. Bounded motion polynomials always admit a factorization into linear factors after multiplying with a suitable real or quaternion polynomial. Our criterion for factorizability allows us to improve on earlier algorithms to compute a suitable real or quaternion polynomial co-factor.
Control of Unknown Quadrotors from a Single Throw IROS 2024
This paper presents a method to recover quadrotor UAV from a throw, when no control parameters are known before the throw. We leverage the availability of high-frequency rotor speed feedback available in racing drone hardware and software to find control effectiveness values and fit a motor model using recursive least squares (RLS) estimation. Furthermore, we propose an excitation sequence that provides large actuation commands while guaranteeing to stay within gyroscope sensing limits. After 450ms of excitation, an INDI attitude controller uses the 52 fitted parameters to arrest rotational motion and recover an upright attitude. Finally, a NDI position controller drives the craft to a position setpoint. The proposed algorithm runs efficiently on microcontrollers found in common UAV flight controllers, and was shown to recover an agile quadrotor every time in 57 live experiments with as low as 3.5m throw height, demonstrating robustness against initial rotations and noise. We also demonstrate control of randomized quadrotors in simulated throws, where the parameter fitting RMS error is typically within 10% of the true value. This work has been submitted to IROS 2024 for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.
comment: 7 pages, 5 figures, 2 tables. Submitted to the IROS 2024 conference
AlabOS: A Python-based Reconfigurable Workflow Management Framework for Autonomous Laboratories
The recent advent of autonomous laboratories, coupled with algorithms for high-throughput screening and active learning, promises to accelerate materials discovery and innovation. As these autonomous systems grow in complexity, the demand for robust and efficient workflow management software becomes increasingly critical. In this paper, we introduce AlabOS, a general-purpose software framework for orchestrating experiments and managing resources, with an emphasis on automated laboratories for materials synthesis and characterization. AlabOS features a reconfigurable experiment workflow model and a resource reservation mechanism, enabling the simultaneous execution of varied workflows composed of modular tasks while eliminating conflicts between tasks. To showcase its capability, we demonstrate the implementation of AlabOS in a prototype autonomous materials laboratory, A-Lab, with around 3,500 samples synthesized over 1.5 years.
comment: 34 pages, 5 figures
DTG : Diffusion-based Trajectory Generation for Mapless Global Navigation
We present a novel end-to-end diffusion-based trajectory generation method, DTG, for mapless global navigation in challenging outdoor scenarios with occlusions and unstructured off-road features like grass, buildings, bushes, etc. Given a distant goal, our approach computes a trajectory that satisfies the following goals: (1) minimize the travel distance to the goal; (2) maximize the traversability by choosing paths that do not lie in undesirable areas. Specifically, we present a novel Conditional RNN(CRNN) for diffusion models to efficiently generate trajectories. Furthermore, we propose an adaptive training method that ensures that the diffusion model generates more traversable trajectories. We evaluate our methods in various outdoor scenes and compare the performance with other global navigation algorithms on a Husky robot. In practice, we observe at least a 15% improvement in traveling distance and around a 7% improvement in traversability.
comment: 10 pages
Perfecting Periodic Trajectory Tracking: Model Predictive Control with a Periodic Observer ($Π$-MPC) IROS 2024
In Model Predictive Control (MPC), discrepancies between the actual system and the predictive model can lead to substantial tracking errors and significantly degrade performance and reliability. While such discrepancies can be alleviated with more complex models, this often complicates controller design and implementation. By leveraging the fact that many trajectories of interest are periodic, we show that perfect tracking is possible when incorporating a simple observer that estimates and compensates for periodic disturbances. We present the design of the observer and the accompanying tracking MPC scheme, proving that their combination achieves zero tracking error asymptotically, regardless of the complexity of the unmodelled dynamics. We validate the effectiveness of our method, demonstrating asymptotically perfect tracking on a high-dimensional soft robot with nearly 10,000 states and a fivefold reduction in tracking errors compared to a baseline MPC on small-scale autonomous race car experiments.
comment: 8 pages, 3 figures; 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
HeROS: a miniaturised platform for research and development on Heterogeneous RObotic Systems
Tests and prototyping are vital in the research and development of robotic systems. Work with target hardware is problematic. Hence, in the article, a low-cost, miniaturised physical platform is presented to deal with experiments on heterogeneous robotic systems. The platform comprises a physical board with tiles of the standardised base, diverse mobile robots, and manipulation robots. The number of exemplary applications validates the usefulness of the solution.
Decentralized Adaptive Aerospace Transportation of Unknown Loads Using A Team of Robots
Transportation missions in aerospace are limited to the capability of each aerospace robot and the properties of the target transported object, such as mass, inertia, and grasping locations. We present a novel decentralized adaptive controller design for multiple robots that can be implemented in different kinds of aerospace robots. Our controller adapts to unknown objects in different gravity environments. We validate our method in an aerial scenario using multiple fully actuated hexarotors with grasping capabilities, and a space scenario using a group of space tugs. In both scenarios, the robots transport a payload cooperatively through desired three-dimensional trajectories. We show that our method can adapt to unexpected changes that include the loss of robots during the transportation mission.
comment: This paper has been accepted by DARS2024 Conference. The permission for the preprint version on Arxiv has been approved through the DARS2024 Committee and Springer Press
Rico: extended TIAGo robot towards up-to-date social and assistive robot usage scenarios
Social and assistive robotics have vastly increased in popularity in recent years. Due to the wide range of usage, robots executing such tasks must be highly reliable and possess enough functions to satisfy multiple scenarios. This article describes a mobile, artificial intelligence-driven, robotic platform Rico. Its prior usage in similar scenarios, the number of its capabilities, and the experiments it presented should qualify it as a proper arm-less platform for social and assistive circumstances.
comment: PP-RAI 2024, 5th Polish Conference on Artificial Intelligence, 18-20.04.2024 Warsaw, Poland
Saltation Matrices: The Essential Tool for Linearizing Hybrid Dynamical Systems
Hybrid dynamical systems, i.e. systems that have both continuous and discrete states, are ubiquitous in engineering, but are difficult to work with due to their discontinuous transitions. For example, a robot leg is able to exert very little control effort while it is in the air compared to when it is on the ground. When the leg hits the ground, the penetrating velocity instantaneously collapses to zero. These instantaneous changes in dynamics and discontinuities (or jumps) in state make standard smooth tools for planning, estimation, control, and learning difficult for hybrid systems. One of the key tools for accounting for these jumps is called the saltation matrix. The saltation matrix is the sensitivity update when a hybrid jump occurs and has been used in a variety of fields including robotics, power circuits, and computational neuroscience. This paper presents an intuitive derivation of the saltation matrix and discusses what it captures, where it has been used in the past, how it is used for linear and quadratic forms, how it is computed for rigid body systems with unilateral constraints, and some of the structural properties of the saltation matrix in these cases.
Systems and Control (CS)
Information-Based Trajectory Planning for Autonomous Absolute Tracking in Cislunar Space
The resurgence of lunar operations requires advancements in cislunar navigation and Space Situational Awareness (SSA). Challenges associated to these tasks have created an interest in autonomous planning, navigation, and tracking technologies that operate with little ground-based intervention. This research introduces a trajectory planning tool for a low-thrust mobile observer, aimed at maximizing navigation and tracking performance with satellite-to-satellite relative measurements. We formulate an expression for the information gathered over an observation period based on the mutual information between augmented observer/target states and the associated measurement set collected. We then develop an optimal trajectory design problem for a mobile observer, balancing information gain and control effort, and solve this problem with a Sequential Convex Programming (SCP) approach. The developed methods are demonstrated in scenarios involving spacecraft in the cislunar regime, demonstrating the potential for improved autonomous navigation and tracking.
comment: 2024 AAS/AIAA Astrodynamics Specialist Conference
Regular Pairings for Non-quadratic Lyapunov Functions and Contraction Analysis
Recent studies on stability and contractivity have highlighted the importance of semi-inner products, which we refer to as ``pairings'', associated with general norms. A pairing is a binary operation that relates the derivative of a curve's norm to the radius-vector of the curve and its tangent. This relationship, known as the curve norm derivative formula, is crucial when using the norm as a Lyapunov function. Another important property of the pairing, used in stability and contraction criteria, is the so-called Lumer inequality, which relates the pairing to the induced logarithmic norm. We prove that the curve norm derivative formula and Lumer's inequality are, in fact, equivalent to each other and to several simpler properties. We then introduce and characterize regular pairings that satisfy all of these properties. Our results unify several independent theories of pairings (semi-inner products) developed in previous work on functional analysis and control theory. Additionally, we introduce the polyhedral max pairing and develop computational tools for polyhedral norms, advancing contraction theory in non-Euclidean spaces.
Robust model predictive control exploiting monotonicity properties
Robust model predictive control algorithms are essential for addressing unavoidable errors due to the uncertainty in predicting real-world systems. However, the formulation of such algorithms typically results in a trade-off between conservatism and computational complexity. Monotone systems facilitate the efficient computation of reachable sets and thus the straightforward formulation of a robust model predictive control approach optimizing over open-loop predictions. We present an approach based on the division of reachable sets to incorporate feedback in the predictions, resulting in less conservative strategies. The concept of mixed-monotonicity enables an extension of our methodology to non-monotone systems. The potential of the proposed approaches is demonstrated through a nonlinear high-dimensional chemical tank reactor cascade case study.
comment: Submitted to "IEEE Transactions on Automatic Control", Code: https://github.com/MoritzHein/RobMPCExploitMon
Learning and Verifying Maximal Taylor-Neural Lyapunov functions
We introduce a novel neural network architecture, termed Taylor-neural Lyapunov functions, designed to approximate Lyapunov functions with formal certification. This architecture innovatively encodes local approximations and extends them globally by leveraging neural networks to approximate the residuals. Our method recasts the problem of estimating the largest region of attraction - specifically for maximal Lyapunov functions - into a learning problem, ensuring convergence around the origin through robust control theory. Physics-informed machine learning techniques further refine the estimation of the largest region of attraction. Remarkably, this method is versatile, operating effectively even without simulated data points. We validate the efficacy of our approach by providing numerical certificates of convergence across multiple examples. Our proposed methodology not only competes closely with state-of-the-art approaches, such as sum-of-squares and LyZNet, but also achieves comparable results even in the absence of simulated data. This work represents a significant advancement in control theory, with broad potential applications in the design of stable control systems and beyond.
"Benefit Game: Alien Seaweed Swarms" -- Real-time Gamification of Digital Seaweed Ecology
"Benefit Game: Alien Seaweed Swarms" combines artificial life art and interactive game with installation to explore the impact of human activity on fragile seaweed ecosystems. The project aims to promote ecological consciousness by creating a balance in digital seaweed ecologies. Inspired by the real species "Laminaria saccharina", the author employs Procedural Content Generation via Machine Learning technology to generate variations of virtual seaweeds and symbiotic fungi. The audience can explore the consequences of human activities through gameplay and observe the ecosystem's feedback on the benefits and risks of seaweed aquaculture. This Benefit Game offers dynamic and real-time responsive artificial seaweed ecosystems for an interactive experience that enhances ecological consciousness.
comment: Paper accepted at ISEA 24, The 29th International Symposium on Electronic Art, Brisbane, Australia, 21-29 June 2024
Leveraging Blockchain and ANFIS for Optimal Supply Chain Management
The supply chain is a critical segment of the product manufacturing cycle, continuously influenced by risky, uncertain, and undesirable events. Optimizing flexibility in the supply chain presents a complex, multi-objective, and nonlinear programming challenge. In the poultry supply chain, the development of mass customization capabilities has led manufacturing companies to increasingly focus on offering tailored and customized services for individual products. To safeguard against data tampering and ensure the integrity of setup costs and overall profitability, a multi-signature decentralized finance (DeFi) protocol, integrated with the IoT on a blockchain platform, is proposed. Managing the poultry supply chain involves uncertainties that may not account for parameters such as delivery time to retailers, reorder time, and the number of requested products. To address these challenges, this study employs an adaptive neuro-fuzzy inference system (ANFIS), combining neural networks with fuzzy logic to compensate for the lack of data training in parameter identification. Through MATLAB simulations, the study investigates the average shop delivery duration, the reorder time, and the number of products per order. By implementing the proposed technique, the average delivery time decreases from 40 to 37 minutes, the reorder time decreases from five to four days, and the quantity of items requested per order grows from six to eleven. Additionally, the ANFIS model enhances overall supply chain performance by reducing transaction times by 15\% compared to conventional systems, thereby improving real-time responsiveness and boosting transparency in supply chain operations, effectively resolving operational issues.
Asynchronous Distributed Learning with Quantized Finite-Time Coordination
In this paper we address distributed learning problems over peer-to-peer networks. In particular, we focus on the challenges of quantized communications, asynchrony, and stochastic gradients that arise in this set-up. We first discuss how to turn the presence of quantized communications into an advantage, by resorting to a finite-time, quantized coordination scheme. This scheme is combined with a distributed gradient descent method to derive the proposed algorithm. Secondly, we show how this algorithm can be adapted to allow asynchronous operations of the agents, as well as the use of stochastic gradients. Finally, we propose a variant of the algorithm which employs zooming-in quantization. We analyze the convergence of the proposed methods and compare them to state-of-the-art alternatives.
comment: To be presented at 2024 IEEE Conference on Decision and Control
Filtering in Projection-based Integrators for Improved Phase Characteristics
Projection-based integrators are effectively employed in high-precision systems with growing industrial success. By utilizing a projection operator, the resulting projection-based integrator keeps its input-output pair within a designated sector set, leading to unique freedom in control design that can be directly translated into performance benefits. This paper aims to enhance projection-based integrators by incorporating well-crafted linear filters into its structure, resulting in a new class of projected integrators that includes the earlier ones, such as the hybrid-integrator gain systems (with and without pre-filtering) as special cases. The extra design freedom in the form of two filters in the input paths to the projection operator and the internal dynamics allows the controller to break away from the inherent limitations of the linear control design. The enhanced performance properties of the proposed structure are formally demonstrated through a (quasi-linear) describing function analysis, the absence of the gain-loss problem, and numerical case studies showcasing improved time-domain properties. The describing function analysis is supported by rigorously showing incremental properties of the new filtered projection-based integrators thereby guaranteeing that the computed steady-state responses are unique and asymptotically stable.
comment: to be presented at IEEE CDC 2024
iCPS-DL: A Description Language for Autonomic Industrial Cyber-Physical Systems
Modern industrial systems require frequent updates to their cyber and physical infrastructures, which often demand considerable reconfiguration effort. This paper introduces a framework to automate this process, implemented as the industrial Cyber-Physical Systems Description Language, iCPSDL. This framework maps an industrial process as a knowledge graph, which includes information about physical and cyber-physical components, a state estimation model, and software component interaction. A novel aspect is the use of communication semantics to ensure correct interaction among distributed entities. Reasoning on the knowledge graph facilitates the configuration of cyber-physical elements in an industrial system. A case study in the Water Distribution Networks domain demonstrates the framework's application.
Wireless Integrated Authenticated Communication System (WIA-Comm)
The exponential increase in the number of devices connected to the internet globally has led to the requirement for the introduction of better and improved security measures for maintaining data integrity. The development of a wireless and authenticated communication system is required to overcome the safety threats and illegal access to the application system/data. The WIA-Comm System is the one that provides a bridge to control the devices at the application side. It has been designed to provide security by giving control rights only to the device whose MAC (physical) address has already been registered, so only authorized users can control the system. LoRa WAN technology has been used for wireless communication and Arduino IDE to develop the code for the required functionality.
comment: 6 pages, 10 figures, 3 tables
Particle Flows for Source Localization in 3-D Using TDOA Measurements
Localization using time-difference of arrival (TDOA) has myriad applications, e.g., in passive surveillance systems and marine mammal research. In this paper, we present a Bayesian estimation method that can localize an unknown number of static sources in 3-D based on TDOA measurements. The proposed localization algorithm based on particle flow (PFL) can overcome the challenges related to the highly nonlinear TDOA measurement model, the data association (DA) uncertainty, and the uncertainty in the number of sources to be localized. Different PFL strategies are compared within a unified belief propagation (BP) framework in a challenging multisensor source localization problem. In particular, we consider PFL-based approximation of beliefs based on one or multiple Gaussian kernels with parameters computed using deterministic and stochastic flow processes. Our numerical results demonstrate that the proposed method can correctly determine the number of sources and provide accurate location estimates. The stochastic flow demonstrates greater accuracy compared to the deterministic flow when using the same number of particles.
comment: 8 pages
Secure Integrated Sensing and Communication Under Correlated Rayleigh Fading
We consider a secure integrated sensing and communication (ISAC) scenario, in which a signal is transmitted through a state-dependent wiretap channel with one legitimate receiver with which the transmitter communicates and one honest-but-curious target that the transmitter wants to sense. The secure ISAC channel is modeled as two state-dependent fast-fading channels with correlated Rayleigh fading coefficients and independent additive Gaussian noise components. Delayed channel outputs are fed back to the transmitter to improve the communication performance and to estimate the channel state sequence. We establish and illustrate an achievable secrecy-distortion region for degraded secure ISAC channels under correlated Rayleigh fading. We also evaluate the inner bound for a large set of parameters to derive practical design insights for secure ISAC methods. The presented results include in particular parameter ranges for which the secrecy capacity of a classical wiretap channel setup is surpassed and for which the channel capacity is approached.
Efficient Dual-Band Single-Port Rectifier for RF Energy Harvesting at FM and GSM Bands
This paper presents an efficient dual-band rectifier for radiofrequency energy harvesting (RFEH) applications at FM and GSM bands. The single-port rectifier circuit, which comprises a 3-port network, optimized T-matching circuits and voltage doubler, is designed, simulated and fabricated to obtain a high RF-to-DC power conversion efficiency (PCE). Measurement results show PCE of 26% and 22% at -20 dBm, and also 58% and 51% at -10 dBm with a maximum amount of 69% and 65% at -2.5 dBm and -5 dBm, with single tone at 95 and 925 MHz, respectively. Besides, the fractional bandwidth of 21% at FM and 11% at GSM band is achieved. The measurement and simulation results are in good agreement. Consequently, the proposed rectifier can be a potential candidate for ambient RF energy harvesting and wireless power transfer (WPT). It should be noted that a 3-port network as a duplexer is designed to be integrated with single-port antennas which cover both FM and GSM bands as a low-cost solution. Moreover, based on simulation results, PCE has small variations when the load resistor varies from 10 to 18 k$\Omega$. Therefore, this rectifier can be utilized for any desired resistance within the range, such as sensors and IoT devices.
Characterizing nonlinear systems with mixed input-output properties through dissipation inequalities
Systems that show different characteristics, such as finite-gain and passivity, depending on the nature of the inputs, are said to possess mixed input-output properties. In this paper, we provide a constructive method for characterizing mixed input-output properties of nonlinear systems using a dissipativity framework. Our results take inspiration from the generalized Kalman-Yakubovich-Popov lemma, and show that a system is ``mixed'' if it is dissipative with respect to highly specialized supply rates. The mixed input-output characterization is used for assessing stability of feedback interconnections in which the feedback components violate conditions of classical results such as the small-gain and passivity theorem, thereby significantly relaxing the results. We highlight applicability of the results through various examples, and provide connections with other input-output characterizations such as scaled graphs.
comment: 7 pages
Exact Recovery Guarantees for Parameterized Non-linear System Identification Problem under Adversarial Attacks
In this work, we study the system identification problem for parameterized non-linear systems using basis functions under adversarial attacks. Motivated by the LASSO-type estimators, we analyze the exact recovery property of a non-smooth estimator, which is generated by solving an embedded $\ell_1$-loss minimization problem. First, we derive necessary and sufficient conditions for the well-specifiedness of the estimator and the uniqueness of global solutions to the underlying optimization problem. Next, we provide exact recovery guarantees for the estimator under two different scenarios of boundedness and Lipschitz continuity of the basis functions. The non-asymptotic exact recovery is guaranteed with high probability, even when there are more severely corrupted data than clean data. Finally, we numerically illustrate the validity of our theory. This is the first study on the sample complexity analysis of a non-smooth estimator for the non-linear system identification problem.
comment: 33 pages
Improving the Region of Attraction of a Multi-rotor UAV by Estimating Unknown Disturbances
This study presents a machine learning-aided approach to accurately estimate the region of attraction (ROA) of a multi-rotor unmanned aerial vehicle (UAV) controlled using a linear quadratic regulator (LQR) controller. Conventional ROA estimation approaches rely on a nominal dynamic model for ROA calculation, leading to inaccurate estimation due to unknown dynamics and disturbances associated with the physical system. To address this issue, our study utilizes a neural network to predict these unknown disturbances of a planar quadrotor. The nominal model integrated with the learned disturbances is then employed to calculate the ROA of the planer quadrotor using a graphical technique. The estimated ROA is then compared with the ROA calculated using Lyapunov analysis and the graphical approach without incorporating the learned disturbances. The results illustrated that the proposed method provides a more accurate estimation of the ROA, while the conventional Lyapunov-based estimation tends to be more conservative.
Fixed-time Disturbance Observer-Based MPC Robust Trajectory Tracking Control of Quadrotor
In this paper, a fixed-time disturbance observerbased model predictive control algorithm is proposed for trajectory tracking of quadrotor in the presence of disturbances. First, a novel multivariable fixed-time disturbance observer is proposed to estimate the lumped disturbances. The bi-limit homogeneity and Lyapunov techniques are employed to ensure the convergence of estimation error within a fixed convergence time, independent of the initial estimation error. Then, an observerbased model predictive control strategy is formulated to achieve robust trajectory tracking of quadrotor, attenuating the lumped disturbances and model uncertainties. Finally, simulations and real-world experiments are provided to illustrate the effectiveness of the proposed method.
Empowering Aggregators with Practical Data-Driven Tools: Harnessing Aggregated and Disaggregated Flexibility for Demand Response
This study explores the interaction between aggregators and building occupants in activating flexibility through Demand Response (DR) programs, with a focus on reinforcing the resilience of the energy system considering the uncertainties presented by Renewable Energy Sources (RES). Firstly, it introduces a methodology of optimizing aggregated flexibility provision strategies in environments with limited data, utilizing Discrete Fourier Transformation (DFT) and clustering techniques to identify building occupants' activity patterns. Secondly, the study assesses the disaggregated flexibility provision of Heating Ventilation and Air Conditioning (HVAC) systems during DR events, employing machine learning and optimization techniques for precise, device-level analysis. The first approach offers a non-intrusive pathway for aggregators to provide flexibility services in environments of a single smart meter for the whole building's consumption, while the second approach maximizes the amount of flexibility in the case of dedicated metering devices to the HVAC systems by carefully considering building occupants' thermal comfort profiles. Through the application of data-driven techniques and encompassing case studies from both industrial and residential buildings, this paper not only unveils pivotal opportunities for aggregators in the balancing and emerging flexibility markets but also successfully develops and demonstrates end-to-end practical tools for aggregators.
Dirichlet Logistic Gaussian Processes for Evaluation of Black-Box Stochastic Systems under Complex Requirements
The requirement-driven performance evaluation of a black-box cyber-physical system (CPS) that utilizes machine learning methods has proven to be an effective way to assess the quality of the CPS. However, the distributional evaluation of the performance has been poorly considered. Although many uncertainty estimation methods have been advocated, they have not successfully estimated highly complex performance distributions under small data. In this paper, we propose a method to distributionally evaluate the performance under complex requirements using small input-trajectory data. To handle the unknown complex probability distributions under small data, we discretize the corresponding performance measure, yielding a discrete random process over an input region. Then, we propose a semiparametric Bayesian model of the discrete process based on a Dirichlet random field whose parameter function is represented by multiple logistic Gaussian processes (LGPs). The Dirichlet posterior parameter function is estimated through the LGP posteriors in a reasonable and conservative fashion. We show that the proposed Bayesian model converges to the true discrete random process as the number of data becomes large enough. We also empirically demonstrate the effectiveness of the proposed method by simulation.
comment: 7 pages, 5figures. This paper has been accepted the 27th European Conference on Artificial Intelligence
On Robust Reinforcement Learning with Lipschitz-Bounded Policy Networks
This paper presents a study of robust policy networks in deep reinforcement learning. We investigate the benefits of policy parameterizations that naturally satisfy constraints on their Lipschitz bound, analyzing their empirical performance and robustness on two representative problems: pendulum swing-up and Atari Pong. We illustrate that policy networks with smaller Lipschitz bounds are more robust to disturbances, random noise, and targeted adversarial attacks than unconstrained policies composed of vanilla multi-layer perceptrons or convolutional neural networks. However, the structure of the Lipschitz layer is important. We find that the widely-used method of spectral normalization is too conservative and severely impacts clean performance, whereas more expressive Lipschitz layers such as the recently-proposed Sandwich layer can achieve improved robustness without sacrificing clean performance.
Iterative Thresholding and Projection Algorithms and Model-Based Deep Neural Networks for Sparse LQR Control Design
In this paper, we consider an LQR design problem for distributed control systems. For large-scale distributed systems, finding a solution might be computationally demanding due to communications among agents. To this aim, we deal with LQR minimization problem with a regularization for sparse feedback matrix, which can lead to achieve the reduction of the communication links in the distributed control systems. For this work, we introduce simple but efficient iterative algorithms -- Iterative Shrinkage Thresholding Algorithm (ISTA) and Iterative Sparse Projection Algorithm (ISPA). They can give us a trade-off solution between LQR cost and sparsity level on feedback matrix. Moreover, in order to improve the speed of the proposed algorithms, we design deep neural network models based on the proposed iterative algorithms. Numerical experiments demonstrate that our algorithms can outperform the previous methods using the Alternating Direction Method of Multiplier (ADMM) [2] and the Gradient Support Pursuit (GraSP) [3], and their deep neural network models can improve the performance of the proposed algorithms in convergence speed.
comment: 15 pages
Control of Unknown Quadrotors from a Single Throw IROS 2024
This paper presents a method to recover quadrotor UAV from a throw, when no control parameters are known before the throw. We leverage the availability of high-frequency rotor speed feedback available in racing drone hardware and software to find control effectiveness values and fit a motor model using recursive least squares (RLS) estimation. Furthermore, we propose an excitation sequence that provides large actuation commands while guaranteeing to stay within gyroscope sensing limits. After 450ms of excitation, an INDI attitude controller uses the 52 fitted parameters to arrest rotational motion and recover an upright attitude. Finally, a NDI position controller drives the craft to a position setpoint. The proposed algorithm runs efficiently on microcontrollers found in common UAV flight controllers, and was shown to recover an agile quadrotor every time in 57 live experiments with as low as 3.5m throw height, demonstrating robustness against initial rotations and noise. We also demonstrate control of randomized quadrotors in simulated throws, where the parameter fitting RMS error is typically within 10% of the true value. This work has been submitted to IROS 2024 for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.
comment: 7 pages, 5 figures, 2 tables. Submitted to the IROS 2024 conference
Nonsmooth Projection-Free Optimization with Functional Constraints
This paper presents a subgradient-based algorithm for constrained nonsmooth convex optimization that does not require projections onto the feasible set. While the well-established Frank-Wolfe algorithm and its variants already avoid projections, they are primarily designed for smooth objective functions. In contrast, our proposed algorithm can handle nonsmooth problems with general convex functional inequality constraints. It achieves an $\epsilon$-suboptimal solution in $\mathcal{O}(\epsilon^{-2})$ iterations, with each iteration requiring only a single (potentially inexact) Linear Minimization Oracle (LMO) call and a (possibly inexact) subgradient computation. This performance is consistent with existing lower bounds. Similar performance is observed when deterministic subgradients are replaced with stochastic subgradients. In the special case where there are no functional inequality constraints, our algorithm competes favorably with a recent nonsmooth projection-free method designed for constraint-free problems. Our approach utilizes a simple separation scheme in conjunction with a new Lagrange multiplier update rule.
Perfecting Periodic Trajectory Tracking: Model Predictive Control with a Periodic Observer ($Π$-MPC) IROS 2024
In Model Predictive Control (MPC), discrepancies between the actual system and the predictive model can lead to substantial tracking errors and significantly degrade performance and reliability. While such discrepancies can be alleviated with more complex models, this often complicates controller design and implementation. By leveraging the fact that many trajectories of interest are periodic, we show that perfect tracking is possible when incorporating a simple observer that estimates and compensates for periodic disturbances. We present the design of the observer and the accompanying tracking MPC scheme, proving that their combination achieves zero tracking error asymptotically, regardless of the complexity of the unmodelled dynamics. We validate the effectiveness of our method, demonstrating asymptotically perfect tracking on a high-dimensional soft robot with nearly 10,000 states and a fivefold reduction in tracking errors compared to a baseline MPC on small-scale autonomous race car experiments.
comment: 8 pages, 3 figures; 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
Decoupling Power Quality Issues in Grid-Microgrid Network Using Microgrid Building Blocks
Microgrids are evolving as promising options to enhance reliability of the connected transmission and distribution systems. Traditional design and deployment of microgrids require significant engineering analysis. Microgrid Building Blocks (MBB), consisting of modular blocks that integrate seamlessly to form effective microgrids, is an enabling concept for faster and broader adoption of microgrids. Back-to-Back converter placed at the point of common coupling of microgrid is an integral part of the MBB. This paper presents applications of MBB to decouple power quality issues in grid-microgrid network serving power quality sensitive loads such as data centers, new grid-edge technologies such as vehicle-to-grid generation, and serving electric vehicle charging loads during evacuation before disaster events. Simulation results show that MBB effectively decouples the power quality issues across networks and helps maintain good power quality in the power quality sensitive network based on the operational scenario.
comment: This paper is accepted for publication in IEEE IECON 2024, Chicago, IL. The complete copyright version will be available on IEEE Xplore when the conference proceedings are published
Efficient Discovery of Actual Causality using Abstraction-Refinement
Causality is the relationship where one event contributes to the production of another, with the cause being partly responsible for the effect and the effect partly dependent on the cause. In this paper, we propose a novel and effective method to formally reason about the causal effect of events in engineered systems, with application for finding the root-cause of safety violations in embedded and cyber-physical systems. We are motivated by the notion of actual causality by Halpern and Pearl, which focuses on the causal effect of particular events rather than type-level causality, which attempts to make general statements about scientific and natural phenomena. Our first contribution is formulating discovery of actual causality in computing systems modeled by transition systems as an SMT solving problem. Since datasets for causality analysis tend to be large, in order to tackle the scalability problem of automated formal reasoning, our second contribution is a novel technique based on abstraction-refinement that allows identifying for actual causes within smaller abstract causal models. We demonstrate the effectiveness of our approach (by several orders of magnitude) using three case studies to find the actual cause of violations of safety in (1) a neural network controller for a Mountain Car, (2) a controller for a Lunar Lander obtained by reinforcement learning, and (3) an MPC controller for an F-16 autopilot simulator.
Exploiting Monotonicity to Design an Adaptive PI Passivity-Based Controller for a Fuel-Cell System
We present a controller for a power electronic system composed of a fuel cell (FC) connected to a boost converter which feeds a resistive load. The controller aims to regulate the output voltage of the converter regardless of the uncertainty of the load. Leveraging the monotonicity feature of the fuel cell polarization curve we prove that the nonlinear system can be controlled by means of a passivity-based proportional-integral approach. We afterward extend the result to an adaptive version, allowing the controller to deal with parameter uncertainties, such as inductor parasitic resistance, load, and FC polarization curve parameters. This adaptive design is based on an indirect control approach with online parameter identification performed by a ``hybrid'' estimator which combines two techniques: the gradient-descent and immersion-and-invariance algorithms. The overall system is proved to be stable with the output voltage regulated to its reference. Experimental results validate our proposal under two real-life scenarios: pulsating load and output voltage reference changes.
comment: 11 pages, 8 Figs
Systems and Control (EESS)
Information-Based Trajectory Planning for Autonomous Absolute Tracking in Cislunar Space
The resurgence of lunar operations requires advancements in cislunar navigation and Space Situational Awareness (SSA). Challenges associated to these tasks have created an interest in autonomous planning, navigation, and tracking technologies that operate with little ground-based intervention. This research introduces a trajectory planning tool for a low-thrust mobile observer, aimed at maximizing navigation and tracking performance with satellite-to-satellite relative measurements. We formulate an expression for the information gathered over an observation period based on the mutual information between augmented observer/target states and the associated measurement set collected. We then develop an optimal trajectory design problem for a mobile observer, balancing information gain and control effort, and solve this problem with a Sequential Convex Programming (SCP) approach. The developed methods are demonstrated in scenarios involving spacecraft in the cislunar regime, demonstrating the potential for improved autonomous navigation and tracking.
comment: 2024 AAS/AIAA Astrodynamics Specialist Conference
Regular Pairings for Non-quadratic Lyapunov Functions and Contraction Analysis
Recent studies on stability and contractivity have highlighted the importance of semi-inner products, which we refer to as ``pairings'', associated with general norms. A pairing is a binary operation that relates the derivative of a curve's norm to the radius-vector of the curve and its tangent. This relationship, known as the curve norm derivative formula, is crucial when using the norm as a Lyapunov function. Another important property of the pairing, used in stability and contraction criteria, is the so-called Lumer inequality, which relates the pairing to the induced logarithmic norm. We prove that the curve norm derivative formula and Lumer's inequality are, in fact, equivalent to each other and to several simpler properties. We then introduce and characterize regular pairings that satisfy all of these properties. Our results unify several independent theories of pairings (semi-inner products) developed in previous work on functional analysis and control theory. Additionally, we introduce the polyhedral max pairing and develop computational tools for polyhedral norms, advancing contraction theory in non-Euclidean spaces.
Robust model predictive control exploiting monotonicity properties
Robust model predictive control algorithms are essential for addressing unavoidable errors due to the uncertainty in predicting real-world systems. However, the formulation of such algorithms typically results in a trade-off between conservatism and computational complexity. Monotone systems facilitate the efficient computation of reachable sets and thus the straightforward formulation of a robust model predictive control approach optimizing over open-loop predictions. We present an approach based on the division of reachable sets to incorporate feedback in the predictions, resulting in less conservative strategies. The concept of mixed-monotonicity enables an extension of our methodology to non-monotone systems. The potential of the proposed approaches is demonstrated through a nonlinear high-dimensional chemical tank reactor cascade case study.
comment: Submitted to "IEEE Transactions on Automatic Control", Code: https://github.com/MoritzHein/RobMPCExploitMon
Learning and Verifying Maximal Taylor-Neural Lyapunov functions
We introduce a novel neural network architecture, termed Taylor-neural Lyapunov functions, designed to approximate Lyapunov functions with formal certification. This architecture innovatively encodes local approximations and extends them globally by leveraging neural networks to approximate the residuals. Our method recasts the problem of estimating the largest region of attraction - specifically for maximal Lyapunov functions - into a learning problem, ensuring convergence around the origin through robust control theory. Physics-informed machine learning techniques further refine the estimation of the largest region of attraction. Remarkably, this method is versatile, operating effectively even without simulated data points. We validate the efficacy of our approach by providing numerical certificates of convergence across multiple examples. Our proposed methodology not only competes closely with state-of-the-art approaches, such as sum-of-squares and LyZNet, but also achieves comparable results even in the absence of simulated data. This work represents a significant advancement in control theory, with broad potential applications in the design of stable control systems and beyond.
"Benefit Game: Alien Seaweed Swarms" -- Real-time Gamification of Digital Seaweed Ecology
"Benefit Game: Alien Seaweed Swarms" combines artificial life art and interactive game with installation to explore the impact of human activity on fragile seaweed ecosystems. The project aims to promote ecological consciousness by creating a balance in digital seaweed ecologies. Inspired by the real species "Laminaria saccharina", the author employs Procedural Content Generation via Machine Learning technology to generate variations of virtual seaweeds and symbiotic fungi. The audience can explore the consequences of human activities through gameplay and observe the ecosystem's feedback on the benefits and risks of seaweed aquaculture. This Benefit Game offers dynamic and real-time responsive artificial seaweed ecosystems for an interactive experience that enhances ecological consciousness.
comment: Paper accepted at ISEA 24, The 29th International Symposium on Electronic Art, Brisbane, Australia, 21-29 June 2024
Leveraging Blockchain and ANFIS for Optimal Supply Chain Management
The supply chain is a critical segment of the product manufacturing cycle, continuously influenced by risky, uncertain, and undesirable events. Optimizing flexibility in the supply chain presents a complex, multi-objective, and nonlinear programming challenge. In the poultry supply chain, the development of mass customization capabilities has led manufacturing companies to increasingly focus on offering tailored and customized services for individual products. To safeguard against data tampering and ensure the integrity of setup costs and overall profitability, a multi-signature decentralized finance (DeFi) protocol, integrated with the IoT on a blockchain platform, is proposed. Managing the poultry supply chain involves uncertainties that may not account for parameters such as delivery time to retailers, reorder time, and the number of requested products. To address these challenges, this study employs an adaptive neuro-fuzzy inference system (ANFIS), combining neural networks with fuzzy logic to compensate for the lack of data training in parameter identification. Through MATLAB simulations, the study investigates the average shop delivery duration, the reorder time, and the number of products per order. By implementing the proposed technique, the average delivery time decreases from 40 to 37 minutes, the reorder time decreases from five to four days, and the quantity of items requested per order grows from six to eleven. Additionally, the ANFIS model enhances overall supply chain performance by reducing transaction times by 15\% compared to conventional systems, thereby improving real-time responsiveness and boosting transparency in supply chain operations, effectively resolving operational issues.
Asynchronous Distributed Learning with Quantized Finite-Time Coordination
In this paper we address distributed learning problems over peer-to-peer networks. In particular, we focus on the challenges of quantized communications, asynchrony, and stochastic gradients that arise in this set-up. We first discuss how to turn the presence of quantized communications into an advantage, by resorting to a finite-time, quantized coordination scheme. This scheme is combined with a distributed gradient descent method to derive the proposed algorithm. Secondly, we show how this algorithm can be adapted to allow asynchronous operations of the agents, as well as the use of stochastic gradients. Finally, we propose a variant of the algorithm which employs zooming-in quantization. We analyze the convergence of the proposed methods and compare them to state-of-the-art alternatives.
comment: To be presented at 2024 IEEE Conference on Decision and Control
Filtering in Projection-based Integrators for Improved Phase Characteristics
Projection-based integrators are effectively employed in high-precision systems with growing industrial success. By utilizing a projection operator, the resulting projection-based integrator keeps its input-output pair within a designated sector set, leading to unique freedom in control design that can be directly translated into performance benefits. This paper aims to enhance projection-based integrators by incorporating well-crafted linear filters into its structure, resulting in a new class of projected integrators that includes the earlier ones, such as the hybrid-integrator gain systems (with and without pre-filtering) as special cases. The extra design freedom in the form of two filters in the input paths to the projection operator and the internal dynamics allows the controller to break away from the inherent limitations of the linear control design. The enhanced performance properties of the proposed structure are formally demonstrated through a (quasi-linear) describing function analysis, the absence of the gain-loss problem, and numerical case studies showcasing improved time-domain properties. The describing function analysis is supported by rigorously showing incremental properties of the new filtered projection-based integrators thereby guaranteeing that the computed steady-state responses are unique and asymptotically stable.
comment: to be presented at IEEE CDC 2024
iCPS-DL: A Description Language for Autonomic Industrial Cyber-Physical Systems
Modern industrial systems require frequent updates to their cyber and physical infrastructures, which often demand considerable reconfiguration effort. This paper introduces a framework to automate this process, implemented as the industrial Cyber-Physical Systems Description Language, iCPSDL. This framework maps an industrial process as a knowledge graph, which includes information about physical and cyber-physical components, a state estimation model, and software component interaction. A novel aspect is the use of communication semantics to ensure correct interaction among distributed entities. Reasoning on the knowledge graph facilitates the configuration of cyber-physical elements in an industrial system. A case study in the Water Distribution Networks domain demonstrates the framework's application.
Wireless Integrated Authenticated Communication System (WIA-Comm)
The exponential increase in the number of devices connected to the internet globally has led to the requirement for the introduction of better and improved security measures for maintaining data integrity. The development of a wireless and authenticated communication system is required to overcome the safety threats and illegal access to the application system/data. The WIA-Comm System is the one that provides a bridge to control the devices at the application side. It has been designed to provide security by giving control rights only to the device whose MAC (physical) address has already been registered, so only authorized users can control the system. LoRa WAN technology has been used for wireless communication and Arduino IDE to develop the code for the required functionality.
comment: 6 pages, 10 figures, 3 tables
Particle Flows for Source Localization in 3-D Using TDOA Measurements
Localization using time-difference of arrival (TDOA) has myriad applications, e.g., in passive surveillance systems and marine mammal research. In this paper, we present a Bayesian estimation method that can localize an unknown number of static sources in 3-D based on TDOA measurements. The proposed localization algorithm based on particle flow (PFL) can overcome the challenges related to the highly nonlinear TDOA measurement model, the data association (DA) uncertainty, and the uncertainty in the number of sources to be localized. Different PFL strategies are compared within a unified belief propagation (BP) framework in a challenging multisensor source localization problem. In particular, we consider PFL-based approximation of beliefs based on one or multiple Gaussian kernels with parameters computed using deterministic and stochastic flow processes. Our numerical results demonstrate that the proposed method can correctly determine the number of sources and provide accurate location estimates. The stochastic flow demonstrates greater accuracy compared to the deterministic flow when using the same number of particles.
comment: 8 pages
Secure Integrated Sensing and Communication Under Correlated Rayleigh Fading
We consider a secure integrated sensing and communication (ISAC) scenario, in which a signal is transmitted through a state-dependent wiretap channel with one legitimate receiver with which the transmitter communicates and one honest-but-curious target that the transmitter wants to sense. The secure ISAC channel is modeled as two state-dependent fast-fading channels with correlated Rayleigh fading coefficients and independent additive Gaussian noise components. Delayed channel outputs are fed back to the transmitter to improve the communication performance and to estimate the channel state sequence. We establish and illustrate an achievable secrecy-distortion region for degraded secure ISAC channels under correlated Rayleigh fading. We also evaluate the inner bound for a large set of parameters to derive practical design insights for secure ISAC methods. The presented results include in particular parameter ranges for which the secrecy capacity of a classical wiretap channel setup is surpassed and for which the channel capacity is approached.
Efficient Dual-Band Single-Port Rectifier for RF Energy Harvesting at FM and GSM Bands
This paper presents an efficient dual-band rectifier for radiofrequency energy harvesting (RFEH) applications at FM and GSM bands. The single-port rectifier circuit, which comprises a 3-port network, optimized T-matching circuits and voltage doubler, is designed, simulated and fabricated to obtain a high RF-to-DC power conversion efficiency (PCE). Measurement results show PCE of 26% and 22% at -20 dBm, and also 58% and 51% at -10 dBm with a maximum amount of 69% and 65% at -2.5 dBm and -5 dBm, with single tone at 95 and 925 MHz, respectively. Besides, the fractional bandwidth of 21% at FM and 11% at GSM band is achieved. The measurement and simulation results are in good agreement. Consequently, the proposed rectifier can be a potential candidate for ambient RF energy harvesting and wireless power transfer (WPT). It should be noted that a 3-port network as a duplexer is designed to be integrated with single-port antennas which cover both FM and GSM bands as a low-cost solution. Moreover, based on simulation results, PCE has small variations when the load resistor varies from 10 to 18 k$\Omega$. Therefore, this rectifier can be utilized for any desired resistance within the range, such as sensors and IoT devices.
Characterizing nonlinear systems with mixed input-output properties through dissipation inequalities
Systems that show different characteristics, such as finite-gain and passivity, depending on the nature of the inputs, are said to possess mixed input-output properties. In this paper, we provide a constructive method for characterizing mixed input-output properties of nonlinear systems using a dissipativity framework. Our results take inspiration from the generalized Kalman-Yakubovich-Popov lemma, and show that a system is ``mixed'' if it is dissipative with respect to highly specialized supply rates. The mixed input-output characterization is used for assessing stability of feedback interconnections in which the feedback components violate conditions of classical results such as the small-gain and passivity theorem, thereby significantly relaxing the results. We highlight applicability of the results through various examples, and provide connections with other input-output characterizations such as scaled graphs.
comment: 7 pages
Exact Recovery Guarantees for Parameterized Non-linear System Identification Problem under Adversarial Attacks
In this work, we study the system identification problem for parameterized non-linear systems using basis functions under adversarial attacks. Motivated by the LASSO-type estimators, we analyze the exact recovery property of a non-smooth estimator, which is generated by solving an embedded $\ell_1$-loss minimization problem. First, we derive necessary and sufficient conditions for the well-specifiedness of the estimator and the uniqueness of global solutions to the underlying optimization problem. Next, we provide exact recovery guarantees for the estimator under two different scenarios of boundedness and Lipschitz continuity of the basis functions. The non-asymptotic exact recovery is guaranteed with high probability, even when there are more severely corrupted data than clean data. Finally, we numerically illustrate the validity of our theory. This is the first study on the sample complexity analysis of a non-smooth estimator for the non-linear system identification problem.
comment: 33 pages
Improving the Region of Attraction of a Multi-rotor UAV by Estimating Unknown Disturbances
This study presents a machine learning-aided approach to accurately estimate the region of attraction (ROA) of a multi-rotor unmanned aerial vehicle (UAV) controlled using a linear quadratic regulator (LQR) controller. Conventional ROA estimation approaches rely on a nominal dynamic model for ROA calculation, leading to inaccurate estimation due to unknown dynamics and disturbances associated with the physical system. To address this issue, our study utilizes a neural network to predict these unknown disturbances of a planar quadrotor. The nominal model integrated with the learned disturbances is then employed to calculate the ROA of the planer quadrotor using a graphical technique. The estimated ROA is then compared with the ROA calculated using Lyapunov analysis and the graphical approach without incorporating the learned disturbances. The results illustrated that the proposed method provides a more accurate estimation of the ROA, while the conventional Lyapunov-based estimation tends to be more conservative.
Fixed-time Disturbance Observer-Based MPC Robust Trajectory Tracking Control of Quadrotor
In this paper, a fixed-time disturbance observerbased model predictive control algorithm is proposed for trajectory tracking of quadrotor in the presence of disturbances. First, a novel multivariable fixed-time disturbance observer is proposed to estimate the lumped disturbances. The bi-limit homogeneity and Lyapunov techniques are employed to ensure the convergence of estimation error within a fixed convergence time, independent of the initial estimation error. Then, an observerbased model predictive control strategy is formulated to achieve robust trajectory tracking of quadrotor, attenuating the lumped disturbances and model uncertainties. Finally, simulations and real-world experiments are provided to illustrate the effectiveness of the proposed method.
Empowering Aggregators with Practical Data-Driven Tools: Harnessing Aggregated and Disaggregated Flexibility for Demand Response
This study explores the interaction between aggregators and building occupants in activating flexibility through Demand Response (DR) programs, with a focus on reinforcing the resilience of the energy system considering the uncertainties presented by Renewable Energy Sources (RES). Firstly, it introduces a methodology of optimizing aggregated flexibility provision strategies in environments with limited data, utilizing Discrete Fourier Transformation (DFT) and clustering techniques to identify building occupants' activity patterns. Secondly, the study assesses the disaggregated flexibility provision of Heating Ventilation and Air Conditioning (HVAC) systems during DR events, employing machine learning and optimization techniques for precise, device-level analysis. The first approach offers a non-intrusive pathway for aggregators to provide flexibility services in environments of a single smart meter for the whole building's consumption, while the second approach maximizes the amount of flexibility in the case of dedicated metering devices to the HVAC systems by carefully considering building occupants' thermal comfort profiles. Through the application of data-driven techniques and encompassing case studies from both industrial and residential buildings, this paper not only unveils pivotal opportunities for aggregators in the balancing and emerging flexibility markets but also successfully develops and demonstrates end-to-end practical tools for aggregators.
Dirichlet Logistic Gaussian Processes for Evaluation of Black-Box Stochastic Systems under Complex Requirements
The requirement-driven performance evaluation of a black-box cyber-physical system (CPS) that utilizes machine learning methods has proven to be an effective way to assess the quality of the CPS. However, the distributional evaluation of the performance has been poorly considered. Although many uncertainty estimation methods have been advocated, they have not successfully estimated highly complex performance distributions under small data. In this paper, we propose a method to distributionally evaluate the performance under complex requirements using small input-trajectory data. To handle the unknown complex probability distributions under small data, we discretize the corresponding performance measure, yielding a discrete random process over an input region. Then, we propose a semiparametric Bayesian model of the discrete process based on a Dirichlet random field whose parameter function is represented by multiple logistic Gaussian processes (LGPs). The Dirichlet posterior parameter function is estimated through the LGP posteriors in a reasonable and conservative fashion. We show that the proposed Bayesian model converges to the true discrete random process as the number of data becomes large enough. We also empirically demonstrate the effectiveness of the proposed method by simulation.
comment: 7 pages, 5figures. This paper has been accepted the 27th European Conference on Artificial Intelligence
On Robust Reinforcement Learning with Lipschitz-Bounded Policy Networks
This paper presents a study of robust policy networks in deep reinforcement learning. We investigate the benefits of policy parameterizations that naturally satisfy constraints on their Lipschitz bound, analyzing their empirical performance and robustness on two representative problems: pendulum swing-up and Atari Pong. We illustrate that policy networks with smaller Lipschitz bounds are more robust to disturbances, random noise, and targeted adversarial attacks than unconstrained policies composed of vanilla multi-layer perceptrons or convolutional neural networks. However, the structure of the Lipschitz layer is important. We find that the widely-used method of spectral normalization is too conservative and severely impacts clean performance, whereas more expressive Lipschitz layers such as the recently-proposed Sandwich layer can achieve improved robustness without sacrificing clean performance.
Iterative Thresholding and Projection Algorithms and Model-Based Deep Neural Networks for Sparse LQR Control Design
In this paper, we consider an LQR design problem for distributed control systems. For large-scale distributed systems, finding a solution might be computationally demanding due to communications among agents. To this aim, we deal with LQR minimization problem with a regularization for sparse feedback matrix, which can lead to achieve the reduction of the communication links in the distributed control systems. For this work, we introduce simple but efficient iterative algorithms -- Iterative Shrinkage Thresholding Algorithm (ISTA) and Iterative Sparse Projection Algorithm (ISPA). They can give us a trade-off solution between LQR cost and sparsity level on feedback matrix. Moreover, in order to improve the speed of the proposed algorithms, we design deep neural network models based on the proposed iterative algorithms. Numerical experiments demonstrate that our algorithms can outperform the previous methods using the Alternating Direction Method of Multiplier (ADMM) [2] and the Gradient Support Pursuit (GraSP) [3], and their deep neural network models can improve the performance of the proposed algorithms in convergence speed.
comment: 15 pages
Control of Unknown Quadrotors from a Single Throw IROS 2024
This paper presents a method to recover quadrotor UAV from a throw, when no control parameters are known before the throw. We leverage the availability of high-frequency rotor speed feedback available in racing drone hardware and software to find control effectiveness values and fit a motor model using recursive least squares (RLS) estimation. Furthermore, we propose an excitation sequence that provides large actuation commands while guaranteeing to stay within gyroscope sensing limits. After 450ms of excitation, an INDI attitude controller uses the 52 fitted parameters to arrest rotational motion and recover an upright attitude. Finally, a NDI position controller drives the craft to a position setpoint. The proposed algorithm runs efficiently on microcontrollers found in common UAV flight controllers, and was shown to recover an agile quadrotor every time in 57 live experiments with as low as 3.5m throw height, demonstrating robustness against initial rotations and noise. We also demonstrate control of randomized quadrotors in simulated throws, where the parameter fitting RMS error is typically within 10% of the true value. This work has been submitted to IROS 2024 for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.
comment: 7 pages, 5 figures, 2 tables. Submitted to the IROS 2024 conference
Nonsmooth Projection-Free Optimization with Functional Constraints
This paper presents a subgradient-based algorithm for constrained nonsmooth convex optimization that does not require projections onto the feasible set. While the well-established Frank-Wolfe algorithm and its variants already avoid projections, they are primarily designed for smooth objective functions. In contrast, our proposed algorithm can handle nonsmooth problems with general convex functional inequality constraints. It achieves an $\epsilon$-suboptimal solution in $\mathcal{O}(\epsilon^{-2})$ iterations, with each iteration requiring only a single (potentially inexact) Linear Minimization Oracle (LMO) call and a (possibly inexact) subgradient computation. This performance is consistent with existing lower bounds. Similar performance is observed when deterministic subgradients are replaced with stochastic subgradients. In the special case where there are no functional inequality constraints, our algorithm competes favorably with a recent nonsmooth projection-free method designed for constraint-free problems. Our approach utilizes a simple separation scheme in conjunction with a new Lagrange multiplier update rule.
Perfecting Periodic Trajectory Tracking: Model Predictive Control with a Periodic Observer ($Π$-MPC) IROS 2024
In Model Predictive Control (MPC), discrepancies between the actual system and the predictive model can lead to substantial tracking errors and significantly degrade performance and reliability. While such discrepancies can be alleviated with more complex models, this often complicates controller design and implementation. By leveraging the fact that many trajectories of interest are periodic, we show that perfect tracking is possible when incorporating a simple observer that estimates and compensates for periodic disturbances. We present the design of the observer and the accompanying tracking MPC scheme, proving that their combination achieves zero tracking error asymptotically, regardless of the complexity of the unmodelled dynamics. We validate the effectiveness of our method, demonstrating asymptotically perfect tracking on a high-dimensional soft robot with nearly 10,000 states and a fivefold reduction in tracking errors compared to a baseline MPC on small-scale autonomous race car experiments.
comment: 8 pages, 3 figures; 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
Decoupling Power Quality Issues in Grid-Microgrid Network Using Microgrid Building Blocks
Microgrids are evolving as promising options to enhance reliability of the connected transmission and distribution systems. Traditional design and deployment of microgrids require significant engineering analysis. Microgrid Building Blocks (MBB), consisting of modular blocks that integrate seamlessly to form effective microgrids, is an enabling concept for faster and broader adoption of microgrids. Back-to-Back converter placed at the point of common coupling of microgrid is an integral part of the MBB. This paper presents applications of MBB to decouple power quality issues in grid-microgrid network serving power quality sensitive loads such as data centers, new grid-edge technologies such as vehicle-to-grid generation, and serving electric vehicle charging loads during evacuation before disaster events. Simulation results show that MBB effectively decouples the power quality issues across networks and helps maintain good power quality in the power quality sensitive network based on the operational scenario.
comment: This paper is accepted for publication in IEEE IECON 2024, Chicago, IL. The complete copyright version will be available on IEEE Xplore when the conference proceedings are published
Efficient Discovery of Actual Causality using Abstraction-Refinement
Causality is the relationship where one event contributes to the production of another, with the cause being partly responsible for the effect and the effect partly dependent on the cause. In this paper, we propose a novel and effective method to formally reason about the causal effect of events in engineered systems, with application for finding the root-cause of safety violations in embedded and cyber-physical systems. We are motivated by the notion of actual causality by Halpern and Pearl, which focuses on the causal effect of particular events rather than type-level causality, which attempts to make general statements about scientific and natural phenomena. Our first contribution is formulating discovery of actual causality in computing systems modeled by transition systems as an SMT solving problem. Since datasets for causality analysis tend to be large, in order to tackle the scalability problem of automated formal reasoning, our second contribution is a novel technique based on abstraction-refinement that allows identifying for actual causes within smaller abstract causal models. We demonstrate the effectiveness of our approach (by several orders of magnitude) using three case studies to find the actual cause of violations of safety in (1) a neural network controller for a Mountain Car, (2) a controller for a Lunar Lander obtained by reinforcement learning, and (3) an MPC controller for an F-16 autopilot simulator.
Exploiting Monotonicity to Design an Adaptive PI Passivity-Based Controller for a Fuel-Cell System
We present a controller for a power electronic system composed of a fuel cell (FC) connected to a boost converter which feeds a resistive load. The controller aims to regulate the output voltage of the converter regardless of the uncertainty of the load. Leveraging the monotonicity feature of the fuel cell polarization curve we prove that the nonlinear system can be controlled by means of a passivity-based proportional-integral approach. We afterward extend the result to an adaptive version, allowing the controller to deal with parameter uncertainties, such as inductor parasitic resistance, load, and FC polarization curve parameters. This adaptive design is based on an indirect control approach with online parameter identification performed by a ``hybrid'' estimator which combines two techniques: the gradient-descent and immersion-and-invariance algorithms. The overall system is proved to be stable with the output voltage regulated to its reference. Experimental results validate our proposal under two real-life scenarios: pulsating load and output voltage reference changes.
comment: 11 pages, 8 Figs
Multiagent Systems
Identifying and Clustering Counter Relationships of Team Compositions in PvP Games for Efficient Balance Analysis
How can balance be quantified in game settings? This question is crucial for game designers, especially in player-versus-player (PvP) games, where analyzing the strength relations among predefined team compositions-such as hero combinations in multiplayer online battle arena (MOBA) games or decks in card games-is essential for enhancing gameplay and achieving balance. We have developed two advanced measures that extend beyond the simplistic win rate to quantify balance in zero-sum competitive scenarios. These measures are derived from win value estimations, which employ strength rating approximations via the Bradley-Terry model and counter relationship approximations via vector quantization, significantly reducing the computational complexity associated with traditional win value estimations. Throughout the learning process of these models, we identify useful categories of compositions and pinpoint their counter relationships, aligning with the experiences of human players without requiring specific game knowledge. Our methodology hinges on a simple technique to enhance codebook utilization in discrete representation with a deterministic vector quantization process for an extremely small state space. Our framework has been validated in popular online games, including Age of Empires II, Hearthstone, Brawl Stars, and League of Legends. The accuracy of the observed strength relations in these games is comparable to traditional pairwise win value predictions, while also offering a more manageable complexity for analysis. Ultimately, our findings contribute to a deeper understanding of PvP game dynamics and present a methodology that significantly improves game balance evaluation and design.
comment: TMLR 09/2024 https://openreview.net/forum?id=2D36otXvBE
Particle Flows for Source Localization in 3-D Using TDOA Measurements
Localization using time-difference of arrival (TDOA) has myriad applications, e.g., in passive surveillance systems and marine mammal research. In this paper, we present a Bayesian estimation method that can localize an unknown number of static sources in 3-D based on TDOA measurements. The proposed localization algorithm based on particle flow (PFL) can overcome the challenges related to the highly nonlinear TDOA measurement model, the data association (DA) uncertainty, and the uncertainty in the number of sources to be localized. Different PFL strategies are compared within a unified belief propagation (BP) framework in a challenging multisensor source localization problem. In particular, we consider PFL-based approximation of beliefs based on one or multiple Gaussian kernels with parameters computed using deterministic and stochastic flow processes. Our numerical results demonstrate that the proposed method can correctly determine the number of sources and provide accurate location estimates. The stochastic flow demonstrates greater accuracy compared to the deterministic flow when using the same number of particles.
comment: 8 pages
Scalable Multi-Agent Reinforcement Learning for Warehouse Logistics with Robotic and Human Co-Workers IROS
We consider a warehouse in which dozens of mobile robots and human pickers work together to collect and deliver items within the warehouse. The fundamental problem we tackle, called the order-picking problem, is how these worker agents must coordinate their movement and actions in the warehouse to maximise performance in this task. Established industry methods using heuristic approaches require large engineering efforts to optimise for innately variable warehouse configurations. In contrast, multi-agent reinforcement learning (MARL) can be flexibly applied to diverse warehouse configurations (e.g. size, layout, number/types of workers, item replenishment frequency), and different types of order-picking paradigms (e.g. Goods-to-Person and Person-to-Goods), as the agents can learn how to cooperate optimally through experience. We develop hierarchical MARL algorithms in which a manager agent assigns goals to worker agents, and the policies of the manager and workers are co-trained toward maximising a global objective (e.g. pick rate). Our hierarchical algorithms achieve significant gains in sample efficiency over baseline MARL algorithms and overall pick rates over multiple established industry heuristics in a diverse set of warehouse configurations and different order-picking paradigms.
comment: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024
Optimal and Bounded Suboptimal Any-Angle Multi-agent Pathfinding IROS 2024
Multi-agent pathfinding (MAPF) is the problem of finding a set of conflict-free paths for a set of agents. Typically, the agents' moves are limited to a pre-defined graph of possible locations and allowed transitions between them, e.g. a 4-neighborhood grid. We explore how to solve MAPF problems when each agent can move between any pair of possible locations as long as traversing the line segment connecting them does not lead to a collision with the obstacles. This is known as any-angle pathfinding. We present the first optimal any-angle multi-agent pathfinding algorithm. Our planner is based on the Continuous Conflict-based Search (CCBS) algorithm and an optimal any-angle variant of the Safe Interval Path Planning (TO-AA-SIPP). The straightforward combination of those, however, scales poorly since any-angle path finding induces search trees with a very large branching factor. To mitigate this, we adapt two techniques from classical MAPF to the any-angle setting, namely Disjoint Splitting and Multi-Constraints. Experimental results on different combinations of these techniques show they enable solving over 30% more problems than the vanilla combination of CCBS and TO-AA-SIPP. In addition, we present a bounded-suboptimal variant of our algorithm, that enables trading runtime for solution cost in a controlled manner.
comment: This is a pre-print version of the paper accepted to IROS 2024. Its main body is similar to the camera-ready version of the conference paper. In addition this pre-print contains Appendix
Optimizing Agent Collaboration through Heuristic Multi-Agent Planning
The SOTA algorithms for addressing QDec-POMDP issues, QDec-FP and QDec-FPS, are unable to effectively tackle problems that involve different types of sensing agents. We propose a new algorithm that addresses this issue by requiring agents to adopt the same plan if one agent is unable to take a sensing action but the other can. Our algorithm performs significantly better than both QDec-FP and QDec-FPS in these types of situations.
comment: Paper has been not finished
Solving Collaborative Dec-POMDPs with Deep Reinforcement Learning Heuristics
WQMIX, QMIX, QTRAN, and VDN are SOTA algorithms for Dec-POMDP. All of them cannot solve complex agents' cooperation domains. We give an algorithm to solve such problems. In the first stage, we solve a single-agent problem and get a policy. In the second stage, we solve the multi-agent problem with the single-agent policy. SA2MA has a clear advantage over all competitors in complex agents' cooperative domains.
comment: Paper has been not finished
Emergence of Social Norms in Generative Agent Societies: Principles and Architecture IJCAI 2024
Social norms play a crucial role in guiding agents towards understanding and adhering to standards of behavior, thus reducing social conflicts within multi-agent systems (MASs). However, current LLM-based (or generative) MASs lack the capability to be normative. In this paper, we propose a novel architecture, named CRSEC, to empower the emergence of social norms within generative MASs. Our architecture consists of four modules: Creation & Representation, Spreading, Evaluation, and Compliance. This addresses several important aspects of the emergent processes all in one: (i) where social norms come from, (ii) how they are formally represented, (iii) how they spread through agents' communications and observations, (iv) how they are examined with a sanity check and synthesized in the long term, and (v) how they are incorporated into agents' planning and actions. Our experiments deployed in the Smallville sandbox game environment demonstrate the capability of our architecture to establish social norms and reduce social conflicts within generative MASs. The positive outcomes of our human evaluation, conducted with 30 evaluators, further affirm the effectiveness of our approach. Our project can be accessed via the following link: https://github.com/sxswz213/CRSEC.
comment: Published as a conference paper at IJCAI 2024
Robotics
3D Whole-body Grasp Synthesis with Directional Controllability
Synthesizing 3D whole-bodies that realistically grasp objects is useful for animation, mixed reality, and robotics. This is challenging, because the hands and body need to look natural w.r.t. each other, the grasped object, as well as the local scene (i.e., a receptacle supporting the object). Only recent work tackles this, with a divide-and-conquer approach; it first generates a "guiding" right-hand grasp, and then searches for bodies that match this. However, the guiding-hand synthesis lacks controllability and receptacle awareness, so it likely has an implausible direction (i.e., a body can't match this without penetrating the receptacle) and needs corrections through major post-processing. Moreover, the body search needs exhaustive sampling and is expensive. These are strong limitations. We tackle these with a novel method called CWGrasp. Our key idea is that performing geometry-based reasoning "early on," instead of "too late," provides rich "control" signals for inference. To this end, CWGrasp first samples a plausible reaching-direction vector (used later for both the arm and hand) from a probabilistic model built via raycasting from the object and collision checking. Then, it generates a reaching body with a desired arm direction, as well as a "guiding" grasping hand with a desired palm direction that complies with the arm's one. Eventually, CWGrasp refines the body to match the "guiding" hand, while plausibly contacting the scene. Notably, generating already-compatible "parts" greatly simplifies the "whole." Moreover, CWGrasp uniquely tackles both right- and left-hand grasps. We evaluate on the GRAB and ReplicaGrasp datasets. CWGrasp outperforms baselines, at lower runtime and budget, while all components help performance. Code and models will be released.
Auricular Vagus Nerve Stimulation for Enhancing Remote Pilot Training and Operations
The rapid growth of the drone industry, particularly in the use of small unmanned aerial systems (sUAS) and unmanned aerial vehicles (UAVs), requires the development of advanced training protocols for remote pilots. Remote pilots must develop a combination of technical and cognitive skills to manage the complexities of modern drone operations. This paper explores the integration of neurotechnology, specifically auricular vagus nerve stimulation (aVNS), as a method to enhance remote pilot training and performance. The scientific literature shows aVNS can safely improve cognitive functions such as attention, learning, and memory. It has also been shown useful to manage stress responses. For safe and efficient sUAS/UAV operation, it is essential for pilots to maintain high levels of vigilance and decision-making under pressure. By modulating sympathetic stress and cortical arousal, aVNS can prime cognitive faculties before training, help maintain focus during training and improve stress recovery post-training. Furthermore, aVNS has demonstrated the potential to enhance multitasking and cognitive control. This may help remote pilots during complex sUAS operations by potentially reducing the risk of impulsive decision-making or cognitive errors. This paper advocates for the inclusion of aVNS in remote pilot training programs by proposing that it can provide significant benefits in improving cognitive readiness, skill and knowledge acquisition, as well as operational safety and efficiency. Future research should focus on optimizing aVNS protocols for drone pilots while assessing long-term benefits to industrial safety and workforce readiness in real-world scenarios.
comment: 21 pages, 7 figures
A compact neuromorphic system for ultra energy-efficient, on-device robot localization
Neuromorphic computing offers a transformative pathway to overcome the computational and energy challenges faced in deploying robotic localization and navigation systems at the edge. Visual place recognition, a critical component for navigation, is often hampered by the high resource demands of conventional systems, making them unsuitable for small-scale robotic platforms which still require to perform complex, long-range tasks. Although neuromorphic approaches offer potential for greater efficiency, real-time edge deployment remains constrained by the complexity and limited scalability of bio-realistic networks. Here, we demonstrate a neuromorphic localization system that performs accurate place recognition in up to 8km of traversal using models as small as 180 KB with 44k parameters, while consuming less than 1% of the energy required by conventional methods. Our Locational Encoding with Neuromorphic Systems (LENS) integrates spiking neural networks, an event-based dynamic vision sensor, and a neuromorphic processor within a single SPECK(TM) chip, enabling real-time, energy-efficient localization on a hexapod robot. LENS represents the first fully neuromorphic localization system capable of large-scale, on-device deployment, setting a new benchmark for energy efficient robotic place recognition.
comment: 28 pages, 4 main figures, 4 supplementary figures, 1 supplementary table, and 1 movie. Under review
Bipedal locomotion using geometric techniques
This article describes a bipedal walking algorithm with inverse kinematics resolution based solely on geometric methods, so that all mathematical concepts are explained from the base, in order to clarify the reason for this solution. To do so, it has been necessary to simplify the problem and carry out didactic work to distribute content. In general, the articles related to this topic use matrix systems to solve both direct and inverse kinematics, using complex techniques such as decoupling or the Jacobian calculation. By simplifying the walking process, its resolution has been proposed in a simple way using only geometric techniques.
comment: in Spanish language
RoboMNIST: A Multimodal Dataset for Multi-Robot Activity Recognition Using WiFi Sensing, Video, and Audio
We introduce a novel dataset for multi-robot activity recognition (MRAR) using two robotic arms integrating WiFi channel state information (CSI), video, and audio data. This multimodal dataset utilizes signals of opportunity, leveraging existing WiFi infrastructure to provide detailed indoor environmental sensing without additional sensor deployment. Data were collected using two Franka Emika robotic arms, complemented by three cameras, three WiFi sniffers to collect CSI, and three microphones capturing distinct yet complementary audio data streams. The combination of CSI, visual, and auditory data can enhance robustness and accuracy in MRAR. This comprehensive dataset enables a holistic understanding of robotic environments, facilitating advanced autonomous operations that mimic human-like perception and interaction. By repurposing ubiquitous WiFi signals for environmental sensing, this dataset offers significant potential aiming to advance robotic perception and autonomous systems. It provides a valuable resource for developing sophisticated decision-making and adaptive capabilities in dynamic environments.
Optimizing Automated Picking Systems in Warehouse Robots Using Machine Learning
With the rapid growth of global e-commerce, the demand for automation in the logistics industry is increasing. This study focuses on automated picking systems in warehouses, utilizing deep learning and reinforcement learning technologies to enhance picking efficiency and accuracy while reducing system failure rates. Through empirical analysis, we demonstrate the effectiveness of these technologies in improving robot picking performance and adaptability to complex environments. The results show that the integrated machine learning model significantly outperforms traditional methods, effectively addressing the challenges of peak order processing, reducing operational errors, and improving overall logistics efficiency. Additionally, by analyzing environmental factors, this study further optimizes system design to ensure efficient and stable operation under variable conditions. This research not only provides innovative solutions for logistics automation but also offers a theoretical and empirical foundation for future technological development and application.
Identifying Terrain Physical Parameters from Vision -- Towards Physical-Parameter-Aware Locomotion and Navigation
Identifying the physical properties of the surrounding environment is essential for robotic locomotion and navigation to deal with non-geometric hazards, such as slippery and deformable terrains. It would be of great benefit for robots to anticipate these extreme physical properties before contact; however, estimating environmental physical parameters from vision is still an open challenge. Animals can achieve this by using their prior experience and knowledge of what they have seen and how it felt. In this work, we propose a cross-modal self-supervised learning framework for vision-based environmental physical parameter estimation, which paves the way for future physical-property-aware locomotion and navigation. We bridge the gap between existing policies trained in simulation and identification of physical terrain parameters from vision. We propose to train a physical decoder in simulation to predict friction and stiffness from multi-modal input. The trained network allows the labeling of real-world images with physical parameters in a self-supervised manner to further train a visual network during deployment, which can densely predict the friction and stiffness from image data. We validate our physical decoder in simulation and the real world using a quadruped ANYmal robot, outperforming an existing baseline method. We show that our visual network can predict the physical properties in indoor and outdoor experiments while allowing fast adaptation to new environments.
DroneWiS: Automated Simulation Testing of small Unmanned Aerial Systems in Realistic Windy Conditions
The continuous evolution of small Unmanned Aerial Systems (sUAS) demands advanced testing methodologies to ensure their safe and reliable operations in the real-world. To push the boundaries of sUAS simulation testing in realistic environments, we previously developed the DroneReqValidator (DRV) platform, allowing developers to automatically conduct simulation testing in digital twin of earth. In this paper, we present DRV 2.0, which introduces a novel component called DroneWiS (Drone Wind Simulation). DroneWiS allows sUAS developers to automatically simulate realistic windy conditions and test the resilience of sUAS against wind. Unlike current state-of-the-art simulation tools such as Gazebo and AirSim that only simulate basic wind conditions, DroneWiS leverages Computational Fluid Dynamics (CFD) to compute the unique wind flows caused by the interaction of wind with the objects in the environment such as buildings and uneven terrains. This simulation capability provides deeper insights to developers about the navigation capability of sUAS in challenging and realistic windy conditions. DroneWiS equips sUAS developers with a powerful tool to test, debug, and improve the reliability and safety of sUAS in real-world. A working demonstration is available at https://youtu.be/khBHEBST8Wc
UAV-Based Human Body Detector Selection and Fusion for Geolocated Saliency Map Generation
The problem of reliably detecting and geolocating objects of different classes in soft real-time is essential in many application areas, such as Search and Rescue performed using Unmanned Aerial Vehicles (UAVs). This research addresses the complementary problems of system contextual vision-based detector selection, allocation, and execution, in addition to the fusion of detection results from teams of UAVs for the purpose of accurately and reliably geolocating objects of interest in a timely manner. In an offline step, an application-independent evaluation of vision-based detectors from a system perspective is first performed. Based on this evaluation, the most appropriate algorithms for online object detection for each platform are selected automatically before a mission, taking into account a number of practical system considerations, such as the available communication links, video compression used, and the available computational resources. The detection results are fused using a method for building maps of salient locations which takes advantage of a novel sensor model for vision-based detections for both positive and negative observations. A number of simulated and real flight experiments are also presented, validating the proposed method.
comment: 42 pages, 19 figures
Integrating Features for Recognizing Human Activities through Optimized Parameters in Graph Convolutional Networks and Transformer Architectures
Human activity recognition is a major field of study that employs computer vision, machine vision, and deep learning techniques to categorize human actions. The field of deep learning has made significant progress, with architectures that are extremely effective at capturing human dynamics. This study emphasizes the influence of feature fusion on the accuracy of activity recognition. This technique addresses the limitation of conventional models, which face difficulties in identifying activities because of their limited capacity to understand spatial and temporal features. The technique employs sensory data obtained from four publicly available datasets: HuGaDB, PKU-MMD, LARa, and TUG. The accuracy and F1-score of two deep learning models, specifically a Transformer model and a Parameter-Optimized Graph Convolutional Network (PO-GCN), were evaluated using these datasets. The feature fusion technique integrated the final layer features from both models and inputted them into a classifier. Empirical evidence demonstrates that PO-GCN outperforms standard models in activity recognition. HuGaDB demonstrated a 2.3% improvement in accuracy and a 2.2% increase in F1-score. TUG showed a 5% increase in accuracy and a 0.5% rise in F1-score. On the other hand, LARa and PKU-MMD achieved lower accuracies of 64% and 69% respectively. This indicates that the integration of features enhanced the performance of both the Transformer model and PO-GCN.
comment: 6 pages, 1 figure, conference
Time-Optimized Trajectory Planning for Non-Prehensile Object Transportation in 3D
Non-prehensile object transportation offers a way to enhance robotic performance in object manipulation tasks, especially with unstable objects. Effective trajectory planning requires simultaneous consideration of robot motion constraints and object stability. Here, we introduce a physical model for object stability and propose a novel trajectory planning approach for non-prehensile transportation along arbitrary straight lines in 3D space. Validation with a 7-DoF Franka Panda robot confirms improved transportation speed via tray rotation integration while ensuring object stability and robot motion constraints.
comment: Accepted to the European Robotic Forum (ERF) 2024
EasyChauffeur: A Baseline Advancing Simplicity and Efficiency on Waymax
Recent advancements in deep-learning-based driving planners have primarily focused on elaborate network engineering, yielding limited improvements. This paper diverges from conventional approaches by exploring three fundamental yet underinvestigated aspects: training policy, data efficiency, and evaluation robustness. We introduce EasyChauffeur, a reproducible and effective planner for both imitation learning (IL) and reinforcement learning (RL) on Waymax, a GPU-accelerated simulator. Notably, our findings indicate that the incorporation of on-policy RL significantly boosts performance and data efficiency. To further enhance this efficiency, we propose SNE-Sampling, a novel method that selectively samples data from the encoder's latent space, substantially improving EasyChauffeur's performance with RL. Additionally, we identify a deficiency in current evaluation methods, which fail to accurately assess the robustness of different planners due to significant performance drops from minor changes in the ego vehicle's initial state. In response, we propose Ego-Shifting, a new evaluation setting for assessing planners' robustness. Our findings advocate for a shift from a primary focus on network architectures to adopting a holistic approach encompassing training strategies, data efficiency, and robust evaluation methods.
Efficient Multi-agent Navigation with Lightweight DRL Policy
In this article, we present an end-to-end collision avoidance policy based on deep reinforcement learning (DRL) for multi-agent systems, demonstrating encouraging outcomes in real-world applications. In particular, our policy calculates the control commands of the agent based on the raw LiDAR observation. In addition, the number of parameters of the proposed basic model is 140,000, and the size of the parameter file is 3.5 MB, which allows the robot to calculate the actions from the CPU alone. We propose a multi-agent training platform based on a physics-based simulator to further bridge the gap between simulation and the real world. The policy is trained on a policy-gradients-based RL algorithm in a dense and messy training environment. A novel reward function is introduced to address the issue of agents choosing suboptimal actions in some common scenarios. Although the data used for training is exclusively from the simulation platform, the policy can be successfully transferred and deployed in real-world robots. Finally, our policy effectively responds to intentional obstructions and avoids collisions. The website is available at \url{https://sites.google.com/view/xingrong2024efficient/%E9%A6%96%E9%A1%B5}.
An Accurate Filter-based Visual Inertial External Force Estimator via Instantaneous Accelerometer Update ICRA
Accurate disturbance estimation is crucial for reliable robotic physical interaction. To estimate environmental interference in a low-cost and sensorless way (without force sensor), a variety of tightly-coupled visual inertial external force estimators are proposed in the literature. However, existing solutions may suffer from relatively low-frequency preintegration. In this paper, a novel estimator is designed to overcome this issue via high-frequency instantaneous accelerometer update.
comment: Accepted by the 40th Anniversary of the IEEE Conference on Robotics and Automation (ICRA@40)
BEVal: A Cross-dataset Evaluation Study of BEV Segmentation Models for Autononomous Driving
Current research in semantic bird's-eye view segmentation for autonomous driving focuses solely on optimizing neural network models using a single dataset, typically nuScenes. This practice leads to the development of highly specialized models that may fail when faced with different environments or sensor setups, a problem known as domain shift. In this paper, we conduct a comprehensive cross-dataset evaluation of state-of-the-art BEV segmentation models to assess their performance across different training and testing datasets and setups, as well as different semantic categories. We investigate the influence of different sensors, such as cameras and LiDAR, on the models' ability to generalize to diverse conditions and scenarios. Additionally, we conduct multi-dataset training experiments that improve models' BEV segmentation performance compared to single-dataset training. Our work addresses the gap in evaluating BEV segmentation models under cross-dataset validation. And our findings underscore the importance of enhancing model generalizability and adaptability to ensure more robust and reliable BEV segmentation approaches for autonomous driving applications.
Safe Bayesian Optimization for High-Dimensional Control Systems via Additive Gaussian Processes
Controller tuning and optimization have been among the most fundamental problems in robotics and mechatronic systems. The traditional methodology is usually model-based, but its performance heavily relies on an accurate mathematical model of the system. In control applications with complex dynamics, obtaining a precise model is often challenging, leading us towards a data-driven approach. While optimizing a single controller has been explored by various researchers, it remains a challenge to obtain the optimal controller parameters safely and efficiently when multiple controllers are involved. In this paper, we propose a high-dimensional safe Bayesian optimization method based on additive Gaussian processes to optimize multiple controllers simultaneously and safely. Additive Gaussian kernels replace the traditional squared-exponential kernels or Mat\'ern kernels, enhancing the efficiency with which Gaussian processes update information on unknown functions. Experimental results on a permanent magnet synchronous motor (PMSM) demonstrate that compared to existing safe Bayesian optimization algorithms, our method can obtain optimal parameters more efficiently while ensuring safety.
Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation
Learned language-conditioned robot policies often struggle to effectively adapt to new real-world tasks even when pre-trained across a diverse set of instructions. We propose a novel approach for few-shot adaptation to unseen tasks that exploits the semantic understanding of task decomposition provided by vision-language models (VLMs). Our method, Policy Adaptation via Language Optimization (PALO), combines a handful of demonstrations of a task with proposed language decompositions sampled from a VLM to quickly enable rapid nonparametric adaptation, avoiding the need for a larger fine-tuning dataset. We evaluate PALO on extensive real-world experiments consisting of challenging unseen, long-horizon robot manipulation tasks. We find that PALO is able of consistently complete long-horizon, multi-tier tasks in the real world, outperforming state of the art pre-trained generalist policies, and methods that have access to the same demonstrations.
comment: 27 pages, 14 figures
RMMI: Enhanced Obstacle Avoidance for Reactive Mobile Manipulation using an Implicit Neural Map
We introduce RMMI, a novel reactive control framework for mobile manipulators operating in complex, static environments. Our approach leverages a neural Signed Distance Field (SDF) to model intricate environment details and incorporates this representation as inequality constraints within a Quadratic Program (QP) to coordinate robot joint and base motion. A key contribution is the introduction of an active collision avoidance cost term that maximises the total robot distance to obstacles during the motion. We first evaluate our approach in a simulated reaching task, outperforming previous methods that rely on representing both the robot and the scene as a set of primitive geometries. Compared with the baseline, we improved the task success rate by 25% in total, which includes increases of 10% by using the active collision cost. We also demonstrate our approach on a real-world platform, showing its effectiveness in reaching target poses in cluttered and confined spaces using environment models built directly from sensor data. For additional details and experiment videos, visit https://rmmi.github.io/.
comment: 8 pages, 6 figures, Paper under review
FlowRetrieval: Flow-Guided Data Retrieval for Few-Shot Imitation Learning
Few-shot imitation learning relies on only a small amount of task-specific demonstrations to efficiently adapt a policy for a given downstream tasks. Retrieval-based methods come with a promise of retrieving relevant past experiences to augment this target data when learning policies. However, existing data retrieval methods fall under two extremes: they either rely on the existence of exact behaviors with visually similar scenes in the prior data, which is impractical to assume; or they retrieve based on semantic similarity of high-level language descriptions of the task, which might not be that informative about the shared low-level behaviors or motions across tasks that is often a more important factor for retrieving relevant data for policy learning. In this work, we investigate how we can leverage motion similarity in the vast amount of cross-task data to improve few-shot imitation learning of the target task. Our key insight is that motion-similar data carries rich information about the effects of actions and object interactions that can be leveraged during few-shot adaptation. We propose FlowRetrieval, an approach that leverages optical flow representations for both extracting similar motions to target tasks from prior data, and for guiding learning of a policy that can maximally benefit from such data. Our results show FlowRetrieval significantly outperforms prior methods across simulated and real-world domains, achieving on average 27% higher success rate than the best retrieval-based prior method. In the Pen-in-Cup task with a real Franka Emika robot, FlowRetrieval achieves 3.7x the performance of the baseline imitation learning technique that learns from all prior and target data. Website: https://flow-retrieval.github.io
Autonomous Image-to-Grasp Robotic Suturing Using Reliability-Driven Suture Thread Reconstruction
Automating suturing during robotically-assisted surgery reduces the burden on the operating surgeon, enabling them to focus on making higher-level decisions rather than fatiguing themselves in the numerous intricacies of a surgical procedure. Accurate suture thread reconstruction and grasping are vital prerequisites for suturing, particularly for avoiding entanglement with surgical tools and performing complex thread manipulation. However, such methods must be robust to heavy perceptual degradation resulting from heavy noise and thread feature sparsity from endoscopic images. We develop a reconstruction algorithm that utilizes quadratic programming optimization to fit smooth splines to thread observations, satisfying reliability bounds estimated from measured observation noise. Additionally, we craft a grasping policy that generates gripper trajectories that maximize the probability of a successful grasp. Our full image-to-grasp pipeline is rigorously evaluated with over 400 grasping trials, exhibiting state-of-the-art accuracy. We show that this strategy can be applied to the various techniques in autonomous suture needle manipulation to achieve autonomous surgery in a generalizable way.
comment: 8 pages, 6 figures, Submitted to RAL
Robotic warehousing operations: a learn-then-optimize approach to large-scale neighborhood search
The rapid deployment of robotics technologies requires dedicated optimization algorithms to manage large fleets of autonomous agents. This paper supports robotic parts-to-picker operations in warehousing by optimizing order-workstation assignments, item-pod assignments and the schedule of order fulfillment at workstations. The model maximizes throughput, while managing human workload at the workstations and congestion in the facility. We solve it via large-scale neighborhood search, with a novel learn-then-optimize approach to subproblem generation. The algorithm relies on an offline machine learning procedure to predict objective improvements based on subproblem features, and an online optimization model to generate a new subproblem at each iteration. In collaboration with Amazon Robotics, we show that our model and algorithm generate much stronger solutions for practical problems than state-of-the-art approaches. In particular, our solution enhances the utilization of robotic fleets by coordinating robotic tasks for human operators to pick multiple items at once, and by coordinating robotic routes to avoid congestion in the facility.
Learning Multi-agent Multi-machine Tending by Mobile Robots
Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborative robots can tackle that can also highly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. In this work, we introduce a multi-agent multi-machine tending learning framework by mobile robots based on Multi-agent Reinforcement Learning (MARL) techniques with the design of a suitable observation and reward. Moreover, an attention-based encoding mechanism is developed and integrated into Multi-agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine tending scenarios. Our model (AB-MAPPO) outperformed MAPPO in this new challenging scenario in terms of task success, safety, and resources utilization. Furthermore, we provided an extensive ablation study to support our various design decisions.
comment: 7 pages, 4 figures
CalTag: Robust calibration of mmWave Radar and LiDAR using backscatter tags
The rise of automation in robotics necessitates the use of high-quality perception systems, often through the use of multiple sensors. A crucial aspect of a successfully deployed multi-sensor systems is the calibration with a known object typically named fiducial. In this work, we propose a novel fiducial system for millimeter wave radars, termed as \name. \name addresses the limitations of traditional corner reflector-based calibration methods in extremely cluttered environments. \name leverages millimeter wave backscatter technology to achieve more reliable calibration than corner reflectors, enhancing the overall performance of multi-sensor perception systems. We compare the performance in several real-world environments and show the improvement achieved by using \name as the radar fiducial over a corner reflector.
Measuring Transparency in Intelligent Robots
As robots become increasingly integrated into our daily lives, the need to make them transparent has never been more critical. Yet, despite its importance in human-robot interaction, a standardized measure of robot transparency has been missing until now. This paper addresses this gap by presenting the first comprehensive scale to measure perceived transparency in robotic systems, available in English, German, and Italian languages. Our approach conceptualizes transparency as a multidimensional construct, encompassing explainability, legibility, predictability, and meta-understanding. The proposed scale was a product of a rigorous three-stage process involving 1,223 participants. Firstly, we generated the items of our scale, secondly, we conducted an exploratory factor analysis, and thirdly, a confirmatory factor analysis served to validate the factor structure of the newly developed TOROS scale. The final scale encompasses 26 items and comprises three factors: Illegibility, Explainability, and Predictability. TOROS demonstrates high cross-linguistic reliability, inter-factor correlation, model fit, internal consistency, and convergent validity across the three cross-national samples. This empirically validated tool enables the assessment of robot transparency and contributes to the theoretical understanding of this complex construct. By offering a standardized measure, we facilitate consistent and comparable research in human-robot interaction in which TOROS can serve as a benchmark.
A framework for training and benchmarking algorithms that schedule robot tasks
Service robots work in a changing environment habited by exogenous agents like humans. In the service robotics domain, lots of uncertainties result from exogenous actions and inaccurate localisation of objects and the robot itself. This makes the robot task scheduling problem incredibly challenging. In this article, we propose a benchmarking system for systematically assessing the performance of algorithms scheduling robot tasks. The robot environment incorporates a room map, furniture, transportable objects, and moving humans; the system defines interfaces for the algorithms, tasks to be executed, and evaluation methods. The system consists of several tools, easing testing scenario generation for training AI-based scheduling algorithms and statistical testing. For benchmarking purposes, a set of scenarios is chosen, and the performance of several scheduling algorithms is assessed. The system source is published to serve the community for tuning and comparable assessment of robot task scheduling algorithms for service robots.
No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery
What data or environments to use for training to improve downstream performance is a longstanding and very topical question in reinforcement learning. In particular, Unsupervised Environment Design (UED) methods have gained recent attention as their adaptive curricula enable agents to be robust to in- and out-of-distribution tasks. We ask to what extent these methods are themselves robust when applied to a novel setting, closely inspired by a real-world robotics problem. Surprisingly, we find that the state-of-the-art UED methods either do not improve upon the na\"{i}ve baseline of Domain Randomisation (DR), or require substantial hyperparameter tuning to do so. Our analysis shows that this is due to their underlying scoring functions failing to predict intuitive measures of ``learnability'', i.e., in finding the settings that the agent sometimes solves, but not always. Based on this, we instead directly train on levels with high learnability and find that this simple and intuitive approach outperforms UED methods and DR in several binary-outcome environments, including on our domain and the standard UED domain of Minigrid. We further introduce a new adversarial evaluation procedure for directly measuring robustness, closely mirroring the conditional value at risk (CVaR). We open-source all our code and present visualisations of final policies here: https://github.com/amacrutherford/sampling-for-learnability.
Stochastic Adaptive Estimation in Polynomial Curvature Shape State Space for Continuum Robots
In continuum robotics, real-time robust shape estimation is crucial for planning and control tasks that involve physical manipulation in complex environments. In this paper, we present a novel stochastic observer-based shape estimation framework designed specifically for continuum robots. The shape state space is uniquely represented by the modal coefficients of a polynomial, enabled by leveraging polynomial curvature kinematics to describe the curvature distribution along the arclength. Our framework processes noisy measurements from limited discrete position, orientation, or pose sensors to estimate the shape state robustly. We derive a novel noise-weighted observability matrix, providing a detailed assessment of observability variations under diverse sensor configurations. To overcome the limitations of a single model, our observer employs the Interacting Multiple Model (IMM) method, coupled with Extended Kalman Filters (EKFs), to mix polynomial curvature models of different orders. The IMM approach, rooted in Markov processes, effectively manages multiple model scenarios by dynamically adapting to different polynomial orders based on real-time model probabilities. This adaptability is key to ensuring robust shape estimation of the robot's behaviors under various conditions. Our comprehensive analysis, supported by both simulation studies and experimental validations, confirms the robustness and accuracy of our proposed methods.
comment: 20 pages, submitted to IEEE Transactions on Robotics, under review
Manipulate-Anything: Automating Real-World Robots using Vision-Language Models
Large-scale endeavors like and widespread community efforts such as Open-X-Embodiment have contributed to growing the scale of robot demonstration data. However, there is still an opportunity to improve the quality, quantity, and diversity of robot demonstration data. Although vision-language models have been shown to automatically generate demonstration data, their utility has been limited to environments with privileged state information, they require hand-designed skills, and are limited to interactions with few object instances. We propose Manipulate-Anything, a scalable automated generation method for real-world robotic manipulation. Unlike prior work, our method can operate in real-world environments without any privileged state information, hand-designed skills, and can manipulate any static object. We evaluate our method using two setups. First, Manipulate-Anything successfully generates trajectories for all 7 real-world and 14 simulation tasks, significantly outperforming existing methods like VoxPoser. Second, Manipulate-Anything's demonstrations can train more robust behavior cloning policies than training with human demonstrations, or from data generated by VoxPoser, Scaling-up, and Code-As-Policies. We believe Manipulate-Anything can be a scalable method for both generating data for robotics and solving novel tasks in a zero-shot setting. Project page: https://robot-ma.github.io/.
comment: Project page: https://robot-ma.github.io/. All supplementary material, prompts and code can be found on the project page
Gameplay Filters: Robust Zero-Shot Safety through Adversarial Imagination
Despite the impressive recent advances in learning-based robot control, ensuring robustness to out-of-distribution conditions remains an open challenge. Safety filters can, in principle, keep arbitrary control policies from incurring catastrophic failures by overriding unsafe actions, but existing solutions for complex (e.g., legged) robot dynamics do not span the full motion envelope and instead rely on local, reduced-order models. These filters tend to overly restrict agility and can still fail when perturbed away from nominal conditions. This paper presents the gameplay filter, a new class of predictive safety filter that continually plays out hypothetical matches between its simulation-trained safety strategy and a virtual adversary co-trained to invoke worst-case events and sim-to-real error, and precludes actions that would cause it to fail down the line. We demonstrate the scalability and robustness of the approach with a first-of-its-kind full-order safety filter for (36-D) quadrupedal dynamics. Physical experiments on two different quadruped platforms demonstrate the superior zero-shot effectiveness of the gameplay filter under large perturbations such as tugging and unmodeled terrain.
Trajectory Forecasting through Low-Rank Adaptation of Discrete Latent Codes
Trajectory forecasting is crucial for video surveillance analytics, as it enables the anticipation of future movements for a set of agents, e.g. basketball players engaged in intricate interactions with long-term intentions. Deep generative models offer a natural learning approach for trajectory forecasting, yet they encounter difficulties in achieving an optimal balance between sampling fidelity and diversity. We address this challenge by leveraging Vector Quantized Variational Autoencoders (VQ-VAEs), which utilize a discrete latent space to tackle the issue of posterior collapse. Specifically, we introduce an instance-based codebook that allows tailored latent representations for each example. In a nutshell, the rows of the codebook are dynamically adjusted to reflect contextual information (i.e., past motion patterns extracted from the observed trajectories). In this way, the discretization process gains flexibility, leading to improved reconstructions. Notably, instance-level dynamics are injected into the codebook through low-rank updates, which restrict the customization of the codebook to a lower dimension space. The resulting discrete space serves as the basis of the subsequent step, which regards the training of a diffusion-based predictive model. We show that such a two-fold framework, augmented with instance-level discretization, leads to accurate and diverse forecasts, yielding state-of-the-art performance on three established benchmarks.
comment: 15 pages, 3 figures, 5 tables
In-Hand Following of Deformable Linear Objects Using Dexterous Fingers with Tactile Sensing IROS 2024
Most research on deformable linear object (DLO) manipulation assumes rigid grasping. However, beyond rigid grasping and re-grasping, in-hand following is also an essential skill that humans use to dexterously manipulate DLOs, which requires continuously changing the grasp point by in-hand sliding while holding the DLO to prevent it from falling. Achieving such a skill is very challenging for robots without using specially designed but not versatile end-effectors. Previous works have attempted using generic parallel grippers, but their robustness is unsatisfactory owing to the conflict between following and holding, which is hard to balance with a one-degree-of-freedom gripper. In this work, inspired by how humans use fingers to follow DLOs, we explore the usage of a generic dexterous hand with tactile sensing to imitate human skills and achieve robust in-hand DLO following. To enable the hardware system to function in the real world, we develop a framework that includes Cartesian-space arm-hand control, tactile-based in-hand 3-D DLO pose estimation, and task-specific motion design. Experimental results demonstrate the significant superiority of our method over using parallel grippers, as well as its great robustness, generalizability, and efficiency.
comment: IROS 2024 Oral. Project website: https://mingrui-yu.github.io/DLO_following/
Asynchronous Spatial-Temporal Allocation for Trajectory Planning of Heterogeneous Multi-Agent Systems
To plan the trajectories of a large-scale heterogeneous swarm, sequentially or synchronously distributed methods usually become intractable due to the lack of global clock synchronization. To this end, we provide a novel asynchronous spatial-temporal allocation method. Specifically, between a pair of agents, the allocation is proposed to determine their corresponding derivable time-stamped space and can be updated in an asynchronous way, by inserting a waiting duration between two consecutive replanning steps. Via theoretical analysis, the inter-agent collision is proved to be avoided and the allocation ensures timely updates. Comprehensive simulations and comparisons with five baselines validate the effectiveness of the proposed method and illustrate its improvement in completion time and moving distance. Finally, hardware experiments are carried out, where $8$ heterogeneous unmanned ground vehicles with onboard computation navigate in cluttered scenarios with high agility.
comment: 8 pages
CafkNet: GNN-Empowered Forward Kinematic Modeling for Cable-Driven Parallel Robots
Cable-driven parallel robots (CDPRs) have gained significant attention due to their promising advantages. When deploying CDPRs in practice, the kinematic modeling is a key question. Unlike serial robots, CDPRs have a simple inverse kinematics problem but a complex forward kinematics (FK) issue. So, the development of accurate and efficient FK solvers has been a prominent research focus in CDPR applications. By observing the topology within CDPRs, in this paper, we propose a graph-based representation to model CDPRs and introduce CafkNet, a fast and general FK solving method, leveraging Graph Neural Network (GNN) to learn the topological structure and yield the real FK solutions with superior generality, high accuracy, and low time cost. CafkNet is extensively tested on 3D and 2D CDPRs in different configurations, both in simulators and real scenarios. The results demonstrate its ability to learn CDPRs' internal topology and accurately solve the FK problem. Then, the zero-shot generalization from one configuration to another is validated. Also, the sim2real gap can be bridged by CafkNet using both simulation and real-world data. To the best of our knowledge, it is the first study that employs the GNN to solve the FK problem for CDPRs.
comment: To the best of our knowledge, it is the first study to employ the GNN for the FK problem of CDPRs. The first two authors have equal contributions. Videos and codes are available at https://sites.google.com/view/cafknet/site
Neuromorphic force-control in an industrial task: validating energy and latency benefits IROS 2024
As robots become smarter and more ubiquitous, optimizing the power consumption of intelligent compute becomes imperative towards ensuring the sustainability of technological advancements. Neuromorphic computing hardware makes use of biologically inspired neural architectures to achieve energy and latency improvements compared to conventional von Neumann computing architecture. Applying these benefits to robots has been demonstrated in several works in the field of neurorobotics, typically on relatively simple control tasks. Here, we introduce an example of neuromorphic computing applied to the real-world industrial task of object insertion. We trained a spiking neural network (SNN) to perform force-torque feedback control using a reinforcement learning approach in simulation. We then ported the SNN to the Intel neuromorphic research chip Loihi interfaced with a KUKA robotic arm. At inference time we show latency competitive with current CPU/GPU architectures, and one order of magnitude less energy usage in comparison to state-of-the-art low-energy edge-hardware. We offer this example as a proof of concept implementation of a neuromoprhic controller in real-world robotic setting, highlighting the benefits of neuromorphic hardware for the development of intelligent controllers for robots.
comment: Accepted at IROS 2024
Learning a Shape-Conditioned Agent for Purely Tactile In-Hand Manipulation of Various Objects
Reorienting diverse objects with a multi-fingered hand is a challenging task. Current methods in robotic in-hand manipulation are either object-specific or require permanent supervision of the object state from visual sensors. This is far from human capabilities and from what is needed in real-world applications. In this work, we address this gap by training shape-conditioned agents to reorient diverse objects in hand, relying purely on tactile feedback (via torque and position measurements of the fingers' joints). To achieve this, we propose a learning framework that exploits shape information in a reinforcement learning policy and a learned state estimator. We find that representing 3D shapes by vectors from a fixed set of basis points to the shape's surface, transformed by its predicted 3D pose, is especially helpful for learning dexterous in-hand manipulation. In simulation and real-world experiments, we show the reorientation of many objects with high success rates, on par with state-of-the-art results obtained with specialized single-object agents. Moreover, we show generalization to novel objects, achieving success rates of $\sim$90% even for non-convex shapes.
Toward An Analytic Theory of Intrinsic Robustness for Dexterous Grasping IROS 2024
Conventional approaches to grasp planning require perfect knowledge of an object's pose and geometry. Uncertainties in these quantities induce uncertainties in the quality of planned grasps, which can lead to failure. Classically, grasp robustness refers to the ability to resist external disturbances after grasping an object. In contrast, this work studies robustness to intrinsic sources of uncertainty like object pose or geometry affecting grasp planning before execution. To do so, we develop a novel analytic theory of grasping that reasons about this intrinsic robustness by characterizing the effect of friction cone uncertainty on a grasp's force closure status. We apply this result in two ways. First, we analyze the theoretical guarantees on intrinsic robustness of two grasp metrics in the literature, the classical Ferrari-Canny metric and more recent min-weight metric. We validate these results with hardware trials that compare grasps synthesized with and without robustness guarantees, showing a clear improvement in success rates. Second, we use our theory to develop a novel analytic notion of probabilistic force closure, which we show can generate unique, uncertainty-aware grasps in simulation.
comment: Accepted to IROS 2024
DuoSpaceNet: Leveraging Both Bird's-Eye-View and Perspective View Representations for 3D Object Detection
Recent advances in multi-view camera-only 3D object detection either rely on an accurate reconstruction of bird's-eye-view (BEV) 3D features or on traditional 2D perspective view (PV) image features. While both have their own pros and cons, few have found a way to stitch them together in order to benefit from "the best of both worlds". To this end, we explore a duo space (i.e., BEV and PV) 3D perception framework, in conjunction with some useful duo space fusion strategies that allow effective aggregation of the two feature representations. To the best of our knowledge, our proposed method, DuoSpaceNet, is the first to leverage two distinct feature spaces and achieves the state-of-the-art 3D object detection and BEV map segmentation results on nuScenes dataset.
TiCoSS: Tightening the Coupling between Semantic Segmentation and Stereo Matching within A Joint Learning Framework
Semantic segmentation and stereo matching, respectively analogous to the ventral and dorsal streams in our human brain, are two key components of autonomous driving perception systems. Addressing these two tasks with separate networks is no longer the mainstream direction in developing computer vision algorithms, particularly with the recent advances in large vision models and embodied artificial intelligence. The trend is shifting towards combining them within a joint learning framework, especially emphasizing feature sharing between the two tasks. The major contributions of this study lie in comprehensively tightening the coupling between semantic segmentation and stereo matching. Specifically, this study introduces three novelties: (1) a tightly coupled, gated feature fusion strategy, (2) a hierarchical deep supervision strategy, and (3) a coupling tightening loss function. The combined use of these technical contributions results in TiCoSS, a state-of-the-art joint learning framework that simultaneously tackles semantic segmentation and stereo matching. Through extensive experiments on the KITTI and vKITTI2 datasets, along with qualitative and quantitative analyses, we validate the effectiveness of our developed strategies and loss function, and demonstrate its superior performance compared to prior arts, with a notable increase in mIoU by over 9%. Our source code will be publicly available at mias.group/TiCoSS upon publication.
Survey of Simulators for Aerial Robots
Uncrewed Aerial Vehicle (UAV) research faces challenges with safety, scalability, costs, and ecological impact when conducting hardware testing. High-fidelity simulators offer a vital solution by replicating real-world conditions to enable the development and evaluation of novel perception and control algorithms. However, the large number of available simulators poses a significant challenge for researchers to determine which simulator best suits their specific use-case, based on each simulator's limitations and customization readiness. In this paper we present an overview of 44 UAV simulators, including in-depth, systematic comparisons for 14 of the simulators. Additionally, we present a set of decision factors for selection of simulators, aiming to enhance the efficiency and safety of research endeavors.
comment: \copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Multiagent Systems
Iterative Graph Alignment
By compressing diverse narratives, LLMs go beyond memorization, achieving intelligence by capturing generalizable causal relationships. However, they suffer from local 'representation gaps' due to insufficient training data diversity, limiting their real-world utility, especially in tasks requiring strict alignment to rules. Traditional alignment methods relying on heavy human annotations are inefficient and unscalable. Recent self-alignment techniques also fall short, as they often depend on self-selection based prompting and memorization-based learning. To address these issues, we introduce Iterative Graph Alignment (IGA), an annotation-free rule-based alignment algorithm. A teacher model (VLM) employs Iterative Graph Prompting (IGP) to create logical graphs and reference answers. The student model (LLM) identifies local knowledge gaps by attempting to align its responses with these references, collaborating with helper models to generate diverse answers. These aligned responses are then used for iterative supervised fine-tuning (SFT). Our evaluations across five rule-based scenarios demonstrate IGP's effectiveness, with a 73.12\% alignment improvement in Claude Sonnet 3.5, and Llama3-8B-Instruct achieving an 86.20\% improvement, outperforming Claude Sonnet 3.5 in rule-based alignment.
comment: 12 pages, 4 figures
Consensus Planning with Primal, Dual, and Proximal Agents
Consensus planning is a method for coordinating decision making across complex systems and organizations, including complex supply chain optimization pipelines. It arises when large interdependent distributed agents (systems) share common resources and must act in order to achieve a joint goal. In this paper, we introduce a generic Consensus Planning Protocol (CPP) to solve such problems. Our protocol allows for different agents to interact with the coordinating algorithm in different ways (e.g., as a primal or dual or proximal agent). In prior consensus planning work, all agents have been assumed to have the same interaction pattern (e.g., all dual agents or all primal agents or all proximal agents), most commonly using the Alternating Direction Method of Multipliers (ADMM) as proximal agents. However, this is often not a valid assumption in practice, where agents consist of large complex systems, and where we might not have the luxury of modifying these large complex systems at will. Our generic CPP allows for any mix of agents by combining ADMM-like updates for the proximal agents, dual ascent updates for the dual agents, and linearized ADMM updates for the primal agents. We prove convergence results for the generic CPP, namely a sublinear O(1/k) convergence rate under mild assumptions, and two-step linear convergence under stronger assumptions. We also discuss enhancements to the basic method and provide illustrative empirical results.
Parametrization and convergence of a primal-dual block-coordinate approach to linearly-constrained nonsmooth optimization
This note is concerned with the problem of minimizing a separable, convex, composite (smooth and nonsmooth) function subject to linear constraints. We study a randomized block-coordinate interpretation of the Chambolle-Pock primal-dual algorithm, based on inexact proximal gradient steps. A specificity of the considered algorithm is its robustness, as it converges even in the absence of strong duality or when the linear program is inconsistent. Using matrix preconditiong, we derive tight sublinear convergence rates with and without duality assumptions and for both the convex and the strongly convex settings. Our developments are extensions and particularizations of original algorithms proposed by Malitsky (2019) and Luke and Malitsky (2018). Numerical experiments are provided for an optimal transport problem of service pricing.
comment: Working paper; 21 pages; to be submitted for publication
3D Topological Modeling and Multi-Agent Movement Simulation for Viral Infection Risk Analysis
In this paper, a method to study how the design of indoor spaces and people's movement within them affect disease spread is proposed by integrating computer-aided modeling, multi-agent movement simulation, and airborne viral transmission modeling. Topologicpy spatial design and analysis software is used to model indoor environments, connect spaces, and construct a navigation graph. Pathways for agents, each with unique characteristics such as walking speed, infection status, and activities, are computed using this graph. Agents follow a schedule of events with specific locations and times. The software calculates "time-to-leave" based on walking speed and event start times, and agents are moved along the shortest path within the navigation graph, accurately considering obstacles, doorways, and walls. Precise distance calculations between agents are enabled by this setup. Viral aerosol concentration is then computed and visualized using a reaction-diffusion equation, and each agent's infection risk is determined with an extension of the Wells-Riley ansatz. Infection risk simulations are improved by this spatio-temporal and topological approach, incorporating realistic human behavior and spatial dynamics. The resulting software is designed as a rapid decision-support tool for policymakers, facility managers, stakeholders, architects, and engineers to mitigate disease spread in existing buildings and inform the design of new ones. The software's effectiveness is demonstrated through a comparative analysis of cellular and open commercial office plan layouts.
MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale
Multi-agent pathfinding (MAPF) is a challenging computational problem that typically requires to find collision-free paths for multiple agents in a shared environment. Solving MAPF optimally is NP-hard, yet efficient solutions are critical for numerous applications, including automated warehouses and transportation systems. Recently, learning-based approaches to MAPF have gained attention, particularly those leveraging deep reinforcement learning. Following current trends in machine learning, we have created a foundation model for the MAPF problems called MAPF-GPT. Using imitation learning, we have trained a policy on a set of pre-collected sub-optimal expert trajectories that can generate actions in conditions of partial observability without additional heuristics, reward functions, or communication with other agents. The resulting MAPF-GPT model demonstrates zero-shot learning abilities when solving the MAPF problem instances that were not present in the training dataset. We show that MAPF-GPT notably outperforms the current best-performing learnable-MAPF solvers on a diverse range of problem instances and is efficient in terms of computation (in the inference mode).
Revolutionizing Bridge Operation and maintenance with LLM-based Agents: An Overview of Applications and Insights
In various industrial fields of human social development, people have been exploring methods aimed at freeing human labor. Constructing LLM-based agents is considered to be one of the most effective tools to achieve this goal. Agent, as a kind of human-like intelligent entity with the ability of perception, planning, decision-making, and action, has created great production value in many fields. However, the bridge O\&M field shows a relatively low level of intelligence compared to other industries. Nevertheless, the bridge O\&M field has developed numerous intelligent inspection devices, machine learning algorithms, and autonomous evaluation and decision-making methods, which provide a feasible basis for breakthroughs in artificial intelligence in this field. The aim of this study is to explore the impact of AI bodies based on large-scale language models on the field of bridge O\&M and to analyze the potential challenges and opportunities it brings to the core tasks of bridge O\&M. Through in-depth research and analysis, this paper expects to provide a more comprehensive perspective for understanding the application of intelligentsia in this field.
PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents ACL 2024
Psychological measurement is essential for mental health, self-understanding, and personal development. Traditional methods, such as self-report scales and psychologist interviews, often face challenges with engagement and accessibility. While game-based and LLM-based tools have been explored to improve user interest and automate assessment, they struggle to balance engagement with generalizability. In this work, we propose PsychoGAT (Psychological Game AgenTs) to achieve a generic gamification of psychological assessment. The main insight is that powerful LLMs can function both as adept psychologists and innovative game designers. By incorporating LLM agents into designated roles and carefully managing their interactions, PsychoGAT can transform any standardized scales into personalized and engaging interactive fiction games. To validate the proposed method, we conduct psychometric evaluations to assess its effectiveness and employ human evaluators to examine the generated content across various psychological constructs, including depression, cognitive distortions, and personality traits. Results demonstrate that PsychoGAT serves as an effective assessment tool, achieving statistically significant excellence in psychometric metrics such as reliability, convergent validity, and discriminant validity. Moreover, human evaluations confirm PsychoGAT's enhancements in content coherence, interactivity, interest, immersion, and satisfaction.
comment: ACL 2024
Intrinsic Action Tendency Consistency for Cooperative Multi-Agent Reinforcement Learning AAAI-2024
Efficient collaboration in the centralized training with decentralized execution (CTDE) paradigm remains a challenge in cooperative multi-agent systems. We identify divergent action tendencies among agents as a significant obstacle to CTDE's training efficiency, requiring a large number of training samples to achieve a unified consensus on agents' policies. This divergence stems from the lack of adequate team consensus-related guidance signals during credit assignments in CTDE. To address this, we propose Intrinsic Action Tendency Consistency, a novel approach for cooperative multi-agent reinforcement learning. It integrates intrinsic rewards, obtained through an action model, into a reward-additive CTDE (RA-CTDE) framework. We formulate an action model that enables surrounding agents to predict the central agent's action tendency. Leveraging these predictions, we compute a cooperative intrinsic reward that encourages agents to match their actions with their neighbors' predictions. We establish the equivalence between RA-CTDE and CTDE through theoretical analyses, demonstrating that CTDE's training process can be achieved using agents' individual targets. Building on this insight, we introduce a novel method to combine intrinsic rewards and CTDE. Extensive experiments on challenging tasks in SMAC and GRF benchmarks showcase the improved performance of our method.
comment: The AAAI-2024 paper with the appendix
CityLight: A Universal Model for Coordinated Traffic Signal Control in City-scale Heterogeneous Intersections
The increasingly severe congestion problem in modern cities strengthens the significance of developing city-scale traffic signal control (TSC) methods for traffic efficiency enhancement. While reinforcement learning has been widely explored in TSC, most of them still target small-scale optimization and cannot directly scale to the city level due to unbearable resource demand. Only a few of them manage to tackle city-level optimization, namely a thousand-scale optimization, by incorporating parameter-sharing mechanisms, but hardly have they fully tackled the heterogeneity of intersections and intricate between-intersection interactions inherent in real-world city road networks. To fill in the gap, we target at the two important challenges in adopting parameter-sharing paradigms to solve TSC: inconsistency of inner state representations for intersections heterogeneous in configuration, scale, and orders of available traffic phases; intricacy of impacts from neighborhood intersections that have various relative traffic relationships due to inconsistent phase orders and diverse relative positioning. Our method, CityLight, features a universal representation module that not only aligns the state representations of intersections by reindexing their phases based on their semantics and designing heterogeneity-preserving observations, but also encodes the narrowed relative traffic relation types to project the neighborhood intersections onto a uniform relative traffic impact space. We further attentively fuse neighborhood representations based on their competing relations and incorporate neighborhood-integrated rewards to boost coordination. Extensive experiments with hundreds to tens of thousands of intersections validate the surprising effectiveness and generalizability of CityLight, with an overall performance gain of 11.68% and a 22.59% improvement in transfer scenarios in throughput.
Systems and Control (CS)
Energy Control of Grid-forming Energy Storage based on Bandwidth Separation Principle
The reduced inertia in power system introduces more operation risks and challenges to frequency regulation. The existing virtual inertia and frequency support control are restricted by the normally non-dispatchable energy resources behind the power electronic converters. In this letter, an improved virtual synchronous machine (VSM) control based on energy storage is proposed, considering the limitation of state-of-charge. The steady-state energy consumed by energy storage in inertia, damping and frequency services is investigated. Based on bandwidth separation principle, an energy recovery control is designed to restore the energy consumed, thereby ensuring constant energy reserve. Effectiveness of the proposed control and design is verified by comprehensive simulation results.
RoboMNIST: A Multimodal Dataset for Multi-Robot Activity Recognition Using WiFi Sensing, Video, and Audio
We introduce a novel dataset for multi-robot activity recognition (MRAR) using two robotic arms integrating WiFi channel state information (CSI), video, and audio data. This multimodal dataset utilizes signals of opportunity, leveraging existing WiFi infrastructure to provide detailed indoor environmental sensing without additional sensor deployment. Data were collected using two Franka Emika robotic arms, complemented by three cameras, three WiFi sniffers to collect CSI, and three microphones capturing distinct yet complementary audio data streams. The combination of CSI, visual, and auditory data can enhance robustness and accuracy in MRAR. This comprehensive dataset enables a holistic understanding of robotic environments, facilitating advanced autonomous operations that mimic human-like perception and interaction. By repurposing ubiquitous WiFi signals for environmental sensing, this dataset offers significant potential aiming to advance robotic perception and autonomous systems. It provides a valuable resource for developing sophisticated decision-making and adaptive capabilities in dynamic environments.
Multi-layer optimisation of hybrid energy storage systems for electric vehicles
This research presents a multi-layer optimization framework for hybrid energy storage systems (HESS) for passenger electric vehicles to increase the battery system's performance by combining multiple cell chemistries. Specifically, we devise a battery model capturing voltage dynamics, temperature and lifetime degradation solely using data from manufacturer datasheets, and jointly optimize the capacity distribution between the two batteries and the power split, for a given drive cycle and HESS topology. The results show that the lowest energy consumption is obtained with a hybrid solution consisting of a NCA-NMC combination, since this provides the best trade-off between efficiency and added weight.
System-level thermal and electrical modeling of battery systems for electric aircraft design
This work introduces a framework for simulating the electrical power consumption of an 8-seater electric aircraft equipped with high-energy-density NMC Lithium-ion cells. We propose an equivalent circuit model (ECM) to capture the thermal and electrical battery behavior. Furthermore, we assess the need for a battery thermal management system (BTMS) by determining heat generation at the cell level and optimize BTMS design to minimize energy consumption over a predefined flight regime. The proposed baseline battery design includes a 304-kWh battery system with BTMS, ensuring failure redundancy through two parallel switched battery banks. Simulation results explore the theoretical flight range without BTMS and reveal advantages in increasing battery capacity under specific conditions. Optimization efforts focus on BTMS design, highlighting the superior performance of water cooling over air cooling. However, the addition of a 9.9 kW water-cooled BTMS results in a 16.5% weight increase (387 kg) compared to no BTMS, reducing the simulated range of the aircraft from 480 km to 410 km. Lastly, we address a heating-induced thermal runaway scenario, demonstrating the robustness of the proposed battery design in preventing thermal runaway.
Efficient Multi-agent Navigation with Lightweight DRL Policy
In this article, we present an end-to-end collision avoidance policy based on deep reinforcement learning (DRL) for multi-agent systems, demonstrating encouraging outcomes in real-world applications. In particular, our policy calculates the control commands of the agent based on the raw LiDAR observation. In addition, the number of parameters of the proposed basic model is 140,000, and the size of the parameter file is 3.5 MB, which allows the robot to calculate the actions from the CPU alone. We propose a multi-agent training platform based on a physics-based simulator to further bridge the gap between simulation and the real world. The policy is trained on a policy-gradients-based RL algorithm in a dense and messy training environment. A novel reward function is introduced to address the issue of agents choosing suboptimal actions in some common scenarios. Although the data used for training is exclusively from the simulation platform, the policy can be successfully transferred and deployed in real-world robots. Finally, our policy effectively responds to intentional obstructions and avoids collisions. The website is available at \url{https://sites.google.com/view/xingrong2024efficient/%E9%A6%96%E9%A1%B5}.
Deep DeePC: Data-enabled predictive control with low or no online optimization using deep learning
Data-enabled predictive control (DeePC) is a data-driven control algorithm that utilizes data matrices to form a non-parametric representation of the underlying system, predicting future behaviors and generating optimal control actions. DeePC typically requires solving an online optimization problem, the complexity of which is heavily influenced by the amount of data used, potentially leading to expensive online computation. In this paper, we leverage deep learning to propose a highly computationally efficient DeePC approach for general nonlinear processes, referred to as Deep DeePC. Specifically, a deep neural network is employed to learn the DeePC vector operator, which is an essential component of the non-parametric representation of DeePC. This neural network is trained offline using historical open-loop input and output data of the nonlinear process. With the trained neural network, the Deep DeePC framework is formed for online control implementation. At each sampling instant, this neural network directly outputs the DeePC operator, eliminating the need for online optimization as conventional DeePC. The optimal control action is obtained based on the DeePC operator updated by the trained neural network. To address constrained scenarios, a constraint handling scheme is further proposed and integrated with the Deep DeePC to handle hard constraints during online implementation. The efficacy and superiority of the proposed Deep DeePC approach are demonstrated using two benchmark process examples.
comment: 34 pages, 7 figures
Adversarial Network Optimization under Bandit Feedback: Maximizing Utility in Non-Stationary Multi-Hop Networks
Stochastic Network Optimization (SNO) concerns scheduling in stochastic queueing systems. It has been widely studied in network theory. Classical SNO algorithms require network conditions to be stationary with time, which fails to capture the non-stationary components in many real-world scenarios. Many existing algorithms also assume knowledge of network conditions before decision, which rules out applications where unpredictability presents. Motivated by these issues, we consider Adversarial Network Optimization (ANO) under bandit feedback. Specifically, we consider the task of *i)* maximizing some unknown and time-varying utility function associated to scheduler's actions, where *ii)* the underlying network is a non-stationary multi-hop one whose conditions change arbitrarily with time, and *iii)* only bandit feedback (effect of actually deployed actions) is revealed after decisions. Our proposed `UMO2` algorithm ensures network stability and also matches the utility maximization performance of any "mildly varying" reference policy up to a polynomially decaying gap. To our knowledge, no previous ANO algorithm handled multi-hop networks or achieved utility guarantees under bandit feedback, whereas ours can do both. Technically, our method builds upon a novel integration of online learning into Lyapunov analyses: To handle complex inter-dependencies among queues in multi-hop networks, we propose meticulous techniques to balance online learning and Lyapunov arguments. To tackle the learning obstacles due to potentially unbounded queue sizes, we design a new online linear optimization algorithm that automatically adapts to loss magnitudes. To maximize utility, we propose a bandit convex optimization algorithm with novel queue-dependent learning rate scheduling that suites drastically varying queue lengths. Our new insights in online learning can be of independent interest.
Economic Optimal Power Management of Second-Life Battery Energy Storage Systems
Second-life battery energy storage systems (SL-BESS) are an economical means of long-duration grid energy storage. They utilize retired battery packs from electric vehicles to store and provide electrical energy at the utility scale. However, they pose critical challenges in achieving optimal utilization and extending their remaining useful life. These complications primarily result from the constituent battery packs' inherent heterogeneities in terms of their size, chemistry, and degradation. This paper proposes an economic optimal power management approach to ensure the cost-minimized operation of SL-BESS while adhering to safety regulations and maintaining a balance between the power supply and demand. The proposed approach takes into account the costs associated with the degradation, energy loss, and decommissioning of the battery packs. In particular, we capture the degradation costs of the retired battery packs through a weighted average Ah-throughput aging model. The presented model allows us to quantify the capacity fading for second-life battery packs for different operating temperatures and C-rates. To evaluate the performance of the proposed approach, we conduct extensive simulations on a SL-BESS consisting of various heterogeneous retired battery packs in the context of grid operation. The results offer novel insights into SL-BESS operation and highlight the importance of prudent power management to ensure economically optimal utilization.
Internet of Things Networks: Enabling Simultaneous Wireless Information and Power Transfer
The number of sensors deployed in the world is expected to explode in the near future. At this moment, nearly 30 billion Internet of Things (IoT) devices are connected and this number is expected to double in the next four years. While not all of these are battery powered, as technology becomes smaller and mobility becomes more important to consumers, soon a larger portion will be. This forecast predicts that the number of machine to machine (M2M) devices will have the largest increase, representing nearly 50% of all devices in 2023.
Analyzing Errors in Controlled Turret System Given Target Location Input from Artificial Intelligence Methods in Automatic Target Recognition
In this paper, we assess the movement error of a targeting system given target location data from artificial intelligence (AI) methods in automatic target recognition (ATR) systems. Few studies evaluate the impacts on the accuracy in moving a targeting system to an aimpoint provided in this manner. To address this knowledge gap, we assess the performance of a controlled gun turret system given target location from an object detector developed from AI methods. In our assessment, we define a measure of object detector error and examine the correlations with several standard metrics in object detection. We then statistically analyze the object detector error data and turret movement error data acquired from controlled targeting simulations, as well as their aggregate error, to examine the impact on turret movement accuracy. Finally, we study the correlations between additional metrics and the probability of a hit. The results indicate that AI technologies are a significant source of error to targeting systems. Moreover, the results suggest that metrics such as the confidence score, intersection-over-union, average precision and average recall are predictors of accuracy against stationary targets with our system parameters.
comment: 30 pages, 21 figures
On Fixed-Time Stability for a Class of Singularly Perturbed Systems using Composite Lyapunov Functions
Fixed-time stable dynamical systems are capable of achieving exact convergence to an equilibrium point within a fixed time that is independent of the initial conditions of the system. This property makes them highly appealing for designing control, estimation, and optimization algorithms in applications with stringent performance requirements. However, the set of tools available for analyzing the interconnection of fixed-time stable systems is rather limited compared to their asymptotic counterparts. In this paper, we address some of these limitations by exploiting the emergence of multiple time scales in nonlinear singularly perturbed dynamical systems, where the fast dynamics and the slow dynamics are fixed-time stable on their own. By extending the so-called composite Lyapunov method from asymptotic stability to the context of fixed-time stability, we provide a novel class of Lyapunov-based sufficient conditions to certify fixed-time stability in a class of singularly perturbed dynamical systems. The results are illustrated, analytically and numerically, using a fixed-time gradient flow system interconnected with a fixed-time plant and an additional high-order example.
Mitigating Polarization in Recommender Systems via Network-aware Feedback Optimization
We consider a recommender system that takes into account the interaction between recommendations and the evolution of user interests. Users opinions are influenced by both social interactions and recommended content. We leverage online feedback optimization to design a recommender system that trades-off between maximizing engagement and minimizing polarization. The recommender system is agnostic about users' opinion, clicking behavior, and social interactions, and solely relies on clicks. We establish optimality and closed-loop stability of the resulting feedback interconnection between the social platform and the recommender system. We numerically validate our algorithm when the user population follows an extended Friedkin--Johnsen model. We observe that network-aware recommendations significantly reduce polarization without compromising user engagement.
Analyzing Errors in Controlled Turret System
The purpose of this paper is to characterize aiming errors in controlled weapon systems given target location as input. To achieve this objective, we analyze the accuracy of a controlled weapon system model for stationary and moving targets under different error sources and firing times. First, we develop a mathematical model of a gun turret and use it to design two controllers, a Proportional-Integral-Derivative controller and a Model Predictive controller, which accept the target location input and move the turret to the centroid of the target in simulations. For stationary targets, we analyze the impact of errors in estimating the system's parameters and uncertainty in the aim point measurement. Our results indicate that turret movement is more sensitive to errors in the moment of inertia than the damping coefficient, which could lead to incorrect simulations of controlled turret system accuracy. The results also support the hypothesis that turret movement errors are larger over longer distances of gun turret movement and, assuming no time constraints, accuracy improves the longer one waits to fire; though this may not always be practical in a combat scenario. Additionally, we demonstrate that the integral control component is needed for high accuracy in moving target scenarios.
comment: 29 pages, 15 figures
Asymptotically Stable Data-Driven Koopman Operator Approximation with Inputs using Total Extended DMD
The Koopman operator framework can be used to identify a data-driven model of a nonlinear system. Unfortunately, when the data is corrupted by noise, the identified model can be biased. Additionally, depending on the choice of lifting functions, the identified model can be unstable, even when the underlying system is asymptotically stable. This paper presents an approach to reduce the bias in an approximate Koopman model, and simultaneously ensure asymptotic stability, when using noisy data. Additionally, the proposed data-driven modeling approach is applicable to systems with inputs, such as a known forcing function or a control input. Specifically, bias is reduced by using a total least-squares, modified to accommodate inputs in addition to lifted inputs. To enforce asymptotic stability of the approximate Koopman model, linear matrix inequality constraints are augmented to the identification problem. The performance of the proposed method is then compared to the well-known extended dynamic mode decomposition method and to the newly introduced forward-backward extended dynamic mode decomposition method using a simulated Duffing oscillator dataset and experimental soft robot arm dataset.
comment: 18 pages, 6 figures
Active Search for Low-altitude UAV Sensing and Communication for Users at Unknown Locations
This paper studies optimal unmanned aerial vehicle (UAV) placement to ensure line-of-sight (LOS) communication and sensing for a cluster of ground users possibly in deep shadow, while the UAV maintains backhaul connectivity with a base station (BS). The key challenges include unknown user locations, uncertain channel model parameters, and unavailable urban structure. Addressing these challenges, this paper focuses on developing an efficient online search strategy which jointly estimates channels, guides UAV positioning, and optimizes resource allocation. Analytically exploiting the geometric properties of the equipotential surface, this paper develops an LOS discovery trajectory on the equipotential surface while the closed-form search directions are determined using perturbation theory. Since the explicit expression of the equipotential surface is not available, this paper proposes to locally construct a channel model for each user in the LOS regime utilizing polynomial regression without depending on user locations or propagation distance. A class of spiral trajectories to simultaneously construct the LOS channels and search on the equipotential surface is developed. An optimal radius of the spiral and an optimal measurement pattern for channel gain estimation are derived to minimize the mean squared error (MSE) of the locally constructed channel. Numerical results on real 3D city maps demonstrate that the proposed scheme achieves over 94% of the performance of a 3D exhaustive search scheme with just a 3-kilometer search.
Quantifying and Optimizing the Time-Coupled Flexibilities at the Distribution-Level for TSO-DSO Coordination
The flexibilities provided by the distributed energy resources (DERs) in distribution systems enable the coordination of transmission system operator (TSO) and distribution system operators (DSOs). At the distribution level, the flexibilities should be optimized for participation in the transmission system operation. This paper first proposes a flexibility quantification method that quantifies the costs of providing flexibilities and their values to the DSO in the TSO-DSO coordination. Compared with traditional power-range-based quantification approaches that are mainly suitable for generators, the proposed method can directly capture the time-coupling characteristics of DERs' individual and aggregated flexibility regions. Based on the quantification method, we further propose a DSO optimization model to activate the flexibilities from DER aggregators in the distribution system for energy arbitrage and ancillary services provision in the transmission system, along with a revenue allocation strategy that ensures a non-profit DSO. Numerical tests on the IEEE test system verify the proposed methods.
Data-driven AC Optimal Power Flow with Physics-informed Learning and Calibrations
The modern power grid is witnessing a shift in operations from traditional control methods to more advanced operational mechanisms. Due to the nonconvex nature of the Alternating Current Optimal Power Flow (ACOPF) problem and the need for operations with better granularity in the modern smart grid, system operators require a more efficient and reliable ACOPF solver. While data-driven ACOPF methods excel in directly inferring the optimal solution based on power grid demand, achieving both feasibility and optimality remains a challenge due to the NP-hardness of the problem. In this paper, we propose a physics-informed machine learning model and a feasibility calibration algorithm to produce solutions for the ACOPF problem. Notably, the machine learning model produces solutions with a 0.5\% and 1.4\% optimality gap for IEEE bus 14 and 118 grids, respectively. The feasibility correction algorithm converges for all test scenarios on bus 14 and achieves a 92.2% convergence rate on bus 118.
comment: 6 pages, 3 figures, 1 algorithm and 2 tables. Submitted to SmartGridComm2024
Modeling and Predictive Control for the Treatment of Hyperthyroidism
In this work, we propose an approach to determine the dosages of antithyroid agents to treat hyperthyroid patients. Instead of relying on a trial-and-error approach as it is commonly done in clinical practice, we suggest to determine the dosages by means of a model predictive control (MPC) scheme. To this end, we first extend a mathematical model of the pituitary-thyroid feedback loop such that the intake of methimazole, a common antithyroid agent, can be considered. Second, based on the extended model, we develop an MPC scheme to determine suitable dosages. In numerical simulations, we consider scenarios in which (i) patients are affected by Graves' disease and take the medication orally and (ii) patients suffering from a life-threatening thyrotoxicosis, in which the medication is usually given intravenously. Our conceptual study suggests that determining the medication dosages by means of an MPC scheme could be a promising alternative to the currently applied trial-and-error approach.
comment: 6 pages
UltimateKalman: Flexible Kalman Filtering and Smoothing Using Orthogonal Transformations
UltimateKalman is a flexible linear Kalman filter and smoother implemented in three popular programming languages: MATLAB, C, and Java. UltimateKalman is a slight simplification and slight generalization of an elegant Kalman filter and smoother that was proposed in 1977 by Paige and Saunders. Their algorithm appears to be numerically superior and more flexible than other Kalman filters and smoothers, but curiously has never been implemented or used before. UltimateKalman is flexible: it can easily handle time-dependent problems, problems with state vectors whose dimensions vary from step to step, problems with varying number of observations in different steps (or no observations at all in some steps), and problems in which the expectation of the initial state is unknown. The programming interface of UltimateKalman is broken into simple building blocks that can be used to construct filters, single or multi-step predictors, multi-step or whole-track smoothers, and combinations. The paper describes the algorithm and its implementation as well as with a test suite of examples and tests.
CityLight: A Universal Model for Coordinated Traffic Signal Control in City-scale Heterogeneous Intersections
The increasingly severe congestion problem in modern cities strengthens the significance of developing city-scale traffic signal control (TSC) methods for traffic efficiency enhancement. While reinforcement learning has been widely explored in TSC, most of them still target small-scale optimization and cannot directly scale to the city level due to unbearable resource demand. Only a few of them manage to tackle city-level optimization, namely a thousand-scale optimization, by incorporating parameter-sharing mechanisms, but hardly have they fully tackled the heterogeneity of intersections and intricate between-intersection interactions inherent in real-world city road networks. To fill in the gap, we target at the two important challenges in adopting parameter-sharing paradigms to solve TSC: inconsistency of inner state representations for intersections heterogeneous in configuration, scale, and orders of available traffic phases; intricacy of impacts from neighborhood intersections that have various relative traffic relationships due to inconsistent phase orders and diverse relative positioning. Our method, CityLight, features a universal representation module that not only aligns the state representations of intersections by reindexing their phases based on their semantics and designing heterogeneity-preserving observations, but also encodes the narrowed relative traffic relation types to project the neighborhood intersections onto a uniform relative traffic impact space. We further attentively fuse neighborhood representations based on their competing relations and incorporate neighborhood-integrated rewards to boost coordination. Extensive experiments with hundreds to tens of thousands of intersections validate the surprising effectiveness and generalizability of CityLight, with an overall performance gain of 11.68% and a 22.59% improvement in transfer scenarios in throughput.
GPU-Accelerated DCOPF using Gradient-Based Optimization
DC Optimal Power Flow (DCOPF) is a key operational tool for power system operators, and it is embedded as a subproblem in many challenging optimization problems (e.g., line switching). However, traditional CPU-based solve routines (e.g., simplex) have saturated in speed and are hard to parallelize. This paper focuses on solving DCOPF problems using gradient-based routines on Graphics Processing Units (GPUs), which have massive parallelization capability. To formulate these problems, we pose a Lagrange dual associated with DCOPF (linear and quadratic cost curves), and then we explicitly solve the inner (primal) minimization problem with a dual norm. The resulting dual problem can be efficiently iterated using projected gradient ascent. After solving the dual problem on both CPUs and GPUs to find tight lower bounds, we benchmark against Gurobi and MOSEK, comparing convergence speed and tightness on the IEEE 2000 and 10000 bus systems. We provide reliable and tight lower bounds for these problems with, at best, 5.4x speedup over a conventional solver.
Systems and Control (EESS)
Energy Control of Grid-forming Energy Storage based on Bandwidth Separation Principle
The reduced inertia in power system introduces more operation risks and challenges to frequency regulation. The existing virtual inertia and frequency support control are restricted by the normally non-dispatchable energy resources behind the power electronic converters. In this letter, an improved virtual synchronous machine (VSM) control based on energy storage is proposed, considering the limitation of state-of-charge. The steady-state energy consumed by energy storage in inertia, damping and frequency services is investigated. Based on bandwidth separation principle, an energy recovery control is designed to restore the energy consumed, thereby ensuring constant energy reserve. Effectiveness of the proposed control and design is verified by comprehensive simulation results.
RoboMNIST: A Multimodal Dataset for Multi-Robot Activity Recognition Using WiFi Sensing, Video, and Audio
We introduce a novel dataset for multi-robot activity recognition (MRAR) using two robotic arms integrating WiFi channel state information (CSI), video, and audio data. This multimodal dataset utilizes signals of opportunity, leveraging existing WiFi infrastructure to provide detailed indoor environmental sensing without additional sensor deployment. Data were collected using two Franka Emika robotic arms, complemented by three cameras, three WiFi sniffers to collect CSI, and three microphones capturing distinct yet complementary audio data streams. The combination of CSI, visual, and auditory data can enhance robustness and accuracy in MRAR. This comprehensive dataset enables a holistic understanding of robotic environments, facilitating advanced autonomous operations that mimic human-like perception and interaction. By repurposing ubiquitous WiFi signals for environmental sensing, this dataset offers significant potential aiming to advance robotic perception and autonomous systems. It provides a valuable resource for developing sophisticated decision-making and adaptive capabilities in dynamic environments.
Multi-layer optimisation of hybrid energy storage systems for electric vehicles
This research presents a multi-layer optimization framework for hybrid energy storage systems (HESS) for passenger electric vehicles to increase the battery system's performance by combining multiple cell chemistries. Specifically, we devise a battery model capturing voltage dynamics, temperature and lifetime degradation solely using data from manufacturer datasheets, and jointly optimize the capacity distribution between the two batteries and the power split, for a given drive cycle and HESS topology. The results show that the lowest energy consumption is obtained with a hybrid solution consisting of a NCA-NMC combination, since this provides the best trade-off between efficiency and added weight.
System-level thermal and electrical modeling of battery systems for electric aircraft design
This work introduces a framework for simulating the electrical power consumption of an 8-seater electric aircraft equipped with high-energy-density NMC Lithium-ion cells. We propose an equivalent circuit model (ECM) to capture the thermal and electrical battery behavior. Furthermore, we assess the need for a battery thermal management system (BTMS) by determining heat generation at the cell level and optimize BTMS design to minimize energy consumption over a predefined flight regime. The proposed baseline battery design includes a 304-kWh battery system with BTMS, ensuring failure redundancy through two parallel switched battery banks. Simulation results explore the theoretical flight range without BTMS and reveal advantages in increasing battery capacity under specific conditions. Optimization efforts focus on BTMS design, highlighting the superior performance of water cooling over air cooling. However, the addition of a 9.9 kW water-cooled BTMS results in a 16.5% weight increase (387 kg) compared to no BTMS, reducing the simulated range of the aircraft from 480 km to 410 km. Lastly, we address a heating-induced thermal runaway scenario, demonstrating the robustness of the proposed battery design in preventing thermal runaway.
Efficient Multi-agent Navigation with Lightweight DRL Policy
In this article, we present an end-to-end collision avoidance policy based on deep reinforcement learning (DRL) for multi-agent systems, demonstrating encouraging outcomes in real-world applications. In particular, our policy calculates the control commands of the agent based on the raw LiDAR observation. In addition, the number of parameters of the proposed basic model is 140,000, and the size of the parameter file is 3.5 MB, which allows the robot to calculate the actions from the CPU alone. We propose a multi-agent training platform based on a physics-based simulator to further bridge the gap between simulation and the real world. The policy is trained on a policy-gradients-based RL algorithm in a dense and messy training environment. A novel reward function is introduced to address the issue of agents choosing suboptimal actions in some common scenarios. Although the data used for training is exclusively from the simulation platform, the policy can be successfully transferred and deployed in real-world robots. Finally, our policy effectively responds to intentional obstructions and avoids collisions. The website is available at \url{https://sites.google.com/view/xingrong2024efficient/%E9%A6%96%E9%A1%B5}.
Deep DeePC: Data-enabled predictive control with low or no online optimization using deep learning
Data-enabled predictive control (DeePC) is a data-driven control algorithm that utilizes data matrices to form a non-parametric representation of the underlying system, predicting future behaviors and generating optimal control actions. DeePC typically requires solving an online optimization problem, the complexity of which is heavily influenced by the amount of data used, potentially leading to expensive online computation. In this paper, we leverage deep learning to propose a highly computationally efficient DeePC approach for general nonlinear processes, referred to as Deep DeePC. Specifically, a deep neural network is employed to learn the DeePC vector operator, which is an essential component of the non-parametric representation of DeePC. This neural network is trained offline using historical open-loop input and output data of the nonlinear process. With the trained neural network, the Deep DeePC framework is formed for online control implementation. At each sampling instant, this neural network directly outputs the DeePC operator, eliminating the need for online optimization as conventional DeePC. The optimal control action is obtained based on the DeePC operator updated by the trained neural network. To address constrained scenarios, a constraint handling scheme is further proposed and integrated with the Deep DeePC to handle hard constraints during online implementation. The efficacy and superiority of the proposed Deep DeePC approach are demonstrated using two benchmark process examples.
comment: 34 pages, 7 figures
Adversarial Network Optimization under Bandit Feedback: Maximizing Utility in Non-Stationary Multi-Hop Networks
Stochastic Network Optimization (SNO) concerns scheduling in stochastic queueing systems. It has been widely studied in network theory. Classical SNO algorithms require network conditions to be stationary with time, which fails to capture the non-stationary components in many real-world scenarios. Many existing algorithms also assume knowledge of network conditions before decision, which rules out applications where unpredictability presents. Motivated by these issues, we consider Adversarial Network Optimization (ANO) under bandit feedback. Specifically, we consider the task of *i)* maximizing some unknown and time-varying utility function associated to scheduler's actions, where *ii)* the underlying network is a non-stationary multi-hop one whose conditions change arbitrarily with time, and *iii)* only bandit feedback (effect of actually deployed actions) is revealed after decisions. Our proposed `UMO2` algorithm ensures network stability and also matches the utility maximization performance of any "mildly varying" reference policy up to a polynomially decaying gap. To our knowledge, no previous ANO algorithm handled multi-hop networks or achieved utility guarantees under bandit feedback, whereas ours can do both. Technically, our method builds upon a novel integration of online learning into Lyapunov analyses: To handle complex inter-dependencies among queues in multi-hop networks, we propose meticulous techniques to balance online learning and Lyapunov arguments. To tackle the learning obstacles due to potentially unbounded queue sizes, we design a new online linear optimization algorithm that automatically adapts to loss magnitudes. To maximize utility, we propose a bandit convex optimization algorithm with novel queue-dependent learning rate scheduling that suites drastically varying queue lengths. Our new insights in online learning can be of independent interest.
Economic Optimal Power Management of Second-Life Battery Energy Storage Systems
Second-life battery energy storage systems (SL-BESS) are an economical means of long-duration grid energy storage. They utilize retired battery packs from electric vehicles to store and provide electrical energy at the utility scale. However, they pose critical challenges in achieving optimal utilization and extending their remaining useful life. These complications primarily result from the constituent battery packs' inherent heterogeneities in terms of their size, chemistry, and degradation. This paper proposes an economic optimal power management approach to ensure the cost-minimized operation of SL-BESS while adhering to safety regulations and maintaining a balance between the power supply and demand. The proposed approach takes into account the costs associated with the degradation, energy loss, and decommissioning of the battery packs. In particular, we capture the degradation costs of the retired battery packs through a weighted average Ah-throughput aging model. The presented model allows us to quantify the capacity fading for second-life battery packs for different operating temperatures and C-rates. To evaluate the performance of the proposed approach, we conduct extensive simulations on a SL-BESS consisting of various heterogeneous retired battery packs in the context of grid operation. The results offer novel insights into SL-BESS operation and highlight the importance of prudent power management to ensure economically optimal utilization.
Internet of Things Networks: Enabling Simultaneous Wireless Information and Power Transfer
The number of sensors deployed in the world is expected to explode in the near future. At this moment, nearly 30 billion Internet of Things (IoT) devices are connected and this number is expected to double in the next four years. While not all of these are battery powered, as technology becomes smaller and mobility becomes more important to consumers, soon a larger portion will be. This forecast predicts that the number of machine to machine (M2M) devices will have the largest increase, representing nearly 50% of all devices in 2023.
Analyzing Errors in Controlled Turret System Given Target Location Input from Artificial Intelligence Methods in Automatic Target Recognition
In this paper, we assess the movement error of a targeting system given target location data from artificial intelligence (AI) methods in automatic target recognition (ATR) systems. Few studies evaluate the impacts on the accuracy in moving a targeting system to an aimpoint provided in this manner. To address this knowledge gap, we assess the performance of a controlled gun turret system given target location from an object detector developed from AI methods. In our assessment, we define a measure of object detector error and examine the correlations with several standard metrics in object detection. We then statistically analyze the object detector error data and turret movement error data acquired from controlled targeting simulations, as well as their aggregate error, to examine the impact on turret movement accuracy. Finally, we study the correlations between additional metrics and the probability of a hit. The results indicate that AI technologies are a significant source of error to targeting systems. Moreover, the results suggest that metrics such as the confidence score, intersection-over-union, average precision and average recall are predictors of accuracy against stationary targets with our system parameters.
comment: 30 pages, 21 figures
On Fixed-Time Stability for a Class of Singularly Perturbed Systems using Composite Lyapunov Functions
Fixed-time stable dynamical systems are capable of achieving exact convergence to an equilibrium point within a fixed time that is independent of the initial conditions of the system. This property makes them highly appealing for designing control, estimation, and optimization algorithms in applications with stringent performance requirements. However, the set of tools available for analyzing the interconnection of fixed-time stable systems is rather limited compared to their asymptotic counterparts. In this paper, we address some of these limitations by exploiting the emergence of multiple time scales in nonlinear singularly perturbed dynamical systems, where the fast dynamics and the slow dynamics are fixed-time stable on their own. By extending the so-called composite Lyapunov method from asymptotic stability to the context of fixed-time stability, we provide a novel class of Lyapunov-based sufficient conditions to certify fixed-time stability in a class of singularly perturbed dynamical systems. The results are illustrated, analytically and numerically, using a fixed-time gradient flow system interconnected with a fixed-time plant and an additional high-order example.
Mitigating Polarization in Recommender Systems via Network-aware Feedback Optimization
We consider a recommender system that takes into account the interaction between recommendations and the evolution of user interests. Users opinions are influenced by both social interactions and recommended content. We leverage online feedback optimization to design a recommender system that trades-off between maximizing engagement and minimizing polarization. The recommender system is agnostic about users' opinion, clicking behavior, and social interactions, and solely relies on clicks. We establish optimality and closed-loop stability of the resulting feedback interconnection between the social platform and the recommender system. We numerically validate our algorithm when the user population follows an extended Friedkin--Johnsen model. We observe that network-aware recommendations significantly reduce polarization without compromising user engagement.
Analyzing Errors in Controlled Turret System
The purpose of this paper is to characterize aiming errors in controlled weapon systems given target location as input. To achieve this objective, we analyze the accuracy of a controlled weapon system model for stationary and moving targets under different error sources and firing times. First, we develop a mathematical model of a gun turret and use it to design two controllers, a Proportional-Integral-Derivative controller and a Model Predictive controller, which accept the target location input and move the turret to the centroid of the target in simulations. For stationary targets, we analyze the impact of errors in estimating the system's parameters and uncertainty in the aim point measurement. Our results indicate that turret movement is more sensitive to errors in the moment of inertia than the damping coefficient, which could lead to incorrect simulations of controlled turret system accuracy. The results also support the hypothesis that turret movement errors are larger over longer distances of gun turret movement and, assuming no time constraints, accuracy improves the longer one waits to fire; though this may not always be practical in a combat scenario. Additionally, we demonstrate that the integral control component is needed for high accuracy in moving target scenarios.
comment: 29 pages, 15 figures
Asymptotically Stable Data-Driven Koopman Operator Approximation with Inputs using Total Extended DMD
The Koopman operator framework can be used to identify a data-driven model of a nonlinear system. Unfortunately, when the data is corrupted by noise, the identified model can be biased. Additionally, depending on the choice of lifting functions, the identified model can be unstable, even when the underlying system is asymptotically stable. This paper presents an approach to reduce the bias in an approximate Koopman model, and simultaneously ensure asymptotic stability, when using noisy data. Additionally, the proposed data-driven modeling approach is applicable to systems with inputs, such as a known forcing function or a control input. Specifically, bias is reduced by using a total least-squares, modified to accommodate inputs in addition to lifted inputs. To enforce asymptotic stability of the approximate Koopman model, linear matrix inequality constraints are augmented to the identification problem. The performance of the proposed method is then compared to the well-known extended dynamic mode decomposition method and to the newly introduced forward-backward extended dynamic mode decomposition method using a simulated Duffing oscillator dataset and experimental soft robot arm dataset.
comment: 18 pages, 6 figures
Active Search for Low-altitude UAV Sensing and Communication for Users at Unknown Locations
This paper studies optimal unmanned aerial vehicle (UAV) placement to ensure line-of-sight (LOS) communication and sensing for a cluster of ground users possibly in deep shadow, while the UAV maintains backhaul connectivity with a base station (BS). The key challenges include unknown user locations, uncertain channel model parameters, and unavailable urban structure. Addressing these challenges, this paper focuses on developing an efficient online search strategy which jointly estimates channels, guides UAV positioning, and optimizes resource allocation. Analytically exploiting the geometric properties of the equipotential surface, this paper develops an LOS discovery trajectory on the equipotential surface while the closed-form search directions are determined using perturbation theory. Since the explicit expression of the equipotential surface is not available, this paper proposes to locally construct a channel model for each user in the LOS regime utilizing polynomial regression without depending on user locations or propagation distance. A class of spiral trajectories to simultaneously construct the LOS channels and search on the equipotential surface is developed. An optimal radius of the spiral and an optimal measurement pattern for channel gain estimation are derived to minimize the mean squared error (MSE) of the locally constructed channel. Numerical results on real 3D city maps demonstrate that the proposed scheme achieves over 94% of the performance of a 3D exhaustive search scheme with just a 3-kilometer search.
Quantifying and Optimizing the Time-Coupled Flexibilities at the Distribution-Level for TSO-DSO Coordination
The flexibilities provided by the distributed energy resources (DERs) in distribution systems enable the coordination of transmission system operator (TSO) and distribution system operators (DSOs). At the distribution level, the flexibilities should be optimized for participation in the transmission system operation. This paper first proposes a flexibility quantification method that quantifies the costs of providing flexibilities and their values to the DSO in the TSO-DSO coordination. Compared with traditional power-range-based quantification approaches that are mainly suitable for generators, the proposed method can directly capture the time-coupling characteristics of DERs' individual and aggregated flexibility regions. Based on the quantification method, we further propose a DSO optimization model to activate the flexibilities from DER aggregators in the distribution system for energy arbitrage and ancillary services provision in the transmission system, along with a revenue allocation strategy that ensures a non-profit DSO. Numerical tests on the IEEE test system verify the proposed methods.
Data-driven AC Optimal Power Flow with Physics-informed Learning and Calibrations
The modern power grid is witnessing a shift in operations from traditional control methods to more advanced operational mechanisms. Due to the nonconvex nature of the Alternating Current Optimal Power Flow (ACOPF) problem and the need for operations with better granularity in the modern smart grid, system operators require a more efficient and reliable ACOPF solver. While data-driven ACOPF methods excel in directly inferring the optimal solution based on power grid demand, achieving both feasibility and optimality remains a challenge due to the NP-hardness of the problem. In this paper, we propose a physics-informed machine learning model and a feasibility calibration algorithm to produce solutions for the ACOPF problem. Notably, the machine learning model produces solutions with a 0.5\% and 1.4\% optimality gap for IEEE bus 14 and 118 grids, respectively. The feasibility correction algorithm converges for all test scenarios on bus 14 and achieves a 92.2% convergence rate on bus 118.
comment: 6 pages, 3 figures, 1 algorithm and 2 tables. Submitted to SmartGridComm2024
Modeling and Predictive Control for the Treatment of Hyperthyroidism
In this work, we propose an approach to determine the dosages of antithyroid agents to treat hyperthyroid patients. Instead of relying on a trial-and-error approach as it is commonly done in clinical practice, we suggest to determine the dosages by means of a model predictive control (MPC) scheme. To this end, we first extend a mathematical model of the pituitary-thyroid feedback loop such that the intake of methimazole, a common antithyroid agent, can be considered. Second, based on the extended model, we develop an MPC scheme to determine suitable dosages. In numerical simulations, we consider scenarios in which (i) patients are affected by Graves' disease and take the medication orally and (ii) patients suffering from a life-threatening thyrotoxicosis, in which the medication is usually given intravenously. Our conceptual study suggests that determining the medication dosages by means of an MPC scheme could be a promising alternative to the currently applied trial-and-error approach.
comment: 6 pages
UltimateKalman: Flexible Kalman Filtering and Smoothing Using Orthogonal Transformations
UltimateKalman is a flexible linear Kalman filter and smoother implemented in three popular programming languages: MATLAB, C, and Java. UltimateKalman is a slight simplification and slight generalization of an elegant Kalman filter and smoother that was proposed in 1977 by Paige and Saunders. Their algorithm appears to be numerically superior and more flexible than other Kalman filters and smoothers, but curiously has never been implemented or used before. UltimateKalman is flexible: it can easily handle time-dependent problems, problems with state vectors whose dimensions vary from step to step, problems with varying number of observations in different steps (or no observations at all in some steps), and problems in which the expectation of the initial state is unknown. The programming interface of UltimateKalman is broken into simple building blocks that can be used to construct filters, single or multi-step predictors, multi-step or whole-track smoothers, and combinations. The paper describes the algorithm and its implementation as well as with a test suite of examples and tests.
CityLight: A Universal Model for Coordinated Traffic Signal Control in City-scale Heterogeneous Intersections
The increasingly severe congestion problem in modern cities strengthens the significance of developing city-scale traffic signal control (TSC) methods for traffic efficiency enhancement. While reinforcement learning has been widely explored in TSC, most of them still target small-scale optimization and cannot directly scale to the city level due to unbearable resource demand. Only a few of them manage to tackle city-level optimization, namely a thousand-scale optimization, by incorporating parameter-sharing mechanisms, but hardly have they fully tackled the heterogeneity of intersections and intricate between-intersection interactions inherent in real-world city road networks. To fill in the gap, we target at the two important challenges in adopting parameter-sharing paradigms to solve TSC: inconsistency of inner state representations for intersections heterogeneous in configuration, scale, and orders of available traffic phases; intricacy of impacts from neighborhood intersections that have various relative traffic relationships due to inconsistent phase orders and diverse relative positioning. Our method, CityLight, features a universal representation module that not only aligns the state representations of intersections by reindexing their phases based on their semantics and designing heterogeneity-preserving observations, but also encodes the narrowed relative traffic relation types to project the neighborhood intersections onto a uniform relative traffic impact space. We further attentively fuse neighborhood representations based on their competing relations and incorporate neighborhood-integrated rewards to boost coordination. Extensive experiments with hundreds to tens of thousands of intersections validate the surprising effectiveness and generalizability of CityLight, with an overall performance gain of 11.68% and a 22.59% improvement in transfer scenarios in throughput.
GPU-Accelerated DCOPF using Gradient-Based Optimization
DC Optimal Power Flow (DCOPF) is a key operational tool for power system operators, and it is embedded as a subproblem in many challenging optimization problems (e.g., line switching). However, traditional CPU-based solve routines (e.g., simplex) have saturated in speed and are hard to parallelize. This paper focuses on solving DCOPF problems using gradient-based routines on Graphics Processing Units (GPUs), which have massive parallelization capability. To formulate these problems, we pose a Lagrange dual associated with DCOPF (linear and quadratic cost curves), and then we explicitly solve the inner (primal) minimization problem with a dual norm. The resulting dual problem can be efficiently iterated using projected gradient ascent. After solving the dual problem on both CPUs and GPUs to find tight lower bounds, we benchmark against Gurobi and MOSEK, comparing convergence speed and tightness on the IEEE 2000 and 10000 bus systems. We provide reliable and tight lower bounds for these problems with, at best, 5.4x speedup over a conventional solver.
Robotics
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
The ability to accurately interpret complex visual information is a crucial topic of multimodal large language models (MLLMs). Recent work indicates that enhanced visual perception significantly reduces hallucinations and improves performance on resolution-sensitive tasks, such as optical character recognition and document analysis. A number of recent MLLMs achieve this goal using a mixture of vision encoders. Despite their success, there is a lack of systematic comparisons and detailed ablation studies addressing critical aspects, such as expert selection and the integration of multiple vision experts. This study provides an extensive exploration of the design space for MLLMs using a mixture of vision encoders and resolutions. Our findings reveal several underlying principles common to various existing strategies, leading to a streamlined yet effective design approach. We discover that simply concatenating visual tokens from a set of complementary vision encoders is as effective as more complex mixing architectures or strategies. We additionally introduce Pre-Alignment to bridge the gap between vision-focused encoders and language tokens, enhancing model coherence. The resulting family of MLLMs, Eagle, surpasses other leading open-source models on major MLLM benchmarks. Models and code: https://github.com/NVlabs/Eagle
comment: Github: https://github.com/NVlabs/Eagle, HuggingFace: https://huggingface.co/NVEagle
In-Context Imitation Learning via Next-Token Prediction
We explore how to enhance next-token prediction models to perform in-context imitation learning on a real robot, where the robot executes new tasks by interpreting contextual information provided during the input phase, without updating its underlying policy parameters. We propose In-Context Robot Transformer (ICRT), a causal transformer that performs autoregressive prediction on sensorimotor trajectories without relying on any linguistic data or reward function. This formulation enables flexible and training-free execution of new tasks at test time, achieved by prompting the model with sensorimotor trajectories of the new task composing of image observations, actions and states tuples, collected through human teleoperation. Experiments with a Franka Emika robot demonstrate that the ICRT can adapt to new tasks specified by prompts, even in environment configurations that differ from both the prompt and the training data. In a multitask environment setup, ICRT significantly outperforms current state-of-the-art next-token prediction models in robotics on generalizing to unseen tasks. Code, checkpoints and data are available on https://icrt.dev/
SLAM2REF: Advancing Long-Term Mapping with 3D LiDAR and Reference Map Integration for Precise 6-DoF Trajectory Estimation and Map Extension
This paper presents a pioneering solution to the task of integrating mobile 3D LiDAR and inertial measurement unit (IMU) data with existing building information models or point clouds, which is crucial for achieving precise long-term localization and mapping in indoor, GPS-denied environments. Our proposed framework, SLAM2REF, introduces a novel approach for automatic alignment and map extension utilizing reference 3D maps. The methodology is supported by a sophisticated multi-session anchoring technique, which integrates novel descriptors and registration methodologies. Real-world experiments reveal the framework's remarkable robustness and accuracy, surpassing current state-of-the-art methods. Our open-source framework's significance lies in its contribution to resilient map data management, enhancing processes across diverse sectors such as construction site monitoring, emergency response, disaster management, and others, where fast-updated digital 3D maps contribute to better decision-making and productivity. Moreover, it offers advancements in localization and mapping research. Link to the repository: https://github.com/MigVega/SLAM2REF, Data: https://doi.org/10.14459/2024mp1743877.
DeMoBot: Deformable Mobile Manipulation with Vision-based Sub-goal Retrieval
Imitation learning (IL) algorithms typically distill experience into parametric behavior policies to mimic expert demonstrations. Despite their effectiveness, previous methods often struggle with data efficiency and accurately aligning the current state with expert demonstrations, especially in deformable mobile manipulation tasks characterized by partial observations and dynamic object deformations. In this paper, we introduce \textbf{DeMoBot}, a novel IL approach that directly retrieves observations from demonstrations to guide robots in \textbf{De}formable \textbf{Mo}bile manipulation tasks. DeMoBot utilizes vision foundation models to identify relevant expert data based on visual similarity and matches the current trajectory with demonstrated trajectories using trajectory similarity and forward reachability constraints to select suitable sub-goals. Once a goal is determined, a motion generation policy will guide the robot to the next state until the task is completed. We evaluated DeMoBot using a Spot robot in several simulated and real-world settings, demonstrating its effectiveness and generalizability. With only 20 demonstrations, DeMoBot significantly outperforms the baselines, reaching a 50\% success rate in curtain opening and 85\% in gap covering in simulation.
Gen-Swarms: Adapting Deep Generative Models to Swarms of Drones
Gen-Swarms is an innovative method that leverages and combines the capabilities of deep generative models with reactive navigation algorithms to automate the creation of drone shows. Advancements in deep generative models, particularly diffusion models, have demonstrated remarkable effectiveness in generating high-quality 2D images. Building on this success, various works have extended diffusion models to 3D point cloud generation. In contrast, alternative generative models such as flow matching have been proposed, offering a simple and intuitive transition from noise to meaningful outputs. However, the application of flow matching models to 3D point cloud generation remains largely unexplored. Gen-Swarms adapts these models to automatically generate drone shows. Existing 3D point cloud generative models create point trajectories which are impractical for drone swarms. In contrast, our method not only generates accurate 3D shapes but also guides the swarm motion, producing smooth trajectories and accounting for potential collisions through a reactive navigation algorithm incorporated into the sampling process. For example, when given a text category like Airplane, Gen-Swarms can rapidly and continuously generate numerous variations of 3D airplane shapes. Our experiments demonstrate that this approach is particularly well-suited for drone shows, providing feasible trajectories, creating representative final shapes, and significantly enhancing the overall performance of drone show generation.
BIM-SLAM: Integrating BIM Models in Multi-session SLAM for Lifelong Mapping using 3D LiDAR
While 3D LiDAR sensor technology is becoming more advanced and cheaper every day, the growth of digitalization in the AEC industry contributes to the fact that 3D building information models (BIM models) are now available for a large part of the built environment. These two facts open the question of how 3D models can support 3D LiDAR long-term SLAM in indoor, GPS-denied environments. This paper proposes a methodology that leverages BIM models to create an updated map of indoor environments with sequential LiDAR measurements. Session data (pose graph-based map and descriptors) are initially generated from BIM models. Then, real-world data is aligned with the session data from the model using multi-session anchoring while minimizing the drift on the real-world data. Finally, the new elements not present in the BIM model are identified, grouped, and reconstructed in a surface representation, allowing a better visualization next to the BIM model. The framework enables the creation of a coherent map aligned with the BIM model that does not require prior knowledge of the initial pose of the robot, and it does not need to be inside the map.
comment: Conference paper in ISARC 2023
FlowAct: A Proactive Multimodal Human-robot Interaction System with Continuous Flow of Perception and Modular Action Sub-systems
The evolution of autonomous systems in the context of human-robot interaction systems necessitates a synergy between the continuous perception of the environment and the potential actions to navigate or interact within it. We present Flowact, a proactive multimodal human-robot interaction architecture, working as an asynchronous endless loop of robot sensors into actuators and organized by two controllers, the Environment State Tracking (EST) and the Action Planner. The EST continuously collects and publishes a representation of the operative environment, ensuring a steady flow of perceptual data. This persistent perceptual flow is pivotal for our advanced Action Planner which orchestrates a collection of modular action subsystems, such as movement and speaking modules, governing their initiation or cessation based on the evolving environmental narrative. The EST employs a fusion of diverse sensory modalities to build a rich, real-time representation of the environment that is distributed to the Action Planner. This planner uses a decision-making framework to dynamically coordinate action modules, allowing them to respond proactively and coherently to changes in the environment. Through a series of real-world experiments, we exhibit the efficacy of the system in maintaining a continuous perception-action loop, substantially enhancing the responsiveness and adaptability of autonomous pro-active agents. The modular architecture of the action subsystems facilitates easy extensibility and adaptability to a broad spectrum of tasks and scenarios.
comment: Paper accepted at WACAI 2024
Towards Optimized Parallel Robots for Human-Robot Collaboration by Combined Structural and Dimensional Synthesis
Parallel robots (PR) offer potential for human-robot collaboration (HRC) due to their lower moving masses and higher speeds. However, the parallel leg chains increase the risks of collision and clamping. In this work, these hazards are described by kinematics and kinetostatics models to minimize them as objective functions by a combined structural and dimensional synthesis in a particle-swarm optimization. In addition to the risk of clamping within and between kinematic chains, the back-drivability is quantified to theoretically guarantee detectability via motor current. Another HRC-relevant objective function is the largest eigenvalue of the mass matrix formulated in the operational-space coordinates to consider collision effects. Multi-objective optimization leads to different Pareto-optimal PR structures. The results show that the optimization leads to significant improvement of the HRC criteria and that a Hexa structure (6-RUS) is to be favored concerning the objective functions and due to its simpler joint structure.
comment: Accepted for publication at VDI Mechatroniktagung 2024
Multi-view Pose Fusion for Occlusion-Aware 3D Human Pose Estimation ECCV
Robust 3D human pose estimation is crucial to ensure safe and effective human-robot collaboration. Accurate human perception,however, is particularly challenging in these scenarios due to strong occlusions and limited camera viewpoints. Current 3D human pose estimation approaches are rather vulnerable in such conditions. In this work we present a novel approach for robust 3D human pose estimation in the context of human-robot collaboration. Instead of relying on noisy 2D features triangulation, we perform multi-view fusion on 3D skeletons provided by absolute monocular methods. Accurate 3D pose estimation is then obtained via reprojection error optimization, introducing limbs length symmetry constraints. We evaluate our approach on the public dataset Human3.6M and on a novel version Human3.6M-Occluded, derived adding synthetic occlusions on the camera views with the purpose of testing pose estimation algorithms under severe occlusions. We further validate our method on real human-robot collaboration workcells, in which we strongly surpass current 3D human pose estimation methods. Our approach outperforms state-of-the-art multi-view human pose estimation techniques and demonstrates superior capabilities in handling challenging scenarios with strong occlusions, representing a reliable and effective solution for real human-robot collaboration setups.
comment: ECCV workshops 2024
Conceptual Design on the Field of View of Celestial Navigation Systems for Maritime Autonomous Surface Ships
In order to understand the appropriate field of view (FOV) size of celestial automatic navigation systems for surface ships, we investigate the variations of measurement accuracy of star position and probability of successful star identification with respect to FOV, focusing on the decreasing number of observable star magnitudes and the presence of physically covered stars in marine environments. The results revealed that, although a larger FOV reduces the measurement accuracy of star positions, it increases the number of observable objects and thus improves the probability of star identification using subgraph isomorphism-based methods. It was also found that, although at least four objects need to be observed for accurate identification, four objects may not be sufficient for wider FOVs. On the other hand, from the point of view of celestial navigation systems, a decrease in the measurement accuracy leads to a decrease in positioning accuracy. Therefore, it was found that maximizing the FOV is required for celestial automatic navigation systems as long as the desired positioning accuracy can be ensured. Furthermore, it was found that algorithms incorporating more than four observed celestial objects are required to achieve highly accurate star identification over a wider FOV.
comment: 15 pages, 10 figures
Addressing the challenges of loop detection in agricultural environments
While visual SLAM systems are well studied and achieve impressive results in indoor and urban settings, natural, outdoor and open-field environments are much less explored and still present relevant research challenges. Visual navigation and local mapping have shown a relatively good performance in open-field environments. However, globally consistent mapping and long-term localization still depend on the robustness of loop detection and closure, for which the literature is scarce. In this work we propose a novel method to pave the way towards robust loop detection in open fields, particularly in agricultural settings, based on local feature search and stereo geometric refinement, with a final stage of relative pose estimation. Our method consistently achieves good loop detections, with a median error of 15cm. We aim to characterize open fields as a novel environment for loop detection, understanding the limitations and problems that arise when dealing with them.
Explicit Contact Optimization in Whole-Body Contact-Rich Manipulation
Humans can exploit contacts anywhere on their body surface to manipulate large and heavy items, objects normally out of reach or multiple objects at once. However, such manipulation through contacts using the whole surface of the body remains extremely challenging to achieve on robots. This can be labelled as Whole-Body Contact-Rich Manipulation (WBCRM) problem. In addition to the high-dimensionality of the Contact-Rich Manipulation problem due to the combinatorics of contact modes, admitting contact creation anywhere on the body surface adds complexity, which hinders planning of manipulation within a reasonable time. We address this computational problem by formulating the contact and motion planning of planar WBCRM as hierarchical continuous optimization problems. To enable this formulation, we propose a novel continuous explicit representation of the robot surface, that we believe to be foundational for future research using continuous optimization for WBCRM. Our results demonstrate a significant improvement of convergence, planning time and feasibility - with, on the average, 99% less iterations and 96% reduction in time to find a solution over considered scenarios, without recourse to prone-to-failure trajectory refinement steps.
Benchmarking ML Approaches to UWB-Based Range-Only Posture Recognition for Human Robot-Interaction
Human pose estimation involves detecting and tracking the positions of various body parts using input data from sources such as images, videos, or motion and inertial sensors. This paper presents a novel approach to human pose estimation using machine learning algorithms to predict human posture and translate them into robot motion commands using ultra-wideband (UWB) nodes, as an alternative to motion sensors. The study utilizes five UWB sensors implemented on the human body to enable the classification of still poses and more robust posture recognition. This approach ensures effective posture recognition across a variety of subjects. These range measurements serve as input features for posture prediction models, which are implemented and compared for accuracy. For this purpose, machine learning algorithms including K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and deep Multi-Layer Perceptron (MLP) neural network are employed and compared in predicting corresponding postures. We demonstrate the proposed approach for real-time control of different mobile/aerial robots with inference implemented in a ROS 2 node. Experimental results demonstrate the efficacy of the approach, showcasing successful prediction of human posture and corresponding robot movements with high accuracy.
A quantitative model of takeover request time budget for conditionally automated driving
In conditional automation, the automated driving system assumes full control and only issues a takeover request to a human driver to resume driving in critical situations. Previous studies have concluded that the time budget required by drivers to resume driving after a takeover request varies with situations and different takeover variables. However, no comprehensive generalized approaches for estimating in advance the time budget required by drivers to takeover have been provided. In this contribution, fixed (7 s) and variable time budgets (6 s, 5 s, and 4 s) with and without visual imagery assistance were investigated for suitability in three takeover scenarios using performance measures such as average lateral displacement. The results indicate that 7 s is suitable for two of the studied scenarios based on their characteristics. Using the obtained results and known relations between takeover variables, a mathematical formula for estimating takeover request time budget is proposed. The proposed formula integrates individual stimulus response time, driving experience, scenario specific requirements and allows increased safety for takeover maneuvers. Furthermore, the visual imagery resulted in increased takeover time which invariably increases the time budget. Thus the time demand of the visualized information if applicable (such as visual imagery) should be included in the time budget.
comment: Manuscript: 12 pages, 12 figures, 7 tables
NeuroVE: Brain-inspired Linear-Angular Velocity Estimation with Spiking Neural Networks
Vision-based ego-velocity estimation is a fundamental problem in robot state estimation. However, the constraints of frame-based cameras, including motion blur and insufficient frame rates in dynamic settings, readily lead to the failure of conventional velocity estimation techniques. Mammals exhibit a remarkable ability to accurately estimate their ego-velocity during aggressive movement. Hence, integrating this capability into robots shows great promise for addressing these challenges. In this paper, we propose a brain-inspired framework for linear-angular velocity estimation, dubbed NeuroVE. The NeuroVE framework employs an event camera to capture the motion information and implements spiking neural networks (SNNs) to simulate the brain's spatial cells' function for velocity estimation. We formulate the velocity estimation as a time-series forecasting problem. To this end, we design an Astrocyte Leaky Integrate-and-Fire (ALIF) neuron model to encode continuous values. Additionally, we have developed an Astrocyte Spiking Long Short-term Memory (ASLSTM) structure, which significantly improves the time-series forecasting capabilities, enabling an accurate estimate of ego-velocity. Results from both simulation and real-world experiments indicate that NeuroVE has achieved an approximate 60% increase in accuracy compared to other SNN-based approaches.
TeFF: Tracking-enhanced Forgetting-free Few-shot 3D LiDAR Semantic Segmentation
In autonomous driving, 3D LiDAR plays a crucial role in understanding the vehicle's surroundings. However, the newly emerged, unannotated objects presents few-shot learning problem for semantic segmentation. This paper addresses the limitations of current few-shot semantic segmentation by exploiting the temporal continuity of LiDAR data. Employing a tracking model to generate pseudo-ground-truths from a sequence of LiDAR frames, our method significantly augments the dataset, enhancing the model's ability to learn on novel classes. However, this approach introduces a data imbalance biased to novel data that presents a new challenge of catastrophic forgetting. To mitigate this, we incorporate LoRA, a technique that reduces the number of trainable parameters, thereby preserving the model's performance on base classes while improving its adaptability to novel classes. This work represents a significant step forward in few-shot 3D LiDAR semantic segmentation for autonomous driving. Our code is available at https://github.com/junbao-zhou/Track-no-forgetting.
Learning dynamics models for velocity estimation in autonomous racing
Velocity estimation is of great importance in autonomous racing. Still, existing solutions are characterized by limited accuracy, especially in the case of aggressive driving or poor generalization to unseen road conditions. To address these issues, we propose to utilize Unscented Kalman Filter (UKF) with a learned dynamics model that is optimized directly for the state estimation task. Moreover, we propose to aid this model with the online-estimated friction coefficient, which increases the estimation accuracy and enables zero-shot adaptation to the new road conditions. To evaluate the UKF-based velocity estimator with the proposed dynamics model, we introduced a publicly available dataset of aggressive manoeuvres performed by an F1TENTH car, with sideslip angles reaching 40{\deg}. Using this dataset, we show that learning the dynamics model through UKF leads to improved estimation performance and that the proposed solution outperforms state-of-the-art learning-based state estimators by 17% in the nominal scenario. Moreover, we present unseen zero-shot adaptation abilities of the proposed method to the new road surface thanks to the use of the proposed learning-based tire dynamics model with online friction estimation.
ES-PTAM: Event-based Stereo Parallel Tracking and Mapping
Visual Odometry (VO) and SLAM are fundamental components for spatial perception in mobile robots. Despite enormous progress in the field, current VO/SLAM systems are limited by their sensors' capability. Event cameras are novel visual sensors that offer advantages to overcome the limitations of standard cameras, enabling robots to expand their operating range to challenging scenarios, such as high-speed motion and high dynamic range illumination. We propose a novel event-based stereo VO system by combining two ideas: a correspondence-free mapping module that estimates depth by maximizing ray density fusion and a tracking module that estimates camera poses by maximizing edge-map alignment. We evaluate the system comprehensively on five real-world datasets, spanning a variety of camera types (manufacturers and spatial resolutions) and scenarios (driving, flying drone, hand-held, egocentric, etc). The quantitative and qualitative results demonstrate that our method outperforms the state of the art in majority of the test sequences by a margin, e.g., trajectory error reduction of 45% on RPG dataset, 61% on DSEC dataset, and 21% on TUM-VIE dataset. To benefit the community and foster research on event-based perception systems, we release the source code and results: https://github.com/tub-rip/ES-PTAM
comment: 17 pages, 7 figures, 4 tables, https://github.com/tub-rip/ES-PTAM
On the Benefits of Visual Stabilization for Frame- and Event-based Perception
Vision-based perception systems are typically exposed to large orientation changes in different robot applications. In such conditions, their performance might be compromised due to the inherent complexity of processing data captured under challenging motion. Integration of mechanical stabilizers to compensate for the camera rotation is not always possible due to the robot payload constraints. This paper presents a processing-based stabilization approach to compensate the camera's rotational motion both on events and on frames (i.e., images). Assuming that the camera's attitude is available, we evaluate the benefits of stabilization in two perception applications: feature tracking and estimating the translation component of the camera's ego-motion. The validation is performed using synthetic data and sequences from well-known event-based vision datasets. The experiments unveil that stabilization can improve feature tracking and camera ego-motion estimation accuracy in 27.37% and 34.82%, respectively. Concurrently, stabilization can reduce the processing time of computing the camera's linear velocity by at least 25%. Code is available at https://github.com/tub-rip/visual_stabilization
comment: 8 pages, 4 figures, 4 tables, https://github.com/tub-rip/visual_stabilization
AeroVerse: UAV-Agent Benchmark Suite for Simulating, Pre-training, Finetuning, and Evaluating Aerospace Embodied World Models
Aerospace embodied intelligence aims to empower unmanned aerial vehicles (UAVs) and other aerospace platforms to achieve autonomous perception, cognition, and action, as well as egocentric active interaction with humans and the environment. The aerospace embodied world model serves as an effective means to realize the autonomous intelligence of UAVs and represents a necessary pathway toward aerospace embodied intelligence. However, existing embodied world models primarily focus on ground-level intelligent agents in indoor scenarios, while research on UAV intelligent agents remains unexplored. To address this gap, we construct the first large-scale real-world image-text pre-training dataset, AerialAgent-Ego10k, featuring urban drones from a first-person perspective. We also create a virtual image-text-pose alignment dataset, CyberAgent Ego500k, to facilitate the pre-training of the aerospace embodied world model. For the first time, we clearly define 5 downstream tasks, i.e., aerospace embodied scene awareness, spatial reasoning, navigational exploration, task planning, and motion decision, and construct corresponding instruction datasets, i.e., SkyAgent-Scene3k, SkyAgent-Reason3k, SkyAgent-Nav3k and SkyAgent-Plan3k, and SkyAgent-Act3k, for fine-tuning the aerospace embodiment world model. Simultaneously, we develop SkyAgentEval, the downstream task evaluation metrics based on GPT-4, to comprehensively, flexibly, and objectively assess the results, revealing the potential and limitations of 2D/3D visual language models in UAV-agent tasks. Furthermore, we integrate over 10 2D/3D visual-language models, 2 pre-training datasets, 5 finetuning datasets, more than 10 evaluation metrics, and a simulator into the benchmark suite, i.e., AeroVerse, which will be released to the community to promote exploration and development of aerospace embodied intelligence.
Feelit: Combining Compliant Shape Displays with Vision-Based Tactile Sensors for Real-Time Teletaction IROS 2024
Teletaction, the transmission of tactile feedback or touch, is a crucial aspect in the field of teleoperation. High-quality teletaction feedback allows users to remotely manipulate objects and increase the quality of the human-machine interface between the operator and the robot, making complex manipulation tasks possible. Advances in the field of teletaction for teleoperation however, have yet to make full use of the high-resolution 3D data provided by modern vision-based tactile sensors. Existing solutions for teletaction lack in one or more areas of form or function, such as fidelity or hardware footprint. In this paper, we showcase our design for a low-cost teletaction device that can utilize real-time high-resolution tactile information from vision-based tactile sensors, through both physical 3D surface reconstruction and shear displacement. We present our device, the Feelit, which uses a combination of a pin-based shape display and compliant mechanisms to accomplish this task. The pin-based shape display utilizes an array of 24 servomotors with miniature Bowden cables, giving the device a resolution of 6x4 pins in a 15x10 mm display footprint. Each pin can actuate up to 3 mm in 200 ms, while providing 80 N of force and 1.5 um of depth resolution. Shear displacement and rotation is achieved using a compliant mechanism design, allowing a minimum of 1 mm displacement laterally and 10 degrees of rotation. This real-time 3D tactile reconstruction is achieved with the use of a vision-based tactile sensor, the GelSight [1], along with an algorithm that samples the depth data and marker tracking to generate actuator commands. Through a series of experiments including shape recognition and relative weight identification, we show that our device has the potential to expand teletaction capabilities in the teleoperation space.
comment: IROS 2024
Power, Control, and Data Acquisition Systems for Rectal Simulator Integrated with Soft Pouch Actuators
Fecal incontinence (FI) is a significant health issue with various underlying causes. Research in this field is limited by social stigma and the lack of effective replication models. To address these challenges, we developed a sophisticated rectal simulator that integrates power, control, and data acquisition systems with soft pouch actuators. The system comprises four key subsystems: mechanical, electrical, pneumatic, and control and data acquisition. The mechanical subsystem utilizes common materials such as aluminum frames, wooden boards, and compact structural components to facilitate the installation and adjustment of electrical and control components. The electrical subsystem supplies power to regulators and sensors. The pneumatic system provides compressed air to actuators, enabling the simulation of FI. The control and data acquisition subsystem collects pressure data and regulates actuator movement. This comprehensive approach allows the robot to accurately replicate human defecation, managing various feces types including liquid, solid, and extremely solid. This innovation enhances our understanding of defecation and holds potential for advancing quality-of-life devices related to this condition.
Bio-inspired circular soft actuators for simulating defecation process of human rectum
Soft robots have found extensive applications in the medical field, particularly in rehabilitation exercises, assisted grasping, and artificial organs. Despite significant advancements in simulating various components of the digestive system, the rectum has been largely neglected due to societal stigma. This study seeks to address this gap by developing soft circular muscle actuators (CMAs) and rectum models to replicate the defecation process. Using soft materials, both the rectum and the actuators were fabricated to enable seamless integration and attachment. We designed, fabricated, and tested three types of CMAs and compared them to the simulated results. A pneumatic system was employed to control the actuators, and simulated stool was synthesized using sodium alginate and calcium chloride. Experimental results indicated that the third type of actuator exhibited superior performance in terms of area contraction and pressure generation. The successful simulation of the defecation process highlights the potential of these soft actuators in biomedical applications, providing a foundation for further research and development in the field of soft robotics.
DECAF: a Discrete-Event based Collaborative Human-Robot Framework for Furniture Assembly
This paper proposes a task planning framework for collaborative Human-Robot scenarios, specifically focused on assembling complex systems such as furniture. The human is characterized as an uncontrollable agent, implying for example that the agent is not bound by a pre-established sequence of actions and instead acts according to its own preferences. Meanwhile, the task planner computes reactively the optimal actions for the collaborative robot to efficiently complete the entire assembly task in the least time possible. We formalize the problem as a Discrete Event Markov Decision Problem (DE-MDP), a comprehensive framework that incorporates a variety of asynchronous behaviors, human change of mind and failure recovery as stochastic events. Although the problem could theoretically be addressed by constructing a graph of all possible actions, such an approach would be constrained by computational limitations. The proposed formulation offers an alternative solution utilizing Reinforcement Learning to derive an optimal policy for the robot. Experiments where conducted both in simulation and on a real system with human subjects assembling a chair in collaboration with a 7-DoF manipulator.
comment: 9 pages, 6 figures, extended version of accepted paper at IRO24
Path planning for autonomous vehicles with minimal collision severity
This paper proposes a path planning algorithm for autonomous vehicles, evaluating collision severity with respect to both static and dynamic obstacles. A collision severity map is generated from ratings, quantifying the severity of collisions. A two-level optimal control problem is designed. At the first level, the objective is to identify paths with the lowest collision severity. Subsequently, at the second level, among the paths with lowest collision severity, the one requiring the minimum steering effort is determined. Finally, numerical simulations were conducted using the optimal control software OCPID-DAE1. The study focuses on scenarios where collisions are unavoidable. Results demonstrate the effectiveness and significance of this approach in finding a path with minimum collision severity for autonomous vehicles. Furthermore, this paper illustrates how the ratings for collision severity influence the behaviour of the automated vehicle.
comment: arXiv admin note: text overlap with arXiv:2203.03681
Hitting the Gym: Reinforcement Learning Control of Exercise-Strengthened Biohybrid Robots in Simulation
Animals can accomplish many incredible behavioral feats across a wide range of operational environments and scales that current robots struggle to match. One explanation for this performance gap is the extraordinary properties of the biological materials that comprise animals, such as muscle tissue. Using living muscle tissue as an actuator can endow robotic systems with highly desirable properties such as self-healing, compliance, and biocompatibility. Unlike traditional soft robotic actuators, living muscle biohybrid actuators exhibit unique adaptability, growing stronger with use. The dependency of a muscle's force output on its use history endows muscular organisms the ability to dynamically adapt to their environment, getting better at tasks over time. While muscle adaptability is a benefit to muscular organisms, it currently presents a challenge for biohybrid researchers: how does one design and control a robot whose actuators' force output changes over time? Here, we incorporate muscle adaptability into a many-muscle biohybrid robot design and modeling tool, leveraging reinforcement learning as both a co-design partner and system controller. As a controller, our learning agents coordinated the independent contraction of 42 muscles distributed on a lattice worm structure to successfully steer it towards eight distinct targets while incorporating muscle adaptability. As a co-design tool, our agents enable users to identify which muscles are important to accomplishing a given task. Our results show that adaptive agents outperform non-adaptive agents in terms of maximum rewards and training time. Together, these contributions can both enable the elucidation of muscle actuator adaptation and inform the design and modeling of adaptive, performant, many-muscle robots.
comment: 11 pages, 6 figures
Domain-decoupled Physics-informed Neural Networks with Closed-form Gradients for Fast Model Learning of Dynamical Systems
Physics-informed neural networks (PINNs) are trained using physical equations and can also incorporate unmodeled effects by learning from data. PINNs for control (PINCs) of dynamical systems are gaining interest due to their prediction speed compared to classical numerical integration methods for nonlinear state-space models, making them suitable for real-time control applications. We introduce the domain-decoupled physics-informed neural network (DD-PINN) to address current limitations of PINC in handling large and complex nonlinear dynamical systems. The time domain is decoupled from the feed-forward neural network to construct an Ansatz function, allowing for calculation of gradients in closed form. This approach significantly reduces training times, especially for large dynamical systems, compared to PINC, which relies on graph-based automatic differentiation. Additionally, the DD-PINN inherently fulfills the initial condition and supports higher-order excitation inputs, simplifying the training process and enabling improved prediction accuracy. Validation on three systems - a nonlinear mass-spring-damper, a five-mass-chain, and a two-link robot - demonstrates that the DD-PINN achieves significantly shorter training times. In cases where the PINC's prediction diverges, the DD-PINN's prediction remains stable and accurate due to higher physics loss reduction or use of a higher-order excitation input. The DD-PINN allows for fast and accurate learning of large dynamical systems previously out of reach for the PINC.
comment: Accepted to International Conference on Informatics in Control, Automation and Robotics (ICINCO) 2024
FAST-LIVO2: Fast, Direct LiDAR-Inertial-Visual Odometry
This paper proposes FAST-LIVO2: a fast, direct LiDAR-inertial-visual odometry framework to achieve accurate and robust state estimation in SLAM tasks and provide great potential in real-time, onboard robotic applications. FAST-LIVO2 fuses the IMU, LiDAR and image measurements efficiently through an ESIKF. To address the dimension mismatch between the heterogeneous LiDAR and image measurements, we use a sequential update strategy in the Kalman filter. To enhance the efficiency, we use direct methods for both the visual and LiDAR fusion, where the LiDAR module registers raw points without extracting edge or plane features and the visual module minimizes direct photometric errors without extracting ORB or FAST corner features. The fusion of both visual and LiDAR measurements is based on a single unified voxel map where the LiDAR module constructs the geometric structure for registering new LiDAR scans and the visual module attaches image patches to the LiDAR points. To enhance the accuracy of image alignment, we use plane priors from the LiDAR points in the voxel map (and even refine the plane prior) and update the reference patch dynamically after new images are aligned. Furthermore, to enhance the robustness of image alignment, FAST-LIVO2 employs an on-demanding raycast operation and estimates the image exposure time in real time. Lastly, we detail three applications of FAST-LIVO2: UAV onboard navigation demonstrating the system's computation efficiency for real-time onboard navigation, airborne mapping showcasing the system's mapping accuracy, and 3D model rendering (mesh-based and NeRF-based) underscoring the suitability of our reconstructed dense map for subsequent rendering tasks. We open source our code, dataset and application on GitHub to benefit the robotics community.
comment: 30 pages, 31 figures, due to the limitation that 'The abstract field cannot exceed 1,920 characters', the abstract presented here is shorter than the one in the PDF file
Multi-modal Integrated Prediction and Decision-making with Adaptive Interaction Modality Explorations
Navigating dense and dynamic environments poses a significant challenge for autonomous driving systems, owing to the intricate nature of multimodal interaction, wherein the actions of various traffic participants and the autonomous vehicle are complex and implicitly coupled. In this paper, we propose a novel framework, Multi-modal Integrated predictioN and Decision-making (MIND), which addresses the challenges by efficiently generating joint predictions and decisions covering multiple distinctive interaction modalities. Specifically, MIND leverages learning-based scenario predictions to obtain integrated predictions and decisions with social-consistent interaction modality and utilizes a modality-aware dynamic branching mechanism to generate scenario trees that efficiently capture the evolutions of distinctive interaction modalities with low variation of interaction uncertainty along the planning horizon. The scenario trees are seamlessly utilized by the contingency planning under interaction uncertainty to obtain clear and considerate maneuvers accounting for multi-modal evolutions. Comprehensive experimental results in the closed-loop simulation based on the real-world driving dataset showcase superior performance to other strong baselines under various driving contexts.
comment: 8 pages, 9 figures
PAAMP: Polytopic Action-Set And Motion Planning for Long Horizon Dynamic Motion Planning via Mixed Integer Linear Programming IROS 2024
Optimization methods for long-horizon, dynamically feasible motion planning in robotics tackle challenging non-convex and discontinuous optimization problems. Traditional methods often falter due to the nonlinear characteristics of these problems. We introduce a technique that utilizes learned representations of the system, known as Polytopic Action Sets, to efficiently compute long-horizon trajectories. By employing a suitable sequence of Polytopic Action Sets, we transform the long-horizon dynamically feasible motion planning problem into a Linear Program. This reformulation enables us to address motion planning as a Mixed Integer Linear Program (MILP). We demonstrate the effectiveness of a Polytopic Action-Set and Motion Planning (PAAMP) approach by identifying swing-up motions for a torque-constrained pendulum as fast as 0.75 milliseconds. This approach is well-suited for solving complex motion planning and long-horizon Constraint Satisfaction Problems (CSPs) in dynamic and underactuated systems such as legged and aerial robots.
comment: Accepted to 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024). 8 pages, 10 figures
Risk-Aware Non-Myopic Motion Planner for Large-Scale Robotic Swarm Using CVaR Constraints IROS 2024
Swarm robotics has garnered significant attention due to its ability to accomplish elaborate and synchronized tasks. Existing methodologies for motion planning of swarm robotic systems mainly encounter difficulties in scalability and safety guarantee. To address these limitations, we propose a Risk-aware swarm mOtion planner using conditional ValuE at Risk (ROVER) that systematically navigates large-scale swarms through cluttered environments while ensuring safety. ROVER formulates a finite-time model predictive control (FTMPC) problem predicated upon the macroscopic state of the robot swarm represented by a Gaussian Mixture Model (GMM) and integrates conditional value-at-risk (CVaR) to ensure collision avoidance. The key component of ROVER is imposing a CVaR constraint on the distribution of the Signed Distance Function between the swarm GMM and obstacles in the FTMPC to enforce collision avoidance. Utilizing the analytical expression of CVaR of a GMM derived in this work, we develop a computationally efficient solution to solve the non-linear constrained FTMPC through sequential linear programming. Simulations and comparisons with representative benchmark approaches demonstrate the effectiveness of ROVER in flexibility, scalability, and risk mitigation.
comment: accepted to IROS 2024
Component reusability evaluation and requirement tracing for agent-based cyber-physical-simulated systems
Evaluating early design concepts is crucial as it impacts quality and cost. This process is often hindered by vague and uncertain design information. This article introduces the SysML-based Simulated-Physical Systems Modeling Language (SPSysML). It is a Domain-Specification Language used to evaluate component reusability in cyber-physical systems, incorporating digital twins and other simulated parts. The proposed factors assess the design quantitatively. SPSysML uses a requirement-based system structuring method to couple simulated and physical parts with requirements. SPSysML enables DTs to perceive exogenous actions in the simulated world. SPSysML validation is survey- and application-based. First, a robotic system for an assisted living project was developed. The integrity of simulated and physical parts of the system is improved using SPSysML-based quantitative evaluation. Thus, more system components are shared between the simulated and physical setups. The system was deployed on the physical robot and two simulators based on the Robot Operating System (ROS) or ROS2. SPSysML was used by a third-party developer and was assessed by him and other practitioners in a survey.
comment: This work has been submitted to the Elsevier Journal of Systems Architecture for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
FRAME: A Modular Framework for Autonomous Map Merging: Advancements in the Field
In this article, a novel approach for merging 3D point cloud maps in the context of egocentric multi-robot exploration is presented. Unlike traditional methods, the proposed approach leverages state-of-the-art place recognition and learned descriptors to efficiently detect overlap between maps, eliminating the need for the time-consuming global feature extraction and feature matching process. The estimated overlapping regions are used to calculate a homogeneous rigid transform, which serves as an initial condition for the GICP point cloud registration algorithm to refine the alignment between the maps. The advantages of this approach include faster processing time, improved accuracy, and increased robustness in challenging environments. Furthermore, the effectiveness of the proposed framework is successfully demonstrated through multiple field missions of robot exploration in a variety of different underground environments.
comment: 28 pages, 24 figures. Accepted to the IEEE Transactions on Field Robotics
Receding-Constraint Model Predictive Control using a Learned Approximate Control-Invariant Set
In recent years, advanced model-based and data-driven control methods are unlocking the potential of complex robotics systems, and we can expect this trend to continue at an exponential rate in the near future. However, ensuring safety with these advanced control methods remains a challenge. A well-known tool to make controllers (either Model Predictive Controllers or Reinforcement Learning policies) safe, is the so-called control-invariant set (a.k.a. safe set). Unfortunately, for nonlinear systems, such a set cannot be exactly computed in general. Numerical algorithms exist for computing approximate control-invariant sets, but classic theoretic control methods break down if the set is not exact. This paper presents our recent efforts to address this issue. We present a novel Model Predictive Control scheme that can guarantee recursive feasibility and/or safety under weaker assumptions than classic methods. In particular, recursive feasibility is guaranteed by making the safe-set constraint move backward over the horizon, and assuming that such set satisfies a condition that is weaker than control invariance. Safety is instead guaranteed under an even weaker assumption on the safe set, triggering a safe task-abortion strategy whenever a risk of constraint violation is detected. We evaluated our approach on a simulated robot manipulator, empirically demonstrating that it leads to less constraint violations than state-of-the-art approaches, while retaining reasonable performance in terms of tracking cost, number of completed tasks, and computation time.
comment: 7 pages, 3 figures, 3 tables, 2 pseudo-algo, conference
Smooth Path Planning with Subharmonic Artificial Potential Field
When a mobile robot plans its path in an environment with obstacles using Artificial Potential Field (APF) strategy, it may fall into the local minimum point and fail to reach the goal. Also, the derivatives of APF will explode close to obstacles causing poor planning performance. To solve the problems, exponential functions are used to modify potential fields' formulas. The potential functions can be subharmonic when the distance between the robot and obstacles is above a predefined threshold. Subharmonic functions do not have local minimum and the derivatives of exponential functions increase mildly when the robot is close to obstacles, thus eliminate the problems in theory. Circular sampling technique is used to keep the robot outside a danger distance to obstacles and support the construction of subharmonic functions. Through simulations, it is proven that mobile robots can bypass local minimum points and construct a smooth path to reach the goal successfully by the proposed methods.
comment: Accepted by ICARM 2024
A Platform-Agnostic Deep Reinforcement Learning Framework for Effective Sim2Real Transfer towards Autonomous Driving
Deep Reinforcement Learning (DRL) has shown remarkable success in solving complex tasks across various research fields. However, transferring DRL agents to the real world is still challenging due to the significant discrepancies between simulation and reality. To address this issue, we propose a robust DRL framework that leverages platform-dependent perception modules to extract task-relevant information and train a lane-following and overtaking agent in simulation. This framework facilitates the seamless transfer of the DRL agent to new simulated environments and the real world with minimal effort. We evaluate the performance of the agent in various driving scenarios in both simulation and the real world, and compare it to human players and the PID baseline in simulation. Our proposed framework significantly reduces the gaps between different platforms and the Sim2Real gap, enabling the trained agent to achieve similar performance in both simulation and the real world, driving the vehicle effectively.
DeepMIF: Deep Monotonic Implicit Fields for Large-Scale LiDAR 3D Mapping
Recently, significant progress has been achieved in sensing real large-scale outdoor 3D environments, particularly by using modern acquisition equipment such as LiDAR sensors. Unfortunately, they are fundamentally limited in their ability to produce dense, complete 3D scenes. To address this issue, recent learning-based methods integrate neural implicit representations and optimizable feature grids to approximate surfaces of 3D scenes. However, naively fitting samples along raw LiDAR rays leads to noisy 3D mapping results due to the nature of sparse, conflicting LiDAR measurements. Instead, in this work we depart from fitting LiDAR data exactly, instead letting the network optimize a non-metric monotonic implicit field defined in 3D space. To fit our field, we design a learning system integrating a monotonicity loss that enables optimizing neural monotonic fields and leverages recent progress in large-scale 3D mapping. Our algorithm achieves high-quality dense 3D mapping performance as captured by multiple quantitative and perceptual measures and visual results obtained for Mai City, Newer College, and KITTI benchmarks. The code of our approach will be made publicly available.
comment: 8 pages, 6 figures
How Physics and Background Attributes Impact Video Transformers in Robotic Manipulation: A Case Study on Planar Pushing IROS 2024
As model and dataset sizes continue to scale in robot learning, the need to understand how the composition and properties of a dataset affect model performance becomes increasingly urgent to ensure cost-effective data collection and model performance. In this work, we empirically investigate how physics attributes (color, friction coefficient, shape) and scene background characteristics, such as the complexity and dynamics of interactions with background objects, influence the performance of Video Transformers in predicting planar pushing trajectories. We investigate three primary questions: How do physics attributes and background scene characteristics influence model performance? What kind of changes in attributes are most detrimental to model generalization? What proportion of fine-tuning data is required to adapt models to novel scenarios? To facilitate this research, we present CloudGripper-Push-1K, a large real-world vision-based robot pushing dataset comprising 1278 hours and 460,000 videos of planar pushing interactions with objects with different physics and background attributes. We also propose Video Occlusion Transformer (VOT), a generic modular video-transformer-based trajectory prediction framework which features 3 choices of 2D-spatial encoders as the subject of our case study. The dataset and source code are available at https://cloudgripper.org.
comment: IEEE/RSJ IROS 2024
ViIK: Flow-based Vision Inverse Kinematics Solver with Fusing Collision Checking
Inverse Kinematics (IK) is to find the robot's configurations that satisfy the target pose of the end effector. In motion planning, diverse configurations were required in case a feasible trajectory was not found. Meanwhile, collision checking (CC), e.g. Oriented bounding box (OBB), Discrete Oriented Polytope (DOP), and Quickhull \cite{quickhull}, needs to be done for each configuration provided by the IK solver to ensure every goal configuration for motion planning is available. This means the classical IK solver and CC algorithm should be executed repeatedly for every configuration. Thus, the preparation time is long when the required number of goal configurations is large, e.g. motion planning in cluster environments. Moreover, structured maps, which might be difficult to obtain, were required by classical collision-checking algorithms. To sidestep such two issues, we propose a flow-based vision method that can output diverse available configurations by fusing inverse kinematics and collision checking, named Vision Inverse Kinematics solver (ViIK). Moreover, ViIK uses RGB images as the perception of environments. ViIK can output 1000 configurations within 40 ms, and the accuracy is about 3 millimeters and 1.5 degrees. The higher accuracy can be obtained by being refined by the classical IK solver within a few iterations. The self-collision rates can be lower than 2%. The collision-with-env rates can be lower than 10% in most scenes. The code is available at: https://github.com/AdamQLMeng/ViIK.
U2UData: A Large-scale Cooperative Perception Dataset for Swarm UAVs Autonomous Flight ACM MM24
Modern perception systems for autonomous flight are sensitive to occlusion and have limited long-range capability, which is a key bottleneck in improving low-altitude economic task performance. Recent research has shown that the UAV-to-UAV (U2U) cooperative perception system has great potential to revolutionize the autonomous flight industry. However, the lack of a large-scale dataset is hindering progress in this area. This paper presents U2UData, the first large-scale cooperative perception dataset for swarm UAVs autonomous flight. The dataset was collected by three UAVs flying autonomously in the U2USim, covering a 9 km$^2$ flight area. It comprises 315K LiDAR frames, 945K RGB and depth frames, and 2.41M annotated 3D bounding boxes for 3 classes. It also includes brightness, temperature, humidity, smoke, and airflow values covering all flight routes. U2USim is the first real-world mapping swarm UAVs simulation environment. It takes Yunnan Province as the prototype and includes 4 terrains, 7 weather conditions, and 8 sensor types. U2UData introduces two perception tasks: cooperative 3D object detection and cooperative 3D object tracking. This paper provides comprehensive benchmarks of recent cooperative perception algorithms on these tasks.
comment: Accepted by ACM MM24
MultiGripperGrasp: A Dataset for Robotic Grasping from Parallel Jaw Grippers to Dexterous Hands IROS 2024
We introduce a large-scale dataset named MultiGripperGrasp for robotic grasping. Our dataset contains 30.4M grasps from 11 grippers for 345 objects. These grippers range from two-finger grippers to five-finger grippers, including a human hand. All grasps in the dataset are verified in the robot simulator Isaac Sim to classify them as successful and unsuccessful grasps. Additionally, the object fall-off time for each grasp is recorded as a grasp quality measurement. Furthermore, the grippers in our dataset are aligned according to the orientation and position of their palms, allowing us to transfer grasps from one gripper to another. The grasp transfer significantly increases the number of successful grasps for each gripper in the dataset. Our dataset is useful to study generalized grasp planning and grasp transfer across different grippers. Data, code and videos for the project are available at https://irvlutd.github.io/MultiGripperGrasp
comment: Published in IROS 2024
PhysPart: Physically Plausible Part Completion for Interactable Objects
Interactable objects are ubiquitous in our daily lives. Recent advances in 3D generative models make it possible to automate the modeling of these objects, benefiting a range of applications from 3D printing to the creation of robot simulation environments. However, while significant progress has been made in modeling 3D shapes and appearances, modeling object physics, particularly for interactable objects, remains challenging due to the physical constraints imposed by inter-part motions. In this paper, we tackle the problem of physically plausible part completion for interactable objects, aiming to generate 3D parts that not only fit precisely into the object but also allow smooth part motions. To this end, we propose a diffusion-based part generation model that utilizes geometric conditioning through classifier-free guidance and formulates physical constraints as a set of stability and mobility losses to guide the sampling process. Additionally, we demonstrate the generation of dependent parts, paving the way toward sequential part generation for objects with complex part-whole hierarchies. Experimentally, we introduce a new metric for measuring physical plausibility based on motion success rates. Our model outperforms existing baselines over shape and physical metrics, especially those that do not adequately model physical constraints. We also demonstrate our applications in 3D printing, robot manipulation, and sequential part generation, showing our strength in realistic tasks with the demand for high physical plausibility.
Systems and Control (CS)
A Control Theoretic Approach to Simultaneously Estimate Average Value of Time and Determine Dynamic Price for High-occupancy Toll Lanes
The dynamic pricing problem of a freeway corridor with high-occupancy toll (HOT) lanes was formulated and solved based on a point queue abstraction of the traffic system [Yin and Lou, 2009]. However, existing pricing strategies cannot guarantee that the closed-loop system converges to the optimal state, in which the HOT lanes' capacity is fully utilized but there is no queue on the HOT lanes, and a well-behaved estimation and control method is quite challenging and still elusive. This paper attempts to fill the gap by making three fundamental contributions: (i) to present a simpler formulation of the point queue model based on the new concept of residual capacity, (ii) to propose a simple feedback control theoretic approach to estimate the average value of time and calculate the dynamic price, and (iii) to analytically and numerically prove that the closed-loop system is stable and guaranteed to converge to the optimal state, in either Gaussian or exponential manners.
comment: 34 pages, 16 figures
Stability of Primal-Dual Gradient Flow Dynamics for Multi-Block Convex Optimization Problems
We examine stability properties of primal-dual gradient flow dynamics for composite convex optimization problems with multiple, possibly nonsmooth, terms in the objective function under the generalized consensus constraint. The proposed dynamics are based on the proximal augmented Lagrangian and they provide a viable alternative to ADMM which faces significant challenges from both analysis and implementation viewpoints in large-scale multi-block scenarios. In contrast to customized algorithms with individualized convergence guarantees, we provide a systematic approach for solving a broad class of challenging composite optimization problems. We leverage various structural properties to establish global (exponential) convergence guarantees for the proposed dynamics. Our assumptions are much weaker than those required to prove (exponential) stability of various primal-dual dynamics as well as (linear) convergence of discrete-time methods, e.g., standard two-block and multi-block ADMM and EXTRA algorithms. Finally, we show necessity of some of our structural assumptions for exponential stability and provide computational experiments to demonstrate the convenience of the proposed dynamics for parallel and distributed computing applications.
comment: 31 pages; 4 figures
Practical Challenges for Reliable RIS Deployment in Heterogeneous Multi-Operator Multi-Band Networks
Reconfigurable intelligent surfaces (RISs) have been introduced as arrays of nearly passive elements with software-tunable electromagnetic properties to dynamically manipulate the reflection/transmission of radio signals. Research works in this area are focused on two applications, namely {\it user-assist} RIS aiming at tuning the RIS to enhance the quality-of-service (QoS) of target users, and the {\it malicious} RIS aiming for an attacker to degrade the QoS at victim receivers through generating {\it intended} destructive interference. While both user-assist and malicious RIS applications have been explored extensively, the impact of RIS deployments on imposing {\it unintended} interference on various wireless user-equipments (EUs) remains underexplored. This paper investigates the challenges of integrating RISs into multi-carrier, multi-user, and multi-operator networks. We discuss how RIS deployments intended to benefit specific users can negatively impact other users served at various carrier frequencies through different network operators. While not an ideal solution, we discuss how ultra-narrowband metasurfaces can be incorporated into the manufacturing of RISs to mitigate some challenges of RIS deployment in wireless networks. We also present a simulation scenario to illuminate some practical challenges associated with the deployment of RISs in shared public environments.
Towards Optimized Parallel Robots for Human-Robot Collaboration by Combined Structural and Dimensional Synthesis
Parallel robots (PR) offer potential for human-robot collaboration (HRC) due to their lower moving masses and higher speeds. However, the parallel leg chains increase the risks of collision and clamping. In this work, these hazards are described by kinematics and kinetostatics models to minimize them as objective functions by a combined structural and dimensional synthesis in a particle-swarm optimization. In addition to the risk of clamping within and between kinematic chains, the back-drivability is quantified to theoretically guarantee detectability via motor current. Another HRC-relevant objective function is the largest eigenvalue of the mass matrix formulated in the operational-space coordinates to consider collision effects. Multi-objective optimization leads to different Pareto-optimal PR structures. The results show that the optimization leads to significant improvement of the HRC criteria and that a Hexa structure (6-RUS) is to be favored concerning the objective functions and due to its simpler joint structure.
comment: Accepted for publication at VDI Mechatroniktagung 2024
A Stochastic Robust Adaptive Systems Level Approach to Stabilizing Large-Scale Uncertain Markovian Jump Linear Systems
We propose a unified framework for robustly and adaptively stabilizing large-scale networked uncertain Markovian jump linear systems (MJLS) under external disturbances and mode switches that can change the network's topology. Adaptation is achieved by using minimal information on the disturbance to identify modes that are consistent with observable data. Robust control is achieved by extending the system level synthesis (SLS) approach, which allows us to pose the problem of simultaneously stabilizing multiple plants as a two-step convex optimization procedure. Our control pipeline computes a likelihood distribution of the system's current mode, uses them as probabilistic weights during simultaneous stabilization, then updates the likelihood via Bayesian inference. Because of this "softer" probabilistic approach to robust stabilization, our control pipeline does not suffer from abrupt destabilization issues due to changes in the system's true mode, which were observed in a previous method. Separability of SLS also lets us compute localized robust controllers for each subsystem, allowing for network scalability; we use several information consensus methods so that mode estimation can also be done locally. We apply our algorithms to disturbance-rejection on two sample dynamic power grid networks, a small-scale system with 7 nodes and a large-scale grid of 25 nodes.
comment: Full version of accepted paper to 63rd IEEE Conference on Decision and Control (CDC) 2024
Risk-Averse Resilient Operation of Electricity Grid Under the Risk of Wildfire
Wildfires and other extreme weather conditions due to climate change are stressing the aging electrical infrastructure. Power utilities have implemented public safety power shutoffs as a method to mitigate the risk of wildfire by proactively de-energizing some power lines, which leaves customers without power. System operators have to make a compromise between de-energizing of power lines to avoid the wildfire risk and energizing those lines to serve the demand. In this work, with a quantified wildfire ignition risk of each line, a resilient operation problem is presented in power systems with a high penetration level of renewable generation resources. A two-stage robust optimization problem is formulated and solved using column-and-constraint generation algorithm to find improved balance between the de-energization of power lines and the customers served. Different penetration levels of renewable generation to mitigate the impact of extreme fire hazard situations on the energization of customers is assessed. The validity of the presented robust optimization algorithm is demonstrated on various test cases.
Advanced POD-Based Performance Evaluation of Classifiers Applied to Human Driver Lane Changing Prediction
Machine learning (ML) classifiers serve as essential tools facilitating classification and prediction across various domains. The performance of these algorithms should be known to ensure their reliable application. In certain fields, receiver operating characteristic and precision-recall curves are frequently employed to assess machine learning algorithms without accounting for the impact of process parameters. However, it may be essential to evaluate the performance of these algorithms in relation to such parameters. As a performance evaluation metric capable of considering the effects of process parameters, this paper uses a modified probability of detection (POD) approach to assess the reliability of ML-based algorithms. As an example, the POD-based approach is employed to assess ML models used for predicting the lane changing behavior of a vehicle driver. The time remaining to the predicted (and therefore unknown) lane changing event is considered as process parameter. The hit/miss approach to POD is taken here and modified by considering the probability of lane changing derived from ML algorithms at each time step, and obtaining the final result of the analysis accordingly. This improves the reliability of results compared to the standard hit/miss approach, which considers the outcome of the classifiers as either 0 or 1, while also simplifying evaluation compared to the \^a versus a approach. Performance evaluation results of the proposed approach are compared with those obtained with the standard hit/miss approach and a pre-developed \^a versus a approach to validate the effectiveness of the proposed method. The comparison shows that this method provides an averaging conservative behavior with the advantage of enhancing the reliability of the hit/miss approach to POD while retaining its simplicity.
comment: Manuscript: 8 pages, 6 figures, 4 tables
Linear-Quadratic Dynamic Games as Receding-Horizon Variational Inequalities
We consider dynamic games with linear dynamics and quadratic objective functions. We observe that the unconstrained open-loop Nash equilibrium coincides with the LQR in an augmented space, thus deriving an explicit expression of the cost-to-go. With such cost-to-go as a terminal cost, we show asymptotic stability for the receding-horizon solution of the finite-horizon, constrained game. Furthermore, we show that the problem is equivalent to a non-symmetric variational inequality, which does not correspond to any Nash equilibrium problem. For unconstrained closed-loop Nash equilibria, we derive a receding-horizon controller that is equivalent to the infinite-horizon one and ensures asymptotic stability.
A quantitative model of takeover request time budget for conditionally automated driving
In conditional automation, the automated driving system assumes full control and only issues a takeover request to a human driver to resume driving in critical situations. Previous studies have concluded that the time budget required by drivers to resume driving after a takeover request varies with situations and different takeover variables. However, no comprehensive generalized approaches for estimating in advance the time budget required by drivers to takeover have been provided. In this contribution, fixed (7 s) and variable time budgets (6 s, 5 s, and 4 s) with and without visual imagery assistance were investigated for suitability in three takeover scenarios using performance measures such as average lateral displacement. The results indicate that 7 s is suitable for two of the studied scenarios based on their characteristics. Using the obtained results and known relations between takeover variables, a mathematical formula for estimating takeover request time budget is proposed. The proposed formula integrates individual stimulus response time, driving experience, scenario specific requirements and allows increased safety for takeover maneuvers. Furthermore, the visual imagery resulted in increased takeover time which invariably increases the time budget. Thus the time demand of the visualized information if applicable (such as visual imagery) should be included in the time budget.
comment: Manuscript: 12 pages, 12 figures, 7 tables
Comparison of Model Predictive Control and Proximal Policy Optimization for a 1-DOF Helicopter System
This study conducts a comparative analysis of Model Predictive Control (MPC) and Proximal Policy Optimization (PPO), a Deep Reinforcement Learning (DRL) algorithm, applied to a 1-Degree of Freedom (DOF) Quanser Aero 2 system. Classical control techniques such as MPC and Linear Quadratic Regulator (LQR) are widely used due to their theoretical foundation and practical effectiveness. However, with advancements in computational techniques and machine learning, DRL approaches like PPO have gained traction in solving optimal control problems through environment interaction. This paper systematically evaluates the dynamic response characteristics of PPO and MPC, comparing their performance, computational resource consumption, and implementation complexity. Experimental results show that while LQR achieves the best steady-state accuracy, PPO excels in rise-time and adaptability, making it a promising approach for applications requiring rapid response and adaptability. Additionally, we have established a baseline for future RL-related research on this specific testbed. We also discuss the strengths and limitations of each control strategy, providing recommendations for selecting appropriate controllers for real-world scenarios.
comment: Accepted at INDIN2024
Structural Optimization of Lightweight Bipedal Robot via SERL
Designing a bipedal robot is a complex and challenging task, especially when dealing with a multitude of structural parameters. Traditional design methods often rely on human intuition and experience. However, such approaches are time-consuming, labor-intensive, lack theoretical guidance and hard to obtain optimal design results within vast design spaces, thus failing to full exploit the inherent performance potential of robots. In this context, this paper introduces the SERL (Structure Evolution Reinforcement Learning) algorithm, which combines reinforcement learning for locomotion tasks with evolution algorithms. The aim is to identify the optimal parameter combinations within a given multidimensional design space. Through the SERL algorithm, we successfully designed a bipedal robot named Wow Orin, where the optimal leg length are obtained through optimization based on body structure and motor torque. We have experimentally validated the effectiveness of the SERL algorithm, which is capable of optimizing the best structure within specified design space and task conditions. Additionally, to assess the performance gap between our designed robot and the current state-of-the-art robots, we compared Wow Orin with mainstream bipedal robots Cassie and Unitree H1. A series of experimental results demonstrate the Outstanding energy efficiency and performance of Wow Orin, further validating the feasibility of applying the SERL algorithm to practical design.
CBF-LLM: Safe Control for LLM Alignment
This paper proposes a control-based framework for aligning large language models (LLMs) by leveraging a control barrier function (CBF) to ensure user-desirable text generation. The presented framework applies the safety filter, designed based on the CBF, to the output generation of the baseline LLM, i.e., the sequence of the token, with the aim of intervening in the generated text. The overall text-generation system is implemented with Llama 3 and a RoBERTa model, and the source code is available at https://github.com/Mya-Mya/CBF-LLM. The experiment demonstrates its control ability and effectiveness in reducing the number of interventions needed for user-specified alignment tasks.
Sufficient and Necessary Barrier-like Conditions for Safety and Reach-avoid Verification of Stochastic Discrete-time Systems
In this paper, we examine sufficient and necessary barrier-like conditions for the safety verification and reach-avoid verification of stochastic discrete-time systems. Safety verification aims to certify the satisfaction of the safety property, which stipulates that the probability of the system, starting from a specified initial state, remaining within a safe set is greater than or equal to a specified lower bound. A sufficient and necessary barrier-like condition is formulated for safety verification. In contrast, reach-avoid verification extends beyond safety to include reachability, seeking to certify the satisfaction of the reach-avoid property. It requires that the probability of the system, starting from a specified initial state, reaching a target set eventually while remaining within a safe set until the first hit of the target, is greater than or equal to a specified lower bound. Two sufficient and necessary barrier-like conditions are formulated under certain assumptions. These conditions are derived via relaxing Bellman equations.
On the Existence of Linear Observed Systems on Manifolds with Connection
Linear observed systems on manifolds are a special class of nonlinear systems whose state spaces are smooth manifolds but possess properties similar to linear systems. Such properties can be characterized by the ability to conduct preintegration and exact linearization with Jacobians independent of the linearization point. IMU dynamics in navigation can be constructed into linear observed settings, leading to invariant filters with guaranteed behaviors such as local convergence and consistency. In this letter, we establish linear observed property for dynamics evolving on an arbitrary smooth manifold through the connection structure endowed upon this space. Our key findings are the existence of linear observed systems on manifolds poses strong constraints on the state space itself, apart from requiring the dynamics to be in some specific forms. The existence of such systems is equivalent to the flatness of the state space, forcing the manifold to admit a group structure under mild topological assumptions.
comment: 6 pages, 2 figures
Infinite-Horizon Optimal Wireless Control Over Shared State-Dependent Fading Channels for IIoT Systems
Heterogeneous systems consisting of a multiloop wireless control system (WCS) and a mobile agent system (MAS) are ubiquitous in Industrial Internet of Things systems. Within these systems, positions of mobile agents may lead to shadow fading on the wireless channel that the WCS is controlled over and can significantly compromise its performance. This paper focuses on the infinite-horizon optimal control of MAS to ensure the performance of WCS while minimizing an average cost for the heterogeneous system subject to state and input constraints. Firstly, the state-dependent fading channel is modeled, which characterizes the interference among transmission links, and shows that the probability of a successful transmission for WCS depends on the state of MAS. A necessary and sufficient condition in terms of constrained set stabilization is then established to ensure the Lyapunov-like performance of WCS with expected decay rate. Secondly, using the semi-tensor product of matrices and constrained reachable sets, a criterion is presented to check the constrained set stabilization of MAS and to ensure the performance of WCS. In addition, a constrained optimal state transition graph is constructed to address state and input constraints, by resorting to which the feasibility analysis of the optimal control problem is presented. Finally, an algorithm is proposed for the construction of optimal input sequences by minimum-mean cycles for weighted graph. An illustrative example is provided to demonstrate effectiveness of the proposed method.
Compact Multi-Service Antenna for Sensing and Communication Using Reconfigurable Complementary Spiral Resonator
In this paper, a compact multi-service antenna (MSA) is presented for sensing and communication using a reconfigurable complementary spiral resonator. A three turns complementary spiral resonator (3-CSR) is inserted in the ground plane of a modified patch antenna to create a miniaturized structure. Two Positive-Intrinsic-Negative (PIN) diodes (D1, D2) are also integrated with the 3-CSR to achieve frequency reconfiguration. The proposed structure operates in three different modes i.e., dual-band joint communication and sensing antenna (JCASA), dual-band antenna, and single-band antenna. The required mode can be selected by changing the state of the PIN diodes. In mode-1, the first band (0.95-0.97 GHz) of the antenna is dedicated to sensing by using frequency domain reflectometry (FDR), while the second band (1.53-1.56 GHz) is allocated to communication. The sensing ability of the proposed structure is utilized to measure soil moisture using FDR. Based on the frequency shift, permittivity of the soil is observed to measure soil moisture. In mode-2 and mode-3, the structure operates as a standard dual and single band antenna, respectively, with a maximum gain of 1.5 dBi at 1.55 GHz. The proposed planar structure, with its simple geometry and a high sensitivity of 1.7%, is a suitable candidate for precision farming. The proposed structure is versatile and capable of being utilized as a single or dual-band antenna and also measuring permittivity of materials within the range of 1-20. Hence, it is adaptable to a range of applications.
Convergence Analysis of Overparametrized LQR Formulations
Motivated by the growing use of Artificial Intelligence (AI) tools in control design, this paper takes the first steps towards bridging the gap between results from Direct Gradient methods for the Linear Quadratic Regulator (LQR), and neural networks. More specifically, it looks into the case where one wants to find a Linear Feed-Forward Neural Network (LFFNN) feedback that minimizes a LQR cost. This paper starts by computing the gradient formulas for the parameters of each layer, which are used to derive a key conservation law of the system. This conservation law is then leveraged to prove boundedness and global convergence of solutions to critical points, and invariance of the set of stabilizing networks under the training dynamics. This is followed by an analysis of the case where the LFFNN has a single hidden layer. For this case, the paper proves that the training converges not only to critical points but to the optimal feedback control law for all but a set of measure-zero of the initializations. These theoretical results are followed by an extensive analysis of a simple version of the problem (the ``vector case''), proving the theoretical properties of accelerated convergence and robustness for this simpler example. Finally, the paper presents numerical evidence of faster convergence of the training of general LFFNNs when compared to traditional direct gradient methods, showing that the acceleration of the solution is observable even when the gradient is not explicitly computed but estimated from evaluations of the cost function.
Safe Barrier-Constrained Control of Uncertain Systems via Event-triggered Learning
While control barrier functions are employed to ensure in addressing safety, control synthesis methods based on them generally rely on accurate system dynamics. This is a critical limitation, since the dynamics of complex systems are often not fully known. Supervised machine learning techniques hold great promise for alleviating this weakness by inferring models from data. We propose a novel control barrier function-based framework for safe control through event-triggered learning, which switches between prioritizing control performance and improving model accuracy based on the uncertainty of the learned model. By updating a Gaussian process model with training points gathered online, the approach guarantees the feasibility of control barrier function conditions with high probability, such that safety can be ensured in a data-efficient manner. Furthermore, we establish the absence of Zeno behavior in the triggering scheme, and extend the algorithm to sampled-data realizations by accounting for inter-sampling effects. The effectiveness of the proposed approach and theory is demonstrated in simulations.
comment: The first two authors contributed equally to the work
Domain-decoupled Physics-informed Neural Networks with Closed-form Gradients for Fast Model Learning of Dynamical Systems
Physics-informed neural networks (PINNs) are trained using physical equations and can also incorporate unmodeled effects by learning from data. PINNs for control (PINCs) of dynamical systems are gaining interest due to their prediction speed compared to classical numerical integration methods for nonlinear state-space models, making them suitable for real-time control applications. We introduce the domain-decoupled physics-informed neural network (DD-PINN) to address current limitations of PINC in handling large and complex nonlinear dynamical systems. The time domain is decoupled from the feed-forward neural network to construct an Ansatz function, allowing for calculation of gradients in closed form. This approach significantly reduces training times, especially for large dynamical systems, compared to PINC, which relies on graph-based automatic differentiation. Additionally, the DD-PINN inherently fulfills the initial condition and supports higher-order excitation inputs, simplifying the training process and enabling improved prediction accuracy. Validation on three systems - a nonlinear mass-spring-damper, a five-mass-chain, and a two-link robot - demonstrates that the DD-PINN achieves significantly shorter training times. In cases where the PINC's prediction diverges, the DD-PINN's prediction remains stable and accurate due to higher physics loss reduction or use of a higher-order excitation input. The DD-PINN allows for fast and accurate learning of large dynamical systems previously out of reach for the PINC.
comment: Accepted to International Conference on Informatics in Control, Automation and Robotics (ICINCO) 2024
Active Search for Low-altitude UAV Sensing and Communication for Users at Unknown Locations
This paper studies optimal unmanned aerial vehicle (UAV) placement to ensure line-of-sight (LOS) communication and sensing for a cluster of ground users possibly in deep shadow, while the UAV maintains backhaul connectivity with a base station (BS). The key challenges include unknown user locations, uncertain channel model parameters, and unavailable urban structure. Addressing these challenges, this paper focuses on developing an efficient online search strategy which jointly estimates channels, guides UAV positioning, and optimizes resource allocation. Analytically exploiting the geometric properties of the equipotential surface, this paper develops an LOS discovery trajectory on the equipotential surface while the closed-form search directions are determined using perturbation theory. Since the explicit expression of the equipotential surface is not available, this paper proposes to locally construct a channel model for each user in the LOS regime utilizing polynomial regression without depending on user locations or propagation distance. A class of spiral trajectories to simultaneously construct the LOS channels and search on the equipotential surface is developed. An optimal radius of the spiral and an optimal measurement pattern for channel gain estimation are derived to minimize the mean squared error (MSE) of the locally constructed channel. Numerical results on real 3D city maps demonstrate that the proposed scheme achieves over 94% of the performance of a 3D exhaustive search scheme with just a 3-kilometer search.
Decentralized Online Learning for Random Inverse Problems Over Graphs
We propose a decentralized online learning algorithm for distributed random inverse problems over network graphs with online measurements, and unifies the distributed parameter estimation in Hilbert spaces and the least mean square problem in reproducing kernel Hilbert spaces (RKHS-LMS). We transform the convergence of the algorithm into the asymptotic stability of a class of inhomogeneous random difference equations in Hilbert spaces with $L_{2}$-bounded martingale difference terms and develop the $L_2$-asymptotic stability theory in Hilbert spaces. We show that if the network graph is connected and the sequence of forward operators satisfies the infinite-dimensional spatio-temporal persistence of excitation condition, then the estimates of all nodes are mean square and almost surely strongly consistent. Moreover, we propose a decentralized online learning algorithm in RKHS based on non-stationary online data streams, and prove that the algorithm is mean square and almost surely strongly consistent if the operators induced by the random input data satisfy the infinite-dimensional spatio-temporal persistence of excitation condition.
Component reusability evaluation and requirement tracing for agent-based cyber-physical-simulated systems
Evaluating early design concepts is crucial as it impacts quality and cost. This process is often hindered by vague and uncertain design information. This article introduces the SysML-based Simulated-Physical Systems Modeling Language (SPSysML). It is a Domain-Specification Language used to evaluate component reusability in cyber-physical systems, incorporating digital twins and other simulated parts. The proposed factors assess the design quantitatively. SPSysML uses a requirement-based system structuring method to couple simulated and physical parts with requirements. SPSysML enables DTs to perceive exogenous actions in the simulated world. SPSysML validation is survey- and application-based. First, a robotic system for an assisted living project was developed. The integrity of simulated and physical parts of the system is improved using SPSysML-based quantitative evaluation. Thus, more system components are shared between the simulated and physical setups. The system was deployed on the physical robot and two simulators based on the Robot Operating System (ROS) or ROS2. SPSysML was used by a third-party developer and was assessed by him and other practitioners in a survey.
comment: This work has been submitted to the Elsevier Journal of Systems Architecture for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Minimax problems for ensembles of control-affine systems
In this paper, we consider ensembles of control-affine systems in $\mathbb{R}^d$, and we study simultaneous optimal control problems related to the worst-case minimization. After proving that such problems admit solutions, denoting with $(\Theta^N)_N$ a sequence of compact sets that parametrize the ensembles of systems, we first show that the corresponding minimax optimal control problems are $\Gamma$-convergent whenever $(\Theta^N)_N$ has a limit with respect to the Hausdorff distance. Besides its independent interest, the previous result plays a crucial role for establishing the Pontryagin Maximum Principle (PMP) when the ensemble is parametrized by a set $\Theta$ consisting of infinitely many points. Namely, we first approximate $\Theta$ by finite and increasing-in-size sets $(\Theta^N)_N$ for which the PMP is known, and then we derive the PMP for the $\Gamma$-limiting problem. The same strategy can be pursued in applications, where we can reduce infinite ensembles to finite ones to compute the minimizers numerically. We bring as a numerical example the Schr\"odinger equation for a qubit with uncertain resonance frequency.
comment: 24 pages, 1 Figure, 2 Tables. Correction of typos, minor revisions and new remarks
Optimal Dynamic Ancillary Services Provision Based on Local Power Grid Perception
In this paper, we propose a systematic closed-loop approach to provide optimal dynamic ancillary services with converter-interfaced generation systems based on local power grid perception. In particular, we structurally encode dynamic ancillary services such as fast frequency and voltage regulation in the form of a parametric transfer function matrix, which includes several parameters to define a set of different feasible response behaviors, among which we aim to find the optimal one to be realized by the converter system. Our approach is based on a so-called "perceive-and-optimize" (P&O) strategy: First, we identify a grid dynamic equivalent at the interconnection terminals of the converter system. Second, we consider the closed-loop interconnection of the identified grid equivalent and the parametric transfer function matrix, which we optimize for the set of transfer function parameters, resulting in a stable and optimal closed-loop performance for ancillary services provision. In the process, we ensure that grid-code and device-level requirements are satisfied. Finally, we demonstrate the effectiveness of our approach in different numerical case studies based on a modified Kundur two-area test system.
comment: 15 pages, 20 Figures
Dynamic Ancillary Services: From Grid Codes to Transfer Function-Based Converter Control
Conventional grid-code specifications for dynamic ancillary services provision such as fast frequency and voltage regulation are typically defined by means of piece-wise linear step-response capability curves in the time domain. However, although the specification of such time-domain curves is straightforward, their practical implementation in a converter-based generation system is not immediate, and no customary methods have been developed yet. In this paper, we thus propose a systematic approach for the practical implementation of piece-wise linear time-domain curves to provide dynamic ancillary services by converter-based generation systems, while ensuring grid-code and device-level requirements to be reliably satisfied. Namely, we translate the piece-wise linear time-domain curves for active and reactive power provision in response to a frequency and voltage step change into a desired rational parametric transfer function in the frequency domain, which defines a dynamic response behavior to be realized by the converter. The obtained transfer function can be easily implemented e.g. via a PI-based matching control in the power loop of standard converter control architectures. We demonstrate the performance of our method in numerical grid-code compliance tests, and reveal its superiority over classical droop and virtual inertia schemes which may not satisfy the grid codes due to their structural limitations.
comment: 8 pages, 11 figures
Receding-Constraint Model Predictive Control using a Learned Approximate Control-Invariant Set
In recent years, advanced model-based and data-driven control methods are unlocking the potential of complex robotics systems, and we can expect this trend to continue at an exponential rate in the near future. However, ensuring safety with these advanced control methods remains a challenge. A well-known tool to make controllers (either Model Predictive Controllers or Reinforcement Learning policies) safe, is the so-called control-invariant set (a.k.a. safe set). Unfortunately, for nonlinear systems, such a set cannot be exactly computed in general. Numerical algorithms exist for computing approximate control-invariant sets, but classic theoretic control methods break down if the set is not exact. This paper presents our recent efforts to address this issue. We present a novel Model Predictive Control scheme that can guarantee recursive feasibility and/or safety under weaker assumptions than classic methods. In particular, recursive feasibility is guaranteed by making the safe-set constraint move backward over the horizon, and assuming that such set satisfies a condition that is weaker than control invariance. Safety is instead guaranteed under an even weaker assumption on the safe set, triggering a safe task-abortion strategy whenever a risk of constraint violation is detected. We evaluated our approach on a simulated robot manipulator, empirically demonstrating that it leads to less constraint violations than state-of-the-art approaches, while retaining reasonable performance in terms of tracking cost, number of completed tasks, and computation time.
comment: 7 pages, 3 figures, 3 tables, 2 pseudo-algo, conference
6G Fresnel Spot Beamfocusing using Large-Scale Metasurfaces: A Distributed DRL-Based Approach
In this paper, we introduce the concept of spot beamfocusing (SBF) in the Fresnel zone through extremely large-scale programmable metasurfaces (ELPMs) as a key enabling technology for 6G networks. A smart SBF scheme aims to adaptively concentrate the aperture's radiating power exactly at a desired focal point (DFP) in the 3D space utilizing some Machine Learning (ML) method. This offers numerous advantages for next-generation networks including efficient wireless power transfer (WPT), interference mitigation, reduced RF pollution, and improved information security. SBF necessitates ELPMs with precise channel state information (CSI) for all ELPM elements. However, obtaining exact CSI for ELPMs is not feasible in all environments; we alleviate this by proposing an adaptive novel CSI-independent ML scheme based on the TD3 deep-reinforcement-learning (DRL) method. While the proposed ML-based scheme is well-suited for relatively small-size arrays, the computational complexity is unaffordable for ELPMs. To overcome this limitation, we introduce a modular highly scalable structure composed of multiple sub-arrays, each equipped with a TD3-DRL optimizer. This setup enables collaborative optimization of the radiated power at the DFP, significantly reducing computational complexity while enhancing learning speed. The proposed structures benefits in terms of 3D spot-like power distribution, convergence rate, and scalability are validated through simulation results.
Wireless Channel Aware Data Augmentation Methods for Deep Learning-Based Indoor Localization
Indoor localization is a challenging problem that - unlike outdoor localization - lacks a universal and robust solution. Machine Learning (ML), particularly Deep Learning (DL), methods have been investigated as a promising approach. Although such methods bring remarkable localization accuracy, they heavily depend on the training data collected from the environment. The data collection is usually a laborious and time-consuming task, but Data Augmentation (DA) can be used to alleviate this issue. In this paper, different from previously used DA, we propose methods that utilize the domain knowledge about wireless propagation channels and devices. The methods exploit the typical hardware component drift in the transceivers and/or the statistical behavior of the channel, in combination with the measured Power Delay Profile (PDP). We comprehensively evaluate the proposed methods to demonstrate their effectiveness. This investigation mainly focuses on the impact of factors such as the number of measurements, augmentation proportion, and the environment of interest impact the effectiveness of the different DA methods. We show that in the low-data regime (few actual measurements available), localization accuracy increases up to 50%, matching non-augmented results in the high-data regime. In addition, the proposed methods may outperform the measurement-only high-data performance by up to 33% using only 1/4 of the amount of measured data. We also exhibit the effect of different training data distribution and quality on the effectiveness of DA. Finally, we demonstrate the power of the proposed methods when employed along with Transfer Learning (TL) to address the data scarcity in target and/or source environments.
comment: 13 pages, 14 figures
Heat Death of Generative Models in Closed-Loop Learning
Improvement and adoption of generative machine learning models is rapidly accelerating, as exemplified by the popularity of LLMs (Large Language Models) for text, and diffusion models for image generation. As generative models become widespread, data they generate is incorporated into shared content through the public web. This opens the question of what happens when data generated by a model is fed back to the model in subsequent training campaigns. This is a question about the stability of the training process, whether the distribution of publicly accessible content, which we refer to as "knowledge", remains stable or collapses. Small scale empirical experiments reported in the literature show that this closed-loop training process is prone to degenerating. Models may start producing gibberish data, or sample from only a small subset of the desired data distribution (a phenomenon referred to as mode collapse). So far there has been only limited theoretical understanding of this process, in part due to the complexity of the deep networks underlying these generative models. The aim of this paper is to provide insights into this process (that we refer to as "generative closed-loop learning") by studying the learning dynamics of generative models that are fed back their own produced content in addition to their original training dataset. The sampling of many of these models can be controlled via a "temperature" parameter. Using dynamical systems tools, we show that, unless a sufficient amount of external data is introduced at each iteration, any non-trivial temperature leads the model to asymptotically degenerate. In fact, either the generative distribution collapses to a small set of outputs or becomes uniform over a large set of outputs.
Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
We study sequential decision making problems aimed at maximizing the expected total reward while satisfying a constraint on the expected total utility. We employ the natural policy gradient method to solve the discounted infinite-horizon optimal control problem for Constrained Markov Decision Processes (constrained MDPs). Specifically, we propose a new Natural Policy Gradient Primal-Dual (NPG-PD) method that updates the primal variable via natural policy gradient ascent and the dual variable via projected sub-gradient descent. Although the underlying maximization involves a nonconcave objective function and a nonconvex constraint set, under the softmax policy parametrization we prove that our method achieves global convergence with sublinear rates regarding both the optimality gap and the constraint violation. Such convergence is independent of the size of the state-action space, i.e., it is~dimension-free. Furthermore, for log-linear and general smooth policy parametrizations, we establish sublinear convergence rates up to a function approximation error caused by restricted policy parametrization. We also provide convergence and finite-sample complexity guarantees for two sample-based NPG-PD algorithms. Finally, we use computational experiments to showcase the merits and the effectiveness of our approach.
comment: 74 pages, 4 figures, 2 tables
Systems and Control (EESS)
A Control Theoretic Approach to Simultaneously Estimate Average Value of Time and Determine Dynamic Price for High-occupancy Toll Lanes
The dynamic pricing problem of a freeway corridor with high-occupancy toll (HOT) lanes was formulated and solved based on a point queue abstraction of the traffic system [Yin and Lou, 2009]. However, existing pricing strategies cannot guarantee that the closed-loop system converges to the optimal state, in which the HOT lanes' capacity is fully utilized but there is no queue on the HOT lanes, and a well-behaved estimation and control method is quite challenging and still elusive. This paper attempts to fill the gap by making three fundamental contributions: (i) to present a simpler formulation of the point queue model based on the new concept of residual capacity, (ii) to propose a simple feedback control theoretic approach to estimate the average value of time and calculate the dynamic price, and (iii) to analytically and numerically prove that the closed-loop system is stable and guaranteed to converge to the optimal state, in either Gaussian or exponential manners.
comment: 34 pages, 16 figures
Stability of Primal-Dual Gradient Flow Dynamics for Multi-Block Convex Optimization Problems
We examine stability properties of primal-dual gradient flow dynamics for composite convex optimization problems with multiple, possibly nonsmooth, terms in the objective function under the generalized consensus constraint. The proposed dynamics are based on the proximal augmented Lagrangian and they provide a viable alternative to ADMM which faces significant challenges from both analysis and implementation viewpoints in large-scale multi-block scenarios. In contrast to customized algorithms with individualized convergence guarantees, we provide a systematic approach for solving a broad class of challenging composite optimization problems. We leverage various structural properties to establish global (exponential) convergence guarantees for the proposed dynamics. Our assumptions are much weaker than those required to prove (exponential) stability of various primal-dual dynamics as well as (linear) convergence of discrete-time methods, e.g., standard two-block and multi-block ADMM and EXTRA algorithms. Finally, we show necessity of some of our structural assumptions for exponential stability and provide computational experiments to demonstrate the convenience of the proposed dynamics for parallel and distributed computing applications.
comment: 31 pages; 4 figures
Practical Challenges for Reliable RIS Deployment in Heterogeneous Multi-Operator Multi-Band Networks
Reconfigurable intelligent surfaces (RISs) have been introduced as arrays of nearly passive elements with software-tunable electromagnetic properties to dynamically manipulate the reflection/transmission of radio signals. Research works in this area are focused on two applications, namely {\it user-assist} RIS aiming at tuning the RIS to enhance the quality-of-service (QoS) of target users, and the {\it malicious} RIS aiming for an attacker to degrade the QoS at victim receivers through generating {\it intended} destructive interference. While both user-assist and malicious RIS applications have been explored extensively, the impact of RIS deployments on imposing {\it unintended} interference on various wireless user-equipments (EUs) remains underexplored. This paper investigates the challenges of integrating RISs into multi-carrier, multi-user, and multi-operator networks. We discuss how RIS deployments intended to benefit specific users can negatively impact other users served at various carrier frequencies through different network operators. While not an ideal solution, we discuss how ultra-narrowband metasurfaces can be incorporated into the manufacturing of RISs to mitigate some challenges of RIS deployment in wireless networks. We also present a simulation scenario to illuminate some practical challenges associated with the deployment of RISs in shared public environments.
Towards Optimized Parallel Robots for Human-Robot Collaboration by Combined Structural and Dimensional Synthesis
Parallel robots (PR) offer potential for human-robot collaboration (HRC) due to their lower moving masses and higher speeds. However, the parallel leg chains increase the risks of collision and clamping. In this work, these hazards are described by kinematics and kinetostatics models to minimize them as objective functions by a combined structural and dimensional synthesis in a particle-swarm optimization. In addition to the risk of clamping within and between kinematic chains, the back-drivability is quantified to theoretically guarantee detectability via motor current. Another HRC-relevant objective function is the largest eigenvalue of the mass matrix formulated in the operational-space coordinates to consider collision effects. Multi-objective optimization leads to different Pareto-optimal PR structures. The results show that the optimization leads to significant improvement of the HRC criteria and that a Hexa structure (6-RUS) is to be favored concerning the objective functions and due to its simpler joint structure.
comment: Accepted for publication at VDI Mechatroniktagung 2024
A Stochastic Robust Adaptive Systems Level Approach to Stabilizing Large-Scale Uncertain Markovian Jump Linear Systems
We propose a unified framework for robustly and adaptively stabilizing large-scale networked uncertain Markovian jump linear systems (MJLS) under external disturbances and mode switches that can change the network's topology. Adaptation is achieved by using minimal information on the disturbance to identify modes that are consistent with observable data. Robust control is achieved by extending the system level synthesis (SLS) approach, which allows us to pose the problem of simultaneously stabilizing multiple plants as a two-step convex optimization procedure. Our control pipeline computes a likelihood distribution of the system's current mode, uses them as probabilistic weights during simultaneous stabilization, then updates the likelihood via Bayesian inference. Because of this "softer" probabilistic approach to robust stabilization, our control pipeline does not suffer from abrupt destabilization issues due to changes in the system's true mode, which were observed in a previous method. Separability of SLS also lets us compute localized robust controllers for each subsystem, allowing for network scalability; we use several information consensus methods so that mode estimation can also be done locally. We apply our algorithms to disturbance-rejection on two sample dynamic power grid networks, a small-scale system with 7 nodes and a large-scale grid of 25 nodes.
comment: Full version of accepted paper to 63rd IEEE Conference on Decision and Control (CDC) 2024
Risk-Averse Resilient Operation of Electricity Grid Under the Risk of Wildfire
Wildfires and other extreme weather conditions due to climate change are stressing the aging electrical infrastructure. Power utilities have implemented public safety power shutoffs as a method to mitigate the risk of wildfire by proactively de-energizing some power lines, which leaves customers without power. System operators have to make a compromise between de-energizing of power lines to avoid the wildfire risk and energizing those lines to serve the demand. In this work, with a quantified wildfire ignition risk of each line, a resilient operation problem is presented in power systems with a high penetration level of renewable generation resources. A two-stage robust optimization problem is formulated and solved using column-and-constraint generation algorithm to find improved balance between the de-energization of power lines and the customers served. Different penetration levels of renewable generation to mitigate the impact of extreme fire hazard situations on the energization of customers is assessed. The validity of the presented robust optimization algorithm is demonstrated on various test cases.
Advanced POD-Based Performance Evaluation of Classifiers Applied to Human Driver Lane Changing Prediction
Machine learning (ML) classifiers serve as essential tools facilitating classification and prediction across various domains. The performance of these algorithms should be known to ensure their reliable application. In certain fields, receiver operating characteristic and precision-recall curves are frequently employed to assess machine learning algorithms without accounting for the impact of process parameters. However, it may be essential to evaluate the performance of these algorithms in relation to such parameters. As a performance evaluation metric capable of considering the effects of process parameters, this paper uses a modified probability of detection (POD) approach to assess the reliability of ML-based algorithms. As an example, the POD-based approach is employed to assess ML models used for predicting the lane changing behavior of a vehicle driver. The time remaining to the predicted (and therefore unknown) lane changing event is considered as process parameter. The hit/miss approach to POD is taken here and modified by considering the probability of lane changing derived from ML algorithms at each time step, and obtaining the final result of the analysis accordingly. This improves the reliability of results compared to the standard hit/miss approach, which considers the outcome of the classifiers as either 0 or 1, while also simplifying evaluation compared to the \^a versus a approach. Performance evaluation results of the proposed approach are compared with those obtained with the standard hit/miss approach and a pre-developed \^a versus a approach to validate the effectiveness of the proposed method. The comparison shows that this method provides an averaging conservative behavior with the advantage of enhancing the reliability of the hit/miss approach to POD while retaining its simplicity.
comment: Manuscript: 8 pages, 6 figures, 4 tables
Linear-Quadratic Dynamic Games as Receding-Horizon Variational Inequalities
We consider dynamic games with linear dynamics and quadratic objective functions. We observe that the unconstrained open-loop Nash equilibrium coincides with the LQR in an augmented space, thus deriving an explicit expression of the cost-to-go. With such cost-to-go as a terminal cost, we show asymptotic stability for the receding-horizon solution of the finite-horizon, constrained game. Furthermore, we show that the problem is equivalent to a non-symmetric variational inequality, which does not correspond to any Nash equilibrium problem. For unconstrained closed-loop Nash equilibria, we derive a receding-horizon controller that is equivalent to the infinite-horizon one and ensures asymptotic stability.
A quantitative model of takeover request time budget for conditionally automated driving
In conditional automation, the automated driving system assumes full control and only issues a takeover request to a human driver to resume driving in critical situations. Previous studies have concluded that the time budget required by drivers to resume driving after a takeover request varies with situations and different takeover variables. However, no comprehensive generalized approaches for estimating in advance the time budget required by drivers to takeover have been provided. In this contribution, fixed (7 s) and variable time budgets (6 s, 5 s, and 4 s) with and without visual imagery assistance were investigated for suitability in three takeover scenarios using performance measures such as average lateral displacement. The results indicate that 7 s is suitable for two of the studied scenarios based on their characteristics. Using the obtained results and known relations between takeover variables, a mathematical formula for estimating takeover request time budget is proposed. The proposed formula integrates individual stimulus response time, driving experience, scenario specific requirements and allows increased safety for takeover maneuvers. Furthermore, the visual imagery resulted in increased takeover time which invariably increases the time budget. Thus the time demand of the visualized information if applicable (such as visual imagery) should be included in the time budget.
comment: Manuscript: 12 pages, 12 figures, 7 tables
Comparison of Model Predictive Control and Proximal Policy Optimization for a 1-DOF Helicopter System
This study conducts a comparative analysis of Model Predictive Control (MPC) and Proximal Policy Optimization (PPO), a Deep Reinforcement Learning (DRL) algorithm, applied to a 1-Degree of Freedom (DOF) Quanser Aero 2 system. Classical control techniques such as MPC and Linear Quadratic Regulator (LQR) are widely used due to their theoretical foundation and practical effectiveness. However, with advancements in computational techniques and machine learning, DRL approaches like PPO have gained traction in solving optimal control problems through environment interaction. This paper systematically evaluates the dynamic response characteristics of PPO and MPC, comparing their performance, computational resource consumption, and implementation complexity. Experimental results show that while LQR achieves the best steady-state accuracy, PPO excels in rise-time and adaptability, making it a promising approach for applications requiring rapid response and adaptability. Additionally, we have established a baseline for future RL-related research on this specific testbed. We also discuss the strengths and limitations of each control strategy, providing recommendations for selecting appropriate controllers for real-world scenarios.
comment: Accepted at INDIN2024
Structural Optimization of Lightweight Bipedal Robot via SERL
Designing a bipedal robot is a complex and challenging task, especially when dealing with a multitude of structural parameters. Traditional design methods often rely on human intuition and experience. However, such approaches are time-consuming, labor-intensive, lack theoretical guidance and hard to obtain optimal design results within vast design spaces, thus failing to full exploit the inherent performance potential of robots. In this context, this paper introduces the SERL (Structure Evolution Reinforcement Learning) algorithm, which combines reinforcement learning for locomotion tasks with evolution algorithms. The aim is to identify the optimal parameter combinations within a given multidimensional design space. Through the SERL algorithm, we successfully designed a bipedal robot named Wow Orin, where the optimal leg length are obtained through optimization based on body structure and motor torque. We have experimentally validated the effectiveness of the SERL algorithm, which is capable of optimizing the best structure within specified design space and task conditions. Additionally, to assess the performance gap between our designed robot and the current state-of-the-art robots, we compared Wow Orin with mainstream bipedal robots Cassie and Unitree H1. A series of experimental results demonstrate the Outstanding energy efficiency and performance of Wow Orin, further validating the feasibility of applying the SERL algorithm to practical design.
CBF-LLM: Safe Control for LLM Alignment
This paper proposes a control-based framework for aligning large language models (LLMs) by leveraging a control barrier function (CBF) to ensure user-desirable text generation. The presented framework applies the safety filter, designed based on the CBF, to the output generation of the baseline LLM, i.e., the sequence of the token, with the aim of intervening in the generated text. The overall text-generation system is implemented with Llama 3 and a RoBERTa model, and the source code is available at https://github.com/Mya-Mya/CBF-LLM. The experiment demonstrates its control ability and effectiveness in reducing the number of interventions needed for user-specified alignment tasks.
Sufficient and Necessary Barrier-like Conditions for Safety and Reach-avoid Verification of Stochastic Discrete-time Systems
In this paper, we examine sufficient and necessary barrier-like conditions for the safety verification and reach-avoid verification of stochastic discrete-time systems. Safety verification aims to certify the satisfaction of the safety property, which stipulates that the probability of the system, starting from a specified initial state, remaining within a safe set is greater than or equal to a specified lower bound. A sufficient and necessary barrier-like condition is formulated for safety verification. In contrast, reach-avoid verification extends beyond safety to include reachability, seeking to certify the satisfaction of the reach-avoid property. It requires that the probability of the system, starting from a specified initial state, reaching a target set eventually while remaining within a safe set until the first hit of the target, is greater than or equal to a specified lower bound. Two sufficient and necessary barrier-like conditions are formulated under certain assumptions. These conditions are derived via relaxing Bellman equations.
On the Existence of Linear Observed Systems on Manifolds with Connection
Linear observed systems on manifolds are a special class of nonlinear systems whose state spaces are smooth manifolds but possess properties similar to linear systems. Such properties can be characterized by the ability to conduct preintegration and exact linearization with Jacobians independent of the linearization point. IMU dynamics in navigation can be constructed into linear observed settings, leading to invariant filters with guaranteed behaviors such as local convergence and consistency. In this letter, we establish linear observed property for dynamics evolving on an arbitrary smooth manifold through the connection structure endowed upon this space. Our key findings are the existence of linear observed systems on manifolds poses strong constraints on the state space itself, apart from requiring the dynamics to be in some specific forms. The existence of such systems is equivalent to the flatness of the state space, forcing the manifold to admit a group structure under mild topological assumptions.
comment: 6 pages, 2 figures
Infinite-Horizon Optimal Wireless Control Over Shared State-Dependent Fading Channels for IIoT Systems
Heterogeneous systems consisting of a multiloop wireless control system (WCS) and a mobile agent system (MAS) are ubiquitous in Industrial Internet of Things systems. Within these systems, positions of mobile agents may lead to shadow fading on the wireless channel that the WCS is controlled over and can significantly compromise its performance. This paper focuses on the infinite-horizon optimal control of MAS to ensure the performance of WCS while minimizing an average cost for the heterogeneous system subject to state and input constraints. Firstly, the state-dependent fading channel is modeled, which characterizes the interference among transmission links, and shows that the probability of a successful transmission for WCS depends on the state of MAS. A necessary and sufficient condition in terms of constrained set stabilization is then established to ensure the Lyapunov-like performance of WCS with expected decay rate. Secondly, using the semi-tensor product of matrices and constrained reachable sets, a criterion is presented to check the constrained set stabilization of MAS and to ensure the performance of WCS. In addition, a constrained optimal state transition graph is constructed to address state and input constraints, by resorting to which the feasibility analysis of the optimal control problem is presented. Finally, an algorithm is proposed for the construction of optimal input sequences by minimum-mean cycles for weighted graph. An illustrative example is provided to demonstrate effectiveness of the proposed method.
Compact Multi-Service Antenna for Sensing and Communication Using Reconfigurable Complementary Spiral Resonator
In this paper, a compact multi-service antenna (MSA) is presented for sensing and communication using a reconfigurable complementary spiral resonator. A three turns complementary spiral resonator (3-CSR) is inserted in the ground plane of a modified patch antenna to create a miniaturized structure. Two Positive-Intrinsic-Negative (PIN) diodes (D1, D2) are also integrated with the 3-CSR to achieve frequency reconfiguration. The proposed structure operates in three different modes i.e., dual-band joint communication and sensing antenna (JCASA), dual-band antenna, and single-band antenna. The required mode can be selected by changing the state of the PIN diodes. In mode-1, the first band (0.95-0.97 GHz) of the antenna is dedicated to sensing by using frequency domain reflectometry (FDR), while the second band (1.53-1.56 GHz) is allocated to communication. The sensing ability of the proposed structure is utilized to measure soil moisture using FDR. Based on the frequency shift, permittivity of the soil is observed to measure soil moisture. In mode-2 and mode-3, the structure operates as a standard dual and single band antenna, respectively, with a maximum gain of 1.5 dBi at 1.55 GHz. The proposed planar structure, with its simple geometry and a high sensitivity of 1.7%, is a suitable candidate for precision farming. The proposed structure is versatile and capable of being utilized as a single or dual-band antenna and also measuring permittivity of materials within the range of 1-20. Hence, it is adaptable to a range of applications.
Convergence Analysis of Overparametrized LQR Formulations
Motivated by the growing use of Artificial Intelligence (AI) tools in control design, this paper takes the first steps towards bridging the gap between results from Direct Gradient methods for the Linear Quadratic Regulator (LQR), and neural networks. More specifically, it looks into the case where one wants to find a Linear Feed-Forward Neural Network (LFFNN) feedback that minimizes a LQR cost. This paper starts by computing the gradient formulas for the parameters of each layer, which are used to derive a key conservation law of the system. This conservation law is then leveraged to prove boundedness and global convergence of solutions to critical points, and invariance of the set of stabilizing networks under the training dynamics. This is followed by an analysis of the case where the LFFNN has a single hidden layer. For this case, the paper proves that the training converges not only to critical points but to the optimal feedback control law for all but a set of measure-zero of the initializations. These theoretical results are followed by an extensive analysis of a simple version of the problem (the ``vector case''), proving the theoretical properties of accelerated convergence and robustness for this simpler example. Finally, the paper presents numerical evidence of faster convergence of the training of general LFFNNs when compared to traditional direct gradient methods, showing that the acceleration of the solution is observable even when the gradient is not explicitly computed but estimated from evaluations of the cost function.
Safe Barrier-Constrained Control of Uncertain Systems via Event-triggered Learning
While control barrier functions are employed to ensure in addressing safety, control synthesis methods based on them generally rely on accurate system dynamics. This is a critical limitation, since the dynamics of complex systems are often not fully known. Supervised machine learning techniques hold great promise for alleviating this weakness by inferring models from data. We propose a novel control barrier function-based framework for safe control through event-triggered learning, which switches between prioritizing control performance and improving model accuracy based on the uncertainty of the learned model. By updating a Gaussian process model with training points gathered online, the approach guarantees the feasibility of control barrier function conditions with high probability, such that safety can be ensured in a data-efficient manner. Furthermore, we establish the absence of Zeno behavior in the triggering scheme, and extend the algorithm to sampled-data realizations by accounting for inter-sampling effects. The effectiveness of the proposed approach and theory is demonstrated in simulations.
comment: The first two authors contributed equally to the work
Domain-decoupled Physics-informed Neural Networks with Closed-form Gradients for Fast Model Learning of Dynamical Systems
Physics-informed neural networks (PINNs) are trained using physical equations and can also incorporate unmodeled effects by learning from data. PINNs for control (PINCs) of dynamical systems are gaining interest due to their prediction speed compared to classical numerical integration methods for nonlinear state-space models, making them suitable for real-time control applications. We introduce the domain-decoupled physics-informed neural network (DD-PINN) to address current limitations of PINC in handling large and complex nonlinear dynamical systems. The time domain is decoupled from the feed-forward neural network to construct an Ansatz function, allowing for calculation of gradients in closed form. This approach significantly reduces training times, especially for large dynamical systems, compared to PINC, which relies on graph-based automatic differentiation. Additionally, the DD-PINN inherently fulfills the initial condition and supports higher-order excitation inputs, simplifying the training process and enabling improved prediction accuracy. Validation on three systems - a nonlinear mass-spring-damper, a five-mass-chain, and a two-link robot - demonstrates that the DD-PINN achieves significantly shorter training times. In cases where the PINC's prediction diverges, the DD-PINN's prediction remains stable and accurate due to higher physics loss reduction or use of a higher-order excitation input. The DD-PINN allows for fast and accurate learning of large dynamical systems previously out of reach for the PINC.
comment: Accepted to International Conference on Informatics in Control, Automation and Robotics (ICINCO) 2024
Active Search for Low-altitude UAV Sensing and Communication for Users at Unknown Locations
This paper studies optimal unmanned aerial vehicle (UAV) placement to ensure line-of-sight (LOS) communication and sensing for a cluster of ground users possibly in deep shadow, while the UAV maintains backhaul connectivity with a base station (BS). The key challenges include unknown user locations, uncertain channel model parameters, and unavailable urban structure. Addressing these challenges, this paper focuses on developing an efficient online search strategy which jointly estimates channels, guides UAV positioning, and optimizes resource allocation. Analytically exploiting the geometric properties of the equipotential surface, this paper develops an LOS discovery trajectory on the equipotential surface while the closed-form search directions are determined using perturbation theory. Since the explicit expression of the equipotential surface is not available, this paper proposes to locally construct a channel model for each user in the LOS regime utilizing polynomial regression without depending on user locations or propagation distance. A class of spiral trajectories to simultaneously construct the LOS channels and search on the equipotential surface is developed. An optimal radius of the spiral and an optimal measurement pattern for channel gain estimation are derived to minimize the mean squared error (MSE) of the locally constructed channel. Numerical results on real 3D city maps demonstrate that the proposed scheme achieves over 94% of the performance of a 3D exhaustive search scheme with just a 3-kilometer search.
Decentralized Online Learning for Random Inverse Problems Over Graphs
We propose a decentralized online learning algorithm for distributed random inverse problems over network graphs with online measurements, and unifies the distributed parameter estimation in Hilbert spaces and the least mean square problem in reproducing kernel Hilbert spaces (RKHS-LMS). We transform the convergence of the algorithm into the asymptotic stability of a class of inhomogeneous random difference equations in Hilbert spaces with $L_{2}$-bounded martingale difference terms and develop the $L_2$-asymptotic stability theory in Hilbert spaces. We show that if the network graph is connected and the sequence of forward operators satisfies the infinite-dimensional spatio-temporal persistence of excitation condition, then the estimates of all nodes are mean square and almost surely strongly consistent. Moreover, we propose a decentralized online learning algorithm in RKHS based on non-stationary online data streams, and prove that the algorithm is mean square and almost surely strongly consistent if the operators induced by the random input data satisfy the infinite-dimensional spatio-temporal persistence of excitation condition.
Component reusability evaluation and requirement tracing for agent-based cyber-physical-simulated systems
Evaluating early design concepts is crucial as it impacts quality and cost. This process is often hindered by vague and uncertain design information. This article introduces the SysML-based Simulated-Physical Systems Modeling Language (SPSysML). It is a Domain-Specification Language used to evaluate component reusability in cyber-physical systems, incorporating digital twins and other simulated parts. The proposed factors assess the design quantitatively. SPSysML uses a requirement-based system structuring method to couple simulated and physical parts with requirements. SPSysML enables DTs to perceive exogenous actions in the simulated world. SPSysML validation is survey- and application-based. First, a robotic system for an assisted living project was developed. The integrity of simulated and physical parts of the system is improved using SPSysML-based quantitative evaluation. Thus, more system components are shared between the simulated and physical setups. The system was deployed on the physical robot and two simulators based on the Robot Operating System (ROS) or ROS2. SPSysML was used by a third-party developer and was assessed by him and other practitioners in a survey.
comment: This work has been submitted to the Elsevier Journal of Systems Architecture for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Minimax problems for ensembles of control-affine systems
In this paper, we consider ensembles of control-affine systems in $\mathbb{R}^d$, and we study simultaneous optimal control problems related to the worst-case minimization. After proving that such problems admit solutions, denoting with $(\Theta^N)_N$ a sequence of compact sets that parametrize the ensembles of systems, we first show that the corresponding minimax optimal control problems are $\Gamma$-convergent whenever $(\Theta^N)_N$ has a limit with respect to the Hausdorff distance. Besides its independent interest, the previous result plays a crucial role for establishing the Pontryagin Maximum Principle (PMP) when the ensemble is parametrized by a set $\Theta$ consisting of infinitely many points. Namely, we first approximate $\Theta$ by finite and increasing-in-size sets $(\Theta^N)_N$ for which the PMP is known, and then we derive the PMP for the $\Gamma$-limiting problem. The same strategy can be pursued in applications, where we can reduce infinite ensembles to finite ones to compute the minimizers numerically. We bring as a numerical example the Schr\"odinger equation for a qubit with uncertain resonance frequency.
comment: 24 pages, 1 Figure, 2 Tables. Correction of typos, minor revisions and new remarks
Optimal Dynamic Ancillary Services Provision Based on Local Power Grid Perception
In this paper, we propose a systematic closed-loop approach to provide optimal dynamic ancillary services with converter-interfaced generation systems based on local power grid perception. In particular, we structurally encode dynamic ancillary services such as fast frequency and voltage regulation in the form of a parametric transfer function matrix, which includes several parameters to define a set of different feasible response behaviors, among which we aim to find the optimal one to be realized by the converter system. Our approach is based on a so-called "perceive-and-optimize" (P&O) strategy: First, we identify a grid dynamic equivalent at the interconnection terminals of the converter system. Second, we consider the closed-loop interconnection of the identified grid equivalent and the parametric transfer function matrix, which we optimize for the set of transfer function parameters, resulting in a stable and optimal closed-loop performance for ancillary services provision. In the process, we ensure that grid-code and device-level requirements are satisfied. Finally, we demonstrate the effectiveness of our approach in different numerical case studies based on a modified Kundur two-area test system.
comment: 15 pages, 20 Figures
Dynamic Ancillary Services: From Grid Codes to Transfer Function-Based Converter Control
Conventional grid-code specifications for dynamic ancillary services provision such as fast frequency and voltage regulation are typically defined by means of piece-wise linear step-response capability curves in the time domain. However, although the specification of such time-domain curves is straightforward, their practical implementation in a converter-based generation system is not immediate, and no customary methods have been developed yet. In this paper, we thus propose a systematic approach for the practical implementation of piece-wise linear time-domain curves to provide dynamic ancillary services by converter-based generation systems, while ensuring grid-code and device-level requirements to be reliably satisfied. Namely, we translate the piece-wise linear time-domain curves for active and reactive power provision in response to a frequency and voltage step change into a desired rational parametric transfer function in the frequency domain, which defines a dynamic response behavior to be realized by the converter. The obtained transfer function can be easily implemented e.g. via a PI-based matching control in the power loop of standard converter control architectures. We demonstrate the performance of our method in numerical grid-code compliance tests, and reveal its superiority over classical droop and virtual inertia schemes which may not satisfy the grid codes due to their structural limitations.
comment: 8 pages, 11 figures
Receding-Constraint Model Predictive Control using a Learned Approximate Control-Invariant Set
In recent years, advanced model-based and data-driven control methods are unlocking the potential of complex robotics systems, and we can expect this trend to continue at an exponential rate in the near future. However, ensuring safety with these advanced control methods remains a challenge. A well-known tool to make controllers (either Model Predictive Controllers or Reinforcement Learning policies) safe, is the so-called control-invariant set (a.k.a. safe set). Unfortunately, for nonlinear systems, such a set cannot be exactly computed in general. Numerical algorithms exist for computing approximate control-invariant sets, but classic theoretic control methods break down if the set is not exact. This paper presents our recent efforts to address this issue. We present a novel Model Predictive Control scheme that can guarantee recursive feasibility and/or safety under weaker assumptions than classic methods. In particular, recursive feasibility is guaranteed by making the safe-set constraint move backward over the horizon, and assuming that such set satisfies a condition that is weaker than control invariance. Safety is instead guaranteed under an even weaker assumption on the safe set, triggering a safe task-abortion strategy whenever a risk of constraint violation is detected. We evaluated our approach on a simulated robot manipulator, empirically demonstrating that it leads to less constraint violations than state-of-the-art approaches, while retaining reasonable performance in terms of tracking cost, number of completed tasks, and computation time.
comment: 7 pages, 3 figures, 3 tables, 2 pseudo-algo, conference
6G Fresnel Spot Beamfocusing using Large-Scale Metasurfaces: A Distributed DRL-Based Approach
In this paper, we introduce the concept of spot beamfocusing (SBF) in the Fresnel zone through extremely large-scale programmable metasurfaces (ELPMs) as a key enabling technology for 6G networks. A smart SBF scheme aims to adaptively concentrate the aperture's radiating power exactly at a desired focal point (DFP) in the 3D space utilizing some Machine Learning (ML) method. This offers numerous advantages for next-generation networks including efficient wireless power transfer (WPT), interference mitigation, reduced RF pollution, and improved information security. SBF necessitates ELPMs with precise channel state information (CSI) for all ELPM elements. However, obtaining exact CSI for ELPMs is not feasible in all environments; we alleviate this by proposing an adaptive novel CSI-independent ML scheme based on the TD3 deep-reinforcement-learning (DRL) method. While the proposed ML-based scheme is well-suited for relatively small-size arrays, the computational complexity is unaffordable for ELPMs. To overcome this limitation, we introduce a modular highly scalable structure composed of multiple sub-arrays, each equipped with a TD3-DRL optimizer. This setup enables collaborative optimization of the radiated power at the DFP, significantly reducing computational complexity while enhancing learning speed. The proposed structures benefits in terms of 3D spot-like power distribution, convergence rate, and scalability are validated through simulation results.
Wireless Channel Aware Data Augmentation Methods for Deep Learning-Based Indoor Localization
Indoor localization is a challenging problem that - unlike outdoor localization - lacks a universal and robust solution. Machine Learning (ML), particularly Deep Learning (DL), methods have been investigated as a promising approach. Although such methods bring remarkable localization accuracy, they heavily depend on the training data collected from the environment. The data collection is usually a laborious and time-consuming task, but Data Augmentation (DA) can be used to alleviate this issue. In this paper, different from previously used DA, we propose methods that utilize the domain knowledge about wireless propagation channels and devices. The methods exploit the typical hardware component drift in the transceivers and/or the statistical behavior of the channel, in combination with the measured Power Delay Profile (PDP). We comprehensively evaluate the proposed methods to demonstrate their effectiveness. This investigation mainly focuses on the impact of factors such as the number of measurements, augmentation proportion, and the environment of interest impact the effectiveness of the different DA methods. We show that in the low-data regime (few actual measurements available), localization accuracy increases up to 50%, matching non-augmented results in the high-data regime. In addition, the proposed methods may outperform the measurement-only high-data performance by up to 33% using only 1/4 of the amount of measured data. We also exhibit the effect of different training data distribution and quality on the effectiveness of DA. Finally, we demonstrate the power of the proposed methods when employed along with Transfer Learning (TL) to address the data scarcity in target and/or source environments.
comment: 13 pages, 14 figures
Heat Death of Generative Models in Closed-Loop Learning
Improvement and adoption of generative machine learning models is rapidly accelerating, as exemplified by the popularity of LLMs (Large Language Models) for text, and diffusion models for image generation. As generative models become widespread, data they generate is incorporated into shared content through the public web. This opens the question of what happens when data generated by a model is fed back to the model in subsequent training campaigns. This is a question about the stability of the training process, whether the distribution of publicly accessible content, which we refer to as "knowledge", remains stable or collapses. Small scale empirical experiments reported in the literature show that this closed-loop training process is prone to degenerating. Models may start producing gibberish data, or sample from only a small subset of the desired data distribution (a phenomenon referred to as mode collapse). So far there has been only limited theoretical understanding of this process, in part due to the complexity of the deep networks underlying these generative models. The aim of this paper is to provide insights into this process (that we refer to as "generative closed-loop learning") by studying the learning dynamics of generative models that are fed back their own produced content in addition to their original training dataset. The sampling of many of these models can be controlled via a "temperature" parameter. Using dynamical systems tools, we show that, unless a sufficient amount of external data is introduced at each iteration, any non-trivial temperature leads the model to asymptotically degenerate. In fact, either the generative distribution collapses to a small set of outputs or becomes uniform over a large set of outputs.
Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
We study sequential decision making problems aimed at maximizing the expected total reward while satisfying a constraint on the expected total utility. We employ the natural policy gradient method to solve the discounted infinite-horizon optimal control problem for Constrained Markov Decision Processes (constrained MDPs). Specifically, we propose a new Natural Policy Gradient Primal-Dual (NPG-PD) method that updates the primal variable via natural policy gradient ascent and the dual variable via projected sub-gradient descent. Although the underlying maximization involves a nonconcave objective function and a nonconvex constraint set, under the softmax policy parametrization we prove that our method achieves global convergence with sublinear rates regarding both the optimality gap and the constraint violation. Such convergence is independent of the size of the state-action space, i.e., it is~dimension-free. Furthermore, for log-linear and general smooth policy parametrizations, we establish sublinear convergence rates up to a function approximation error caused by restricted policy parametrization. We also provide convergence and finite-sample complexity guarantees for two sample-based NPG-PD algorithms. Finally, we use computational experiments to showcase the merits and the effectiveness of our approach.
comment: 74 pages, 4 figures, 2 tables
Multiagent Systems
Different Facets for Different Experts: A Framework for Streamlining The Integration of Qualitative Insights into ABM Development
A key problem in agent-based simulation is that integrating qualitative insights from multiple discipline experts is extremely hard. In most simulations, agent capabilities and corresponding behaviour needs to be programmed into the agent. We report on the architecture of a tool that disconnects the programmed functions of the agent, from the acquisition of capability and displayed behaviour. This allows multiple different domain experts to represent qualitative insights, without the need for code to be changed. It also allows a continuous integration (or even change) of qualitative behaviour processes, as more insights are gained. The consequent behaviour observed in the model is both, more faithful to the expert's insight as well as able to be contrasted against other models representing other insights.
comment: 14 pages, 6 figures, accepted at the 19th Social Simulation Conference 2024
TrafficGamer: Reliable and Flexible Traffic Simulation for Safety-Critical Scenarios with Game-Theoretic Oracles
While modern Autonomous Vehicle (AV) systems can develop reliable driving policies under regular traffic conditions, they frequently struggle with safety-critical traffic scenarios. This difficulty primarily arises from the rarity of such scenarios in driving datasets and the complexities associated with predictive modeling among multiple vehicles. To support the testing and refinement of AV policies, simulating safety-critical traffic events is an essential challenge to be addressed. In this work, we introduce TrafficGamer, which facilitates game-theoretic traffic simulation by viewing common road driving as a multi-agent game. In evaluating the empirical performance across various real-world datasets, TrafficGamer ensures both fidelity and exploitability of the simulated scenarios, guaranteeing that they not only statically align with real-world traffic distribution but also efficiently capture equilibriums for representing safety-critical scenarios involving multiple agents. Additionally, the results demonstrate that TrafficGamer exhibits highly flexible simulation across various contexts. Specifically, we demonstrate that the generated scenarios can dynamically adapt to equilibriums of varying tightness by configuring risk-sensitive constraints during optimization. To the best of our knowledge, TrafficGamer is the first simulator capable of generating diverse traffic scenarios involving multiple agents. We have provided a demo webpage for the project at https://qiaoguanren.github.io/trafficgamer-demo/.
Improving the Prediction of Individual Engagement in Recommendations Using Cognitive Models
For public health programs with limited resources, the ability to predict how behaviors change over time and in response to interventions is crucial for deciding when and to whom interventions should be allocated. Using data from a real-world maternal health program, we demonstrate how a cognitive model based on Instance-Based Learning (IBL) Theory can augment existing purely computational approaches. Our findings show that, compared to general time-series forecasters (e.g., LSTMs), IBL models, which reflect human decision-making processes, better predict the dynamics of individuals' states. Additionally, IBL provides estimates of the volatility in individuals' states and their sensitivity to interventions, which can improve the efficiency of training of other time series models.
Component reusability evaluation and requirement tracing for agent-based cyber-physical-simulated systems
Evaluating early design concepts is crucial as it impacts quality and cost. This process is often hindered by vague and uncertain design information. This article introduces the SysML-based Simulated-Physical Systems Modeling Language (SPSysML). It is a Domain-Specification Language used to evaluate component reusability in cyber-physical systems, incorporating digital twins and other simulated parts. The proposed factors assess the design quantitatively. SPSysML uses a requirement-based system structuring method to couple simulated and physical parts with requirements. SPSysML enables DTs to perceive exogenous actions in the simulated world. SPSysML validation is survey- and application-based. First, a robotic system for an assisted living project was developed. The integrity of simulated and physical parts of the system is improved using SPSysML-based quantitative evaluation. Thus, more system components are shared between the simulated and physical setups. The system was deployed on the physical robot and two simulators based on the Robot Operating System (ROS) or ROS2. SPSysML was used by a third-party developer and was assessed by him and other practitioners in a survey.
comment: This work has been submitted to the Elsevier Journal of Systems Architecture for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Robotics
SpecGuard: Specification Aware Recovery for Robotic Autonomous Vehicles from Physical Attacks CCS'24
Robotic Autonomous Vehicles (RAVs) rely on their sensors for perception, and follow strict mission specifications (e.g., altitude, speed, and geofence constraints) for safe and timely operations. Physical attacks can corrupt the RAVs' sensors, resulting in mission failures. Recovering RAVs from such attacks demands robust control techniques that maintain compliance with mission specifications even under attacks to ensure the RAV's safety and timely operations. We propose SpecGuard, a technique that complies with mission specifications and performs safe recovery of RAVs. There are two innovations in SpecGuard. First, it introduces an approach to incorporate mission specifications and learn a recovery control policy using Deep Reinforcement Learning (Deep-RL). We design a compliance-based reward structure that reflects the RAV's complex dynamics and enables SpecGuard to satisfy multiple mission specifications simultaneously. Second, SpecGuard incorporates state reconstruction, a technique that minimizes attack induced sensor perturbations. This reconstruction enables effective adversarial training, and optimizing the recovery control policy for robustness under attacks. We evaluate SpecGuard in both virtual and real RAVs, and find that it achieves 92% recovery success rate under attacks on different sensors, without any crashes or stalls. SpecGuard achieves 2X higher recovery success than prior work, and incurs about 15% performance overhead on real RAVs.
comment: CCS'24 (a shorter version of this paper will appear in the conference proceeding)
Evaluation of Local Planner-Based Stanley Control in Autonomous RC Car Racing Series
This paper proposes a control technique for autonomous RC car racing. The presented method does not require any map-building phase beforehand since it operates only local path planning on the actual LiDAR point cloud. Racing control algorithms must have the capability to be optimized to the actual track layout for minimization of lap time. In the examined one, it is guaranteed with the improvement of the Stanley controller with additive control components to stabilize the movement in both low and high-speed ranges, and with the integration of an adaptive lookahead point to induce sharp and dynamic cornering for traveled distance reduction. The developed method is tested on a 1/10-sized RC car, and the tuning procedure from a base solution to the optimal setting in a real F1Tenth race is presented. Furthermore, the proposed method is evaluated with a comparison to a more simple reactive method, and in parallel to a more complex optimization-based technique that involves offline map building the global optimal trajectory calculation. The performance of the proposed method compared to the latter, referring to the lap time, is that the proposed one has only 8% lower average speed. This demonstrates that with appropriate tuning, a local planning-based method can be comparable with a more complex optimization-based one. Thus, the performance gap is lower than 10% from the state-of-the-art method. Moreover, the proposed technique has significantly higher similarity to real scenarios, therefore the results can be interesting in the context of automotive industry.
No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery
What data or environments to use for training to improve downstream performance is a longstanding and very topical question in reinforcement learning. In particular, Unsupervised Environment Design (UED) methods have gained recent attention as their adaptive curricula enable agents to be robust to in- and out-of-distribution tasks. We ask to what extent these methods are themselves robust when applied to a novel setting, closely inspired by a real-world robotics problem. Surprisingly, we find that the state-of-the-art UED methods either do not improve upon the na\"{i}ve baseline of Domain Randomisation (DR), or require substantial hyperparameter tuning to do so. Our analysis shows that this is due to their underlying scoring functions failing to predict intuitive measures of ``learnability'', i.e., in finding the settings that the agent sometimes solves, but not always. Based on this, we instead directly train on levels with high learnability and find that this simple and intuitive approach outperforms UED methods and DR in several binary-outcome environments, including on our domain and the standard UED domain of Minigrid. We further introduce a new adversarial evaluation procedure for directly measuring robustness, closely mirroring the conditional value at risk (CVaR). We open-source all our code and present visualisations of final policies here: https://github.com/amacrutherford/sampling-for-learnability.
Scalable Supervisory Architecture for Autonomous Race Cars
In recent years, the number and importance of autonomous racing leagues, and consequently the number of studies on them, has been growing. The seamless integration between different series has gained attention due to the scene's diversity. However, the high cost of full scale racing makes it a more accessible development model, to research at smaller form factors and scale up the achieved results. This paper presents a scalable architecture designed for autonomous racing that emphasizes modularity, adaptability to diverse configurations, and the ability to supervise parallel execution of pipelines that allows the use of different dynamic strategies. The system showcased consistent racing performance across different environments, demonstrated through successful participation in two relevant competitions. The results confirm the architecture's scalability and versatility, providing a robust foundation for the development of competitive autonomous racing systems. The successful application in real-world scenarios validates its practical effectiveness and highlights its potential for future advancements in autonomous racing technology.
Distributed Planning for Rigid Robot Formations with Probabilistic Collision Avoidance
This paper presents a distributed method for robots moving in rigid formations while ensuring probabilistic collision avoidance between the robots. The formation is parametrised through the transformation of a base configuration. The robots map their desired velocities into a corresponding desired change in the formation parameters and apply a consensus step to reach agreement on the desired formation and a constraint satisfaction step to ensure collision avoidance within the formation. The constraint set is found such that the probability of collision remains below an upper bound. The method was demonstrated in a manual teleoperation scenario both in simulation and a real-world experiment.
AEROBULL: A Center-of-Mass Displacing Aerial Vehicle Enabling Efficient High-Force Interaction
In various industrial sectors, inspection and maintenance tasks using UAV (Unmanned Aerial Vehicle) require substantial force application to ensure effective adherence and stable contact, posing significant challenges to existing solutions. This paper addresses these industrial needs by introducing a novel lightweight aerial platform (3.12kg) designed to exert high pushing forces on non-horizontal surfaces. To increase maneuverability, the proposed platform incorporates tiltable rotors with 5-DoF (Degree of Freedom) actuation. Moreover, it has an innovative shifting-mass mechanism that dynamically adjusts the system's CoM (Center of Mass) during contact-based task execution. A compliant EE (End-Effector) is applied to ensure a smooth interaction with the work surface. We provide a detailed study of the UAV's overall system design, hardware integration of the developed physical prototype, and software architecture of the proposed control algorithm. Physical experiments were conducted to validate the control design and explore the force generation capability of the designed platform via a pushing task. With a total mass of 3.12kg, the UAV exerted a maximum pushing force of above 28N being almost equal to its gravity force. Furthermore, the experiments illustrated the benefits of having displaced CoM by benchmarking with a fixed CoM configuration.
Depth Restoration of Hand-Held Transparent Objects for Human-to-Robot Handover
Transparent objects are common in daily life, while their unique optical properties pose challenges for RGB-D cameras, which struggle to capture accurate depth information. For assistant robots, accurately perceiving transparent objects held by humans is essential for effective human-robot interaction. This paper presents a Hand-Aware Depth Restoration (HADR) method for hand-held transparent objects based on creating an implicit neural representation function from a single RGB-D image. The proposed method introduces the hand posture as an important guidance to leverage semantic and geometric information. To train and evaluate the proposed method, we create a high-fidelity synthetic dataset called TransHand-14K with a real-to-sim data generation scheme. Experiments show that our method has a better performance and generalization ability compared with existing methods. We further develop a real-world human-to-robot handover system based on the proposed depth restoration method, demonstrating its application value in human-robot interaction.
comment: 7 pages, 7 figures, conference
Domain-decoupled Physics-informed Neural Networks with Closed-form Gradients for Fast Model Learning of Dynamical Systems
Physics-informed neural networks (PINNs) are trained using physical equations and can also incorporate unmodeled effects by learning from data. PINNs for control (PINCs) of dynamical systems are gaining interest due to their prediction speed compared to classical numerical integration methods for nonlinear state-space models, making them suitable for real-time control applications. We introduce the domain-decoupled physics-informed neural network (DD-PINN) to address current limitations of PINC in handling large and complex nonlinear dynamic systems. The time domain is decoupled from the feed-forward neural network to construct an Ansatz function, allowing for calculation of gradients in closed form. This approach significantly reduces training times, especially for large dynamical systems, compared to PINC, which relies on graph-based automatic differentiation. Additionally, the DD-PINN inherently fulfills the initial condition and supports higher-order excitation inputs, simplifying the training process and enabling improved prediction accuracy. Validation on three systems - a nonlinear mass-spring-damper, a five-mass-chain, and a two-link robot - demonstrates that the DD-PINN achieves significantly shorter training times. In cases where the PINC's prediction diverges, the DD-PINN's prediction remains stable and accurate due to higher physics loss reduction or use of a higher-order excitation input. The DD-PINN allows for fast and accurate learning of large dynamical systems previously out of reach for the PINC.
comment: Accepted to International Conference on Informatics in Control, Automation and Robotics (ICINCO) 2024
Three-Dimensional Vehicle Dynamics State Estimation for High-Speed Race Cars under varying Signal Quality IROS 2024
This work aims to present a three-dimensional vehicle dynamics state estimation under varying signal quality. Few researchers have investigated the impact of three-dimensional road geometries on the state estimation and, thus, neglect road inclination and banking. Especially considering high velocities and accelerations, the literature does not address these effects. Therefore, we compare two- and three-dimensional state estimation schemes to outline the impact of road geometries. We use an Extended Kalman Filter with a point-mass motion model and extend it by an additional formulation of reference angles. Furthermore, virtual velocity measurements significantly improve the estimation of road angles and the vehicle's side slip angle. We highlight the importance of steady estimations for vehicle motion control algorithms and demonstrate the challenges of degraded signal quality and Global Navigation Satellite System dropouts. The proposed adaptive covariance facilitates a smooth estimation and enables stable controller behavior. The developed state estimation has been deployed on a high-speed autonomous race car at various racetracks. Our findings indicate that our approach outperforms state-of-the-art vehicle dynamics state estimators and an industry-grade Inertial Navigation System. Further studies are needed to investigate the performance under varying track conditions and on other vehicle types.
comment: This paper has been accepted at IROS 2024
Robo-GS: A Physics Consistent Spatial-Temporal Model for Robotic Arm with Hybrid Representation
Real2Sim2Real plays a critical role in robotic arm control and reinforcement learning, yet bridging this gap remains a significant challenge due to the complex physical properties of robots and the objects they manipulate. Existing methods lack a comprehensive solution to accurately reconstruct real-world objects with spatial representations and their associated physics attributes. We propose a Real2Sim pipeline with a hybrid representation model that integrates mesh geometry, 3D Gaussian kernels, and physics attributes to enhance the digital asset representation of robotic arms. This hybrid representation is implemented through a Gaussian-Mesh-Pixel binding technique, which establishes an isomorphic mapping between mesh vertices and Gaussian models. This enables a fully differentiable rendering pipeline that can be optimized through numerical solvers, achieves high-fidelity rendering via Gaussian Splatting, and facilitates physically plausible simulation of the robotic arm's interaction with its environment using mesh-based methods. The code,full presentation and datasets will be made publicly available at our website https://robostudioapp.com
Optimizing Structured Data Processing through Robotic Process Automation
Robotic Process Automation (RPA) has emerged as a game-changing technology in data extraction, revolutionizing the way organizations process and analyze large volumes of documents such as invoices, purchase orders, and payment advices. This study investigates the use of RPA for structured data extraction and evaluates its advantages over manual processes. By comparing human-performed tasks with those executed by RPA software bots, we assess efficiency and accuracy in data extraction from invoices, focusing on the effectiveness of the RPA system. Through four distinct scenarios involving varying numbers of invoices, we measure efficiency in terms of time and effort required for task completion, as well as accuracy by comparing error rates between manual and RPA processes. Our findings highlight the significant efficiency gains achieved by RPA, with bots completing tasks in significantly less time compared to manual efforts across all cases. Moreover, the RPA system consistently achieves perfect accuracy, mitigating the risk of errors and enhancing process reliability. These results underscore the transformative potential of RPA in optimizing operational efficiency, reducing human labor costs, and improving overall business performance.
comment: This manuscript has been accepted for publication in the journal Revue d'Intelligence Artificielle
Points2Plans: From Point Clouds to Long-Horizon Plans with Composable Relational Dynamics
We present Points2Plans, a framework for composable planning with a relational dynamics model that enables robots to solve long-horizon manipulation tasks from partial-view point clouds. Given a language instruction and a point cloud of the scene, our framework initiates a hierarchical planning procedure, whereby a language model generates a high-level plan and a sampling-based planner produces constraint-satisfying continuous parameters for manipulation primitives sequenced according to the high-level plan. Key to our approach is the use of a relational dynamics model as a unifying interface between the continuous and symbolic representations of states and actions, thus facilitating language-driven planning from high-dimensional perceptual input such as point clouds. Whereas previous relational dynamics models require training on datasets of multi-step manipulation scenarios that align with the intended test scenarios, Points2Plans uses only single-step simulated training data while generalizing zero-shot to a variable number of steps during real-world evaluations. We evaluate our approach on tasks involving geometric reasoning, multi-object interactions, and occluded object reasoning in both simulated and real-world settings. Results demonstrate that Points2Plans offers strong generalization to unseen long-horizon tasks in the real world, where it solves over 85% of evaluated tasks while the next best baseline solves only 50%. Qualitative demonstrations of our approach operating on a mobile manipulator platform are made available at sites.google.com/stanford.edu/points2plans.
comment: Under review
Benchmarking Reinforcement Learning Methods for Dexterous Robotic Manipulation with a Three-Fingered Gripper
Reinforcement Learning (RL) training is predominantly conducted in cost-effective and controlled simulation environments. However, the transfer of these trained models to real-world tasks often presents unavoidable challenges. This research explores the direct training of RL algorithms in controlled yet realistic real-world settings for the execution of dexterous manipulation. The benchmarking results of three RL algorithms trained on intricate in-hand manipulation tasks within practical real-world contexts are presented. Our study not only demonstrates the practicality of RL training in authentic real-world scenarios, facilitating direct real-world applications, but also provides insights into the associated challenges and considerations. Additionally, our experiences with the employed experimental methods are shared, with the aim of empowering and engaging fellow researchers and practitioners in this dynamic field of robotics.
Active Semantic Mapping and Pose Graph Spectral Analysis for Robot Exploration
Exploration in unknown and unstructured environments is a pivotal requirement for robotic applications. A robot's exploration behavior can be inherently affected by the performance of its Simultaneous Localization and Mapping (SLAM) subsystem, although SLAM and exploration are generally studied separately. In this paper, we formulate exploration as an active mapping problem and extend it with semantic information. We introduce a novel active metric-semantic SLAM approach, leveraging recent research advances in information theory and spectral graph theory: we combine semantic mutual information and the connectivity metrics of the underlying pose graph of the SLAM subsystem. We use the resulting utility function to evaluate different trajectories to select the most favorable strategy during exploration. Exploration and SLAM metrics are analyzed in experiments. Running our algorithm on the Habitat dataset, we show that, while maintaining efficiency close to the state-of-the-art exploration methods, our approach effectively increases the performance of metric-semantic SLAM with a 21% reduction in average map error and a 9% improvement in average semantic classification accuracy.
comment: 8 pages, 5 figures
Pathfinding with Lazy Successor Generation
We study a pathfinding problem where only locations (i.e., vertices) are given, and edges are implicitly defined by an oracle answering the connectivity of two locations. Despite its simple structure, this problem becomes non-trivial with a massive number of locations, due to posing a huge branching factor for search algorithms. Limiting the number of successors, such as with nearest neighbors, can reduce search efforts but compromises completeness. Instead, we propose a novel LaCAS* algorithm, which does not generate successors all at once but gradually generates successors as the search progresses. This scheme is implemented with k-nearest neighbors search on a k-d tree. LaCAS* is a complete and anytime algorithm that eventually converges to the optima. Extensive evaluations demonstrate the efficacy of LaCAS*, e.g., solving complex pathfinding instances quickly, where conventional methods falter.
comment: 14 pages
Fast and Modular Autonomy Software for Autonomous Racing Vehicles
Autonomous motorsports aim to replicate the human racecar driver with software and sensors. As in traditional motorsports, Autonomous Racing Vehicles (ARVs) are pushed to their handling limits in multi-agent scenarios at extremely high ($\geq 150mph$) speeds. This Operational Design Domain (ODD) presents unique challenges across the autonomy stack. The Indy Autonomous Challenge (IAC) is an international competition aiming to advance autonomous vehicle development through ARV competitions. While far from challenging what a human racecar driver can do, the IAC is pushing the state of the art by facilitating full-sized ARV competitions. This paper details the MIT-Pitt-RW Team's approach to autonomous racing in the IAC. In this work, we present our modular and fast approach to agent detection, motion planning and controls to create an autonomy stack. We also provide analysis of the performance of the software stack in single and multi-agent scenarios for rapid deployment in a fast-paced competition environment. We also cover what did and did not work when deployed on a physical system the Dallara AV-21 platform and potential improvements to address these shortcomings. Finally, we convey lessons learned and discuss limitations and future directions for improvement.
comment: Published in Journal of Field Robotics
Panoptic Perception for Autonomous Driving: A Survey
Panoptic perception represents a forefront advancement in autonomous driving technology, unifying multiple perception tasks into a singular, cohesive framework to facilitate a thorough understanding of the vehicle's surroundings. This survey reviews typical panoptic perception models for their unique inputs and architectures and compares them to performance, responsiveness, and resource utilization. It also delves into the prevailing challenges faced in panoptic perception and explores potential trajectories for future research. Our goal is to furnish researchers in autonomous driving with a detailed synopsis of panoptic perception, positioning this survey as a pivotal reference in the ever-evolving landscape of autonomous driving technologies.
How Much is too Much: Exploring the Effect of Verbal Route Description Length on Indoor Navigation
Navigating through a new indoor environment can be stressful. Recently, many places have deployed robots to assist visitors. One of the features of such robots is escorting the visitors to their desired destination within the environment, but this is neither scalable nor necessary for every visitor. Instead, a robot assistant could be deployed at a strategic location to provide wayfinding instructions. This not only increases the user experience but can be helpful in many time-critical scenarios e.g., escorting someone to their boarding gate at an airport. However, delivering route descriptions verbally poses a challenge. If the description is too verbose, people may struggle to recall all the information, while overly brief descriptions may be simply unhelpful. This article focuses on studying the optimal length of verbal route descriptions that are effective for reaching the destination and easy for people to recall. This work proposes a theoretical framework that links route segments to chunks in working memory. Based on this framework, an experiment is designed and conducted to examine the effects of route descriptions of different lengths on navigational performance. The results revealed intriguing patterns suggesting an ideal length of four route segments. This study lays a foundation for future research exploring the relationship between route description lengths, working memory capacity, and navigational performance in indoor environments.
comment: Accepted in IEEE ROMAN 2024
This is the Way: Mitigating the Roll of an Autonomous Uncrewed Surface Vessel in Wavy Conditions Using Model Predictive Control IROS
Though larger vessels may be well-equipped to deal with wavy conditions, smaller vessels are often more susceptible to disturbances. This paper explores the development of a nonlinear model predictive control (NMPC) system for Uncrewed Surface Vessels (USVs) in wavy conditions to minimize average roll. The NMPC is based on a prediction method that uses information about the vessel's dynamics and an assumed wave model. This method is able to mitigate the roll of an under-actuated USV in a variety of conditions by adjusting the weights of the cost function. The results show a reduction of 39% of average roll with a tuned controller in conditions with 1.75-metre sinusoidal waves. A general and intuitive tuning strategy is established. This preliminary work is a proof of concept which sets the stage for the leveraging of wave prediction methodologies to perform planning and control in real time for USVs in real-world scenarios and field trials.
comment: 6 pages, 10 figures. To appear in Proceedings of the 2024 IEEE/RSJ International Conference on Robots and Systems (IROS), October 2024
PIE: Parkour with Implicit-Explicit Learning Framework for Legged Robots
Parkour presents a highly challenging task for legged robots, requiring them to traverse various terrains with agile and smooth locomotion. This necessitates comprehensive understanding of both the robot's own state and the surrounding terrain, despite the inherent unreliability of robot perception and actuation. Current state-of-the-art methods either rely on complex pre-trained high-level terrain reconstruction modules or limit the maximum potential of robot parkour to avoid failure due to inaccurate perception. In this paper, we propose a one-stage end-to-end learning-based parkour framework: Parkour with Implicit-Explicit learning framework for legged robots (PIE) that leverages dual-level implicit-explicit estimation. With this mechanism, even a low-cost quadruped robot equipped with an unreliable egocentric depth camera can achieve exceptional performance on challenging parkour terrains using a relatively simple training process and reward function. While the training process is conducted entirely in simulation, our real-world validation demonstrates successful zero-shot deployment of our framework, showcasing superior parkour performance on harsh terrains.
comment: Accepted for IEEE Robotics and Automation Letters (RA-L)
C3DM: Constrained-Context Conditional Diffusion Models for Imitation Learning
Behavior Cloning (BC) methods are effective at learning complex manipulation tasks. However, they are prone to spurious correlation - expressive models may focus on distractors that are irrelevant to action prediction - and are thus fragile in real-world deployment. Prior methods have addressed this challenge by exploring different model architectures and action representations. However, none were able to balance between sample efficiency and robustness against distractors for solving manipulation tasks with a complex action space. We present \textbf{C}onstrained-\textbf{C}ontext \textbf{C}onditional \textbf{D}iffusion \textbf{M}odel (C3DM), a diffusion model policy for solving 6-DoF robotic manipulation tasks with robustness to distractions that can learn deployable robot policies from as little as five demonstrations. A key component of C3DM is a fixation step that helps the action denoiser to focus on task-relevant regions around a predicted fixation point while ignoring distractors in the context. We empirically show that C3DM is robust to out-of-distribution distractors, and consistently achieves high success rates on a wide array of tasks, ranging from table-top manipulation to industrial kitting that require varying levels of precision and robustness to distractors.
Brain Inspired Probabilistic Occupancy Grid Mapping with Hyperdimensional Computing
Real-time robotic systems require advanced perception, computation, and action capability. However, the main bottleneck in current autonomous systems is the trade-off between computational capability, energy efficiency and model determinism. World modeling, a key objective of many robotic systems, commonly uses occupancy grid mapping (OGM) as the first step towards building an end-to-end robotic system with perception, planning, autonomous maneuvering, and decision making capabilities. OGM divides the environment into discrete cells and assigns probability values to attributes such as occupancy and traversability. Existing methods fall into two categories: traditional methods and neural methods. Traditional methods rely on dense statistical calculations, while neural methods employ deep learning for probabilistic information processing. Recent works formulate a deterministic theory of neural computation at the intersection of cognitive science and vector symbolic architectures. In this study, we propose a Fourier-based hyperdimensional OGM system, VSA-OGM, combined with a novel application of Shannon entropy that retains the interpretability and stability of traditional methods along with the improved computational efficiency of neural methods. Our approach, validated across multiple datasets, achieves similar accuracy to covariant traditional methods while approximately reducing latency by 200x and memory by 1000x. Compared to invariant traditional methods, we see similar accuracy values while reducing latency by 3.7x. Moreover, we achieve 1.5x latency reductions compared to neural methods while eliminating the need for domain-specific model training.
Continual Domain Randomization IROS 2024
Domain Randomization (DR) is commonly used for sim2real transfer of reinforcement learning (RL) policies in robotics. Most DR approaches require a simulator with a fixed set of tunable parameters from the start of the training, from which the parameters are randomized simultaneously to train a robust model for use in the real world. However, the combined randomization of many parameters increases the task difficulty and might result in sub-optimal policies. To address this problem and to provide a more flexible training process, we propose Continual Domain Randomization (CDR) for RL that combines domain randomization with continual learning to enable sequential training in simulation on a subset of randomization parameters at a time. Starting from a model trained in a non-randomized simulation where the task is easier to solve, the model is trained on a sequence of randomizations, and continual learning is employed to remember the effects of previous randomizations. Our robotic reaching and grasping tasks experiments show that the model trained in this fashion learns effectively in simulation and performs robustly on the real robot while matching or outperforming baselines that employ combined randomization or sequential randomization without continual learning. Our code and videos are available at https://continual-dr.github.io/.
comment: Accepted at IROS 2024. Equal contribution from first two authors
Skill Q-Network: Learning Adaptive Skill Ensemble for Mapless Navigation in Unknown Environments IROS
This paper focuses on the acquisition of mapless navigation skills within unknown environments. We introduce the Skill Q-Network (SQN), a novel reinforcement learning method featuring an adaptive skill ensemble mechanism. Unlike existing methods, our model concurrently learns a high-level skill decision process alongside multiple low-level navigation skills, all without the need for prior knowledge. Leveraging a tailored reward function for mapless navigation, the SQN is capable of learning adaptive maneuvers that incorporate both exploration and goal-directed skills, enabling effective navigation in new environments. Our experiments demonstrate that our SQN can effectively navigate complex environments, exhibiting a 40\% higher performance compared to baseline models. Without explicit guidance, SQN discovers how to combine low-level skill policies, showcasing both goal-directed navigations to reach destinations and exploration maneuvers to escape from local minimum regions in challenging scenarios. Remarkably, our adaptive skill ensemble method enables zero-shot transfer to out-of-distribution domains, characterized by unseen observations from non-convex obstacles or uneven, subterranean-like environments. The project page is available at https://sites.google.com/view/skill-q-net.
comment: 8 pages, 8 figures, accepted at the International Conference on Intelligent Robots and Systems (IROS) 2024
A Comprehensive Survey of Cross-Domain Policy Transfer for Embodied Agents IJCAI 2024
The burgeoning fields of robot learning and embodied AI have triggered an increasing demand for large quantities of data. However, collecting sufficient unbiased data from the target domain remains a challenge due to costly data collection processes and stringent safety requirements. Consequently, researchers often resort to data from easily accessible source domains, such as simulation and laboratory environments, for cost-effective data acquisition and rapid model iteration. Nevertheless, the environments and embodiments of these source domains can be quite different from their target domain counterparts, underscoring the need for effective cross-domain policy transfer approaches. In this paper, we conduct a systematic review of existing cross-domain policy transfer methods. Through a nuanced categorization of domain gaps, we encapsulate the overarching insights and design considerations of each problem setting. We also provide a high-level discussion about the key methodologies used in cross-domain policy transfer problems. Lastly, we summarize the open challenges that lie beyond the capabilities of current paradigms and discuss potential future directions in this field.
comment: IJCAI 2024
Structured Deep Neural Networks-Based Backstepping Trajectory Tracking Control for Lagrangian Systems
Deep neural networks (DNN) are increasingly being used to learn controllers due to their excellent approximation capabilities. However, their black-box nature poses significant challenges to closed-loop stability guarantees and performance analysis. In this paper, we introduce a structured DNN-based controller for the trajectory tracking control of Lagrangian systems using backing techniques. By properly designing neural network structures, the proposed controller can ensure closed-loop stability for any compatible neural network parameters. In addition, improved control performance can be achieved by further optimizing neural network parameters. Besides, we provide explicit upper bounds on tracking errors in terms of controller parameters, which allows us to achieve the desired tracking performance by properly selecting the controller parameters. Furthermore, when system models are unknown, we propose an improved Lagrangian neural network (LNN) structure to learn the system dynamics and design the controller. We show that in the presence of model approximation errors and external disturbances, the closed-loop stability and tracking control performance can still be guaranteed. The effectiveness of the proposed approach is demonstrated through simulations.
Probabilistic Visibility-Aware Trajectory Planning for Target Tracking in Cluttered Environments
Target tracking has numerous significant civilian and military applications, and maintaining the visibility of the target plays a vital role in ensuring the success of the tracking task. Existing visibility-aware planners primarily focus on keeping the target within the limited field of view of an onboard sensor and avoiding obstacle occlusion. However, the negative impact of system uncertainty is often neglected, rendering the planners delicate to uncertainties in practice. To bridge the gap, this work proposes a real-time, non-myopic trajectory planner for visibility-aware and safe target tracking in the presence of system uncertainty. For more accurate target motion prediction, we introduce the concept of belief-space probability of detection (BPOD) to measure the predictive visibility of the target under stochastic robot and target states. An Extended Kalman Filter variant incorporating BPOD is developed to predict target belief state under uncertain visibility within the planning horizon. To reach real-time trajectory planning, we propose a computationally efficient algorithm to uniformly calculate both BPOD and the chance-constrained collision risk by utilizing linearized signed distance function (SDF), and then design a two-stage strategy for lightweight calculation of SDF in sequential convex programming. Extensive simulation results with benchmark comparisons show the capacity of the proposed approach to robustly maintain the visibility of the target under high system uncertainty. The practicality of the proposed trajectory planner is validated by real-world experiments.
comment: A technical report for our conference paper in 2024 American Control Conference (ACC)
Riemannian Flow Matching Policy for Robot Motion Learning IROS'24
We introduce Riemannian Flow Matching Policies (RFMP), a novel model for learning and synthesizing robot visuomotor policies. RFMP leverages the efficient training and inference capabilities of flow matching methods. By design, RFMP inherits the strengths of flow matching: the ability to encode high-dimensional multimodal distributions, commonly encountered in robotic tasks, and a very simple and fast inference process. We demonstrate the applicability of RFMP to both state-based and vision-conditioned robot motion policies. Notably, as the robot state resides on a Riemannian manifold, RFMP inherently incorporates geometric awareness, which is crucial for realistic robotic tasks. To evaluate RFMP, we conduct two proof-of-concept experiments, comparing its performance against Diffusion Policies. Although both approaches successfully learn the considered tasks, our results show that RFMP provides smoother action trajectories with significantly lower inference times.
comment: Accepted for publication at IROS'24. 8 pages, 5 figures, 4 tables
Online Multi-Agent Pickup and Delivery with Task Deadlines IROS 2024
Managing delivery deadlines in automated warehouses and factories is crucial for maintaining customer satisfaction and ensuring seamless production. This study introduces the problem of online multi-agent pickup and delivery with task deadlines (MAPD-D), an advanced variant of the online MAPD problem incorporating delivery deadlines. In the MAPD problem, agents must manage a continuous stream of delivery tasks online. Tasks are added at any time. Agents must complete their tasks while avoiding collisions with each other. MAPD-D introduces a dynamic, deadline-driven approach that incorporates task deadlines, challenging the conventional MAPD frameworks. To tackle MAPD-D, we propose a novel algorithm named deadline-aware token passing (D-TP). The D-TP algorithm calculates pickup deadlines and assigns tasks while balancing execution cost and deadline proximity. Additionally, we introduce the D-TP with task swaps (D-TPTS) method to further reduce task tardiness, enhancing flexibility and efficiency through task-swapping strategies. Numerical experiments were conducted in simulated warehouse environments to showcase the effectiveness of the proposed methods. Both D-TP and D-TPTS demonstrated significant reductions in task tardiness compared to existing methods. Our methods contribute to efficient operations in automated warehouses and factories with delivery deadlines.
comment: 6 pages, 4 figures, IROS 2024
InPTC: Integrated Planning and Tube-Following Control for Prescribed-Time Collision-Free Navigation of Wheeled Mobile Robots
In this article, we propose a novel approach, called InPTC (Integrated Planning and Tube-Following Control), for prescribed-time collision-free navigation of wheeled mobile robots in a compact convex workspace cluttered with static, sufficiently separated, and convex obstacles. A path planner with prescribed-time convergence is presented based upon Bouligand's tangent cones and time scale transformation (TST) techniques, yielding a continuous vector field that can guide the robot from almost all initial positions in the free space to the designated goal at a prescribed time, while avoiding entering the obstacle regions augmented with safety margin. By leveraging barrier functions and TST, we further derive a tube-following controller to achieve robot trajectory tracking within a prescribed time less than the planner's settling time. This controller ensures the robot moves inside a predefined ``safe tube'' around the reference trajectory, where the tube radius is set to be less than the safety margin. Consequently, the robot will reach the goal location within a prescribed time while avoiding collision with any obstacles along the way. The proposed InPTC is implemented on a Mona robot operating in an arena cluttered with obstacles of various shapes. Experimental results demonstrate that InPTC not only generates smooth collision-free reference trajectories that converge to the goal location at the preassigned time of $250\,\rm s$ (i.e., the required task completion time), but also achieves tube-following trajectory tracking with tracking accuracy higher than $0.01\rm m$ after the preassigned time of $150\,\rm s$. This enables the robot to accomplish the navigation task within the required time of $250\,\rm s$.
Predictive Modeling of Flexible EHD Pumps using Kolmogorov-Arnold Networks
We present a novel approach to predicting the pressure and flow rate of flexible electrohydrodynamic pumps using the Kolmogorov-Arnold Network. Inspired by the Kolmogorov-Arnold representation theorem, KAN replaces fixed activation functions with learnable spline-based activation functions, enabling it to approximate complex nonlinear functions more effectively than traditional models like Multi-Layer Perceptron and Random Forest. We evaluated KAN on a dataset of flexible EHD pump parameters and compared its performance against RF, and MLP models. KAN achieved superior predictive accuracy, with Mean Squared Errors of 12.186 and 0.001 for pressure and flow rate predictions, respectively. The symbolic formulas extracted from KAN provided insights into the nonlinear relationships between input parameters and pump performance. These findings demonstrate that KAN offers exceptional accuracy and interpretability, making it a promising alternative for predictive modeling in electrohydrodynamic pumping.
Object Augmentation Algorithm: Computing virtual object motion and object induced interaction wrench from optical markers IROS 2024
This study addresses the critical need for diverse and comprehensive data focused on human arm joint torques while performing activities of daily living (ADL). Previous studies have often overlooked the influence of objects on joint torques during ADL, resulting in limited datasets for analysis. To address this gap, we propose an Object Augmentation Algorithm (OAA) capable of augmenting existing marker-based databases with virtual object motions and object-induced joint torque estimations. The OAA consists of five phases: (1) computing hand coordinate systems from optical markers, (2) characterising object movements with virtual markers, (3) calculating object motions through inverse kinematics (IK), (4) determining the wrench necessary for prescribed object motion using inverse dynamics (ID), and (5) computing joint torques resulting from object manipulation. The algorithm's accuracy is validated through trajectory tracking and torque analysis on a 7+4 degree of freedom (DoF) robotic hand-arm system, manipulating three unique objects. The results show that the OAA can accurately and precisely estimate 6 DoF object motion and object-induced joint torques. Correlations between computed and measured quantities were > 0.99 for object trajectories and > 0.93 for joint torques. The OAA was further shown to be robust to variations in the number and placement of input markers, which are expected between databases. Differences between repeated experiments were minor but significant (p < 0.05). The algorithm expands the scope of available data and facilitates more comprehensive analyses of human-object interaction dynamics.
comment: An open source implementation of the described algorithm is available at https://github.com/ChristopherHerneth/ObjectAugmentationAlgorithm/tree/main. Accompanying video material may be found here https://youtu.be/8oz-awvyNRA. The article was accepted at IROS 2024
A Neurosymbolic Approach to Adaptive Feature Extraction in SLAM IROS
Autonomous robots, autonomous vehicles, and humans wearing mixed-reality headsets require accurate and reliable tracking services for safety-critical applications in dynamically changing real-world environments. However, the existing tracking approaches, such as Simultaneous Localization and Mapping (SLAM), do not adapt well to environmental changes and boundary conditions despite extensive manual tuning. On the other hand, while deep learning-based approaches can better adapt to environmental changes, they typically demand substantial data for training and often lack flexibility in adapting to new domains. To solve this problem, we propose leveraging the neurosymbolic program synthesis approach to construct adaptable SLAM pipelines that integrate the domain knowledge from traditional SLAM approaches while leveraging data to learn complex relationships. While the approach can synthesize end-to-end SLAM pipelines, we focus on synthesizing the feature extraction module. We first devise a domain-specific language (DSL) that can encapsulate domain knowledge on the important attributes for feature extraction and the real-world performance of various feature extractors. Our neurosymbolic architecture then undertakes adaptive feature extraction, optimizing parameters via learning while employing symbolic reasoning to select the most suitable feature extractor. Our evaluations demonstrate that our approach, neurosymbolic Feature EXtraction (nFEX), yields higher-quality features. It also reduces the pose error observed for the state-of-the-art baseline feature extractors ORB and SIFT by up to 90% and up to 66%, respectively, thereby enhancing the system's efficiency and adaptability to novel environments.
comment: 8 pages, 6 figures, and 5 tables. Published at the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Corresponding author: Yasra Chandio (ychandio@umass.edu)
Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Large-scale multi-task robotic manipulation systems often rely on text to specify the task. In this work, we explore whether a robot can learn by observing humans. To do so, the robot must understand a person's intent and perform the inferred task despite differences in the embodiments and environments. We introduce Vid2Robot, an end-to-end video-conditioned policy that takes human videos demonstrating manipulation tasks as input and produces robot actions. Our model is trained with a large dataset of prompt video-robot trajectory pairs to learn unified representations of human and robot actions from videos. Vid2Robot uses cross-attention transformer layers between video features and the current robot state to produce the actions and perform the same task as shown in the video. We use auxiliary contrastive losses to align the prompt and robot video representations for better policies. We evaluate Vid2Robot on real-world robots and observe over 20% improvement over BC-Z when using human prompt videos. Further, we also show cross-object motion transfer ability that enables video-conditioned policies to transfer a motion observed on one object in the prompt video to another object in the robot's own environment. Videos available at https://vid2robot.github.io
comment: Robotics: Science & Systems (RSS) 2024. https://vid2robot.github.io/
Scaling Learning based Policy Optimization for Temporal Logic Tasks by Controller Network Dropout
This paper introduces a model-based approach for training feedback controllers for an autonomous agent operating in a highly nonlinear (albeit deterministic) environment. We desire the trained policy to ensure that the agent satisfies specific task objectives and safety constraints, both expressed in Discrete-Time Signal Temporal Logic (DT-STL). One advantage for reformulation of a task via formal frameworks, like DT-STL, is that it permits quantitative satisfaction semantics. In other words, given a trajectory and a DT-STL formula, we can compute the {\em robustness}, which can be interpreted as an approximate signed distance between the trajectory and the set of trajectories satisfying the formula. We utilize feedback control, and we assume a feed forward neural network for learning the feedback controller. We show how this learning problem is similar to training recurrent neural networks (RNNs), where the number of recurrent units is proportional to the temporal horizon of the agent's task objectives. This poses a challenge: RNNs are susceptible to vanishing and exploding gradients, and na\"{i}ve gradient descent-based strategies to solve long-horizon task objectives thus suffer from the same problems. To tackle this challenge, we introduce a novel gradient approximation algorithm based on the idea of dropout or gradient sampling. One of the main contributions is the notion of {\em controller network dropout}, where we approximate the NN controller in several time-steps in the task horizon by the control input obtained using the controller in a previous training step. We show that our control synthesis methodology, can be quite helpful for stochastic gradient descent to converge with less numerical issues, enabling scalable backpropagation over long time horizons and trajectories over high dimensional state spaces.
LTL-Transfer: Skill Transfer for Temporal Task Specification ICRA 2024
Deploying robots in real-world environments, such as households and manufacturing lines, requires generalization across novel task specifications without violating safety constraints. Linear temporal logic (LTL) is a widely used task specification language with a compositional grammar that naturally induces commonalities among tasks while preserving safety guarantees. However, most prior work on reinforcement learning with LTL specifications treats every new task independently, thus requiring large amounts of training data to generalize. We propose LTL-Transfer, a zero-shot transfer algorithm that composes task-agnostic skills learned during training to safely satisfy a wide variety of novel LTL task specifications. Experiments in Minecraft-inspired domains show that after training on only 50 tasks, LTL-Transfer can solve over 90% of 100 challenging unseen tasks and 100% of 300 commonly used novel tasks without violating any safety constraints. We deployed LTL-Transfer at the task-planning level of a quadruped mobile manipulator to demonstrate its zero-shot transfer ability for fetch-and-deliver and navigation tasks.
comment: ICRA 2024
Machine Learning for Shipwreck Segmentation from Side Scan Sonar Imagery: Dataset and Benchmark
Open-source benchmark datasets have been a critical component for advancing machine learning for robot perception in terrestrial applications. Benchmark datasets enable the widespread development of state-of-the-art machine learning methods, which require large datasets for training, validation, and thorough comparison to competing approaches. Underwater environments impose several operational challenges that hinder efforts to collect large benchmark datasets for marine robot perception. Furthermore, a low abundance of targets of interest relative to the size of the search space leads to increased time and cost required to collect useful datasets for a specific task. As a result, there is limited availability of labeled benchmark datasets for underwater applications. We present the AI4Shipwrecks dataset, which consists of 28 distinct shipwrecks totaling 286 high-resolution labeled side scan sonar images to advance the state-of-the-art in autonomous sonar image understanding. We leverage the unique abundance of targets in Thunder Bay National Marine Sanctuary in Lake Huron, MI, to collect and compile a sonar imagery benchmark dataset through surveys with an autonomous underwater vehicle (AUV). We consulted with expert marine archaeologists for the labeling of robotically gathered data. We then leverage this dataset to perform benchmark experiments for comparison of state-of-the-art supervised segmentation methods, and we present insights on opportunities and open challenges for the field. The dataset and benchmarking tools will be released as an open-source benchmark dataset to spur innovation in machine learning for Great Lakes and ocean exploration. The dataset and accompanying software are available at https://umfieldrobotics.github.io/ai4shipwrecks/.
comment: Project website link: https://umfieldrobotics.github.io/ai4shipwrecks/
Multiagent Systems
Decentralized Unlabeled Multi-agent Pathfinding Via Target And Priority Swapping (With Supplementary) ECAI 2024
In this paper we study a challenging variant of the multi-agent pathfinding problem (MAPF), when a set of agents must reach a set of goal locations, but it does not matter which agent reaches a specific goal - Anonymous MAPF (AMAPF). Current optimal and suboptimal AMAPF solvers rely on the existence of a centralized controller which is in charge of both target assignment and pathfinding. We extend the state of the art and present the first AMAPF solver capable of solving the problem at hand in a fully decentralized fashion, when each agent makes decisions individually and relies only on the local communication with the others. The core of our method is a priority and target swapping procedure tailored to produce consistent goal assignments (i.e. making sure that no two agents are heading towards the same goal). Coupled with an established rule-based path planning, we end up with a TP-SWAP, an efficient and flexible approach to solve decentralized AMAPF. On the theoretical side, we prove that TP-SWAP is complete (i.e. TP-SWAP guarantees that each target will be reached by some agent). Empirically, we evaluate TP-SWAP across a wide range of setups and compare it to both centralized and decentralized baselines. Indeed, TP-SWAP outperforms the fully-decentralized competitor and can even outperform the semi-decentralized one (i.e. the one relying on the initial consistent goal assignment) in terms of flowtime (a widespread cost objective in MAPF
comment: This is a pre-print of the paper accepted to ECAI 2024. Its main body is similar the camera-ready version of the conference paper. In addition this pre-print contains Supplementary Material incorporating extended empirical results and analysis. It contains 10 pages, 8 figures, 4 tables
Graph Attention Inference of Network Topology in Multi-Agent Systems
Accurately identifying the underlying graph structures of multi-agent systems remains a difficult challenge. Our work introduces a novel machine learning-based solution that leverages the attention mechanism to predict future states of multi-agent systems by learning node representations. The graph structure is then inferred from the strength of the attention values. This approach is applied to both linear consensus dynamics and the non-linear dynamics of Kuramoto oscillators, resulting in implicit learning the graph by learning good agent representations. Our results demonstrate that the presented data-driven graph attention machine learning model can identify the network topology in multi-agent systems, even when the underlying dynamic model is not known, as evidenced by the F1 scores achieved in the link prediction.
comment: Accepted for publication at Modeling and Estimation Control Conference 2024; 6 pages, 5 figures
Compressed Federated Reinforcement Learning with a Generative Model ECML-PKDD 2024
Reinforcement learning has recently gained unprecedented popularity, yet it still grapples with sample inefficiency. Addressing this challenge, federated reinforcement learning (FedRL) has emerged, wherein agents collaboratively learn a single policy by aggregating local estimations. However, this aggregation step incurs significant communication costs. In this paper, we propose CompFedRL, a communication-efficient FedRL approach incorporating both \textit{periodic aggregation} and (direct/error-feedback) compression mechanisms. Specifically, we consider compressed federated $Q$-learning with a generative model setup, where a central server learns an optimal $Q$-function by periodically aggregating compressed $Q$-estimates from local agents. For the first time, we characterize the impact of these two mechanisms (which have remained elusive) by providing a finite-time analysis of our algorithm, demonstrating strong convergence behaviors when utilizing either direct or error-feedback compression. Our bounds indicate improved solution accuracy concerning the number of agents and other federated hyperparameters while simultaneously reducing communication costs. To corroborate our theory, we also conduct in-depth numerical experiments to verify our findings, considering Top-$K$ and Sparsified-$K$ sparsification operators.
comment: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2024)
MARPF: Multi-Agent and Multi-Rack Path Finding IROS 2024
In environments where many automated guided vehicles (AGVs) operate, planning efficient, collision-free paths is essential. Related research has mainly focused on environments with pre-defined passages, resulting in space inefficiency. We attempt to relax this assumption. In this study, we define multi-agent and multi-rack path finding (MARPF) as the problem of planning paths for AGVs to convey target racks to their designated locations in environments without passages. In such environments, an AGV without a rack can pass under racks, whereas one with a rack cannot pass under racks to avoid collisions. MARPF entails conveying the target racks without collisions, while the obstacle racks are relocated to prevent any interference with the target racks. We formulated MARPF as an integer linear programming problem in a network flow. To distinguish situations in which an AGV is or is not loading a rack, the proposed method introduces two virtual layers into the network. We optimized the AGVs' movements to move obstacle racks and convey the target racks. The formulation and applicability of the algorithm were validated through numerical experiments. The results indicated that the proposed algorithm addressed issues in environments with dense racks.
comment: 6 pages, 10 figures, IROS 2024
Online Multi-Agent Pickup and Delivery with Task Deadlines IROS 2024
Managing delivery deadlines in automated warehouses and factories is crucial for maintaining customer satisfaction and ensuring seamless production. This study introduces the problem of online multi-agent pickup and delivery with task deadlines (MAPD-D), an advanced variant of the online MAPD problem incorporating delivery deadlines. In the MAPD problem, agents must manage a continuous stream of delivery tasks online. Tasks are added at any time. Agents must complete their tasks while avoiding collisions with each other. MAPD-D introduces a dynamic, deadline-driven approach that incorporates task deadlines, challenging the conventional MAPD frameworks. To tackle MAPD-D, we propose a novel algorithm named deadline-aware token passing (D-TP). The D-TP algorithm calculates pickup deadlines and assigns tasks while balancing execution cost and deadline proximity. Additionally, we introduce the D-TP with task swaps (D-TPTS) method to further reduce task tardiness, enhancing flexibility and efficiency through task-swapping strategies. Numerical experiments were conducted in simulated warehouse environments to showcase the effectiveness of the proposed methods. Both D-TP and D-TPTS demonstrated significant reductions in task tardiness compared to existing methods. Our methods contribute to efficient operations in automated warehouses and factories with delivery deadlines.
comment: 6 pages, 4 figures, IROS 2024
Q-ITAGS: Quality-Optimized Spatio-Temporal Heterogeneous Task Allocation with a Time Budget
Complex multi-objective missions require the coordination of heterogeneous robots at multiple inter-connected levels, such as coalition formation, scheduling, and motion planning. The associated challenges are exacerbated when solutions to these interconnected problems need to simultaneously maximize task performance and respect practical constraints on time and resources. In this work, we formulate a new class of spatio-temporal heterogeneous task allocation problems that formalize these complexities. We then contribute a novel framework, named Quality-Optimized Incremental Task Allocation Graph Search (Q-ITAGS), to solve such problems. Q-ITAGS offers a flexible interleaved framework that i) explicitly models and optimizes the effect of collective capabilities on task performance via learnable trait-quality maps, and ii) respects both resource and spatio-temporal constraints, including a user-specified time budget (i.e., maximum makespan). In addition to algorithmic contributions, we derive theoretical suboptimality bounds in terms of task performance that varies as a function of a single hyperparameter. Detailed experiments involving a simulated emergency response task and a real-world video game dataset reveal that i) Q-ITAGS results in superior team performance compared to a state-of-the-art method, while also respecting complex spatio-temporal and resource constraints, ii) Q-ITAGS efficiently learns trait-quality maps to enable effective trade-off between task performance and resource constraints, and iii) Q-ITAGS' suboptimality bounds consistently hold in practice.
comment: arXiv admin note: text overlap with arXiv:2209.13092
Tractable Equilibrium Computation in Markov Games through Risk Aversion
A significant roadblock to the development of principled multi-agent reinforcement learning is the fact that desired solution concepts like Nash equilibria may be intractable to compute. To overcome this obstacle, we take inspiration from behavioral economics and show that -- by imbuing agents with important features of human decision-making like risk aversion and bounded rationality -- a class of risk-averse quantal response equilibria (RQE) become tractable to compute in all $n$-player matrix and finite-horizon Markov games. In particular, we show that they emerge as the endpoint of no-regret learning in suitably adjusted versions of the games. Crucially, the class of computationally tractable RQE is independent of the underlying game structure and only depends on agents' degree of risk-aversion and bounded rationality. To validate the richness of this class of solution concepts we show that it captures peoples' patterns of play in a number of 2-player matrix games previously studied in experimental economics. Furthermore, we give a first analysis of the sample complexity of computing these equilibria in finite-horizon Markov games when one has access to a generative model and validate our findings on a simple multi-agent reinforcement learning benchmark.
comment: preprint of multi-agent RL with risk-averse equilibria
Systems and Control (CS)
Data-enabled Predictive Repetitive Control
Many systems are subject to periodic disturbances and exhibit repetitive behaviour. Model-based repetitive control employs knowledge of such periodicity to attenuate periodic disturbances and has seen a wide range of successful industrial implementations. The aim of this paper is to develop a data-driven repetitive control method. In the developed framework, linear periodically time-varying (LPTV) behaviour is lifted to linear time-invariant (LTI) behaviour. Periodic disturbance mitigation is enabled by developing an extension of Willems' fundamental lemma for systems with exogenous disturbances. The resulting Data-enabled Predictive Repetitive Control (DeePRC) technique accounts for periodic system behaviour to perform attenuation of a periodic disturbance. Simulations demonstrate the ability of DeePRC to effectively mitigate periodic disturbances in the presence of noise.
comment: Extended report
SpecGuard: Specification Aware Recovery for Robotic Autonomous Vehicles from Physical Attacks CCS'24
Robotic Autonomous Vehicles (RAVs) rely on their sensors for perception, and follow strict mission specifications (e.g., altitude, speed, and geofence constraints) for safe and timely operations. Physical attacks can corrupt the RAVs' sensors, resulting in mission failures. Recovering RAVs from such attacks demands robust control techniques that maintain compliance with mission specifications even under attacks to ensure the RAV's safety and timely operations. We propose SpecGuard, a technique that complies with mission specifications and performs safe recovery of RAVs. There are two innovations in SpecGuard. First, it introduces an approach to incorporate mission specifications and learn a recovery control policy using Deep Reinforcement Learning (Deep-RL). We design a compliance-based reward structure that reflects the RAV's complex dynamics and enables SpecGuard to satisfy multiple mission specifications simultaneously. Second, SpecGuard incorporates state reconstruction, a technique that minimizes attack induced sensor perturbations. This reconstruction enables effective adversarial training, and optimizing the recovery control policy for robustness under attacks. We evaluate SpecGuard in both virtual and real RAVs, and find that it achieves 92% recovery success rate under attacks on different sensors, without any crashes or stalls. SpecGuard achieves 2X higher recovery success than prior work, and incurs about 15% performance overhead on real RAVs.
comment: CCS'24 (a shorter version of this paper will appear in the conference proceeding)
Data-driven distributionally robust MPC for systems with multiplicative noise: A semi-infinite semi-definite programming approach
This article introduces a novel distributionally robust model predictive control (DRMPC) algorithm for a specific class of controlled dynamical systems where the disturbance multiplies the state and control variables. These classes of systems arise in mathematical finance, where the paradigm of distributionally robust optimization (DRO) fits perfectly, and this serves as the primary motivation for this work. We recast the optimal control problem (OCP) as a semi-definite program with an infinite number of constraints, making the ensuing optimization problem a \emph{semi-infinite semi-definite program} (SI-SDP). To numerically solve the SI-SDP, we advance an approach for solving convex semi-infinite programs (SIPs) to SI-SDPs and, subsequently, solve the DRMPC problem. A numerical example is provided to show the effectiveness of the algorithm.
comment: To appear in the proceedings of Mathematical Theory of Networks and Systems (MTNS) 2024
Applications in CityLearn Gym Environment for Multi-Objective Control Benchmarking in Grid-Interactive Buildings and Districts
It is challenging to coordinate multiple distributed energy resources in a single or multiple buildings to ensure efficient and flexible operation. Advanced control algorithms such as model predictive control and reinforcement learning control provide solutions to this problem by effectively managing a distribution of distributed energy resource control tasks while adapting to unique building characteristics, and cooperating towards improving multi-objective key performance indicator. Yet, a research gap for advanced control adoption is the ability to benchmark algorithm performance. CityLearn addresses this gap an open-source Gym environment for the easy implementation and benchmarking of simple rule-based control and advanced algorithms that has an advantage of modeling simplicity, multi-agent control, district-level objectives, and control resiliency assessment. Here we demonstrate the functionalities of CityLearn using 17 different building control problems that have varying complexity with respect to the number of controllable distributed energy resources in buildings, the simplicity of the control algorithm, the control objective, and district size.
comment: To be published in IBPSA-USA SimBuild 2024 Conference
Multitone PSK Modulation Design for Simultaneous Wireless Information and Power Transfer
Far-field wireless power transfer, based on radio frequency (RF) waves, came into the picture to fulfill the power need of large Internet of Things (IoT) networks, the backbone of the 5G and beyond era. However, RF communication signals carry both information as well as energy. Therefore, recently, simultaneous wireless information and power transfer (SWIPT) has attracted much attention in order to wirelessly charge these IoT devices. In this paper, we propose a novel N -tone multitone phase shift keying (PSK) modulation scheme, taking advantage of the non-linearity of integrated receiver rectifier architecture. The main advantage of the proposed modulation scheme is the reduction in ripple voltage, introduced by the symbol transmission through phases. Achievable power conversion efficiency (PCE) and bit error rate (BER) at the output are considered to measure the efficacy of the proposed modulation scheme. Simulation results are verified by the measurements over the designed rectifier circuitry. The effect of symbol phase range, modulation order, and the number of tones are analyzed. In the future, this transmission scheme can be utilized to satisfy the data and power requirements of low-power Internet of Things sensor networks.
Compact Pixelated Microstrip Forward Broadside Coupler Using Binary Particle Swarm Optimization
In this paper, a compact microstrip forward broadside coupler (MFBC) with high coupling level is proposed in the frequency band of 3.5-3.8 GHz. The coupler is composed of two parallel pixelated transmission lines. To validate the designstrategy, the proposed MFBC is fabricated and measured. The measured results demonstrate a forward coupler with 3 dB coupling, and a compact size of 0.12 {\lambda}g x 0.10{\lambda}g. Binary Particle Swarm Optimization (BPSO) design methodology and flexibility of pixelation enable us to optimize the proposed MFBC with desired coupling level and operating frequency within a fixed dimension. Also, low sensitivity to misalignment between two coupled TLs makes the proposed coupler a good candidate for near-field Wireless Power Transfer (WPT) application and sensors.
Distributed Planning for Rigid Robot Formations with Probabilistic Collision Avoidance
This paper presents a distributed method for robots moving in rigid formations while ensuring probabilistic collision avoidance between the robots. The formation is parametrised through the transformation of a base configuration. The robots map their desired velocities into a corresponding desired change in the formation parameters and apply a consensus step to reach agreement on the desired formation and a constraint satisfaction step to ensure collision avoidance within the formation. The constraint set is found such that the probability of collision remains below an upper bound. The method was demonstrated in a manual teleoperation scenario both in simulation and a real-world experiment.
Earth Observation Satellite Scheduling with Graph Neural Networks
The Earth Observation Satellite Planning (EOSP) is a difficult optimization problem with considerable practical interest. A set of requested observations must be scheduled on an agile Earth observation satellite while respecting constraints on their visibility window, as well as maneuver constraints that impose varying delays between successive observations. In addition, the problem is largely oversubscribed: there are much more candidate observations than what can possibly be achieved. Therefore, one must select the set of observations that will be performed while maximizing their weighted cumulative benefit, and propose a feasible schedule for these observations. As previous work mostly focused on heuristic and iterative search algorithms, this paper presents a new technique for selecting and scheduling observations based on Graph Neural Networks (GNNs) and Deep Reinforcement Learning (DRL). GNNs are used to extract relevant information from the graphs representing instances of the EOSP, and DRL drives the search for optimal schedules. Our simulations show that it is able to learn on small problem instances and generalize to larger real-world instances, with very competitive performance compared to traditional approaches.
comment: Accepted at 17th European Workshop on Reinforcement Learning (EWRL 2024)
Fixed-time Disturbance Observer-Based MPC Robust Trajectory Tracking Control of Quadrotor
In this paper, a fixed-time disturbance observerbased model predictive control algorithm is proposed for trajectory tracking of quadrotor in the presence of disturbances. First, a novel multivariable fixed-time disturbance observer is proposed to estimate the lumped disturbances. The bi-limit homogeneity and Lyapunov techniques are employed to ensure the convergence of estimation error within a fixed convergence time, independent of the initial estimation error. Then, an observerbased model predictive control strategy is formulated to achieve robust trajectory tracking of quadrotor, attenuating the lumped disturbances and model uncertainties. Finally, simulations and real-world experiments are provided to illustrate the effectiveness of the proposed method.
Domain-decoupled Physics-informed Neural Networks with Closed-form Gradients for Fast Model Learning of Dynamical Systems
Physics-informed neural networks (PINNs) are trained using physical equations and can also incorporate unmodeled effects by learning from data. PINNs for control (PINCs) of dynamical systems are gaining interest due to their prediction speed compared to classical numerical integration methods for nonlinear state-space models, making them suitable for real-time control applications. We introduce the domain-decoupled physics-informed neural network (DD-PINN) to address current limitations of PINC in handling large and complex nonlinear dynamic systems. The time domain is decoupled from the feed-forward neural network to construct an Ansatz function, allowing for calculation of gradients in closed form. This approach significantly reduces training times, especially for large dynamical systems, compared to PINC, which relies on graph-based automatic differentiation. Additionally, the DD-PINN inherently fulfills the initial condition and supports higher-order excitation inputs, simplifying the training process and enabling improved prediction accuracy. Validation on three systems - a nonlinear mass-spring-damper, a five-mass-chain, and a two-link robot - demonstrates that the DD-PINN achieves significantly shorter training times. In cases where the PINC's prediction diverges, the DD-PINN's prediction remains stable and accurate due to higher physics loss reduction or use of a higher-order excitation input. The DD-PINN allows for fast and accurate learning of large dynamical systems previously out of reach for the PINC.
comment: Accepted to International Conference on Informatics in Control, Automation and Robotics (ICINCO) 2024
Towards Safe Autonomous Intersection Management: Temporal Logic-based Safety Filters for Vehicle Coordination
In this paper, we introduce a temporal logic-based safety filter for Autonomous Intersection Management (AIM), an emerging infrastructure technology for connected vehicles to coordinate traffic flow through intersections. Despite substantial work on AIM systems, the balance between intersection safety and efficiency persists as a significant challenge. Building on recent developments in formal methods that now have become computationally feasible for AIM applications, we introduce an approach that starts with a temporal logic specification for the intersection and then uses reachability analysis to compute safe time-state corridors for the connected vehicles that pass through the intersection. By analyzing these corridors, in contrast to single trajectories, we can make explicit design decisions regarding safety-efficiency trade-offs while taking each vehicle's decision uncertainty into account. Additionally, we compute safe driving limits to ensure that vehicles remain within their designated safe corridors. Combining these elements, we develop a service that provides safety filters for AIM coordination of connected vehicles. We evaluate the practical feasibility of our safety framework using a simulated 4-way intersection, showing that our approach performs in real-time for multiple scenarios.
comment: To be published in 27th IEEE International Conference on Intelligent Transportation Systems
Model Predictive Control for T-S Fuzzy Markovian Jump Systems Using Dynamic Prediction Optimization
In this paper, the model predictive control (MPC) problem is investigated for the constrained discrete-time Takagi-Sugeno fuzzy Markovian jump systems (FMJSs) under imperfect premise matching rules. To strike a balance between initial feasible region, control performance, and online computation burden, a set of mode-dependent state feedback fuzzy controllers within the frame of dynamic prediction optimizing (DPO)-MPC is delicately designed with the perturbation variables produced by the predictive dynamics. The DPO-MPC controllers are implemented via two stages: at the first stage, terminal constraints sets companied with feedback gain are obtained by solving a ``min-max'' problem; at the second stage, and a set of perturbations is designed felicitously to enlarge the feasible region. Here, dynamic feedback gains are designed for off-line using matrix factorization technique, while the dynamic controller state is determined for online over a moving horizon to gradually guide the system state from the initial feasible region to the terminal constraint set. Sufficient conditions are provided to rigorously ensure the recursive feasibility of the proposed DPO-MPC scheme and the mean-square stability of the underlying FMJS. Finally, the efficacy of the proposed methods is demonstrated through a robot arm system example.
Learning-Based Adaptive Dynamic Routing with Stability Guarantee for a Single-Origin-Single-Destination Network
We consider learning-based adaptive dynamic routing for a single-origin-single-destination queuing network with stability guarantees. Specifically, we study a class of generalized shortest path policies that can be parameterized by only two constants via a piecewise-linear function. Using the Foster-Lyapunov stability theory, we develop a criterion on the parameters to ensure mean boundedness of the traffic state. Then, we develop a policy iteration algorithm that learns the parameters from realized sample paths. Importantly, the piecewise-linear function is both integrated into the Lyapunov function for stability analysis and used as a proxy of the value function for policy iteration; hence, stability is inherently ensured for the learned policy. Finally, we demonstrate via a numerical example that the proposed algorithm learns a near-optimal routing policy with an acceptable optimality gap but significantly higher computational efficiency compared with a standard neural network-based algorithm.
Constructive Nonlinear Control of Underactuated Systems via Zero Dynamics Policies
Stabilizing underactuated systems is an inherently challenging control task due to fundamental limitations on how the control input affects the unactuated dynamics. Decomposing the system into actuated (output) and unactuated (zero) coordinates provides useful insight as to how input enters the system dynamics. In this work, we leverage the structure of this decomposition to formalize the idea of Zero Dynamics Policies (ZDPs) -- a mapping from the unactuated coordinates to desired actuated coordinates. Specifically, we show that a ZDP exists in a neighborhood of the origin, and prove that combining output stabilization with a ZDP results in stability of the full system state. We detail a constructive method of obtaining ZDPs in a neighborhood of the origin, and propose a learning-based approach which leverages optimal control to obtain ZDPs with much larger regions of attraction. We demonstrate that such a paradigm can be used to stabilize the canonical underactuated system of the cartpole, and showcase an improvement over the nominal performance of LQR.
comment: 8 pages, 2 figures, CDC 2024
Sub-Riemannian Geometry, Mixing, and the Holonomy of Optimal Mass Transport
The theory of Monge-Kantorovich Optimal Mass Transport (OMT) has in recent years spurred a fast developing phase of research in stochastic control, control of ensemble systems, thermodynamics, data science, and several other fields in engineering and science. Specifically, OMT endowed the space of probability distributions with a rich Riemannian-like geometry, the Wasserstein geometry and the Wasserstein $\mathcal W_2$-metric. This geometry proved fruitful in quantifying and regulating the uncertainty of deterministic and stochastic systems, and in dealing with problems related to the transport of ensembles in continuous and discrete spaces. We herein introduce a new type of transportation problems. The salient feature of these problems is that particles/agents in the ensemble, that are to be transported, are labeled and their relative position along their journey is of interest. Of particular importance in our program are closed orbits where particles return to their original place after being transported along closed paths. Thereby, control laws are sought so that the distribution of the ensemble traverses a closed orbit in the Wasserstein manifold without mixing. This feature is in contrast with the classical theory of optimal transport where the primary object of study is the path of probability densities, without any concern about mixing of the flow, which is expected and allowed when traversing curves in the Wasserstein space. In the theory that we present, we explore a hitherto unstudied sub-Riemannian structure of Monge-Kantorovich transport where the relative position of particles along their journey is modeled by the holonomy of the transportation schedule. From this vantage point, we discuss several other problems of independent interest.
comment: 16 pages, 8 figures
Energy Management for Prepaid Customers: A Linear Optimization Approach
With increasing energy prices, low income households are known to forego or minimize the use of electricity to save on energy costs. If a household is on a prepaid electricity program, it can be automatically and immediately disconnected from service if there is no balance in its prepaid account. Such households need to actively ration the amount of energy they use by deciding which appliances to use and for how long. We present a tool that helps households extend the availability of their critical appliances by limiting the use of discretionary ones, and prevent disconnections. The proposed method is based on a linear optimization problem that only uses average power demand as an input and can be solved to optimality using a simple greedy approach. We compare the model with two mixed-integer linear programming models that require more detailed demand forecasts and optimization solvers for implementation. In a numerical case study based on real household data, we assess the performance of the different models under different accuracy and granularity of demand forecasts. Our results show that our proposed linear model is much simpler to implement, while providing similar performance under realistic circumstances.
comment: Accepted to IEEE SmartGridComm (International Conference on Communications, Control, and Computing Technologies for Smart Grids) 2024, Oslo, Norway (7 pages, 4 figures)
Online Event-Triggered Switching for Frequency Control in Power Grids with Variable Inertia
The increasing integration of renewable energy resources into power grids has led to time-varying system inertia and consequent degradation in frequency dynamics. A promising solution to alleviate performance degradation is using power electronics interfaced energy resources, such as renewable generators and battery energy storage for primary frequency control, by adjusting their power output set-points in response to frequency deviations. However, designing a frequency controller under time-varying inertia is challenging. Specifically, the stability or optimality of controllers designed for time-invariant systems can be compromised once applied to a time-varying system. We model the frequency dynamics under time-varying inertia as a nonlinear switching system, where the frequency dynamics under each mode are described by the nonlinear swing equations and different modes represent different inertia levels. We identify a key controller structure, named Neural Proportional-Integral (Neural-PI) controller, that guarantees exponential input-to-state stability for each mode. To further improve performance, we present an online event-triggered switching algorithm to select the most suitable controller from a set of Neural-PI controllers, each optimized for specific inertia levels. Simulations on the IEEE 39-bus system validate the effectiveness of the proposed online switching control method with stability guarantees and optimized performance for frequency control under time-varying inertia.
Hybrid Plant Models Call for a Different Plant Modelling Paradigm and a New Generation of Software (Heresy in the land of moles, fractions, & rigorous physical properties)
This paper is an invitation to the process systems engineering community to change the paradigm for process plants. The goal is to achieve much easier convergence while retaining accuracy on par with the rigorous models. Accurate plant models of existing plants can be linear or much less nonlinear if they are based on mass component flows and stream properties per unit mass properties instead of molar flows and mole fractions. Accurate stream properties per unit mass can be calculated at stream specific conditions by linear approximations which in many instances eliminates mole fraction-based flash calculations. Hybrid data-driven node models fit naturally in this paradigm, since they used measured data, which is either in mass or in volumetric units, but never in moles. Instantiation of models at all levels of abstraction (planning, scheduling, optimization, and control models) from the same plant topology representation will ensure inheritance of solutions from mass-only to mass-and-energy to mass-and-energy-and-stream-properties, thereby ensuring consistency of solutions between these models. None of the existing software provides inheritance between different levels of plant abstraction (i.e. inheritance between models for different business applications) or different levels of abstractions per plant sections or per time periods, which motivates this exposition.
On Mobility Equity and the Promise of Emerging Transportation Systems
This paper introduces a mobility equity metric (MEM) for evaluating fairness and accessibility in multi-modal intelligent transportation systems. The MEM simultaneously accounts for service accessibility and transportation costs across different modes of transportation and social demographics. We provide a data-driven validation of the proposed MEM to characterize the impact of various parameters in the metric across cities in the U.S. We subsequently develop a routing framework that aims to optimize MEM within a transportation network containing both public transit and private vehicles. Within this framework, a system planner provides routing suggestions to vehicles across all modes of transportation to maximize MEM. We evaluate our approach through numerical simulations, analyzing the impact of travel demands and compliance of private vehicles. This work provides insights into designing transportation systems that are not only efficient but also equitable, ensuring fair access to essential services across diverse populations.
comment: 14 pages, 15 figures
Optimization Solution Functions as Deterministic Policies for Offline Reinforcement Learning
Offline reinforcement learning (RL) is a promising approach for many control applications but faces challenges such as limited data coverage and value function overestimation. In this paper, we propose an implicit actor-critic (iAC) framework that employs optimization solution functions as a deterministic policy (actor) and a monotone function over the optimal value of optimization as a critic. By encoding optimality in the actor policy, we show that the learned policies are robust to the suboptimality of the learned actor parameters via the exponentially decaying sensitivity (EDS) property. We obtain performance guarantees for the proposed iAC framework and show its benefits over general function approximation schemes. Finally, we validate the proposed framework on two real-world applications and show a significant improvement over state-of-the-art (SOTA) offline RL methods.
comment: American Control Conference 2024
This is the Way: Mitigating the Roll of an Autonomous Uncrewed Surface Vessel in Wavy Conditions Using Model Predictive Control IROS
Though larger vessels may be well-equipped to deal with wavy conditions, smaller vessels are often more susceptible to disturbances. This paper explores the development of a nonlinear model predictive control (NMPC) system for Uncrewed Surface Vessels (USVs) in wavy conditions to minimize average roll. The NMPC is based on a prediction method that uses information about the vessel's dynamics and an assumed wave model. This method is able to mitigate the roll of an under-actuated USV in a variety of conditions by adjusting the weights of the cost function. The results show a reduction of 39% of average roll with a tuned controller in conditions with 1.75-metre sinusoidal waves. A general and intuitive tuning strategy is established. This preliminary work is a proof of concept which sets the stage for the leveraging of wave prediction methodologies to perform planning and control in real time for USVs in real-world scenarios and field trials.
comment: 6 pages, 10 figures. To appear in Proceedings of the 2024 IEEE/RSJ International Conference on Robots and Systems (IROS), October 2024
Control-Informed Reinforcement Learning for Chemical Processes
This work proposes a control-informed reinforcement learning (CIRL) framework that integrates proportional-integral-derivative (PID) control components into the architecture of deep reinforcement learning (RL) policies. The proposed approach augments deep RL agents with a PID controller layer, incorporating prior knowledge from control theory into the learning process. CIRL improves performance and robustness by combining the best of both worlds: the disturbance-rejection and setpoint-tracking capabilities of PID control and the nonlinear modeling capacity of deep RL. Simulation studies conducted on a continuously stirred tank reactor system demonstrate the improved performance of CIRL compared to both conventional model-free deep RL and static PID controllers. CIRL exhibits better setpoint-tracking ability, particularly when generalizing to trajectories outside the training distribution, suggesting enhanced generalization capabilities. Furthermore, the embedded prior control knowledge within the CIRL policy improves its robustness to unobserved system disturbances. The control-informed RL framework combines the strengths of classical control and reinforcement learning to develop sample-efficient and robust deep reinforcement learning algorithms, with potential applications in complex industrial systems.
Structured Deep Neural Networks-Based Backstepping Trajectory Tracking Control for Lagrangian Systems
Deep neural networks (DNN) are increasingly being used to learn controllers due to their excellent approximation capabilities. However, their black-box nature poses significant challenges to closed-loop stability guarantees and performance analysis. In this paper, we introduce a structured DNN-based controller for the trajectory tracking control of Lagrangian systems using backing techniques. By properly designing neural network structures, the proposed controller can ensure closed-loop stability for any compatible neural network parameters. In addition, improved control performance can be achieved by further optimizing neural network parameters. Besides, we provide explicit upper bounds on tracking errors in terms of controller parameters, which allows us to achieve the desired tracking performance by properly selecting the controller parameters. Furthermore, when system models are unknown, we propose an improved Lagrangian neural network (LNN) structure to learn the system dynamics and design the controller. We show that in the presence of model approximation errors and external disturbances, the closed-loop stability and tracking control performance can still be guaranteed. The effectiveness of the proposed approach is demonstrated through simulations.
Baseline Results for Selected Nonlinear System Identification Benchmarks
Nonlinear system identification remains an important open challenge across research and academia. Large numbers of novel approaches are seen published each year, each presenting improvements or extensions to existing methods. It is natural, therefore, to consider how one might choose between these competing models. Benchmark datasets provide one clear way to approach this question. However, to make meaningful inference based on benchmark performance it is important to understand how well a new method performs comparatively to results available with well-established methods. This paper presents a set of ten baseline techniques and their relative performances on five popular benchmarks. The aim of this contribution is to stimulate thought and discussion regarding objective comparison of identification methodologies.
InPTC: Integrated Planning and Tube-Following Control for Prescribed-Time Collision-Free Navigation of Wheeled Mobile Robots
In this article, we propose a novel approach, called InPTC (Integrated Planning and Tube-Following Control), for prescribed-time collision-free navigation of wheeled mobile robots in a compact convex workspace cluttered with static, sufficiently separated, and convex obstacles. A path planner with prescribed-time convergence is presented based upon Bouligand's tangent cones and time scale transformation (TST) techniques, yielding a continuous vector field that can guide the robot from almost all initial positions in the free space to the designated goal at a prescribed time, while avoiding entering the obstacle regions augmented with safety margin. By leveraging barrier functions and TST, we further derive a tube-following controller to achieve robot trajectory tracking within a prescribed time less than the planner's settling time. This controller ensures the robot moves inside a predefined ``safe tube'' around the reference trajectory, where the tube radius is set to be less than the safety margin. Consequently, the robot will reach the goal location within a prescribed time while avoiding collision with any obstacles along the way. The proposed InPTC is implemented on a Mona robot operating in an arena cluttered with obstacles of various shapes. Experimental results demonstrate that InPTC not only generates smooth collision-free reference trajectories that converge to the goal location at the preassigned time of $250\,\rm s$ (i.e., the required task completion time), but also achieves tube-following trajectory tracking with tracking accuracy higher than $0.01\rm m$ after the preassigned time of $150\,\rm s$. This enables the robot to accomplish the navigation task within the required time of $250\,\rm s$.
Risk-aware Scheduling and Dispatch of Flexibility Events in Buildings
Residential and commercial buildings, equipped with systems such as heat pumps (HPs), hot water tanks, or stationary energy storage, have a large potential to offer their consumption flexibility as grid services. In this work, we leverage this flexibility to react to consumption requests related to maximizing self-consumption and reducing peak loads. We employ a data-driven virtual storage modeling approach for flexibility prediction in the form of flexibility envelopes for individual buildings. The risk-awareness of this prediction is inherited by the proposed scheduling algorithm. A Mixed-integer Linear Program (MILP) is formulated to schedule the activation of a pool of buildings in order to best respond to an external aggregated consumption request. This aggregated request is then dispatched to the active individual buildings, based on the previously determined schedule. The effectiveness of the approach is demonstrated by coordinating up to 500 simulated buildings using the Energym Python library and observing about 1.5 times peak power reduction in comparison with a baseline approach while maintaining comfort more robustly. We demonstrate the scalability of the approach by solving problems with 2000 buildings in about 21 seconds, with solving times being approximately linear in the number of considered assets.
Model-Free Unsupervised Anomaly detection framework in multivariate time-series of industrial dynamical systems
In this paper, a new model-free anomaly detection framework is proposed for time-series induced by industrial dynamical systems. The framework lies in the category of conventional approaches which enable appealing features such as, a fast learning with reduced amount of learning data, a reduced memory, a high potential for explainability as well as easiness of incremental learning mechanism to incorporate operator feedback after an alarm is raised an analyzed. All these are crucial features towards acceptance of data-driven solution by industry but they are rarely considered in the comparisons between competing methods which generally exclusively focus on performance metrics. Moreover, the features engineering step involved in the proposed framework is inspired by the time-series being implicitly governed by physical laws as it is generally the case in industrial time-series. Two examples are given to assess the efficiency of the proposed approach.
comment: 25 pages, 2 tables, 12 figures, 1 appendix
An Interface Method for Co-simulation of EMT Model and Shifted Frequency EMT Model Based on Rotational Invariance Techniques
The shifted frequency-based electromagnetic transient (SFEMT) simulation has greatly improved the computational efficiency of traditional electromagnetic transient (EMT) simulation for the ac grid. This letter proposes a novel interface for the co-simulation of the SFEMT model and the traditional EMT model. The general form of SFEMT modeling and the principle of analytical signal construction are first derived. Then, an interface for the co-simulation of EMT and SFEMT simulation is proposed based on rotational invariance techniques. Theoretical analyses and test results demonstrate the effectiveness of the proposed method.
Submodular Maximization Approaches for Equitable Client Selection in Federated Learning
In a conventional Federated Learning framework, client selection for training typically involves the random sampling of a subset of clients in each iteration. However, this random selection often leads to disparate performance among clients, raising concerns regarding fairness, particularly in applications where equitable outcomes are crucial, such as in medical or financial machine learning tasks. This disparity typically becomes more pronounced with the advent of performance-centric client sampling techniques. This paper introduces two novel methods, namely SUBTRUNC and UNIONFL, designed to address the limitations of random client selection. Both approaches utilize submodular function maximization to achieve more balanced models. By modifying the facility location problem, they aim to mitigate the fairness concerns associated with random selection. SUBTRUNC leverages client loss information to diversify solutions, while UNIONFL relies on historical client selection data to ensure a more equitable performance of the final model. Moreover, these algorithms are accompanied by robust theoretical guarantees regarding convergence under reasonable assumptions. The efficacy of these methods is demonstrated through extensive evaluations across heterogeneous scenarios, revealing significant improvements in fairness as measured by a client dissimilarity metric.
comment: 13 pages
Scaling Learning based Policy Optimization for Temporal Logic Tasks by Controller Network Dropout
This paper introduces a model-based approach for training feedback controllers for an autonomous agent operating in a highly nonlinear (albeit deterministic) environment. We desire the trained policy to ensure that the agent satisfies specific task objectives and safety constraints, both expressed in Discrete-Time Signal Temporal Logic (DT-STL). One advantage for reformulation of a task via formal frameworks, like DT-STL, is that it permits quantitative satisfaction semantics. In other words, given a trajectory and a DT-STL formula, we can compute the {\em robustness}, which can be interpreted as an approximate signed distance between the trajectory and the set of trajectories satisfying the formula. We utilize feedback control, and we assume a feed forward neural network for learning the feedback controller. We show how this learning problem is similar to training recurrent neural networks (RNNs), where the number of recurrent units is proportional to the temporal horizon of the agent's task objectives. This poses a challenge: RNNs are susceptible to vanishing and exploding gradients, and na\"{i}ve gradient descent-based strategies to solve long-horizon task objectives thus suffer from the same problems. To tackle this challenge, we introduce a novel gradient approximation algorithm based on the idea of dropout or gradient sampling. One of the main contributions is the notion of {\em controller network dropout}, where we approximate the NN controller in several time-steps in the task horizon by the control input obtained using the controller in a previous training step. We show that our control synthesis methodology, can be quite helpful for stochastic gradient descent to converge with less numerical issues, enabling scalable backpropagation over long time horizons and trajectories over high dimensional state spaces.
Robust Backstepping Control of a Quadrotor Unmanned Aerial Vehicle Under Colored Noises
Advances in software and hardware technologies have facilitated the production of quadrotor unmanned aerial vehicles (UAVs). Quadrotor UAVs are used in important missions such as search and rescue, counter terrorism, firefighting, surveillance and cargo transportation. While performing these tasks, quadrotors must operate in noisy environments. Therefore, a robust controller design that can control the altitude and attitude of the quadrotor in noisy environments is of great importance. While many researchers focus only on white Gaussian noise in their studies, all colored noises should be considered during quadrotor's operation. In this study, it is aimed to design a robust controller that is resistant to all colored noises. Firstly, a nonlinear model of the quadrotor was created with MATLAB. Then, a backstepping control design that is resistant to colored noises was realized. The designed backstepping controller was tested under Gaussian white noise, pink noise, brown noise, blue noise and purple noise. PID and Lyapunov-based controller designs were also carried out and their time responses (rise time, overshoot, settling time) were compared with those of backstepping controller. When the values obtained was examined, it was proven that the proposed backstepping controller had the least overshoot and shortest settling time under all noise types.
comment: 15 pages, 24 figures
Stochastic-Robust Planning of Networked Hydrogen-Electrical Microgrids: A Study on Induced Refueling Demand
Hydrogen-electrical microgrids are increasingly assuming an important role on the pathway toward decarbonization of energy and transportation systems. This paper studies networked hydrogen-electrical microgrids planning (NHEMP), considering a critical but often-overlooked issue, i.e., the demand-inducing effect (DIE) associated with infrastructure development decisions. Specifically, higher refueling capacities will attract more refueling demand of hydrogen-powered vehicles (HVs). To capture such interactions between investment decisions and induced refueling demand, we introduce a decision-dependent uncertainty (DDU) set and build a trilevel stochastic-robust formulation. The upper-level determines optimal investment strategies for hydrogen-electrical microgrids, the lower-level optimizes the risk-aware operation schedules across a series of stochastic scenarios, and, for each scenario, the middle-level identifies the "worst" situation of refueling demand within an individual DDU set to ensure economic feasibility. Then, an adaptive and exact decomposition algorithm, based on Parametric Column-and-Constraint Generation (PC&CG), is customized and developed to address the computational challenge and to quantitatively analyze the impact of DIE. Case studies on an IEEE exemplary system validate the effectiveness of the proposed NHEMP model and the PC&CG algorithm. It is worth highlighting that DIE can make an important contribution to the economic benefits of NHEMP, yet its significance will gradually decrease when the main bottleneck transits to other system restrictions.
Systems and Control (EESS)
Data-enabled Predictive Repetitive Control
Many systems are subject to periodic disturbances and exhibit repetitive behaviour. Model-based repetitive control employs knowledge of such periodicity to attenuate periodic disturbances and has seen a wide range of successful industrial implementations. The aim of this paper is to develop a data-driven repetitive control method. In the developed framework, linear periodically time-varying (LPTV) behaviour is lifted to linear time-invariant (LTI) behaviour. Periodic disturbance mitigation is enabled by developing an extension of Willems' fundamental lemma for systems with exogenous disturbances. The resulting Data-enabled Predictive Repetitive Control (DeePRC) technique accounts for periodic system behaviour to perform attenuation of a periodic disturbance. Simulations demonstrate the ability of DeePRC to effectively mitigate periodic disturbances in the presence of noise.
comment: Extended report
SpecGuard: Specification Aware Recovery for Robotic Autonomous Vehicles from Physical Attacks CCS'24
Robotic Autonomous Vehicles (RAVs) rely on their sensors for perception, and follow strict mission specifications (e.g., altitude, speed, and geofence constraints) for safe and timely operations. Physical attacks can corrupt the RAVs' sensors, resulting in mission failures. Recovering RAVs from such attacks demands robust control techniques that maintain compliance with mission specifications even under attacks to ensure the RAV's safety and timely operations. We propose SpecGuard, a technique that complies with mission specifications and performs safe recovery of RAVs. There are two innovations in SpecGuard. First, it introduces an approach to incorporate mission specifications and learn a recovery control policy using Deep Reinforcement Learning (Deep-RL). We design a compliance-based reward structure that reflects the RAV's complex dynamics and enables SpecGuard to satisfy multiple mission specifications simultaneously. Second, SpecGuard incorporates state reconstruction, a technique that minimizes attack induced sensor perturbations. This reconstruction enables effective adversarial training, and optimizing the recovery control policy for robustness under attacks. We evaluate SpecGuard in both virtual and real RAVs, and find that it achieves 92% recovery success rate under attacks on different sensors, without any crashes or stalls. SpecGuard achieves 2X higher recovery success than prior work, and incurs about 15% performance overhead on real RAVs.
comment: CCS'24 (a shorter version of this paper will appear in the conference proceeding)
Data-driven distributionally robust MPC for systems with multiplicative noise: A semi-infinite semi-definite programming approach
This article introduces a novel distributionally robust model predictive control (DRMPC) algorithm for a specific class of controlled dynamical systems where the disturbance multiplies the state and control variables. These classes of systems arise in mathematical finance, where the paradigm of distributionally robust optimization (DRO) fits perfectly, and this serves as the primary motivation for this work. We recast the optimal control problem (OCP) as a semi-definite program with an infinite number of constraints, making the ensuing optimization problem a \emph{semi-infinite semi-definite program} (SI-SDP). To numerically solve the SI-SDP, we advance an approach for solving convex semi-infinite programs (SIPs) to SI-SDPs and, subsequently, solve the DRMPC problem. A numerical example is provided to show the effectiveness of the algorithm.
comment: To appear in the proceedings of Mathematical Theory of Networks and Systems (MTNS) 2024
Applications in CityLearn Gym Environment for Multi-Objective Control Benchmarking in Grid-Interactive Buildings and Districts
It is challenging to coordinate multiple distributed energy resources in a single or multiple buildings to ensure efficient and flexible operation. Advanced control algorithms such as model predictive control and reinforcement learning control provide solutions to this problem by effectively managing a distribution of distributed energy resource control tasks while adapting to unique building characteristics, and cooperating towards improving multi-objective key performance indicator. Yet, a research gap for advanced control adoption is the ability to benchmark algorithm performance. CityLearn addresses this gap an open-source Gym environment for the easy implementation and benchmarking of simple rule-based control and advanced algorithms that has an advantage of modeling simplicity, multi-agent control, district-level objectives, and control resiliency assessment. Here we demonstrate the functionalities of CityLearn using 17 different building control problems that have varying complexity with respect to the number of controllable distributed energy resources in buildings, the simplicity of the control algorithm, the control objective, and district size.
comment: To be published in IBPSA-USA SimBuild 2024 Conference
Multitone PSK Modulation Design for Simultaneous Wireless Information and Power Transfer
Far-field wireless power transfer, based on radio frequency (RF) waves, came into the picture to fulfill the power need of large Internet of Things (IoT) networks, the backbone of the 5G and beyond era. However, RF communication signals carry both information as well as energy. Therefore, recently, simultaneous wireless information and power transfer (SWIPT) has attracted much attention in order to wirelessly charge these IoT devices. In this paper, we propose a novel N -tone multitone phase shift keying (PSK) modulation scheme, taking advantage of the non-linearity of integrated receiver rectifier architecture. The main advantage of the proposed modulation scheme is the reduction in ripple voltage, introduced by the symbol transmission through phases. Achievable power conversion efficiency (PCE) and bit error rate (BER) at the output are considered to measure the efficacy of the proposed modulation scheme. Simulation results are verified by the measurements over the designed rectifier circuitry. The effect of symbol phase range, modulation order, and the number of tones are analyzed. In the future, this transmission scheme can be utilized to satisfy the data and power requirements of low-power Internet of Things sensor networks.
Compact Pixelated Microstrip Forward Broadside Coupler Using Binary Particle Swarm Optimization
In this paper, a compact microstrip forward broadside coupler (MFBC) with high coupling level is proposed in the frequency band of 3.5-3.8 GHz. The coupler is composed of two parallel pixelated transmission lines. To validate the designstrategy, the proposed MFBC is fabricated and measured. The measured results demonstrate a forward coupler with 3 dB coupling, and a compact size of 0.12 {\lambda}g x 0.10{\lambda}g. Binary Particle Swarm Optimization (BPSO) design methodology and flexibility of pixelation enable us to optimize the proposed MFBC with desired coupling level and operating frequency within a fixed dimension. Also, low sensitivity to misalignment between two coupled TLs makes the proposed coupler a good candidate for near-field Wireless Power Transfer (WPT) application and sensors.
Distributed Planning for Rigid Robot Formations with Probabilistic Collision Avoidance
This paper presents a distributed method for robots moving in rigid formations while ensuring probabilistic collision avoidance between the robots. The formation is parametrised through the transformation of a base configuration. The robots map their desired velocities into a corresponding desired change in the formation parameters and apply a consensus step to reach agreement on the desired formation and a constraint satisfaction step to ensure collision avoidance within the formation. The constraint set is found such that the probability of collision remains below an upper bound. The method was demonstrated in a manual teleoperation scenario both in simulation and a real-world experiment.
Earth Observation Satellite Scheduling with Graph Neural Networks
The Earth Observation Satellite Planning (EOSP) is a difficult optimization problem with considerable practical interest. A set of requested observations must be scheduled on an agile Earth observation satellite while respecting constraints on their visibility window, as well as maneuver constraints that impose varying delays between successive observations. In addition, the problem is largely oversubscribed: there are much more candidate observations than what can possibly be achieved. Therefore, one must select the set of observations that will be performed while maximizing their weighted cumulative benefit, and propose a feasible schedule for these observations. As previous work mostly focused on heuristic and iterative search algorithms, this paper presents a new technique for selecting and scheduling observations based on Graph Neural Networks (GNNs) and Deep Reinforcement Learning (DRL). GNNs are used to extract relevant information from the graphs representing instances of the EOSP, and DRL drives the search for optimal schedules. Our simulations show that it is able to learn on small problem instances and generalize to larger real-world instances, with very competitive performance compared to traditional approaches.
comment: Accepted at 17th European Workshop on Reinforcement Learning (EWRL 2024)
Fixed-time Disturbance Observer-Based MPC Robust Trajectory Tracking Control of Quadrotor
In this paper, a fixed-time disturbance observerbased model predictive control algorithm is proposed for trajectory tracking of quadrotor in the presence of disturbances. First, a novel multivariable fixed-time disturbance observer is proposed to estimate the lumped disturbances. The bi-limit homogeneity and Lyapunov techniques are employed to ensure the convergence of estimation error within a fixed convergence time, independent of the initial estimation error. Then, an observerbased model predictive control strategy is formulated to achieve robust trajectory tracking of quadrotor, attenuating the lumped disturbances and model uncertainties. Finally, simulations and real-world experiments are provided to illustrate the effectiveness of the proposed method.
Domain-decoupled Physics-informed Neural Networks with Closed-form Gradients for Fast Model Learning of Dynamical Systems
Physics-informed neural networks (PINNs) are trained using physical equations and can also incorporate unmodeled effects by learning from data. PINNs for control (PINCs) of dynamical systems are gaining interest due to their prediction speed compared to classical numerical integration methods for nonlinear state-space models, making them suitable for real-time control applications. We introduce the domain-decoupled physics-informed neural network (DD-PINN) to address current limitations of PINC in handling large and complex nonlinear dynamic systems. The time domain is decoupled from the feed-forward neural network to construct an Ansatz function, allowing for calculation of gradients in closed form. This approach significantly reduces training times, especially for large dynamical systems, compared to PINC, which relies on graph-based automatic differentiation. Additionally, the DD-PINN inherently fulfills the initial condition and supports higher-order excitation inputs, simplifying the training process and enabling improved prediction accuracy. Validation on three systems - a nonlinear mass-spring-damper, a five-mass-chain, and a two-link robot - demonstrates that the DD-PINN achieves significantly shorter training times. In cases where the PINC's prediction diverges, the DD-PINN's prediction remains stable and accurate due to higher physics loss reduction or use of a higher-order excitation input. The DD-PINN allows for fast and accurate learning of large dynamical systems previously out of reach for the PINC.
comment: Accepted to International Conference on Informatics in Control, Automation and Robotics (ICINCO) 2024
Towards Safe Autonomous Intersection Management: Temporal Logic-based Safety Filters for Vehicle Coordination
In this paper, we introduce a temporal logic-based safety filter for Autonomous Intersection Management (AIM), an emerging infrastructure technology for connected vehicles to coordinate traffic flow through intersections. Despite substantial work on AIM systems, the balance between intersection safety and efficiency persists as a significant challenge. Building on recent developments in formal methods that now have become computationally feasible for AIM applications, we introduce an approach that starts with a temporal logic specification for the intersection and then uses reachability analysis to compute safe time-state corridors for the connected vehicles that pass through the intersection. By analyzing these corridors, in contrast to single trajectories, we can make explicit design decisions regarding safety-efficiency trade-offs while taking each vehicle's decision uncertainty into account. Additionally, we compute safe driving limits to ensure that vehicles remain within their designated safe corridors. Combining these elements, we develop a service that provides safety filters for AIM coordination of connected vehicles. We evaluate the practical feasibility of our safety framework using a simulated 4-way intersection, showing that our approach performs in real-time for multiple scenarios.
comment: To be published in 27th IEEE International Conference on Intelligent Transportation Systems
Model Predictive Control for T-S Fuzzy Markovian Jump Systems Using Dynamic Prediction Optimization
In this paper, the model predictive control (MPC) problem is investigated for the constrained discrete-time Takagi-Sugeno fuzzy Markovian jump systems (FMJSs) under imperfect premise matching rules. To strike a balance between initial feasible region, control performance, and online computation burden, a set of mode-dependent state feedback fuzzy controllers within the frame of dynamic prediction optimizing (DPO)-MPC is delicately designed with the perturbation variables produced by the predictive dynamics. The DPO-MPC controllers are implemented via two stages: at the first stage, terminal constraints sets companied with feedback gain are obtained by solving a ``min-max'' problem; at the second stage, and a set of perturbations is designed felicitously to enlarge the feasible region. Here, dynamic feedback gains are designed for off-line using matrix factorization technique, while the dynamic controller state is determined for online over a moving horizon to gradually guide the system state from the initial feasible region to the terminal constraint set. Sufficient conditions are provided to rigorously ensure the recursive feasibility of the proposed DPO-MPC scheme and the mean-square stability of the underlying FMJS. Finally, the efficacy of the proposed methods is demonstrated through a robot arm system example.
Learning-Based Adaptive Dynamic Routing with Stability Guarantee for a Single-Origin-Single-Destination Network
We consider learning-based adaptive dynamic routing for a single-origin-single-destination queuing network with stability guarantees. Specifically, we study a class of generalized shortest path policies that can be parameterized by only two constants via a piecewise-linear function. Using the Foster-Lyapunov stability theory, we develop a criterion on the parameters to ensure mean boundedness of the traffic state. Then, we develop a policy iteration algorithm that learns the parameters from realized sample paths. Importantly, the piecewise-linear function is both integrated into the Lyapunov function for stability analysis and used as a proxy of the value function for policy iteration; hence, stability is inherently ensured for the learned policy. Finally, we demonstrate via a numerical example that the proposed algorithm learns a near-optimal routing policy with an acceptable optimality gap but significantly higher computational efficiency compared with a standard neural network-based algorithm.
Constructive Nonlinear Control of Underactuated Systems via Zero Dynamics Policies
Stabilizing underactuated systems is an inherently challenging control task due to fundamental limitations on how the control input affects the unactuated dynamics. Decomposing the system into actuated (output) and unactuated (zero) coordinates provides useful insight as to how input enters the system dynamics. In this work, we leverage the structure of this decomposition to formalize the idea of Zero Dynamics Policies (ZDPs) -- a mapping from the unactuated coordinates to desired actuated coordinates. Specifically, we show that a ZDP exists in a neighborhood of the origin, and prove that combining output stabilization with a ZDP results in stability of the full system state. We detail a constructive method of obtaining ZDPs in a neighborhood of the origin, and propose a learning-based approach which leverages optimal control to obtain ZDPs with much larger regions of attraction. We demonstrate that such a paradigm can be used to stabilize the canonical underactuated system of the cartpole, and showcase an improvement over the nominal performance of LQR.
comment: 8 pages, 2 figures, CDC 2024
Sub-Riemannian Geometry, Mixing, and the Holonomy of Optimal Mass Transport
The theory of Monge-Kantorovich Optimal Mass Transport (OMT) has in recent years spurred a fast developing phase of research in stochastic control, control of ensemble systems, thermodynamics, data science, and several other fields in engineering and science. Specifically, OMT endowed the space of probability distributions with a rich Riemannian-like geometry, the Wasserstein geometry and the Wasserstein $\mathcal W_2$-metric. This geometry proved fruitful in quantifying and regulating the uncertainty of deterministic and stochastic systems, and in dealing with problems related to the transport of ensembles in continuous and discrete spaces. We herein introduce a new type of transportation problems. The salient feature of these problems is that particles/agents in the ensemble, that are to be transported, are labeled and their relative position along their journey is of interest. Of particular importance in our program are closed orbits where particles return to their original place after being transported along closed paths. Thereby, control laws are sought so that the distribution of the ensemble traverses a closed orbit in the Wasserstein manifold without mixing. This feature is in contrast with the classical theory of optimal transport where the primary object of study is the path of probability densities, without any concern about mixing of the flow, which is expected and allowed when traversing curves in the Wasserstein space. In the theory that we present, we explore a hitherto unstudied sub-Riemannian structure of Monge-Kantorovich transport where the relative position of particles along their journey is modeled by the holonomy of the transportation schedule. From this vantage point, we discuss several other problems of independent interest.
comment: 16 pages, 8 figures
Energy Management for Prepaid Customers: A Linear Optimization Approach
With increasing energy prices, low income households are known to forego or minimize the use of electricity to save on energy costs. If a household is on a prepaid electricity program, it can be automatically and immediately disconnected from service if there is no balance in its prepaid account. Such households need to actively ration the amount of energy they use by deciding which appliances to use and for how long. We present a tool that helps households extend the availability of their critical appliances by limiting the use of discretionary ones, and prevent disconnections. The proposed method is based on a linear optimization problem that only uses average power demand as an input and can be solved to optimality using a simple greedy approach. We compare the model with two mixed-integer linear programming models that require more detailed demand forecasts and optimization solvers for implementation. In a numerical case study based on real household data, we assess the performance of the different models under different accuracy and granularity of demand forecasts. Our results show that our proposed linear model is much simpler to implement, while providing similar performance under realistic circumstances.
comment: Accepted to IEEE SmartGridComm (International Conference on Communications, Control, and Computing Technologies for Smart Grids) 2024, Oslo, Norway (7 pages, 4 figures)
Online Event-Triggered Switching for Frequency Control in Power Grids with Variable Inertia
The increasing integration of renewable energy resources into power grids has led to time-varying system inertia and consequent degradation in frequency dynamics. A promising solution to alleviate performance degradation is using power electronics interfaced energy resources, such as renewable generators and battery energy storage for primary frequency control, by adjusting their power output set-points in response to frequency deviations. However, designing a frequency controller under time-varying inertia is challenging. Specifically, the stability or optimality of controllers designed for time-invariant systems can be compromised once applied to a time-varying system. We model the frequency dynamics under time-varying inertia as a nonlinear switching system, where the frequency dynamics under each mode are described by the nonlinear swing equations and different modes represent different inertia levels. We identify a key controller structure, named Neural Proportional-Integral (Neural-PI) controller, that guarantees exponential input-to-state stability for each mode. To further improve performance, we present an online event-triggered switching algorithm to select the most suitable controller from a set of Neural-PI controllers, each optimized for specific inertia levels. Simulations on the IEEE 39-bus system validate the effectiveness of the proposed online switching control method with stability guarantees and optimized performance for frequency control under time-varying inertia.
Hybrid Plant Models Call for a Different Plant Modelling Paradigm and a New Generation of Software (Heresy in the land of moles, fractions, & rigorous physical properties)
This paper is an invitation to the process systems engineering community to change the paradigm for process plants. The goal is to achieve much easier convergence while retaining accuracy on par with the rigorous models. Accurate plant models of existing plants can be linear or much less nonlinear if they are based on mass component flows and stream properties per unit mass properties instead of molar flows and mole fractions. Accurate stream properties per unit mass can be calculated at stream specific conditions by linear approximations which in many instances eliminates mole fraction-based flash calculations. Hybrid data-driven node models fit naturally in this paradigm, since they used measured data, which is either in mass or in volumetric units, but never in moles. Instantiation of models at all levels of abstraction (planning, scheduling, optimization, and control models) from the same plant topology representation will ensure inheritance of solutions from mass-only to mass-and-energy to mass-and-energy-and-stream-properties, thereby ensuring consistency of solutions between these models. None of the existing software provides inheritance between different levels of plant abstraction (i.e. inheritance between models for different business applications) or different levels of abstractions per plant sections or per time periods, which motivates this exposition.
On Mobility Equity and the Promise of Emerging Transportation Systems
This paper introduces a mobility equity metric (MEM) for evaluating fairness and accessibility in multi-modal intelligent transportation systems. The MEM simultaneously accounts for service accessibility and transportation costs across different modes of transportation and social demographics. We provide a data-driven validation of the proposed MEM to characterize the impact of various parameters in the metric across cities in the U.S. We subsequently develop a routing framework that aims to optimize MEM within a transportation network containing both public transit and private vehicles. Within this framework, a system planner provides routing suggestions to vehicles across all modes of transportation to maximize MEM. We evaluate our approach through numerical simulations, analyzing the impact of travel demands and compliance of private vehicles. This work provides insights into designing transportation systems that are not only efficient but also equitable, ensuring fair access to essential services across diverse populations.
comment: 14 pages, 15 figures
Optimization Solution Functions as Deterministic Policies for Offline Reinforcement Learning
Offline reinforcement learning (RL) is a promising approach for many control applications but faces challenges such as limited data coverage and value function overestimation. In this paper, we propose an implicit actor-critic (iAC) framework that employs optimization solution functions as a deterministic policy (actor) and a monotone function over the optimal value of optimization as a critic. By encoding optimality in the actor policy, we show that the learned policies are robust to the suboptimality of the learned actor parameters via the exponentially decaying sensitivity (EDS) property. We obtain performance guarantees for the proposed iAC framework and show its benefits over general function approximation schemes. Finally, we validate the proposed framework on two real-world applications and show a significant improvement over state-of-the-art (SOTA) offline RL methods.
comment: American Control Conference 2024
This is the Way: Mitigating the Roll of an Autonomous Uncrewed Surface Vessel in Wavy Conditions Using Model Predictive Control IROS
Though larger vessels may be well-equipped to deal with wavy conditions, smaller vessels are often more susceptible to disturbances. This paper explores the development of a nonlinear model predictive control (NMPC) system for Uncrewed Surface Vessels (USVs) in wavy conditions to minimize average roll. The NMPC is based on a prediction method that uses information about the vessel's dynamics and an assumed wave model. This method is able to mitigate the roll of an under-actuated USV in a variety of conditions by adjusting the weights of the cost function. The results show a reduction of 39% of average roll with a tuned controller in conditions with 1.75-metre sinusoidal waves. A general and intuitive tuning strategy is established. This preliminary work is a proof of concept which sets the stage for the leveraging of wave prediction methodologies to perform planning and control in real time for USVs in real-world scenarios and field trials.
comment: 6 pages, 10 figures. To appear in Proceedings of the 2024 IEEE/RSJ International Conference on Robots and Systems (IROS), October 2024
Control-Informed Reinforcement Learning for Chemical Processes
This work proposes a control-informed reinforcement learning (CIRL) framework that integrates proportional-integral-derivative (PID) control components into the architecture of deep reinforcement learning (RL) policies. The proposed approach augments deep RL agents with a PID controller layer, incorporating prior knowledge from control theory into the learning process. CIRL improves performance and robustness by combining the best of both worlds: the disturbance-rejection and setpoint-tracking capabilities of PID control and the nonlinear modeling capacity of deep RL. Simulation studies conducted on a continuously stirred tank reactor system demonstrate the improved performance of CIRL compared to both conventional model-free deep RL and static PID controllers. CIRL exhibits better setpoint-tracking ability, particularly when generalizing to trajectories outside the training distribution, suggesting enhanced generalization capabilities. Furthermore, the embedded prior control knowledge within the CIRL policy improves its robustness to unobserved system disturbances. The control-informed RL framework combines the strengths of classical control and reinforcement learning to develop sample-efficient and robust deep reinforcement learning algorithms, with potential applications in complex industrial systems.
Structured Deep Neural Networks-Based Backstepping Trajectory Tracking Control for Lagrangian Systems
Deep neural networks (DNN) are increasingly being used to learn controllers due to their excellent approximation capabilities. However, their black-box nature poses significant challenges to closed-loop stability guarantees and performance analysis. In this paper, we introduce a structured DNN-based controller for the trajectory tracking control of Lagrangian systems using backing techniques. By properly designing neural network structures, the proposed controller can ensure closed-loop stability for any compatible neural network parameters. In addition, improved control performance can be achieved by further optimizing neural network parameters. Besides, we provide explicit upper bounds on tracking errors in terms of controller parameters, which allows us to achieve the desired tracking performance by properly selecting the controller parameters. Furthermore, when system models are unknown, we propose an improved Lagrangian neural network (LNN) structure to learn the system dynamics and design the controller. We show that in the presence of model approximation errors and external disturbances, the closed-loop stability and tracking control performance can still be guaranteed. The effectiveness of the proposed approach is demonstrated through simulations.
Baseline Results for Selected Nonlinear System Identification Benchmarks
Nonlinear system identification remains an important open challenge across research and academia. Large numbers of novel approaches are seen published each year, each presenting improvements or extensions to existing methods. It is natural, therefore, to consider how one might choose between these competing models. Benchmark datasets provide one clear way to approach this question. However, to make meaningful inference based on benchmark performance it is important to understand how well a new method performs comparatively to results available with well-established methods. This paper presents a set of ten baseline techniques and their relative performances on five popular benchmarks. The aim of this contribution is to stimulate thought and discussion regarding objective comparison of identification methodologies.
InPTC: Integrated Planning and Tube-Following Control for Prescribed-Time Collision-Free Navigation of Wheeled Mobile Robots
In this article, we propose a novel approach, called InPTC (Integrated Planning and Tube-Following Control), for prescribed-time collision-free navigation of wheeled mobile robots in a compact convex workspace cluttered with static, sufficiently separated, and convex obstacles. A path planner with prescribed-time convergence is presented based upon Bouligand's tangent cones and time scale transformation (TST) techniques, yielding a continuous vector field that can guide the robot from almost all initial positions in the free space to the designated goal at a prescribed time, while avoiding entering the obstacle regions augmented with safety margin. By leveraging barrier functions and TST, we further derive a tube-following controller to achieve robot trajectory tracking within a prescribed time less than the planner's settling time. This controller ensures the robot moves inside a predefined ``safe tube'' around the reference trajectory, where the tube radius is set to be less than the safety margin. Consequently, the robot will reach the goal location within a prescribed time while avoiding collision with any obstacles along the way. The proposed InPTC is implemented on a Mona robot operating in an arena cluttered with obstacles of various shapes. Experimental results demonstrate that InPTC not only generates smooth collision-free reference trajectories that converge to the goal location at the preassigned time of $250\,\rm s$ (i.e., the required task completion time), but also achieves tube-following trajectory tracking with tracking accuracy higher than $0.01\rm m$ after the preassigned time of $150\,\rm s$. This enables the robot to accomplish the navigation task within the required time of $250\,\rm s$.
Risk-aware Scheduling and Dispatch of Flexibility Events in Buildings
Residential and commercial buildings, equipped with systems such as heat pumps (HPs), hot water tanks, or stationary energy storage, have a large potential to offer their consumption flexibility as grid services. In this work, we leverage this flexibility to react to consumption requests related to maximizing self-consumption and reducing peak loads. We employ a data-driven virtual storage modeling approach for flexibility prediction in the form of flexibility envelopes for individual buildings. The risk-awareness of this prediction is inherited by the proposed scheduling algorithm. A Mixed-integer Linear Program (MILP) is formulated to schedule the activation of a pool of buildings in order to best respond to an external aggregated consumption request. This aggregated request is then dispatched to the active individual buildings, based on the previously determined schedule. The effectiveness of the approach is demonstrated by coordinating up to 500 simulated buildings using the Energym Python library and observing about 1.5 times peak power reduction in comparison with a baseline approach while maintaining comfort more robustly. We demonstrate the scalability of the approach by solving problems with 2000 buildings in about 21 seconds, with solving times being approximately linear in the number of considered assets.
Model-Free Unsupervised Anomaly detection framework in multivariate time-series of industrial dynamical systems
In this paper, a new model-free anomaly detection framework is proposed for time-series induced by industrial dynamical systems. The framework lies in the category of conventional approaches which enable appealing features such as, a fast learning with reduced amount of learning data, a reduced memory, a high potential for explainability as well as easiness of incremental learning mechanism to incorporate operator feedback after an alarm is raised an analyzed. All these are crucial features towards acceptance of data-driven solution by industry but they are rarely considered in the comparisons between competing methods which generally exclusively focus on performance metrics. Moreover, the features engineering step involved in the proposed framework is inspired by the time-series being implicitly governed by physical laws as it is generally the case in industrial time-series. Two examples are given to assess the efficiency of the proposed approach.
comment: 25 pages, 2 tables, 12 figures, 1 appendix
An Interface Method for Co-simulation of EMT Model and Shifted Frequency EMT Model Based on Rotational Invariance Techniques
The shifted frequency-based electromagnetic transient (SFEMT) simulation has greatly improved the computational efficiency of traditional electromagnetic transient (EMT) simulation for the ac grid. This letter proposes a novel interface for the co-simulation of the SFEMT model and the traditional EMT model. The general form of SFEMT modeling and the principle of analytical signal construction are first derived. Then, an interface for the co-simulation of EMT and SFEMT simulation is proposed based on rotational invariance techniques. Theoretical analyses and test results demonstrate the effectiveness of the proposed method.
Submodular Maximization Approaches for Equitable Client Selection in Federated Learning
In a conventional Federated Learning framework, client selection for training typically involves the random sampling of a subset of clients in each iteration. However, this random selection often leads to disparate performance among clients, raising concerns regarding fairness, particularly in applications where equitable outcomes are crucial, such as in medical or financial machine learning tasks. This disparity typically becomes more pronounced with the advent of performance-centric client sampling techniques. This paper introduces two novel methods, namely SUBTRUNC and UNIONFL, designed to address the limitations of random client selection. Both approaches utilize submodular function maximization to achieve more balanced models. By modifying the facility location problem, they aim to mitigate the fairness concerns associated with random selection. SUBTRUNC leverages client loss information to diversify solutions, while UNIONFL relies on historical client selection data to ensure a more equitable performance of the final model. Moreover, these algorithms are accompanied by robust theoretical guarantees regarding convergence under reasonable assumptions. The efficacy of these methods is demonstrated through extensive evaluations across heterogeneous scenarios, revealing significant improvements in fairness as measured by a client dissimilarity metric.
comment: 13 pages
Scaling Learning based Policy Optimization for Temporal Logic Tasks by Controller Network Dropout
This paper introduces a model-based approach for training feedback controllers for an autonomous agent operating in a highly nonlinear (albeit deterministic) environment. We desire the trained policy to ensure that the agent satisfies specific task objectives and safety constraints, both expressed in Discrete-Time Signal Temporal Logic (DT-STL). One advantage for reformulation of a task via formal frameworks, like DT-STL, is that it permits quantitative satisfaction semantics. In other words, given a trajectory and a DT-STL formula, we can compute the {\em robustness}, which can be interpreted as an approximate signed distance between the trajectory and the set of trajectories satisfying the formula. We utilize feedback control, and we assume a feed forward neural network for learning the feedback controller. We show how this learning problem is similar to training recurrent neural networks (RNNs), where the number of recurrent units is proportional to the temporal horizon of the agent's task objectives. This poses a challenge: RNNs are susceptible to vanishing and exploding gradients, and na\"{i}ve gradient descent-based strategies to solve long-horizon task objectives thus suffer from the same problems. To tackle this challenge, we introduce a novel gradient approximation algorithm based on the idea of dropout or gradient sampling. One of the main contributions is the notion of {\em controller network dropout}, where we approximate the NN controller in several time-steps in the task horizon by the control input obtained using the controller in a previous training step. We show that our control synthesis methodology, can be quite helpful for stochastic gradient descent to converge with less numerical issues, enabling scalable backpropagation over long time horizons and trajectories over high dimensional state spaces.
Robust Backstepping Control of a Quadrotor Unmanned Aerial Vehicle Under Colored Noises
Advances in software and hardware technologies have facilitated the production of quadrotor unmanned aerial vehicles (UAVs). Quadrotor UAVs are used in important missions such as search and rescue, counter terrorism, firefighting, surveillance and cargo transportation. While performing these tasks, quadrotors must operate in noisy environments. Therefore, a robust controller design that can control the altitude and attitude of the quadrotor in noisy environments is of great importance. While many researchers focus only on white Gaussian noise in their studies, all colored noises should be considered during quadrotor's operation. In this study, it is aimed to design a robust controller that is resistant to all colored noises. Firstly, a nonlinear model of the quadrotor was created with MATLAB. Then, a backstepping control design that is resistant to colored noises was realized. The designed backstepping controller was tested under Gaussian white noise, pink noise, brown noise, blue noise and purple noise. PID and Lyapunov-based controller designs were also carried out and their time responses (rise time, overshoot, settling time) were compared with those of backstepping controller. When the values obtained was examined, it was proven that the proposed backstepping controller had the least overshoot and shortest settling time under all noise types.
comment: 15 pages, 24 figures
Stochastic-Robust Planning of Networked Hydrogen-Electrical Microgrids: A Study on Induced Refueling Demand
Hydrogen-electrical microgrids are increasingly assuming an important role on the pathway toward decarbonization of energy and transportation systems. This paper studies networked hydrogen-electrical microgrids planning (NHEMP), considering a critical but often-overlooked issue, i.e., the demand-inducing effect (DIE) associated with infrastructure development decisions. Specifically, higher refueling capacities will attract more refueling demand of hydrogen-powered vehicles (HVs). To capture such interactions between investment decisions and induced refueling demand, we introduce a decision-dependent uncertainty (DDU) set and build a trilevel stochastic-robust formulation. The upper-level determines optimal investment strategies for hydrogen-electrical microgrids, the lower-level optimizes the risk-aware operation schedules across a series of stochastic scenarios, and, for each scenario, the middle-level identifies the "worst" situation of refueling demand within an individual DDU set to ensure economic feasibility. Then, an adaptive and exact decomposition algorithm, based on Parametric Column-and-Constraint Generation (PC&CG), is customized and developed to address the computational challenge and to quantitatively analyze the impact of DIE. Case studies on an IEEE exemplary system validate the effectiveness of the proposed NHEMP model and the PC&CG algorithm. It is worth highlighting that DIE can make an important contribution to the economic benefits of NHEMP, yet its significance will gradually decrease when the main bottleneck transits to other system restrictions.
Robotics
Advancing Humanoid Locomotion: Mastering Challenging Terrains with Denoising World Model Learning
Humanoid robots, with their human-like skeletal structure, are especially suited for tasks in human-centric environments. However, this structure is accompanied by additional challenges in locomotion controller design, especially in complex real-world environments. As a result, existing humanoid robots are limited to relatively simple terrains, either with model-based control or model-free reinforcement learning. In this work, we introduce Denoising World Model Learning (DWL), an end-to-end reinforcement learning framework for humanoid locomotion control, which demonstrates the world's first humanoid robot to master real-world challenging terrains such as snowy and inclined land in the wild, up and down stairs, and extremely uneven terrains. All scenarios run the same learned neural network with zero-shot sim-to-real transfer, indicating the superior robustness and generalization capability of the proposed method.
comment: Robotics: Science and Systems (RSS), 2024. (Best Paper Award Finalist)
GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy
The robotics community has consistently aimed to achieve generalizable robot manipulation with flexible natural language instructions. One of the primary challenges is that obtaining robot data fully annotated with both actions and texts is time-consuming and labor-intensive. However, partially annotated data, such as human activity videos without action labels and robot play data without language labels, is much easier to collect. Can we leverage these data to enhance the generalization capability of robots? In this paper, we propose GR-MG, a novel method which supports conditioning on both a language instruction and a goal image. During training, GR-MG samples goal images from trajectories and conditions on both the text and the goal image or solely on the image when text is unavailable. During inference, where only the text is provided, GR-MG generates the goal image via a diffusion-based image-editing model and condition on both the text and the generated image. This approach enables GR-MG to leverage large amounts of partially annotated data while still using language to flexibly specify tasks. To generate accurate goal images, we propose a novel progress-guided goal image generation model which injects task progress information into the generation process, significantly improving the fidelity and the performance. In simulation experiments, GR-MG improves the average number of tasks completed in a row of 5 from 3.35 to 4.04. In real-robot experiments, GR-MG is able to perform 47 different tasks and improves the success rate from 62.5% to 75.0% and 42.4% to 57.6% in simple and generalization settings, respectively. Code and checkpoints will be available at the project page: https://gr-mg.github.io/.
comment: 9 pages, 7 figures, letter
Model Predictive Parkour Control of a Monoped Hopper in Dynamically Changing Environments
A great advantage of legged robots is their ability to operate on particularly difficult and obstructed terrain, which demands dynamic, robust, and precise movements. The study of obstacle courses provides invaluable insights into the challenges legged robots face, offering a controlled environment to assess and enhance their capabilities. Traversing it with a one-legged hopper introduces intricate challenges, such as planning over contacts and dealing with flight phases, which necessitates a sophisticated controller. A novel model predictive parkour controller is introduced, that finds an optimal path through a real-time changing obstacle course with mixed integer motion planning. The execution of this optimized path is then achieved through a state machine employing a PD control scheme with feedforward torques, ensuring robust and accurate performance.
comment: Published in: IEEE Robotics and Automation Letters
Functional kinematic and kinetic requirements of the upper limb during activities of daily living: a recommendation on necessary joint capabilities for prosthetic arms IROS 2024
Prosthetic limb abandonment remains an unsolved challenge as amputees consistently reject their devices. Current prosthetic designs often fail to balance human-like perfomance with acceptable device weight, highlighting the need for optimised designs tailored to modern tasks. This study aims to provide a comprehensive dataset of joint kinematics and kinetics essential for performing activities of daily living (ADL), thereby informing the design of more functional and user-friendly prosthetic devices. Functionally required Ranges of Motion (ROM), velocities, and torques for the Glenohumeral (rotation), elbow, Radioulnar, and wrist joints were computed using motion capture data from 12 subjects performing 24 ADLs. Our approach included the computation of joint torques for varying mass and inertia properties of the upper limb, while torques induced by the manipulation of experimental objects were considered by their interaction wrench with the subjects hand. Joint torques pertaining to individual ADL scaled linearly with limb and object mass and mass distribution, permitting their generalisation to not explicitly simulated limb and object dynamics with linear regressors (LRM), exhibiting coefficients of determination R = 0.99 pm 0.01. Exemplifying an application of data-driven prosthesis design, we optimise wrist axes orientations for two serial and two differential joint configurations. Optimised axes reduced peak power requirements, between 22 to 38 percent compared to anatomical configurations, by exploiting high torque correlations between Ulnar deviation and wrist flexion/extension joints. This study offers critical insights into the functional requirements of upper limb prostheses, providing a valuable foundation for data-driven prosthetic design that addresses key user concerns and enhances device adoption.
comment: Accepted at IROS 2024
Equivariant Reinforcement Learning under Partial Observability
Incorporating inductive biases is a promising approach for tackling challenging robot learning domains with sample-efficient solutions. This paper identifies partially observable domains where symmetries can be a useful inductive bias for efficient learning. Specifically, by encoding the equivariance regarding specific group symmetries into the neural networks, our actor-critic reinforcement learning agents can reuse solutions in the past for related scenarios. Consequently, our equivariant agents outperform non-equivariant approaches significantly in terms of sample efficiency and final performance, demonstrated through experiments on a range of robotic tasks in simulation and real hardware.
comment: Conference on Robot Learning, 2023
Visuo-Tactile Exploration of Unknown Rigid 3D Curvatures by Vision-Augmented Unified Force-Impedance Control IROS 2024
Despite recent advancements in torque-controlled tactile robots, integrating them into manufacturing settings remains challenging, particularly in complex environments. Simplifying robotic skill programming for non-experts is crucial for increasing robot deployment in manufacturing. This work proposes an innovative approach, Vision-Augmented Unified Force-Impedance Control (VA-UFIC), aimed at intuitive visuo-tactile exploration of unknown 3D curvatures. VA-UFIC stands out by seamlessly integrating vision and tactile data, enabling the exploration of diverse contact shapes in three dimensions, including point contacts, flat contacts with concave and convex curvatures, and scenarios involving contact loss. A pivotal component of our method is a robust online contact alignment monitoring system that considers tactile error, local surface curvature, and orientation, facilitating adaptive adjustments of robot stiffness and force regulation during exploration. We introduce virtual energy tanks within the control framework to ensure safety and stability, effectively addressing inherent safety concerns in visuo-tactile exploration. Evaluation using a Franka Emika research robot demonstrates the efficacy of VA-UFIC in exploring unknown 3D curvatures while adhering to arbitrarily defined force-motion policies. By seamlessly integrating vision and tactile sensing, VA-UFIC offers a promising avenue for intuitive exploration of complex environments, with potential applications spanning manufacturing, inspection, and beyond.
comment: 8 pages, 3 figures, accepted by IROS 2024
A Survey on Small-Scale Testbeds for Connected and Automated Vehicles and Robot Swarms
Connected and automated vehicles and robot swarms hold transformative potential for enhancing safety, efficiency, and sustainability in the transportation and manufacturing sectors. Extensive testing and validation of these technologies is crucial for their deployment in the real world. While simulations are essential for initial testing, they often have limitations in capturing the complex dynamics of real-world interactions. This limitation underscores the importance of small-scale testbeds. These testbeds provide a realistic, cost-effective, and controlled environment for testing and validating algorithms, acting as an essential intermediary between simulation and full-scale experiments. This work serves to facilitate researchers' efforts in identifying existing small-scale testbeds suitable for their experiments and provide insights for those who want to build their own. In addition, it delivers a comprehensive survey of the current landscape of these testbeds. We derive 62 characteristics of testbeds based on the well-known sense-plan-act paradigm and offer an online table comparing 22 small-scale testbeds based on these characteristics. The online table is hosted on our designated public webpage www.cpm-remote.de/testbeds, and we invite testbed creators and developers to contribute to it. We closely examine nine testbeds in this paper, demonstrating how the derived characteristics can be used to present testbeds. Furthermore, we discuss three ongoing challenges concerning small-scale testbeds that we identified, i.e., small-scale to full-scale transition, sustainability, and power and resource management.
comment: 16 pages, 11 figures, 1 table. This work has been submitted to the IEEE Robotics & Automation Magazine for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
DynamicRouteGPT: A Real-Time Multi-Vehicle Dynamic Navigation Framework Based on Large Language Models
Real-time dynamic path planning in complex traffic environments presents challenges, such as varying traffic volumes and signal wait times. Traditional static routing algorithms like Dijkstra and A* compute shortest paths but often fail under dynamic conditions. Recent Reinforcement Learning (RL) approaches offer improvements but tend to focus on local optima, risking dead-ends or boundary issues. This paper proposes a novel approach based on causal inference for real-time dynamic path planning, balancing global and local optimality. We first use the static Dijkstra algorithm to compute a globally optimal baseline path. A distributed control strategy then guides vehicles along this path. At intersections, DynamicRouteGPT performs real-time decision-making for local path selection, considering real-time traffic, driving preferences, and unexpected events. DynamicRouteGPT integrates Markov chains, Bayesian inference, and large-scale pretrained language models like Llama3 8B to provide an efficient path planning solution. It dynamically adjusts to traffic scenarios and driver preferences and requires no pre-training, offering broad applicability across road networks. A key innovation is the construction of causal graphs for counterfactual reasoning, optimizing path decisions. Experimental results show that our method achieves state-of-the-art performance in real-time dynamic path planning for multiple vehicles while providing explainable path selections, offering a novel and efficient solution for complex traffic environments.
comment: This paper is 12 pages long and represents the initial draft, version 1
Robot Navigation with Entity-Based Collision Avoidance using Deep Reinforcement Learning
Efficient navigation in dynamic environments is crucial for autonomous robots interacting with various environmental entities, including both moving agents and static obstacles. In this study, we present a novel methodology that enhances the robot's interaction with different types of agents and obstacles based on specific safety requirements. This approach uses information about the entity types, improving collision avoidance and ensuring safer navigation. We introduce a new reward function that penalizes the robot for collisions with different entities such as adults, bicyclists, children, and static obstacles, and additionally encourages the robot's proximity to the goal. It also penalizes the robot for being close to entities, and the safe distance also depends on the entity type. Additionally, we propose an optimized algorithm for training and testing, which significantly accelerates train, validation, and test steps and enables training in complex environments. Comprehensive experiments conducted using simulation demonstrate that our approach consistently outperforms conventional navigation and collision avoidance methods, including state-of-the-art techniques. To sum up, this work contributes to enhancing the safety and efficiency of navigation systems for autonomous robots in dynamic, crowded environments.
comment: 14 pages, 5 figures
CHIGLU: A Modular Hardware for Stepper Motorized Quadruped Robot $\unicode{x2014}$ Design, Analysis, Fabrication, and Validation
Bio-engineered robots are under rapid development due to their maneuver ability through uneven surfaces. This advancement paves the way for experimenting with versatile electrical system developments with various motors. In this research paper, we present a design, fabrication and analysis of a versatile printed circuit board (PCB) as the main system that allows for the control of twelve stepper motors by stacking low-budget stepper motor controller and widely used micro-controller unit. The primary motivation behind the design is to offer a compact and efficient hardware solution for controlling multiple stepper motors of a quadruped robot while meeting the required power budget. The research focuses on the hardware's architecture, stackable design, power budget planning and a thorough analysis. Additionally, PDN (Power Distribution Network) analysis simulation is done to ensure that the voltage and current density are within the expected parameters. Also, the hardware design deep dives into design for manufacturability (DFM). The ability to stack the controllers on the development board provides insights into the board's components swapping feasibility. The findings from this research make a significant contribution to the advancement of stepper motor control systems of multi-axis applications for bio-inspired robot offering a convenient form factor and a reliable performance.
comment: LaTeX, 26 pages with 25 figures
Modular Meshed Ultra-Wideband Aided Inertial Navigation with Robust Anchor Calibration
This paper introduces a generic filter-based state estimation framework that supports two state-decoupling strategies based on cross-covariance factorization. These strategies reduce the computational complexity and inherently support true modularity -- a perquisite for handling and processing meshed range measurements among a time-varying set of devices. In order to utilize these measurements in the estimation framework, positions of newly detected stationary devices (anchors) and the pairwise biases between the ranging devices are required. In this work an autonomous calibration procedure for new anchors is presented, that utilizes range measurements from multiple tags as well as already known anchors. To improve the robustness, an outlier rejection method is introduced. After the calibration is performed, the sensor fusion framework obtains initial beliefs of the anchor positions and dictionaries of pairwise biases, in order to fuse range measurements obtained from new anchors tightly-coupled. The effectiveness of the filter and calibration framework has been validated through evaluations on a recorded dataset and real-world experiments.
Bridging the gap between Learning-to-plan, Motion Primitives and Safe Reinforcement Learning
Trajectory planning under kinodynamic constraints is fundamental for advanced robotics applications that require dexterous, reactive, and rapid skills in complex environments. These constraints, which may represent task, safety, or actuator limitations, are essential for ensuring the proper functioning of robotic platforms and preventing unexpected behaviors. Recent advances in kinodynamic planning demonstrate that learning-to-plan techniques can generate complex and reactive motions under intricate constraints. However, these techniques necessitate the analytical modeling of both the robot and the entire task, a limiting assumption when systems are extremely complex or when constructing accurate task models is prohibitive. This paper addresses this limitation by combining learning-to-plan methods with reinforcement learning, resulting in a novel integration of black-box learning of motion primitives and optimization. We evaluate our approach against state-of-the-art safe reinforcement learning methods, showing that our technique, particularly when exploiting task structure, outperforms baseline methods in challenging scenarios such as planning to hit in robot air hockey. This work demonstrates the potential of our integrated approach to enhance the performance and safety of robots operating under complex kinodynamic constraints.
Collaborative Perception in Multi-Robot Systems: Case Studies in Household Cleaning and Warehouse Operations
This paper explores the paradigm of Collaborative Perception (CP), where multiple robots and sensors in the environment share and integrate sensor data to construct a comprehensive representation of the surroundings. By aggregating data from various sensors and utilizing advanced algorithms, the collaborative perception framework improves task efficiency, coverage, and safety. Two case studies are presented to showcase the benefits of collaborative perception in multi-robot systems. The first case study illustrates the benefits and advantages of using CP for the task of household cleaning with a team of cleaning robots. The second case study performs a comparative analysis of the performance of CP versus Standalone Perception (SP) for Autonomous Mobile Robots operating in a warehouse environment. The case studies validate the effectiveness of CP in enhancing multi-robot coordination, task completion, and overall system performance and its potential to impact operations in other applications as well. Future investigations will focus on optimizing the framework and validating its performance through empirical testing.
Re-Mix: Optimizing Data Mixtures for Large Scale Imitation Learning
Increasingly large imitation learning datasets are being collected with the goal of training foundation models for robotics. However, despite the fact that data selection has been of utmost importance in vision and natural language processing, little work in robotics has questioned what data such models should actually be trained on. In this work we investigate how to weigh different subsets or ``domains'' of robotics datasets for robot foundation model pre-training. Concrete, we use distributionally robust optimization (DRO) to maximize worst-case performance across all possible downstream domains. Our method, Re-Mix, addresses the wide range of challenges that arise when applying DRO to robotics datasets including variability in action spaces and dynamics across different datasets. Re-Mix employs early stopping, action normalization, and discretization to counteract these issues. Through extensive experimentation on the largest open-source robot manipulation dataset, the Open X-Embodiment dataset, we demonstrate that data curation can have an outsized impact on downstream performance. Specifically, domain weights learned by Re-Mix outperform uniform weights by 38\% on average and outperform human-selected weights by 32\% on datasets used to train existing generalist robot policies, specifically the RT-X models.
FAST-LIVO2: Fast, Direct LiDAR-Inertial-Visual Odometry
This paper proposes FAST-LIVO2: a fast, direct LiDAR-inertial-visual odometry framework to achieve accurate and robust state estimation in SLAM tasks and provide great potential in real-time, onboard robotic applications. FAST-LIVO2 fuses the IMU, LiDAR and image measurements efficiently through an ESIKF. To address the dimension mismatch between the heterogeneous LiDAR and image measurements, we use a sequential update strategy in the Kalman filter. To enhance the efficiency, we use direct methods for both the visual and LiDAR fusion, where the LiDAR module registers raw points without extracting edge or plane features and the visual module minimizes direct photometric errors without extracting ORB or FAST corner features. The fusion of both visual and LiDAR measurements is based on a single unified voxel map where the LiDAR module constructs the geometric structure for registering new LiDAR scans and the visual module attaches image patches to the LiDAR points. To enhance the accuracy of image alignment, we use plane priors from the LiDAR points in the voxel map (and even refine the plane prior) and update the reference patch dynamically after new images are aligned. Furthermore, to enhance the robustness of image alignment, FAST-LIVO2 employs an on-demanding raycast operation and estimates the image exposure time in real time. Lastly, we detail three applications of FAST-LIVO2: UAV onboard navigation demonstrating the system's computation efficiency for real-time onboard navigation, airborne mapping showcasing the system's mapping accuracy, and 3D model rendering (mesh-based and NeRF-based) underscoring the suitability of our reconstructed dense map for subsequent rendering tasks. We open source our code, dataset and application on GitHub to benefit the robotics community.
comment: 30 pages, 31 figures, due to the limitation that 'The abstract field cannot exceed 1,920 characters', the abstract presented here is shorter than the one in the PDF file
Optimizing TD3 for 7-DOF Robotic Arm Grasping: Overcoming Suboptimality with Exploration-Enhanced Contrastive Learning
In actor-critic-based reinforcement learning algorithms such as Twin Delayed Deep Deterministic policy gradient (TD3), insufficient exploration of the spatial space can result in suboptimal policies when controlling 7-DOF robotic arms. To address this issue, we propose a novel Exploration-Enhanced Contrastive Learning (EECL) module that improves exploration by providing additional rewards for encountering novel states. Our module stores previously explored states in a buffer and identifies new states by comparing them with historical data using Euclidean distance within a K-dimensional tree (KDTree) framework. When the agent explores new states, exploration rewards are assigned. These rewards are then integrated into the TD3 algorithm, ensuring that the Q-learning process incorporates these signals, promoting more effective strategy optimization. We evaluate our method on the robosuite panda lift task, demonstrating that it significantly outperforms the baseline TD3 in terms of both efficiency and convergence speed in the tested environment.
comment: 4 pages, 2 figures, IEEE-ICKII-2024
Quantitative Representation of Scenario Difficulty for Autonomous Driving Based on Adversarial Policy Search
Adversarial scenario generation is crucial for autonomous driving testing because it can efficiently simulate various challenge and complex traffic conditions. However, it is difficult to control current existing methods to generate desired scenarios, such as the ones with different conflict levels. Therefore, this paper proposes a data-driven quantitative method to represent scenario difficulty. Compared with rule-based discrete scenario difficulty representation method, the proposed algorithm can achieve continuous difficulty representation. Specifically, the environment agent is introduced, and a reinforcement learning method combined with mechanism knowledge is constructed for policy search to obtain an agent with adversarial behavior. The model parameters of the environment agent at different stages in the training process are extracted to construct a policy group, and then the agents with different adversarial intensity are obtained, which are used to realize data generation in different difficulty scenarios through the simulation environment. Finally, a data-driven scenario difficulty quantitative representation model is constructed, which is used to output the environment agent policy under different difficulties. The result analysis shows that the proposed algorithm can generate reasonable and interpretable scenarios with high discrimination, and can provide quantifiable difficulty representation without any expert logic rule design. The video link is https://www.youtube.com/watch?v=GceGdqAm9Ys.
Multi-Agent Path Finding with Real Robot Dynamics and Interdependent Tasks for Automated Warehouses ECAI-2024
Multi-Agent Path Finding (MAPF) is an important optimization problem underlying the deployment of robots in automated warehouses and factories. Despite the large body of work on this topic, most approaches make heavy simplifications, both on the environment and the agents, which make the resulting algorithms impractical for real-life scenarios. In this paper, we consider a realistic problem of online order delivery in a warehouse, where a fleet of robots bring the products belonging to each order from shelves to workstations. This creates a stream of inter-dependent pickup and delivery tasks and the associated MAPF problem consists of computing realistic collision-free robot trajectories fulfilling these tasks. To solve this MAPF problem, we propose an extension of the standard Prioritized Planning algorithm to deal with the inter-dependent tasks (Interleaved Prioritized Planning) and a novel Via-Point Star (VP*) algorithm to compute an optimal dynamics-compliant robot trajectory to visit a sequence of goal locations while avoiding moving obstacles. We prove the completeness of our approach and evaluate it in simulation as well as in a real warehouse.
comment: Accepted to ECAI-2024. For related videos, see https://europe.naverlabs.com/research/publications/MAPF_IPP
A Survey on Reinforcement Learning Applications in SLAM
The emergence of mobile robotics, particularly in the automotive industry, introduces a promising era of enriched user experiences and adept handling of complex navigation challenges. The realization of these advancements necessitates a focused technological effort and the successful execution of numerous intricate tasks, particularly in the critical domain of Simultaneous Localization and Mapping (SLAM). Various artificial intelligence (AI) methodologies, such as deep learning and reinforcement learning, present viable solutions to address the challenges in SLAM. This study specifically explores the application of reinforcement learning in the context of SLAM. By enabling the agent (the robot) to iteratively interact with and receive feedback from its environment, reinforcement learning facilitates the acquisition of navigation and mapping skills, thereby enhancing the robot's decision-making capabilities. This approach offers several advantages, including improved navigation proficiency, increased resilience, reduced dependence on sensor precision, and refinement of the decision-making process. The findings of this study, which provide an overview of reinforcement learning's utilization in SLAM, reveal significant advancements in the field. The investigation also highlights the evolution and innovative integration of these techniques.
Design, Kinematics, and Deployment of a Continuum Underwater Vehicle-Manipulator System
Underwater vehicle-manipulator systems (UVMSs) are underwater robots equipped with one or more manipulators to perform intervention missions. This paper provides the mechanical, electrical, and software design of a novel UVMS equipped with a continuum manipulator, referred to as a continuum-UVMS. A kinematic model for the continuum-UVMS is derived in order to build an algorithm to resolve the robot's redundancy and generate joint space commands. Different methods to optimize the trajectory for specific tasks are proposed using both the weighted least norm solution and the gradient projection method. Kinematic simulation results are analyzed to assess the performance of the proposed algorithm. Finally, the continuum-UVMS is deployed in an experimental demonstration in which both teleoperation and autonomous control are tested for a given reference trajectory.
comment: 14 pages, ASME Journal of Mechanisms and Robotics, accepted
Brain Inspired Probabilistic Occupancy Grid Mapping with Hyperdimensional Computing
Real-time robotic systems require advanced perception, computation, and action capability. However, the main bottleneck in current autonomous systems is the trade-off between computational capability, energy efficiency and model determinism. World modeling, a key objective of many robotic systems, commonly uses occupancy grid mapping (OGM) as the first step towards building an end-to-end robotic system with perception, planning, autonomous maneuvering, and decision making capabilities. OGM divides the environment into discrete cells and assigns probability values to attributes such as occupancy and traversability. Existing methods fall into two categories: traditional methods and neural methods. Traditional methods rely on dense statistical calculations, while neural methods employ deep learning for probabilistic information processing. Recent works formulate a deterministic theory of neural computation at the intersection of cognitive science and vector symbolic architectures. In this study, we propose a Fourier-based hyperdimensional OGM system, VSA-OGM, combined with a novel application of Shannon entropy that retains the interpretability and stability of traditional methods along with the improved computational efficiency of neural methods. Our approach, validated across multiple datasets, achieves similar accuracy to covariant traditional methods while approximately reducing latency by 200x and memory by 1000x. Compared to invariant traditional methods, we see similar accuracy values while reducing latency by 3.7x. Moreover, we achieve 1.5x latency reductions compared to neural methods while eliminating the need for domain-specific model training.
Whole-body Humanoid Robot Locomotion with Human Reference
Recently, humanoid robots have made significant advances in their ability to perform challenging tasks due to the deployment of Reinforcement Learning (RL), however, the inherent complexity of humanoid robots, including the difficulty of designing complicated reward functions and training entire sophisticated systems, still poses a notable challenge. To conquer these challenges, after many iterations and in-depth investigations, we have meticulously developed a full-size humanoid robot, "Adam", whose innovative structural design greatly improves the efficiency and effectiveness of the imitation learning process. In addition, we have developed a novel imitation learning framework based on an adversarial motion prior, which applies not only to Adam but also to humanoid robots in general. Using the framework, Adam can exhibit unprecedented human-like characteristics in locomotion tasks. Our experimental results demonstrate that the proposed framework enables Adam to achieve human-comparable performance in complex locomotion tasks, marking the first time that human locomotion data has been used for imitation learning in a full-size humanoid robot.
comment: 7pages, 7 figures
Fusion of Visual-Inertial Odometry with LiDAR Relative Localization for Cooperative Guidance of a Micro-Scale Aerial Vehicle
A novel relative localization approach for guidance of a micro-scale UAV by a well-equipped aerial robot fusing VIO with LiDAR is proposed in this paper. LiDAR-based localization is accurate and robust to challenging environmental conditions, but 3D LiDARs are relatively heavy and require large UAV platforms, in contrast to lightweight cameras. However, visual-based self-localization methods exhibit lower accuracy and can suffer from significant drift with respect to the global reference frame. To benefit from both sensory modalities, we focus on cooperative navigation in a heterogeneous team of a primary LiDAR-equipped UAV and a secondary micro-scale camera-equipped UAV. We propose a novel cooperative approach combining LiDAR relative localization data with VIO output on board the primary UAV to obtain an accurate pose of the secondary UAV. The pose estimate is used to precisely and reliably guide the secondary UAV along trajectories defined in the primary UAV reference frame. The experimental evaluation has shown the superior accuracy of our method to the raw VIO output and demonstrated its capability to guide the secondary UAV along desired trajectories while mitigating VIO drift. Thus, such a heterogeneous system can explore large areas with LiDAR precision, as well as visit locations inaccessible to the large LiDAR-carrying UAV platforms, as was showcased in a real-world cooperative mapping scenario.
comment: pre-print submitted to Journal of Intelligent and Robotic Systems
LF Tracy: A Unified Single-Pipeline Approach for Salient Object Detection in Light Field Cameras ICPR 2024
Leveraging rich information is crucial for dense prediction tasks. Light field (LF) cameras are instrumental in this regard, as they allow data to be sampled from various perspectives. This capability provides valuable spatial, depth, and angular information, enhancing scene-parsing tasks. However, we have identified two overlooked issues for the LF salient object detection (SOD) task. (1): Previous approaches predominantly employ a customized two-stream design to discover the spatial and depth features within light field images. The network struggles to learn the implicit angular information between different images due to a lack of intra-network data connectivity. (2): Little research has been directed towards the data augmentation strategy for LF SOD. Research on inter-network data connectivity is scant. In this study, we propose an efficient paradigm (LF Tracy) to address those issues. This comprises a single-pipeline encoder paired with a highly efficient information aggregation (IA) module (around 8M parameters) to establish an intra-network connection. Then, a simple yet effective data augmentation strategy called MixLD is designed to bridge the inter-network connections. Owing to this innovative paradigm, our model surpasses the existing state-of-the-art method through extensive experiments. Especially, LF Tracy demonstrates a 23% improvement over previous results on the latest large-scale PKU dataset. The source code is publicly available at: https://github.com/FeiBryantkit/LF-Tracy.
comment: Accepted to ICPR 2024. The source code is publicly available at: https://github.com/FeiBryantkit/LF-Tracy
Decision-Focused Learning to Predict Action Costs for Planning
In many automated planning applications, action costs can be hard to specify. An example is the time needed to travel through a certain road segment, which depends on many factors, such as the current weather conditions. A natural way to address this issue is to learn to predict these parameters based on input features (e.g., weather forecasts) and use the predicted action costs in automated planning afterward. Decision-Focused Learning (DFL) has been successful in learning to predict the parameters of combinatorial optimization problems in a way that optimizes solution quality rather than prediction quality. This approach yields better results than treating prediction and optimization as separate tasks. In this paper, we investigate for the first time the challenges of implementing DFL for automated planning in order to learn to predict the action costs. There are two main challenges to overcome: (1) planning systems are called during gradient descent learning, to solve planning problems with negative action costs, which are not supported in planning. We propose novel methods for gradient computation to avoid this issue. (2) DFL requires repeated planner calls during training, which can limit the scalability of the method. We experiment with different methods approximating the optimal plan as well as an easy-to-implement caching mechanism to speed up the learning process. As the first work that addresses DFL for automated planning, we demonstrate that the proposed gradient computation consistently yields significantly better plans than predictions aimed at minimizing prediction error; and that caching can temper the computation requirements.
Collision-Free Trajectory Optimization in Cluttered Environments Using Sums-of-Squares Programming
In this work, we propose a trajectory optimization approach for robot navigation in cluttered 3D environments. We represent the robot's geometry as a semialgebraic set defined by polynomial inequalities such that robots with general shapes can be suitably characterized. To address the robot navigation task in obstacle-dense environments, we exploit the free space directly to construct a sequence of free regions, and allocate each waypoint on the trajectory to a specific region. Then, we incorporate a uniform scaling factor for each free region, and formulate a Sums-of-Squares (SOS) optimization problem that renders the containment relationship between the robot and the free space computationally tractable. The SOS optimization problem is further reformulated to a semidefinite program (SDP), and the collision-free constraints are shown to be equivalent to limiting the scaling factor along the entire trajectory. In this context, the robot at a specific configuration is tailored to stay within the free region. Next, to solve the trajectory optimization problem with the proposed safety constraints (which are implicitly dependent on the robot configurations), we derive the analytical solution to the gradient of the minimum scaling factor with respect to the robot configuration. As a result, this seamlessly facilitates the use of gradient-based methods in efficient solving of the trajectory optimization problem. Through a series of simulations and real-world experiments, the proposed trajectory optimization approach is validated in various challenging scenarios, and the results demonstrate its effectiveness in generating collision-free trajectories in dense and intricate environments populated with obstacles. Our code is available at: https://github.com/lyl00/minimum_scaling_free_region
SSL-Interactions: Pretext Tasks for Interactive Trajectory Prediction
This paper addresses motion forecasting in multi-agent environments, pivotal for ensuring safety of autonomous vehicles. Traditional as well as recent data-driven marginal trajectory prediction methods struggle to properly learn non-linear agent-to-agent interactions. We present SSL-Interactions that proposes pretext tasks to enhance interaction modeling for trajectory prediction. We introduce four interaction-aware pretext tasks to encapsulate various aspects of agent interactions: range gap prediction, closest distance prediction, direction of movement prediction, and type of interaction prediction. We further propose an approach to curate interaction-heavy scenarios from datasets. This curated data has two advantages: it provides a stronger learning signal to the interaction model, and facilitates generation of pseudo-labels for interaction-centric pretext tasks. We also propose three new metrics specifically designed to evaluate predictions in interactive scenes. Our empirical evaluations indicate SSL-Interactions outperforms state-of-the-art motion forecasting methods quantitatively with up to 8% improvement, and qualitatively, for interaction-heavy scenarios.
comment: Accepted at IV-2024. 13 pages, 5 figures
Ten Problems in Geobotics
Robots sense, move and act in the physical world. It is therefore natural that algorithmic problems in robotics and automation have a geometric component, often central to the problem. Below we review ten challenging problems at the intersection of robotics and computational geometry -- let's call this intersection Geobotics. What is common to most of these problems is that the prevalent algorithmic techniques used in robotics do not seem suitable for solving them, or at least do not suggest quality guarantees for the solution. Solving some of them, even partially, can shed light on less well-understood aspects of computation in robotics.
Combining Safe Intervals and RRT* for Efficient Multi-Robot Path Planning in Complex Environments
In this paper, we consider the problem of Multi-Robot Path Planning (MRPP) in continuous space to find conflict-free paths. The difficulty of the problem arises from two primary factors. First, the involvement of multiple robots leads to combinatorial decision-making, which escalates the search space exponentially. Second, the continuous space presents potentially infinite states and actions. For this problem, we propose a two-level approach where the low level is a sampling-based planner Safe Interval RRT* (SI-RRT*) that finds a collision-free trajectory for individual robots. The high level can use any method that can resolve inter-robot conflicts where we employ two representative methods that are Prioritized Planning (SI-CPP) and Conflict Based Search (SI-CCBS). Experimental results show that SI-RRT* can find a high-quality solution quickly with a small number of samples. SI-CPP exhibits improved scalability while SI-CCBS produces higher-quality solutions compared to the state-of-the-art planners for continuous space. Compared to the most scalable existing algorithm, SI-CPP achieves a success rate that is up to 94% higher with 100 robots while maintaining solution quality (i.e., flowtime, the sum of travel times of all robots) without significant compromise. SI-CPP also decreases the makespan up to 45%. SI-CCBS decreases the flowtime by 9% compared to the competitor, albeit exhibiting a 14% lower success rate.
Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control
This paper presents a comprehensive study on using deep reinforcement learning (RL) to create dynamic locomotion controllers for bipedal robots. Going beyond focusing on a single locomotion skill, we develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing. Our RL-based controller incorporates a novel dual-history architecture, utilizing both a long-term and short-term input/output (I/O) history of the robot. This control architecture, when trained through the proposed end-to-end RL approach, consistently outperforms other methods across a diverse range of skills in both simulation and the real world. The study also delves into the adaptivity and robustness introduced by the proposed RL system in developing locomotion controllers. We demonstrate that the proposed architecture can adapt to both time-invariant dynamics shifts and time-variant changes, such as contact events, by effectively using the robot's I/O history. Additionally, we identify task randomization as another key source of robustness, fostering better task generalization and compliance to disturbances. The resulting control policies can be successfully deployed on Cassie, a torque-controlled human-sized bipedal robot. This work pushes the limits of agility for bipedal robots through extensive real-world experiments. We demonstrate a diverse range of locomotion skills, including: robust standing, versatile walking, fast running with a demonstration of a 400-meter dash, and a diverse set of jumping skills, such as standing long jumps and high jumps.
comment: Accepted in International Journal of Robotics Research (IJRR) 2024. This is the author's version and will no longer be updated as the copyright may get transferred at anytime
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI
Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial General Intelligence (AGI) and serves as a foundation for various applications that bridge cyberspace and the physical world. Recently, the emergence of Multi-modal Large Models (MLMs) and World Models (WMs) have attracted significant attention due to their remarkable perception, interaction, and reasoning capabilities, making them a promising architecture for the brain of embodied agents. However, there is no comprehensive survey for Embodied AI in the era of MLMs. In this survey, we give a comprehensive exploration of the latest advancements in Embodied AI. Our analysis firstly navigates through the forefront of representative works of embodied robots and simulators, to fully understand the research focuses and their limitations. Then, we analyze four main research targets: 1) embodied perception, 2) embodied interaction, 3) embodied agent, and 4) sim-to-real adaptation, covering the state-of-the-art methods, essential paradigms, and comprehensive datasets. Additionally, we explore the complexities of MLMs in virtual and real embodied agents, highlighting their significance in facilitating interactions in dynamic digital and physical environments. Finally, we summarize the challenges and limitations of embodied AI and discuss their potential future directions. We hope this survey will serve as a foundational reference for the research community and inspire continued innovation. The associated project can be found at https://github.com/HCPLab-SYSU/Embodied_AI_Paper_List.
comment: The first comprehensive review of Embodied AI in the era of MLMs, 39 pages. We also provide the paper list for Embodied AI: https://github.com/HCPLab-SYSU/Embodied_AI_Paper_List
Ultra-Lightweight Collaborative Mapping for Robot Swarms
A key requirement in robotics is the ability to simultaneously self-localize and map a previously unknown environment, relying primarily on onboard sensing and computation. Achieving fully onboard accurate simultaneous localization and mapping (SLAM) is feasible for high-end robotic platforms, whereas small and inexpensive robots face challenges due to constrained hardware, therefore frequently resorting to external infrastructure for sensing and computation. The challenge is further exacerbated in swarms of robots, where coordination, scalability, and latency are crucial concerns. This work introduces a decentralized and lightweight collaborative SLAM approach that enables mapping on virtually any robot, even those equipped with low-cost hardware and only 1.5 MB of memory, including miniaturized insect-size devices. Moreover, the proposed solution supports large swarm formations with the capability to coordinate hundreds of agents. To substantiate our claims, we have successfully implemented collaborative SLAM on centimeter-size drones weighing 46 g. Remarkably, we achieve a mapping accuracy below 30 cm, a result comparable to high-end state-of-the-art solutions while reducing the cost, memory, and computation requirements by two orders of magnitude. Our approach is innovative in three main aspects. First, it enables onboard infrastructure-less collaborative mapping with a lightweight and cost-effective (\$20) solution in terms of sensing and computation. Second, we optimize the data traffic within the swarm to support hundreds of cooperative agents using standard wireless protocols such as ultra-wideband (UWB), Bluetooth, or WiFi. Last, we implement a distributed swarm coordination policy to decrease mapping latency and enhance accuracy.
comment: 14 pages, 13 figures
MAPPO-PIS: A Multi-Agent Proximal Policy Optimization Method with Prior Intent Sharing for CAVs' Cooperative Decision-Making
Vehicle-to-Vehicle (V2V) technologies have great potential for enhancing traffic flow efficiency and safety. However, cooperative decision-making in multi-agent systems, particularly in complex human-machine mixed merging areas, remains challenging for connected and autonomous vehicles (CAVs). Intent sharing, a key aspect of human coordination, may offer an effective solution to these decision-making problems, but its application in CAVs is under-explored. This paper presents an intent-sharing-based cooperative method, the Multi-Agent Proximal Policy Optimization with Prior Intent Sharing (MAPPO-PIS), which models the CAV cooperative decision-making problem as a Multi-Agent Reinforcement Learning (MARL) problem. It involves training and updating the agents' policies through the integration of two key modules: the Intention Generator Module (IGM) and the Safety Enhanced Module (SEM). The IGM is specifically crafted to generate and disseminate CAVs' intended trajectories spanning multiple future time-steps. On the other hand, the SEM serves a crucial role in assessing the safety of the decisions made and rectifying them if necessary. Merging area with human-machine mixed traffic flow is selected to validate our method. Results show that MAPPO-PIS significantly improves decision-making performance in multi-agent systems, surpassing state-of-the-art baselines in safety, efficiency, and overall traffic system performance. The code and video demo can be found at: \url{https://github.com/CCCC1dhcgd/A-MAPPO-PIS}.
Hybrid Continuum-Eversion Robot: Precise Navigation and Decontamination in Nuclear Environments using Vine Robot
Soft growing vine robots show great potential for navigation and decontamination tasks in the nuclear industry. This paper introduces a novel hybrid continuum-eversion robot designed to address certain challenges in relation to navigating and operating within pipe networks and enclosed remote vessels. The hybrid robot combines the flexibility of a soft eversion robot with the precision of a continuum robot at its tip, allowing for controlled steering and movement in hard to access and/or complex environments. The design enables the delivery of sensors, liquids, and aerosols to remote areas, supporting remote decontamination activities. This paper outlines the design and construction of the robot and the methods by which it achieves selective steering. We also include a comprehensive review of current related work in eversion robotics, as well as other steering devices and actuators currently under research, which underpin this novel active steering approach. This is followed by an experimental evaluation that demonstrates the robot's real-world capabilities in delivering liquids and aerosols to remote locations. The experiments reveal successful outcomes, with over 95% success in precision spraying tests. The paper concludes by discussing future work alongside limitations in the current design, ultimately showcasing its potential as a solution for remote decontamination operations in the nuclear industry.
comment: 7 pages, 8 figures, conference
A Hessian for Gaussian Mixture Likelihoods in Nonlinear Least Squares
This paper proposes a novel Hessian approximation for Maximum a Posteriori estimation problems in robotics involving Gaussian mixture likelihoods. Previous approaches manipulate the Gaussian mixture likelihood into a form that allows the problem to be represented as a nonlinear least squares (NLS) problem. The resulting Hessian approximation used within NLS solvers from these approaches neglects certain nonlinearities. The proposed Hessian approximation is derived by setting the Hessians of the Gaussian mixture component errors to zero, which is the same starting point as for the Gauss-Newton Hessian approximation for NLS, and using the chain rule to account for additional nonlinearities. The proposed Hessian approximation results in improved convergence speed and uncertainty characterization for simulated experiments,and similar performance to the state of the art on real-world experiments. A method to maintain compatibility with existing solvers, such as ceres, is also presented. Accompanying software and supplementary material can be found at https://github.com/decargroup/hessian_sum_mixtures.
comment: 8 pages, 2 figures. Accepted to IEEE Robotics and Automation Letters
Narrowing your FOV with SOLiD: Spatially Organized and Lightweight Global Descriptor for FOV-constrained LiDAR Place Recognition
We often encounter limited FOV situations due to various factors such as sensor fusion or sensor mount in real-world robot navigation. However, the limited FOV interrupts the generation of descriptions and impacts place recognition adversely. Therefore, we suffer from correcting accumulated drift errors in a consistent map using LiDAR-based place recognition with limited FOV. Thus, in this paper, we propose a robust LiDAR-based place recognition method for handling narrow FOV scenarios. The proposed method establishes spatial organization based on the range-elevation bin and azimuth-elevation bin to represent places. In addition, we achieve a robust place description through reweighting based on vertical direction information. Based on these representations, our method enables addressing rotational changes and determining the initial heading. Additionally, we designed a lightweight and fast approach for the robot's onboard autonomy. For rigorous validation, the proposed method was tested across various LiDAR place recognition scenarios (i.e., single-session, multi-session, and multi-robot scenarios). To the best of our knowledge, we report the first method to cope with the restricted FOV. Our place description and SLAM codes will be released. Also, the supplementary materials of our descriptor are available at \texttt{\url{https://sites.google.com/view/lidar-solid}}.
comment: IEEE Robotics and Automation Letters (2024)
Capability-based Frameworks for Industrial Robot Skills: a Survey
The research community is puzzled with words like skill, action, atomic unit and others when describing robots' capabilities. However, for giving the possibility to integrate capabilities in industrial scenarios, a standardization of these descriptions is necessary. This work uses a structured review approach to identify commonalities and differences in the research community of robots' skill frameworks. Through this method, 210 papers were analyzed and three main results were obtained. First, the vast majority of authors agree on a taxonomy based on task, skill and primitive. Second, the most investigated robots' capabilities are pick and place. Third, industrial oriented applications focus more on simple robots' capabilities with fixed parameters while ensuring safety aspects. Therefore, this work emphasizes that a taxonomy based on task, skill and primitives should be used by future works to align with existing literature. Moreover, further research is needed in the industrial domain for parametric robots' capabilities while ensuring safety.
Multiagent Systems
A Survey on Small-Scale Testbeds for Connected and Automated Vehicles and Robot Swarms
Connected and automated vehicles and robot swarms hold transformative potential for enhancing safety, efficiency, and sustainability in the transportation and manufacturing sectors. Extensive testing and validation of these technologies is crucial for their deployment in the real world. While simulations are essential for initial testing, they often have limitations in capturing the complex dynamics of real-world interactions. This limitation underscores the importance of small-scale testbeds. These testbeds provide a realistic, cost-effective, and controlled environment for testing and validating algorithms, acting as an essential intermediary between simulation and full-scale experiments. This work serves to facilitate researchers' efforts in identifying existing small-scale testbeds suitable for their experiments and provide insights for those who want to build their own. In addition, it delivers a comprehensive survey of the current landscape of these testbeds. We derive 62 characteristics of testbeds based on the well-known sense-plan-act paradigm and offer an online table comparing 22 small-scale testbeds based on these characteristics. The online table is hosted on our designated public webpage www.cpm-remote.de/testbeds, and we invite testbed creators and developers to contribute to it. We closely examine nine testbeds in this paper, demonstrating how the derived characteristics can be used to present testbeds. Furthermore, we discuss three ongoing challenges concerning small-scale testbeds that we identified, i.e., small-scale to full-scale transition, sustainability, and power and resource management.
comment: 16 pages, 11 figures, 1 table. This work has been submitted to the IEEE Robotics & Automation Magazine for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Multi-Agent Path Finding with Real Robot Dynamics and Interdependent Tasks for Automated Warehouses ECAI-2024
Multi-Agent Path Finding (MAPF) is an important optimization problem underlying the deployment of robots in automated warehouses and factories. Despite the large body of work on this topic, most approaches make heavy simplifications, both on the environment and the agents, which make the resulting algorithms impractical for real-life scenarios. In this paper, we consider a realistic problem of online order delivery in a warehouse, where a fleet of robots bring the products belonging to each order from shelves to workstations. This creates a stream of inter-dependent pickup and delivery tasks and the associated MAPF problem consists of computing realistic collision-free robot trajectories fulfilling these tasks. To solve this MAPF problem, we propose an extension of the standard Prioritized Planning algorithm to deal with the inter-dependent tasks (Interleaved Prioritized Planning) and a novel Via-Point Star (VP*) algorithm to compute an optimal dynamics-compliant robot trajectory to visit a sequence of goal locations while avoiding moving obstacles. We prove the completeness of our approach and evaluate it in simulation as well as in a real warehouse.
comment: Accepted to ECAI-2024. For related videos, see https://europe.naverlabs.com/research/publications/MAPF_IPP
Compressed Federated Reinforcement Learning with a Generative Model ECML-PKDD 2024
Reinforcement learning has recently gained unprecedented popularity, yet it still grapples with sample inefficiency. Addressing this challenge, federated reinforcement learning (FedRL) has emerged, wherein agents collaboratively learn a single policy by aggregating local estimations. However, this aggregation step incurs significant communication costs. In this paper, we propose CompFedRL, a communication-efficient FedRL approach incorporating both \textit{periodic aggregation} and (direct/error-feedback) compression mechanisms. Specifically, we consider compressed federated $Q$-learning with a generative model setup, where a central server learns an optimal $Q$-function by periodically aggregating compressed $Q$-estimates from local agents. For the first time, we characterize the impact of these two mechanisms (which have remained elusive) by providing a finite-time analysis of our algorithm, demonstrating strong convergence behaviors when utilizing either direct or error-feedback compression. Our bounds indicate improved solution accuracy concerning the number of agents and other federated hyperparameters while simultaneously reducing communication costs. To corroborate our theory, we also conduct in-depth numerical experiments to verify our findings, considering Top-$K$ and Sparsified-$K$ sparsification operators.
comment: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2024)
Combining Safe Intervals and RRT* for Efficient Multi-Robot Path Planning in Complex Environments
In this paper, we consider the problem of Multi-Robot Path Planning (MRPP) in continuous space to find conflict-free paths. The difficulty of the problem arises from two primary factors. First, the involvement of multiple robots leads to combinatorial decision-making, which escalates the search space exponentially. Second, the continuous space presents potentially infinite states and actions. For this problem, we propose a two-level approach where the low level is a sampling-based planner Safe Interval RRT* (SI-RRT*) that finds a collision-free trajectory for individual robots. The high level can use any method that can resolve inter-robot conflicts where we employ two representative methods that are Prioritized Planning (SI-CPP) and Conflict Based Search (SI-CCBS). Experimental results show that SI-RRT* can find a high-quality solution quickly with a small number of samples. SI-CPP exhibits improved scalability while SI-CCBS produces higher-quality solutions compared to the state-of-the-art planners for continuous space. Compared to the most scalable existing algorithm, SI-CPP achieves a success rate that is up to 94% higher with 100 robots while maintaining solution quality (i.e., flowtime, the sum of travel times of all robots) without significant compromise. SI-CPP also decreases the makespan up to 45%. SI-CCBS decreases the flowtime by 9% compared to the competitor, albeit exhibiting a 14% lower success rate.
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI
Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial General Intelligence (AGI) and serves as a foundation for various applications that bridge cyberspace and the physical world. Recently, the emergence of Multi-modal Large Models (MLMs) and World Models (WMs) have attracted significant attention due to their remarkable perception, interaction, and reasoning capabilities, making them a promising architecture for the brain of embodied agents. However, there is no comprehensive survey for Embodied AI in the era of MLMs. In this survey, we give a comprehensive exploration of the latest advancements in Embodied AI. Our analysis firstly navigates through the forefront of representative works of embodied robots and simulators, to fully understand the research focuses and their limitations. Then, we analyze four main research targets: 1) embodied perception, 2) embodied interaction, 3) embodied agent, and 4) sim-to-real adaptation, covering the state-of-the-art methods, essential paradigms, and comprehensive datasets. Additionally, we explore the complexities of MLMs in virtual and real embodied agents, highlighting their significance in facilitating interactions in dynamic digital and physical environments. Finally, we summarize the challenges and limitations of embodied AI and discuss their potential future directions. We hope this survey will serve as a foundational reference for the research community and inspire continued innovation. The associated project can be found at https://github.com/HCPLab-SYSU/Embodied_AI_Paper_List.
comment: The first comprehensive review of Embodied AI in the era of MLMs, 39 pages. We also provide the paper list for Embodied AI: https://github.com/HCPLab-SYSU/Embodied_AI_Paper_List
Systems and Control (CS)
Precision on Demand: Propositional Logic for Event-Trigger Threshold Regulation
We introduce a novel event-trigger threshold (ETT) regulation mechanism based on the quantitative semantics of propositional logic (PL). We exploit the expressiveness of the PL vocabulary to deliver a precise and flexible specification of ETT regulation based on system requirements and properties. Additionally, we present a modified ETT regulation mechanism that provides formal guarantees for satisfaction/violation detection of arbitrary PL properties. To validate our proposed method, we consider a convoy of vehicles in an adaptive cruise control scenario. In this scenario, the PL operators are used to encode safety properties and the ETTs are regulated accordingly, e.g., if our safety metric is high there can be a higher ETT threshold, while a smaller threshold is used when the system is approaching unsafe conditions. Under ideal ETT regulation conditions in this safety scenario, we show that reductions between 41.8 - 96.3% in the number of triggered events is possible compared to using a constant ETT while maintaining similar safety conditions.
comment: 17 pages, 7 figures
Advancing Humanoid Locomotion: Mastering Challenging Terrains with Denoising World Model Learning
Humanoid robots, with their human-like skeletal structure, are especially suited for tasks in human-centric environments. However, this structure is accompanied by additional challenges in locomotion controller design, especially in complex real-world environments. As a result, existing humanoid robots are limited to relatively simple terrains, either with model-based control or model-free reinforcement learning. In this work, we introduce Denoising World Model Learning (DWL), an end-to-end reinforcement learning framework for humanoid locomotion control, which demonstrates the world's first humanoid robot to master real-world challenging terrains such as snowy and inclined land in the wild, up and down stairs, and extremely uneven terrains. All scenarios run the same learned neural network with zero-shot sim-to-real transfer, indicating the superior robustness and generalization capability of the proposed method.
comment: Robotics: Science and Systems (RSS), 2024. (Best Paper Award Finalist)
Decentralized Singular Value Decomposition for Extremely Large-scale Antenna Array Systems
In this article, the problems of decentralized Singular Value Decomposition (d-SVD) and decentralized Principal Component Analysis (d-PCA) are studied, which are fundamental in various signal processing applications. Two scenarios of d-SVD are considered depending on the availability of the data matrix under consideration. In the first scenario, the matrix of interest is row-wisely available in each local node in the network. In the second scenario, the matrix of interest implicitly forms an outer product generated from two different series of measurements. Combining the lightweight local rational function approximation approach and parallel averaging consensus algorithms, two d-SVD algorithms are proposed to cope with the two aforementioned scenarios. We demonstrate the proposed algorithms with two respective application examples for Extremely Large-scale Antenna Array (ELAA) systems: decentralized sensor localization via low-rank matrix completion and decentralized passive radar detection. Moreover, a non-trivial truncation technique, which employs a representative vector that is orthonormal to the principal signal subspace, is proposed to further reduce the associated communication cost with the d-SVD algorithms. Simulation results show that the proposed d-SVD algorithms converge to the centralized solution with reduced communication cost compared to those facilitated with the state-of-the-art decentralized power method.
Applying digital twins for the management of information in turnaround event operations in commercial airports
The aerospace sector is one of the many sectors in which large amounts of data are generated. Thanks to the evolution of technology, these data can be exploited in several ways to improve the operation and management of industrial processes. However, to achieve this goal, it is necessary to define architectures and data models that allow to manage and homogenise the heterogeneous data collected. In this paper, we present an Airport Digital Twin Reference Conceptualisation's and data model based on FIWARE Generic Enablers and the Next Generation Service Interfaces-Linked Data standard. Concretely, we particularise the Airport Digital Twin to improve the efficiency of flight turnaround events. The architecture proposed is validated in the Aberdeen International Airport with the aim of reducing delays in commercial flights. The implementation includes an application that shows the real state of the airport, combining two-dimensional and three-dimensional virtual reality representations of the stands, and a mobile application that helps ground operators to schedule departure and arrival flights.
Hierarchical-type Model Predictive Control and Experimental Evaluation for a Water-Hydraulic Artificial Muscle with Direct Data-Driven Adaptive Model Matching
High-precision displacement control for water-hydraulic artificial muscles is a challenging issue due to its strong hysteresis characteristics that is hard to be modelled precisely, and many control methods have been proposed. Recently, data-driven control methods have attracted much attention because they do not explicitly use mathematical models, making design much easier. In our previous work, we proposed fictitious reference iterative tuning (FRIT)-based model predictive control (FMPC), which combines data-driven and model-based methods for the muscle and showed its effectiveness because it can consider input constraints as well. However, the problem in which control performance strongly depends on prior input-output data remains still unsolved. Adaptive FRIT based on directional forgetting has also been proposed; however, it is difficult to achieve the desired transient performance because it cannot consider input constraints and there are no design parameters that directly determine the control performance, such as MPC. In this study, we propose a novel data-driven adaptive model matching-based controller that combines these methods. Experimental results show that the proposed method could significantly improve the control performance and achieve high robustness against inappropriate initial experimental data , while considering the input constraints in the design phase.
comment: 14 pages, 17 figures
Miniaturized Patch Rectenna Using 3-Turn Complementary Spiral Resonator for Wireless Power Transfer
A miniaturized linearly-polarized patch antenna is presented for Wireless Power Transfer (WPT) at 1. 8 GHz. The proposed antenna consists of a patch element and a 3-turn Complementary Spiral Resonator (3-CSR) with antenna dimension of 50 mm x 50 mm. 3-CSR is inserted in the ground plane to reduce the antenna size. This modification also increased the impedance bandwidth from 43 MHz (1.78-1.83 GHz) to 310 MHz (1.69-2.0 GHz) . Moreover, antenna is fabricated and simulated and measured results are in good agreement. Additionally, a rectifier and matching circuits are designed at -10 dBm to realize a rectenna (rectifying antenna) for WPT application. Rectenna efficiency of 53.6 % is achieved at a low input power of -10 dBm.
CHIGLU: A Modular Hardware for Stepper Motorized Quadruped Robot $\unicode{x2014}$ Design, Analysis, Fabrication, and Validation
Bio-engineered robots are under rapid development due to their maneuver ability through uneven surfaces. This advancement paves the way for experimenting with versatile electrical system developments with various motors. In this research paper, we present a design, fabrication and analysis of a versatile printed circuit board (PCB) as the main system that allows for the control of twelve stepper motors by stacking low-budget stepper motor controller and widely used micro-controller unit. The primary motivation behind the design is to offer a compact and efficient hardware solution for controlling multiple stepper motors of a quadruped robot while meeting the required power budget. The research focuses on the hardware's architecture, stackable design, power budget planning and a thorough analysis. Additionally, PDN (Power Distribution Network) analysis simulation is done to ensure that the voltage and current density are within the expected parameters. Also, the hardware design deep dives into design for manufacturability (DFM). The ability to stack the controllers on the development board provides insights into the board's components swapping feasibility. The findings from this research make a significant contribution to the advancement of stepper motor control systems of multi-axis applications for bio-inspired robot offering a convenient form factor and a reliable performance.
comment: LaTeX, 26 pages with 25 figures
Modular Meshed Ultra-Wideband Aided Inertial Navigation with Robust Anchor Calibration
This paper introduces a generic filter-based state estimation framework that supports two state-decoupling strategies based on cross-covariance factorization. These strategies reduce the computational complexity and inherently support true modularity -- a perquisite for handling and processing meshed range measurements among a time-varying set of devices. In order to utilize these measurements in the estimation framework, positions of newly detected stationary devices (anchors) and the pairwise biases between the ranging devices are required. In this work an autonomous calibration procedure for new anchors is presented, that utilizes range measurements from multiple tags as well as already known anchors. To improve the robustness, an outlier rejection method is introduced. After the calibration is performed, the sensor fusion framework obtains initial beliefs of the anchor positions and dictionaries of pairwise biases, in order to fuse range measurements obtained from new anchors tightly-coupled. The effectiveness of the filter and calibration framework has been validated through evaluations on a recorded dataset and real-world experiments.
Active Search for Low-altitude UAV Sensing and Communication for Users at Unknown Locations
This paper studies optimal unmanned aerial vehicle (UAV) placement to ensure line-of-sight (LOS) communication and sensing for a cluster of ground users possibly in deep shadow, while the UAV maintains backhaul connectivity with a base station (BS). The key challenges include unknown user locations, uncertain channel model parameters, and unavailable urban structure. Addressing these challenges, this paper focuses on developing an efficient online search strategy which jointly estimates channels, guides UAV positioning, and optimizes resource allocation. Analytically exploiting the geometric properties of the equipotential surface, this paper develops an LOS discovery trajectory on the equipotential surface while the closed-form search directions are determined using perturbation theory. Since the explicit expression of the equipotential surface is not available, this paper proposes to locally construct a channel model for each user in the LOS regime utilizing polynomial regression without depending on user locations or propagation distance. A class of spiral trajectories to simultaneously construct the LOS channels and search on the equipotential surface is developed. An optimal radius of the spiral and an optimal measurement pattern for channel gain estimation are derived to minimize the mean squared error (MSE) of the locally constructed channel. Numerical results on real 3D city maps demonstrate that the proposed scheme achieves over 95% of the performance of a 3D exhaustive search scheme with just a 3-kilometer search.
Revisiting time-variant complex conjugate matrix equations with their corresponding real field time-variant large-scale linear equations, neural hypercomplex numbers space compressive approximation approach
Large-scale linear equations and high dimension have been hot topics in deep learning, machine learning, control,and scientific computing. Because of special conjugate operation characteristics, time-variant complex conjugate matrix equations need to be transformed into corresponding real field time-variant large-scale linear equations. In this paper, zeroing neural dynamic models based on complex field error (called Con-CZND1) and based on real field error (called Con-CZND2) are proposed for in-depth analysis. Con-CZND1 has fewer elements because of the direct processing of complex matrices. Con-CZND2 needs to be transformed into the real field, with more elements, and its performance is affected by the main diagonal dominance of coefficient matrices. A neural hypercomplex numbers space compressive approximation approach (NHNSCAA) is innovatively proposed. Then Con-CZND1 conj model is constructed. Numerical experiments verify Con-CZND1 conj model effectiveness and highlight NHNSCAA importance.
Speeding Ticket: Unveiling the Energy and Emission Burden of AI-Accelerated Distributed and Decentralized Power Dispatch Models
As the modern electrical grid shifts towards distributed systems, there is an increasing need for rapid decision-making tools. Artificial Intelligence (AI) and Machine Learning (ML) technologies are now pivotal in enhancing the efficiency of power dispatch operations, effectively overcoming the constraints of traditional optimization solvers with long computation times. However, this increased efficiency comes at a high environmental cost, escalating energy consumption and carbon emissions from computationally intensive AI/ML models. Despite their potential to transform power systems management, the environmental impact of these technologies often remains an overlooked aspect. This paper introduces the first comparison of energy demands across centralized, distributed, and decentralized ML-driven power dispatch models. We provide a detailed analysis of the energy and carbon footprint required for continuous operations on an IEEE 33 bus system, highlighting the critical trade-offs between operational efficiency and environmental sustainability. This study aims to guide future AI implementations in energy systems, ensuring they enhance not only efficiency but also prioritize ecological integrity.
Stable dynamic pricing scheme independent of lane-choice models for high-occupancy-toll lanes
A stable dynamic pricing scheme is essential to guarantee the desired performance of high-occupancy-toll (HOT) lanes, where single-occupancy vehicles (SOVs) can pay a price to use the HOT lanes. But existing methods apply to either only one type of lane-choice models with unknown parameters or different types of lane-choice models but with known parameters. In this study we present a new dynamic pricing scheme that is stable and applies to different types of lane-choice models with unknown parameters. There are two operational objectives for operating HOT lanes: (i) to maintain the free-flow condition to guarantee the travel time reliability; and (ii) to maximize the HOT lanes' throughput to minimize the system's total delay. The traffic dynamics on both HOT and general purpose (GP) lanes are described by point queue models, where the queueing times are determined by the demands and capacities. We consider three types of lane-choice models: the multinomial logit model when SOVs share the same value of time, the vehicle-based user equilibrium model when SOVs' values of time are heterogeneous and follow a distribution, and a general lane-choice model. We demonstrate that the second objective is approximately equivalent to the social welfare optimization principle for the logit model. Observing that the dynamic price and the excess queueing time on the GP lanes are linearly correlated in all the lane-choice models, we propose a feedback control method to determine the dynamic prices based on two integral controllers. We further present a method to estimate the parameters of a lane-choice model once its type is known. Analytically we prove that the equilibrium state of the closed-loop system with constant demand patterns is ideal, since the two objectives are achieved in it, and that it is asymptotically stable. With numerical examples we verify the effectiveness of the solution method.
comment: 24 pages, 10 figures
Synergistic and Efficient Edge-Host Communication for Energy Harvesting Wireless Sensor Networks
There is an increasing demand for intelligent processing on ultra-low-power internet of things (IoT) device. Recent works have shown substantial efficiency boosts by executing inferences directly on the IoT device (node) rather than transmitting data. However, the computation and power demands of Deep Neural Network (DNN)-based inference pose significant challenges in an energy-harvesting wireless sensor network (EH-WSN). Moreover, these tasks often require responses from multiple physically distributed EH sensor nodes, which impose crucial system optimization challenges in addition to per-node constraints. To address these challenges, we propose Seeker, a hardware-software co-design approach for increasing on-sensor computation, reducing communication volume, and maximizing inference completion, without violating the quality of service, in EH-WSNs coordinated by a mobile device. Seeker uses a store-and-execute approach to complete a subset of inferences on the EH sensor node, reducing communication with the mobile host. Further, for those inferences unfinished because of the harvested energy constraints, it leverages task-aware coreset construction to efficiently communicate compact features to the host device. We evaluate Seeker for human activity recognition, as well as predictive maintenance and show ~8.9x reduction in communication data volume with 86.8% accuracy, surpassing the 81.2% accuracy of the state-of-the-art.
comment: arXiv admin note: substantial text overlap with arXiv:2204.13106
Synthetic Grid Generator: Synthesizing Large-Scale Power Distribution Grids using Open Street Map
Nowadays, various stakeholders involved in the analysis of electric power distribution grids face difficulties in the data acquisition related to the grid topology and parameters of grid assets. To mitigate the problem and possibly accelerate the accomplishment of grid studies without access to real data, we propose a novel approach for generating synthetic distribution grids (Syngrids) of (almost) arbitrary size replicating the characteristics of real medium- and low-voltage distribution networks. The method enables large-scale testing without incurring the burden of retrieving and pre-processing real-world data. The proposed algorithm exploits the publicly available information of Open Street Map (OSM). By leveraging geospatial data of real buildings and road networks, the approach allows to construct a Syngrid of chosen size with realistic topology and electrical parameters. It is shown that typical power-flow and short-circuit calculations can be performed on Syngrids ensuring convergence. Within the context of validating the effectiveness of the algorithm and the meaningful similarity of the output to real grids, the topological and electrical characteristics of a Syngrid are compared to their real-world counterparts. Finally, an open-source web platform named as Synthetic Grid Generator (SGG) and based on the proposed algorithm can be used by various stakeholders for the creation of synthetic grids.
comment: 8 pages, 8 figures
Shock waves in nonlinear transmission lines
In the first half of the paper we consider interaction between the small amplitude travelling waves ("sound") and the shock waves in the transmission line containing both nonlinear capacitors and nonlinear inductors. We calculate the "sound" wave coefficient of reflection from (coefficient of transmission through) the shock wave. These coefficients are expressed in terms of the speeds of the "sound" waves relative to the shock and the wave impedances. In the second half of the paper we explicitly include into consideration the dissipation in the system, introducing ohmic resistors shunting the inductors and also in series with the capacitors. This allows us to justify the conditions on the shocks, postulated in the first half of the paper. This also allows us to describe the shocks as physical objects of finite width and study their profiles. In some particular cases the profiles were obtained in terms of elementary functions.
comment: pdfLaTeX, 7 pages, 4 figures. As accepted for publication in Physica Status Solidi B
Extremum Seeking Tracking for Derivative-free Distributed Optimization
In this paper, we deal with a network of agents that want to cooperatively minimize the sum of local cost functions depending on a common decision variable. We consider the challenging scenario in which objective functions are unknown and agents have only access to local measurements of their local functions. We propose a novel distributed algorithm that combines a recent gradient tracking policy with an extremum-seeking technique to estimate the global descent direction. The joint use of these two techniques results in a distributed optimization scheme that provides arbitrarily accurate solution estimates through the combination of Lyapunov and averaging analysis approaches with consensus theory. We perform numerical simulations in a personalized optimization framework to corroborate the theoretical results.
Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control
This paper presents a comprehensive study on using deep reinforcement learning (RL) to create dynamic locomotion controllers for bipedal robots. Going beyond focusing on a single locomotion skill, we develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing. Our RL-based controller incorporates a novel dual-history architecture, utilizing both a long-term and short-term input/output (I/O) history of the robot. This control architecture, when trained through the proposed end-to-end RL approach, consistently outperforms other methods across a diverse range of skills in both simulation and the real world. The study also delves into the adaptivity and robustness introduced by the proposed RL system in developing locomotion controllers. We demonstrate that the proposed architecture can adapt to both time-invariant dynamics shifts and time-variant changes, such as contact events, by effectively using the robot's I/O history. Additionally, we identify task randomization as another key source of robustness, fostering better task generalization and compliance to disturbances. The resulting control policies can be successfully deployed on Cassie, a torque-controlled human-sized bipedal robot. This work pushes the limits of agility for bipedal robots through extensive real-world experiments. We demonstrate a diverse range of locomotion skills, including: robust standing, versatile walking, fast running with a demonstration of a 400-meter dash, and a diverse set of jumping skills, such as standing long jumps and high jumps.
comment: Accepted in International Journal of Robotics Research (IJRR) 2024. This is the author's version and will no longer be updated as the copyright may get transferred at anytime
Control Barrier Function Based Design of Gradient Flows for Constrained Nonlinear Programming
This paper considers the problem of designing a continuous-time dynamical system that solves a constrained nonlinear optimization problem and makes the feasible set forward invariant and asymptotically stable. The invariance of the feasible set makes the dynamics anytime, when viewed as an algorithm, meaning it returns a feasible solution regardless of when it is terminated. Our approach augments the gradient flow of the objective function with inputs defined by the constraint functions, treats the feasible set as a safe set, and synthesizes a safe feedback controller using techniques from the theory of control barrier functions. The resulting closed-loop system, termed safe gradient flow, can be viewed as a primal-dual flow, where the state corresponds to the primal variables and the inputs correspond to the dual ones. We provide a detailed suite of conditions based on constraint qualification under which (both isolated and nonisolated) local minimizers are stable with respect to the feasible set and the whole state space. Comparisons with other continuous-time methods for optimization in a simple example illustrate the advantages of the safe gradient flow.
comment: Full version, with appendix, of work appearing in IEEE Transactions on Automatic Control
Physics and technology of Laser Lightning Control
The recent development of high average, high peak power lasers has revived the effort of using lasers as a potential tool to influence natural lightning. Although impressive, the current progress in laser lightning control technology may only be the beginning of a new area involving a positive feedback between powerful laser development and atmospheric research. In this review paper, we critically evaluate the past, present and future of Laser Lightning Control (LLC), considering both its technological and scientific significance in atmospheric research.
comment: Revised version after first round of peer-review. Restructuration, adding more physics contextuality and pushed some tables in a newly created Appendix
Modeling pedestrian fundamental diagram based on Directional Statistics
Understanding pedestrian dynamics is crucial for appropriately designing pedestrian spaces. The pedestrian fundamental diagram (FD), which describes the relationship between pedestrian flow and density within a given space, characterizes these dynamics. Pedestrian FDs are significantly influenced by the flow type, such as uni-directional, bi-directional, and crossing flows. However, to the authors' knowledge, generalized pedestrian FDs that are applicable to various flow types have not been proposed. This may be due to the difficulty of using statistical methods to characterize the flow types. The flow types significantly depend on the angles of pedestrian movement; however, these angles cannot be processed by standard statistics due to their periodicity. In this study, we propose a comprehensive model for pedestrian FDs that can describe the pedestrian dynamics for various flow types by applying Directional Statistics. First, we develop a novel statistic describing the pedestrian flow type solely from pedestrian trajectory data using Directional Statistics. Then, we formulate a comprehensive pedestrian FD model that can be applied to various flow types by incorporating the proposed statistics into a traditional pedestrian FD model. The proposed model was validated using actual pedestrian trajectory data. The results confirmed that the model effectively represents the essential nature of pedestrian dynamics, such as the capacity reduction due to conflict of crossing flows and the capacity improvement due to the lane formation in bi-directional flows.
Cooperative Hypothesis Testing by Two Observers with Asymmetric Information
We consider the binary hypothesis testing problem with two observers. There are two possible states of nature (or hypotheses). Observations collected by the two observers are statistically related to the true state of nature. The knowledge of joint distribution of the observations collected and the true state of nature is unknown to the observers. There are two problems to be solved by the observers: (i) true state of nature is known: find the distribution of the local information collected; (ii) true state of nature is unknown: collaboratively estimate the same using the distributions found by solving the first problem. We present four algorithms, each having two phases where the two problems are solved, with emphasis on the information exchange between the observers and resulting patterns. We prove different properties of the algorithms including the following: the probability spaces constructed as a consequence of solving the first problem are dependent on the information patterns at the observers; (ii) the rate of decay of probability of error of algorithms while solving the second problem is dependent on the information exchange between the observers. We present a numerical example demonstrating the four algorithms.
comment: Journal Paper to be published
Systems and Control (EESS)
Precision on Demand: Propositional Logic for Event-Trigger Threshold Regulation
We introduce a novel event-trigger threshold (ETT) regulation mechanism based on the quantitative semantics of propositional logic (PL). We exploit the expressiveness of the PL vocabulary to deliver a precise and flexible specification of ETT regulation based on system requirements and properties. Additionally, we present a modified ETT regulation mechanism that provides formal guarantees for satisfaction/violation detection of arbitrary PL properties. To validate our proposed method, we consider a convoy of vehicles in an adaptive cruise control scenario. In this scenario, the PL operators are used to encode safety properties and the ETTs are regulated accordingly, e.g., if our safety metric is high there can be a higher ETT threshold, while a smaller threshold is used when the system is approaching unsafe conditions. Under ideal ETT regulation conditions in this safety scenario, we show that reductions between 41.8 - 96.3% in the number of triggered events is possible compared to using a constant ETT while maintaining similar safety conditions.
comment: 17 pages, 7 figures
Advancing Humanoid Locomotion: Mastering Challenging Terrains with Denoising World Model Learning
Humanoid robots, with their human-like skeletal structure, are especially suited for tasks in human-centric environments. However, this structure is accompanied by additional challenges in locomotion controller design, especially in complex real-world environments. As a result, existing humanoid robots are limited to relatively simple terrains, either with model-based control or model-free reinforcement learning. In this work, we introduce Denoising World Model Learning (DWL), an end-to-end reinforcement learning framework for humanoid locomotion control, which demonstrates the world's first humanoid robot to master real-world challenging terrains such as snowy and inclined land in the wild, up and down stairs, and extremely uneven terrains. All scenarios run the same learned neural network with zero-shot sim-to-real transfer, indicating the superior robustness and generalization capability of the proposed method.
comment: Robotics: Science and Systems (RSS), 2024. (Best Paper Award Finalist)
Decentralized Singular Value Decomposition for Extremely Large-scale Antenna Array Systems
In this article, the problems of decentralized Singular Value Decomposition (d-SVD) and decentralized Principal Component Analysis (d-PCA) are studied, which are fundamental in various signal processing applications. Two scenarios of d-SVD are considered depending on the availability of the data matrix under consideration. In the first scenario, the matrix of interest is row-wisely available in each local node in the network. In the second scenario, the matrix of interest implicitly forms an outer product generated from two different series of measurements. Combining the lightweight local rational function approximation approach and parallel averaging consensus algorithms, two d-SVD algorithms are proposed to cope with the two aforementioned scenarios. We demonstrate the proposed algorithms with two respective application examples for Extremely Large-scale Antenna Array (ELAA) systems: decentralized sensor localization via low-rank matrix completion and decentralized passive radar detection. Moreover, a non-trivial truncation technique, which employs a representative vector that is orthonormal to the principal signal subspace, is proposed to further reduce the associated communication cost with the d-SVD algorithms. Simulation results show that the proposed d-SVD algorithms converge to the centralized solution with reduced communication cost compared to those facilitated with the state-of-the-art decentralized power method.
Applying digital twins for the management of information in turnaround event operations in commercial airports
The aerospace sector is one of the many sectors in which large amounts of data are generated. Thanks to the evolution of technology, these data can be exploited in several ways to improve the operation and management of industrial processes. However, to achieve this goal, it is necessary to define architectures and data models that allow to manage and homogenise the heterogeneous data collected. In this paper, we present an Airport Digital Twin Reference Conceptualisation's and data model based on FIWARE Generic Enablers and the Next Generation Service Interfaces-Linked Data standard. Concretely, we particularise the Airport Digital Twin to improve the efficiency of flight turnaround events. The architecture proposed is validated in the Aberdeen International Airport with the aim of reducing delays in commercial flights. The implementation includes an application that shows the real state of the airport, combining two-dimensional and three-dimensional virtual reality representations of the stands, and a mobile application that helps ground operators to schedule departure and arrival flights.
Hierarchical-type Model Predictive Control and Experimental Evaluation for a Water-Hydraulic Artificial Muscle with Direct Data-Driven Adaptive Model Matching
High-precision displacement control for water-hydraulic artificial muscles is a challenging issue due to its strong hysteresis characteristics that is hard to be modelled precisely, and many control methods have been proposed. Recently, data-driven control methods have attracted much attention because they do not explicitly use mathematical models, making design much easier. In our previous work, we proposed fictitious reference iterative tuning (FRIT)-based model predictive control (FMPC), which combines data-driven and model-based methods for the muscle and showed its effectiveness because it can consider input constraints as well. However, the problem in which control performance strongly depends on prior input-output data remains still unsolved. Adaptive FRIT based on directional forgetting has also been proposed; however, it is difficult to achieve the desired transient performance because it cannot consider input constraints and there are no design parameters that directly determine the control performance, such as MPC. In this study, we propose a novel data-driven adaptive model matching-based controller that combines these methods. Experimental results show that the proposed method could significantly improve the control performance and achieve high robustness against inappropriate initial experimental data , while considering the input constraints in the design phase.
comment: 14 pages, 17 figures
Miniaturized Patch Rectenna Using 3-Turn Complementary Spiral Resonator for Wireless Power Transfer
A miniaturized linearly-polarized patch antenna is presented for Wireless Power Transfer (WPT) at 1. 8 GHz. The proposed antenna consists of a patch element and a 3-turn Complementary Spiral Resonator (3-CSR) with antenna dimension of 50 mm x 50 mm. 3-CSR is inserted in the ground plane to reduce the antenna size. This modification also increased the impedance bandwidth from 43 MHz (1.78-1.83 GHz) to 310 MHz (1.69-2.0 GHz) . Moreover, antenna is fabricated and simulated and measured results are in good agreement. Additionally, a rectifier and matching circuits are designed at -10 dBm to realize a rectenna (rectifying antenna) for WPT application. Rectenna efficiency of 53.6 % is achieved at a low input power of -10 dBm.
CHIGLU: A Modular Hardware for Stepper Motorized Quadruped Robot $\unicode{x2014}$ Design, Analysis, Fabrication, and Validation
Bio-engineered robots are under rapid development due to their maneuver ability through uneven surfaces. This advancement paves the way for experimenting with versatile electrical system developments with various motors. In this research paper, we present a design, fabrication and analysis of a versatile printed circuit board (PCB) as the main system that allows for the control of twelve stepper motors by stacking low-budget stepper motor controller and widely used micro-controller unit. The primary motivation behind the design is to offer a compact and efficient hardware solution for controlling multiple stepper motors of a quadruped robot while meeting the required power budget. The research focuses on the hardware's architecture, stackable design, power budget planning and a thorough analysis. Additionally, PDN (Power Distribution Network) analysis simulation is done to ensure that the voltage and current density are within the expected parameters. Also, the hardware design deep dives into design for manufacturability (DFM). The ability to stack the controllers on the development board provides insights into the board's components swapping feasibility. The findings from this research make a significant contribution to the advancement of stepper motor control systems of multi-axis applications for bio-inspired robot offering a convenient form factor and a reliable performance.
comment: LaTeX, 26 pages with 25 figures
Modular Meshed Ultra-Wideband Aided Inertial Navigation with Robust Anchor Calibration
This paper introduces a generic filter-based state estimation framework that supports two state-decoupling strategies based on cross-covariance factorization. These strategies reduce the computational complexity and inherently support true modularity -- a perquisite for handling and processing meshed range measurements among a time-varying set of devices. In order to utilize these measurements in the estimation framework, positions of newly detected stationary devices (anchors) and the pairwise biases between the ranging devices are required. In this work an autonomous calibration procedure for new anchors is presented, that utilizes range measurements from multiple tags as well as already known anchors. To improve the robustness, an outlier rejection method is introduced. After the calibration is performed, the sensor fusion framework obtains initial beliefs of the anchor positions and dictionaries of pairwise biases, in order to fuse range measurements obtained from new anchors tightly-coupled. The effectiveness of the filter and calibration framework has been validated through evaluations on a recorded dataset and real-world experiments.
Active Search for Low-altitude UAV Sensing and Communication for Users at Unknown Locations
This paper studies optimal unmanned aerial vehicle (UAV) placement to ensure line-of-sight (LOS) communication and sensing for a cluster of ground users possibly in deep shadow, while the UAV maintains backhaul connectivity with a base station (BS). The key challenges include unknown user locations, uncertain channel model parameters, and unavailable urban structure. Addressing these challenges, this paper focuses on developing an efficient online search strategy which jointly estimates channels, guides UAV positioning, and optimizes resource allocation. Analytically exploiting the geometric properties of the equipotential surface, this paper develops an LOS discovery trajectory on the equipotential surface while the closed-form search directions are determined using perturbation theory. Since the explicit expression of the equipotential surface is not available, this paper proposes to locally construct a channel model for each user in the LOS regime utilizing polynomial regression without depending on user locations or propagation distance. A class of spiral trajectories to simultaneously construct the LOS channels and search on the equipotential surface is developed. An optimal radius of the spiral and an optimal measurement pattern for channel gain estimation are derived to minimize the mean squared error (MSE) of the locally constructed channel. Numerical results on real 3D city maps demonstrate that the proposed scheme achieves over 95% of the performance of a 3D exhaustive search scheme with just a 3-kilometer search.
Revisiting time-variant complex conjugate matrix equations with their corresponding real field time-variant large-scale linear equations, neural hypercomplex numbers space compressive approximation approach
Large-scale linear equations and high dimension have been hot topics in deep learning, machine learning, control,and scientific computing. Because of special conjugate operation characteristics, time-variant complex conjugate matrix equations need to be transformed into corresponding real field time-variant large-scale linear equations. In this paper, zeroing neural dynamic models based on complex field error (called Con-CZND1) and based on real field error (called Con-CZND2) are proposed for in-depth analysis. Con-CZND1 has fewer elements because of the direct processing of complex matrices. Con-CZND2 needs to be transformed into the real field, with more elements, and its performance is affected by the main diagonal dominance of coefficient matrices. A neural hypercomplex numbers space compressive approximation approach (NHNSCAA) is innovatively proposed. Then Con-CZND1 conj model is constructed. Numerical experiments verify Con-CZND1 conj model effectiveness and highlight NHNSCAA importance.
Speeding Ticket: Unveiling the Energy and Emission Burden of AI-Accelerated Distributed and Decentralized Power Dispatch Models
As the modern electrical grid shifts towards distributed systems, there is an increasing need for rapid decision-making tools. Artificial Intelligence (AI) and Machine Learning (ML) technologies are now pivotal in enhancing the efficiency of power dispatch operations, effectively overcoming the constraints of traditional optimization solvers with long computation times. However, this increased efficiency comes at a high environmental cost, escalating energy consumption and carbon emissions from computationally intensive AI/ML models. Despite their potential to transform power systems management, the environmental impact of these technologies often remains an overlooked aspect. This paper introduces the first comparison of energy demands across centralized, distributed, and decentralized ML-driven power dispatch models. We provide a detailed analysis of the energy and carbon footprint required for continuous operations on an IEEE 33 bus system, highlighting the critical trade-offs between operational efficiency and environmental sustainability. This study aims to guide future AI implementations in energy systems, ensuring they enhance not only efficiency but also prioritize ecological integrity.
Stable dynamic pricing scheme independent of lane-choice models for high-occupancy-toll lanes
A stable dynamic pricing scheme is essential to guarantee the desired performance of high-occupancy-toll (HOT) lanes, where single-occupancy vehicles (SOVs) can pay a price to use the HOT lanes. But existing methods apply to either only one type of lane-choice models with unknown parameters or different types of lane-choice models but with known parameters. In this study we present a new dynamic pricing scheme that is stable and applies to different types of lane-choice models with unknown parameters. There are two operational objectives for operating HOT lanes: (i) to maintain the free-flow condition to guarantee the travel time reliability; and (ii) to maximize the HOT lanes' throughput to minimize the system's total delay. The traffic dynamics on both HOT and general purpose (GP) lanes are described by point queue models, where the queueing times are determined by the demands and capacities. We consider three types of lane-choice models: the multinomial logit model when SOVs share the same value of time, the vehicle-based user equilibrium model when SOVs' values of time are heterogeneous and follow a distribution, and a general lane-choice model. We demonstrate that the second objective is approximately equivalent to the social welfare optimization principle for the logit model. Observing that the dynamic price and the excess queueing time on the GP lanes are linearly correlated in all the lane-choice models, we propose a feedback control method to determine the dynamic prices based on two integral controllers. We further present a method to estimate the parameters of a lane-choice model once its type is known. Analytically we prove that the equilibrium state of the closed-loop system with constant demand patterns is ideal, since the two objectives are achieved in it, and that it is asymptotically stable. With numerical examples we verify the effectiveness of the solution method.
comment: 24 pages, 10 figures
Synergistic and Efficient Edge-Host Communication for Energy Harvesting Wireless Sensor Networks
There is an increasing demand for intelligent processing on ultra-low-power internet of things (IoT) device. Recent works have shown substantial efficiency boosts by executing inferences directly on the IoT device (node) rather than transmitting data. However, the computation and power demands of Deep Neural Network (DNN)-based inference pose significant challenges in an energy-harvesting wireless sensor network (EH-WSN). Moreover, these tasks often require responses from multiple physically distributed EH sensor nodes, which impose crucial system optimization challenges in addition to per-node constraints. To address these challenges, we propose Seeker, a hardware-software co-design approach for increasing on-sensor computation, reducing communication volume, and maximizing inference completion, without violating the quality of service, in EH-WSNs coordinated by a mobile device. Seeker uses a store-and-execute approach to complete a subset of inferences on the EH sensor node, reducing communication with the mobile host. Further, for those inferences unfinished because of the harvested energy constraints, it leverages task-aware coreset construction to efficiently communicate compact features to the host device. We evaluate Seeker for human activity recognition, as well as predictive maintenance and show ~8.9x reduction in communication data volume with 86.8% accuracy, surpassing the 81.2% accuracy of the state-of-the-art.
comment: arXiv admin note: substantial text overlap with arXiv:2204.13106
Synthetic Grid Generator: Synthesizing Large-Scale Power Distribution Grids using Open Street Map
Nowadays, various stakeholders involved in the analysis of electric power distribution grids face difficulties in the data acquisition related to the grid topology and parameters of grid assets. To mitigate the problem and possibly accelerate the accomplishment of grid studies without access to real data, we propose a novel approach for generating synthetic distribution grids (Syngrids) of (almost) arbitrary size replicating the characteristics of real medium- and low-voltage distribution networks. The method enables large-scale testing without incurring the burden of retrieving and pre-processing real-world data. The proposed algorithm exploits the publicly available information of Open Street Map (OSM). By leveraging geospatial data of real buildings and road networks, the approach allows to construct a Syngrid of chosen size with realistic topology and electrical parameters. It is shown that typical power-flow and short-circuit calculations can be performed on Syngrids ensuring convergence. Within the context of validating the effectiveness of the algorithm and the meaningful similarity of the output to real grids, the topological and electrical characteristics of a Syngrid are compared to their real-world counterparts. Finally, an open-source web platform named as Synthetic Grid Generator (SGG) and based on the proposed algorithm can be used by various stakeholders for the creation of synthetic grids.
comment: 8 pages, 8 figures
Shock waves in nonlinear transmission lines
In the first half of the paper we consider interaction between the small amplitude travelling waves ("sound") and the shock waves in the transmission line containing both nonlinear capacitors and nonlinear inductors. We calculate the "sound" wave coefficient of reflection from (coefficient of transmission through) the shock wave. These coefficients are expressed in terms of the speeds of the "sound" waves relative to the shock and the wave impedances. In the second half of the paper we explicitly include into consideration the dissipation in the system, introducing ohmic resistors shunting the inductors and also in series with the capacitors. This allows us to justify the conditions on the shocks, postulated in the first half of the paper. This also allows us to describe the shocks as physical objects of finite width and study their profiles. In some particular cases the profiles were obtained in terms of elementary functions.
comment: pdfLaTeX, 7 pages, 4 figures. As accepted for publication in Physica Status Solidi B
Extremum Seeking Tracking for Derivative-free Distributed Optimization
In this paper, we deal with a network of agents that want to cooperatively minimize the sum of local cost functions depending on a common decision variable. We consider the challenging scenario in which objective functions are unknown and agents have only access to local measurements of their local functions. We propose a novel distributed algorithm that combines a recent gradient tracking policy with an extremum-seeking technique to estimate the global descent direction. The joint use of these two techniques results in a distributed optimization scheme that provides arbitrarily accurate solution estimates through the combination of Lyapunov and averaging analysis approaches with consensus theory. We perform numerical simulations in a personalized optimization framework to corroborate the theoretical results.
Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control
This paper presents a comprehensive study on using deep reinforcement learning (RL) to create dynamic locomotion controllers for bipedal robots. Going beyond focusing on a single locomotion skill, we develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing. Our RL-based controller incorporates a novel dual-history architecture, utilizing both a long-term and short-term input/output (I/O) history of the robot. This control architecture, when trained through the proposed end-to-end RL approach, consistently outperforms other methods across a diverse range of skills in both simulation and the real world. The study also delves into the adaptivity and robustness introduced by the proposed RL system in developing locomotion controllers. We demonstrate that the proposed architecture can adapt to both time-invariant dynamics shifts and time-variant changes, such as contact events, by effectively using the robot's I/O history. Additionally, we identify task randomization as another key source of robustness, fostering better task generalization and compliance to disturbances. The resulting control policies can be successfully deployed on Cassie, a torque-controlled human-sized bipedal robot. This work pushes the limits of agility for bipedal robots through extensive real-world experiments. We demonstrate a diverse range of locomotion skills, including: robust standing, versatile walking, fast running with a demonstration of a 400-meter dash, and a diverse set of jumping skills, such as standing long jumps and high jumps.
comment: Accepted in International Journal of Robotics Research (IJRR) 2024. This is the author's version and will no longer be updated as the copyright may get transferred at anytime
Control Barrier Function Based Design of Gradient Flows for Constrained Nonlinear Programming
This paper considers the problem of designing a continuous-time dynamical system that solves a constrained nonlinear optimization problem and makes the feasible set forward invariant and asymptotically stable. The invariance of the feasible set makes the dynamics anytime, when viewed as an algorithm, meaning it returns a feasible solution regardless of when it is terminated. Our approach augments the gradient flow of the objective function with inputs defined by the constraint functions, treats the feasible set as a safe set, and synthesizes a safe feedback controller using techniques from the theory of control barrier functions. The resulting closed-loop system, termed safe gradient flow, can be viewed as a primal-dual flow, where the state corresponds to the primal variables and the inputs correspond to the dual ones. We provide a detailed suite of conditions based on constraint qualification under which (both isolated and nonisolated) local minimizers are stable with respect to the feasible set and the whole state space. Comparisons with other continuous-time methods for optimization in a simple example illustrate the advantages of the safe gradient flow.
comment: Full version, with appendix, of work appearing in IEEE Transactions on Automatic Control
Physics and technology of Laser Lightning Control
The recent development of high average, high peak power lasers has revived the effort of using lasers as a potential tool to influence natural lightning. Although impressive, the current progress in laser lightning control technology may only be the beginning of a new area involving a positive feedback between powerful laser development and atmospheric research. In this review paper, we critically evaluate the past, present and future of Laser Lightning Control (LLC), considering both its technological and scientific significance in atmospheric research.
comment: Revised version after first round of peer-review. Restructuration, adding more physics contextuality and pushed some tables in a newly created Appendix
Modeling pedestrian fundamental diagram based on Directional Statistics
Understanding pedestrian dynamics is crucial for appropriately designing pedestrian spaces. The pedestrian fundamental diagram (FD), which describes the relationship between pedestrian flow and density within a given space, characterizes these dynamics. Pedestrian FDs are significantly influenced by the flow type, such as uni-directional, bi-directional, and crossing flows. However, to the authors' knowledge, generalized pedestrian FDs that are applicable to various flow types have not been proposed. This may be due to the difficulty of using statistical methods to characterize the flow types. The flow types significantly depend on the angles of pedestrian movement; however, these angles cannot be processed by standard statistics due to their periodicity. In this study, we propose a comprehensive model for pedestrian FDs that can describe the pedestrian dynamics for various flow types by applying Directional Statistics. First, we develop a novel statistic describing the pedestrian flow type solely from pedestrian trajectory data using Directional Statistics. Then, we formulate a comprehensive pedestrian FD model that can be applied to various flow types by incorporating the proposed statistics into a traditional pedestrian FD model. The proposed model was validated using actual pedestrian trajectory data. The results confirmed that the model effectively represents the essential nature of pedestrian dynamics, such as the capacity reduction due to conflict of crossing flows and the capacity improvement due to the lane formation in bi-directional flows.
Cooperative Hypothesis Testing by Two Observers with Asymmetric Information
We consider the binary hypothesis testing problem with two observers. There are two possible states of nature (or hypotheses). Observations collected by the two observers are statistically related to the true state of nature. The knowledge of joint distribution of the observations collected and the true state of nature is unknown to the observers. There are two problems to be solved by the observers: (i) true state of nature is known: find the distribution of the local information collected; (ii) true state of nature is unknown: collaboratively estimate the same using the distributions found by solving the first problem. We present four algorithms, each having two phases where the two problems are solved, with emphasis on the information exchange between the observers and resulting patterns. We prove different properties of the algorithms including the following: the probability spaces constructed as a consequence of solving the first problem are dependent on the information patterns at the observers; (ii) the rate of decay of probability of error of algorithms while solving the second problem is dependent on the information exchange between the observers. We present a numerical example demonstrating the four algorithms.
comment: Journal Paper to be published
Robotics
TraIL-Det: Transformation-Invariant Local Feature Networks for 3D LiDAR Object Detection with Unsupervised Pre-Training BMVC 2024
3D point clouds are essential for perceiving outdoor scenes, especially within the realm of autonomous driving. Recent advances in 3D LiDAR Object Detection focus primarily on the spatial positioning and distribution of points to ensure accurate detection. However, despite their robust performance in variable conditions, these methods are hindered by their sole reliance on coordinates and point intensity, resulting in inadequate isometric invariance and suboptimal detection outcomes. To tackle this challenge, our work introduces Transformation-Invariant Local (TraIL) features and the associated TraIL-Det architecture. Our TraIL features exhibit rigid transformation invariance and effectively adapt to variations in point density, with a design focus on capturing the localized geometry of neighboring structures. They utilize the inherent isotropic radiation of LiDAR to enhance local representation, improve computational efficiency, and boost detection performance. To effectively process the geometric relations among points within each proposal, we propose a Multi-head self-Attention Encoder (MAE) with asymmetric geometric features to encode high-dimensional TraIL features into manageable representations. Our method outperforms contemporary self-supervised 3D object detection approaches in terms of mAP on KITTI (67.8, 20% label, moderate) and Waymo (68.9, 20% label, moderate) datasets under various label ratios (20%, 50%, and 100%).
comment: BMVC 2024; 15 pages, 3 figures, 3 tables; Code at https://github.com/l1997i/rapid_seg
Safe Policy Exploration Improvement via Subgoals
Reinforcement learning is a widely used approach to autonomous navigation, showing potential in various tasks and robotic setups. Still, it often struggles to reach distant goals when safety constraints are imposed (e.g., the wheeled robot is prohibited from moving close to the obstacles). One of the main reasons for poor performance in such setups, which is common in practice, is that the need to respect the safety constraints degrades the exploration capabilities of an RL agent. To this end, we introduce a novel learnable algorithm that is based on decomposing the initial problem into smaller sub-problems via intermediate goals, on the one hand, and respects the limit of the cumulative safety constraints, on the other hand -- SPEIS(Safe Policy Exploration Improvement via Subgoals). It comprises the two coupled policies trained end-to-end: subgoal and safe. The subgoal policy is trained to generate the subgoal based on the transitions from the buffer of the safe (main) policy that helps the safe policy to reach distant goals. Simultaneously, the safe policy maximizes its rewards while attempting not to violate the limit of the cumulative safety constraints, thus providing a certain level of safety. We evaluate SPEIS in a wide range of challenging (simulated) environments that involve different types of robots in two different environments: autonomous vehicles from the POLAMP environment and car, point, doggo, and sweep from the safety-gym environment. We demonstrate that our method consistently outperforms state-of-the-art competitors and can significantly reduce the collision rate while maintaining high success rates (higher by 80% compared to the best-performing methods).
comment: 11 pages, 8 figures
Improving GNSS Positioning in Challenging Urban Areas by Digital Twin Database Correction
Accurate positioning technology is the foundation for industry and business applications. Although indoor and outdoor positioning techniques have been well studied separately, positioning performance in the intermediate period of changing the positioning environment is still challenging. This paper proposed a digital twin-aided positioning correction method for seamless positioning focusing on improving the receiver's outdoor positioning performance in urban areas, where the change of the positioning environment usually happens. The proposed algorithm will simulate the positioning solution for virtual receivers in a grid-based digital twin. Based on the simulated positioning solutions, a statistical model will be used to study the positioning characteristics and generate a correction information database for real receivers to improve their positioning performance. This algorithm has a low computation load on the receiver side and does not require a specially designed antenna, making it implementable for small-sized devices.
comment: 7 pages conference paper in indoor positioning and indoor navigation 2024
TripleMixer: A 3D Point Cloud Denoising Model for Adverse Weather
LiDAR sensors are crucial for providing high-resolution 3D point cloud data in autonomous driving systems, enabling precise environmental perception. However, real-world adverse weather conditions, such as rain, fog, and snow, introduce significant noise and interference, degrading the reliability of LiDAR data and the performance of downstream tasks like semantic segmentation. Existing datasets often suffer from limited weather diversity and small dataset sizes, which restrict their effectiveness in training models. Additionally, current deep learning denoising methods, while effective in certain scenarios, often lack interpretability, complicating the ability to understand and validate their decision-making processes. To overcome these limitations, we introduce two large-scale datasets, Weather-KITTI and Weather-NuScenes, which cover three common adverse weather conditions: rain, fog, and snow. These datasets retain the original LiDAR acquisition information and provide point-level semantic labels for rain, fog, and snow. Furthermore, we propose a novel point cloud denoising model, TripleMixer, comprising three mixer layers: the Geometry Mixer Layer, the Frequency Mixer Layer, and the Channel Mixer Layer. These layers are designed to capture geometric spatial information, extract multi-scale frequency information, and enhance the multi-channel feature information of point clouds, respectively. Experiments conducted on the WADS dataset in real-world scenarios, as well as on our proposed Weather-KITTI and Weather-NuScenes datasets, demonstrate that our model achieves state-of-the-art denoising performance. Additionally, our experiments show that integrating the denoising model into existing segmentation frameworks enhances the performance of downstream tasks.The datasets and code will be made publicly available at https://github.com/Grandzxw/TripleMixer.
comment: 15 pages, submit to IEEE TIP
MASQ: Multi-Agent Reinforcement Learning for Single Quadruped Robot Locomotion
This paper proposes a novel method to improve locomotion learning for a single quadruped robot using multi-agent deep reinforcement learning (MARL). Many existing methods use single-agent reinforcement learning for an individual robot or MARL for the cooperative task in multi-robot systems. Unlike existing methods, this paper proposes using MARL for the locomotion learning of a single quadruped robot. We develop a learning structure called Multi-Agent Reinforcement Learning for Single Quadruped Robot Locomotion (MASQ), considering each leg as an agent to explore the action space of the quadruped robot, sharing a global critic, and learning collaboratively. Experimental results indicate that MASQ not only speeds up learning convergence but also enhances robustness in real-world settings, suggesting that applying MASQ to single robots such as quadrupeds could surpass traditional single-robot reinforcement learning approaches. Our study provides insightful guidance on integrating MARL with single-robot locomotion learning.
Multi-modal Integrated Prediction and Decision-making with Adaptive Interaction Modality Explorations
Navigating dense and dynamic environments poses a significant challenge for autonomous driving systems, owing to the intricate nature of multimodal interaction, wherein the actions of various traffic participants and the autonomous vehicle are complex and implicitly coupled. In this paper, we propose a novel framework, Multi-modal Integrated predictioN and Decision-making (MIND), which addresses the challenges by efficiently generating joint predictions and decisions covering multiple distinctive interaction modalities. Specifically, MIND leverages learning-based scenario predictions to obtain integrated predictions and decisions with social-consistent interaction modality and utilizes a modality-aware dynamic branching mechanism to generate scenario trees that efficiently capture the evolutions of distinctive interaction modalities with low variation of interaction uncertainty along the planning horizon. The scenario trees are seamlessly utilized by the contingency planning under interaction uncertainty to obtain clear and considerate maneuvers accounting for multi-modal evolutions. Comprehensive experimental results in the closed-loop simulation based on the real-world driving dataset showcase superior performance to other strong baselines under various driving contexts.
comment: 8 pages, 9 figures
PIE: Parkour with Implicit-Explicit Learning Framework for Legged Robots
Parkour presents a highly challenging task for legged robots, requiring them to traverse various terrains with agile and smooth locomotion. This necessitates comprehensive understanding of both the robot's own state and the surrounding terrain, despite the inherent unreliability of robot perception and actuation. Current state-of-the-art methods either rely on complex pre-trained high-level terrain reconstruction modules or limit the maximum potential of robot parkour to avoid failure due to inaccurate perception. In this paper, we propose a one-stage end-to-end learning-based parkour framework: Parkour with Implicit-Explicit learning framework for legged robots (PIE) that leverages dual-level implicit-explicit estimation. With this mechanism, even a low-cost quadruped robot equipped with an unreliable egocentric depth camera can achieve exceptional performance on challenging parkour terrains using a relatively simple training process and reward function. While the training process is conducted entirely in simulation, our real-world validation demonstrates successful zero-shot deployment of our framework, showcasing superior parkour performance on harsh terrains.
comment: Accepted for IEEE Robotics and Automation Letters (RA-L)
PhysPart: Physically Plausible Part Completion for Interactable Objects
Interactable objects are ubiquitous in our daily lives. Recent advances in 3D generative models make it possible to automate the modeling of these objects, benefiting a range of applications from 3D printing to the creation of robot simulation environments. However, while significant progress has been made in modeling 3D shapes and appearances, modeling object physics, particularly for interactable objects, remains challenging due to the physical constraints imposed by inter-part motions. In this paper, we tackle the problem of physically plausible part completion for interactable objects, aiming to generate 3D parts that not only fit precisely into the object but also allow smooth part motions. To this end, we propose a diffusion-based part generation model that utilizes geometric conditioning through classifier-free guidance and formulates physical constraints as a set of stability and mobility losses to guide the sampling process. Additionally, we demonstrate the generation of dependent parts, paving the way toward sequential part generation for objects with complex part-whole hierarchies. Experimentally, we introduce a new metric for measuring physical plausibility based on motion success rates. Our model outperforms existing baselines over shape and physical metrics, especially those that do not adequately model physical constraints. We also demonstrate our applications in 3D printing, robot manipulation, and sequential part generation, showing our strength in realistic tasks with the demand for high physical plausibility.
SeeBelow: Sub-dermal 3D Reconstruction of Tumors with Surgical Robotic Palpation and Tactile Exploration IROS 2024
Surgical scene understanding in Robot-assisted Minimally Invasive Surgery (RMIS) is highly reliant on visual cues and lacks tactile perception. Force-modulated surgical palpation with tactile feedback is necessary for localization, geometry/depth estimation, and dexterous exploration of abnormal stiff inclusions in subsurface tissue layers. Prior works explored surface-level tissue abnormalities or single layered tissue-tumor embeddings with more than 300 palpations for dense 2D stiffness mapping. Our approach focuses on 3D reconstructions of sub-dermal tumor surface profiles in multi-layered tissue (skin-fat-muscle) using a visually-guided novel tactile navigation policy. A robotic palpation probe with tri-axial force sensing was leveraged for tactile exploration of the phantom. From a surface mesh of the surgical region initialized from a depth camera, the policy explores a surgeon's region of interest through palpation, sampled from bayesian optimization. Each palpation includes contour following using a contact-safe impedance controller to trace the sub-dermal tumor geometry, until the underlying tumor-tissue boundary is reached. Projections of these contour following palpation trajectories allows 3D reconstruction of the subdermal tumor surface profile in less than 100 palpations. Our approach generates high-fidelity 3D surface reconstructions of rigid tumor embeddings in tissue layers with isotropic elasticities, although soft tumor geometries are yet to be explored.
comment: 8 pages, 6 figures, accepted to IROS 2024
Equivariant Ensembles and Regularization for Reinforcement Learning in Map-based Path Planning IROS 2024
In reinforcement learning (RL), exploiting environmental symmetries can significantly enhance efficiency, robustness, and performance. However, ensuring that the deep RL policy and value networks are respectively equivariant and invariant to exploit these symmetries is a substantial challenge. Related works try to design networks that are equivariant and invariant by construction, limiting them to a very restricted library of components, which in turn hampers the expressiveness of the networks. This paper proposes a method to construct equivariant policies and invariant value functions without specialized neural network components, which we term equivariant ensembles. We further add a regularization term for adding inductive bias during training. In a map-based path planning case study, we show how equivariant ensembles and regularization benefit sample efficiency and performance.
comment: Accepted at IROS 2024. A video can be found here: https://youtu.be/L6NOdvU7n7s. The code is available at https://github.com/theilem/uavSim
RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation ECCV 2024
3D point clouds play a pivotal role in outdoor scene perception, especially in the context of autonomous driving. Recent advancements in 3D LiDAR segmentation often focus intensely on the spatial positioning and distribution of points for accurate segmentation. However, these methods, while robust in variable conditions, encounter challenges due to sole reliance on coordinates and point intensity, leading to poor isometric invariance and suboptimal segmentation. To tackle this challenge, our work introduces Range-Aware Pointwise Distance Distribution (RAPiD) features and the associated RAPiD-Seg architecture. Our RAPiD features exhibit rigid transformation invariance and effectively adapt to variations in point density, with a design focus on capturing the localized geometry of neighboring structures. They utilize inherent LiDAR isotropic radiation and semantic categorization for enhanced local representation and computational efficiency, while incorporating a 4D distance metric that integrates geometric and surface material reflectivity for improved semantic segmentation. To effectively embed high-dimensional RAPiD features, we propose a double-nested autoencoder structure with a novel class-aware embedding objective to encode high-dimensional features into manageable voxel-wise embeddings. Additionally, we propose RAPiD-Seg which incorporates a channel-wise attention fusion and two effective RAPiD-Seg variants, further optimizing the embedding for enhanced performance and generalization. Our method outperforms contemporary LiDAR segmentation work in terms of mIoU on SemanticKITTI (76.1) and nuScenes (83.6) datasets.
comment: ECCV 2024 (Oral); 18 pages, 6 figures, 7 tables; Code at https://github.com/l1997i/rapid_seg
Feeling Optimistic? Ambiguity Attitudes for Online Decision Making
Due to the complexity of many decision making problems, tree search algorithms often have inadequate information to produce accurate transition models. This results in ambiguities (uncertainties for which there are multiple plausible models). Faced with ambiguities, robust methods have been used to produce safe solutions--often by maximizing the lower bound over the set of plausible transition models. However, they often overlook how much the representation of uncertainty can impact how a decision is made. This work introduces the Ambiguity Attitude Graph Search (AAGS), advocating for more comprehensive representations of ambiguities in decision making. Additionally, AAGS allows users to adjust their ambiguity attitude (or preference), promoting exploration and improving users' ability to control how an agent should respond when faced with a set of plausible alternatives. Simulation in a dynamic sailing environment shows how environments with high entropy transition models can lead robust methods to fail. Results further demonstrate how adjusting ambiguity attitudes better fulfills objectives while mitigating this failure mode of robust approaches. Because this approach is a generalization of the robust framework, these results further demonstrate how algorithms focused on ambiguity have applicability beyond safety-critical systems.
comment: 6 pages, 5 figures, 2 algorithms. Accepted to the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems in Abu Dhabi, UAE (Oct 14-18, 2024) \c{opyright} 2024 IEEE
DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction
Temporal information plays a pivotal role in Bird's-Eye-View (BEV) driving scene understanding, which can alleviate the visual information sparsity. However, the indiscriminate temporal fusion method will cause the barrier of feature redundancy when constructing vectorized High-Definition (HD) maps. In this paper, we revisit the temporal fusion of vectorized HD maps, focusing on temporal instance consistency and temporal map consistency learning. To improve the representation of instances in single-frame maps, we introduce a novel method, DTCLMapper. This approach uses a dual-stream temporal consistency learning module that combines instance embedding with geometry maps. In the instance embedding component, our approach integrates temporal Instance Consistency Learning (ICL), ensuring consistency from vector points and instance features aggregated from points. A vectorized points pre-selection module is employed to enhance the regression efficiency of vector points from each instance. Then aggregated instance features obtained from the vectorized points preselection module are grounded in contrastive learning to realize temporal consistency, where positive and negative samples are selected based on position and semantic information. The geometry mapping component introduces Map Consistency Learning (MCL) designed with self-supervised learning. The MCL enhances the generalization capability of our consistent learning approach by concentrating on the global location and distribution constraints of the instances. Extensive experiments on well-recognized benchmarks indicate that the proposed DTCLMapper achieves state-of-the-art performance in vectorized mapping tasks, reaching 61.9% and 65.1% mAP scores on the nuScenes and Argoverse datasets, respectively. The source code is available at https://github.com/lynn-yu/DTCLMapper.
comment: Accepted to IEEE Transactions on Intelligent Transportation Systems (T-ITS). The source code is available at https://github.com/lynn-yu/DTCLMapper
Multiagent Systems
Decentralized Stochastic Control in Standard Borel Spaces: Centralized MDP Reductions, Near Optimality of Finite Window Local Information, and Q-Learning
Decentralized stochastic control problems are intrinsically difficult to study because of the inapplicability of standard tools from centralized control such as dynamic programming and the resulting computational complexity. In this paper, we address some of these challenges for decentralized stochastic control with Borel spaces under three different but tightly related information structures under a unified theme: the one-step delayed information sharing pattern, the K-step periodic information sharing pattern, and the completely decentralized information structure where no sharing of information occurs. We will show that the one-step delayed and K-step periodic problems can be reduced to a centralized MDP, generalizing prior results which considered finite, linear, or static models, by addressing several measurability questions. The separated nature of policies under both information structures is then established. We then provide sufficient conditions for the transition kernels of both centralized reductions to be weak-Feller, which facilitates rigorous approximation and learning theoretic results. We will then show that for the completely decentralized control problem finite memory local policies are near optimal under a joint conditional mixing condition. This is achieved by obtaining a bound for finite memory policies which goes to zero as memory size increases. We will also provide a performance bound for the K-periodic problem, which results from replacing the full common information by a finite sliding window of information. The latter will depend on the condition of predictor stability in expected total variation, which we will establish. We finally show that under the periodic information sharing pattern, a quantized Q-learning algorithm converges asymptotically towards a near optimal solution. Each of the above, to our knowledge, is a new contribution to the literature.
comment: A summary of the results is to be presented in CDC'24
Multi-Agent Target Assignment and Path Finding for Intelligent Warehouse: A Cooperative Multi-Agent Deep Reinforcement Learning Perspective
Multi-agent target assignment and path planning (TAPF) are two key problems in intelligent warehouse. However, most literature only addresses one of these two problems separately. In this study, we propose a method to simultaneously solve target assignment and path planning from a perspective of cooperative multi-agent deep reinforcement learning (RL). To the best of our knowledge, this is the first work to model the TAPF problem for intelligent warehouse to cooperative multi-agent deep RL, and the first to simultaneously address TAPF based on multi-agent deep RL. Furthermore, previous literature rarely considers the physical dynamics of agents. In this study, the physical dynamics of the agents is considered. Experimental results show that our method performs well in various task settings, which means that the target assignment is solved reasonably well and the planned path is almost shortest. Moreover, our method is more time-efficient than baselines.
Keeping the Harmony Between Neighbors: Local Fairness in Graph Fair Division AAMAS 2024
We study the problem of allocating indivisible resources under the connectivity constraints of a graph $G$. This model, initially introduced by Bouveret et al. (published in IJCAI, 2017), effectively encompasses a diverse array of scenarios characterized by spatial or temporal limitations, including the division of land plots and the allocation of time plots. In this paper, we introduce a novel fairness concept that integrates local comparisons within the social network formed by a connected allocation of the item graph. Our particular focus is to achieve pairwise-maximin fair share (PMMS) among the "neighbors" within this network. For any underlying graph structure, we show that a connected allocation that maximizes Nash welfare guarantees a $(1/2)$-PMMS fairness. Moreover, for two agents, we establish that a $(3/4)$-PMMS allocation can be efficiently computed. Additionally, we demonstrate that for three agents and the items aligned on a path, a PMMS allocation is always attainable and can be computed in polynomial time. Lastly, when agents have identical additive utilities, we present a pseudo-polynomial-time algorithm for a $(3/4)$-PMMS allocation, irrespective of the underlying graph $G$. Furthermore, we provide a polynomial-time algorithm for obtaining a PMMS allocation when $G$ is a tree.
comment: Full version of paper accepted for presentation at AAMAS 2024
Systems and Control (CS)
On output consensus of heterogeneous dynamical networks
This work is concerned with interconnected networks with non-identical subsystems. We investigate the output consensus of the network where the dynamics are subject to external disturbance and/or reference input. For a network of output-feedback passive subsystems, we first introduce an index that characterises the gap between a pair of adjacent subsystems by the difference of their input-output trajectories. The set of these indices quantifies the level of heterogeneity of the networks. We then provide a condition in terms of the level of heterogeneity and the connectivity of the networks for ensuring the output consensus of the interconnected network.
Data-driven approximate output regulation of nonlinear systems
The paper deals with the data-based design of controllers that solve the output regulation problem for nonlinear systems. Inspired by recent developments in model-based output regulation design techniques and in data-driven control design for nonlinear systems, we derive a data-dependent semidefinite program that, when solved, directly returns a controller that approximately regulates the tracking error to zero. When specialized to the case of linear systems, the result appears to improve upon existing work. Numerical results illustrate the findings.
Watercraft as Overwater Ambulance Exchange Points to Enhance Aeromedical Evacuation
Ambulance exchange points are preidentified sites where patients are transferred between evacuation platforms while en route to enhanced medical care. We propose a new capability for maritime medical evacuation, which involves co-opting underway watercraft as overwater ambulance exchange points to transfer patients between medical evacuation aircraft. We partner with the United States Army's 25th Combat Aviation Brigade to demonstrate the use of an Army watercraft as an overwater ambulance exchange point. A manikin is transferred between two HH-60 Medical Evacuation Black Hawk helicopters conducting hoist operations over Army Logistics Support Vessel 3, which is traveling south of Honolulu, Hawaii. The demonstration is enabled by a decision support system for dispatching aircraft, hoist stabilization technology, commercial satellite internet, military geospatial infrastructure applications, and digital medical documentation tools, the benefits of which are all discussed. Three extensions of the overwater ambulance exchange point are introduced and civilian applications are considered.
Sensing-aided Near-Field Secure Communications with Mobile Eavesdroppers
The additional degree of freedom (DoF) in the distance domain of near-field communication offers new opportunities for physical layer security (PLS) design. However, existing works mainly consider static eavesdroppers, and the related study with mobile eavesdroppers is still in its infancy due to the difficulty in obtaining the channel state information (CSI) of the eavesdropper. To this end, we propose to leverage the sensing capability of integrated sensing and communication (ISAC) systems to assist PLS design. To comprehensively study the dynamic behaviors of the system, we propose a Pareto optimization framework, where a multi-objective optimization problem (MOOP) is formulated to simultaneously optimize three key performance metrics: power consumption, number of securely served users, and tracking performance, while guaranteeing the achievable rate of the users with a given leakage rate constraint. A globally optimal design based on the generalized Benders decomposition (GBD) method is proposed to achieve the Pareto optimal solutions. To reduce the computational complexity, we further design a low-complexity algorithm based on zero-forcing (ZF) beamforming and successive convex approximation (SCA). Simulation results validate the effectiveness of the proposed algorithms and reveal the intrinsic trade-offs between the three performance metrics. It is observed that near-field communication offers a favorable beam diffraction effect for PLS, where the energy of the information signal is nulled around the eavesdropper and focused on the users.
Informativeness and Trust in Bayesian Persuasion
A persuasion policy successfully persuades an agent to pick a particular action only if the information is designed in a manner that convinces the agent that it is in their best interest to pick that action. Thus, it is natural to ask, what makes the agent trust the persuader's suggestion? We study a Bayesian persuasion interaction between a sender and a receiver where the sender has access to private information and the receiver attempts to recover this information from messages sent by the sender. The sender crafts these messages in an attempt to maximize its utility which depends on the source symbol and the symbol recovered by the receiver. Our goal is to characterize the \textit{Stackelberg game value}, and the amount of true information revealed by the sender during persuasion. We find that the SGV is given by the optimal value of a \textit{linear program} on probability distributions constrained by certain \textit{trust constraints}. These constraints encode that any signal in a persuasion strategy must contain more truth than untruth and thus impose a fundamental bound on the extent of obfuscation a sender can perform. We define \textit{informativeness} of the sender as the minimum expected number of symbols truthfully revealed by the sender in any accumulation point of a sequence of $\varepsilon$-equilibrium persuasion strategies, and show that it is given by another linear program. Informativeness is a fundamental bound on the amount of information the sender must reveal to persuade a receiver. Closed form expressions for the SGV and the informativeness are presented for structured utility functions. This work generalizes our previous work where the sender and the receiver were constrained to play only deterministic strategies and a similar notion of informativeness was characterized. Comparisons between the previous and current notions are discussed.
Integrated Modeling and Forecasting of Electric Vehicles Charging Profiles Based on Real Data
In the context of energy transition and decarbonization of the economy, several governments will ban the sale of new combustion vehicles by 2050. Thus, growing penetration of electric vehicles (EVs) in distribution networks (DN) is predicted. Impact analyzes must be performed to determine if mitigation means are needed to accommodate a large quantity of EVs in the DN. Furthermore, the habits of the local population resulting in different EVs charging patterns needs to be realistically considered. This article proposed an individual residential EVs multi-charging algorithm based on the observed charging behavior of 500 measured residential EVs located in a large North American utility. Probability functions are derived from the analysis of these charging patterns. These can be used to model daily charging profiles of individual EV to assess, in a quasi-static time-series perspective, their impact either on a single costumer or a whole DN. An impact evaluation study is also presented.
Efficient Shield Synthesis via State-Space Transformation
We consider the problem of synthesizing safety strategies for control systems, also known as shields. Since the state space is infinite, shields are typically computed over a finite-state abstraction, with the most common abstraction being a rectangular grid. However, for many systems, such a grid does not align well with the safety property or the system dynamics. That is why a coarse grid is rarely sufficient, but a fine grid is typically computationally infeasible to obtain. In this paper, we show that appropriate state-space transformations can still allow to use a coarse grid at almost no computational overhead. We demonstrate in three case studies that our transformation-based synthesis outperforms a standard synthesis by several orders of magnitude. In the first two case studies, we use domain knowledge to select a suitable transformation. In the third case study, we instead report on results in engineering a transformation without domain knowledge.
Activating the Flexibility in Distribution systems via a Unified Quantification Approach Aligned with the Flexibility Models
To activate the flexibility of demand-side distributed energy resources (DERs) in power system operations, it is essential to reasonably quantify the costs of providing the flexibility and its value to the system. This paper proposes a unified quantification method aligned with flexibility models to quantify the cost and value of the flexibility from DERs and their aggregators. Compared with traditional power-range-based quantification approaches that are mainly suitable for generators, the proposed method can directly capture the time-dependent characteristics of DERs' individual and aggregated flexibility regions. Based on the quantification method, we further propose an optimization model for distribution system operators (DSOs) to coordinate the DER aggregators for energy arbitrage and ancillary services provision in the transmission network, along with a revenue allocation strategy ensuring a non-profit DSO. Numerical tests on the IEEE distribution system verify the proposed methods.
Addressing Trust Issues for Vehicle to Grid in Distributed Power Grids Using Blockchains
While blockchain offers inherent security, trust issues among stakeholders in vehicle-to-grid (V2G) applications remain unresolved due to a lack of regulatory frameworks and standardization. Additionally, a tailored decentralized privacy-preserved coordination scheme for blockchain in V2G networks is needed to ensure user privacy and efficient energy transactions. This paper proposes a V2G trading and coordination scheme tailored to the decentralized nature of blockchain as well as the interests of stakeholders utilizing smart charging points (SCPs) and Stackelberg game model. Case studies using real-world data from Southern University of Science and Technology demonstrate the efficacy of proposed scheme in reducing EV charging costs and the potential for supporting auxiliary grid services.
comment: This paper has been accepted by The 14th International Conference on Power and Energy Systems (ICPES 2024)
Edge Information Hub: Orchestrating Satellites, UAVs, MEC, Sensing and Communications for 6G Closed-Loop Controls
An increasing number of field robots would be used for mission-critical tasks in remote or post-disaster areas. Due to the limited individual abilities, these robots usually require an edge information hub (EIH), with not only communication but also sensing and computing functions. Such EIH could be deployed on a flexibly-dispatched unmanned aerial vehicle (UAV). Different from traditional aerial base stations or mobile edge computing (MEC), the EIH would direct the operations of robots via sensing-communication-computing-control ($\textbf{SC}^3$) closed-loop orchestration. This paper aims to optimize the closed-loop control performance of multiple $\textbf{SC}^3$ loops, with constraints on satellite-backhaul rate, computing capability, and on-board energy. Specifically, the linear quadratic regulator (LQR) control cost is used to measure the closed-loop utility, and a sum LQR cost minimization problem is formulated to jointly optimize the splitting of sensor data and allocation of communication and computing resources. We first derive the optimal splitting ratio of sensor data, and then recast the problem to a more tractable form. An iterative algorithm is finally proposed to provide a sub-optimal solution. Simulation results demonstrate the superiority of the proposed algorithm. We also uncover the influence of $\textbf{SC}^3$ parameters on closed-loop controls, highlighting more systematic understanding.
comment: 16pages, 11 figures
Infinite-Horizon Reach-Avoid Zero-Sum Games via Deep Reinforcement Learning
In this paper, we consider the infinite-horizon reach-avoid zero-sum game problem, where the goal is to find a set in the state space, referred to as the reach-avoid set, such that the system starting at a state therein could be controlled to reach a given target set without violating constraints under the worst-case disturbance. We address this problem by designing a new value function with a contracting Bellman backup, where the super-zero level set, i.e., the set of states where the value function is evaluated to be non-negative, recovers the reach-avoid set. Building upon this, we prove that the proposed method can be adapted to compute the viability kernel, or the set of states which could be controlled to satisfy given constraints, and the backward reachable set, or the set of states that could be driven towards a given target set. Finally, we propose to alleviate the curse of dimensionality issue in high-dimensional problems by extending Conservative Q-Learning, a deep reinforcement learning technique, to learn a value function such that the super-zero level set of the learned value function serves as a (conservative) approximation to the reach-avoid set. Our theoretical and empirical results suggest that the proposed method could learn reliably the reach-avoid set and the optimal control policy even with neural network approximation.
VLEIBot: A New 45-mg Swimming Microrobot Driven by a Bioinspired Anguilliform Propulsor ICRA 2024
This paper presents the VLEIBot^* (Very Little Eel-Inspired roBot), a 45-mg/23-mm^3 microrobotic swimmer that is propelled by a bioinspired anguilliform propulsor. The propulsor is excited by a single 6-mg high-work-density (HWD) microactuator and undulates periodically due to wave propagation phenomena generated by fluid-structure interaction (FSI) during swimming. The microactuator is composed of a carbon-fiber beam, which functions as a leaf spring, and shape-memory alloy (SMA) wires, which deform cyclically when excited periodically using Joule heating. The VLEIBot can swim at speeds as high as 15.1mm * s^{-1} (0.33 Bl * s^{-1}}) when driven with a heuristically-optimized propulsor. To improve maneuverability, we evolved the VLEIBot design into the 90-mg/47-mm^3 VLEIBot^+, which is driven by two propulsors and fully controllable in the two-dimensional (2D) space. The VLEIBot^+ can swim at speeds as high as 16.1mm * s^{-1} (0.35 Bl * s^{-1}), when driven with heuristically-optimized propulsors, and achieves turning rates as high as 0.28 rad * s^{-1}, when tracking path references. The measured root-mean-square (RMS) values of the tracking errors are as low as 4 mm.
comment: 8 pages, 7 figures, to be presented at ICRA 2024
Systems and Control (EESS)
On output consensus of heterogeneous dynamical networks
This work is concerned with interconnected networks with non-identical subsystems. We investigate the output consensus of the network where the dynamics are subject to external disturbance and/or reference input. For a network of output-feedback passive subsystems, we first introduce an index that characterises the gap between a pair of adjacent subsystems by the difference of their input-output trajectories. The set of these indices quantifies the level of heterogeneity of the networks. We then provide a condition in terms of the level of heterogeneity and the connectivity of the networks for ensuring the output consensus of the interconnected network.
Data-driven approximate output regulation of nonlinear systems
The paper deals with the data-based design of controllers that solve the output regulation problem for nonlinear systems. Inspired by recent developments in model-based output regulation design techniques and in data-driven control design for nonlinear systems, we derive a data-dependent semidefinite program that, when solved, directly returns a controller that approximately regulates the tracking error to zero. When specialized to the case of linear systems, the result appears to improve upon existing work. Numerical results illustrate the findings.
Watercraft as Overwater Ambulance Exchange Points to Enhance Aeromedical Evacuation
Ambulance exchange points are preidentified sites where patients are transferred between evacuation platforms while en route to enhanced medical care. We propose a new capability for maritime medical evacuation, which involves co-opting underway watercraft as overwater ambulance exchange points to transfer patients between medical evacuation aircraft. We partner with the United States Army's 25th Combat Aviation Brigade to demonstrate the use of an Army watercraft as an overwater ambulance exchange point. A manikin is transferred between two HH-60 Medical Evacuation Black Hawk helicopters conducting hoist operations over Army Logistics Support Vessel 3, which is traveling south of Honolulu, Hawaii. The demonstration is enabled by a decision support system for dispatching aircraft, hoist stabilization technology, commercial satellite internet, military geospatial infrastructure applications, and digital medical documentation tools, the benefits of which are all discussed. Three extensions of the overwater ambulance exchange point are introduced and civilian applications are considered.
Sensing-aided Near-Field Secure Communications with Mobile Eavesdroppers
The additional degree of freedom (DoF) in the distance domain of near-field communication offers new opportunities for physical layer security (PLS) design. However, existing works mainly consider static eavesdroppers, and the related study with mobile eavesdroppers is still in its infancy due to the difficulty in obtaining the channel state information (CSI) of the eavesdropper. To this end, we propose to leverage the sensing capability of integrated sensing and communication (ISAC) systems to assist PLS design. To comprehensively study the dynamic behaviors of the system, we propose a Pareto optimization framework, where a multi-objective optimization problem (MOOP) is formulated to simultaneously optimize three key performance metrics: power consumption, number of securely served users, and tracking performance, while guaranteeing the achievable rate of the users with a given leakage rate constraint. A globally optimal design based on the generalized Benders decomposition (GBD) method is proposed to achieve the Pareto optimal solutions. To reduce the computational complexity, we further design a low-complexity algorithm based on zero-forcing (ZF) beamforming and successive convex approximation (SCA). Simulation results validate the effectiveness of the proposed algorithms and reveal the intrinsic trade-offs between the three performance metrics. It is observed that near-field communication offers a favorable beam diffraction effect for PLS, where the energy of the information signal is nulled around the eavesdropper and focused on the users.
Informativeness and Trust in Bayesian Persuasion
A persuasion policy successfully persuades an agent to pick a particular action only if the information is designed in a manner that convinces the agent that it is in their best interest to pick that action. Thus, it is natural to ask, what makes the agent trust the persuader's suggestion? We study a Bayesian persuasion interaction between a sender and a receiver where the sender has access to private information and the receiver attempts to recover this information from messages sent by the sender. The sender crafts these messages in an attempt to maximize its utility which depends on the source symbol and the symbol recovered by the receiver. Our goal is to characterize the \textit{Stackelberg game value}, and the amount of true information revealed by the sender during persuasion. We find that the SGV is given by the optimal value of a \textit{linear program} on probability distributions constrained by certain \textit{trust constraints}. These constraints encode that any signal in a persuasion strategy must contain more truth than untruth and thus impose a fundamental bound on the extent of obfuscation a sender can perform. We define \textit{informativeness} of the sender as the minimum expected number of symbols truthfully revealed by the sender in any accumulation point of a sequence of $\varepsilon$-equilibrium persuasion strategies, and show that it is given by another linear program. Informativeness is a fundamental bound on the amount of information the sender must reveal to persuade a receiver. Closed form expressions for the SGV and the informativeness are presented for structured utility functions. This work generalizes our previous work where the sender and the receiver were constrained to play only deterministic strategies and a similar notion of informativeness was characterized. Comparisons between the previous and current notions are discussed.
Integrated Modeling and Forecasting of Electric Vehicles Charging Profiles Based on Real Data
In the context of energy transition and decarbonization of the economy, several governments will ban the sale of new combustion vehicles by 2050. Thus, growing penetration of electric vehicles (EVs) in distribution networks (DN) is predicted. Impact analyzes must be performed to determine if mitigation means are needed to accommodate a large quantity of EVs in the DN. Furthermore, the habits of the local population resulting in different EVs charging patterns needs to be realistically considered. This article proposed an individual residential EVs multi-charging algorithm based on the observed charging behavior of 500 measured residential EVs located in a large North American utility. Probability functions are derived from the analysis of these charging patterns. These can be used to model daily charging profiles of individual EV to assess, in a quasi-static time-series perspective, their impact either on a single costumer or a whole DN. An impact evaluation study is also presented.
Efficient Shield Synthesis via State-Space Transformation
We consider the problem of synthesizing safety strategies for control systems, also known as shields. Since the state space is infinite, shields are typically computed over a finite-state abstraction, with the most common abstraction being a rectangular grid. However, for many systems, such a grid does not align well with the safety property or the system dynamics. That is why a coarse grid is rarely sufficient, but a fine grid is typically computationally infeasible to obtain. In this paper, we show that appropriate state-space transformations can still allow to use a coarse grid at almost no computational overhead. We demonstrate in three case studies that our transformation-based synthesis outperforms a standard synthesis by several orders of magnitude. In the first two case studies, we use domain knowledge to select a suitable transformation. In the third case study, we instead report on results in engineering a transformation without domain knowledge.
Activating the Flexibility in Distribution systems via a Unified Quantification Approach Aligned with the Flexibility Models
To activate the flexibility of demand-side distributed energy resources (DERs) in power system operations, it is essential to reasonably quantify the costs of providing the flexibility and its value to the system. This paper proposes a unified quantification method aligned with flexibility models to quantify the cost and value of the flexibility from DERs and their aggregators. Compared with traditional power-range-based quantification approaches that are mainly suitable for generators, the proposed method can directly capture the time-dependent characteristics of DERs' individual and aggregated flexibility regions. Based on the quantification method, we further propose an optimization model for distribution system operators (DSOs) to coordinate the DER aggregators for energy arbitrage and ancillary services provision in the transmission network, along with a revenue allocation strategy ensuring a non-profit DSO. Numerical tests on the IEEE distribution system verify the proposed methods.
Addressing Trust Issues for Vehicle to Grid in Distributed Power Grids Using Blockchains
While blockchain offers inherent security, trust issues among stakeholders in vehicle-to-grid (V2G) applications remain unresolved due to a lack of regulatory frameworks and standardization. Additionally, a tailored decentralized privacy-preserved coordination scheme for blockchain in V2G networks is needed to ensure user privacy and efficient energy transactions. This paper proposes a V2G trading and coordination scheme tailored to the decentralized nature of blockchain as well as the interests of stakeholders utilizing smart charging points (SCPs) and Stackelberg game model. Case studies using real-world data from Southern University of Science and Technology demonstrate the efficacy of proposed scheme in reducing EV charging costs and the potential for supporting auxiliary grid services.
comment: This paper has been accepted by The 14th International Conference on Power and Energy Systems (ICPES 2024)
Edge Information Hub: Orchestrating Satellites, UAVs, MEC, Sensing and Communications for 6G Closed-Loop Controls
An increasing number of field robots would be used for mission-critical tasks in remote or post-disaster areas. Due to the limited individual abilities, these robots usually require an edge information hub (EIH), with not only communication but also sensing and computing functions. Such EIH could be deployed on a flexibly-dispatched unmanned aerial vehicle (UAV). Different from traditional aerial base stations or mobile edge computing (MEC), the EIH would direct the operations of robots via sensing-communication-computing-control ($\textbf{SC}^3$) closed-loop orchestration. This paper aims to optimize the closed-loop control performance of multiple $\textbf{SC}^3$ loops, with constraints on satellite-backhaul rate, computing capability, and on-board energy. Specifically, the linear quadratic regulator (LQR) control cost is used to measure the closed-loop utility, and a sum LQR cost minimization problem is formulated to jointly optimize the splitting of sensor data and allocation of communication and computing resources. We first derive the optimal splitting ratio of sensor data, and then recast the problem to a more tractable form. An iterative algorithm is finally proposed to provide a sub-optimal solution. Simulation results demonstrate the superiority of the proposed algorithm. We also uncover the influence of $\textbf{SC}^3$ parameters on closed-loop controls, highlighting more systematic understanding.
comment: 16pages, 11 figures
Infinite-Horizon Reach-Avoid Zero-Sum Games via Deep Reinforcement Learning
In this paper, we consider the infinite-horizon reach-avoid zero-sum game problem, where the goal is to find a set in the state space, referred to as the reach-avoid set, such that the system starting at a state therein could be controlled to reach a given target set without violating constraints under the worst-case disturbance. We address this problem by designing a new value function with a contracting Bellman backup, where the super-zero level set, i.e., the set of states where the value function is evaluated to be non-negative, recovers the reach-avoid set. Building upon this, we prove that the proposed method can be adapted to compute the viability kernel, or the set of states which could be controlled to satisfy given constraints, and the backward reachable set, or the set of states that could be driven towards a given target set. Finally, we propose to alleviate the curse of dimensionality issue in high-dimensional problems by extending Conservative Q-Learning, a deep reinforcement learning technique, to learn a value function such that the super-zero level set of the learned value function serves as a (conservative) approximation to the reach-avoid set. Our theoretical and empirical results suggest that the proposed method could learn reliably the reach-avoid set and the optimal control policy even with neural network approximation.
VLEIBot: A New 45-mg Swimming Microrobot Driven by a Bioinspired Anguilliform Propulsor ICRA 2024
This paper presents the VLEIBot^* (Very Little Eel-Inspired roBot), a 45-mg/23-mm^3 microrobotic swimmer that is propelled by a bioinspired anguilliform propulsor. The propulsor is excited by a single 6-mg high-work-density (HWD) microactuator and undulates periodically due to wave propagation phenomena generated by fluid-structure interaction (FSI) during swimming. The microactuator is composed of a carbon-fiber beam, which functions as a leaf spring, and shape-memory alloy (SMA) wires, which deform cyclically when excited periodically using Joule heating. The VLEIBot can swim at speeds as high as 15.1mm * s^{-1} (0.33 Bl * s^{-1}}) when driven with a heuristically-optimized propulsor. To improve maneuverability, we evolved the VLEIBot design into the 90-mg/47-mm^3 VLEIBot^+, which is driven by two propulsors and fully controllable in the two-dimensional (2D) space. The VLEIBot^+ can swim at speeds as high as 16.1mm * s^{-1} (0.35 Bl * s^{-1}), when driven with heuristically-optimized propulsors, and achieves turning rates as high as 0.28 rad * s^{-1}, when tracking path references. The measured root-mean-square (RMS) values of the tracking errors are as low as 4 mm.
comment: 8 pages, 7 figures, to be presented at ICRA 2024
Robotics
Effects of fiber number and density on fiber jamming: Towards follow-the-leader deployment of a continuum robot IROS2024
Fiber jamming modules (FJMs) offer flexibility and quick stiffness variation, making them suitable for follow-the-leader (FTL) motions in continuum robots, which is ideal for minimally invasive surgery (MIS). However, their potential has not been fully exploited, particularly in designing and manufacturing small-sized FJMs with high stiffness variation. Although existing research has focused on factors like fiber materials and geometry to maximize stiffness variation, the results often do not apply to FJMs for MIS due to size constraints. Meanwhile, other factors such as fiber number and packing density, less significant to large FJMs but critical to small-sized FJMs, have received insufficient investigation regarding their impact on the stiffness variation for FTL deployment. In this paper, we design and fabricate FJMs with a diameter of 4mm. Through theoretical and experimental analysis, we find that fiber number and packing density significantly affect both absolute stiffness and stiffness variation. Our experiments confirm the feasibility of using FJMs in a medical FTL robot design. The optimal configuration is a 4mm FJM with 0.4mm fibers at a 56% packing density, achieving up to 3400% stiffness variation. A video demonstration of a prototype robot using the suggested parameters for achieving FTL motions can be found at https://youtu.be/7pI5U0z7kcE.
comment: 6 pages, 6 figures, accepted by IROS2024
Modeling of Terrain Deformation by a Grouser Wheel for Lunar Rover Simulation
Simulation of vehicle motion in planetary environments is challenging. This is due to the modeling of complex terrain, optical conditions, and terrain-aware vehicle dynamics. One of the critical issues of typical simulators is that they assume terrain is a rigid body, which limits their ability to render wheel traces and compute the wheel-terrain interactions. This prevents, for example, the use of wheel traces as landmarks for localization, as well as the accurate simulation of motion. In the context of lunar regolith, the surface is not rigid but granular. As such, there are differences in the rover's motion, such as sinkage and slippage, and a clear wheel trace left behind the rover, compared to that on a rigid terrain. This study presents a novel approach to integrating a terramechanics-aware terrain deformation engine to simulate a realistic wheel trace in a digital lunar environment. By leveraging Discrete Element Method simulation results alongside experimental single-wheel test data, we construct a regression model to derive deformation height as a function of contact normal force. The region of interest in a height map is retrieved from the wheel poses. The elevation values of corresponding pixels are subsequently modified using contact normal forces and the regression model. Finally, we apply the determined elevation change to each mesh vertex to render wheel traces during runtime. The deformation engine is integrated into our ongoing development of a lunar simulator based on NVIDIA's Omniverse IsaacSim. We hypothesize that our work will be crucial to testing perception and downstream navigation systems under conditions similar to outdoor or terrestrial fields. A demonstration video is available here: https://www.youtube.com/watch?v=TpzD0h-5hv4
comment: 7pages, 7 figures, to be published in proceedings of the 21st International and 12th Asia-Pacific Regional Conference of the ISTVS (ISTVS)
A Comparison of Imitation Learning Algorithms for Bimanual Manipulation
Amidst the wide popularity of imitation learning algorithms in robotics, their properties regarding hyperparameter sensitivity, ease of training, data efficiency, and performance have not been well-studied in high-precision industry-inspired environments. In this work, we demonstrate the limitations and benefits of prominent imitation learning approaches and analyze their capabilities regarding these properties. We evaluate each algorithm on a complex bimanual manipulation task involving an over-constrained dynamics system in a setting involving multiple contacts between the manipulated object and the environment. While we find that imitation learning is well suited to solve such complex tasks, not all algorithms are equal in terms of handling environmental and hyperparameter perturbations, training requirements, performance, and ease of use. We investigate the empirical influence of these key characteristics by employing a carefully designed experimental procedure and learning environment. Paper website: https://bimanual-imitation.github.io/
Learning and Blending Robot Hugging Behaviors in Time and Space
We introduce an imitation learning-based physical human-robot interaction algorithm capable of predicting appropriate robot responses in complex interactions involving a superposition of multiple interactions. Our proposed algorithm, Blending Bayesian Interaction Primitives (B-BIP) allows us to achieve responsive interactions in complex hugging scenarios, capable of reciprocating and adapting to a hugs motion and timing. We show that this algorithm is a generalization of prior work, for which the original formulation reduces to the particular case of a single interaction, and evaluate our method through both an extensive user study and empirical experiments. Our algorithm yields significantly better quantitative prediction error and more-favorable participant responses with respect to accuracy, responsiveness, and timing, when compared to existing state-of-the-art methods.
Multiagent Systems
DeepVoting: Learning Voting Rules with Tailored Embeddings
Aggregating the preferences of multiple agents into a collective decision is a common step in many important problems across areas of computer science including information retrieval, reinforcement learning, and recommender systems. As Social Choice Theory has shown, the problem of designing algorithms for aggregation rules with specific properties (axioms) can be difficult, or provably impossible in some cases. Instead of designing algorithms by hand, one can learn aggregation rules, particularly voting rules, from data. However, the prior work in this area has required extremely large models, or been limited by the choice of preference representation, i.e., embedding. We recast the problem of designing a good voting rule into one of learning probabilistic versions of voting rules that output distributions over a set of candidates. Specifically, we use neural networks to learn probabilistic social choice functions from the literature. We show that embeddings of preference profiles derived from the social choice literature allows us to learn existing voting rules more efficiently and scale to larger populations of voters more easily than other work if the embedding is tailored to the learning objective. Moreover, we show that rules learned using embeddings can be tweaked to create novel voting rules with improved axiomatic properties. Namely, we show that existing voting rules require only minor modification to combat a probabilistic version of the No Show Paradox.
Reaching New Heights in Multi-Agent Collective Construction ECAI 2024
We propose a new approach for multi-agent collective construction, based on the idea of reversible ramps. Our ReRamp algorithm utilizes reversible side-ramps to generate construction plans for ramped block structures higher and larger than was previously possible using state-of-the-art planning algorithms, given the same building area. We compare the ReRamp algorithm to similar state-of-the-art algorithms on a set of benchmark instances, where we demonstrate its superior computational speed. We also establish in our experiments that the ReRamp algorithm is capable of generating plans for a single-story house, an important milestone on the road to real-world multi-agent construction applications.
comment: 9 pages, 5 figures, ECAI 2024 extension
Hybrid Training for Enhanced Multi-task Generalization in Multi-agent Reinforcement Learning
In multi-agent reinforcement learning (MARL), achieving multi-task generalization to diverse agents and objectives presents significant challenges. Existing online MARL algorithms primarily focus on single-task performance, but their lack of multi-task generalization capabilities typically results in substantial computational waste and limited real-life applicability. Meanwhile, existing offline multi-task MARL approaches are heavily dependent on data quality, often resulting in poor performance on unseen tasks. In this paper, we introduce HyGen, a novel hybrid MARL framework, Hybrid Training for Enhanced Multi-Task Generalization, which integrates online and offline learning to ensure both multi-task generalization and training efficiency. Specifically, our framework extracts potential general skills from offline multi-task datasets. We then train policies to select the optimal skills under the centralized training and decentralized execution paradigm (CTDE). During this stage, we utilize a replay buffer that integrates both offline data and online interactions. We empirically demonstrate that our framework effectively extracts and refines general skills, yielding impressive generalization to unseen tasks. Comparative analyses on the StarCraft multi-agent challenge show that HyGen outperforms a wide range of existing solely online and offline methods.
Systems and Control (CS)
Submodular Maximization Approaches for Equitable Client Selection in Federated Learning
In a conventional Federated Learning framework, client selection for training typically involves the random sampling of a subset of clients in each iteration. However, this random selection often leads to disparate performance among clients, raising concerns regarding fairness, particularly in applications where equitable outcomes are crucial, such as in medical or financial machine learning tasks. This disparity typically becomes more pronounced with the advent of performance-centric client sampling techniques. This paper introduces two novel methods, namely SUBTRUNC and UNIONFL, designed to address the limitations of random client selection. Both approaches utilize submodular function maximization to achieve more balanced models. By modifying the facility location problem, they aim to mitigate the fairness concerns associated with random selection. SUBTRUNC leverages client loss information to diversify solutions, while UNIONFL relies on historical client selection data to ensure a more equitable performance of the final model. Moreover, these algorithms are accompanied by robust theoretical guarantees regarding convergence under reasonable assumptions. The efficacy of these methods is demonstrated through extensive evaluations across heterogeneous scenarios, revealing significant improvements in fairness as measured by a client dissimilarity metric.
comment: 13 pages
Receding-Horizon Games with Tullock-Based Profit Functions for Electric Ride-Hailing Markets
This paper proposes a receding-horizon, game-theoretic charging planning mechanism for electric ride-hailing markets. As the demand for ride-hailing services continues to surge and governments advocate for stricter environmental regulations, integrating electric vehicles into these markets becomes inevitable. The proposed framework addresses the challenges posed by dynamic demand patterns, fluctuating energy costs, and competitive dynamics inherent in such markets. Leveraging the concept of receding-horizon games, we propose a method to optimize proactive dispatching of vehicles for recharging over a predefined time horizon. We integrate a modified Tullock contest that accounts for customer abandonment due to long waiting times to model the expected market share, and by factoring in the demand-based electricity charging, we construct a game capturing interactions between two companies over the time horizon. For this game, we first establish the existence and uniqueness of the Nash equilibrium and then present a semi-decentralized, iterative method to compute it. Finally, the method is evaluated in an open-loop and a closed-loop manner in a simulated numerical case study.
comment: Extended version of the paper accepted for presentation at the 63rd IEEE Conference on Decision and Control (CDC 2024) in Milan, Italy
Control-Informed Reinforcement Learning for Chemical Processes
This work proposes a control-informed reinforcement learning (CIRL) framework that integrates proportional-integral-derivative (PID) control components into the architecture of deep reinforcement learning (RL) policies. The proposed approach augments deep RL agents with a PID controller layer, incorporating prior knowledge from control theory into the learning process. CIRL improves performance and robustness by combining the best of both worlds: the disturbance-rejection and setpoint-tracking capabilities of PID control and the nonlinear modeling capacity of deep RL. Simulation studies conducted on a continuously stirred tank reactor system demonstrate the improved performance of CIRL compared to both conventional model-free deep RL and static PID controllers. CIRL exhibits better setpoint-tracking ability, particularly when generalizing to trajectories outside the training distribution, suggesting enhanced generalization capabilities. Furthermore, the embedded prior control knowledge within the CIRL policy improves its robustness to unobserved system disturbances. The control-informed RL framework combines the strengths of classical control and reinforcement learning to develop sample-efficient and robust deep reinforcement learning algorithms, with potential applications in complex industrial systems.
Plug-and-Play Drag Sail Module for LEO Satellites: Implementation and Early Testing of AirDragMod (ADM)
Space debris has become a critical issue, with debris in orbit surpassing active satellites, posing significant risks to space sustainability. Payloads or rocket bodies discarded post-mission in LEO without orbital control are major sources. The IADC guidelines recommend limiting post-mission presence in protected regions to 25 years. The FCC recently introduced stricter regulations, reducing the allowable post-mission stay for LEO satellites to 5 years. These changes necessitate integrating deorbiting systems into satellite designs. However, adding extra fuel and engines for active deorbiting presents challenges due to LEO satellites' mass and volume limitations, especially for large constellations or CubeSats. This often leads to prioritizing mission-critical components over deorbiting systems. Thus, alternative approaches like passive deorbiting techniques or international regulations are explored. Drag sails are a cost-effective passive solution for small and medium-sized LEO satellites. This paper proposes a plug-and-play drag sail module using COTS components for CubeSats and sub-mass satellites. The scalable design is derived from mission requirements and trajectory analysis. The technique includes active control for quicker deorbiting at specific orbital LTAN. Inspired by JAXA's IKAROS mission, the deployment mechanism uses residual angular momentum and follows a standard sequence. A cost analysis estimates the system's breakeven point. A prototype with a 3D-printed deployment system and inverted stepper motor was tested and compared to a numerical model. A tension model for sail extension petals was developed using curve fitting from test data. SIMULINK multibody models are available for simulations. Further experimentation and prototype development are required to assess real-world performance, with a control system identified as crucial.
Learning a Factorized Orthogonal Latent Space using Encoder-only Architecture for Fault Detection; An Alarm management perspective
False and nuisance alarms in industrial fault detection systems are often triggered by uncertainty, causing normal process variable fluctuations to be erroneously identified as faults. This paper introduces a novel encoder-based residual design that effectively decouples the stochastic and deterministic components of process variables without imposing detection delay. The proposed model employs two distinct encoders to factorize the latent space into two orthogonal spaces: one for the deterministic part and the other for the stochastic part. To ensure the identifiability of the desired spaces, constraints are applied during training. The deterministic space is constrained to be smooth to guarantee determinism, while the stochastic space is required to resemble standard Gaussian noise. Additionally, a decorrelation term enforces the independence of the learned representations. The efficacy of this approach is demonstrated through numerical examples and its application to the Tennessee Eastman process, highlighting its potential for robust fault detection. By focusing decision logic solely on deterministic factors, the proposed model significantly enhances prediction quality while achieving nearly zero false alarms and missed detections, paving the way for improved operational safety and integrity in industrial environments.
Intelligent Router for LLM Workloads: Improving Performance Through Workload-Aware Scheduling
Large Language Model (LLM) workloads have distinct prefill and decode phases with different compute and memory requirements which should ideally be accounted for when scheduling input queries across different LLM instances in a cluster. However existing scheduling algorithms treat LLM workloads as monolithic jobs without considering the distinct characteristics of the two phases in each workload. This leads to sub-optimal scheduling and increased response latency. In this work, we propose a heuristic-guided reinforcement learning-based intelligent router for data-driven and workload-aware scheduling. Our router leverages a trainable response-length predictor, and a novel formulation for estimating the impact of mixing different workloads to schedule queries across LLM instances and achieve over 11\% lower end-to-end latency than existing approaches.
comment: 16 pages, 8 figures
Enabling Wireless Communications, Energy Harvesting, and Energy Saving by Using a Multimode Smart Nonlinear Circuit (MSNC)
In this paper, a multimode smart nonlinear circuit (MSNC) for wireless communications (Tx and Rx modes) as well as energy harvesting (EH) and power saving is presented. The proposed MSNC is designed at 680 MHz and has three ports, which are connected to an antenna, and T/R (transceiver) and power-saving modules. According to the input/output power level, the proposed MSNC has three modes of operations; Receiving (Rx), power saving and transmitting (Tx), for low (<-25 dBm), mid (>-25 dBm and <0 dBm) and high (>5 dBm) power ranges, respectively. In the power-saving mode, when the received power is greater than the sensitivity of the Rx module, the excess power is directed to the energy harvesting load (power storage), while the receiving direction is still in place. The fact that the proposed MSNC can manage the received power level smartly and without any external control, distinguishes the proposed MSNC from other EH circuits. The proposed MSNC operates within a power range from -50 dBm to +15 dBm, demonstrates an efficiency of more than 60% in the power-saving mode, and has acceptable matching over a large frequency range. The design procedure of the proposed MSNC along with the theoretical, simulation and measurement results are presented in this paper. Good agreement between theory, simulation and measurement results confirms the accuracy of design procedure.
Quantitative Evaluation of Full-Scale Ship Maneuvering Characteristics During Berthing and Unberthing
Leveraging empirical data is crucial in the development of accurate and reliable virtual models for the advancement of autonomous ship technologies and the optimization of port operations. This study presents an in-depth analysis of ship berthing and unberthing maneuvering characteristics by utilizing a comprehensive dataset encompassing the operation of a full-scale ship in diverse infrastructural and environmental conditions. Various statistical techniques and time-series analysis were employed to process and interpret the operational data. A systematic analysis was conducted on key performance variables, including approach speed, drift angles, turning motions, distance from obstacles, and actuator utilization. The results demonstrate significant discrepancies between the empirical data and the established maneuvering characteristics. These findings have the potential to significantly enhance the accuracy and reliability of conventional maneuvering models, such as the Mathematical Modeling Group (MMG) model, and improve the conditions used in captive model tests for the identification of maneuvering model parameters. Furthermore, these findings could inform the development of more robust autonomous berthing and unberthing algorithms and digital twins.
Towards Automatic Linearization via SMT Solving
Mathematical optimization is ubiquitous in modern applications. However, in practice, we often need to use nonlinear optimization models, for which the existing optimization tools such as Cplex or Gurobi may not be directly applicable and an (error-prone) manual transformation often has to be done. Thus, to address this issue, in this paper we investigate the problem of automatically verifying and synthesizing reductions, the solution of which may allow an automatic linearization of nonlinear models. We show that the synthesis of reductions can be formulated as an $\exists^* \forall^*$ synthesis problem, which can be solved by an SMT solver via the counter-example guided inductive synthesis approach (CEGIS).
comment: 4 pages, conference
Multi-Interval Energy-Reserve Co-Optimization with SoC-Dependent Bids from Battery Storage
We consider the problem of co-optimized energy-reserve market clearing with state-of-charge (SoC) dependent bids from battery storage participants. While SoC-dependent bids capture storage's degradation and opportunity costs, such bids result in a non-convex optimization in the market clearing process. More challenging is the regulation reserve capacity clearing, where the SoC-dependent cost is uncertain as it depends on the unknown regulation trajectories ex-post of the market clearing. Addressing the nonconvexity and uncertainty in a multi-interval co-optimized real-time energy-reserve market, we introduce a simple restriction on the SoC-dependent bids along with a robust optimization formulation, transforming the non-convex market clearing under uncertainty into a standard convex piece-wise linear program and making it possible for large-scale storage integration. Under reasonable assumptions, we show that SoC-dependent bids yield higher profit for storage participants than that from SoC-independent bids. Numerical simulations demonstrate a 28%-150% profit increase of the proposed SoC-dependent bids compared with the SoC-independent counterpart.
Analysis of Indistinguishable Trajectories of a Nonholonomic Vehicle Subject to Range Measurements
We propose a global constructibility analysis for a vehicle moving on a planar surface. Assuming that the vehicle follows a trajectory that can be uniquely identified by the sequence of control inputs and by some intermittent ranging measurements from known points in the environment, we can model the trajectory as a rigid body subject to rotation and translation in the plane. This way, the localisation problem can be reduced to finding the conditions for the existence of a unique roto-translation of the trajectory from a known reference frame to the world reference frame, given the collected measurements. As discussed in this paper, such conditions can be expressed in terms of the shape of the trajectory, of the layout of the ranging sensors, and of the numbers of measurements collected from each of them. The approach applies to a large class of kinematic models. Focusing on the special case of unicycle kinematics, we provide additional local constructibility results.
comment: 12 pages, 7 figures, This article has been accepted for publication in IEEE Transactions on Automatic Control (2024). content may change prior to final publication
Sparse Attention-driven Quality Prediction for Production Process Optimization in Digital Twins
In the process industry, long-term and efficient optimization of production lines requires real-time monitoring and analysis of operational states to fine-tune production line parameters. However, complexity in operational logic and intricate coupling of production process parameters make it difficult to develop an accurate mathematical model for the entire process, thus hindering the deployment of efficient optimization mechanisms. In view of these difficulties, we propose to deploy a digital twin of the production line by encoding its operational logic in a data-driven approach. By iteratively mapping the real-world data reflecting equipment operation status and product quality indicators in the digital twin, we adopt a quality prediction model for production process based on self-attention-enabled temporal convolutional neural networks. This model enables the data-driven state evolution of the digital twin. The digital twin takes a role of aggregating the information of actual operating conditions and the results of quality-sensitive analysis, which facilitates the optimization of process production with virtual-reality evolution. Leveraging the digital twin as an information-flow carrier, we extract temporal features from key process indicators and establish a production process quality prediction model based on the proposed deep neural network. Our operation experiments on a specific tobacco shredding line demonstrate that the proposed digital twin-based production process optimization method fosters seamless integration between virtual and real production lines. This integration achieves an average operating status prediction accuracy of over 98% and a product quality acceptance rate of over 96%.
Systems and Control (EESS)
Submodular Maximization Approaches for Equitable Client Selection in Federated Learning
In a conventional Federated Learning framework, client selection for training typically involves the random sampling of a subset of clients in each iteration. However, this random selection often leads to disparate performance among clients, raising concerns regarding fairness, particularly in applications where equitable outcomes are crucial, such as in medical or financial machine learning tasks. This disparity typically becomes more pronounced with the advent of performance-centric client sampling techniques. This paper introduces two novel methods, namely SUBTRUNC and UNIONFL, designed to address the limitations of random client selection. Both approaches utilize submodular function maximization to achieve more balanced models. By modifying the facility location problem, they aim to mitigate the fairness concerns associated with random selection. SUBTRUNC leverages client loss information to diversify solutions, while UNIONFL relies on historical client selection data to ensure a more equitable performance of the final model. Moreover, these algorithms are accompanied by robust theoretical guarantees regarding convergence under reasonable assumptions. The efficacy of these methods is demonstrated through extensive evaluations across heterogeneous scenarios, revealing significant improvements in fairness as measured by a client dissimilarity metric.
comment: 13 pages
Receding-Horizon Games with Tullock-Based Profit Functions for Electric Ride-Hailing Markets
This paper proposes a receding-horizon, game-theoretic charging planning mechanism for electric ride-hailing markets. As the demand for ride-hailing services continues to surge and governments advocate for stricter environmental regulations, integrating electric vehicles into these markets becomes inevitable. The proposed framework addresses the challenges posed by dynamic demand patterns, fluctuating energy costs, and competitive dynamics inherent in such markets. Leveraging the concept of receding-horizon games, we propose a method to optimize proactive dispatching of vehicles for recharging over a predefined time horizon. We integrate a modified Tullock contest that accounts for customer abandonment due to long waiting times to model the expected market share, and by factoring in the demand-based electricity charging, we construct a game capturing interactions between two companies over the time horizon. For this game, we first establish the existence and uniqueness of the Nash equilibrium and then present a semi-decentralized, iterative method to compute it. Finally, the method is evaluated in an open-loop and a closed-loop manner in a simulated numerical case study.
comment: Extended version of the paper accepted for presentation at the 63rd IEEE Conference on Decision and Control (CDC 2024) in Milan, Italy
Control-Informed Reinforcement Learning for Chemical Processes
This work proposes a control-informed reinforcement learning (CIRL) framework that integrates proportional-integral-derivative (PID) control components into the architecture of deep reinforcement learning (RL) policies. The proposed approach augments deep RL agents with a PID controller layer, incorporating prior knowledge from control theory into the learning process. CIRL improves performance and robustness by combining the best of both worlds: the disturbance-rejection and setpoint-tracking capabilities of PID control and the nonlinear modeling capacity of deep RL. Simulation studies conducted on a continuously stirred tank reactor system demonstrate the improved performance of CIRL compared to both conventional model-free deep RL and static PID controllers. CIRL exhibits better setpoint-tracking ability, particularly when generalizing to trajectories outside the training distribution, suggesting enhanced generalization capabilities. Furthermore, the embedded prior control knowledge within the CIRL policy improves its robustness to unobserved system disturbances. The control-informed RL framework combines the strengths of classical control and reinforcement learning to develop sample-efficient and robust deep reinforcement learning algorithms, with potential applications in complex industrial systems.
Plug-and-Play Drag Sail Module for LEO Satellites: Implementation and Early Testing of AirDragMod (ADM)
Space debris has become a critical issue, with debris in orbit surpassing active satellites, posing significant risks to space sustainability. Payloads or rocket bodies discarded post-mission in LEO without orbital control are major sources. The IADC guidelines recommend limiting post-mission presence in protected regions to 25 years. The FCC recently introduced stricter regulations, reducing the allowable post-mission stay for LEO satellites to 5 years. These changes necessitate integrating deorbiting systems into satellite designs. However, adding extra fuel and engines for active deorbiting presents challenges due to LEO satellites' mass and volume limitations, especially for large constellations or CubeSats. This often leads to prioritizing mission-critical components over deorbiting systems. Thus, alternative approaches like passive deorbiting techniques or international regulations are explored. Drag sails are a cost-effective passive solution for small and medium-sized LEO satellites. This paper proposes a plug-and-play drag sail module using COTS components for CubeSats and sub-mass satellites. The scalable design is derived from mission requirements and trajectory analysis. The technique includes active control for quicker deorbiting at specific orbital LTAN. Inspired by JAXA's IKAROS mission, the deployment mechanism uses residual angular momentum and follows a standard sequence. A cost analysis estimates the system's breakeven point. A prototype with a 3D-printed deployment system and inverted stepper motor was tested and compared to a numerical model. A tension model for sail extension petals was developed using curve fitting from test data. SIMULINK multibody models are available for simulations. Further experimentation and prototype development are required to assess real-world performance, with a control system identified as crucial.
Learning a Factorized Orthogonal Latent Space using Encoder-only Architecture for Fault Detection; An Alarm management perspective
False and nuisance alarms in industrial fault detection systems are often triggered by uncertainty, causing normal process variable fluctuations to be erroneously identified as faults. This paper introduces a novel encoder-based residual design that effectively decouples the stochastic and deterministic components of process variables without imposing detection delay. The proposed model employs two distinct encoders to factorize the latent space into two orthogonal spaces: one for the deterministic part and the other for the stochastic part. To ensure the identifiability of the desired spaces, constraints are applied during training. The deterministic space is constrained to be smooth to guarantee determinism, while the stochastic space is required to resemble standard Gaussian noise. Additionally, a decorrelation term enforces the independence of the learned representations. The efficacy of this approach is demonstrated through numerical examples and its application to the Tennessee Eastman process, highlighting its potential for robust fault detection. By focusing decision logic solely on deterministic factors, the proposed model significantly enhances prediction quality while achieving nearly zero false alarms and missed detections, paving the way for improved operational safety and integrity in industrial environments.
Intelligent Router for LLM Workloads: Improving Performance Through Workload-Aware Scheduling
Large Language Model (LLM) workloads have distinct prefill and decode phases with different compute and memory requirements which should ideally be accounted for when scheduling input queries across different LLM instances in a cluster. However existing scheduling algorithms treat LLM workloads as monolithic jobs without considering the distinct characteristics of the two phases in each workload. This leads to sub-optimal scheduling and increased response latency. In this work, we propose a heuristic-guided reinforcement learning-based intelligent router for data-driven and workload-aware scheduling. Our router leverages a trainable response-length predictor, and a novel formulation for estimating the impact of mixing different workloads to schedule queries across LLM instances and achieve over 11\% lower end-to-end latency than existing approaches.
comment: 16 pages, 8 figures
Enabling Wireless Communications, Energy Harvesting, and Energy Saving by Using a Multimode Smart Nonlinear Circuit (MSNC)
In this paper, a multimode smart nonlinear circuit (MSNC) for wireless communications (Tx and Rx modes) as well as energy harvesting (EH) and power saving is presented. The proposed MSNC is designed at 680 MHz and has three ports, which are connected to an antenna, and T/R (transceiver) and power-saving modules. According to the input/output power level, the proposed MSNC has three modes of operations; Receiving (Rx), power saving and transmitting (Tx), for low (<-25 dBm), mid (>-25 dBm and <0 dBm) and high (>5 dBm) power ranges, respectively. In the power-saving mode, when the received power is greater than the sensitivity of the Rx module, the excess power is directed to the energy harvesting load (power storage), while the receiving direction is still in place. The fact that the proposed MSNC can manage the received power level smartly and without any external control, distinguishes the proposed MSNC from other EH circuits. The proposed MSNC operates within a power range from -50 dBm to +15 dBm, demonstrates an efficiency of more than 60% in the power-saving mode, and has acceptable matching over a large frequency range. The design procedure of the proposed MSNC along with the theoretical, simulation and measurement results are presented in this paper. Good agreement between theory, simulation and measurement results confirms the accuracy of design procedure.
Quantitative Evaluation of Full-Scale Ship Maneuvering Characteristics During Berthing and Unberthing
Leveraging empirical data is crucial in the development of accurate and reliable virtual models for the advancement of autonomous ship technologies and the optimization of port operations. This study presents an in-depth analysis of ship berthing and unberthing maneuvering characteristics by utilizing a comprehensive dataset encompassing the operation of a full-scale ship in diverse infrastructural and environmental conditions. Various statistical techniques and time-series analysis were employed to process and interpret the operational data. A systematic analysis was conducted on key performance variables, including approach speed, drift angles, turning motions, distance from obstacles, and actuator utilization. The results demonstrate significant discrepancies between the empirical data and the established maneuvering characteristics. These findings have the potential to significantly enhance the accuracy and reliability of conventional maneuvering models, such as the Mathematical Modeling Group (MMG) model, and improve the conditions used in captive model tests for the identification of maneuvering model parameters. Furthermore, these findings could inform the development of more robust autonomous berthing and unberthing algorithms and digital twins.
Towards Automatic Linearization via SMT Solving
Mathematical optimization is ubiquitous in modern applications. However, in practice, we often need to use nonlinear optimization models, for which the existing optimization tools such as Cplex or Gurobi may not be directly applicable and an (error-prone) manual transformation often has to be done. Thus, to address this issue, in this paper we investigate the problem of automatically verifying and synthesizing reductions, the solution of which may allow an automatic linearization of nonlinear models. We show that the synthesis of reductions can be formulated as an $\exists^* \forall^*$ synthesis problem, which can be solved by an SMT solver via the counter-example guided inductive synthesis approach (CEGIS).
comment: 4 pages, conference
Multi-Interval Energy-Reserve Co-Optimization with SoC-Dependent Bids from Battery Storage
We consider the problem of co-optimized energy-reserve market clearing with state-of-charge (SoC) dependent bids from battery storage participants. While SoC-dependent bids capture storage's degradation and opportunity costs, such bids result in a non-convex optimization in the market clearing process. More challenging is the regulation reserve capacity clearing, where the SoC-dependent cost is uncertain as it depends on the unknown regulation trajectories ex-post of the market clearing. Addressing the nonconvexity and uncertainty in a multi-interval co-optimized real-time energy-reserve market, we introduce a simple restriction on the SoC-dependent bids along with a robust optimization formulation, transforming the non-convex market clearing under uncertainty into a standard convex piece-wise linear program and making it possible for large-scale storage integration. Under reasonable assumptions, we show that SoC-dependent bids yield higher profit for storage participants than that from SoC-independent bids. Numerical simulations demonstrate a 28%-150% profit increase of the proposed SoC-dependent bids compared with the SoC-independent counterpart.
Analysis of Indistinguishable Trajectories of a Nonholonomic Vehicle Subject to Range Measurements
We propose a global constructibility analysis for a vehicle moving on a planar surface. Assuming that the vehicle follows a trajectory that can be uniquely identified by the sequence of control inputs and by some intermittent ranging measurements from known points in the environment, we can model the trajectory as a rigid body subject to rotation and translation in the plane. This way, the localisation problem can be reduced to finding the conditions for the existence of a unique roto-translation of the trajectory from a known reference frame to the world reference frame, given the collected measurements. As discussed in this paper, such conditions can be expressed in terms of the shape of the trajectory, of the layout of the ranging sensors, and of the numbers of measurements collected from each of them. The approach applies to a large class of kinematic models. Focusing on the special case of unicycle kinematics, we provide additional local constructibility results.
comment: 12 pages, 7 figures, This article has been accepted for publication in IEEE Transactions on Automatic Control (2024). content may change prior to final publication
Sparse Attention-driven Quality Prediction for Production Process Optimization in Digital Twins
In the process industry, long-term and efficient optimization of production lines requires real-time monitoring and analysis of operational states to fine-tune production line parameters. However, complexity in operational logic and intricate coupling of production process parameters make it difficult to develop an accurate mathematical model for the entire process, thus hindering the deployment of efficient optimization mechanisms. In view of these difficulties, we propose to deploy a digital twin of the production line by encoding its operational logic in a data-driven approach. By iteratively mapping the real-world data reflecting equipment operation status and product quality indicators in the digital twin, we adopt a quality prediction model for production process based on self-attention-enabled temporal convolutional neural networks. This model enables the data-driven state evolution of the digital twin. The digital twin takes a role of aggregating the information of actual operating conditions and the results of quality-sensitive analysis, which facilitates the optimization of process production with virtual-reality evolution. Leveraging the digital twin as an information-flow carrier, we extract temporal features from key process indicators and establish a production process quality prediction model based on the proposed deep neural network. Our operation experiments on a specific tobacco shredding line demonstrate that the proposed digital twin-based production process optimization method fosters seamless integration between virtual and real production lines. This integration achieves an average operating status prediction accuracy of over 98% and a product quality acceptance rate of over 96%.
Robotics
Multi-finger Manipulation via Trajectory Optimization with Differentiable Rolling and Geometric Constraints
Parameterizing finger rolling and finger-object contacts in a differentiable manner is important for formulating dexterous manipulation as a trajectory optimization problem. In contrast to previous methods which often assume simplified geometries of the robot and object or do not explicitly model finger rolling, we propose a method to further extend the capabilities of dexterous manipulation by accounting for non-trivial geometries of both the robot and the object. By integrating the object's Signed Distance Field (SDF) with a sampling method, our method estimates contact and rolling-related variables and includes those in a trajectory optimization framework. This formulation naturally allows for the emergence of finger-rolling behaviors, enabling the robot to locally adjust the contact points. Our method is tested in a peg alignment task and a screwdriver turning task, where it outperforms the baselines in terms of achieving desired object configurations and avoiding dropping the object. We also successfully apply our method to a real-world screwdriver turning task, demonstrating its robustness to the sim2real gap.
Do Mistakes Matter? Comparing Trust Responses of Different Age Groups to Errors Made by Physically Assistive Robots
Trust is a key factor in ensuring acceptable human-robot interaction, especially in settings where robots may be assisting with critical activities of daily living. When practically deployed, robots are bound to make occasional mistakes, yet the degree to which these errors will impact a care recipient's trust in the robot, especially in performing physically assistive tasks, remains an open question. To investigate this, we conducted experiments where participants interacted with physically assistive robots which would occasionally make intentional mistakes while performing two different tasks: bathing and feeding. Our study considered the error response of two populations: younger adults at a university (median age 26) and older adults at an independent living facility (median age 83). We observed that the impact of errors on a users' trust in the robot depends on both their age and the task that the robot is performing. We also found that older adults tend to evaluate the robot on factors unrelated to the robot's performance, making their trust in the system more resilient to errors when compared to younger adults. Code and supplementary materials are available on our project webpage.
comment: 8 pages, 5 figures, in proceedings for IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) 2024
ShapeICP: Iterative Category-level Object Pose and Shape Estimation from Depth
Category-level object pose and shape estimation from a single depth image has recently drawn research attention due to its wide applications in robotics and self-driving. The task is particularly challenging because the three unknowns, object pose, object shape, and model-to-measurement correspondences, are compounded together but only a single view of depth measurements is provided. The vast majority of the prior work heavily relies on data-driven approaches to obtain solutions to at least one of the unknowns and typically two, running with the risk of failing to generalize to unseen domains. The shape representations used in the prior work also mainly focus on point cloud and signed distance field (SDF). In stark contrast to the prior work, we approach the problem using an iterative estimation method that does not require learning from any pose-annotated data. In addition, we adopt a novel mesh-based object active shape model that has not been explored by the previous literature. Our algorithm, named ShapeICP, has its foundation in the iterative closest point (ICP) algorithm but is equipped with additional features for the category-level pose and shape estimation task. The results show that even without using any pose-annotated data, ShapeICP surpasses many data-driven approaches that rely on the pose data for training, opening up new solution space for researchers to consider.
Complete Autonomous Robotic Nasopharyngeal Swab System with Evaluation on a Stochastically Moving Phantom Head
The application of autonomous robotics to close-contact healthcare tasks has a clear role for the future due to its potential to reduce infection risks to staff and improve clinical efficiency. Nasopharyngeal (NP) swab sample collection for diagnosing upper-respiratory illnesses is one type of close contact task that is interesting for robotics due to the dexterity requirements and the unobservability of the nasal cavity. We propose a control system that performs the test using a collaborative manipulator arm with an instrumented end-effector to take visual and force measurements, under the scenario that the patient is unrestrained and the tools are general enough to be applied to other close contact tasks. The system employs a visual servo controller to align the swab with the nostrils. A compliant joint velocity controller inserts the swab along a trajectory optimized through a simulation environment, that also reacts to measured forces applied to the swab. Additional subsystems include a fuzzy logic system for detecting when the swab reaches the nasopharynx and a method for detaching the swab and aborting the procedure if safety criteria is violated. The system is evaluated using a second robotic arm that holds a nasal cavity phantom and simulates the natural head motions that could occur during the procedure. Through extensive experiments, we identify controller configurations capable of effectively performing the NP swab test even with significant head motion, which demonstrates the safety and reliability of the system.
comment: 18 pages, 26 figures. Supplementary files may be found at https://uwaterloo.ca/scholar/pqjlee/supplementary-files-complete-autonomous-robotic-nasopharyngeal-swab-system-evaluation
cc-DRL: a Convex Combined Deep Reinforcement Learning Flight Control Design for a Morphing Quadrotor
In comparison to common quadrotors, the shape change of morphing quadrotors endows it with a more better flight performance but also results in more complex flight dynamics. Generally, it is extremely difficult or even impossible for morphing quadrotors to establish an accurate mathematical model describing their complex flight dynamics. To figure out the issue of flight control design for morphing quadrotors, this paper resorts to a combination of model-free control techniques (e.g., deep reinforcement learning, DRL) and convex combination (CC) technique, and proposes a convex-combined-DRL (cc-DRL) flight control algorithm for position and attitude of a class of morphing quadrotors, where the shape change is realized by the length variation of four arm rods. In the proposed cc-DRL flight control algorithm, proximal policy optimization algorithm that is a model-free DRL algorithm is utilized to off-line train the corresponding optimal flight control laws for some selected representative arm length modes and hereby a cc-DRL flight control scheme is constructed by the convex combination technique. Finally, simulation results are presented to show the effectiveness and merit of the proposed flight control algorithm.
Identification and validation of the dynamic model of a tendon-driven anthropomorphic finger
This study addresses the absence of an identification framework to quantify a comprehensive dynamic model of human and anthropomorphic tendon-driven fingers, which is necessary to investigate the physiological properties of human fingers and improve the control of robotic hands. First, a generalized dynamic model was formulated, which takes into account the inherent properties of such a mechanical system. This includes rigid-body dynamics, coupling matrix, joint viscoelasticity, and tendon friction. Then, we propose a methodology comprising a series of experiments, for step-wise identification and validation of this dynamic model. Moreover, an experimental setup was designed and constructed that features actuation modules and peripheral sensors to facilitate the identification process. To verify the proposed methodology, a 3D-printed robotic finger based on the index finger design of the Dexmart hand was developed, and the proposed experiments were executed to identify and validate its dynamic model. This study could be extended to explore the identification of cadaver hands, aiming for a consistent dataset from a single cadaver specimen to improve the development of musculoskeletal hand models.
comment: 8 pages, 9 figures
Robust Iterative Value Conversion: Deep Reinforcement Learning for Neurochip-driven Edge Robots
A neurochip is a device that reproduces the signal processing mechanisms of brain neurons and calculates Spiking Neural Networks (SNNs) with low power consumption and at high speed. Thus, neurochips are attracting attention from edge robot applications, which suffer from limited battery capacity. This paper aims to achieve deep reinforcement learning (DRL) that acquires SNN policies suitable for neurochip implementation. Since DRL requires a complex function approximation, we focus on conversion techniques from Floating Point NN (FPNN) because it is one of the most feasible SNN techniques. However, DRL requires conversions to SNNs for every policy update to collect the learning samples for a DRL-learning cycle, which updates the FPNN policy and collects the SNN policy samples. Accumulative conversion errors can significantly degrade the performance of the SNN policies. We propose Robust Iterative Value Conversion (RIVC) as a DRL that incorporates conversion error reduction and robustness to conversion errors. To reduce them, FPNN is optimized with the same number of quantization bits as an SNN. The FPNN output is not significantly changed by quantization. To robustify the conversion error, an FPNN policy that is applied with quantization is updated to increase the gap between the probability of selecting the optimal action and other actions. This step prevents unexpected replacements of the policy's optimal actions. We verified RIVC's effectiveness on a neurochip-driven robot. The results showed that RIVC consumed 1/15 times less power and increased the calculation speed by five times more than an edge CPU (quad-core ARM Cortex-A72). The previous framework with no countermeasures against conversion errors failed to train the policies. Videos from our experiments are available: https://youtu.be/Q5Z0-BvK1Tc.
comment: Accepted by Robotics and Autonomous Systems
Informational Embodiment: Computational role of information structure in codes and robots
The body morphology plays an important role in the way information is perceived and processed by an agent. We address an information theory (IT) account on how the precision of sensors, the accuracy of motors, their placement, the body geometry, shape the information structure in robots and computational codes. As an original idea, we envision the robot's body as a physical communication channel through which information is conveyed, in and out, despite intrinsic noise and material limitations. Following this, entropy, a measure of information and uncertainty, can be used to maximize the efficiency of robot design and of algorithmic codes per se. This is known as the principle of Entropy Maximization (PEM) introduced in biology by Barlow in 1969. The Shannon's source coding theorem provides then a framework to compare different types of bodies in terms of sensorimotor information. In line with PME, we introduce a special class of efficient codes used in IT that reached the Shannon limits in terms of information capacity for error correction and robustness against noise, and parsimony. These efficient codes, which exploit insightfully quantization and randomness, permit to deal with uncertainty, redundancy and compacity. These features can be used for perception and control in intelligent systems. In various examples and closing discussions, we reflect on the broader implications of our framework that we called Informational Embodiment to motor theory and bio-inspired robotics, touching upon concepts like motor synergies, reservoir computing, and morphological computation. These insights can contribute to a deeper understanding of how information theory intersects with the embodiment of intelligence in both natural and artificial systems.
SIMPNet: Spatial-Informed Motion Planning Network
Current robotic manipulators require fast and efficient motion-planning algorithms to operate in cluttered environments. State-of-the-art sampling-based motion planners struggle to scale to high-dimensional configuration spaces and are inefficient in complex environments. This inefficiency arises because these planners utilize either uniform or hand-crafted sampling heuristics within the configuration space. To address these challenges, we present the Spatial-informed Motion Planning Network (SIMPNet). SIMPNet consists of a stochastic graph neural network (GNN)-based sampling heuristic for informed sampling within the configuration space. The sampling heuristic of SIMPNet encodes the workspace embedding into the configuration space through a cross-attention mechanism. It encodes the manipulator's kinematic structure into a graph, which is used to generate informed samples within the framework of sampling-based motion planning algorithms. We have evaluated the performance of SIMPNet using a UR5e robotic manipulator operating within simple and complex workspaces, comparing it against baseline state-of-the-art motion planners. The evaluation results show the effectiveness and advantages of the proposed planner compared to the baseline planners.
Towards Human-Robot Teaming through Augmented Reality and Gaze-Based Attention Control
Robots are now increasingly integrated into various real world applications and domains. In these new domains, robots are mostly employed to improve, in some ways, the work done by humans. So, the need for effective Human-Robot Teaming (HRT) capabilities grows. These capabilities usually involve the dynamic collaboration between humans and robots at different levels of involvement, leveraging the strengths of both to efficiently navigate complex situations. Crucial to this collaboration is the ability of robotic systems to adjust their level of autonomy to match the needs of the task and the human team members. This paper introduces a system designed to control attention using HRT through the use of ground robots and augmented reality (AR) technology. Traditional methods of controlling attention, such as pointing, touch, and voice commands, sometimes fall short in precision and subtlety. Our system overcomes these limitations by employing AR headsets to display virtual visual markers. These markers act as dynamic cues to attract and shift human attention seamlessly, irrespective of the robot's physical location.
comment: Accepted to the Variable Autonomy for Human-Robot Teaming Workshop, IEEE ROMAN 2024
Courteous MPC for Autonomous Driving with CBF-inspired Risk Assessment SC 2024
With more autonomous vehicles (AVs) sharing roadways with human-driven vehicles (HVs), ensuring safe and courteous maneuvers that respect HVs' behavior becomes increasingly important. To promote both safety and courtesy in AV's behavior, an extension of Control Barrier Functions (CBFs)-inspired risk evaluation framework is proposed in this paper by considering both noisy observed positions and velocities of surrounding vehicles. The perceived risk by the ego vehicle can be visualized as a risk map that reflects the understanding of the surrounding environment and thus shows the potential for facilitating safe and courteous driving. By incorporating the risk evaluation framework into the Model Predictive Control (MPC) scheme, we propose a Courteous MPC for ego AV to generate courteous behaviors that 1) reduce the overall risk imposed on other vehicles and 2) respect the hard safety constraints and the original objective for efficiency. We demonstrate the performance of the proposed Courteous MPC via theoretical analysis and simulation experiments.
comment: 7 pages, accepted to ITSC 2024
Environment-Centric Active Inference
To handle unintended changes in the environment by agents, we propose an environment-centric active inference EC-AIF in which the Markov Blanket of active inference is defined starting from the environment. In normal active inference, the Markov Blanket is defined starting from the agent. That is, first the agent was defined as the entity that performs the "action" such as a robot or a person, then the environment was defined as other people or objects that are directly affected by the agent's "action," and the boundary between the agent and the environment was defined as the Markov Blanket. This agent-centric definition does not allow the agent to respond to unintended changes in the environment caused by factors outside of the defined environment. In the proposed EC-AIF, there is no entity corresponding to an agent. The environment includes all observable things, including people and things conventionally considered to be the environment, as well as entities that perform "actions" such as robots and people. Accordingly, all states, including robots and people, are included in inference targets, eliminating unintended changes in the environment. The EC-AIF was applied to a robot arm and validated with an object transport task by the robot arm. The results showed that the robot arm successfully transported objects while responding to changes in the target position of the object and to changes in the orientation of another robot arm.
comment: 14 pages, 9 figures
Towards Robust Perception for Assistive Robotics: An RGB-Event-LiDAR Dataset and Multi-Modal Detection Pipeline
The increasing adoption of human-robot interaction presents opportunities for technology to positively impact lives, particularly those with visual impairments, through applications such as guide-dog-like assistive robotics. We present a pipeline exploring the perception and "intelligent disobedience" required by such a system. A dataset of two people moving in and out of view has been prepared to compare RGB-based and event-based multi-modal dynamic object detection using LiDAR data for 3D position localisation. Our analysis highlights challenges in accurate 3D localisation using 2D image-LiDAR fusion, indicating the need for further refinement. Compared to the performance of the frame-based detection algorithm utilised (YOLOv4), current cutting-edge event-based detection models appear limited to contextual scenarios, such as for automotive platforms. This is highlighted by weak precision and recall over varying confidence and Intersection over Union (IoU) thresholds when using frame-based detections as a ground truth. Therefore, we have publicly released this dataset to the community, containing RGB, event, point cloud and Inertial Measurement Unit (IMU) data along with ground truth poses for the two people in the scene to fill a gap in the current landscape of publicly available datasets and provide a means to assist in the development of safer and more robust algorithms in the future: https://uts-ri.github.io/revel/.
comment: Accepted to the 2024 IEEE International Conference on Automation Science and Engineering (CASE)
Safe Bubble Cover for Motion Planning on Distance Fields
We consider the problem of planning collision-free trajectories on distance fields. Our key observation is that querying a distance field at one configuration reveals a region of safe space whose radius is given by the distance value, obviating the need for additional collision checking within the safe region. We refer to such regions as safe bubbles, and show that safe bubbles can be obtained from any Lipschitz-continuous safety constraint. Inspired by sampling-based planning algorithms, we present three algorithms for constructing a safe bubble cover of free space, named bubble roadmap (BRM), rapidly exploring bubble graph (RBG), and expansive bubble graph (EBG). The bubble sampling algorithms are combined with a hierarchical planning method that first computes a discrete path of bubbles, followed by a continuous path within the bubbles computed via convex optimization. Experimental results show that the bubble-based methods yield up to 5- 10 times cost reduction relative to conventional baselines while simultaneously reducing computational efforts by orders of magnitude.
comment: 16 pages, 11 figures. Submitted to International Symposium on Robotics Research 2024
Beyond Winning Strategies: Admissible and Admissible Winning Strategies for Quantitative Reachability Games
Classical reactive synthesis approaches aim to synthesize a reactive system that always satisfies a given specifications. These approaches often reduce to playing a two-player zero-sum game where the goal is to synthesize a winning strategy. However, in many pragmatic domains, such as robotics, a winning strategy does not always exist, yet it is desirable for the system to make an effort to satisfy its requirements instead of "giving up". To this end, this paper investigates the notion of admissible strategies, which formalize "doing-your-best", in quantitative reachability games. We show that, unlike the qualitative case, quantitative admissible strategies are history-dependent even for finite payoff functions, making synthesis a challenging task. In addition, we prove that admissible strategies always exist but may produce undesirable optimistic behaviors. To mitigate this, we propose admissible winning strategies, which enforce the best possible outcome while being admissible. We show that both strategies always exist but are not memoryless. We provide necessary and sufficient conditions for the existence of both strategies and propose synthesis algorithms. Finally, we illustrate the strategies on gridworld and robot manipulator domains.
comment: TL;DR: This paper relaxes the notion of winning strategies by introducing Admissible and Admissible Winning strategies for quantitative reachability games, providing existence proofs and synthesis algorithms with applications in robotics
Tactile-Morph Skills: Energy-Based Control Meets Data-Driven Learning
Robotic manipulation is essential for modernizing factories and automating industrial tasks like polishing, which require advanced tactile abilities. These robots must be easily set up, safely work with humans, learn tasks autonomously, and transfer skills to similar tasks. Addressing these needs, we introduce the tactile-morph skill framework, which integrates unified force-impedance control with data-driven learning. Our system adjusts robot movements and force application based on estimated energy levels for the desired trajectory and force profile, ensuring safety by stopping if energy allocated for the control runs out. Using a Temporal Convolutional Network, we estimate the energy distribution for a given motion and force profile, enabling skill transfer across different tasks and surfaces. Our approach maintains stability and performance even on unfamiliar geometries with similar friction characteristics, demonstrating improved accuracy, zero-shot transferable performance, and enhanced safety in real-world scenarios. This framework promises to enhance robotic capabilities in industrial settings, making intelligent robots more accessible and valuable.
comment: 15 pages, 7 figures,updated footnote
Multimodal Reinforcement Learning for Robots Collaborating with Humans
Robot assistants for older adults and people with disabilities need to interact with their users in collaborative tasks. The core component of these systems is an interaction manager whose job is to observe and assess the task, and infer the state of the human and their intent to choose the best course of action for the robot. Due to the sparseness of the data in this domain, the policy for such multi-modal systems is often crafted by hand; as the complexity of interactions grows this process is not scalable. In this paper, we propose a reinforcement learning (RL) approach to learn the robot policy. In contrast to the dialog systems, our agent is trained with a simulator developed by using human data and can deal with multiple modalities such as language and physical actions. We conducted a human study to evaluate the performance of the system in the interaction with a user. Our designed system shows promising preliminary results when it is used by a real user.
Solving Robotics Problems in Zero-Shot with Vision-Language Models
We introduce Wonderful Team, a multi-agent visual LLM (VLLM) framework for solving robotics problems in the zero-shot regime. By zero-shot we mean that, for a novel environment, we feed a VLLM an image of the robot's environment and a description of the task, and have the VLLM output the sequence of actions necessary for the robot to complete the task. Prior work on VLLMs in robotics has largely focused on settings where some part of the pipeline is fine-tuned, such as tuning an LLM on robot data or training a separate vision encoder for perception and action generation. Surprisingly, due to recent advances in the capabilities of VLLMs, this type of fine-tuning may no longer be necessary for many tasks. In this work, we show that with careful engineering, we can prompt a single off-the-shelf VLLM to handle all aspects of a robotics task, from high-level planning to low-level location-extraction and action-execution. Wonderful Team builds on recent advances in multi-agent LLMs to partition tasks across an agent hierarchy, making it self-corrective and able to effectively partition and solve even long-horizon tasks. Extensive experiments on VIMABench and real-world robotic environments demonstrate the system's capability to handle a variety of robotic tasks, including manipulation, visual goal-reaching, and visual reasoning, all in a zero-shot manner. These results underscore a key point: vision-language models have progressed rapidly in the past year, and should strongly be considered as a backbone for robotics problems going forward.
comment: aka Wonderful Team
iMTSP: Solving Min-Max Multiple Traveling Salesman Problem with Imperative Learning
This paper considers a Min-Max Multiple Traveling Salesman Problem (MTSP), where the goal is to find a set of tours, one for each agent, to collectively visit all the cities while minimizing the length of the longest tour. Though MTSP has been widely studied, obtaining near-optimal solutions for large-scale problems is still challenging due to its NP-hardness. Recent efforts in data-driven methods face challenges of the need for hard-to-obtain supervision and issues with high variance in gradient estimations, leading to slow convergence and highly suboptimal solutions. We address these issues by reformulating MTSP as a bilevel optimization problem, using the concept of imperative learning (IL). This involves introducing an allocation network that decomposes the MTSP into multiple single-agent traveling salesman problems (TSPs). The longest tour from these TSP solutions is then used to self-supervise the allocation network, resulting in a new self-supervised, bilevel, end-to-end learning framework, which we refer to as imperative MTSP (iMTSP). Additionally, to tackle the high-variance gradient issues during the optimization, we introduce a control variate-based gradient estimation algorithm. Our experiments showed that these innovative designs enable our gradient estimator to converge 20% faster than the advanced reinforcement learning baseline and find up to 80% shorter tour length compared with Google OR-Tools MTSP solver, especially in large-scale problems (e.g. 1000 cities and 15 agents).
comment: 8 pages, 3 figures, 3 tables
Hierarchical Consensus-Based Multi-Agent Reinforcement Learning for Multi-Robot Cooperation Tasks IROS 2024
In multi-agent reinforcement learning (MARL), the Centralized Training with Decentralized Execution (CTDE) framework is pivotal but struggles due to a gap: global state guidance in training versus reliance on local observations in execution, lacking global signals. Inspired by human societal consensus mechanisms, we introduce the Hierarchical Consensus-based Multi-Agent Reinforcement Learning (HC-MARL) framework to address this limitation. HC-MARL employs contrastive learning to foster a global consensus among agents, enabling cooperative behavior without direct communication. This approach enables agents to form a global consensus from local observations, using it as an additional piece of information to guide collaborative actions during execution. To cater to the dynamic requirements of various tasks, consensus is divided into multiple layers, encompassing both short-term and long-term considerations. Short-term observations prompt the creation of an immediate, low-layer consensus, while long-term observations contribute to the formation of a strategic, high-layer consensus. This process is further refined through an adaptive attention mechanism that dynamically adjusts the influence of each consensus layer. This mechanism optimizes the balance between immediate reactions and strategic planning, tailoring it to the specific demands of the task at hand. Extensive experiments and real-world applications in multi-robot systems showcase our framework's superior performance, marking significant advancements over baselines.
comment: 8 pages, 10 figures. Accepted for presentation at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments
Robotic manipulation with two-finger grippers is challenged by objects lacking distinct graspable features. Traditional pre-grasping methods, which typically involve repositioning objects or utilizing external aids like table edges, are limited in their adaptability across different object categories and environments. To overcome these limitations, we introduce PreAfford, a novel pre-grasping planning framework incorporating a point-level affordance representation and a relay training approach. Our method significantly improves adaptability, allowing effective manipulation across a wide range of environments and object types. When evaluated on the ShapeNet-v2 dataset, PreAfford not only enhances grasping success rates by 69% but also demonstrates its practicality through successful real-world experiments. These improvements highlight PreAfford's potential to redefine standards for robotic handling of complex manipulation tasks in diverse settings.
comment: Project Page: https://air-discover.github.io/PreAfford/
Servo Integrated Nonlinear Model Predictive Control for Overactuated Tiltable-Quadrotors
Utilizing a servo to tilt each rotor transforms quadrotors from underactuated to overactuated systems, allowing for independent control of both attitude and position, which provides advantages for aerial manipulation. However, this enhancement also introduces model nonlinearity, sluggish servo response, and limited operational range into the system, posing challenges to dynamic control. In this study, we propose a control approach for tiltable-quadrotors based on nonlinear model predictive control (NMPC). Unlike conventional cascade methods, our approach preserves the full dynamics without simplification. It directly uses rotor thrust and servo angle as control inputs, where their limited working ranges are considered input constraints. Notably, we incorporate a first-order servo model within the NMPC framework. Simulation reveals that integrating the servo dynamics is not only an enhancement to control performance but also a critical factor for optimization convergence. To evaluate the effectiveness of our approach, we fabricate a tiltable-quadrotor and deploy the algorithm onboard at 100 Hz. Extensive real-world experiments demonstrate rapid, robust, and smooth pose-tracking performance.
comment: 8 pages, 10 figures. This article has been accepted by RA-L
Field Testing of a Stochastic Planner for ASV Navigation Using Satellite Images
We introduce a multi-sensor navigation system for autonomous surface vessels (ASV) intended for water-quality monitoring in freshwater lakes. Our mission planner uses satellite imagery as a prior map, formulating offline a mission-level policy for global navigation of the ASV and enabling autonomous online execution via local perception and local planning modules. A significant challenge is posed by the inconsistencies in traversability estimation between satellite images and real lakes, due to environmental effects such as wind, aquatic vegetation, shallow waters, and fluctuating water levels. Hence, we specifically modelled these traversability uncertainties as stochastic edges in a graph and optimized for a mission-level policy that minimizes the expected total travel distance. To execute the policy, we propose a modern local planner architecture that processes sensor inputs and plans paths to execute the high-level policy under uncertain traversability conditions. Our system was tested on three km-scale missions on a Northern Ontario lake, demonstrating that our GPS-, vision-, and sonar-enabled ASV system can effectively execute the mission-level policy and disambiguate the traversability of stochastic edges. Finally, we provide insights gained from practical field experience and offer several future directions to enhance the overall reliability of ASV navigation systems.
comment: To appear in IEEE Transactions on Field Robotics (T-FR). 40 pages, 21 figures. Video available at https://youtu.be/KVSTmWFLqjk?si=Gvt1uOgLH-6OUrfD. Journal extension of arXiv:2209.11864
"Golden Ratio Yoshimura" for Meta-Stable and Massively Reconfigurable Deployment
Yoshimura origami is a classical folding pattern that has inspired many deployable structure designs. Its applications span from space exploration, kinetic architectures, and soft robots to even everyday household items. However, despite its wide usage, Yoshimura has been fixated on a set of design constraints to ensure its flat-foldability. Through extensive kinematic analysis and prototype tests, this study presents a new Yoshimura that intentionally defies these constraints. Remarkably, one can impart a unique meta-stability by using the Golden Ratio angle to define the triangular facets of a generalized Yoshimura. As a result, when its facets are strategically popped out, a ``Golden Ratio Yoshimura'' boom with $m$ modules can be theoretically reconfigured into $8^m$ geometrically unique and load-bearing shapes. This result not only challenges the existing design norms but also opens up a new avenue to create deployable and versatile structural systems.
AirPilot: A PPO-based DRL Auto-Tuned Nonlinear PID Drone Controller for Robust Autonomous Flights
Navigation precision, speed and stability are crucial for safe Unmanned Aerial Vehicle (UAV) flight maneuvers and effective flight mission executions in dynamic environments. Different flight missions may have varying objectives, such as minimizing energy consumption, achieving precise positioning, or maximizing speed. A controller that can adapt to different objectives on the fly is highly valuable. Proportional Integral Derivative (PID) controllers are one of the most popular and widely used control algorithms for drones and other control systems, but their linear control algorithm fails to capture the nonlinear nature of the dynamic wind conditions and complex drone system. Manually tuning the PID gains for various missions can be time-consuming and requires significant expertise. This paper aims to revolutionize drone flight control by presenting the AirPilot, a nonlinear Deep Reinforcement Learning (DRL) - enhanced Proportional Integral Derivative (PID) drone controller using Proximal Policy Optimization (PPO). AirPilot controller combines the simplicity and effectiveness of traditional PID control with the adaptability, learning capability, and optimization potential of DRL. This makes it better suited for modern drone applications where the environment is dynamic, and mission-specific performance demands are high. We employed a COEX Clover autonomous drone for training the DRL agent within the simulator and implemented it in a real-world lab setting, which marks a significant milestone as one of the first attempts to apply a DRL-based flight controller on an actual drone. Airpilot is capable of reducing the navigation error of the default PX4 PID position controller by 90%, improving effective navigation speed of a fine-tuned PID controller by 21%, reducing settling time and overshoot by 17% and 16% respectively.
comment: 14 pages, 17 figures
Sensing environmental interaction physics to traverse cluttered obstacles
When legged robots physically interact with obstacles in applications such as search and rescue through rubble and planetary exploration across Martain rocks, even the most advanced ones struggle because they lack a fundamental framework to model the robot-obstacle physical interaction paralleling artificial potential fields for obstacle avoidance. To remedy this, recent studies established a novel framework - potential energy landscape modeling - that explains and predicts the destabilizing transitions across locomotor modes from the physical interaction between robots and obstacles, and governs a wide range of complex locomotion. However, this framework was confined to the laboratory because we lack methods to obtain the potential energy landscape in unknown environments. Here, we explore the feasibility of introducing this framework to such environments. We showed that a robot can reconstruct the potential energy landscape for unknown obstacles by measuring the obstacle contact forces and resulting torques. To elaborate, we developed a minimalistic robot capable of sensing contact forces and torques when propelled against a pair of grass-like obstacles. Despite the forces and torques not being fully conservative, they well-matched the potential energy landscape gradients, and the reconstructed landscape well-matched ground truth. In addition, we found that using normal forces and torques and head oscillation inspired by cockroach observations further improved the estimation of conservative ones. Our study will finally inspire free-running robots to achieve low-effort, "zero-shot" traversing clustered, large obstacles in real-world applications by sampling contact forces and torques and reconstructing the landscape around its neighboring states in real time.
Multiagent Systems
From Mobilisation to Radicalisation: Probing the Persistence and Radicalisation of Social Movements Using an Agent-Based Model
We are living in an age of protest. Although we have an excellent understanding of the factors that predict participation in protest, we understand little about the conditions that foster a sustained (versus transient) movement. How do interactions between supporters and authorities combine to influence whether and how people engage (i.e., using conventional or radical tactics)? This paper introduces a novel, theoretically-founded and empirically-informed agent-based model (DIMESim) to address these questions. We model the complex interactions between the psychological attributes of the protester (agents), the authority to whom the protests are targeted, and the environment that allows protesters to coordinate with each other -- over time, and at a population scale. Where an authority is responsive and failure is contested, a modest sized conventional movement endured. Where authorities repeatedly and incontrovertibly fail the movement, the population disengaged from action but evidenced an ongoing commitment to radicalism (latent radicalism).
comment: Initial submission version of journal paper
Optimizing Collaboration of LLM based Agents for Finite Element Analysis
This paper investigates the interactions between multiple agents within Large Language Models (LLMs) in the context of programming and coding tasks. We utilize the AutoGen framework to facilitate communication among agents, evaluating different configurations based on the success rates from 40 random runs for each setup. The study focuses on developing a flexible automation framework for applying the Finite Element Method (FEM) to solve linear elastic problems. Our findings emphasize the importance of optimizing agent roles and clearly defining their responsibilities, rather than merely increasing the number of agents. Effective collaboration among agents is shown to be crucial for addressing general FEM challenges. This research demonstrates the potential of LLM multi-agent systems to enhance computational automation in simulation methodologies, paving the way for future advancements in engineering and artificial intelligence.
DBHP: Trajectory Imputation in Multi-Agent Sports Using Derivative-Based Hybrid Prediction
Many spatiotemporal domains handle multi-agent trajectory data, but in real-world scenarios, collected trajectory data are often partially missing due to various reasons. While existing approaches demonstrate good performance in trajectory imputation, they face challenges in capturing the complex dynamics and interactions between agents due to a lack of physical constraints that govern realistic trajectories, leading to suboptimal results. To address this issue, the paper proposes a Derivative-Based Hybrid Prediction (DBHP) framework that can effectively impute multiple agents' missing trajectories. First, a neural network equipped with Set Transformers produces a naive prediction of missing trajectories while satisfying the permutation-equivariance in terms of the order of input agents. Then, the framework makes alternative predictions leveraging velocity and acceleration information and combines all the predictions with properly determined weights to provide final imputed trajectories. In this way, our proposed framework not only accurately predicts position, velocity, and acceleration values but also enforces the physical relationship between them, eventually improving both the accuracy and naturalness of the predicted trajectories. Accordingly, the experiment results about imputing player trajectories in team sports show that our framework significantly outperforms existing imputation baselines.
Hierarchical Consensus-Based Multi-Agent Reinforcement Learning for Multi-Robot Cooperation Tasks IROS 2024
In multi-agent reinforcement learning (MARL), the Centralized Training with Decentralized Execution (CTDE) framework is pivotal but struggles due to a gap: global state guidance in training versus reliance on local observations in execution, lacking global signals. Inspired by human societal consensus mechanisms, we introduce the Hierarchical Consensus-based Multi-Agent Reinforcement Learning (HC-MARL) framework to address this limitation. HC-MARL employs contrastive learning to foster a global consensus among agents, enabling cooperative behavior without direct communication. This approach enables agents to form a global consensus from local observations, using it as an additional piece of information to guide collaborative actions during execution. To cater to the dynamic requirements of various tasks, consensus is divided into multiple layers, encompassing both short-term and long-term considerations. Short-term observations prompt the creation of an immediate, low-layer consensus, while long-term observations contribute to the formation of a strategic, high-layer consensus. This process is further refined through an adaptive attention mechanism that dynamically adjusts the influence of each consensus layer. This mechanism optimizes the balance between immediate reactions and strategic planning, tailoring it to the specific demands of the task at hand. Extensive experiments and real-world applications in multi-robot systems showcase our framework's superior performance, marking significant advancements over baselines.
comment: 8 pages, 10 figures. Accepted for presentation at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
A Review of Nine Physics Engines for Reinforcement Learning Research
We present a review of popular simulation engines and frameworks used in reinforcement learning (RL) research, aiming to guide researchers in selecting tools for creating simulated physical environments for RL and training setups. It evaluates nine frameworks (Brax, Chrono, Gazebo, MuJoCo, ODE, PhysX, PyBullet, Webots, and Unity) based on their popularity, feature range, quality, usability, and RL capabilities. We highlight the challenges in selecting and utilizing physics engines for RL research, including the need for detailed comparisons and an understanding of each framework's capabilities. Key findings indicate MuJoCo as the leading framework due to its performance and flexibility, despite usability challenges. Unity is noted for its ease of use but lacks scalability and simulation fidelity. The study calls for further development to improve simulation engines' usability and performance and stresses the importance of transparency and reproducibility in RL research. This review contributes to the RL community by offering insights into the selection process for simulation engines, facilitating informed decision-making.
comment: 11 pages, 3 figures
Locally Differentially Private Distributed Online Learning with Guaranteed Optimality
Distributed online learning is gaining increased traction due to its unique ability to process large-scale datasets and streaming data. To address the growing public awareness and concern on privacy protection, plenty of algorithms have been proposed to enable differential privacy in distributed online optimization and learning. However, these algorithms often face the dilemma of trading learning accuracy for privacy. By exploiting the unique characteristics of online learning, this paper proposes an approach that tackles the dilemma and ensures both differential privacy and learning accuracy in distributed online learning. More specifically, while ensuring a diminishing expected instantaneous regret, the approach can simultaneously ensure a finite cumulative privacy budget, even in the infinite time horizon. To cater for the fully distributed setting, we adopt the local differential-privacy framework, which avoids the reliance on a trusted data curator that is required in the classic "centralized" (global) differential-privacy framework. To the best of our knowledge, this is the first algorithm that successfully ensures both rigorous local differential privacy and learning accuracy. The effectiveness of the proposed algorithm is evaluated using machine learning tasks, including logistic regression on the the "mushrooms" datasets and CNN-based image classification on the "MNIST" and "CIFAR-10" datasets.
comment: 24 pages, 9 figures
Systems and Control (CS)
Quadratic estimation for stochastic systems in the presence of random parameter matrices, time-correlated additive noise and deception attacks
Networked systems usually face different random uncertainties that make the performance of the least-squares (LS) linear filter decline significantly. For this reason, great attention has been paid to the search for other kinds of suboptimal estimators. Among them, the LS quadratic estimation approach has attracted considerable interest in the scientific community for its balance between computational complexity and estimation accuracy. When it comes to stochastic systems subject to different random uncertainties and deception attacks, the quadratic estimator design has not been deeply studied. In this paper, using covariance information, the LS quadratic filtering and fixed-point smoothing problems are addressed under the assumption that the measurements are perturbed by a time-correlated additive noise, as well as affected by random parameter matrices and exposed to random deception attacks. The use of random parameter matrices covers a wide range of common uncertainties and random failures, thus better reflecting the engineering reality. The signal and observation vectors are augmented by stacking the original vectors with their second-order Kronecker powers; then, the linear estimator of the original signal based on the augmented observations provides the required quadratic estimator. A simulation example illustrates the superiority of the proposed quadratic estimators over the conventional linear ones and the effect of the deception attacks on the estimation performance.
Dual Grid-Forming Converter
This letter proposes a dual model for grid-forming (GFM) controlled converters. The model is inspired from the observation that the structures of the active and reactive power equations of lossy synchronous machine models are almost symmetrical in terms of armature resistance and transient reactance. The proposed device is able to compensate grid power unbalance without requiring a frequency signal. In fact, the active power control is based on the rate of change of the voltage magnitude. On the other hand, synchronization and frequency control is obtained through the reactive power support. The letter shows that the proposed dual-GFM control is robust and capable of recovering a normal operating condition following large contingencies, such as load outages and three-phase faults.
cc-DRL: a Convex Combined Deep Reinforcement Learning Flight Control Design for a Morphing Quadrotor
In comparison to common quadrotors, the shape change of morphing quadrotors endows it with a more better flight performance but also results in more complex flight dynamics. Generally, it is extremely difficult or even impossible for morphing quadrotors to establish an accurate mathematical model describing their complex flight dynamics. To figure out the issue of flight control design for morphing quadrotors, this paper resorts to a combination of model-free control techniques (e.g., deep reinforcement learning, DRL) and convex combination (CC) technique, and proposes a convex-combined-DRL (cc-DRL) flight control algorithm for position and attitude of a class of morphing quadrotors, where the shape change is realized by the length variation of four arm rods. In the proposed cc-DRL flight control algorithm, proximal policy optimization algorithm that is a model-free DRL algorithm is utilized to off-line train the corresponding optimal flight control laws for some selected representative arm length modes and hereby a cc-DRL flight control scheme is constructed by the convex combination technique. Finally, simulation results are presented to show the effectiveness and merit of the proposed flight control algorithm.
Towards learning digital twin: case study on an anisotropic non-ideal rotor system
In the manufacturing industry, the digital twin (DT) is becoming a central topic. It has the potential to enhance the efficiency of manufacturing machines and reduce the frequency of errors. In order to fulfill its purpose, a DT must be an exact enough replica of its corresponding physical object. Nevertheless, the physical object endures a lifelong process of degradation. As a result, the digital twin must be modified accordingly in order to satisfy the accuracy requirement. This article introduces the novel concept of "learning digital twin (LDT)," which concentrates on the temporal behavior of the physical object and highlights the digital twin's capacity for lifelong learning. The structure of a LDT is first described. Then, in-depth descriptions of various algorithms for implementing each component of a LDT are provided. The proposed LDT is validated on the simulated degradation process of an anisotropic non-ideal rotor system.
From Time-Invariant to Uniformly Time-Varying Control Barrier Functions: A Constructive Approach
In this paper, we define and analyze a subclass of (time-invariant) Control Barrier Functions (CBF) that have favorable properties for the construction of uniformly timevarying CBFs and thereby for the satisfaction of uniformly time-varying constraints. We call them {\Lambda}-shiftable CBFs where {\Lambda} states the extent by which the CBF can be varied by adding a time-varying function. Moreover, we derive sufficient conditions under which a time-varying CBF can be obtained from a time-invariant one, and we propose a systematic construction method. Advantageous about our approach is that a {\Lambda}-shiftable CBF, once constructed, can be reused for various control objectives. In the end, we relate the class of {\Lambda}-shiftable CBFs to Control Lyapunov Functions (CLF), and we illustrate the application of our results with a relevant simulation example.
comment: 6 pages, 2 figures, accepted for the publication at IEEE CDC 2024
SIMPNet: Spatial-Informed Motion Planning Network
Current robotic manipulators require fast and efficient motion-planning algorithms to operate in cluttered environments. State-of-the-art sampling-based motion planners struggle to scale to high-dimensional configuration spaces and are inefficient in complex environments. This inefficiency arises because these planners utilize either uniform or hand-crafted sampling heuristics within the configuration space. To address these challenges, we present the Spatial-informed Motion Planning Network (SIMPNet). SIMPNet consists of a stochastic graph neural network (GNN)-based sampling heuristic for informed sampling within the configuration space. The sampling heuristic of SIMPNet encodes the workspace embedding into the configuration space through a cross-attention mechanism. It encodes the manipulator's kinematic structure into a graph, which is used to generate informed samples within the framework of sampling-based motion planning algorithms. We have evaluated the performance of SIMPNet using a UR5e robotic manipulator operating within simple and complex workspaces, comparing it against baseline state-of-the-art motion planners. The evaluation results show the effectiveness and advantages of the proposed planner compared to the baseline planners.
Courteous MPC for Autonomous Driving with CBF-inspired Risk Assessment SC 2024
With more autonomous vehicles (AVs) sharing roadways with human-driven vehicles (HVs), ensuring safe and courteous maneuvers that respect HVs' behavior becomes increasingly important. To promote both safety and courtesy in AV's behavior, an extension of Control Barrier Functions (CBFs)-inspired risk evaluation framework is proposed in this paper by considering both noisy observed positions and velocities of surrounding vehicles. The perceived risk by the ego vehicle can be visualized as a risk map that reflects the understanding of the surrounding environment and thus shows the potential for facilitating safe and courteous driving. By incorporating the risk evaluation framework into the Model Predictive Control (MPC) scheme, we propose a Courteous MPC for ego AV to generate courteous behaviors that 1) reduce the overall risk imposed on other vehicles and 2) respect the hard safety constraints and the original objective for efficiency. We demonstrate the performance of the proposed Courteous MPC via theoretical analysis and simulation experiments.
comment: 7 pages, accepted to ITSC 2024
Minimizing Movement Delay for Movable Antennas via Trajectory Optimization
Movable antennas (MAs) have received increasing attention in wireless communications due to their capability of antenna position adjustment to reconfigure wireless channels. However, moving MAs results in non-negligible delay, which may decrease the effective data transmission time. To reduce the movement delay, we study in this paper a new MA trajectory optimization problem. In particular, given the desired destination positions of multiple MAs, we aim to jointly optimize their associations with the initial MA positions and the trajectories for moving them from their respective initial to destination positions within a given two-dimensional (2D) region, such that the delay of antenna movement is minimized, subject to the inter-MA minimum distance constraints in the movement. However, this problem is a continuous-time mixed-integer linear programming (MILP) problem that is challenging to solve. To tackle this challenge, we propose a two-stage optimization framework that sequentially optimizes the MAs' position associations and trajectories, respectively. First, we relax the inter-MA distance constraints and optimally solve the resulted delay minimization problem. Next, we check if the obtained MA association and trajectory solutions satisfy the inter-MA distance constraints. If not satisfied, we then employ a successive convex approximation (SCA) algorithm to adjust the MAs' trajectories until they satisfy the given constraints. Simulation results are provided to show the effectiveness of our proposed trajectory optimization method in reducing the movement delay as well as draw useful insights.
comment: 6 pages,6 figures, submit to GLOBECOM 2024 Workshop - IRAFWCC
A sufficient condition for 2-contraction of a feedback interconnection
Multistationarity - the existence of multiple equilibrium points - is a common phenomenon in dynamical systems from a variety of fields, including neuroscience, opinion dynamics, systems biology, and power systems. A recently proposed generalization of contraction theory, called $k$-contraction, is a promising approach for analyzing the asymptotic behaviour of multistationary systems. In particular, all bounded trajectories of a time-invariant 2-contracting system converge to an equilibrium point, but the system may have multiple equilibrium points where more than one is locally stable. An important challenge is to study $k$-contraction in large-scale interconnected systems. Inspired by a recent small-gain theorem for 2-contraction by Angeli et al., we derive a new sufficient condition for 2-contraction of a feedback interconnection of two nonlinear dynamical systems. Our condition is based on (i) deriving new formulas for the 2-multiplicative [2-additive] compound of block matrices using block Kronecker products [sums], (ii) a hierarchical approach for proving standard contraction, and (iii) a network small-gain theorem for Metzler matrices. We demonstrate our results by deriving a simple sufficient condition for 2-contraction in a network of FitzHugh-Nagumo neurons.
Reduce, Reuse, Recycle: Categories for Compositional Reinforcement Learning ECAI 2024
In reinforcement learning, conducting task composition by forming cohesive, executable sequences from multiple tasks remains challenging. However, the ability to (de)compose tasks is a linchpin in developing robotic systems capable of learning complex behaviors. Yet, compositional reinforcement learning is beset with difficulties, including the high dimensionality of the problem space, scarcity of rewards, and absence of system robustness after task composition. To surmount these challenges, we view task composition through the prism of category theory -- a mathematical discipline exploring structures and their compositional relationships. The categorical properties of Markov decision processes untangle complex tasks into manageable sub-tasks, allowing for strategical reduction of dimensionality, facilitating more tractable reward structures, and bolstering system robustness. Experimental results support the categorical theory of reinforcement learning by enabling skill reduction, reuse, and recycling when learning complex robotic arm tasks.
comment: ECAI 2024
Stable Formulations in Optimistic Bilevel Optimization
Solutions of bilevel optimization problems tend to suffer from instability under changes to problem data. In the optimistic setting, we construct a lifted, alternative formulation that exhibits desirable stability properties under mild assumptions that neither invoke convexity nor smoothness. The upper- and lower-level problems might involve integer restrictions and disjunctive constraints. In a range of results, we at most invoke pointwise and local calmness for the lower-level problem in a sense that holds broadly. The alternative formulation is computationally attractive with structural properties being brought out and an outer approximation algorithm becoming available.
Autonomous Station Keeping of Satellites in Areostationary Mars Orbit: A Predictive Control Approach
The continued exploration of Mars will require a greater number of in-space assets to aid interplanetary communications. Future missions to the surface of Mars may be augmented with stationary satellites that remain overhead at all times as a means of sending data back to Earth from fixed antennae on the surface. These areostationary satellites will experience several important disturbances that push and pull the spacecraft off of its desired orbit. Thus, a station-keeping control strategy must be put into place to ensure the satellite remains overhead while minimizing the fuel required to elongate mission lifetime. This paper develops a model predictive control policy for areostationary station keeping that exploits knowledge of non-Keplerian perturbations in order to minimize the required annual station-keeping $\Delta v$. The station-keeping policy is applied to a satellite placed at various longitudes, and simulations are performed for an example mission at a longitude of a potential future crewed landing site. Through careful tuning of the controller constraints, and proper placement of the satellite at stable longitudes, the annual station-keeping $\Delta v$ can be reduced relative to a naive mission design.
comment: Preprint submitted to Acta Astronautica
An IoT Framework for Building Energy Optimization Using Machine Learning-based MPC
This study proposes a machine learning-based Model Predictive Control (MPC) approach for controlling Air Handling Unit (AHU) systems by employing an Internet of Things (IoT) framework. The proposed framework utilizes an Artificial Neural Network (ANN) to provide dynamic-linear thermal model parameters considering building information and disturbances in real time, thereby facilitating the practical MPC of the AHU system. The proposed framework allows users to establish new setpoints for a closed-loop control system, enabling customization of the thermal environment to meet individual needs with minimal use of the AHU. The experimental results demonstrate the cost benefits of the proposed machine-learning-based MPC-IoT framework, achieving a 57.59\% reduction in electricity consumption compared with a clock-based manual controller while maintaining a high level of user satisfaction. The proposed framework offers remarkable flexibility and effectiveness, even in legacy systems with limited building information, making it a pragmatic and valuable solution for enhancing the energy efficiency and user comfort in pre-existing structures.
Optimal Dispatch Strategy for a Multi-microgrid Cooperative Alliance Using a Two-Stage Pricing Mechanism
To coordinate resources among multi-level stakeholders and enhance the integration of electric vehicles (EVs) into multi-microgrids, this study proposes an optimal dispatch strategy within a multi-microgrid cooperative alliance using a nuanced two-stage pricing mechanism. Initially, the strategy assesses electric energy interactions between microgrids and distribution networks to establish a foundation for collaborative scheduling. The two-stage pricing mechanism initiates with a leader-follower game, wherein the microgrid operator acts as the leader and users as followers. Subsequently, it adjusts EV tariffs based on the game's equilibrium, taking into account factors such as battery degradation and travel needs to optimize EVs' electricity consumption. Furthermore, a bi-level optimization model refines power interactions and pricing strategies across the network, significantly enhancing demand response capabilities and economic outcomes. Simulation results demonstrate that this strategy not only increases renewable energy consumption but also reduces energy costs, thereby improving the overall efficiency and sustainability of the system.
comment: Accepted by IEEE Transactions on Sustainable Energy, Paper no. TSTE-00122-2024
Extraction of Typical Operating Scenarios of New Power System Based on Deep Time Series Aggregation
Extracting typical operational scenarios is essential for making flexible decisions in the dispatch of a new power system. This study proposed a novel deep time series aggregation scheme (DTSAs) to generate typical operational scenarios, considering the large amount of historical operational snapshot data. Specifically, DTSAs analyze the intrinsic mechanisms of different scheduling operational scenario switching to mathematically represent typical operational scenarios. A gramian angular summation field (GASF) based operational scenario image encoder was designed to convert operational scenario sequences into high-dimensional spaces. This enables DTSAs to fully capture the spatiotemporal characteristics of new power systems using deep feature iterative aggregation models. The encoder also facilitates the generation of typical operational scenarios that conform to historical data distributions while ensuring the integrity of grid operational snapshots. Case studies demonstrate that the proposed method extracted new fine-grained power system dispatch schemes and outperformed the latest high-dimensional featurescreening methods. In addition, experiments with different new energy access ratios were conducted to verify the robustness of the proposed method. DTSAs enables dispatchers to master the operation experience of the power system in advance, and actively respond to the dynamic changes of the operation scenarios under the high access rate of new energy.
comment: Accepted by CAAI Transactions on Intelligence Technology
Consensus over Clustered Networks Using Intermittent and Asynchronous Output Feedback
In recent years, multi-agent teaming has garnered considerable interest since complex objectives, such as intelligence, surveillance, and reconnaissance, can be divided into multiple cluster-level sub-tasks and assigned to a cluster of agents with the appropriate functionality. Yet, coordination and information dissemination between clusters may be necessary to accomplish a desired objective. Distributed consensus protocols provide a mechanism for spreading information within clustered networks, allowing agents and clusters to make decisions without requiring direct access to the state of the ensemble. Hence, we propose a strategy for achieving system-wide consensus in the states of identical linear time-invariant systems coupled by an undirected graph whose directed sub-graphs are available only at sporadic times. Within this work, the agents of the network are organized into pairwise disjoint clusters, which induce sub-graphs of the undirected parent graph. Some cluster sub-graph pairs are linked by an inter-cluster sub-graph, where the union of all cluster and inter-cluster sub-graphs yields the undirected parent graph. Each agent utilizes a distributed consensus protocol with components that are updated intermittently and asynchronously with respect to other agents. The closed-loop ensemble dynamics is modeled as a hybrid system, and a Lyapunov-based stability analysis yields sufficient conditions for rendering the agreement subspace (consensus set) globally exponentially stable. Furthermore, an input-to-state stability argument demonstrates the consensus set is robust to a class of perturbations. A numerical simulation considering both nominal and perturbed scenarios is provided for validation purposes.
AirPilot: A PPO-based DRL Auto-Tuned Nonlinear PID Drone Controller for Robust Autonomous Flights
Navigation precision, speed and stability are crucial for safe Unmanned Aerial Vehicle (UAV) flight maneuvers and effective flight mission executions in dynamic environments. Different flight missions may have varying objectives, such as minimizing energy consumption, achieving precise positioning, or maximizing speed. A controller that can adapt to different objectives on the fly is highly valuable. Proportional Integral Derivative (PID) controllers are one of the most popular and widely used control algorithms for drones and other control systems, but their linear control algorithm fails to capture the nonlinear nature of the dynamic wind conditions and complex drone system. Manually tuning the PID gains for various missions can be time-consuming and requires significant expertise. This paper aims to revolutionize drone flight control by presenting the AirPilot, a nonlinear Deep Reinforcement Learning (DRL) - enhanced Proportional Integral Derivative (PID) drone controller using Proximal Policy Optimization (PPO). AirPilot controller combines the simplicity and effectiveness of traditional PID control with the adaptability, learning capability, and optimization potential of DRL. This makes it better suited for modern drone applications where the environment is dynamic, and mission-specific performance demands are high. We employed a COEX Clover autonomous drone for training the DRL agent within the simulator and implemented it in a real-world lab setting, which marks a significant milestone as one of the first attempts to apply a DRL-based flight controller on an actual drone. Airpilot is capable of reducing the navigation error of the default PX4 PID position controller by 90%, improving effective navigation speed of a fine-tuned PID controller by 21%, reducing settling time and overshoot by 17% and 16% respectively.
comment: 14 pages, 17 figures
Unsignalized Intersection Management Strategy for Mixed Autonomy Traffic Streams
With the rapid development of connected and automated vehicles (CAVs) and intelligent transportation infrastructure, CAVs, connected human-driven vehicles (CHVs), and un-connected human-driven vehicles (HVs) will coexist on the roads in the future for a long time. This paper comprehensively considers the different traffic characteristics of CHVs, CAVs, and HVs, and systemically investigates the unsignalized intersection management strategy from the upper decision-making level to the lower execution level. The unsignalized intersection management strategy consists of two parts: the heuristic priority queues based right of way allocation (HPQ) algorithm and the vehicle planning and control algorithm. In the HPQ algorithm, a vehicle priority management model considering the difference between CAVs, CHVs, and HVs, is built to design the right of way management for different types of vehicles. In the lower level for vehicle planning and control algorithm, different control modes of CAVs are designed according to the upper-level decision made by the HPQ algorithm. Moreover, the vehicle control execution is realized by the model predictive controller combined with the geographical environment constraints and the unsignalized intersection management strategy. The proposed strategy is evaluated by simulations, which show that the proposed intersection management strategy can effectively reduce travel time and improve traffic efficiency. Results show that the proposed method can decrease the average travel time by 5% to 65% for different traffic flows compared with the comparative methods. The intersection management strategy captures the real-world balance between efficiency and safety for future intelligent traffic systems.
Opinion dynamics on signed graphs and graphons: Beyond the piece-wise constant case
In this paper we make use of graphon theory to study opinion dynamics on large undirected networks. The opinion dynamics models that we take into consideration allow for negative interactions between the individuals, i.e. competing entities whose opinions can grow apart. We consider both the repelling model and the opposing model that are studied in the literature. We define the repelling and the opposing dynamics on graphons and we show that their initial value problem's solutions exist and are unique. We then show that the graphon dynamics well approximate the dynamics on large graphs that converge to a graphon. This result applies to large random graphs that are sampled according to a graphon. All these facts are illustrated in an extended numerical example.
comment: 8 double-column pages. This revised version corrects several typos. An abridged version is going to appear in the proceedings of the 2024 IEEE Conference on Decision and Control
Intelligent Energy Management with IoT Framework in Smart Cities Using Intelligent Analysis: An Application of Machine Learning Methods for Complex Networks and Systems
This study confronts the growing challenges of energy consumption and the depletion of energy resources, particularly in the context of smart buildings. As the demand for energy increases alongside the necessity for efficient building maintenance, it becomes imperative to explore innovative energy management solutions. We present a comprehensive review of Internet of Things (IoT)-based frameworks aimed at smart city energy management, highlighting the pivotal role of IoT devices in addressing these issues due to their compactness, sensing, measurement, and computing capabilities. Our review methodology encompasses a thorough analysis of existing literature on IoT architectures and frameworks for intelligent energy management applications. We focus on systems that not only collect and store data but also support intelligent analysis for monitoring, controlling, and enhancing system efficiency. Additionally, we examine the potential for these frameworks to serve as platforms for the development of third-party applications, thereby extending their utility and adaptability. The findings from our review indicate that IoT-based frameworks offer significant potential to reduce energy consumption and environmental impact in smart buildings. Through the adoption of intelligent mechanisms and solutions, these frameworks facilitate effective energy management, leading to improved system efficiency and sustainability. Considering these findings, we recommend further exploration and adoption of IoT-based wireless sensing systems in smart buildings as a strategic approach to energy management. Our review underscores the importance of incorporating intelligent analysis and enabling the development of third-party applications within the IoT framework to efficiently meet the evolving energy demands and maintenance challenges
Equitable Networked Microgrid Topology Reconfiguration for Wildfire Risk Mitigation
The increasing number of wildfires in recent years consistently challenges the safe and reliable operations of power systems. To prevent power lines and other electrical components from causing wildfires under extreme conditions, electric utilities often deploy public safety power shutoffs (PSPS) to mitigate the wildfire risks therein. Although PSPS are effective countermeasures against wildfires, uncoordinated strategies can cause disruptions in electricity supply and even lead to cascading failures. Meanwhile, it is important to consider mitigating biased decisions on different communities and populations during the implementation of shutoff actions. In this work, we primarily focus on the dynamic reconfiguration problem of networked microgrids with distributed energy resources. In particular, we formulate a rolling horizon optimization problem allowing for flexible network reconfiguration at each time interval to mitigate wildfire risks. To promote equity and fairness during the span of shutoffs, we further enforce a range of constraints associated with load shedding to discourage disproportionate impact on individual load blocks. Numerical studies on a modified IEEE 13-bus system and a larger-sized Smart-DS system demonstrate the performance of the proposed algorithm towards more equitable power shutoff operations.
"Golden Ratio Yoshimura" for Meta-Stable and Massively Reconfigurable Deployment
Yoshimura origami is a classical folding pattern that has inspired many deployable structure designs. Its applications span from space exploration, kinetic architectures, and soft robots to even everyday household items. However, despite its wide usage, Yoshimura has been fixated on a set of design constraints to ensure its flat-foldability. Through extensive kinematic analysis and prototype tests, this study presents a new Yoshimura that intentionally defies these constraints. Remarkably, one can impart a unique meta-stability by using the Golden Ratio angle to define the triangular facets of a generalized Yoshimura. As a result, when its facets are strategically popped out, a ``Golden Ratio Yoshimura'' boom with $m$ modules can be theoretically reconfigured into $8^m$ geometrically unique and load-bearing shapes. This result not only challenges the existing design norms but also opens up a new avenue to create deployable and versatile structural systems.
Sensing environmental interaction physics to traverse cluttered obstacles
When legged robots physically interact with obstacles in applications such as search and rescue through rubble and planetary exploration across Martain rocks, even the most advanced ones struggle because they lack a fundamental framework to model the robot-obstacle physical interaction paralleling artificial potential fields for obstacle avoidance. To remedy this, recent studies established a novel framework - potential energy landscape modeling - that explains and predicts the destabilizing transitions across locomotor modes from the physical interaction between robots and obstacles, and governs a wide range of complex locomotion. However, this framework was confined to the laboratory because we lack methods to obtain the potential energy landscape in unknown environments. Here, we explore the feasibility of introducing this framework to such environments. We showed that a robot can reconstruct the potential energy landscape for unknown obstacles by measuring the obstacle contact forces and resulting torques. To elaborate, we developed a minimalistic robot capable of sensing contact forces and torques when propelled against a pair of grass-like obstacles. Despite the forces and torques not being fully conservative, they well-matched the potential energy landscape gradients, and the reconstructed landscape well-matched ground truth. In addition, we found that using normal forces and torques and head oscillation inspired by cockroach observations further improved the estimation of conservative ones. Our study will finally inspire free-running robots to achieve low-effort, "zero-shot" traversing clustered, large obstacles in real-world applications by sampling contact forces and torques and reconstructing the landscape around its neighboring states in real time.
Compositional nonlinear audio signal processing with Volterra series
We present a compositional theory of nonlinear audio signal processing based on a categorification of the Volterra series. We begin by augmenting the classical definition of the Volterra series so that it is functorial with respect to a base category whose objects are temperate distributions and whose morphisms are certain linear transformations. This motivates the derivation of formulae describing how the outcomes of nonlinear transformations are affected if their input signals are linearly processed--e.g., translated, modulated, sampled, or periodized. We then consider how nonlinear systems, themselves, change, and introduce as a model thereof the notion of morphism of Volterra series, which we exhibit as both a type of lens map and natural transformation. We show how morphisms can be parameterized and used to generate indexed families of Volterra series, which are well-suited to model nonstationary or time-varying nonlinear phenomena. We then describe how Volterra series and their morphisms organize into a category, which we call Volt. We exhibit the operations of sum, product, and series composition of Volterra series as monoidal products on Volt, and identify, for each in turn, its corresponding universal property. In particular, we show that the series composition of Volterra series is associative. We then bridge between our framework and the subject at the heart of audio signal processing: time-frequency analysis. Specifically, we show that a known equivalence, between a class of second-order Volterra series and the bilinear time-frequency distributions, can be extended to one between certain higher-order Volterra series and the so-called polynomial TFDs. We end by outlining potential avenues for future work, including the incorporation of system identification techniques and the potential extension of our theory to the settings of graph and topological audio signal processing.
Systems and Control (EESS)
Quadratic estimation for stochastic systems in the presence of random parameter matrices, time-correlated additive noise and deception attacks
Networked systems usually face different random uncertainties that make the performance of the least-squares (LS) linear filter decline significantly. For this reason, great attention has been paid to the search for other kinds of suboptimal estimators. Among them, the LS quadratic estimation approach has attracted considerable interest in the scientific community for its balance between computational complexity and estimation accuracy. When it comes to stochastic systems subject to different random uncertainties and deception attacks, the quadratic estimator design has not been deeply studied. In this paper, using covariance information, the LS quadratic filtering and fixed-point smoothing problems are addressed under the assumption that the measurements are perturbed by a time-correlated additive noise, as well as affected by random parameter matrices and exposed to random deception attacks. The use of random parameter matrices covers a wide range of common uncertainties and random failures, thus better reflecting the engineering reality. The signal and observation vectors are augmented by stacking the original vectors with their second-order Kronecker powers; then, the linear estimator of the original signal based on the augmented observations provides the required quadratic estimator. A simulation example illustrates the superiority of the proposed quadratic estimators over the conventional linear ones and the effect of the deception attacks on the estimation performance.
Dual Grid-Forming Converter
This letter proposes a dual model for grid-forming (GFM) controlled converters. The model is inspired from the observation that the structures of the active and reactive power equations of lossy synchronous machine models are almost symmetrical in terms of armature resistance and transient reactance. The proposed device is able to compensate grid power unbalance without requiring a frequency signal. In fact, the active power control is based on the rate of change of the voltage magnitude. On the other hand, synchronization and frequency control is obtained through the reactive power support. The letter shows that the proposed dual-GFM control is robust and capable of recovering a normal operating condition following large contingencies, such as load outages and three-phase faults.
cc-DRL: a Convex Combined Deep Reinforcement Learning Flight Control Design for a Morphing Quadrotor
In comparison to common quadrotors, the shape change of morphing quadrotors endows it with a more better flight performance but also results in more complex flight dynamics. Generally, it is extremely difficult or even impossible for morphing quadrotors to establish an accurate mathematical model describing their complex flight dynamics. To figure out the issue of flight control design for morphing quadrotors, this paper resorts to a combination of model-free control techniques (e.g., deep reinforcement learning, DRL) and convex combination (CC) technique, and proposes a convex-combined-DRL (cc-DRL) flight control algorithm for position and attitude of a class of morphing quadrotors, where the shape change is realized by the length variation of four arm rods. In the proposed cc-DRL flight control algorithm, proximal policy optimization algorithm that is a model-free DRL algorithm is utilized to off-line train the corresponding optimal flight control laws for some selected representative arm length modes and hereby a cc-DRL flight control scheme is constructed by the convex combination technique. Finally, simulation results are presented to show the effectiveness and merit of the proposed flight control algorithm.
Towards learning digital twin: case study on an anisotropic non-ideal rotor system
In the manufacturing industry, the digital twin (DT) is becoming a central topic. It has the potential to enhance the efficiency of manufacturing machines and reduce the frequency of errors. In order to fulfill its purpose, a DT must be an exact enough replica of its corresponding physical object. Nevertheless, the physical object endures a lifelong process of degradation. As a result, the digital twin must be modified accordingly in order to satisfy the accuracy requirement. This article introduces the novel concept of "learning digital twin (LDT)," which concentrates on the temporal behavior of the physical object and highlights the digital twin's capacity for lifelong learning. The structure of a LDT is first described. Then, in-depth descriptions of various algorithms for implementing each component of a LDT are provided. The proposed LDT is validated on the simulated degradation process of an anisotropic non-ideal rotor system.
From Time-Invariant to Uniformly Time-Varying Control Barrier Functions: A Constructive Approach
In this paper, we define and analyze a subclass of (time-invariant) Control Barrier Functions (CBF) that have favorable properties for the construction of uniformly timevarying CBFs and thereby for the satisfaction of uniformly time-varying constraints. We call them {\Lambda}-shiftable CBFs where {\Lambda} states the extent by which the CBF can be varied by adding a time-varying function. Moreover, we derive sufficient conditions under which a time-varying CBF can be obtained from a time-invariant one, and we propose a systematic construction method. Advantageous about our approach is that a {\Lambda}-shiftable CBF, once constructed, can be reused for various control objectives. In the end, we relate the class of {\Lambda}-shiftable CBFs to Control Lyapunov Functions (CLF), and we illustrate the application of our results with a relevant simulation example.
comment: 6 pages, 2 figures, accepted for the publication at IEEE CDC 2024
SIMPNet: Spatial-Informed Motion Planning Network
Current robotic manipulators require fast and efficient motion-planning algorithms to operate in cluttered environments. State-of-the-art sampling-based motion planners struggle to scale to high-dimensional configuration spaces and are inefficient in complex environments. This inefficiency arises because these planners utilize either uniform or hand-crafted sampling heuristics within the configuration space. To address these challenges, we present the Spatial-informed Motion Planning Network (SIMPNet). SIMPNet consists of a stochastic graph neural network (GNN)-based sampling heuristic for informed sampling within the configuration space. The sampling heuristic of SIMPNet encodes the workspace embedding into the configuration space through a cross-attention mechanism. It encodes the manipulator's kinematic structure into a graph, which is used to generate informed samples within the framework of sampling-based motion planning algorithms. We have evaluated the performance of SIMPNet using a UR5e robotic manipulator operating within simple and complex workspaces, comparing it against baseline state-of-the-art motion planners. The evaluation results show the effectiveness and advantages of the proposed planner compared to the baseline planners.
Courteous MPC for Autonomous Driving with CBF-inspired Risk Assessment SC 2024
With more autonomous vehicles (AVs) sharing roadways with human-driven vehicles (HVs), ensuring safe and courteous maneuvers that respect HVs' behavior becomes increasingly important. To promote both safety and courtesy in AV's behavior, an extension of Control Barrier Functions (CBFs)-inspired risk evaluation framework is proposed in this paper by considering both noisy observed positions and velocities of surrounding vehicles. The perceived risk by the ego vehicle can be visualized as a risk map that reflects the understanding of the surrounding environment and thus shows the potential for facilitating safe and courteous driving. By incorporating the risk evaluation framework into the Model Predictive Control (MPC) scheme, we propose a Courteous MPC for ego AV to generate courteous behaviors that 1) reduce the overall risk imposed on other vehicles and 2) respect the hard safety constraints and the original objective for efficiency. We demonstrate the performance of the proposed Courteous MPC via theoretical analysis and simulation experiments.
comment: 7 pages, accepted to ITSC 2024
Minimizing Movement Delay for Movable Antennas via Trajectory Optimization
Movable antennas (MAs) have received increasing attention in wireless communications due to their capability of antenna position adjustment to reconfigure wireless channels. However, moving MAs results in non-negligible delay, which may decrease the effective data transmission time. To reduce the movement delay, we study in this paper a new MA trajectory optimization problem. In particular, given the desired destination positions of multiple MAs, we aim to jointly optimize their associations with the initial MA positions and the trajectories for moving them from their respective initial to destination positions within a given two-dimensional (2D) region, such that the delay of antenna movement is minimized, subject to the inter-MA minimum distance constraints in the movement. However, this problem is a continuous-time mixed-integer linear programming (MILP) problem that is challenging to solve. To tackle this challenge, we propose a two-stage optimization framework that sequentially optimizes the MAs' position associations and trajectories, respectively. First, we relax the inter-MA distance constraints and optimally solve the resulted delay minimization problem. Next, we check if the obtained MA association and trajectory solutions satisfy the inter-MA distance constraints. If not satisfied, we then employ a successive convex approximation (SCA) algorithm to adjust the MAs' trajectories until they satisfy the given constraints. Simulation results are provided to show the effectiveness of our proposed trajectory optimization method in reducing the movement delay as well as draw useful insights.
comment: 6 pages,6 figures, submit to GLOBECOM 2024 Workshop - IRAFWCC
A sufficient condition for 2-contraction of a feedback interconnection
Multistationarity - the existence of multiple equilibrium points - is a common phenomenon in dynamical systems from a variety of fields, including neuroscience, opinion dynamics, systems biology, and power systems. A recently proposed generalization of contraction theory, called $k$-contraction, is a promising approach for analyzing the asymptotic behaviour of multistationary systems. In particular, all bounded trajectories of a time-invariant 2-contracting system converge to an equilibrium point, but the system may have multiple equilibrium points where more than one is locally stable. An important challenge is to study $k$-contraction in large-scale interconnected systems. Inspired by a recent small-gain theorem for 2-contraction by Angeli et al., we derive a new sufficient condition for 2-contraction of a feedback interconnection of two nonlinear dynamical systems. Our condition is based on (i) deriving new formulas for the 2-multiplicative [2-additive] compound of block matrices using block Kronecker products [sums], (ii) a hierarchical approach for proving standard contraction, and (iii) a network small-gain theorem for Metzler matrices. We demonstrate our results by deriving a simple sufficient condition for 2-contraction in a network of FitzHugh-Nagumo neurons.
Reduce, Reuse, Recycle: Categories for Compositional Reinforcement Learning ECAI 2024
In reinforcement learning, conducting task composition by forming cohesive, executable sequences from multiple tasks remains challenging. However, the ability to (de)compose tasks is a linchpin in developing robotic systems capable of learning complex behaviors. Yet, compositional reinforcement learning is beset with difficulties, including the high dimensionality of the problem space, scarcity of rewards, and absence of system robustness after task composition. To surmount these challenges, we view task composition through the prism of category theory -- a mathematical discipline exploring structures and their compositional relationships. The categorical properties of Markov decision processes untangle complex tasks into manageable sub-tasks, allowing for strategical reduction of dimensionality, facilitating more tractable reward structures, and bolstering system robustness. Experimental results support the categorical theory of reinforcement learning by enabling skill reduction, reuse, and recycling when learning complex robotic arm tasks.
comment: ECAI 2024
Stable Formulations in Optimistic Bilevel Optimization
Solutions of bilevel optimization problems tend to suffer from instability under changes to problem data. In the optimistic setting, we construct a lifted, alternative formulation that exhibits desirable stability properties under mild assumptions that neither invoke convexity nor smoothness. The upper- and lower-level problems might involve integer restrictions and disjunctive constraints. In a range of results, we at most invoke pointwise and local calmness for the lower-level problem in a sense that holds broadly. The alternative formulation is computationally attractive with structural properties being brought out and an outer approximation algorithm becoming available.
Autonomous Station Keeping of Satellites in Areostationary Mars Orbit: A Predictive Control Approach
The continued exploration of Mars will require a greater number of in-space assets to aid interplanetary communications. Future missions to the surface of Mars may be augmented with stationary satellites that remain overhead at all times as a means of sending data back to Earth from fixed antennae on the surface. These areostationary satellites will experience several important disturbances that push and pull the spacecraft off of its desired orbit. Thus, a station-keeping control strategy must be put into place to ensure the satellite remains overhead while minimizing the fuel required to elongate mission lifetime. This paper develops a model predictive control policy for areostationary station keeping that exploits knowledge of non-Keplerian perturbations in order to minimize the required annual station-keeping $\Delta v$. The station-keeping policy is applied to a satellite placed at various longitudes, and simulations are performed for an example mission at a longitude of a potential future crewed landing site. Through careful tuning of the controller constraints, and proper placement of the satellite at stable longitudes, the annual station-keeping $\Delta v$ can be reduced relative to a naive mission design.
comment: Preprint submitted to Acta Astronautica
An IoT Framework for Building Energy Optimization Using Machine Learning-based MPC
This study proposes a machine learning-based Model Predictive Control (MPC) approach for controlling Air Handling Unit (AHU) systems by employing an Internet of Things (IoT) framework. The proposed framework utilizes an Artificial Neural Network (ANN) to provide dynamic-linear thermal model parameters considering building information and disturbances in real time, thereby facilitating the practical MPC of the AHU system. The proposed framework allows users to establish new setpoints for a closed-loop control system, enabling customization of the thermal environment to meet individual needs with minimal use of the AHU. The experimental results demonstrate the cost benefits of the proposed machine-learning-based MPC-IoT framework, achieving a 57.59\% reduction in electricity consumption compared with a clock-based manual controller while maintaining a high level of user satisfaction. The proposed framework offers remarkable flexibility and effectiveness, even in legacy systems with limited building information, making it a pragmatic and valuable solution for enhancing the energy efficiency and user comfort in pre-existing structures.
Optimal Dispatch Strategy for a Multi-microgrid Cooperative Alliance Using a Two-Stage Pricing Mechanism
To coordinate resources among multi-level stakeholders and enhance the integration of electric vehicles (EVs) into multi-microgrids, this study proposes an optimal dispatch strategy within a multi-microgrid cooperative alliance using a nuanced two-stage pricing mechanism. Initially, the strategy assesses electric energy interactions between microgrids and distribution networks to establish a foundation for collaborative scheduling. The two-stage pricing mechanism initiates with a leader-follower game, wherein the microgrid operator acts as the leader and users as followers. Subsequently, it adjusts EV tariffs based on the game's equilibrium, taking into account factors such as battery degradation and travel needs to optimize EVs' electricity consumption. Furthermore, a bi-level optimization model refines power interactions and pricing strategies across the network, significantly enhancing demand response capabilities and economic outcomes. Simulation results demonstrate that this strategy not only increases renewable energy consumption but also reduces energy costs, thereby improving the overall efficiency and sustainability of the system.
comment: Accepted by IEEE Transactions on Sustainable Energy, Paper no. TSTE-00122-2024
Extraction of Typical Operating Scenarios of New Power System Based on Deep Time Series Aggregation
Extracting typical operational scenarios is essential for making flexible decisions in the dispatch of a new power system. This study proposed a novel deep time series aggregation scheme (DTSAs) to generate typical operational scenarios, considering the large amount of historical operational snapshot data. Specifically, DTSAs analyze the intrinsic mechanisms of different scheduling operational scenario switching to mathematically represent typical operational scenarios. A gramian angular summation field (GASF) based operational scenario image encoder was designed to convert operational scenario sequences into high-dimensional spaces. This enables DTSAs to fully capture the spatiotemporal characteristics of new power systems using deep feature iterative aggregation models. The encoder also facilitates the generation of typical operational scenarios that conform to historical data distributions while ensuring the integrity of grid operational snapshots. Case studies demonstrate that the proposed method extracted new fine-grained power system dispatch schemes and outperformed the latest high-dimensional featurescreening methods. In addition, experiments with different new energy access ratios were conducted to verify the robustness of the proposed method. DTSAs enables dispatchers to master the operation experience of the power system in advance, and actively respond to the dynamic changes of the operation scenarios under the high access rate of new energy.
comment: Accepted by CAAI Transactions on Intelligence Technology
Consensus over Clustered Networks Using Intermittent and Asynchronous Output Feedback
In recent years, multi-agent teaming has garnered considerable interest since complex objectives, such as intelligence, surveillance, and reconnaissance, can be divided into multiple cluster-level sub-tasks and assigned to a cluster of agents with the appropriate functionality. Yet, coordination and information dissemination between clusters may be necessary to accomplish a desired objective. Distributed consensus protocols provide a mechanism for spreading information within clustered networks, allowing agents and clusters to make decisions without requiring direct access to the state of the ensemble. Hence, we propose a strategy for achieving system-wide consensus in the states of identical linear time-invariant systems coupled by an undirected graph whose directed sub-graphs are available only at sporadic times. Within this work, the agents of the network are organized into pairwise disjoint clusters, which induce sub-graphs of the undirected parent graph. Some cluster sub-graph pairs are linked by an inter-cluster sub-graph, where the union of all cluster and inter-cluster sub-graphs yields the undirected parent graph. Each agent utilizes a distributed consensus protocol with components that are updated intermittently and asynchronously with respect to other agents. The closed-loop ensemble dynamics is modeled as a hybrid system, and a Lyapunov-based stability analysis yields sufficient conditions for rendering the agreement subspace (consensus set) globally exponentially stable. Furthermore, an input-to-state stability argument demonstrates the consensus set is robust to a class of perturbations. A numerical simulation considering both nominal and perturbed scenarios is provided for validation purposes.
AirPilot: A PPO-based DRL Auto-Tuned Nonlinear PID Drone Controller for Robust Autonomous Flights
Navigation precision, speed and stability are crucial for safe Unmanned Aerial Vehicle (UAV) flight maneuvers and effective flight mission executions in dynamic environments. Different flight missions may have varying objectives, such as minimizing energy consumption, achieving precise positioning, or maximizing speed. A controller that can adapt to different objectives on the fly is highly valuable. Proportional Integral Derivative (PID) controllers are one of the most popular and widely used control algorithms for drones and other control systems, but their linear control algorithm fails to capture the nonlinear nature of the dynamic wind conditions and complex drone system. Manually tuning the PID gains for various missions can be time-consuming and requires significant expertise. This paper aims to revolutionize drone flight control by presenting the AirPilot, a nonlinear Deep Reinforcement Learning (DRL) - enhanced Proportional Integral Derivative (PID) drone controller using Proximal Policy Optimization (PPO). AirPilot controller combines the simplicity and effectiveness of traditional PID control with the adaptability, learning capability, and optimization potential of DRL. This makes it better suited for modern drone applications where the environment is dynamic, and mission-specific performance demands are high. We employed a COEX Clover autonomous drone for training the DRL agent within the simulator and implemented it in a real-world lab setting, which marks a significant milestone as one of the first attempts to apply a DRL-based flight controller on an actual drone. Airpilot is capable of reducing the navigation error of the default PX4 PID position controller by 90%, improving effective navigation speed of a fine-tuned PID controller by 21%, reducing settling time and overshoot by 17% and 16% respectively.
comment: 14 pages, 17 figures
Unsignalized Intersection Management Strategy for Mixed Autonomy Traffic Streams
With the rapid development of connected and automated vehicles (CAVs) and intelligent transportation infrastructure, CAVs, connected human-driven vehicles (CHVs), and un-connected human-driven vehicles (HVs) will coexist on the roads in the future for a long time. This paper comprehensively considers the different traffic characteristics of CHVs, CAVs, and HVs, and systemically investigates the unsignalized intersection management strategy from the upper decision-making level to the lower execution level. The unsignalized intersection management strategy consists of two parts: the heuristic priority queues based right of way allocation (HPQ) algorithm and the vehicle planning and control algorithm. In the HPQ algorithm, a vehicle priority management model considering the difference between CAVs, CHVs, and HVs, is built to design the right of way management for different types of vehicles. In the lower level for vehicle planning and control algorithm, different control modes of CAVs are designed according to the upper-level decision made by the HPQ algorithm. Moreover, the vehicle control execution is realized by the model predictive controller combined with the geographical environment constraints and the unsignalized intersection management strategy. The proposed strategy is evaluated by simulations, which show that the proposed intersection management strategy can effectively reduce travel time and improve traffic efficiency. Results show that the proposed method can decrease the average travel time by 5% to 65% for different traffic flows compared with the comparative methods. The intersection management strategy captures the real-world balance between efficiency and safety for future intelligent traffic systems.
Opinion dynamics on signed graphs and graphons: Beyond the piece-wise constant case
In this paper we make use of graphon theory to study opinion dynamics on large undirected networks. The opinion dynamics models that we take into consideration allow for negative interactions between the individuals, i.e. competing entities whose opinions can grow apart. We consider both the repelling model and the opposing model that are studied in the literature. We define the repelling and the opposing dynamics on graphons and we show that their initial value problem's solutions exist and are unique. We then show that the graphon dynamics well approximate the dynamics on large graphs that converge to a graphon. This result applies to large random graphs that are sampled according to a graphon. All these facts are illustrated in an extended numerical example.
comment: 8 double-column pages. This revised version corrects several typos. An abridged version is going to appear in the proceedings of the 2024 IEEE Conference on Decision and Control
Intelligent Energy Management with IoT Framework in Smart Cities Using Intelligent Analysis: An Application of Machine Learning Methods for Complex Networks and Systems
This study confronts the growing challenges of energy consumption and the depletion of energy resources, particularly in the context of smart buildings. As the demand for energy increases alongside the necessity for efficient building maintenance, it becomes imperative to explore innovative energy management solutions. We present a comprehensive review of Internet of Things (IoT)-based frameworks aimed at smart city energy management, highlighting the pivotal role of IoT devices in addressing these issues due to their compactness, sensing, measurement, and computing capabilities. Our review methodology encompasses a thorough analysis of existing literature on IoT architectures and frameworks for intelligent energy management applications. We focus on systems that not only collect and store data but also support intelligent analysis for monitoring, controlling, and enhancing system efficiency. Additionally, we examine the potential for these frameworks to serve as platforms for the development of third-party applications, thereby extending their utility and adaptability. The findings from our review indicate that IoT-based frameworks offer significant potential to reduce energy consumption and environmental impact in smart buildings. Through the adoption of intelligent mechanisms and solutions, these frameworks facilitate effective energy management, leading to improved system efficiency and sustainability. Considering these findings, we recommend further exploration and adoption of IoT-based wireless sensing systems in smart buildings as a strategic approach to energy management. Our review underscores the importance of incorporating intelligent analysis and enabling the development of third-party applications within the IoT framework to efficiently meet the evolving energy demands and maintenance challenges
Equitable Networked Microgrid Topology Reconfiguration for Wildfire Risk Mitigation
The increasing number of wildfires in recent years consistently challenges the safe and reliable operations of power systems. To prevent power lines and other electrical components from causing wildfires under extreme conditions, electric utilities often deploy public safety power shutoffs (PSPS) to mitigate the wildfire risks therein. Although PSPS are effective countermeasures against wildfires, uncoordinated strategies can cause disruptions in electricity supply and even lead to cascading failures. Meanwhile, it is important to consider mitigating biased decisions on different communities and populations during the implementation of shutoff actions. In this work, we primarily focus on the dynamic reconfiguration problem of networked microgrids with distributed energy resources. In particular, we formulate a rolling horizon optimization problem allowing for flexible network reconfiguration at each time interval to mitigate wildfire risks. To promote equity and fairness during the span of shutoffs, we further enforce a range of constraints associated with load shedding to discourage disproportionate impact on individual load blocks. Numerical studies on a modified IEEE 13-bus system and a larger-sized Smart-DS system demonstrate the performance of the proposed algorithm towards more equitable power shutoff operations.
"Golden Ratio Yoshimura" for Meta-Stable and Massively Reconfigurable Deployment
Yoshimura origami is a classical folding pattern that has inspired many deployable structure designs. Its applications span from space exploration, kinetic architectures, and soft robots to even everyday household items. However, despite its wide usage, Yoshimura has been fixated on a set of design constraints to ensure its flat-foldability. Through extensive kinematic analysis and prototype tests, this study presents a new Yoshimura that intentionally defies these constraints. Remarkably, one can impart a unique meta-stability by using the Golden Ratio angle to define the triangular facets of a generalized Yoshimura. As a result, when its facets are strategically popped out, a ``Golden Ratio Yoshimura'' boom with $m$ modules can be theoretically reconfigured into $8^m$ geometrically unique and load-bearing shapes. This result not only challenges the existing design norms but also opens up a new avenue to create deployable and versatile structural systems.
Sensing environmental interaction physics to traverse cluttered obstacles
When legged robots physically interact with obstacles in applications such as search and rescue through rubble and planetary exploration across Martain rocks, even the most advanced ones struggle because they lack a fundamental framework to model the robot-obstacle physical interaction paralleling artificial potential fields for obstacle avoidance. To remedy this, recent studies established a novel framework - potential energy landscape modeling - that explains and predicts the destabilizing transitions across locomotor modes from the physical interaction between robots and obstacles, and governs a wide range of complex locomotion. However, this framework was confined to the laboratory because we lack methods to obtain the potential energy landscape in unknown environments. Here, we explore the feasibility of introducing this framework to such environments. We showed that a robot can reconstruct the potential energy landscape for unknown obstacles by measuring the obstacle contact forces and resulting torques. To elaborate, we developed a minimalistic robot capable of sensing contact forces and torques when propelled against a pair of grass-like obstacles. Despite the forces and torques not being fully conservative, they well-matched the potential energy landscape gradients, and the reconstructed landscape well-matched ground truth. In addition, we found that using normal forces and torques and head oscillation inspired by cockroach observations further improved the estimation of conservative ones. Our study will finally inspire free-running robots to achieve low-effort, "zero-shot" traversing clustered, large obstacles in real-world applications by sampling contact forces and torques and reconstructing the landscape around its neighboring states in real time.
Compositional nonlinear audio signal processing with Volterra series
We present a compositional theory of nonlinear audio signal processing based on a categorification of the Volterra series. We begin by augmenting the classical definition of the Volterra series so that it is functorial with respect to a base category whose objects are temperate distributions and whose morphisms are certain linear transformations. This motivates the derivation of formulae describing how the outcomes of nonlinear transformations are affected if their input signals are linearly processed--e.g., translated, modulated, sampled, or periodized. We then consider how nonlinear systems, themselves, change, and introduce as a model thereof the notion of morphism of Volterra series, which we exhibit as both a type of lens map and natural transformation. We show how morphisms can be parameterized and used to generate indexed families of Volterra series, which are well-suited to model nonstationary or time-varying nonlinear phenomena. We then describe how Volterra series and their morphisms organize into a category, which we call Volt. We exhibit the operations of sum, product, and series composition of Volterra series as monoidal products on Volt, and identify, for each in turn, its corresponding universal property. In particular, we show that the series composition of Volterra series is associative. We then bridge between our framework and the subject at the heart of audio signal processing: time-frequency analysis. Specifically, we show that a known equivalence, between a class of second-order Volterra series and the bilinear time-frequency distributions, can be extended to one between certain higher-order Volterra series and the so-called polynomial TFDs. We end by outlining potential avenues for future work, including the incorporation of system identification techniques and the potential extension of our theory to the settings of graph and topological audio signal processing.
Systems and Control (CS)
Population Control of Giardia lamblia
Giardia lamblia is a flagellate intestinal protozoan with global distribution causing the disease known as giardiasis. This parasite is responsable for 35.1% of outbreaks of diarrhea caused by contaminated water which and mainly affects children in whom it can cause physical and cognitive impairment. In this paper, we consider a model of population dynamics to represent the behavior of Giardia lamblia in vitro, taking into account its mutation characteristic that guarantees to the protozoan resistance to the drug metronidazole. Different from what is found in the literature, it is pursued as the control objective the extermination of the protozoan considering that the parameters of the model are uncertain and only the partial measurement of the state vector is possible. On these assumptions, a control law is designed and the stability of the closed-loop system is rigorously proved. Simulation and experimental results illustrate the benefits of the proposed population control method of Giardia lamblia.
comment: 7 pages, 5 figures, 1 Table
Impact of the Inflation Reduction Act and Carbon Capture on Transportation Electrification for a Net-Zero Western U.S. Grid
The electrification of transportation is critical to mitigate Greenhouse Gas (GHG) emissions. The United States (U.S.) government's Inflation Reduction Act (IRA) of 2022 introduces policies to promote the electrification of transportation. In addition to electrifying transportation, clean energy technologies such as Carbon Capture and Storage (CCS) may play a major role in achieving a net-zero energy system. Utilizing scenarios simulated by the U.S. version of the Global Change Analysis Model (GCAM-USA), we analyze the individual and compound contributions of the IRA and CCS to reach a clean U.S. grid by 2035 and net-zero GHG emissions by 2050. We analyze the contributions based on three metrics: i) transportation electrification rate, ii) transportation fuel mix, and iii) spatio-temporal charging loads. Our findings indicate that the IRA significantly accelerates transportation electrification in the near-term (until 2035). In contrast, CCS technologies, by enabling the continued use of internal combustion vehicles while still advancing torward net-zero, potentially suppresses the rate of transportation electrification in the long-term. This study underscores how policy and technology innovation can interact and sensitivity studies with different combination are essential to characterize the potential contributions of each to the transportation electrification.
comment: This is a preprint. It's complete copyright version will be available on the publisher's website after publication
A Note on an Upper-Bound for the Sum of a Class K and an Extended Class K Function
In this short note, we derive an upper-bound for the sum of two comparison functions, namely for the sum of a class K and an extended class K function. To the best of our knowledge, the relations derived in this note have not been previously derived in the literature.
comment: 4 pages
Data-driven MPC with terminal conditions in the Koopman framework
We investigate nonlinear model predictive control (MPC) with terminal conditions in the Koopman framework using extended dynamic mode decomposition (EDMD) to generate a data-based surrogate model for prediction and optimization. We rigorously show recursive feasibility and prove practical asymptotic stability w.r.t. the approximation accuracy. To this end, finite-data error bounds are employed. The construction of the terminal conditions is based on recently derived proportional error bounds to ensure the required Lyapunov decrease. Finally, we illustrate the effectiveness of the proposed data-driven predictive controller including the design procedure to construct the terminal region and controller.
comment: Accepted for presentation at the 63rd IEEE Conference on Decision and Control (CDC2024)
The Hybrid Hospital: Balancing On-Site and Remote Hospitalization
Hybrid hospitals offer on-site and remote hospitalization through telemedicine. These new healthcare models require novel operational policies to balance costs, efficiency, and patient well-being. Our study addresses two first-order questions: (i) how to direct patient admission and call-in based on individual characteristics and proximity and (ii) how to determine the optimal allocation of medical resources between these two hospitalization options and among different patient types. We develop a model that uses Brownian Motion to capture the patient's health evolution during remote/on-site hospitalization and during travel. Under cost-minimizing call-in policies, we find that remote hospitalization can be cost-effective for moderately distant patients, as the optimal call-in threshold is non-monotonic in the patient's travel time. Subject to scarce resources, the optimal solution structure becomes equivalent to a simultaneous, identically sized increase of remote and on-site costs under abundant resources. When limited resources must be divided among multiple patient types, the optimal thresholds shift in non-obvious ways as resource availability changes. Finally, we develop a practical and efficient policy that allows for swapping an on-site patient with a remote patient when the latter is called-in and sufficient resources are not available to treat both on-site. Contrary to the widely held view that telemedicine can mitigate rural and non-rural healthcare disparities, our research suggests that on-site care may actually be more cost-effective than remote hospitalization for patients in distant locations, due to (potentially overlooked) risks during patient travel. This finding may be of particular concern in light of the growing number of ``hospital deserts'' amid recent rural hospital closures, as these communities may in fact not be well-served through at-home care.
A Stable Polygamy Approach to Spectrum Access with Channel Reuse
We introduce a new and broader formulation of the stable marriage problem (SMP), called the stable polygamy problem (SPP), where multiple individuals from a larger group $L$ of $|L|$ individuals can be matched with a single individual from a smaller group $S$ of $|S|$ individuals. Each individual $\ell \in L$ possesses a social constraint set $C_{\ell}$ that contains a subset of $L$ with whom they cannot coexist harmoniously. We define a generalized concept of stability based on the preference and constraints of the individuals. We explore two common settings: common utility, where the utility of a match is the same for individuals from both sets, and preference ranking, where each individual has a preference ranking for every other individual in the opposite set. Our analysis is conducted using a novel graph-theoretical framework. The classic SMP has been investigated in recent years for spectrum access to match cells or users to channels, where only one-to-one matching is allowed. By contrast, the new SPP formulation allows us to solve more general models with channel reuse, where multiple users may access the same channel simultaneously. Interestingly, we show that classic algorithms, such as propose and reject (P&R), and Hungarian method are no longer efficient in the polygamy setting. We develop efficient algorithms to solve the SPP in polynomial time, tailored for implementations in spectrum access with channel reuse. We analytically show that our algorithm always solves the SPP with common utility. While the SPP with preference ranking cannot be solved by any algorithm in all cases, we prove that our algorithm effectively solves it in specific graph structures representing strong and weak interference regimes. Simulation results demonstrate the efficiency of our algorithms across various spectrum access scenarios.
Recursive Distributed Collaborative Aided Inertial Navigation
In this dissertation, we investigate the issue of robust localization in swarms of heterogeneous mobile agents with multiple and time-varying sensing modalities. Our focus is the development of filter-based and decoupled estimators under the assumption that agents possess communication and processing capabilities. Based on the findings from Distributed Collaborative State Estimation and modular sensor fusion, we propose a novel Kalman filter decoupling paradigm, which is termed Isolated Kalman Filtering (IKF). This paradigm is formally discussed and the treatment of delayed measurement is studied. The impact of approximation made was investigated on different observation graphs and the filter credibility was evaluated on a linear system in a Monte Carlo simulation. Finally, we propose a multi-agent modular sensor fusion approach based on the IKF paradigm, in order to cooperatively estimate the global state of a multi-agent system in a distributed way and fuse information provided by different on-board sensors in a computationally efficient way. As a consequence, this approach can be performed distributed among agents, while (i) communication between agents is only required at the moment of inter-agent joint observations, (ii) one agent acts as interim master to process state corrections isolated, (iii) agents can be added and removed from the swarm, (iv) each agent's full state can vary during mission (each local sensor suite can be truly modular), and (v) delayed and multi-rate sensor updates are supported. Extensive evaluation on realistic simulated and real-world data sets show that the proposed Isolated Kalman Filtering (IKF) paradigm, is applicable for both, truly modular single agent estimation and distributed collaborative multi-agent estimation problems.
comment: PhD thesis
Fine-tuning Smaller Language Models for Question Answering over Financial Documents
Recent research has shown that smaller language models can acquire substantial reasoning abilities when fine-tuned with reasoning exemplars crafted by a significantly larger teacher model. We explore this paradigm for the financial domain, focusing on the challenge of answering questions that require multi-hop numerical reasoning over financial texts. We assess the performance of several smaller models that have been fine-tuned to generate programs that encode the required financial reasoning and calculations. Our findings demonstrate that these fine-tuned smaller models approach the performance of the teacher model. To provide a granular analysis of model performance, we propose an approach to investigate the specific student model capabilities that are enhanced by fine-tuning. Our empirical analysis indicates that fine-tuning refines the student models ability to express and apply the required financial concepts along with adapting the entity extraction for the specific data format. In addition, we hypothesize and demonstrate that comparable financial reasoning capability can be induced using relatively smaller datasets.
Star-shaped Tilted Hexarotor Maneuverability: Analysis of the Role of the Tilt Cant Angles
Star-shaped Tilted Hexarotors are rapidly emerging for applications highly demanding in terms of robustness and maneuverability. To ensure improvement in such features, a careful selection of the tilt angles is mandatory. In this work, we present a rigorous analysis of how the force subspace varies with the tilt cant angles, namely the tilt angles along the vehicle arms, taking into account gravity compensation and torque decoupling to abide by the hovering condition. Novel metrics are introduced to assess the performance of existing tilted platforms, as well as to provide some guidelines for the selection of the tilt cant angle in the design phase.
comment: accepted for presentation at the 2024 IEEE 20th International Conference on Automation Science and Engineering (CASE2024)
Modularized data-driven approximation of the Koopman operator and generator
Extended Dynamic Mode Decomposition (EDMD) is a widely-used data-driven approach to learn an approximation of the Koopman operator. Consequently, it provides a powerful tool for data-driven analysis, prediction, and control of nonlinear dynamical (control) systems. In this work, we propose a novel modularized EDMD scheme tailored to interconnected systems. To this end, we utilize the structure of the Koopman generator that allows to learn the dynamics of subsystems individually and thus alleviates the curse of dimensionality by considering observable functions on smaller state spaces. Moreover, our approach canonically enables transfer learning if a system encompasses multiple copies of a model as well as efficient adaption to topology changes without retraining. We provide finite-data bounds on the estimation error using tools from graph theory. The efficacy of the method is illustrated by means of various numerical examples.
comment: 31 pages, 11 figures
Accounts of using the Tustin-Net architecture on a rotary inverted pendulum
In this report we investigate the use of the Tustin neural network architecture (Tustin-Net) for the identification of a physical rotary inverse pendulum. This physics-based architecture is of particular interest as it builds on the known relationship between velocities and positions. We here aim at discussing the advantages, limitations and performance of Tustin-Nets compared to first-principles grey-box models on a real physical apparatus, showing how, with a standard training procedure, the former can hardly achieve the same accuracy as the latter. To address this limitation, we present a training strategy based on transfer learning that yields Tustin-Nets that are competitive with the first-principles model, without requiring extensive knowledge of the setup as the latter.
Deep Analysis of Time Series Data for Smart Grid Startup Strategies: A Transformer-LSTM-PSO Model Approach
Grid startup, an integral component of the power system, holds strategic importance for ensuring the reliability and efficiency of the electrical grid. However, current methodologies for in-depth analysis and precise prediction of grid startup scenarios are inadequate. To address these challenges, we propose a novel method based on the Transformer-LSTM-PSO model. This model uniquely combines the Transformer's self-attention mechanism, LSTM's temporal modeling capabilities, and the parameter tuning features of the particle swarm optimization algorithm. It is designed to more effectively capture the complex temporal relationships in grid startup schemes. Our experiments demonstrate significant improvements, with our model achieving lower RMSE and MAE values across multiple datasets compared to existing benchmarks, particularly in the NYISO Electric Market dataset where the RMSE was reduced by approximately 15% and the MAE by 20% compared to conventional models. Our main contribution is the development of a Transformer-LSTM-PSO model that significantly enhances the accuracy and efficiency of smart grid startup predictions. The application of the Transformer-LSTM-PSO model represents a significant advancement in smart grid predictive analytics, concurrently fostering the development of more reliable and intelligent grid management systems.
comment: 46 pages
Robust Input Shaping Vibration Control via Extended Kalman Filter-Incorporated Residual Neural Network
With the rapid development of industry, the vibration control of flexible structures and underactuated systems has been increasingly gaining attention. Input shaping technology enables stable performance for high-speed motion in industrial motion systems. However, existing input shapers generally suffer from the ineffective control performance due to the neglect of observation errors. To address this critical issue, this paper proposes an Extended Kalman Filter-incorporated Residual Neural Network-based input Shaping (ERS) model for vibration control. Its main ideas are two-fold: a) adopting an extended Kalman filter to address a vertical flexible beam's model errors; and b) adopting a residual neural network to cascade with the extended Kalman filter for eliminating the remaining observation errors. Detailed experiments on a real dataset collected from a vertical flexible beam demonstrate that the proposed ERS model has achieved significant vibration control performance over several state-of-the-art models.
Controllability and Observability of Temporal Hypergraphs
Numerous complex systems, such as those arisen in ecological networks, genomic contact networks, and social networks, exhibit higher-order and time-varying characteristics, which can be effectively modeled using temporal hypergraphs. However, analyzing and controlling temporal hypergraphs poses significant challenges due to their inherent time-varying and nonlinear nature, while most existing methods predominantly target static hypergraphs. In this article, we generalize the notions of controllability and observability to temporal hypergraphs by leveraging tensor and nonlinear systems theory. Specifically, we establish tensor-based rank conditions to determine the weak controllability and observability of temporal hypergraphs. The proposed framework is further demonstrated with synthetic and real-world examples.
comment: 6 pages, 3 figures
Autonomous Grid-Forming Inverter Exponential Droop Control for Improved Frequency Stability
This paper introduces the novel Droop-e grid-forming power electronic converter control strategy, which establishes a non-linear, active power--frequency droop relationship based on an exponential function of the power output. A primary advantage of Droop-e is an increased utilization of available power headroom that directly mitigates system frequency excursions and reduces the rate of change of frequency. The motivation for Droop-e as compared to a linear grid-forming control is first established, and then the full controller is described, including the mirrored inversion at the origin, the linearization at a parameterized limit, and the auxiliary autonomous power sharing controller. The analytic stability of the controller, including synchronization criteria and a small signal stability analysis, is assessed. Electromagnetic transient time domain simulations of the Droop-e controller with full order power electronic converters and accompanying DC-side dynamics, connected in parallel with synchronous generators, are executed at a range of dispatches on a simple 3-bus system. Finally, IEEE 39-bus system simulations highlight the improved frequency stability of the system with multiple, Droop-e controlled grid-forming inverters.
comment: 10 pages, 11 figures
Late Breaking Results: On the One-Key Premise of Logic Locking
The evaluation of logic locking methods has long been predicated on an implicit assumption that only the correct key can unveil the true functionality of a protected circuit. Consequently, a locking technique is deemed secure if it resists a good array of attacks aimed at finding this correct key. This paper challenges this one-key premise by introducing a more efficient attack methodology, focused not on identifying that one correct key, but on finding multiple, potentially incorrect keys that can collectively produce correct functionality from the protected circuit. The tasks of finding these keys can be parallelized, which is well suited for multi-core computing environments. Empirical results show our attack achieves a runtime reduction of up to 99.6% compared to the conventional attack that tries to find a single correct key.
comment: 2 pages, accepted in DAC 2024 proceedings
Remaining Discharge Energy Prediction for Lithium-Ion Batteries Over Broad Current Ranges: A Machine Learning Approach
Lithium-ion batteries have found their way into myriad sectors of industry to drive electrification, decarbonization, and sustainability. A crucial aspect in ensuring their safe and optimal performance is monitoring their energy levels. In this paper, we present the first study on predicting the remaining energy of a battery cell undergoing discharge over wide current ranges from low to high C-rates. The complexity of the challenge arises from the cell's C-rate-dependent energy availability as well as its intricate electro-thermal dynamics especially at high C-rates. To address this, we introduce a new definition of remaining discharge energy and then undertake a systematic effort in harnessing the power of machine learning to enable its prediction. Our effort includes two parts in cascade. First, we develop an accurate dynamic model based on integration of physics with machine learning to capture a battery's voltage and temperature behaviors. Second, based on the model, we propose a machine learning approach to predict the remaining discharge energy under arbitrary C-rates and pre-specified cut-off limits in voltage and temperature. The experimental validation shows that the proposed approach can predict the remaining discharge energy with a relative error of less than 3% when the current varies between 0~8 C for an NCA cell and 0~15 C for an LFP cell. The approach, by design, is amenable to training and computation.
comment: 15 pages, 13 figures, 4 tables
An Efficient and Explainable Transformer-Based Few-Shot Learning for Modeling Electricity Consumption Profiles Across Thousands of Domains
Electricity Consumption Profiles (ECPs) are crucial for operating and planning power distribution systems, especially with the increasing numbers of various low-carbon technologies such as solar panels and electric vehicles. Traditional ECP modeling methods typically assume the availability of sufficient ECP data. However, in practice, the accessibility of ECP data is limited due to privacy issues or the absence of metering devices. Few-shot learning (FSL) has emerged as a promising solution for ECP modeling in data-scarce scenarios. Nevertheless, standard FSL methods, such as those used for images, are unsuitable for ECP modeling because (1) these methods usually assume several source domains with sufficient data and several target domains. However, in the context of ECP modeling, there may be thousands of source domains with a moderate amount of data and thousands of target domains. (2) Standard FSL methods usually involve cumbersome knowledge transfer mechanisms, such as pre-training and fine-tuning, whereas ECP modeling requires more lightweight methods. (3) Deep learning models often lack explainability, hindering their application in industry. This paper proposes a novel FSL method that exploits Transformers and Gaussian Mixture Models (GMMs) for ECP modeling to address the above-described issues. Results show that our method can accurately restore the complex ECP distribution with a minimal amount of ECP data (e.g., only 1.6\% of the complete domain dataset) while it outperforms state-of-the-art time series modeling methods, maintaining the advantages of being both lightweight and interpretable. The project is open-sourced at https://github.com/xiaweijie1996/TransformerEM-GMM.git.
Real-time Event Recognition of Long-distance Distributed Vibration Sensing with Knowledge Distillation and Hardware Acceleration
Fiber-optic sensing, especially distributed optical fiber vibration (DVS) sensing, is gaining importance in internet of things (IoT) applications, such as industrial safety monitoring and intrusion detection. Despite their wide application, existing post-processing methods that rely on deep learning models for event recognition in DVS systems face challenges with real-time processing of large sample data volumes, particularly in long-distance applications. To address this issue, we propose to use a four-layer convolutional neural network (CNN) with ResNet as the teacher model for knowledge distillation. This results in a significant improvement in accuracy, from 83.41% to 95.39%, on data from previously untrained environments. Additionally, we propose a novel hardware design based on field-programmable gate arrays (FPGA) to further accelerate model inference. This design replaces multiplication with binary shift operations and quantizes model weights, enabling high parallelism and low latency. Our implementation achieves an inference time of 0.083 ms for a spatial-temporal sample covering a 12.5 m fiber length and 0.256 s time frame. This performance enables real-time signal processing over approximately 38.55 km of fiber, about $2.14\times$ the capability of an Nvidia GTX 4090 GPU. The proposed method greatly enhances the efficiency of vibration pattern recognition, promoting the use of DVS as a smart IoT system. The data and code are available at https://github.com/HUST-IOF/Efficient-DVS.
comment: 9 pages, 10 figures
A Complete Set of Quadratic Constraints for Repeated ReLU and Generalizations
This paper derives a complete set of quadratic constraints (QCs) for the repeated ReLU. The complete set of QCs is described by a collection of matrix copositivity conditions. We also show that only two functions satisfy all QCs in our complete set: the repeated ReLU and flipped ReLU. Thus our complete set of QCs bounds the repeated ReLU as tight as possible up to the sign invariance inherent in quadratic forms. We derive a similar complete set of incremental QCs for repeated ReLU, which can potentially lead to less conservative Lipschitz bounds for ReLU networks than the standard LipSDP approach. The basic constructions are also used to derive the complete sets of QCs for other piecewise linear activation functions such as leaky ReLU, MaxMin, and HouseHolder. Finally, we illustrate the use of the complete set of QCs to assess stability and performance for recurrent neural networks with ReLU activation functions. We rely on a standard copositivity relaxation to formulate the stability/performance condition as a semidefinite program. Simple examples are provided to illustrate that the complete sets of QCs and incremental QCs can yield less conservative bounds than existing sets.
Contextual Tuning of Model Predictive Control for Autonomous Racing
Learning-based model predictive control has been widely applied in autonomous racing to improve the closed-loop behaviour of vehicles in a data-driven manner. When environmental conditions change, e.g., due to rain, often only the predictive model is adapted, but the controller parameters are kept constant. However, this can lead to suboptimal behaviour. In this paper, we address the problem of data-efficient controller tuning, adapting both the model and objective simultaneously. The key novelty of the proposed approach is that we leverage a learned dynamics model to encode the environmental condition as a so-called context. This insight allows us to employ contextual Bayesian optimization to efficiently transfer knowledge across different environmental conditions. Consequently, we require fewer data to find the optimal controller configuration for each context. The proposed framework is extensively evaluated with more than 3'000 laps driven on an experimental platform with 1:28 scale RC race cars. The results show that our approach successfully optimizes the lap time across different contexts requiring fewer data compared to other approaches based on standard Bayesian optimization.
Control-Coherent Koopman Modeling: A Physical Modeling Approach
The modeling of nonlinear dynamics based on Koopman operator theory, which is originally applicable only to autonomous systems with no control, is extended to non-autonomous control system without approximation to input matrix B. Prevailing methods using a least square estimate of the B matrix may result in an erroneous input matrix, misinforming the controller about the structure of the input matrix in a lifted space. Here, a new method for constructing a Koopman model that comprises the exact input matrix B is presented. A set of state variables are introduced so that the control inputs are linearly involved in the dynamics of actuators. With these variables, a lifted linear model with the exact control matrix, called a Control-Coherent Koopman Model, is constructed by superposing control input terms, which are linear in local actuator dynamics, to the Koopman operator of the associated autonomous nonlinear system. The proposed method is applied to multi degree-of-freedom robotic arms and multi-cable manipulation systems. Model Predictive Control is applied to the former. It is demonstrated that the prevailing Dynamic Mode Decomposition with Control (DMDc) using an approximate control matrix B does not provide a satisfactory result, while the Control-Coherent Koopman Model performs well with the correct B matrix.
comment: Accepted at the Conference on Decision and Control (CDC 2024)
Distributed Online Feedback Optimization for Real-time Distribution System Voltage Regulation
We investigate the real-time distribution system voltage regulation problem employing online feedback optimization (OFO) and short-range communication between physical neighbours. OFO does not need an accurate grid topology nor estimated consumption of non-controllable loads, affords fast calculations, and demonstrates robustness to uncertainties due to its feedback-based nature. These characteristics render it particularly suitable for real-time distribution system applications. However, many OFO controllers require centralized communication, which lack robustness to single-point failures. This paper proposes a distributed OFO design based on a nested feedback optimization strategy and analyzes its convergence. Numerical results reveal that the proposed design achieves satisfactory voltage regulation and outperforms other distributed approaches.
Dynamic Complex-Frequency Control of Grid-Forming Converters
Complex droop control, alternatively known as dispatchable virtual oscillator control (dVOC), stands out for its unique capabilities in synchronization and voltage stabilization among existing control strategies for grid-forming converters. Complex droop control leverages the novel concept of ``complex frequency'', thereby establishing a coupled connection between active and reactive power inputs and frequency and rate-of-change-of voltage outputs. However, its reliance on static droop gains limits its ability to exhibit crucial dynamic response behaviors required in future power systems. To address this limitation, this paper introduces dynamic complex-frequency control, upgrading static droop gains with dynamic transfer functions to enhance the richness and flexibility in dynamic responses for frequency and voltage control. Unlike existing approaches, the complex-frequency control framework treats frequency and voltage dynamics collectively, ensuring small-signal stability for frequency synchronization and voltage stabilization simultaneously. The control framework is validated through detailed numerical case studies on the IEEE nine-bus system, also showcasing its applicability in multi-converter setups.
comment: 6 Pages, 7 Figures
Decentralized Online Learning for Random Inverse Problems Over Graphs
We propose a decentralized online learning algorithm for distributed random inverse problems over network graphs with online measurements, and unifies the distributed parameter estimation in Hilbert spaces and the least mean square problem in reproducing kernel Hilbert spaces (RKHS-LMS). We transform the convergence of the algorithm into the asymptotic stability of a class of inhomogeneous random difference equations in Hilbert spaces with $L_{2}$-bounded martingale difference terms and develop the $L_2$-asymptotic stability theory in Hilbert spaces. We show that if the network graph is connected and the sequence of forward operators satisfies the infinite-dimensional spatio-temporal persistence of excitation condition, then the estimates of all nodes are mean square and almost surely strongly consistent. Moreover, we propose a decentralized online learning algorithm in RKHS based on non-stationary online data streams, and prove that the algorithm is mean square and almost surely strongly consistent if the operators induced by the random input data satisfy the infinite-dimensional spatio-temporal persistence of excitation condition.
Integrating Physics-Based Modeling with Machine Learning for Lithium-Ion Batteries
Mathematical modeling of lithium-ion batteries (LiBs) is a primary challenge in advanced battery management. This paper proposes two new frameworks to integrate physics-based models with machine learning to achieve high-precision modeling for LiBs. The frameworks are characterized by informing the machine learning model of the state information of the physical model, enabling a deep integration between physics and machine learning. Based on the frameworks, a series of hybrid models are constructed, through combining an electrochemical model and an equivalent circuit model, respectively, with a feedforward neural network. The hybrid models are relatively parsimonious in structure and can provide considerable voltage predictive accuracy under a broad range of C-rates, as shown by extensive simulations and experiments. The study further expands to conduct aging-aware hybrid modeling, leading to the design of a hybrid model conscious of the state-of-health to make prediction. The experiments show that the model has high voltage predictive accuracy throughout a LiB's cycle life.
comment: 15 pages, 10 figures, 2 tables. arXiv admin note: text overlap with arXiv:2103.11580
Two competing populations with a common environmental resource
Feedback-evolving games is a framework that models the co-evolution between payoff functions and an environmental state. It serves as a useful tool to analyze many social dilemmas such as natural resource consumption, behaviors in epidemics, and the evolution of biological populations. However, it has primarily focused on the dynamics of a single population of agents. In this paper, we consider the impact of two populations of agents that share a common environmental resource. We focus on a scenario where individuals in one population are governed by an environmentally ``responsible" incentive policy, and individuals in the other population are environmentally ``irresponsible". An analysis on the asymptotic stability of the coupled system is provided, and conditions for which the resource collapses are identified. We then derive consumption rates for the irresponsible population that optimally exploit the environmental resource, and analyze how incentives should be allocated to the responsible population that most effectively promote the environment via a sensitivity analysis.
Systems and Control (EESS)
Population Control of Giardia lamblia
Giardia lamblia is a flagellate intestinal protozoan with global distribution causing the disease known as giardiasis. This parasite is responsable for 35.1% of outbreaks of diarrhea caused by contaminated water which and mainly affects children in whom it can cause physical and cognitive impairment. In this paper, we consider a model of population dynamics to represent the behavior of Giardia lamblia in vitro, taking into account its mutation characteristic that guarantees to the protozoan resistance to the drug metronidazole. Different from what is found in the literature, it is pursued as the control objective the extermination of the protozoan considering that the parameters of the model are uncertain and only the partial measurement of the state vector is possible. On these assumptions, a control law is designed and the stability of the closed-loop system is rigorously proved. Simulation and experimental results illustrate the benefits of the proposed population control method of Giardia lamblia.
comment: 7 pages, 5 figures, 1 Table
Impact of the Inflation Reduction Act and Carbon Capture on Transportation Electrification for a Net-Zero Western U.S. Grid
The electrification of transportation is critical to mitigate Greenhouse Gas (GHG) emissions. The United States (U.S.) government's Inflation Reduction Act (IRA) of 2022 introduces policies to promote the electrification of transportation. In addition to electrifying transportation, clean energy technologies such as Carbon Capture and Storage (CCS) may play a major role in achieving a net-zero energy system. Utilizing scenarios simulated by the U.S. version of the Global Change Analysis Model (GCAM-USA), we analyze the individual and compound contributions of the IRA and CCS to reach a clean U.S. grid by 2035 and net-zero GHG emissions by 2050. We analyze the contributions based on three metrics: i) transportation electrification rate, ii) transportation fuel mix, and iii) spatio-temporal charging loads. Our findings indicate that the IRA significantly accelerates transportation electrification in the near-term (until 2035). In contrast, CCS technologies, by enabling the continued use of internal combustion vehicles while still advancing torward net-zero, potentially suppresses the rate of transportation electrification in the long-term. This study underscores how policy and technology innovation can interact and sensitivity studies with different combination are essential to characterize the potential contributions of each to the transportation electrification.
comment: This is a preprint. It's complete copyright version will be available on the publisher's website after publication
A Note on an Upper-Bound for the Sum of a Class K and an Extended Class K Function
In this short note, we derive an upper-bound for the sum of two comparison functions, namely for the sum of a class K and an extended class K function. To the best of our knowledge, the relations derived in this note have not been previously derived in the literature.
comment: 4 pages
Data-driven MPC with terminal conditions in the Koopman framework
We investigate nonlinear model predictive control (MPC) with terminal conditions in the Koopman framework using extended dynamic mode decomposition (EDMD) to generate a data-based surrogate model for prediction and optimization. We rigorously show recursive feasibility and prove practical asymptotic stability w.r.t. the approximation accuracy. To this end, finite-data error bounds are employed. The construction of the terminal conditions is based on recently derived proportional error bounds to ensure the required Lyapunov decrease. Finally, we illustrate the effectiveness of the proposed data-driven predictive controller including the design procedure to construct the terminal region and controller.
comment: Accepted for presentation at the 63rd IEEE Conference on Decision and Control (CDC2024)
The Hybrid Hospital: Balancing On-Site and Remote Hospitalization
Hybrid hospitals offer on-site and remote hospitalization through telemedicine. These new healthcare models require novel operational policies to balance costs, efficiency, and patient well-being. Our study addresses two first-order questions: (i) how to direct patient admission and call-in based on individual characteristics and proximity and (ii) how to determine the optimal allocation of medical resources between these two hospitalization options and among different patient types. We develop a model that uses Brownian Motion to capture the patient's health evolution during remote/on-site hospitalization and during travel. Under cost-minimizing call-in policies, we find that remote hospitalization can be cost-effective for moderately distant patients, as the optimal call-in threshold is non-monotonic in the patient's travel time. Subject to scarce resources, the optimal solution structure becomes equivalent to a simultaneous, identically sized increase of remote and on-site costs under abundant resources. When limited resources must be divided among multiple patient types, the optimal thresholds shift in non-obvious ways as resource availability changes. Finally, we develop a practical and efficient policy that allows for swapping an on-site patient with a remote patient when the latter is called-in and sufficient resources are not available to treat both on-site. Contrary to the widely held view that telemedicine can mitigate rural and non-rural healthcare disparities, our research suggests that on-site care may actually be more cost-effective than remote hospitalization for patients in distant locations, due to (potentially overlooked) risks during patient travel. This finding may be of particular concern in light of the growing number of ``hospital deserts'' amid recent rural hospital closures, as these communities may in fact not be well-served through at-home care.
A Stable Polygamy Approach to Spectrum Access with Channel Reuse
We introduce a new and broader formulation of the stable marriage problem (SMP), called the stable polygamy problem (SPP), where multiple individuals from a larger group $L$ of $|L|$ individuals can be matched with a single individual from a smaller group $S$ of $|S|$ individuals. Each individual $\ell \in L$ possesses a social constraint set $C_{\ell}$ that contains a subset of $L$ with whom they cannot coexist harmoniously. We define a generalized concept of stability based on the preference and constraints of the individuals. We explore two common settings: common utility, where the utility of a match is the same for individuals from both sets, and preference ranking, where each individual has a preference ranking for every other individual in the opposite set. Our analysis is conducted using a novel graph-theoretical framework. The classic SMP has been investigated in recent years for spectrum access to match cells or users to channels, where only one-to-one matching is allowed. By contrast, the new SPP formulation allows us to solve more general models with channel reuse, where multiple users may access the same channel simultaneously. Interestingly, we show that classic algorithms, such as propose and reject (P&R), and Hungarian method are no longer efficient in the polygamy setting. We develop efficient algorithms to solve the SPP in polynomial time, tailored for implementations in spectrum access with channel reuse. We analytically show that our algorithm always solves the SPP with common utility. While the SPP with preference ranking cannot be solved by any algorithm in all cases, we prove that our algorithm effectively solves it in specific graph structures representing strong and weak interference regimes. Simulation results demonstrate the efficiency of our algorithms across various spectrum access scenarios.
Recursive Distributed Collaborative Aided Inertial Navigation
In this dissertation, we investigate the issue of robust localization in swarms of heterogeneous mobile agents with multiple and time-varying sensing modalities. Our focus is the development of filter-based and decoupled estimators under the assumption that agents possess communication and processing capabilities. Based on the findings from Distributed Collaborative State Estimation and modular sensor fusion, we propose a novel Kalman filter decoupling paradigm, which is termed Isolated Kalman Filtering (IKF). This paradigm is formally discussed and the treatment of delayed measurement is studied. The impact of approximation made was investigated on different observation graphs and the filter credibility was evaluated on a linear system in a Monte Carlo simulation. Finally, we propose a multi-agent modular sensor fusion approach based on the IKF paradigm, in order to cooperatively estimate the global state of a multi-agent system in a distributed way and fuse information provided by different on-board sensors in a computationally efficient way. As a consequence, this approach can be performed distributed among agents, while (i) communication between agents is only required at the moment of inter-agent joint observations, (ii) one agent acts as interim master to process state corrections isolated, (iii) agents can be added and removed from the swarm, (iv) each agent's full state can vary during mission (each local sensor suite can be truly modular), and (v) delayed and multi-rate sensor updates are supported. Extensive evaluation on realistic simulated and real-world data sets show that the proposed Isolated Kalman Filtering (IKF) paradigm, is applicable for both, truly modular single agent estimation and distributed collaborative multi-agent estimation problems.
comment: PhD thesis
Fine-tuning Smaller Language Models for Question Answering over Financial Documents
Recent research has shown that smaller language models can acquire substantial reasoning abilities when fine-tuned with reasoning exemplars crafted by a significantly larger teacher model. We explore this paradigm for the financial domain, focusing on the challenge of answering questions that require multi-hop numerical reasoning over financial texts. We assess the performance of several smaller models that have been fine-tuned to generate programs that encode the required financial reasoning and calculations. Our findings demonstrate that these fine-tuned smaller models approach the performance of the teacher model. To provide a granular analysis of model performance, we propose an approach to investigate the specific student model capabilities that are enhanced by fine-tuning. Our empirical analysis indicates that fine-tuning refines the student models ability to express and apply the required financial concepts along with adapting the entity extraction for the specific data format. In addition, we hypothesize and demonstrate that comparable financial reasoning capability can be induced using relatively smaller datasets.
Star-shaped Tilted Hexarotor Maneuverability: Analysis of the Role of the Tilt Cant Angles
Star-shaped Tilted Hexarotors are rapidly emerging for applications highly demanding in terms of robustness and maneuverability. To ensure improvement in such features, a careful selection of the tilt angles is mandatory. In this work, we present a rigorous analysis of how the force subspace varies with the tilt cant angles, namely the tilt angles along the vehicle arms, taking into account gravity compensation and torque decoupling to abide by the hovering condition. Novel metrics are introduced to assess the performance of existing tilted platforms, as well as to provide some guidelines for the selection of the tilt cant angle in the design phase.
comment: accepted for presentation at the 2024 IEEE 20th International Conference on Automation Science and Engineering (CASE2024)
Modularized data-driven approximation of the Koopman operator and generator
Extended Dynamic Mode Decomposition (EDMD) is a widely-used data-driven approach to learn an approximation of the Koopman operator. Consequently, it provides a powerful tool for data-driven analysis, prediction, and control of nonlinear dynamical (control) systems. In this work, we propose a novel modularized EDMD scheme tailored to interconnected systems. To this end, we utilize the structure of the Koopman generator that allows to learn the dynamics of subsystems individually and thus alleviates the curse of dimensionality by considering observable functions on smaller state spaces. Moreover, our approach canonically enables transfer learning if a system encompasses multiple copies of a model as well as efficient adaption to topology changes without retraining. We provide finite-data bounds on the estimation error using tools from graph theory. The efficacy of the method is illustrated by means of various numerical examples.
comment: 31 pages, 11 figures
Accounts of using the Tustin-Net architecture on a rotary inverted pendulum
In this report we investigate the use of the Tustin neural network architecture (Tustin-Net) for the identification of a physical rotary inverse pendulum. This physics-based architecture is of particular interest as it builds on the known relationship between velocities and positions. We here aim at discussing the advantages, limitations and performance of Tustin-Nets compared to first-principles grey-box models on a real physical apparatus, showing how, with a standard training procedure, the former can hardly achieve the same accuracy as the latter. To address this limitation, we present a training strategy based on transfer learning that yields Tustin-Nets that are competitive with the first-principles model, without requiring extensive knowledge of the setup as the latter.
Deep Analysis of Time Series Data for Smart Grid Startup Strategies: A Transformer-LSTM-PSO Model Approach
Grid startup, an integral component of the power system, holds strategic importance for ensuring the reliability and efficiency of the electrical grid. However, current methodologies for in-depth analysis and precise prediction of grid startup scenarios are inadequate. To address these challenges, we propose a novel method based on the Transformer-LSTM-PSO model. This model uniquely combines the Transformer's self-attention mechanism, LSTM's temporal modeling capabilities, and the parameter tuning features of the particle swarm optimization algorithm. It is designed to more effectively capture the complex temporal relationships in grid startup schemes. Our experiments demonstrate significant improvements, with our model achieving lower RMSE and MAE values across multiple datasets compared to existing benchmarks, particularly in the NYISO Electric Market dataset where the RMSE was reduced by approximately 15% and the MAE by 20% compared to conventional models. Our main contribution is the development of a Transformer-LSTM-PSO model that significantly enhances the accuracy and efficiency of smart grid startup predictions. The application of the Transformer-LSTM-PSO model represents a significant advancement in smart grid predictive analytics, concurrently fostering the development of more reliable and intelligent grid management systems.
comment: 46 pages
Robust Input Shaping Vibration Control via Extended Kalman Filter-Incorporated Residual Neural Network
With the rapid development of industry, the vibration control of flexible structures and underactuated systems has been increasingly gaining attention. Input shaping technology enables stable performance for high-speed motion in industrial motion systems. However, existing input shapers generally suffer from the ineffective control performance due to the neglect of observation errors. To address this critical issue, this paper proposes an Extended Kalman Filter-incorporated Residual Neural Network-based input Shaping (ERS) model for vibration control. Its main ideas are two-fold: a) adopting an extended Kalman filter to address a vertical flexible beam's model errors; and b) adopting a residual neural network to cascade with the extended Kalman filter for eliminating the remaining observation errors. Detailed experiments on a real dataset collected from a vertical flexible beam demonstrate that the proposed ERS model has achieved significant vibration control performance over several state-of-the-art models.
Controllability and Observability of Temporal Hypergraphs
Numerous complex systems, such as those arisen in ecological networks, genomic contact networks, and social networks, exhibit higher-order and time-varying characteristics, which can be effectively modeled using temporal hypergraphs. However, analyzing and controlling temporal hypergraphs poses significant challenges due to their inherent time-varying and nonlinear nature, while most existing methods predominantly target static hypergraphs. In this article, we generalize the notions of controllability and observability to temporal hypergraphs by leveraging tensor and nonlinear systems theory. Specifically, we establish tensor-based rank conditions to determine the weak controllability and observability of temporal hypergraphs. The proposed framework is further demonstrated with synthetic and real-world examples.
comment: 6 pages, 3 figures
Autonomous Grid-Forming Inverter Exponential Droop Control for Improved Frequency Stability
This paper introduces the novel Droop-e grid-forming power electronic converter control strategy, which establishes a non-linear, active power--frequency droop relationship based on an exponential function of the power output. A primary advantage of Droop-e is an increased utilization of available power headroom that directly mitigates system frequency excursions and reduces the rate of change of frequency. The motivation for Droop-e as compared to a linear grid-forming control is first established, and then the full controller is described, including the mirrored inversion at the origin, the linearization at a parameterized limit, and the auxiliary autonomous power sharing controller. The analytic stability of the controller, including synchronization criteria and a small signal stability analysis, is assessed. Electromagnetic transient time domain simulations of the Droop-e controller with full order power electronic converters and accompanying DC-side dynamics, connected in parallel with synchronous generators, are executed at a range of dispatches on a simple 3-bus system. Finally, IEEE 39-bus system simulations highlight the improved frequency stability of the system with multiple, Droop-e controlled grid-forming inverters.
comment: 10 pages, 11 figures
Late Breaking Results: On the One-Key Premise of Logic Locking
The evaluation of logic locking methods has long been predicated on an implicit assumption that only the correct key can unveil the true functionality of a protected circuit. Consequently, a locking technique is deemed secure if it resists a good array of attacks aimed at finding this correct key. This paper challenges this one-key premise by introducing a more efficient attack methodology, focused not on identifying that one correct key, but on finding multiple, potentially incorrect keys that can collectively produce correct functionality from the protected circuit. The tasks of finding these keys can be parallelized, which is well suited for multi-core computing environments. Empirical results show our attack achieves a runtime reduction of up to 99.6% compared to the conventional attack that tries to find a single correct key.
comment: 2 pages, accepted in DAC 2024 proceedings
Remaining Discharge Energy Prediction for Lithium-Ion Batteries Over Broad Current Ranges: A Machine Learning Approach
Lithium-ion batteries have found their way into myriad sectors of industry to drive electrification, decarbonization, and sustainability. A crucial aspect in ensuring their safe and optimal performance is monitoring their energy levels. In this paper, we present the first study on predicting the remaining energy of a battery cell undergoing discharge over wide current ranges from low to high C-rates. The complexity of the challenge arises from the cell's C-rate-dependent energy availability as well as its intricate electro-thermal dynamics especially at high C-rates. To address this, we introduce a new definition of remaining discharge energy and then undertake a systematic effort in harnessing the power of machine learning to enable its prediction. Our effort includes two parts in cascade. First, we develop an accurate dynamic model based on integration of physics with machine learning to capture a battery's voltage and temperature behaviors. Second, based on the model, we propose a machine learning approach to predict the remaining discharge energy under arbitrary C-rates and pre-specified cut-off limits in voltage and temperature. The experimental validation shows that the proposed approach can predict the remaining discharge energy with a relative error of less than 3% when the current varies between 0~8 C for an NCA cell and 0~15 C for an LFP cell. The approach, by design, is amenable to training and computation.
comment: 15 pages, 13 figures, 4 tables
An Efficient and Explainable Transformer-Based Few-Shot Learning for Modeling Electricity Consumption Profiles Across Thousands of Domains
Electricity Consumption Profiles (ECPs) are crucial for operating and planning power distribution systems, especially with the increasing numbers of various low-carbon technologies such as solar panels and electric vehicles. Traditional ECP modeling methods typically assume the availability of sufficient ECP data. However, in practice, the accessibility of ECP data is limited due to privacy issues or the absence of metering devices. Few-shot learning (FSL) has emerged as a promising solution for ECP modeling in data-scarce scenarios. Nevertheless, standard FSL methods, such as those used for images, are unsuitable for ECP modeling because (1) these methods usually assume several source domains with sufficient data and several target domains. However, in the context of ECP modeling, there may be thousands of source domains with a moderate amount of data and thousands of target domains. (2) Standard FSL methods usually involve cumbersome knowledge transfer mechanisms, such as pre-training and fine-tuning, whereas ECP modeling requires more lightweight methods. (3) Deep learning models often lack explainability, hindering their application in industry. This paper proposes a novel FSL method that exploits Transformers and Gaussian Mixture Models (GMMs) for ECP modeling to address the above-described issues. Results show that our method can accurately restore the complex ECP distribution with a minimal amount of ECP data (e.g., only 1.6\% of the complete domain dataset) while it outperforms state-of-the-art time series modeling methods, maintaining the advantages of being both lightweight and interpretable. The project is open-sourced at https://github.com/xiaweijie1996/TransformerEM-GMM.git.
Real-time Event Recognition of Long-distance Distributed Vibration Sensing with Knowledge Distillation and Hardware Acceleration
Fiber-optic sensing, especially distributed optical fiber vibration (DVS) sensing, is gaining importance in internet of things (IoT) applications, such as industrial safety monitoring and intrusion detection. Despite their wide application, existing post-processing methods that rely on deep learning models for event recognition in DVS systems face challenges with real-time processing of large sample data volumes, particularly in long-distance applications. To address this issue, we propose to use a four-layer convolutional neural network (CNN) with ResNet as the teacher model for knowledge distillation. This results in a significant improvement in accuracy, from 83.41% to 95.39%, on data from previously untrained environments. Additionally, we propose a novel hardware design based on field-programmable gate arrays (FPGA) to further accelerate model inference. This design replaces multiplication with binary shift operations and quantizes model weights, enabling high parallelism and low latency. Our implementation achieves an inference time of 0.083 ms for a spatial-temporal sample covering a 12.5 m fiber length and 0.256 s time frame. This performance enables real-time signal processing over approximately 38.55 km of fiber, about $2.14\times$ the capability of an Nvidia GTX 4090 GPU. The proposed method greatly enhances the efficiency of vibration pattern recognition, promoting the use of DVS as a smart IoT system. The data and code are available at https://github.com/HUST-IOF/Efficient-DVS.
comment: 9 pages, 10 figures
A Complete Set of Quadratic Constraints for Repeated ReLU and Generalizations
This paper derives a complete set of quadratic constraints (QCs) for the repeated ReLU. The complete set of QCs is described by a collection of matrix copositivity conditions. We also show that only two functions satisfy all QCs in our complete set: the repeated ReLU and flipped ReLU. Thus our complete set of QCs bounds the repeated ReLU as tight as possible up to the sign invariance inherent in quadratic forms. We derive a similar complete set of incremental QCs for repeated ReLU, which can potentially lead to less conservative Lipschitz bounds for ReLU networks than the standard LipSDP approach. The basic constructions are also used to derive the complete sets of QCs for other piecewise linear activation functions such as leaky ReLU, MaxMin, and HouseHolder. Finally, we illustrate the use of the complete set of QCs to assess stability and performance for recurrent neural networks with ReLU activation functions. We rely on a standard copositivity relaxation to formulate the stability/performance condition as a semidefinite program. Simple examples are provided to illustrate that the complete sets of QCs and incremental QCs can yield less conservative bounds than existing sets.
Contextual Tuning of Model Predictive Control for Autonomous Racing
Learning-based model predictive control has been widely applied in autonomous racing to improve the closed-loop behaviour of vehicles in a data-driven manner. When environmental conditions change, e.g., due to rain, often only the predictive model is adapted, but the controller parameters are kept constant. However, this can lead to suboptimal behaviour. In this paper, we address the problem of data-efficient controller tuning, adapting both the model and objective simultaneously. The key novelty of the proposed approach is that we leverage a learned dynamics model to encode the environmental condition as a so-called context. This insight allows us to employ contextual Bayesian optimization to efficiently transfer knowledge across different environmental conditions. Consequently, we require fewer data to find the optimal controller configuration for each context. The proposed framework is extensively evaluated with more than 3'000 laps driven on an experimental platform with 1:28 scale RC race cars. The results show that our approach successfully optimizes the lap time across different contexts requiring fewer data compared to other approaches based on standard Bayesian optimization.
Control-Coherent Koopman Modeling: A Physical Modeling Approach
The modeling of nonlinear dynamics based on Koopman operator theory, which is originally applicable only to autonomous systems with no control, is extended to non-autonomous control system without approximation to input matrix B. Prevailing methods using a least square estimate of the B matrix may result in an erroneous input matrix, misinforming the controller about the structure of the input matrix in a lifted space. Here, a new method for constructing a Koopman model that comprises the exact input matrix B is presented. A set of state variables are introduced so that the control inputs are linearly involved in the dynamics of actuators. With these variables, a lifted linear model with the exact control matrix, called a Control-Coherent Koopman Model, is constructed by superposing control input terms, which are linear in local actuator dynamics, to the Koopman operator of the associated autonomous nonlinear system. The proposed method is applied to multi degree-of-freedom robotic arms and multi-cable manipulation systems. Model Predictive Control is applied to the former. It is demonstrated that the prevailing Dynamic Mode Decomposition with Control (DMDc) using an approximate control matrix B does not provide a satisfactory result, while the Control-Coherent Koopman Model performs well with the correct B matrix.
comment: Accepted at the Conference on Decision and Control (CDC 2024)
Distributed Online Feedback Optimization for Real-time Distribution System Voltage Regulation
We investigate the real-time distribution system voltage regulation problem employing online feedback optimization (OFO) and short-range communication between physical neighbours. OFO does not need an accurate grid topology nor estimated consumption of non-controllable loads, affords fast calculations, and demonstrates robustness to uncertainties due to its feedback-based nature. These characteristics render it particularly suitable for real-time distribution system applications. However, many OFO controllers require centralized communication, which lack robustness to single-point failures. This paper proposes a distributed OFO design based on a nested feedback optimization strategy and analyzes its convergence. Numerical results reveal that the proposed design achieves satisfactory voltage regulation and outperforms other distributed approaches.
Dynamic Complex-Frequency Control of Grid-Forming Converters
Complex droop control, alternatively known as dispatchable virtual oscillator control (dVOC), stands out for its unique capabilities in synchronization and voltage stabilization among existing control strategies for grid-forming converters. Complex droop control leverages the novel concept of ``complex frequency'', thereby establishing a coupled connection between active and reactive power inputs and frequency and rate-of-change-of voltage outputs. However, its reliance on static droop gains limits its ability to exhibit crucial dynamic response behaviors required in future power systems. To address this limitation, this paper introduces dynamic complex-frequency control, upgrading static droop gains with dynamic transfer functions to enhance the richness and flexibility in dynamic responses for frequency and voltage control. Unlike existing approaches, the complex-frequency control framework treats frequency and voltage dynamics collectively, ensuring small-signal stability for frequency synchronization and voltage stabilization simultaneously. The control framework is validated through detailed numerical case studies on the IEEE nine-bus system, also showcasing its applicability in multi-converter setups.
comment: 6 Pages, 7 Figures
Decentralized Online Learning for Random Inverse Problems Over Graphs
We propose a decentralized online learning algorithm for distributed random inverse problems over network graphs with online measurements, and unifies the distributed parameter estimation in Hilbert spaces and the least mean square problem in reproducing kernel Hilbert spaces (RKHS-LMS). We transform the convergence of the algorithm into the asymptotic stability of a class of inhomogeneous random difference equations in Hilbert spaces with $L_{2}$-bounded martingale difference terms and develop the $L_2$-asymptotic stability theory in Hilbert spaces. We show that if the network graph is connected and the sequence of forward operators satisfies the infinite-dimensional spatio-temporal persistence of excitation condition, then the estimates of all nodes are mean square and almost surely strongly consistent. Moreover, we propose a decentralized online learning algorithm in RKHS based on non-stationary online data streams, and prove that the algorithm is mean square and almost surely strongly consistent if the operators induced by the random input data satisfy the infinite-dimensional spatio-temporal persistence of excitation condition.
Integrating Physics-Based Modeling with Machine Learning for Lithium-Ion Batteries
Mathematical modeling of lithium-ion batteries (LiBs) is a primary challenge in advanced battery management. This paper proposes two new frameworks to integrate physics-based models with machine learning to achieve high-precision modeling for LiBs. The frameworks are characterized by informing the machine learning model of the state information of the physical model, enabling a deep integration between physics and machine learning. Based on the frameworks, a series of hybrid models are constructed, through combining an electrochemical model and an equivalent circuit model, respectively, with a feedforward neural network. The hybrid models are relatively parsimonious in structure and can provide considerable voltage predictive accuracy under a broad range of C-rates, as shown by extensive simulations and experiments. The study further expands to conduct aging-aware hybrid modeling, leading to the design of a hybrid model conscious of the state-of-health to make prediction. The experiments show that the model has high voltage predictive accuracy throughout a LiB's cycle life.
comment: 15 pages, 10 figures, 2 tables. arXiv admin note: text overlap with arXiv:2103.11580
Two competing populations with a common environmental resource
Feedback-evolving games is a framework that models the co-evolution between payoff functions and an environmental state. It serves as a useful tool to analyze many social dilemmas such as natural resource consumption, behaviors in epidemics, and the evolution of biological populations. However, it has primarily focused on the dynamics of a single population of agents. In this paper, we consider the impact of two populations of agents that share a common environmental resource. We focus on a scenario where individuals in one population are governed by an environmentally ``responsible" incentive policy, and individuals in the other population are environmentally ``irresponsible". An analysis on the asymptotic stability of the coupled system is provided, and conditions for which the resource collapses are identified. We then derive consumption rates for the irresponsible population that optimally exploit the environmental resource, and analyze how incentives should be allocated to the responsible population that most effectively promote the environment via a sensitivity analysis.
Robotics
Automating Deformable Gasket Assembly
In Gasket Assembly, a deformable gasket must be aligned and pressed into a narrow channel. This task is common for sealing surfaces in the manufacturing of automobiles, appliances, electronics, and other products. Gasket Assembly is a long-horizon, high-precision task and the gasket must align with the channel and be fully pressed in to achieve a secure fit. To compare approaches, we present 4 methods for Gasket Assembly: one policy from deep imitation learning and three procedural algorithms. We evaluate these methods with 100 physical trials. Results suggest that the Binary+ algorithm succeeds in 10/10 on the straight channel whereas the learned policy based on 250 human teleoperated demonstrations succeeds in 8/10 trials and is significantly slower. Code, CAD models, videos, and data can be found at https://berkeleyautomation.github.io/robot-gasket/
comment: Content without Appendix accepted for IEEE CASE 2024
UMAD: University of Macau Anomaly Detection Benchmark Dataset IROS
Anomaly detection is critical in surveillance systems and patrol robots by identifying anomalous regions in images for early warning. Depending on whether reference data are utilized, anomaly detection can be categorized into anomaly detection with reference and anomaly detection without reference. Currently, anomaly detection without reference, which is closely related to out-of-distribution (OoD) object detection, struggles with learning anomalous patterns due to the difficulty of collecting sufficiently large and diverse anomaly datasets with the inherent rarity and novelty of anomalies. Alternatively, anomaly detection with reference employs the scheme of change detection to identify anomalies by comparing semantic changes between a reference image and a query one. However, there are very few ADr works due to the scarcity of public datasets in this domain. In this paper, we aim to address this gap by introducing the UMAD Benchmark Dataset. To our best knowledge, this is the first benchmark dataset designed specifically for anomaly detection with reference in robotic patrolling scenarios, e.g., where an autonomous robot is employed to detect anomalous objects by comparing a reference and a query video sequences. The reference sequences can be taken by the robot along a specified route when there are no anomalous objects in the scene. The query sequences are captured online by the robot when it is patrolling in the same scene following the same route. Our benchmark dataset is elaborated such that each query image can find a corresponding reference based on accurate robot localization along the same route in the prebuilt 3D map, with which the reference and query images can be geometrically aligned using adaptive warping. Besides the proposed benchmark dataset, we evaluate the baseline models of ADr on this dataset.
comment: Accepted by the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2024, project code at https://github.com/IMRL/UMAD
Beyond Shortsighted Navigation: Merging Best View Trajectory Planning with Robot Navigation
Gathering visual information effectively to monitor known environments is a key challenge in robotics. To be as efficient as human surveyors, robotic systems must continuously collect observational data required to complete their survey task. Inspection personnel instinctively know to look at relevant equipment that happens to be ``along the way.'' In this paper, we introduce a novel framework for continuous long-horizon viewpoint planning, for ground robots, applied to tasks involving patrolling, monitoring or visual data gathering in known environments. Our approach to Long Horizon Viewpoint Planning (LHVP), enables the robot to autonomously navigate and collect environmental data optimizing for coverage over the horizon of the patrol. Leveraging a quadruped's mobility and sensory capabilities, our LHVP framework plans patrol paths that account for coupling the viewpoint planner for the arm camera with the mobile base's navigation planner. The viewpath optimization algorithm seeks a balance between comprehensive environmental coverage and dynamically feasible movements, thus ensuring prolonged and effective operation in scenarios including monitoring, security surveillance, and disaster response. We validate our approach through simulations and in the real world and show that our LHVP significantly outperforms naive patrolling methods in terms of area coverage generating information-gathering trajectories for the robot arm. Our results indicate a promising direction for the deployment of mobile robots in long-term, autonomous surveying, and environmental data collection tasks, highlighting the potential of intelligent robotic systems in challenging real-world applications.
comment: 7 pages, 8 figures, 5 tables
Integrated Hardware and Software Architecture for Industrial AGV with Manual Override Capability
This paper presents a study on transforming a traditional human-operated vehicle into a fully autonomous device. By leveraging previous research and state-of-the-art technologies, the study addresses autonomy, safety, and operational efficiency in industrial environments. Motivated by the demand for automation in hazardous and complex industries, the autonomous system integrates sensors, actuators, advanced control algorithms, and communication systems to enhance safety, streamline processes, and improve productivity. The paper covers system requirements, hardware architecture, software framework and preliminary results. This research offers insights into designing and implementing autonomous capabilities in human-operated vehicles, with implications for improving safety and efficiency in various industrial sectors.
comment: accepted for presentation as WiP at the 2024 IEEE International Conference on Emerging Technologies and Factory Automation (ETFA2024)
Smart Fleet Solutions: Simulating Electric AGV Performance in Industrial Settings
This paper explores the potential benefits and challenges of integrating Electric Vehicles (EVs) and Autonomous Ground Vehicles (AGVs) in industrial settings to improve sustainability and operational efficiency. While EVs offer environmental advantages, barriers like high costs and limited range hinder their widespread use. Similarly, AGVs, despite their autonomous capabilities, face challenges in technology integration and reliability. To address these issues, the paper develops a fleet management tool tailored for coordinating electric AGVs in industrial environments. The study focuses on simulating electric AGV performance in a primary aluminum plant to provide insights into their effectiveness and offer recommendations for optimizing fleet performance.
comment: accepted for presentation as WiP at the 2024 IEEE International Conference on Emerging Technologies and Factory Automation (ETFA2024)
Probabilistic Homotopy Optimization for Dynamic Motion Planning IROS 2024
We present a homotopic approach to solving challenging, optimization-based motion planning problems. The approach uses Homotopy Optimization, which, unlike standard continuation methods for solving homotopy problems, solves a sequence of constrained optimization problems rather than a sequence of nonlinear systems of equations. The insight behind our proposed algorithm is formulating the discovery of this sequence of optimization problems as a search problem in a multidimensional homotopy parameter space. Our proposed algorithm, the Probabilistic Homotopy Optimization algorithm, switches between solve and sample phases, using solutions to easy problems as initial guesses to more challenging problems. We analyze how our algorithm performs in the presence of common challenges to homotopy methods, such as bifurcation, folding, and disconnectedness of the homotopy solution manifold. Finally, we demonstrate its utility via a case study on two dynamic motion planning problems: the cart-pole and the MIT Humanoid.
comment: 8 pages, 9 Figures, 2 Tables, to appear in the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
Robotic Eye-in-hand Visual Servo Axially Aligning Nasopharyngeal Swabs with the Nasal Cavity
The nasopharyngeal (NP) swab test is a method for collecting cultures to diagnose for different types of respiratory illnesses, including COVID-19. Delegating this task to robots would be beneficial in terms of reducing infection risks and bolstering the healthcare system, but a critical component of the NP swab test is having the swab aligned properly with the nasal cavity so that it does not cause excessive discomfort or injury by traveling down the wrong passage. Existing research towards robotic NP swabbing typically assumes the patient's head is held within a fixture. This simplifies the alignment problem, but is also dissimilar to clinical scenarios where patients are typically free-standing. Consequently, our work creates a vision-guided pipeline to allow an instrumented robot arm to properly position and orient NP swabs with respect to the nostrils of free-standing patients. The first component of the pipeline is a precomputed joint lookup table to allow the arm to meet the patient's arbitrary position in the designated workspace, while avoiding joint limits. Our pipeline leverages semantic face models from computer vision to estimate the Euclidean pose of the face with respect to a monocular RGB-D camera placed on the end-effector. These estimates are passed into an unscented Kalman filter on manifolds state estimator and a pose based visual servo control loop to move the swab to the designated pose in front of the nostril. Our pipeline was validated with human trials, featuring a cohort of 25 participants. The system is effective, reaching the nostril for 84% of participants, and our statistical analysis did not find significant demographic biases within the cohort.
comment: 12 pages, 13 figures
Multi Agent Framework for Collective Intelligence Research
This paper presents a scalable decentralized multi agent framework that facilitates the exchange of information between computing units through computer networks. The architectural boundaries imposed by the tool make it suitable for collective intelligence research experiments ranging from agents that exchange hello world messages to virtual drone agents exchanging positions and eventually agents exchanging information via radio with real Crazyflie drones in VU Amsterdam laboratory. The field modulation theory is implemented to construct synthetic local perception maps for agents, which are constructed based on neighbouring agents positions and neighbouring points of interest dictated by the environment. By constraining the experimental setup to a 2D environment with discrete actions, constant velocity and parameters tailored to VU Amsterdam laboratory, UAV Crazyflie drones running hill climbing controller followed collision-free trajectories and bridged sim-to-real gap.
Characterization, Experimental Validation and Pilot User Study of the Vibro-Inertial Bionic Enhancement System (VIBES)
This study presents the characterization and validation of the VIBES, a wearable vibrotactile device that provides high-frequency tactile information embedded in a prosthetic socket. A psychophysical characterization involving ten able-bodied participants is performed to compute the Just Noticeable Difference (JND) related to the discrimination of vibrotactile cues delivered on the skin in two forearm positions, with the goal of optimising vibrotactile actuator position to maximise perceptual response. Furthermore, system performance is validated and tested both with ten able-bodied participants and one prosthesis user considering three tasks. More specifically, in the Active Texture Identification, Slippage and Fragile Object Experiments, we investigate if the VIBES could enhance users' roughness discrimination and manual usability and dexterity. Finally, we test the effect of the vibrotactile system on prosthetic embodiment in a Rubber Hand Illusion (RHI) task. Results show the system's effectiveness in conveying contact and texture cues, making it a potential tool to restore sensory feedback and enhance the embodiment in prosthetic users.
Recursive Distributed Collaborative Aided Inertial Navigation
In this dissertation, we investigate the issue of robust localization in swarms of heterogeneous mobile agents with multiple and time-varying sensing modalities. Our focus is the development of filter-based and decoupled estimators under the assumption that agents possess communication and processing capabilities. Based on the findings from Distributed Collaborative State Estimation and modular sensor fusion, we propose a novel Kalman filter decoupling paradigm, which is termed Isolated Kalman Filtering (IKF). This paradigm is formally discussed and the treatment of delayed measurement is studied. The impact of approximation made was investigated on different observation graphs and the filter credibility was evaluated on a linear system in a Monte Carlo simulation. Finally, we propose a multi-agent modular sensor fusion approach based on the IKF paradigm, in order to cooperatively estimate the global state of a multi-agent system in a distributed way and fuse information provided by different on-board sensors in a computationally efficient way. As a consequence, this approach can be performed distributed among agents, while (i) communication between agents is only required at the moment of inter-agent joint observations, (ii) one agent acts as interim master to process state corrections isolated, (iii) agents can be added and removed from the swarm, (iv) each agent's full state can vary during mission (each local sensor suite can be truly modular), and (v) delayed and multi-rate sensor updates are supported. Extensive evaluation on realistic simulated and real-world data sets show that the proposed Isolated Kalman Filtering (IKF) paradigm, is applicable for both, truly modular single agent estimation and distributed collaborative multi-agent estimation problems.
comment: PhD thesis
Star-shaped Tilted Hexarotor Maneuverability: Analysis of the Role of the Tilt Cant Angles
Star-shaped Tilted Hexarotors are rapidly emerging for applications highly demanding in terms of robustness and maneuverability. To ensure improvement in such features, a careful selection of the tilt angles is mandatory. In this work, we present a rigorous analysis of how the force subspace varies with the tilt cant angles, namely the tilt angles along the vehicle arms, taking into account gravity compensation and torque decoupling to abide by the hovering condition. Novel metrics are introduced to assess the performance of existing tilted platforms, as well as to provide some guidelines for the selection of the tilt cant angle in the design phase.
comment: accepted for presentation at the 2024 IEEE 20th International Conference on Automation Science and Engineering (CASE2024)
Tactile-Morph Skills: Energy-Based Control Meets Data-Driven Learning
Robotic manipulation is essential for modernizing factories and automating industrial tasks like polishing, which require advanced tactile abilities. These robots must be easily set up, safely work with humans, learn tasks autonomously, and transfer skills to similar tasks. Addressing these needs, we introduce the tactile-morph skill framework, which integrates unified force-impedance control with data-driven learning. Our system adjusts robot movements and force application based on estimated energy levels for the desired trajectory and force profile, ensuring safety by stopping if energy allocated for the control runs out. Using a Temporal Convolutional Network, we estimate the energy distribution for a given motion and force profile, enabling skill transfer across different tasks and surfaces. Our approach maintains stability and performance even on unfamiliar geometries with similar friction characteristics, demonstrating improved accuracy, zero-shot transferable performance, and enhanced safety in real-world scenarios. This framework promises to enhance robotic capabilities in industrial settings, making intelligent robots more accessible and valuable.
comment: 15 pages, 7 figures
A Safety-Oriented Self-Learning Algorithm for Autonomous Driving: Evolution Starting from a Basic Model
Autonomous driving vehicles with self-learning capabilities are expected to evolve in complex environments to improve their ability to cope with different scenarios. However, most self-learning algorithms suffer from low learning efficiency and lacking safety, which limits their applications. This paper proposes a safety-oriented self-learning algorithm for autonomous driving, which focuses on how to achieve evolution from a basic model. Specifically, a basic model based on the transformer encoder is designed to extract and output policy features from a small number of demonstration trajectories. To improve the learning efficiency, a policy mixed approach is developed. The basic model provides initial values to improve exploration efficiency, and the self-learning algorithm enhances the adaptability and generalization of the model, enabling continuous improvement without external intervention. Finally, an actor approximator based on receding horizon optimization is designed considering the constraints of the environmental input to ensure safety. The proposed method is verified in a challenging mixed traffic environment with pedestrians and vehicles. Simulation and real-vehicle test results show that the proposed method can safely and efficiently learn appropriate autonomous driving behaviors. Compared reinforcement learning and behavior cloning methods, it can achieve comprehensive improvement in learning efficiency and performance under the premise of ensuring safety.
A Safe and Efficient Self-evolving Algorithm for Decision-making and Control of Autonomous Driving Systems
Autonomous vehicles with a self-evolving ability are expected to cope with unknown scenarios in the real-world environment. Take advantage of trial and error mechanism, reinforcement learning is able to self evolve by learning the optimal policy, and it is particularly well suitable for solving decision-making problems. However, reinforcement learning suffers from safety issues and low learning efficiency, especially in the continuous action space. Therefore, the motivation of this paper is to address the above problem by proposing a hybrid Mechanism-Experience-Learning augmented approach. Specifically, to realize the efficient self-evolution, the driving tendency by analogy with human driving experience is proposed to reduce the search space of the autonomous driving problem, while the constrained optimization problem based on a mechanistic model is designed to ensure safety during the self-evolving process. Experimental results show that the proposed method is capable of generating safe and reasonable actions in various complex scenarios, improving the performance of the autonomous driving system. Compared to conventional reinforcement learning, the safety and efficiency of the proposed algorithm are greatly improved. The training process is collision-free, and the training time is equivalent to less than 10 minutes in the real world.
Control-Theoretic Analysis of Shared Control Systems
Users of shared control systems change their behavior in the presence of assistance, which conflicts with assumpts about user behavior that some assistance methods make. In this paper, we propose an analysis technique to evaluate the user's experience with the assistive systems that bypasses required assumptions: we model the assistance as a dynamical system that can be analyzed using control theory techniques. We analyze the shared autonomy assistance algorithm and make several observations: we identify a problem with runaway goal confidence and propose a system adjustment to mitigate it, we demonstrate that the system inherently limits the possible actions available to the user, and we show that in a simplified setting, the effect of the assistance is to drive the system to the convex hull of the goals and, once there, add a layer of indirection between the user control and the system behavior. We conclude by discussing the possible uses of this analysis for the field.
comment: Presented in the Variable Autonomy for Human-Robot Teaming (VAT) workshop at 33rd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) 2024
LLM-enhanced Scene Graph Learning for Household Rearrangement SIGGRAPH
The household rearrangement task involves spotting misplaced objects in a scene and accommodate them with proper places. It depends both on common-sense knowledge on the objective side and human user preference on the subjective side. In achieving such task, we propose to mine object functionality with user preference alignment directly from the scene itself, without relying on human intervention. To do so, we work with scene graph representation and propose LLM-enhanced scene graph learning which transforms the input scene graph into an affordance-enhanced graph (AEG) with information-enhanced nodes and newly discovered edges (relations). In AEG, the nodes corresponding to the receptacle objects are augmented with context-induced affordance which encodes what kind of carriable objects can be placed on it. New edges are discovered with newly discovered non-local relations. With AEG, we perform task planning for scene rearrangement by detecting misplaced carriables and determining a proper placement for each of them. We test our method by implementing a tiding robot in simulator and perform evaluation on a new benchmark we build. Extensive evaluations demonstrate that our method achieves state-of-the-art performance on misplacement detection and the following rearrangement planning.
comment: SIGGRAPH ASIA 2024
Highly Accurate Robot Calibration Using Adaptive and Momental Bound with Decoupled Weight Decay
Within the context of intelligent manufacturing, industrial robots have a pivotal function. Nonetheless, extended operational periods cause a decline in their absolute positioning accuracy, preventing them from meeting high precision. To address this issue, this paper presents a novel robot algorithm that combines an adaptive and momental bound algorithm with decoupled weight decay (AdaModW), which has three-fold ideas: a) adopting an adaptive moment estimation (Adam) algorithm to achieve a high convergence rate, b) introducing a hyperparameter into the Adam algorithm to define the length of memory, effectively addressing the issue of the abnormal learning rate, and c) interpolating a weight decay coefficient to improve its generalization. Numerous experiments on an HRS-JR680 industrial robot show that the presented algorithm significantly outperforms state-of-the-art algorithms in robot calibration performance. Thus, in light of its reliability, this algorithm provides an efficient way to address robot calibration concerns.
One-shot Video Imitation via Parameterized Symbolic Abstraction Graphs
Learning to manipulate dynamic and deformable objects from a single demonstration video holds great promise in terms of scalability. Previous approaches have predominantly focused on either replaying object relationships or actor trajectories. The former often struggles to generalize across diverse tasks, while the latter suffers from data inefficiency. Moreover, both methodologies encounter challenges in capturing invisible physical attributes, such as forces. In this paper, we propose to interpret video demonstrations through Parameterized Symbolic Abstraction Graphs (PSAG), where nodes represent objects and edges denote relationships between objects. We further ground geometric constraints through simulation to estimate non-geometric, visually imperceptible attributes. The augmented PSAG is then applied in real robot experiments. Our approach has been validated across a range of tasks, such as Cutting Avocado, Cutting Vegetable, Pouring Liquid, Rolling Dough, and Slicing Pizza. We demonstrate successful generalization to novel objects with distinct visual and physical properties.
comment: Robot Learning, Computer Vision, Learning from Videos
Ten Problems in Geobotics
Robots sense, move and act in the physical world. It is therefore natural that algorithmic problems in robotics and automation have a geometric component, often central to the problem. Below we review ten challenging problems at the intersection of robotics and computational geometry -- let's call this intersection Geobotics. What is common to most of these problems is that the prevalent algorithmic techniques used in robotics do not seem suitable for solving them, or at least do not suggest quality guarantees for the solution. Solving some of them, even partially, can shed light on less well-understood aspects of computation in robotics.
Efficient Sensor Placement from Regression with Sparse Gaussian Processes in Continuous and Discrete Spaces
The sensor placement problem is a common problem that arises when monitoring correlated phenomena, such as temperature, precipitation, and salinity. Existing approaches to this problem typically formulate it as the maximization of information metrics, such as mutual information~(MI), and use optimization methods such as greedy algorithms in discrete domains, and derivative-free optimization methods such as genetic algorithms in continuous domains. However, computing MI for sensor placement requires discretizing the environment, and its computation cost depends on the size of the discretized environment. These limitations restrict these approaches from scaling to large problems. We present a novel formulation to the SP problem based on variational approximation that can be optimized using gradient descent, allowing us to efficiently find solutions in continuous domains. We generalize our method to also handle discrete environments. Our experimental results on four real-world datasets demonstrate that our approach generates sensor placements consistently on par with or better than the prior state-of-the-art approaches in terms of both MI and reconstruction quality, all while being significantly faster. Our computationally efficient approach enables both large-scale sensor placement and fast robotic sensor placement for informative path planning algorithms.
comment: preprint
Learning to Imitate Spatial Organization in Multi-robot Systems IROS 2024
Understanding collective behavior and how it evolves is important to ensure that robot swarms can be trusted in a shared environment. One way to understand the behavior of the swarm is through collective behavior reconstruction using prior demonstrations. Existing approaches often require access to the swarm controller which may not be available. We reconstruct collective behaviors in distinct swarm scenarios involving shared environments without using swarm controller information. We achieve this by transforming prior demonstrations into features that describe multi-agent interactions before behavior reconstruction with multi-agent generative adversarial imitation learning (MA-GAIL). We show that our approach outperforms existing algorithms in spatial organization, and can be used to observe and reconstruct a swarm's behavior for further analysis and testing, which might be impractical or undesirable on the original robot swarm.
comment: 6 pages, 4 figures. Accepted for presentation at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
Contextual Tuning of Model Predictive Control for Autonomous Racing
Learning-based model predictive control has been widely applied in autonomous racing to improve the closed-loop behaviour of vehicles in a data-driven manner. When environmental conditions change, e.g., due to rain, often only the predictive model is adapted, but the controller parameters are kept constant. However, this can lead to suboptimal behaviour. In this paper, we address the problem of data-efficient controller tuning, adapting both the model and objective simultaneously. The key novelty of the proposed approach is that we leverage a learned dynamics model to encode the environmental condition as a so-called context. This insight allows us to employ contextual Bayesian optimization to efficiently transfer knowledge across different environmental conditions. Consequently, we require fewer data to find the optimal controller configuration for each context. The proposed framework is extensively evaluated with more than 3'000 laps driven on an experimental platform with 1:28 scale RC race cars. The results show that our approach successfully optimizes the lap time across different contexts requiring fewer data compared to other approaches based on standard Bayesian optimization.
StreamLTS: Query-based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection
Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations. The official code is available at \url{https://github.com/YuanYunshuang/CoSense3D}.
ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models IROS2024
While the integration of Multi-modal Large Language Models (MLLMs) with robotic systems has significantly improved robots' ability to understand and execute natural language instructions, their performance in manipulation tasks remains limited due to a lack of robotics-specific knowledge. Conventional MLLMs are typically trained on generic image-text pairs, leaving them deficient in understanding affordances and physical concepts crucial for manipulation. To address this gap, we propose ManipVQA, a novel framework that infuses MLLMs with manipulation-centric knowledge through a Visual Question-Answering (VQA) format. This approach encompasses tool detection, affordance recognition, and a broader understanding of physical concepts. We curated a diverse dataset of images depicting interactive objects, to challenge robotic understanding in tool detection, affordance prediction, and physical concept comprehension. To effectively integrate this robotics-specific knowledge with the inherent vision-reasoning capabilities of MLLMs, we leverage a unified VQA format and devise a fine-tuning strategy. This strategy preserves the original vision-reasoning abilities while incorporating the newly acquired robotic insights. Empirical evaluations conducted in robotic simulators and across various vision task benchmarks demonstrate the robust performance of ManipVQA. The code and dataset are publicly available at https://github.com/SiyuanHuang95/ManipVQA.
comment: Code and dataset are publicly available at https://github.com/SiyuanHuang95/ManipVQA. Accepted by IROS2024
Tactile Perception in Upper Limb Prostheses: Mechanical Characterization, Human Experiments, and Computational Findings
Our research investigates vibrotactile perception in four prosthetic hands with distinct kinematics and mechanical characteristics. We found that rigid and simple socket-based prosthetic devices can transmit tactile information and surprisingly enable users to identify the stimulated finger with high reliability. This ability decreases with more advanced prosthetic hands with additional articulations and softer mechanics. We conducted experiments to understand the underlying mechanisms. We assessed a prosthetic user's ability to discriminate finger contacts based on vibrations transmitted through the four prosthetic hands. We also performed numerical and mechanical vibration tests on the prostheses and used a machine learning classifier to identify the contacted finger. Our results show that simpler and rigid prosthetic hands facilitate contact discrimination (for instance, a user of a purely cosmetic hand can distinguish a contact on the index finger from other fingers with 83% accuracy), but all tested hands, including soft advanced ones, performed above chance level. Despite advanced hands reducing vibration transmission, a machine learning algorithm still exceeded human performance in discriminating finger contacts. These findings suggest the potential for enhancing vibrotactile feedback in advanced prosthetic hands and lay the groundwork for future integration of such feedback in prosthetic devices.
Energy-Optimized Planning in Non-Uniform Wind Fields with Fixed-Wing Aerial Vehicles
Fixed-wing small uncrewed aerial vehicles (sUAVs) possess the capability to remain airborne for extended durations and traverse vast distances. However, their operation is susceptible to wind conditions, particularly in regions of complex terrain where high wind speeds may push the aircraft beyond its operational limitations, potentially raising safety concerns. Moreover, wind impacts the energy required to follow a path, especially in locations where the wind direction and speed are not favorable. Incorporating wind information into mission planning is essential to ensure both safety and energy efficiency. In this paper, we propose a sampling-based planner using the kinematic Dubins aircraft paths with respect to the ground, to plan energy-efficient paths in non-uniform wind fields. We study the planner characteristics with synthetic and real-world wind data and compare its performance against baseline cost and path formulations. We demonstrate that the energy-optimized planner effectively utilizes updrafts to minimize energy consumption, albeit at the expense of increased travel time. The ground-relative path formulation facilitates the generation of safe trajectories onboard sUAVs within reasonable computational timeframes.
comment: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
MuTT: A Multimodal Trajectory Transformer for Robot Skills
High-level robot skills represent an increasingly popular paradigm in robot programming. However, configuring the skills' parameters for a specific task remains a manual and time-consuming endeavor. Existing approaches for learning or optimizing these parameters often require numerous real-world executions or do not work in dynamic environments. To address these challenges, we propose MuTT, a novel encoder-decoder transformer architecture designed to predict environment-aware executions of robot skills by integrating vision, trajectory, and robot skill parameters. Notably, we pioneer the fusion of vision and trajectory, introducing a novel trajectory projection. Furthermore, we illustrate MuTT's efficacy as a predictor when combined with a model-based robot skill optimizer. This approach facilitates the optimization of robot skill parameters for the current environment, without the need for real-world executions during optimization. Designed for compatibility with any representation of robot skills, MuTT demonstrates its versatility across three comprehensive experiments, showcasing superior performance across two different skill representations.
CGGM: A conditional graph generation model with adaptive sparsity for node anomaly detection in IoT networks
Dynamic graphs are extensively employed for detecting anomalous behavior in nodes within the Internet of Things (IoT). Graph generative models are often used to address the issue of imbalanced node categories in dynamic graphs. Neverthe less, the constraints it faces include the monotonicity of adjacency relationships, the difficulty in constructing multi-dimensional features for nodes, and the lack of a method for end-to-end generation of multiple categories of nodes. In this paper, we propose a novel graph generation model, called CGGM, specifically for generating samples belonging to the minority class. The framework consists two core module: a conditional graph generation module and a graph-based anomaly detection module. The generative module adapts to the sparsity of the matrix by downsampling a noise adjacency matrix, and incorporates a multi-dimensional feature encoder based on multi-head self-attention to capture latent dependencies among features. Additionally, a latent space constraint is combined with the distribution distance to approximate the latent distribution of real data. The graph-based anomaly detection module utilizes the generated balanced dataset to predict the node behaviors. Extensive experiments have shown that CGGM outperforms the state-of-the-art methods in terms of accuracy and divergence. The results also demonstrate CGGM can generated diverse data categories, that enhancing the performance of multi-category classification task.
comment: 23 pages, 19 figures
GRADE: Generating Realistic And Dynamic Environments for Robotics Research with Isaac Sim
Synthetic data and novel rendering techniques have greatly influenced computer vision research in tasks like target tracking and human pose estimation. However, robotics research has lagged behind in leveraging it due to the limitations of most simulation frameworks, including the lack of low-level software control and flexibility, Robot Operating System integration, realistic physics, or photorealism. This hindered progress in (visual-)perception research, e.g. in autonomous robotics, especially in dynamic environments. Visual Simultaneous Localization and Mapping (V-SLAM), for instance, has been mostly developed passively, in static environments, and evaluated on few pre-recorded dynamic datasets due to the difficulties of realistically simulating dynamic worlds and the huge sim-to-real gap. To address these challenges, we present GRADE (Generating Realistic and Dynamic Environments), a highly customizable framework built upon NVIDIA Isaac Sim. We leverage Isaac's rendering capabilities and low-level APIs to populate and control the simulation, collect ground-truth data, and test online and offline approaches. Importantly, we introduce a new way to precisely repeat a recorded experiment within a physically enabled simulation while allowing environmental and simulation changes. Next, we collect a synthetic dataset of richly annotated videos in dynamic environments with a flying drone. Using that, we train detection and segmentation models for humans, closing the syn-to-real gap. Finally, we benchmark state-of-the-art dynamic V-SLAM algorithms, revealing their short tracking times and low generalization capabilities. We also show for the first time that the top-performing deep learning models do not achieve the best SLAM performance. Code and data are provided as open-source at https://grade.is.tue.mpg.de.
comment: 33 pages, 11 tables, 7 figures
Robust Policy Learning via Offline Skill Diffusion AAAI
Skill-based reinforcement learning (RL) approaches have shown considerable promise, especially in solving long-horizon tasks via hierarchical structures. These skills, learned task-agnostically from offline datasets, can accelerate the policy learning process for new tasks. Yet, the application of these skills in different domains remains restricted due to their inherent dependency on the datasets, which poses a challenge when attempting to learn a skill-based policy via RL for a target domain different from the datasets' domains. In this paper, we present a novel offline skill learning framework DuSkill which employs a guided Diffusion model to generate versatile skills extended from the limited skills in datasets, thereby enhancing the robustness of policy learning for tasks in different domains. Specifically, we devise a guided diffusion-based skill decoder in conjunction with the hierarchical encoding to disentangle the skill embedding space into two distinct representations, one for encapsulating domain-invariant behaviors and the other for delineating the factors that induce domain variations in the behaviors. Our DuSkill framework enhances the diversity of skills learned offline, thus enabling to accelerate the learning procedure of high-level policies for different domains. Through experiments, we show that DuSkill outperforms other skill-based imitation learning and RL algorithms for several long-horizon tasks, demonstrating its benefits in few-shot imitation and online RL.
comment: 11 pages, 6 figures; Accepted for AAAI Conference on Artificial Intelligence (AAAI 2024); Published version
Text2Interaction: Establishing Safe and Preferable Human-Robot Interaction
Adjusting robot behavior to human preferences can require intensive human feedback, preventing quick adaptation to new users and changing circumstances. Moreover, current approaches typically treat user preferences as a reward, which requires a manual balance between task success and user satisfaction. To integrate new user preferences in a zero-shot manner, our proposed Text2Interaction framework invokes large language models to generate a task plan, motion preferences as Python code, and parameters of a safe controller. By maximizing the combined probability of task completion and user satisfaction instead of a weighted sum of rewards, we can reliably find plans that fulfill both requirements. We find that 83% of users working with Text2Interaction agree that it integrates their preferences into the robot's plan, and 94% prefer Text2Interaction over the baseline. Our ablation study shows that Text2Interaction aligns better with unseen preferences than other baselines while maintaining a high success rate.
Multiagent Systems
MEDCO: Medical Education Copilots Based on A Multi-Agent Framework
Large language models (LLMs) have had a significant impact on diverse research domains, including medicine and healthcare. However, the potential of LLMs as copilots in medical education remains underexplored. Current AI-assisted educational tools are limited by their solitary learning approach and inability to simulate the multi-disciplinary and interactive nature of actual medical training. To address these limitations, we propose MEDCO (Medical EDucation COpilots), a novel multi-agent-based copilot system specially developed to emulate real-world medical training environments. MEDCO incorporates three primary agents: an agentic patient, an expert doctor, and a radiologist, facilitating a multi-modal and interactive learning environment. Our framework emphasizes the learning of proficient question-asking skills, multi-disciplinary collaboration, and peer discussions between students. Our experiments show that simulated virtual students who underwent training with MEDCO not only achieved substantial performance enhancements comparable to those of advanced models, but also demonstrated human-like learning behaviors and improvements, coupled with an increase in the number of learning samples. This work contributes to medical education by introducing a copilot that implements an interactive and collaborative learning approach. It also provides valuable insights into the effectiveness of AI-integrated training paradigms.
Balancing Act: Prioritization Strategies for LLM-Designed Restless Bandit Rewards
LLMs are increasingly used to design reward functions based on human preferences in Reinforcement Learning (RL). We focus on LLM-designed rewards for Restless Multi-Armed Bandits, a framework for allocating limited resources among agents. In applications such as public health, this approach empowers grassroots health workers to tailor automated allocation decisions to community needs. In the presence of multiple agents, altering the reward function based on human preferences can impact subpopulations very differently, leading to complex tradeoffs and a multi-objective resource allocation problem. We are the first to present a principled method termed Social Choice Language Model for dealing with these tradeoffs for LLM-designed rewards for multiagent planners in general and restless bandits in particular. The novel part of our model is a transparent and configurable selection component, called an adjudicator, external to the LLM that controls complex tradeoffs via a user-selected social welfare function. Our experiments demonstrate that our model reliably selects more effective, aligned, and balanced reward functions compared to purely LLM-based approaches.
Learning to Imitate Spatial Organization in Multi-robot Systems IROS 2024
Understanding collective behavior and how it evolves is important to ensure that robot swarms can be trusted in a shared environment. One way to understand the behavior of the swarm is through collective behavior reconstruction using prior demonstrations. Existing approaches often require access to the swarm controller which may not be available. We reconstruct collective behaviors in distinct swarm scenarios involving shared environments without using swarm controller information. We achieve this by transforming prior demonstrations into features that describe multi-agent interactions before behavior reconstruction with multi-agent generative adversarial imitation learning (MA-GAIL). We show that our approach outperforms existing algorithms in spatial organization, and can be used to observe and reconstruct a swarm's behavior for further analysis and testing, which might be impractical or undesirable on the original robot swarm.
comment: 6 pages, 4 figures. Accepted for presentation at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
Language Agents as Optimizable Graphs
Various human-designed prompt engineering techniques have been proposed to improve problem solvers based on Large Language Models (LLMs), yielding many disparate code bases. We unify these approaches by describing LLM-based agents as computational graphs. The nodes implement functions to process multimodal data or query LLMs, and the edges describe the information flow between operations. Graphs can be recursively combined into larger composite graphs representing hierarchies of inter-agent collaboration (where edges connect operations of different agents). Our novel automatic graph optimizers (1) refine node-level LLM prompts (node optimization) and (2) improve agent orchestration by changing graph connectivity (edge optimization). Experiments demonstrate that our framework can be used to efficiently develop, integrate, and automatically improve various LLM agents. The code can be found at https://github.com/metauto-ai/gptswarm.
comment: Project Website: https://gptswarm.org ; Github Repo: https://github.com/metauto-ai/gptswarm . In Forty-first International Conference on Machine Learning (2024)
Robotics
Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation
Modern machine learning systems rely on large datasets to attain broad generalization, and this often poses a challenge in robot learning, where each robotic platform and task might have only a small dataset. By training a single policy across many different kinds of robots, a robot learning method can leverage much broader and more diverse datasets, which in turn can lead to better generalization and robustness. However, training a single policy on multi-robot data is challenging because robots can have widely varying sensors, actuators, and control frequencies. We propose CrossFormer, a scalable and flexible transformer-based policy that can consume data from any embodiment. We train CrossFormer on the largest and most diverse dataset to date, 900K trajectories across 20 different robot embodiments. We demonstrate that the same network weights can control vastly different robots, including single and dual arm manipulation systems, wheeled robots, quadcopters, and quadrupeds. Unlike prior work, our model does not require manual alignment of the observation or action spaces. Extensive experiments in the real world show that our method matches the performance of specialist policies tailored for each embodiment, while also significantly outperforming the prior state of the art in cross-embodiment learning.
comment: Project website at https://crossformer-model.github.io/
EmbodiedSAM: Online Segment Any 3D Thing in Real Time
Embodied tasks require the agent to fully understand 3D scenes simultaneously with its exploration, so an online, real-time, fine-grained and highly-generalized 3D perception model is desperately needed. Since high-quality 3D data is limited, directly training such a model in 3D is almost infeasible. Meanwhile, vision foundation models (VFM) has revolutionized the field of 2D computer vision with superior performance, which makes the use of VFM to assist embodied 3D perception a promising direction. However, most existing VFM-assisted 3D perception methods are either offline or too slow that cannot be applied in practical embodied tasks. In this paper, we aim to leverage Segment Anything Model (SAM) for real-time 3D instance segmentation in an online setting. This is a challenging problem since future frames are not available in the input streaming RGB-D video, and an instance may be observed in several frames so object matching between frames is required. To address these challenges, we first propose a geometric-aware query lifting module to represent the 2D masks generated by SAM by 3D-aware queries, which is then iteratively refined by a dual-level query decoder. In this way, the 2D masks are transferred to fine-grained shapes on 3D point clouds. Benefit from the query representation for 3D masks, we can compute the similarity matrix between the 3D masks from different views by efficient matrix operation, which enables real-time inference. Experiments on ScanNet, ScanNet200, SceneNN and 3RScan show our method achieves leading performance even compared with offline methods. Our method also demonstrates great generalization ability in several zero-shot dataset transferring experiments and show great potential in open-vocabulary and data-efficient setting. Code and demo are available at https://xuxw98.github.io/ESAM/, with only one RTX 3090 GPU required for training and evaluation.
comment: Project page: https://xuxw98.github.io/ESAM/
Informed, Constrained, Aligned: A Field Analysis on Degeneracy-aware Point Cloud Registration in the Wild
The ICP registration algorithm has been a preferred method for LiDAR-based robot localization for nearly a decade. However, even in modern SLAM solutions, ICP can degrade and become unreliable in geometrically ill-conditioned environments. Current solutions primarily focus on utilizing additional sources of information, such as external odometry, to either replace the degenerate directions of the optimization solution or add additional constraints in a sensor-fusion setup afterward. In response, this work investigates and compares new and existing degeneracy mitigation methods for robust LiDAR-based localization and analyzes the efficacy of these approaches in degenerate environments for the first time in the literature at this scale. Specifically, this work proposes and investigates i) the incorporation of different types of constraints into the ICP algorithm, ii) the effect of using active or passive degeneracy mitigation techniques, and iii) the choice of utilizing global point cloud registration methods on the ill-conditioned ICP problem in LiDAR degenerate environments. The study results are validated through multiple real-world field and simulated experiments. The analysis shows that active optimization degeneracy mitigation is necessary and advantageous in the absence of reliable external estimate assistance for LiDAR-SLAM. Furthermore, introducing degeneracy-aware hard constraints in the optimization before or during the optimization is shown to perform better in the wild than by including the constraints after. Moreover, with heuristic fine-tuned parameters, soft constraints can provide equal or better results in complex ill-conditioned scenarios. The implementations used in the analysis of this work are made publicly available to the community.
comment: Submitted to IEEE Transactions on Field Robotics
ACE: A Cross-Platform Visual-Exoskeletons System for Low-Cost Dexterous Teleoperation
Learning from demonstrations has shown to be an effective approach to robotic manipulation, especially with the recently collected large-scale robot data with teleoperation systems. Building an efficient teleoperation system across diverse robot platforms has become more crucial than ever. However, there is a notable lack of cost-effective and user-friendly teleoperation systems for different end-effectors, e.g., anthropomorphic robot hands and grippers, that can operate across multiple platforms. To address this issue, we develop ACE, a cross-platform visual-exoskeleton system for low-cost dexterous teleoperation. Our system utilizes a hand-facing camera to capture 3D hand poses and an exoskeleton mounted on a portable base, enabling accurate real-time capture of both finger and wrist poses. Compared to previous systems, which often require hardware customization according to different robots, our single system can generalize to humanoid hands, arm-hands, arm-gripper, and quadruped-gripper systems with high-precision teleoperation. This enables imitation learning for complex manipulation tasks on diverse platforms.
comment: Webpage: https://ace-teleop.github.io/
An Advanced Microscopic Energy Consumption Model for Automated Vehicle:Development, Calibration, Verification
The automated vehicle (AV) equipped with the Adaptive Cruise Control (ACC) system is expected to reduce the fuel consumption for the intelligent transportation system. This paper presents the Advanced ACC-Micro (AA-Micro) model, a new energy consumption model based on micro trajectory data, calibrated and verified by empirical data. Utilizing a commercial AV equipped with the ACC system as the test platform, experiments were conducted at the Columbus 151 Speedway, capturing data from multiple ACC and Human-Driven (HV) test runs. The calibrated AA-Micro model integrates features from traditional energy consumption models and demonstrates superior goodness of fit, achieving an impressive 90% accuracy in predicting ACC system energy consumption without overfitting. A comprehensive statistical evaluation of the AA-Micro model's applicability and adaptability in predicting energy consumption and vehicle trajectories indicated strong model consistency and reliability for ACC vehicles, evidenced by minimal variance in RMSE values and uniform RSS distributions. Conversely, significant discrepancies were observed when applying the model to HV data, underscoring the necessity for specialized models to accurately predict energy consumption for HV and ACC systems, potentially due to their distinct energy consumption characteristics.
D-RMGPT: Robot-assisted collaborative tasks driven by large multimodal models
Collaborative robots are increasingly popular for assisting humans at work and daily tasks. However, designing and setting up interfaces for human-robot collaboration is challenging, requiring the integration of multiple components, from perception and robot task control to the hardware itself. Frequently, this leads to highly customized solutions that rely on large amounts of costly training data, diverging from the ideal of flexible and general interfaces that empower robots to perceive and adapt to unstructured environments where they can naturally collaborate with humans. To overcome these challenges, this paper presents the Detection-Robot Management GPT (D-RMGPT), a robot-assisted assembly planner based on Large Multimodal Models (LMM). This system can assist inexperienced operators in assembly tasks without requiring any markers or previous training. D-RMGPT is composed of DetGPT-V and R-ManGPT. DetGPT-V, based on GPT-4V(vision), perceives the surrounding environment through one-shot analysis of prompted images of the current assembly stage and the list of components to be assembled. It identifies which components have already been assembled by analysing their features and assembly requirements. R-ManGPT, based on GPT-4, plans the next component to be assembled and generates the robot's discrete actions to deliver it to the human co-worker. Experimental tests on assembling a toy aircraft demonstrated that D-RMGPT is flexible and intuitive to use, achieving an assembly success rate of 83% while reducing the assembly time for inexperienced operators by 33% compared to the manual process. http://robotics-and-ai.github.io/LMMmodels/
Bayesian Optimization Framework for Efficient Fleet Design in Autonomous Multi-Robot Exploration
This study addresses the challenge of fleet design optimization in the context of heterogeneous multi-robot fleets, aiming to obtain feasible designs that balance performance and costs. In the domain of autonomous multi-robot exploration, reinforcement learning agents play a central role, offering adaptability to complex terrains and facilitating collaboration among robots. However, modifying the fleet composition results in changes in the learned behavior, and training multi-robot systems using multi-agent reinforcement learning is expensive. Therefore, an exhaustive evaluation of each potential fleet design is infeasible. To tackle these hurdles, we introduce Bayesian Optimization for Fleet Design (BOFD), a framework leveraging multi-objective Bayesian Optimization to explore fleets on the Pareto front of performance and cost while accounting for uncertainty in the design space. Moreover, we establish a sub-linear bound for cumulative regret, supporting BOFD's robustness and efficacy. Extensive benchmark experiments in synthetic and simulated environments demonstrate the superiority of our framework over state-of-the-art methods, achieving efficient fleet designs with minimal fleet evaluations.
Collaborative Robot Arm Inserting Nasopharyngeal Swabs with Admittance Control
The nasopharyngeal (NP) swab sample test, commonly used to detect COVID-19 and other respiratory illnesses, involves moving a swab through the nasal cavity to collect samples from the nasopharynx. While typically this is done by human healthcare workers, there is a significant societal interest to enable robots to do this test to reduce exposure to patients and to free up human resources. The task is challenging from the robotics perspective because of the dexterity and safety requirements. While other works have implemented specific hardware solutions, our research differentiates itself by using a ubiquitous rigid robotic arm. This work presents a case study where we investigate the strengths and challenges using compliant control system to accomplish NP swab tests with such a robotic configuration. To accomplish this, we designed a force sensing end-effector that integrates with the proposed torque controlled compliant control loop. We then conducted experiments where the robot inserted NP swabs into a 3D printed nasal cavity phantom. Ultimately, we found that the compliant control system outperformed a basic position controller and shows promise for human use. However, further efforts are needed to ensure the initial alignment with the nostril and to address head motion.
comment: 13 pages, 9 figures. See https://uwaterloo.ca/scholar/pqjlee/collaborative-robot-arm-inserting-nasopharyngeal-swabs-admittance-control for supplementary data
Online state vector reduction during model predictive control with gradient-based trajectory optimisation
Non-prehensile manipulation in high-dimensional systems is challenging for a variety of reasons, one of the main reasons is the computationally long planning times that come with a large state space. Trajectory optimisation algorithms have proved their utility in a wide variety of tasks, but, like most methods struggle scaling to the high dimensional systems ubiquitous to non-prehensile manipulation in clutter as well as deformable object manipulation. We reason that, during manipulation, different degrees of freedom will become more or less important to the task over time as the system evolves. We leverage this idea to reduce the number of degrees of freedom considered in a trajectory optimisation problem, to reduce planning times. This idea is particularly relevant in the context of model predictive control (MPC) where the cost landscape of the optimisation problem is constantly evolving. We provide simulation results under asynchronous MPC and show our methods are capable of achieving better overall performance due to the decreased policy lag whilst still being able to optimise trajectories effectively.
comment: 18 pages, 4 figures, accepted to WAFR 2024
Enhanced Visual SLAM for Collision-free Driving with Lightweight Autonomous Cars
The paper presents a vision-based obstacle avoidance strategy for lightweight self-driving cars that can be run on a CPU-only device using a single RGB-D camera. The method consists of two steps: visual perception and path planning. The visual perception part uses ORBSLAM3 enhanced with optical flow to estimate the car's poses and extract rich texture information from the scene. In the path planning phase, we employ a method combining a control Lyapunov function and control barrier function in the form of quadratic program (CLF-CBF-QP) together with an obstacle shape reconstruction process (SRP) to plan safe and stable trajectories. To validate the performance and robustness of the proposed method, simulation experiments were conducted with a car in various complex indoor environments using the Gazebo simulation environment. Our method can effectively avoid obstacles in the scenes. The proposed algorithm outperforms benchmark algorithms in achieving more stable and shorter trajectories across multiple simulated scenes.
comment: 16 pages; Submitted to a journal
RaNDT SLAM: Radar SLAM Based on Intensity-Augmented Normal Distributions Transform
Rescue robotics sets high requirements to perception algorithms due to the unstructured and potentially vision-denied environments. Pivoting Frequency-Modulated Continuous Wave radars are an emerging sensing modality for SLAM in this kind of environment. However, the complex noise characteristics of radar SLAM makes, particularly indoor, applications computationally demanding and slow. In this work, we introduce a novel radar SLAM framework, RaNDT SLAM, that operates fast and generates accurate robot trajectories. The method is based on the Normal Distributions Transform augmented by radar intensity measures. Motion estimation is based on fusion of motion model, IMU data, and registration of the intensity-augmented Normal Distributions Transform. We evaluate RaNDT SLAM in a new benchmark dataset and the Oxford Radar RobotCar dataset. The new dataset contains indoor and outdoor environments besides multiple sensing modalities (LiDAR, radar, and IMU).
comment: This work was accepted by the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024
A Survey of Embodied Learning for Object-Centric Robotic Manipulation
Embodied learning for object-centric robotic manipulation is a rapidly developing and challenging area in embodied AI. It is crucial for advancing next-generation intelligent robots and has garnered significant interest recently. Unlike data-driven machine learning methods, embodied learning focuses on robot learning through physical interaction with the environment and perceptual feedback, making it especially suitable for robotic manipulation. In this paper, we provide a comprehensive survey of the latest advancements in this field and categorize the existing work into three main branches: 1) Embodied perceptual learning, which aims to predict object pose and affordance through various data representations; 2) Embodied policy learning, which focuses on generating optimal robotic decisions using methods such as reinforcement learning and imitation learning; 3) Embodied task-oriented learning, designed to optimize the robot's performance based on the characteristics of different tasks in object grasping and manipulation. In addition, we offer an overview and discussion of public datasets, evaluation metrics, representative applications, current challenges, and potential future research directions. A project associated with this survey has been established at https://github.com/RayYoh/OCRM_survey.
Long-Range Vision-Based UAV-assisted Localization for Unmanned Surface Vehicles
The global positioning system (GPS) has become an indispensable navigation method for field operations with unmanned surface vehicles (USVs) in marine environments. However, GPS may not always be available outdoors because it is vulnerable to natural interference and malicious jamming attacks. Thus, an alternative navigation system is required when the use of GPS is restricted or prohibited. To this end, we present a novel method that utilizes an Unmanned Aerial Vehicle (UAV) to assist in localizing USVs in GNSS-restricted marine environments. In our approach, the UAV flies along the shoreline at a consistent altitude, continuously tracking and detecting the USV using a deep learning-based approach on camera images. Subsequently, triangulation techniques are applied to estimate the USV's position relative to the UAV, utilizing geometric information and datalink range from the UAV. We propose adjusting the UAV's camera angle based on the pixel error between the USV and the image center throughout the localization process to enhance accuracy. Additionally, visual measurements are integrated into an Extended Kalman Filter (EKF) for robust state estimation. To validate our proposed method, we utilize a USV equipped with onboard sensors and a UAV equipped with a camera. A heterogeneous robotic interface is established to facilitate communication between the USV and UAV. We demonstrate the efficacy of our approach through a series of experiments conducted during the ``Muhammad Bin Zayed International Robotic Challenge (MBZIRC-2024)'' in real marine environments, incorporating noisy measurements and ocean disturbances. The successful outcomes indicate the potential of our method to complement GPS for USV navigation.
AS-LIO: Spatial Overlap Guided Adaptive Sliding Window LiDAR-Inertial Odometry for Aggressive FOV Variation
LiDAR-Inertial Odometry (LIO) demonstrates outstanding accuracy and stability in general low-speed and smooth motion scenarios. However, in high-speed and intense motion scenarios, such as sharp turns, two primary challenges arise: firstly, due to the limitations of IMU frequency, the error in estimating significantly non-linear motion states escalates; secondly, drastic changes in the Field of View (FOV) may diminish the spatial overlap between LiDAR frame and pointcloud map (or between frames), leading to insufficient data association and constraint degradation. To address these issues, we propose a novel Adaptive Sliding window LIO framework (AS-LIO) guided by the Spatial Overlap Degree (SOD). Initially, we assess the SOD between the LiDAR frames and the registered map, directly evaluating the adverse impact of current FOV variation on pointcloud alignment. Subsequently, we design an adaptive sliding window to manage the continuous LiDAR stream and control state updates, dynamically adjusting the update step according to the SOD. This strategy enables our odometry to adaptively adopt higher update frequency to precisely characterize trajectory during aggressive FOV variation, thus effectively reducing the non-linear error in positioning. Meanwhile, the historical constraints within the sliding window reinforce the frame-to-map data association, ensuring the robustness of state estimation. Experiments show that our AS-LIO framework can quickly perceive and respond to challenging FOV change, outperforming other state-of-the-art LIO frameworks in terms of accuracy and robustness.
comment: 8 pages, 6 figures
Subgoal-based Hierarchical Reinforcement Learning for Multi-Agent Collaboration
Recent advancements in reinforcement learning have made significant impacts across various domains, yet they often struggle in complex multi-agent environments due to issues like algorithm instability, low sampling efficiency, and the challenges of exploration and dimensionality explosion. Hierarchical reinforcement learning (HRL) offers a structured approach to decompose complex tasks into simpler sub-tasks, which is promising for multi-agent settings. This paper advances the field by introducing a hierarchical architecture that autonomously generates effective subgoals without explicit constraints, enhancing both flexibility and stability in training. We propose a dynamic goal generation strategy that adapts based on environmental changes. This method significantly improves the adaptability and sample efficiency of the learning process. Furthermore, we address the critical issue of credit assignment in multi-agent systems by synergizing our hierarchical architecture with a modified QMIX network, thus improving overall strategy coordination and efficiency. Comparative experiments with mainstream reinforcement learning algorithms demonstrate the superior convergence speed and performance of our approach in both single-agent and multi-agent environments, confirming its effectiveness and flexibility in complex scenarios. Our code is open-sourced at: \url{https://github.com/SICC-Group/GMAH}.
Reflex-Based Open-Vocabulary Navigation without Prior Knowledge Using Omnidirectional Camera and Multiple Vision-Language Models
Various robot navigation methods have been developed, but they are mainly based on Simultaneous Localization and Mapping (SLAM), reinforcement learning, etc., which require prior map construction or learning. In this study, we consider the simplest method that does not require any map construction or learning, and execute open-vocabulary navigation of robots without any prior knowledge to do this. We applied an omnidirectional camera and pre-trained vision-language models to the robot. The omnidirectional camera provides a uniform view of the surroundings, thus eliminating the need for complicated exploratory behaviors including trajectory generation. By applying multiple pre-trained vision-language models to this omnidirectional image and incorporating reflective behaviors, we show that navigation becomes simple and does not require any prior setup. Interesting properties and limitations of our method are discussed based on experiments with the mobile robot Fetch.
comment: Accepted at Advanced Robotics, website - https://haraduka.github.io/omnidirectional-vlm/
Deep Reinforcement Learning for Decentralized Multi-Robot Control: A DQN Approach to Robustness and Information Integration
The superiority of Multi-Robot Systems (MRS) in various complex environments is unquestionable. However, in complex situations such as search and rescue, environmental monitoring, and automated production, robots are often required to work collaboratively without a central control unit. This necessitates an efficient and robust decentralized control mechanism to process local information and guide the robots' behavior. In this work, we propose a new decentralized controller design method that utilizes the Deep Q-Network (DQN) algorithm from deep reinforcement learning, aimed at improving the integration of local information and robustness of multi-robot systems. The designed controller allows each robot to make decisions independently based on its local observations while enhancing the overall system's collaborative efficiency and adaptability to dynamic environments through a shared learning mechanism. Through testing in simulated environments, we have demonstrated the effectiveness of this controller in improving task execution efficiency, strengthening system fault tolerance, and enhancing adaptability to the environment. Furthermore, we explored the impact of DQN parameter tuning on system performance, providing insights for further optimization of the controller design. Our research not only showcases the potential application of the DQN algorithm in the decentralized control of multi-robot systems but also offers a new perspective on how to enhance the overall performance and robustness of the system through the integration of local information.
comment: Multi-robot system, Reinforcement learning, Information ingrated
ViIK: Flow-based Vision Inverse Kinematics Solver with Fusing Collision Checking
Inverse Kinematics (IK) is to find the robot's configurations that satisfy the target pose of the end effector. In motion planning, diverse configurations were required in case a feasible trajectory was not found. Meanwhile, collision checking (CC), e.g. Oriented bounding box (OBB), Discrete Oriented Polytope (DOP), and Quickhull \cite{quickhull}, needs to be done for each configuration provided by the IK solver to ensure every goal configuration for motion planning is available. This means the classical IK solver and CC algorithm should be executed repeatedly for every configuration. Thus, the preparation time is long when the required number of goal configurations is large, e.g. motion planning in cluster environments. Moreover, structured maps, which might be difficult to obtain, were required by classical collision-checking algorithms. To sidestep such two issues, we propose a flow-based vision method that can output diverse available configurations by fusing inverse kinematics and collision checking, named Vision Inverse Kinematics solver (ViIK). Moreover, ViIK uses RGB images as the perception of environments. ViIK can output 1000 configurations within 40 ms, and the accuracy is about 3 millimeters and 1.5 degrees. The higher accuracy can be obtained by being refined by the classical IK solver within a few iterations. The self-collision rates can be lower than 2%. The collision-with-env rates can be lower than 10% in most scenes. The code is available at: https://github.com/AdamQLMeng/ViIK.
FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization
Hierarchical methods represent state-of-the-art visual localization, optimizing search efficiency by using global descriptors to focus on relevant map regions. However, this state-of-the-art performance comes at the cost of substantial memory requirements, as all database images must be stored for feature matching. In contrast, direct 2D-3D matching algorithms require significantly less memory but suffer from lower accuracy due to the larger and more ambiguous search space. We address this ambiguity by fusing local and global descriptors using a weighted average operator within a 2D-3D search framework. This fusion rearranges the local descriptor space such that geographically nearby local descriptors are closer in the feature space according to the global descriptors. Therefore, the number of irrelevant competing descriptors decreases, specifically if they are geographically distant, thereby increasing the likelihood of correctly matching a query descriptor. We consistently improve the accuracy over local-only systems and achieve performance close to hierarchical methods while halving memory requirements. Extensive experiments using various state-of-the-art local and global descriptors across four different datasets demonstrate the effectiveness of our approach. For the first time, our approach enables direct matching algorithms to benefit from global descriptors while maintaining memory efficiency. The code for this paper will be published at \href{https://github.com/sontung/descriptor-disambiguation}{github.com/sontung/descriptor-disambiguation}.
Evaluating Gait Symmetry with a Smart Robotic Walker: A Novel Approach to Mobility Assessment IROS
Gait asymmetry, a consequence of various neurological or physical conditions such as aging and stroke, detrimentally impacts bipedal locomotion, causing biomechanical alterations, increasing the risk of falls and reducing quality of life. Addressing this critical issue, this paper introduces a novel diagnostic method for gait symmetry analysis through the use of an assistive robotic Smart Walker equipped with an innovative asymmetry detection scheme. This method analyzes sensor measurements capturing the interaction torque between user and walker. By applying a seasonal-trend decomposition tool, we isolate gait-specific patterns within these data, allowing for the estimation of stride durations and calculation of a symmetry index. Through experiments involving 5 experimenters, we demonstrate the Smart Walker's capability in detecting and quantifying gait asymmetry by achieving an accuracy of 84.9% in identifying asymmetric cases in a controlled testing environment. Further analysis explores the classification of these asymmetries based on their underlying causes, providing valuable insights for gait assessment. The results underscore the potential of the device as a precise, ready-to-use monitoring tool for personalized rehabilitation, facilitating targeted interventions for enhanced patient outcomes.
comment: 7 pages, 5 figures, accepted for the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Abu Dhabi, UAE, 2024
Optimized Kalman Filter based State Estimation and Height Control in Hopping Robots
Quadrotor-based multimodal hopping and flying locomotion significantly improves efficiency and operation time as compared to purely flying systems. However, effective control necessitates continuous estimation of the vertical states. A single hopping state estimator has been shown (Kang 2024), in which two vertical states (position, acceleration) are measured and only velocity is estimated using a moving horizon estimation and visual inertial odometry at 200 Hz. This technique requires complex sensors (IMU, lidar, depth camera, contact force sensor), and computationally intensive calculations (12-core, 5 GHz processor), for a maximum hop height of $\sim$0.6 m at 3.65 kg. Here we show a trained Kalman filter based hopping vertical state estimator (HVSE), requiring only vertical acceleration measurements. Our results show the HVSE can estimate more states (position, velocity) with a mean-absolute-error in the hop apex ratio (height error/ground truth) of 12.5\%, running $\sim$4.2x faster (840 Hz) on a substantially less powerful processor (dual-core 240 MHz) with over $\sim$6.7x the hopping height (4.02 m) at 20\% of the mass (672 g). The presented general HVSE, and training procedure are broadly applicable to jumping, hopping, and legged robots across a wide range of sizes and hopping heights.
comment: 14 pages, 6 figures, 5 tables
Visual Localization in 3D Maps: Comparing Point Cloud, Mesh, and NeRF Representations
This paper introduces and assesses a cross-modal global visual localization system that can localize camera images within a color 3D map representation built using both visual and lidar sensing. We present three different state-of-the-art methods for creating the color 3D maps: point clouds, meshes, and neural radiance fields (NeRF). Our system constructs a database of synthetic RGB and depth image pairs from these representations. This database serves as the basis for global localization. We present an automatic approach that builds this database by synthesizing novel images of the scene and exploiting the 3D structure encoded in the different representations. Next, we present a global localization system that relies on the synthetic image database to accurately estimate the 6 DoF camera poses of monocular query images. Our localization approach relies on different learning-based global descriptors and feature detectors which enable robust image retrieval and matching despite the domain gap between (real) query camera images and the synthetic database images. We assess the system's performance through extensive real-world experiments in both indoor and outdoor settings, in order to evaluate the effectiveness of each map representation and the benefits against traditional structure-from-motion localization approaches. Our results show that all three map representations can achieve consistent localization success rates of 55% and higher across various environments. NeRF synthesized images show superior performance, localizing query images at an average success rate of 72%. Furthermore, we demonstrate that our synthesized database enables global localization even when the map creation data and the localization sequence are captured when travelling in opposite directions. Our system, operating in real-time on a mobile laptop equipped with a GPU, achieves a processing rate of 1Hz.
D$^3$FlowSLAM: Self-Supervised Dynamic SLAM with Flow Motion Decomposition and DINO Guidance
In this paper, we introduce a self-supervised deep SLAM method that robustly operates in dynamic scenes while accurately identifying dynamic components. Our method leverages a dual-flow representation for static flow and dynamic flow, facilitating effective scene decomposition in dynamic environments. We propose a dynamic update module based on this representation and develop a dense SLAM system that excels in dynamic scenarios. In addition, we design a self-supervised training scheme using DINO as a prior, enabling label-free training. Our method achieves superior accuracy compared to other self-supervised methods. It also matches or even surpasses the performance of existing supervised methods in some cases. All code and data will be made publicly available upon acceptance.
comment: Homepage: https://zju3dv.github.io/deflowslam
A Survey for Foundation Models in Autonomous Driving
The advent of foundation models has revolutionized the fields of natural language processing and computer vision, paving the way for their application in autonomous driving (AD). This survey presents a comprehensive review of more than 40 research papers, demonstrating the role of foundation models in enhancing AD. Large language models contribute to planning and simulation in AD, particularly through their proficiency in reasoning, code generation and translation. In parallel, vision foundation models are increasingly adapted for critical tasks such as 3D object detection and tracking, as well as creating realistic driving scenarios for simulation and testing. Multi-modal foundation models, integrating diverse inputs, exhibit exceptional visual understanding and spatial reasoning, crucial for end-to-end AD. This survey not only provides a structured taxonomy, categorizing foundation models based on their modalities and functionalities within the AD domain but also delves into the methods employed in current research. It identifies the gaps between existing foundation models and cutting-edge AD approaches, thereby charting future research directions and proposing a roadmap for bridging these gaps.
Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations
Learning from Demonstrations, the field that proposes to learn robot behavior models from data, is gaining popularity with the emergence of deep generative models. Although the problem has been studied for years under names such as Imitation Learning, Behavioral Cloning, or Inverse Reinforcement Learning, classical methods have relied on models that don't capture complex data distributions well or don't scale well to large numbers of demonstrations. In recent years, the robot learning community has shown increasing interest in using deep generative models to capture the complexity of large datasets. In this survey, we aim to provide a unified and comprehensive review of the last year's progress in the use of deep generative models in robotics. We present the different types of models that the community has explored, such as energy-based models, diffusion models, action value maps, or generative adversarial networks. We also present the different types of applications in which deep generative models have been used, from grasp generation to trajectory generation or cost learning. One of the most important elements of generative models is the generalization out of distributions. In our survey, we review the different decisions the community has made to improve the generalization of the learned models. Finally, we highlight the research challenges and propose a number of future directions for learning deep generative models in robotics.
comment: 20 pages, 11 figures, submitted to TRO
Tilde: Teleoperation for Dexterous In-Hand Manipulation Learning with a DeltaHand
Dexterous robotic manipulation remains a challenging domain due to its strict demands for precision and robustness on both hardware and software. While dexterous robotic hands have demonstrated remarkable capabilities in complex tasks, efficiently learning adaptive control policies for hands still presents a significant hurdle given the high dimensionalities of hands and tasks. To bridge this gap, we propose Tilde, an imitation learning-based in-hand manipulation system on a dexterous DeltaHand. It leverages 1) a low-cost, configurable, simple-to-control, soft dexterous robotic hand, DeltaHand, 2) a user-friendly, precise, real-time teleoperation interface, TeleHand, and 3) an efficient and generalizable imitation learning approach with diffusion policies. Our proposed TeleHand has a kinematic twin design to the DeltaHand that enables precise one-to-one joint control of the DeltaHand during teleoperation. This facilitates efficient high-quality data collection of human demonstrations in the real world. To evaluate the effectiveness of our system, we demonstrate the fully autonomous closed-loop deployment of diffusion policies learned from demonstrations across seven dexterous manipulation tasks with an average 90% success rate.
Toward Control of Wheeled Humanoid Robots with Unknown Payloads: Equilibrium Point Estimation via Real-to-Sim Adaptation
Model-based controllers using a linearized model around the system's equilibrium point is a common approach in the control of a wheeled humanoid due to their less computational load and ease of stability analysis. However, controlling a wheeled humanoid robot while it lifts an unknown object presents significant challenges, primarily due to the lack of knowledge in object dynamics. This paper presents a framework designed for predicting the new equilibrium point explicitly to control a wheeled-legged robot with unknown dynamics. We estimated the total mass and center of mass of the system from its response to initially unknown dynamics, then calculated the new equilibrium point accordingly. To avoid using additional sensors (e.g., force torque sensor) and reduce the effort of obtaining expensive real data, a data-driven approach is utilized with a novel real-to-sim adaptation. A more accurate nonlinear dynamics model, offering a closer representation of real-world physics, is injected into a rigid-body simulation for real-to-sim adaptation. The nonlinear dynamics model parameters were optimized using Particle Swarm Optimization. The efficacy of this framework was validated on a physical wheeled inverted pendulum, a simplified model of a wheeled-legged robot. The experimental results indicate that employing a more precise analytical model with optimized parameters significantly reduces the gap between simulation and reality, thus improving the efficiency of a model-based controller in controlling a wheeled robot with unknown dynamics
Online Learning-Based Inertial Parameter Identification of Unknown Object for Model-Based Control of Wheeled Humanoids
Identifying the dynamic properties of manipulated objects is essential for safe and accurate robot control. Most methods rely on low noise force torque sensors, long exciting signals, and solving nonlinear optimization problems, making the estimation process slow. In this work, we propose a fast, online learning based inertial parameter estimation framework that enhances model based control. We aim to quickly and accurately estimate the parameters of an unknown object using only the robot's proprioception through end to end learning, which is applicable for real-time system. To effectively capture features in robot proprioception affected by object dynamics and address the challenge of obtaining ground truth inertial parameters in the real world, we developed a high fidelity simulation that uses more accurate robot dynamics through real-to-sim adaptation. Since our adaptation focuses solely on the robot, task-relevant data (e.g., holding an object) is not required from the real world, simplifying the data collection process. Moreover, we address both parametric and non-parametric modeling errors independently using Robot System Identification and Gaussian Processes. We validate our estimator to assess how quickly and accurately it can estimate physically feasible parameters of an manipulated object given a specific trajectory obtained from a wheeled humanoid robot. Our estimator achieves faster estimation speeds (around 0.1 seconds) while maintaining accuracy comparable to other methods. Additionally, our estimator further highlight its benefits in improving the performance of model based control by compensating object's dynamics and reinitializing new equilibrium point of wheeled humanoid
Comparative Analysis of NMPC and Fuzzy PID Controllers for Trajectory Tracking in Omni-Drive Robots: Design, Simulation, and Performance Evaluation
Trajectory tracking for an Omni-drive robot presents a challenging task that demands an efficient controller design. This paper introduces a self-optimizing controller, Type-1 fuzzyPID, which leverages dynamic and static system response analysis to overcome the limitations of manual tuning. To account for system uncertainties, an Interval Type-2 fuzzyPID controller is also developed. Both controllers are designed using Matlab/Simulink and tested through trajectory tracking simulations in the CoppeliaSim environment. Additionally, a non-linear model predictive controller(NMPC) is proposed and compared against the fuzzyPID controllers. The impact of tunable parameters on NMPC tracking accuracy is thoroughly examined. We also present plots of the step-response characteristics and noise rejection experiments for each controller. Simulation results validate the precision and effectiveness of NMPC over fuzzyPID controllers while trading computational complexity. Access to code and simulation environment is available in the following link: https://github.com/love481/Omni-drive-robot-Simulation.git.
LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning IROS 2024
Conventional Task and Motion Planning (TAMP) approaches rely on manually crafted interfaces connecting symbolic task planning with continuous motion generation. These domain-specific and labor-intensive modules are limited in addressing emerging tasks in real-world settings. Here, we present LLM^3, a novel Large Language Model (LLM)-based TAMP framework featuring a domain-independent interface. Specifically, we leverage the powerful reasoning and planning capabilities of pre-trained LLMs to propose symbolic action sequences and select continuous action parameters for motion planning. Crucially, LLM^3 incorporates motion planning feedback through prompting, allowing the LLM to iteratively refine its proposals by reasoning about motion failure. Consequently, LLM^3 interfaces between task planning and motion planning, alleviating the intricate design process of handling domain-specific messages between them. Through a series of simulations in a box-packing domain, we quantitatively demonstrate the effectiveness of LLM^3 in solving TAMP problems and the efficiency in selecting action parameters. Ablation studies underscore the significant contribution of motion failure reasoning to the success of LLM^3. Furthermore, we conduct qualitative experiments on a physical manipulator, demonstrating the practical applicability of our approach in real-world settings.
comment: IROS 2024. Codes available: https://github.com/AssassinWS/LLM-TAMP
Runtime Verification and Field-based Testing for ROS-based Robotic Systems
Robotic systems are becoming pervasive and adopted in increasingly many domains, such as manufacturing, healthcare, and space exploration. To this end, engineering software has emerged as a crucial discipline for building maintainable and reusable robotic systems. The robotics software engineering research field has received increasing attention, fostering autonomy as a fundamental goal. However, robotics developers are still challenged to achieve this goal because simulation cannot realistically deliver solutions to emulate real-world phenomena. Robots also need to operate in unpredictable and uncontrollable environments, which require safe and trustworthy self-adaptation capabilities implemented in software. Typical techniques to address the challenges are runtime verification, field-based testing, and mitigation techniques that enable fail-safe solutions. However, no clear guidance exists for architecting ROS-based systems to enable and facilitate runtime verification and field-based testing. This paper aims to fill this gap by providing guidelines to help developers and quality assurance (QA) teams develop, verify, or test their robots in the field. These guidelines are carefully tailored to address the challenges and requirements of testing robotics systems in real-world scenarios. We conducted (i) a literature review on studies addressing runtime verification and field-based testing for robotic systems, (ii) mined ROS-based applications repositories, and (iii) validated the applicability, clarity, and usefulness via two questionnaires with 55 answers overall. We contribute 20 guidelines: 8 for developers and 12 for QA teams formulated for researchers and practitioners in robotic software engineering. Finally, we map our guidelines to open challenges in runtime verification and field-based testing for ROS-based systems, and we outline promising research directions in the field.
Learning Coordinated Maneuver in Adversarial Environments
This paper aims to solve the coordination of a team of robots traversing a route in the presence of adversaries with random positions. Our goal is to minimize the overall cost of the team, which is determined by (i) the accumulated risk when robots stay in adversary-impacted zones and (ii) the mission completion time. During traversal, robots can reduce their speed and act as a `guard' (the slower, the better), which will decrease the risks certain adversary incurs. This leads to a trade-off between the robots' guarding behaviors and their travel speeds. The formulated problem is highly non-convex and cannot be efficiently solved by existing algorithms. Our approach includes a theoretical analysis of the robots' behaviors for the single-adversary case. As the scale of the problem expands, solving the optimal solution using optimization approaches is challenging, therefore, we employ reinforcement learning techniques by developing new encoding and policy-generating methods. Simulations demonstrate that our learning methods can efficiently produce team coordination behaviors. We discuss the reasoning behind these behaviors and explain why they reduce the overall team cost.
Bi-CL: A Reinforcement Learning Framework for Robots Coordination Through Bi-level Optimization
In multi-robot systems, achieving coordinated missions remains a significant challenge due to the coupled nature of coordination behaviors and the lack of global information for individual robots. To mitigate these challenges, this paper introduces a novel approach, Bi-level Coordination Learning (Bi-CL), that leverages a bi-level optimization structure within a centralized training and decentralized execution paradigm. Our bi-level reformulation decomposes the original problem into a reinforcement learning level with reduced action space, and an imitation learning level that gains demonstrations from a global optimizer. Both levels contribute to improved learning efficiency and scalability. We note that robots' incomplete information leads to mismatches between the two levels of learning models. To address this, Bi-CL further integrates an alignment penalty mechanism, aiming to minimize the discrepancy between the two levels without degrading their training efficiency. We introduce a running example to conceptualize the problem formulation and apply Bi-CL to two variations of this example: route-based and graph-based scenarios. Simulation results demonstrate that Bi-CL can learn more efficiently and achieve comparable performance with traditional multi-agent reinforcement learning baselines for multi-robot coordination.
Visual SLAM with 3D Gaussian Primitives and Depth Priors Enabling Novel View Synthesis
Conventional geometry-based SLAM systems lack dense 3D reconstruction capabilities since their data association usually relies on feature correspondences. Additionally, learning-based SLAM systems often fall short in terms of real-time performance and accuracy. Balancing real-time performance with dense 3D reconstruction capabilities is a challenging problem. In this paper, we propose a real-time RGB-D SLAM system that incorporates a novel view synthesis technique, 3D Gaussian Splatting, for 3D scene representation and pose estimation. This technique leverages the real-time rendering performance of 3D Gaussian Splatting with rasterization and allows for differentiable optimization in real time through CUDA implementation. We also enable mesh reconstruction from 3D Gaussians for explicit dense 3D reconstruction. To estimate accurate camera poses, we utilize a rotation-translation decoupled strategy with inverse optimization. This involves iteratively updating both in several iterations through gradient-based optimization. This process includes differentiably rendering RGB, depth, and silhouette maps and updating the camera parameters to minimize a combined loss of photometric loss, depth geometry loss, and visibility loss, given the existing 3D Gaussian map. However, 3D Gaussian Splatting (3DGS) struggles to accurately represent surfaces due to the multi-view inconsistency of 3D Gaussians, which can lead to reduced accuracy in both camera pose estimation and scene reconstruction. To address this, we utilize depth priors as additional regularization to enforce geometric constraints, thereby improving the accuracy of both pose estimation and 3D reconstruction. We also provide extensive experimental results on public benchmark datasets to demonstrate the effectiveness of our proposed methods in terms of pose accuracy, geometric accuracy, and rendering performance.
MorphoMove: Bi-Modal Path Planner with MPC-based Path Follower for Multi-Limb Morphogenetic UAV
This paper discusses developments for a multi-limb morphogenetic UAV, MorphoGear, that is capable of both aerial flight and ground locomotion. A hybrid path planning algorithm based on the A* strategy has been developed, enabling seamless transition between air-to-ground navigation modes, thereby enhancing robot's mobility in complex environments. Moreover, precise path following is achieved during ground locomotion with a Model Predictive Control (MPC) architecture for its novel walking behaviour. Experimental validation was conducted in the Unity simulation environment utilizing Python scripts to compute control values. The algorithm's performance is validated by the Root Mean Squared Error (RMSE) of 0.91 cm and a maximum error of 1.85 cm, as demonstrated by the results. These developments highlight the adaptability of MorphoGear in navigation through cluttered environments, establishing it as a usable tool in autonomous exploration, both aerial and ground-based.
comment: Accepted in IEEE International Conference on Systems, Man, and Cybernetics (SMC 2024)
NGEL-SLAM: Neural Implicit Representation-based Global Consistent Low-Latency SLAM System
Neural implicit representations have emerged as a promising solution for providing dense geometry in Simultaneous Localization and Mapping (SLAM). However, existing methods in this direction fall short in terms of global consistency and low latency. This paper presents NGEL-SLAM to tackle the above challenges. To ensure global consistency, our system leverages a traditional feature-based tracking module that incorporates loop closure. Additionally, we maintain a global consistent map by representing the scene using multiple neural implicit fields, enabling quick adjustment to the loop closure. Moreover, our system allows for fast convergence through the use of octree-based implicit representations. The combination of rapid response to loop closure and fast convergence makes our system a truly low-latency system that achieves global consistency. Our system enables rendering high-fidelity RGB-D images, along with extracting dense and complete surfaces. Experiments on both synthetic and real-world datasets suggest that our system achieves state-of-the-art tracking and mapping accuracy while maintaining low latency.
A Framework For Automated Dissection Along Tissue Boundary
Robotic surgery promises enhanced precision and adaptability over traditional surgical methods. It also offers the possibility of automating surgical interventions, resulting in reduced stress on the surgeon, better surgical outcomes, and lower costs. Cholecystectomy, the removal of the gallbladder, serves as an ideal model procedure for automation due to its distinct and well-contrasted anatomical features between the gallbladder and liver, along with standardized surgical maneuvers. Dissection is a frequently used subtask in cholecystectomy where the surgeon delivers the energy on the hook to detach the gallbladder from the liver. Hence, dissection along tissue boundaries is a good candidate for surgical automation. For the da Vinci surgical robot to perform the same procedure as a surgeon automatically, it needs to have the ability to (1) recognize and distinguish between the two different tissues (e.g. the liver and the gallbladder), (2) understand where the boundary between the two tissues is located in the 3D workspace, (3) locate the instrument tip relative to the boundary in the 3D space using visual feedback, and (4) move the instrument along the boundary. This paper presents a novel framework that addresses these challenges through AI-assisted image processing and vision-based robot control. We also present the ex-vivo evaluation of the automated procedure on chicken and pork liver specimens that demonstrates the effectiveness of the proposed framework.
comment: 9 pages, 7 figures, 7 tables, accepted in the 2024 International Conference on Biomedical Robotics and Biomechatronics (BioRob 2024)
S$^3$-MonoDETR: Supervised Shape&Scale-perceptive Deformable Transformer for Monocular 3D Object Detection
Recently, transformer-based methods have shown exceptional performance in monocular 3D object detection, which can predict 3D attributes from a single 2D image. These methods typically use visual and depth representations to generate query points on objects, whose quality plays a decisive role in the detection accuracy. However, current unsupervised attention mechanisms without any geometry appearance awareness in transformers are susceptible to producing noisy features for query points, which severely limits the network performance and also makes the model have a poor ability to detect multi-category objects in a single training process. To tackle this problem, this paper proposes a novel ``Supervised Shape&Scale-perceptive Deformable Attention'' (S$^3$-DA) module for monocular 3D object detection. Concretely, S$^3$-DA utilizes visual and depth features to generate diverse local features with various shapes and scales and predict the corresponding matching distribution simultaneously to impose valuable shape&scale perception for each query. Benefiting from this, S$^3$-DA effectively estimates receptive fields for query points belonging to any category, enabling them to generate robust query features. Besides, we propose a Multi-classification-based Shape&Scale Matching (MSM) loss to supervise the above process. Extensive experiments on KITTI and Waymo Open datasets demonstrate that S$^3$-DA significantly improves the detection accuracy, yielding state-of-the-art performance of single-category and multi-category 3D object detection in a single training process compared to the existing approaches. The source code will be made publicly available at https://github.com/mikasa3lili/S3-MonoDETR.
comment: The source code will be made publicly available at https://github.com/mikasa3lili/S3-MonoDETR
NeuFlow v2: High-Efficiency Optical Flow Estimation on Edge Devices
Real-time high-accuracy optical flow estimation is crucial for various real-world applications. While recent learning-based optical flow methods have achieved high accuracy, they often come with significant computational costs. In this paper, we propose a highly efficient optical flow method that balances high accuracy with reduced computational demands. Building upon NeuFlow v1, we introduce new components including a much more light-weight backbone and a fast refinement module. Both these modules help in keeping the computational demands light while providing close to state of the art accuracy. Compares to other state of the art methods, our model achieves a 10x-70x speedup while maintaining comparable performance on both synthetic and real-world data. It is capable of running at over 20 FPS on 512x384 resolution images on a Jetson Orin Nano. The full training and evaluation code is available at https://github.com/neufieldrobotics/NeuFlow_v2.
Pointing the Way: Refining Radar-Lidar Localization Using Learned ICP Weights
This paper presents a novel deep-learning-based approach to improve localizing radar measurements against lidar maps. This radar-lidar localization leverages the benefits of both sensors; radar is resilient against adverse weather, while lidar produces high-quality maps in clear conditions. However, owing in part to the unique artefacts present in radar measurements, radar-lidar localization has struggled to achieve comparable performance to lidar-lidar systems, preventing it from being viable for autonomous driving. This work builds on ICP-based radar-lidar localization by including a learned preprocessing step that weights radar points based on high-level scan information. To train the weight-generating network, we present a novel, stand-alone, open-source differentiable ICP library. The learned weights facilitate ICP by filtering out harmful radar points related to artefacts, noise, and even vehicles on the road. Combining an analytical approach with a learned weight reduces overall localization errors and improves convergence in radar-lidar ICP results run on real-world autonomous driving data. Our code base is publicly available to facilitate reproducibility and extensions.
comment: 8 pages, 4 figures, 1 table. Submitted to Robotics and Automation Letters (RA-L)
GazeRace: Revolutionizing Remote Piloting with Eye-Gaze Control
This paper presents GazeRace, a novel system that leverages eye-tracking technology for intuitive drone control. Using the MediaPipe library, the system translates eye movements into precise drone commands, enabling effective remote piloting. In testing, GazeRace demonstrated an 18% reduction in drone trajectory length while maintaining competitive speed with traditional controls. The results suggest that this approach enhances control accuracy and reduces user frustration, offering a significant advancement in the field of human-computer interaction and drone navigation.
comment: Accepted in: IEEE International Conference on Systems, Man, and Cybernetics (SMC 2024)
Hamilton-Jacobi Reachability in Reinforcement Learning: A Survey
Recent literature has proposed approaches that learn control policies with high performance while maintaining safety guarantees. Synthesizing Hamilton-Jacobi (HJ) reachable sets has become an effective tool for verifying safety and supervising the training of reinforcement learning-based control policies for complex, high-dimensional systems. Previously, HJ reachability was restricted to verifying low-dimensional dynamical systems primarily because the computational complexity of the dynamic programming approach it relied on grows exponentially with the number of system states. In recent years, a litany of proposed methods addresses this limitation by computing the reachability value function simultaneously with learning control policies to scale HJ reachability analysis while still maintaining a reliable estimate of the true reachable set. These HJ reachability approximations are used to improve the safety, and even reward performance, of learned control policies and can solve challenging tasks such as those with dynamic obstacles and/or with lidar-based or vision-based observations. In this survey paper, we review the recent developments in the field of HJ reachability estimation in reinforcement learning that would provide a foundational basis for further research into reliability in high-dimensional systems.
comment: Accepted in IEEE Open Journal of Control Systems (OJ-CSYS)
Systems and Control (CS)
An Advanced Microscopic Energy Consumption Model for Automated Vehicle:Development, Calibration, Verification
The automated vehicle (AV) equipped with the Adaptive Cruise Control (ACC) system is expected to reduce the fuel consumption for the intelligent transportation system. This paper presents the Advanced ACC-Micro (AA-Micro) model, a new energy consumption model based on micro trajectory data, calibrated and verified by empirical data. Utilizing a commercial AV equipped with the ACC system as the test platform, experiments were conducted at the Columbus 151 Speedway, capturing data from multiple ACC and Human-Driven (HV) test runs. The calibrated AA-Micro model integrates features from traditional energy consumption models and demonstrates superior goodness of fit, achieving an impressive 90% accuracy in predicting ACC system energy consumption without overfitting. A comprehensive statistical evaluation of the AA-Micro model's applicability and adaptability in predicting energy consumption and vehicle trajectories indicated strong model consistency and reliability for ACC vehicles, evidenced by minimal variance in RMSE values and uniform RSS distributions. Conversely, significant discrepancies were observed when applying the model to HV data, underscoring the necessity for specialized models to accurately predict energy consumption for HV and ACC systems, potentially due to their distinct energy consumption characteristics.
CAMEO:A Co-design Architecture for Multi-objective Energy System Optimization
Co-design plays a pivotal role in energy system planning as it allows for the holistic optimization of interconnected components, fostering efficiency, resilience, and sustainability by addressing complex interdependencies and trade-offs within the system. This leads to reduced operational costs and improved financial performance through optimized system design, resource allocation, and system-wide synergies. In addition, system planners must consider multiple probable scenarios to plan for potential variations in operating conditions, uncertainties, and future demands, ensuring robust and adaptable solutions that can effectively address the needs and challenges of various systems. This research introduces Co-design Architecture for Multi-objective Energy System Optimization (CAMEO), which facilitates design space exploration of the co-design problem via a modular and automated workflow system, enhancing flexibility and accelerating the design and validation cycles. The cloud-scale automation provides a user-friendly interface and enable energy system modelers to efficiently explore diverse design alternatives. CAMEO aims to revolutionize energy system optimization by developing next-generation design assistant with improved scalability, usability, and automation, thereby enabling the development of optimized energy systems with greater ease and speed.
Consensus over Clustered Networks using Intermittent and Asynchronous Output Feedback
In recent years, multi-agent teaming has garnered considerable interest since complex objectives, such as intelligence, surveillance, and reconnaissance, can be divided into multiple cluster-level sub-tasks and assigned to a cluster of agents with the appropriate functionality. Yet, coordination and information dissemination between clusters may be necessary to accomplish a desired objective. Distributed consensus protocols provide a mechanism for spreading information within clustered networks, allowing agents and clusters to make decisions without requiring direct access to the state of the ensemble. Hence, we propose a strategy for achieving system-wide consensus in the states of identical linear time-invariant systems coupled by an undirected graph whose directed sub-graphs are available only at sporadic times. Within this work, the agents of the network are organized into pairwise disjoint clusters, which induce sub-graphs of the undirected parent graph. Some cluster sub-graph pairs are linked by an inter-cluster sub-graph, where the union of all cluster and inter-cluster sub-graphs yields the undirected parent graph. Each agent utilizes a distributed consensus protocol with components that are updated intermittently and asynchronously with respect to other agents. The closed-loop ensemble dynamics is modeled as a hybrid system, and a Lyapunov-based stability analysis yields sufficient conditions for rendering the agreement subspace (consensus set) globally exponentially stable. Furthermore, an input-to-state stability argument demonstrates the consensus set is robust to a class of perturbations. A numerical simulation considering both nominal and perturbed scenarios is provided for validation purposes.
Collaborative Robot Arm Inserting Nasopharyngeal Swabs with Admittance Control
The nasopharyngeal (NP) swab sample test, commonly used to detect COVID-19 and other respiratory illnesses, involves moving a swab through the nasal cavity to collect samples from the nasopharynx. While typically this is done by human healthcare workers, there is a significant societal interest to enable robots to do this test to reduce exposure to patients and to free up human resources. The task is challenging from the robotics perspective because of the dexterity and safety requirements. While other works have implemented specific hardware solutions, our research differentiates itself by using a ubiquitous rigid robotic arm. This work presents a case study where we investigate the strengths and challenges using compliant control system to accomplish NP swab tests with such a robotic configuration. To accomplish this, we designed a force sensing end-effector that integrates with the proposed torque controlled compliant control loop. We then conducted experiments where the robot inserted NP swabs into a 3D printed nasal cavity phantom. Ultimately, we found that the compliant control system outperformed a basic position controller and shows promise for human use. However, further efforts are needed to ensure the initial alignment with the nostril and to address head motion.
comment: 13 pages, 9 figures. See https://uwaterloo.ca/scholar/pqjlee/collaborative-robot-arm-inserting-nasopharyngeal-swabs-admittance-control for supplementary data
Data-driven H2-optimal Model Reduction via Offline Transfer Function Sampling
$\mathcal{H}_2$-optimal model order reduction algorithms represent a significant class of techniques, known for their accuracy, which has been extensively validated over the past two decades. Among these, the Iterative Rational Krylov Algorithm (IRKA) is widely regarded as a benchmark for constructing $\mathcal{H}_2$-optimal reduced-order models. However, a key challenge in its data-driven implementation lies in the need for transfer function samples and their derivatives, which must be updated iteratively. Conducting new experiments to acquire these samples each time IRKA updates the interpolation data is impractical. Additionally, for discrete-time systems, obtaining transfer function samples at frequencies outside the unit circle is challenging, as these are not easily accessible through measurements. This paper proposes a method to sample the transfer function and its derivative offline using frequency or time-domain data, which is commonly measured for various design and analysis purposes in industry. By leveraging this approach, there is no need to directly measure transfer function samples at interpolation points, as these can be generated offline using the pre-existing data. This facilitates the offline implementation of IRKA within the frequency- or time-domain Loewner framework. The approach is also extended to discrete-time systems in this work. A numerical example is provided to validate the theoretical findings presented.
Data-driven Modeling of Combined Sewer Systems for Urban Sustainability: An Empirical Evaluation
Climate change poses complex challenges, with extreme weather events becoming increasingly frequent and difficult to model. Examples include the dynamics of Combined Sewer Systems (CSS). Overburdened CSS during heavy rainfall will overflow untreated wastewater into surface water bodies. Classical approaches to modeling the impact of extreme rainfall events rely on physical simulations, which are particularly challenging to create for large urban infrastructures. Deep Learning (DL) models offer a cost-effective alternative for modeling the complex dynamics of sewer systems. In this study, we present a comprehensive empirical evaluation of several state-of-the-art DL time series models for predicting sewer system dynamics in a large urban infrastructure, utilizing three years of measurement data. We especially investigate the potential of DL models to maintain predictive precision during network outages by comparing global models, which have access to all variables within the sewer system, and local models, which are limited to data from a restricted set of local sensors. Our findings demonstrate that DL models can accurately predict the dynamics of sewer system load, even under network outage conditions. These results suggest that DL models can effectively aid in balancing the load redistribution in CSS, thereby enhancing the sustainability and resilience of urban infrastructures.
comment: 12 pages, 4 figures, accepted at 47th German Conference on Artificial Intelligence, Wuerzburg 2024
Networked Communication for Mean-Field Games with Function Approximation and Empirical Mean-Field Estimation
Recent works have provided algorithms by which decentralised agents, which may be connected via a communication network, can learn equilibria in Mean-Field Games from a single, non-episodic run of the empirical system. However, these algorithms are given for tabular settings: this computationally limits the size of players' observation space, meaning that the algorithms are not able to handle anything but small state spaces, nor to generalise beyond policies depending on the ego player's state to so-called 'population-dependent' policies. We address this limitation by introducing function approximation to the existing setting, drawing on the Munchausen Online Mirror Descent method that has previously been employed only in finite-horizon, episodic, centralised settings. While this permits us to include the population's mean-field distribution in the observation for each player's policy, it is arguably unrealistic to assume that decentralised agents would have access to this global information: we therefore additionally provide new algorithms that allow agents to estimate the global empirical distribution based on a local neighbourhood, and to improve this estimate via communication over a given network. Our experiments showcase how the communication network allows decentralised agents to estimate the mean-field distribution for population-dependent policies, and that exchanging policy information helps networked agents to outperform both independent and even centralised agents in function-approximation settings, by an even greater margin than in tabular settings.
Enhanced Visual SLAM for Collision-free Driving with Lightweight Autonomous Cars
The paper presents a vision-based obstacle avoidance strategy for lightweight self-driving cars that can be run on a CPU-only device using a single RGB-D camera. The method consists of two steps: visual perception and path planning. The visual perception part uses ORBSLAM3 enhanced with optical flow to estimate the car's poses and extract rich texture information from the scene. In the path planning phase, we employ a method combining a control Lyapunov function and control barrier function in the form of quadratic program (CLF-CBF-QP) together with an obstacle shape reconstruction process (SRP) to plan safe and stable trajectories. To validate the performance and robustness of the proposed method, simulation experiments were conducted with a car in various complex indoor environments using the Gazebo simulation environment. Our method can effectively avoid obstacles in the scenes. The proposed algorithm outperforms benchmark algorithms in achieving more stable and shorter trajectories across multiple simulated scenes.
comment: 16 pages; Submitted to a journal
Flatness-based control revisited: The HEOL setting
We present the algebraic foundations of the HEOL setting, which combines flatness-based control and intelligent controllers, two advances in automatic control that have been proven in practice, including in industry. The result provides a solution to many pending questions on feedback loops concerning flatness-based control and model-free control (MFC). Elementary module theory, ordinary differential fields and the generalization of K\"ahler differentials to differential fields provide an intrinsic definition of the tangent linear system. The algebraic manipulations associated with the operational calculus lead to homeostat and intelligent controllers. They are illustrated via some computer simulations.
comment: Accepted for publication in "Comptes Rendus Math\'ematique"
Learning Deep Dissipative Dynamics
This study challenges strictly guaranteeing ``dissipativity'' of a dynamical system represented by neural networks learned from given time-series data. Dissipativity is a crucial indicator for dynamical systems that generalizes stability and input-output stability, known to be valid across various systems including robotics, biological systems, and molecular dynamics. By analytically proving the general solution to the nonlinear Kalman-Yakubovich-Popov (KYP) lemma, which is the necessary and sufficient condition for dissipativity, we propose a differentiable projection that transforms any dynamics represented by neural networks into dissipative ones and a learning method for the transformed dynamics. Utilizing the generality of dissipativity, our method strictly guarantee stability, input-output stability, and energy conservation of trained dynamical systems. Finally, we demonstrate the robustness of our method against out-of-domain input through applications to robotic arms and fluid dynamics. Code here https://github.com/kojima-r/DeepDissipativeModel
Ultra-Fast and Efficient Design Method Using Deep Learning for Capacitive Coupling WPT System
Capacitive coupling wireless power transfer (CCWPT) is one of the pervasive methods to transfer power in the reactive near-field zone. In this paper, a flexible design methodology based on Binary Particle Swarm Optimization (BPSO) algorithm is proposed for a pixelated microstrip structure. The pixel configuration of each parallel plate (43x43 pixels) determines the frequency response of the system (S-parameters) and by changing this configuration, we can achieve the dedicated operating frequency (resonance frequency) and its related |S21| value. Due to the large number of pixels, iterative optimization algorithm (BPSO) is the solution for designing a CCWPT system. However, the output of each iteration should be simulated in electromagnetic simulators (e.g., CST, HFSS, etc.), hence, the whole optimization process is time-consuming. This paper develops a rapid, agile and efficient method for designing two parallel pixelated microstrip plates of a CCWPT system based on deep neural networks. In the proposed method, CST-based BPSO algorithm is replaced with an AI-based method using ResNet-18. Advantages of the AI-based iterative method are automatic design process, more efficient, less time-consuming, less computational resource-consuming and less background EM knowledge requirements compared to the conventional techniques. Finally, the prototype of the proposed simulated structure is fabricated and measured. The simulation and measurement results validate the design procedure accuracy, using AI-based BPSO algorithm. The MAE (Mean Absolute Error) of prediction for the main resonance frequency and related |S21| are 110 MHz and 0.18 dB, respectively and according to the simulation results, the whole design process is 3629 times faster than the CST-based BPSO algorithm.
Real-Time Discrete Fractional Fourier Transform Using Metamaterial Coupled Lines Network
Discrete Fractional Fourier Transforms (DFrFT) are universal mathematical tools in signal processing, communications and microwave sensing. Despite the excessive applications of DFrFT, implementation of corresponding fractional orders in the baseband signal often leads to bulky, power-hungry, and high-latency systems. In this paper, we present a passive metamaterial coupled lines network (MCLN) that performs the analog DFrFT in real-time at microwave frequencies. The proposed MCLN consists of M parallel microstrip transmission lines (TLs) in which adjacent TLs are loaded with interdigital capacitors to enhance the coupling level. We show that with proper design of the coupling coefficients between adjacent channels, the MCLN can perform an M-point DFrFT of an arbitrary fractional order that can be designed through the length of the network. In the context of real-time signal processing for realization of DFrFT, we design, model, simulate and implement a 16x16 MCLN and experimentally demonstrate the performance of the proposed structure. The proposed innovative approach is versatile and is capable to be used in various applications where DFrFT is an essential tool. The proposed design scheme based on MCLN is scalable across the frequency spectrum and can be applied to millimeter and submillimeter wave systems.
Reflex-Based Open-Vocabulary Navigation without Prior Knowledge Using Omnidirectional Camera and Multiple Vision-Language Models
Various robot navigation methods have been developed, but they are mainly based on Simultaneous Localization and Mapping (SLAM), reinforcement learning, etc., which require prior map construction or learning. In this study, we consider the simplest method that does not require any map construction or learning, and execute open-vocabulary navigation of robots without any prior knowledge to do this. We applied an omnidirectional camera and pre-trained vision-language models to the robot. The omnidirectional camera provides a uniform view of the surroundings, thus eliminating the need for complicated exploratory behaviors including trajectory generation. By applying multiple pre-trained vision-language models to this omnidirectional image and incorporating reflective behaviors, we show that navigation becomes simple and does not require any prior setup. Interesting properties and limitations of our method are discussed based on experiments with the mobile robot Fetch.
comment: Accepted at Advanced Robotics, website - https://haraduka.github.io/omnidirectional-vlm/
Measurement-based Fast Quantum State Stabilization with Deep Reinforcement Learning
The stabilization of quantum states is a fundamental problem for realizing various quantum technologies. Measurement-based-feedback strategies have demonstrated powerful performance, and the construction of quantum control signals using measurement information has attracted great interest. However, the interaction between quantum systems and the environment is inevitable, especially when measurements are introduced, which leads to decoherence. To mitigate decoherence, it is desirable to stabilize quantum systems faster, thereby reducing the time of interaction with the environment. In this paper, we utilize information obtained from measurement and apply deep reinforcement learning (DRL) algorithms, without explicitly constructing specific complex measurement-control mappings, to rapidly drive random initial quantum state to the target state. The proposed DRL algorithm has the ability to speed up the convergence to a target state, which shortens the interaction between quantum systems and their environments to protect coherence. Simulations are performed on two-qubit and three-qubit systems, and the results show that our algorithm can successfully stabilize random initial quantum system to the target entangled state, with a convergence time faster than traditional methods such as Lyapunov feedback control. Moreover, it exhibits robustness against imperfect measurements and delays in system evolution.
An Econometric Analysis of Large Flexible Cryptocurrency-mining Consumers in Electricity Markets
In recent years, power grids have seen a surge in large cryptocurrency mining firms, with individual consumption levels reaching 700MW. This study examines the behavior of these firms in Texas, focusing on how their consumption is influenced by cryptocurrency conversion rates, electricity prices, local weather, and other factors. We transform the skewed electricity consumption data of these firms, perform correlation analysis, and apply a seasonal autoregressive moving average model for analysis. Our findings reveal that, surprisingly, short-term mining electricity consumption is not correlated with cryptocurrency conversion rates. Instead, the primary influencers are the temperature and electricity prices. These firms also respond to avoid transmission and distribution network (T\&D) charges -- famously known as four Coincident peak (4CP) charges -- during summer times. As the scale of these firms is likely to surge in future years, the developed electricity consumption model can be used to generate public, synthetic datasets to understand the overall impact on power grid. The developed model could also lead to better pricing mechanisms to effectively use the flexibility of these resources towards improving power grid reliability.
comment: 10 pages, 10 figures, accepted for publication in Hawaii International Conference on System Sciences-58
On the design of stabilizing FIR controllers
Recently, it has been observed that finite impulse response controllers are an excellent basis for encrypted control, where privacy-preserving controller evaluations via special cryptosystems are the main focus. Beneficial properties of FIR filters are also well-known from digital signal processing, which makes them preferable over infinite impulse response filters in many applications. Their appeal extends to feedback control, offering design flexibility grounded solely on output measurements. However, designing FIR controllers is challenging, which motivates this work. To address the design challenge, we initially show that FIR controller designs for linear systems can equivalently be stated as static or dynamic output feedback problems. After focusing on the existence of stabilizing FIR controllers for a given plant, we tailor two common design approaches for output feedback to the case of FIR controllers. Unfortunately, it will turn out that the FIR characteristics add further restrictions to the LMI-based approaches. Hence, we finally turn to designs building on non-convex optimization, which provide satisfactory results for a selection of benchmark systems.
Distributed alternating gradient descent for convex semi-infinite programs over a network
This paper presents a first-order distributed algorithm for solving a convex semi-infinite program (SIP) over a time-varying network. In this setting, the objective function associated with the optimization problem is a summation of a set of functions, each held by one node in a network. The semi-infinite constraint, on the other hand, is known to all agents. The nodes collectively aim to solve the problem using local data about the objective and limited communication capabilities depending on the network topology. Our algorithm is built on three key ingredients: consensus step, gradient descent in the local objective, and local gradient descent iterations in the constraint at a node when the estimate violates the semi-infinite constraint. The algorithm is constructed, and its parameters are prescribed in such a way that the iterates held by each agent provably converge to an optimizer. That is, as the algorithm progresses, the estimates achieve consensus, and the constraint violation and the error in the optimal value are bounded above by vanishing terms. A simulation example illustrates our results.
comment: 16 pages, 1 figure
Decoupling Power Quality Issues in Grid-Microgrid Network Using Microgrid Building Blocks
Microgrids are evolving as promising options to enhance reliability of the connected transmission and distribution systems. Traditional design and deployment of microgrids require significant engineering analysis. Microgrid Building Blocks (MBB), consisting of modular blocks that integrate seamlessly to form effective microgrids, is an enabling concept for faster and broader adoption of microgrids. Back-to-Back converter placed at the point of common coupling of microgrid is an integral part of the MBB. This paper presents applications of MBB to decouple power quality issues in grid-microgrid network serving power quality sensitive loads such as data centers, new grid-edge technologies such as vehicle-to-grid generation, and serving electric vehicle charging loads during evacuation before disaster events. Simulation results show that MBB effectively decouples the power quality issues across networks and helps maintain good power quality in the power quality sensitive network based on the operational scenario.
comment: This paper is accepted for publication in IEEE IECON 2024, Chicago, IL. The complete copyright version will be available on IEEE Xplore when the conference proceedings are published
Emerging clean technologies: policy-driven cost reductions, implications and perspectives
Hydrogen production from water electrolysis, direct air capture (DAC), and synthetic kerosene derived from hydrogen and CO2 (`e-kerosene') are expected to play an important role in global decarbonization efforts. So far, the economics of these nascent technologies hamper their market diffusion. However, a wave of recent policy support in the United States, Europe, China, and elsewhere is anticipated to drive their commercial liftoff and bring their costs down. To this end, we evaluate the potential cost reductions driven by policy-induced scale-up of these emerging technologies through 2030 using an experience curves approach accounting for both local and global learning effects. We then analyze the consequences of projected cost declines on the competitiveness of these nascent technologies compared to conventional fossil alternatives, where applicable, and highlight some of the tradeoffs associated with their expansion. Our findings indicate that enacted policies could lead to substantial capital cost reductions for electrolyzers. Nevertheless, electrolytic hydrogen production at $1-2/kg would still require some form of policy support. Given expected costs and experience curves, it is unlikely that liquid solvent DAC (L-DAC) scale-up will bring removal costs to stated targets of $100/tCO2, though a $200/tCO2 may eventually be within reach. We also underscore the importance of tackling methane leakage for natural gas-powered L-DAC: unmitigated leaks amplify net removal costs, exacerbate the investment requirements to reach targeted costs, and cast doubt on L-DAC's role in the clean energy transition. Lastly, despite reductions in electrolysis and L-DAC costs, e-kerosene remains considerably more expensive than fossil jet fuel. The economics of e-kerosene and the resources required for production raise questions about the fuel's ultimate viability as a decarbonization tool for aviation.
Suppressing unknown disturbances to dynamical systems using machine learning
Identifying and suppressing unknown disturbances to dynamical systems is a problem with applications in many different fields. Here we present a model-free method to identify and suppress an unknown disturbance to an unknown system based only on previous observations of the system under the influence of a known forcing function. We find that, under very mild restrictions on the training function, our method is able to robustly identify and suppress a large class of unknown disturbances. We illustrate our scheme with the identification of both deterministic and stochastic unknown disturbances to an analog electric chaotic circuit and with numerical examples where a chaotic disturbance to various chaotic dynamical systems is identified and suppressed.
Comparative Analysis of NMPC and Fuzzy PID Controllers for Trajectory Tracking in Omni-Drive Robots: Design, Simulation, and Performance Evaluation
Trajectory tracking for an Omni-drive robot presents a challenging task that demands an efficient controller design. This paper introduces a self-optimizing controller, Type-1 fuzzyPID, which leverages dynamic and static system response analysis to overcome the limitations of manual tuning. To account for system uncertainties, an Interval Type-2 fuzzyPID controller is also developed. Both controllers are designed using Matlab/Simulink and tested through trajectory tracking simulations in the CoppeliaSim environment. Additionally, a non-linear model predictive controller(NMPC) is proposed and compared against the fuzzyPID controllers. The impact of tunable parameters on NMPC tracking accuracy is thoroughly examined. We also present plots of the step-response characteristics and noise rejection experiments for each controller. Simulation results validate the precision and effectiveness of NMPC over fuzzyPID controllers while trading computational complexity. Access to code and simulation environment is available in the following link: https://github.com/love481/Omni-drive-robot-Simulation.git.
2-Level Reinforcement Learning for Ships on Inland Waterways: Path Planning and Following
This paper proposes a realistic modularized framework for controlling autonomous surface vehicles (ASVs) on inland waterways (IWs) based on deep reinforcement learning (DRL). The framework improves operational safety and comprises two levels: a high-level local path planning (LPP) unit and a low-level path following (PF) unit, each consisting of a DRL agent. The LPP agent is responsible for planning a path under consideration of dynamic vessels, closing a gap in the current research landscape. In addition, the LPP agent adequately considers traffic rules and the geometry of the waterway. We thereby introduce a novel application of a spatial-temporal recurrent neural network architecture to continuous action spaces. The LPP agent outperforms a state-of-the-art artificial potential field (APF) method by increasing the minimum distance to other vessels by 65% on average. The PF agent performs low-level actuator control while accounting for shallow water influences and the environmental forces winds, waves, and currents. Compared with a proportional-integral-derivative (PID) controller, the PF agent yields only 61% of the mean cross-track error (MCTE) while significantly reducing control effort (CE) in terms of the required absolute rudder angle. Lastly, both agents are jointly validated in simulation, employing the lower Elbe in northern Germany as an example case and using real automatic identification system (AIS) trajectories to model the behavior of other ships.
Receding Horizon Games for Modeling Competitive Supply Chains
The vast majority of products we use daily are supplied to us through complex global supply chains that transform raw materials into finished goods and distribute them to end consumers. This paper proposes a modeling methodology for dynamic competitive supply chains based on game theory and model predictive control. We model each manufacturer in the supply chain as a rational utility maximizing agent that selects their actions by finding an open-loop generalized Nash equilibrium of a multi-stage game. To react to competitors and the state of the market, every agent re-plans their actions in a receding horizon manner based on estimates of market and supplier parameters thereby creating an approximate closed-loop equilibrium policy. We demonstrate through numerical simulations that this modeling approach is computationally tractable and generates economically interpretable behaviors in a variety of settings such as demand spikes, supply shocks, and information asymmetry.
Graph Simplification Solutions to the Street Intersection Miscount Problem
Street intersection counts and densities are fundamental measures in transport geography and urban planning. Conventional street network data and analysis tools often lead to significant overcounting of these measures. This study investigates the causes of such overcounting and proposes remedies. It introduces algorithms to streamline urban street network models through edge simplification and node consolidation, which enhance computational efficiency and improve accuracy in network measures like intersection counts, street segment lengths, and node degrees. The algorithms are validated and subjected to a global empirical assessment to evaluate the extent of count bias. These findings highlight the prevalence of this bias and underscore the necessity for better methods to mitigate inaccuracies in intersection representation.
comment: Conference paper
An agent design with goal reaching guarantees for enhancement of learning
Reinforcement learning is commonly concerned with problems of maximizing accumulated rewards in Markov decision processes. Oftentimes, a certain goal state or a subset of the state space attain maximal reward. In such a case, the environment may be considered solved when the goal is reached. Whereas numerous techniques, learning or non-learning based, exist for solving environments, doing so optimally is the biggest challenge. Say, one may choose a reward rate which penalizes the action effort. Reinforcement learning is currently among the most actively developed frameworks for solving environments optimally by virtue of maximizing accumulated reward, in other words, returns. Yet, tuning agents is a notoriously hard task as reported in a series of works. Our aim here is to help the agent learn a near-optimal policy efficiently while ensuring a goal reaching property of some basis policy that merely solves the environment. We suggest an algorithm, which is fairly flexible, and can be used to augment practically any agent as long as it comprises of a critic. A formal proof of a goal reaching property is provided. Comparative experiments on several problems under popular baseline agents provided an empirical evidence that the learning can indeed be boosted while ensuring goal reaching property.
Hamilton-Jacobi Reachability in Reinforcement Learning: A Survey
Recent literature has proposed approaches that learn control policies with high performance while maintaining safety guarantees. Synthesizing Hamilton-Jacobi (HJ) reachable sets has become an effective tool for verifying safety and supervising the training of reinforcement learning-based control policies for complex, high-dimensional systems. Previously, HJ reachability was restricted to verifying low-dimensional dynamical systems primarily because the computational complexity of the dynamic programming approach it relied on grows exponentially with the number of system states. In recent years, a litany of proposed methods addresses this limitation by computing the reachability value function simultaneously with learning control policies to scale HJ reachability analysis while still maintaining a reliable estimate of the true reachable set. These HJ reachability approximations are used to improve the safety, and even reward performance, of learned control policies and can solve challenging tasks such as those with dynamic obstacles and/or with lidar-based or vision-based observations. In this survey paper, we review the recent developments in the field of HJ reachability estimation in reinforcement learning that would provide a foundational basis for further research into reliability in high-dimensional systems.
comment: Accepted in IEEE Open Journal of Control Systems (OJ-CSYS)
The Role of Electric Grid Research in Addressing Climate Change
Addressing the urgency of climate change necessitates a coordinated and inclusive effort from all relevant stakeholders. Critical to this effort is the modeling, analysis, control, and integration of technological innovations within the electric energy system, which plays a crucial role in scaling up climate change solutions. This perspective article presents a set of research challenges and opportunities in the area of electric power systems that would be crucial in accelerating Gigaton-level decarbonization. Furthermore, it highlights institutional challenges associated with developing market mechanisms and regulatory architectures, ensuring that incentives are aligned for stakeholders to effectively implement the technological solutions on a large scale.
comment: 17 pages, 2 figures
Shock waves in nonlinear transmission lines
We consider interaction between the small amplitude travelling waves ("sound") and the shock waves in the transmission line containing both nonlinear capacitors and nonlinear inductors. We calculate for the "sound" wave the coefficient of reflection from (the coefficient of transmission through) the shock wave. These coefficients are expressed in terms of the wave speeds and the wave impedances. When only the capacitors or only the inductors are nonlinear, the coefficients are expressed in terms of the wave speeds only. We explicitly include into consideration of the shocks the dissipation, introducing ohmic resistors shunting the inductors and also in series with the capacitors. This allows us to describe the shocks as physical objects of finite width and study their profiles. In some particular cases the profiles were obtained in terms of elementary functions.
comment: pdfLaTeX, 8 pages, 4 figures. The paper was edited substantially
Systems and Control (EESS)
An Advanced Microscopic Energy Consumption Model for Automated Vehicle:Development, Calibration, Verification
The automated vehicle (AV) equipped with the Adaptive Cruise Control (ACC) system is expected to reduce the fuel consumption for the intelligent transportation system. This paper presents the Advanced ACC-Micro (AA-Micro) model, a new energy consumption model based on micro trajectory data, calibrated and verified by empirical data. Utilizing a commercial AV equipped with the ACC system as the test platform, experiments were conducted at the Columbus 151 Speedway, capturing data from multiple ACC and Human-Driven (HV) test runs. The calibrated AA-Micro model integrates features from traditional energy consumption models and demonstrates superior goodness of fit, achieving an impressive 90% accuracy in predicting ACC system energy consumption without overfitting. A comprehensive statistical evaluation of the AA-Micro model's applicability and adaptability in predicting energy consumption and vehicle trajectories indicated strong model consistency and reliability for ACC vehicles, evidenced by minimal variance in RMSE values and uniform RSS distributions. Conversely, significant discrepancies were observed when applying the model to HV data, underscoring the necessity for specialized models to accurately predict energy consumption for HV and ACC systems, potentially due to their distinct energy consumption characteristics.
CAMEO:A Co-design Architecture for Multi-objective Energy System Optimization
Co-design plays a pivotal role in energy system planning as it allows for the holistic optimization of interconnected components, fostering efficiency, resilience, and sustainability by addressing complex interdependencies and trade-offs within the system. This leads to reduced operational costs and improved financial performance through optimized system design, resource allocation, and system-wide synergies. In addition, system planners must consider multiple probable scenarios to plan for potential variations in operating conditions, uncertainties, and future demands, ensuring robust and adaptable solutions that can effectively address the needs and challenges of various systems. This research introduces Co-design Architecture for Multi-objective Energy System Optimization (CAMEO), which facilitates design space exploration of the co-design problem via a modular and automated workflow system, enhancing flexibility and accelerating the design and validation cycles. The cloud-scale automation provides a user-friendly interface and enable energy system modelers to efficiently explore diverse design alternatives. CAMEO aims to revolutionize energy system optimization by developing next-generation design assistant with improved scalability, usability, and automation, thereby enabling the development of optimized energy systems with greater ease and speed.
Consensus over Clustered Networks using Intermittent and Asynchronous Output Feedback
In recent years, multi-agent teaming has garnered considerable interest since complex objectives, such as intelligence, surveillance, and reconnaissance, can be divided into multiple cluster-level sub-tasks and assigned to a cluster of agents with the appropriate functionality. Yet, coordination and information dissemination between clusters may be necessary to accomplish a desired objective. Distributed consensus protocols provide a mechanism for spreading information within clustered networks, allowing agents and clusters to make decisions without requiring direct access to the state of the ensemble. Hence, we propose a strategy for achieving system-wide consensus in the states of identical linear time-invariant systems coupled by an undirected graph whose directed sub-graphs are available only at sporadic times. Within this work, the agents of the network are organized into pairwise disjoint clusters, which induce sub-graphs of the undirected parent graph. Some cluster sub-graph pairs are linked by an inter-cluster sub-graph, where the union of all cluster and inter-cluster sub-graphs yields the undirected parent graph. Each agent utilizes a distributed consensus protocol with components that are updated intermittently and asynchronously with respect to other agents. The closed-loop ensemble dynamics is modeled as a hybrid system, and a Lyapunov-based stability analysis yields sufficient conditions for rendering the agreement subspace (consensus set) globally exponentially stable. Furthermore, an input-to-state stability argument demonstrates the consensus set is robust to a class of perturbations. A numerical simulation considering both nominal and perturbed scenarios is provided for validation purposes.
Collaborative Robot Arm Inserting Nasopharyngeal Swabs with Admittance Control
The nasopharyngeal (NP) swab sample test, commonly used to detect COVID-19 and other respiratory illnesses, involves moving a swab through the nasal cavity to collect samples from the nasopharynx. While typically this is done by human healthcare workers, there is a significant societal interest to enable robots to do this test to reduce exposure to patients and to free up human resources. The task is challenging from the robotics perspective because of the dexterity and safety requirements. While other works have implemented specific hardware solutions, our research differentiates itself by using a ubiquitous rigid robotic arm. This work presents a case study where we investigate the strengths and challenges using compliant control system to accomplish NP swab tests with such a robotic configuration. To accomplish this, we designed a force sensing end-effector that integrates with the proposed torque controlled compliant control loop. We then conducted experiments where the robot inserted NP swabs into a 3D printed nasal cavity phantom. Ultimately, we found that the compliant control system outperformed a basic position controller and shows promise for human use. However, further efforts are needed to ensure the initial alignment with the nostril and to address head motion.
comment: 13 pages, 9 figures. See https://uwaterloo.ca/scholar/pqjlee/collaborative-robot-arm-inserting-nasopharyngeal-swabs-admittance-control for supplementary data
Data-driven H2-optimal Model Reduction via Offline Transfer Function Sampling
$\mathcal{H}_2$-optimal model order reduction algorithms represent a significant class of techniques, known for their accuracy, which has been extensively validated over the past two decades. Among these, the Iterative Rational Krylov Algorithm (IRKA) is widely regarded as a benchmark for constructing $\mathcal{H}_2$-optimal reduced-order models. However, a key challenge in its data-driven implementation lies in the need for transfer function samples and their derivatives, which must be updated iteratively. Conducting new experiments to acquire these samples each time IRKA updates the interpolation data is impractical. Additionally, for discrete-time systems, obtaining transfer function samples at frequencies outside the unit circle is challenging, as these are not easily accessible through measurements. This paper proposes a method to sample the transfer function and its derivative offline using frequency or time-domain data, which is commonly measured for various design and analysis purposes in industry. By leveraging this approach, there is no need to directly measure transfer function samples at interpolation points, as these can be generated offline using the pre-existing data. This facilitates the offline implementation of IRKA within the frequency- or time-domain Loewner framework. The approach is also extended to discrete-time systems in this work. A numerical example is provided to validate the theoretical findings presented.
Data-driven Modeling of Combined Sewer Systems for Urban Sustainability: An Empirical Evaluation
Climate change poses complex challenges, with extreme weather events becoming increasingly frequent and difficult to model. Examples include the dynamics of Combined Sewer Systems (CSS). Overburdened CSS during heavy rainfall will overflow untreated wastewater into surface water bodies. Classical approaches to modeling the impact of extreme rainfall events rely on physical simulations, which are particularly challenging to create for large urban infrastructures. Deep Learning (DL) models offer a cost-effective alternative for modeling the complex dynamics of sewer systems. In this study, we present a comprehensive empirical evaluation of several state-of-the-art DL time series models for predicting sewer system dynamics in a large urban infrastructure, utilizing three years of measurement data. We especially investigate the potential of DL models to maintain predictive precision during network outages by comparing global models, which have access to all variables within the sewer system, and local models, which are limited to data from a restricted set of local sensors. Our findings demonstrate that DL models can accurately predict the dynamics of sewer system load, even under network outage conditions. These results suggest that DL models can effectively aid in balancing the load redistribution in CSS, thereby enhancing the sustainability and resilience of urban infrastructures.
comment: 12 pages, 4 figures, accepted at 47th German Conference on Artificial Intelligence, Wuerzburg 2024
Networked Communication for Mean-Field Games with Function Approximation and Empirical Mean-Field Estimation
Recent works have provided algorithms by which decentralised agents, which may be connected via a communication network, can learn equilibria in Mean-Field Games from a single, non-episodic run of the empirical system. However, these algorithms are given for tabular settings: this computationally limits the size of players' observation space, meaning that the algorithms are not able to handle anything but small state spaces, nor to generalise beyond policies depending on the ego player's state to so-called 'population-dependent' policies. We address this limitation by introducing function approximation to the existing setting, drawing on the Munchausen Online Mirror Descent method that has previously been employed only in finite-horizon, episodic, centralised settings. While this permits us to include the population's mean-field distribution in the observation for each player's policy, it is arguably unrealistic to assume that decentralised agents would have access to this global information: we therefore additionally provide new algorithms that allow agents to estimate the global empirical distribution based on a local neighbourhood, and to improve this estimate via communication over a given network. Our experiments showcase how the communication network allows decentralised agents to estimate the mean-field distribution for population-dependent policies, and that exchanging policy information helps networked agents to outperform both independent and even centralised agents in function-approximation settings, by an even greater margin than in tabular settings.
Enhanced Visual SLAM for Collision-free Driving with Lightweight Autonomous Cars
The paper presents a vision-based obstacle avoidance strategy for lightweight self-driving cars that can be run on a CPU-only device using a single RGB-D camera. The method consists of two steps: visual perception and path planning. The visual perception part uses ORBSLAM3 enhanced with optical flow to estimate the car's poses and extract rich texture information from the scene. In the path planning phase, we employ a method combining a control Lyapunov function and control barrier function in the form of quadratic program (CLF-CBF-QP) together with an obstacle shape reconstruction process (SRP) to plan safe and stable trajectories. To validate the performance and robustness of the proposed method, simulation experiments were conducted with a car in various complex indoor environments using the Gazebo simulation environment. Our method can effectively avoid obstacles in the scenes. The proposed algorithm outperforms benchmark algorithms in achieving more stable and shorter trajectories across multiple simulated scenes.
comment: 16 pages; Submitted to a journal
Flatness-based control revisited: The HEOL setting
We present the algebraic foundations of the HEOL setting, which combines flatness-based control and intelligent controllers, two advances in automatic control that have been proven in practice, including in industry. The result provides a solution to many pending questions on feedback loops concerning flatness-based control and model-free control (MFC). Elementary module theory, ordinary differential fields and the generalization of K\"ahler differentials to differential fields provide an intrinsic definition of the tangent linear system. The algebraic manipulations associated with the operational calculus lead to homeostat and intelligent controllers. They are illustrated via some computer simulations.
comment: Accepted for publication in "Comptes Rendus Math\'ematique"
Learning Deep Dissipative Dynamics
This study challenges strictly guaranteeing ``dissipativity'' of a dynamical system represented by neural networks learned from given time-series data. Dissipativity is a crucial indicator for dynamical systems that generalizes stability and input-output stability, known to be valid across various systems including robotics, biological systems, and molecular dynamics. By analytically proving the general solution to the nonlinear Kalman-Yakubovich-Popov (KYP) lemma, which is the necessary and sufficient condition for dissipativity, we propose a differentiable projection that transforms any dynamics represented by neural networks into dissipative ones and a learning method for the transformed dynamics. Utilizing the generality of dissipativity, our method strictly guarantee stability, input-output stability, and energy conservation of trained dynamical systems. Finally, we demonstrate the robustness of our method against out-of-domain input through applications to robotic arms and fluid dynamics. Code here https://github.com/kojima-r/DeepDissipativeModel
Ultra-Fast and Efficient Design Method Using Deep Learning for Capacitive Coupling WPT System
Capacitive coupling wireless power transfer (CCWPT) is one of the pervasive methods to transfer power in the reactive near-field zone. In this paper, a flexible design methodology based on Binary Particle Swarm Optimization (BPSO) algorithm is proposed for a pixelated microstrip structure. The pixel configuration of each parallel plate (43x43 pixels) determines the frequency response of the system (S-parameters) and by changing this configuration, we can achieve the dedicated operating frequency (resonance frequency) and its related |S21| value. Due to the large number of pixels, iterative optimization algorithm (BPSO) is the solution for designing a CCWPT system. However, the output of each iteration should be simulated in electromagnetic simulators (e.g., CST, HFSS, etc.), hence, the whole optimization process is time-consuming. This paper develops a rapid, agile and efficient method for designing two parallel pixelated microstrip plates of a CCWPT system based on deep neural networks. In the proposed method, CST-based BPSO algorithm is replaced with an AI-based method using ResNet-18. Advantages of the AI-based iterative method are automatic design process, more efficient, less time-consuming, less computational resource-consuming and less background EM knowledge requirements compared to the conventional techniques. Finally, the prototype of the proposed simulated structure is fabricated and measured. The simulation and measurement results validate the design procedure accuracy, using AI-based BPSO algorithm. The MAE (Mean Absolute Error) of prediction for the main resonance frequency and related |S21| are 110 MHz and 0.18 dB, respectively and according to the simulation results, the whole design process is 3629 times faster than the CST-based BPSO algorithm.
Real-Time Discrete Fractional Fourier Transform Using Metamaterial Coupled Lines Network
Discrete Fractional Fourier Transforms (DFrFT) are universal mathematical tools in signal processing, communications and microwave sensing. Despite the excessive applications of DFrFT, implementation of corresponding fractional orders in the baseband signal often leads to bulky, power-hungry, and high-latency systems. In this paper, we present a passive metamaterial coupled lines network (MCLN) that performs the analog DFrFT in real-time at microwave frequencies. The proposed MCLN consists of M parallel microstrip transmission lines (TLs) in which adjacent TLs are loaded with interdigital capacitors to enhance the coupling level. We show that with proper design of the coupling coefficients between adjacent channels, the MCLN can perform an M-point DFrFT of an arbitrary fractional order that can be designed through the length of the network. In the context of real-time signal processing for realization of DFrFT, we design, model, simulate and implement a 16x16 MCLN and experimentally demonstrate the performance of the proposed structure. The proposed innovative approach is versatile and is capable to be used in various applications where DFrFT is an essential tool. The proposed design scheme based on MCLN is scalable across the frequency spectrum and can be applied to millimeter and submillimeter wave systems.
Reflex-Based Open-Vocabulary Navigation without Prior Knowledge Using Omnidirectional Camera and Multiple Vision-Language Models
Various robot navigation methods have been developed, but they are mainly based on Simultaneous Localization and Mapping (SLAM), reinforcement learning, etc., which require prior map construction or learning. In this study, we consider the simplest method that does not require any map construction or learning, and execute open-vocabulary navigation of robots without any prior knowledge to do this. We applied an omnidirectional camera and pre-trained vision-language models to the robot. The omnidirectional camera provides a uniform view of the surroundings, thus eliminating the need for complicated exploratory behaviors including trajectory generation. By applying multiple pre-trained vision-language models to this omnidirectional image and incorporating reflective behaviors, we show that navigation becomes simple and does not require any prior setup. Interesting properties and limitations of our method are discussed based on experiments with the mobile robot Fetch.
comment: Accepted at Advanced Robotics, website - https://haraduka.github.io/omnidirectional-vlm/
Measurement-based Fast Quantum State Stabilization with Deep Reinforcement Learning
The stabilization of quantum states is a fundamental problem for realizing various quantum technologies. Measurement-based-feedback strategies have demonstrated powerful performance, and the construction of quantum control signals using measurement information has attracted great interest. However, the interaction between quantum systems and the environment is inevitable, especially when measurements are introduced, which leads to decoherence. To mitigate decoherence, it is desirable to stabilize quantum systems faster, thereby reducing the time of interaction with the environment. In this paper, we utilize information obtained from measurement and apply deep reinforcement learning (DRL) algorithms, without explicitly constructing specific complex measurement-control mappings, to rapidly drive random initial quantum state to the target state. The proposed DRL algorithm has the ability to speed up the convergence to a target state, which shortens the interaction between quantum systems and their environments to protect coherence. Simulations are performed on two-qubit and three-qubit systems, and the results show that our algorithm can successfully stabilize random initial quantum system to the target entangled state, with a convergence time faster than traditional methods such as Lyapunov feedback control. Moreover, it exhibits robustness against imperfect measurements and delays in system evolution.
An Econometric Analysis of Large Flexible Cryptocurrency-mining Consumers in Electricity Markets
In recent years, power grids have seen a surge in large cryptocurrency mining firms, with individual consumption levels reaching 700MW. This study examines the behavior of these firms in Texas, focusing on how their consumption is influenced by cryptocurrency conversion rates, electricity prices, local weather, and other factors. We transform the skewed electricity consumption data of these firms, perform correlation analysis, and apply a seasonal autoregressive moving average model for analysis. Our findings reveal that, surprisingly, short-term mining electricity consumption is not correlated with cryptocurrency conversion rates. Instead, the primary influencers are the temperature and electricity prices. These firms also respond to avoid transmission and distribution network (T\&D) charges -- famously known as four Coincident peak (4CP) charges -- during summer times. As the scale of these firms is likely to surge in future years, the developed electricity consumption model can be used to generate public, synthetic datasets to understand the overall impact on power grid. The developed model could also lead to better pricing mechanisms to effectively use the flexibility of these resources towards improving power grid reliability.
comment: 10 pages, 10 figures, accepted for publication in Hawaii International Conference on System Sciences-58
On the design of stabilizing FIR controllers
Recently, it has been observed that finite impulse response controllers are an excellent basis for encrypted control, where privacy-preserving controller evaluations via special cryptosystems are the main focus. Beneficial properties of FIR filters are also well-known from digital signal processing, which makes them preferable over infinite impulse response filters in many applications. Their appeal extends to feedback control, offering design flexibility grounded solely on output measurements. However, designing FIR controllers is challenging, which motivates this work. To address the design challenge, we initially show that FIR controller designs for linear systems can equivalently be stated as static or dynamic output feedback problems. After focusing on the existence of stabilizing FIR controllers for a given plant, we tailor two common design approaches for output feedback to the case of FIR controllers. Unfortunately, it will turn out that the FIR characteristics add further restrictions to the LMI-based approaches. Hence, we finally turn to designs building on non-convex optimization, which provide satisfactory results for a selection of benchmark systems.
Distributed alternating gradient descent for convex semi-infinite programs over a network
This paper presents a first-order distributed algorithm for solving a convex semi-infinite program (SIP) over a time-varying network. In this setting, the objective function associated with the optimization problem is a summation of a set of functions, each held by one node in a network. The semi-infinite constraint, on the other hand, is known to all agents. The nodes collectively aim to solve the problem using local data about the objective and limited communication capabilities depending on the network topology. Our algorithm is built on three key ingredients: consensus step, gradient descent in the local objective, and local gradient descent iterations in the constraint at a node when the estimate violates the semi-infinite constraint. The algorithm is constructed, and its parameters are prescribed in such a way that the iterates held by each agent provably converge to an optimizer. That is, as the algorithm progresses, the estimates achieve consensus, and the constraint violation and the error in the optimal value are bounded above by vanishing terms. A simulation example illustrates our results.
comment: 16 pages, 1 figure
Decoupling Power Quality Issues in Grid-Microgrid Network Using Microgrid Building Blocks
Microgrids are evolving as promising options to enhance reliability of the connected transmission and distribution systems. Traditional design and deployment of microgrids require significant engineering analysis. Microgrid Building Blocks (MBB), consisting of modular blocks that integrate seamlessly to form effective microgrids, is an enabling concept for faster and broader adoption of microgrids. Back-to-Back converter placed at the point of common coupling of microgrid is an integral part of the MBB. This paper presents applications of MBB to decouple power quality issues in grid-microgrid network serving power quality sensitive loads such as data centers, new grid-edge technologies such as vehicle-to-grid generation, and serving electric vehicle charging loads during evacuation before disaster events. Simulation results show that MBB effectively decouples the power quality issues across networks and helps maintain good power quality in the power quality sensitive network based on the operational scenario.
comment: This paper is accepted for publication in IEEE IECON 2024, Chicago, IL. The complete copyright version will be available on IEEE Xplore when the conference proceedings are published
Emerging clean technologies: policy-driven cost reductions, implications and perspectives
Hydrogen production from water electrolysis, direct air capture (DAC), and synthetic kerosene derived from hydrogen and CO2 (`e-kerosene') are expected to play an important role in global decarbonization efforts. So far, the economics of these nascent technologies hamper their market diffusion. However, a wave of recent policy support in the United States, Europe, China, and elsewhere is anticipated to drive their commercial liftoff and bring their costs down. To this end, we evaluate the potential cost reductions driven by policy-induced scale-up of these emerging technologies through 2030 using an experience curves approach accounting for both local and global learning effects. We then analyze the consequences of projected cost declines on the competitiveness of these nascent technologies compared to conventional fossil alternatives, where applicable, and highlight some of the tradeoffs associated with their expansion. Our findings indicate that enacted policies could lead to substantial capital cost reductions for electrolyzers. Nevertheless, electrolytic hydrogen production at $1-2/kg would still require some form of policy support. Given expected costs and experience curves, it is unlikely that liquid solvent DAC (L-DAC) scale-up will bring removal costs to stated targets of $100/tCO2, though a $200/tCO2 may eventually be within reach. We also underscore the importance of tackling methane leakage for natural gas-powered L-DAC: unmitigated leaks amplify net removal costs, exacerbate the investment requirements to reach targeted costs, and cast doubt on L-DAC's role in the clean energy transition. Lastly, despite reductions in electrolysis and L-DAC costs, e-kerosene remains considerably more expensive than fossil jet fuel. The economics of e-kerosene and the resources required for production raise questions about the fuel's ultimate viability as a decarbonization tool for aviation.
Suppressing unknown disturbances to dynamical systems using machine learning
Identifying and suppressing unknown disturbances to dynamical systems is a problem with applications in many different fields. Here we present a model-free method to identify and suppress an unknown disturbance to an unknown system based only on previous observations of the system under the influence of a known forcing function. We find that, under very mild restrictions on the training function, our method is able to robustly identify and suppress a large class of unknown disturbances. We illustrate our scheme with the identification of both deterministic and stochastic unknown disturbances to an analog electric chaotic circuit and with numerical examples where a chaotic disturbance to various chaotic dynamical systems is identified and suppressed.
Comparative Analysis of NMPC and Fuzzy PID Controllers for Trajectory Tracking in Omni-Drive Robots: Design, Simulation, and Performance Evaluation
Trajectory tracking for an Omni-drive robot presents a challenging task that demands an efficient controller design. This paper introduces a self-optimizing controller, Type-1 fuzzyPID, which leverages dynamic and static system response analysis to overcome the limitations of manual tuning. To account for system uncertainties, an Interval Type-2 fuzzyPID controller is also developed. Both controllers are designed using Matlab/Simulink and tested through trajectory tracking simulations in the CoppeliaSim environment. Additionally, a non-linear model predictive controller(NMPC) is proposed and compared against the fuzzyPID controllers. The impact of tunable parameters on NMPC tracking accuracy is thoroughly examined. We also present plots of the step-response characteristics and noise rejection experiments for each controller. Simulation results validate the precision and effectiveness of NMPC over fuzzyPID controllers while trading computational complexity. Access to code and simulation environment is available in the following link: https://github.com/love481/Omni-drive-robot-Simulation.git.
2-Level Reinforcement Learning for Ships on Inland Waterways: Path Planning and Following
This paper proposes a realistic modularized framework for controlling autonomous surface vehicles (ASVs) on inland waterways (IWs) based on deep reinforcement learning (DRL). The framework improves operational safety and comprises two levels: a high-level local path planning (LPP) unit and a low-level path following (PF) unit, each consisting of a DRL agent. The LPP agent is responsible for planning a path under consideration of dynamic vessels, closing a gap in the current research landscape. In addition, the LPP agent adequately considers traffic rules and the geometry of the waterway. We thereby introduce a novel application of a spatial-temporal recurrent neural network architecture to continuous action spaces. The LPP agent outperforms a state-of-the-art artificial potential field (APF) method by increasing the minimum distance to other vessels by 65% on average. The PF agent performs low-level actuator control while accounting for shallow water influences and the environmental forces winds, waves, and currents. Compared with a proportional-integral-derivative (PID) controller, the PF agent yields only 61% of the mean cross-track error (MCTE) while significantly reducing control effort (CE) in terms of the required absolute rudder angle. Lastly, both agents are jointly validated in simulation, employing the lower Elbe in northern Germany as an example case and using real automatic identification system (AIS) trajectories to model the behavior of other ships.
Receding Horizon Games for Modeling Competitive Supply Chains
The vast majority of products we use daily are supplied to us through complex global supply chains that transform raw materials into finished goods and distribute them to end consumers. This paper proposes a modeling methodology for dynamic competitive supply chains based on game theory and model predictive control. We model each manufacturer in the supply chain as a rational utility maximizing agent that selects their actions by finding an open-loop generalized Nash equilibrium of a multi-stage game. To react to competitors and the state of the market, every agent re-plans their actions in a receding horizon manner based on estimates of market and supplier parameters thereby creating an approximate closed-loop equilibrium policy. We demonstrate through numerical simulations that this modeling approach is computationally tractable and generates economically interpretable behaviors in a variety of settings such as demand spikes, supply shocks, and information asymmetry.
Graph Simplification Solutions to the Street Intersection Miscount Problem
Street intersection counts and densities are fundamental measures in transport geography and urban planning. Conventional street network data and analysis tools often lead to significant overcounting of these measures. This study investigates the causes of such overcounting and proposes remedies. It introduces algorithms to streamline urban street network models through edge simplification and node consolidation, which enhance computational efficiency and improve accuracy in network measures like intersection counts, street segment lengths, and node degrees. The algorithms are validated and subjected to a global empirical assessment to evaluate the extent of count bias. These findings highlight the prevalence of this bias and underscore the necessity for better methods to mitigate inaccuracies in intersection representation.
comment: Conference paper
An agent design with goal reaching guarantees for enhancement of learning
Reinforcement learning is commonly concerned with problems of maximizing accumulated rewards in Markov decision processes. Oftentimes, a certain goal state or a subset of the state space attain maximal reward. In such a case, the environment may be considered solved when the goal is reached. Whereas numerous techniques, learning or non-learning based, exist for solving environments, doing so optimally is the biggest challenge. Say, one may choose a reward rate which penalizes the action effort. Reinforcement learning is currently among the most actively developed frameworks for solving environments optimally by virtue of maximizing accumulated reward, in other words, returns. Yet, tuning agents is a notoriously hard task as reported in a series of works. Our aim here is to help the agent learn a near-optimal policy efficiently while ensuring a goal reaching property of some basis policy that merely solves the environment. We suggest an algorithm, which is fairly flexible, and can be used to augment practically any agent as long as it comprises of a critic. A formal proof of a goal reaching property is provided. Comparative experiments on several problems under popular baseline agents provided an empirical evidence that the learning can indeed be boosted while ensuring goal reaching property.
Hamilton-Jacobi Reachability in Reinforcement Learning: A Survey
Recent literature has proposed approaches that learn control policies with high performance while maintaining safety guarantees. Synthesizing Hamilton-Jacobi (HJ) reachable sets has become an effective tool for verifying safety and supervising the training of reinforcement learning-based control policies for complex, high-dimensional systems. Previously, HJ reachability was restricted to verifying low-dimensional dynamical systems primarily because the computational complexity of the dynamic programming approach it relied on grows exponentially with the number of system states. In recent years, a litany of proposed methods addresses this limitation by computing the reachability value function simultaneously with learning control policies to scale HJ reachability analysis while still maintaining a reliable estimate of the true reachable set. These HJ reachability approximations are used to improve the safety, and even reward performance, of learned control policies and can solve challenging tasks such as those with dynamic obstacles and/or with lidar-based or vision-based observations. In this survey paper, we review the recent developments in the field of HJ reachability estimation in reinforcement learning that would provide a foundational basis for further research into reliability in high-dimensional systems.
comment: Accepted in IEEE Open Journal of Control Systems (OJ-CSYS)
The Role of Electric Grid Research in Addressing Climate Change
Addressing the urgency of climate change necessitates a coordinated and inclusive effort from all relevant stakeholders. Critical to this effort is the modeling, analysis, control, and integration of technological innovations within the electric energy system, which plays a crucial role in scaling up climate change solutions. This perspective article presents a set of research challenges and opportunities in the area of electric power systems that would be crucial in accelerating Gigaton-level decarbonization. Furthermore, it highlights institutional challenges associated with developing market mechanisms and regulatory architectures, ensuring that incentives are aligned for stakeholders to effectively implement the technological solutions on a large scale.
comment: 17 pages, 2 figures
Shock waves in nonlinear transmission lines
We consider interaction between the small amplitude travelling waves ("sound") and the shock waves in the transmission line containing both nonlinear capacitors and nonlinear inductors. We calculate for the "sound" wave the coefficient of reflection from (the coefficient of transmission through) the shock wave. These coefficients are expressed in terms of the wave speeds and the wave impedances. When only the capacitors or only the inductors are nonlinear, the coefficients are expressed in terms of the wave speeds only. We explicitly include into consideration of the shocks the dissipation, introducing ohmic resistors shunting the inductors and also in series with the capacitors. This allows us to describe the shocks as physical objects of finite width and study their profiles. In some particular cases the profiles were obtained in terms of elementary functions.
comment: pdfLaTeX, 8 pages, 4 figures. The paper was edited substantially
Multiagent Systems
VIRIS: Simulating indoor airborne transmission combining architectural design and people movement
A Viral Infection Risk Indoor Simulator (VIRIS) has been developed to quickly assess and compare mitigations for airborne disease spread. This agent-based simulator combines people movement in an indoor space, viral transmission modelling and detailed architectural design, and it is powered by topologicpy, an open-source Python library. VIRIS generates very fast predictions of the viral concentration and the spatiotemporal infection risk for individuals as they move through a given space. The simulator is validated with data from a courtroom superspreader event. A sensitivity study for unknown parameter values is also performed. We compare several non-pharmaceutical interventions (NPIs) issued in UK government guidance, for two indoor settings: a care home and a supermarket. Additionally, we have developed the user-friendly VIRIS web app that allows quick exploration of diverse scenarios of interest and visualisation, allowing policymakers, architects and space managers to easily design or assess infection risk in an indoor space.
Bayesian Optimization Framework for Efficient Fleet Design in Autonomous Multi-Robot Exploration
This study addresses the challenge of fleet design optimization in the context of heterogeneous multi-robot fleets, aiming to obtain feasible designs that balance performance and costs. In the domain of autonomous multi-robot exploration, reinforcement learning agents play a central role, offering adaptability to complex terrains and facilitating collaboration among robots. However, modifying the fleet composition results in changes in the learned behavior, and training multi-robot systems using multi-agent reinforcement learning is expensive. Therefore, an exhaustive evaluation of each potential fleet design is infeasible. To tackle these hurdles, we introduce Bayesian Optimization for Fleet Design (BOFD), a framework leveraging multi-objective Bayesian Optimization to explore fleets on the Pareto front of performance and cost while accounting for uncertainty in the design space. Moreover, we establish a sub-linear bound for cumulative regret, supporting BOFD's robustness and efficacy. Extensive benchmark experiments in synthetic and simulated environments demonstrate the superiority of our framework over state-of-the-art methods, achieving efficient fleet designs with minimal fleet evaluations.
Networked Communication for Mean-Field Games with Function Approximation and Empirical Mean-Field Estimation
Recent works have provided algorithms by which decentralised agents, which may be connected via a communication network, can learn equilibria in Mean-Field Games from a single, non-episodic run of the empirical system. However, these algorithms are given for tabular settings: this computationally limits the size of players' observation space, meaning that the algorithms are not able to handle anything but small state spaces, nor to generalise beyond policies depending on the ego player's state to so-called 'population-dependent' policies. We address this limitation by introducing function approximation to the existing setting, drawing on the Munchausen Online Mirror Descent method that has previously been employed only in finite-horizon, episodic, centralised settings. While this permits us to include the population's mean-field distribution in the observation for each player's policy, it is arguably unrealistic to assume that decentralised agents would have access to this global information: we therefore additionally provide new algorithms that allow agents to estimate the global empirical distribution based on a local neighbourhood, and to improve this estimate via communication over a given network. Our experiments showcase how the communication network allows decentralised agents to estimate the mean-field distribution for population-dependent policies, and that exchanging policy information helps networked agents to outperform both independent and even centralised agents in function-approximation settings, by an even greater margin than in tabular settings.
Subgoal-based Hierarchical Reinforcement Learning for Multi-Agent Collaboration
Recent advancements in reinforcement learning have made significant impacts across various domains, yet they often struggle in complex multi-agent environments due to issues like algorithm instability, low sampling efficiency, and the challenges of exploration and dimensionality explosion. Hierarchical reinforcement learning (HRL) offers a structured approach to decompose complex tasks into simpler sub-tasks, which is promising for multi-agent settings. This paper advances the field by introducing a hierarchical architecture that autonomously generates effective subgoals without explicit constraints, enhancing both flexibility and stability in training. We propose a dynamic goal generation strategy that adapts based on environmental changes. This method significantly improves the adaptability and sample efficiency of the learning process. Furthermore, we address the critical issue of credit assignment in multi-agent systems by synergizing our hierarchical architecture with a modified QMIX network, thus improving overall strategy coordination and efficiency. Comparative experiments with mainstream reinforcement learning algorithms demonstrate the superior convergence speed and performance of our approach in both single-agent and multi-agent environments, confirming its effectiveness and flexibility in complex scenarios. Our code is open-sourced at: \url{https://github.com/SICC-Group/GMAH}.
Deep Reinforcement Learning for Decentralized Multi-Robot Control: A DQN Approach to Robustness and Information Integration
The superiority of Multi-Robot Systems (MRS) in various complex environments is unquestionable. However, in complex situations such as search and rescue, environmental monitoring, and automated production, robots are often required to work collaboratively without a central control unit. This necessitates an efficient and robust decentralized control mechanism to process local information and guide the robots' behavior. In this work, we propose a new decentralized controller design method that utilizes the Deep Q-Network (DQN) algorithm from deep reinforcement learning, aimed at improving the integration of local information and robustness of multi-robot systems. The designed controller allows each robot to make decisions independently based on its local observations while enhancing the overall system's collaborative efficiency and adaptability to dynamic environments through a shared learning mechanism. Through testing in simulated environments, we have demonstrated the effectiveness of this controller in improving task execution efficiency, strengthening system fault tolerance, and enhancing adaptability to the environment. Furthermore, we explored the impact of DQN parameter tuning on system performance, providing insights for further optimization of the controller design. Our research not only showcases the potential application of the DQN algorithm in the decentralized control of multi-robot systems but also offers a new perspective on how to enhance the overall performance and robustness of the system through the integration of local information.
comment: Multi-robot system, Reinforcement learning, Information ingrated
Empirical Equilibria in Agent-based Economic systems with Learning agents
We present an agent-based simulator for economic systems with heterogeneous households, firms, central bank, and government agents. These agents interact to define production, consumption, and monetary flow. Each agent type has distinct objectives, such as households seeking utility from consumption and the central bank targeting inflation and production. We define this multi-agent economic system using an OpenAI Gym-style environment, enabling agents to optimize their objectives through reinforcement learning. Standard multi-agent reinforcement learning (MARL) schemes, like independent learning, enable agents to learn concurrently but do not address whether the resulting strategies are at equilibrium. This study integrates the Policy Space Response Oracle (PSRO) algorithm, which has shown superior performance over independent MARL in games with homogeneous agents, with economic agent-based modeling. We use PSRO to develop agent policies approximating Nash equilibria of the empirical economic game, thereby linking to economic equilibria. Our results demonstrate that PSRO strategies achieve lower regret values than independent MARL strategies in our economic system with four agent types. This work aims to bridge artificial intelligence, economics, and empirical game theory towards future research.
comment: arXiv admin note: text overlap with arXiv:2402.09563
$α$-Rank-Collections: Analyzing Expected Strategic Behavior with Uncertain Utilities
Game theory relies heavily on the availability of cardinal utility functions, but in fields such as matching markets, only ordinal preferences are typically elicited. The literature focuses on mechanisms with simple dominant strategies, but many real-world applications lack dominant strategies, making the intensity of preferences between outcomes important for determining strategies. Even though precise information about cardinal utilities is not available, some data about the likelihood of utility functions is often accessible. We propose to use Bayesian games to formalize uncertainty about the decision-makers' utilities by viewing them as a collection of normal-form games. Instead of searching for the Bayes-Nash equilibrium, we study how uncertainty in utilities is reflected in uncertainty of strategic play. To do this, we introduce a novel solution concept called $\alpha$-Rank-collections, which extends $\alpha$-Rank to Bayesian games. This allows us to analyze strategic play in, for example, non-strategyproof matching markets, for which appropriate solution concepts are currently lacking. $\alpha$-Rank-collections characterize the expected probability of encountering a certain strategy profile under replicator dynamics in the long run, rather than predicting a specific equilibrium strategy profile. We experimentally evaluate $\alpha$-Rank-collections using instances of the Boston mechanism, finding that our solution concept provides more nuanced predictions compared to Bayes-Nash equilibria. Additionally, we prove that $\alpha$-Rank-collections are invariant to positive affine transformations, a standard property for a solution concept, and are efficient to approximate.
comment: Accepted at the 25th ACM Conference on Economics and Computation (EC), 2024
Robotics
FLAME: Learning to Navigate with Multimodal LLM in Urban Environments
Large Language Models (LLMs) have demonstrated potential in Vision-and-Language Navigation (VLN) tasks, yet current applications face challenges. While LLMs excel in general conversation scenarios, they struggle with specialized navigation tasks, yielding suboptimal performance compared to specialized VLN models. We introduce FLAME (FLAMingo-Architected Embodied Agent), a novel Multimodal LLM-based agent and architecture designed for urban VLN tasks that efficiently handles multiple observations. Our approach implements a three-phase tuning technique for effective adaptation to navigation tasks, including single perception tuning for street view description, multiple perception tuning for trajectory summarization, and end-to-end training on VLN datasets. The augmented datasets are synthesized automatically. Experimental results demonstrate FLAME's superiority over existing methods, surpassing state-of-the-art methods by a 7.3% increase in task completion rate on Touchdown dataset. This work showcases the potential of Multimodal LLMs (MLLMs) in complex navigation tasks, representing an advancement towards practical applications of MLLMs in embodied AI. Project page: https://flame-sjtu.github.io
comment: 10 pages, 5 figures
RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands
It has been a long-standing research goal to endow robot hands with human-level dexterity. Bi-manual robot piano playing constitutes a task that combines challenges from dynamic tasks, such as generating fast while precise motions, with slower but contact-rich manipulation problems. Although reinforcement learning based approaches have shown promising results in single-task performance, these methods struggle in a multi-song setting. Our work aims to close this gap and, thereby, enable imitation learning approaches for robot piano playing at scale. To this end, we introduce the Robot Piano 1 Million (RP1M) dataset, containing bi-manual robot piano playing motion data of more than one million trajectories. We formulate finger placements as an optimal transport problem, thus, enabling automatic annotation of vast amounts of unlabeled songs. Benchmarking existing imitation learning approaches shows that such approaches reach state-of-the-art robot piano playing performance by leveraging RP1M.
comment: Project Website: https://rp1m.github.io/
Evaluating Assistive Technologies on a Trade Fair: Methodological Overview and Lessons Learned
User-centered evaluations are a core requirement in the development of new user related technologies. However, it is often difficult to recruit sufficient participants, especially if the target population is small, particularly busy, or in some way restricted in their mobility. We bypassed these problems by conducting studies on trade fairs that were specifically designed for our target population (potentially care-receiving individuals in wheelchairs) and therefore provided our users with external incentive to attend our study. This paper presents our gathered experiences, including methodological specifications and lessons learned, and is aimed to guide other researchers with conducting similar studies. In addition, we also discuss chances generated by this unconventional study environment as well as its limitations.
Enhancing End-to-End Autonomous Driving Systems Through Synchronized Human Behavior Data
This paper presents a pioneering exploration into the integration of fine-grained human supervision within the autonomous driving domain to enhance system performance. The current advances in End-to-End autonomous driving normally are data-driven and rely on given expert trials. However, this reliance limits the systems' generalizability and their ability to earn human trust. Addressing this gap, our research introduces a novel approach by synchronously collecting data from human and machine drivers under identical driving scenarios, focusing on eye-tracking and brainwave data to guide machine perception and decision-making processes. This paper utilizes the Carla simulation to evaluate the impact brought by human behavior guidance. Experimental results show that using human attention to guide machine attention could bring a significant improvement in driving performance. However, guidance by human intention still remains a challenge. This paper pioneers a promising direction and potential for utilizing human behavior guidance to enhance autonomous systems.
All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents
Embodied AI is transforming how AI systems interact with the physical world, yet existing datasets are inadequate for developing versatile, general-purpose agents. These limitations include a lack of standardized formats, insufficient data diversity, and inadequate data volume. To address these issues, we introduce ARIO (All Robots In One), a new data standard that enhances existing datasets by offering a unified data format, comprehensive sensory modalities, and a combination of real-world and simulated data. ARIO aims to improve the training of embodied AI agents, increasing their robustness and adaptability across various tasks and environments. Building upon the proposed new standard, we present a large-scale unified ARIO dataset, comprising approximately 3 million episodes collected from 258 series and 321,064 tasks. The ARIO standard and dataset represent a significant step towards bridging the gaps of existing data resources. By providing a cohesive framework for data collection and representation, ARIO paves the way for the development of more powerful and versatile embodied AI agents, capable of navigating and interacting with the physical world in increasingly complex and diverse ways. The project is available on https://imaei.github.io/project_pages/ario/
comment: Project website: https://imaei.github.io/project_pages/ario/
A Mini-Review on Mobile Manipulators with Variable Autonomy
This paper presents a mini-review of the current state of research in mobile manipulators with variable levels of autonomy, emphasizing their associated challenges and application environments. The need for mobile manipulators in different environments is evident due to the unique challenges and risks each presents. Many systems deployed in these environments are not fully autonomous, requiring human-robot teaming to ensure safe and reliable operations under uncertainties. Through this analysis, we identify gaps and challenges in the literature on Variable Autonomy, including cognitive workload and communication delays, and propose future directions, including whole-body Variable Autonomy for mobile manipulators, virtual reality frameworks, and large language models to reduce operators' complexity and cognitive load in some challenging and uncertain scenarios.
comment: Presented at Variable Autonomy for Human-Robot Teaming (VAT) at IEEE RO-MAN 2024 Workshop
DVRP-MHSI: Dynamic Visualization Research Platform for Multimodal Human-Swarm Interaction
In recent years, there has been a significant amount of research on algorithms and control methods for distributed collaborative robots. However, the emergence of collective behavior in a swarm is still difficult to predict and control. Nevertheless, human interaction with the swarm helps render the swarm more predictable and controllable, as human operators can utilize intuition or knowledge that is not always available to the swarm. Therefore, this paper designs the Dynamic Visualization Research Platform for Multimodal Human-Swarm Interaction (DVRP-MHSI), which is an innovative open system that can perform real-time dynamic visualization and is specifically designed to accommodate a multitude of interaction modalities (such as brain-computer, eye-tracking, electromyographic, and touch-based interfaces), thereby expediting progress in human-swarm interaction research. Specifically, the platform consists of custom-made low-cost omnidirectional wheeled mobile robots, multitouch screens and two workstations. In particular, the mutitouch screens can recognize human gestures and the shapes of objects placed on them, and they can also dynamically render diverse scenes. One of the workstations processes communication information within robots and the other one implements human-robot interaction methods. The development of DVRP-MHSI frees researchers from hardware or software details and allows them to focus on versatile swarm algorithms and human-swarm interaction methods without being limited to fixed scenarios, tasks, and interfaces. The effectiveness and potential of the platform for human-swarm interaction studies are validated by several demonstrative experiments.
ZebraPose: Zebra Detection and Pose Estimation using only Synthetic Data
Synthetic data is increasingly being used to address the lack of labeled images in uncommon domains for deep learning tasks. A prominent example is 2D pose estimation of animals, particularly wild species like zebras, for which collecting real-world data is complex and impractical. However, many approaches still require real images, consistency and style constraints, sophisticated animal models, and/or powerful pre-trained networks to bridge the syn-to-real gap. Moreover, they often assume that the animal can be reliably detected in images or videos, a hypothesis that often does not hold, e.g. in wildlife scenarios or aerial images. To solve this, we use synthetic data generated with a 3D photorealistic simulator to obtain the first synthetic dataset that can be used for both detection and 2D pose estimation of zebras without applying any of the aforementioned bridging strategies. Unlike previous works, we extensively train and benchmark our detection and 2D pose estimation models on multiple real-world and synthetic datasets using both pre-trained and non-pre-trained backbones. These experiments show how the models trained from scratch and only with synthetic data can consistently generalize to real-world images of zebras in both tasks. Moreover, we show it is possible to easily generalize those same models to 2D pose estimation of horses with a minimal amount of real-world images to account for the domain transfer. Code, results, trained models; and the synthetic, training, and validation data, including 104K manually labeled frames, are provided as open-source at https://zebrapose.is.tue.mpg.de/
comment: 8 pages, 5 tables, 7 figures
Towards reliable real-time trajectory optimization
Motion planning is a key aspect of robotics. A common approach to address motion planning problems is trajectory optimization. Trajectory optimization can represent the high-level behaviors of robots through mathematical formulations. However, current trajectory optimization approaches have two main challenges. Firstly, their solution heavily depends on the initial guess, and they are prone to get stuck in local minima. Secondly, they face scalability limitations by increasing the number of constraints. This thesis endeavors to tackle these challenges by introducing four innovative trajectory optimization algorithms to improve reliability, scalability, and computational efficiency. There are two novel aspects of the proposed algorithms. The first key innovation is remodeling the kinematic constraints and collision avoidance constraints. Another key innovation lies in the design of algorithms that effectively utilize parallel computation on GPU accelerators. By using reformulated constraints and leveraging the computational power of GPUs, the proposed algorithms of this thesis demonstrate significant improvements in efficiency and scalability compared to the existing methods. Parallelization enables faster computation times, allowing for real-time decision-making in dynamic environments. Moreover, the algorithms are designed to adapt to changes in the environment, ensuring robust performance. Extensive benchmarking for each proposed optimizer validates their efficacy. Overall, this thesis makes a significant contribution to the field of trajectory optimization algorithms. It introduces innovative solutions that specifically address the challenges faced by existing methods. The proposed algorithms pave the way for more efficient and robust motion planning solutions in robotics by leveraging parallel computation and specific mathematical structures.
comment: PhD Thesis, University of Tartu, 2024. The thesis was defended on 21st of June. https://dspace.ut.ee/items/a65d36c9-afe7-44ab-b544-20236177ed79
Learning Instruction-Guided Manipulation Affordance via Large Models for Embodied Robotic Tasks
We study the task of language instruction-guided robotic manipulation, in which an embodied robot is supposed to manipulate the target objects based on the language instructions. In previous studies, the predicted manipulation regions of the target object typically do not change with specification from the language instructions, which means that the language perception and manipulation prediction are separate. However, in human behavioral patterns, the manipulation regions of the same object will change for different language instructions. In this paper, we propose Instruction-Guided Affordance Net (IGANet) for predicting affordance maps of instruction-guided robotic manipulation tasks by utilizing powerful priors from vision and language encoders pre-trained on large-scale datasets. We develop a Vison-Language-Models(VLMs)-based data augmentation pipeline, which can generate a large amount of data automatically for model training. Besides, with the help of Large-Language-Models(LLMs), actions can be effectively executed to finish the tasks defined by instructions. A series of real-world experiments revealed that our method can achieve better performance with generated data. Moreover, our model can generalize better to scenarios with unseen objects and language instructions.
comment: Accepted to ICARM 2024
Safety Metric Aware Trajectory Repairing for Automated Driving
Recent analyses highlight challenges in autonomous vehicle technologies, particularly failures in decision-making under dynamic or emergency conditions. Traditional automated driving systems recalculate the entire trajectory in a changing environment. Instead, a novel approach retains valid trajectory segments, minimizing the need for complete replanning and reducing changes to the original plan. This work introduces a trajectory repairing framework that calculates a feasible evasive trajectory while computing the Feasible Time-to-React (F-TTR), balancing the maintenance of the original plan with safety assurance. The framework employs a binary search algorithm to iteratively create repaired trajectories, guaranteeing both the safety and feasibility of the trajectory repairing result. In contrast to earlier approaches that separated the calculation of safety metrics from trajectory repairing, which resulted in unsuccessful plans for evasive maneuvers, our work has the anytime capability to provide both a Feasible Time-to-React and an evasive trajectory for further execution.
OMEGA: Efficient Occlusion-Aware Navigation for Air-Ground Robot in Dynamic Environments via State Space Model
Air-ground robots (AGRs) are widely used in surveillance and disaster response due to their exceptional mobility and versatility (i.e., flying and driving). Current AGR navigation systems perform well in static occlusion-prone environments (e.g., indoors) by using 3D semantic occupancy networks to predict occlusions for complete local mapping and then computing Euclidean Signed Distance Field (ESDF) for path planning. However, these systems face challenges in dynamic, severe occlusion scenes (e.g., crowds) due to limitations in perception networks' low prediction accuracy and path planners' high computation overhead. In this paper, we propose OMEGA, which contains OccMamba with an Efficient AGR-Planner to address the above-mentioned problems. OccMamba adopts a novel architecture that separates semantic and occupancy prediction into independent branches, incorporating two mamba blocks within these branches. These blocks efficiently extract semantic and geometric features in 3D environments with linear complexity, ensuring that the network can learn long-distance dependencies to improve prediction accuracy. Semantic and geometric features are combined within the Bird's Eye View (BEV) space to minimise computational overhead during feature fusion. The resulting semantic occupancy map is then seamlessly integrated into the local map, providing occlusion awareness of the dynamic environment. Our AGR-Planner utilizes this local map and employs kinodynamic A* search and gradient-based trajectory optimization to guarantee planning is ESDF-free and energy-efficient. Extensive experiments demonstrate that OccMamba outperforms the state-of-the-art 3D semantic occupancy network with 25.0% mIoU. End-to-end navigation experiments in dynamic scenes verify OMEGA's efficiency, achieving a 96% average planning success rate. Code and video are available at https://jmwang0117.github.io/OMEGA/.
comment: OccMamba is Coming!
Fast Collective Evasion in Self-Localized Swarms of Unmanned Aerial Vehicles
A novel approach for achieving fast evasion in self-localized swarms of Unmanned Aerial Vehicles (UAVs) threatened by an intruding moving object is presented in this paper. Motivated by natural self-organizing systems, the presented approach of fast and collective evasion enables the UAV swarm to avoid dynamic objects (interferers) that are actively approaching the group. The main objective of the proposed technique is the fast and safe escape of the swarm from an interferer ~discovered in proximity. This method is inspired by the collective behavior of groups of certain animals, such as schools of fish or flocks of birds. These animals use the limited information of their sensing organs and decentralized control to achieve reliable and effective group motion. The system presented in this paper is intended to execute the safe coordination of UAV swarms with a large number of agents. Similar to natural swarms, this system propagates a fast shock of information about detected interferers throughout the group to achieve dynamic and collective evasion. The proposed system is fully decentralized using only onboard sensors to mutually localize swarm agents and interferers, similar to how animals accomplish this behavior. As a result, the communication structure between swarm agents is not overwhelmed by information about the state (position and velocity) of each individual and it is reliable to communication dropouts. The proposed system and theory were numerically evaluated and verified in real-world experiments.
Bidirectional Intent Communication: A Role for Large Foundation Models
Integrating multimodal foundation models has significantly enhanced autonomous agents' language comprehension, perception, and planning capabilities. However, while existing works adopt a \emph{task-centric} approach with minimal human interaction, applying these models to developing assistive \emph{user-centric} robots that can interact and cooperate with humans remains underexplored. This paper introduces ``Bident'', a framework designed to integrate robots seamlessly into shared spaces with humans. Bident enhances the interactive experience by incorporating multimodal inputs like speech and user gaze dynamics. Furthermore, Bident supports verbal utterances and physical actions like gestures, making it versatile for bidirectional human-robot interactions. Potential applications include personalized education, where robots can adapt to individual learning styles and paces, and healthcare, where robots can offer personalized support, companionship, and everyday assistance in the home and workplace environments.
comment: 2024 33rd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), Workshop: Large Language Models in the RoMan Age
A Passivity-Based Variable Impedance Controller for Incremental Learning of Periodic Interactive Tasks
In intelligent manufacturing, robots are asked to dynamically adapt their behaviours without reducing productivity. Human teaching, where an operator physically interacts with the robot to demonstrate a new task, is a promising strategy to quickly and intuitively reconfigure the production line. However, physical guidance during task execution poses challenges in terms of both operator safety and system usability. In this paper, we solve this issue by designing a variable impedance control strategy that regulates the interaction with the environment and the physical demonstrations, explicitly preventing at the same time passivity violations. We derive constraints to limit not only the exchanged energy with the environment but also the exchanged power, resulting in smoother interactions. By monitoring the energy flow between the robot and the environment, we are able to distinguish between disturbances (to be rejected) and physical guidance (to be accomplished), enabling smooth and controlled transitions from teaching to execution and vice versa. The effectiveness of the proposed approach is validated in wiping tasks with a real robotic manipulator.
comment: Accepted at the 2024 IEEE 20th International Conference on Automation Science and Engineering (CASE)
Where to Fetch: Extracting Visual Scene Representation from Large Pre-Trained Models for Robotic Goal Navigation
To complete a complex task where a robot navigates to a goal object and fetches it, the robot needs to have a good understanding of the instructions and the surrounding environment. Large pre-trained models have shown capabilities to interpret tasks defined via language descriptions. However, previous methods attempting to integrate large pre-trained models with daily tasks are not competent in many robotic goal navigation tasks due to poor understanding of the environment. In this work, we present a visual scene representation built with large-scale visual language models to form a feature representation of the environment capable of handling natural language queries. Combined with large language models, this method can parse language instructions into action sequences for a robot to follow, and accomplish goal navigation with querying the scene representation. Experiments demonstrate that our method enables the robot to follow a wide range of instructions and complete complex goal navigation tasks.
Navigating Dimensionality through State Machines in Automotive System Validation
The increasing automation of vehicles is resulting in the integration of more extensive in-vehicle sensor systems, electronic control units, and software. Additionally, vehicle-to-everything communication is seen as an opportunity to extend automated driving capabilities through information from a source outside the ego vehicle. However, the validation and verification of automated driving functions already pose a challenge due to the number of possible scenarios that can occur for a driving function, which makes it difficult to achieve comprehensive test coverage. Currently, the establishment of Safety Of The Intended Functionality ( SOTIF ) mandates the implementation of scenario-based testing. The introduction of additional external systems through vehicle-to-everything further complicates the problem and increases the scenario space. In this paper, a methodology based on state charts is proposed for modeling the interaction with external systems, which may remain as black boxes. This approach leverages the testability and coverage analysis inherent in state charts by combining them with scenario-based testing. The overall objective is to reduce the space of scenarios necessary for testing a networked driving function and to streamline validation and verification. The utilization of this approach is demonstrated using a simulated signalized intersection with a roadside unit that detects vulnerable road users.
comment: 10 pages, 5 figures, 2 figures in Appendix
Constrained Behavior Cloning for Robotic Learning
Behavior cloning (BC) is a popular supervised imitation learning method in the societies of robotics, autonomous driving, etc., wherein complex skills can be learned by direct imitation from expert demonstrations. Despite its rapid development, it is still affected by limited field of view where accumulation of sensors and joint noise bring compounding errors. In this paper, we introduced geometrically and historically constrained behavior cloning (GHCBC) to dominantly consider high-level state information inspired by neuroscientists, wherein the geometrically constrained behavior cloning were used to geometrically constrain predicting poses, and the historically constrained behavior cloning were utilized to temporally constrain action sequences. The synergy between these two types of constrains enhanced the BC performance in terms of robustness and stability. Comprehensive experimental results showed that success rates were improved by 29.73% in simulation and 39.4% in real robot experiments in average, respectively, compared to state-of-the-art BC method, especially in long-term operational scenes, indicating great potential of using the GHCBC for robotic learning.
Kalib: Markerless Hand-Eye Calibration with Keypoint Tracking
Hand-eye calibration involves estimating the transformation between the camera and the robot. Traditional methods rely on fiducial markers, involving much manual labor and careful setup. Recent advancements in deep learning offer markerless techniques, but they present challenges, including the need for retraining networks for each robot, the requirement of accurate mesh models for data generation, and the need to address the sim-to-real gap. In this letter, we propose Kalib, an automatic and universal markerless hand-eye calibration pipeline that leverages the generalizability of visual foundation models to eliminate these barriers. In each calibration process, Kalib uses keypoint tracking and proprioceptive sensors to estimate the transformation between a robot's coordinate space and its corresponding points in camera space. Our method does not require training new networks or access to mesh models. Through evaluations in simulation environments and the real-world dataset DROID, Kalib demonstrates superior accuracy compared to recent baseline methods. This approach provides an effective and flexible calibration process for various robot systems by simplifying setup and removing dependency on precise physical markers.
comment: The code and supplementary materials are available at https://sites.google.com/view/hand-eye-kalib
Leveraging Temporal Contexts to Enhance Vehicle-Infrastructure Cooperative Perception SC 2024
Infrastructure sensors installed at elevated positions offer a broader perception range and encounter fewer occlusions. Integrating both infrastructure and ego-vehicle data through V2X communication, known as vehicle-infrastructure cooperation, has shown considerable advantages in enhancing perception capabilities and addressing corner cases encountered in single-vehicle autonomous driving. However, cooperative perception still faces numerous challenges, including limited communication bandwidth and practical communication interruptions. In this paper, we propose CTCE, a novel framework for cooperative 3D object detection. This framework transmits queries with temporal contexts enhancement, effectively balancing transmission efficiency and performance to accommodate real-world communication conditions. Additionally, we propose a temporal-guided fusion module to further improve performance. The roadside temporal enhancement and vehicle-side spatial-temporal fusion together constitute a multi-level temporal contexts integration mechanism, fully leveraging temporal information to enhance performance. Furthermore, a motion-aware reconstruction module is introduced to recover lost roadside queries due to communication interruptions. Experimental results on V2X-Seq and V2X-Sim datasets demonstrate that CTCE outperforms the baseline QUEST, achieving improvements of 3.8% and 1.3% in mAP, respectively. Experiments under communication interruption conditions validate CTCE's robustness to communication interruptions.
comment: Accepted by IEEE ITSC 2024
MPGNet: Learning Move-Push-Grasping Synergy for Target-Oriented Grasping in Occluded Scenes IROS 2024
This paper focuses on target-oriented grasping in occluded scenes, where the target object is specified by a binary mask and the goal is to grasp the target object with as few robotic manipulations as possible. Most existing methods rely on a push-grasping synergy to complete this task. To deliver a more powerful target-oriented grasping pipeline, we present MPGNet, a three-branch network for learning a synergy between moving, pushing, and grasping actions. We also propose a multi-stage training strategy to train the MPGNet which contains three policy networks corresponding to the three actions. The effectiveness of our method is demonstrated via both simulated and real-world experiments.
comment: Accepted to IROS 2024
Inverse Design of Snap-Actuated Jumping Robots Powered by Mechanics-Aided Machine Learning
Exploring the design and control strategies of soft robots through simulation is highly attractive due to its cost-effectiveness. Although many existing models (e.g., finite element analysis) are effective for simulating soft robotic dynamics, there remains a need for a general and efficient numerical simulation approach in the soft robotics community. In this paper, we develop a discrete differential geometry-based numerical framework to achieve the model-based inverse design of a novel snap-actuated jumping robot. It is found that the dynamic process of a snapping beam can be either symmetric or asymmetric, such that the trajectory of the jumping robot can be tunable (e.g., horizontal or vertical). By employing this novel mechanism of the bistable beam as the robotic actuator, we next propose a physics-data hybrid inverse design strategy for the snap-jump robot with a broad spectrum of jumping capabilities. We first use the physical engine to study the influences of the robot's design parameters on the jumping capabilities, then generate extensive simulation data to formulate a data-driven inverse design solution. The inverse design solution can rapidly explore the combination of design parameters for achieving a target jump, which provides valuable guidance for the fabrication and control of the jumping robot. The proposed methodology paves the way for exploring the design and control insights of soft robots with the help of simulations.
comment: 8 pages, 6 figures
Newton-Raphson Flow for Aggressive Quadrotor Tracking Control
We apply the Newton-Raphson flow tracking controller to aggressive quadrotor flight and demonstrate that it achieves good tracking performance over a suite of benchmark trajectories, beating the native trajectory tracking controller in the popular PX4 Autopilot. The Newton-Raphson flow tracking controller is a recently proposed integrator-type controller that aims to drive to zero the error between a future predicted system output and the reference trajectory. This controller is computationally lightweight, requiring only an imprecise predictor, and achieves guaranteed asymptotic error bounds under certain conditions. We show that these theoretical advantages are realizable on a quadrotor hardware platform. Our experiments are conducted on a Holybrox x500v2 quadrotor using a Pixhawk 6x flight controller and a Rasbperry Pi 4 companion computer which receives location information from an OptiTrack motion capture system and sends input commands through the ROS2 API for the PX4 software stack.
comment: Expanded version of our submission to the American Control Conference 2024
Optimization of Multi-Agent Flying Sidekick Traveling Salesman Problem over Road Networks
The mixed truck-drone delivery systems have attracted increasing attention for last-mile logistics, but real-world complexities demand a shift from single-agent, fully connected graph models to multi-agent systems operating on actual road networks. We introduce the multi-agent flying sidekick traveling salesman problem (MA-FSTSP) on road networks, extending the single truck-drone model to multiple trucks, each carrying multiple drones while considering full road networks for truck restrictions and flexible drone routes. We propose a mixed-integer linear programming model and an efficient three-phase heuristic algorithm for this NP-hard problem. Our approach decomposes MA-FSTSP into manageable subproblems of one truck with multiple drones. Then, it computes the routes for trucks without drones in subproblems, which are used in the final phase as heuristics to help optimize drone and truck routes simultaneously. Extensive numerical experiments on Manhattan and Boston road networks demonstrate our algorithm's superior effectiveness and efficiency, significantly outperforming both column generation and variable neighborhood search baselines in solution quality and computation time. Notably, our approach scales to more than 300 customers within a 5-minute time limit, showcasing its potential for large-scale, real-world logistics applications.
Range-based Multi-Robot Integrity Monitoring Against Cyberattacks and Faults: An Anchor-Free Approach
Coordination of multi-robot systems (MRSs) relies on efficient sensing and reliable communication among the robots. However, the sensors and communication channels of these robots are often vulnerable to cyberattacks and faults, which can disrupt their individual behavior and the overall objective of the MRS. In this work, we present a multi-robot integrity monitoring framework that utilizes inter-robot range measurements to (i) detect the presence of cyberattacks or faults affecting the MRS, (ii) identify the affected robot(s), and (iii) reconstruct the resulting localization error of these robot(s). The proposed iterative algorithm leverages sequential convex programming and alternating direction of multipliers method to enable real-time and distributed implementation. Our approach is validated using numerical simulations and demonstrated using PX4-SiTL in Gazebo on an MRS, where certain agents deviate from their desired position due to a GNSS spoofing attack. Furthermore, we demonstrate the scalability and interoperability of our algorithm through mixed-reality experiments by forming a heterogeneous MRS comprising real Crazyflie UAVs and virtual PX4-SiTL UAVs working in tandem.
comment: 8 pages, 7 figures
Target-Oriented Object Grasping via Multimodal Human Guidance ECCV 2024
In the context of human-robot interaction and collaboration scenarios, robotic grasping still encounters numerous challenges. Traditional grasp detection methods generally analyze the entire scene to predict grasps, leading to redundancy and inefficiency. In this work, we reconsider 6-DoF grasp detection from a target-referenced perspective and propose a Target-Oriented Grasp Network (TOGNet). TOGNet specifically targets local, object-agnostic region patches to predict grasps more efficiently. It integrates seamlessly with multimodal human guidance, including language instructions, pointing gestures, and interactive clicks. Thus our system comprises two primary functional modules: a guidance module that identifies the target object in 3D space and TOGNet, which detects region-focal 6-DoF grasps around the target, facilitating subsequent motion planning. Through 50 target-grasping simulation experiments in cluttered scenes, our system achieves a success rate improvement of about 13.7%. In real-world experiments, we demonstrate that our method excels in various target-oriented grasping scenarios.
comment: Accepted by ECCV 2024 Workshop on Assistive Computer Vision and Robotics (ACVR 2024)
LoopSplat: Loop Closure by Registering 3D Gaussian Splats
Simultaneous Localization and Mapping (SLAM) based on 3D Gaussian Splats (3DGS) has recently shown promise towards more accurate, dense 3D scene maps. However, existing 3DGS-based methods fail to address the global consistency of the scene via loop closure and/or global bundle adjustment. To this end, we propose LoopSplat, which takes RGB-D images as input and performs dense mapping with 3DGS submaps and frame-to-model tracking. LoopSplat triggers loop closure online and computes relative loop edge constraints between submaps directly via 3DGS registration, leading to improvements in efficiency and accuracy over traditional global-to-local point cloud registration. It uses a robust pose graph optimization formulation and rigidly aligns the submaps to achieve global consistency. Evaluation on the synthetic Replica and real-world TUM-RGBD, ScanNet, and ScanNet++ datasets demonstrates competitive or superior tracking, mapping, and rendering compared to existing methods for dense RGB-D SLAM. Code is available at loopsplat.github.io.
comment: Project page: https://loopsplat.github.io/
Recursive Model-agnostic Inverse Dynamics of Serial Soft-Rigid Robots
Robotics is shifting from rigid, articulated systems to more sophisticated and heterogeneous mechanical structures. Soft robots, for example, have continuously deformable elements capable of large deformations. The flourishing of control techniques developed for this class of systems is fueling the need of efficient procedures for evaluating their inverse dynamics (ID), which is challenging due to the complex and mixed nature of these systems. As of today, no single ID algorithm can describe the behavior of generic (combinations of) models of soft robots. We address this challenge for generic series-like interconnections of possibly soft structures that may require heterogeneous modeling techniques. Our proposed algorithm requires as input a purely geometric description (forward-kinematics-like) of the mapping from configuration space to deformation space. With this information only, the complete equations of motion can be given an exact recursive structure which is essentially independent from (or `agnostic' to) the underlying reduced-order kinematic modeling techniques. We achieve this goal by exploiting Kane's method to manipulate the equations of motion, showing then their recursive structure. The resulting ID algorithms have optimal computational complexity within the proposed setting, i.e., linear in the number of distinct modules. Further, a variation of the algorithm is introduced that can evaluate the generalized mass matrix without increasing computation costs. We showcase the applicability of this method to robot models involving a mixture of rigid and soft elements, described via possibly heterogeneous reduced order models (ROMs), such as Volumetric FEM, Cosserat strain-based, and volume-preserving deformation primitives. None of these systems can be handled using existing ID techniques.
Learning Realistic Joint Space Boundaries for Range of Motion Analysis of Healthy and Impaired Human Arms
A realistic human kinematic model that satisfies anatomical constraints is essential for human-robot interaction, biomechanics and robot-assisted rehabilitation. Modeling realistic joint constraints, however, is challenging as human arm motion is constrained by joint limits, inter- and intra-joint dependencies, self-collisions, individual capabilities and muscular or neurological constraints which are difficult to represent. Hence, physicians and researchers have relied on simple box-constraints, ignoring important anatomical factors. In this paper, we propose a data-driven method to learn realistic anatomically constrained upper-limb range of motion (RoM) boundaries from motion capture data. This is achieved by fitting a one-class support vector machine to a dataset of upper-limb joint space exploration motions with an efficient hyper-parameter tuning scheme. Our approach outperforms similar works focused on valid RoM learning. Further, we propose an impairment index (II) metric that offers a quantitative assessment of capability/impairment when comparing healthy and impaired arms. We validate the metric on healthy subjects physically constrained to emulate hemiplegia and different disability levels as stroke patients.
Affordances-Oriented Planning using Foundation Models for Continuous Vision-Language Navigation
LLM-based agents have demonstrated impressive zero-shot performance in vision-language navigation (VLN) task. However, existing LLM-based methods often focus only on solving high-level task planning by selecting nodes in predefined navigation graphs for movements, overlooking low-level control in navigation scenarios. To bridge this gap, we propose AO-Planner, a novel Affordances-Oriented Planner for continuous VLN task. Our AO-Planner integrates various foundation models to achieve affordances-oriented low-level motion planning and high-level decision-making, both performed in a zero-shot setting. Specifically, we employ a Visual Affordances Prompting (VAP) approach, where the visible ground is segmented by SAM to provide navigational affordances, based on which the LLM selects potential candidate waypoints and plans low-level paths towards selected waypoints. We further propose a high-level PathAgent which marks planned paths into the image input and reasons the most probable path by comprehending all environmental information. Finally, we convert the selected path into 3D coordinates using camera intrinsic parameters and depth information, avoiding challenging 3D predictions for LLMs. Experiments on the challenging R2R-CE and RxR-CE datasets show that AO-Planner achieves state-of-the-art zero-shot performance (8.8% improvement on SPL). Our method can also serve as a data annotator to obtain pseudo-labels, distilling its waypoint prediction ability into a learning-based predictor. This new predictor does not require any waypoint data from the simulator and achieves 47% SR competing with supervised methods. We establish an effective connection between LLM and 3D world, presenting novel prospects for employing foundation models in low-level motion control.
Enabling the Deployment of Any-Scale Robotic Applications in Microservice Architectures through Automated Containerization ICRA
In an increasingly automated world -- from warehouse robots to self-driving cars -- streamlining the development and deployment process and operations of robotic applications becomes ever more important. Automated DevOps processes and microservice architectures have already proven successful in other domains such as large-scale customer-oriented web services (e.g., Netflix). We recommend to employ similar microservice architectures for the deployment of small- to large-scale robotic applications in order to accelerate development cycles, loosen functional dependence, and improve resiliency and elasticity. In order to facilitate involved DevOps processes, we present and release a tooling suite for automating the development of microservices for robotic applications based on the Robot Operating System (ROS). Our tooling suite covers the automated minimal containerization of ROS applications, a collection of useful machine learning-enabled base container images, as well as a CLI tool for simplified interaction with container images during the development phase. Within the scope of this paper, we embed our tooling suite into the overall context of streamlined robotics deployment and compare it to alternative solutions. We release our tools as open-source software at https://github.com/ika-rwth-aachen/dorotos.
comment: 6 pages; Submitted to IEEE International Conference on Robotics and Automation (ICRA) 2024
Topology-Guided ORCA: Smooth Multi-Agent Motion Planning in Constrained Environments
We present Topology-Guided ORCA as an alternative simulator to replace ORCA for planning smooth multi-agent motions in environments with static obstacles. Despite the impressive performance in simulating multi-agent crowd motion in free space, ORCA encounters a significant challenge in navigating the agents with the presence of static obstacles. ORCA ignores static obstacles until an agent gets too close to an obstacle, and the agent will get stuck if the obstacle intercepts an agent's path toward the goal. To address this challenge, Topology-Guided ORCA constructs a graph to represent the topology of the traversable region of the environment. We use a path planner to plan a path of waypoints that connects each agent's start and goal positions. The waypoints are used as a sequence of goals to guide ORCA. The experiments of crowd simulation in constrained environments show that our method outperforms ORCA in terms of generating smooth and natural motions of multiple agents in constrained environments, which indicates great potential of Topology-Guided ORCA for serving as an effective simulator for training constrained social navigation policies.
comment: Accepted by Unsolved Problems in Social Robot Navigation workshop in conjunction with RSS 2024
Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations
Learning from Demonstrations, the field that proposes to learn robot behavior models from data, is gaining popularity with the emergence of deep generative models. Although the problem has been studied for years under names such as Imitation Learning, Behavioral Cloning, or Inverse Reinforcement Learning, classical methods have relied on models that don't capture complex data distributions well or don't scale well to large numbers of demonstrations. In recent years, the robot learning community has shown increasing interest in using deep generative models to capture the complexity of large datasets. In this survey, we aim to provide a unified and comprehensive review of the last year's progress in the use of deep generative models in robotics. We present the different types of models that the community has explored, such as energy-based models, diffusion models, action value maps, or generative adversarial networks. We also present the different types of applications in which deep generative models have been used, from grasp generation to trajectory generation or cost learning. One of the most important elements of generative models is the generalization out of distributions. In our survey, we review the different decisions the community has made to improve the generalization of the learned models. Finally, we highlight the research challenges and propose a number of future directions for learning deep generative models in robotics.
comment: 20 pages, 11 figures, submitted to TRO
DynaPix SLAM: A Pixel-Based Dynamic Visual SLAM Approach
Visual Simultaneous Localization and Mapping (V-SLAM) methods achieve remarkable performance in static environments, but face challenges in dynamic scenes where moving objects severely affect their core modules. To avoid this, dynamic V-SLAM approaches often leverage semantic information, geometric constraints, or optical flow. However, these methods are limited by imprecise estimations and their reliance on the accuracy of deep-learning models. Moreover, predefined thresholds for static/dynamic classification, the a-priori selection of dynamic object classes, and the inability to recognize unknown or unexpected moving objects, often degrade their performance. To address these limitations, we introduce DynaPix, a novel semantic-free V-SLAM system based on per-pixel motion probability estimation and an improved pose optimization process. The per-pixel motion probability is estimated using a static background differencing method on image data and optical flows computed on splatted frames. With DynaPix, we fully integrate these probabilities into map point selection and apply them through weighted bundle adjustment within the tracking and optimization modules of ORB-SLAM2. We thoroughly evaluate our method using the GRADE and TUM RGB-D datasets, showing significantly lower trajectory errors and longer tracking times in both static and dynamic sequences. The source code, datasets, and results are available at https://dynapix.is.tue.mpg.de/.
comment: Chenghao Xu and Elia Bonetto contributed equally to this work as first authors. 19 pages, 4 tables, 6 figures. Includes supplementary material
S3E: A Mulit-Robot Multimodal Dataset for Collaborative SLAM
The burgeoning demand for collaborative robotic systems to execute complex tasks collectively has intensified the research community's focus on advancing simultaneous localization and mapping (SLAM) in a cooperative context. Despite this interest, the scalability and diversity of existing datasets for collaborative trajectories remain limited, especially in scenarios with constrained perspectives where the generalization capabilities of Collaborative SLAM (C-SLAM) are critical for the feasibility of multi-agent missions. Addressing this gap, we introduce S3E, an expansive multimodal dataset. Captured by a fleet of unmanned ground vehicles traversing four distinct collaborative trajectory paradigms, S3E encompasses 13 outdoor and 5 indoor sequences. These sequences feature meticulously synchronized and spatially calibrated data streams, including 360-degree LiDAR point cloud, high-resolution stereo imagery, high-frequency inertial measurement units (IMU), and Ultra-wideband (UWB) relative observations. Our dataset not only surpasses previous efforts in scale, scene diversity, and data intricacy but also provides a thorough analysis and benchmarks for both collaborative and individual SLAM methodologies. For access to the dataset and the latest information, please visit our repository at https://pengyu-team.github.io/S3E.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
D$^3$FlowSLAM: Self-Supervised Dynamic SLAM with Flow Motion Decomposition and DINO Guidance
In this paper, we introduce a self-supervised deep SLAM method that robustly operates in dynamic scenes while accurately identifying dynamic components. Our method leverages a dual-flow representation for static flow and dynamic flow, facilitating effective scene decomposition in dynamic environments. We propose a dynamic update module based on this representation and develop a dense SLAM system that excels in dynamic scenarios. In addition, we design a self-supervised training scheme using DINO as a prior, enabling label-free training. Our method achieves superior accuracy compared to other self-supervised methods. It also matches or even surpasses the performance of existing supervised methods in some cases. All code and data will be made publicly available upon acceptance.
comment: Homepage: https://zju3dv.github.io/deflowslam
ATI-CTLO:Adaptive Temporal Interval-based Continuous-Time LiDAR-Only Odometry
The motion distortion in LiDAR scans caused by aggressive robot motion and varying terrain features significantly impacts the positioning and mapping performance of 3D LiDAR odometry. Existing distortion correction solutions often struggle to balance computational complexity and accuracy. In this work, we propose an Adaptive Temporal Interval-based Continuous-Time LiDAR-only Odometry, utilizing straightforward and efficient linear interpolation. Our method flexibly adjusts the temporal intervals between control nodes according to the dynamics of motion and environmental characteristics. This adaptability enhances performance across various motion states and improves robustness in challenging, feature-sparse environments. We validate the effectiveness of our method on multiple datasets across different platforms, achieving accuracy comparable to state-of-the-art LiDAR-only odometry methods. Notably, in scenarios involving aggressive motion and sparse features, our method outperforms existing solutions.
Multi-Robot Object SLAM Using Distributed Variational Inference
Multi-robot simultaneous localization and mapping (SLAM) enables a robot team to achieve coordinated tasks by relying on a common map of the environment. Constructing a map by centralized processing of the robot observations is undesirable because it creates a single point of failure and requires pre-existing infrastructure and significant communication throughput. This paper formulates multi-robot object SLAM as a variational inference problem over a communication graph subject to consensus constraints on the object estimates maintained by different robots. To solve the problem, we develop a distributed mirror descent algorithm with regularization enforcing consensus among the communicating robots. Using Gaussian distributions in the algorithm, we also derive a distributed multi-state constraint Kalman filter (MSCKF) for multi-robot object SLAM. Experiments on real and simulated data show that our method improves the trajectory and object estimates, compared to individual-robot SLAM, while achieving better scaling to large robot teams, compared to centralized multi-robot SLAM.
ComTraQ-MPC: Meta-Trained DQN-MPC Integration for Trajectory Tracking with Limited Active Localization Updates
Optimal decision-making for trajectory tracking in partially observable, stochastic environments where the number of active localization updates -- the process by which the agent obtains its true state information from the sensors -- are limited, presents a significant challenge. Traditional methods often struggle to balance resource conservation, accurate state estimation and precise tracking, resulting in suboptimal performance. This problem is particularly pronounced in environments with large action spaces, where the need for frequent, accurate state data is paramount, yet the capacity for active localization updates is restricted by external limitations. This paper introduces ComTraQ-MPC, a novel framework that combines Deep Q-Networks (DQN) and Model Predictive Control (MPC) to optimize trajectory tracking with constrained active localization updates. The meta-trained DQN ensures adaptive active localization scheduling, while the MPC leverages available state information to improve tracking. The central contribution of this work is their reciprocal interaction: DQN's update decisions inform MPC's control strategy, and MPC's outcomes refine DQN's learning, creating a cohesive, adaptive system. Empirical evaluations in simulated and real-world settings demonstrate that ComTraQ-MPC significantly enhances operational efficiency and accuracy, providing a generalizable and approximately optimal solution for trajectory tracking in complex partially observable environments.
comment: * Equal contribution
Multiagent Systems
Athena: Safe Autonomous Agents with Verbal Contrastive Learning
Due to emergent capabilities, large language models (LLMs) have been utilized as language-based agents to perform a variety of tasks and make decisions with an increasing degree of autonomy. These autonomous agents can understand high-level instructions, interact with their environments, and execute complex tasks using a selection of tools available to them. As the capabilities of the agents expand, ensuring their safety and trustworthiness becomes more imperative. In this study, we introduce the Athena framework which leverages the concept of verbal contrastive learning where past safe and unsafe trajectories are used as in-context (contrastive) examples to guide the agent towards safety while fulfilling a given task. The framework also incorporates a critiquing mechanism to guide the agent to prevent risky actions at every step. Furthermore, due to the lack of existing benchmarks on the safety reasoning ability of LLM-based agents, we curate a set of 80 toolkits across 8 categories with 180 scenarios to provide a safety evaluation benchmark. Our experimental evaluation, with both closed- and open-source LLMs, indicates verbal contrastive learning and interaction-level critiquing improve the safety rate significantly.
comment: 9 pages, 2 figures, 4 tables
DBHP: Trajectory Imputation in Multi-Agent Sports Using Derivative-Based Hybrid Prediction
Many spatiotemporal domains handle multi-agent trajectory data, but in real-world scenarios, collected trajectory data are often partially missing due to various reasons. While existing approaches demonstrate good performance in trajectory imputation, they face challenges in capturing the complex dynamics and interactions between agents due to a lack of physical constraints that govern realistic trajectories, leading to suboptimal results. To address this issue, the paper proposes a Derivative-Based Hybrid Prediction (DBHP) framework that can effectively impute multiple agents' missing trajectories. First, a neural network equipped with Set Transformers produces a naive prediction of missing trajectories while satisfying the permutation-equivariance in terms of the order of input agents. Then, the framework makes alternative predictions leveraging velocity and acceleration information and combines all the predictions with properly determined weights to provide final imputed trajectories. In this way, our proposed framework not only accurately predicts position, velocity, and acceleration values but also enforces the physical relationship between them, eventually improving both the accuracy and naturalness of the predicted trajectories. Accordingly, the experiment results about imputing player trajectories in team sports show that our framework significantly outperforms existing imputation baselines.
Multi-Agent Based Simulation for Decentralized Electric Vehicle Charging Strategies and their Impacts
The growing shift towards a Smart Grid involves integrating numerous new digital energy solutions into the energy ecosystems to address problems arising from the transition to carbon neutrality, particularly in linking the electricity and transportation sectors. Yet, this shift brings challenges due to mass electric vehicle adoption and the lack of methods to adequately assess various EV charging algorithms and their ecosystem impacts. This paper introduces a multi-agent based simulation model, validated through a case study of a Danish radial distribution network serving 126 households. The study reveals that traditional charging leads to grid overload by 2031 at 67% EV penetration, while decentralized strategies like Real-Time Pricing could cause overloads as early as 2028. The developed multi-agent based simulation demonstrates its ability to offer detailed, hourly analysis of future load profiles in distribution grids, and therefore, can be applied to other prospective scenarios in similar energy systems.
Multi-agent based modeling for investigating excess heat utilization from electrolyzer production to district heating network
Power-to-Hydrogen is crucial for the renewable energy transition, yet existing literature lacks business models for the significant excess heat it generates. This study addresses this by evaluating three models for selling electrolyzer-generated heat to district heating grids: constant, flexible, and renewable-source hydrogen production, with and without heat sales. Using agent-based modeling and multi-criteria decision-making methods (VIKOR, TOPSIS, PROMETHEE), it finds that selling excess heat can cut hydrogen production costs by 5.6%. The optimal model operates flexibly with electricity spot prices, includes heat sales, and maintains a hydrogen price of 3.3 EUR/kg. Environmentally, hydrogen production from grid electricity could emit up to 13,783.8 tons of CO2 over four years from 2023. The best economic and environmental model uses renewable sources and sells heat at 3.5 EUR/kg
Multi-Agent Based Simulation for Investigating Centralized Charging Strategies and their Impact on Electric Vehicle Home Charging Ecosystem
This paper addresses the critical integration of electric vehicles (EVs) into the electricity grid, which is essential for achieving carbon neutrality by 2050. The rapid increase in EV adoption poses significant challenges to the existing grid infrastructure, particularly in managing the increasing electricity demand and mitigating the risk of grid overloads. Centralized EV charging strategies are investigated due to their potential to optimize grid stability and efficiency, compared to decentralized approaches that may exacerbate grid stress. Utilizing a multi-agent based simulation model, the study provides a realistic representation of the electric vehicle home charging ecosystem in a case study of Strib, Denmark. The findings show that the Earliest-deadline-first and Round Robin perform best with 100% EV adoption in terms of EV user satisfaction. The simulation considers a realistic adoption curve, EV charging strategies, EV models, and driving patterns to capture the full ecosystem dynamics over a long-term period with high resolution (hourly). Additionally, the study offers detailed load profiles for future distribution grids, demonstrating how centralized charging strategies can efficiently manage grid loads and prevent overloads.
Analyzing the Impact of Electric Vehicles on Local Energy Systems using Digital Twins
The electrification of the transportation and heating sector, the so-called sector coupling, is one of the core elements to achieve independence from fossil fuels. As it highly affects the electricity demand, especially on the local level, the integrated modeling and simulation of all sectors is a promising approach for analyzing design decisions or complex control strategies. This paper analyzes the increase in electricity demand resulting from sector coupling, mainly due to integrating electric vehicles into urban energy systems. Therefore, we utilize a digital twin of an existing local energy system and extend it with a mobility simulation model to evaluate the impact of electric vehicles on the distribution grid level. Our findings indicate a significant rise in annual electricity consumption attributed to electric vehicles, with home charging alone resulting in a 78% increase. However, we demonstrate that integrating photovoltaic and battery energy storage systems can effectively mitigate this rise.
comment: Paper is to be published in Proceedings of the 2024 Winter Simulation Conference
Synchronization behind Learning in Periodic Zero-Sum Games Triggers Divergence from Nash equilibrium
Learning in zero-sum games studies a situation where multiple agents competitively learn their strategy. In such multi-agent learning, we often see that the strategies cycle around their optimum, i.e., Nash equilibrium. When a game periodically varies (called a ``periodic'' game), however, the Nash equilibrium moves generically. How learning dynamics behave in such periodic games is of interest but still unclear. Interestingly, we discover that the behavior is highly dependent on the relationship between the two speeds at which the game changes and at which players learn. We observe that when these two speeds synchronize, the learning dynamics diverge, and their time-average does not converge. Otherwise, the learning dynamics draw complicated cycles, but their time-average converges. Under some assumptions introduced for the dynamical systems analysis, we prove that this behavior occurs. Furthermore, our experiments observe this behavior even if removing these assumptions. This study discovers a novel phenomenon, i.e., synchronization, and gains insight widely applicable to learning in periodic games.
comment: 8 pages, 5 figures (main); 7 pages, 1 figure (appendix)
Optimization of Multi-Agent Flying Sidekick Traveling Salesman Problem over Road Networks
The mixed truck-drone delivery systems have attracted increasing attention for last-mile logistics, but real-world complexities demand a shift from single-agent, fully connected graph models to multi-agent systems operating on actual road networks. We introduce the multi-agent flying sidekick traveling salesman problem (MA-FSTSP) on road networks, extending the single truck-drone model to multiple trucks, each carrying multiple drones while considering full road networks for truck restrictions and flexible drone routes. We propose a mixed-integer linear programming model and an efficient three-phase heuristic algorithm for this NP-hard problem. Our approach decomposes MA-FSTSP into manageable subproblems of one truck with multiple drones. Then, it computes the routes for trucks without drones in subproblems, which are used in the final phase as heuristics to help optimize drone and truck routes simultaneously. Extensive numerical experiments on Manhattan and Boston road networks demonstrate our algorithm's superior effectiveness and efficiency, significantly outperforming both column generation and variable neighborhood search baselines in solution quality and computation time. Notably, our approach scales to more than 300 customers within a 5-minute time limit, showcasing its potential for large-scale, real-world logistics applications.
Autonomous Negotiation Using Comparison-Based Gradient Estimation
Negotiation is useful for resolving conflicts in multi-agent systems. We explore autonomous negotiation in a setting where two self-interested rational agents sequentially trade items from a finite set of categories. Each agent has a utility function that depends on the amount of items it possesses in each category. The offering agent makes trade offers to improve its utility without knowing the responding agent's utility function, and the responding agent accepts offers that improve its utility. We present a comparison-based algorithm for the offering agent that generates offers through previous acceptance or rejection responses without extensive information sharing. The algorithm estimates the responding agent's gradient by leveraging the rationality assumption and rejected offers to prune the space of potential gradients. After the algorithm makes a finite number of consecutively rejected offers, the responding agent is at a near-optimal state, or the agents' preferences are closely aligned. Additionally, we facilitate negotiations with humans by representing natural language feedback as comparisons that can be integrated into the proposed algorithm. We compare the proposed algorithm against random search baselines in integer and fractional trading scenarios and show that it improves the societal benefit with fewer offers.
MegaAgent: A Practical Framework for Autonomous Cooperation in Large-Scale LLM Agent Systems
With the emergence of large language models (LLMs), LLM-powered multi-agent systems (LLM-MA systems) have been proposed to tackle real-world tasks. However, their agents mostly follow predefined Standard Operating Procedures (SOPs) that remain unchanged across the whole interaction, lacking autonomy and scalability. Additionally, current solutions often overlook the necessity for effective agent cooperation. To address the above limitations, we propose MegaAgent, a practical framework designed for autonomous cooperation in large-scale LLM Agent systems. MegaAgent leverages the autonomy of agents to dynamically generate agents based on task requirements, incorporating features such as automatically dividing tasks, systematic planning and monitoring of agent activities, and managing concurrent operations. In addition, MegaAgent is designed with a hierarchical structure and employs system-level parallelism to enhance performance and boost communication. We demonstrate the effectiveness of MegaAgent through Gobang game development, showing that it outperforms popular LLM-MA systems; and national policy simulation, demonstrating its high autonomy and potential to rapidly scale up to 590 agents while ensuring effective cooperation among them. Our results indicate that MegaAgent is the first autonomous large-scale LLM-MA system with no pre-defined SOPs, high effectiveness and scalability, paving the way for further research in this field. Our code is at https://anonymous.4open.science/r/MegaAgent-81F3.
Approval-Based Committee Voting under Incomplete Information
We investigate approval-based committee voting with incomplete information about the approval preferences of voters. We consider several models of incompleteness where each voter partitions the set of candidates into approved, disapproved, and unknown candidates, possibly with ordinal preference constraints among candidates in the latter category. This captures scenarios where voters have not evaluated all candidates and/or it is unknown where voters draw the threshold between approved and disapproved candidates. We study the complexity of some fundamental computational problems for a number of classic approval-based committee voting rules including Proportional Approval Voting and Chamberlin-Courant. These problems include determining whether a given set of candidates is a possible or necessary winning committee and whether a given candidate is possibly or necessarily a member of the winning committee. We also consider proportional representation axioms and the problem of deciding whether a given committee is possibly or necessarily representative.
Near-linear Time Dispersion of Mobile Agents
Consider that there are $k\le n$ agents in a simple, connected, and undirected graph $G=(V,E)$ with $n$ nodes and $m$ edges. The goal of the dispersion problem is to move these $k$ agents to mutually distinct nodes. Agents can communicate only when they are at the same node, and no other communication means, such as whiteboards, are available. We assume that the agents operate synchronously. We consider two scenarios: when all agents are initially located at a single node (rooted setting) and when they are initially distributed over one or more nodes (general setting). Kshemkalyani and Sharma presented a dispersion algorithm for the general setting, which uses $O(m_k)$ time and $\log(k + \Delta)$ bits of memory per agent [OPODIS 2021], where $m_k$ is the maximum number of edges in any induced subgraph of $G$ with $k$ nodes, and $\Delta$ is the maximum degree of $G$. This algorithm is currently the fastest in the literature, as no $o(m_k)$-time algorithm has been discovered, even for the rooted setting. In this paper, we present significantly faster algorithms for both the rooted and the general settings. First, we present an algorithm for the rooted setting that solves the dispersion problem in $O(k\log \min(k,\Delta))=O(k\log k)$ time using $O(\log (k+\Delta))$ bits of memory per agent. Next, we propose an algorithm for the general setting that achieves dispersion in $O(k \log k \cdot \log \min(k,\Delta))=O(k \log^2 k)$ time using $O(\log (k+\Delta))$ bits. Finally, for the rooted setting, we give a time-optimal (i.e.,~$O(k)$-time) algorithm with $O(\Delta+\log k)$ bits of space per agent. All algorithms presented in this paper work only in the synchronous setting, while several algorithms in the literature, including the one given by Kshemkalyani and Sharma at OPODIS 2021, work in the asynchronous setting.
Nash Equilibrium and Learning Dynamics in Three-Player Matching $m$-Action Games
Learning in games discusses the processes where multiple players learn their optimal strategies through the repetition of game plays. The dynamics of learning between two players in zero-sum games, such as matching pennies, where their benefits are competitive, have already been well analyzed. However, it is still unexplored and challenging to analyze the dynamics of learning among three players. In this study, we formulate a minimalistic game where three players compete to match their actions with one another. Although interaction among three players diversifies and complicates the Nash equilibria, we fully analyze the equilibria. We also discuss the dynamics of learning based on some famous algorithms categorized into Follow the Regularized Leader. From both theoretical and experimental aspects, we characterize the dynamics by categorizing three-player interactions into three forces to synchronize their actions, switch their actions rotationally, and seek competition.
comment: 9 pages, 4 figures (main), 9 pages, 1 figure (appendix)
PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety ACL 2024
Multi-agent systems, when enhanced with Large Language Models (LLMs), exhibit profound capabilities in collective intelligence. However, the potential misuse of this intelligence for malicious purposes presents significant risks. To date, comprehensive research on the safety issues associated with multi-agent systems remains limited. In this paper, we explore these concerns through the innovative lens of agent psychology, revealing that the dark psychological states of agents constitute a significant threat to safety. To tackle these concerns, we propose a comprehensive framework (PsySafe) grounded in agent psychology, focusing on three key areas: firstly, identifying how dark personality traits in agents can lead to risky behaviors; secondly, evaluating the safety of multi-agent systems from the psychological and behavioral perspectives, and thirdly, devising effective strategies to mitigate these risks. Our experiments reveal several intriguing phenomena, such as the collective dangerous behaviors among agents, agents' self-reflection when engaging in dangerous behavior, and the correlation between agents' psychological assessments and dangerous behaviors. We anticipate that our framework and observations will provide valuable insights for further research into the safety of multi-agent systems. We will make our data and code publicly accessible at https://github.com/AI4Good24/PsySafe.
comment: ACL 2024
Instruct, Not Assist: LLM-based Multi-Turn Planning and Hierarchical Questioning for Socratic Code Debugging
Socratic questioning is an effective teaching strategy, encouraging critical thinking and problem-solving. The conversational capabilities of large language models (LLMs) show great potential for providing scalable, real-time student guidance. However, current LLMs often give away solutions directly, making them ineffective instructors. We tackle this issue in the code debugging domain with TreeInstruct, an Instructor agent guided by a novel state space-based planning algorithm. TreeInstruct asks probing questions to help students independently identify and resolve errors. It estimates a student's conceptual and syntactical knowledge to dynamically construct a question tree based on their responses and current knowledge state, effectively addressing both independent and dependent mistakes concurrently in a multi-turn interaction setting. In addition to using an existing single-bug debugging benchmark, we construct a more challenging multi-bug dataset of 150 coding problems, incorrect solutions, and bug fixes -- all carefully constructed and annotated by experts. Extensive evaluation shows TreeInstruct's state-of-the-art performance on both datasets, proving it to be a more effective instructor than baselines. Furthermore, a real-world case study with five students of varying skill levels further demonstrates TreeInstruct's ability to guide students to debug their code efficiently with minimal turns and highly Socratic questioning. We provide our code and datasets at http://github.com/agarwalishika/TreeInstruct .
Systems and Control (CS)
Hybrid Recurrent Models Support Emergent Descriptions for Hierarchical Planning and Control
An open problem in artificial intelligence is how systems can flexibly learn discrete abstractions that are useful for solving inherently continuous problems. Previous work has demonstrated that a class of hybrid state-space model known as recurrent switching linear dynamical systems (rSLDS) discover meaningful behavioural units via the piecewise linear decomposition of complex continuous dynamics (Linderman et al., 2016). Furthermore, they model how the underlying continuous states drive these discrete mode switches. We propose that the rich representations formed by an rSLDS can provide useful abstractions for planning and control. We present a novel hierarchical model-based algorithm inspired by Active Inference in which a discrete MDP sits above a low-level linear-quadratic controller. The recurrent transition dynamics learned by the rSLDS allow us to (1) specify temporally-abstracted sub-goals in a method reminiscent of the options framework, (2) lift the exploration into discrete space allowing us to exploit information-theoretic exploration bonuses and (3) `cache' the approximate solutions to low-level problems in the discrete planner. We successfully apply our model to the sparse Continuous Mountain Car task, demonstrating fast system identification via enhanced exploration and non-trivial planning through the delineation of abstract sub-goals.
comment: 4 pages, 3 figures
Safety-Critical Stabilization of Force-Controlled Nonholonomic Robots
We present a safety-critical controller for the problem of stabilization for force-controlled nonholonomic autonomous vehicles. The proposed control law is based on the constructions of control Lyapunov functions (CLFs) and control barrier functions (CBFs) for cascaded systems. To address nonholonomicity, we design the nominal controller that guarantees global asymptotic stability and local exponential stability for the closed-loop system in polar coordinates and construct a strict Lyapunov function valid on any compact sets. Furthermore, we present a procedure for constructing CBFs for cascaded systems, utilizing the CBF of the kinematic model through integrator backstepping. Quadratic programming is employed to combine CLFs and CBFs to integrate both stability and safety in the closed loop. The proposed control law is time-invariant, continuous along trajectories, and easy to implement. Our main results guarantee both safety and local asymptotic stability for the closed-loop system.
How Much Reserve Fuel: Quantifying the Maximal Energy Cost of System Disturbances
Motivated by the design question of additional fuel needed to complete a task in an uncertain environment, this paper introduces metrics to quantify the maximal additional energy used by a control system in the presence of bounded disturbances when compared to a nominal, disturbance-free system. In particular, we consider the task of finite-time stabilization for a linear time-invariant system. We first derive the nominal energy required to achieve this task in a disturbance-free system, and then the worst-case energy over all feasible disturbances. The latter leads to an optimal control problem with a least-squares solution, and then an infinite-dimensional optimization problem where we derive an upper bound on the solution. The comparison of these energies is accomplished using additive and multiplicative metrics, and we derive analytical bounds on these metrics. Simulation examples on an ADMIRE fighter jet model demonstrate the practicability of these metrics, and their variation with the task hardness, a combination of the distance of the initial condition from the origin and the task completion time.
comment: 6 pages, 4 figures. IEEE Conference on Decision and Control
Emerging clean technologies: policy-driven cost reductions, implications and perspectives
Hydrogen production from water electrolysis, direct air capture (DAC), and synthetic kerosene derived from hydrogen and CO2 (`e-kerosene') are expected to play an important role in global decarbonization efforts. So far, the economics of these nascent technologies hamper their market diffusion. However, a wave of recent policy support in the United States, Europe, China, and elsewhere is anticipated to drive their commercial liftoff and bring their costs down. To this end, we evaluate the potential cost reductions driven by policy-induced scale-up of these emerging technologies through 2030 using an experience curves approach accounting for both local and global learning effects. We then analyze the consequences of projected cost declines on the competitiveness of these nascent technologies compared to conventional fossil alternatives, where applicable, and highlight some of the tradeoffs associated with their expansion. Our findings indicate that enacted policies could lead to substantial capital cost reductions for electrolyzers. Nevertheless, electrolytic hydrogen production at $1-2/kg would still require some form of policy support. Given expected costs and experience curves, it is unlikely that liquid solvent DAC (L-DAC) scale-up will bring removal costs to stated targets of $100/tCO2, though a $200/tCO2 may eventually be within reach. We also underscore the importance of tackling methane leakage for natural gas-powered L-DAC: unmitigated leaks amplify net removal costs, exacerbate the investment requirements to reach targeted costs, and cast doubt on L-DAC's role in the clean energy transition. Lastly, despite reductions in electrolysis and L-DAC costs, e-kerosene remains considerably more expensive than fossil jet fuel. The economics of e-kerosene and the resources required for production raise questions about the fuel's ultimate viability as a decarbonization tool for aviation.
Towards reliable real-time trajectory optimization
Motion planning is a key aspect of robotics. A common approach to address motion planning problems is trajectory optimization. Trajectory optimization can represent the high-level behaviors of robots through mathematical formulations. However, current trajectory optimization approaches have two main challenges. Firstly, their solution heavily depends on the initial guess, and they are prone to get stuck in local minima. Secondly, they face scalability limitations by increasing the number of constraints. This thesis endeavors to tackle these challenges by introducing four innovative trajectory optimization algorithms to improve reliability, scalability, and computational efficiency. There are two novel aspects of the proposed algorithms. The first key innovation is remodeling the kinematic constraints and collision avoidance constraints. Another key innovation lies in the design of algorithms that effectively utilize parallel computation on GPU accelerators. By using reformulated constraints and leveraging the computational power of GPUs, the proposed algorithms of this thesis demonstrate significant improvements in efficiency and scalability compared to the existing methods. Parallelization enables faster computation times, allowing for real-time decision-making in dynamic environments. Moreover, the algorithms are designed to adapt to changes in the environment, ensuring robust performance. Extensive benchmarking for each proposed optimizer validates their efficacy. Overall, this thesis makes a significant contribution to the field of trajectory optimization algorithms. It introduces innovative solutions that specifically address the challenges faced by existing methods. The proposed algorithms pave the way for more efficient and robust motion planning solutions in robotics by leveraging parallel computation and specific mathematical structures.
comment: PhD Thesis, University of Tartu, 2024. The thesis was defended on 21st of June. https://dspace.ut.ee/items/a65d36c9-afe7-44ab-b544-20236177ed79
Fast Grid Emissions Sensitivities using Parallel Decentralized Implicit Differentiation
Marginal emissions rates -- the sensitivity of carbon emissions to electricity demand -- are important for evaluating the impact of emissions mitigation measures. Like locational marginal prices, locational marginal emissions rates (LMEs) can vary geographically, even between nearby locations, and may be coupled across time periods because of, for example, storage and ramping constraints. This temporal coupling makes computing LMEs computationally expensive for large electricity networks with high storage and renewable penetrations. Recent work demonstrates that decentralized algorithms can mitigate this problem by decoupling timesteps during differentiation. Unfortunately, we show these potential speedups are negated by the sparse structure inherent in power systems problems. We address these limitations by introducing a parallel, reverse-mode decentralized differentiation scheme that never explicitly instantiates the solution map Jacobian. We show both theoretically and empirically that parallelization is necessary to achieve non-trivial speedups when computing grid emissions sensitivities. Numerical results on a 500 node system indicate that our method can achieve greater than 10x speedups over centralized and serial decentralized approaches.
Fault Tolerant Dynamic Task Assignment for UAV-based Search Teams
This research offers a novel framework for dynamic task assignment for unmanned aerial vehicles (UAVs) in cooperative search settings. Notably, it incorporates post-fault UAV capabilities into job assignment techniques, assuring operational dependability in the event of sensor and actuator failures. A significant innovation is the utilization of UAV battery charge to assess range relative to search objectives, hence improving job distribution while conserving battery life. This model integrates repair, recharge, and stochastic goal recurrence, hence increasing its real-world applicability. Using stochastic dynamic programming, this method makes it simpler to determine optimal assignment policies offline so they may be implemented rapidly online. This paper emphasizes the holistic aspect of the proposed model, which connects high-level task rules to low-level control capabilities. A simulation-based case study proves its usefulness, highlighting its robustness in fault-prone and battery-variable settings. Overall, this paper proposes and demonstrates a comprehensive method for assigning UAV tasks that integrates defect awareness, battery management, and multilayer control through the use of stochastic dynamic programming.
Semi-on-Demand Off-Peak Transit Services with Shared Autonomous Vehicles -- Service Planning, Simulation, and Analysis in Munich, Germany
This study investigates the implementation of semi-on-demand (SoD) hybrid-route services using Shared Autonomous Vehicles (SAVs) on existing transit lines. SoD services combine the cost efficiency of fixed-route buses with the flexibility of on-demand services. SAVs first serve all scheduled fixed-route stops, then drop off and pick up passengers in the pre-determined flexible-route portion, and return to the fixed route. This study addresses four key questions: optimal fleet and vehicle sizes for peak-hour fixed-route services with SAVs and during transition (from drivers to autonomous vehicles), optimal off-peak SoD service planning, and suitable use cases. The methodology combines analytical modeling for service planning with agent-based simulation for operational analysis. We examine ten bus routes in Munich, Germany, considering full SAV and transition scenarios with varying proportions of drivers. Our findings demonstrate that the lower operating costs of SAVs improve service quality through increased frequency and smaller vehicles, even in transition scenarios. The reduced headway lowers waiting time and also favors more flexible-route operation in SoD services. The optimal SoD settings range from fully flexible to hybrid routes, where higher occupancy from the terminus favors shorter flexible routes. During the transition phase, limited fleet size and higher headways constrain the benefits of flexible-route operations. The simulation results corroborate the SoD benefits of door-to-door convenience, attracting more passengers without excessive detours and operator costs at moderate flexible-route lengths, and validate the analytical model.
comment: 25 pages, 10 figures
High-Sensitivity and Compact Time-domain Soil Moisture Sensor Using Dispersive Phase Shifter for Complex Permittivity Measurement
This paper presents a Time-Domain Transmissometry Soil Moisture Sensor (TDT-SMS) using a Dispersive Phase Shifter (DPS), consisting of an interdigital capacitor that is loaded with a stacked 4-turn Complementary Spiral Resonator (S4-CSR). Soil moisture measurement technique of the proposed sensor is based on the complex permittivity sensing property of a DPS in time domain. Soil relative permittivity which varies with its moisture content is measured by burying the DPS under a soil mass and changing its phase difference while excited with a 114 MHz sine wave (single tone). DPS output phase and magnitude are compared with the reference signal and measured with a phase/loss detector. The proposed sensor exhibits accuracy better than +-1.2 percent at the highest Volumetric Water Content (VWC=30 percent) for sandy-type soil. Precise design guide is developed and simulations are performed to achieve a highly sensitive sensor. The measurement results validate the accuracy of theoretical analysis and design procedure. Owning the advantages of low profile, low power consumption, and high sensitivity makes the proposed TDT-SMS a good candidate for precision farming and IoT systems.
Low Profile Metamaterial Band-Pass Filter Loaded with 4-Turn Complementary Spiral Resonator for WPT Applications
In this paper, a very compact and low insertion loss metamaterial band-pass filter (MBPF) at the center frequency of f0=730 MHz is proposed, based on the rectangular-shape 4 turn complementary spiral resonators (4 CSR). The proposed MBPF consists of an interdigital capacitor as a series capacitance in the top layer, leading to improve the stopband performance in the pass band range of 700 to 760 MHz, which makes it suitable for wireless power transfer (WPT) systems by rejecting unwanted signals. In order to validate the performance of the proposed technique, the MBPF is fabricated on the RO-4003 substrate and great agreement is achieved between simulated and measured results. The stop band attenuations of greater than 52 dB and 20 dB are obtained around the 0.8xfcl (lower cutoff frequency) and 1.2xfcu (upper cutoff frequency), respectively.
How many autonomous vehicles are required to stabilize traffic flow?
Collective behavior of human-driven vehicles (HVs) results in the well-known stop-and-go waves potentially leading to higher fuel consumption and emissions. This letter investigates the stabilization of traffic flow via a minimum number of autonomous vehicles (AVs) subject to constraints on the control parameters. The unconstrained scenario has been well-studied in recent studies. The main motivation to investigate the constrained scenario is that, in reality, lower and upper bounds exist on the control parameters. For the constrained scenario, we optimally find the minimum number of required AVs (via computing the optimal lower bound on the AV penetration rate) to stabilize traffic flow for a given number of HVs. As an immediate consequence, we conclude that for a given number of AVs, the number of HVs in the stabilized traffic flow cannot be arbitrarily large in the constrained scenario unlike the unconstrained scenario studied in the literature. Using nonlinear optimization techniques, we systematically propose a procedure to compute the optimal lower bound on the AV penetration rate. Finally, we validate the theoretical results via numerical simulations. Numerical simulations suggest that by enlarging the constraint intervals, a smaller optimal lower bound on the AV penetration rate is attainable. However, it leads to a slower transient response due to a dominant pole closer to the origin.
Newton-Raphson Flow for Aggressive Quadrotor Tracking Control
We apply the Newton-Raphson flow tracking controller to aggressive quadrotor flight and demonstrate that it achieves good tracking performance over a suite of benchmark trajectories, beating the native trajectory tracking controller in the popular PX4 Autopilot. The Newton-Raphson flow tracking controller is a recently proposed integrator-type controller that aims to drive to zero the error between a future predicted system output and the reference trajectory. This controller is computationally lightweight, requiring only an imprecise predictor, and achieves guaranteed asymptotic error bounds under certain conditions. We show that these theoretical advantages are realizable on a quadrotor hardware platform. Our experiments are conducted on a Holybrox x500v2 quadrotor using a Pixhawk 6x flight controller and a Rasbperry Pi 4 companion computer which receives location information from an OptiTrack motion capture system and sends input commands through the ROS2 API for the PX4 software stack.
comment: Expanded version of our submission to the American Control Conference 2024
Timer-Based Coverage Control for Mobile Sensors
This work investigates the coverage control problem over a static, bounded, and convex workspace and develops a hybrid extension of the continuous-time Lloyd algorithm. Each agent in a multi-agent system (MAS) is equipped with a timer mechanism that generates intermittent measurement and control update events, which may occur asynchronously between agents. Between consecutive event times, as determined by the corresponding timer mechanism, the controller of each agent is held constant. These controllers are shown to drive the configuration of the MAS into a neighborhood of the set of centroidal Voronoi configurations, i.e., a local minimizer of the standard locational cost. The combination of continuous-time dynamics with intermittently updated control inputs is modeled as a hybrid system. The coverage objective is posed as a set attractivity problem for hybrid systems, where an invariance-based convergence analysis yields sufficient conditions that ensure maximal solutions of the hybrid system asymptotically converge to a desired set. A brief simulation example is included to showcase the result.
An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems
In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No model knowledge or state measurements are needed and the obtained optimal policy only uses past input-output information. Moreover, our formulation of the proposed algorithm renders it computationally efficient. We provide conditions that guarantee the convergence of the algorithm to the optimal solution. Finally, the performance of our method is compared to existing algorithms in the literature.
comment: Proceedings of the 6th Annual Learning for Dynamics & Control Conference, PMLR 242:312-323, 2024
Continuous Approximations of Projected Dynamical Systems via Control Barrier Functions
Projected Dynamical Systems (PDSs) form a class of discontinuous constrained dynamical systems, and have been used widely to solve optimization problems and variational inequalities. Recently, they have also gained significant attention for control purposes, such as high-performance integrators, saturated control and feedback optimization. In this work, we establish that locally Lipschitz continuous dynamics, involving Control Barrier Functions (CBFs), namely CBF-based dynamics, approximate PDSs. Specifically, we prove that trajectories of CBF-based dynamics uniformly converge to trajectories of PDSs, as a CBF-parameter approaches infinity. Towards this, we also prove that CBF-based dynamics are perturbations of PDSs, with quantitative bounds on the perturbation. Our results pave the way to implement discontinuous PDS-based controllers in a continuous fashion, employing CBFs. We demonstrate this on numerical examples on feedback optimization and synchronverter control. Moreover, our results can be employed to numerically simulate PDSs, overcoming disadvantages of existing discretization schemes, such as computing projections to possibly non-convex sets. Finally, this bridge between CBFs and PDSs may yield other potential benefits, including novel insights on stability.
comment: Accepted to IEEE Transactions on Automatic Control (IEEE TAC). Compared to the accepted version, this version contains an additional numerical example on feedback optimization
Optimal Dynamic Ancillary Services Provision Based on Local Power Grid Perception
In this paper, we propose a systematic closed-loop approach to provide optimal dynamic ancillary services with converter-interfaced generation systems based on local power grid perception. In particular, we structurally encode dynamic ancillary services such as fast frequency and voltage regulation in the form of a parametric transfer function matrix, which includes several parameters to define a set of different feasible response behaviors, among which we aim to find the optimal one to be realized by the converter system. Our approach is based on a so-called "perceive-and-optimize" (P&O) strategy: First, we identify a grid dynamic equivalent at the interconnection terminals of the converter system. Second, we consider the closed-loop interconnection of the identified grid equivalent and the parametric transfer function matrix, which we optimize for the set of transfer function parameters, resulting in a stable and optimal closed-loop performance for ancillary services provision. In the process, we ensure that grid-code and device-level requirements are satisfied. Finally, we demonstrate the effectiveness of our approach in different numerical case studies based on a modified Kundur two-area test system.
comment: 15 pages, 20 Figures
Control Barrier Function Based Design of Gradient Flows for Constrained Nonlinear Programming
This paper considers the problem of designing a continuous-time dynamical system that solves a constrained nonlinear optimization problem and makes the feasible set forward invariant and asymptotically stable. The invariance of the feasible set makes the dynamics anytime, when viewed as an algorithm, meaning it returns a feasible solution regardless of when it is terminated. Our approach augments the gradient flow of the objective function with inputs defined by the constraint functions, treats the feasible set as a safe set, and synthesizes a safe feedback controller using techniques from the theory of control barrier functions. The resulting closed-loop system, termed safe gradient flow, can be viewed as a primal-dual flow, where the state corresponds to the primal variables and the inputs correspond to the dual ones. We provide a detailed suite of conditions based on constraint qualification under which (both isolated and nonisolated) local minimizers are stable with respect to the feasible set and the whole state space. Comparisons with other continuous-time methods for optimization in a simple example illustrate the advantages of the safe gradient flow.
comment: Full version, with appendix, of work appearing in IEEE Transactions on Automatic Control
System Identification for Lithium-Ion Batteries with Nonlinear Coupled Electro-Thermal Dynamics via Bayesian Optimization
Essential to various practical applications of lithium-ion batteries is the availability of accurate equivalent circuit models. This paper presents a new coupled electro-thermal model for batteries and studies how to extract it from data. We consider the problem of maximum likelihood parameter estimation, which, however, is nontrivial to solve as the model is nonlinear in both its dynamics and measurement. We propose to leverage the Bayesian optimization approach, owing to its machine learning-driven capability in handling complex optimization problems and searching for global optima. To enhance the parameter search efficiency, we dynamically narrow and refine the search space in Bayesian optimization. The proposed system identification approach can efficiently determine the parameters of the coupled electro-thermal model. It is amenable to practical implementation, with few requirements on the experiment, data types, and optimization setups, and well applicable to many other battery models.
comment: 2024 American Control Conference(ACC)
Reducing transmission expansion by co-optimizing sizing of wind, solar, storage and grid connection capacity
Expanding transmission capacity is likely a bottleneck that will restrict variable renewable energy (VRE) deployment required to achieve ambitious emission reduction goals. Grid interconnection and inter-regional transmission capacity may be reduced by the optimal sizing of VREs to grid connection or co-location of VRE and battery resources behind the grid interconnection, but neither of these capabilities are commonly captured in macro-energy system models. We thus develop these two new functionalities to explore the substitutability of storage for transmission and VRE resource trade-offs through 2030 in the Western Interconnection of the United States. Our findings indicate that not modeling co-location fails to capture the full substitutability of storage and solar photovoltaic (PV) resources for transmission: co-location can reduce long-distance inter-regional transmission expansion by 12-31% and decrease grid connection capacity and shorter-distance transmission interconnection by 20-25%. We also demonstrate that not modeling colocated storage does not accurately reflect competition between wind and solar PV resources and underestimates the value of energy storage: co-location of VREs and storage favors solar PV (4-5% increase) and lithium-ion battery deployment (1.7-6 times increase), while decreasing wind buildout (0.9-1.6% decline).
comment: The authors have decided to withdraw this version since they have found multiple inaccuracies in the results with the new model version, updated data, and reformulated cost analysis
ComTraQ-MPC: Meta-Trained DQN-MPC Integration for Trajectory Tracking with Limited Active Localization Updates
Optimal decision-making for trajectory tracking in partially observable, stochastic environments where the number of active localization updates -- the process by which the agent obtains its true state information from the sensors -- are limited, presents a significant challenge. Traditional methods often struggle to balance resource conservation, accurate state estimation and precise tracking, resulting in suboptimal performance. This problem is particularly pronounced in environments with large action spaces, where the need for frequent, accurate state data is paramount, yet the capacity for active localization updates is restricted by external limitations. This paper introduces ComTraQ-MPC, a novel framework that combines Deep Q-Networks (DQN) and Model Predictive Control (MPC) to optimize trajectory tracking with constrained active localization updates. The meta-trained DQN ensures adaptive active localization scheduling, while the MPC leverages available state information to improve tracking. The central contribution of this work is their reciprocal interaction: DQN's update decisions inform MPC's control strategy, and MPC's outcomes refine DQN's learning, creating a cohesive, adaptive system. Empirical evaluations in simulated and real-world settings demonstrate that ComTraQ-MPC significantly enhances operational efficiency and accuracy, providing a generalizable and approximately optimal solution for trajectory tracking in complex partially observable environments.
comment: * Equal contribution
Proximal observers for secure state estimation
This paper discusses a general framework for designing robust state estimators for a class of discrete-time nonlinear systems. We consider systems that may be impacted by impulsive (sparse but otherwise arbitrary) measurement noise sequences. We show that a family of state estimators, robust to this type of undesired signal, can be obtained by minimizing a class of nonsmooth convex functions at each time step. The resulting state observers are defined through proximal operators. We obtain a nonlinear implicit dynamical system in term of estimation error and prove, in the noise-free setting, that it vanishes asymptotically when the minimized loss function and the to-be-observed system enjoy appropriate properties. From a computational perspective, even though the proposed observers can be implemented via efficient numerical procedures, they do not admit closed-form expressions. The paper argues that by adopting appropriate relaxations, simple and fast analytic expressions can be derived.
comment: 17 pages, 6 figures
Efficient Reachable Sets on Lie Groups Using Lie Algebra Monotonicity and Tangent Intervals
In this paper, we efficiently compute overapproximating reachable sets for control systems evolving on Lie groups, building off results from monotone systems theory and geometric integration theory. We consider intervals in the tangent space, which describe real sets on the Lie group through the exponential map. A local equivalence between the original system and a system evolving on the Lie algebra allows existing interval reachability techniques to apply in the tangent space. Using interval bounds of the Baker-Campbell-Hausdorff formula, these reachable set estimates are extended to arbitrary time horizons in an efficient Runge-Kutta-Munthe-Kaas integration algorithm. The algorithm is demonstrated through consensus on a torus and attitude control on $SO(3)$.
Strong exponential stability of switched impulsive systems with mode-constrained switching
Strong stability, defined by bounds that decay not only over time but also with the number of impulses, has been established as a requirement to ensure robustness properties for impulsive systems with respect to inputs or disturbances. Most existing results, however, only consider weak stability, where the bounds only decay with time. In this paper, we provide a method for calculating the maximum overshoot and the decay rate for strong global uniform exponential stability bounds for nonlinear switched impulsive systems. We consider the scenario of mode-constrained switching where not all transitions between subsystems are allowed, and where subsystems may exhibit unstable dynamics in the flow and/or jump maps. Based on direct and reverse mode-dependent average dwell-time and activation-time constraints, we derive stability bounds that can be improved by considering longer switching sequences for computation. We provide an example that shows how the results can be employed to ensure the stability robustness of nonlinear systems that admit a global state weak linearization.
comment: 21 pages, 4 figures
Systems and Control (EESS)
Hybrid Recurrent Models Support Emergent Descriptions for Hierarchical Planning and Control
An open problem in artificial intelligence is how systems can flexibly learn discrete abstractions that are useful for solving inherently continuous problems. Previous work has demonstrated that a class of hybrid state-space model known as recurrent switching linear dynamical systems (rSLDS) discover meaningful behavioural units via the piecewise linear decomposition of complex continuous dynamics (Linderman et al., 2016). Furthermore, they model how the underlying continuous states drive these discrete mode switches. We propose that the rich representations formed by an rSLDS can provide useful abstractions for planning and control. We present a novel hierarchical model-based algorithm inspired by Active Inference in which a discrete MDP sits above a low-level linear-quadratic controller. The recurrent transition dynamics learned by the rSLDS allow us to (1) specify temporally-abstracted sub-goals in a method reminiscent of the options framework, (2) lift the exploration into discrete space allowing us to exploit information-theoretic exploration bonuses and (3) `cache' the approximate solutions to low-level problems in the discrete planner. We successfully apply our model to the sparse Continuous Mountain Car task, demonstrating fast system identification via enhanced exploration and non-trivial planning through the delineation of abstract sub-goals.
comment: 4 pages, 3 figures
Safety-Critical Stabilization of Force-Controlled Nonholonomic Robots
We present a safety-critical controller for the problem of stabilization for force-controlled nonholonomic autonomous vehicles. The proposed control law is based on the constructions of control Lyapunov functions (CLFs) and control barrier functions (CBFs) for cascaded systems. To address nonholonomicity, we design the nominal controller that guarantees global asymptotic stability and local exponential stability for the closed-loop system in polar coordinates and construct a strict Lyapunov function valid on any compact sets. Furthermore, we present a procedure for constructing CBFs for cascaded systems, utilizing the CBF of the kinematic model through integrator backstepping. Quadratic programming is employed to combine CLFs and CBFs to integrate both stability and safety in the closed loop. The proposed control law is time-invariant, continuous along trajectories, and easy to implement. Our main results guarantee both safety and local asymptotic stability for the closed-loop system.
How Much Reserve Fuel: Quantifying the Maximal Energy Cost of System Disturbances
Motivated by the design question of additional fuel needed to complete a task in an uncertain environment, this paper introduces metrics to quantify the maximal additional energy used by a control system in the presence of bounded disturbances when compared to a nominal, disturbance-free system. In particular, we consider the task of finite-time stabilization for a linear time-invariant system. We first derive the nominal energy required to achieve this task in a disturbance-free system, and then the worst-case energy over all feasible disturbances. The latter leads to an optimal control problem with a least-squares solution, and then an infinite-dimensional optimization problem where we derive an upper bound on the solution. The comparison of these energies is accomplished using additive and multiplicative metrics, and we derive analytical bounds on these metrics. Simulation examples on an ADMIRE fighter jet model demonstrate the practicability of these metrics, and their variation with the task hardness, a combination of the distance of the initial condition from the origin and the task completion time.
comment: 6 pages, 4 figures. IEEE Conference on Decision and Control
Emerging clean technologies: policy-driven cost reductions, implications and perspectives
Hydrogen production from water electrolysis, direct air capture (DAC), and synthetic kerosene derived from hydrogen and CO2 (`e-kerosene') are expected to play an important role in global decarbonization efforts. So far, the economics of these nascent technologies hamper their market diffusion. However, a wave of recent policy support in the United States, Europe, China, and elsewhere is anticipated to drive their commercial liftoff and bring their costs down. To this end, we evaluate the potential cost reductions driven by policy-induced scale-up of these emerging technologies through 2030 using an experience curves approach accounting for both local and global learning effects. We then analyze the consequences of projected cost declines on the competitiveness of these nascent technologies compared to conventional fossil alternatives, where applicable, and highlight some of the tradeoffs associated with their expansion. Our findings indicate that enacted policies could lead to substantial capital cost reductions for electrolyzers. Nevertheless, electrolytic hydrogen production at $1-2/kg would still require some form of policy support. Given expected costs and experience curves, it is unlikely that liquid solvent DAC (L-DAC) scale-up will bring removal costs to stated targets of $100/tCO2, though a $200/tCO2 may eventually be within reach. We also underscore the importance of tackling methane leakage for natural gas-powered L-DAC: unmitigated leaks amplify net removal costs, exacerbate the investment requirements to reach targeted costs, and cast doubt on L-DAC's role in the clean energy transition. Lastly, despite reductions in electrolysis and L-DAC costs, e-kerosene remains considerably more expensive than fossil jet fuel. The economics of e-kerosene and the resources required for production raise questions about the fuel's ultimate viability as a decarbonization tool for aviation.
Towards reliable real-time trajectory optimization
Motion planning is a key aspect of robotics. A common approach to address motion planning problems is trajectory optimization. Trajectory optimization can represent the high-level behaviors of robots through mathematical formulations. However, current trajectory optimization approaches have two main challenges. Firstly, their solution heavily depends on the initial guess, and they are prone to get stuck in local minima. Secondly, they face scalability limitations by increasing the number of constraints. This thesis endeavors to tackle these challenges by introducing four innovative trajectory optimization algorithms to improve reliability, scalability, and computational efficiency. There are two novel aspects of the proposed algorithms. The first key innovation is remodeling the kinematic constraints and collision avoidance constraints. Another key innovation lies in the design of algorithms that effectively utilize parallel computation on GPU accelerators. By using reformulated constraints and leveraging the computational power of GPUs, the proposed algorithms of this thesis demonstrate significant improvements in efficiency and scalability compared to the existing methods. Parallelization enables faster computation times, allowing for real-time decision-making in dynamic environments. Moreover, the algorithms are designed to adapt to changes in the environment, ensuring robust performance. Extensive benchmarking for each proposed optimizer validates their efficacy. Overall, this thesis makes a significant contribution to the field of trajectory optimization algorithms. It introduces innovative solutions that specifically address the challenges faced by existing methods. The proposed algorithms pave the way for more efficient and robust motion planning solutions in robotics by leveraging parallel computation and specific mathematical structures.
comment: PhD Thesis, University of Tartu, 2024. The thesis was defended on 21st of June. https://dspace.ut.ee/items/a65d36c9-afe7-44ab-b544-20236177ed79
Fast Grid Emissions Sensitivities using Parallel Decentralized Implicit Differentiation
Marginal emissions rates -- the sensitivity of carbon emissions to electricity demand -- are important for evaluating the impact of emissions mitigation measures. Like locational marginal prices, locational marginal emissions rates (LMEs) can vary geographically, even between nearby locations, and may be coupled across time periods because of, for example, storage and ramping constraints. This temporal coupling makes computing LMEs computationally expensive for large electricity networks with high storage and renewable penetrations. Recent work demonstrates that decentralized algorithms can mitigate this problem by decoupling timesteps during differentiation. Unfortunately, we show these potential speedups are negated by the sparse structure inherent in power systems problems. We address these limitations by introducing a parallel, reverse-mode decentralized differentiation scheme that never explicitly instantiates the solution map Jacobian. We show both theoretically and empirically that parallelization is necessary to achieve non-trivial speedups when computing grid emissions sensitivities. Numerical results on a 500 node system indicate that our method can achieve greater than 10x speedups over centralized and serial decentralized approaches.
Fault Tolerant Dynamic Task Assignment for UAV-based Search Teams
This research offers a novel framework for dynamic task assignment for unmanned aerial vehicles (UAVs) in cooperative search settings. Notably, it incorporates post-fault UAV capabilities into job assignment techniques, assuring operational dependability in the event of sensor and actuator failures. A significant innovation is the utilization of UAV battery charge to assess range relative to search objectives, hence improving job distribution while conserving battery life. This model integrates repair, recharge, and stochastic goal recurrence, hence increasing its real-world applicability. Using stochastic dynamic programming, this method makes it simpler to determine optimal assignment policies offline so they may be implemented rapidly online. This paper emphasizes the holistic aspect of the proposed model, which connects high-level task rules to low-level control capabilities. A simulation-based case study proves its usefulness, highlighting its robustness in fault-prone and battery-variable settings. Overall, this paper proposes and demonstrates a comprehensive method for assigning UAV tasks that integrates defect awareness, battery management, and multilayer control through the use of stochastic dynamic programming.
Semi-on-Demand Off-Peak Transit Services with Shared Autonomous Vehicles -- Service Planning, Simulation, and Analysis in Munich, Germany
This study investigates the implementation of semi-on-demand (SoD) hybrid-route services using Shared Autonomous Vehicles (SAVs) on existing transit lines. SoD services combine the cost efficiency of fixed-route buses with the flexibility of on-demand services. SAVs first serve all scheduled fixed-route stops, then drop off and pick up passengers in the pre-determined flexible-route portion, and return to the fixed route. This study addresses four key questions: optimal fleet and vehicle sizes for peak-hour fixed-route services with SAVs and during transition (from drivers to autonomous vehicles), optimal off-peak SoD service planning, and suitable use cases. The methodology combines analytical modeling for service planning with agent-based simulation for operational analysis. We examine ten bus routes in Munich, Germany, considering full SAV and transition scenarios with varying proportions of drivers. Our findings demonstrate that the lower operating costs of SAVs improve service quality through increased frequency and smaller vehicles, even in transition scenarios. The reduced headway lowers waiting time and also favors more flexible-route operation in SoD services. The optimal SoD settings range from fully flexible to hybrid routes, where higher occupancy from the terminus favors shorter flexible routes. During the transition phase, limited fleet size and higher headways constrain the benefits of flexible-route operations. The simulation results corroborate the SoD benefits of door-to-door convenience, attracting more passengers without excessive detours and operator costs at moderate flexible-route lengths, and validate the analytical model.
comment: 25 pages, 10 figures
High-Sensitivity and Compact Time-domain Soil Moisture Sensor Using Dispersive Phase Shifter for Complex Permittivity Measurement
This paper presents a Time-Domain Transmissometry Soil Moisture Sensor (TDT-SMS) using a Dispersive Phase Shifter (DPS), consisting of an interdigital capacitor that is loaded with a stacked 4-turn Complementary Spiral Resonator (S4-CSR). Soil moisture measurement technique of the proposed sensor is based on the complex permittivity sensing property of a DPS in time domain. Soil relative permittivity which varies with its moisture content is measured by burying the DPS under a soil mass and changing its phase difference while excited with a 114 MHz sine wave (single tone). DPS output phase and magnitude are compared with the reference signal and measured with a phase/loss detector. The proposed sensor exhibits accuracy better than +-1.2 percent at the highest Volumetric Water Content (VWC=30 percent) for sandy-type soil. Precise design guide is developed and simulations are performed to achieve a highly sensitive sensor. The measurement results validate the accuracy of theoretical analysis and design procedure. Owning the advantages of low profile, low power consumption, and high sensitivity makes the proposed TDT-SMS a good candidate for precision farming and IoT systems.
Low Profile Metamaterial Band-Pass Filter Loaded with 4-Turn Complementary Spiral Resonator for WPT Applications
In this paper, a very compact and low insertion loss metamaterial band-pass filter (MBPF) at the center frequency of f0=730 MHz is proposed, based on the rectangular-shape 4 turn complementary spiral resonators (4 CSR). The proposed MBPF consists of an interdigital capacitor as a series capacitance in the top layer, leading to improve the stopband performance in the pass band range of 700 to 760 MHz, which makes it suitable for wireless power transfer (WPT) systems by rejecting unwanted signals. In order to validate the performance of the proposed technique, the MBPF is fabricated on the RO-4003 substrate and great agreement is achieved between simulated and measured results. The stop band attenuations of greater than 52 dB and 20 dB are obtained around the 0.8xfcl (lower cutoff frequency) and 1.2xfcu (upper cutoff frequency), respectively.
How many autonomous vehicles are required to stabilize traffic flow?
Collective behavior of human-driven vehicles (HVs) results in the well-known stop-and-go waves potentially leading to higher fuel consumption and emissions. This letter investigates the stabilization of traffic flow via a minimum number of autonomous vehicles (AVs) subject to constraints on the control parameters. The unconstrained scenario has been well-studied in recent studies. The main motivation to investigate the constrained scenario is that, in reality, lower and upper bounds exist on the control parameters. For the constrained scenario, we optimally find the minimum number of required AVs (via computing the optimal lower bound on the AV penetration rate) to stabilize traffic flow for a given number of HVs. As an immediate consequence, we conclude that for a given number of AVs, the number of HVs in the stabilized traffic flow cannot be arbitrarily large in the constrained scenario unlike the unconstrained scenario studied in the literature. Using nonlinear optimization techniques, we systematically propose a procedure to compute the optimal lower bound on the AV penetration rate. Finally, we validate the theoretical results via numerical simulations. Numerical simulations suggest that by enlarging the constraint intervals, a smaller optimal lower bound on the AV penetration rate is attainable. However, it leads to a slower transient response due to a dominant pole closer to the origin.
Newton-Raphson Flow for Aggressive Quadrotor Tracking Control
We apply the Newton-Raphson flow tracking controller to aggressive quadrotor flight and demonstrate that it achieves good tracking performance over a suite of benchmark trajectories, beating the native trajectory tracking controller in the popular PX4 Autopilot. The Newton-Raphson flow tracking controller is a recently proposed integrator-type controller that aims to drive to zero the error between a future predicted system output and the reference trajectory. This controller is computationally lightweight, requiring only an imprecise predictor, and achieves guaranteed asymptotic error bounds under certain conditions. We show that these theoretical advantages are realizable on a quadrotor hardware platform. Our experiments are conducted on a Holybrox x500v2 quadrotor using a Pixhawk 6x flight controller and a Rasbperry Pi 4 companion computer which receives location information from an OptiTrack motion capture system and sends input commands through the ROS2 API for the PX4 software stack.
comment: Expanded version of our submission to the American Control Conference 2024
Timer-Based Coverage Control for Mobile Sensors
This work investigates the coverage control problem over a static, bounded, and convex workspace and develops a hybrid extension of the continuous-time Lloyd algorithm. Each agent in a multi-agent system (MAS) is equipped with a timer mechanism that generates intermittent measurement and control update events, which may occur asynchronously between agents. Between consecutive event times, as determined by the corresponding timer mechanism, the controller of each agent is held constant. These controllers are shown to drive the configuration of the MAS into a neighborhood of the set of centroidal Voronoi configurations, i.e., a local minimizer of the standard locational cost. The combination of continuous-time dynamics with intermittently updated control inputs is modeled as a hybrid system. The coverage objective is posed as a set attractivity problem for hybrid systems, where an invariance-based convergence analysis yields sufficient conditions that ensure maximal solutions of the hybrid system asymptotically converge to a desired set. A brief simulation example is included to showcase the result.
An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems
In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No model knowledge or state measurements are needed and the obtained optimal policy only uses past input-output information. Moreover, our formulation of the proposed algorithm renders it computationally efficient. We provide conditions that guarantee the convergence of the algorithm to the optimal solution. Finally, the performance of our method is compared to existing algorithms in the literature.
comment: Proceedings of the 6th Annual Learning for Dynamics & Control Conference, PMLR 242:312-323, 2024
Continuous Approximations of Projected Dynamical Systems via Control Barrier Functions
Projected Dynamical Systems (PDSs) form a class of discontinuous constrained dynamical systems, and have been used widely to solve optimization problems and variational inequalities. Recently, they have also gained significant attention for control purposes, such as high-performance integrators, saturated control and feedback optimization. In this work, we establish that locally Lipschitz continuous dynamics, involving Control Barrier Functions (CBFs), namely CBF-based dynamics, approximate PDSs. Specifically, we prove that trajectories of CBF-based dynamics uniformly converge to trajectories of PDSs, as a CBF-parameter approaches infinity. Towards this, we also prove that CBF-based dynamics are perturbations of PDSs, with quantitative bounds on the perturbation. Our results pave the way to implement discontinuous PDS-based controllers in a continuous fashion, employing CBFs. We demonstrate this on numerical examples on feedback optimization and synchronverter control. Moreover, our results can be employed to numerically simulate PDSs, overcoming disadvantages of existing discretization schemes, such as computing projections to possibly non-convex sets. Finally, this bridge between CBFs and PDSs may yield other potential benefits, including novel insights on stability.
comment: Accepted to IEEE Transactions on Automatic Control (IEEE TAC). Compared to the accepted version, this version contains an additional numerical example on feedback optimization
Optimal Dynamic Ancillary Services Provision Based on Local Power Grid Perception
In this paper, we propose a systematic closed-loop approach to provide optimal dynamic ancillary services with converter-interfaced generation systems based on local power grid perception. In particular, we structurally encode dynamic ancillary services such as fast frequency and voltage regulation in the form of a parametric transfer function matrix, which includes several parameters to define a set of different feasible response behaviors, among which we aim to find the optimal one to be realized by the converter system. Our approach is based on a so-called "perceive-and-optimize" (P&O) strategy: First, we identify a grid dynamic equivalent at the interconnection terminals of the converter system. Second, we consider the closed-loop interconnection of the identified grid equivalent and the parametric transfer function matrix, which we optimize for the set of transfer function parameters, resulting in a stable and optimal closed-loop performance for ancillary services provision. In the process, we ensure that grid-code and device-level requirements are satisfied. Finally, we demonstrate the effectiveness of our approach in different numerical case studies based on a modified Kundur two-area test system.
comment: 15 pages, 20 Figures
Control Barrier Function Based Design of Gradient Flows for Constrained Nonlinear Programming
This paper considers the problem of designing a continuous-time dynamical system that solves a constrained nonlinear optimization problem and makes the feasible set forward invariant and asymptotically stable. The invariance of the feasible set makes the dynamics anytime, when viewed as an algorithm, meaning it returns a feasible solution regardless of when it is terminated. Our approach augments the gradient flow of the objective function with inputs defined by the constraint functions, treats the feasible set as a safe set, and synthesizes a safe feedback controller using techniques from the theory of control barrier functions. The resulting closed-loop system, termed safe gradient flow, can be viewed as a primal-dual flow, where the state corresponds to the primal variables and the inputs correspond to the dual ones. We provide a detailed suite of conditions based on constraint qualification under which (both isolated and nonisolated) local minimizers are stable with respect to the feasible set and the whole state space. Comparisons with other continuous-time methods for optimization in a simple example illustrate the advantages of the safe gradient flow.
comment: Full version, with appendix, of work appearing in IEEE Transactions on Automatic Control
System Identification for Lithium-Ion Batteries with Nonlinear Coupled Electro-Thermal Dynamics via Bayesian Optimization
Essential to various practical applications of lithium-ion batteries is the availability of accurate equivalent circuit models. This paper presents a new coupled electro-thermal model for batteries and studies how to extract it from data. We consider the problem of maximum likelihood parameter estimation, which, however, is nontrivial to solve as the model is nonlinear in both its dynamics and measurement. We propose to leverage the Bayesian optimization approach, owing to its machine learning-driven capability in handling complex optimization problems and searching for global optima. To enhance the parameter search efficiency, we dynamically narrow and refine the search space in Bayesian optimization. The proposed system identification approach can efficiently determine the parameters of the coupled electro-thermal model. It is amenable to practical implementation, with few requirements on the experiment, data types, and optimization setups, and well applicable to many other battery models.
comment: 2024 American Control Conference(ACC)
Reducing transmission expansion by co-optimizing sizing of wind, solar, storage and grid connection capacity
Expanding transmission capacity is likely a bottleneck that will restrict variable renewable energy (VRE) deployment required to achieve ambitious emission reduction goals. Grid interconnection and inter-regional transmission capacity may be reduced by the optimal sizing of VREs to grid connection or co-location of VRE and battery resources behind the grid interconnection, but neither of these capabilities are commonly captured in macro-energy system models. We thus develop these two new functionalities to explore the substitutability of storage for transmission and VRE resource trade-offs through 2030 in the Western Interconnection of the United States. Our findings indicate that not modeling co-location fails to capture the full substitutability of storage and solar photovoltaic (PV) resources for transmission: co-location can reduce long-distance inter-regional transmission expansion by 12-31% and decrease grid connection capacity and shorter-distance transmission interconnection by 20-25%. We also demonstrate that not modeling colocated storage does not accurately reflect competition between wind and solar PV resources and underestimates the value of energy storage: co-location of VREs and storage favors solar PV (4-5% increase) and lithium-ion battery deployment (1.7-6 times increase), while decreasing wind buildout (0.9-1.6% decline).
comment: The authors have decided to withdraw this version since they have found multiple inaccuracies in the results with the new model version, updated data, and reformulated cost analysis
ComTraQ-MPC: Meta-Trained DQN-MPC Integration for Trajectory Tracking with Limited Active Localization Updates
Optimal decision-making for trajectory tracking in partially observable, stochastic environments where the number of active localization updates -- the process by which the agent obtains its true state information from the sensors -- are limited, presents a significant challenge. Traditional methods often struggle to balance resource conservation, accurate state estimation and precise tracking, resulting in suboptimal performance. This problem is particularly pronounced in environments with large action spaces, where the need for frequent, accurate state data is paramount, yet the capacity for active localization updates is restricted by external limitations. This paper introduces ComTraQ-MPC, a novel framework that combines Deep Q-Networks (DQN) and Model Predictive Control (MPC) to optimize trajectory tracking with constrained active localization updates. The meta-trained DQN ensures adaptive active localization scheduling, while the MPC leverages available state information to improve tracking. The central contribution of this work is their reciprocal interaction: DQN's update decisions inform MPC's control strategy, and MPC's outcomes refine DQN's learning, creating a cohesive, adaptive system. Empirical evaluations in simulated and real-world settings demonstrate that ComTraQ-MPC significantly enhances operational efficiency and accuracy, providing a generalizable and approximately optimal solution for trajectory tracking in complex partially observable environments.
comment: * Equal contribution
Proximal observers for secure state estimation
This paper discusses a general framework for designing robust state estimators for a class of discrete-time nonlinear systems. We consider systems that may be impacted by impulsive (sparse but otherwise arbitrary) measurement noise sequences. We show that a family of state estimators, robust to this type of undesired signal, can be obtained by minimizing a class of nonsmooth convex functions at each time step. The resulting state observers are defined through proximal operators. We obtain a nonlinear implicit dynamical system in term of estimation error and prove, in the noise-free setting, that it vanishes asymptotically when the minimized loss function and the to-be-observed system enjoy appropriate properties. From a computational perspective, even though the proposed observers can be implemented via efficient numerical procedures, they do not admit closed-form expressions. The paper argues that by adopting appropriate relaxations, simple and fast analytic expressions can be derived.
comment: 17 pages, 6 figures
Efficient Reachable Sets on Lie Groups Using Lie Algebra Monotonicity and Tangent Intervals
In this paper, we efficiently compute overapproximating reachable sets for control systems evolving on Lie groups, building off results from monotone systems theory and geometric integration theory. We consider intervals in the tangent space, which describe real sets on the Lie group through the exponential map. A local equivalence between the original system and a system evolving on the Lie algebra allows existing interval reachability techniques to apply in the tangent space. Using interval bounds of the Baker-Campbell-Hausdorff formula, these reachable set estimates are extended to arbitrary time horizons in an efficient Runge-Kutta-Munthe-Kaas integration algorithm. The algorithm is demonstrated through consensus on a torus and attitude control on $SO(3)$.
Strong exponential stability of switched impulsive systems with mode-constrained switching
Strong stability, defined by bounds that decay not only over time but also with the number of impulses, has been established as a requirement to ensure robustness properties for impulsive systems with respect to inputs or disturbances. Most existing results, however, only consider weak stability, where the bounds only decay with time. In this paper, we provide a method for calculating the maximum overshoot and the decay rate for strong global uniform exponential stability bounds for nonlinear switched impulsive systems. We consider the scenario of mode-constrained switching where not all transitions between subsystems are allowed, and where subsystems may exhibit unstable dynamics in the flow and/or jump maps. Based on direct and reverse mode-dependent average dwell-time and activation-time constraints, we derive stability bounds that can be improved by considering longer switching sequences for computation. We provide an example that shows how the results can be employed to ensure the stability robustness of nonlinear systems that admit a global state weak linearization.
comment: 21 pages, 4 figures
Robotics
A Biologically Inspired Design Principle for Building Robust Robotic Systems
Robustness, the ability of a system to maintain performance under significant and unanticipated environmental changes, is a critical property for robotic systems. While biological systems naturally exhibit robustness, there is no comprehensive understanding of how to achieve similar robustness in robotic systems. In this work, we draw inspirations from biological systems and propose a design principle that advocates active interconnections among system components to enhance robustness to environmental variations. We evaluate this design principle in a challenging long-horizon manipulation task: solving lockboxes. Our extensive simulated and real-world experiments demonstrate that we could enhance robustness against environmental changes by establishing active interconnections among system components without substantial changes in individual components. Our findings suggest that a systematic investigation of design principles in system building is necessary. It also advocates for interdisciplinary collaborations to explore and evaluate additional principles of biological robustness to advance the development of intelligent and adaptable robotic systems.
Perfectly Undetectable Reflection and Scaling False Data Injection Attacks via Affine Transformation on Mobile Robot Trajectory Tracking Control
With the increasing integration of cyber-physical systems (CPS) into critical applications, ensuring their resilience against cyberattacks is paramount. A particularly concerning threat is the vulnerability of CPS to deceptive attacks that degrade system performance while remaining undetected. This paper investigates perfectly undetectable false data injection attacks (FDIAs) targeting the trajectory tracking control of a non-holonomic mobile robot. The proposed attack method utilizes affine transformations of intercepted signals, exploiting weaknesses inherent in the partially linear dynamic properties and symmetry of the nonlinear plant. The feasibility and potential impact of these attacks are validated through experiments using a Turtlebot 3 platform, highlighting the urgent need for sophisticated detection mechanisms and resilient control strategies to safeguard CPS against such threats. Furthermore, a novel approach for detection of these attacks called the state monitoring signature function (SMSF) is introduced. An example SMSF, a carefully designed function resilient to FDIA, is shown to be able to detect the presence of a FDIA through signatures based on systems states.
comment: 15 pages, 17 figures. Manuscript under review for publication
Don't Get Stuck: A Deadlock Recovery Approach SC
When multiple agents share space, interactions can lead to deadlocks, where no agent can advance towards its goal. This paper addresses this challenge with a deadlock recovery strategy. In particular, the proposed algorithm integrates hybrid-A$^\star$, STL, and MPPI frameworks. Specifically, hybrid-A$^\star$ generates a reference path, STL defines a goal (deadlock avoidance) and associated constraints (w.r.t. traffic rules), and MPPI refines the path and speed accordingly. This STL-MPPI framework ensures system compliance to specifications and dynamics while ensuring the safety of the resulting maneuvers, indicating a strong potential for application to complex traffic scenarios (and rules) in practice. Validation studies are conducted in simulations and on scaled cars, respectively, to demonstrate the effectiveness of the proposed algorithm.
comment: Presented at the 27th IEEE International Conference on Intelligent Transportation Systems (ITSC) 2024, Edmonton, Alberta, Canada
Towards UAV-USV Collaboration in Harsh Maritime Conditions Including Large Waves
This paper introduces a system designed for tight collaboration between Unmanned Aerial Vehicles (UAVs) and Unmanned Surface Vehicles (USVs) in harsh maritime conditions characterized by large waves. This onboard UAV system aims to enhance collaboration with USVs for following and landing tasks under such challenging conditions. The main contribution of our system is the novel mathematical USV model, describing the movement of the USV in 6 degrees of freedom on a wavy water surface, which is used to estimate and predict USV states. The estimator fuses data from multiple global and onboard sensors, ensuring accurate USV state estimation. The predictor computes future USV states using the novel mathematical USV model and the last estimated states. The estimated and predicted USV states are forwarded into a trajectory planner that generates a UAV trajectory for following the USV or landing on its deck, even in harsh environmental conditions. The proposed approach was verified in numerous simulations and deployed to the real world, where the UAV was able to follow the USV and land on its deck repeatedly.
Physics-Aware Combinatorial Assembly Planning using Deep Reinforcement Learning
Combinatorial assembly uses standardized unit primitives to build objects that satisfy user specifications. Lego is a widely used platform for combinatorial assembly, in which people use unit primitives (ie Lego bricks) to build highly customizable 3D objects. This paper studies sequence planning for physical combinatorial assembly using Lego. Given the shape of the desired object, we want to find a sequence of actions for placing Lego bricks to build the target object. In particular, we aim to ensure the planned assembly sequence is physically executable. However, assembly sequence planning (ASP) for combinatorial assembly is particularly challenging due to its combinatorial nature, ie the vast number of possible combinations and complex constraints. To address the challenges, we employ deep reinforcement learning to learn a construction policy for placing unit primitives sequentially to build the desired object. Specifically, we design an online physics-aware action mask that efficiently filters out invalid actions and guides policy learning. In the end, we demonstrate that the proposed method successfully plans physically valid assembly sequences for constructing different Lego structures. The generated construction plan can be executed in real.
LoopSplat: Loop Closure by Registering 3D Gaussian Splats
Simultaneous Localization and Mapping (SLAM) based on 3D Gaussian Splats (3DGS) has recently shown promise towards more accurate, dense 3D scene maps. However, existing 3DGS-based methods fail to address the global consistency of the scene via loop closure and/or global bundle adjustment. To this end, we propose LoopSplat, which takes RGB-D images as input and performs dense mapping with 3DGS submaps and frame-to-model tracking. LoopSplat triggers loop closure online and computes relative loop edge constraints between submaps directly via 3DGS registration, leading to improvements in efficiency and accuracy over traditional global-to-local point cloud registration. It uses a robust pose graph optimization formulation and rigidly aligns the submaps to achieve global consistency. Evaluation on the synthetic Replica and real-world TUM-RGBD, ScanNet, and ScanNet++ datasets demonstrates competitive or superior tracking, mapping, and rendering compared to existing methods for dense RGB-D SLAM. Code is available at \href{https://loopsplat.github.io/}{loopsplat.github.io}.
comment: Project page: \href{https://loopsplat.github.io/}{loopsplat.github.io}
Source-Seeking Problem with Robot Swarms
We present an algorithm to solve the problem of locating the source, or maxima, of a scalar field using a robot swarm. We demonstrate how the robot swarm determines its direction of movement to approach the source using only field intensity measurements taken by each robot. In contrast with the current literature, our algorithm accommodates a generic (non-degenerate) geometry for the swarm's formation. Additionally, we rigorously show the effectiveness of the algorithm even when the dynamics of the robots are complex, such as a unicycle with constant speed. Not requiring a strict geometry for the swarm significantly enhances its resilience. For example, this allows the swarm to change its size and formation in the presence of obstacles or other real-world factors, including the loss or addition of individuals to the swarm on the fly. For clarity, the article begins by presenting the algorithm for robots with free dynamics. In the second part, we demonstrate the algorithm's effectiveness even considering non-holonomic dynamics for the robots, using the vector field guidance paradigm. Finally, we verify and validate our algorithm with various numerical simulations.
Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
Affordance, defined as the potential actions that an object offers, is crucial for robotic manipulation tasks. A deep understanding of affordance can lead to more intelligent AI systems. For example, such knowledge directs an agent to grasp a knife by the handle for cutting and by the blade when passing it to someone. In this paper, we present a streamlined affordance learning system that encompasses data collection, effective model training, and robot deployment. First, we collect training data from egocentric videos in an automatic manner. Different from previous methods that focus only on the object graspable affordance and represent it as coarse heatmaps, we cover both graspable (e.g., object handles) and functional affordances (e.g., knife blades, hammer heads) and extract data with precise segmentation masks. We then propose an effective model, termed Geometry-guided Affordance Transformer (GKT), to train on the collected data. GKT integrates an innovative Depth Feature Injector (DFI) to incorporate 3D shape and geometric priors, enhancing the model's understanding of affordances. To enable affordance-oriented manipulation, we further introduce Aff-Grasp, a framework that combines GKT with a grasp generation model. For comprehensive evaluation, we create an affordance evaluation dataset with pixel-wise annotations, and design real-world tasks for robot experiments. The results show that GKT surpasses the state-of-the-art by 15.9% in mIoU, and Aff-Grasp achieves high success rates of 95.5% in affordance prediction and 77.1% in successful grasping among 179 trials, including evaluations with seen, unseen objects, and cluttered scenes.
comment: Project page: https://reagan1311.github.io/affgrasp
Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning
Reinforcement Learning from Human Feedback (RLHF) is a powerful paradigm for aligning foundation models to human values and preferences. However, current RLHF techniques cannot account for the naturally occurring differences in individual human preferences across a diverse population. When these differences arise, traditional RLHF frameworks simply average over them, leading to inaccurate rewards and poor performance for individual subgroups. To address the need for pluralistic alignment, we develop a class of multimodal RLHF methods. Our proposed techniques are based on a latent variable formulation - inferring a novel user-specific latent and learning reward models and policies conditioned on this latent without additional user-specific data. While conceptually simple, we show that in practice, this reward modeling requires careful algorithmic considerations around model architecture and reward scaling. To empirically validate our proposed technique, we first show that it can provide a way to combat underspecification in simulated control problems, inferring and optimizing user-specific reward functions. Next, we conduct experiments on pluralistic language datasets representing diverse user preferences and demonstrate improved reward function accuracy. We additionally show the benefits of this probabilistic framework in terms of measuring uncertainty, and actively learning user preferences. This work enables learning from diverse populations of users with divergent preferences, an important challenge that naturally occurs in problems from robot learning to foundation model alignment.
comment: weirdlabuw.github.io/vpl
Understanding cyclists' perception of driverless vehicles through eye-tracking and interviews
As automated vehicles (AVs) become increasingly popular, the question arises as to how cyclists will interact with such vehicles. This study investigated (1) whether cyclists spontaneously notice if a vehicle is driverless, (2) how well they perform a driver-detection task when explicitly instructed, and (3) how they carry out such tasks. Using a Wizard-of-Oz method, 37 participants cycled a designated route and encountered an AV multiple times in two experimental sessions. In Session 1, participants cycled the route uninstructed, while in Session 2, they were instructed to verbally report whether they detected the presence or absence of a driver. Additionally, we recorded the participants' gaze behaviour with eye-tracking and their responses in post-session interviews. The interviews revealed that 30% of the cyclists spontaneously mentioned the absence of a driver (Session 1), and when instructed (Session 2), they detected the absence and presence of the driver with 93% accuracy. The eye-tracking data showed that cyclists looked more frequently and longer at the vehicle in Session 2 compared to Session 1. Furthermore, participants exhibited intermittent sampling of the vehicle, and they looked in front of the vehicle when it was far away and towards the windshield region when it was closer. The post-session interviews also indicated that participants were curious, felt safe, and reported a need to receive information about the AV's driving state. In conclusion, cyclists can detect the absence of a driver in the AV, and this detection may influence their perceptions of safety. Further research is needed to explore these findings in real-world traffic conditions.
Edge-Cloud Collaborative Motion Planning for Autonomous Driving with Large Language Models
Integrating large language models (LLMs) into autonomous driving enhances personalization and adaptability in open-world scenarios. However, traditional edge computing models still face significant challenges in processing complex driving data, particularly regarding real-time performance and system efficiency. To address these challenges, this study introduces EC-Drive, a novel edge-cloud collaborative autonomous driving system with data drift detection capabilities. EC-Drive utilizes drift detection algorithms to selectively upload critical data, including new obstacles and traffic pattern changes, to the cloud for processing by GPT-4, while routine data is efficiently managed by smaller LLMs on edge devices. This approach not only reduces inference latency but also improves system efficiency by optimizing communication resource use. Experimental validation confirms the system's robust processing capabilities and practical applicability in real-world driving conditions, demonstrating the effectiveness of this edge-cloud collaboration framework. Our data and system demonstration will be released at https://sites.google.com/view/ec-drive.
Human Mimetic Forearm Design with Radioulnar Joint using Miniature Bone-Muscle Modules and Its Applications IROS2017
The human forearm is composed of two long, thin bones called the radius and the ulna, and rotates using two axle joints. We aimed to develop a forearm based on the body proportion, weight ratio, muscle arrangement, and joint performance of the human body in order to bring out its benefits. For this, we need to miniaturize the muscle modules. To approach this task, we arranged two muscle motors inside one muscle module, and used the space effectively by utilizing common parts. In addition, we enabled the muscle module to also be used as the bone structure. Moreover, we used miniature motors and developed a way to dissipate the motor heat to the bone structure. Through these approaches, we succeeded in developing a forearm with a radioulnar joint based on the body proportion, weight ratio, muscle arrangement, and joint performance of the human body, while keeping maintainability and reliability. Also, we performed some motions such as soldering, opening a book, turning a screw, and badminton swinging using the benefits of the radioulnar structure, which have not been discussed before, and verified that Kengoro can realize skillful motions using the radioulnar joint like a human.
comment: Accepted at IROS2017
Automated Vehicle Driver Monitoring Dataset from Real-World Scenarios
From SAE Level 3 of automation onwards, drivers are allowed to engage in activities that are not directly related to driving during their travel. However, in level 3, a misunderstanding of the capabilities of the system might lead drivers to engage in secondary tasks, which could impair their ability to react to challenging traffic situations. Anticipating driver activity allows for early detection of risky behaviors, to prevent accidents. To be able to predict the driver activity, a Deep Learning network needs to be trained on a dataset. However, the use of datasets based on simulation for training and the migration to real-world data for prediction has proven to be suboptimal. Hence, this paper presents a real-world driver activity dataset, openly accessible on IEEE Dataport, which encompasses various activities that occur in autonomous driving scenarios under various illumination and weather conditions. Results from the training process showed that the dataset provides an excellent benchmark for implementing models for driver activity recognition.
comment: 6 pages
Integrating Naturalistic Insights in Objective Multi-Vehicle Safety Framework
As autonomous vehicle technology advances, the precise assessment of safety in complex traffic scenarios becomes crucial, especially in mixed-vehicle environments where human perception of safety must be taken into account. This paper presents a framework designed for assessing traffic safety in multi-vehicle situations, facilitating the simultaneous utilization of diverse objective safety metrics. Additionally, it allows the integration of subjective perception of safety by adjusting model parameters. The framework was applied to evaluate various model configurations in car-following scenarios on a highway, utilizing naturalistic driving datasets. The evaluation of the model showed an outstanding performance, particularly when integrating multiple objective safety measures. Furthermore, the performance was significantly enhanced when considering all surrounding vehicles.
Harnessing the Potential of Omnidirectional Multi-Rotor Aerial Vehicles in Cooperative Jamming Against Eavesdropping
Recent research in communications-aware robotics has been propelled by advancements in 5G and emerging 6G technologies. This field now includes the integration of Multi-Rotor Aerial Vehicles (MRAVs) into cellular networks, with a specific focus on under-actuated MRAVs. These vehicles face challenges in independently controlling position and orientation due to their limited control inputs, which adversely affects communication metrics such as Signal-to-Noise Ratio. In response, a newer class of omnidirectional MRAVs has been developed, which can control both position and orientation simultaneously by tilting their propellers. However, exploiting this capability fully requires sophisticated motion planning techniques. This paper presents a novel application of omnidirectional MRAVs designed to enhance communication security and thwart eavesdropping. It proposes a strategy where one MRAV functions as an aerial Base Station, while another acts as a friendly jammer to secure communications. This study is the first to apply such a strategy to MRAVs in scenarios involving eavesdroppers.
comment: 6 pages, 4 figures, Accepted for presentation to the 2024 IEEE Global Communications Conference (IEEE GLOBECOM), Cape Town, South Africa. Copyright may be transferred without notice, after which this version may no longer be accessible
Quantitative 3D Map Accuracy Evaluation Hardware and Algorithm for LiDAR(-Inertial) SLAM
Accuracy evaluation of a 3D pointcloud map is crucial for the development of autonomous driving systems. In this work, we propose a user-independent software/hardware system that can quantitatively evaluate the accuracy of a 3D pointcloud map acquired from LiDAR(-Inertial) SLAM. We introduce a LiDAR target that functions robustly in the outdoor environment, while remaining observable by LiDAR. We also propose a software algorithm that automatically extracts representative points and calculates the accuracy of the 3D pointcloud map by leveraging GPS position data. This methodology overcomes the limitations of the manual selection method, that its result varies between users. Furthermore, two different error metrics, relative and absolute errors, are introduced to analyze the accuracy from different perspectives. Our implementations are available at: https://github.com/SangwooJung98/3D_Map_Evaluation
comment: ICCAS 2024 accepted, 5 pages, 6 figures, 2 Tables
An Efficient Deep Reinforcement Learning Model for Online 3D Bin Packing Combining Object Rearrangement and Stable Placement
This paper presents an efficient deep reinforcement learning (DRL) framework for online 3D bin packing (3D-BPP). The 3D-BPP is an NP-hard problem significant in logistics, warehousing, and transportation, involving the optimal arrangement of objects inside a bin. Traditional heuristic algorithms often fail to address dynamic and physical constraints in real-time scenarios. We introduce a novel DRL framework that integrates a reliable physics heuristic algorithm and object rearrangement and stable placement. Our experiment show that the proposed framework achieves higher space utilization rates effectively minimizing the amount of wasted space with fewer training epochs.
Multi-Agent Reinforcement Learning for Autonomous Driving: A Survey
Reinforcement Learning (RL) is a potent tool for sequential decision-making and has achieved performance surpassing human capabilities across many challenging real-world tasks. As the extension of RL in the multi-agent system domain, multi-agent RL (MARL) not only need to learn the control policy but also requires consideration regarding interactions with all other agents in the environment, mutual influences among different system components, and the distribution of computational resources. This augments the complexity of algorithmic design and poses higher requirements on computational resources. Simultaneously, simulators are crucial to obtain realistic data, which is the fundamentals of RL. In this paper, we first propose a series of metrics of simulators and summarize the features of existing benchmarks. Second, to ease comprehension, we recall the foundational knowledge and then synthesize the recently advanced studies of MARL-related autonomous driving and intelligent transportation systems. Specifically, we examine their environmental modeling, state representation, perception units, and algorithm design. Conclusively, we discuss open challenges as well as prospects and opportunities. We hope this paper can help the researchers integrate MARL technologies and trigger more insightful ideas toward the intelligent and autonomous driving.
comment: 23 pages, 6 figures and 2 tables. Submitted to IEEE Journal
CusADi: A GPU Parallelization Framework for Symbolic Expressions and Optimal Control
The parallelism afforded by GPUs presents significant advantages in training controllers through reinforcement learning (RL). However, integrating model-based optimization into this process remains challenging due to the complexity of formulating and solving optimization problems across thousands of instances. In this work, we present CusADi, an extension of the CasADi symbolic framework to support the parallelization of arbitrary closed-form expressions on GPUs with CUDA. We also formulate a closed-form approximation for solving general optimal control problems, enabling large-scale parallelization and evaluation of MPC controllers. Our results show a ten-fold speedup relative to similar MPC implementation on the CPU, and we demonstrate the use of CusADi for various applications, including parallel simulation, parameter sweeps, and policy training.
comment: RAL 2024 submission
RUMI: Rummaging Using Mutual Information
This paper presents Rummaging Using Mutual Information (RUMI), a method for online generation of robot action sequences to gather information about the pose of a known movable object in visually-occluded environments. Focusing on contact-rich rummaging, our approach leverages mutual information between the object pose distribution and robot trajectory for action planning. From an observed partial point cloud, RUMI deduces the compatible object pose distribution and approximates the mutual information of it with workspace occupancy in real time. Based on this, we develop an information gain cost function and a reachability cost function to keep the object within the robot's reach. These are integrated into a model predictive control (MPC) framework with a stochastic dynamics model, updating the pose distribution in a closed loop. Key contributions include a new belief framework for object pose estimation, an efficient information gain computation strategy, and a robust MPC-based control scheme. RUMI demonstrates superior performance in both simulated and real tasks compared to baseline methods.
comment: 19 pages, 17 figures, submitted to IEEE Transactions on Robotics (T-RO)
NeuFlow v2: High-Efficiency Optical Flow Estimation on Edge Devices
Real-time high-accuracy optical flow estimation is crucial for various real-world applications. While recent learning-based optical flow methods have achieved high accuracy, they often come with significant computational costs. In this paper, we propose a highly efficient optical flow method that balances high accuracy with reduced computational demands. Building upon NeuFlow v1, we introduce new components including a much more light-weight backbone and a fast refinement module. Both these modules help in keeping the computational demands light while providing close to state of the art accuracy. Compares to other state of the art methods, our model achieves a 10x-70x speedup while maintaining comparable performance on both synthetic and real-world data. It is capable of running at over 20 FPS on 512x384 resolution images on a Jetson Orin Nano. The full training and evaluation code is available at https://github.com/neufieldrobotics/NeuFlow_v2.
CRITERIA: a New Benchmarking Paradigm for Evaluating Trajectory Prediction Models for Autonomous Driving
Benchmarking is a common method for evaluating trajectory prediction models for autonomous driving. Existing benchmarks rely on datasets, which are biased towards more common scenarios, such as cruising, and distance-based metrics that are computed by averaging over all scenarios. Following such a regiment provides a little insight into the properties of the models both in terms of how well they can handle different scenarios and how admissible and diverse their outputs are. There exist a number of complementary metrics designed to measure the admissibility and diversity of trajectories, however, they suffer from biases, such as length of trajectories. In this paper, we propose a new benChmarking paRadIgm for evaluaTing trajEctoRy predIction Approaches (CRITERIA). Particularly, we propose 1) a method for extracting driving scenarios at varying levels of specificity according to the structure of the roads, models' performance, and data properties for fine-grained ranking of prediction models; 2) A set of new bias-free metrics for measuring diversity, by incorporating the characteristics of a given scenario, and admissibility, by considering the structure of roads and kinematic compliancy, motivated by real-world driving constraints. 3) Using the proposed benchmark, we conduct extensive experimentation on a representative set of the prediction models using the large scale Argoverse dataset. We show that the proposed benchmark can produce a more accurate ranking of the models and serve as a means of characterizing their behavior. We further present ablation studies to highlight contributions of different elements that are used to compute the proposed metrics.
Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning
Sim-to-real transfer presents a difficult challenge, where models trained in simulation are to be deployed in the real world. The distribution shift between the two settings leads to biased representations of the dynamics, and thus to suboptimal predictions in the real-world environment. In this work, we tackle the challenge of sim-to-real transfer of reinforcement learning (RL) agents for coverage path planning (CPP). In CPP, the task is for a robot to find a path that covers every point of a confined area. Specifically, we consider the case where the environment is unknown, and the agent needs to plan the path online while mapping the environment. We bridge the sim-to-real gap through a semi-virtual environment, including a real robot and real-time aspects, while utilizing a simulated sensor and obstacles to enable environment randomization and automated episode resetting. We investigate what level of fine-tuning is needed for adapting to a realistic setting, comparing to an agent trained solely in simulation. We find that a high inference frequency allows first-order Markovian policies to transfer directly from simulation, while higher-order policies can be fine-tuned to further reduce the sim-to-real gap. Moreover, they can operate at a lower frequency, thus reducing computational requirements. In both cases, our approaches transfer state-of-the-art results from simulation to the real domain, where direct learning would take in the order of weeks with manual interaction, that is, it would be completely infeasible.
ForzaETH Race Stack -- Scaled Autonomous Head-to-Head Racing on Fully Commercial off-the-Shelf Hardware
Autonomous racing in robotics combines high-speed dynamics with the necessity for reliability and real-time decision-making. While such racing pushes software and hardware to their limits, many existing full-system solutions necessitate complex, custom hardware and software, and usually focus on Time-Trials rather than full unrestricted Head-to-Head racing, due to financial and safety constraints. This limits their reproducibility, making advancements and replication feasible mostly for well-resourced laboratories with comprehensive expertise in mechanical, electrical, and robotics fields. Researchers interested in the autonomy domain but with only partial experience in one of these fields, need to spend significant time with familiarization and integration. The ForzaETH Race Stack addresses this gap by providing an autonomous racing software platform designed for F1TENTH, a 1:10 scaled Head-to-Head autonomous racing competition, which simplifies replication by using commercial off-the-shelf hardware. This approach enhances the competitive aspect of autonomous racing and provides an accessible platform for research and development in the field. The ForzaETH Race Stack is designed with modularity and operational ease of use in mind, allowing customization and adaptability to various environmental conditions, such as track friction and layout. Capable of handling both Time-Trials and Head-to-Head racing, the stack has demonstrated its effectiveness, robustness, and adaptability in the field by winning the official F1TENTH international competition multiple times.
comment: This paper has been accepted at the Journal of Field Robotics
NVINS: Robust Visual Inertial Navigation Fused with NeRF-augmented Camera Pose Regressor and Uncertainty Quantification IROS 2024
In recent years, Neural Radiance Fields (NeRF) have emerged as a powerful tool for 3D reconstruction and novel view synthesis. However, the computational cost of NeRF rendering and degradation in quality due to the presence of artifacts pose significant challenges for its application in real-time and robust robotic tasks, especially on embedded systems. This paper introduces a novel framework that integrates NeRF-derived localization information with Visual-Inertial Odometry (VIO) to provide a robust solution for real-time robotic navigation. By training an absolute pose regression network with augmented image data rendered from a NeRF and quantifying its uncertainty, our approach effectively counters positional drift and enhances system reliability. We also establish a mathematically sound foundation for combining visual inertial navigation with camera localization neural networks, considering uncertainty under a Bayesian framework. Experimental validation in a photorealistic simulation environment demonstrates significant improvements in accuracy compared to a conventional VIO approach.
comment: Accepted to IROS 2024, 8 pages, 5 figures, 2 tables
Bi-VLA: Vision-Language-Action Model-Based System for Bimanual Robotic Dexterous Manipulations
This research introduces the Bi-VLA (Vision-Language-Action) model, a novel system designed for bimanual robotic dexterous manipulation that seamlessly integrates vision for scene understanding, language comprehension for translating human instructions into executable code, and physical action generation. We evaluated the system's functionality through a series of household tasks, including the preparation of a desired salad upon human request. Bi-VLA demonstrates the ability to interpret complex human instructions, perceive and understand the visual context of ingredients, and execute precise bimanual actions to prepare the requested salad. We assessed the system's performance in terms of accuracy, efficiency, and adaptability to different salad recipes and human preferences through a series of experiments. Our results show a 100% success rate in generating the correct executable code by the Language Module, a 96.06% success rate in detecting specific ingredients by the Vision Module, and an overall success rate of 83.4% in correctly executing user-requested tasks.
comment: The paper was accepted to the IEEE SMC 2024
Localization in Dynamic Planar Environments Using Few Distance Measurements
We present a method for determining the unknown location of a sensor placed in a known 2D environment in the presence of unknown dynamic obstacles, using only few distance measurements. We present guarantees on the quality of the localization, which are robust under mild assumptions on the density of the unknown/dynamic obstacles in the known environment. We demonstrate the effectiveness of our method in simulated experiments for different environments and varying dynamic-obstacle density. Our open source software is available at https://github.com/TAU-CGL/vb-fdml2-public.
HyperSurf: Quadruped Robot Leg Capable of Surface Recognition with GRU and Real-to-Sim Transferring
This paper introduces a system of data collection acceleration and real-to-sim transferring for surface recognition on a quadruped robot. The system features a mechanical single-leg setup capable of stepping on various easily interchangeable surfaces. Additionally, it incorporates a GRU-based Surface Recognition System, inspired by the system detailed in the Dog-Surf paper. This setup facilitates the expansion of dataset collection for model training, enabling data acquisition from hard-to-reach surfaces in laboratory conditions. Furthermore, it opens avenues for transferring surface properties from reality to simulation, thereby allowing the training of optimal gaits for legged robots in simulation environments using a pre-prepared library of digital twins of surfaces. Moreover, enhancements have been made to the GRU-based Surface Recognition System, allowing for the integration of data from both the quadruped robot and the single-leg setup. The dataset and code have been made publicly available.
comment: IEEE SMC 2024
SceneMotion: From Agent-Centric Embeddings to Scene-Wide Forecasts SC 2024
Self-driving vehicles rely on multimodal motion forecasts to effectively interact with their environment and plan safe maneuvers. We introduce SceneMotion, an attention-based model for forecasting scene-wide motion modes of multiple traffic agents. Our model transforms local agent-centric embeddings into scene-wide forecasts using a novel latent context module. This module learns a scene-wide latent space from multiple agent-centric embeddings, enabling joint forecasting and interaction modeling. The competitive performance in the Waymo Open Interaction Prediction Challenge demonstrates the effectiveness of our approach. Moreover, we cluster future waypoints in time and space to quantify the interaction between agents. We merge all modes and analyze each mode independently to determine which clusters are resolved through interaction or result in conflict. Our implementation is available at: https://github.com/kit-mrt/future-motion
comment: 7 pages, 3 figures, ITSC 2024; v2: added details about waypoint clustering
Two-step dynamic obstacle avoidance
Dynamic obstacle avoidance (DOA) is a fundamental challenge for any autonomous vehicle, independent of whether it operates in sea, air, or land. This paper proposes a two-step architecture for handling DOA tasks by combining supervised and reinforcement learning (RL). In the first step, we introduce a data-driven approach to estimate the collision risk (CR) of an obstacle using a recurrent neural network, which is trained in a supervised fashion and offers robustness to non-linear obstacle movements. In the second step, we include these CR estimates into the observation space of an RL agent to increase its situational awareness. We illustrate the power of our two-step approach by training different RL agents in a challenging environment that requires to navigate amid multiple obstacles. The non-linear movements of obstacles are exemplarily modeled based on stochastic processes and periodic patterns, although our architecture is suitable for any obstacle dynamics. The experiments reveal that integrating our CR metrics into the observation space doubles the performance in terms of reward, which is equivalent to halving the number of collisions in the considered environment. We also perform a generalization experiment to validate the proposal in an RL environment based on maritime traffic and real-world vessel trajectory data. Furthermore, we show that the architecture's performance improvement is independent of the applied RL algorithm.
GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration
We introduce a pipeline that enhances a general-purpose Vision Language Model, GPT-4V(ision), to facilitate one-shot visual teaching for robotic manipulation. This system analyzes videos of humans performing tasks and outputs executable robot programs that incorporate insights into affordances. The process begins with GPT-4V analyzing the videos to obtain textual explanations of environmental and action details. A GPT-4-based task planner then encodes these details into a symbolic task plan. Subsequently, vision systems spatially and temporally ground the task plan in the videos. Object are identified using an open-vocabulary object detector, and hand-object interactions are analyzed to pinpoint moments of grasping and releasing. This spatiotemporal grounding allows for the gathering of affordance information (e.g., grasp types, waypoints, and body postures) critical for robot execution. Experiments across various scenarios demonstrate the method's efficacy in achieving real robots' operations from human demonstrations in a one-shot manner. Meanwhile, quantitative tests have revealed instances of hallucination in GPT-4V, highlighting the importance of incorporating human supervision within the pipeline. The prompts of GPT-4V/GPT-4 are available at this project page: https://microsoft.github.io/GPT4Vision-Robot-Manipulation-Prompts/
comment: 8 pages, 10 figures, 3 tables. Last updated on August 18th, 2024
MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting
Open-world generalization requires robotic systems to have a profound understanding of the physical world and the user command to solve diverse and complex tasks. While the recent advancement in vision-language models (VLMs) has offered unprecedented opportunities to solve open-world problems, how to leverage their capabilities to control robots remains a grand challenge. In this paper, we present MOKA (Marking Open-vocabulary Keypoint Affordances), an approach that employs VLMs to solve robotic manipulation tasks specified by free-form language instructions. Central to our approach is a compact point-based representation of affordance, which bridges the VLM's predictions on observed images and the robot's actions in the physical world. By prompting the pre-trained VLM, our approach utilizes the VLM's commonsense knowledge and concept understanding acquired from broad data sources to predict affordances and generate motions. To facilitate the VLM's reasoning in zero-shot and few-shot manners, we propose a visual prompting technique that annotates marks on images, converting affordance reasoning into a series of visual question-answering problems that are solvable by the VLM. We further explore methods to enhance performance with robot experiences collected by MOKA through in-context learning and policy distillation. We evaluate and analyze MOKA's performance on various table-top manipulation tasks including tool use, deformable body manipulation, and object rearrangement.
Vision-Based Dexterous Motion Planning by Dynamic Movement Primitives with Human Hand Demonstration
This paper proposes a vision-based framework for a 7-degree-of-freedom robotic manipulator, with the primary objective of facilitating its capacity to acquire information from human hand demonstrations for the execution of dexterous pick-and-place tasks. Most existing works only focus on the position demonstration without considering the orientations. In this paper, by employing a single depth camera, MediaPipe is applied to generate the three-dimensional coordinates of a human hand, thereby comprehensively recording the hand's motion, encompassing the trajectory of the wrist, orientation of the hand, and the grasp motion. A mean filter is applied during data pre-processing to smooth the raw data. The demonstration is designed to pick up an object at a specific angle, navigate around obstacles in its path and subsequently, deposit it within a sloped container. The robotic system demonstrates its learning capabilities, facilitated by the implementation of Dynamic Movement Primitives, enabling the assimilation of user actions into its trajectories with different start and end poi
comment: This paper has been published in 2024 IEEE 33rd International Symposium on Industrial Electronics (ISIE)
STHN: Deep Homography Estimation for UAV Thermal Geo-localization with Satellite Imagery
Accurate geo-localization of Unmanned Aerial Vehicles (UAVs) is crucial for outdoor applications including search and rescue operations, power line inspections, and environmental monitoring. The vulnerability of Global Navigation Satellite Systems (GNSS) signals to interference and spoofing necessitates the development of additional robust localization methods for autonomous navigation. Visual Geo-localization (VG), leveraging onboard cameras and reference satellite maps, offers a promising solution for absolute localization. Specifically, Thermal Geo-localization (TG), which relies on image-based matching between thermal imagery with satellite databases, stands out by utilizing infrared cameras for effective nighttime localization. However, the efficiency and effectiveness of current TG approaches, are hindered by dense sampling on satellite maps and geometric noises in thermal query images. To overcome these challenges, we introduce STHN, a novel UAV thermal geo-localization approach that employs a coarse-to-fine deep homography estimation method. This method attains reliable thermal geo-localization within a 512-meter radius of the UAV's last known location even with a challenging 11\% size ratio between thermal and satellite images, despite the presence of indistinct textures and self-similar patterns. We further show how our research significantly enhances UAV thermal geo-localization performance and robustness against geometric noises under low-visibility conditions in the wild. The code is made publicly available.
comment: 8 pages, 7 figures. Accepted for IEEE Robotics and Automation Letters
Multiagent Systems
Auctioning Escape Permits for Multiple Correlated Pollutants Using CMRA
In the context of increasingly complex environmental challenges, effective pollution control mechanisms are crucial. By extending the state of the art auction mechanisms, we aim to develop an efficient approach for allocating pollution abatement resources in a multi-pollutant setting with pollutants affecting each other's reduction costs. We modify the Combinatorial Multi-Round Ascending Auction for the auction of escape permits of pollutants with co-dependent reduction processes, specifically, greenhouse gas emissions and nutrient runoff in Finnish agriculture. We show the significant advantages of this mechanism in pollution control through experiments on the bid prices and amount of escape permits sold in multiple auction simulations.
Synthesis of Reward Machines for Multi-Agent Equilibrium Design (Full Version)
Mechanism design is a well-established game-theoretic paradigm for designing games to achieve desired outcomes. This paper addresses a closely related but distinct concept, equilibrium design. Unlike mechanism design, the designer's authority in equilibrium design is more constrained; she can only modify the incentive structures in a given game to achieve certain outcomes without the ability to create the game from scratch. We study the problem of equilibrium design using dynamic incentive structures, known as reward machines. We use weighted concurrent game structures for the game model, with goals (for the players and the designer) defined as mean-payoff objectives. We show how reward machines can be used to represent dynamic incentives that allocate rewards in a manner that optimises the designer's goal. We also introduce the main decision problem within our framework, the payoff improvement problem. This problem essentially asks whether there exists a dynamic incentive (represented by some reward machine) that can improve the designer's payoff by more than a given threshold value. We present two variants of the problem: strong and weak. We demonstrate that both can be solved in polynomial time using a Turing machine equipped with an NP oracle. Furthermore, we also establish that these variants are either NP-hard or coNP-hard. Finally, we show how to synthesise the corresponding reward machine if it exists.
MegaAgent: A Practical Framework for Autonomous Cooperation in Large-Scale LLM Agent Systems
With the emergence of large language models (LLMs), LLM-powered multi-agent systems (LLM-MA systems) have been proposed to tackle real-world tasks. However, their agents mostly follow predefined Standard Operating Procedures (SOPs) that remain unchanged across the whole interaction, lacking autonomy and scalability. Additionally, current solutions often overlook the necessity for effective agent cooperation. To address the above limitations, we propose MegaAgent, a practical framework designed for autonomous cooperation in large-scale LLM Agent systems. MegaAgent leverages the autonomy of agents to dynamically generate agents based on task requirements, incorporating features such as automatically dividing tasks, systematic planning and monitoring of agent activities, and managing concurrent operations. In addition, MegaAgent is designed with a hierarchical structure and employs system-level parallelism to enhance performance and boost communication. We demonstrate the effectiveness of MegaAgent through Gobang game development, showing that it outperforms popular LLM-MA systems; and national policy simulation, demonstrating its high autonomy and potential to rapidly scale up to 590 agents while ensuring effective cooperation among them. Our results indicate that MegaAgent is the first autonomous large-scale LLM-MA system with no pre-defined SOPs, high effectiveness and scalability, paving the way for further research in this field. Our code is at https://anonymous.4open.science/r/MegaAgent-81F3.
Algorithmic Contract Design with Reinforcement Learning Agents
We introduce a novel problem setting for algorithmic contract design, named the principal-MARL contract design problem. This setting extends traditional contract design to account for dynamic and stochastic environments using Markov Games and Multi-Agent Reinforcement Learning. To tackle this problem, we propose a Multi-Objective Bayesian Optimization (MOBO) framework named Constrained Pareto Maximum Entropy Search (cPMES). Our approach integrates MOBO and MARL to explore the highly constrained contract design space, identifying promising incentive and recruitment decisions. cPMES transforms the principal-MARL contract design problem into an unconstrained multi-objective problem, leveraging the probability of feasibility as part of the objectives and ensuring promising designs predicted on the feasibility border are included in the Pareto front. By focusing the entropy prediction on designs within the Pareto set, cPMES mitigates the risk of the search strategy being overwhelmed by entropy from constraints. We demonstrate the effectiveness of cPMES through extensive benchmark studies in synthetic and simulated environments, showing its ability to find feasible contract designs that maximize the principal's objectives. Additionally, we provide theoretical support with a sub-linear regret bound concerning the number of iterations.
Multi-Agent Reinforcement Learning for Autonomous Driving: A Survey
Reinforcement Learning (RL) is a potent tool for sequential decision-making and has achieved performance surpassing human capabilities across many challenging real-world tasks. As the extension of RL in the multi-agent system domain, multi-agent RL (MARL) not only need to learn the control policy but also requires consideration regarding interactions with all other agents in the environment, mutual influences among different system components, and the distribution of computational resources. This augments the complexity of algorithmic design and poses higher requirements on computational resources. Simultaneously, simulators are crucial to obtain realistic data, which is the fundamentals of RL. In this paper, we first propose a series of metrics of simulators and summarize the features of existing benchmarks. Second, to ease comprehension, we recall the foundational knowledge and then synthesize the recently advanced studies of MARL-related autonomous driving and intelligent transportation systems. Specifically, we examine their environmental modeling, state representation, perception units, and algorithm design. Conclusively, we discuss open challenges as well as prospects and opportunities. We hope this paper can help the researchers integrate MARL technologies and trigger more insightful ideas toward the intelligent and autonomous driving.
comment: 23 pages, 6 figures and 2 tables. Submitted to IEEE Journal
Tax Credits and Household Behavior: The Roles of Myopic Decision-Making and Liquidity in a Simulated Economy
There has been a growing interest in multi-agent simulators in the domain of economic modeling. However, contemporary research often involves developing reinforcement learning (RL) based models that focus solely on a single type of agents, such as households, firms, or the government. Such an approach overlooks the adaptation of interacting agents thereby failing to capture the complexity of real-world economic systems. In this work, we consider a multi-agent simulator comprised of RL agents of numerous types, including heterogeneous households, firm, central bank and government. In particular, we focus on the crucial role of the government in distributing tax credits to households. We conduct two broad categories of comprehensive experiments dealing with the impact of tax credits on 1) households with varied degrees of myopia (short-sightedness in spending and saving decisions), and 2) households with diverse liquidity profiles. The first category of experiments examines the impact of the frequency of tax credits (e.g. annual vs quarterly) on consumption patterns of myopic households. The second category of experiments focuses on the impact of varying tax credit distribution strategies on households with differing liquidities. We validate our simulation model by reproducing trends observed in real households upon receipt of unforeseen, uniform tax credits, as documented in a JPMorgan Chase report. Based on the results of the latter, we propose an innovative tax credit distribution strategy for the government to reduce inequality among households. We demonstrate the efficacy of this strategy in improving social welfare in our simulation results.
Team Coordination on Graphs: Problem, Analysis, and Algorithms
Team Coordination on Graphs with Risky Edges (TCGRE) is a recently emerged problem, in which a robot team collectively reduces graph traversal cost through support from one robot to another when the latter traverses a risky edge. Resembling the traditional Multi-Agent Path Finding (MAPF) problem, both classical and learning-based methods have been proposed to solve TCGRE, however, they lacked either computational efficiency or optimality assurance. In this paper, we reformulate TCGRE as a constrained optimization problem and perform a rigorous mathematical analysis. Our theoretical analysis shows the NP-hardness of TCGRE by reduction from the Maximum 3D Matching problem and that efficient decomposition is a key to tackle this combinatorial optimization problem. Furthermore, we design three classes of algorithms to solve TCGRE, i.e., Joint State Graph (JSG) based, coordination based, and receding-horizon sub-team based solutions. Each of these proposed algorithms enjoy different provable optimality and efficiency characteristics that are demonstrated in our extensive experiments.
comment: 8 pages, 4 figures
An Introduction to Decentralized Training and Execution in Cooperative Multi-Agent Reinforcement Learning
Multi-agent reinforcement learning (MARL) has exploded in popularity in recent years. Many approaches have been developed but they can be divided into three main types: centralized training and execution (CTE), centralized training for decentralized execution (CTDE), and Decentralized training and execution (DTE). Decentralized training and execution methods make the fewest assumptions and are often simple to implement. In fact, as I'll discuss, any single-agent RL method can be used for DTE by just letting each agent learn separately. Of course, there are pros and cons to such approaches. It is worth noting that DTE is required if no offline coordination is available. That is, if all agents must learn during online interactions without prior coordination, learning and execution must both be decentralized. DTE methods can be applied in cooperative, competitive, or mixed cases but this text will focus on the cooperative MARL case. This text is an introduction to the field of decentralized, cooperative MARL. As such, I will first give a brief description of the cooperative MARL problem in the form of the Dec-POMDP. Then, I will discuss value-based DTE methods starting with independent Q-learning and its extensions and then discuss the extension to the deep case with DQN, the additional complications this causes, and methods that have been developed to (attempt to) address these issues. Next, I will discuss policy gradient DTE methods starting with independent REINFORCE (i.e., vanilla policy gradient), and then extending to the actor-critic case and deep variants (such as independent PPO). Finally, I will discuss some general topics related to DTE and future directions.
Systems and Control (CS)
LEAD: Towards Learning-Based Equity-Aware Decarbonization in Ridesharing Platforms
Ridesharing platforms such as Uber, Lyft, and DiDi have grown in popularity due to their on-demand availability, ease of use, and commute cost reductions, among other benefits. However, not all ridesharing promises have panned out. Recent studies demonstrate that the expected drop in traffic congestion and reduction in greenhouse gas (GHG) emissions have not materialized. This is primarily due to the substantial distances traveled by the ridesharing vehicles without passengers between rides, known as deadhead miles. Recent work has focused on reducing the impact of deadhead miles while considering additional metrics such as rider waiting time, GHG emissions from deadhead miles, or driver earnings. Unfortunately, prior studies consider these environmental and equity-based metrics individually despite them being interrelated. In this paper, we propose a Learning-based Equity-Aware Decarabonization approach, LEAD, for ridesharing platforms. LEAD targets minimizing emissions while ensuring that the driver's utility, defined as the difference between the trip distance and the deadhead miles, is fairly distributed. LEAD uses reinforcement learning to match riders to drivers based on the expected future utility of drivers and the expected carbon emissions of the platform without increasing the rider waiting times. Extensive experiments based on a real-world ride-sharing dataset show that LEAD improves fairness by 2$\times$ when compared to emission-aware ride-assignment and reduces emissions by 70% while ensuring fairness within 66% of the fair baseline. It also reduces the rider wait time, by at least 40%, compared to various baselines. Additionally, LEAD corrects the imbalance in previous emission-aware ride assignment algorithms that overassigned rides to low-emission vehicles.
Perfectly Undetectable Reflection and Scaling False Data Injection Attacks via Affine Transformation on Mobile Robot Trajectory Tracking Control
With the increasing integration of cyber-physical systems (CPS) into critical applications, ensuring their resilience against cyberattacks is paramount. A particularly concerning threat is the vulnerability of CPS to deceptive attacks that degrade system performance while remaining undetected. This paper investigates perfectly undetectable false data injection attacks (FDIAs) targeting the trajectory tracking control of a non-holonomic mobile robot. The proposed attack method utilizes affine transformations of intercepted signals, exploiting weaknesses inherent in the partially linear dynamic properties and symmetry of the nonlinear plant. The feasibility and potential impact of these attacks are validated through experiments using a Turtlebot 3 platform, highlighting the urgent need for sophisticated detection mechanisms and resilient control strategies to safeguard CPS against such threats. Furthermore, a novel approach for detection of these attacks called the state monitoring signature function (SMSF) is introduced. An example SMSF, a carefully designed function resilient to FDIA, is shown to be able to detect the presence of a FDIA through signatures based on systems states.
comment: 15 pages, 17 figures. Manuscript under review for publication
General Impedance Modeling for Modular Multilevel Converter with Grid-forming and Grid-following Control
Modular multilevel converter (MMC) has complex topology, control architecture and broadband harmonic spectrum. For this, linear-time-periodic (LTP) theory, covering multi-harmonic coupling relations, has been adopted for MMC impedance modeling recently. However, the existing MMC impedance models usually lack explicit expressions and general modeling procedure for different control strategies. To this end, this paper proposes a general impedance modeling procedure applicable to various power converters with grid-forming and grid-following control strategies. The modeling is based on a unified representation of MMC circuit as the input and output relation between the voltage or current on the AC side and the exerted modulation index, while the control part vice versa, thereby interconnected as closed-loop feedback. With each part expressed as transfer functions, the final impedance model keeps the explicit form of harmonic transfer function matrix, making it convenient to directly observe and analyze the influence of each part individually. Thereby the submodule capacitance is found as the main cause of difference between MMC impedance compared to two-level converter, which will get closer as the capacitance increases. Effectiveness and generality of the impedance modeling method is demonstrated through comprehensive comparison with impedance scanning using electromagnetic transient simulation.
Adaptive BESS and Grid Setpoints Optimization: A Model-Free Framework for Efficient Battery Management under Dynamic Tariff Pricing
This paper introduces an enhanced framework for managing Battery Energy Storage Systems (BESS) in residential communities. The non-convex BESS control problem is first addressed using a gradient-based optimizer, providing a benchmark solution. Subsequently, the problem is tackled using multiple Deep Reinforcement Learning (DRL) agents, with a specific emphasis on the off-policy Soft Actor-Critic (SAC) algorithm. This version of SAC incorporates reward refinement based on this non-convex problem, applying logarithmic scaling to enhance convergence rates. Additionally, a safety mechanism selects only feasible actions from the action space, aimed at improving the learning curve, accelerating convergence, and reducing computation times. Moreover, the state representation of this DRL approach now includes uncertainties quantified in the entropy term, enhancing the model's adaptability across various entropy types. This developed system adheres to strict limits on the battery's State of Charge (SOC), thus preventing breaches of SOC boundaries and extending the battery lifespan. The robustness of the model is validated across several Australian states' districts, each characterized by unique uncertainty distributions. By implementing the refined SAC, the SOC consistently surpasses 50 percent by the end of each day, enabling the BESS control to start smoothly for the next day with some reserve. Finally, this proposed DRL method achieves a mean reduction in optimization time by 50 percent and an average cost saving of 40 percent compared to the gradient-based optimization benchmark.
Minimal Sensor Placement for Generic State and Unknown Input Observability
This paper addresses the problem of selecting the minimum number of dedicated sensors to achieve observability in the presence of unknown inputs, namely, the state and input observability, for linear time-invariant systems. We assume that the only available information is the zero-nonzero structure of system matrices, and approach this problem within a structured system model. We revisit the concept of state and input observability for structured systems, providing refined necessary and sufficient conditions for placing dedicated sensors via the Dulmage-Mendelsohn decomposition. Based on these conditions, we prove that determining the minimum number of dedicated sensors to achieve generic state and input observability is NP-hard, which contrasts sharply with the polynomial-time complexity of the corresponding problem with known inputs. We also demonstrate that this problem is hard to approximate within a factor of $(1-o(1)){\rm{log}}(n)$, where $n$ is the state dimension. Notwithstanding, we propose nontrivial upper and lower bounds that can be computed in polynomial time, which confine the optimal value of this problem to an interval with length being the number of inputs. We further present a special case for which the exact optimal value can be determined in polynomial time. Additionally, we propose a two-stage algorithm to solve this problem approximately. Each stage of the algorithm is either optimal or suboptimal and can be completed in polynomial time.
comment: 12 pages, 6 figures
Qualitative properties and stability analysis of the mathematical model for a DC-DC electric circuit
This paper describes a simplified model of an electric circuit with a DC-DC converter and a PID-regulator as a system of integral differential equations with an identically singular matrix multiplying the higher derivative of the desired vector-function. We use theoretical results on integral and differential equations and their systems to prove solvability of such a model and analyze its stability.
comment: submitted to COIA-2024. arXiv admin note: text overlap with arXiv:2408.06045
ISAC-Fi: Enabling Full-fledged Monostatic Sensing over Wi-Fi Communication
Whereas Wi-Fi communications have been exploited for sensing purpose for over a decade, the bistatic or multistatic nature of Wi-Fi still poses multiple challenges, hampering real-life deployment of integrated sensing and communication (ISAC) within Wi-Fi framework. In this paper, we aim to re-design WiFi so that monostatic sensing (mimicking radar) can be achieved over the multistatic communication infrastructure. Specifically, we propose, design, and implement ISAC-Fi as an ISAC-ready Wi-Fi prototype. We first present a novel self-interference cancellation scheme, in order to extract reflected (radio frequency) signals for sensing purpose in the face of transmissions. We then subtly revise existing Wi-Fi framework so as to seamlessly operate monostatic sensing under Wi-Fi communication standard. Finally, we offer two ISAC-Fi designs: while a USRP-based one emulates a totally re-designed ISAC-Fi device, another plug-andplay design allows for backward compatibility by attaching an extra module to an arbitrary Wi-Fi device. We perform extensive experiments to validate the efficacy of ISAC-Fi and also to demonstrate its superiority over existing Wi-Fi sensing proposals.
comment: 14 pages, 22 figures
Neural Horizon Model Predictive Control -- Increasing Computational Efficiency with Neural Networks
The expansion in automation of increasingly fast applications and low-power edge devices poses a particular challenge for optimization based control algorithms, like model predictive control. Our proposed machine-learning supported approach addresses this by utilizing a feed-forward neural network to reduce the computation load of the online-optimization. We propose approximating part of the problem horizon, while maintaining safety guarantees -- constraint satisfaction -- via the remaining optimization part of the controller. The approach is validated in simulation, demonstrating an improvement in computational efficiency, while maintaining guarantees and near-optimal performance. The proposed MPC scheme can be applied to a wide range of applications, including those requiring a rapid control response, such as robotics and embedded applications with limited computational resources.
comment: 6 pages, 4 figures, 4 tables, American Control Conference (ACC) 2024
Self-Refined Generative Foundation Models for Wireless Traffic Prediction
With a broad range of emerging applications in 6G networks, wireless traffic prediction has become a critical component of network management. However, the dynamically shifting distribution of wireless traffic in non-stationary 6G networks presents significant challenges to achieving accurate and stable predictions. Motivated by recent advancements in Generative AI (GAI)-enabled 6G networks, this paper proposes a novel self-refined Large Language Model (LLM) for wireless traffic prediction, namely TrafficLLM, through in-context learning without parameter fine-tuning or model training. The proposed TrafficLLM harnesses the powerful few-shot learning abilities of LLMs to enhance the scalability of traffic prediction in dynamically changing wireless environments. Specifically, our proposed TrafficLLM embraces an LLM to iteratively refine its predictions through a three-step process: traffic prediction, feedback generation, and prediction refinement. Initially, the proposed TrafficLLM conducts traffic predictions using task-specific demonstration prompts. Recognizing that LLMs may generate incorrect predictions on the first attempt, we subsequently incorporate feedback demonstration prompts designed to provide multifaceted and valuable feedback related to these initial predictions. Following this comprehensive feedback, our proposed TrafficLLM introduces refinement demonstration prompts, enabling the same LLM to further refine its predictions and thereby enhance prediction performance. The evaluations on two realistic datasets demonstrate that the proposed TrafficLLM outperforms state-of-the-art methods with performance improvements of 23.17% and 17.09%, respectively.
Finite-time input-to-state stability for infinite-dimensional systems
In this paper, we extend the notion of finite-time input-to-state stability (FTISS) for finite-dimensional systems to infinite-dimensional systems. More specifically, we first prove an FTISS Lyapunov theorem for a class of infinite-dimensional systems, namely, the existence of an FTISS Lyapunov functional (FTISS-LF) implies the FTISS of the system, and then, provide a sufficient condition for ensuring the existence of an FTISS-LF for a class of abstract infinite-dimensional systems under the framework of compact semigroup theory and Hilbert spaces. As an application of the FTISS Lyapunov theorem, we verify the FTISS for a class of parabolic PDEs involving sublinear terms and distributed in-domain disturbances. Since the nonlinear terms of the corresponding abstract system are not Lipschitz continuous, the well-posedness is proved based on the application of compact semigroup theory and the FTISS is assessed by using the Lyapunov method with the aid of an interpolation inequality. Numerical simulations are conducted to confirm the theoretical results.
An LMI-based Robust Fuzzy Controller for Blood Glucose Regulation in Type 1 Diabetes
This paper presents a control algorithm for creating an artificial pancreas for type 1 diabetes, factoring in input saturation for a practical application. By utilizing the parallel distributed compensation and Takagi-Sugeno Fuzzy model, we design an optimal robust fuzzy controller. Stability conditions derived from the Lyapunov method are expressed as linear matrix inequalities, allowing for optimal controller gain selection that minimizes disturbance effects. We employ the minimal Bergman and Tolic models to represent type 1 diabetes glucose-insulin dynamics, converting them into corresponding Takagi-Sugeno fuzzy models using the sector nonlinearity approach. Simulation results demonstrate the proposed controller's effectiveness.
Certifiable Deep Learning for Reachability Using a New Lipschitz Continuous Value Function
We propose a new reachability learning framework for high-dimensional nonlinear systems, focusing on reach-avoid problems. These problems require computing the reach-avoid set, which ensures that all its elements can safely reach a target set despite any disturbance within pre-specified bounds. Our framework has two main parts: offline learning of a newly designed reach-avoid value function and post-learning certification. Compared to prior works, our new value function is Lipschitz continuous and its associated Bellman operator is a contraction mapping, both of which improve the learning performance. To ensure deterministic guarantees of our learned reach-avoid set, we introduce two efficient post-learning certification methods. Both methods can be used online for real-time local certification or offline for comprehensive certification. We validate our framework in a 12-dimensional crazyflie drone racing hardware experiment and a simulated 10-dimensional highway takeover example.
comment: Submitted, under review
Conjectural Online Learning with First-order Beliefs in Asymmetric Information Stochastic Games
Asymmetric information stochastic games (AISGs) arise in many complex socio-technical systems, such as cyber-physical systems and IT infrastructures. Existing computational methods for AISGs are primarily offline and can not adapt to equilibrium deviations. Further, current methods are limited to particular information structures to avoid belief hierarchies. Considering these limitations, we propose conjectural online learning (COL), an online learning method under generic information structures in AISGs. COL uses a forecaster-actor-critic (FAC) architecture, where subjective forecasts are used to conjecture the opponents' strategies within a lookahead horizon, and Bayesian learning is used to calibrate the conjectures. To adapt strategies to nonstationary environments based on information feedback, COL uses online rollout with cost function approximation (actor-critic). We prove that the conjectures produced by COL are asymptotically consistent with the information feedback in the sense of a relaxed Bayesian consistency. We also prove that the empirical strategy profile induced by COL converges to the Berk-Nash equilibrium, a solution concept characterizing rationality under subjectivity. Experimental results from an intrusion response use case demonstrate COL's {faster convergence} over state-of-the-art reinforcement learning methods against nonstationary attacks.
comment: Accepted to the 63rd IEEE Conference on Decision and Control, Special Session on Networks, Games and Learning
Quantification of Residential Flexibility Potential using Global Forecasting Models
This paper proposes a general and practical approach to estimate the economic benefits of optimally controlling deferrable loads in a Distribution System Operator's (DSO) grid, without relying on historical observations. We achieve this by learning the simulated response of flexible loads to random control signals, using a non-parametric global forecasting model. An optimal control policy is found by including the latter in an optimization problem. We apply this method to electric water heaters and heat pumps operated through ripple control and show how flexibility, including rebound effects, can be characterized and controlled. Finally, we show that the forecaster's accuracy is sufficient to completely bypass the simulations and directly use the forecaster to estimate the economic benefit of flexibility control.
Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning
Sim-to-real transfer presents a difficult challenge, where models trained in simulation are to be deployed in the real world. The distribution shift between the two settings leads to biased representations of the dynamics, and thus to suboptimal predictions in the real-world environment. In this work, we tackle the challenge of sim-to-real transfer of reinforcement learning (RL) agents for coverage path planning (CPP). In CPP, the task is for a robot to find a path that covers every point of a confined area. Specifically, we consider the case where the environment is unknown, and the agent needs to plan the path online while mapping the environment. We bridge the sim-to-real gap through a semi-virtual environment, including a real robot and real-time aspects, while utilizing a simulated sensor and obstacles to enable environment randomization and automated episode resetting. We investigate what level of fine-tuning is needed for adapting to a realistic setting, comparing to an agent trained solely in simulation. We find that a high inference frequency allows first-order Markovian policies to transfer directly from simulation, while higher-order policies can be fine-tuned to further reduce the sim-to-real gap. Moreover, they can operate at a lower frequency, thus reducing computational requirements. In both cases, our approaches transfer state-of-the-art results from simulation to the real domain, where direct learning would take in the order of weeks with manual interaction, that is, it would be completely infeasible.
ForzaETH Race Stack -- Scaled Autonomous Head-to-Head Racing on Fully Commercial off-the-Shelf Hardware
Autonomous racing in robotics combines high-speed dynamics with the necessity for reliability and real-time decision-making. While such racing pushes software and hardware to their limits, many existing full-system solutions necessitate complex, custom hardware and software, and usually focus on Time-Trials rather than full unrestricted Head-to-Head racing, due to financial and safety constraints. This limits their reproducibility, making advancements and replication feasible mostly for well-resourced laboratories with comprehensive expertise in mechanical, electrical, and robotics fields. Researchers interested in the autonomy domain but with only partial experience in one of these fields, need to spend significant time with familiarization and integration. The ForzaETH Race Stack addresses this gap by providing an autonomous racing software platform designed for F1TENTH, a 1:10 scaled Head-to-Head autonomous racing competition, which simplifies replication by using commercial off-the-shelf hardware. This approach enhances the competitive aspect of autonomous racing and provides an accessible platform for research and development in the field. The ForzaETH Race Stack is designed with modularity and operational ease of use in mind, allowing customization and adaptability to various environmental conditions, such as track friction and layout. Capable of handling both Time-Trials and Head-to-Head racing, the stack has demonstrated its effectiveness, robustness, and adaptability in the field by winning the official F1TENTH international competition multiple times.
comment: This paper has been accepted at the Journal of Field Robotics
Towards Near-Field 3D Spot Beamfocusing: Possibilities, Challenges, and Use-cases
Spot beamfocusing (SBF) is the process of focusing the signal power in a small spot-like region in the 3D space, which can be either hard-tuned (HT) using traditional tools like lenses and mirrors or electronically reconfigured (ER) using modern large-scale intelligent surface phased arrays. ER-SBF can be a key enabling technology (KET) for the next-generation 6G wireless networks offering benefits to many future wireless application areas such as wireless communication and security, mid-range high-power and safe wireless chargers, medical and health, physics, etc. Although near-field HT-SBF and ER-beamfocusing have been studied in the literature and applied in the industry, there is no comprehensive study of different aspects of ER-SBF and its future applications, especially for nonoptical (mmWave, sub-THz, and THz) electromagnetic waves in the next generation wireless technology, which is the aim of this paper. The theoretical concepts behind ER-SBF, different antenna technologies for implementing ER-SBF, employing machine learning (ML)-based schemes for enabling channel-state-information (CSI)-independent ER-SBF, and different practical application areas that can benefit from ER-SBF will be explored.
A new perspective on Bayesian Operational Modal Analysis
In the field of operational modal analysis (OMA), obtained modal information is frequently used to assess the current state of aerospace, mechanical, offshore and civil structures. However, the stochasticity of operational systems and the lack of forcing information can lead to inconsistent results. Quantifying the uncertainty of the recovered modal parameters through OMA is therefore of significant value. In this article, a new perspective on Bayesian OMA is proposed: a Bayesian stochastic subspace identification (SSI) algorithm. Distinct from existing approaches to Bayesian OMA, a hierarchical probabilistic model is embedded at the core of covariance-driven SSI. Through substitution of canonical correlation analysis with a Bayesian equivalent, posterior distributions over the modal properties are obtained. Two inference schemes are presented for the proposed Bayesian formulation: Markov Chain Monte Carlo and variational Bayes. Two case studies are then explored. The first is benchmark study using data from a simulated, multi degree-of-freedom, linear system. Following application of Bayesian SSI, it is shown that the same posterior is targeted and recovered by both inference schemes, with good agreement between the posterior mean and the conventional SSI result. The second study applies the variational form to data obtained from an in-service structure: The Z24 bridge. The results of this study are presented at single model orders, and then using a stabilisation diagram. The recovered posterior uncertainty is presented and compared to the classic SSI result. It is observed that the posterior distributions with mean values coinciding with the natural frequencies exhibit lower variance than values situated away from the natural frequencies.
Augmented LRFS-based Filter: Holistic Tracking of Group Objects
This paper addresses the problem of group target tracking (GTT), wherein multiple closely spaced targets within a group pose a coordinated motion. To improve the tracking performance, the labeled random finite sets (LRFSs) theory is adopted, and this paper develops a new kind of LRFSs, i.e., augmented LRFSs, which introduces group information into the definition of LRFSs. Specifically, for each element in an LRFS, the kinetic states, track label, and the corresponding group information of its represented target are incorporated. Furthermore, by means of the labeled multi-Bernoulli (LMB) filter with the proposed augmented LRFSs, the group structure is iteratively propagated and updated during the tracking process, which achieves the simultaneously estimation of the kinetic states, track label, and the corresponding group information of multiple group targets, and further improves the GTT tracking performance. Finally, simulation experiments are provided, which well demonstrates the effectiveness of the labeled multi-Bernoulli filter with the proposed augmented LRFSs for GTT tracking.
Wireless MAC Protocol Synthesis and Optimization with Multi-Agent Distributed Reinforcement Learning
In this letter, we propose a novel Multi-Agent Deep Reinforcement Learning (MADRL) framework for Medium Access Control (MAC) protocol design. Unlike centralized approaches, which rely on a single entity for decision-making, MADRL empowers individual network nodes to autonomously learn and optimize their MAC based on local observations. Leveraging ns3-ai and RLlib, as far as we are aware of, our framework is the first of a kind that enables distributed multi-agent learning within the ns-3 environment, facilitating the design and synthesis of adaptive MAC protocols tailored to specific environmental conditions. We demonstrate the effectiveness of the MADRL MAC framework through extensive simulations, showcasing superior performance compared to legacy protocols across diverse scenarios. Our findings highlight the potential of MADRL-based MAC protocols to significantly enhance Quality of Service (QoS) requirements for future wireless applications.
Hybrid Semantic/Bit Communication Based Networking Problem Optimization
This paper jointly investigates user association (UA), mode selection (MS), and bandwidth allocation (BA) problems in a novel and practical next-generation cellular network where two modes of semantic communication (SemCom) and conventional bit communication (BitCom) coexist, namely hybrid semantic/bit communication network (HSB-Net). Concretely, we first identify a unified performance metric of message throughput for both SemCom and BitCom links. Next, we comprehensively develop a knowledge matching-aware two-stage tandem packet queuing model and theoretically derive the average packet loss ratio and queuing latency. Combined with several practical constraints, we then formulate a joint optimization problem for UA, MS, and BA to maximize the overall message throughput of HSB-Net. Afterward, we propose an optimal resource management strategy by employing a Lagrange primal-dual method and devising a preference list-based heuristic algorithm. Finally, numerical results validate the performance superiority of our proposed strategy compared with different benchmarks.
comment: This paper has been accepted for publication and will be presented in 2024 IEEE Global Communications Conference (GlobeCom 2024). Copyright may be transferred without notice, after which this version may no longer be accessible. arXiv admin note: substantial text overlap with arXiv:2404.04162
PhiBE: A PDE-based Bellman Equation for Continuous Time Policy Evaluation
In this paper, we address the problem of continuous-time reinforcement learning in scenarios where the dynamics follow a stochastic differential equation. When the underlying dynamics remain unknown and we have access only to discrete-time information, how can we effectively conduct policy evaluation? We first highlight that the commonly used Bellman equation (BE) is not always a reliable approximation to the true value function. We then introduce a new bellman equation, PhiBE, which integrates the discrete-time information into a PDE formulation. The new bellman equation offers a more accurate approximation to the true value function, especially in scenarios where the underlying dynamics change slowly. Moreover, we extend PhiBE to higher orders, providing increasingly accurate approximations. We conduct the error analysis for both BE and PhiBE with explicit dependence on the discounted coefficient, the reward and the dynamics. Additionally, we present a model-free algorithm to solve PhiBE when only discrete-time trajectory data is available. Numerical experiments are provided to validate the theoretical guarantees we propose.
Wireless Resource Optimization in Hybrid Semantic/Bit Communication Networks
Recently, semantic communication (SemCom) has shown great potential in significant resource savings and efficient information exchanges, thus naturally introducing a novel and practical cellular network paradigm where two modes of SemCom and conventional bit communication (BitCom) coexist. Nevertheless, the involved wireless resource management becomes rather complicated and challenging, given the unique background knowledge matching and time-consuming semantic coding requirements in SemCom. To this end, this paper jointly investigates user association (UA), mode selection (MS), and bandwidth allocation (BA) problems in a hybrid semantic/bit communication network (HSB-Net). Concretely, we first identify a unified performance metric of message throughput for both SemCom and BitCom links. Next, we specially develop a knowledge matching-aware two-stage tandem packet queuing model and theoretically derive the average packet loss ratio and queuing latency. Combined with practical constraints, we then formulate a joint optimization problem for UA, MS, and BA to maximize the overall message throughput of HSB-Net. Afterward, we propose an optimal resource management strategy by utilizing a Lagrange primal-dual transformation method and a preference list-based heuristic algorithm with polynomial-time complexity. Numerical results not only demonstrate the accuracy of our analytical queuing model, but also validate the performance superiority of our proposed strategy compared with different benchmarks.
comment: This paper has been resubmitted, after completing major revisions, to IEEE Transactions on Communications for possible publication. arXiv admin note: text overlap with arXiv:2408.07820
Semi-Persistent Scheduling in NR Sidelink Mode 2: MAC Packet Reception Ratio Model and ns-3 Validation
5G New Radio (NR) Sidelink (SL) has demonstrated the promising capability for infrastructure-less cellular coverage. Understanding the fundamentals of the NR SL channel access mechanism, Semi-Persistent Scheduling (SPS), which is specified by the 3rd Generation Partnership Project (3GPP), is a necessity to enhance the NR SL Packet Reception Ratio (PRR). However, most existing works fail to account for the new SPS features introduced in NR SL, which might be out-of-date for comprehensively describing the NR SL PRR. The existing models ignore the relationships between SPS parameters and, therefore, do not provide sufficient insights into the PRR of SPS. This work proposes a novel SPS PRR model incorporating MAC collisions based on new features in NR SL. We extend our model by loosening several simplifying assumptions made in our initial modeling. The extended models illustrate how the PRR is affected by various SPS parameters. The computed results are validated via simulations using the network simulator (ns-3), which provides important guidelines for future NR SL enhancement work.
comment: This work has been submitted to the IEEE for possible publication. 13 pages, 22 figures
Systems and Control (EESS)
LEAD: Towards Learning-Based Equity-Aware Decarbonization in Ridesharing Platforms
Ridesharing platforms such as Uber, Lyft, and DiDi have grown in popularity due to their on-demand availability, ease of use, and commute cost reductions, among other benefits. However, not all ridesharing promises have panned out. Recent studies demonstrate that the expected drop in traffic congestion and reduction in greenhouse gas (GHG) emissions have not materialized. This is primarily due to the substantial distances traveled by the ridesharing vehicles without passengers between rides, known as deadhead miles. Recent work has focused on reducing the impact of deadhead miles while considering additional metrics such as rider waiting time, GHG emissions from deadhead miles, or driver earnings. Unfortunately, prior studies consider these environmental and equity-based metrics individually despite them being interrelated. In this paper, we propose a Learning-based Equity-Aware Decarabonization approach, LEAD, for ridesharing platforms. LEAD targets minimizing emissions while ensuring that the driver's utility, defined as the difference between the trip distance and the deadhead miles, is fairly distributed. LEAD uses reinforcement learning to match riders to drivers based on the expected future utility of drivers and the expected carbon emissions of the platform without increasing the rider waiting times. Extensive experiments based on a real-world ride-sharing dataset show that LEAD improves fairness by 2$\times$ when compared to emission-aware ride-assignment and reduces emissions by 70% while ensuring fairness within 66% of the fair baseline. It also reduces the rider wait time, by at least 40%, compared to various baselines. Additionally, LEAD corrects the imbalance in previous emission-aware ride assignment algorithms that overassigned rides to low-emission vehicles.
Perfectly Undetectable Reflection and Scaling False Data Injection Attacks via Affine Transformation on Mobile Robot Trajectory Tracking Control
With the increasing integration of cyber-physical systems (CPS) into critical applications, ensuring their resilience against cyberattacks is paramount. A particularly concerning threat is the vulnerability of CPS to deceptive attacks that degrade system performance while remaining undetected. This paper investigates perfectly undetectable false data injection attacks (FDIAs) targeting the trajectory tracking control of a non-holonomic mobile robot. The proposed attack method utilizes affine transformations of intercepted signals, exploiting weaknesses inherent in the partially linear dynamic properties and symmetry of the nonlinear plant. The feasibility and potential impact of these attacks are validated through experiments using a Turtlebot 3 platform, highlighting the urgent need for sophisticated detection mechanisms and resilient control strategies to safeguard CPS against such threats. Furthermore, a novel approach for detection of these attacks called the state monitoring signature function (SMSF) is introduced. An example SMSF, a carefully designed function resilient to FDIA, is shown to be able to detect the presence of a FDIA through signatures based on systems states.
comment: 15 pages, 17 figures. Manuscript under review for publication
General Impedance Modeling for Modular Multilevel Converter with Grid-forming and Grid-following Control
Modular multilevel converter (MMC) has complex topology, control architecture and broadband harmonic spectrum. For this, linear-time-periodic (LTP) theory, covering multi-harmonic coupling relations, has been adopted for MMC impedance modeling recently. However, the existing MMC impedance models usually lack explicit expressions and general modeling procedure for different control strategies. To this end, this paper proposes a general impedance modeling procedure applicable to various power converters with grid-forming and grid-following control strategies. The modeling is based on a unified representation of MMC circuit as the input and output relation between the voltage or current on the AC side and the exerted modulation index, while the control part vice versa, thereby interconnected as closed-loop feedback. With each part expressed as transfer functions, the final impedance model keeps the explicit form of harmonic transfer function matrix, making it convenient to directly observe and analyze the influence of each part individually. Thereby the submodule capacitance is found as the main cause of difference between MMC impedance compared to two-level converter, which will get closer as the capacitance increases. Effectiveness and generality of the impedance modeling method is demonstrated through comprehensive comparison with impedance scanning using electromagnetic transient simulation.
Adaptive BESS and Grid Setpoints Optimization: A Model-Free Framework for Efficient Battery Management under Dynamic Tariff Pricing
This paper introduces an enhanced framework for managing Battery Energy Storage Systems (BESS) in residential communities. The non-convex BESS control problem is first addressed using a gradient-based optimizer, providing a benchmark solution. Subsequently, the problem is tackled using multiple Deep Reinforcement Learning (DRL) agents, with a specific emphasis on the off-policy Soft Actor-Critic (SAC) algorithm. This version of SAC incorporates reward refinement based on this non-convex problem, applying logarithmic scaling to enhance convergence rates. Additionally, a safety mechanism selects only feasible actions from the action space, aimed at improving the learning curve, accelerating convergence, and reducing computation times. Moreover, the state representation of this DRL approach now includes uncertainties quantified in the entropy term, enhancing the model's adaptability across various entropy types. This developed system adheres to strict limits on the battery's State of Charge (SOC), thus preventing breaches of SOC boundaries and extending the battery lifespan. The robustness of the model is validated across several Australian states' districts, each characterized by unique uncertainty distributions. By implementing the refined SAC, the SOC consistently surpasses 50 percent by the end of each day, enabling the BESS control to start smoothly for the next day with some reserve. Finally, this proposed DRL method achieves a mean reduction in optimization time by 50 percent and an average cost saving of 40 percent compared to the gradient-based optimization benchmark.
Minimal Sensor Placement for Generic State and Unknown Input Observability
This paper addresses the problem of selecting the minimum number of dedicated sensors to achieve observability in the presence of unknown inputs, namely, the state and input observability, for linear time-invariant systems. We assume that the only available information is the zero-nonzero structure of system matrices, and approach this problem within a structured system model. We revisit the concept of state and input observability for structured systems, providing refined necessary and sufficient conditions for placing dedicated sensors via the Dulmage-Mendelsohn decomposition. Based on these conditions, we prove that determining the minimum number of dedicated sensors to achieve generic state and input observability is NP-hard, which contrasts sharply with the polynomial-time complexity of the corresponding problem with known inputs. We also demonstrate that this problem is hard to approximate within a factor of $(1-o(1)){\rm{log}}(n)$, where $n$ is the state dimension. Notwithstanding, we propose nontrivial upper and lower bounds that can be computed in polynomial time, which confine the optimal value of this problem to an interval with length being the number of inputs. We further present a special case for which the exact optimal value can be determined in polynomial time. Additionally, we propose a two-stage algorithm to solve this problem approximately. Each stage of the algorithm is either optimal or suboptimal and can be completed in polynomial time.
comment: 12 pages, 6 figures
Qualitative properties and stability analysis of the mathematical model for a DC-DC electric circuit
This paper describes a simplified model of an electric circuit with a DC-DC converter and a PID-regulator as a system of integral differential equations with an identically singular matrix multiplying the higher derivative of the desired vector-function. We use theoretical results on integral and differential equations and their systems to prove solvability of such a model and analyze its stability.
comment: submitted to COIA-2024. arXiv admin note: text overlap with arXiv:2408.06045
ISAC-Fi: Enabling Full-fledged Monostatic Sensing over Wi-Fi Communication
Whereas Wi-Fi communications have been exploited for sensing purpose for over a decade, the bistatic or multistatic nature of Wi-Fi still poses multiple challenges, hampering real-life deployment of integrated sensing and communication (ISAC) within Wi-Fi framework. In this paper, we aim to re-design WiFi so that monostatic sensing (mimicking radar) can be achieved over the multistatic communication infrastructure. Specifically, we propose, design, and implement ISAC-Fi as an ISAC-ready Wi-Fi prototype. We first present a novel self-interference cancellation scheme, in order to extract reflected (radio frequency) signals for sensing purpose in the face of transmissions. We then subtly revise existing Wi-Fi framework so as to seamlessly operate monostatic sensing under Wi-Fi communication standard. Finally, we offer two ISAC-Fi designs: while a USRP-based one emulates a totally re-designed ISAC-Fi device, another plug-andplay design allows for backward compatibility by attaching an extra module to an arbitrary Wi-Fi device. We perform extensive experiments to validate the efficacy of ISAC-Fi and also to demonstrate its superiority over existing Wi-Fi sensing proposals.
comment: 14 pages, 22 figures
Neural Horizon Model Predictive Control -- Increasing Computational Efficiency with Neural Networks
The expansion in automation of increasingly fast applications and low-power edge devices poses a particular challenge for optimization based control algorithms, like model predictive control. Our proposed machine-learning supported approach addresses this by utilizing a feed-forward neural network to reduce the computation load of the online-optimization. We propose approximating part of the problem horizon, while maintaining safety guarantees -- constraint satisfaction -- via the remaining optimization part of the controller. The approach is validated in simulation, demonstrating an improvement in computational efficiency, while maintaining guarantees and near-optimal performance. The proposed MPC scheme can be applied to a wide range of applications, including those requiring a rapid control response, such as robotics and embedded applications with limited computational resources.
comment: 6 pages, 4 figures, 4 tables, American Control Conference (ACC) 2024
Self-Refined Generative Foundation Models for Wireless Traffic Prediction
With a broad range of emerging applications in 6G networks, wireless traffic prediction has become a critical component of network management. However, the dynamically shifting distribution of wireless traffic in non-stationary 6G networks presents significant challenges to achieving accurate and stable predictions. Motivated by recent advancements in Generative AI (GAI)-enabled 6G networks, this paper proposes a novel self-refined Large Language Model (LLM) for wireless traffic prediction, namely TrafficLLM, through in-context learning without parameter fine-tuning or model training. The proposed TrafficLLM harnesses the powerful few-shot learning abilities of LLMs to enhance the scalability of traffic prediction in dynamically changing wireless environments. Specifically, our proposed TrafficLLM embraces an LLM to iteratively refine its predictions through a three-step process: traffic prediction, feedback generation, and prediction refinement. Initially, the proposed TrafficLLM conducts traffic predictions using task-specific demonstration prompts. Recognizing that LLMs may generate incorrect predictions on the first attempt, we subsequently incorporate feedback demonstration prompts designed to provide multifaceted and valuable feedback related to these initial predictions. Following this comprehensive feedback, our proposed TrafficLLM introduces refinement demonstration prompts, enabling the same LLM to further refine its predictions and thereby enhance prediction performance. The evaluations on two realistic datasets demonstrate that the proposed TrafficLLM outperforms state-of-the-art methods with performance improvements of 23.17% and 17.09%, respectively.
Finite-time input-to-state stability for infinite-dimensional systems
In this paper, we extend the notion of finite-time input-to-state stability (FTISS) for finite-dimensional systems to infinite-dimensional systems. More specifically, we first prove an FTISS Lyapunov theorem for a class of infinite-dimensional systems, namely, the existence of an FTISS Lyapunov functional (FTISS-LF) implies the FTISS of the system, and then, provide a sufficient condition for ensuring the existence of an FTISS-LF for a class of abstract infinite-dimensional systems under the framework of compact semigroup theory and Hilbert spaces. As an application of the FTISS Lyapunov theorem, we verify the FTISS for a class of parabolic PDEs involving sublinear terms and distributed in-domain disturbances. Since the nonlinear terms of the corresponding abstract system are not Lipschitz continuous, the well-posedness is proved based on the application of compact semigroup theory and the FTISS is assessed by using the Lyapunov method with the aid of an interpolation inequality. Numerical simulations are conducted to confirm the theoretical results.
An LMI-based Robust Fuzzy Controller for Blood Glucose Regulation in Type 1 Diabetes
This paper presents a control algorithm for creating an artificial pancreas for type 1 diabetes, factoring in input saturation for a practical application. By utilizing the parallel distributed compensation and Takagi-Sugeno Fuzzy model, we design an optimal robust fuzzy controller. Stability conditions derived from the Lyapunov method are expressed as linear matrix inequalities, allowing for optimal controller gain selection that minimizes disturbance effects. We employ the minimal Bergman and Tolic models to represent type 1 diabetes glucose-insulin dynamics, converting them into corresponding Takagi-Sugeno fuzzy models using the sector nonlinearity approach. Simulation results demonstrate the proposed controller's effectiveness.
Certifiable Deep Learning for Reachability Using a New Lipschitz Continuous Value Function
We propose a new reachability learning framework for high-dimensional nonlinear systems, focusing on reach-avoid problems. These problems require computing the reach-avoid set, which ensures that all its elements can safely reach a target set despite any disturbance within pre-specified bounds. Our framework has two main parts: offline learning of a newly designed reach-avoid value function and post-learning certification. Compared to prior works, our new value function is Lipschitz continuous and its associated Bellman operator is a contraction mapping, both of which improve the learning performance. To ensure deterministic guarantees of our learned reach-avoid set, we introduce two efficient post-learning certification methods. Both methods can be used online for real-time local certification or offline for comprehensive certification. We validate our framework in a 12-dimensional crazyflie drone racing hardware experiment and a simulated 10-dimensional highway takeover example.
comment: Submitted, under review
Conjectural Online Learning with First-order Beliefs in Asymmetric Information Stochastic Games
Asymmetric information stochastic games (AISGs) arise in many complex socio-technical systems, such as cyber-physical systems and IT infrastructures. Existing computational methods for AISGs are primarily offline and can not adapt to equilibrium deviations. Further, current methods are limited to particular information structures to avoid belief hierarchies. Considering these limitations, we propose conjectural online learning (COL), an online learning method under generic information structures in AISGs. COL uses a forecaster-actor-critic (FAC) architecture, where subjective forecasts are used to conjecture the opponents' strategies within a lookahead horizon, and Bayesian learning is used to calibrate the conjectures. To adapt strategies to nonstationary environments based on information feedback, COL uses online rollout with cost function approximation (actor-critic). We prove that the conjectures produced by COL are asymptotically consistent with the information feedback in the sense of a relaxed Bayesian consistency. We also prove that the empirical strategy profile induced by COL converges to the Berk-Nash equilibrium, a solution concept characterizing rationality under subjectivity. Experimental results from an intrusion response use case demonstrate COL's {faster convergence} over state-of-the-art reinforcement learning methods against nonstationary attacks.
comment: Accepted to the 63rd IEEE Conference on Decision and Control, Special Session on Networks, Games and Learning
Quantification of Residential Flexibility Potential using Global Forecasting Models
This paper proposes a general and practical approach to estimate the economic benefits of optimally controlling deferrable loads in a Distribution System Operator's (DSO) grid, without relying on historical observations. We achieve this by learning the simulated response of flexible loads to random control signals, using a non-parametric global forecasting model. An optimal control policy is found by including the latter in an optimization problem. We apply this method to electric water heaters and heat pumps operated through ripple control and show how flexibility, including rebound effects, can be characterized and controlled. Finally, we show that the forecaster's accuracy is sufficient to completely bypass the simulations and directly use the forecaster to estimate the economic benefit of flexibility control.
Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning
Sim-to-real transfer presents a difficult challenge, where models trained in simulation are to be deployed in the real world. The distribution shift between the two settings leads to biased representations of the dynamics, and thus to suboptimal predictions in the real-world environment. In this work, we tackle the challenge of sim-to-real transfer of reinforcement learning (RL) agents for coverage path planning (CPP). In CPP, the task is for a robot to find a path that covers every point of a confined area. Specifically, we consider the case where the environment is unknown, and the agent needs to plan the path online while mapping the environment. We bridge the sim-to-real gap through a semi-virtual environment, including a real robot and real-time aspects, while utilizing a simulated sensor and obstacles to enable environment randomization and automated episode resetting. We investigate what level of fine-tuning is needed for adapting to a realistic setting, comparing to an agent trained solely in simulation. We find that a high inference frequency allows first-order Markovian policies to transfer directly from simulation, while higher-order policies can be fine-tuned to further reduce the sim-to-real gap. Moreover, they can operate at a lower frequency, thus reducing computational requirements. In both cases, our approaches transfer state-of-the-art results from simulation to the real domain, where direct learning would take in the order of weeks with manual interaction, that is, it would be completely infeasible.
ForzaETH Race Stack -- Scaled Autonomous Head-to-Head Racing on Fully Commercial off-the-Shelf Hardware
Autonomous racing in robotics combines high-speed dynamics with the necessity for reliability and real-time decision-making. While such racing pushes software and hardware to their limits, many existing full-system solutions necessitate complex, custom hardware and software, and usually focus on Time-Trials rather than full unrestricted Head-to-Head racing, due to financial and safety constraints. This limits their reproducibility, making advancements and replication feasible mostly for well-resourced laboratories with comprehensive expertise in mechanical, electrical, and robotics fields. Researchers interested in the autonomy domain but with only partial experience in one of these fields, need to spend significant time with familiarization and integration. The ForzaETH Race Stack addresses this gap by providing an autonomous racing software platform designed for F1TENTH, a 1:10 scaled Head-to-Head autonomous racing competition, which simplifies replication by using commercial off-the-shelf hardware. This approach enhances the competitive aspect of autonomous racing and provides an accessible platform for research and development in the field. The ForzaETH Race Stack is designed with modularity and operational ease of use in mind, allowing customization and adaptability to various environmental conditions, such as track friction and layout. Capable of handling both Time-Trials and Head-to-Head racing, the stack has demonstrated its effectiveness, robustness, and adaptability in the field by winning the official F1TENTH international competition multiple times.
comment: This paper has been accepted at the Journal of Field Robotics
Towards Near-Field 3D Spot Beamfocusing: Possibilities, Challenges, and Use-cases
Spot beamfocusing (SBF) is the process of focusing the signal power in a small spot-like region in the 3D space, which can be either hard-tuned (HT) using traditional tools like lenses and mirrors or electronically reconfigured (ER) using modern large-scale intelligent surface phased arrays. ER-SBF can be a key enabling technology (KET) for the next-generation 6G wireless networks offering benefits to many future wireless application areas such as wireless communication and security, mid-range high-power and safe wireless chargers, medical and health, physics, etc. Although near-field HT-SBF and ER-beamfocusing have been studied in the literature and applied in the industry, there is no comprehensive study of different aspects of ER-SBF and its future applications, especially for nonoptical (mmWave, sub-THz, and THz) electromagnetic waves in the next generation wireless technology, which is the aim of this paper. The theoretical concepts behind ER-SBF, different antenna technologies for implementing ER-SBF, employing machine learning (ML)-based schemes for enabling channel-state-information (CSI)-independent ER-SBF, and different practical application areas that can benefit from ER-SBF will be explored.
A new perspective on Bayesian Operational Modal Analysis
In the field of operational modal analysis (OMA), obtained modal information is frequently used to assess the current state of aerospace, mechanical, offshore and civil structures. However, the stochasticity of operational systems and the lack of forcing information can lead to inconsistent results. Quantifying the uncertainty of the recovered modal parameters through OMA is therefore of significant value. In this article, a new perspective on Bayesian OMA is proposed: a Bayesian stochastic subspace identification (SSI) algorithm. Distinct from existing approaches to Bayesian OMA, a hierarchical probabilistic model is embedded at the core of covariance-driven SSI. Through substitution of canonical correlation analysis with a Bayesian equivalent, posterior distributions over the modal properties are obtained. Two inference schemes are presented for the proposed Bayesian formulation: Markov Chain Monte Carlo and variational Bayes. Two case studies are then explored. The first is benchmark study using data from a simulated, multi degree-of-freedom, linear system. Following application of Bayesian SSI, it is shown that the same posterior is targeted and recovered by both inference schemes, with good agreement between the posterior mean and the conventional SSI result. The second study applies the variational form to data obtained from an in-service structure: The Z24 bridge. The results of this study are presented at single model orders, and then using a stabilisation diagram. The recovered posterior uncertainty is presented and compared to the classic SSI result. It is observed that the posterior distributions with mean values coinciding with the natural frequencies exhibit lower variance than values situated away from the natural frequencies.
Augmented LRFS-based Filter: Holistic Tracking of Group Objects
This paper addresses the problem of group target tracking (GTT), wherein multiple closely spaced targets within a group pose a coordinated motion. To improve the tracking performance, the labeled random finite sets (LRFSs) theory is adopted, and this paper develops a new kind of LRFSs, i.e., augmented LRFSs, which introduces group information into the definition of LRFSs. Specifically, for each element in an LRFS, the kinetic states, track label, and the corresponding group information of its represented target are incorporated. Furthermore, by means of the labeled multi-Bernoulli (LMB) filter with the proposed augmented LRFSs, the group structure is iteratively propagated and updated during the tracking process, which achieves the simultaneously estimation of the kinetic states, track label, and the corresponding group information of multiple group targets, and further improves the GTT tracking performance. Finally, simulation experiments are provided, which well demonstrates the effectiveness of the labeled multi-Bernoulli filter with the proposed augmented LRFSs for GTT tracking.
Wireless MAC Protocol Synthesis and Optimization with Multi-Agent Distributed Reinforcement Learning
In this letter, we propose a novel Multi-Agent Deep Reinforcement Learning (MADRL) framework for Medium Access Control (MAC) protocol design. Unlike centralized approaches, which rely on a single entity for decision-making, MADRL empowers individual network nodes to autonomously learn and optimize their MAC based on local observations. Leveraging ns3-ai and RLlib, as far as we are aware of, our framework is the first of a kind that enables distributed multi-agent learning within the ns-3 environment, facilitating the design and synthesis of adaptive MAC protocols tailored to specific environmental conditions. We demonstrate the effectiveness of the MADRL MAC framework through extensive simulations, showcasing superior performance compared to legacy protocols across diverse scenarios. Our findings highlight the potential of MADRL-based MAC protocols to significantly enhance Quality of Service (QoS) requirements for future wireless applications.
Hybrid Semantic/Bit Communication Based Networking Problem Optimization
This paper jointly investigates user association (UA), mode selection (MS), and bandwidth allocation (BA) problems in a novel and practical next-generation cellular network where two modes of semantic communication (SemCom) and conventional bit communication (BitCom) coexist, namely hybrid semantic/bit communication network (HSB-Net). Concretely, we first identify a unified performance metric of message throughput for both SemCom and BitCom links. Next, we comprehensively develop a knowledge matching-aware two-stage tandem packet queuing model and theoretically derive the average packet loss ratio and queuing latency. Combined with several practical constraints, we then formulate a joint optimization problem for UA, MS, and BA to maximize the overall message throughput of HSB-Net. Afterward, we propose an optimal resource management strategy by employing a Lagrange primal-dual method and devising a preference list-based heuristic algorithm. Finally, numerical results validate the performance superiority of our proposed strategy compared with different benchmarks.
comment: This paper has been accepted for publication and will be presented in 2024 IEEE Global Communications Conference (GlobeCom 2024). Copyright may be transferred without notice, after which this version may no longer be accessible. arXiv admin note: substantial text overlap with arXiv:2404.04162
PhiBE: A PDE-based Bellman Equation for Continuous Time Policy Evaluation
In this paper, we address the problem of continuous-time reinforcement learning in scenarios where the dynamics follow a stochastic differential equation. When the underlying dynamics remain unknown and we have access only to discrete-time information, how can we effectively conduct policy evaluation? We first highlight that the commonly used Bellman equation (BE) is not always a reliable approximation to the true value function. We then introduce a new bellman equation, PhiBE, which integrates the discrete-time information into a PDE formulation. The new bellman equation offers a more accurate approximation to the true value function, especially in scenarios where the underlying dynamics change slowly. Moreover, we extend PhiBE to higher orders, providing increasingly accurate approximations. We conduct the error analysis for both BE and PhiBE with explicit dependence on the discounted coefficient, the reward and the dynamics. Additionally, we present a model-free algorithm to solve PhiBE when only discrete-time trajectory data is available. Numerical experiments are provided to validate the theoretical guarantees we propose.
Wireless Resource Optimization in Hybrid Semantic/Bit Communication Networks
Recently, semantic communication (SemCom) has shown great potential in significant resource savings and efficient information exchanges, thus naturally introducing a novel and practical cellular network paradigm where two modes of SemCom and conventional bit communication (BitCom) coexist. Nevertheless, the involved wireless resource management becomes rather complicated and challenging, given the unique background knowledge matching and time-consuming semantic coding requirements in SemCom. To this end, this paper jointly investigates user association (UA), mode selection (MS), and bandwidth allocation (BA) problems in a hybrid semantic/bit communication network (HSB-Net). Concretely, we first identify a unified performance metric of message throughput for both SemCom and BitCom links. Next, we specially develop a knowledge matching-aware two-stage tandem packet queuing model and theoretically derive the average packet loss ratio and queuing latency. Combined with practical constraints, we then formulate a joint optimization problem for UA, MS, and BA to maximize the overall message throughput of HSB-Net. Afterward, we propose an optimal resource management strategy by utilizing a Lagrange primal-dual transformation method and a preference list-based heuristic algorithm with polynomial-time complexity. Numerical results not only demonstrate the accuracy of our analytical queuing model, but also validate the performance superiority of our proposed strategy compared with different benchmarks.
comment: This paper has been resubmitted, after completing major revisions, to IEEE Transactions on Communications for possible publication. arXiv admin note: text overlap with arXiv:2408.07820
Semi-Persistent Scheduling in NR Sidelink Mode 2: MAC Packet Reception Ratio Model and ns-3 Validation
5G New Radio (NR) Sidelink (SL) has demonstrated the promising capability for infrastructure-less cellular coverage. Understanding the fundamentals of the NR SL channel access mechanism, Semi-Persistent Scheduling (SPS), which is specified by the 3rd Generation Partnership Project (3GPP), is a necessity to enhance the NR SL Packet Reception Ratio (PRR). However, most existing works fail to account for the new SPS features introduced in NR SL, which might be out-of-date for comprehensively describing the NR SL PRR. The existing models ignore the relationships between SPS parameters and, therefore, do not provide sufficient insights into the PRR of SPS. This work proposes a novel SPS PRR model incorporating MAC collisions based on new features in NR SL. We extend our model by loosening several simplifying assumptions made in our initial modeling. The extended models illustrate how the PRR is affected by various SPS parameters. The computed results are validated via simulations using the network simulator (ns-3), which provides important guidelines for future NR SL enhancement work.
comment: This work has been submitted to the IEEE for possible publication. 13 pages, 22 figures
Robotics
ContactSDF: Signed Distance Functions as Multi-Contact Models for Dexterous Manipulation
In this paper, we propose ContactSDF, a method that uses signed distance functions (SDFs) to approximate multi-contact models, including both collision detection and time-stepping routines. ContactSDF first establishes an SDF using the supporting plane representation of an object for collision detection, and then use the generated contact dual cones to build a second SDF for time stepping prediction of the next state. Those two SDFs create a differentiable and closed-form multi-contact dynamic model for state prediction, enabling efficient model learning and optimization for contact-rich manipulation. We perform extensive simulation experiments to show the effectiveness of ContactSDF for model learning and real-time control of dexterous manipulation. We further evaluate the ContactSDF on a hardware Allegro hand for on-palm reorientation tasks. Results show with around 2 minutes of learning on hardware, the ContactSDF achieves high-quality dexterous manipulation at a frequency of 30-60Hz.
HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model
Large Language Model (LLM)-based agents exhibit significant potential across various domains, operating as interactive systems that process environmental observations to generate executable actions for target tasks. The effectiveness of these agents is significantly influenced by their memory mechanism, which records historical experiences as sequences of action-observation pairs. We categorize memory into two types: cross-trial memory, accumulated across multiple attempts, and in-trial memory (working memory), accumulated within a single attempt. While considerable research has optimized performance through cross-trial memory, the enhancement of agent performance through improved working memory utilization remains underexplored. Instead, existing approaches often involve directly inputting entire historical action-observation pairs into LLMs, leading to redundancy in long-horizon tasks. Inspired by human problem-solving strategies, this paper introduces HiAgent, a framework that leverages subgoals as memory chunks to manage the working memory of LLM-based agents hierarchically. Specifically, HiAgent prompts LLMs to formulate subgoals before generating executable actions and enables LLMs to decide proactively to replace previous subgoals with summarized observations, retaining only the action-observation pairs relevant to the current subgoal. Experimental results across five long-horizon tasks demonstrate that HiAgent achieves a twofold increase in success rate and reduces the average number of steps required by 3.8. Additionally, our analysis shows that HiAgent consistently improves performance across various steps, highlighting its robustness and generalizability. Project Page: https://github.com/HiAgent2024/HiAgent .
comment: Project Page: https://github.com/HiAgent2024/HiAgent
Swift Trust in Mobile Ad Hoc Human-Robot Teams
Integrating robots into teams of humans is anticipated to bring significant capability improvements for tasks such as searching potentially hazardous buildings. Trust between humans and robots is recognized as a key enabler for human-robot teaming (HRT) activity: if trust during a mission falls below sufficient levels for cooperative tasks to be completed, it could critically affect success. Changes in trust could be particularly problematic in teams that have formed on an ad hoc basis (as might be expected in emergency situations) where team members may not have previously worked together. In such ad hoc teams, a foundational level of 'swift trust' may be fragile and challenging to sustain in the face of inevitable setbacks. We present results of an experiment focused on understanding trust building, violation and repair processes in ad hoc teams (one human and two robots). Trust violation occurred through robots becoming unresponsive, with limited communication and feedback. We perform exploratory analysis of a variety of data, including communications and performance logs, trust surveys and post-experiment interviews, toward understanding how autonomous systems can be designed into interdependent ad hoc human-robot teams where swift trust can be sustained.
Design and Experimental Study of Vacuum Suction Grabbing Technology to Grasp Fabric Piece
The primary objective of this study was to design the grabbing technique used to determine the vacuum suction gripper and its design parameters for the pocket welting operation in apparel manufacturing. It presents the application of vacuum suction in grabbing technology, a technique that has revolutionized the handling and manipulation to grasp the various fabric materials in a range of garment industries. Vacuum suction, being non-intrusive and non-invasive, offers several advantages compared to traditional grabbing methods. It is particularly useful in scenarios where soft woven fabric and air-impermeable fabric items need to be handled with utmost care. The paper delves into the working principles of vacuum suction, its various components, and the underlying physics involved. Furthermore, it explores the various applications of vacuum suction in the garment industry into the automation exploration. The paper also highlights the challenges and limitations of vacuum suction technology and suggests potential areas for further research and development.
comment: 9 Pages, 3 figures, 6 diagrams, 1 table
Towards Safe and Robust Autonomous Vehicle Platooning: A Self-Organizing Cooperative Control Framework
In the emerging hybrid traffic flow environment, which includes both human-driven vehicles (HDVs) and autonomous vehicles (AVs), ensuring safe and robust decision-making and control is crucial for the effective operation of autonomous vehicle platooning. Current systems for cooperative adaptive cruise control and lane changing are inadequate in responding to real-world emergency situations, limiting the potential of autonomous vehicle platooning technology. To address the aforementioned challenges, we propose a Twin-World Safety-Enhanced Data-Model-Knowledge Hybrid-Driven autonomous vehicle platooning Cooperative Control Framework. Within this framework, a deep reinforcement learning formation decision model integrating traffic priors is designed, and a twin-world deduction model based on safety priority judgment is proposed. Subsequently, an optimal control-based multi-scenario decision-control right adaptive switching mechanism is designed to achieve adaptive switching between data-driven and model-driven methods. Through simulation experiments and hardware-in-loop tests, our algorithm has demonstrated excellent performance in terms of safety, robustness, and flexibility. A detailed account of the validation results for the model can be found in \url{https://perfectxu88.github.io/towardssafeandrobust.github.io/}.
Behavioral Learning of Dish Rinsing and Scrubbing based on Interruptive Direct Teaching Considering Assistance Rate
Robots are expected to manipulate objects in a safe and dexterous way. For example, washing dishes is a dexterous operation that involves scrubbing the dishes with a sponge and rinsing them with water. It is necessary to learn it safely without splashing water and without dropping the dishes. In this study, we propose a safe and dexterous manipulation system. %that can scrub and rinse dirty dishes. The robot learns a dynamics model of the object by estimating the state of the object and the robot itself, the control input, and the amount of human assistance required (assistance rate) after the human corrects the initial trajectory of the robot's hands by interruptive direct teaching. By backpropagating the error between the estimated and the reference value %at the next time using the acquired dynamics model, the robot can generate a control input that approaches the reference value, for example, so that human assistance is not required and the dish does not move excessively. This allows for adaptive rinsing and scrubbing of dishes with unknown shapes and properties. As a result, it is possible to generate safe actions that require less human assistance.
comment: Accepted at Advanced Robotics
Complementarity-Free Multi-Contact Modeling and Optimization for Dexterous Manipulation
A significant barrier preventing model-based methods from matching the high performance of reinforcement learning in dexterous manipulation is the inherent complexity of multi-contact dynamics. Traditionally formulated using complementarity models, multi-contact dynamics introduces combinatorial complexity and non-smoothness, complicating contact-rich planning and control. In this paper, we circumvent these challenges by introducing a novel, simplified multi-contact model. Our new model, derived from the duality of optimization-based contact models, dispenses with the complementarity constructs entirely, providing computational advantages such as explicit time stepping, differentiability, automatic satisfaction of Coulomb friction law, and minimal hyperparameter tuning. We demonstrate the effectiveness and efficiency of the model for planning and control in a range of challenging dexterous manipulation tasks, including fingertip 3D in-air manipulation, TriFinger in-hand manipulation, and Allegro hand on-palm reorientation, all with diverse objects. Our method consistently achieves state-of-the-art results: (I) a 96.5% average success rate across tasks, (II) high manipulation accuracy with an average reorientation error of 11{\deg} and position error of 7.8 mm, and (III) model predictive control running at 50-100 Hz for all tested dexterous manipulation tasks. These results are achieved with minimal hyperparameter tuning.
comment: Video demo: https://youtu.be/NsL4hbSXvFg
EqNIO: Subequivariant Neural Inertial Odometry
Neural networks are seeing rapid adoption in purely inertial odometry, where accelerometer and gyroscope measurements from commodity inertial measurement units (IMU) are used to regress displacements and associated uncertainties. They can learn informative displacement priors, which can be directly fused with the raw data with off-the-shelf non-linear filters. Nevertheless, these networks do not consider the physical roto-reflective symmetries inherent in IMU data, leading to the need to memorize the same priors for every possible motion direction, which hinders generalization. In this work, we characterize these symmetries and show that the IMU data and the resulting displacement and covariance transform equivariantly, when rotated around the gravity vector and reflected with respect to arbitrary planes parallel to gravity. We design a neural network that respects these symmetries by design through equivariant processing in three steps: First, it estimates an equivariant gravity-aligned frame from equivariant vectors and invariant scalars derived from IMU data, leveraging expressive linear and non-linear layers tailored to commute with the underlying symmetry transformation. We then map the IMU data into this frame, thereby achieving an invariant canonicalization that can be directly used with off-the-shelf inertial odometry networks. Finally, we map these network outputs back into the original frame, thereby obtaining equivariant covariances and displacements. We demonstrate the generality of our framework by applying it to the filter-based approach based on TLIO, and the end-to-end RONIN architecture, and show better performance on the TLIO, Aria, RIDI and OxIOD datasets than existing methods.
comment: 27 pages
Using Implicit Behavior Cloning and Dynamic Movement Primitive to Facilitate Reinforcement Learning for Robot Motion Planning
Reinforcement learning (RL) for motion planning of multi-degree-of-freedom robots still suffers from low efficiency in terms of slow training speed and poor generalizability. In this paper, we propose a novel RL-based robot motion planning framework that uses implicit behavior cloning (IBC) and dynamic movement primitive (DMP) to improve the training speed and generalizability of an off-policy RL agent. IBC utilizes human demonstration data to leverage the training speed of RL, and DMP serves as a heuristic model that transfers motion planning into a simpler planning space. To support this, we also create a human demonstration dataset using a pick-and-place experiment that can be used for similar studies. Comparison studies in simulation reveal the advantage of the proposed method over the conventional RL agents with faster training speed and higher scores. A real-robot experiment indicates the applicability of the proposed method to a simple assembly task. Our work provides a novel perspective on using motion primitives and human demonstration to leverage the performance of RL for robot applications.
Human Orientation Estimation under Partial Observation IROS 2024
Reliable Human Orientation Estimation (HOE) from a monocular image is critical for autonomous agents to understand human intention. Significant progress has been made in HOE under full observation. However, the existing methods easily make a wrong prediction under partial observation and give it an unexpectedly high confidence. To solve the above problems, this study first develops a method called Part-HOE that estimates orientation from the visible joints of a target person so that it is able to handle partial observation. Subsequently, we introduce a confidence-aware orientation estimation method, enabling more accurate orientation estimation and reasonable confidence estimation under partial observation. The effectiveness of our method is validated on both public and custom-built datasets, and it shows great accuracy and reliability improvement in partial observation scenarios. In particular, we show in real experiments that our method can benefit the robustness and consistency of the Robot Person Following (RPF) task.
comment: Accepted by IROS 2024
Expectable Motion Unit: Avoiding Hazards From Human Involuntary Motions in Human-Robot Interaction
In robotics, many control and planning schemes have been developed to ensure human physical safety in human-robot interaction. The human psychological state and the expectation towards the robot, however, are typically neglected. Even if the robot behaviour is regarded as biomechanically safe, humans may still react with a rapid involuntary motion (IM) caused by a startle or surprise. Such sudden, uncontrolled motions can jeopardize safety and should be prevented by any means. In this letter, we propose the Expectable Motion Unit (EMU), which ensures that a certain probability of IM occurrence is not exceeded in a typical HRI setting. Based on a model of IM occurrence generated through an experiment with 29 participants, we establish the mapping between robot velocity, robot-human distance, and the relative frequency of IM occurrence. This mapping is processed towards a real-time capable robot motion generator that limits the robot velocity during task execution if necessary. The EMU is combined in a holistic safety framework that integrates both the physical and psychological safety knowledge. A validation experiment showed that the EMU successfully avoids human IM in five out of six cases.
Towards Safe Robot Use with Edged or Pointed Objects: A Surrogate Study Assembling a Human Hand Injury Protection Database
The use of pointed or edged tools or objects is one of the most challenging aspects of today's application of physical human-robot interaction (pHRI). One reason for this is that the severity of harm caused by such edged or pointed impactors is less well studied than for blunt impactors. Consequently, the standards specify well-reasoned force and pressure thresholds for blunt impactors and advise avoiding any edges and corners in contacts. Nevertheless, pointed or edged impactor geometries cannot be completely ruled out in real pHRI applications. For example, to allow edged or pointed tools such as screwdrivers near human operators, the knowledge of injury severity needs to be extended so that robot integrators can perform well-reasoned, time-efficient risk assessments. In this paper, we provide the initial datasets on injury prevention for the human hand based on drop tests with surrogates for the human hand, namely pig claws and chicken drumsticks. We then demonstrate the ease and efficiency of robot use using the dataset for contact on two examples. Finally, our experiments provide a set of injuries that may also be expected for human subjects under certain robot mass-velocity constellations in collisions. To extend this work, testing on human samples and a collaborative effort from research institutes worldwide is needed to create a comprehensive human injury avoidance database for any pHRI scenario and thus for safe pHRI applications including edged and pointed geometries.
comment: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Towards Unconstrained Collision Injury Protection Data Sets: Initial Surrogate Experiments for the Human Hand
Safety for physical human-robot interaction (pHRI) is a major concern for all application domains. While current standardization for industrial robot applications provide safety constraints that address the onset of pain in blunt impacts, these impact thresholds are difficult to use on edged or pointed impactors. The most severe injuries occur in constrained contact scenarios, where crushing is possible. Nevertheless, situations potentially resulting in constrained contact only occur in certain areas of a workspace and design or organisational approaches can be used to avoid them. What remains are risks to the human physical integrity caused by unconstrained accidental contacts, which are difficult to avoid while maintaining robot motion efficiency. Nevertheless, the probability and severity of injuries occurring with edged or pointed impacting objects in unconstrained collisions is hardly researched. In this paper, we propose an experimental setup and procedure using two pendulums modeling human hands and arms and robots to understand the injury potential of unconstrained collisions of human hands with edged objects. Pig feet are used as ex vivo surrogate samples - as these closely resemble the physiological characteristics of human hands - to create an initial injury database on the severity of injuries caused by unconstrained edged or pointed impacts. For the effective mass range of typical lightweight robots, the data obtained show low probabilities of injuries such as skin cuts or bone/tendon injuries in unconstrained collisions when the velocity is reduced to < 0.5 m/s. The proposed experimental setups and procedures should be complemented by sufficient human modeling and will eventually lead to a complete understanding of the biomechanical injury potential in pHRI.
comment: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
An Observability-Constrained Magnetic Field-Aided Inertial Navigation System -- Extended Version
Maintaining consistent uncertainty estimates in localization systems is crucial as the perceived uncertainty commonly affects high-level system components, such as control or decision processes. A method for constructing an observability-constrained magnetic field-aided inertial navigation system is proposed to address the issue of erroneous yaw observability, which leads to inconsistent estimates of yaw uncertainty. The proposed method builds upon the previously proposed observability-constrained extended Kalman filter and extends it to work with a magnetic field-based odometry-aided inertial navigation system. The proposed method is evaluated using simulation and real-world data, showing that (i) the system observability properties are preserved, (ii) the estimation accuracy increases, and (iii) the perceived uncertainty calculated by the EKF is more consistent with the true uncertainty of the filter estimates.
comment: Accepted to IPIN 2024
DexCatch: Learning to Catch Arbitrary Objects with Dexterous Hands
Achieving human-like dexterous manipulation remains a crucial area of research in robotics. Current research focuses on improving the success rate of pick-and-place tasks. Compared with pick-and-place, throwing-catching behavior has the potential to increase the speed of transporting objects to their destination. However, dynamic dexterous manipulation poses a major challenge for stable control due to a large number of dynamic contacts. In this paper, we propose a Learning-based framework for Throwing-Catching tasks using dexterous hands (LTC). Our method, LTC, achieves a 73\% success rate across 45 scenarios (diverse hand poses and objects), and the learned policies demonstrate strong zero-shot transfer performance on unseen objects. Additionally, in tasks where the object in hand faces sideways, an extremely unstable scenario due to the lack of support from the palm, all baselines fail, while our method still achieves a success rate of over 60\%.
Advancements in Translation Accuracy for Stereo Visual-Inertial Initialization
As the current initialization method in the state-of-the-art Stereo Visual-Inertial SLAM framework, ORB-SLAM3 has limitations. Its success depends on the performance of the pure stereo SLAM system and is based on the underlying assumption that pure visual SLAM can accurately estimate the camera trajectory, which is essential for inertial parameter estimation. Meanwhile, the further improved initialization method for ORB-SLAM3, known as Stereo-NEC, is time-consuming due to applying keypoint tracking to estimate gyroscope bias with normal epipolar constraints. To address the limitations of previous methods, this paper proposes a method aimed at enhancing translation accuracy during the initialization stage. The fundamental concept of our method is to improve the translation estimate with a 3 Degree-of-Freedom (DoF) Bundle Adjustment (BA), independently, while the rotation estimate is fixed, instead of using ORB-SLAM3's 6-DoF BA. Additionally, the rotation estimate will be updated by considering IMU measurements and gyroscope bias, unlike ORB-SLAM3's rotation, which is directly obtained from stereo visual odometry and may yield inferior results when operating in challenging scenarios. We also conduct extensive evaluations on the public benchmark, the EuRoC dataset, demonstrating that our method excels in accuracy.
Multiagent Systems
Beyond Local Views: Global State Inference with Diffusion Models for Cooperative Multi-Agent Reinforcement Learning
In partially observable multi-agent systems, agents typically only have access to local observations. This severely hinders their ability to make precise decisions, particularly during decentralized execution. To alleviate this problem and inspired by image outpainting, we propose State Inference with Diffusion Models (SIDIFF), which uses diffusion models to reconstruct the original global state based solely on local observations. SIDIFF consists of a state generator and a state extractor, which allow agents to choose suitable actions by considering both the reconstructed global state and local observations. In addition, SIDIFF can be effortlessly incorporated into current multi-agent reinforcement learning algorithms to improve their performance. Finally, we evaluated SIDIFF on different experimental platforms, including Multi-Agent Battle City (MABC), a novel and flexible multi-agent reinforcement learning environment we developed. SIDIFF achieved desirable results and outperformed other popular algorithms.
comment: 15 pages, 12 figures
Value-Enriched Population Synthesis: Integrating a Motivational Layer
In recent years, computational improvements have allowed for more nuanced, data-driven and geographically explicit agent-based simulations. So far, simulations have struggled to adequately represent the attributes that motivate the actions of the agents. In fact, existing population synthesis frameworks generate agent profiles limited to socio-demographic attributes. In this paper, we introduce a novel value-enriched population synthesis framework that integrates a motivational layer with the traditional individual and household socio-demographic layers. Our research highlights the significance of extending the profile of agents in synthetic populations by incorporating data on values, ideologies, opinions and vital priorities, which motivate the agents' behaviour. This motivational layer can help us develop a more nuanced decision-making mechanism for the agents in social simulation settings. Our methodology integrates microdata and macrodata within different Bayesian network structures. This contribution allows to generate synthetic populations with integrated value systems that preserve the inherent socio-demographic distributions of the real population in any specific region.
GNN-Empowered Effective Partial Observation MARL Method for AoI Management in Multi-UAV Network
Unmanned Aerial Vehicles (UAVs), due to their low cost and high flexibility, have been widely used in various scenarios to enhance network performance. However, the optimization of UAV trajectories in unknown areas or areas without sufficient prior information, still faces challenges related to poor planning performance and low distributed execution. These challenges arise when UAVs rely solely on their own observation information and the information from other UAVs within their communicable range, without access to global information. To address these challenges, this paper proposes the Qedgix framework, which combines graph neural networks (GNNs) and the QMIX algorithm to achieve distributed optimization of the Age of Information (AoI) for users in unknown scenarios. The framework utilizes GNNs to extract information from UAVs, users within the observable range, and other UAVs within the communicable range, thereby enabling effective UAV trajectory planning. Due to the discretization and temporal features of AoI indicators, the Qedgix framework employs QMIX to optimize distributed partially observable Markov decision processes (Dec-POMDP) based on centralized training and distributed execution (CTDE) with respect to mean AoI values of users. By modeling the UAV network optimization problem in terms of AoI and applying the Kolmogorov-Arnold representation theorem, the Qedgix framework achieves efficient neural network training through parameter sharing based on permutation invariance. Simulation results demonstrate that the proposed algorithm significantly improves convergence speed while reducing the mean AoI values of users. The code is available at https://github.com/UNIC-Lab/Qedgix.
Enhancing Cooperation through Selective Interaction and Long-term Experiences in Multi-Agent Reinforcement Learning IJCAI 2024
The significance of network structures in promoting group cooperation within social dilemmas has been widely recognized. Prior studies attribute this facilitation to the assortment of strategies driven by spatial interactions. Although reinforcement learning has been employed to investigate the impact of dynamic interaction on the evolution of cooperation, there remains a lack of understanding about how agents develop neighbour selection behaviours and the formation of strategic assortment within an explicit interaction structure. To address this, our study introduces a computational framework based on multi-agent reinforcement learning in the spatial Prisoner's Dilemma game. This framework allows agents to select dilemma strategies and interacting neighbours based on their long-term experiences, differing from existing research that relies on preset social norms or external incentives. By modelling each agent using two distinct Q-networks, we disentangle the coevolutionary dynamics between cooperation and interaction. The results indicate that long-term experience enables agents to develop the ability to identify non-cooperative neighbours and exhibit a preference for interaction with cooperative ones. This emergent self-organizing behaviour leads to the clustering of agents with similar strategies, thereby increasing network reciprocity and enhancing group cooperation.
comment: Accepted at IJCAI 2024 (33rd International Joint Conference on Artificial Intelligence - Jeju)
Online Learning of Temporal Dependencies for Sustainable Foraging Problem
The sustainable foraging problem is a dynamic environment testbed for exploring the forms of agent cognition in dealing with social dilemmas in a multi-agent setting. The agents need to resist the temptation of individual rewards through foraging and choose the collective long-term goal of sustainability. We investigate methods of online learning in Neuro-Evolution and Deep Recurrent Q-Networks to enable agents to attempt the problem one-shot as is often required by wicked social problems. We further explore if learning temporal dependencies with Long Short-Term Memory may be able to aid the agents in developing sustainable foraging strategies in the long term. It was found that the integration of Long Short-Term Memory assisted agents in developing sustainable strategies for a single agent, however failed to assist agents in managing the social dilemma that arises in the multi-agent scenario.
comment: 6 pages, 13 figures, accepted for publication by the Second International Workshop on Sustainability and Scalability of Self-Organisation (SaSSO 2024), DOI to be provided once published
Systems and Control (CS)
Prescribed-time Convergent Distributed Multiobjective Optimization with Dynamic Event-triggered Communication
This paper addresses distributed constrained multiobjective resource allocation problems (DCMRAPs) within multi-agent networks, where each agent has multiple, potentially conflicting local objectives, constrained by both local and global constraints. By reformulating the DCMRAP as a single-objective weighted $L_p$ problem, a distributed solution is enabled, which eliminates the need for predetermined weighting factors or centralized decision-making in traditional methods. Leveraging prescribed-time control and dynamic event-triggered mechanisms (ETMs), novel distributed algorithms are proposed to achieve Pareto optimality within a prescribed settling time through sampled communication. Using generalized time-based generators (TBGs), these algorithms provide more flexibility in optimizing accuracy and control smoothness without the constraints of initial conditions. Novel dynamic ETMs are designed to work with generalized TBGs to promote communication efficiency, which adjusts to both local error metrics and network-based disagreements. The Zeno behavior is excluded. Validated by Lyapunov analysis and simulations, our method demonstrates superior control performance and efficiency compared to existing methods, advancing distributed optimization in complex environments.
Safe Adaptive Control for Uncertain Systems with Complex Input Constraints
In this paper, we propose a novel adaptive Control Barrier Function (CBF) based controller for nonlinear systems with complex, time-varying input constraints. Conventional CBF approaches often struggle with feasibility issues and stringent assumptions when addressing input constraints. Unlike these methods, our approach converts the input-constraint problem into an output-constraint CBF design. This transformation simplifies the Quadratic Programming (QP) formulation and enhances compatibility with the CBF framework. We design an adaptive CBF-based controller to manage the mismatched uncertainties introduced by this transformation. Our method systematically addresses the challenges of complex, time-varying, and state-dependent input constraints. The efficacy of the proposed approach is validated using numerical examples.
comment: 8 pages, 2 figures
Behavioral Learning of Dish Rinsing and Scrubbing based on Interruptive Direct Teaching Considering Assistance Rate
Robots are expected to manipulate objects in a safe and dexterous way. For example, washing dishes is a dexterous operation that involves scrubbing the dishes with a sponge and rinsing them with water. It is necessary to learn it safely without splashing water and without dropping the dishes. In this study, we propose a safe and dexterous manipulation system. %that can scrub and rinse dirty dishes. The robot learns a dynamics model of the object by estimating the state of the object and the robot itself, the control input, and the amount of human assistance required (assistance rate) after the human corrects the initial trajectory of the robot's hands by interruptive direct teaching. By backpropagating the error between the estimated and the reference value %at the next time using the acquired dynamics model, the robot can generate a control input that approaches the reference value, for example, so that human assistance is not required and the dish does not move excessively. This allows for adaptive rinsing and scrubbing of dishes with unknown shapes and properties. As a result, it is possible to generate safe actions that require less human assistance.
comment: Accepted at Advanced Robotics
Generalizable Physics-Informed Learning for Stochastic Safety-Critical Systems
Accurate estimate of long-term risk is critical for safe decision-making, but sampling from rare risk events and long-term trajectories can be prohibitively costly. Risk gradient can be used in many first-order techniques for learning and control methods, but gradient estimate is difficult to obtain using Monte Carlo (MC) methods because the infinitesimal divisor may significantly amplify sampling noise. Motivated by this gap, we propose an efficient method to evaluate long-term risk probabilities and their gradients using short-term samples without sufficient risk events. We first derive that four types of long-term risk probability are solutions of certain partial differential equations (PDEs). Then, we propose a physics-informed learning technique that integrates data and physics information (aforementioned PDEs). The physics information helps propagate information beyond available data and obtain provable generalization beyond available data, which in turn enables long-term risk to be estimated using short-term samples of safe events. Finally, we demonstrate in simulation that the proposed technique has improved sample efficiency, generalizes well to unseen regions, and adapts to changing system parameters.
comment: arXiv admin note: substantial text overlap with arXiv:2305.06432
A Generalizable Physics-informed Learning Framework for Risk Probability Estimation
Accurate estimates of long-term risk probabilities and their gradients are critical for many stochastic safe control methods. However, computing such risk probabilities in real-time and in unseen or changing environments is challenging. Monte Carlo (MC) methods cannot accurately evaluate the probabilities and their gradients as an infinitesimal devisor can amplify the sampling noise. In this paper, we develop an efficient method to evaluate the probabilities of long-term risk and their gradients. The proposed method exploits the fact that long-term risk probability satisfies certain partial differential equations (PDEs), which characterize the neighboring relations between the probabilities, to integrate MC methods and physics-informed neural networks. We provide theoretical guarantees of the estimation error given certain choices of training configurations. Numerical results show the proposed method has better sample efficiency, generalizes well to unseen regions, and can adapt to systems with changing parameters. The proposed method can also accurately estimate the gradients of risk probabilities, which enables first- and second-order techniques on risk probabilities to be used for learning and control.
comment: Accepted at the 5th Annual Learning for Dynamics & Control (L4DC) Conference, 2023
Intention-Aware Control Based on Belief-Space Specifications and Stochastic Expansion
This paper develops a correct-by-design controller for an autonomous vehicle interacting with opponent vehicles with unknown intentions. We define an intention-aware control problem incorporating epistemic uncertainties of the opponent vehicles and model their intentions as discrete-valued random variables. Then, we focus on a control objective specified as belief-space temporal logic specifications. From this stochastic control problem, we derive a sound deterministic control problem using stochastic expansion and solve it using shrinking-horizon model predictive control. The solved intention-aware controller allows a vehicle to adjust its behaviors according to its opponents' intentions. It ensures provable safety by restricting the probabilistic risk under a desired level. We show with experimental studies that the proposed method ensures strict limitation of risk probabilities, validating its efficacy in autonomous driving cases. This work provides a novel solution for the risk-aware control of interactive vehicles with formal safety guarantees.
Remaining Discharge Energy Prediction for Lithium-Ion Batteries Over Broad Current Ranges: A Machine Learning Approach
Lithium-ion batteries have found their way into myriad sectors of industry to drive electrification, decarbonization, and sustainability. A crucial aspect in ensuring their safe and optimal performance is monitoring their energy levels. In this paper, we present the first study on predicting the remaining energy of a battery cell undergoing discharge over wide current ranges from low to high C-rates. The complexity of the challenge arises from the cell's C-rate-dependent energy availability as well as its intricate electro-thermal dynamics especially at high C-rates. To address this, we introduce a new definition of remaining discharge energy and then undertake a systematic effort in harnessing the power of machine learning to enable its prediction. Our effort includes two parts in cascade. First, we develop an accurate dynamic model based on integration of physics with machine learning to capture a battery's voltage and temperature behaviors. Second, based on the model, we propose a machine learning approach to predict the remaining discharge energy under arbitrary C-rates and pre-specified cut-off limits in voltage and temperature. The experimental validation shows that the proposed approach can predict the remaining discharge energy with a relative error of less than 3% when the current varies between 0~8 C for an NCA cell and 0~15 C for an LFP cell. The approach, by design, is amenable to training and computation.
comment: 15 pages, 13 figures, 4 tables
PowerGraph: A power grid benchmark dataset for graph neural networks
Power grids are critical infrastructures of paramount importance to modern society and, therefore, engineered to operate under diverse conditions and failures. The ongoing energy transition poses new challenges for the decision-makers and system operators. Therefore, developing grid analysis algorithms is important for supporting reliable operations. These key tools include power flow analysis and system security analysis, both needed for effective operational and strategic planning. The literature review shows a growing trend of machine learning (ML) models that perform these analyses effectively. In particular, Graph Neural Networks (GNNs) stand out in such applications because of the graph-based structure of power grids. However, there is a lack of publicly available graph datasets for training and benchmarking ML models in electrical power grid applications. First, we present PowerGraph, which comprises GNN-tailored datasets for i) power flows, ii) optimal power flows, and iii) cascading failure analyses of power grids. Second, we provide ground-truth explanations for the cascading failure analysis. Finally, we perform a complete benchmarking of GNN methods for node-level and graph-level tasks and explainability. Overall, PowerGraph is a multifaceted GNN dataset for diverse tasks that includes power flow and fault scenarios with real-world explanations, providing a valuable resource for developing improved GNN models for node-level, graph-level tasks and explainability methods in power system modeling. The dataset is available at https://figshare.com/articles/dataset/PowerGraph/22820534 and the code at https://github.com/PowerGraph-Datasets.
comment: 21 pages, 8 figures, conference paper
Precision Agriculture: Ultra-Compact Sensor and Reconfigurable Antenna for Joint Sensing and Communication
In this paper, a joint sensing and communication system is presented for smart agriculture. The system integrates an Ultra-compact Soil Moisture Sensor (UCSMS) for precise sensing, along with a Pattern Reconfigurable Antenna (PRA) for efficient transmission of information to the base station. A multiturn complementary spiral resonator (MCSR) is etched onto the ground plane of a microstrip transmission line to achieve miniaturization. The UCSMS operates at 180 MHz with a 3-turn complementary spiral resonator (3-CSR), at 102 MHz with a 4- turn complementary spiral resonator (4-CSR), and at 86 MHz with a 5-turn complementary spiral resonator (5-CSR). Due to its low resonance frequency, the proposed UCSMS is insensitive to variations in the Volume Under Test (VUT) of soil. A probe-fed circular patch antenna is designed in the Wireless Local Area Network (WLAN) band (2.45 GHz) with a maximum measured gain of 5.63 dBi. Additionally, four varactor diodes are integrated across the slots on the bottom side of the substrate to achieve pattern reconfiguration. Six different radiation patterns have been achieved by using different bias conditions of the diodes. In standby mode, PRA can serve as a means for Wireless Power Transfer (WPT) or Energy Harvesting (EH) to store power in a battery. This stored power can then be utilized to bias the varactor diodes. The combination of UCSMS and PRA enables the realization of a joint sensing and communication system. The proposed system's planar and simple geometry, along with its high sensitivity of 2.05 %, makes it suitable for smart agriculture applications. Moreover, the sensor is adaptive and capable of measuring the permittivity of various Material Under Test (MUT) within the range of 1 to 23.
Imitation Learning for Intra-Day Power Grid Operation through Topology Actions
Power grid operation is becoming increasingly complex due to the increase in generation of renewable energy. The recent series of Learning To Run a Power Network (L2RPN) competitions have encouraged the use of artificial agents to assist human dispatchers in operating power grids. In this paper we study the performance of imitation learning for day-ahead power grid operation through topology actions. In particular, we consider two rule-based expert agents: a greedy agent and a N-1 agent. While the latter is more computationally expensive since it takes N-1 safety considerations into account, it exhibits a much higher operational performance. We train a fully-connected neural network (FCNN) on expert state-action pairs and evaluate it in two ways. First, we find that classification accuracy is limited despite extensive hyperparameter tuning, due to class imbalance and class overlap. Second, as a power system agent, the FCNN performs only slightly worse than expert agents. Furthermore, hybrid agents, which incorporate minimal additional simulations, match expert agents' performance with significantly lower computational cost. Consequently, imitation learning shows promise for developing fast, high-performing power grid agents, motivating its further exploration in future L2RPN studies.
comment: To be presented at the Machine Learning for Sustainable Power Systems 2024 workshop and to be published in the corresponding Springer Communications in Computer and Information Science proceedings
Multidirectional Pixelated Cubic Antenna with Enhanced Isolation for Vehicular Applications
This paper presents a pixelated cubic antenna design with enhanced isolation and diverse radiation pattern for vehicular applications. The design consists of four radiating patches to take advantage of a nearly omnidirectional radiation pattern with enhanced isolation and high gain. The antenna system with four patches has been pixelated and optimized simultaneously to achieve desired performance and high isolation at 5.4 GHz band. The antenna achieved measured isolation of more than -34 dB between antenna elements. The overall isolation improvement obtained by the antenna is about 18 dB compared to a configuration using standard patch antennas. Moreover, isolation improvement is achieved through patch pixelization without additional resonators or elements. The antenna achieved up to 6.9 dB realized gain in each direction. Additionally, the cubic antenna system is equipped with an E-shaped GPS antenna to facilitate connectivity with GPS satellite. Finally, the antenna performance has been investigated using a simulation model of the vehicle roof and roof rack. The reflection coefficient, isolation and radiation patterns of the antenna remains unaffected. The antenna prototype has been fabricated on Rogers substrate and measured to verify the simulation results. The measured results correlate well with the simulation results. The proposed antenna features low-profile, simple design for ease of manufacture, good radiation characteristics with multidirectional property and high isolation, which are well-suited to vehicular applications in different environments.
Risk-Aware Control of Discrete-Time Stochastic Systems: Integrating Kalman Filter and Worst-case CVaR in Control Barrier Functions
This paper proposes control approaches for discrete-time linear systems subject to stochastic disturbances. It employs Kalman filter to estimate the mean and covariance of the state propagation, and the worst-case conditional value-at-risk (CVaR) to quantify the tail risk using the estimated mean and covariance. The quantified risk is then integrated into a control barrier function (CBF) to derive constraints for controller synthesis, addressing tail risks near safe set boundaries. Two optimization-based control methods are presented using the obtained constraints for half-space and ellipsoidal safe sets, respectively. The effectiveness of the obtained results is demonstrated using numerical simulations.
comment: Replaced Fig. 1(b)
Integrated Optimal Fast Charging and Active Thermal Management of Lithium-Ion Batteries in Extreme Ambient Temperatures
This paper presents an integrated control strategy for optimal fast charging and active thermal management of Lithium-ion batteries in extreme ambient temperatures, striking a balance between charging speed and battery health. A control-oriented thermal-NDC (nonlinear double-capacitor) battery model is proposed to describe the electrical and thermal dynamics, incorporating the effects of both an active thermal source and ambient temperature. A state-feedback model predictive control algorithm is then developed for optimal fast charging and active thermal management. Numerical experiments validate the algorithm under extreme temperatures, showing that the proposed algorithm can energy-efficiently adjust the battery temperature, thereby balancing charging speed and battery health. Additionally, an output-feedback model predictive control algorithm with an extended Kalman filter is proposed for battery charging when states are partially measurable. Numerical experiments validate the effectiveness under extreme temperatures.
Systems and Control (EESS)
Prescribed-time Convergent Distributed Multiobjective Optimization with Dynamic Event-triggered Communication
This paper addresses distributed constrained multiobjective resource allocation problems (DCMRAPs) within multi-agent networks, where each agent has multiple, potentially conflicting local objectives, constrained by both local and global constraints. By reformulating the DCMRAP as a single-objective weighted $L_p$ problem, a distributed solution is enabled, which eliminates the need for predetermined weighting factors or centralized decision-making in traditional methods. Leveraging prescribed-time control and dynamic event-triggered mechanisms (ETMs), novel distributed algorithms are proposed to achieve Pareto optimality within a prescribed settling time through sampled communication. Using generalized time-based generators (TBGs), these algorithms provide more flexibility in optimizing accuracy and control smoothness without the constraints of initial conditions. Novel dynamic ETMs are designed to work with generalized TBGs to promote communication efficiency, which adjusts to both local error metrics and network-based disagreements. The Zeno behavior is excluded. Validated by Lyapunov analysis and simulations, our method demonstrates superior control performance and efficiency compared to existing methods, advancing distributed optimization in complex environments.
Safe Adaptive Control for Uncertain Systems with Complex Input Constraints
In this paper, we propose a novel adaptive Control Barrier Function (CBF) based controller for nonlinear systems with complex, time-varying input constraints. Conventional CBF approaches often struggle with feasibility issues and stringent assumptions when addressing input constraints. Unlike these methods, our approach converts the input-constraint problem into an output-constraint CBF design. This transformation simplifies the Quadratic Programming (QP) formulation and enhances compatibility with the CBF framework. We design an adaptive CBF-based controller to manage the mismatched uncertainties introduced by this transformation. Our method systematically addresses the challenges of complex, time-varying, and state-dependent input constraints. The efficacy of the proposed approach is validated using numerical examples.
comment: 8 pages, 2 figures
Behavioral Learning of Dish Rinsing and Scrubbing based on Interruptive Direct Teaching Considering Assistance Rate
Robots are expected to manipulate objects in a safe and dexterous way. For example, washing dishes is a dexterous operation that involves scrubbing the dishes with a sponge and rinsing them with water. It is necessary to learn it safely without splashing water and without dropping the dishes. In this study, we propose a safe and dexterous manipulation system. %that can scrub and rinse dirty dishes. The robot learns a dynamics model of the object by estimating the state of the object and the robot itself, the control input, and the amount of human assistance required (assistance rate) after the human corrects the initial trajectory of the robot's hands by interruptive direct teaching. By backpropagating the error between the estimated and the reference value %at the next time using the acquired dynamics model, the robot can generate a control input that approaches the reference value, for example, so that human assistance is not required and the dish does not move excessively. This allows for adaptive rinsing and scrubbing of dishes with unknown shapes and properties. As a result, it is possible to generate safe actions that require less human assistance.
comment: Accepted at Advanced Robotics
Generalizable Physics-Informed Learning for Stochastic Safety-Critical Systems
Accurate estimate of long-term risk is critical for safe decision-making, but sampling from rare risk events and long-term trajectories can be prohibitively costly. Risk gradient can be used in many first-order techniques for learning and control methods, but gradient estimate is difficult to obtain using Monte Carlo (MC) methods because the infinitesimal divisor may significantly amplify sampling noise. Motivated by this gap, we propose an efficient method to evaluate long-term risk probabilities and their gradients using short-term samples without sufficient risk events. We first derive that four types of long-term risk probability are solutions of certain partial differential equations (PDEs). Then, we propose a physics-informed learning technique that integrates data and physics information (aforementioned PDEs). The physics information helps propagate information beyond available data and obtain provable generalization beyond available data, which in turn enables long-term risk to be estimated using short-term samples of safe events. Finally, we demonstrate in simulation that the proposed technique has improved sample efficiency, generalizes well to unseen regions, and adapts to changing system parameters.
comment: arXiv admin note: substantial text overlap with arXiv:2305.06432
A Generalizable Physics-informed Learning Framework for Risk Probability Estimation
Accurate estimates of long-term risk probabilities and their gradients are critical for many stochastic safe control methods. However, computing such risk probabilities in real-time and in unseen or changing environments is challenging. Monte Carlo (MC) methods cannot accurately evaluate the probabilities and their gradients as an infinitesimal devisor can amplify the sampling noise. In this paper, we develop an efficient method to evaluate the probabilities of long-term risk and their gradients. The proposed method exploits the fact that long-term risk probability satisfies certain partial differential equations (PDEs), which characterize the neighboring relations between the probabilities, to integrate MC methods and physics-informed neural networks. We provide theoretical guarantees of the estimation error given certain choices of training configurations. Numerical results show the proposed method has better sample efficiency, generalizes well to unseen regions, and can adapt to systems with changing parameters. The proposed method can also accurately estimate the gradients of risk probabilities, which enables first- and second-order techniques on risk probabilities to be used for learning and control.
comment: Accepted at the 5th Annual Learning for Dynamics & Control (L4DC) Conference, 2023
Intention-Aware Control Based on Belief-Space Specifications and Stochastic Expansion
This paper develops a correct-by-design controller for an autonomous vehicle interacting with opponent vehicles with unknown intentions. We define an intention-aware control problem incorporating epistemic uncertainties of the opponent vehicles and model their intentions as discrete-valued random variables. Then, we focus on a control objective specified as belief-space temporal logic specifications. From this stochastic control problem, we derive a sound deterministic control problem using stochastic expansion and solve it using shrinking-horizon model predictive control. The solved intention-aware controller allows a vehicle to adjust its behaviors according to its opponents' intentions. It ensures provable safety by restricting the probabilistic risk under a desired level. We show with experimental studies that the proposed method ensures strict limitation of risk probabilities, validating its efficacy in autonomous driving cases. This work provides a novel solution for the risk-aware control of interactive vehicles with formal safety guarantees.
Remaining Discharge Energy Prediction for Lithium-Ion Batteries Over Broad Current Ranges: A Machine Learning Approach
Lithium-ion batteries have found their way into myriad sectors of industry to drive electrification, decarbonization, and sustainability. A crucial aspect in ensuring their safe and optimal performance is monitoring their energy levels. In this paper, we present the first study on predicting the remaining energy of a battery cell undergoing discharge over wide current ranges from low to high C-rates. The complexity of the challenge arises from the cell's C-rate-dependent energy availability as well as its intricate electro-thermal dynamics especially at high C-rates. To address this, we introduce a new definition of remaining discharge energy and then undertake a systematic effort in harnessing the power of machine learning to enable its prediction. Our effort includes two parts in cascade. First, we develop an accurate dynamic model based on integration of physics with machine learning to capture a battery's voltage and temperature behaviors. Second, based on the model, we propose a machine learning approach to predict the remaining discharge energy under arbitrary C-rates and pre-specified cut-off limits in voltage and temperature. The experimental validation shows that the proposed approach can predict the remaining discharge energy with a relative error of less than 3% when the current varies between 0~8 C for an NCA cell and 0~15 C for an LFP cell. The approach, by design, is amenable to training and computation.
comment: 15 pages, 13 figures, 4 tables
PowerGraph: A power grid benchmark dataset for graph neural networks
Power grids are critical infrastructures of paramount importance to modern society and, therefore, engineered to operate under diverse conditions and failures. The ongoing energy transition poses new challenges for the decision-makers and system operators. Therefore, developing grid analysis algorithms is important for supporting reliable operations. These key tools include power flow analysis and system security analysis, both needed for effective operational and strategic planning. The literature review shows a growing trend of machine learning (ML) models that perform these analyses effectively. In particular, Graph Neural Networks (GNNs) stand out in such applications because of the graph-based structure of power grids. However, there is a lack of publicly available graph datasets for training and benchmarking ML models in electrical power grid applications. First, we present PowerGraph, which comprises GNN-tailored datasets for i) power flows, ii) optimal power flows, and iii) cascading failure analyses of power grids. Second, we provide ground-truth explanations for the cascading failure analysis. Finally, we perform a complete benchmarking of GNN methods for node-level and graph-level tasks and explainability. Overall, PowerGraph is a multifaceted GNN dataset for diverse tasks that includes power flow and fault scenarios with real-world explanations, providing a valuable resource for developing improved GNN models for node-level, graph-level tasks and explainability methods in power system modeling. The dataset is available at https://figshare.com/articles/dataset/PowerGraph/22820534 and the code at https://github.com/PowerGraph-Datasets.
comment: 21 pages, 8 figures, conference paper
Precision Agriculture: Ultra-Compact Sensor and Reconfigurable Antenna for Joint Sensing and Communication
In this paper, a joint sensing and communication system is presented for smart agriculture. The system integrates an Ultra-compact Soil Moisture Sensor (UCSMS) for precise sensing, along with a Pattern Reconfigurable Antenna (PRA) for efficient transmission of information to the base station. A multiturn complementary spiral resonator (MCSR) is etched onto the ground plane of a microstrip transmission line to achieve miniaturization. The UCSMS operates at 180 MHz with a 3-turn complementary spiral resonator (3-CSR), at 102 MHz with a 4- turn complementary spiral resonator (4-CSR), and at 86 MHz with a 5-turn complementary spiral resonator (5-CSR). Due to its low resonance frequency, the proposed UCSMS is insensitive to variations in the Volume Under Test (VUT) of soil. A probe-fed circular patch antenna is designed in the Wireless Local Area Network (WLAN) band (2.45 GHz) with a maximum measured gain of 5.63 dBi. Additionally, four varactor diodes are integrated across the slots on the bottom side of the substrate to achieve pattern reconfiguration. Six different radiation patterns have been achieved by using different bias conditions of the diodes. In standby mode, PRA can serve as a means for Wireless Power Transfer (WPT) or Energy Harvesting (EH) to store power in a battery. This stored power can then be utilized to bias the varactor diodes. The combination of UCSMS and PRA enables the realization of a joint sensing and communication system. The proposed system's planar and simple geometry, along with its high sensitivity of 2.05 %, makes it suitable for smart agriculture applications. Moreover, the sensor is adaptive and capable of measuring the permittivity of various Material Under Test (MUT) within the range of 1 to 23.
Imitation Learning for Intra-Day Power Grid Operation through Topology Actions
Power grid operation is becoming increasingly complex due to the increase in generation of renewable energy. The recent series of Learning To Run a Power Network (L2RPN) competitions have encouraged the use of artificial agents to assist human dispatchers in operating power grids. In this paper we study the performance of imitation learning for day-ahead power grid operation through topology actions. In particular, we consider two rule-based expert agents: a greedy agent and a N-1 agent. While the latter is more computationally expensive since it takes N-1 safety considerations into account, it exhibits a much higher operational performance. We train a fully-connected neural network (FCNN) on expert state-action pairs and evaluate it in two ways. First, we find that classification accuracy is limited despite extensive hyperparameter tuning, due to class imbalance and class overlap. Second, as a power system agent, the FCNN performs only slightly worse than expert agents. Furthermore, hybrid agents, which incorporate minimal additional simulations, match expert agents' performance with significantly lower computational cost. Consequently, imitation learning shows promise for developing fast, high-performing power grid agents, motivating its further exploration in future L2RPN studies.
comment: To be presented at the Machine Learning for Sustainable Power Systems 2024 workshop and to be published in the corresponding Springer Communications in Computer and Information Science proceedings
Multidirectional Pixelated Cubic Antenna with Enhanced Isolation for Vehicular Applications
This paper presents a pixelated cubic antenna design with enhanced isolation and diverse radiation pattern for vehicular applications. The design consists of four radiating patches to take advantage of a nearly omnidirectional radiation pattern with enhanced isolation and high gain. The antenna system with four patches has been pixelated and optimized simultaneously to achieve desired performance and high isolation at 5.4 GHz band. The antenna achieved measured isolation of more than -34 dB between antenna elements. The overall isolation improvement obtained by the antenna is about 18 dB compared to a configuration using standard patch antennas. Moreover, isolation improvement is achieved through patch pixelization without additional resonators or elements. The antenna achieved up to 6.9 dB realized gain in each direction. Additionally, the cubic antenna system is equipped with an E-shaped GPS antenna to facilitate connectivity with GPS satellite. Finally, the antenna performance has been investigated using a simulation model of the vehicle roof and roof rack. The reflection coefficient, isolation and radiation patterns of the antenna remains unaffected. The antenna prototype has been fabricated on Rogers substrate and measured to verify the simulation results. The measured results correlate well with the simulation results. The proposed antenna features low-profile, simple design for ease of manufacture, good radiation characteristics with multidirectional property and high isolation, which are well-suited to vehicular applications in different environments.
Risk-Aware Control of Discrete-Time Stochastic Systems: Integrating Kalman Filter and Worst-case CVaR in Control Barrier Functions
This paper proposes control approaches for discrete-time linear systems subject to stochastic disturbances. It employs Kalman filter to estimate the mean and covariance of the state propagation, and the worst-case conditional value-at-risk (CVaR) to quantify the tail risk using the estimated mean and covariance. The quantified risk is then integrated into a control barrier function (CBF) to derive constraints for controller synthesis, addressing tail risks near safe set boundaries. Two optimization-based control methods are presented using the obtained constraints for half-space and ellipsoidal safe sets, respectively. The effectiveness of the obtained results is demonstrated using numerical simulations.
comment: Replaced Fig. 1(b)
Integrated Optimal Fast Charging and Active Thermal Management of Lithium-Ion Batteries in Extreme Ambient Temperatures
This paper presents an integrated control strategy for optimal fast charging and active thermal management of Lithium-ion batteries in extreme ambient temperatures, striking a balance between charging speed and battery health. A control-oriented thermal-NDC (nonlinear double-capacitor) battery model is proposed to describe the electrical and thermal dynamics, incorporating the effects of both an active thermal source and ambient temperature. A state-feedback model predictive control algorithm is then developed for optimal fast charging and active thermal management. Numerical experiments validate the algorithm under extreme temperatures, showing that the proposed algorithm can energy-efficiently adjust the battery temperature, thereby balancing charging speed and battery health. Additionally, an output-feedback model predictive control algorithm with an extended Kalman filter is proposed for battery charging when states are partially measurable. Numerical experiments validate the effectiveness under extreme temperatures.
Robotics
Design and Control of Modular Soft-Rigid Hybrid Manipulators with Self-Contact
Soft robotics focuses on designing robots with highly deformable materials, allowing them to adapt and operate safely and reliably in unstructured and variable environments. While soft robots offer increased compliance over rigid body robots, their payloads are limited, and they consume significant energy when operating against gravity in terrestrial environments. To address the carrying capacity limitation, we introduce a novel class of soft-rigid hybrid robot manipulators (SRH) that incorporates both soft continuum modules and rigid joints in a serial configuration. The SRH manipulators can seamlessly transition between being compliant and delicate to rigid and strong, achieving this through dynamic shape modulation and employing self-contact among rigid components to effectively form solid structures. We discuss the design and fabrication of SRH robots, and present a class of novel control algorithms for SRH systems. We propose a configuration space PD+ shape controller and a Cartesian impedance controller, both of which are provably stable, endowing the soft robot with the necessary low-level capabilities. We validate the controllers on SRH hardware and demonstrate the robot performing several tasks. Our results highlight the potential for the soft-rigid hybrid paradigm to produce robots that are both physically safe and effective at task performance.
comment: 23 pages, 7 figures
Reinforcement Learning Compensated Model Predictive Control for Off-road Driving on Unknown Deformable Terrain
This study presents an Actor-Critic reinforcement learning Compensated Model Predictive Controller (AC2MPC) designed for high-speed, off-road autonomous driving on deformable terrains. Addressing the difficulty of modeling unknown tire-terrain interaction and ensuring real-time control feasibility and performance, this framework integrates deep reinforcement learning with a model predictive controller to manage unmodeled nonlinear dynamics. We evaluate the controller framework over constant and varying velocity profiles using high-fidelity simulator Project Chrono. Our findings demonstrate that our controller statistically outperforms standalone model-based and learning-based controllers over three unknown terrains that represent sandy deformable track, sandy and rocky track and cohesive clay-like deformable soil track. Despite varied and previously unseen terrain characteristics, this framework generalized well enough to track longitudinal reference speeds with the least error. Furthermore, this framework required significantly less training data compared to purely learning based controller, converging in fewer steps while delivering better performance. Even when under-trained, this controller outperformed the standalone controllers, highlighting its potential for safer and more efficient real-world deployment.
comment: Submitted to IEEE Transactions on Intelligent Vehicles as a Regular Paper
V2X-VLM: End-to-End V2X Cooperative Autonomous Driving Through Large Vision-Language Models
Advancements in autonomous driving have increasingly focused on end-to-end (E2E) systems that manage the full spectrum of driving tasks, from environmental perception to vehicle navigation and control. This paper introduces V2X-VLM, an innovative E2E vehicle-infrastructure cooperative autonomous driving (VICAD) framework with large vision-language models (VLMs). V2X-VLM is designed to enhance situational awareness, decision-making, and ultimate trajectory planning by integrating data from vehicle-mounted cameras, infrastructure sensors, and textual information. The strength of the comprehensive multimodel data fusion of the VLM enables precise and safe E2E trajectory planning in complex and dynamic driving scenarios. Validation on the DAIR-V2X dataset demonstrates that V2X-VLM outperforms existing state-of-the-art methods in cooperative autonomous driving.
Intuitive Human-Robot Interface: A 3-Dimensional Action Recognition and UAV Collaboration Framework
Harnessing human movements to command an Unmanned Aerial Vehicle (UAV) holds the potential to revolutionize their deployment, rendering it more intuitive and user-centric. In this research, we introduce a novel methodology adept at classifying three-dimensional human actions, leveraging them to coordinate on-field with a UAV. Utilizing a stereo camera, we derive both RGB and depth data, subsequently extracting three-dimensional human poses from the continuous video feed. This data is then processed through our proposed k-nearest neighbour classifier, the results of which dictate the behaviour of the UAV. It also includes mechanisms ensuring the robot perpetually maintains the human within its visual purview, adeptly tracking user movements. We subjected our approach to rigorous testing involving multiple tests with real robots. The ensuing results, coupled with comprehensive analysis, underscore the efficacy and inherent advantages of our proposed methodology.
comment: Accepted in International Conference on Informatics in Control, Automation and Robotics (ICINCO) 2024
Learning Based Toolpath Planner on Diverse Graphs for 3D Printing
This paper presents a learning based planner for computing optimized 3D printing toolpaths on prescribed graphs, the challenges of which include the varying graph structures on different models and the large scale of nodes & edges on a graph. We adopt an on-the-fly strategy to tackle these challenges, formulating the planner as a Deep Q-Network (DQN) based optimizer to decide the next `best' node to visit. We construct the state spaces by the Local Search Graph (LSG) centered at different nodes on a graph, which is encoded by a carefully designed algorithm so that LSGs in similar configurations can be identified to re-use the earlier learned DQN priors for accelerating the computation of toolpath planning. Our method can cover different 3D printing applications by defining their corresponding reward functions. Toolpath planning problems in wire-frame printing, continuous fiber printing, and metallic printing are selected to demonstrate its generality. The performance of our planner has been verified by testing the resultant toolpaths in physical experiments. By using our planner, wire-frame models with up to 4.2k struts can be successfully printed, up to 93.3% of sharp turns on continuous fiber toolpaths can be avoided, and the thermal distortion in metallic printing can be reduced by 24.9%.
Impact-Resilient Orchestrated Robust Controller for Heavy-duty Hydraulic Manipulators
Heavy-duty operations, typically performed using heavy-duty hydraulic manipulators (HHMs), are susceptible to environmental contact due to tracking errors or sudden environmental changes. Therefore, beyond precise control design, it is crucial that the manipulator be resilient to potential impacts without relying on contact-force sensors, which mostly cannot be utilized. This paper proposes a novel force-sensorless robust impact-resilient controller for a generic 6-degree-of-freedom (DoF) HHM constituting from anthropomorphic arm and spherical wrist mechanisms. The scheme consists of a neuroadaptive subsystem-based impedance controller, which is designed to ensure both accurate tracking of position and orientation with stabilization of HHMs upon contact, along with a novel generalized momentum observer, which is for the first time introduced in Pl\"ucker coordinate, to estimate the impact force. Finally, by leveraging the concepts of virtual stability and virtual power flow, the semi-global uniformly ultimately boundedness of the entire system is assured. To demonstrate the efficacy and versatility of the proposed method, extensive experiments were conducted using a generic 6-DoF industrial HHM. The experimental results confirm the exceptional performance of the designed method by achieving a subcentimeter tracking accuracy and by 80% reduction of impact of the contact.
comment: This paper has been submitted for possible publication in IEEE
LOID: Lane Occlusion Inpainting and Detection for Enhanced Autonomous Driving Systems
Accurate lane detection is essential for effective path planning and lane following in autonomous driving, especially in scenarios with significant occlusion from vehicles and pedestrians. Existing models often struggle under such conditions, leading to unreliable navigation and safety risks. We propose two innovative approaches to enhance lane detection in these challenging environments, each showing notable improvements over current methods. The first approach aug-Segment improves conventional lane detection models by augmenting the training dataset of CULanes with simulated occlusions and training a segmentation model. This method achieves a 12% improvement over a number of SOTA models on the CULanes dataset, demonstrating that enriched training data can better handle occlusions, however, since this model lacked robustness to certain settings, our main contribution is the second approach, LOID Lane Occlusion Inpainting and Detection. LOID introduces an advanced lane detection network that uses an image processing pipeline to identify and mask occlusions. It then employs inpainting models to reconstruct the road environment in the occluded areas. The enhanced image is processed by a lane detection algorithm, resulting in a 20% & 24% improvement over several SOTA models on the BDDK100 and CULanes datasets respectively, highlighting the effectiveness of this novel technique.
comment: 8 pages, 6 figures and 4 tables
Training Verifiably Robust Agents Using Set-Based Reinforcement Learning
Reinforcement learning often uses neural networks to solve complex control tasks. However, neural networks are sensitive to input perturbations, which makes their deployment in safety-critical environments challenging. This work lifts recent results from formally verifying neural networks against such disturbances to reinforcement learning in continuous state and action spaces using reachability analysis. While previous work mainly focuses on adversarial attacks for robust reinforcement learning, we train neural networks utilizing entire sets of perturbed inputs and maximize the worst-case reward. The obtained agents are verifiably more robust than agents obtained by related work, making them more applicable in safety-critical environments. This is demonstrated with an extensive empirical evaluation of four different benchmarks.
Using neuroevolution for designing soft medical devices
Soft robots can exhibit better performance in specific tasks compared to conventional robots, particularly in healthcare-related tasks. However, the field of soft robotics is still young, and designing them often involves mimicking natural organisms or relying heavily on human experts' creativity. A formal automated design process is required. We propose the use of neuroevolution-based algorithms to automatically design initial sketches of soft actuators that can enable the movement of future medical devices, such as drug-delivering catheters. The actuator morphologies discovered by algorithms like Age-Fitness Pareto Optimization, NeuroEvolution of Augmenting Topologies (NEAT), and Hypercube-based NEAT (HyperNEAT) were compared based on the maximum displacement reached and their robustness against various control methods. Analyzing the results granted the insight that neuroevolution-based algorithms produce better-performing and more robust actuators under different control methods. Moreover, the best-performing morphologies were discovered by the NEAT algorithm. As a future work aspect, we propose using the morphologies discovered here as test beds to optimize specialized controllers, enabling more effective functionality towards the desired deflections of the suggested soft catheters.
Brain Inspired Probabilistic Occupancy Grid Mapping with Hyperdimensional Computing
Real-time robotic systems require advanced perception, computation, and action capability. However, the main bottleneck in current autonomous systems is the trade-off between computational capability, energy efficiency and model determinism. World modeling, a key objective of many robotic systems, commonly uses occupancy grid mapping (OGM) as the first step towards building an end-to-end robotic system with perception, planning, autonomous maneuvering, and decision making capabilities. OGM divides the environment into discrete cells and assigns probability values to attributes such as occupancy and traversability. Existing methods fall into two categories: traditional methods and neural methods. Traditional methods rely on dense statistical calculations, while neural methods employ deep learning for probabilistic information processing. Recent works formulate a deterministic theory of neural computation at the intersection of cognitive science and vector symbolic architectures. In this study, we propose a Fourier-based hyperdimensional OGM system, VSA-OGM, combined with a novel application of Shannon entropy that retains the interpretability and stability of traditional methods along with the improved computational efficiency of neural methods. Our approach, validated across multiple datasets, achieves similar accuracy to covariant traditional methods while approximately reducing latency by 200x and memory by 1000x. Compared to invariant traditional methods, we see similar accuracy values while reducing latency by 3.7x. Moreover, we achieve 1.5x latency reductions compared to neural methods while eliminating the need for domain-specific model training.
Vision-assisted Avocado Harvesting with Aerial Bimanual Manipulation
Robotic fruit harvesting holds potential in precision agriculture to improve harvesting efficiency. While ground mobile robots are mostly employed in fruit harvesting, certain crops, like avocado trees, cannot be harvested efficiently from the ground alone. This is because of unstructured ground and planting arrangement and high-to-reach fruits. In such cases, aerial robots integrated with manipulation capabilities can pave new ways in robotic harvesting. This paper outlines the design and implementation of a bimanual UAV that employs visual perception and learning to autonomously detect avocados, reach, and harvest them. The dual-arm system comprises a gripper and a fixer arm, to address a key challenge when harvesting avocados: once grasped, a rotational motion is the most efficient way to detach the avocado from the peduncle; however, the peduncle may store elastic energy preventing the avocado from being harvested. The fixer arm aims to stabilize the peduncle, allowing the gripper arm to harvest. The integrated visual perception process enables the detection of avocados and the determination of their pose; the latter is then used to determine target points for a bimanual manipulation planner. Several experiments are conducted to assess the efficacy of each component, and integrated experiments assess the effectiveness of the system.
comment: First Two Authors Share Equal Contribution. 13 Pages, 15 Figures
Risk Occupancy: A New and Efficient Paradigm through Vehicle-Road-Cloud Collaboration
This study introduces the 4D Risk Occupancy within a vehicle-road-cloud architecture, integrating the road surface spatial, risk, and temporal dimensions, and endowing the algorithm with beyond-line-of-sight, all-angles, and efficient abilities. The algorithm simplifies risk modeling by focusing on directly observable information and key factors, drawing on the concept of Occupancy Grid Maps (OGM), and incorporating temporal prediction to effectively map current and future risk occupancy. Compared to conventional driving risk fields and grid occupancy maps, this algorithm can map global risks more efficiently, simply, and reliably. It can integrate future risk information, adapting to dynamic traffic environments. The 4D Risk Occupancy also unifies the expression of BEV detection and lane line detection results, enhancing the intuitiveness and unity of environmental perception. Using DAIR-V2X data, this paper validates the 4D Risk Occupancy algorithm and develops a local path planning model based on it. Qualitative experiments under various road conditions demonstrate the practicality and robustness of this local path planning model. Quantitative analysis shows that the path planning based on risk occupation significantly improves trajectory planning performance, increasing safety redundancy by 12.5% and reducing average deceleration by 5.41% at an initial braking speed of 8 m/s, thereby improving safety and comfort. This work provides a new global perception method and local path planning method through Vehicle-Road-Cloud architecture, offering a new perceptual paradigm for achieving safer and more efficient autonomous driving.
comment: 13 pages,9 figures
Deep Stochastic Kinematic Models for Probabilistic Motion Forecasting in Traffic
In trajectory forecasting tasks for traffic, future output trajectories can be computed by advancing the ego vehicle's state with predicted actions according to a kinematics model. By unrolling predicted trajectories via time integration and models of kinematic dynamics, predicted trajectories should not only be kinematically feasible but also relate uncertainty from one timestep to the next. While current works in probabilistic prediction do incorporate kinematic priors for mean trajectory prediction, _variance_ is often left as a learnable parameter, despite uncertainty in one time step being inextricably tied to uncertainty in the previous time step. In this paper, we show simple and differentiable analytical approximations describing the relationship between variance at one timestep and that at the next with the kinematic bicycle model. In our results, we find that encoding the relationship between variance across timesteps works especially well in unoptimal settings, such as with small or noisy datasets. We observe up to a 50% performance boost in partial dataset settings and up to an 8% performance boost in large-scale learning compared to previous kinematic prediction methods on SOTA trajectory forecasting architectures out-of-the-box, with no fine-tuning.
comment: 8 pages
Orchestrated Robust Controller for the Precision Control of Heavy-duty Hydraulic Manipulators
Vast industrial investment along with increased academic research on hydraulic heavy-duty manipulators has unavoidably paved the way for their automatization, necessitating the design of robust and high-precision controllers. In this study, an orchestrated robust controller is designed to address the mentioned issue for a generic manipulator with an anthropomorphic arm and spherical wrist. To do so, the entire robotic system is decomposed into subsystems, and a robust controller is designed at each local subsystem by considering unknown model uncertainties, unknown disturbances, and compound input nonlinearities, thanks to virtual decomposition control (VDC). As such, radial basic function neural networks (RBFNNs) are incorporated into VDC to tackle unknown disturbances and uncertainties, resulting in novel decentralized RBFNNs. All robust local controllers designed at each local subsystem, then, are orchestrated to accomplish high-precision control. In the end, for the first time in the context of VDC, a semi-globally uniformly ultimate boundedness is achieved under the designed controller. The validity of the theoretical results is verified by performing extensive simulations and experiments on a 6-degrees-of-freedom industrial manipulator with a nominal lifting capacity of 600 kg at 5 meters reach. Comparing the simulation result to the state-of-the-art controller along with provided experimental results, demonstrates that proposed method established all promises and performed excellently.
comment: Submitted to IEEE Transactions on Robotics. Revised version resubmitted after receiving a 'revise and resubmit' decision
PRIME: Scaffolding Manipulation Tasks with Behavior Primitives for Data-Efficient Imitation Learning
Imitation learning has shown great potential for enabling robots to acquire complex manipulation behaviors. However, these algorithms suffer from high sample complexity in long-horizon tasks, where compounding errors accumulate over the task horizons. We present PRIME (PRimitive-based IMitation with data Efficiency), a behavior primitive-based framework designed for improving the data efficiency of imitation learning. PRIME scaffolds robot tasks by decomposing task demonstrations into primitive sequences, followed by learning a high-level control policy to sequence primitives through imitation learning. Our experiments demonstrate that PRIME achieves a significant performance improvement in multi-stage manipulation tasks, with 10-34% higher success rates in simulation over state-of-the-art baselines and 20-48% on physical hardware.
On Safety and Liveness Filtering Using Hamilton-Jacobi Reachability Analysis
Hamilton-Jacobi (HJ) reachability-based filtering provides a powerful framework to co-optimize performance and safety (or liveness) for autonomous systems. Under this filtering scheme, a nominal controller is minimally modified to ensure system safety or liveness. However, the resulting controllers can exhibit abrupt switching and bang-bang behavior, which is not suitable for applications of autonomous systems in the real world. This work presents a novel, unifying framework to design safety and liveness filters through reachability analysis. We explicitly characterize the maximal set of control inputs that ensures safety (or liveness) at a given state. Different safety filters can then be constructed using different subsets of this maximal set along with a projection operator to modify the nominal controller. We use the proposed framework to design three safety filters, each balancing performance, computation time, and smoothness differently. We highlight their relative strengths and limitations by applying these filters to autonomous navigation and rocket landing scenarios and on a physical robot testbed. We also discuss practical aspects associated with implementing these filters on real-world autonomous systems. Our research advances the understanding and potential application of reachability-based controllers on real-world autonomous systems.
comment: 16 pages, 13 figures
Multiagent Systems
Joint-perturbation simultaneous pseudo-gradient
We study the problem of computing an approximate Nash equilibrium of a game whose strategy space is continuous without access to gradients of the utility function. Such games arise, for example, when players' strategies are represented by the parameters of a neural network. Lack of access to gradients is common in reinforcement learning settings, where the environment is treated as a black box, as well as equilibrium finding in mechanisms such as auctions, where the mechanism's payoffs are discontinuous in the players' actions. To tackle this problem, we turn to zeroth-order optimization techniques that combine pseudo-gradients with equilibrium-finding dynamics. Specifically, we introduce a new technique that requires a number of utility function evaluations per iteration that is constant rather than linear in the number of players. It achieves this by performing a single joint perturbation on all players' strategies, rather than perturbing each one individually. This yields a dramatic improvement for many-player games, especially when the utility function is expensive to compute in terms of wall time, memory, money, or other resources. We evaluate our approach on various games, including auctions, which have important real-world applications. Our approach yields a significant reduction in the run time required to reach an approximate Nash equilibrium.
Energy-efficient flocking with nonlinear navigational feedback
Modeling collective motion in multi-agent systems has gained significant attention. Of particular interest are sufficient conditions for flocking dynamics. We present a generalization of the multi-agent model of Olfati--Saber with nonlinear navigational feedback forces. Unlike the original model, ours is not generally dissipative and lacks an obvious Lyapunov function. We address this by proposing a method to prove the existence of an attractor without relying on LaSalle's principle. Other contributions are as follows. We prove that, under mild conditions, agents' velocities approach the center of mass velocity exponentially, with the distance between the center of mass and the virtual leader being bounded. In the dissipative case, we show existence of a broad class of nonlinear control forces for which the attractor does not contain periodic trajectories, which cannot be ruled out by LaSalle's principle. Finally, we conduct a computational investigation of the problem of reducing propulsion energy consumption by selecting appropriate navigational feedback forces.
Systems and Control (CS)
ByCAN: Reverse Engineering Controller Area Network (CAN) Messages from Bit to Byte Level
As the primary standard protocol for modern cars, the Controller Area Network (CAN) is a critical research target for automotive cybersecurity threats and autonomous applications. As the decoding specification of CAN is a proprietary black-box maintained by Original Equipment Manufacturers (OEMs), conducting related research and industry developments can be challenging without a comprehensive understanding of the meaning of CAN messages. In this paper, we propose a fully automated reverse-engineering system, named ByCAN, to reverse engineer CAN messages. ByCAN outperforms existing research by introducing byte-level clusters and integrating multiple features at both byte and bit levels. ByCAN employs the clustering and template matching algorithms to automatically decode the specifications of CAN frames without the need for prior knowledge. Experimental results demonstrate that ByCAN achieves high accuracy in slicing and labeling performance, i.e., the identification of CAN signal boundaries and labels. In the experiments, ByCAN achieves slicing accuracy of 80.21%, slicing coverage of 95.21%, and labeling accuracy of 68.72% for general labels when analyzing the real-world CAN frames.
comment: Accept by IEEE Internet of Things Journal, 15 pages, 5 figures, 6 tables
Reinforcement Learning Compensated Model Predictive Control for Off-road Driving on Unknown Deformable Terrain
This study presents an Actor-Critic reinforcement learning Compensated Model Predictive Controller (AC2MPC) designed for high-speed, off-road autonomous driving on deformable terrains. Addressing the difficulty of modeling unknown tire-terrain interaction and ensuring real-time control feasibility and performance, this framework integrates deep reinforcement learning with a model predictive controller to manage unmodeled nonlinear dynamics. We evaluate the controller framework over constant and varying velocity profiles using high-fidelity simulator Project Chrono. Our findings demonstrate that our controller statistically outperforms standalone model-based and learning-based controllers over three unknown terrains that represent sandy deformable track, sandy and rocky track and cohesive clay-like deformable soil track. Despite varied and previously unseen terrain characteristics, this framework generalized well enough to track longitudinal reference speeds with the least error. Furthermore, this framework required significantly less training data compared to purely learning based controller, converging in fewer steps while delivering better performance. Even when under-trained, this controller outperformed the standalone controllers, highlighting its potential for safer and more efficient real-world deployment.
comment: Submitted to IEEE Transactions on Intelligent Vehicles as a Regular Paper
Analysis and Design of Satellite Constellation Spare Strategy Using Markov Chain
This paper introduces the analysis and design method of an optimal spare management policy using Markov chain for a large-scale satellite constellation. We propose an analysis methodology of spare strategy using a multi-echelon $(r,q)$ inventory control model with Markov chain, and review two different spare strategies: direct resupply, which inserts spares directly into the constellation orbit using launch vehicles; and indirect resupply, which places spares into parking orbits before transferring them to the constellation orbit. Furthermore, we propose an optimization formulation utilizing the results of the proposed analysis method, and an optimal solution is found using a genetic algorithm.
comment: This paper was presented at the 2024 AAS/AIAA Astrodynamics Specialist Conference, August 11-15, 2024, Broomfield, Colorado, USA
Time Efficient Rate Feedback Tracking Controller with Slew Rate and Control Constraint
This paper proposes a time-efficient attitude-tracking controller considering the slew rate constraint and control constraint. The algorithm defines the sliding surface, which is the linear combination of command, body, and regulating angular velocity, and utilizes the sliding surface to derive the control command that guarantees finite time stability. The regulating rate, which is an angular velocity regulating the attitude error between the command and body frame, is defined along the instantaneous eigen-axis between the two frames to minimize the rotation angle. In addition, the regulating rate is shaped such that the slew rate constraint is satisfied while the time to regulation is minimized with consideration of the control constraint. Practical scenarios involving Earth observation satellites are used to validate the algorithm's performance.
comment: This paper was presented at the 2024 AAS/AIAA Astrodynamics Specialist Conference, August 11-15, 2024, Broomfield, Colorado, USA
Accelerating Chance-constrained SCED via Scenario Compression
This paper studies some compression methods to accelerate the scenario-based chance-constrained security-constrained economic dispatch (SCED) problem. In particular, we show that by exclusively employing the vertices after convex hull compression, an equivalent solution can be obtained compared to utilizing the entire scenario set. For other compression methods that might relax the original solution, such as box compression, this paper presents the compression risk validation scheme to assess the risk arising from the sample space. By quantifying the risk associated with compression, decision-makers are empowered to select either solution risk or compression risk as the risk metric, depending on the complexity of specific problems. Numerical examples based on the 118-bus system and synthetic Texas grids compare these two risk metrics. The results also demonstrate the efficiency of compression methods in both problem formulation and solving processes.
Optimal Strip Attitude Command of Earth Observation Satellite using Differential Dynamic Programming
This paper addresses the optimal scan profile problem for strip imaging in an Earth observation satellite (EOS) equipped with a time-delay integration (TDI) camera. Modern TDI cameras can control image integration frequency during imaging operation, adding an additional degree of freedom (DOF) to the imaging operation. On the other hand, modern agile EOS is capable of imaging non-parallel ground targets, which require a substantial amount of angular velocity and angular acceleration during operation. We leverage this DOF to minimize various factors impacting image quality, such as angular velocity. Initially, we derive analytic expressions for angular velocity based on kinematic equations. These expressions are then used to formulate a constrained optimal control problem (OCP), which we solve using differential dynamic programming (DDP). We validate our approach through testing and comparison with reference methods across various practical scenarios. Simulation results demonstrate that our proposed method efficiently achieves near-optimal solutions without encountering non-convergence issues.
comment: This paper was presented at the 2024 AAS/AIAA Astrodynamics Specialist Conference, August 11-15, 2024, Broomfield, Colorado, USA
Reinforcement learning-based adaptive speed controllers in mixed autonomy condition
The integration of Automated Vehicles (AVs) into traffic flow holds the potential to significantly improve traffic congestion by enabling AVs to function as actuators within the flow. This paper introduces an adaptive speed controller tailored for scenarios of mixed autonomy, where AVs interact with human-driven vehicles. We model the traffic dynamics using a system of strongly coupled Partial and Ordinary Differential Equations (PDE-ODE), with the PDE capturing the general flow of human-driven traffic and the ODE characterizing the trajectory of the AVs. A speed policy for AVs is derived using a Reinforcement Learning (RL) algorithm structured within an Actor-Critic (AC) framework. This algorithm interacts with the PDE-ODE model to optimize the AV control policy. Numerical simulations are presented to demonstrate the controller's impact on traffic patterns, showing the potential of AVs to improve traffic flow and reduce congestion.
Highly Sensitive and Compact Quad-Band Ambient RF Energy Harvester
A highly efficient and compact quad band energy harvester (QBEH) circuit based on the extended composite right and left handed transmission lines (ECRLHTLs) technique is presented.The design procedure based on ECRLHTLs at four desired frequency bands is introduced to realize a quad band matching network (QBMN).The proposed QBEH operates at four frequency bands f1=0.75 GHz,f2=1.8 GHz,f3=2.4 GHz and f4=5.8 GHz. The simulations and experimental results of the proposed QBEH exhibit overall (end to end) efficiency of 55percent and 70percent while excited at four frequency bands simultaneously with negative 20dBm (10 microWatt) and negative 10dBm (100 microWatt) input power, respectively.Due to applying multi band excitation technique and radio frequency (RF) combining method in the QBEH circuit, the sensitivity is improved, and sufficient power is generated to realize a self sustainable sensor (S3) using ambient low level RF signals.A favorable impedance matching over a broad low input power range of negative 50 to negative 10 dBm (0.01 to 100 microWatt) is achieved, enabling the proposed QBEH to harvest ambient RF energy in urban environments. Moreover, an accurate theoretical analyses based on the Volterra series and Laplace transformation are presented to maximize the output DC current of the rectifier over a wide input power range.Theoretical, simulation and measurement results are in excellent agreement, which validate the design accuracy for the proposed quad band structure.The proposed new energy harvesting technique has the potential to practically realize a green energy harvesting solution to generate a viable energy source for low powered sensors and IoT devices, anytime, anywhere.
Dual-Band, Slant-Polarized MIMO Antenna Set for Vehicular Communication
Slant-polarized Multi Input Multi Output (MIMO) antennas are able to improve the performance of mobile communication systems in terms of channel capacity. Especially, the implementation of MIMO configurations for automotive applications requires to consider high gain, wideband, low-profile and affordable antennas in the communication link. In this work design, simulation and measurement of a new dual-band slant-polarized MIMO antenna with HPBW (Half Power Beam Width) of around 900 are presented. Then, four replicas of the proposed antenna set are placed at four different poles (North, South, West and East) to cover 3600 around the vehicle as an omni-directional pattern. In the real world scenario, the proper antenna set is selected to communicate with the intended user. Each slant MIMO antenna set consists of two inclined (450) low band (LB: 700 to 900 MHz) and two inclined high band (HB: 1.7 to 2.7 GHz) log-periodic antennas. The measured gain of LB and HB antennas are 7 dBi and 8 dBi, respectively. Great agreement between simulation and measurement results confirms the accuracy of the design and simulation procedures of antenna system using optimization algorithm (Genetic method). The proposed antenna is also measured in the field for industrial applications.
Planning of Off-Grid Renewable Power to Ammonia Systems with Heterogeneous Flexibility: A Multistakeholder Equilibrium Perspective
Off-grid renewable power to ammonia (ReP2A) systems present a promising pathway toward carbon neutrality in both the energy and chemical industries. However, due to chemical safety requirements, the limited flexibility of ammonia synthesis poses a challenge when attempting to align with the variable hydrogen flow produced from renewable power. This necessitates the optimal sizing of equipment capacity for effective and coordinated production across the system. Additionally, an ReP2A system may involve multiple stakeholders with varying degrees of operational flexibility, complicating the planning problem. This paper first examines the multistakeholder sizing equilibrium (MSSE) of the ReP2A system. First, we propose an MSSE model that accounts for individual planning decisions and the competing economic interests of the stakeholders of power generation, hydrogen production, and ammonia synthesis. We then construct an equivalent optimization problem based on Karush--Kuhn--Tucker (KKT) conditions to determine the equilibrium. Following this, we decompose the problem in the temporal dimension and solve it via multicut generalized Benders decomposition (GBD) to address long-term balancing issues. Case studies based on a realistic project reveal that the equilibrium does not naturally balance the interests of all stakeholders due to their heterogeneous characteristics. Our findings suggest that benefit transfer agreements ensure mutual benefits and the successful implementation of ReP2A projects.
Training Verifiably Robust Agents Using Set-Based Reinforcement Learning
Reinforcement learning often uses neural networks to solve complex control tasks. However, neural networks are sensitive to input perturbations, which makes their deployment in safety-critical environments challenging. This work lifts recent results from formally verifying neural networks against such disturbances to reinforcement learning in continuous state and action spaces using reachability analysis. While previous work mainly focuses on adversarial attacks for robust reinforcement learning, we train neural networks utilizing entire sets of perturbed inputs and maximize the worst-case reward. The obtained agents are verifiably more robust than agents obtained by related work, making them more applicable in safety-critical environments. This is demonstrated with an extensive empirical evaluation of four different benchmarks.
On Safety and Liveness Filtering Using Hamilton-Jacobi Reachability Analysis
Hamilton-Jacobi (HJ) reachability-based filtering provides a powerful framework to co-optimize performance and safety (or liveness) for autonomous systems. Under this filtering scheme, a nominal controller is minimally modified to ensure system safety or liveness. However, the resulting controllers can exhibit abrupt switching and bang-bang behavior, which is not suitable for applications of autonomous systems in the real world. This work presents a novel, unifying framework to design safety and liveness filters through reachability analysis. We explicitly characterize the maximal set of control inputs that ensures safety (or liveness) at a given state. Different safety filters can then be constructed using different subsets of this maximal set along with a projection operator to modify the nominal controller. We use the proposed framework to design three safety filters, each balancing performance, computation time, and smoothness differently. We highlight their relative strengths and limitations by applying these filters to autonomous navigation and rocket landing scenarios and on a physical robot testbed. We also discuss practical aspects associated with implementing these filters on real-world autonomous systems. Our research advances the understanding and potential application of reachability-based controllers on real-world autonomous systems.
comment: 16 pages, 13 figures
Systems and Control (EESS)
ByCAN: Reverse Engineering Controller Area Network (CAN) Messages from Bit to Byte Level
As the primary standard protocol for modern cars, the Controller Area Network (CAN) is a critical research target for automotive cybersecurity threats and autonomous applications. As the decoding specification of CAN is a proprietary black-box maintained by Original Equipment Manufacturers (OEMs), conducting related research and industry developments can be challenging without a comprehensive understanding of the meaning of CAN messages. In this paper, we propose a fully automated reverse-engineering system, named ByCAN, to reverse engineer CAN messages. ByCAN outperforms existing research by introducing byte-level clusters and integrating multiple features at both byte and bit levels. ByCAN employs the clustering and template matching algorithms to automatically decode the specifications of CAN frames without the need for prior knowledge. Experimental results demonstrate that ByCAN achieves high accuracy in slicing and labeling performance, i.e., the identification of CAN signal boundaries and labels. In the experiments, ByCAN achieves slicing accuracy of 80.21%, slicing coverage of 95.21%, and labeling accuracy of 68.72% for general labels when analyzing the real-world CAN frames.
comment: Accept by IEEE Internet of Things Journal, 15 pages, 5 figures, 6 tables
Reinforcement Learning Compensated Model Predictive Control for Off-road Driving on Unknown Deformable Terrain
This study presents an Actor-Critic reinforcement learning Compensated Model Predictive Controller (AC2MPC) designed for high-speed, off-road autonomous driving on deformable terrains. Addressing the difficulty of modeling unknown tire-terrain interaction and ensuring real-time control feasibility and performance, this framework integrates deep reinforcement learning with a model predictive controller to manage unmodeled nonlinear dynamics. We evaluate the controller framework over constant and varying velocity profiles using high-fidelity simulator Project Chrono. Our findings demonstrate that our controller statistically outperforms standalone model-based and learning-based controllers over three unknown terrains that represent sandy deformable track, sandy and rocky track and cohesive clay-like deformable soil track. Despite varied and previously unseen terrain characteristics, this framework generalized well enough to track longitudinal reference speeds with the least error. Furthermore, this framework required significantly less training data compared to purely learning based controller, converging in fewer steps while delivering better performance. Even when under-trained, this controller outperformed the standalone controllers, highlighting its potential for safer and more efficient real-world deployment.
comment: Submitted to IEEE Transactions on Intelligent Vehicles as a Regular Paper
Analysis and Design of Satellite Constellation Spare Strategy Using Markov Chain
This paper introduces the analysis and design method of an optimal spare management policy using Markov chain for a large-scale satellite constellation. We propose an analysis methodology of spare strategy using a multi-echelon $(r,q)$ inventory control model with Markov chain, and review two different spare strategies: direct resupply, which inserts spares directly into the constellation orbit using launch vehicles; and indirect resupply, which places spares into parking orbits before transferring them to the constellation orbit. Furthermore, we propose an optimization formulation utilizing the results of the proposed analysis method, and an optimal solution is found using a genetic algorithm.
comment: This paper was presented at the 2024 AAS/AIAA Astrodynamics Specialist Conference, August 11-15, 2024, Broomfield, Colorado, USA
Time Efficient Rate Feedback Tracking Controller with Slew Rate and Control Constraint
This paper proposes a time-efficient attitude-tracking controller considering the slew rate constraint and control constraint. The algorithm defines the sliding surface, which is the linear combination of command, body, and regulating angular velocity, and utilizes the sliding surface to derive the control command that guarantees finite time stability. The regulating rate, which is an angular velocity regulating the attitude error between the command and body frame, is defined along the instantaneous eigen-axis between the two frames to minimize the rotation angle. In addition, the regulating rate is shaped such that the slew rate constraint is satisfied while the time to regulation is minimized with consideration of the control constraint. Practical scenarios involving Earth observation satellites are used to validate the algorithm's performance.
comment: This paper was presented at the 2024 AAS/AIAA Astrodynamics Specialist Conference, August 11-15, 2024, Broomfield, Colorado, USA
Accelerating Chance-constrained SCED via Scenario Compression
This paper studies some compression methods to accelerate the scenario-based chance-constrained security-constrained economic dispatch (SCED) problem. In particular, we show that by exclusively employing the vertices after convex hull compression, an equivalent solution can be obtained compared to utilizing the entire scenario set. For other compression methods that might relax the original solution, such as box compression, this paper presents the compression risk validation scheme to assess the risk arising from the sample space. By quantifying the risk associated with compression, decision-makers are empowered to select either solution risk or compression risk as the risk metric, depending on the complexity of specific problems. Numerical examples based on the 118-bus system and synthetic Texas grids compare these two risk metrics. The results also demonstrate the efficiency of compression methods in both problem formulation and solving processes.
Optimal Strip Attitude Command of Earth Observation Satellite using Differential Dynamic Programming
This paper addresses the optimal scan profile problem for strip imaging in an Earth observation satellite (EOS) equipped with a time-delay integration (TDI) camera. Modern TDI cameras can control image integration frequency during imaging operation, adding an additional degree of freedom (DOF) to the imaging operation. On the other hand, modern agile EOS is capable of imaging non-parallel ground targets, which require a substantial amount of angular velocity and angular acceleration during operation. We leverage this DOF to minimize various factors impacting image quality, such as angular velocity. Initially, we derive analytic expressions for angular velocity based on kinematic equations. These expressions are then used to formulate a constrained optimal control problem (OCP), which we solve using differential dynamic programming (DDP). We validate our approach through testing and comparison with reference methods across various practical scenarios. Simulation results demonstrate that our proposed method efficiently achieves near-optimal solutions without encountering non-convergence issues.
comment: This paper was presented at the 2024 AAS/AIAA Astrodynamics Specialist Conference, August 11-15, 2024, Broomfield, Colorado, USA
Reinforcement learning-based adaptive speed controllers in mixed autonomy condition
The integration of Automated Vehicles (AVs) into traffic flow holds the potential to significantly improve traffic congestion by enabling AVs to function as actuators within the flow. This paper introduces an adaptive speed controller tailored for scenarios of mixed autonomy, where AVs interact with human-driven vehicles. We model the traffic dynamics using a system of strongly coupled Partial and Ordinary Differential Equations (PDE-ODE), with the PDE capturing the general flow of human-driven traffic and the ODE characterizing the trajectory of the AVs. A speed policy for AVs is derived using a Reinforcement Learning (RL) algorithm structured within an Actor-Critic (AC) framework. This algorithm interacts with the PDE-ODE model to optimize the AV control policy. Numerical simulations are presented to demonstrate the controller's impact on traffic patterns, showing the potential of AVs to improve traffic flow and reduce congestion.
Highly Sensitive and Compact Quad-Band Ambient RF Energy Harvester
A highly efficient and compact quad band energy harvester (QBEH) circuit based on the extended composite right and left handed transmission lines (ECRLHTLs) technique is presented.The design procedure based on ECRLHTLs at four desired frequency bands is introduced to realize a quad band matching network (QBMN).The proposed QBEH operates at four frequency bands f1=0.75 GHz,f2=1.8 GHz,f3=2.4 GHz and f4=5.8 GHz. The simulations and experimental results of the proposed QBEH exhibit overall (end to end) efficiency of 55percent and 70percent while excited at four frequency bands simultaneously with negative 20dBm (10 microWatt) and negative 10dBm (100 microWatt) input power, respectively.Due to applying multi band excitation technique and radio frequency (RF) combining method in the QBEH circuit, the sensitivity is improved, and sufficient power is generated to realize a self sustainable sensor (S3) using ambient low level RF signals.A favorable impedance matching over a broad low input power range of negative 50 to negative 10 dBm (0.01 to 100 microWatt) is achieved, enabling the proposed QBEH to harvest ambient RF energy in urban environments. Moreover, an accurate theoretical analyses based on the Volterra series and Laplace transformation are presented to maximize the output DC current of the rectifier over a wide input power range.Theoretical, simulation and measurement results are in excellent agreement, which validate the design accuracy for the proposed quad band structure.The proposed new energy harvesting technique has the potential to practically realize a green energy harvesting solution to generate a viable energy source for low powered sensors and IoT devices, anytime, anywhere.
Dual-Band, Slant-Polarized MIMO Antenna Set for Vehicular Communication
Slant-polarized Multi Input Multi Output (MIMO) antennas are able to improve the performance of mobile communication systems in terms of channel capacity. Especially, the implementation of MIMO configurations for automotive applications requires to consider high gain, wideband, low-profile and affordable antennas in the communication link. In this work design, simulation and measurement of a new dual-band slant-polarized MIMO antenna with HPBW (Half Power Beam Width) of around 900 are presented. Then, four replicas of the proposed antenna set are placed at four different poles (North, South, West and East) to cover 3600 around the vehicle as an omni-directional pattern. In the real world scenario, the proper antenna set is selected to communicate with the intended user. Each slant MIMO antenna set consists of two inclined (450) low band (LB: 700 to 900 MHz) and two inclined high band (HB: 1.7 to 2.7 GHz) log-periodic antennas. The measured gain of LB and HB antennas are 7 dBi and 8 dBi, respectively. Great agreement between simulation and measurement results confirms the accuracy of the design and simulation procedures of antenna system using optimization algorithm (Genetic method). The proposed antenna is also measured in the field for industrial applications.
Planning of Off-Grid Renewable Power to Ammonia Systems with Heterogeneous Flexibility: A Multistakeholder Equilibrium Perspective
Off-grid renewable power to ammonia (ReP2A) systems present a promising pathway toward carbon neutrality in both the energy and chemical industries. However, due to chemical safety requirements, the limited flexibility of ammonia synthesis poses a challenge when attempting to align with the variable hydrogen flow produced from renewable power. This necessitates the optimal sizing of equipment capacity for effective and coordinated production across the system. Additionally, an ReP2A system may involve multiple stakeholders with varying degrees of operational flexibility, complicating the planning problem. This paper first examines the multistakeholder sizing equilibrium (MSSE) of the ReP2A system. First, we propose an MSSE model that accounts for individual planning decisions and the competing economic interests of the stakeholders of power generation, hydrogen production, and ammonia synthesis. We then construct an equivalent optimization problem based on Karush--Kuhn--Tucker (KKT) conditions to determine the equilibrium. Following this, we decompose the problem in the temporal dimension and solve it via multicut generalized Benders decomposition (GBD) to address long-term balancing issues. Case studies based on a realistic project reveal that the equilibrium does not naturally balance the interests of all stakeholders due to their heterogeneous characteristics. Our findings suggest that benefit transfer agreements ensure mutual benefits and the successful implementation of ReP2A projects.
Training Verifiably Robust Agents Using Set-Based Reinforcement Learning
Reinforcement learning often uses neural networks to solve complex control tasks. However, neural networks are sensitive to input perturbations, which makes their deployment in safety-critical environments challenging. This work lifts recent results from formally verifying neural networks against such disturbances to reinforcement learning in continuous state and action spaces using reachability analysis. While previous work mainly focuses on adversarial attacks for robust reinforcement learning, we train neural networks utilizing entire sets of perturbed inputs and maximize the worst-case reward. The obtained agents are verifiably more robust than agents obtained by related work, making them more applicable in safety-critical environments. This is demonstrated with an extensive empirical evaluation of four different benchmarks.
On Safety and Liveness Filtering Using Hamilton-Jacobi Reachability Analysis
Hamilton-Jacobi (HJ) reachability-based filtering provides a powerful framework to co-optimize performance and safety (or liveness) for autonomous systems. Under this filtering scheme, a nominal controller is minimally modified to ensure system safety or liveness. However, the resulting controllers can exhibit abrupt switching and bang-bang behavior, which is not suitable for applications of autonomous systems in the real world. This work presents a novel, unifying framework to design safety and liveness filters through reachability analysis. We explicitly characterize the maximal set of control inputs that ensures safety (or liveness) at a given state. Different safety filters can then be constructed using different subsets of this maximal set along with a projection operator to modify the nominal controller. We use the proposed framework to design three safety filters, each balancing performance, computation time, and smoothness differently. We highlight their relative strengths and limitations by applying these filters to autonomous navigation and rocket landing scenarios and on a physical robot testbed. We also discuss practical aspects associated with implementing these filters on real-world autonomous systems. Our research advances the understanding and potential application of reachability-based controllers on real-world autonomous systems.
comment: 16 pages, 13 figures
Systems and Control (CS)
Gaussian Processes with Noisy Regression Inputs for Dynamical Systems
This paper is centered around the approximation of dynamical systems by means of Gaussian processes. To this end, trajectories of such systems must be collected to be used as training data. The measurements of these trajectories are typically noisy, which implies that both the regression inputs and outputs are corrupted by noise. However, most of the literature considers only noise in the regression outputs. In this paper, we show how to account for the noise in the regression inputs in an extended Gaussian process framework to approximate scalar and multidimensional systems. We demonstrate the potential of our framework by comparing it to different state-of-the-art methods in several simulation examples.
comment: 6 pages
Robust Stochastic Shortest-Path Planning via Risk-Sensitive Incremental Sampling
With the pervasiveness of Stochastic Shortest-Path (SSP) problems in high-risk industries, such as last-mile autonomous delivery and supply chain management, robust planning algorithms are crucial for ensuring successful task completion while mitigating hazardous outcomes. Mainstream chance-constrained incremental sampling techniques for solving SSP problems tend to be overly conservative and typically do not consider the likelihood of undesirable tail events. We propose an alternative risk-aware approach inspired by the asymptotically-optimal Rapidly-Exploring Random Trees (RRT*) planning algorithm, which selects nodes along path segments with minimal Conditional Value-at-Risk (CVaR). Our motivation rests on the step-wise coherence of the CVaR risk measure and the optimal substructure of the SSP problem. Thus, optimizing with respect to the CVaR at each sampling iteration necessarily leads to an optimal path in the limit of the sample size. We validate our approach via numerical path planning experiments in a two-dimensional grid world with obstacles and stochastic path-segment lengths. Our simulation results show that incorporating risk into the tree growth process yields paths with lengths that are significantly less sensitive to variations in the noise parameter, or equivalently, paths that are more robust to environmental uncertainty. Algorithmic analyses reveal similar query time and memory space complexity to the baseline RRT* procedure, with only a marginal increase in processing time. This increase is offset by significantly lower noise sensitivity and reduced planner failure rates.
comment: Accepted for presentation at the 2024 IEEE Conference on Decision and Control (CDC)
A new perspective on Bayesian Operational Modal Analysis
In the field of operational modal analysis (OMA), obtained modal information is frequently used to assess the current state of aerospace, mechanical, offshore and civil structures. However, the stochasticity of operational systems and the lack of forcing information can lead to inconsistent results. Quantifying the uncertainty of the recovered modal parameters through OMA is therefore of significant value. In this article, a new perspective on Bayesian OMA is proposed: a Bayesian stochastic subspace identification (SSI) algorithm. Distinct from existing approaches to Bayesian OMA, a hierarchical probabilistic model is embedded at the core of covariance-driven SSI. Through substitution of canonical correlation analysis with a Bayesian equivalent, posterior distributions over the modal properties are obtained. Two inference schemes are presented for the proposed Bayesian formulation: Markov Chain Monte Carlo and variational Bayes. Two case studies are then explored. The first is benchmark study using data from a simulated, multi degree-of-freedom, linear system. Following application of Bayesian SSI, it is shown that the same posterior is targeted and recovered by both inference schemes, with good agreement between the posterior mean and the conventional SSI result. The second study applies the variational form to data obtained from an in-service structure: The Z24 bridge. The results of this study are presented at single model orders, and then using a stabilisation diagram. The recovered posterior uncertainty is presented and compared to the classic SSI result. It is observed that the posterior distributions with mean values coinciding with the natural frequencies exhibit lower variance than values situated away from the natural frequencies.
Cross-Chip Partial Reconfiguration for the Initialisation of Modular and Scalable Heterogeneous Systems
The almost unlimited possibilities to customize the logic in an FPGA are one of the main reasons for the versatility of these devices. Partial reconfiguration exploits this capability even further by allowing to replace logic in predefined FPGA regions at runtime. This is especially relevant in heterogeneous SoCs, combining FPGA fabric with conventional processors on a single die. Tight integration and supporting frameworks like the FPGA subsystem in Linux facilitate use, for example, to dynamically load custom hardware accelerators. Although this example is one of the most common use cases for partial reconfiguration, the possible applications go far beyond. We propose to use partial reconfiguration in combination with the AXI C2C cross-chip bus to extend the resources of heterogeneous MPSoC and RFSoC devices by connecting peripheral FPGAs. With AXI C2C it is easily possible to link the programmable logic of the individual devices, but partial reconfiguration on peripheral FPGAs utilising the same channel is not officially supported. By using an AXI ICAP controller in combination with custom Linux drivers, we show that it is possible to enable the PS of the heterogeneous SoC to perform partial reconfiguration on peripheral FPGAs, and thus to seamlessly access and manage the entire multi-device system. As a result, software and FPGA firmware updates can be applied to the entire system at runtime, and peripheral FPGAs can be added and removed during operation.
comment: 8 pages, double-column, 9 figures. Paper submitted as proceeding for the 24rd IEEE Real Time Conference (2024)
Study on Human-Variability-Respecting Optimal Control Affecting Human Interaction Experience
Broad application of human-machine interaction (HMI) demands advanced and human-centered control designs for the machine's automation. Human natural motor action shows stochastic behavior, which has so far not been respected in HMI control designs. Using a previously presented novel human-variability-respecting optimal controller we present a study design which allows the investigation of respecting human natural variability and its effect on human interaction experience. Our approach is tested in simulation based on an identified real human subject and presents a promising approach to be used for a larger subject study.
Discrete-time SIS Social Contagion Processes on Hypergraphs
Recent research on social contagion processes has revealed the limitations of traditional networks, which capture only pairwise relationships, to characterize complex multiparty relationships and group influences properly. Social contagion processes on higher-order networks (simplicial complexes and general hypergraphs) have therefore emerged as a novel frontier. In this work, we investigate discrete-time Susceptible-Infected-Susceptible (SIS) social contagion processes occurring on weighted and directed hypergraphs and their extensions to bivirus cases and general higher-order SIS processes with the aid of tensor algebra. Our focus lies in comprehensively characterizing the healthy state and endemic equilibria within this framework. The emergence of bistability or multistability behavior phenomena, where multiple equilibria coexist and are simultaneously locally asymptotically stable, is demonstrated in view of the presence of the higher-order interaction. The novel sufficient conditions of the appearance for system behaviors, which are determined by both (higher-order) network topology and transition rates, are provided to assess the likelihood of the SIS social contagion processes causing an outbreak. More importantly, given the equilibrium is locally stable, an explicit domain of attraction associated with the system parameters is constructed. Moreover, a learning method to estimate the transition rates is presented. In the end, the attained theoretical results are supplemented via numerical examples. Specifically, we evaluate the effectiveness of the networked SIS social contagion process by comparing it with the $2^n$-state Markov chain model. These numerical examples are given to highlight the performance of parameter learning algorithms and the system behaviors of the discrete-time SIS social contagion process.
RadioDiff: An Effective Generative Diffusion Model for Sampling-Free Dynamic Radio Map Construction
Radio map (RM) is a promising technology that can obtain pathloss based on only location, which is significant for 6G network applications to reduce the communication costs for pathloss estimation. However, the construction of RM in traditional is either computationally intensive or depends on costly sampling-based pathloss measurements. Although the neural network (NN)-based method can efficiently construct the RM without sampling, its performance is still suboptimal. This is primarily due to the misalignment between the generative characteristics of the RM construction problem and the discrimination modeling exploited by existing NN-based methods. Thus, to enhance RM construction performance, in this paper, the sampling-free RM construction is modeled as a conditional generative problem, where a denoised diffusion-based method, named RadioDiff, is proposed to achieve high-quality RM construction. In addition, to enhance the diffusion model's capability of extracting features from dynamic environments, an attention U-Net with an adaptive fast Fourier transform module is employed as the backbone network to improve the dynamic environmental features extracting capability. Meanwhile, the decoupled diffusion model is utilized to further enhance the construction performance of RMs. Moreover, a comprehensive theoretical analysis of why the RM construction is a generative problem is provided for the first time, from both perspectives of data features and NN training methods. Experimental results show that the proposed RadioDiff achieves state-of-the-art performance in all three metrics of accuracy, structural similarity, and peak signal-to-noise ratio. The code is available at https://github.com/UNIC-Lab/RadioDiff.
Data-driven Construction of Finite Abstractions for Interconnected Systems: A Compositional Approach
Finite-state abstractions (a.k.a. symbolic models) present a promising avenue for the formal verification and synthesis of controllers in continuous-space control systems. These abstractions provide simplified models that capture the fundamental behaviors of the original systems. However, the creation of such abstractions typically relies on the availability of precise knowledge concerning system dynamics, which might not be available in many real-world applications. In this work, we introduce an innovative, data-driven, and compositional approach to generate finite abstractions for interconnected systems that consist of discrete-time control subsystems with unknown dynamics. These subsystems interact through an unknown static interconnection map. Our methodology for abstracting the interconnected system involves constructing abstractions for individual subsystems and incorporating an abstraction of the interconnection map.
comment: This manuscript of 19 pages and 7 figures is a preprint under review with a journal
A New Control Law for TS Fuzzy Models: Less Conservative LMI Conditions by Using Membership Functions Derivative
This note proposes a new type of Parallel Distributed Controller (PDC) for Takagi-Sugeno (TS) fuzzy models. Our idea consists of using two control terms based on state feedback, one composed of a convex combination of linear gains weighted by the normalized membership grade, as in traditional PDC, and the other composed of linear gains weighted by the time-derivatives of the membership functions. We present the design conditions as Linear Matrix Inequalities, solvable through numerical optimization tools. Numerical examples are given to illustrate the advantages of the proposed approach, which contains the the traditional PDC as a special case.
comment: 20 pages, 4 figures
GLANCE: Graph-based Learnable Digital Twin for Communication Networks
As digital twins (DTs) to physical communication systems, network simulators can aid the design and deployment of communication networks. However, time-consuming simulations must be run for every new set of network configurations. Learnable digital twins (LDTs), in contrast, can be trained offline to emulate simulation outcomes and serve as a more efficient alternative to simulation-based DTs at runtime. In this work, we propose GLANCE, a communication LDT that learns from the simulator ns-3. It can evaluate network key performance indicators (KPIs) and assist in network management with exceptional efficiency. Leveraging graph learning, we exploit network data characteristics and devise a specialized architecture to embed sequential and topological features of traffic flows within the network. In addition, multi-task learning (MTL) and transfer learning (TL) are leveraged to enhance GLANCE's generalizability to unseen inputs and efficacy across different tasks. Beyond end-to-end KPI prediction, GLANCE can be deployed within an optimization framework for network management. It serves as an efficient or differentiable evaluator in optimizing network configurations such as traffic loads and flow destinations. Through numerical experiments and benchmarking, we verify the effectiveness of the proposed LDT architecture, demonstrate its robust generalization to various inputs, and showcase its efficacy in network management applications.
Memory-optimised Cubic Splines for High-fidelity Quantum Operations
Radio-frequency pulses are widespread for the control of quantum bits and the execution of operations in quantum computers. The ability to tune key pulse parameters such as time-dependent amplitude, phase, and frequency is essential to achieve maximal gate fidelity and mitigate errors. As systems scale, a larger fraction of the control electronic processing will move closer to the qubits, to enhance integration and minimise latency in operations requiring fast feedback. This will constrain the space available in the memory of the control electronics to load time-resolved pulse parameters at high sampling rates. Cubic spline interpolation is a powerful and widespread technique that divides the pulse into segments of cubic polynomials. We show an optimised implementation of this strategy, using a two-stage curve fitting process and additional symmetry operations to load a high-sampling pulse output on an FPGA. This results in a favourable accuracy versus memory footprint trade-off. By simulating single-qubit population transfer and atom transport on a neutral atom device, we show that we can achieve high fidelities with low memory requirements. This is instrumental for scaling up the number of qubits and gate operations in environments where memory is a limited resource.
Stable State Space SubSpace (S$^5$) Identification
State space subspace algorithms for input-output systems have been widely applied but also have a reasonably well-developedasymptotic theory dealing with consistency. However, guaranteeing the stability of the estimated system matrix is a major issue. Existing stability-guaranteed algorithms are computationally expensive, require several tuning parameters, and scale badly to high state dimensions. Here, we develop a new algorithm that is closed-form and requires no tuning parameters. It is thus computationally cheap and scales easily to high state dimensions. We also prove its consistency under reasonable conditions.
Parallelized Robust Distributed Model Predictive Control in the Presence of Coupled State Constraints
In this paper, we present a robust distributed model predictive control (DMPC) scheme for dynamically decoupled nonlinear systems which are subject to state constraints, coupled state constraints and input constraints. In the proposed control scheme, all subsystems solve their local optimization problem in parallel and neighbor-to-neighbor communication suffices. The approach relies on consistency constraints which define a neighborhood around each subsystem's reference trajectory where the state of the subsystem is guaranteed to stay in. Reference trajectories and consistency constraints are known to neighboring subsystems. Contrary to other relevant approaches, the reference trajectories are improved consecutively. The presented approach allows the formulation of convex optimization problems for systems with linear dynamics even in the presence of non-convex state constraints. Additionally, we employ tubes in order to ensure the controller's robustness against bounded uncertainties. In the end, we briefly comment on an iterative extension of the DMPC scheme. The effectiveness of the proposed DMPC scheme and its iterative extension are demonstrated with simulations.
comment: 15 pages, 5 figures, accepted for publication in Automatica
Transmission Expansion Planning for Renewable-energy-dominated Power Grids Considering Climate Impact
As renewable energy is becoming the major resource in future grids, the weather and climate can have a higher impact on grid reliability. Transmission expansion planning (TEP) has the potential to reinforce a transmission network that is suitable for climate-impacted grids. In this paper, we propose a systematic TEP procedure for climate-impacted renewable energy-enriched grids. Particularly, this work developed an improved model for TEP considering climate impact (TEP-CI) and evaluated the system reliability with the obtained transmission investment plan. Firstly, we created climate-impacted spatio-temporal future grid data to facilitate the TEP-CI study, which includes the future climate-dependent renewable production as well as the dynamic rating profiles of the Texas 123-bus backbone transmission system (TX-123BT). Secondly, we proposed the TEP-CI which considers the variation in renewable production and dynamic line rating, and obtained the investment plan for future TX-123BT. Thirdly, we presented a customized security-constrained unit commitment (SCUC) specifically for climate-impacted grids. The future grid reliability under various investment scenarios is analyzed based on the daily operation conditions from SCUC simulations. The whole procedure presented in this paper enables numerical studies on grid planning considering climate impacts. It can also serve as a benchmark for other TEP-CI research and performance evaluation.
comment: 11 pages, 8 figures
A Synthetic Texas Power System with Time-Series High-Resolution Weather-Dependent Spatio-Temporally Correlated Grid Profiles
This study introduced a synthetic power system with spatio-temporally correlated profiles of solar power, wind power, dynamic line ratings and loads at one-hour resolution for five continuous years, referred to as the Texas 123-bus backbone transmission (TX-123BT) system. Unlike conventional test cases that offer a static snapshot of system profile, the designed TX-123BT system incorporates weather-dependent profiles for renewable generation and transmission thermal limits, mimicking the actual Electric Reliability Council of Texas (ERCOT) system characteristics. Three weather-dependent models are used for the creation of wind and solar power production, and dynamic line rating (DLR) separately. Security-constrained unit commitment (SCUC) is conducted on TX-123BT daily profiles and numerical results are compared with the actual ERCOT system for validation. The long-term spatio-temporal profiles can greatly capture the renewable production versatility due to the environmental conditions. An example of hydrogen facilities integration studies is presented to illustrate the advantage of utilizing detailed spatio-temporal profiles of TX-123BT.
comment: 10 pages, 14 figures, 10 tables
Optimization of the Energy-Comfort Trade-Off of HVAC Systems in Electric City Buses Based on a Steady-State Model
The electrification of public transport vehicles offers the potential to relieve city centers of pollutant and noise emissions. Furthermore, electric buses have lower life-cycle greenhouse gas (GHG) emissions than diesel buses, particularly when operated with sustainably produced electricity. However, the heating, ventilation, and air-conditioning (HVAC) system can consume a significant amount of energy, thus limiting the achievable driving range. In this paper, we address the HVAC system in an electric city bus by analyzing the trade-off between the energy consumption and the thermal comfort of the passengers. We do this by developing a dynamic thermal model for the bus, which we simplify by considering it to be in steady state. We introduce a method that is able to quickly optimize the steady-state HVAC system inputs for a large number of samples representative of a year-round operation. A comparison between the results from the steady-state optimization approach and a dynamic simulation reveals small deviations in both the HVAC system power demand and achieved thermal comfort. Thus, the approximation of the system performance with a steady-state model is justified. We present two case studies to demonstrate the practical relevance of the approach. First, we show how the method can be used to compare different HVAC system designs based on a year-round performance evaluation. Second, we show how the method can be used to extract setpoints for online controllers that achieve close-to-optimal performance without any predictive information. In conclusion, this study shows that a steady-state analysis of the HVAC systems of an electric city bus is a valuable approach to evaluate and optimize its performance.
comment: Preprint submitted to Control Engineering Practice
AirPilot: A PPO-based DRL Auto-Tuned Nonlinear PID Drone Controller for Robust Autonomous Flights
Navigation precision, speed and stability are crucial for safe UAV flight maneuvers and effective flight mission executions in dynamic environments. Different flight missions may have varying objectives, such as minimizing energy consumption, achieving precise positioning, or maximizing speed. A controller that can adapt to different objectives on the fly is highly valuable. Proportional Integral Derivative controllers are one of the most popular and widely used control algorithms for drones control systems, but their linear control algorithm fails to capture the nonlinear nature of the dynamic wind conditions and complex drone system. Manually tuning the PID gains for various missions can be time-consuming and requires significant expertise. This paper aims to revolutionize drone flight control by presenting the AirPilot, a nonlinear Deep Reinforcement Learning (DRL) - enhanced PID drone controller using Proximal Policy Optimization. AirPilot controller combines the simplicity and effectiveness of traditional PID control with the adaptability, learning capability, and optimization potential of DRL. This makes it better suited for modern drone applications where the environment is dynamic, and mission-specific performance demands are high. We employed a COEX Clover autonomous drone for training the DRL agent within the Gazebo simulator and subsequently implemented it in a real-world lab setting, which marks a significant milestone as one of the first attempts to apply a DRL-based flight controller on an actual drone. Airpilot is capable of reducing the navigation error by more than 82% and improving overshoot, speed and settling time significantly.
comment: 14 pages, 17 figures
Symmetric Locality: Definition and Initial Results
In this short paper, we characterize symmetric locality. In designing algorithms, compilers, and systems, data movement is a common bottleneck in high-performance computation, in which we improve cache and memory performance. We study a special type of data reuse in the form of repeated traversals, or re-traversals, which are based on the symmetric group. The cyclic and sawtooth traces are previously known results in symmetric locality, and in this work, we would like to generalize this result for any re-traversal. Then, we also provide an abstract framework for applications in compiler design and machine learning models to improve the memory performance of certain programs.
comment: 6 pages, 2nd ver
Geometric Tracking Control of Omnidirectional Multirotors for Aggressive Maneuvers
An omnidirectional multirotor has the maneuverability of decoupled translational and rotational motions, superseding the traditional multirotors' motion capability. Such maneuverability is achieved due to the ability of the omnidirectional multirotor to frequently alter the thrust amplitude and direction. In doing so, the rotors' settling time, which is induced by inherent rotor dynamics, significantly affects the omnidirectional multirotor's tracking performance, especially in aggressive flights. To resolve this issue, we propose a novel tracking controller that takes the rotor dynamics into account and does not require additional rotor state measurement. We prove that the proposed controller yields almost global exponential stability. The proposed controller is validated in experiments, where we demonstrate significantly improved tracking performance in multiple aggressive maneuvers compared with a baseline geometric PD controller.
Characterization of the Dynamical Properties of Safety Filters for Linear Planar Systems
This paper studies the dynamical properties of closed-loop systems obtained from control barrier function-based safety filters. We provide a sufficient and necessary condition for the existence of undesirable equilibria and show that the Jacobian matrix of the closed-loop system evaluated at an undesirable equilibrium always has a nonpositive eigenvalue. In the special case of linear planar systems and ellipsoidal obstacles, we give a complete characterization of the dynamical properties of the corresponding closed-loop system. We show that for underactuated systems, the safety filter always introduces a single undesirable equilibrium, which is a saddle-point. We prove that all trajectories outside the global stable manifold of such equilibrium converge to the origin. In the fully actuated case, we discuss how the choice of nominal controller affects the stability properties of the closed-loop system. Various simulations illustrate our results.
Exploiting Symmetry in Dynamics for Model-Based Reinforcement Learning with Asymmetric Rewards
Recent work in reinforcement learning has leveraged symmetries in the model to improve sample efficiency in training a policy. A commonly used simplifying assumption is that the dynamics and reward both exhibit the same symmetry; however, in many real-world environments, the dynamical model exhibits symmetry independent of the reward model. In this paper, we assume only the dynamics exhibit symmetry, extending the scope of problems in reinforcement learning and learning in control theory to which symmetry techniques can be applied. We use Cartan's moving frame method to introduce a technique for learning dynamics that, by construction, exhibit specified symmetries. Numerical experiments demonstrate that the proposed method learns a more accurate dynamical model
Systems and Control (EESS)
Gaussian Processes with Noisy Regression Inputs for Dynamical Systems
This paper is centered around the approximation of dynamical systems by means of Gaussian processes. To this end, trajectories of such systems must be collected to be used as training data. The measurements of these trajectories are typically noisy, which implies that both the regression inputs and outputs are corrupted by noise. However, most of the literature considers only noise in the regression outputs. In this paper, we show how to account for the noise in the regression inputs in an extended Gaussian process framework to approximate scalar and multidimensional systems. We demonstrate the potential of our framework by comparing it to different state-of-the-art methods in several simulation examples.
comment: 6 pages
Robust Stochastic Shortest-Path Planning via Risk-Sensitive Incremental Sampling
With the pervasiveness of Stochastic Shortest-Path (SSP) problems in high-risk industries, such as last-mile autonomous delivery and supply chain management, robust planning algorithms are crucial for ensuring successful task completion while mitigating hazardous outcomes. Mainstream chance-constrained incremental sampling techniques for solving SSP problems tend to be overly conservative and typically do not consider the likelihood of undesirable tail events. We propose an alternative risk-aware approach inspired by the asymptotically-optimal Rapidly-Exploring Random Trees (RRT*) planning algorithm, which selects nodes along path segments with minimal Conditional Value-at-Risk (CVaR). Our motivation rests on the step-wise coherence of the CVaR risk measure and the optimal substructure of the SSP problem. Thus, optimizing with respect to the CVaR at each sampling iteration necessarily leads to an optimal path in the limit of the sample size. We validate our approach via numerical path planning experiments in a two-dimensional grid world with obstacles and stochastic path-segment lengths. Our simulation results show that incorporating risk into the tree growth process yields paths with lengths that are significantly less sensitive to variations in the noise parameter, or equivalently, paths that are more robust to environmental uncertainty. Algorithmic analyses reveal similar query time and memory space complexity to the baseline RRT* procedure, with only a marginal increase in processing time. This increase is offset by significantly lower noise sensitivity and reduced planner failure rates.
comment: Accepted for presentation at the 2024 IEEE Conference on Decision and Control (CDC)
A new perspective on Bayesian Operational Modal Analysis
In the field of operational modal analysis (OMA), obtained modal information is frequently used to assess the current state of aerospace, mechanical, offshore and civil structures. However, the stochasticity of operational systems and the lack of forcing information can lead to inconsistent results. Quantifying the uncertainty of the recovered modal parameters through OMA is therefore of significant value. In this article, a new perspective on Bayesian OMA is proposed: a Bayesian stochastic subspace identification (SSI) algorithm. Distinct from existing approaches to Bayesian OMA, a hierarchical probabilistic model is embedded at the core of covariance-driven SSI. Through substitution of canonical correlation analysis with a Bayesian equivalent, posterior distributions over the modal properties are obtained. Two inference schemes are presented for the proposed Bayesian formulation: Markov Chain Monte Carlo and variational Bayes. Two case studies are then explored. The first is benchmark study using data from a simulated, multi degree-of-freedom, linear system. Following application of Bayesian SSI, it is shown that the same posterior is targeted and recovered by both inference schemes, with good agreement between the posterior mean and the conventional SSI result. The second study applies the variational form to data obtained from an in-service structure: The Z24 bridge. The results of this study are presented at single model orders, and then using a stabilisation diagram. The recovered posterior uncertainty is presented and compared to the classic SSI result. It is observed that the posterior distributions with mean values coinciding with the natural frequencies exhibit lower variance than values situated away from the natural frequencies.
Cross-Chip Partial Reconfiguration for the Initialisation of Modular and Scalable Heterogeneous Systems
The almost unlimited possibilities to customize the logic in an FPGA are one of the main reasons for the versatility of these devices. Partial reconfiguration exploits this capability even further by allowing to replace logic in predefined FPGA regions at runtime. This is especially relevant in heterogeneous SoCs, combining FPGA fabric with conventional processors on a single die. Tight integration and supporting frameworks like the FPGA subsystem in Linux facilitate use, for example, to dynamically load custom hardware accelerators. Although this example is one of the most common use cases for partial reconfiguration, the possible applications go far beyond. We propose to use partial reconfiguration in combination with the AXI C2C cross-chip bus to extend the resources of heterogeneous MPSoC and RFSoC devices by connecting peripheral FPGAs. With AXI C2C it is easily possible to link the programmable logic of the individual devices, but partial reconfiguration on peripheral FPGAs utilising the same channel is not officially supported. By using an AXI ICAP controller in combination with custom Linux drivers, we show that it is possible to enable the PS of the heterogeneous SoC to perform partial reconfiguration on peripheral FPGAs, and thus to seamlessly access and manage the entire multi-device system. As a result, software and FPGA firmware updates can be applied to the entire system at runtime, and peripheral FPGAs can be added and removed during operation.
comment: 8 pages, double-column, 9 figures. Paper submitted as proceeding for the 24rd IEEE Real Time Conference (2024)
Study on Human-Variability-Respecting Optimal Control Affecting Human Interaction Experience
Broad application of human-machine interaction (HMI) demands advanced and human-centered control designs for the machine's automation. Human natural motor action shows stochastic behavior, which has so far not been respected in HMI control designs. Using a previously presented novel human-variability-respecting optimal controller we present a study design which allows the investigation of respecting human natural variability and its effect on human interaction experience. Our approach is tested in simulation based on an identified real human subject and presents a promising approach to be used for a larger subject study.
Discrete-time SIS Social Contagion Processes on Hypergraphs
Recent research on social contagion processes has revealed the limitations of traditional networks, which capture only pairwise relationships, to characterize complex multiparty relationships and group influences properly. Social contagion processes on higher-order networks (simplicial complexes and general hypergraphs) have therefore emerged as a novel frontier. In this work, we investigate discrete-time Susceptible-Infected-Susceptible (SIS) social contagion processes occurring on weighted and directed hypergraphs and their extensions to bivirus cases and general higher-order SIS processes with the aid of tensor algebra. Our focus lies in comprehensively characterizing the healthy state and endemic equilibria within this framework. The emergence of bistability or multistability behavior phenomena, where multiple equilibria coexist and are simultaneously locally asymptotically stable, is demonstrated in view of the presence of the higher-order interaction. The novel sufficient conditions of the appearance for system behaviors, which are determined by both (higher-order) network topology and transition rates, are provided to assess the likelihood of the SIS social contagion processes causing an outbreak. More importantly, given the equilibrium is locally stable, an explicit domain of attraction associated with the system parameters is constructed. Moreover, a learning method to estimate the transition rates is presented. In the end, the attained theoretical results are supplemented via numerical examples. Specifically, we evaluate the effectiveness of the networked SIS social contagion process by comparing it with the $2^n$-state Markov chain model. These numerical examples are given to highlight the performance of parameter learning algorithms and the system behaviors of the discrete-time SIS social contagion process.
RadioDiff: An Effective Generative Diffusion Model for Sampling-Free Dynamic Radio Map Construction
Radio map (RM) is a promising technology that can obtain pathloss based on only location, which is significant for 6G network applications to reduce the communication costs for pathloss estimation. However, the construction of RM in traditional is either computationally intensive or depends on costly sampling-based pathloss measurements. Although the neural network (NN)-based method can efficiently construct the RM without sampling, its performance is still suboptimal. This is primarily due to the misalignment between the generative characteristics of the RM construction problem and the discrimination modeling exploited by existing NN-based methods. Thus, to enhance RM construction performance, in this paper, the sampling-free RM construction is modeled as a conditional generative problem, where a denoised diffusion-based method, named RadioDiff, is proposed to achieve high-quality RM construction. In addition, to enhance the diffusion model's capability of extracting features from dynamic environments, an attention U-Net with an adaptive fast Fourier transform module is employed as the backbone network to improve the dynamic environmental features extracting capability. Meanwhile, the decoupled diffusion model is utilized to further enhance the construction performance of RMs. Moreover, a comprehensive theoretical analysis of why the RM construction is a generative problem is provided for the first time, from both perspectives of data features and NN training methods. Experimental results show that the proposed RadioDiff achieves state-of-the-art performance in all three metrics of accuracy, structural similarity, and peak signal-to-noise ratio. The code is available at https://github.com/UNIC-Lab/RadioDiff.
Data-driven Construction of Finite Abstractions for Interconnected Systems: A Compositional Approach
Finite-state abstractions (a.k.a. symbolic models) present a promising avenue for the formal verification and synthesis of controllers in continuous-space control systems. These abstractions provide simplified models that capture the fundamental behaviors of the original systems. However, the creation of such abstractions typically relies on the availability of precise knowledge concerning system dynamics, which might not be available in many real-world applications. In this work, we introduce an innovative, data-driven, and compositional approach to generate finite abstractions for interconnected systems that consist of discrete-time control subsystems with unknown dynamics. These subsystems interact through an unknown static interconnection map. Our methodology for abstracting the interconnected system involves constructing abstractions for individual subsystems and incorporating an abstraction of the interconnection map.
comment: This manuscript of 19 pages and 7 figures is a preprint under review with a journal
A New Control Law for TS Fuzzy Models: Less Conservative LMI Conditions by Using Membership Functions Derivative
This note proposes a new type of Parallel Distributed Controller (PDC) for Takagi-Sugeno (TS) fuzzy models. Our idea consists of using two control terms based on state feedback, one composed of a convex combination of linear gains weighted by the normalized membership grade, as in traditional PDC, and the other composed of linear gains weighted by the time-derivatives of the membership functions. We present the design conditions as Linear Matrix Inequalities, solvable through numerical optimization tools. Numerical examples are given to illustrate the advantages of the proposed approach, which contains the the traditional PDC as a special case.
comment: 20 pages, 4 figures
GLANCE: Graph-based Learnable Digital Twin for Communication Networks
As digital twins (DTs) to physical communication systems, network simulators can aid the design and deployment of communication networks. However, time-consuming simulations must be run for every new set of network configurations. Learnable digital twins (LDTs), in contrast, can be trained offline to emulate simulation outcomes and serve as a more efficient alternative to simulation-based DTs at runtime. In this work, we propose GLANCE, a communication LDT that learns from the simulator ns-3. It can evaluate network key performance indicators (KPIs) and assist in network management with exceptional efficiency. Leveraging graph learning, we exploit network data characteristics and devise a specialized architecture to embed sequential and topological features of traffic flows within the network. In addition, multi-task learning (MTL) and transfer learning (TL) are leveraged to enhance GLANCE's generalizability to unseen inputs and efficacy across different tasks. Beyond end-to-end KPI prediction, GLANCE can be deployed within an optimization framework for network management. It serves as an efficient or differentiable evaluator in optimizing network configurations such as traffic loads and flow destinations. Through numerical experiments and benchmarking, we verify the effectiveness of the proposed LDT architecture, demonstrate its robust generalization to various inputs, and showcase its efficacy in network management applications.
Memory-optimised Cubic Splines for High-fidelity Quantum Operations
Radio-frequency pulses are widespread for the control of quantum bits and the execution of operations in quantum computers. The ability to tune key pulse parameters such as time-dependent amplitude, phase, and frequency is essential to achieve maximal gate fidelity and mitigate errors. As systems scale, a larger fraction of the control electronic processing will move closer to the qubits, to enhance integration and minimise latency in operations requiring fast feedback. This will constrain the space available in the memory of the control electronics to load time-resolved pulse parameters at high sampling rates. Cubic spline interpolation is a powerful and widespread technique that divides the pulse into segments of cubic polynomials. We show an optimised implementation of this strategy, using a two-stage curve fitting process and additional symmetry operations to load a high-sampling pulse output on an FPGA. This results in a favourable accuracy versus memory footprint trade-off. By simulating single-qubit population transfer and atom transport on a neutral atom device, we show that we can achieve high fidelities with low memory requirements. This is instrumental for scaling up the number of qubits and gate operations in environments where memory is a limited resource.
Stable State Space SubSpace (S$^5$) Identification
State space subspace algorithms for input-output systems have been widely applied but also have a reasonably well-developedasymptotic theory dealing with consistency. However, guaranteeing the stability of the estimated system matrix is a major issue. Existing stability-guaranteed algorithms are computationally expensive, require several tuning parameters, and scale badly to high state dimensions. Here, we develop a new algorithm that is closed-form and requires no tuning parameters. It is thus computationally cheap and scales easily to high state dimensions. We also prove its consistency under reasonable conditions.
Parallelized Robust Distributed Model Predictive Control in the Presence of Coupled State Constraints
In this paper, we present a robust distributed model predictive control (DMPC) scheme for dynamically decoupled nonlinear systems which are subject to state constraints, coupled state constraints and input constraints. In the proposed control scheme, all subsystems solve their local optimization problem in parallel and neighbor-to-neighbor communication suffices. The approach relies on consistency constraints which define a neighborhood around each subsystem's reference trajectory where the state of the subsystem is guaranteed to stay in. Reference trajectories and consistency constraints are known to neighboring subsystems. Contrary to other relevant approaches, the reference trajectories are improved consecutively. The presented approach allows the formulation of convex optimization problems for systems with linear dynamics even in the presence of non-convex state constraints. Additionally, we employ tubes in order to ensure the controller's robustness against bounded uncertainties. In the end, we briefly comment on an iterative extension of the DMPC scheme. The effectiveness of the proposed DMPC scheme and its iterative extension are demonstrated with simulations.
comment: 15 pages, 5 figures, accepted for publication in Automatica
Transmission Expansion Planning for Renewable-energy-dominated Power Grids Considering Climate Impact
As renewable energy is becoming the major resource in future grids, the weather and climate can have a higher impact on grid reliability. Transmission expansion planning (TEP) has the potential to reinforce a transmission network that is suitable for climate-impacted grids. In this paper, we propose a systematic TEP procedure for climate-impacted renewable energy-enriched grids. Particularly, this work developed an improved model for TEP considering climate impact (TEP-CI) and evaluated the system reliability with the obtained transmission investment plan. Firstly, we created climate-impacted spatio-temporal future grid data to facilitate the TEP-CI study, which includes the future climate-dependent renewable production as well as the dynamic rating profiles of the Texas 123-bus backbone transmission system (TX-123BT). Secondly, we proposed the TEP-CI which considers the variation in renewable production and dynamic line rating, and obtained the investment plan for future TX-123BT. Thirdly, we presented a customized security-constrained unit commitment (SCUC) specifically for climate-impacted grids. The future grid reliability under various investment scenarios is analyzed based on the daily operation conditions from SCUC simulations. The whole procedure presented in this paper enables numerical studies on grid planning considering climate impacts. It can also serve as a benchmark for other TEP-CI research and performance evaluation.
comment: 11 pages, 8 figures
A Synthetic Texas Power System with Time-Series High-Resolution Weather-Dependent Spatio-Temporally Correlated Grid Profiles
This study introduced a synthetic power system with spatio-temporally correlated profiles of solar power, wind power, dynamic line ratings and loads at one-hour resolution for five continuous years, referred to as the Texas 123-bus backbone transmission (TX-123BT) system. Unlike conventional test cases that offer a static snapshot of system profile, the designed TX-123BT system incorporates weather-dependent profiles for renewable generation and transmission thermal limits, mimicking the actual Electric Reliability Council of Texas (ERCOT) system characteristics. Three weather-dependent models are used for the creation of wind and solar power production, and dynamic line rating (DLR) separately. Security-constrained unit commitment (SCUC) is conducted on TX-123BT daily profiles and numerical results are compared with the actual ERCOT system for validation. The long-term spatio-temporal profiles can greatly capture the renewable production versatility due to the environmental conditions. An example of hydrogen facilities integration studies is presented to illustrate the advantage of utilizing detailed spatio-temporal profiles of TX-123BT.
comment: 10 pages, 14 figures, 10 tables
Optimization of the Energy-Comfort Trade-Off of HVAC Systems in Electric City Buses Based on a Steady-State Model
The electrification of public transport vehicles offers the potential to relieve city centers of pollutant and noise emissions. Furthermore, electric buses have lower life-cycle greenhouse gas (GHG) emissions than diesel buses, particularly when operated with sustainably produced electricity. However, the heating, ventilation, and air-conditioning (HVAC) system can consume a significant amount of energy, thus limiting the achievable driving range. In this paper, we address the HVAC system in an electric city bus by analyzing the trade-off between the energy consumption and the thermal comfort of the passengers. We do this by developing a dynamic thermal model for the bus, which we simplify by considering it to be in steady state. We introduce a method that is able to quickly optimize the steady-state HVAC system inputs for a large number of samples representative of a year-round operation. A comparison between the results from the steady-state optimization approach and a dynamic simulation reveals small deviations in both the HVAC system power demand and achieved thermal comfort. Thus, the approximation of the system performance with a steady-state model is justified. We present two case studies to demonstrate the practical relevance of the approach. First, we show how the method can be used to compare different HVAC system designs based on a year-round performance evaluation. Second, we show how the method can be used to extract setpoints for online controllers that achieve close-to-optimal performance without any predictive information. In conclusion, this study shows that a steady-state analysis of the HVAC systems of an electric city bus is a valuable approach to evaluate and optimize its performance.
comment: Preprint submitted to Control Engineering Practice
AirPilot: A PPO-based DRL Auto-Tuned Nonlinear PID Drone Controller for Robust Autonomous Flights
Navigation precision, speed and stability are crucial for safe UAV flight maneuvers and effective flight mission executions in dynamic environments. Different flight missions may have varying objectives, such as minimizing energy consumption, achieving precise positioning, or maximizing speed. A controller that can adapt to different objectives on the fly is highly valuable. Proportional Integral Derivative controllers are one of the most popular and widely used control algorithms for drones control systems, but their linear control algorithm fails to capture the nonlinear nature of the dynamic wind conditions and complex drone system. Manually tuning the PID gains for various missions can be time-consuming and requires significant expertise. This paper aims to revolutionize drone flight control by presenting the AirPilot, a nonlinear Deep Reinforcement Learning (DRL) - enhanced PID drone controller using Proximal Policy Optimization. AirPilot controller combines the simplicity and effectiveness of traditional PID control with the adaptability, learning capability, and optimization potential of DRL. This makes it better suited for modern drone applications where the environment is dynamic, and mission-specific performance demands are high. We employed a COEX Clover autonomous drone for training the DRL agent within the Gazebo simulator and subsequently implemented it in a real-world lab setting, which marks a significant milestone as one of the first attempts to apply a DRL-based flight controller on an actual drone. Airpilot is capable of reducing the navigation error by more than 82% and improving overshoot, speed and settling time significantly.
comment: 14 pages, 17 figures
Symmetric Locality: Definition and Initial Results
In this short paper, we characterize symmetric locality. In designing algorithms, compilers, and systems, data movement is a common bottleneck in high-performance computation, in which we improve cache and memory performance. We study a special type of data reuse in the form of repeated traversals, or re-traversals, which are based on the symmetric group. The cyclic and sawtooth traces are previously known results in symmetric locality, and in this work, we would like to generalize this result for any re-traversal. Then, we also provide an abstract framework for applications in compiler design and machine learning models to improve the memory performance of certain programs.
comment: 6 pages, 2nd ver
Geometric Tracking Control of Omnidirectional Multirotors for Aggressive Maneuvers
An omnidirectional multirotor has the maneuverability of decoupled translational and rotational motions, superseding the traditional multirotors' motion capability. Such maneuverability is achieved due to the ability of the omnidirectional multirotor to frequently alter the thrust amplitude and direction. In doing so, the rotors' settling time, which is induced by inherent rotor dynamics, significantly affects the omnidirectional multirotor's tracking performance, especially in aggressive flights. To resolve this issue, we propose a novel tracking controller that takes the rotor dynamics into account and does not require additional rotor state measurement. We prove that the proposed controller yields almost global exponential stability. The proposed controller is validated in experiments, where we demonstrate significantly improved tracking performance in multiple aggressive maneuvers compared with a baseline geometric PD controller.
Characterization of the Dynamical Properties of Safety Filters for Linear Planar Systems
This paper studies the dynamical properties of closed-loop systems obtained from control barrier function-based safety filters. We provide a sufficient and necessary condition for the existence of undesirable equilibria and show that the Jacobian matrix of the closed-loop system evaluated at an undesirable equilibrium always has a nonpositive eigenvalue. In the special case of linear planar systems and ellipsoidal obstacles, we give a complete characterization of the dynamical properties of the corresponding closed-loop system. We show that for underactuated systems, the safety filter always introduces a single undesirable equilibrium, which is a saddle-point. We prove that all trajectories outside the global stable manifold of such equilibrium converge to the origin. In the fully actuated case, we discuss how the choice of nominal controller affects the stability properties of the closed-loop system. Various simulations illustrate our results.
Exploiting Symmetry in Dynamics for Model-Based Reinforcement Learning with Asymmetric Rewards
Recent work in reinforcement learning has leveraged symmetries in the model to improve sample efficiency in training a policy. A commonly used simplifying assumption is that the dynamics and reward both exhibit the same symmetry; however, in many real-world environments, the dynamical model exhibits symmetry independent of the reward model. In this paper, we assume only the dynamics exhibit symmetry, extending the scope of problems in reinforcement learning and learning in control theory to which symmetry techniques can be applied. We use Cartan's moving frame method to introduce a technique for learning dynamics that, by construction, exhibit specified symmetries. Numerical experiments demonstrate that the proposed method learns a more accurate dynamical model
Robotics
System Identification For Constrained Robots
Identifying the parameters of robotic systems, such as motor inertia or joint friction, is critical to satisfactory controller synthesis, model analysis, and observer design. Conventional identification techniques are designed primarily for unconstrained systems, such as robotic manipulators. In contrast, the growing importance of legged robots that feature closed kinematic chains or other constraints, poses challenges to these traditional methods. This paper introduces a system identification approach for constrained systems that relies on iterative least squares to identify motor inertia and joint friction parameters from data. The proposed approach is validated in simulation and in the real-world on Digit, which is a 20 degree-of-freedom humanoid robot built by Agility Robotics. In these experiments, the parameters identified by the proposed method enable a model-based controller to achieve better tracking performance than when it uses the default parameters provided by the manufacturer. The implementation of the approach is available at https://github.com/roahmlab/ConstrainedSysID.
A Transparency Paradox? Investigating the Impact of Explanation Specificity and Autonomous Vehicle Perceptual Inaccuracies on Passengers
Transparency in automated systems could be afforded through the provision of intelligible explanations. While transparency is desirable, might it lead to catastrophic outcomes (such as anxiety), that could outweigh its benefits? It's quite unclear how the specificity of explanations (level of transparency) influences recipients, especially in autonomous driving (AD). In this work, we examined the effects of transparency mediated through varying levels of explanation specificity in AD. We first extended a data-driven explainer model by adding a rule-based option for explanation generation in AD, and then conducted a within-subject lab study with 39 participants in an immersive driving simulator to study the effect of the resulting explanations. Specifically, our investigation focused on: (1) how different types of explanations (specific vs. abstract) affect passengers' perceived safety, anxiety, and willingness to take control of the vehicle when the vehicle perception system makes erroneous predictions; and (2) the relationship between passengers' behavioural cues and their feelings during the autonomous drives. Our findings showed that passengers felt safer with specific explanations when the vehicle's perception system had minimal errors, while abstract explanations that hid perception errors led to lower feelings of safety. Anxiety levels increased when specific explanations revealed perception system errors (high transparency). We found no significant link between passengers' visual patterns and their anxiety levels. Our study suggests that passengers prefer clear and specific explanations (high transparency) when they originate from autonomous vehicles (AVs) with optimal perceptual accuracy.
comment: Submitted to Transportation Research Part F: Traffic Psychology and Behaviour. arXiv admin note: text overlap with arXiv:2307.00633
User-centered evaluation of the Wearable Walker lower limb exoskeleton, preliminary assessment based on the Experience protocol
Using lower-limbs exoskeletons provides potential advantages in terms of productivity and safety associated with reduced stress. However, complex issues in human-robot interaction are still open, such as the physiological effects of exoskeletons and the impact on the user's subjective experience. In this work, an innovative exoskeleton, the Wearable Walker, is assessed using the EXPERIENCE benchmarking protocol from the EUROBENCH project. The Wearable Walker is a lower-limb exoskeleton that enhances human abilities, such as carrying loads. The device uses a unique control approach called Blend Control that provides smooth assistance torques. It operates two models simultaneously, one in the case in which the left foot is grounded and another for the grounded right foot. These models generate assistive torques combined to provide continuous and smooth overall assistance, preventing any abrupt changes in torque due to model switching. The EXPERIENCE protocol consists of walking on flat ground while gathering physiological signals such as heart rate, its variability, respiration rate, and galvanic skin response and completing a questionnaire. The test was performed with five healthy subjects. The scope of the present study is twofold: to evaluate the specific exoskeleton and its current control system to gain insight into possible improvements and to present a case study for a formal and replicable benchmarking of wearable robots.
comment: 12 pages, 5 figures
Robust Stochastic Shortest-Path Planning via Risk-Sensitive Incremental Sampling
With the pervasiveness of Stochastic Shortest-Path (SSP) problems in high-risk industries, such as last-mile autonomous delivery and supply chain management, robust planning algorithms are crucial for ensuring successful task completion while mitigating hazardous outcomes. Mainstream chance-constrained incremental sampling techniques for solving SSP problems tend to be overly conservative and typically do not consider the likelihood of undesirable tail events. We propose an alternative risk-aware approach inspired by the asymptotically-optimal Rapidly-Exploring Random Trees (RRT*) planning algorithm, which selects nodes along path segments with minimal Conditional Value-at-Risk (CVaR). Our motivation rests on the step-wise coherence of the CVaR risk measure and the optimal substructure of the SSP problem. Thus, optimizing with respect to the CVaR at each sampling iteration necessarily leads to an optimal path in the limit of the sample size. We validate our approach via numerical path planning experiments in a two-dimensional grid world with obstacles and stochastic path-segment lengths. Our simulation results show that incorporating risk into the tree growth process yields paths with lengths that are significantly less sensitive to variations in the noise parameter, or equivalently, paths that are more robust to environmental uncertainty. Algorithmic analyses reveal similar query time and memory space complexity to the baseline RRT* procedure, with only a marginal increase in processing time. This increase is offset by significantly lower noise sensitivity and reduced planner failure rates.
comment: Accepted for presentation at the 2024 IEEE Conference on Decision and Control (CDC)
Case Study: Runtime Safety Verification of Neural Network Controlled System
Neural networks are increasingly used in safety-critical applications such as robotics and autonomous vehicles. However, the deployment of neural-network-controlled systems (NNCSs) raises significant safety concerns. Many recent advances overlook critical aspects of verifying control and ensuring safety in real-time scenarios. This paper presents a case study on using POLAR-Express, a state-of-the-art NNCS reachability analysis tool, for runtime safety verification in a Turtlebot navigation system using LiDAR. The Turtlebot, equipped with a neural network controller for steering, operates in a complex environment with obstacles. We developed a safe online controller switching strategy that switches between the original NNCS controller and an obstacle avoidance controller based on the verification results. Our experiments, conducted in a ROS2 Flatland simulation environment, explore the capabilities and limitations of using POLAR-Express for runtime verification and demonstrate the effectiveness of our switching strategy.
comment: 15 pages, 5 figures, submitted to Runtime Verification 2024
S-RAF: A Simulation-Based Robustness Assessment Framework for Responsible Autonomous Driving
As artificial intelligence (AI) technology advances, ensuring the robustness and safety of AI-driven systems has become paramount. However, varying perceptions of robustness among AI developers create misaligned evaluation metrics, complicating the assessment and certification of safety-critical and complex AI systems such as autonomous driving (AD) agents. To address this challenge, we introduce Simulation-Based Robustness Assessment Framework (S-RAF) for autonomous driving. S-RAF leverages the CARLA Driving simulator to rigorously assess AD agents across diverse conditions, including faulty sensors, environmental changes, and complex traffic situations. By quantifying robustness and its relationship with other safety-critical factors, such as carbon emissions, S-RAF aids developers and stakeholders in building safe and responsible driving agents, and streamlining safety certification processes. Furthermore, S-RAF offers significant advantages, such as reduced testing costs, and the ability to explore edge cases that may be unsafe to test in the real world. The code for this framework is available here: https://github.com/cognitive-robots/rai-leaderboard
Detection and tracking of MAVs using a LiDAR with rosette scanning pattern
The usage of commercial Micro Aerial Vehicles (MAVs) has increased drastically during the last decade. While the added value of MAVs to society is apparent, their growing use is also coming with increasing risks like violating public airspace at airports or committing privacy violations. To mitigate these issues it is becoming critical to develop solutions that incorporate the detection and tracking of MAVs with autonomous systems. This work presents a method for the detection and tracking of MAVs using a novel, low-cost rosette scanning LiDAR on a pan-tilt turret. Once the static background is captured, a particle filter is utilized to detect a possible target and track its position with a physical, programmable pan-tilt system. The tracking makes it possible to keep the MAV in the center, maximizing the density of 3D points measured on the target by the LiDAR sensor. The developed algorithm was evaluated within the indoor MIcro aerial vehicle and MOtion capture (MIMO) arena and has state-of-the-art tracking accuracy, stability, and fast re-detection time in case of tracking loss. Based on the outdoor tests, it was possible to significantly increase the detection distance and number of returned points compared to other similar methods using LiDAR.
Study of MRI-compatible Notched Plastic Ultrasonic Stator with FEM Simulation and Holography Validation
Intra-operative image guidance using magnetic resonance imaging (MRI) can significantly enhance the precision of surgical procedures, such as deep brain tumor ablation. However, the powerful magnetic fields and limited space within an MRI scanner require the use of robotic devices to aid surgeons. Piezoelectric motors are commonly utilized to drive these robots, with piezoelectric ultrasonic motors being particularly notable. These motors consist of a piezoelectric ring stator that is bonded to a rotor through frictional coupling. When the stator is excited at specific frequencies, it generates distinctive mode shapes with surface waves that exhibit both in-plane and out-of-plane displacement, leading to the rotation of the rotor. In this study, we continue our previous work and refine the motor design and performance, we combine finite element modeling (FEM) with stroboscopic and time-averaged digital holography to validate a further plastic-based ultrasonic motor with better rotary performance.
comment: 4 pages, 9 figures, 1 table
On the Completeness of Conflict-Based Search: Temporally-Relative Duplicate Pruning
Conflict-Based Search (CBS) algorithm for the multi-agent pathfinding (MAPF) problem is that it is incomplete for problems which have no solution; if no mitigating procedure is run in parallel, CBS will run forever when given an unsolvable problem instance. In this work, we introduce Temporally-Relative Duplicate Pruning (TRDP), a technique for duplicate detection and removal in both classic and continuous-time MAPF domains. TRDP is a simple procedure which closes the long-standing theoretic loophole of incompleteness for CBS by detecting and avoiding the expansion of duplicate states. TRDP is shown both theoretically and empirically to ensure termination without a significant impact on runtime in the majority of problem instances. In certain cases, TRDP is shown to increase performance significantly
comment: 9 pages, 4 figures, 2 tables
Diffusion Model for Planning: A Systematic Literature Review
Diffusion models, which leverage stochastic processes to capture complex data distributions effectively, have shown their performance as generative models, achieving notable success in image-related tasks through iterative denoising processes. Recently, diffusion models have been further applied and show their strong abilities in planning tasks, leading to a significant growth in related publications since 2023. To help researchers better understand the field and promote the development of the field, we conduct a systematic literature review of recent advancements in the application of diffusion models for planning. Specifically, this paper categorizes and discusses the current literature from the following perspectives: (i) relevant datasets and benchmarks used for evaluating diffusion modelbased planning; (ii) fundamental studies that address aspects such as sampling efficiency; (iii) skill-centric and condition-guided planning for enhancing adaptability; (iv) safety and uncertainty managing mechanism for enhancing safety and robustness; and (v) domain-specific application such as autonomous driving. Finally, given the above literature review, we further discuss the challenges and future directions in this field.
comment: 13 pages, 2 figures, 4 tables
Agentic Skill Discovery
Language-conditioned robotic skills make it possible to apply the high-level reasoning of Large Language Models (LLMs) to low-level robotic control. A remaining challenge is to acquire a diverse set of fundamental skills. Existing approaches either manually decompose a complex task into atomic robotic actions in a top-down fashion, or bootstrap as many combinations as possible in a bottom-up fashion to cover a wider range of task possibilities. These decompositions or combinations, however, require an initial skill library. For example, a ``grasping'' capability can never emerge from a skill library containing only diverse ``pushing'' skills. Existing skill discovery techniques with reinforcement learning acquire skills by an exhaustive exploration but often yield non-meaningful behaviors. In this study, we introduce a novel framework for skill discovery that is entirely driven by LLMs. The framework begins with an LLM generating task proposals based on the provided scene description and the robot's configurations, aiming to incrementally acquire new skills upon task completion. For each proposed task, a series of reinforcement learning processes are initiated, utilizing reward and success determination functions sampled by the LLM to develop the corresponding policy. The reliability and trustworthiness of learned behaviors are further ensured by an independent vision-language model. We show that starting with zero skill, the skill library emerges and expands to more and more meaningful and reliable skills, enabling the robot to efficiently further propose and complete advanced tasks. Project page: \url{https://agentic-skill-discovery.github.io}.
comment: Webpage see https://agentic-skill-discovery.github.io/
MonoForce: Self-supervised Learning of Physics-informed Model for Predicting Robot-terrain Interaction IROS 2024
While autonomous navigation of mobile robots on rigid terrain is a well-explored problem, navigating on deformable terrain such as tall grass or bushes remains a challenge. To address it, we introduce an explainable, physics-aware and end-to-end differentiable model which predicts the outcome of robot-terrain interaction from camera images, both on rigid and non-rigid terrain. The proposed MonoForce model consists of a black-box module which predicts robot-terrain interaction forces from onboard cameras, followed by a white-box module, which transforms these forces and a control signals into predicted trajectories, using only the laws of classical mechanics. The differentiable white-box module allows backpropagating the predicted trajectory errors into the black-box module, serving as a self-supervised loss that measures consistency between the predicted forces and ground-truth trajectories of the robot. Experimental evaluation on a public dataset and our data has shown that while the prediction capabilities are comparable to state-of-the-art algorithms on rigid terrain, MonoForce shows superior accuracy on non-rigid terrain such as tall grass or bushes. To facilitate the reproducibility of our results, we release both the code and datasets.
comment: IROS 2024
AirPilot: A PPO-based DRL Auto-Tuned Nonlinear PID Drone Controller for Robust Autonomous Flights
Navigation precision, speed and stability are crucial for safe UAV flight maneuvers and effective flight mission executions in dynamic environments. Different flight missions may have varying objectives, such as minimizing energy consumption, achieving precise positioning, or maximizing speed. A controller that can adapt to different objectives on the fly is highly valuable. Proportional Integral Derivative controllers are one of the most popular and widely used control algorithms for drones control systems, but their linear control algorithm fails to capture the nonlinear nature of the dynamic wind conditions and complex drone system. Manually tuning the PID gains for various missions can be time-consuming and requires significant expertise. This paper aims to revolutionize drone flight control by presenting the AirPilot, a nonlinear Deep Reinforcement Learning (DRL) - enhanced PID drone controller using Proximal Policy Optimization. AirPilot controller combines the simplicity and effectiveness of traditional PID control with the adaptability, learning capability, and optimization potential of DRL. This makes it better suited for modern drone applications where the environment is dynamic, and mission-specific performance demands are high. We employed a COEX Clover autonomous drone for training the DRL agent within the Gazebo simulator and subsequently implemented it in a real-world lab setting, which marks a significant milestone as one of the first attempts to apply a DRL-based flight controller on an actual drone. Airpilot is capable of reducing the navigation error by more than 82% and improving overshoot, speed and settling time significantly.
comment: 14 pages, 17 figures
Geometric Tracking Control of Omnidirectional Multirotors for Aggressive Maneuvers
An omnidirectional multirotor has the maneuverability of decoupled translational and rotational motions, superseding the traditional multirotors' motion capability. Such maneuverability is achieved due to the ability of the omnidirectional multirotor to frequently alter the thrust amplitude and direction. In doing so, the rotors' settling time, which is induced by inherent rotor dynamics, significantly affects the omnidirectional multirotor's tracking performance, especially in aggressive flights. To resolve this issue, we propose a novel tracking controller that takes the rotor dynamics into account and does not require additional rotor state measurement. We prove that the proposed controller yields almost global exponential stability. The proposed controller is validated in experiments, where we demonstrate significantly improved tracking performance in multiple aggressive maneuvers compared with a baseline geometric PD controller.
Component Selection for Craft Assembly Tasks
Inspired by traditional handmade crafts, where a person improvises assemblies based on the available objects, we formally introduce the Craft Assembly Task. It is a robotic assembly task that involves building an accurate representation of a given target object using the available objects, which do not directly correspond to its parts. In this work, we focus on selecting the subset of available objects for the final craft, when the given input is an RGB image of the target in the wild. We use a mask segmentation neural network to identify visible parts, followed by retrieving labelled template meshes. These meshes undergo pose optimization to determine the most suitable template. Then, we propose to simplify the parts of the transformed template mesh to primitive shapes like cuboids or cylinders. Finally, we design a search algorithm to find correspondences in the scene based on local and global proportions. We develop baselines for comparison that consider all possible combinations, and choose the highest scoring combination for common metrics used in foreground maps and mask accuracy. Our approach achieves comparable results to the baselines for two different scenes, and we show qualitative results for an implementation in a real-world scenario.
comment: Published on IEEE RA-L
iMTSP: Solving Min-Max Multiple Traveling Salesman Problem with Imperative Learning
This paper considers a Min-Max Multiple Traveling Salesman Problem (MTSP), where the goal is to find a set of tours, one for each agent, to collectively visit all the cities while minimizing the length of the longest tour. Though MTSP has been widely studied, obtaining near-optimal solutions for large-scale problems is still challenging due to its NP-hardness. Recent efforts in data-driven methods face challenges of the need for hard-to-obtain supervision and issues with high variance in gradient estimations, leading to slow convergence and highly suboptimal solutions. We address these issues by reformulating MTSP as a bilevel optimization problem, using the concept of imperative learning (IL). This involves introducing an allocation network that decomposes the MTSP into multiple single-agent traveling salesman problems (TSPs). The longest tour from these TSP solutions is then used to self-supervise the allocation network, resulting in a new self-supervised, bilevel, end-to-end learning framework, which we refer to as imperative MTSP (iMTSP). Additionally, to tackle the high-variance gradient issues during the optimization, we introduce a control variate-based gradient estimation algorithm. Our experiments showed that these innovative designs enable our gradient estimator to converge 20% faster than the advanced reinforcement learning baseline and find up to 80% shorter tour length compared with Google OR-Tools MTSP solver, especially in large-scale problems (e.g. 1000 cities and 15 agents).
comment: 8 pages, 3 figures, 3 tables
Exploiting Symmetry in Dynamics for Model-Based Reinforcement Learning with Asymmetric Rewards
Recent work in reinforcement learning has leveraged symmetries in the model to improve sample efficiency in training a policy. A commonly used simplifying assumption is that the dynamics and reward both exhibit the same symmetry; however, in many real-world environments, the dynamical model exhibits symmetry independent of the reward model. In this paper, we assume only the dynamics exhibit symmetry, extending the scope of problems in reinforcement learning and learning in control theory to which symmetry techniques can be applied. We use Cartan's moving frame method to introduce a technique for learning dynamics that, by construction, exhibit specified symmetries. Numerical experiments demonstrate that the proposed method learns a more accurate dynamical model
Multiagent Systems
The computational power of a human society: a new model of social evolution
Social evolutionary theory seeks to explain increases in the scale and complexity of human societies, from origins to present. Over the course of the twentieth century, social evolutionary theory largely fell out of favor as a way of investigating human history, just as advances in complex systems science and computer science saw the emergence of powerful new conceptions of complex systems, and in particular new methods of measuring complexity. We propose that these advances in our understanding of complex systems and computer science should be brought to bear on our investigations into human history. To that end, we present a new framework for modeling how human societies co-evolve with their biotic environments, recognizing that both a society and its environment are computers. This leads us to model the dynamics of each of those two systems using the same, new kind of computational machine, which we define here. For simplicity, we construe a society as a set of interacting occupations and technologies. Similarly, under such a model, a biotic environment is a set of interacting distinct ecological and climatic processes. This provides novel ways to characterize social complexity, which we hope will cast new light on the archaeological and historical records. Our framework also provides a natural way to formalize both the energetic (thermodynamic) costs required by a society as it runs, and the ways it can extract thermodynamic resources from the environment in order to pay for those costs -- and perhaps to grow with any left-over resources.
comment: 61 pages, 7 figures, 5 pages of appendices, 101 references
AgentSimulator: An Agent-based Approach for Data-driven Business Process Simulation
Business process simulation (BPS) is a versatile technique for estimating process performance across various scenarios. Traditionally, BPS approaches employ a control-flow-first perspective by enriching a process model with simulation parameters. Although such approaches can mimic the behavior of centrally orchestrated processes, such as those supported by workflow systems, current control-flow-first approaches cannot faithfully capture the dynamics of real-world processes that involve distinct resource behavior and decentralized decision-making. Recognizing this issue, this paper introduces AgentSimulator, a resource-first BPS approach that discovers a multi-agent system from an event log, modeling distinct resource behaviors and interaction patterns to simulate the underlying process. Our experiments show that AgentSimulator achieves state-of-the-art simulation accuracy with significantly lower computation times than existing approaches while providing high interpretability and adaptability to different types of process-execution scenarios.
Multilevel Graph Reinforcement Learning for Consistent Cognitive Decision-making in Heterogeneous Mixed Autonomy
In the realm of heterogeneous mixed autonomy, vehicles experience dynamic spatial correlations and nonlinear temporal interactions in a complex, non-Euclidean space. These complexities pose significant challenges to traditional decision-making frameworks. Addressing this, we propose a hierarchical reinforcement learning framework integrated with multilevel graph representations, which effectively comprehends and models the spatiotemporal interactions among vehicles navigating through uncertain traffic conditions with varying decision-making systems. Rooted in multilevel graph representation theory, our approach encapsulates spatiotemporal relationships inherent in non-Euclidean spaces. A weighted graph represents spatiotemporal features between nodes, addressing the degree imbalance inherent in dynamic graphs. We integrate asynchronous parallel hierarchical reinforcement learning with a multilevel graph representation and a multi-head attention mechanism, which enables connected autonomous vehicles (CAVs) to exhibit capabilities akin to human cognition, facilitating consistent decision-making across various critical dimensions. The proposed decision-making strategy is validated in challenging environments characterized by high density, randomness, and dynamism on highway roads. We assess the performance of our framework through ablation studies, comparative analyses, and spatiotemporal trajectory evaluations. This study presents a quantitative analysis of decision-making mechanisms mirroring human cognitive functions in the realm of heterogeneous mixed autonomy, promoting the development of multi-dimensional decision-making strategies and a sophisticated distribution of attentional resources.
comment: 15 pages, 9 figures
Data-driven Construction of Finite Abstractions for Interconnected Systems: A Compositional Approach
Finite-state abstractions (a.k.a. symbolic models) present a promising avenue for the formal verification and synthesis of controllers in continuous-space control systems. These abstractions provide simplified models that capture the fundamental behaviors of the original systems. However, the creation of such abstractions typically relies on the availability of precise knowledge concerning system dynamics, which might not be available in many real-world applications. In this work, we introduce an innovative, data-driven, and compositional approach to generate finite abstractions for interconnected systems that consist of discrete-time control subsystems with unknown dynamics. These subsystems interact through an unknown static interconnection map. Our methodology for abstracting the interconnected system involves constructing abstractions for individual subsystems and incorporating an abstraction of the interconnection map.
comment: This manuscript of 19 pages and 7 figures is a preprint under review with a journal
ASGM-KG: Unveiling Alluvial Gold Mining Through Knowledge Graphs
Artisanal and Small-Scale Gold Mining (ASGM) is a low-cost yet highly destructive mining practice, leading to environmental disasters across the world's tropical watersheds. The topic of ASGM spans multiple domains of research and information, including natural and social systems, and knowledge is often atomized across a diversity of media and documents. We therefore introduce a knowledge graph (ASGM-KG) that consolidates and provides crucial information about ASGM practices and their environmental effects. The current version of ASGM-KG consists of 1,899 triples extracted using a large language model (LLM) from documents and reports published by both non-governmental and governmental organizations. These documents were carefully selected by a group of tropical ecologists with expertise in ASGM. This knowledge graph was validated using two methods. First, a small team of ASGM experts reviewed and labeled triples as factual or non-factual. Second, we devised and applied an automated factual reduction framework that relies on a search engine and an LLM for labeling triples. Our framework performs as well as five baselines on a publicly available knowledge graph and achieves over 90 accuracy on our ASGM-KG validated by domain experts. ASGM-KG demonstrates an advancement in knowledge aggregation and representation for complex, interdisciplinary environmental crises such as ASGM.
Multiagent Systems
Stochastic Semi-Gradient Descent for Learning Mean Field Games with Population-Aware Function Approximation
Mean field games (MFGs) model the interactions within a large-population multi-agent system using the population distribution. Traditional learning methods for MFGs are based on fixed-point iteration (FPI), which calculates best responses and induced population distribution separately and sequentially. However, FPI-type methods suffer from inefficiency and instability, due to oscillations caused by the forward-backward procedure. This paper considers an online learning method for MFGs, where an agent updates its policy and population estimates simultaneously and fully asynchronously, resulting in a simple stochastic gradient descent (SGD) type method called SemiSGD. Not only does SemiSGD exhibit numerical stability and efficiency, but it also provides a novel perspective by treating the value function and population distribution as a unified parameter. We theoretically show that SemiSGD directs this unified parameter along a descent direction to the mean field equilibrium. Motivated by this perspective, we develop a linear function approximation (LFA) for both the value function and the population distribution, resulting in the first population-aware LFA for MFGs on continuous state-action space. Finite-time convergence and approximation error analysis are provided for SemiSGD equipped with population-aware LFA.
EmBARDiment: an Embodied AI Agent for Productivity in XR
XR devices running chat-bots powered by Large Language Models (LLMs) have tremendous potential as always-on agents that can enable much better productivity scenarios. However, screen based chat-bots do not take advantage of the the full-suite of natural inputs available in XR, including inward facing sensor data, instead they over-rely on explicit voice or text prompts, sometimes paired with multi-modal data dropped as part of the query. We propose a solution that leverages an attention framework that derives context implicitly from user actions, eye-gaze, and contextual memory within the XR environment. This minimizes the need for engineered explicit prompts, fostering grounded and intuitive interactions that glean user insights for the chat-bot. Our user studies demonstrate the imminent feasibility and transformative potential of our approach to streamline user interaction in XR with chat-bots, while offering insights for the design of future XR-embodied LLM agents.
Independent Policy Mirror Descent for Markov Potential Games: Scaling to Large Number of Players
Markov Potential Games (MPGs) form an important sub-class of Markov games, which are a common framework to model multi-agent reinforcement learning problems. In particular, MPGs include as a special case the identical-interest setting where all the agents share the same reward function. Scaling the performance of Nash equilibrium learning algorithms to a large number of agents is crucial for multi-agent systems. To address this important challenge, we focus on the independent learning setting where agents can only have access to their local information to update their own policy. In prior work on MPGs, the iteration complexity for obtaining $\epsilon$-Nash regret scales linearly with the number of agents $N$. In this work, we investigate the iteration complexity of an independent policy mirror descent (PMD) algorithm for MPGs. We show that PMD with KL regularization, also known as natural policy gradient, enjoys a better $\sqrt{N}$ dependence on the number of agents, improving over PMD with Euclidean regularization and prior work. Furthermore, the iteration complexity is also independent of the sizes of the agents' action spaces.
comment: 16 pages, CDC 2024
Time-Ordered Ad-hoc Resource Sharing for Independent Robotic Agents IROS 2024
Resource sharing is a crucial part of a multi-robot system. We propose a Boolean satisfiability based approach to resource sharing. Our key contributions are an algorithm for converting any constrained assignment to a weighted-SAT based optimization. We propose a theorem that allows optimal resource assignment problems to be solved via repeated application of a SAT solver. Additionally we show a way to encode continuous time ordering constraints using Conjunctive Normal Form (CNF). We benchmark our new algorithms and show that they can be used in an ad-hoc setting. We test our algorithms on a fleet of simulated and real world robots and show that the algorithms are able to handle real world situations. Our algorithms and test harnesses are opensource and build on Open-RMFs fleet management system.
comment: IROS 2024
The Nah Bandit: Modeling User Non-compliance in Recommendation Systems
Recommendation systems now pervade the digital world, ranging from advertising to entertainment. However, it remains challenging to implement effective recommendation systems in the physical world, such as in mobility or health. This work focuses on a key challenge: in the physical world, it is often easy for the user to opt out of taking any recommendation if they are not to her liking, and to fall back to her baseline behavior. It is thus crucial in cyber-physical recommendation systems to operate with an interaction model that is aware of such user behavior, lest the user abandon the recommendations altogether. This paper thus introduces the Nah Bandit, a tongue-in-cheek reference to describe a Bandit problem where users can say `nah' to the recommendation and opt for their preferred option instead. As such, this problem lies in between a typical bandit setup and supervised learning. We model the user non-compliance by parameterizing an anchoring effect of recommendations on users. We then propose the Expert with Clustering (EWC) algorithm, a hierarchical approach that incorporates feedback from both recommended and non-recommended options to accelerate user preference learning. In a recommendation scenario with $N$ users, $T$ rounds per user, and $K$ clusters, EWC achieves a regret bound of $O(N\sqrt{T\log K} + NT)$, achieving superior theoretical performance in the short term compared to LinUCB algorithm. Experimental results also highlight that EWC outperforms both supervised learning and traditional contextual bandit approaches. This advancement reveals that effective use of non-compliance feedback can accelerate preference learning and improve recommendation accuracy. This work lays the foundation for future research in Nah Bandit, providing a robust framework for more effective recommendation systems.
comment: 12 pages, 8 figures, under review
Decentralized and Uncoordinated Learning of Stable Matchings: A Game-Theoretic Approach
We consider the problem of learning stable matchings with unknown preferences in a decentralized and uncoordinated manner, where "decentralized" means that players make decisions individually without the influence of a central platform, and "uncoordinated" means that players do not need to synchronize their decisions using pre-specified rules. First, we provide a game formulation for this problem with known preferences, where the set of pure Nash equilibria (NE) coincides with the set of stable matchings, and mixed NE can be rounded to a stable matching. Then, we show that for hierarchical markets, applying the exponential weight (EXP) learning algorithm to the stable matching game achieves logarithmic regret in a fully decentralized and uncoordinated fashion. Moreover, we show that EXP converges locally and exponentially fast to a stable matching in general markets. We also introduce another decentralized and uncoordinated learning algorithm that globally converges to a stable matching with arbitrarily high probability. Finally, we provide stronger feedback conditions under which it is possible to drive the market faster toward an approximate stable matching. Our proposed game-theoretic framework bridges the discrete problem of learning stable matchings with the problem of learning NE in continuous-action games.
Compressed Federated Reinforcement Learning with a Generative Model ECML-PKDD 2024
Reinforcement learning has recently gained unprecedented popularity, yet it still grapples with sample inefficiency. Addressing this challenge, federated reinforcement learning (FedRL) has emerged, wherein agents collaboratively learn a single policy by aggregating local estimations. However, this aggregation step incurs significant communication costs. In this paper, we propose CompFedRL, a communication-efficient FedRL approach incorporating both \textit{periodic aggregation} and (direct/error-feedback) compression mechanisms. Specifically, we consider compressed federated $Q$-learning with a generative model setup, where a central server learns an optimal $Q$-function by periodically aggregating compressed $Q$-estimates from local agents. For the first time, we characterize the impact of these two mechanisms (which have remained elusive) by providing a finite-time analysis of our algorithm, demonstrating strong convergence behaviors when utilizing either direct or error-feedback compression. Our bounds indicate improved solution accuracy concerning the number of agents and other federated hyperparameters while simultaneously reducing communication costs. To corroborate our theory, we also conduct in-depth numerical experiments to verify our findings, considering Top-$K$ and Sparsified-$K$ sparsification operators.
comment: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2024)
Self-organizing Multiagent Target Enclosing under Limited Information and Safety Guarantees
This paper introduces an approach to address the target enclosing problem using non-holonomic multiagent systems, where agents self-organize on the enclosing shape around a fixed target. In our approach, agents independently move toward the desired enclosing geometry when apart and activate the collision avoidance mechanism when a collision is imminent, thereby guaranteeing inter-agent safety. Our approach combines global enclosing behavior and local collision avoidance mechanisms by devising a special potential function and sliding manifold. We rigorously show that an agent does not need to ensure safety with every other agent and put forth a concept of the nearest colliding agent (for any arbitrary agent) with whom ensuring safety is sufficient to avoid collisions in the entire swarm. The proposed control eliminates the need for a fixed or pre-established agent arrangement around the target and requires only relative information between an agent and the target. This makes our design particularly appealing for scenarios with limited global information, hence significantly reducing communication requirements. We finally present simulation results to vindicate the efficacy of the proposed method.
Systems and Control (CS)
Memory-optimised Cubic Splines for High-fidelity Quantum Operations
Radio-frequency pulses are widespread for the control of quantum bits and the execution of operations in quantum computers. The ability to tune key pulse parameters such as time-dependent amplitude, phase, and frequency is essential to achieve maximal gate fidelity and mitigate errors. As systems scale, a larger fraction of the control electronic processing will move closer to the qubits, to enhance integration and minimise latency in operations requiring fast feedback. This will constrain the space available in the memory of the control electronics to load time-resolved pulse parameters at high sampling rates. Cubic spline interpolation is a powerful and widespread technique that divides the pulse into segments of cubic polynomials. We show an optimised implementation of this strategy, using a two-stage curve fitting process and additional symmetry operations to load a high-sampling pulse output on an FPGA. This results in a favourable accuracy versus memory footprint trade-off. By simulating single-qubit population transfer and atom transport on a neutral atom device, we show that we can achieve high fidelities with low memory requirements. This is instrumental for scaling up the number of qubits and gate operations in environments where memory is a limited resource.
A Conflicts-free, Speed-lossless KAN-based Reinforcement Learning Decision System for Interactive Driving in Roundabouts
Safety and efficiency are crucial for autonomous driving in roundabouts, especially in the context of mixed traffic where autonomous vehicles (AVs) and human-driven vehicles coexist. This paper introduces a learning-based algorithm tailored to foster safe and efficient driving behaviors across varying levels of traffic flows in roundabouts. The proposed algorithm employs a deep Q-learning network to effectively learn safe and efficient driving strategies in complex multi-vehicle roundabouts. Additionally, a KAN (Kolmogorov-Arnold network) enhances the AVs' ability to learn their surroundings robustly and precisely. An action inspector is integrated to replace dangerous actions to avoid collisions when the AV interacts with the environment, and a route planner is proposed to enhance the driving efficiency and safety of the AVs. Moreover, a model predictive control is adopted to ensure stability and precision of the driving actions. The results show that our proposed system consistently achieves safe and efficient driving whilst maintaining a stable training process, as evidenced by the smooth convergence of the reward function and the low variance in the training curves across various traffic flows. Compared to state-of-the-art benchmarks, the proposed algorithm achieves a lower number of collisions and reduced travel time to destination.
comment: 15 pages, 12 figures, submitted to an IEEE journal
CatalogBank: A Structured and Interoperable Catalog Dataset with a Semi-Automatic Annotation Tool (DocumentLabeler) for Engineering System Design
In the realm of document engineering and Natural Language Processing (NLP), the integration of digitally born catalogs into product design processes presents a novel avenue for enhancing information extraction and interoperability. This paper introduces CatalogBank, a dataset developed to bridge the gap between textual descriptions and other data modalities related to engineering design catalogs. We utilized existing information extraction methodologies to extract product information from PDF-based catalogs to use in downstream tasks to generate a baseline metric. Our approach not only supports the potential automation of design workflows but also overcomes the limitations of manual data entry and non-standard metadata structures that have historically impeded the seamless integration of textual and other data modalities. Through the use of DocumentLabeler, an open-source annotation tool adapted for our dataset, we demonstrated the potential of CatalogBank in supporting diverse document-based tasks such as layout analysis and knowledge extraction. Our findings suggest that CatalogBank can contribute to document engineering and NLP by providing a robust dataset for training models capable of understanding and processing complex document formats with relatively less effort using the semi-automated annotation tool DocumentLabeler.
comment: 8 pages, 6 figures
Optimizing Highway Ramp Merge Safety and Efficiency via Spatio-Temporal Cooperative Control and Vehicle-Road Coordination
In view of existing automatic driving, it is difficult to accurately and timely obtain the status and driving intention of other vehicles. The safety risk and urgency of autonomous vehicles in the absence of collision are evaluated. To ensure safety and improve road efficiency, a method of pre-compiling the spatio-temporal trajectory of vehicles is established to eliminate conflicts between vehicles in advance. The calculation method of the safe distance under spatio-temporal conditions is studied, considering vehicle speed differences, vehicle positioning errors, and clock errors. By combining collision acceleration and urgent acceleration, an evaluation model for vehicle conflict risk is constructed. Mainline vehicles that may have conflicts with on-ramp vehicles are identified, and the target gap for on-ramp vehicles is determined. Finally, a cooperative control method is established based on the selected target gap, preparing the vehicle travel path in advance. Taking highway ramp merge as an example, the mainline priority spatio-temporal cooperative control method is proposed and verified through simulation. Using SUMO and Python co-simulation, mainline traffic volumes of 800 veh*h-1*lane-1
Analytical Model of Modular Upper Limb Rehabilitation
Configurable robots are made up of robotic modules that can be assembled or can configure themselves into multiple robot configurations. In this research plan, a method for upper-body rehabilitation will be discussed in the form of a modular robot with different morphologies. The advantage and superiority of designing an example of a robotic module for upper body rehabilitation is the ability to reset the modular robot system. In this research, a number of modules will be designed and implemented according to the needs of one-hand rehabilitation with different degrees of freedom. The design modules' performance and efficiency will be evaluated by simulating, making samples, and testing them. This article's research includes presenting a modular upper body rehabilitation robot in the wrist, elbow, and shoulder areas, as well as providing a suitable kinematic and dynamic model of the upper body rehabilitation robot to determine human-robot interaction forces and movement. The research also involves analyzing the mathematical model of the upper body rehabilitation robot to identify advanced control strategies that rely on force control and torque control. After reviewing the articles and research of others, we concluded that no one has yet worked on the design of a prototype robotic module for upper body rehabilitation in the specified order. In our pioneering research, we intend to address this important matter.
Communication-robust and Privacy-safe Distributed Estimation for Heterogeneous Community-level Behind-the-meter Solar Power Generation
The rapid growth of behind-the-meter (BTM) solar power generation systems presents challenges for distribution system planning and scheduling due to invisible solar power generation. To address the data leakage problem of centralized machine-learning methods in BTM solar power generation estimation, the federated learning (FL) method has been investigated for its distributed learning capability. However, the conventional FL method has encountered various challenges, including heterogeneity, communication failures, and malicious privacy attacks. To overcome these challenges, this study proposes a communication-robust and privacy-safe distributed estimation method for heterogeneous community-level BTM solar power generation. Specifically, this study adopts multi-task FL as the main structure and learns the common and unique features of all communities. Simultaneously, it embeds an updated parameters estimation method into the multi-task FL, automatically identifies similarities between any two clients, and estimates the updated parameters for unavailable clients to mitigate the negative effects of communication failures. Finally, this study adopts a differential privacy mechanism under the dynamic privacy budget allocation strategy to combat malicious privacy attacks and improve model training efficiency. Case studies show that in the presence of heterogeneity and communication failures, the proposed method exhibits better estimation accuracy and convergence performance as compared with traditional FL and localized learning methods, while providing stronger privacy protection.
Stochastic Real-Time Economic Dispatch for Integrated Electric and Gas Systems Considering Uncertainty Propagation and Pipeline Leakage
Gas-fired units (GFUs) with rapid regulation capabilities are considered an effective tool to mitigate fluctuations in the generation of renewable energy sources and have coupled electricity power systems (EPSs) and natural gas systems (NGSs) more tightly. However, this tight coupling leads to uncertainty propagation, a challenge for the real-time dispatch of such integrated electric and gas systems (IEGSs). Moreover, pipeline leakage failures in the NGS may threaten the electricity supply reliability of the EPS through GFUs. To address these problems, this paper first establishes an operational model considering gas pipeline dynamic characteristics under uncertain leakage failures for the NGS and then presents a stochastic IEGS real-time economic dispatch (RTED) model considering both uncertainty propagation and pipeline leakage uncertainty. To quickly solve this complicated large-scale stochastic optimization problem, a novel notion of the coupling boundary dynamic adjustment region considering pipeline leakage failure (LCBDAR) is proposed to characterize the dynamic characteristics of the NGS boundary connecting GFUs. Based on the LCBDAR, a noniterative decentralized solution is proposed to decompose the original stochastic RTED model into two subproblems that are solved separately by the EPS and NGS operators, thus preserving their data privacy. In particular, only one-time data interaction from the NGS to the EPS is required. Case studies on several IEGSs at different scales demonstrate the effectiveness of the proposed method.
Robust Maneuver Planning With Scalable Prediction Horizons: A Move Blocking Approach
Implementation of Model Predictive Control (MPC) on hardware with limited computational resources remains a challenge. Especially for long-distance maneuvers that require small sampling times, the necessary horizon lengths prevent its application on onboard computers. In this paper, we propose a computationally efficient tubebased shrinking horizon MPC that is scalable to long prediction horizons. Using move blocking, we ensure that a given number of decision inputs is efficiently used throughout the maneuver. Next, a method to substantially reduce the number of constraints is introduced. The approach is demonstrated with a helicopter landing on an inclined platform using a prediction horizon of 300 steps. The constraint reduction decreases the computation time by an order of magnitude with a slight increase in trajectory cost.
comment: Submitted to L-CSS with CDC option
Meta SAC-Lag: Towards Deployable Safe Reinforcement Learning via MetaGradient-based Hyperparameter Tuning IROS
Safe Reinforcement Learning (Safe RL) is one of the prevalently studied subcategories of trial-and-error-based methods with the intention to be deployed on real-world systems. In safe RL, the goal is to maximize reward performance while minimizing constraints, often achieved by setting bounds on constraint functions and utilizing the Lagrangian method. However, deploying Lagrangian-based safe RL in real-world scenarios is challenging due to the necessity of threshold fine-tuning, as imprecise adjustments may lead to suboptimal policy convergence. To mitigate this challenge, we propose a unified Lagrangian-based model-free architecture called Meta Soft Actor-Critic Lagrangian (Meta SAC-Lag). Meta SAC-Lag uses meta-gradient optimization to automatically update the safety-related hyperparameters. The proposed method is designed to address safe exploration and threshold adjustment with minimal hyperparameter tuning requirement. In our pipeline, the inner parameters are updated through the conventional formulation and the hyperparameters are adjusted using the meta-objectives which are defined based on the updated parameters. Our results show that the agent can reliably adjust the safety performance due to the relatively fast convergence rate of the safety threshold. We evaluate the performance of Meta SAC-Lag in five simulated environments against Lagrangian baselines, and the results demonstrate its capability to create synergy between parameters, yielding better or competitive results. Furthermore, we conduct a real-world experiment involving a robotic arm tasked with pouring coffee into a cup without spillage. Meta SAC-Lag is successfully trained to execute the task, while minimizing effort constraints.
comment: Main text accepted to the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2024, 10 pages, 4 figures, 3 tables
$\mathcal{H}_2$-optimal Model Reduction of Linear Quadratic Output Systems in Finite Frequency Range
Linear quadratic output systems constitute an important class of dynamical systems with numerous practical applications. When the order of these models is exceptionally high, simulating and analyzing these systems becomes computationally prohibitive. In such instances, model order reduction offers an effective solution by approximating the original high-order system with a reduced-order model while preserving the system's essential characteristics. In frequency-limited model order reduction, the objective is to maintain the frequency response of the original system within a specified frequency range in the reduced-order model. In this paper, a mathematical expression for the frequency-limited $\mathcal{H}_2$ norm is derived, which quantifies the error within the desired frequency interval. Subsequently, the necessary conditions for a local optimum of the frequency-limited $\mathcal{H}_2$ norm of the error are derived. The inherent difficulty in satisfying these conditions within a Petrov-Galerkin projection framework is also discussed. Based on the optimality conditions and Petrov-Galerkin projection, a stationary point iteration algorithm is proposed that enforces three of the four optimality conditions upon convergence. A numerical example is provided to illustrate the algorithm's effectiveness in accurately approximating the original high-order model within the specified frequency interval.
Enhanced Equivalent Circuit Model for High Current Discharge of Lithium-Ion Batteries with Application to Electric Vertical Takeoff and Landing Aircraft
Conventional battery equivalent circuit models (ECMs) have limited capability to predict performance at high discharge rates, where lithium depleted regions may develop and cause a sudden exponential drop in the cell's terminal voltage. Having accurate predictions of performance under such conditions is necessary for electric vertical takeoff and landing (eVTOL) aircraft applications, where high discharge currents can be required during fault scenarios and the inability to provide these currents can be safety-critical. To address this challenge, we utilize data-driven modeling methods to derive a parsimonious addition to a conventional ECM that can capture the observed rapid voltage drop with only one additional state. We also provide a detailed method for identifying the resulting model parameters, including an extensive characterization data set along with a well-regularized objective function formulation. The model is validated against a novel data set of over 150 flights encompassing a wide array of conditions for an eVTOL aircraft using an application-specific and safety-relevant reserve duration metric for quantifying accuracy. The model is shown to predict the landing hover capability with an error mean and standard deviation of 2.9 and 6.2 seconds, respectively, defining the model's ability to capture the cell voltage behavior under high discharge currents.
Stable State Space SubSpace (S$^5$) Identification
State space subspace algorithms for input-output systems have been widely applied but also have a reasonably well-developedasymptotic theory dealing with consistency. However, guaranteeing the stability of the estimated system matrix is a major issue. Existing stability-guaranteed algorithms are computationally expensive, require several tuning parameters, and scale badly to high state dimensions. Here, we develop a new algorithm that is closed-form and requires no tuning parameters. It is thus computationally cheap and scales easily to high state dimensions. We also prove its consistency under reasonable conditions.
The Nah Bandit: Modeling User Non-compliance in Recommendation Systems
Recommendation systems now pervade the digital world, ranging from advertising to entertainment. However, it remains challenging to implement effective recommendation systems in the physical world, such as in mobility or health. This work focuses on a key challenge: in the physical world, it is often easy for the user to opt out of taking any recommendation if they are not to her liking, and to fall back to her baseline behavior. It is thus crucial in cyber-physical recommendation systems to operate with an interaction model that is aware of such user behavior, lest the user abandon the recommendations altogether. This paper thus introduces the Nah Bandit, a tongue-in-cheek reference to describe a Bandit problem where users can say `nah' to the recommendation and opt for their preferred option instead. As such, this problem lies in between a typical bandit setup and supervised learning. We model the user non-compliance by parameterizing an anchoring effect of recommendations on users. We then propose the Expert with Clustering (EWC) algorithm, a hierarchical approach that incorporates feedback from both recommended and non-recommended options to accelerate user preference learning. In a recommendation scenario with $N$ users, $T$ rounds per user, and $K$ clusters, EWC achieves a regret bound of $O(N\sqrt{T\log K} + NT)$, achieving superior theoretical performance in the short term compared to LinUCB algorithm. Experimental results also highlight that EWC outperforms both supervised learning and traditional contextual bandit approaches. This advancement reveals that effective use of non-compliance feedback can accelerate preference learning and improve recommendation accuracy. This work lays the foundation for future research in Nah Bandit, providing a robust framework for more effective recommendation systems.
comment: 12 pages, 8 figures, under review
On Accelerating Large-Scale Robust Portfolio Optimization
Solving large-scale robust portfolio optimization problems is challenging due to the high computational demands associated with an increasing number of assets, the amount of data considered, and market uncertainty. To address this issue, we propose an extended supporting hyperplane approximation approach for efficiently solving a class of distributionally robust portfolio problems for a general class of additively separable utility functions and polyhedral ambiguity distribution set, applied to a large-scale set of assets. Our technique is validated using a large-scale portfolio of the S&P 500 index constituents, demonstrating robust out-of-sample trading performance. More importantly, our empirical studies show that this approach significantly reduces computational time compared to traditional concave Expected Log-Growth (ELG) optimization, with running times decreasing from several thousand seconds to just a few. This method provides a scalable and practical solution to large-scale robust portfolio optimization, addressing both theoretical and practical challenges.
comment: Submitted to possible publication
Certifiable Deep Learning for Reachability Using a New Lipschitz Continuous Value Function
We propose a new reachability learning framework for high-dimensional nonlinear systems, focusing on reach-avoid problems. These problems require computing the reach-avoid set, which ensures that all its elements can safely reach a target set despite any disturbance within pre-specified bounds. Our framework has two main parts: offline learning of a newly designed reach-avoid value function and post-learning certification. Compared to prior works, our new value function is Lipschitz continuous and its associated Bellman operator is a contraction mapping, both of which improve the learning performance. To ensure deterministic guarantees of our learned reach-avoid set, we introduce two efficient post-learning certification methods. Both methods can be used online for real-time local certification or offline for comprehensive certification. We validate our framework in a 12-dimensional crazyflie drone racing hardware experiment and a simulated 10-dimensional highway takeover example.
comment: Submitted, under review
A semi-centralized multi-agent RL framework for efficient irrigation scheduling
This paper proposes a Semi-Centralized Multi-Agent Reinforcement Learning (SCMARL) approach for irrigation scheduling in spatially variable agricultural fields, where management zones address spatial variability. The SCMARL framework is hierarchical in nature, with a centralized coordinator agent at the top level and decentralized local agents at the second level. The coordinator agent makes daily binary irrigation decisions based on field-wide conditions, which are communicated to the local agents. Local agents determine appropriate irrigation amounts for specific management zones using local conditions. The framework employs state augmentation approach to handle non-stationarity in the local agents' environments. An extensive evaluation on a large-scale field in Lethbridge, Canada, compares the SCMARL approach with a learning-based multi-agent model predictive control scheduling approach, highlighting its enhanced performance, resulting in water conservation and improved Irrigation Water Use Efficiency (IWUE). Notably, the proposed approach achieved a 4.0% savings in irrigation water while enhancing the IWUE by 6.3%.
Timing Analysis and Priority-driven Enhancements of ROS 2 Multi-threaded Executors
The second generation of Robotic Operating System, ROS 2, has gained much attention for its potential to be used for safety-critical robotic applications. The need to provide a solid foundation for timing correctness and scheduling mechanisms is therefore growing rapidly. Although there are some pioneering studies conducted on formally analyzing the response time of processing chains in ROS 2, the focus has been limited to single-threaded executors, and multi-threaded executors, despite their advantages, have not been studied well. To fill this knowledge gap, in this paper, we propose a comprehensive response-time analysis framework for chains running on ROS 2 multi-threaded executors. We first analyze the timing behavior of the default scheduling scheme in ROS 2 multi-threaded executors, and then present priority-driven scheduling enhancements to address the limitations of the default scheme. Our framework can analyze chains with both arbitrary and constrained deadlines and also the effect of mutually-exclusive callback groups. Evaluation is conducted by a case study on NVIDIA Jetson AGX Xavier and schedulability experiments using randomly-generated chains. The results demonstrate that our analysis framework can safely upper-bound response times under various conditions and the priority-driven scheduling enhancements not only reduce the response time of critical chains but also improve analytical bounds.
Physics-Guided Reinforcement Learning System for Realistic Vehicle Active Suspension Control
The suspension system is a crucial part of the automotive chassis, improving vehicle ride comfort and isolating passengers from rough road excitation. Unlike passive suspension, which has constant spring and damping coefficients, active suspension incorporates electronic actuators into the system to dynamically control stiffness and damping variables. However, effectively controlling the suspension system poses a challenging task that necessitates real-time adaptability to various road conditions. This paper presents the Physics-Guided Deep Reinforcement Learning (DRL) for adjusting an active suspension system's variable kinematics and compliance properties for a quarter-car model in real time. Specifically, the outputs of the model are defined as actuator stiffness and damping control, which are bound within physically realistic ranges to maintain the system's physical compliance. The proposed model was trained on stochastic road profiles according to ISO 8608 standards to optimize the actuator's control policy. According to qualitative results on simulations, the vehicle body reacts smoothly to various novel real-world road conditions, having a much lower degree of oscillation. These observations mean a higher level of passenger comfort and better vehicle stability. Quantitatively, DRL outperforms passive systems in reducing the average vehicle body velocity and acceleration by 43.58% and 17.22%, respectively, minimizing the vertical movement impacts on the passengers. The code is publicly available at github.com/anh-nn01/RL4Suspension-ICMLA23.
comment: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
An Efficient and Explainable Transformer-Based Few-Shot Learning for Modeling Electricity Consumption Profiles Across Thousands of Domains
Electricity Consumption Profiles (ECPs) are crucial for operating and planning power distribution systems, especially with the increasing numbers of various low-carbon technologies such as solar panels and electric vehicles. Traditional ECP modeling methods typically assume the availability of sufficient ECP data. However, in practice, the accessibility of ECP data is limited due to privacy issues or the absence of metering devices. Few-shot learning (FSL) has emerged as a promising solution for ECP modeling in data-scarce scenarios. Nevertheless, standard FSL methods, such as those used for images, are unsuitable for ECP modeling because (1) these methods usually assume several source domains with sufficient data and several target domains. However, in the context of ECP modeling, there may be thousands of source domains with a moderate amount of data and thousands of target domains. (2) Standard FSL methods usually involve cumbersome knowledge transfer mechanisms, such as pre-training and fine-tuning, whereas ECP modeling requires more lightweight methods. (3) Deep learning models often lack explainability, hindering their application in industry. This paper proposes a novel FSL method that exploits Transformers and Gaussian Mixture Models (GMMs) for ECP modeling to address the above-described issues. Results show that our method can accurately restore the complex ECP distribution with a minimal amount of ECP data (e.g., only 1.6\% of the complete domain dataset) while it outperforms state-of-the-art time series modeling methods, maintaining the advantages of being both lightweight and interpretable. The project is open-sourced at https://github.com/xiaweijie1996/TransformerEM-GMM.git.
Stabilization of Nonlinear Systems through Control Barrier Functions
This paper proposes a control design approach for stabilizing nonlinear control systems. Our key observation is that the set of points where the decrease condition of a control Lyapunov function (CLF) is feasible can be regarded as a safe set. By leveraging a nonsmooth version of control barrier functions (CBFs) and a weaker notion of CLF, we develop a control design that forces the system to converge to and remain in the region where the CLF decrease condition is feasible. We characterize the conditions under which our controller asymptotically stabilizes the origin or a small neighborhood around it, even in the cases where it is discontinuous. We illustrate our design in various examples.
An Implicit Function Method for Computing the Stability Boundaries of Hill's Equation
Hill's equation is a common model of a time-periodic system that can undergo parametric resonance for certain choices of system parameters. For most kinds of parametric forcing, stable regions in its two-dimensional parameter space need to be identified numerically, typically by applying a matrix trace criterion. By integrating ODEs derived from the stability criterion, we present an alternative, more accurate and computationally efficient numerical method for determining the stability boundaries of Hill's equation in parameter space. This method works similarly to determine stability boundaries for the closely related problem of vibrational stabilization of the linearized Katpiza pendulum. Additionally, we derive a stability criterion for the damped Hill's equation in terms of a matrix trace criterion on an equivalent undamped system. In doing so we generalize the method of this paper to compute stability boundaries for parametric resonance in the presence of damping.
comment: 11 pages, 9 figures
Spin Hall Nano-Antenna
The spin Hall effect is a celebrated phenomenon in spintronics and magnetism that has found numerous applications in digital electronics (memory and logic), but very few in analog electronics. Practically, the only analog application in widespread use is the spin Hall nano-oscillator (SHNO) that delivers a high frequency alternating current or voltage to a load. Here, we report its analogue - a spin Hall nano-antenna (SHNA) that radiates a high frequency electromagnetic wave (alternating electric/magnetic fields) into the surrounding medium. It can also radiate an acoustic wave in an underlying substrate if the nanomagnets are made of a magnetostrictive material. That makes it a dual electromagnetic/acoustic antenna. The SHNA is made of an array of ledged magnetostrictive nanomagnets deposited on a substrate, with a heavy metal nanostrip underlying/overlying the ledges. An alternating charge current passed through the nanostrip generates an alternating spin-orbit torque in the nanomagnets via the spin Hall effect which makes their magnetizations oscillate in time with the frequency of the current, producing confined spin waves (magnons), which radiate electromagnetic waves (photons) in space with the same frequency as the ac current. Despite being much smaller than the radiated wavelength, the SHNA surprisingly does not act as a point source which would radiate isotropically. Instead, there is clear directionality (anisotropy) in the radiation pattern, which is frequency-dependent. This is due to the (frequency-dependent) intrinsic anisotropy in the confined spin wave patterns generated within the nanomagnets, which effectively endows the "point source" with internal anisotropy.
A review of the calculation methods of optimal power flow in integrated energy systems
The analysis of Integrated Energy Systems (IES) is crucial for enhancing the comprehensive and complementary utilization of clean energy across China, significantly impacting the effective planning, operational coordination, and security control of the IES network. This paper presents a systematic review of the current research on optimal power flow (OPF) within IES, addressing the spatiotemporal interrelationships and coupled co-supply among primary energy processes such as electricity, gas, and heat (cooling). It highlights the challenges and future directions in this field, underscoring the lack of comprehensive studies on coupled power flow modeling for electricity-heat-gas systems and the need for more robust modeling approaches that align with practical engineering applications. Furthermore, the paper discusses the potential of multi-target energy storage systems to enhance energy consumption efficiency and flexibility in energy resource management. The existing models and algorithms for target power flow and optimal power flow are critiqued for their lack of flexibility and comprehensiveness, particularly in handling multi-objective flows and ensuring system safety and reliability. The study emphasizes the necessity for further development of safety and reliability assessment frameworks to support the evolving demands of integrated energy systems.
comment: 10 pages, in Chinese language
Startup Control Optimization of He-Xe Cooled Space Nuclear Reactors Using a System Analysis Program
In recent years, achieving autonomous control in nuclear reactor operations has become pivotal for the effectiveness of Space Nuclear Power Systems (SNPS). However, compared to power control, the startup control of SNPS remains underexplored. This study introduces a multi-objective optimization framework aimed at enhancing startup control, leveraging a system level analysis program to simulate the system's dynamic behavior accurately. The primary contribution of this work is the development and implementation of an optimization framework that significantly reduces startup time and improves control efficiency. Utilizing a non-ideal gas model, a multi-channel core model and the Monte Carlo code RMC employed to calculate temperature reactivity coefficients and neutron kinetics parameters, the system analysis tool ensures precise thermal-dynamic simulations. After insightful comprehension of system dynamics through reactive insertion accidents, the optimization algorithm fine-tunes the control sequences for external reactivity insertion, TAC system shaft speed, and cooling system background temperature. The optimized control strategy achieves threshold power 1260 seconds earlier and turbine inlet temperature 1980 seconds sooner than baseline methods. The findings highlight the potential of the proposed optimization framework to enhance the autonomy and operational efficiency of future SNPS designs.
Quality-Aware Hydraulic Control in Drinking Water Networks via Controllability Proxies
The operation of water distribution networks is a complex procedure aimed at efficiently delivering consumers with adequate water quantity while ensuring its safe quality. An added challenge is the dependency of the water quality dynamics on the system's hydraulics, which influences the performance of the water quality controller. Prior research has addressed either solving the optimum operational hydraulic setting problem or regulating the water quality dynamics as separate problems. Additionally, there have been efforts to couple these two problems and solve one compact problem resulting in trade-offs between the contradictory objectives. In contrast, this paper takes a novel approach by examining the water quality dependency on the hydraulics from a control-theoretic standpoint. More specifically, we explore the influence of accountability for water quality controllability improvement when addressing the pump scheduling problem. We examine its effects on the cumulative cost of the interconnected systems as well as the subsequent performance of the water quality controller. To achieve this, we develop a framework that incorporates different controllability metrics within the operational hydraulic optimization problem; its aim is attaining an adequate level of water quality control across the system. We assess the aforementioned aspects' performance on various scaled networks with a wide range of numerical scenarios.
Low-Complexity Control for a Class of Uncertain MIMO Nonlinear Systems under Generalized Time-Varying Output Constraints
This paper introduces a novel control framework to address the satisfaction of multiple time-varying output constraints in uncertain high-order MIMO nonlinear control systems. Unlike existing methods, which often assume that the constraints are always decoupled and feasible, our approach can handle coupled time-varying constraints even in the presence of potential infeasibilities. First, it is shown that satisfying multiple constraints essentially boils down to ensuring the positivity of a scalar variable, representing the signed distance from the boundary of the time-varying output-constrained set. To achieve this, a single consolidating constraint is designed that, when satisfied, guarantees convergence to and invariance of the time-varying output-constrained set within a user-defined finite time. Next, a novel robust and low-complexity feedback controller is proposed to ensure the satisfaction of the consolidating constraint. Additionally, we provide a mechanism for online modification of the consolidating constraint to find a least violating solution when the constraints become mutually infeasible for some time. Finally, simulation examples of trajectory and region tracking for a mobile robot validate the proposed approach.
comment: extended version, 20 pages, 7 figures
In-Field Gyroscope Autocalibration with Iterative Attitude Estimation
This paper presents an efficient in-field calibration method tailored for low-cost triaxial MEMS gyroscopes often used in healthcare applications. Traditional calibration techniques are challenging to implement in clinical settings due to the unavailability of high-precision equipment. Unlike the auto-calibration approaches used for triaxial MEMS accelerometers, which rely on local gravity, gyroscopes lack a reliable reference since the Earth's self-rotation speed is insufficient for accurate calibration. To address this limitation, we propose a novel method that uses manual rotation of the MEMS gyroscope to a specific angle (360{\deg}) as the calibration reference. This approach iteratively estimates the sensor's attitude without requiring any external equipment. Numerical simulations and empirical tests validate that the calibration error is low and that parameter estimation is unbiased. The method can be implemented in real-time on a low-energy microcontroller and completed in under 30 seconds. Comparative results demonstrate that the proposed technique outperforms existing state-of-the-art methods, achieving scale factor and bias errors of less than $2.5\times10^{-2}$ for LSM9DS1 and less than $1\times10^{-2}$ for ICM20948.
Channel Characterization of IRS-assisted Resonant Beam Communication Systems
To meet the growing demand for data traffic, spectrum-rich optical wireless communication (OWC) has emerged as a key technological driver for the development of 6G. The resonant beam communication (RBC) system, which employs spatially separated laser cavities as the transmitter and receiver, is a high-speed OWC technology capable of self-alignment without tracking. However, its transmission through the air is susceptible to losses caused by obstructions. In this paper, we propose an intelligent reflecting surface (IRS) assisted RBC system with the optical frequency doubling method, where the resonant beam in frequency-fundamental and frequency-doubled is transmitted through both direct line-of-sight (LoS) and IRS-assisted channels to maintain steady-state oscillation and enable communication without echo-interference, respectively. Then, we establish the channel model based on Fresnel diffraction theory under the near-field optical propagation to analyze the transmission loss and frequency-doubled power analytically. Furthermore, communication power can be maximized by dynamically controlling the beam-splitting ratio between the two channels according to the loss levels encountered over air. Numerical results validate that the IRS-assisted channel can compensate for the losses in the obstructed LoS channel and misaligned receivers, ensuring that communication performance reaches an optimal value with dynamic ratio adjustments.
Learning Tube-Certified Neural Robust Contraction Metrics
Control design for general nonlinear robotic systems with guaranteed stability and/or safety in the presence of model uncertainties is a challenging problem. Recent efforts attempt to learn a controller and a certificate (e.g., a Lyapunov function or a contraction metric) jointly using neural networks (NNs), in which model uncertainties are generally ignored during the learning process. In this paper, for nonlinear systems subject to bounded disturbances, we present a framework for jointly learning a robust nonlinear controller and a contraction metric using a novel disturbance rejection objective that certifies a tube bound using NNs for user-specified variables (e.g. control inputs). The learned controller aims to minimize the effect of disturbances on the actual trajectories of state and/or input variables from their nominal counterparts while providing certificate tubes around nominal trajectories that are guaranteed to contain actual trajectories in the presence of disturbances. Experimental results demonstrate that our framework can generate tighter (smaller) tubes and a controller that is computationally efficient to implement.
Decentralized and Uncoordinated Learning of Stable Matchings: A Game-Theoretic Approach
We consider the problem of learning stable matchings with unknown preferences in a decentralized and uncoordinated manner, where "decentralized" means that players make decisions individually without the influence of a central platform, and "uncoordinated" means that players do not need to synchronize their decisions using pre-specified rules. First, we provide a game formulation for this problem with known preferences, where the set of pure Nash equilibria (NE) coincides with the set of stable matchings, and mixed NE can be rounded to a stable matching. Then, we show that for hierarchical markets, applying the exponential weight (EXP) learning algorithm to the stable matching game achieves logarithmic regret in a fully decentralized and uncoordinated fashion. Moreover, we show that EXP converges locally and exponentially fast to a stable matching in general markets. We also introduce another decentralized and uncoordinated learning algorithm that globally converges to a stable matching with arbitrarily high probability. Finally, we provide stronger feedback conditions under which it is possible to drive the market faster toward an approximate stable matching. Our proposed game-theoretic framework bridges the discrete problem of learning stable matchings with the problem of learning NE in continuous-action games.
Efficient Data-Driven MPC for Demand Response of Commercial Buildings
Model predictive control (MPC) has been shown to significantly improve the energy efficiency of buildings while maintaining thermal comfort. Data-driven approaches based on neural networks have been proposed to facilitate system modelling. However, such approaches are generally nonconvex and result in computationally intractable optimization problems. In this work, we design a readily implementable energy management method for small commercial buildings. We then leverage our approach to formulate a real-time demand bidding strategy. We propose a data-driven and mixed-integer convex MPC which is solved via derivative-free optimization given a limited computational time of 5 minutes to respect operational constraints. We consider rooftop unit heating, ventilation, and air conditioning systems with discrete controls to accurately model the operation of most commercial buildings. Our approach uses an input convex recurrent neural network to model the thermal dynamics. We apply our approach in several demand response (DR) settings, including a demand bidding, a time-of-use, and a critical peak rebate program. Controller performance is evaluated on a state-of-the-art building simulation. The proposed approach improves thermal comfort while reducing energy consumption and cost through DR participation, when compared to other data-driven approaches or a set-point controller.
Collaborative Estimation of Real Valued Function by Two Agents and a Fusion Center with Knowledge Exchange
We consider a collaborative iterative algorithm with two agents and a fusion center for estimation of a real valued function (or ``model") on the set of real numbers. While the data collected by the agents is private, in every iteration of the algorithm, the models estimated by the agents are uploaded to the fusion center, fused, and, subsequently downloaded by the agents. We consider the estimation spaces at the agents and the fusion center to be Reproducing Kernel Hilbert Spaces (RKHS). Under suitable assumptions on these spaces, we prove that the algorithm is consistent, i.e., there exists a subsequence of the estimated models which converges to a model in the strong topology. To this end, we define estimation operators for the agents, fusion center, and, for every iteration of the algorithm constructively. We define valid input data sequences, study the asymptotic properties of the norm of the estimation operators, and, find sufficient conditions under which the estimation operator until any iteration is uniformly bounded. Using these results, we prove the existence of an estimation operator for the algorithm which implies the consistency of the considered estimation algorithm.
comment: arXiv admin note: text overlap with arXiv:2401.03012
Battlefield transfers in coalitional Blotto games
In competitive resource allocation environments, agents often choose to form alliances; however, for some agents, doing so may not always be beneficial. Is there a method of forming alliances that always reward each of their members? We study this question using the framework of the coalitional Blotto game, in which two players compete against a common adversary by allocating their budgeted resources across disjoint sets of valued battlefields. On any given battlefield, the agent that allocates a greater amount of resources wins the corresponding battlefield value. Existing work has shown the surprising result that in certain game instances, if one player donates a portion of their budget to the other player, then both players win larger amounts in their separate competitions against the adversary. However, this transfer-based method of alliance formation is not always mutually beneficial, which motivates the search for alternate strategies. In this vein, we study a new method of alliance formation referred to as a joint transfer, whereby players publicly transfer battlefields and budgets between one another before they engage in their separate competitions against the adversary. We show that in almost all game instances, there exists a mutually beneficial joint transfer that strictly increases the payoff of each player.
Self-organizing Multiagent Target Enclosing under Limited Information and Safety Guarantees
This paper introduces an approach to address the target enclosing problem using non-holonomic multiagent systems, where agents self-organize on the enclosing shape around a fixed target. In our approach, agents independently move toward the desired enclosing geometry when apart and activate the collision avoidance mechanism when a collision is imminent, thereby guaranteeing inter-agent safety. Our approach combines global enclosing behavior and local collision avoidance mechanisms by devising a special potential function and sliding manifold. We rigorously show that an agent does not need to ensure safety with every other agent and put forth a concept of the nearest colliding agent (for any arbitrary agent) with whom ensuring safety is sufficient to avoid collisions in the entire swarm. The proposed control eliminates the need for a fixed or pre-established agent arrangement around the target and requires only relative information between an agent and the target. This makes our design particularly appealing for scenarios with limited global information, hence significantly reducing communication requirements. We finally present simulation results to vindicate the efficacy of the proposed method.
Enhancing Reinforcement Learning in Sensor Fusion: A Comparative Analysis of Cubature and Sampling-based Integration Methods for Rover Search Planning IROS 2024
This study investigates the computational speed and accuracy of two numerical integration methods, cubature and sampling-based, for integrating an integrand over a 2D polygon. Using a group of rovers searching the Martian surface with a limited sensor footprint as a test bed, the relative error and computational time are compared as the area was subdivided to improve accuracy in the sampling-based approach. The results show that the sampling-based approach exhibits a $14.75\%$ deviation in relative error compared to cubature when it matches the computational performance at $100\%$. Furthermore, achieving a relative error below $1\%$ necessitates a $10000\%$ increase in relative time to calculate due to the $\mathcal{O}(N^2)$ complexity of the sampling-based method. It is concluded that for enhancing reinforcement learning capabilities and other high iteration algorithms, the cubature method is preferred over the sampling-based method.
comment: Submitted to IROS 2024
Output-feedback Synthesis Orbit Geometry: Quotient Manifolds and LQG Direct Policy Optimization
We consider direct policy optimization for the linear-quadratic Gaussian (LQG) setting. Over the past few years, it has been recognized that the landscape of dynamic output-feedback controllers of relevance to LQG has an intricate geometry, particularly pertaining to the existence of degenerate stationary points, that hinders gradient methods. In order to address these challenges, in this paper, we adopt a system-theoretic coordinate-invariant Riemannian metric for the space of dynamic output-feedback controllers and develop a Riemannian gradient descent for direct LQG policy optimization. We then proceed to prove that the orbit space of such controllers, modulo the coordinate transformation, admits a Riemannian quotient manifold structure. This geometric structure--that is of independent interest--provides an effective approach to derive direct policy optimization algorithms for LQG with a local linear rate convergence guarantee. Subsequently, we show that the proposed approach exhibits significantly faster and more robust numerical performance as compared with ordinary gradient descent.
Systems and Control (EESS)
Memory-optimised Cubic Splines for High-fidelity Quantum Operations
Radio-frequency pulses are widespread for the control of quantum bits and the execution of operations in quantum computers. The ability to tune key pulse parameters such as time-dependent amplitude, phase, and frequency is essential to achieve maximal gate fidelity and mitigate errors. As systems scale, a larger fraction of the control electronic processing will move closer to the qubits, to enhance integration and minimise latency in operations requiring fast feedback. This will constrain the space available in the memory of the control electronics to load time-resolved pulse parameters at high sampling rates. Cubic spline interpolation is a powerful and widespread technique that divides the pulse into segments of cubic polynomials. We show an optimised implementation of this strategy, using a two-stage curve fitting process and additional symmetry operations to load a high-sampling pulse output on an FPGA. This results in a favourable accuracy versus memory footprint trade-off. By simulating single-qubit population transfer and atom transport on a neutral atom device, we show that we can achieve high fidelities with low memory requirements. This is instrumental for scaling up the number of qubits and gate operations in environments where memory is a limited resource.
A Conflicts-free, Speed-lossless KAN-based Reinforcement Learning Decision System for Interactive Driving in Roundabouts
Safety and efficiency are crucial for autonomous driving in roundabouts, especially in the context of mixed traffic where autonomous vehicles (AVs) and human-driven vehicles coexist. This paper introduces a learning-based algorithm tailored to foster safe and efficient driving behaviors across varying levels of traffic flows in roundabouts. The proposed algorithm employs a deep Q-learning network to effectively learn safe and efficient driving strategies in complex multi-vehicle roundabouts. Additionally, a KAN (Kolmogorov-Arnold network) enhances the AVs' ability to learn their surroundings robustly and precisely. An action inspector is integrated to replace dangerous actions to avoid collisions when the AV interacts with the environment, and a route planner is proposed to enhance the driving efficiency and safety of the AVs. Moreover, a model predictive control is adopted to ensure stability and precision of the driving actions. The results show that our proposed system consistently achieves safe and efficient driving whilst maintaining a stable training process, as evidenced by the smooth convergence of the reward function and the low variance in the training curves across various traffic flows. Compared to state-of-the-art benchmarks, the proposed algorithm achieves a lower number of collisions and reduced travel time to destination.
comment: 15 pages, 12 figures, submitted to an IEEE journal
CatalogBank: A Structured and Interoperable Catalog Dataset with a Semi-Automatic Annotation Tool (DocumentLabeler) for Engineering System Design
In the realm of document engineering and Natural Language Processing (NLP), the integration of digitally born catalogs into product design processes presents a novel avenue for enhancing information extraction and interoperability. This paper introduces CatalogBank, a dataset developed to bridge the gap between textual descriptions and other data modalities related to engineering design catalogs. We utilized existing information extraction methodologies to extract product information from PDF-based catalogs to use in downstream tasks to generate a baseline metric. Our approach not only supports the potential automation of design workflows but also overcomes the limitations of manual data entry and non-standard metadata structures that have historically impeded the seamless integration of textual and other data modalities. Through the use of DocumentLabeler, an open-source annotation tool adapted for our dataset, we demonstrated the potential of CatalogBank in supporting diverse document-based tasks such as layout analysis and knowledge extraction. Our findings suggest that CatalogBank can contribute to document engineering and NLP by providing a robust dataset for training models capable of understanding and processing complex document formats with relatively less effort using the semi-automated annotation tool DocumentLabeler.
comment: 8 pages, 6 figures
Optimizing Highway Ramp Merge Safety and Efficiency via Spatio-Temporal Cooperative Control and Vehicle-Road Coordination
In view of existing automatic driving, it is difficult to accurately and timely obtain the status and driving intention of other vehicles. The safety risk and urgency of autonomous vehicles in the absence of collision are evaluated. To ensure safety and improve road efficiency, a method of pre-compiling the spatio-temporal trajectory of vehicles is established to eliminate conflicts between vehicles in advance. The calculation method of the safe distance under spatio-temporal conditions is studied, considering vehicle speed differences, vehicle positioning errors, and clock errors. By combining collision acceleration and urgent acceleration, an evaluation model for vehicle conflict risk is constructed. Mainline vehicles that may have conflicts with on-ramp vehicles are identified, and the target gap for on-ramp vehicles is determined. Finally, a cooperative control method is established based on the selected target gap, preparing the vehicle travel path in advance. Taking highway ramp merge as an example, the mainline priority spatio-temporal cooperative control method is proposed and verified through simulation. Using SUMO and Python co-simulation, mainline traffic volumes of 800 veh*h-1*lane-1
Analytical Model of Modular Upper Limb Rehabilitation
Configurable robots are made up of robotic modules that can be assembled or can configure themselves into multiple robot configurations. In this research plan, a method for upper-body rehabilitation will be discussed in the form of a modular robot with different morphologies. The advantage and superiority of designing an example of a robotic module for upper body rehabilitation is the ability to reset the modular robot system. In this research, a number of modules will be designed and implemented according to the needs of one-hand rehabilitation with different degrees of freedom. The design modules' performance and efficiency will be evaluated by simulating, making samples, and testing them. This article's research includes presenting a modular upper body rehabilitation robot in the wrist, elbow, and shoulder areas, as well as providing a suitable kinematic and dynamic model of the upper body rehabilitation robot to determine human-robot interaction forces and movement. The research also involves analyzing the mathematical model of the upper body rehabilitation robot to identify advanced control strategies that rely on force control and torque control. After reviewing the articles and research of others, we concluded that no one has yet worked on the design of a prototype robotic module for upper body rehabilitation in the specified order. In our pioneering research, we intend to address this important matter.
Communication-robust and Privacy-safe Distributed Estimation for Heterogeneous Community-level Behind-the-meter Solar Power Generation
The rapid growth of behind-the-meter (BTM) solar power generation systems presents challenges for distribution system planning and scheduling due to invisible solar power generation. To address the data leakage problem of centralized machine-learning methods in BTM solar power generation estimation, the federated learning (FL) method has been investigated for its distributed learning capability. However, the conventional FL method has encountered various challenges, including heterogeneity, communication failures, and malicious privacy attacks. To overcome these challenges, this study proposes a communication-robust and privacy-safe distributed estimation method for heterogeneous community-level BTM solar power generation. Specifically, this study adopts multi-task FL as the main structure and learns the common and unique features of all communities. Simultaneously, it embeds an updated parameters estimation method into the multi-task FL, automatically identifies similarities between any two clients, and estimates the updated parameters for unavailable clients to mitigate the negative effects of communication failures. Finally, this study adopts a differential privacy mechanism under the dynamic privacy budget allocation strategy to combat malicious privacy attacks and improve model training efficiency. Case studies show that in the presence of heterogeneity and communication failures, the proposed method exhibits better estimation accuracy and convergence performance as compared with traditional FL and localized learning methods, while providing stronger privacy protection.
Stochastic Real-Time Economic Dispatch for Integrated Electric and Gas Systems Considering Uncertainty Propagation and Pipeline Leakage
Gas-fired units (GFUs) with rapid regulation capabilities are considered an effective tool to mitigate fluctuations in the generation of renewable energy sources and have coupled electricity power systems (EPSs) and natural gas systems (NGSs) more tightly. However, this tight coupling leads to uncertainty propagation, a challenge for the real-time dispatch of such integrated electric and gas systems (IEGSs). Moreover, pipeline leakage failures in the NGS may threaten the electricity supply reliability of the EPS through GFUs. To address these problems, this paper first establishes an operational model considering gas pipeline dynamic characteristics under uncertain leakage failures for the NGS and then presents a stochastic IEGS real-time economic dispatch (RTED) model considering both uncertainty propagation and pipeline leakage uncertainty. To quickly solve this complicated large-scale stochastic optimization problem, a novel notion of the coupling boundary dynamic adjustment region considering pipeline leakage failure (LCBDAR) is proposed to characterize the dynamic characteristics of the NGS boundary connecting GFUs. Based on the LCBDAR, a noniterative decentralized solution is proposed to decompose the original stochastic RTED model into two subproblems that are solved separately by the EPS and NGS operators, thus preserving their data privacy. In particular, only one-time data interaction from the NGS to the EPS is required. Case studies on several IEGSs at different scales demonstrate the effectiveness of the proposed method.
Robust Maneuver Planning With Scalable Prediction Horizons: A Move Blocking Approach
Implementation of Model Predictive Control (MPC) on hardware with limited computational resources remains a challenge. Especially for long-distance maneuvers that require small sampling times, the necessary horizon lengths prevent its application on onboard computers. In this paper, we propose a computationally efficient tubebased shrinking horizon MPC that is scalable to long prediction horizons. Using move blocking, we ensure that a given number of decision inputs is efficiently used throughout the maneuver. Next, a method to substantially reduce the number of constraints is introduced. The approach is demonstrated with a helicopter landing on an inclined platform using a prediction horizon of 300 steps. The constraint reduction decreases the computation time by an order of magnitude with a slight increase in trajectory cost.
comment: Submitted to L-CSS with CDC option
Meta SAC-Lag: Towards Deployable Safe Reinforcement Learning via MetaGradient-based Hyperparameter Tuning IROS
Safe Reinforcement Learning (Safe RL) is one of the prevalently studied subcategories of trial-and-error-based methods with the intention to be deployed on real-world systems. In safe RL, the goal is to maximize reward performance while minimizing constraints, often achieved by setting bounds on constraint functions and utilizing the Lagrangian method. However, deploying Lagrangian-based safe RL in real-world scenarios is challenging due to the necessity of threshold fine-tuning, as imprecise adjustments may lead to suboptimal policy convergence. To mitigate this challenge, we propose a unified Lagrangian-based model-free architecture called Meta Soft Actor-Critic Lagrangian (Meta SAC-Lag). Meta SAC-Lag uses meta-gradient optimization to automatically update the safety-related hyperparameters. The proposed method is designed to address safe exploration and threshold adjustment with minimal hyperparameter tuning requirement. In our pipeline, the inner parameters are updated through the conventional formulation and the hyperparameters are adjusted using the meta-objectives which are defined based on the updated parameters. Our results show that the agent can reliably adjust the safety performance due to the relatively fast convergence rate of the safety threshold. We evaluate the performance of Meta SAC-Lag in five simulated environments against Lagrangian baselines, and the results demonstrate its capability to create synergy between parameters, yielding better or competitive results. Furthermore, we conduct a real-world experiment involving a robotic arm tasked with pouring coffee into a cup without spillage. Meta SAC-Lag is successfully trained to execute the task, while minimizing effort constraints.
comment: Main text accepted to the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2024, 10 pages, 4 figures, 3 tables
$\mathcal{H}_2$-optimal Model Reduction of Linear Quadratic Output Systems in Finite Frequency Range
Linear quadratic output systems constitute an important class of dynamical systems with numerous practical applications. When the order of these models is exceptionally high, simulating and analyzing these systems becomes computationally prohibitive. In such instances, model order reduction offers an effective solution by approximating the original high-order system with a reduced-order model while preserving the system's essential characteristics. In frequency-limited model order reduction, the objective is to maintain the frequency response of the original system within a specified frequency range in the reduced-order model. In this paper, a mathematical expression for the frequency-limited $\mathcal{H}_2$ norm is derived, which quantifies the error within the desired frequency interval. Subsequently, the necessary conditions for a local optimum of the frequency-limited $\mathcal{H}_2$ norm of the error are derived. The inherent difficulty in satisfying these conditions within a Petrov-Galerkin projection framework is also discussed. Based on the optimality conditions and Petrov-Galerkin projection, a stationary point iteration algorithm is proposed that enforces three of the four optimality conditions upon convergence. A numerical example is provided to illustrate the algorithm's effectiveness in accurately approximating the original high-order model within the specified frequency interval.
Enhanced Equivalent Circuit Model for High Current Discharge of Lithium-Ion Batteries with Application to Electric Vertical Takeoff and Landing Aircraft
Conventional battery equivalent circuit models (ECMs) have limited capability to predict performance at high discharge rates, where lithium depleted regions may develop and cause a sudden exponential drop in the cell's terminal voltage. Having accurate predictions of performance under such conditions is necessary for electric vertical takeoff and landing (eVTOL) aircraft applications, where high discharge currents can be required during fault scenarios and the inability to provide these currents can be safety-critical. To address this challenge, we utilize data-driven modeling methods to derive a parsimonious addition to a conventional ECM that can capture the observed rapid voltage drop with only one additional state. We also provide a detailed method for identifying the resulting model parameters, including an extensive characterization data set along with a well-regularized objective function formulation. The model is validated against a novel data set of over 150 flights encompassing a wide array of conditions for an eVTOL aircraft using an application-specific and safety-relevant reserve duration metric for quantifying accuracy. The model is shown to predict the landing hover capability with an error mean and standard deviation of 2.9 and 6.2 seconds, respectively, defining the model's ability to capture the cell voltage behavior under high discharge currents.
Stable State Space SubSpace (S$^5$) Identification
State space subspace algorithms for input-output systems have been widely applied but also have a reasonably well-developedasymptotic theory dealing with consistency. However, guaranteeing the stability of the estimated system matrix is a major issue. Existing stability-guaranteed algorithms are computationally expensive, require several tuning parameters, and scale badly to high state dimensions. Here, we develop a new algorithm that is closed-form and requires no tuning parameters. It is thus computationally cheap and scales easily to high state dimensions. We also prove its consistency under reasonable conditions.
The Nah Bandit: Modeling User Non-compliance in Recommendation Systems
Recommendation systems now pervade the digital world, ranging from advertising to entertainment. However, it remains challenging to implement effective recommendation systems in the physical world, such as in mobility or health. This work focuses on a key challenge: in the physical world, it is often easy for the user to opt out of taking any recommendation if they are not to her liking, and to fall back to her baseline behavior. It is thus crucial in cyber-physical recommendation systems to operate with an interaction model that is aware of such user behavior, lest the user abandon the recommendations altogether. This paper thus introduces the Nah Bandit, a tongue-in-cheek reference to describe a Bandit problem where users can say `nah' to the recommendation and opt for their preferred option instead. As such, this problem lies in between a typical bandit setup and supervised learning. We model the user non-compliance by parameterizing an anchoring effect of recommendations on users. We then propose the Expert with Clustering (EWC) algorithm, a hierarchical approach that incorporates feedback from both recommended and non-recommended options to accelerate user preference learning. In a recommendation scenario with $N$ users, $T$ rounds per user, and $K$ clusters, EWC achieves a regret bound of $O(N\sqrt{T\log K} + NT)$, achieving superior theoretical performance in the short term compared to LinUCB algorithm. Experimental results also highlight that EWC outperforms both supervised learning and traditional contextual bandit approaches. This advancement reveals that effective use of non-compliance feedback can accelerate preference learning and improve recommendation accuracy. This work lays the foundation for future research in Nah Bandit, providing a robust framework for more effective recommendation systems.
comment: 12 pages, 8 figures, under review
On Accelerating Large-Scale Robust Portfolio Optimization
Solving large-scale robust portfolio optimization problems is challenging due to the high computational demands associated with an increasing number of assets, the amount of data considered, and market uncertainty. To address this issue, we propose an extended supporting hyperplane approximation approach for efficiently solving a class of distributionally robust portfolio problems for a general class of additively separable utility functions and polyhedral ambiguity distribution set, applied to a large-scale set of assets. Our technique is validated using a large-scale portfolio of the S&P 500 index constituents, demonstrating robust out-of-sample trading performance. More importantly, our empirical studies show that this approach significantly reduces computational time compared to traditional concave Expected Log-Growth (ELG) optimization, with running times decreasing from several thousand seconds to just a few. This method provides a scalable and practical solution to large-scale robust portfolio optimization, addressing both theoretical and practical challenges.
comment: Submitted to possible publication
Certifiable Deep Learning for Reachability Using a New Lipschitz Continuous Value Function
We propose a new reachability learning framework for high-dimensional nonlinear systems, focusing on reach-avoid problems. These problems require computing the reach-avoid set, which ensures that all its elements can safely reach a target set despite any disturbance within pre-specified bounds. Our framework has two main parts: offline learning of a newly designed reach-avoid value function and post-learning certification. Compared to prior works, our new value function is Lipschitz continuous and its associated Bellman operator is a contraction mapping, both of which improve the learning performance. To ensure deterministic guarantees of our learned reach-avoid set, we introduce two efficient post-learning certification methods. Both methods can be used online for real-time local certification or offline for comprehensive certification. We validate our framework in a 12-dimensional crazyflie drone racing hardware experiment and a simulated 10-dimensional highway takeover example.
comment: Submitted, under review
A semi-centralized multi-agent RL framework for efficient irrigation scheduling
This paper proposes a Semi-Centralized Multi-Agent Reinforcement Learning (SCMARL) approach for irrigation scheduling in spatially variable agricultural fields, where management zones address spatial variability. The SCMARL framework is hierarchical in nature, with a centralized coordinator agent at the top level and decentralized local agents at the second level. The coordinator agent makes daily binary irrigation decisions based on field-wide conditions, which are communicated to the local agents. Local agents determine appropriate irrigation amounts for specific management zones using local conditions. The framework employs state augmentation approach to handle non-stationarity in the local agents' environments. An extensive evaluation on a large-scale field in Lethbridge, Canada, compares the SCMARL approach with a learning-based multi-agent model predictive control scheduling approach, highlighting its enhanced performance, resulting in water conservation and improved Irrigation Water Use Efficiency (IWUE). Notably, the proposed approach achieved a 4.0% savings in irrigation water while enhancing the IWUE by 6.3%.
Timing Analysis and Priority-driven Enhancements of ROS 2 Multi-threaded Executors
The second generation of Robotic Operating System, ROS 2, has gained much attention for its potential to be used for safety-critical robotic applications. The need to provide a solid foundation for timing correctness and scheduling mechanisms is therefore growing rapidly. Although there are some pioneering studies conducted on formally analyzing the response time of processing chains in ROS 2, the focus has been limited to single-threaded executors, and multi-threaded executors, despite their advantages, have not been studied well. To fill this knowledge gap, in this paper, we propose a comprehensive response-time analysis framework for chains running on ROS 2 multi-threaded executors. We first analyze the timing behavior of the default scheduling scheme in ROS 2 multi-threaded executors, and then present priority-driven scheduling enhancements to address the limitations of the default scheme. Our framework can analyze chains with both arbitrary and constrained deadlines and also the effect of mutually-exclusive callback groups. Evaluation is conducted by a case study on NVIDIA Jetson AGX Xavier and schedulability experiments using randomly-generated chains. The results demonstrate that our analysis framework can safely upper-bound response times under various conditions and the priority-driven scheduling enhancements not only reduce the response time of critical chains but also improve analytical bounds.
Physics-Guided Reinforcement Learning System for Realistic Vehicle Active Suspension Control
The suspension system is a crucial part of the automotive chassis, improving vehicle ride comfort and isolating passengers from rough road excitation. Unlike passive suspension, which has constant spring and damping coefficients, active suspension incorporates electronic actuators into the system to dynamically control stiffness and damping variables. However, effectively controlling the suspension system poses a challenging task that necessitates real-time adaptability to various road conditions. This paper presents the Physics-Guided Deep Reinforcement Learning (DRL) for adjusting an active suspension system's variable kinematics and compliance properties for a quarter-car model in real time. Specifically, the outputs of the model are defined as actuator stiffness and damping control, which are bound within physically realistic ranges to maintain the system's physical compliance. The proposed model was trained on stochastic road profiles according to ISO 8608 standards to optimize the actuator's control policy. According to qualitative results on simulations, the vehicle body reacts smoothly to various novel real-world road conditions, having a much lower degree of oscillation. These observations mean a higher level of passenger comfort and better vehicle stability. Quantitatively, DRL outperforms passive systems in reducing the average vehicle body velocity and acceleration by 43.58% and 17.22%, respectively, minimizing the vertical movement impacts on the passengers. The code is publicly available at github.com/anh-nn01/RL4Suspension-ICMLA23.
comment: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
An Efficient and Explainable Transformer-Based Few-Shot Learning for Modeling Electricity Consumption Profiles Across Thousands of Domains
Electricity Consumption Profiles (ECPs) are crucial for operating and planning power distribution systems, especially with the increasing numbers of various low-carbon technologies such as solar panels and electric vehicles. Traditional ECP modeling methods typically assume the availability of sufficient ECP data. However, in practice, the accessibility of ECP data is limited due to privacy issues or the absence of metering devices. Few-shot learning (FSL) has emerged as a promising solution for ECP modeling in data-scarce scenarios. Nevertheless, standard FSL methods, such as those used for images, are unsuitable for ECP modeling because (1) these methods usually assume several source domains with sufficient data and several target domains. However, in the context of ECP modeling, there may be thousands of source domains with a moderate amount of data and thousands of target domains. (2) Standard FSL methods usually involve cumbersome knowledge transfer mechanisms, such as pre-training and fine-tuning, whereas ECP modeling requires more lightweight methods. (3) Deep learning models often lack explainability, hindering their application in industry. This paper proposes a novel FSL method that exploits Transformers and Gaussian Mixture Models (GMMs) for ECP modeling to address the above-described issues. Results show that our method can accurately restore the complex ECP distribution with a minimal amount of ECP data (e.g., only 1.6\% of the complete domain dataset) while it outperforms state-of-the-art time series modeling methods, maintaining the advantages of being both lightweight and interpretable. The project is open-sourced at https://github.com/xiaweijie1996/TransformerEM-GMM.git.
Stabilization of Nonlinear Systems through Control Barrier Functions
This paper proposes a control design approach for stabilizing nonlinear control systems. Our key observation is that the set of points where the decrease condition of a control Lyapunov function (CLF) is feasible can be regarded as a safe set. By leveraging a nonsmooth version of control barrier functions (CBFs) and a weaker notion of CLF, we develop a control design that forces the system to converge to and remain in the region where the CLF decrease condition is feasible. We characterize the conditions under which our controller asymptotically stabilizes the origin or a small neighborhood around it, even in the cases where it is discontinuous. We illustrate our design in various examples.
An Implicit Function Method for Computing the Stability Boundaries of Hill's Equation
Hill's equation is a common model of a time-periodic system that can undergo parametric resonance for certain choices of system parameters. For most kinds of parametric forcing, stable regions in its two-dimensional parameter space need to be identified numerically, typically by applying a matrix trace criterion. By integrating ODEs derived from the stability criterion, we present an alternative, more accurate and computationally efficient numerical method for determining the stability boundaries of Hill's equation in parameter space. This method works similarly to determine stability boundaries for the closely related problem of vibrational stabilization of the linearized Katpiza pendulum. Additionally, we derive a stability criterion for the damped Hill's equation in terms of a matrix trace criterion on an equivalent undamped system. In doing so we generalize the method of this paper to compute stability boundaries for parametric resonance in the presence of damping.
comment: 11 pages, 9 figures
Spin Hall Nano-Antenna
The spin Hall effect is a celebrated phenomenon in spintronics and magnetism that has found numerous applications in digital electronics (memory and logic), but very few in analog electronics. Practically, the only analog application in widespread use is the spin Hall nano-oscillator (SHNO) that delivers a high frequency alternating current or voltage to a load. Here, we report its analogue - a spin Hall nano-antenna (SHNA) that radiates a high frequency electromagnetic wave (alternating electric/magnetic fields) into the surrounding medium. It can also radiate an acoustic wave in an underlying substrate if the nanomagnets are made of a magnetostrictive material. That makes it a dual electromagnetic/acoustic antenna. The SHNA is made of an array of ledged magnetostrictive nanomagnets deposited on a substrate, with a heavy metal nanostrip underlying/overlying the ledges. An alternating charge current passed through the nanostrip generates an alternating spin-orbit torque in the nanomagnets via the spin Hall effect which makes their magnetizations oscillate in time with the frequency of the current, producing confined spin waves (magnons), which radiate electromagnetic waves (photons) in space with the same frequency as the ac current. Despite being much smaller than the radiated wavelength, the SHNA surprisingly does not act as a point source which would radiate isotropically. Instead, there is clear directionality (anisotropy) in the radiation pattern, which is frequency-dependent. This is due to the (frequency-dependent) intrinsic anisotropy in the confined spin wave patterns generated within the nanomagnets, which effectively endows the "point source" with internal anisotropy.
A review of the calculation methods of optimal power flow in integrated energy systems
The analysis of Integrated Energy Systems (IES) is crucial for enhancing the comprehensive and complementary utilization of clean energy across China, significantly impacting the effective planning, operational coordination, and security control of the IES network. This paper presents a systematic review of the current research on optimal power flow (OPF) within IES, addressing the spatiotemporal interrelationships and coupled co-supply among primary energy processes such as electricity, gas, and heat (cooling). It highlights the challenges and future directions in this field, underscoring the lack of comprehensive studies on coupled power flow modeling for electricity-heat-gas systems and the need for more robust modeling approaches that align with practical engineering applications. Furthermore, the paper discusses the potential of multi-target energy storage systems to enhance energy consumption efficiency and flexibility in energy resource management. The existing models and algorithms for target power flow and optimal power flow are critiqued for their lack of flexibility and comprehensiveness, particularly in handling multi-objective flows and ensuring system safety and reliability. The study emphasizes the necessity for further development of safety and reliability assessment frameworks to support the evolving demands of integrated energy systems.
comment: 10 pages, in Chinese language
Startup Control Optimization of He-Xe Cooled Space Nuclear Reactors Using a System Analysis Program
In recent years, achieving autonomous control in nuclear reactor operations has become pivotal for the effectiveness of Space Nuclear Power Systems (SNPS). However, compared to power control, the startup control of SNPS remains underexplored. This study introduces a multi-objective optimization framework aimed at enhancing startup control, leveraging a system level analysis program to simulate the system's dynamic behavior accurately. The primary contribution of this work is the development and implementation of an optimization framework that significantly reduces startup time and improves control efficiency. Utilizing a non-ideal gas model, a multi-channel core model and the Monte Carlo code RMC employed to calculate temperature reactivity coefficients and neutron kinetics parameters, the system analysis tool ensures precise thermal-dynamic simulations. After insightful comprehension of system dynamics through reactive insertion accidents, the optimization algorithm fine-tunes the control sequences for external reactivity insertion, TAC system shaft speed, and cooling system background temperature. The optimized control strategy achieves threshold power 1260 seconds earlier and turbine inlet temperature 1980 seconds sooner than baseline methods. The findings highlight the potential of the proposed optimization framework to enhance the autonomy and operational efficiency of future SNPS designs.
Quality-Aware Hydraulic Control in Drinking Water Networks via Controllability Proxies
The operation of water distribution networks is a complex procedure aimed at efficiently delivering consumers with adequate water quantity while ensuring its safe quality. An added challenge is the dependency of the water quality dynamics on the system's hydraulics, which influences the performance of the water quality controller. Prior research has addressed either solving the optimum operational hydraulic setting problem or regulating the water quality dynamics as separate problems. Additionally, there have been efforts to couple these two problems and solve one compact problem resulting in trade-offs between the contradictory objectives. In contrast, this paper takes a novel approach by examining the water quality dependency on the hydraulics from a control-theoretic standpoint. More specifically, we explore the influence of accountability for water quality controllability improvement when addressing the pump scheduling problem. We examine its effects on the cumulative cost of the interconnected systems as well as the subsequent performance of the water quality controller. To achieve this, we develop a framework that incorporates different controllability metrics within the operational hydraulic optimization problem; its aim is attaining an adequate level of water quality control across the system. We assess the aforementioned aspects' performance on various scaled networks with a wide range of numerical scenarios.
Low-Complexity Control for a Class of Uncertain MIMO Nonlinear Systems under Generalized Time-Varying Output Constraints
This paper introduces a novel control framework to address the satisfaction of multiple time-varying output constraints in uncertain high-order MIMO nonlinear control systems. Unlike existing methods, which often assume that the constraints are always decoupled and feasible, our approach can handle coupled time-varying constraints even in the presence of potential infeasibilities. First, it is shown that satisfying multiple constraints essentially boils down to ensuring the positivity of a scalar variable, representing the signed distance from the boundary of the time-varying output-constrained set. To achieve this, a single consolidating constraint is designed that, when satisfied, guarantees convergence to and invariance of the time-varying output-constrained set within a user-defined finite time. Next, a novel robust and low-complexity feedback controller is proposed to ensure the satisfaction of the consolidating constraint. Additionally, we provide a mechanism for online modification of the consolidating constraint to find a least violating solution when the constraints become mutually infeasible for some time. Finally, simulation examples of trajectory and region tracking for a mobile robot validate the proposed approach.
comment: extended version, 20 pages, 7 figures
In-Field Gyroscope Autocalibration with Iterative Attitude Estimation
This paper presents an efficient in-field calibration method tailored for low-cost triaxial MEMS gyroscopes often used in healthcare applications. Traditional calibration techniques are challenging to implement in clinical settings due to the unavailability of high-precision equipment. Unlike the auto-calibration approaches used for triaxial MEMS accelerometers, which rely on local gravity, gyroscopes lack a reliable reference since the Earth's self-rotation speed is insufficient for accurate calibration. To address this limitation, we propose a novel method that uses manual rotation of the MEMS gyroscope to a specific angle (360{\deg}) as the calibration reference. This approach iteratively estimates the sensor's attitude without requiring any external equipment. Numerical simulations and empirical tests validate that the calibration error is low and that parameter estimation is unbiased. The method can be implemented in real-time on a low-energy microcontroller and completed in under 30 seconds. Comparative results demonstrate that the proposed technique outperforms existing state-of-the-art methods, achieving scale factor and bias errors of less than $2.5\times10^{-2}$ for LSM9DS1 and less than $1\times10^{-2}$ for ICM20948.
Channel Characterization of IRS-assisted Resonant Beam Communication Systems
To meet the growing demand for data traffic, spectrum-rich optical wireless communication (OWC) has emerged as a key technological driver for the development of 6G. The resonant beam communication (RBC) system, which employs spatially separated laser cavities as the transmitter and receiver, is a high-speed OWC technology capable of self-alignment without tracking. However, its transmission through the air is susceptible to losses caused by obstructions. In this paper, we propose an intelligent reflecting surface (IRS) assisted RBC system with the optical frequency doubling method, where the resonant beam in frequency-fundamental and frequency-doubled is transmitted through both direct line-of-sight (LoS) and IRS-assisted channels to maintain steady-state oscillation and enable communication without echo-interference, respectively. Then, we establish the channel model based on Fresnel diffraction theory under the near-field optical propagation to analyze the transmission loss and frequency-doubled power analytically. Furthermore, communication power can be maximized by dynamically controlling the beam-splitting ratio between the two channels according to the loss levels encountered over air. Numerical results validate that the IRS-assisted channel can compensate for the losses in the obstructed LoS channel and misaligned receivers, ensuring that communication performance reaches an optimal value with dynamic ratio adjustments.
Learning Tube-Certified Neural Robust Contraction Metrics
Control design for general nonlinear robotic systems with guaranteed stability and/or safety in the presence of model uncertainties is a challenging problem. Recent efforts attempt to learn a controller and a certificate (e.g., a Lyapunov function or a contraction metric) jointly using neural networks (NNs), in which model uncertainties are generally ignored during the learning process. In this paper, for nonlinear systems subject to bounded disturbances, we present a framework for jointly learning a robust nonlinear controller and a contraction metric using a novel disturbance rejection objective that certifies a tube bound using NNs for user-specified variables (e.g. control inputs). The learned controller aims to minimize the effect of disturbances on the actual trajectories of state and/or input variables from their nominal counterparts while providing certificate tubes around nominal trajectories that are guaranteed to contain actual trajectories in the presence of disturbances. Experimental results demonstrate that our framework can generate tighter (smaller) tubes and a controller that is computationally efficient to implement.
Decentralized and Uncoordinated Learning of Stable Matchings: A Game-Theoretic Approach
We consider the problem of learning stable matchings with unknown preferences in a decentralized and uncoordinated manner, where "decentralized" means that players make decisions individually without the influence of a central platform, and "uncoordinated" means that players do not need to synchronize their decisions using pre-specified rules. First, we provide a game formulation for this problem with known preferences, where the set of pure Nash equilibria (NE) coincides with the set of stable matchings, and mixed NE can be rounded to a stable matching. Then, we show that for hierarchical markets, applying the exponential weight (EXP) learning algorithm to the stable matching game achieves logarithmic regret in a fully decentralized and uncoordinated fashion. Moreover, we show that EXP converges locally and exponentially fast to a stable matching in general markets. We also introduce another decentralized and uncoordinated learning algorithm that globally converges to a stable matching with arbitrarily high probability. Finally, we provide stronger feedback conditions under which it is possible to drive the market faster toward an approximate stable matching. Our proposed game-theoretic framework bridges the discrete problem of learning stable matchings with the problem of learning NE in continuous-action games.
Efficient Data-Driven MPC for Demand Response of Commercial Buildings
Model predictive control (MPC) has been shown to significantly improve the energy efficiency of buildings while maintaining thermal comfort. Data-driven approaches based on neural networks have been proposed to facilitate system modelling. However, such approaches are generally nonconvex and result in computationally intractable optimization problems. In this work, we design a readily implementable energy management method for small commercial buildings. We then leverage our approach to formulate a real-time demand bidding strategy. We propose a data-driven and mixed-integer convex MPC which is solved via derivative-free optimization given a limited computational time of 5 minutes to respect operational constraints. We consider rooftop unit heating, ventilation, and air conditioning systems with discrete controls to accurately model the operation of most commercial buildings. Our approach uses an input convex recurrent neural network to model the thermal dynamics. We apply our approach in several demand response (DR) settings, including a demand bidding, a time-of-use, and a critical peak rebate program. Controller performance is evaluated on a state-of-the-art building simulation. The proposed approach improves thermal comfort while reducing energy consumption and cost through DR participation, when compared to other data-driven approaches or a set-point controller.
Collaborative Estimation of Real Valued Function by Two Agents and a Fusion Center with Knowledge Exchange
We consider a collaborative iterative algorithm with two agents and a fusion center for estimation of a real valued function (or ``model") on the set of real numbers. While the data collected by the agents is private, in every iteration of the algorithm, the models estimated by the agents are uploaded to the fusion center, fused, and, subsequently downloaded by the agents. We consider the estimation spaces at the agents and the fusion center to be Reproducing Kernel Hilbert Spaces (RKHS). Under suitable assumptions on these spaces, we prove that the algorithm is consistent, i.e., there exists a subsequence of the estimated models which converges to a model in the strong topology. To this end, we define estimation operators for the agents, fusion center, and, for every iteration of the algorithm constructively. We define valid input data sequences, study the asymptotic properties of the norm of the estimation operators, and, find sufficient conditions under which the estimation operator until any iteration is uniformly bounded. Using these results, we prove the existence of an estimation operator for the algorithm which implies the consistency of the considered estimation algorithm.
comment: arXiv admin note: text overlap with arXiv:2401.03012
Battlefield transfers in coalitional Blotto games
In competitive resource allocation environments, agents often choose to form alliances; however, for some agents, doing so may not always be beneficial. Is there a method of forming alliances that always reward each of their members? We study this question using the framework of the coalitional Blotto game, in which two players compete against a common adversary by allocating their budgeted resources across disjoint sets of valued battlefields. On any given battlefield, the agent that allocates a greater amount of resources wins the corresponding battlefield value. Existing work has shown the surprising result that in certain game instances, if one player donates a portion of their budget to the other player, then both players win larger amounts in their separate competitions against the adversary. However, this transfer-based method of alliance formation is not always mutually beneficial, which motivates the search for alternate strategies. In this vein, we study a new method of alliance formation referred to as a joint transfer, whereby players publicly transfer battlefields and budgets between one another before they engage in their separate competitions against the adversary. We show that in almost all game instances, there exists a mutually beneficial joint transfer that strictly increases the payoff of each player.
Self-organizing Multiagent Target Enclosing under Limited Information and Safety Guarantees
This paper introduces an approach to address the target enclosing problem using non-holonomic multiagent systems, where agents self-organize on the enclosing shape around a fixed target. In our approach, agents independently move toward the desired enclosing geometry when apart and activate the collision avoidance mechanism when a collision is imminent, thereby guaranteeing inter-agent safety. Our approach combines global enclosing behavior and local collision avoidance mechanisms by devising a special potential function and sliding manifold. We rigorously show that an agent does not need to ensure safety with every other agent and put forth a concept of the nearest colliding agent (for any arbitrary agent) with whom ensuring safety is sufficient to avoid collisions in the entire swarm. The proposed control eliminates the need for a fixed or pre-established agent arrangement around the target and requires only relative information between an agent and the target. This makes our design particularly appealing for scenarios with limited global information, hence significantly reducing communication requirements. We finally present simulation results to vindicate the efficacy of the proposed method.
Enhancing Reinforcement Learning in Sensor Fusion: A Comparative Analysis of Cubature and Sampling-based Integration Methods for Rover Search Planning IROS 2024
This study investigates the computational speed and accuracy of two numerical integration methods, cubature and sampling-based, for integrating an integrand over a 2D polygon. Using a group of rovers searching the Martian surface with a limited sensor footprint as a test bed, the relative error and computational time are compared as the area was subdivided to improve accuracy in the sampling-based approach. The results show that the sampling-based approach exhibits a $14.75\%$ deviation in relative error compared to cubature when it matches the computational performance at $100\%$. Furthermore, achieving a relative error below $1\%$ necessitates a $10000\%$ increase in relative time to calculate due to the $\mathcal{O}(N^2)$ complexity of the sampling-based method. It is concluded that for enhancing reinforcement learning capabilities and other high iteration algorithms, the cubature method is preferred over the sampling-based method.
comment: Submitted to IROS 2024
Output-feedback Synthesis Orbit Geometry: Quotient Manifolds and LQG Direct Policy Optimization
We consider direct policy optimization for the linear-quadratic Gaussian (LQG) setting. Over the past few years, it has been recognized that the landscape of dynamic output-feedback controllers of relevance to LQG has an intricate geometry, particularly pertaining to the existence of degenerate stationary points, that hinders gradient methods. In order to address these challenges, in this paper, we adopt a system-theoretic coordinate-invariant Riemannian metric for the space of dynamic output-feedback controllers and develop a Riemannian gradient descent for direct LQG policy optimization. We then proceed to prove that the orbit space of such controllers, modulo the coordinate transformation, admits a Riemannian quotient manifold structure. This geometric structure--that is of independent interest--provides an effective approach to derive direct policy optimization algorithms for LQG with a local linear rate convergence guarantee. Subsequently, we show that the proposed approach exhibits significantly faster and more robust numerical performance as compared with ordinary gradient descent.
Robotics
HyperTaxel: Hyper-Resolution for Taxel-Based Tactile Signals Through Contrastive Learning IROS 2024
To achieve dexterity comparable to that of humans, robots must intelligently process tactile sensor data. Taxel-based tactile signals often have low spatial-resolution, with non-standardized representations. In this paper, we propose a novel framework, HyperTaxel, for learning a geometrically-informed representation of taxel-based tactile signals to address challenges associated with their spatial resolution. We use this representation and a contrastive learning objective to encode and map sparse low-resolution taxel signals to high-resolution contact surfaces. To address the uncertainty inherent in these signals, we leverage joint probability distributions across multiple simultaneous contacts to improve taxel hyper-resolution. We evaluate our representation by comparing it with two baselines and present results that suggest our representation outperforms the baselines. Furthermore, we present qualitative results that demonstrate the learned representation captures the geometric features of the contact surface, such as flatness, curvature, and edges, and generalizes across different objects and sensor configurations. Moreover, we present results that suggest our representation improves the performance of various downstream tasks, such as surface classification, 6D in-hand pose estimation, and sim-to-real transfer.
comment: Accepted by IROS 2024
VLPG-Nav: Object Navigation Using Visual Language Pose Graph and Object Localization Probability Maps
We present VLPG-Nav, a visual language navigation method for guiding robots to specified objects within household scenes. Unlike existing methods primarily focused on navigating the robot toward objects, our approach considers the additional challenge of centering the object within the robot's camera view. Our method builds a visual language pose graph (VLPG) that functions as a spatial map of VL embeddings. Given an open vocabulary object query, we plan a viewpoint for object navigation using the VLPG. Despite navigating to the viewpoint, real-world challenges like object occlusion, displacement, and the robot's localization error can prevent visibility. We build an object localization probability map that leverages the robot's current observations and prior VLPG. When the object isn't visible, the probability map is updated and an alternate viewpoint is computed. In addition, we propose an object-centering formulation that locally adjusts the robot's pose to center the object in the camera view. We evaluate the effectiveness of our approach through simulations and real-world experiments, evaluating its ability to successfully view and center the object within the camera field of view. VLPG-Nav demonstrates improved performance in locating the object, navigating around occlusions, and centering the object within the robot's camera view, outperforming the selected baselines in the evaluation metrics.
Autonomous Behavior Planning For Humanoid Loco-manipulation Through Grounded Language Model IROS 2024
Enabling humanoid robots to perform autonomously loco-manipulation in unstructured environments is crucial and highly challenging for achieving embodied intelligence. This involves robots being able to plan their actions and behaviors in long-horizon tasks while using multi-modality to perceive deviations between task execution and high-level planning. Recently, large language models (LLMs) have demonstrated powerful planning and reasoning capabilities for comprehension and processing of semantic information through robot control tasks, as well as the usability of analytical judgment and decision-making for multi-modal inputs. To leverage the power of LLMs towards humanoid loco-manipulation, we propose a novel language-model based framework that enables robots to autonomously plan behaviors and low-level execution under given textual instructions, while observing and correcting failures that may occur during task execution. To systematically evaluate this framework in grounding LLMs, we created the robot 'action' and 'sensing' behavior library for task planning, and conducted mobile manipulation tasks and experiments in both simulated and real environments using the CENTAURO robot, and verified the effectiveness and application of this approach in robotic tasks with autonomous behavioral planning.
comment: Paper accepted by IROS 2024
Marker or Markerless? Mode-Switchable Optical Tactile Sensing for Diverse Robot Tasks
Optical tactile sensors play a pivotal role in robot perception and manipulation tasks. The membrane of these sensors can be painted with markers or remain markerless, enabling them to function in either marker or markerless mode. However, this uni-modal selection means the sensor is only suitable for either manipulation or perception tasks. While markers are vital for manipulation, they can also obstruct the camera, thereby impeding perception. The dilemma of selecting between marker and markerless modes presents a significant obstacle. To address this issue, we propose a novel mode-switchable optical tactile sensing approach that facilitates transitions between the two modes. The marker-to-markerless transition is achieved through a generative model, whereas its inverse transition is realized using a sparsely supervised regressive model. Our approach allows a single-mode optical sensor to operate effectively in both marker and markerless modes without the need for additional hardware, making it well-suited for both perception and manipulation tasks. Extensive experiments validate the effectiveness of our method. For perception tasks, our approach decreases the number of categories that include misclassified samples by 2 and improves contact area segmentation IoU by 3.53%. For manipulation tasks, our method attains a high success rate of 92.59% in slip detection. Code, dataset and demo videos are available at the project website: https://gitouni.github.io/Marker-Markerless-Transition/
comment: 8 pages, 10 figures
A Conflicts-free, Speed-lossless KAN-based Reinforcement Learning Decision System for Interactive Driving in Roundabouts
Safety and efficiency are crucial for autonomous driving in roundabouts, especially in the context of mixed traffic where autonomous vehicles (AVs) and human-driven vehicles coexist. This paper introduces a learning-based algorithm tailored to foster safe and efficient driving behaviors across varying levels of traffic flows in roundabouts. The proposed algorithm employs a deep Q-learning network to effectively learn safe and efficient driving strategies in complex multi-vehicle roundabouts. Additionally, a KAN (Kolmogorov-Arnold network) enhances the AVs' ability to learn their surroundings robustly and precisely. An action inspector is integrated to replace dangerous actions to avoid collisions when the AV interacts with the environment, and a route planner is proposed to enhance the driving efficiency and safety of the AVs. Moreover, a model predictive control is adopted to ensure stability and precision of the driving actions. The results show that our proposed system consistently achieves safe and efficient driving whilst maintaining a stable training process, as evidenced by the smooth convergence of the reward function and the low variance in the training curves across various traffic flows. Compared to state-of-the-art benchmarks, the proposed algorithm achieves a lower number of collisions and reduced travel time to destination.
comment: 15 pages, 12 figures, submitted to an IEEE journal
Scaling Up Natural Language Understanding for Multi-Robots Through the Lens of Hierarchy
Long-horizon planning is hindered by challenges such as uncertainty accumulation, computational complexity, delayed rewards and incomplete information. This work proposes an approach to exploit the task hierarchy from human instructions to facilitate multi-robot planning. Using Large Language Models (LLMs), we propose a two-step approach to translate multi-sentence instructions into a structured language, Hierarchical Linear Temporal Logic (LTL), which serves as a formal representation for planning. Initially, LLMs transform the instructions into a hierarchical representation defined as Hierarchical Task Tree, capturing the logical and temporal relations among tasks. Following this, a domain-specific fine-tuning of LLM translates sub-tasks of each task into flat LTL formulas, aggregating them to form hierarchical LTL specifications. These specifications are then leveraged for planning using off-the-shelf planners. Our framework not only bridges the gap between instructions and algorithmic planning but also showcases the potential of LLMs in harnessing hierarchical reasoning to automate multi-robot task planning. Through evaluations in both simulation and real-world experiments involving human participants, we demonstrate that our method can handle more complex instructions compared to existing methods. The results indicate that our approach achieves higher success rates and lower costs in multi-robot task allocation and plan generation. Demos videos are available at https://youtu.be/7WOrDKxIMIs .
General-purpose Clothes Manipulation with Semantic Keypoints
We have seen much recent progress in task-specific clothes manipulation, but generalizable clothes manipulation is still a challenge. Clothes manipulation requires sequential actions, making it challenging to generalize to unseen tasks. Besides, a general clothes state representation method is crucial. In this paper, we adopt language instructions to specify and decompose clothes manipulation tasks, and propose a large language model based hierarchical learning method to enhance generalization. For state representation, we use semantic keypoints to capture the geometry of clothes and outline their manipulation methods. Simulation experiments show that the proposed method outperforms the baseline method in terms of success rate and generalization for clothes manipulation tasks.
Robust Maneuver Planning With Scalable Prediction Horizons: A Move Blocking Approach
Implementation of Model Predictive Control (MPC) on hardware with limited computational resources remains a challenge. Especially for long-distance maneuvers that require small sampling times, the necessary horizon lengths prevent its application on onboard computers. In this paper, we propose a computationally efficient tubebased shrinking horizon MPC that is scalable to long prediction horizons. Using move blocking, we ensure that a given number of decision inputs is efficiently used throughout the maneuver. Next, a method to substantially reduce the number of constraints is introduced. The approach is demonstrated with a helicopter landing on an inclined platform using a prediction horizon of 300 steps. The constraint reduction decreases the computation time by an order of magnitude with a slight increase in trajectory cost.
comment: Submitted to L-CSS with CDC option
Path Planning for Spot Spraying with UAVs Combining TSP and Area Coverages
This paper addresses the following task: given a set of patches or areas of varying sizes that are meant to be serviced within a bounding contour calculate a minimal length path plan for an unmanned aerial vehicle (UAV) such that the path additionally avoids given obstacles areas and does never leave the bounding contour. The application in mind is agricultural spot spraying, where the bounding contour represents the field contour and multiple patches represent multiple weed areas meant to be sprayed. Obstacle areas are ponds or tree islands. The proposed method combines a heuristic solution to a traveling salesman problem (TSP) with optimised area coverage path planning. Two TSP-initialisation and 4 TSP-refinement heuristics as well as two area coverage path planning methods are evaluated on three real-world experiments with three obstacle areas and 15, 19 and 197 patches, respectively. The unsuitability of a Baustrophedon-path for area coverage gap avoidance is discussed and inclusion of a headland path for area coverage is motivated. Two main findings are (i) the particular suitability of one TSP-refinement heuristic, and (ii) the unexpected high contribution of patches areas coverage pathlengths on total pathlength, highlighting the importance of optimised area coverage path planning for spot spraying.
comment: 10 pages, 13 figures, 4 tables
Toward a Dialogue System Using a Large Language Model to Recognize User Emotions with a Camera
The performance of ChatGPT\copyright{} and other LLMs has improved tremendously, and in online environments, they are increasingly likely to be used in a wide variety of situations, such as ChatBot on web pages, call center operations using voice interaction, and dialogue functions using agents. In the offline environment, multimodal dialogue functions are also being realized, such as guidance by Artificial Intelligence agents (AI agents) using tablet terminals and dialogue systems in the form of LLMs mounted on robots. In this multimodal dialogue, mutual emotion recognition between the AI and the user will become important. So far, there have been methods for expressing emotions on the part of the AI agent or for recognizing them using textual or voice information of the user's utterances, but methods for AI agents to recognize emotions from the user's facial expressions have not been studied. In this study, we examined whether or not LLM-based AI agents can interact with users according to their emotional states by capturing the user in dialogue with a camera, recognizing emotions from facial expressions, and adding such emotion information to prompts. The results confirmed that AI agents can have conversations according to the emotional state for emotional states with relatively high scores, such as Happy and Angry.
comment: 4 pages, 5 figures, 1 table, The 1st InterAI: Interactive AI for Human-Centered Robotics workshop in conjuction with IEEE Ro-MAN 2024, Pasadona, LA, USA, Aug. 2024
Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models IROS 2024
This paper investigates the task of the open-ended interactive robotic manipulation on table-top scenarios. While recent Large Language Models (LLMs) enhance robots' comprehension of user instructions, their lack of visual grounding constrains their ability to physically interact with the environment. This is because the robot needs to locate the target object for manipulation within the physical workspace. To this end, we introduce an interactive robotic manipulation framework called Polaris, which integrates perception and interaction by utilizing GPT-4 alongside grounded vision models. For precise manipulation, it is essential that such grounded vision models produce detailed object pose for the target object, rather than merely identifying pixels belonging to them in the image. Consequently, we propose a novel Synthetic-to-Real (Syn2Real) pose estimation pipeline. This pipeline utilizes rendered synthetic data for training and is then transferred to real-world manipulation tasks. The real-world performance demonstrates the efficacy of our proposed pipeline and underscores its potential for extension to more general categories. Moreover, real-robot experiments have showcased the impressive performance of our framework in grasping and executing multiple manipulation tasks. This indicates its potential to generalize to scenarios beyond the tabletop. More information and video results are available here: https://star-uu-wang.github.io/Polaris/
comment: Accepted by IROS 2024. 8 pages, 5 figures. See https://star-uu-wang.github.io/Polaris/
Meta SAC-Lag: Towards Deployable Safe Reinforcement Learning via MetaGradient-based Hyperparameter Tuning IROS
Safe Reinforcement Learning (Safe RL) is one of the prevalently studied subcategories of trial-and-error-based methods with the intention to be deployed on real-world systems. In safe RL, the goal is to maximize reward performance while minimizing constraints, often achieved by setting bounds on constraint functions and utilizing the Lagrangian method. However, deploying Lagrangian-based safe RL in real-world scenarios is challenging due to the necessity of threshold fine-tuning, as imprecise adjustments may lead to suboptimal policy convergence. To mitigate this challenge, we propose a unified Lagrangian-based model-free architecture called Meta Soft Actor-Critic Lagrangian (Meta SAC-Lag). Meta SAC-Lag uses meta-gradient optimization to automatically update the safety-related hyperparameters. The proposed method is designed to address safe exploration and threshold adjustment with minimal hyperparameter tuning requirement. In our pipeline, the inner parameters are updated through the conventional formulation and the hyperparameters are adjusted using the meta-objectives which are defined based on the updated parameters. Our results show that the agent can reliably adjust the safety performance due to the relatively fast convergence rate of the safety threshold. We evaluate the performance of Meta SAC-Lag in five simulated environments against Lagrangian baselines, and the results demonstrate its capability to create synergy between parameters, yielding better or competitive results. Furthermore, we conduct a real-world experiment involving a robotic arm tasked with pouring coffee into a cup without spillage. Meta SAC-Lag is successfully trained to execute the task, while minimizing effort constraints.
comment: Main text accepted to the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2024, 10 pages, 4 figures, 3 tables
Time-Ordered Ad-hoc Resource Sharing for Independent Robotic Agents IROS 2024
Resource sharing is a crucial part of a multi-robot system. We propose a Boolean satisfiability based approach to resource sharing. Our key contributions are an algorithm for converting any constrained assignment to a weighted-SAT based optimization. We propose a theorem that allows optimal resource assignment problems to be solved via repeated application of a SAT solver. Additionally we show a way to encode continuous time ordering constraints using Conjunctive Normal Form (CNF). We benchmark our new algorithms and show that they can be used in an ad-hoc setting. We test our algorithms on a fleet of simulated and real world robots and show that the algorithms are able to handle real world situations. Our algorithms and test harnesses are opensource and build on Open-RMFs fleet management system.
comment: IROS 2024
Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning
Surgical video segmentation is a critical task in computer-assisted surgery and is vital for enhancing surgical quality and patient outcomes. Recently, the Segment Anything Model 2 (SAM2) framework has shown superior advancements in image and video segmentation. However, SAM2 struggles with efficiency due to the high computational demands of processing high-resolution images and complex and long-range temporal dynamics in surgical videos. To address these challenges, we introduce Surgical SAM 2 (SurgSAM-2), an advanced model to utilize SAM2 with an Efficient Frame Pruning (EFP) mechanism, to facilitate real-time surgical video segmentation. The EFP mechanism dynamically manages the memory bank by selectively retaining only the most informative frames, reducing memory usage and computational cost while maintaining high segmentation accuracy. Our extensive experiments demonstrate that SurgSAM-2 significantly improves both efficiency and segmentation accuracy compared to the vanilla SAM2. Remarkably, SurgSAM-2 achieves a 3$\times$ FPS compared with SAM2, while also delivering state-of-the-art performance after fine-tuning with lower-resolution data. These advancements establish SurgSAM-2 as a leading model for surgical video analysis, making real-time surgical video segmentation in resource-constrained environments a feasible reality.
comment: 16 pages, 2 figures
GOReloc: Graph-based Object-Level Relocalization for Visual SLAM
This article introduces a novel method for object-level relocalization of robotic systems. It determines the pose of a camera sensor by robustly associating the object detections in the current frame with 3D objects in a lightweight object-level map. Object graphs, considering semantic uncertainties, are constructed for both the incoming camera frame and the pre-built map. Objects are represented as graph nodes, and each node employs unique semantic descriptors based on our devised graph kernels. We extract a subgraph from the target map graph by identifying potential object associations for each object detection, then refine these associations and pose estimations using a RANSAC-inspired strategy. Experiments on various datasets demonstrate that our method achieves more accurate data association and significantly increases relocalization success rates compared to baseline methods. The implementation of our method is released at \url{https://github.com/yutongwangBIT/GOReloc}.
comment: 8 pages, accepted by IEEE RAL
DM2RM: Dual-Mode Multimodal Ranking for Target Objects and Receptacles Based on Open-Vocabulary Instructions
In this study, we aim to develop a domestic service robot (DSR) that, guided by open-vocabulary instructions, can carry everyday objects to the specified pieces of furniture. Few existing methods handle mobile manipulation tasks with open-vocabulary instructions in the image retrieval setting, and most do not identify both the target objects and the receptacles. We propose the Dual-Mode Multimodal Ranking model (DM2RM), which enables images of both the target objects and receptacles to be retrieved using a single model based on multimodal foundation models. We introduce a switching mechanism that leverages a mode token and phrase identification via a large language model to switch the embedding space based on the prediction target. To evaluate the DM2RM, we construct a novel dataset including real-world images collected from hundreds of building-scale environments and crowd-sourced instructions with referring expressions. The evaluation results show that the proposed DM2RM outperforms previous approaches in terms of standard metrics in image retrieval settings. Furthermore, we demonstrate the application of the DM2RM on a standardized real-world DSR platform including fetch-and-carry actions, where it achieves a task success rate of 82% despite the zero-shot transfer setting. Demonstration videos, code, and more materials are available at https://kkrr10.github.io/dm2rm/.
Autonomous on-Demand Shuttles for First Mile-Last Mile Connectivity: Design, Optimization, and Impact Assessment
The First-Mile Last-Mile (FMLM) connectivity is crucial for improving public transit accessibility and efficiency, particularly in sprawling suburban regions where traditional fixed-route transit systems are often inadequate. Autonomous on-Demand Shuttles (AODS) hold a promising option for FMLM connections due to their cost-effectiveness and improved safety features, thereby enhancing user convenience and reducing reliance on personal vehicles. A critical issue in AODS service design is the optimization of travel paths, for which realistic traffic network assignment combined with optimal routing offers a viable solution. In this study, we have designed an AODS controller that integrates a mesoscopic simulation-based dynamic traffic assignment model with a greedy insertion heuristics approach to optimize the travel routes of the shuttles. The controller also considers the charging infrastructure/strategies and the impact of the shuttles on regular traffic flow for routes and fleet-size planning. The controller is implemented in Aimsun traffic simulator considering Lake Nona in Orlando, Florida as a case study. We show that, under the present demand based on 1% of total trips as transit riders, a fleet of 3 autonomous shuttles can serve about 80% of FMLM trip requests on-demand basis with an average waiting time below 4 minutes. Additional power sources have significant effect on service quality as the inactive waiting time for charging would increase the fleet size. We also show that low-speed autonomous shuttles would have negligible impact on regular vehicle flow, making them suitable for suburban areas. These findings have important implications for sustainable urban planning and public transit operations.
comment: 25 Pages, 13 Figures, 1 Table
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning
Offline reinforcement learning algorithms hold the promise of enabling data-driven RL methods that do not require costly or dangerous real-world exploration and benefit from large pre-collected datasets. This in turn can facilitate real-world applications, as well as a more standardized approach to RL research. Furthermore, offline RL methods can provide effective initializations for online finetuning to overcome challenges with exploration. However, evaluating progress on offline RL algorithms requires effective and challenging benchmarks that capture properties of real-world tasks, provide a range of task difficulties, and cover a range of challenges both in terms of the parameters of the domain (e.g., length of the horizon, sparsity of rewards) and the parameters of the data (e.g., narrow demonstration data or broad exploratory data). While considerable progress in offline RL in recent years has been enabled by simpler benchmark tasks, the most widely used datasets are increasingly saturating in performance and may fail to reflect properties of realistic tasks. We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments, based on models of real-world robotic systems, and comprising a variety of data sources, including scripted data, play-style data collected by human teleoperators, and other data sources. Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation, with some of the tasks specifically designed to require both pre-training and fine-tuning. We hope that our proposed benchmark will facilitate further progress on both offline RL and fine-tuning algorithms. Website with code, examples, tasks, and data is available at \url{https://sites.google.com/view/d5rl/}
comment: RLC 2024
Timing Analysis and Priority-driven Enhancements of ROS 2 Multi-threaded Executors
The second generation of Robotic Operating System, ROS 2, has gained much attention for its potential to be used for safety-critical robotic applications. The need to provide a solid foundation for timing correctness and scheduling mechanisms is therefore growing rapidly. Although there are some pioneering studies conducted on formally analyzing the response time of processing chains in ROS 2, the focus has been limited to single-threaded executors, and multi-threaded executors, despite their advantages, have not been studied well. To fill this knowledge gap, in this paper, we propose a comprehensive response-time analysis framework for chains running on ROS 2 multi-threaded executors. We first analyze the timing behavior of the default scheduling scheme in ROS 2 multi-threaded executors, and then present priority-driven scheduling enhancements to address the limitations of the default scheme. Our framework can analyze chains with both arbitrary and constrained deadlines and also the effect of mutually-exclusive callback groups. Evaluation is conducted by a case study on NVIDIA Jetson AGX Xavier and schedulability experiments using randomly-generated chains. The results demonstrate that our analysis framework can safely upper-bound response times under various conditions and the priority-driven scheduling enhancements not only reduce the response time of critical chains but also improve analytical bounds.
Physics-Guided Reinforcement Learning System for Realistic Vehicle Active Suspension Control
The suspension system is a crucial part of the automotive chassis, improving vehicle ride comfort and isolating passengers from rough road excitation. Unlike passive suspension, which has constant spring and damping coefficients, active suspension incorporates electronic actuators into the system to dynamically control stiffness and damping variables. However, effectively controlling the suspension system poses a challenging task that necessitates real-time adaptability to various road conditions. This paper presents the Physics-Guided Deep Reinforcement Learning (DRL) for adjusting an active suspension system's variable kinematics and compliance properties for a quarter-car model in real time. Specifically, the outputs of the model are defined as actuator stiffness and damping control, which are bound within physically realistic ranges to maintain the system's physical compliance. The proposed model was trained on stochastic road profiles according to ISO 8608 standards to optimize the actuator's control policy. According to qualitative results on simulations, the vehicle body reacts smoothly to various novel real-world road conditions, having a much lower degree of oscillation. These observations mean a higher level of passenger comfort and better vehicle stability. Quantitatively, DRL outperforms passive systems in reducing the average vehicle body velocity and acceleration by 43.58% and 17.22%, respectively, minimizing the vertical movement impacts on the passengers. The code is publicly available at github.com/anh-nn01/RL4Suspension-ICMLA23.
comment: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
The Threats of Embodied Multimodal LLMs: Jailbreaking Robotic Manipulation in the Physical World
Embodied artificial intelligence (AI) represents an artificial intelligence system that interacts with the physical world through sensors and actuators, seamlessly integrating perception and action. This design enables AI to learn from and operate within complex, real-world environments. Large Language Models (LLMs) deeply explore language instructions, playing a crucial role in devising plans for complex tasks. Consequently, they have progressively shown immense potential in empowering embodied AI, with LLM-based embodied AI emerging as a focal point of research within the community. It is foreseeable that, over the next decade, LLM-based embodied AI robots are expected to proliferate widely, becoming commonplace in homes and industries. However, a critical safety issue that has long been hiding in plain sight is: could LLM-based embodied AI perpetrate harmful behaviors? Our research investigates for the first time how to induce threatening actions in embodied AI, confirming the severe risks posed by these soon-to-be-marketed robots, which starkly contravene Asimov's Three Laws of Robotics and threaten human safety. Specifically, we formulate the concept of embodied AI jailbreaking and expose three critical security vulnerabilities: first, jailbreaking robotics through compromised LLM; second, safety misalignment between action and language spaces; and third, deceptive prompts leading to unaware hazardous behaviors. We also analyze potential mitigation measures and advocate for community awareness regarding the safety of embodied AI applications in the physical world.
comment: Preliminary version (17 pages, 4 figures). Work in progress, revisions ongoing. Appreciate understanding and welcome any feedback
Evetac: An Event-based Optical Tactile Sensor for Robotic Manipulation
Optical tactile sensors have recently become popular. They provide high spatial resolution, but struggle to offer fine temporal resolutions. To overcome this shortcoming, we study the idea of replacing the RGB camera with an event-based camera and introduce a new event-based optical tactile sensor called Evetac. Along with hardware design, we develop touch processing algorithms to process its measurements online at 1000 Hz. We devise an efficient algorithm to track the elastomer's deformation through the imprinted markers despite the sensor's sparse output. Benchmarking experiments demonstrate Evetac's capabilities of sensing vibrations up to 498 Hz, reconstructing shear forces, and significantly reducing data rates compared to RGB optical tactile sensors. Moreover, Evetac's output and the marker tracking provide meaningful features for learning data-driven slip detection and prediction models. The learned models form the basis for a robust and adaptive closed-loop grasp controller capable of handling a wide range of objects. We believe that fast and efficient event-based tactile sensors like Evetac will be essential for bringing human-like manipulation capabilities to robotics. The sensor design is open-sourced at https://sites.google.com/view/evetac .
comment: Accepted at IEEE Transactions On Robotics. Project Website: https://sites.google.com/view/evetac
Deep Learning Innovations for Underwater Waste Detection: An In-Depth Analysis
Addressing the issue of submerged underwater trash is crucial for safeguarding aquatic ecosystems and preserving marine life. While identifying debris present on the surface of water bodies is straightforward, assessing the underwater submerged waste is a challenge due to the image distortions caused by factors such as light refraction, absorption, suspended particles, color shifts, and occlusion. This paper conducts a comprehensive review of state-of-the-art architectures and on the existing datasets to establish a baseline for submerged waste and trash detection. The primary goal remains to establish the benchmark of the object localization techniques to be leveraged by advanced underwater sensors and autonomous underwater vehicles. The ultimate objective is to explore the underwater environment, to identify, and remove underwater debris. The absence of benchmarks (dataset or algorithm) in many researches emphasizes the need for a more robust algorithmic solution. Through this research, we aim to give performance comparative analysis of various underwater trash detection algorithms.
End-to-end Autonomous Driving: Challenges and Frontiers
The autonomous driving community has witnessed a rapid growth in approaches that embrace an end-to-end algorithm framework, utilizing raw sensor input to generate vehicle motion plans, instead of concentrating on individual tasks such as detection and motion prediction. End-to-end systems, in comparison to modular pipelines, benefit from joint feature optimization for perception and planning. This field has flourished due to the availability of large-scale datasets, closed-loop evaluation, and the increasing need for autonomous driving algorithms to perform effectively in challenging scenarios. In this survey, we provide a comprehensive analysis of more than 270 papers, covering the motivation, roadmap, methodology, challenges, and future trends in end-to-end autonomous driving. We delve into several critical challenges, including multi-modality, interpretability, causal confusion, robustness, and world models, amongst others. Additionally, we discuss current advancements in foundation models and visual pre-training, as well as how to incorporate these techniques within the end-to-end driving framework. we maintain an active repository that contains up-to-date literature and open-source projects at https://github.com/OpenDriveLab/End-to-end-Autonomous-Driving.
comment: Accepted by IEEE TPAMI
A Survey on Integration of Large Language Models with Intelligent Robots
In recent years, the integration of large language models (LLMs) has revolutionized the field of robotics, enabling robots to communicate, understand, and reason with human-like proficiency. This paper explores the multifaceted impact of LLMs on robotics, addressing key challenges and opportunities for leveraging these models across various domains. By categorizing and analyzing LLM applications within core robotics elements -- communication, perception, planning, and control -- we aim to provide actionable insights for researchers seeking to integrate LLMs into their robotic systems. Our investigation focuses on LLMs developed post-GPT-3.5, primarily in text-based modalities while also considering multimodal approaches for perception and control. We offer comprehensive guidelines and examples for prompt engineering, facilitating beginners' access to LLM-based robotics solutions. Through tutorial-level examples and structured prompt construction, we illustrate how LLM-guided enhancements can be seamlessly integrated into robotics applications. This survey serves as a roadmap for researchers navigating the evolving landscape of LLM-driven robotics, offering a comprehensive overview and practical guidance for harnessing the power of language models in robotics development.
comment: 24 pages, 5 figures, Published in Intelligent Service Robotics (ISR)
Chance-Constrained Information-Theoretic Stochastic Model Predictive Control with Safety Shielding
This paper introduces a novel nonlinear stochastic model predictive control path integral (MPPI) method, which considers chance constraints on system states. The proposed belief-space stochastic MPPI (BSS-MPPI) applies Monte-Carlo sampling to evaluate state distributions resulting from underlying systematic disturbances, and utilizes a Control Barrier Function (CBF) inspired heuristic in belief space to fulfill the specified chance constraints. Compared to several previous stochastic predictive control methods, our approach applies to general nonlinear dynamics without requiring the computationally expensive system linearization step. Moreover, the BSS-MPPI controller can solve optimization problems without limiting the form of the objective function and chance constraints. By multi-threading the sampling process using a GPU, we can achieve fast real-time planning for time- and safety-critical tasks such as autonomous racing. Our results on a realistic race-car simulation study show significant reductions in constraint violation compared to some of the prior MPPI approaches, while being comparable in computation times.
Lane Graph as Path: Continuity-preserving Path-wise Modeling for Online Lane Graph Construction ECCV 2024
Online lane graph construction is a promising but challenging task in autonomous driving. Previous methods usually model the lane graph at the pixel or piece level, and recover the lane graph by pixel-wise or piece-wise connection, which breaks down the continuity of the lane and results in suboptimal performance. Human drivers focus on and drive along the continuous and complete paths instead of considering lane pieces. Autonomous vehicles also require path-specific guidance from lane graph for trajectory planning. We argue that the path, which indicates the traffic flow, is the primitive of the lane graph. Motivated by this, we propose to model the lane graph in a novel path-wise manner, which well preserves the continuity of the lane and encodes traffic information for planning. We present a path-based online lane graph construction method, termed LaneGAP, which end-to-end learns the path and recovers the lane graph via a Path2Graph algorithm. We qualitatively and quantitatively demonstrate the superior accuracy and efficiency of LaneGAP over conventional pixel-based and piece-based methods on the challenging nuScenes and Argoverse2 datasets under controllable and fair conditions. Compared to the recent state-of-the-art piece-wise method TopoNet on the OpenLane-V2 dataset, LaneGAP still outperforms by 1.6 mIoU, further validating the effectiveness of path-wise modeling. Abundant visualizations in the supplementary material show LaneGAP can cope with diverse traffic conditions. Code is released at \url{https://github.com/hustvl/LaneGAP}.
comment: Accepted to ECCV 2024
FedRobo: Federated Learning Driven Autonomous Inter Robots Communication For Optimal Chemical Sprays
Federated Learning enables robots to learn from each other's experiences without relying on centralized data collection. Each robot independently maintains a model of crop conditions and chemical spray effectiveness, which is periodically shared with other robots in the fleet. A communication protocol is designed to optimize chemical spray applications by facilitating the exchange of information about crop conditions, weather, and other critical factors. The federated learning algorithm leverages this shared data to continuously refine the chemical spray strategy, reducing waste and improving crop yields. This approach has the potential to revolutionize the agriculture industry by offering a scalable and efficient solution for crop protection. However, significant challenges remain, including the development of a secure and robust communication protocol, the design of a federated learning algorithm that effectively integrates data from multiple sources, and ensuring the safety and reliability of autonomous robots. The proposed cluster-based federated learning approach also effectively reduces the computational load on the global server and minimizes communication overhead among clients.
comment: This research article is going to be submitted to a best-fit conference. We are looking for a conference
MMP++: Motion Manifold Primitives with Parametric Curve Models
Motion Manifold Primitives (MMP), a manifold-based approach for encoding basic motion skills, can produce diverse trajectories, enabling the system to adapt to unseen constraints. Nonetheless, we argue that current MMP models lack crucial functionalities of movement primitives, such as temporal and via-points modulation, found in traditional approaches. This shortfall primarily stems from MMP's reliance on discrete-time trajectories. To overcome these limitations, we introduce Motion Manifold Primitives++ (MMP++), a new model that integrates the strengths of both MMP and traditional methods by incorporating parametric curve representations into the MMP framework. Furthermore, we identify a significant challenge with MMP++: performance degradation due to geometric distortions in the latent space, meaning that similar motions are not closely positioned. To address this, Isometric Motion Manifold Primitives++ (IMMP++) is proposed to ensure the latent space accurately preserves the manifold's geometry. Our experimental results across various applications, including 2-DoF planar motions, 7-DoF robot arm motions, and SE(3) trajectory planning, show that MMP++ and IMMP++ outperform existing methods in trajectory generation tasks, achieving substantial improvements in some cases. Moreover, they enable the modulation of latent coordinates and via-points, thereby allowing efficient online adaptation to dynamic environments.
comment: 15 pages. The paper will appear in the IEEE Transactions on Robotics
Three-dimensional Morphological Reconstruction of Millimeter-Scale Soft Continuum Robots based on Dual-Stereo-Vision
Continuum robots can be miniaturized to just a few millimeters in diameter. Among these, notched tubular continuum robots (NTCR) show great potential in many delicate applications. Existing works in robotic modeling focus on kinematics and dynamics but still face challenges in reproducing the robot's morphology -- a significant factor that can expand the research landscape of continuum robots, especially for those with asymmetric continuum structures. This paper proposes a dual stereo vision-based method for the three-dimensional morphological reconstruction of millimeter-scale NTCRs. The method employs two oppositely located stationary binocular cameras to capture the point cloud of the NTCR, then utilizes predefined geometry as a reference for the KD tree method to relocate the capture point clouds, resulting in a morphologically correct NTCR despite the low-quality raw point cloud collection. The method has been proved feasible for an NTCR with a 3.5 mm diameter, capturing 14 out of 16 notch features, with the measurements generally centered around the standard of 1.5 mm, demonstrating the capability of revealing morphological details. Our proposed method paves the way for 3D morphological reconstruction of millimeter-scale soft robots for further self-modeling study.
comment: 6 pages, 6 figures, submitted to Robio 2024
Touch-GS: Visual-Tactile Supervised 3D Gaussian Splatting
In this work, we propose a novel method to supervise 3D Gaussian Splatting (3DGS) scenes using optical tactile sensors. Optical tactile sensors have become widespread in their use in robotics for manipulation and object representation; however, raw optical tactile sensor data is unsuitable to directly supervise a 3DGS scene. Our representation leverages a Gaussian Process Implicit Surface to implicitly represent the object, combining many touches into a unified representation with uncertainty. We merge this model with a monocular depth estimation network, which is aligned in a two stage process, coarsely aligning with a depth camera and then finely adjusting to match our touch data. For every training image, our method produces a corresponding fused depth and uncertainty map. Utilizing this additional information, we propose a new loss function, variance weighted depth supervised loss, for training the 3DGS scene model. We leverage the DenseTact optical tactile sensor and RealSense RGB-D camera to show that combining touch and vision in this manner leads to quantitatively and qualitatively better results than vision or touch alone in a few-view scene syntheses on opaque as well as on reflective and transparent objects. Please see our project page at http://armlabstanford.github.io/touch-gs
comment: 8 pages, 7 figures
Grounding LLMs For Robot Task Planning Using Closed-loop State Feedback
Planning algorithms decompose complex problems into intermediate steps that can be sequentially executed by robots to complete tasks. Recent works have employed Large Language Models (LLMs) for task planning, using natural language to generate robot policies in both simulation and real-world environments. LLMs like GPT-4 have shown promising results in generalizing to unseen tasks, but their applicability is limited due to hallucinations caused by insufficient grounding in the robot environment. The robustness of LLMs in task planning can be enhanced with environmental state information and feedback. In this paper, we introduce a novel approach to task planning that utilizes two separate LLMs for high-level planning and low-level control, improving task-related success rates and goal condition recall. Our algorithm, \textit{BrainBody-LLM}, draws inspiration from the human neural system, emulating its brain-body architecture by dividing planning across two LLMs in a structured, hierarchical manner. BrainBody-LLM implements a closed-loop feedback mechanism, enabling learning from simulator errors to resolve execution errors in complex settings. We demonstrate the successful application of BrainBody-LLM in the VirtualHome simulation environment, achieving a 29\% improvement in task-oriented success rates over competitive baselines with the GPT-4 backend. Additionally, we evaluate our algorithm on seven complex tasks using a realistic physics simulator and the Franka Research 3 robotic arm, comparing it with various state-of-the-art LLMs. Our results show advancements in the reasoning capabilities of recent LLMs, which enable them to learn from raw simulator/controller errors to correct plans, making them highly effective in robotic task planning.
comment: This work has been submitted to Autonomous Robots
IPC: Incremental Probabilistic Consensus-based Consistent Set Maximization for SLAM Backends ICRA
In SLAM (Simultaneous localization and mapping) problems, Pose Graph Optimization (PGO) is a technique to refine an initial estimate of a set of poses (positions and orientations) from a set of pairwise relative measurements. The optimization procedure can be negatively affected even by a single outlier measurement, with possible catastrophic and meaningless results. Although recent works on robust optimization aim to mitigate the presence of outlier measurements, robust solutions capable of handling large numbers of outliers are yet to come. This paper presents IPC, acronym for Incremental Probabilistic Consensus, a method that approximates the solution to the combinatorial problem of finding the maximally consistent set of measurements in an incremental fashion. It evaluates the consistency of each loop closure measurement through a consensus-based procedure, possibly applied to a subset of the global problem, where all previously integrated inlier measurements have veto power. We evaluated IPC on standard benchmarks against several state-of-the-art methods. Although it is simple and relatively easy to implement, IPC competes with or outperforms the other tested methods in handling outliers while providing online performances. We release with this paper an open-source implementation of the proposed method.
comment: This paper has been accepted for publication at the 2024 IEEE International Conference on Robotics and Automation (ICRA)
Osprey: Multi-Session Autonomous Aerial Mapping with LiDAR-based SLAM and Next Best View Planning
Aerial mapping systems are important for many surveying applications (e.g., industrial inspection or agricultural monitoring). Aerial platforms that can fly GPS-guided preplanned missions semi-autonomously are already widely available but fully autonomous systems can significantly improve efficiency. Autonomously mapping complex 3D structures requires a system that performs online mapping and mission planning. This paper presents Osprey, an autonomous aerial mapping system with state-of-the-art multi-session LiDAR-based mapping capabilities. It enables a non-expert operator to specify a bounded target area that the aerial platform can then map autonomously over multiple flights. Field experiments with Osprey demonstrate that this system can achieve greater map coverage of large industrial sites than manual surveys with a pilot-flown aerial platform or a terrestrial laser scanner (TLS). Three sites, with a total ground coverage of $2528$ m$^2$ and a maximum height of $27$ m, were mapped in separate missions using $112$ minutes of autonomous flight time. True colour maps were created from images captured by Osprey using pointcloud and NeRF reconstruction methods. These maps provide useful data for structural inspection tasks.
comment: 25 pages, 15 figures, 3 tables. Video available at https://www.youtube.com/watch?v=CVIXu2qUQJ8 Dataset available at https://dynamic.robots.ox.ac.uk/datasets/oxford-osprey
Personalizing Federated Instrument Segmentation with Visual Trait Priors in Robotic Surgery
Personalized federated learning (PFL) for surgical instrument segmentation (SIS) is a promising approach. It enables multiple clinical sites to collaboratively train a series of models in privacy, with each model tailored to the individual distribution of each site. Existing PFL methods rarely consider the personalization of multi-headed self-attention, and do not account for appearance diversity and instrument shape similarity, both inherent in surgical scenes. We thus propose PFedSIS, a novel PFL method with visual trait priors for SIS, incorporating global-personalized disentanglement (GPD), appearance-regulation personalized enhancement (APE), and shape-similarity global enhancement (SGE), to boost SIS performance in each site. GPD represents the first attempt at head-wise assignment for multi-headed self-attention personalization. To preserve the unique appearance representation of each site and gradually leverage the inter-site difference, APE introduces appearance regulation and provides customized layer-wise aggregation solutions via hypernetworks for each site's personalized parameters. The mutual shape information of instruments is maintained and shared via SGE, which enhances the cross-style shape consistency on the image level and computes the shape-similarity contribution of each site on the prediction level for updating the global parameters. PFedSIS outperforms state-of-the-art methods with +1.51% Dice, +2.11% IoU, -2.79 ASSD, -15.55 HD95 performance gains. The corresponding code and models will be released at https://github.com/wzjialang/PFedSIS.
comment: 9 pages, 3 figures, under review
Self-organizing Multiagent Target Enclosing under Limited Information and Safety Guarantees
This paper introduces an approach to address the target enclosing problem using non-holonomic multiagent systems, where agents self-organize on the enclosing shape around a fixed target. In our approach, agents independently move toward the desired enclosing geometry when apart and activate the collision avoidance mechanism when a collision is imminent, thereby guaranteeing inter-agent safety. Our approach combines global enclosing behavior and local collision avoidance mechanisms by devising a special potential function and sliding manifold. We rigorously show that an agent does not need to ensure safety with every other agent and put forth a concept of the nearest colliding agent (for any arbitrary agent) with whom ensuring safety is sufficient to avoid collisions in the entire swarm. The proposed control eliminates the need for a fixed or pre-established agent arrangement around the target and requires only relative information between an agent and the target. This makes our design particularly appealing for scenarios with limited global information, hence significantly reducing communication requirements. We finally present simulation results to vindicate the efficacy of the proposed method.
Deep Reinforcement Learning for Robotics: A Survey of Real-World Successes
Reinforcement learning (RL), particularly its combination with deep neural networks referred to as deep RL (DRL), has shown tremendous promise across a wide range of applications, suggesting its potential for enabling the development of sophisticated robotic behaviors. Robotics problems, however, pose fundamental difficulties for the application of RL, stemming from the complexity and cost of interacting with the physical world. This article provides a modern survey of DRL for robotics, with a particular focus on evaluating the real-world successes achieved with DRL in realizing several key robotic competencies. Our analysis aims to identify the key factors underlying those exciting successes, reveal underexplored areas, and provide an overall characterization of the status of DRL in robotics. We highlight several important avenues for future work, emphasizing the need for stable and sample-efficient real-world RL paradigms, holistic approaches for discovering and integrating various competencies to tackle complex long-horizon, open-world tasks, and principled development and evaluation procedures. This survey is designed to offer insights for both RL practitioners and roboticists toward harnessing RL's power to create generally capable real-world robotic systems.
comment: The first three authors contributed equally. Accepted to Annual Review of Control, Robotics, and Autonomous Systems
Enhancing Reinforcement Learning in Sensor Fusion: A Comparative Analysis of Cubature and Sampling-based Integration Methods for Rover Search Planning IROS 2024
This study investigates the computational speed and accuracy of two numerical integration methods, cubature and sampling-based, for integrating an integrand over a 2D polygon. Using a group of rovers searching the Martian surface with a limited sensor footprint as a test bed, the relative error and computational time are compared as the area was subdivided to improve accuracy in the sampling-based approach. The results show that the sampling-based approach exhibits a $14.75\%$ deviation in relative error compared to cubature when it matches the computational performance at $100\%$. Furthermore, achieving a relative error below $1\%$ necessitates a $10000\%$ increase in relative time to calculate due to the $\mathcal{O}(N^2)$ complexity of the sampling-based method. It is concluded that for enhancing reinforcement learning capabilities and other high iteration algorithms, the cubature method is preferred over the sampling-based method.
comment: Submitted to IROS 2024
Robotics
NeuroEvolution algorithms applied in the designing process of biohybrid actuators
Soft robots diverge from traditional rigid robotics, offering unique advantages in adaptability, safety, and human-robot interaction. In some cases, soft robots can be powered by biohybrid actuators and the design process of these systems is far from straightforward. We analyse here two algorithms that may assist the design of these systems, namely, NEAT (NeuroEvolution of Augmented Topologies) and HyperNEAT (Hypercube-based NeuroEvolution of Augmented Topologies). These algorithms exploit the evolution of the structure of actuators encoded through neural networks. To evaluate these algorithms, we compare them with a similar approach using the Age Fitness Pareto Optimization (AFPO) algorithm, with a focus on assessing the maximum displacement achieved by the discovered biohybrid morphologies. Additionally, we investigate the effects of optimization against both the volume of these morphologies and the distance they can cover. To further accelerate the computational process, the proposed methodology is implemented in a client-server setting; so, the most demanding calculations can be executed on specialized and efficient hardware. The results indicate that the HyperNEAT-based approach excels in identifying morphologies with minimal volumes that still achieve satisfactory displacement targets.
SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning SC
This paper introduces an open-source, decentralized framework named SigmaRL, designed to enhance both sample efficiency and generalization of multi-agent Reinforcement Learning (RL) for motion planning of connected and automated vehicles. Most RL agents exhibit a limited capacity to generalize, often focusing narrowly on specific scenarios, and are usually evaluated in similar or even the same scenarios seen during training. Various methods have been proposed to address these challenges, including experience replay and regularization. However, how observation design in RL affects sample efficiency and generalization remains an under-explored area. We address this gap by proposing five strategies to design information-dense observations, focusing on general features that are applicable to most traffic scenarios. We train our RL agents using these strategies on an intersection and evaluate their generalization through numerical experiments across completely unseen traffic scenarios, including a new intersection, an on-ramp, and a roundabout. Incorporating these information-dense observations reduces training times to under one hour on a single CPU, and the evaluation results reveal that our RL agents can effectively zero-shot generalize. Code: github.com/cas-lab-munich/SigmaRL
comment: 8 pages, 5 figures, accepted for presentation at the IEEE International Conference on Intelligent Transportation Systems (ITSC) 2024
Non-Gaited Legged Locomotion with Monte-Carlo Tree Search and Supervised Learning
Legged robots are able to navigate complex terrains by continuously interacting with the environment through careful selection of contact sequences and timings. However, the combinatorial nature behind contact planning hinders the applicability of such optimization problems on hardware. In this work, we present a novel approach that optimizes gait sequences and respective timings for legged robots in the context of optimization-based controllers through the use of sampling-based methods and supervised learning techniques. We propose to bootstrap the search by learning an optimal value function in order to speed-up the gait planning procedure making it applicable in real-time. To validate our proposed method, we showcase its performance both in simulation and on hardware using a 22 kg electric quadruped robot. The method is assessed on different terrains, under external perturbations, and in comparison to a standard control approach where the gait sequence is fixed a priori.
Object Augmentation Algorithm: Computing virtual object motion and object induced interaction wrench from optical markers IROS 2024
This study addresses the critical need for diverse and comprehensive data focused on human arm joint torques while performing activities of daily living (ADL). Previous studies have often overlooked the influence of objects on joint torques during ADL, resulting in limited datasets for analysis. To address this gap, we propose an Object Augmentation Algorithm (OAA) capable of augmenting existing marker-based databases with virtual object motions and object-induced joint torque estimations. The OAA consists of five phases: (1) computing hand coordinate systems from optical markers, (2) characterising object movements with virtual markers, (3) calculating object motions through inverse kinematics (IK), (4) determining the wrench necessary for prescribed object motion using inverse dynamics (ID), and (5) computing joint torques resulting from object manipulation. The algorithm's accuracy is validated through trajectory tracking and torque analysis on a 7+4 degree of freedom (DoF) robotic hand-arm system, manipulating three unique objects. The results show that the OAA can accurately and precisely estimate 6 DoF object motion and object-induced joint torques. Correlations between computed and measured quantities were > 0.99 for object trajectories and > 0.93 for joint torques. The OAA was further shown to be robust to variations in the number and placement of input markers, which are expected between databases. Differences between repeated experiments were minor but significant (p < 0.05). The algorithm expands the scope of available data and facilitates more comprehensive analyses of human-object interaction dynamics.
comment: An open source implementation of the described algorithm is available at https://github.com/ChristopherHerneth/ObjectAugmentationAlgorithm/tree/main. Accompanying video material may be found here https://youtu.be/8oz-awvyNRA. The article was accepted at IROS 2024
Enhanced Optimization Strategies to Design an Underactuated Hand Exoskeleton
Exoskeletons can boost human strength and provide assistance to individuals with physical disabilities. However, ensuring safety and optimal performance in their design poses substantial challenges. This study presents the design process for an underactuated hand exoskeleton (U-HEx), first including a single objective (maximizing force transmission), then expanding into multi objective (also minimizing torque variance and actuator displacement). The optimization relies on a Genetic Algorithm, the Big Bang-Big Crunch Algorithm, and their versions for multi-objective optimization. Analyses revealed that using Big Bang-Big Crunch provides high and more consistent results in terms of optimality with lower convergence time. In addition, adding more objectives offers a variety of trade-off solutions to the designers, who might later set priorities for the objectives without repeating the process - at the cost of complicating the optimization algorithm and computational burden. These findings underline the importance of performing proper optimization while designing exoskeletons, as well as providing a significant improvement to this specific robotic design.
comment: 12 pages, 7 figures, 8 talbes, submitted to IEEE Transactions on Robotics
Risk Occupancy: A New and Efficient Paradigm through Vehicle-Road-Cloud Collaboration
This study introduces the 4D Risk Occupancy within a vehicle-road-cloud architecture, integrating the road surface spatial, risk, and temporal dimensions, and endowing the algorithm with beyond-line-of-sight, all-angles, and efficient abilities. The algorithm simplifies risk modeling by focusing on directly observable information and key factors, drawing on the concept of Occupancy Grid Maps (OGM), and incorporating temporal prediction to effectively map current and future risk occupancy. Compared to conventional driving risk fields and grid occupancy maps, this algorithm can map global risks more efficiently, simply, and reliably. It can integrate future risk information, adapting to dynamic traffic environments. The 4D Risk Occupancy also unifies the expression of BEV detection and lane line detection results, enhancing the intuitiveness and unity of environmental perception. Using DAIR-V2X data, this paper validates the 4D Risk Occupancy algorithm and develops a local path planning model based on it. Qualitative experiments under various road conditions demonstrate the practicality and robustness of this local path planning model. Quantitative analysis shows that the path planning based on risk occupation significantly improves trajectory planning performance, increasing safety redundancy by 12.5% and reducing average deceleration by 5.41% at an initial braking speed of 8 m/s, thereby improving safety and comfort. This work provides a new global perception method and local path planning method through Vehicle-Road-Cloud architecture, offering a new perceptual paradigm for achieving safer and more efficient autonomous driving.
comment: 13 pages,9 figures
Narrowing your FOV with SOLiD: Spatially Organized and Lightweight Global Descriptor for FOV-constrained LiDAR Place Recognition
We often encounter limited FOV situations due to various factors such as sensor fusion or sensor mount in real-world robot navigation. However, the limited FOV interrupts the generation of descriptions and impacts place recognition adversely. Therefore, we suffer from correcting accumulated drift errors in a consistent map using LiDAR-based place recognition with limited FOV. Thus, in this paper, we propose a robust LiDAR-based place recognition method for handling narrow FOV scenarios. The proposed method establishes spatial organization based on the range-elevation bin and azimuth-elevation bin to represent places. In addition, we achieve a robust place description through reweighting based on vertical direction information. Based on these representations, our method enables addressing rotational changes and determining the initial heading. Additionally, we designed a lightweight and fast approach for the robot's onboard autonomy. For rigorous validation, the proposed method was tested across various LiDAR place recognition scenarios (i.e., single-session, multi-session, and multi-robot scenarios). To the best of our knowledge, we report the first method to cope with the restricted FOV. Our place description and SLAM codes will be released. Also, the supplementary materials of our descriptor are available at \texttt{\url{https://sites.google.com/view/lidar-solid}}.
comment: IEEE Robotics and Automation Letters (2024)
The Design of Autonomous UAV Prototypes for Inspecting Tunnel Construction Environment
This article presents novel designs of autonomous UAV prototypes specifically developed for inspecting GPS-denied tunnel construction environments with dynamic human and robotic presence. Our UAVs integrate advanced sensor suites and robust motion planning algorithms to autonomously navigate and explore these complex environments. We validated our approach through comprehensive simulation experiments in PX4 Gazebo and Airsim Unreal Engine 4 environments. Real-world wind tests and exploration experiments demonstrate the UAVs' capability to operate stably under diverse environmental conditions without GPS assistance. This study highlights the practicality and resilience of our UAV prototypes in real-world applications.
comment: Autonomous UAV, Tunnel Inspection, GPS-Denied Environment, Safe Trajectory Planning, Real-World Testing
Enhanced Scale-aware Depth Estimation for Monocular Endoscopic Scenes with Geometric Modeling
Scale-aware monocular depth estimation poses a significant challenge in computer-aided endoscopic navigation. However, existing depth estimation methods that do not consider the geometric priors struggle to learn the absolute scale from training with monocular endoscopic sequences. Additionally, conventional methods face difficulties in accurately estimating details on tissue and instruments boundaries. In this paper, we tackle these problems by proposing a novel enhanced scale-aware framework that only uses monocular images with geometric modeling for depth estimation. Specifically, we first propose a multi-resolution depth fusion strategy to enhance the quality of monocular depth estimation. To recover the precise scale between relative depth and real-world values, we further calculate the 3D poses of instruments in the endoscopic scenes by algebraic geometry based on the image-only geometric primitives (i.e., boundaries and tip of instruments). Afterwards, the 3D poses of surgical instruments enable the scale recovery of relative depth maps. By coupling scale factors and relative depth estimation, the scale-aware depth of the monocular endoscopic scenes can be estimated. We evaluate the pipeline on in-house endoscopic surgery videos and simulated data. The results demonstrate that our method can learn the absolute scale with geometric modeling and accurately estimate scale-aware depth for monocular scenes.
Complementarity-Free Multi-Contact Modeling and Optimization for Dexterous Manipulation
A significant barrier preventing model-based methods from matching the high performance of reinforcement learning in dexterous manipulation is the inherent complexity of multi-contact dynamics. Traditionally formulated using complementarity models, multi-contact dynamics introduces combinatorial complexity and non-smoothness, complicating contact-rich planning and control. In this paper, we circumvent these challenges by introducing a novel, simplified multi-contact model. Our new model, derived from the duality of optimization-based contact models, dispenses with the complementarity constructs entirely, providing computational advantages such as explicit time stepping, differentiability, automatic satisfaction of Coulomb friction law, and minimal hyperparameter tuning. We demonstrate the effectiveness and efficiency of the model for planning and control in a range of challenging dexterous manipulation tasks, including fingertip 3D in-air manipulation, TriFinger in-hand manipulation, and Allegro hand on-palm reorientation, all with diverse objects. Our method consistently achieves state-of-the-art results: (I) a 96.5% average success rate across tasks, (II) high manipulation accuracy with an average reorientation error of 11{\deg} and position error of 7.8 mm, and (III) model predictive control running at 50-100 Hz for all tested dexterous manipulation tasks. These results are achieved with minimal hyperparameter tuning.
comment: Video demo: https://youtu.be/NsL4hbSXvFg
From Decision to Action in Surgical Autonomy: Multi-Modal Large Language Models for Robot-Assisted Blood Suction
The rise of Large Language Models (LLMs) has impacted research in robotics and automation. While progress has been made in integrating LLMs into general robotics tasks, a noticeable void persists in their adoption in more specific domains such as surgery, where critical factors such as reasoning, explainability, and safety are paramount. Achieving autonomy in robotic surgery, which entails the ability to reason and adapt to changes in the environment, remains a significant challenge. In this work, we propose a multi-modal LLM integration in robot-assisted surgery for autonomous blood suction. The reasoning and prioritization are delegated to the higher-level task-planning LLM, and the motion planning and execution are handled by the lower-level deep reinforcement learning model, creating a distributed agency between the two components. As surgical operations are highly dynamic and may encounter unforeseen circumstances, blood clots and active bleeding were introduced to influence decision-making. Results showed that using a multi-modal LLM as a higher-level reasoning unit can account for these surgical complexities to achieve a level of reasoning previously unattainable in robot-assisted surgeries. These findings demonstrate the potential of multi-modal LLMs to significantly enhance contextual understanding and decision-making in robotic-assisted surgeries, marking a step toward autonomous surgical systems.
Exoskeleton-Assisted Balance and Task Evaluation During Quiet Stance and Kneeling in Construction
Construction workers exert intense physical effort and experience serious safety and health risks in hazardous working environments. Quiet stance and kneeling are among the most common postures performed by construction workers during their daily work. This paper analyzes lower-limb joint influence on neural balance control strategies using the frequency behavior of the intersection point of ground reaction forces. To evaluate the impact of elevation and wearable knee exoskeletons on postural balance and welding task performance, we design and integrate virtual- and mixed-reality (VR/MR) to simulate elevated environments and welding tasks. A linear quadratic regulator-controlled triple- and double-link inverted pendulum model is used for balance strategy quantification in quiet stance and kneeling, respectively. Extensive multi-subject experiments are conducted to evaluate the usability of occupational exoskeletons in destabilizing construction environments. The quantified balance strategies capture the significance of knee joint during balance control of quiet stance and kneeling gaits. Results show that center of pressure sway area reduced up to 62% in quiet stance and 39% in kneeling for subjects tested in high-elevation VR/MR worksites when provided knee exoskeleton assistance. The comprehensive balance and multitask evaluation methodology developed aims to reveal exoskeleton design considerations to mitigate the fall risk in construction.
comment: 14 pages, 15 figures, submitted to IEEE Transactions on Automation Science and Engineering
Knowledge-based Neural Ordinary Differential Equations for Cosserat Rod-based Soft Robots
Soft robots have many advantages over rigid robots thanks to their compliant and passive nature. However, it is generally challenging to model the dynamics of soft robots due to their high spatial dimensionality, making it difficult to use model-based methods to accurately control soft robots. It often requires direct numerical simulation of partial differential equations to simulate soft robots. This not only requires an accurate numerical model, but also makes soft robot modeling slow and expensive. Deep learning algorithms have shown promises in data-driven modeling of soft robots. However, these algorithms usually require a large amount of data, which are difficult to obtain in either simulation or real-world experiments of soft robots. In this work, we propose KNODE-Cosserat, a framework that combines first-principle physics models and neural ordinary differential equations. We leverage the best from both worlds -- the generalization ability of physics-based models and the fast speed of deep learning methods. We validate our framework in both simulation and real-world experiments. In both cases, we show that the robot model significantly improves over the baseline models under different metrics.
comment: 8 pages, 11 figures, 4 tables
RAVE Checklist: Recommendations for Overcoming Challenges in Retrospective Safety Studies of Automated Driving Systems
The public, regulators, and domain experts alike seek to understand the effect of deployed SAE level 4 automated driving system (ADS) technologies on safety. The recent expansion of ADS technology deployments is paving the way for early stage safety impact evaluations, whereby the observational data from both an ADS and a representative benchmark fleet are compared to quantify safety performance. In January 2024, a working group of experts across academia, insurance, and industry came together in Washington, DC to discuss the current and future challenges in performing such evaluations. A subset of this working group then met, virtually, on multiple occasions to produce this paper. This paper presents the RAVE (Retrospective Automated Vehicle Evaluation) checklist, a set of fifteen recommendations for performing and evaluating retrospective ADS performance comparisons. The recommendations are centered around the concepts of (1) quality and validity, (2) transparency, and (3) interpretation. Over time, it is anticipated there will be a large and varied body of work evaluating the observed performance of these ADS fleets. Establishing and promoting good scientific practices benefits the work of stakeholders, many of whom may not be subject matter experts. This working group's intentions are to: i) strengthen individual research studies and ii) make the at-large community more informed on how to evaluate this collective body of work.
Inverse k-visibility for RSSI-based Indoor Geometric Mapping
In recent years, the increased availability of WiFi in indoor environments has gained an interest in the robotics community to leverage WiFi signals for enhancing indoor SLAM (Simultaneous Localization and Mapping) systems. SLAM technology is widely used, especially for the navigation and control of autonomous robots. This paper discusses various works in developing WiFi-based localization and challenges in achieving high-accuracy geometric maps. This paper introduces the concept of inverse k-visibility developed from the k-visibility algorithm to identify the free space in an unknown environment for planning, navigation, and obstacle avoidance. Comprehensive experiments, including those utilizing single and multiple RSSI signals, were conducted in both simulated and real-world environments to demonstrate the robustness of the proposed algorithm. Additionally, a detailed analysis comparing the resulting maps with ground-truth Lidar-based maps is provided to highlight the algorithm's accuracy and reliability.
comment: This work has been submitted to the IEEE Sensors Journal for possible publication
User-customizable Shared Control for Robot Teleoperation via Virtual Reality IROS 2024
Shared control can ease and enhance a human operator's ability to teleoperate robots, particularly for intricate tasks demanding fine control over multiple degrees of freedom. However, the arbitration process dictating how much autonomous assistance to administer in shared control can confuse novice operators and impede their understanding of the robot's behavior. To overcome these adverse side-effects, we propose a novel formulation of shared control that enables operators to tailor the arbitration to their unique capabilities and preferences. Unlike prior approaches to customizable shared control where users could indirectly modify the latent parameters of the arbitration function by issuing a feedback command, we instead make these parameters observable and directly editable via a virtual reality (VR) interface. We present our user-customizable shared control method for a teleoperation task in SE(3), known as the buzz wire game. A user study is conducted with participants teleoperating a robotic arm in VR to complete the game. The experiment spanned two weeks per subject to investigate longitudinal trends. Our findings reveal that users allowed to interactively tune the arbitration parameters across trials generalize well to adaptations in the task, exhibiting improvements in precision and fluency over direct teleoperation and conventional shared control.
comment: Accepted at IROS 2024
VIRUS-NeRF -- Vision, InfraRed and UltraSonic based Neural Radiance Fields
Autonomous mobile robots are an increasingly integral part of modern factory and warehouse operations. Obstacle detection, avoidance and path planning are critical safety-relevant tasks, which are often solved using expensive LiDAR sensors and depth cameras. We propose to use cost-effective low-resolution ranging sensors, such as ultrasonic and infrared time-of-flight sensors by developing VIRUS-NeRF - Vision, InfraRed, and UltraSonic based Neural Radiance Fields. Building upon Instant Neural Graphics Primitives with a Multiresolution Hash Encoding (Instant-NGP), VIRUS-NeRF incorporates depth measurements from ultrasonic and infrared sensors and utilizes them to update the occupancy grid used for ray marching. Experimental evaluation in 2D demonstrates that VIRUS-NeRF achieves comparable mapping performance to LiDAR point clouds regarding coverage. Notably, in small environments, its accuracy aligns with that of LiDAR measurements, while in larger ones, it is bounded by the utilized ultrasonic sensors. An in-depth ablation study reveals that adding ultrasonic and infrared sensors is highly effective when dealing with sparse data and low view variation. Further, the proposed occupancy grid of VIRUS-NeRF improves the mapping capabilities and increases the training speed by 46% compared to Instant-NGP. Overall, VIRUS-NeRF presents a promising approach for cost-effective local mapping in mobile robotics, with potential applications in safety and navigation tasks. The code can be found at https://github.com/ethz-asl/virus nerf.
Resilient source seeking with robot swarms
We present a solution for locating the source, or maximum, of an unknown scalar field using a swarm of mobile robots. Unlike relying on the traditional gradient information, the swarm determines an ascending direction to approach the source with arbitrary precision. The ascending direction is calculated from measurements of the field strength at the robot locations and their relative positions concerning the centroid. Rather than focusing on individual robots, we focus the analysis on the density of robots per unit area to guarantee a more resilient swarm, i.e., the functionality remains even if individuals go missing or are misplaced during the mission. We reinforce the robustness of the algorithm by providing sufficient conditions for the swarm shape so that the ascending direction is almost parallel to the gradient. The swarm can respond to an unexpected environment by morphing its shape and exploiting the existence of multiple ascending directions. Finally, we validate our approach numerically with hundreds of robots. The fact that a large number of robots always calculate an ascending direction compensates for the loss of individuals and mitigates issues arising from the actuator and sensor noises.
comment: 7 pages, CDC 2024, accepted version
Research on Autonomous Robots Navigation based on Reinforcement Learning
Reinforcement learning continuously optimizes decision-making based on real-time feedback reward signals through continuous interaction with the environment, demonstrating strong adaptive and self-learning capabilities. In recent years, it has become one of the key methods to achieve autonomous navigation of robots. In this work, an autonomous robot navigation method based on reinforcement learning is introduced. We use the Deep Q Network (DQN) and Proximal Policy Optimization (PPO) models to optimize the path planning and decision-making process through the continuous interaction between the robot and the environment, and the reward signals with real-time feedback. By combining the Q-value function with the deep neural network, deep Q network can handle high-dimensional state space, so as to realize path planning in complex environments. Proximal policy optimization is a strategy gradient-based method, which enables robots to explore and utilize environmental information more efficiently by optimizing policy functions. These methods not only improve the robot's navigation ability in the unknown environment, but also enhance its adaptive and self-learning capabilities. Through multiple training and simulation experiments, we have verified the effectiveness and robustness of these models in various complex scenarios.
DRAMA: An Efficient End-to-end Motion Planner for Autonomous Driving with Mamba
Motion planning is a challenging task to generate safe and feasible trajectories in highly dynamic and complex environments, forming a core capability for autonomous vehicles. In this paper, we propose DRAMA, the first Mamba-based end-to-end motion planner for autonomous vehicles. DRAMA fuses camera, LiDAR Bird's Eye View images in the feature space, as well as ego status information, to generate a series of future ego trajectories. Unlike traditional transformer-based methods with quadratic attention complexity for sequence length, DRAMA is able to achieve a less computationally intensive attention complexity, demonstrating potential to deal with increasingly complex scenarios. Leveraging our Mamba fusion module, DRAMA efficiently and effectively fuses the features of the camera and LiDAR modalities. In addition, we introduce a Mamba-Transformer decoder that enhances the overall planning performance. This module is universally adaptable to any Transformer-based model, especially for tasks with long sequence inputs. We further introduce a novel feature state dropout which improves the planner's robustness without increasing training and inference times. Extensive experimental results show that DRAMA achieves higher accuracy on the NAVSIM dataset compared to the baseline Transfuser, with fewer parameters and lower computational costs.
Ankle Exoskeletons May Hinder Standing Balance in Simple Models of Older and Younger Adults
Humans rely on ankle torque to maintain standing balance, particularly in the presence of small to moderate perturbations. Reductions in maximum torque (MT) production and maximum rate of torque development (MRTD) occur at the ankle with age, diminishing stability. Ankle exoskeletons are powered orthotic devices that may assist older adults by compensating for reduced muscle force and power production capabilities. They may also be able to assist with ankle strategies used for balance. However, no studies have investigated the effect of such devices on balance in older adults. Here, we model the effect ankle exoskeletons have on stability in physics-based models of healthy young and old adults, focusing on the mitigation of age-related deficits such as reduced MT and MRTD. We show that an ankle exoskeleton moderately reduces feasible stability boundaries in users who have full ankle strength. For individuals with age-related deficits, there is a trade-off. While exoskeletons augment stability in low velocity conditions, they reduce stability in some high velocity conditions. Our results suggest that well-established control strategies must still be experimentally validated in older adults.
comment: 12 pages, 7 figures
Virtual Elastic Tether: a New Approach for Multi-agent Navigation in Confined Aquatic Environments
Underwater navigation is a challenging area in the field of mobile robotics due to inherent constraints in self-localisation and communication in underwater environments. Some of these challenges can be mitigated by using collaborative multi-agent teams. However, when applied underwater, the robustness of traditional multi-agent collaborative control approaches is highly limited due to the unavailability of reliable measurements. In this paper, the concept of a Virtual Elastic Tether (VET) is introduced in the context of incomplete state measurements, which represents an innovative approach to underwater navigation in confined spaces. The concept of VET is formulated and validated using the Cooperative Aquatic Vehicle Exploration System (CAVES), which is a sim-to-real multi-agent aquatic robotic platform. Within this framework, a vision-based Autonomous Underwater Vehicle-Autonomous Surface Vehicle leader-follower formulation is developed. Experiments were conducted in both simulation and on a physical platform, benchmarked against a traditional Image-Based Visual Servoing approach. Results indicate that the formation of the baseline approach fails under discrete disturbances, when induced distances between the robots exceeds 0.6 m in simulation and 0.3 m in the real world. In contrast, the VET-enhanced system recovers to pre-perturbation distances within 5 seconds. Furthermore, results illustrate the successful navigation of VET-enhanced CAVES in a confined water pond where the baseline approach fails to perform adequately.
comment: This work has been submitted to the Wiley for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
gatekeeper: Online Safety Verification and Control for Nonlinear Systems in Dynamic Environments IROS 2023
This paper presents the gatekeeper algorithm, a real-time and computationally-lightweight method that ensures that trajectories of a nonlinear system satisfy safety constraints despite sensing limitations. gatekeeper integrates with existing path planners and feedback controllers by introducing an additional verification step to ensure that proposed trajectories can be executed safely, despite nonlinear dynamics subject to bounded disturbances, input constraints and partial knowledge of the environment. Our key contribution is that (A) we propose an algorithm to recursively construct safe trajectories by numerically forward propagating the system over a (short) finite horizon, and (B) we prove that tracking such a trajectory ensures the system remains safe for all future time, i.e., beyond the finite horizon. We demonstrate the method in a simulation of a dynamic firefighting mission, and in physical experiments of a quadrotor navigating in an obstacle environment that is sensed online. We also provide comparisons against the state-of-the-art techniques for similar problems.
comment: Accepted at IEEE T-RO 2024. Accepted at IROS 2023. 17 pages, 10 figures
Safe Imitation Learning of Nonlinear Model Predictive Control for Flexible Robots IROS 2024
Flexible robots may overcome some of the industry's major challenges, such as enabling intrinsically safe human-robot collaboration and achieving a higher payload-to-mass ratio. However, controlling flexible robots is complicated due to their complex dynamics, which include oscillatory behavior and a high-dimensional state space. Nonlinear model predictive control (NMPC) offers an effective means to control such robots, but its significant computational demand often limits its application in real-time scenarios. To enable fast control of flexible robots, we propose a framework for a safe approximation of NMPC using imitation learning and a predictive safety filter. Our framework significantly reduces computation time while incurring a slight loss in performance. Compared to NMPC, our framework shows more than an eightfold improvement in computation time when controlling a three-dimensional flexible robot arm in simulation, all while guaranteeing safety constraints. Notably, our approach outperforms state-of-the-art reinforcement learning methods. The development of fast and safe approximate NMPC holds the potential to accelerate the adoption of flexible robots in industry. The project code is available at: tinyurl.com/anmpc4fr
comment: Accepted to IROS 2024
WATonoBus: Field-Tested All-Weather Autonomous Shuttle Technology SC
All-weather autonomous vehicle operation poses significant challenges, encompassing modules from perception and decision-making to path planning and control. The complexity arises from the need to address adverse weather conditions such as rain, snow, and fog across the autonomy stack. Conventional model-based single-module approaches often lack holistic integration with upstream or downstream tasks. We tackle this problem by proposing a multi-module and modular system architecture with considerations for adverse weather across the perception level, through features such as snow covered curb detection, to decision-making and safety monitoring. Through daily weekday service on the WATonoBus platform for almost two years, we demonstrate that our proposed approach is capable of addressing adverse weather conditions and provide valuable insights from edge cases observed during operation.
comment: 8 pages, 10 figures. This work has been submitted to the ITSC for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Fast and Accurate Relative Motion Tracking for Dual Industrial Robots
Industrial robotic applications such as spraying, welding, and additive manufacturing frequently require fast, accurate, and uniform motion along a 3D spatial curve. To increase process throughput, some manufacturers propose a dual-robot setup to overcome the speed limitation of a single robot. Industrial robot motion is programmed through waypoints connected by motion primitives (Cartesian linear and circular paths and linear joint paths at constant Cartesian speed). The actual robot motion is affected by the blending between these motion primitives and the pose of the robot (an outstretched/near-singularity pose tends to have larger path tracking errors). Choosing the waypoints and the speed along each motion segment to achieve the performance requirement is challenging. At present, there is no automated solution, and laborious manual tuning by robot experts is needed to approach the desired performance. In this paper, we present a systematic three-step approach to designing and programming a dual robot system to optimize system performance. The first step is to select the relative placement between the two robots based on the specified relative motion path. The second step is to select the relative waypoints and the motion primitives. The final step is to update the waypoints iteratively based on the actual measured relative motion. Waypoint iteration is first executed in simulation and then completed using the actual robots. For performance assessment, we use the mean path speed subject to the relative position and orientation constraints and the path speed uniformity constraint. We have demonstrated the effectiveness of this method on two systems, a physical testbed of two ABB robots and a simulation testbed of two FANUC robots, for two challenging test curves.
Simultaneous Task Allocation and Planning for Multi-Robots under Hierarchical Temporal Logic Specifications
Research in robotic planning with temporal logic specifications, such as syntactically co-safe Linear Temporal Logic (sc-LTL), has relied on single formulas. However, as task complexity increases, sc-LTL formulas become lengthy, making them difficult to interpret and generate, and straining the computational capacities of planners. To address this, we introduce a hierarchical structure to sc-LTL specifications with both syntax and semantics, proving it to be more expressive than flat counterparts. We conducted a user study that compared the flat sc-LTL with our hierarchical version and found that users could more easily comprehend complex tasks using the hierarchical structure. We develop a search-based approach to synthesize plans for multi-robot systems, achieving simultaneous task allocation and planning. This method approximates the search space by loosely interconnected sub-spaces, each corresponding to an sc-LTL specification. The search primarily focuses on a single sub-space, transitioning to another under conditions determined by the decomposition of automatons. We develop multiple heuristics to significantly expedite the search. Our theoretical analysis, conducted under mild assumptions, addresses completeness and optimality. Compared to existing methods used in various simulators for service tasks, our approach improves planning times while maintaining comparable solution quality.
comment: 20 pages, 10 figures
Multiagent Systems
SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning SC
This paper introduces an open-source, decentralized framework named SigmaRL, designed to enhance both sample efficiency and generalization of multi-agent Reinforcement Learning (RL) for motion planning of connected and automated vehicles. Most RL agents exhibit a limited capacity to generalize, often focusing narrowly on specific scenarios, and are usually evaluated in similar or even the same scenarios seen during training. Various methods have been proposed to address these challenges, including experience replay and regularization. However, how observation design in RL affects sample efficiency and generalization remains an under-explored area. We address this gap by proposing five strategies to design information-dense observations, focusing on general features that are applicable to most traffic scenarios. We train our RL agents using these strategies on an intersection and evaluate their generalization through numerical experiments across completely unseen traffic scenarios, including a new intersection, an on-ramp, and a roundabout. Incorporating these information-dense observations reduces training times to under one hour on a single CPU, and the evaluation results reveal that our RL agents can effectively zero-shot generalize. Code: github.com/cas-lab-munich/SigmaRL
comment: 8 pages, 5 figures, accepted for presentation at the IEEE International Conference on Intelligent Transportation Systems (ITSC) 2024
A Nested Graph Reinforcement Learning-based Decision-making Strategy for Eco-platooning
Platooning technology is renowned for its precise vehicle control, traffic flow optimization, and energy efficiency enhancement. However, in large-scale mixed platoons, vehicle heterogeneity and unpredictable traffic conditions lead to virtual bottlenecks. These bottlenecks result in reduced traffic throughput and increased energy consumption within the platoon. To address these challenges, we introduce a decision-making strategy based on nested graph reinforcement learning. This strategy improves collaborative decision-making, ensuring energy efficiency and alleviating congestion. We propose a theory of nested traffic graph representation that maps dynamic interactions between vehicles and platoons in non-Euclidean spaces. By incorporating spatio-temporal weighted graph into a multi-head attention mechanism, we further enhance the model's capacity to process both local and global data. Additionally, we have developed a nested graph reinforcement learning framework to enhance the self-iterative learning capabilities of platooning. Using the I-24 dataset, we designed and conducted comparative algorithm experiments, generalizability testing, and permeability ablation experiments, thereby validating the proposed strategy's effectiveness. Compared to the baseline, our strategy increases throughput by 10% and decreases energy use by 9%. Specifically, increasing the penetration rate of CAVs significantly enhances traffic throughput, though it also increases energy consumption.
comment: 14 pages, 18 figures
Bridging Training and Execution via Dynamic Directed Graph-Based Communication in Cooperative Multi-Agent Systems
Multi-agent systems must learn to communicate and understand interactions between agents to achieve cooperative goals in partially observed tasks. However, existing approaches lack a dynamic directed communication mechanism and rely on global states, thus diminishing the role of communication in centralized training. Thus, we propose the transformer-based graph coarsening network (TGCNet), a novel multi-agent reinforcement learning (MARL) algorithm. TGCNet learns the topological structure of a dynamic directed graph to represent the communication policy and integrates graph coarsening networks to approximate the representation of global state during training. It also utilizes the transformer decoder for feature extraction during execution. Experiments on multiple cooperative MARL benchmarks demonstrate state-of-the-art performance compared to popular MARL algorithms. Further ablation studies validate the effectiveness of our dynamic directed graph communication mechanism and graph coarsening networks.
comment: 9 pages, 7 figures
Improving Global Parameter-sharing in Physically Heterogeneous Multi-agent Reinforcement Learning with Unified Action Space
In a multi-agent system (MAS), action semantics indicates the different influences of agents' actions toward other entities, and can be used to divide agents into groups in a physically heterogeneous MAS. Previous multi-agent reinforcement learning (MARL) algorithms apply global parameter-sharing across different types of heterogeneous agents without careful discrimination of different action semantics. This common implementation decreases the cooperation and coordination between agents in complex situations. However, fully independent agent parameters dramatically increase the computational cost and training difficulty. In order to benefit from the usage of different action semantics while also maintaining a proper parameter-sharing structure, we introduce the Unified Action Space (UAS) to fulfill the requirement. The UAS is the union set of all agent actions with different semantics. All agents first calculate their unified representation in the UAS, and then generate their heterogeneous action policies using different available-action-masks. To further improve the training of extra UAS parameters, we introduce a Cross-Group Inverse (CGI) loss to predict other groups' agent policies with the trajectory information. As a universal method for solving the physically heterogeneous MARL problem, we implement the UAS adding to both value-based and policy-based MARL algorithms, and propose two practical algorithms: U-QMIX and U-MAPPO. Experimental results in the SMAC environment prove the effectiveness of both U-QMIX and U-MAPPO compared with several state-of-the-art MARL methods.
Value-Based Rationales Improve Social Experience: A Multiagent Simulation Study ECAI 2024
We propose Exanna, a framework to realize agents that incorporate values in decision making. An Exannaagent considers the values of itself and others when providing rationales for its actions and evaluating the rationales provided by others. Via multiagent simulation, we demonstrate that considering values in decision making and producing rationales, especially for norm-deviating actions, leads to (1) higher conflict resolution, (2) better social experience, (3) higher privacy, and (4) higher flexibility.
comment: 13 pages, 13 figures, 13 tables (and supplementary material with reproducibility and additional results), accepted at ECAI 2024
A Multi-Scale Cognitive Interaction Model of Instrument Operations at the Linac Coherent Light Source
We describe a novel multi-agent, multi-scale computational cognitive interaction model of instrument operations at the Linac Coherent Light Source (LCLS). A leading scientific user facility, LCLS is the world's first hard x-ray free electron laser, operated by the SLAC National Accelerator Laboratory for the U.S. Department of Energy. As the world's first x-ray free electron laser, LCLS is in high demand and heavily oversubscribed. Our overall project employs cognitive engineering methodologies to improve experimental efficiency and scientific productivity by refining experimental interfaces and workflows, simplifying tasks, reducing errors, and improving operator safety and stress levels. Our model simulates aspects of human cognition at multiple cognitive and temporal scales, ranging from seconds to hours, and among agents playing multiple roles, including instrument operator, real time data analyst, and experiment manager. The model can predict impacts stemming from proposed changes to operational interfaces and workflows. Because the model code is open source, and supplemental videos go into detail on all aspects of the model and results, this approach could be applied to other experimental apparatus and processes. Example results demonstrate the model's potential in guiding modifications to improve operational efficiency and scientific output. We discuss the implications of our findings for cognitive engineering in complex experimental settings and outline future directions for research.
comment: Supplemental videos: https://www.youtube.com/playlist?list=PLI13S4Z1cbXggy98pDXjqnVnnoekohF2f
GHQ: Grouped Hybrid Q Learning for Heterogeneous Cooperative Multi-agent Reinforcement Learning
Previous deep multi-agent reinforcement learning (MARL) algorithms have achieved impressive results, typically in homogeneous scenarios. However, heterogeneous scenarios are also very common and usually harder to solve. In this paper, we mainly discuss cooperative heterogeneous MARL problems in Starcraft Multi-Agent Challenges (SMAC) environment. We firstly define and describe the heterogeneous problems in SMAC. In order to comprehensively reveal and study the problem, we make new maps added to the original SMAC maps. We find that baseline algorithms fail to perform well in those heterogeneous maps. To address this issue, we propose the Grouped Individual-Global-Max Consistency (GIGM) and a novel MARL algorithm, Grouped Hybrid Q Learning (GHQ). GHQ separates agents into several groups and keeps individual parameters for each group, along with a novel hybrid structure for factorization. To enhance coordination between groups, we maximize the Inter-group Mutual Information (IGMI) between groups' trajectories. Experiments on original and new heterogeneous maps show the fabulous performance of GHQ compared to other state-of-the-art algorithms.
Systems and Control (CS)
Model-Based Control of Water Treatment with Pumped Water Storage
Water treatment facilities are critical infrastructure they must accommodate dynamic demand patterns without system disruption. These patterns can be scheduled, such as daily residential irrigation, or unexpected, such as demand spikes from withdrawals for fire management. The critical necessity of clean, safe, and reliable water requires water treatment control strategies that are insensitive to disturbances to guarantee that demand will be met. One essential problem in achieving this is the minimization of energy costs in the process of meeting water demand, especially as the need for decarbonization persists. This work develops a control-oriented hydraulic model of a water treatment facility with integrated pumped storage and introduces a model predictive control strategy for scheduling treatment plant system operations to minimize greenhouse gas emissions and safely meet water demand.
comment: 6 pages, 6 figures, 2 tables. Accepted for MECC 2024
SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning SC
This paper introduces an open-source, decentralized framework named SigmaRL, designed to enhance both sample efficiency and generalization of multi-agent Reinforcement Learning (RL) for motion planning of connected and automated vehicles. Most RL agents exhibit a limited capacity to generalize, often focusing narrowly on specific scenarios, and are usually evaluated in similar or even the same scenarios seen during training. Various methods have been proposed to address these challenges, including experience replay and regularization. However, how observation design in RL affects sample efficiency and generalization remains an under-explored area. We address this gap by proposing five strategies to design information-dense observations, focusing on general features that are applicable to most traffic scenarios. We train our RL agents using these strategies on an intersection and evaluate their generalization through numerical experiments across completely unseen traffic scenarios, including a new intersection, an on-ramp, and a roundabout. Incorporating these information-dense observations reduces training times to under one hour on a single CPU, and the evaluation results reveal that our RL agents can effectively zero-shot generalize. Code: github.com/cas-lab-munich/SigmaRL
comment: 8 pages, 5 figures, accepted for presentation at the IEEE International Conference on Intelligent Transportation Systems (ITSC) 2024
On linear quadratic regulator for the heat equation with general boundary conditions
We consider the linear quadratic regulator of the heat equation on a finite interval. Previous frequency-domain methods for this problem rely on discrete Fourier transform and require symmetric boundary conditions. We use the Fokas method to derive the optimal control law for general Dirichlet and Neumann boundary conditions. The Fokas method uses the continuous Fourier transform restricted on the bounded spatial domain, with the frequency variable $k$ domain extended from the real line to the complex plane. This extension, together with results from complex analysis, allows us to eliminate the dependence of the optimal control on the unknown boundary values. As a result, we derive an integral representation of the control similar to the inverse Fourier transform. This representation contains integrals along complex contours and only depends on known initial and boundary conditions. We also show that for the homogeneous Dirichlet boundary value problem, the integral representation recovers an existing series representation of the optimal control. Moreover, the integral representation exhibits numerical advantages compared to the traditional series representation.
comment: 9 pages, 5 figures
Microgrid Building Blocks for Dynamic Decoupling and Black Start Applications
Microgrids offer increased self-reliance and resilience at the grid's edge. They promote a significant transition to decentralized and renewable energy production by optimizing the utilization of local renewable sources. However, to maintain stable operations under all conditions and harness microgrids' full economic and technological potential, it is essential to integrate with the bulk grid and neighboring microgrids seamlessly. In this paper, we explore the capabilities of Back-to-Back (BTB) converters as a pivotal technology for interfacing microgrids, hybrid AC/DC grids, and bulk grids, by leveraging a comprehensive phasor-domain model integrated into GridLAB-D. The phasor-domain model is computationally efficient for simulating BTB with bulk grids and networked microgrids. We showcase the versatility of BTB converters (an integrated Microgrid Building Block) by configuring a two-microgrid network from a modified IEEE 13-node distribution system. These microgrids are equipped with diesel generators, photovoltaic units, and Battery Energy Storage Systems (BESS). The simulation studies are focused on use cases demonstrating dynamic decoupling and controlled support that a microgrid can provide via a BTB converter.
comment: This paper is accepted for publication in IEEE PES Grid Edge Technologies Conference & Exposition 2025, San Diego, CA. The complete copyright version will be available on IEEE Xplore when the conference proceedings are published
Verification of Quantum Circuits through Discrete-Time Barrier Certificates
Current methods for verifying quantum computers are predominately based on interactive or automatic theorem provers. Considering that quantum computers are dynamical in nature, this paper employs and extends the concepts from the verification of dynamical systems to verify properties of quantum circuits. Our main contribution is to propose k-inductive barrier certificates over complex variables and show how to compute them using Hermitian Sum of Squares optimization. We apply this new technique to verify properties of different quantum circuits.
comment: 20 pages, 6 figures
Steady-State Cascade Operators and their Role in Linear Control, Estimation, and Model Reduction Problems
Certain linear matrix operators arise naturally in systems analysis and design problems involving cascade interconnections of linear time-invariant systems, including problems of stabilization, estimation, and model order reduction. We conduct here a comprehensive study of these operators and their relevant system-theoretic properties. The general theory is then leveraged to delineate both known and new design methodologies for control, estimation, and model reduction. Several entirely new designs arise from this systematic categorization, including new recursive and low-gain design frameworks for observation of cascaded systems. The benefits of the results beyond the linear time-invariant setting are demonstrated through preliminary extensions for nonlinear systems, with an outlook towards the development of a similarly comprehensive nonlinear theory.
comment: 16 pages, 5 figures, submitted for publication
Cooled Space Nuclear Reactors Using a System Analysis Program
In recent years, achieving autonomous control in nuclear reactor operations has become pivotal for the effectiveness of Space Nuclear Power Systems (SNPS). However, compared to power control, the startup control of SNPS remains underexplored. This study introduces a multi-objective optimization framework aimed at enhancing startup control, leveraging a system level analysis program to simulate the system's dynamic behavior accurately. The primary contribution of this work is the development and implementation of an optimization framework that significantly reduces startup time and improves control efficiency. Utilizing a non-ideal gas model, a multi-channel core model and the Monte Carlo code RMC employed to calculate temperature reactivity coefficients and neutron kinetics parameters, the system analysis tool ensures precise thermal-dynamic simulations. After insightful comprehension of system dynamics through reactive insertion accidents, the optimization algorithm fine-tunes the control sequences for external reactivity insertion, TAC system shaft speed, and cooling system background temperature. The optimized control strategy achieves threshold power 1260 seconds earlier and turbine inlet temperature 1980 seconds sooner than baseline methods. The findings highlight the potential of the proposed optimization framework to enhance the autonomy and operational efficiency of future SNPS designs.
Designing Laplacian flows for opinion clustering in structurally balanced and unbalanced networks
In this work, we consider a group of n agents whose interactions can be represented using unsigned or signed structurally balanced graphs or a special case of structurally unbalanced graphs. A Laplacian-based model is proposed to govern the evolution of opinions. The objective of the paper is to analyze the proposed opinion model on the opinion evolution of the agents. Further, we also determine the conditions required to apply the proposed Laplacian-based opinion model. Finally, some numerical results are shown to validate these results.
comment: 8 pages, 5 figures
Remote Tube-based MPC for Tracking Over Lossy Networks
This paper addresses the problem of controlling constrained systems subject to disturbances in the case where controller and system are connected over a lossy network. To do so, we propose a novel framework that splits the concept of tube-based model predictive control into two parts. One runs locally on the system and is responsible for disturbance rejection, while the other runs remotely and provides optimal input trajectories that satisfy the system's state and input constraints. Key to our approach is the presence of a nominal model and an ancillary controller on the local system. Theoretical guarantees regarding the recursive feasibility and the tracking capabilities in the presence of disturbances and packet losses in both directions are provided. To test the efficacy of the proposed approach, we compare it to a state-of-the-art solution in the case of controlling a cartpole system. Extensive simulations are carried out with both linearized and nonlinear system dynamics, as well as different packet loss probabilities and disturbances. The code for this work is available at https://github.com/EricssonResearch/Robust-Tracking-MPC-over-Lossy-Networks
comment: Accepted at the Conference on Decision and Control 2024, 8 pages, 6 figures
Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems
Recent advancements in machine learning based energy management approaches, specifically reinforcement learning with a safety layer (OptLayerPolicy) and a metaheuristic algorithm generating a decision tree control policy (TreeC), have shown promise. However, their effectiveness has only been demonstrated in computer simulations. This paper presents the real-world validation of these methods, comparing against model predictive control and simple rule-based control benchmark. The experiments were conducted on the electrical installation of 4 reproductions of residential houses, which all have their own battery, photovoltaic and dynamic load system emulating a non-controllable electrical load and a controllable electric vehicle charger. The results show that the simple rules, TreeC, and model predictive control-based methods achieved similar costs, with a difference of only 0.6%. The reinforcement learning based method, still in its training phase, obtained a cost 25.5\% higher to the other methods. Additional simulations show that the costs can be further reduced by using a more representative training dataset for TreeC and addressing errors in the model predictive control implementation caused by its reliance on accurate data from various sources. The OptLayerPolicy safety layer allows safe online training of a reinforcement learning agent in the real-world, given an accurate constraint function formulation. The proposed safety layer method remains error-prone, nonetheless, it is found beneficial for all investigated methods. The TreeC method, which does require building a realistic simulation for training, exhibits the safest operational performance, exceeding the grid limit by only 27.1 Wh compared to 593.9 Wh for reinforcement learning.
The Design of Autonomous UAV Prototypes for Inspecting Tunnel Construction Environment
This article presents novel designs of autonomous UAV prototypes specifically developed for inspecting GPS-denied tunnel construction environments with dynamic human and robotic presence. Our UAVs integrate advanced sensor suites and robust motion planning algorithms to autonomously navigate and explore these complex environments. We validated our approach through comprehensive simulation experiments in PX4 Gazebo and Airsim Unreal Engine 4 environments. Real-world wind tests and exploration experiments demonstrate the UAVs' capability to operate stably under diverse environmental conditions without GPS assistance. This study highlights the practicality and resilience of our UAV prototypes in real-world applications.
comment: Autonomous UAV, Tunnel Inspection, GPS-Denied Environment, Safe Trajectory Planning, Real-World Testing
SustainDC -- Benchmarking for Sustainable Data Center Control NeurIPS 2024
Machine learning has driven an exponential increase in computational demand, leading to massive data centers that consume significant amounts of energy and contribute to climate change. This makes sustainable data center control a priority. In this paper, we introduce SustainDC, a set of Python environments for benchmarking multi-agent reinforcement learning (MARL) algorithms for data centers (DC). SustainDC supports custom DC configurations and tasks such as workload scheduling, cooling optimization, and auxiliary battery management, with multiple agents managing these operations while accounting for the effects of each other. We evaluate various MARL algorithms on SustainDC, showing their performance across diverse DC designs, locations, weather conditions, grid carbon intensity, and workload requirements. Our results highlight significant opportunities for improvement of data center operations using MARL algorithms. Given the increasing use of DC due to AI, SustainDC provides a crucial platform for the development and benchmarking of advanced algorithms essential for achieving sustainable computing and addressing other heterogeneous real-world challenges.
comment: Under review at Advances in Neural Information Processing Systems 2024 (NeurIPS 2024)
Local Cold Load Pick-up Estimation Using Customer Energy Consumption Measurements
Thermostatically-controlled loads have a significant impact on electricity demand after service is restored following an outage, a phenomenon known as cold load pick-up (CLPU). Active management of CLPU is becoming an essential tool for distribution system operators who seek to defer network upgrades and speed up post-outage customer restoration. One key functionality needed for actively managing CLPU is its forecast at various scales. The widespread deployment of smart metering devices is also opening up new opportunities for data-driven load modeling and forecast. In this paper, we propose an approach for customer-side estimation of CLPU using time-stamped local load measurements. The proposed method uses Auto-Regressive Integrated Moving Average (ARIMA) modeling for short-term foregone energy consumption forecast during an outage. Forecasts are made on an hourly basis to estimate the energy to potentially recover after outages lasting up to several hours. Moreover, to account for changing customer behavior and weather, the model order is adjusted dynamically. Simulation results based on actual smart meter measurements are presented for 50 residential customers over the duration of one year. These results are validated using physical modeling of residential loads and are shown to match well the ARIMA-based forecasts. Additionally, accuracy and execution speed has been compared with other state-of-the-art approaches for time-series forecasting including Long Short Term Memory Network (LSTM) and Holt-Winters Exponential Smoothing (HWES). ARIMA-based forecast is found to offer superior performance both in terms of accuracy and computation speed.
Artificial Intelligence in Power System Security and Stability Analysis: A Comprehensive Review
This review comprehensively examines the integration of artificial intelligence (AI) in enhancing the dynamic security assessments of modern power systems. It highlights the pivotal role of AI in facilitating scenario generation, incident prediction, risk assessment, and severity grading, thereby addressing the complexities introduced by renewable energy integration and advancements in digital grid technologies. The paper delves into data-driven techniques, with a particular focus on decision trees that effectively bridge operational characteristics with security metrics. These methodologies enable real-time, accurate predictions of system behaviors under varied operational conditions and support the optimization of control strategies. Through detailed analysis, we demonstrate how AI applications can transform traditional security assessment protocols, enhancing both the efficacy and efficiency of power system operations. The findings advocate for the potential of AI to significantly enhance the reliability and resilience of electrical grids, marking a paradigm shift towards more adaptive and intelligent power infrastructure.
comment: 10 pages, in Chinese language
Resilient source seeking with robot swarms
We present a solution for locating the source, or maximum, of an unknown scalar field using a swarm of mobile robots. Unlike relying on the traditional gradient information, the swarm determines an ascending direction to approach the source with arbitrary precision. The ascending direction is calculated from measurements of the field strength at the robot locations and their relative positions concerning the centroid. Rather than focusing on individual robots, we focus the analysis on the density of robots per unit area to guarantee a more resilient swarm, i.e., the functionality remains even if individuals go missing or are misplaced during the mission. We reinforce the robustness of the algorithm by providing sufficient conditions for the swarm shape so that the ascending direction is almost parallel to the gradient. The swarm can respond to an unexpected environment by morphing its shape and exploiting the existence of multiple ascending directions. Finally, we validate our approach numerically with hundreds of robots. The fact that a large number of robots always calculate an ascending direction compensates for the loss of individuals and mitigates issues arising from the actuator and sensor noises.
comment: 7 pages, CDC 2024, accepted version
System-Level Simulation Framework for NB-IoT: Key Features and Performance Evaluation
Narrowband Internet of Things (NB-IoT) is a technology specifically designated by the 3rd Generation Partnership Project (3GPP) to meet the explosive demand for massive machine-type communications (mMTC), and it is evolving to RedCap. Industrial companies have increasingly adopted NB-IoT as the solution for mMTC due to its lightweight design and comprehensive technical specifications released by 3GPP. This paper presents a system-level simulation framework for NB-IoT networks to evaluate their performance. The system-level simulator is structured into four parts: initialization, pre-generation, main simulation loop, and post-processing. Additionally, three essential features are investigated to enhance coverage, support massive connections, and ensure low power consumption, respectively. Simulation results demonstrate that the cumulative distribution function curves of the signal-to-interference-and-noise ratio fully comply with industrial standards. Furthermore, the throughput performance explains how NB-IoT networks realize massive connections at the cost of data rate. This work highlights its practical utility and paves the way for developing NB-IoT networks.
Real-time Regulation of Detention Ponds via Feedback Control: Balancing Flood Mitigation and Water Quality
Detention ponds can mitigate flooding and improve water quality by allowing the settlement of pollutants. Typically, they are operated with fully open orifices and weirs (i.e., passive control). Active controls can improve the performance of these systems: orifices can be retrofitted with controlled valves and spillways can have controllable gates. The real-time optimal operation of its hydraulic devices can be achieved with techniques such as Model Predictive Control (MPC). A distributed quasi-2D hydrologic-hydrodynamic coupled with a reservoir flood routing model is developed and integrated with an MPC algorithm to estimate the operation of valves and movable gates. The control optimization problem is adapted to switch from a flood-related algorithm focusing on mitigating floods to a heuristic objective function that aims to increase the detention time when no inflow hydrographs are predicted. The case studies show the potential of applying the methods developed in a catchment in Sao Paulo, Brazil. The performance of MPC compared to alternatives with either fully or partially open valves and gates are tested. Comparisons with HEC-RAS 2D indicate volume and peak flow errors of approximately 1.4% and 0.91% for the watershed module. Simulating two consecutive 10-year storms shows that the MPC strategy can achieve peak flow reductions of 79%. In contrast, passive control has nearly half of the performance. A 1-year continuous simulation results show that the passive scenario with 25% of the valves opened can treat 12% more runoff compared to the developed MPC approach, with an average detention time of approximately 6 hours. For the MPC approach, the average detention time is nearly 14 hours indicating that both control techniques can treat similar volumes; however, the proxy water quality for the MPC approach is enhanced due to the longer detention times.
Virtual Elastic Tether: a New Approach for Multi-agent Navigation in Confined Aquatic Environments
Underwater navigation is a challenging area in the field of mobile robotics due to inherent constraints in self-localisation and communication in underwater environments. Some of these challenges can be mitigated by using collaborative multi-agent teams. However, when applied underwater, the robustness of traditional multi-agent collaborative control approaches is highly limited due to the unavailability of reliable measurements. In this paper, the concept of a Virtual Elastic Tether (VET) is introduced in the context of incomplete state measurements, which represents an innovative approach to underwater navigation in confined spaces. The concept of VET is formulated and validated using the Cooperative Aquatic Vehicle Exploration System (CAVES), which is a sim-to-real multi-agent aquatic robotic platform. Within this framework, a vision-based Autonomous Underwater Vehicle-Autonomous Surface Vehicle leader-follower formulation is developed. Experiments were conducted in both simulation and on a physical platform, benchmarked against a traditional Image-Based Visual Servoing approach. Results indicate that the formation of the baseline approach fails under discrete disturbances, when induced distances between the robots exceeds 0.6 m in simulation and 0.3 m in the real world. In contrast, the VET-enhanced system recovers to pre-perturbation distances within 5 seconds. Furthermore, results illustrate the successful navigation of VET-enhanced CAVES in a confined water pond where the baseline approach fails to perform adequately.
comment: This work has been submitted to the Wiley for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
gatekeeper: Online Safety Verification and Control for Nonlinear Systems in Dynamic Environments IROS 2023
This paper presents the gatekeeper algorithm, a real-time and computationally-lightweight method that ensures that trajectories of a nonlinear system satisfy safety constraints despite sensing limitations. gatekeeper integrates with existing path planners and feedback controllers by introducing an additional verification step to ensure that proposed trajectories can be executed safely, despite nonlinear dynamics subject to bounded disturbances, input constraints and partial knowledge of the environment. Our key contribution is that (A) we propose an algorithm to recursively construct safe trajectories by numerically forward propagating the system over a (short) finite horizon, and (B) we prove that tracking such a trajectory ensures the system remains safe for all future time, i.e., beyond the finite horizon. We demonstrate the method in a simulation of a dynamic firefighting mission, and in physical experiments of a quadrotor navigating in an obstacle environment that is sensed online. We also provide comparisons against the state-of-the-art techniques for similar problems.
comment: Accepted at IEEE T-RO 2024. Accepted at IROS 2023. 17 pages, 10 figures
Fast and Accurate Relative Motion Tracking for Dual Industrial Robots
Industrial robotic applications such as spraying, welding, and additive manufacturing frequently require fast, accurate, and uniform motion along a 3D spatial curve. To increase process throughput, some manufacturers propose a dual-robot setup to overcome the speed limitation of a single robot. Industrial robot motion is programmed through waypoints connected by motion primitives (Cartesian linear and circular paths and linear joint paths at constant Cartesian speed). The actual robot motion is affected by the blending between these motion primitives and the pose of the robot (an outstretched/near-singularity pose tends to have larger path tracking errors). Choosing the waypoints and the speed along each motion segment to achieve the performance requirement is challenging. At present, there is no automated solution, and laborious manual tuning by robot experts is needed to approach the desired performance. In this paper, we present a systematic three-step approach to designing and programming a dual robot system to optimize system performance. The first step is to select the relative placement between the two robots based on the specified relative motion path. The second step is to select the relative waypoints and the motion primitives. The final step is to update the waypoints iteratively based on the actual measured relative motion. Waypoint iteration is first executed in simulation and then completed using the actual robots. For performance assessment, we use the mean path speed subject to the relative position and orientation constraints and the path speed uniformity constraint. We have demonstrated the effectiveness of this method on two systems, a physical testbed of two ABB robots and a simulation testbed of two FANUC robots, for two challenging test curves.
Line zonotopes: a set representation suitable for unbounded systems and its application to set-based state estimation and active fault diagnosis of descriptor systems
This paper proposes new methods for set-based state estimation and active fault diagnosis (AFD) of linear descriptor systems (LDS). In contrast to intervals, ellipsoids, and zonotopes, linear static constraints on the state variables, typical of descriptor systems, can be directly incorporated in the mathematical description of constrained zonotopes (CZs). Thanks to this feature, methods using CZs could provide less conservative enclosures than zonotope methods. However, an enclosure on the states should be known for all $k \geq 0$, which is not true in the case of unstable or unobservable LDS. In this context, this paper proposes a new representation for unbounded sets, which allows to develop methods for state estimation and AFD of stable and unstable LDS. Unlike many other extensions of CZs, the proposed set inherits most of their properties, including polynomial time complexity reduction methods, while allowing to describe different classes of sets, such as strips, hyperplanes, and the entire $n$-dimensional Euclidean space. The advantages of the proposed approaches with respect to CZ methods are highlighted in numerical examples.
comment: 15 pages, 6 figures. Revised manuscript includes a new name for the set representation, revised article structure, a new numerical example, and several minor modifications. Theoretical results unchanged. arXiv admin note: text overlap with arXiv:2306.07369
Formally Verifying Deep Reinforcement Learning Controllers with Lyapunov Barrier Certificates
Deep reinforcement learning (DRL) is a powerful machine learning paradigm for generating agents that control autonomous systems. However, the ``black box'' nature of DRL agents limits their deployment in real-world safety-critical applications. A promising approach for providing strong guarantees on an agent's behavior is to use Neural Lyapunov Barrier (NLB) certificates, which are learned functions over the system whose properties indirectly imply that an agent behaves as desired. However, NLB-based certificates are typically difficult to learn and even more difficult to verify, especially for complex systems. In this work, we present a novel method for training and verifying NLB-based certificates for discrete-time systems. Specifically, we introduce a technique for certificate composition, which simplifies the verification of highly-complex systems by strategically designing a sequence of certificates. When jointly verified with neural network verification engines, these certificates provide a formal guarantee that a DRL agent both achieves its goals and avoids unsafe behavior. Furthermore, we introduce a technique for certificate filtering, which significantly simplifies the process of producing formally verified certificates. We demonstrate the merits of our approach with a case study on providing safety and liveness guarantees for a DRL-controlled spacecraft.
comment: To appear in FMCAD 2024
Systems and Control (EESS)
Model-Based Control of Water Treatment with Pumped Water Storage
Water treatment facilities are critical infrastructure they must accommodate dynamic demand patterns without system disruption. These patterns can be scheduled, such as daily residential irrigation, or unexpected, such as demand spikes from withdrawals for fire management. The critical necessity of clean, safe, and reliable water requires water treatment control strategies that are insensitive to disturbances to guarantee that demand will be met. One essential problem in achieving this is the minimization of energy costs in the process of meeting water demand, especially as the need for decarbonization persists. This work develops a control-oriented hydraulic model of a water treatment facility with integrated pumped storage and introduces a model predictive control strategy for scheduling treatment plant system operations to minimize greenhouse gas emissions and safely meet water demand.
comment: 6 pages, 6 figures, 2 tables. Accepted for MECC 2024
SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning SC
This paper introduces an open-source, decentralized framework named SigmaRL, designed to enhance both sample efficiency and generalization of multi-agent Reinforcement Learning (RL) for motion planning of connected and automated vehicles. Most RL agents exhibit a limited capacity to generalize, often focusing narrowly on specific scenarios, and are usually evaluated in similar or even the same scenarios seen during training. Various methods have been proposed to address these challenges, including experience replay and regularization. However, how observation design in RL affects sample efficiency and generalization remains an under-explored area. We address this gap by proposing five strategies to design information-dense observations, focusing on general features that are applicable to most traffic scenarios. We train our RL agents using these strategies on an intersection and evaluate their generalization through numerical experiments across completely unseen traffic scenarios, including a new intersection, an on-ramp, and a roundabout. Incorporating these information-dense observations reduces training times to under one hour on a single CPU, and the evaluation results reveal that our RL agents can effectively zero-shot generalize. Code: github.com/cas-lab-munich/SigmaRL
comment: 8 pages, 5 figures, accepted for presentation at the IEEE International Conference on Intelligent Transportation Systems (ITSC) 2024
On linear quadratic regulator for the heat equation with general boundary conditions
We consider the linear quadratic regulator of the heat equation on a finite interval. Previous frequency-domain methods for this problem rely on discrete Fourier transform and require symmetric boundary conditions. We use the Fokas method to derive the optimal control law for general Dirichlet and Neumann boundary conditions. The Fokas method uses the continuous Fourier transform restricted on the bounded spatial domain, with the frequency variable $k$ domain extended from the real line to the complex plane. This extension, together with results from complex analysis, allows us to eliminate the dependence of the optimal control on the unknown boundary values. As a result, we derive an integral representation of the control similar to the inverse Fourier transform. This representation contains integrals along complex contours and only depends on known initial and boundary conditions. We also show that for the homogeneous Dirichlet boundary value problem, the integral representation recovers an existing series representation of the optimal control. Moreover, the integral representation exhibits numerical advantages compared to the traditional series representation.
comment: 9 pages, 5 figures
Microgrid Building Blocks for Dynamic Decoupling and Black Start Applications
Microgrids offer increased self-reliance and resilience at the grid's edge. They promote a significant transition to decentralized and renewable energy production by optimizing the utilization of local renewable sources. However, to maintain stable operations under all conditions and harness microgrids' full economic and technological potential, it is essential to integrate with the bulk grid and neighboring microgrids seamlessly. In this paper, we explore the capabilities of Back-to-Back (BTB) converters as a pivotal technology for interfacing microgrids, hybrid AC/DC grids, and bulk grids, by leveraging a comprehensive phasor-domain model integrated into GridLAB-D. The phasor-domain model is computationally efficient for simulating BTB with bulk grids and networked microgrids. We showcase the versatility of BTB converters (an integrated Microgrid Building Block) by configuring a two-microgrid network from a modified IEEE 13-node distribution system. These microgrids are equipped with diesel generators, photovoltaic units, and Battery Energy Storage Systems (BESS). The simulation studies are focused on use cases demonstrating dynamic decoupling and controlled support that a microgrid can provide via a BTB converter.
comment: This paper is accepted for publication in IEEE PES Grid Edge Technologies Conference & Exposition 2025, San Diego, CA. The complete copyright version will be available on IEEE Xplore when the conference proceedings are published
Verification of Quantum Circuits through Discrete-Time Barrier Certificates
Current methods for verifying quantum computers are predominately based on interactive or automatic theorem provers. Considering that quantum computers are dynamical in nature, this paper employs and extends the concepts from the verification of dynamical systems to verify properties of quantum circuits. Our main contribution is to propose k-inductive barrier certificates over complex variables and show how to compute them using Hermitian Sum of Squares optimization. We apply this new technique to verify properties of different quantum circuits.
comment: 20 pages, 6 figures
Steady-State Cascade Operators and their Role in Linear Control, Estimation, and Model Reduction Problems
Certain linear matrix operators arise naturally in systems analysis and design problems involving cascade interconnections of linear time-invariant systems, including problems of stabilization, estimation, and model order reduction. We conduct here a comprehensive study of these operators and their relevant system-theoretic properties. The general theory is then leveraged to delineate both known and new design methodologies for control, estimation, and model reduction. Several entirely new designs arise from this systematic categorization, including new recursive and low-gain design frameworks for observation of cascaded systems. The benefits of the results beyond the linear time-invariant setting are demonstrated through preliminary extensions for nonlinear systems, with an outlook towards the development of a similarly comprehensive nonlinear theory.
comment: 16 pages, 5 figures, submitted for publication
Cooled Space Nuclear Reactors Using a System Analysis Program
In recent years, achieving autonomous control in nuclear reactor operations has become pivotal for the effectiveness of Space Nuclear Power Systems (SNPS). However, compared to power control, the startup control of SNPS remains underexplored. This study introduces a multi-objective optimization framework aimed at enhancing startup control, leveraging a system level analysis program to simulate the system's dynamic behavior accurately. The primary contribution of this work is the development and implementation of an optimization framework that significantly reduces startup time and improves control efficiency. Utilizing a non-ideal gas model, a multi-channel core model and the Monte Carlo code RMC employed to calculate temperature reactivity coefficients and neutron kinetics parameters, the system analysis tool ensures precise thermal-dynamic simulations. After insightful comprehension of system dynamics through reactive insertion accidents, the optimization algorithm fine-tunes the control sequences for external reactivity insertion, TAC system shaft speed, and cooling system background temperature. The optimized control strategy achieves threshold power 1260 seconds earlier and turbine inlet temperature 1980 seconds sooner than baseline methods. The findings highlight the potential of the proposed optimization framework to enhance the autonomy and operational efficiency of future SNPS designs.
Designing Laplacian flows for opinion clustering in structurally balanced and unbalanced networks
In this work, we consider a group of n agents whose interactions can be represented using unsigned or signed structurally balanced graphs or a special case of structurally unbalanced graphs. A Laplacian-based model is proposed to govern the evolution of opinions. The objective of the paper is to analyze the proposed opinion model on the opinion evolution of the agents. Further, we also determine the conditions required to apply the proposed Laplacian-based opinion model. Finally, some numerical results are shown to validate these results.
comment: 8 pages, 5 figures
Remote Tube-based MPC for Tracking Over Lossy Networks
This paper addresses the problem of controlling constrained systems subject to disturbances in the case where controller and system are connected over a lossy network. To do so, we propose a novel framework that splits the concept of tube-based model predictive control into two parts. One runs locally on the system and is responsible for disturbance rejection, while the other runs remotely and provides optimal input trajectories that satisfy the system's state and input constraints. Key to our approach is the presence of a nominal model and an ancillary controller on the local system. Theoretical guarantees regarding the recursive feasibility and the tracking capabilities in the presence of disturbances and packet losses in both directions are provided. To test the efficacy of the proposed approach, we compare it to a state-of-the-art solution in the case of controlling a cartpole system. Extensive simulations are carried out with both linearized and nonlinear system dynamics, as well as different packet loss probabilities and disturbances. The code for this work is available at https://github.com/EricssonResearch/Robust-Tracking-MPC-over-Lossy-Networks
comment: Accepted at the Conference on Decision and Control 2024, 8 pages, 6 figures
Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems
Recent advancements in machine learning based energy management approaches, specifically reinforcement learning with a safety layer (OptLayerPolicy) and a metaheuristic algorithm generating a decision tree control policy (TreeC), have shown promise. However, their effectiveness has only been demonstrated in computer simulations. This paper presents the real-world validation of these methods, comparing against model predictive control and simple rule-based control benchmark. The experiments were conducted on the electrical installation of 4 reproductions of residential houses, which all have their own battery, photovoltaic and dynamic load system emulating a non-controllable electrical load and a controllable electric vehicle charger. The results show that the simple rules, TreeC, and model predictive control-based methods achieved similar costs, with a difference of only 0.6%. The reinforcement learning based method, still in its training phase, obtained a cost 25.5\% higher to the other methods. Additional simulations show that the costs can be further reduced by using a more representative training dataset for TreeC and addressing errors in the model predictive control implementation caused by its reliance on accurate data from various sources. The OptLayerPolicy safety layer allows safe online training of a reinforcement learning agent in the real-world, given an accurate constraint function formulation. The proposed safety layer method remains error-prone, nonetheless, it is found beneficial for all investigated methods. The TreeC method, which does require building a realistic simulation for training, exhibits the safest operational performance, exceeding the grid limit by only 27.1 Wh compared to 593.9 Wh for reinforcement learning.
The Design of Autonomous UAV Prototypes for Inspecting Tunnel Construction Environment
This article presents novel designs of autonomous UAV prototypes specifically developed for inspecting GPS-denied tunnel construction environments with dynamic human and robotic presence. Our UAVs integrate advanced sensor suites and robust motion planning algorithms to autonomously navigate and explore these complex environments. We validated our approach through comprehensive simulation experiments in PX4 Gazebo and Airsim Unreal Engine 4 environments. Real-world wind tests and exploration experiments demonstrate the UAVs' capability to operate stably under diverse environmental conditions without GPS assistance. This study highlights the practicality and resilience of our UAV prototypes in real-world applications.
comment: Autonomous UAV, Tunnel Inspection, GPS-Denied Environment, Safe Trajectory Planning, Real-World Testing
SustainDC -- Benchmarking for Sustainable Data Center Control NeurIPS 2024
Machine learning has driven an exponential increase in computational demand, leading to massive data centers that consume significant amounts of energy and contribute to climate change. This makes sustainable data center control a priority. In this paper, we introduce SustainDC, a set of Python environments for benchmarking multi-agent reinforcement learning (MARL) algorithms for data centers (DC). SustainDC supports custom DC configurations and tasks such as workload scheduling, cooling optimization, and auxiliary battery management, with multiple agents managing these operations while accounting for the effects of each other. We evaluate various MARL algorithms on SustainDC, showing their performance across diverse DC designs, locations, weather conditions, grid carbon intensity, and workload requirements. Our results highlight significant opportunities for improvement of data center operations using MARL algorithms. Given the increasing use of DC due to AI, SustainDC provides a crucial platform for the development and benchmarking of advanced algorithms essential for achieving sustainable computing and addressing other heterogeneous real-world challenges.
comment: Under review at Advances in Neural Information Processing Systems 2024 (NeurIPS 2024)
Local Cold Load Pick-up Estimation Using Customer Energy Consumption Measurements
Thermostatically-controlled loads have a significant impact on electricity demand after service is restored following an outage, a phenomenon known as cold load pick-up (CLPU). Active management of CLPU is becoming an essential tool for distribution system operators who seek to defer network upgrades and speed up post-outage customer restoration. One key functionality needed for actively managing CLPU is its forecast at various scales. The widespread deployment of smart metering devices is also opening up new opportunities for data-driven load modeling and forecast. In this paper, we propose an approach for customer-side estimation of CLPU using time-stamped local load measurements. The proposed method uses Auto-Regressive Integrated Moving Average (ARIMA) modeling for short-term foregone energy consumption forecast during an outage. Forecasts are made on an hourly basis to estimate the energy to potentially recover after outages lasting up to several hours. Moreover, to account for changing customer behavior and weather, the model order is adjusted dynamically. Simulation results based on actual smart meter measurements are presented for 50 residential customers over the duration of one year. These results are validated using physical modeling of residential loads and are shown to match well the ARIMA-based forecasts. Additionally, accuracy and execution speed has been compared with other state-of-the-art approaches for time-series forecasting including Long Short Term Memory Network (LSTM) and Holt-Winters Exponential Smoothing (HWES). ARIMA-based forecast is found to offer superior performance both in terms of accuracy and computation speed.
Artificial Intelligence in Power System Security and Stability Analysis: A Comprehensive Review
This review comprehensively examines the integration of artificial intelligence (AI) in enhancing the dynamic security assessments of modern power systems. It highlights the pivotal role of AI in facilitating scenario generation, incident prediction, risk assessment, and severity grading, thereby addressing the complexities introduced by renewable energy integration and advancements in digital grid technologies. The paper delves into data-driven techniques, with a particular focus on decision trees that effectively bridge operational characteristics with security metrics. These methodologies enable real-time, accurate predictions of system behaviors under varied operational conditions and support the optimization of control strategies. Through detailed analysis, we demonstrate how AI applications can transform traditional security assessment protocols, enhancing both the efficacy and efficiency of power system operations. The findings advocate for the potential of AI to significantly enhance the reliability and resilience of electrical grids, marking a paradigm shift towards more adaptive and intelligent power infrastructure.
comment: 10 pages, in Chinese language
Resilient source seeking with robot swarms
We present a solution for locating the source, or maximum, of an unknown scalar field using a swarm of mobile robots. Unlike relying on the traditional gradient information, the swarm determines an ascending direction to approach the source with arbitrary precision. The ascending direction is calculated from measurements of the field strength at the robot locations and their relative positions concerning the centroid. Rather than focusing on individual robots, we focus the analysis on the density of robots per unit area to guarantee a more resilient swarm, i.e., the functionality remains even if individuals go missing or are misplaced during the mission. We reinforce the robustness of the algorithm by providing sufficient conditions for the swarm shape so that the ascending direction is almost parallel to the gradient. The swarm can respond to an unexpected environment by morphing its shape and exploiting the existence of multiple ascending directions. Finally, we validate our approach numerically with hundreds of robots. The fact that a large number of robots always calculate an ascending direction compensates for the loss of individuals and mitigates issues arising from the actuator and sensor noises.
comment: 7 pages, CDC 2024, accepted version
System-Level Simulation Framework for NB-IoT: Key Features and Performance Evaluation
Narrowband Internet of Things (NB-IoT) is a technology specifically designated by the 3rd Generation Partnership Project (3GPP) to meet the explosive demand for massive machine-type communications (mMTC), and it is evolving to RedCap. Industrial companies have increasingly adopted NB-IoT as the solution for mMTC due to its lightweight design and comprehensive technical specifications released by 3GPP. This paper presents a system-level simulation framework for NB-IoT networks to evaluate their performance. The system-level simulator is structured into four parts: initialization, pre-generation, main simulation loop, and post-processing. Additionally, three essential features are investigated to enhance coverage, support massive connections, and ensure low power consumption, respectively. Simulation results demonstrate that the cumulative distribution function curves of the signal-to-interference-and-noise ratio fully comply with industrial standards. Furthermore, the throughput performance explains how NB-IoT networks realize massive connections at the cost of data rate. This work highlights its practical utility and paves the way for developing NB-IoT networks.
Real-time Regulation of Detention Ponds via Feedback Control: Balancing Flood Mitigation and Water Quality
Detention ponds can mitigate flooding and improve water quality by allowing the settlement of pollutants. Typically, they are operated with fully open orifices and weirs (i.e., passive control). Active controls can improve the performance of these systems: orifices can be retrofitted with controlled valves and spillways can have controllable gates. The real-time optimal operation of its hydraulic devices can be achieved with techniques such as Model Predictive Control (MPC). A distributed quasi-2D hydrologic-hydrodynamic coupled with a reservoir flood routing model is developed and integrated with an MPC algorithm to estimate the operation of valves and movable gates. The control optimization problem is adapted to switch from a flood-related algorithm focusing on mitigating floods to a heuristic objective function that aims to increase the detention time when no inflow hydrographs are predicted. The case studies show the potential of applying the methods developed in a catchment in Sao Paulo, Brazil. The performance of MPC compared to alternatives with either fully or partially open valves and gates are tested. Comparisons with HEC-RAS 2D indicate volume and peak flow errors of approximately 1.4% and 0.91% for the watershed module. Simulating two consecutive 10-year storms shows that the MPC strategy can achieve peak flow reductions of 79%. In contrast, passive control has nearly half of the performance. A 1-year continuous simulation results show that the passive scenario with 25% of the valves opened can treat 12% more runoff compared to the developed MPC approach, with an average detention time of approximately 6 hours. For the MPC approach, the average detention time is nearly 14 hours indicating that both control techniques can treat similar volumes; however, the proxy water quality for the MPC approach is enhanced due to the longer detention times.
Virtual Elastic Tether: a New Approach for Multi-agent Navigation in Confined Aquatic Environments
Underwater navigation is a challenging area in the field of mobile robotics due to inherent constraints in self-localisation and communication in underwater environments. Some of these challenges can be mitigated by using collaborative multi-agent teams. However, when applied underwater, the robustness of traditional multi-agent collaborative control approaches is highly limited due to the unavailability of reliable measurements. In this paper, the concept of a Virtual Elastic Tether (VET) is introduced in the context of incomplete state measurements, which represents an innovative approach to underwater navigation in confined spaces. The concept of VET is formulated and validated using the Cooperative Aquatic Vehicle Exploration System (CAVES), which is a sim-to-real multi-agent aquatic robotic platform. Within this framework, a vision-based Autonomous Underwater Vehicle-Autonomous Surface Vehicle leader-follower formulation is developed. Experiments were conducted in both simulation and on a physical platform, benchmarked against a traditional Image-Based Visual Servoing approach. Results indicate that the formation of the baseline approach fails under discrete disturbances, when induced distances between the robots exceeds 0.6 m in simulation and 0.3 m in the real world. In contrast, the VET-enhanced system recovers to pre-perturbation distances within 5 seconds. Furthermore, results illustrate the successful navigation of VET-enhanced CAVES in a confined water pond where the baseline approach fails to perform adequately.
comment: This work has been submitted to the Wiley for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
gatekeeper: Online Safety Verification and Control for Nonlinear Systems in Dynamic Environments IROS 2023
This paper presents the gatekeeper algorithm, a real-time and computationally-lightweight method that ensures that trajectories of a nonlinear system satisfy safety constraints despite sensing limitations. gatekeeper integrates with existing path planners and feedback controllers by introducing an additional verification step to ensure that proposed trajectories can be executed safely, despite nonlinear dynamics subject to bounded disturbances, input constraints and partial knowledge of the environment. Our key contribution is that (A) we propose an algorithm to recursively construct safe trajectories by numerically forward propagating the system over a (short) finite horizon, and (B) we prove that tracking such a trajectory ensures the system remains safe for all future time, i.e., beyond the finite horizon. We demonstrate the method in a simulation of a dynamic firefighting mission, and in physical experiments of a quadrotor navigating in an obstacle environment that is sensed online. We also provide comparisons against the state-of-the-art techniques for similar problems.
comment: Accepted at IEEE T-RO 2024. Accepted at IROS 2023. 17 pages, 10 figures
Fast and Accurate Relative Motion Tracking for Dual Industrial Robots
Industrial robotic applications such as spraying, welding, and additive manufacturing frequently require fast, accurate, and uniform motion along a 3D spatial curve. To increase process throughput, some manufacturers propose a dual-robot setup to overcome the speed limitation of a single robot. Industrial robot motion is programmed through waypoints connected by motion primitives (Cartesian linear and circular paths and linear joint paths at constant Cartesian speed). The actual robot motion is affected by the blending between these motion primitives and the pose of the robot (an outstretched/near-singularity pose tends to have larger path tracking errors). Choosing the waypoints and the speed along each motion segment to achieve the performance requirement is challenging. At present, there is no automated solution, and laborious manual tuning by robot experts is needed to approach the desired performance. In this paper, we present a systematic three-step approach to designing and programming a dual robot system to optimize system performance. The first step is to select the relative placement between the two robots based on the specified relative motion path. The second step is to select the relative waypoints and the motion primitives. The final step is to update the waypoints iteratively based on the actual measured relative motion. Waypoint iteration is first executed in simulation and then completed using the actual robots. For performance assessment, we use the mean path speed subject to the relative position and orientation constraints and the path speed uniformity constraint. We have demonstrated the effectiveness of this method on two systems, a physical testbed of two ABB robots and a simulation testbed of two FANUC robots, for two challenging test curves.
Line zonotopes: a set representation suitable for unbounded systems and its application to set-based state estimation and active fault diagnosis of descriptor systems
This paper proposes new methods for set-based state estimation and active fault diagnosis (AFD) of linear descriptor systems (LDS). In contrast to intervals, ellipsoids, and zonotopes, linear static constraints on the state variables, typical of descriptor systems, can be directly incorporated in the mathematical description of constrained zonotopes (CZs). Thanks to this feature, methods using CZs could provide less conservative enclosures than zonotope methods. However, an enclosure on the states should be known for all $k \geq 0$, which is not true in the case of unstable or unobservable LDS. In this context, this paper proposes a new representation for unbounded sets, which allows to develop methods for state estimation and AFD of stable and unstable LDS. Unlike many other extensions of CZs, the proposed set inherits most of their properties, including polynomial time complexity reduction methods, while allowing to describe different classes of sets, such as strips, hyperplanes, and the entire $n$-dimensional Euclidean space. The advantages of the proposed approaches with respect to CZ methods are highlighted in numerical examples.
comment: 15 pages, 6 figures. Revised manuscript includes a new name for the set representation, revised article structure, a new numerical example, and several minor modifications. Theoretical results unchanged. arXiv admin note: text overlap with arXiv:2306.07369
Formally Verifying Deep Reinforcement Learning Controllers with Lyapunov Barrier Certificates
Deep reinforcement learning (DRL) is a powerful machine learning paradigm for generating agents that control autonomous systems. However, the ``black box'' nature of DRL agents limits their deployment in real-world safety-critical applications. A promising approach for providing strong guarantees on an agent's behavior is to use Neural Lyapunov Barrier (NLB) certificates, which are learned functions over the system whose properties indirectly imply that an agent behaves as desired. However, NLB-based certificates are typically difficult to learn and even more difficult to verify, especially for complex systems. In this work, we present a novel method for training and verifying NLB-based certificates for discrete-time systems. Specifically, we introduce a technique for certificate composition, which simplifies the verification of highly-complex systems by strategically designing a sequence of certificates. When jointly verified with neural network verification engines, these certificates provide a formal guarantee that a DRL agent both achieves its goals and avoids unsafe behavior. Furthermore, we introduce a technique for certificate filtering, which significantly simplifies the process of producing formally verified certificates. We demonstrate the merits of our approach with a case study on providing safety and liveness guarantees for a DRL-controlled spacecraft.
comment: To appear in FMCAD 2024
Robotics
HADRON: Human-friendly Control and Artificial Intelligence for Military Drone Operations
As drones are getting more and more entangled in our society, more untrained users require the capability to operate them. This scenario is to be achieved through the development of artificial intelligence capabilities assisting the human operator in controlling the Unmanned Aerial System (UAS) and processing the sensor data, thereby alleviating the need for extensive operator training. This paper presents the HADRON project that seeks to develop and test multiple novel technologies to enable human-friendly control of drone swarms. This project is divided into three main parts. The first part consists of the integration of different technologies for the intuitive control of drones, focusing on novice or inexperienced pilots and operators. The second part focuses on the development of a multi-drone system that will be controlled from a command and control station, in which an expert pilot can supervise the operations of the multiple drones. The third part of the project will focus on reducing the cognitive load on human operators, whether they are novice or expert pilots. For this, we will develop AI tools that will assist drone operators with semi-automated real-time data processing.
comment: 4 pages, 2 Figures and 1 table. This work has been accepted at the Workshop Variable Autonomy for Human-Robot Teaming held at the 33rd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN 2024)
Decision-Focused Learning to Predict Action Costs for Planning
In many automated planning applications, action costs can be hard to specify. An example is the time needed to travel through a certain road segment, which depends on many factors, such as the current weather conditions. A natural way to address this issue is to learn to predict these parameters based on input features (e.g., weather forecasts) and use the predicted action costs in automated planning afterward. Decision-Focused Learning (DFL) has been successful in learning to predict the parameters of combinatorial optimization problems in a way that optimizes solution quality rather than prediction quality. This approach yields better results than treating prediction and optimization as separate tasks. In this paper, we investigate for the first time the challenges of implementing DFL for automated planning in order to learn to predict the action costs. There are two main challenges to overcome: (1) planning systems are called during gradient descent learning, to solve planning problems with negative action costs, which are not supported in planning. We propose novel methods for gradient computation to avoid this issue. (2) DFL requires repeated planner calls during training, which can limit the scalability of the method. We experiment with different methods approximating the optimal plan as well as an easy-to-implement caching mechanism to speed up the learning process. As the first work that addresses DFL for automated planning, we demonstrate that the proposed gradient computation consistently yields significantly better plans than predictions aimed at minimizing prediction error; and that caching can temper the computation requirements.
Learn2Decompose: Learning Problem Decomposition for Efficient Task and Motion Planning
We focus on designing efficient Task and Motion Planning (TAMP) approach for long-horizon manipulation tasks involving multi-step manipulation of multiple objects. TAMP solvers typically require exponentially longer planning time as the planning horizon and the number of environmental objects increase. To address this challenge, we first propose Learn2Decompose, a Learning from Demonstrations (LfD) approach that learns embedding task rules from demonstrations and decomposes the long-horizon problem into several subproblems. These subproblems require planning over shorter horizons with fewer objects and can be solved in parallel. We then design a parallelized hierarchical TAMP framework that concurrently solves the subproblems and concatenates the resulting subplans for the target task, significantly improving the planning efficiency of classical TAMP solvers. The effectiveness of our proposed methods is validated in both simulation and real-world experiments.
comment: Submitted for potential publication
Exploring Domain Shift on Radar-Based 3D Object Detection Amidst Diverse Environmental Conditions SC
The rapid evolution of deep learning and its integration with autonomous driving systems have led to substantial advancements in 3D perception using multimodal sensors. Notably, radar sensors show greater robustness compared to cameras and lidar under adverse weather and varying illumination conditions. This study delves into the often-overlooked yet crucial issue of domain shift in 4D radar-based object detection, examining how varying environmental conditions, such as different weather patterns and road types, impact 3D object detection performance. Our findings highlight distinct domain shifts across various weather scenarios, revealing unique dataset sensitivities that underscore the critical role of radar point cloud generation. Additionally, we demonstrate that transitioning between different road types, especially from highways to urban settings, introduces notable domain shifts, emphasizing the necessity for diverse data collection across varied road environments. To the best of our knowledge, this is the first comprehensive analysis of domain shift effects on 4D radar-based object detection. We believe this empirical study contributes to understanding the complex nature of domain shifts in radar data and suggests paths forward for data collection strategy in the face of environmental variability.
comment: 6 pages, 5 figures, 3 tables, accepted in IEEE International Conference on Intelligent Transportation Systems (ITSC) 2024
Grasping by Hanging: a Learning-Free Grasping Detection Method for Previously Unseen Objects
This paper proposes a novel learning-free three-stage method that predicts grasping poses, enabling robots to pick up and transfer previously unseen objects. Our method first identifies potential structures that can afford the action of hanging by analyzing the hanging mechanics and geometric properties. Then 6D poses are detected for a parallel gripper retrofitted with an extending bar, which when closed forms loops to hook each hangable structure. Finally, an evaluation policy qualities and rank grasp candidates for execution attempts. Compared to the traditional physical model-based and deep learning-based methods, our approach is closer to the human natural action of grasping unknown objects. And it also eliminates the need for a vast amount of training data. To evaluate the effectiveness of the proposed method, we conducted experiments with a real robot. Experimental results indicate that the grasping accuracy and stability are significantly higher than the state-of-the-art learning-based method, especially for thin and flat objects.
comment: 13 pages and 7 figures
Adaptive USVs Swarm Optimization for Target Tracking in Dynamic Environments
This research investigates the performance and efficiency of Unmanned Surface Vehicles (USVs) in multi-target tracking scenarios using the Adaptive Particle Swarm Optimization with k-Nearest Neighbors (APSO-kNN) algorithm. The study explores various search patterns-Random Walk, Spiral, Lawnmower, and Cluster Search to assess their effectiveness in dynamic environments. Through extensive simulations, we evaluate the impact of different search strategies, varying the number of targets and USVs' sensing capabilities, and integrating a Pursuit-Evasion model to test adaptability. Our findings demonstrate that systematic search patterns like Spiral and Lawnmower provide superior coverage and tracking accuracy, making them ideal for thorough area exploration. In contrast, the Random Walk pattern, while highly adaptable, shows lower accuracy due to its non-deterministic nature, and Cluster Search maintains group cohesion but is heavily dependent on target distribution. The mixed strategy, combining multiple patterns, offers robust performance across varied scenarios, while APSO-kNN effectively balances exploration and exploitation, making it a promising approach for real-world applications such as surveillance, search and rescue, and environmental monitoring. This study provides valuable insights into optimizing search strategies and sensing configurations for USV swarms, ultimately enhancing their operational efficiency and success in complex environments.
comment: 9 pages
SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields ECCV 2024
The ability to distill object-centric abstractions from intricate visual scenes underpins human-level generalization. Despite the significant progress in object-centric learning methods, learning object-centric representations in the 3D physical world remains a crucial challenge. In this work, we propose SlotLifter, a novel object-centric radiance model addressing scene reconstruction and decomposition jointly via slot-guided feature lifting. Such a design unites object-centric learning representations and image-based rendering methods, offering state-of-the-art performance in scene decomposition and novel-view synthesis on four challenging synthetic and four complex real-world datasets, outperforming existing 3D object-centric learning methods by a large margin. Through extensive ablative studies, we showcase the efficacy of designs in SlotLifter, revealing key insights for potential future directions.
comment: Accepted by ECCV 2024. Project website: https://slotlifter.github.io
Compact robotic gripper with tandem actuation for selective fruit harvesting
Selective fruit harvesting is a challenging manipulation problem due to occlusions and clutter arising from plant foliage. A harvesting gripper should i) have a small cross-section, to avoid collisions while approaching the fruit; ii) have a soft and compliant grasp to adapt to different fruit geometry and avoid bruising it; and iii) be capable of rigidly holding the fruit tightly enough to counteract detachment forces. Previous work on fruit harvesting has primarily focused on using grippers with a single actuation mode, either suction or fingers. In this paper we present a compact robotic gripper that combines the benefits of both. The gripper first uses an array of compliant suction cups to gently attach to the fruit. After attachment, telescoping cam-driven fingers deploy, sweeping obstacles away before pivoting inwards to provide a secure grip on the fruit for picking. We present and analyze the finger design for both ability to sweep clutter and maintain a tight grasp. Specifically, we use a motorized test bed to measure grasp strength for each actuation mode (suction, fingers, or both). We apply a tensile force at different angles (0{\deg}, 15{\deg}, 30{\deg} and 45{\deg}), and vary the point of contact between the fingers and the fruit. We observed that with both modes the grasp strength is approximately 40 N. We use an apple proxy to test the gripper's ability to obtain a grasp in the presence of occluding apples and leaves, achieving a grasp success rate over 96% (with an ideal controller). Finally, we validate our gripper in a commercial apple orchard.
comment: 8 pages, 9 figures
Design of a Double-joint Robotic Fish Using a Composite Linkage
Robotic fish is one of the most promising directions of the new generation of underwater vehicles. Traditional biomimetic fish often mimic fish joints using tandem components like servos, which leads to increased volume, weight and control complexity. In this paper, a new double-joint robotic fish using a composite linkage was designed, where the propulsion mechanism transforms the single-degree-of-freedom rotation of the motor into a double-degree-of-freedom coupled motion, namely caudal peduncle translation and caudal fin rotation. Motion analysis of the propulsion mechanism demonstrates its ability to closely emulate the undulating movement observed in carangiform fish. Experimental results further validate the feasibility of the proposed propulsion mechanism. To improve propulsion efficiency, an analysis is conducted to explore the influence of swing angle amplitude and swing frequency on the swimming speed of the robotic fish. This examination establishes a practical foundation for future research on such robotic fish systems.
MAPPO-PIS: A Multi-Agent Proximal Policy Optimization Method with Prior Intent Sharing for CAVs' Cooperative Decision-Making
Vehicle-to-Vehicle (V2V) technologies have great potential for enhancing traffic flow efficiency and safety. However, cooperative decision-making in multi-agent systems, particularly in complex human-machine mixed merging areas, remains challenging for connected and autonomous vehicles (CAVs). Intent sharing, a key aspect of human coordination, may offer an effective solution to these decision-making problems, but its application in CAVs is under-explored. This paper presents an intent-sharing-based cooperative method, the Multi-Agent Proximal Policy Optimization with Prior Intent Sharing (MAPPO-PIS), which models the CAV cooperative decision-making problem as a Multi-Agent Reinforcement Learning (MARL) problem. It involves training and updating the agents' policies through the integration of two key modules: the Intention Generator Module (IGM) and the Safety Enhanced Module (SEM). The IGM is specifically crafted to generate and disseminate CAVs' intended trajectories spanning multiple future time-steps. On the other hand, the SEM serves a crucial role in assessing the safety of the decisions made and rectifying them if necessary. Merging area with human-machine mixed traffic flow is selected to validate our method. Results show that MAPPO-PIS significantly improves decision-making performance in multi-agent systems, surpassing state-of-the-art baselines in safety, efficiency, and overall traffic system performance. The code and video demo can be found at: \url{https://github.com/CCCC1dhcgd/A-MAPPO-PIS}.
A Miniature Vision-Based Localization System for Indoor Blimps
With increasing attention paid to blimp research, I hope to build an indoor blimp to interact with humans. To begin with, I propose developing a visual localization system to enable blimps to localize themselves in an indoor environment autonomously. This system initially reconstructs an indoor environment by employing Structure from Motion with Superpoint visual features. Next, with the previously built sparse point cloud map, the system generates camera poses by continuously employing pose estimation on matched visual features observed from the map. In this project, the blimp only serves as a reference mobile platform that constrains the weight of the perception system. The perception system contains one monocular camera and a WiFi adaptor to capture and transmit visual data to a ground PC station where the algorithms will be executed. The success of this project will transform remote-controlled indoor blimps into autonomous indoor blimps, which can be utilized for applications such as surveillance, advertisement, and indoor mapping.
Centralization vs. decentralization in multi-robot coverage: Ground robots under UAV supervision
In swarm robotics, decentralized control is often proposed as a more scalable and fault-tolerant alternative to centralized control. However, centralized behaviors are often faster and more efficient than their decentralized counterparts. In any given application, the goals and constraints of the task being solved should guide the choice to use centralized control, decentralized control, or a combination of the two. Currently, the tradeoffs that exist between centralization and decentralization have not been thoroughly studied. In this paper, we investigate these tradeoffs for multi-robot coverage, and find that they are more nuanced than expected. For instance, our findings reinforce the expectation that more decentralized control will provide better scalability, but contradict the expectation that more decentralized control will perform better in environments with randomized obstacles. Beginning with a group of fully independent ground robots executing coverage, we add unmanned aerial vehicles as supervisors and progressively increase the degree to which the supervisors use centralized control, in terms of access to global information and a central coordinating entity. We compare, using the multi-robot physics-based simulation environment ARGoS, the following four control approaches: decentralized control, hybrid control, centralized control, and predetermined control. In comparing the ground robots performing the coverage task, we assess the speed and efficiency advantages of centralization -- in terms of coverage completeness and coverage uniformity -- and we assess the scalability and fault tolerance advantages of decentralization. We also assess the energy expenditure disadvantages of centralization due to different energy consumption rates of ground robots and unmanned aerial vehicles, according to the specifications of robots available off-the-shelf.
comment: IRIDIA, Universite Libre de Bruxelles, Brussels, Belgium, 2021
A Comparison of Imitation Learning Algorithms for Bimanual Manipulation
Amidst the wide popularity of imitation learning algorithms in robotics, their properties regarding hyperparameter sensitivity, ease of training, data efficiency, and performance have not been well-studied in high-precision industry-inspired environments. In this work, we demonstrate the limitations and benefits of prominent imitation learning approaches and analyze their capabilities regarding these properties. We evaluate each algorithm on a complex bimanual manipulation task involving an over-constrained dynamics system in a setting involving multiple contacts between the manipulated object and the environment. While we find that imitation learning is well suited to solve such complex tasks, not all algorithms are equal in terms of handling environmental and hyperparameter perturbations, training requirements, performance, and ease of use. We investigate the empirical influence of these key characteristics by employing a carefully designed experimental procedure and learning environment. Paper website: https://bimanual-imitation.github.io/
Spb3DTracker: A Robust LiDAR-Based Person Tracker for Noisy Environment
Person detection and tracking (PDT) has seen significant advancements with 2D camera-based systems in the autonomous vehicle field, leading to widespread adoption of these algorithms. However, growing privacy concerns have recently emerged as a major issue, prompting a shift towards LiDAR-based PDT as a viable alternative. Within this domain, "Tracking-by-Detection" (TBD) has become a prominent methodology. Despite its effectiveness, LiDAR-based PDT has not yet achieved the same level of performance as camera-based PDT. This paper examines key components of the LiDAR-based PDT framework, including detection post-processing, data association, motion modeling, and lifecycle management. Building upon these insights, we introduce SpbTrack, a robust person tracker designed for diverse environments. Our method achieves superior performance on noisy datasets and state-of-the-art results on KITTI Dataset benchmarks and custom office indoor dataset among LiDAR-based trackers.
comment: 17 pages, 5 figures
A Universal Flexible Near-sensor Neuromorphic Tactile System with Multi-threshold strategy for Pressure Characteristic Detection
Constructing the new generation information processing system by mimicking biological nervous system is a feasible way for implement of high-efficient intelligent sensing device and bionic robot. However, most biological nervous system, especially the tactile system, have various powerful functions. This is a big challenge for bionic system design. Here we report a universal fully flexible neuromorphic tactile perception system with strong compatibility and a multithreshold signal processing strategy. Like nervous system, signal in our system is transmitted as pulses and processed as threshold information. For feasibility verification, recognition of three different type pressure signals (continuous changing signal, Morse code signal and symbol pattern) is tested respectively. Our system can output trend of these signals accurately and have a high accuracy in the recognition of symbol pattern and Morse code. Comparing to conventional system, consumption of our system significantly decreases in a same recognition task. Meanwhile, we give the detail introduction and demonstration of our system universality.
Continual Driving Policy Optimization with Closed-Loop Individualized Curricula ICRA 2024
The safety of autonomous vehicles (AV) has been a long-standing top concern, stemming from the absence of rare and safety-critical scenarios in the long-tail naturalistic driving distribution. To tackle this challenge, a surge of research in scenario-based autonomous driving has emerged, with a focus on generating high-risk driving scenarios and applying them to conduct safety-critical testing of AV models. However, limited work has been explored on the reuse of these extensive scenarios to iteratively improve AV models. Moreover, it remains intractable and challenging to filter through gigantic scenario libraries collected from other AV models with distinct behaviors, attempting to extract transferable information for current AV improvement. Therefore, we develop a continual driving policy optimization framework featuring Closed-Loop Individualized Curricula (CLIC), which we factorize into a set of standardized sub-modules for flexible implementation choices: AV Evaluation, Scenario Selection, and AV Training. CLIC frames AV Evaluation as a collision prediction task, where it estimates the chance of AV failures in these scenarios at each iteration. Subsequently, by re-sampling from historical scenarios based on these failure probabilities, CLIC tailors individualized curricula for downstream training, aligning them with the evaluated capability of AV. Accordingly, CLIC not only maximizes the utilization of the vast pre-collected scenario library for closed-loop driving policy optimization but also facilitates AV improvement by individualizing its training with more challenging cases out of those poorly organized scenarios. Experimental results clearly indicate that CLIC surpasses other curriculum-based training strategies, showing substantial improvement in managing risky scenarios, while still maintaining proficiency in handling simpler cases.
comment: ICRA 2024
Non-convex Pose Graph Optimization in SLAM via Proximal Linearized Riemannian ADMM
Pose graph optimization (PGO) is a well-known technique for solving the pose-based simultaneous localization and mapping (SLAM) problem. In this paper, we represent the rotation and translation by a unit quaternion and a three-dimensional vector, and propose a new PGO model based on the von Mises-Fisher distribution. The constraints derived from the unit quaternions are spherical manifolds, and the projection onto the constraints can be calculated by normalization. Then a proximal linearized Riemannian alternating direction method of multipliers (PieADMM) is developed to solve the proposed model, which not only has low memory requirements, but also can update the poses in parallel. Furthermore, we establish the iteration complexity of $O(1/\epsilon^{2})$ of PieADMM for finding an $\epsilon$-stationary solution of our model. The efficiency of our proposed algorithm is demonstrated by numerical experiments on two synthetic and four 3D SLAM benchmark datasets.
Automatic Spatial Calibration of Near-Field MIMO Radar With Respect to Optical Depth Sensors IROS 2024
Despite an emerging interest in MIMO radar, the utilization of its complementary strengths in combination with optical depth sensors has so far been limited to far-field applications, due to the challenges that arise from mutual sensor calibration in the near field. In fact, most related approaches in the autonomous industry propose target-based calibration methods using corner reflectors that have proven to be unsuitable for the near field. In contrast, we propose a novel, joint calibration approach for optical RGB-D sensors and MIMO radars that is designed to operate in the radar's near-field range, within decimeters from the sensors. Our pipeline consists of a bespoke calibration target, allowing for automatic target detection and localization, followed by the spatial calibration of the two sensor coordinate systems through target registration. We validate our approach using two different depth sensing technologies from the optical domain. The experiments show the efficiency and accuracy of our calibration for various target displacements, as well as its robustness of our localization in terms of signal ambiguities.
comment: 8 pages, 9 figures, accepted to IROS 2024
Gravity-aware Grasp Generation with Implicit Grasp Mode Selection for Underactuated Hands IROS2024
Learning-based grasp detectors typically assume a precision grasp, where each finger only has one contact point, and estimate the grasp probability. In this work, we propose a data generation and learning pipeline that can leverage power grasping, which has more contact points with an enveloping configuration and is robust against both positioning error and force disturbance. To train a grasp detector to prioritize power grasping while still keeping precision grasping as the secondary choice, we propose to train the network against the magnitude of disturbance in the gravity direction a grasp can resist (gravity-rejection score) rather than the binary classification of success. We also provide an efficient data generation pipeline for a dataset with gravity-rejection score annotation. In addition to thorough ablation studies, quantitative evaluation in both simulation and real-robot clarifies the significant improvement in our approach, especially when the objects are heavy.
comment: Accepted for IROS2024
Enhancing Visual Place Recognition via Fast and Slow Adaptive Biasing in Event Cameras IROS 2024
Event cameras are increasingly popular in robotics due to beneficial features such as low latency, energy efficiency, and high dynamic range. Nevertheless, their downstream task performance is greatly influenced by the optimization of bias parameters. These parameters, for instance, regulate the necessary change in light intensity to trigger an event, which in turn depends on factors such as the environment lighting and camera motion. This paper introduces feedback control algorithms that automatically tune the bias parameters through two interacting methods: 1) An immediate, on-the-fly \textit{fast} adaptation of the refractory period, which sets the minimum interval between consecutive events, and 2) if the event rate exceeds the specified bounds even after changing the refractory period repeatedly, the controller adapts the pixel bandwidth and event thresholds, which stabilizes after a short period of noise events across all pixels (\textit{slow} adaptation). Our evaluation focuses on the visual place recognition task, where incoming query images are compared to a given reference database. We conducted comprehensive evaluations of our algorithms' adaptive feedback control in real-time. To do so, we collected the QCR-Fast-and-Slow dataset that contains DAVIS346 event camera streams from 366 repeated traversals of a Scout Mini robot navigating through a 100 meter long indoor lab setting (totaling over 35km distance traveled) in varying brightness conditions with ground truth location information. Our proposed feedback controllers result in superior performance when compared to the standard bias settings and prior feedback control methods. Our findings also detail the impact of bias adjustments on task performance and feature ablation studies on the fast and slow adaptation mechanisms.
comment: 8 pages, 9 figures, paper accepted to the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
WaveShot: A Compact Portable Unmanned Surface Vessel for Dynamic Water Surface Videography and Media Production
This paper presents WaveShot, an innovative portable unmanned surface vessel that aims to transform water surface videography by offering a highly maneuverable, cost-effective, and safe alternative to traditional filming methods. WaveShot is designed for the modern demands of film production, advertising, documentaries, and visual arts, equipped with professional-grade waterproof cameras and advanced technology to capture static and dynamic scenes on waterways. We discuss the development and advantages of WaveShot, highlighting its portability, ease of transport, and rapid deployment capabilities. Experimental validation showcasing WaveShot's stability and high-quality video capture in various water conditions, and the integration of monocular depth estimation algorithms to enhance the operator's spatial perception. The paper concludes by exploring WaveShot's real-world applications, its user-friendly remote operation, and future enhancements such as gimbal integration and advanced computer vision for optimized videography on water surfaces.
A Survey of Behavior Learning Applications in Robotics -- State of the Art and Perspectives
Recent success of machine learning in many domains has been overwhelming, which often leads to false expectations regarding the capabilities of behavior learning in robotics. In this survey, we analyze the current state of machine learning for robotic behaviors. We will give a broad overview of behaviors that have been learned and used on real robots. Our focus is on kinematically or sensorially complex robots. That includes humanoid robots or parts of humanoid robots, for example, legged robots or robotic arms. We will classify presented behaviors according to various categories and we will draw conclusions about what can be learned and what should be learned. Furthermore, we will give an outlook on problems that are challenging today but might be solved by machine learning in the future and argue that classical robotics and other approaches from artificial intelligence should be integrated more with machine learning to form complete, autonomous systems.
comment: Research Report of DFKI GmbH, Robotics Innovation Center
Multiagent Systems
Multi-Agent Continuous Control with Generative Flow Networks
Generative Flow Networks (GFlowNets) aim to generate diverse trajectories from a distribution in which the final states of the trajectories are proportional to the reward, serving as a powerful alternative to reinforcement learning for exploratory control tasks. However, the individual-flow matching constraint in GFlowNets limits their applications for multi-agent systems, especially continuous joint-control problems. In this paper, we propose a novel Multi-Agent generative Continuous Flow Networks (MACFN) method to enable multiple agents to perform cooperative exploration for various compositional continuous objects. Technically, MACFN trains decentralized individual-flow-based policies in a centralized global-flow-based matching fashion. During centralized training, MACFN introduces a continuous flow decomposition network to deduce the flow contributions of each agent in the presence of only global rewards. Then agents can deliver actions solely based on their assigned local flow in a decentralized way, forming a joint policy distribution proportional to the rewards. To guarantee the expressiveness of continuous flow decomposition, we theoretically derive a consistency condition on the decomposition network. Experimental results demonstrate that the proposed method yields results superior to the state-of-the-art counterparts and better exploration capability. Our code is available at https://github.com/isluoshuang/MACFN.
Systems and Control (CS)
Robust Model Predictive Control for Aircraft Intent-Aware Collision Avoidance
This paper presents the use of robust model predictive control for the design of an intent-aware collision avoidance system for multi-agent aircraft engaged in horizontal maneuvering scenarios. We assume that information from other agents is accessible in the form of waypoints or destinations. Consequently, we consider that other agents follow their optimal Dubin's path--a trajectory that connects their current state to their intended state--while accounting for potential uncertainties. We propose using scenario tree model predictive control as a robust approach that offers computational efficiency. We demonstrate that the proposed method can easily integrate intent information and offer a robust scheme that handles different uncertainties. The method is illustrated through simulation results.
comment: 8 Pages, 10 Figs
Optimization-Based Model Checking and Trace Synthesis for Complex STL Specifications
We present a bounded model checking algorithm for signal temporal logic (STL) that exploits mixed-integer linear programming (MILP). A key technical element is our novel MILP encoding of the STL semantics; it follows the idea of stable partitioning from the recent work on SMT-based STL model checking. Assuming that our (continuous-time) system models can be encoded to MILP -- typical examples are rectangular hybrid automata (precisely) and hybrid dynamics with closed-form solutions (approximately) -- our MILP encoding yields an optimization-based model checking algorithm that is scalable, is anytime/interruptible, and accommodates parameter mining. Experimental evaluation shows our algorithm's performance advantages especially for complex STL formulas, demonstrating its practical relevance e.g. in the automotive domain.
comment: Extended version of the paper accepted by 36th International Conference on Computer-Aided Verification (CAV), 2024
Verification of Diagnosability for Cyber-Physical Systems: A Hybrid Barrier Certificate Approach
Diagnosability is a system theoretical property characterizing whether fault occurrences in a system can always be detected within a finite time. In this paper, we investigate the verification of diagnosability for cyber-physical systems with continuous state sets. We develop an abstraction-free and automata-based framework to verify (the lack of) diagnosability, leveraging a notion of hybrid barrier certificates. To this end, we first construct a (delta,K)-deterministic finite automaton that captures the occurrence of faults targeted for diagnosis. Then, the verification of diagnosability property is converted into a safety verification problem over a product system between the automaton and the augmented version of the dynamical system. We demonstrate that this verification problem can be addressed by computing hybrid barrier certificates for the product system. To this end, we introduce two systematic methods, leveraging sum-of-squares programming and counter-example guided inductive synthesis to search for such certificates. Additionally, if the system is found to be diagnosable, we propose methodologies to construct a diagnoser to identify fault occurrences online. Finally, we showcase the effectiveness of our methods through a case study.
On the Improvement of the Performance of Inexpensive Electromagnetic Skins by means of an Inverse Source Design Approach
A new methodology for the improvement of the performance of inexpensive static passive electromagnetic skins (SP-EMSs) is presented. The proposed approach leverages on the non-uniqueness of the inverse source problem associated to the SP-EMS design by decomposing the induced surface current into pre-image (PI) and null-space (NS) components. Successively, the unknown EMS layout and NS expansion coefficients are determined by means of an alternate minimization of a suitable cost function. This latter quantifies the mismatch between the ideal surface current, which radiates the user-defined target field, and that actually induced on the EMS layout. Results from a representative set of numerical experiments, concerned with the design of EMSs reflecting pencil-beam as well as contoured target patterns, are reported to assess the feasibility and the effectiveness of the proposed method in improving the performance of inexpensive EMS realizations. The measurements on an EMS prototype, featuring a conductive ink pattern printed on a standard paper substrate, are also shown to prove the reliability of the synthesis process.
Synthesis of Wide-Angle Scanning Arrays through Array Power Control
A new methodology for the synthesis of wide-angle scanning arrays is proposed. It is based on the formulation of the array design problem as a multi-objective one where, for each scan angle, both the radiated power density in the scan direction and the total reflected power are accounted for. A set of numerical results from full-wave simulated examples - dealing with different radiators, arrangements, frequencies, and number of elements - is reported to show the features of the proposed approach as well as to assess its potentialities.
Residual Deep Reinforcement Learning for Inverter-based Volt-Var Control
A residual deep reinforcement learning (RDRL) approach is proposed by integrating DRL with model-based optimization for inverter-based volt-var control in active distribution networks when the accurate power flow model is unknown. RDRL learns a residual action with a reduced residual action space, based on the action of the model-based approach with an approximate model. RDRL inherits the control capability of the approximate-model-based optimization and enhances the policy optimization capability by residual policy learning. Additionally, it improves the approximation accuracy of the critic and reduces the search difficulties of the actor by reducing residual action space. To address the issues of "too small" or "too large" residual action space of RDRL and further improve the optimization performance, we extend RDRL to a boosting RDRL approach. It selects a much smaller residual action space and learns a residual policy by using the policy of RDRL as a base policy. Simulations demonstrate that RDRL and boosting RDRL improve the optimization performance considerably throughout the learning stage and verify their rationales point-by-point, including 1) inheriting the capability of the approximate model-based optimization, 2) residual policy learning, and 3) learning in a reduced action space.
comment: arXiv admin note: text overlap with arXiv:2210.07360
Robustness of optimal quantum annealing protocols
Noise in quantum computing devices poses a key challenge in their realization. In this paper, we study the robustness of optimal quantum annealing protocols against coherent control errors, which are multiplicative Hamlitonian errors causing detrimental effects on current quantum devices. We show that the norm of the Hamiltonian quantifies the robustness against these errors, motivating the introduction of an additional regularization term in the cost function. We analyze the optimality conditions of the resulting robust quantum optimal control problem based on Pontryagin's maximum principle, showing that robust protocols admit larger smooth annealing sections. This suggests that quantum annealing admits improved robustness in comparison to bang-bang solutions such as the quantum approximate optimization algorithm. Finally, we perform numerical simulations to verify our analytical results and demonstrate the improved robustness of the proposed approach.
Robust Deep Reinforcement Learning for Inverter-based Volt-Var Control in Partially Observable Distribution Networks
Inverter-based volt-var control is studied in this paper. One key issue in DRL-based approaches is the limited measurement deployment in active distribution networks, which leads to problems of a partially observable state and unknown reward. To address those problems, this paper proposes a robust DRL approach with a conservative critic and a surrogate reward. The conservative critic utilizes the quantile regression technology to estimate conservative state-action value function based on the partially observable state, which helps to train a robust policy; the surrogate rewards of power loss and voltage violation are designed that can be calculated from the limited measurements. The proposed approach optimizes the power loss of the whole network and the voltage profile of buses with measurable voltages while indirectly improving the voltage profile of other buses. Extensive simulations verify the effectiveness of the robust DRL approach in different limited measurement conditions, even when only the active power injection of the root bus and less than 10% of bus voltages are measurable.
Designing Consensus-Based Distributed Filtering over Directed Graphs
This paper proposes a novel consensus-on-only-measurement distributed filter over directed graphs under the collectively observability condition. First, the distributed filter structure is designed with an augmented leader-following measurement fusion strategy. Subsequently, two parameter design methods are presented, and the consensus gain parameter is devised utilizing local information exclusively rather than global information. Additionally, the lower bound of the fusion step is derived to guarantee a uniformly upper bound of the estimation error covariance. Moreover, the lower bounds of the convergence rates of the steady-state performance gap between the proposed algorithm and the centralized filter are provided with the fusion step approaching infinity. The analysis demonstrates that the convergence rate is, at a minimum, as rapid as exponential convergence under the spectral norm condition of the communication graph. The transient performance is also analyzed with the fusion step tending to infinity. The inherent trade-off between the communication cost and the filtering performance is revealed from the analysis of the steady-state performance and the transient performance. Finally, the theoretical results are substantiated through the validation of two simulation examples.
On the Effects of Modeling Errors on Distributed Continuous-time Filtering
This paper offers a comprehensive performance analysis of the distributed continuous-time filtering in the presence of modeling errors. First, we introduce two performance indices, namely the nominal performance index and the estimation error covariance. By leveraging the nominal performance index and the Frobenius norm of the modeling deviations, we derive the bounds of the estimation error covariance and the lower bound of the nominal performance index. Specifically, we reveal the effect of the consensus parameter on both bounds. We demonstrate that, under specific conditions, an incorrect process noise covariance can lead to the divergence of the estimation error covariance. Moreover, we investigate the properties of the eigenvalues of the error dynamical matrix. In the context of switching topological configurations, we provide a sufficient condition that ensures the stability of the error dynamical matrix. Furthermore, we explore the magnitude relations between the nominal performance index and the estimation error covariance. Finally, we present some numerical simulations to validate the effectiveness of the theoretical results.
Performance Analysis of Distributed Filtering under Mismatched Noise Covariances
This paper systematically investigates the performance of consensus-based distributed filtering under mismatched noise covariances. First, we introduce three performance evaluation indices for such filtering problems,namely the standard performance evaluation index, the nominal performance evaluation index, and the estimation error covariance. We derive difference expressions among these indices and establish one-step relations among them under various mismatched noise covariance scenarios. We particularly reveal the effect of the consensus fusion on these relations. Furthermore, the recursive relations are introduced by extending the results of the one-step relations. Subsequently, we demonstrate the convergence of these indices under the collective observability condition, and show this convergence condition of the nominal performance evaluation index can guarantee the convergence of the estimation error covariance. Additionally, we prove that the estimation error covariance of the consensus-based distributed filter under mismatched noise covariances can be bounded by the Frobenius norms of the noise covariance deviations and the trace of the nominal performance evaluation index. Finally, the effectiveness of the theoretical results is verified by numerical simulations.
Compact robotic gripper with tandem actuation for selective fruit harvesting
Selective fruit harvesting is a challenging manipulation problem due to occlusions and clutter arising from plant foliage. A harvesting gripper should i) have a small cross-section, to avoid collisions while approaching the fruit; ii) have a soft and compliant grasp to adapt to different fruit geometry and avoid bruising it; and iii) be capable of rigidly holding the fruit tightly enough to counteract detachment forces. Previous work on fruit harvesting has primarily focused on using grippers with a single actuation mode, either suction or fingers. In this paper we present a compact robotic gripper that combines the benefits of both. The gripper first uses an array of compliant suction cups to gently attach to the fruit. After attachment, telescoping cam-driven fingers deploy, sweeping obstacles away before pivoting inwards to provide a secure grip on the fruit for picking. We present and analyze the finger design for both ability to sweep clutter and maintain a tight grasp. Specifically, we use a motorized test bed to measure grasp strength for each actuation mode (suction, fingers, or both). We apply a tensile force at different angles (0{\deg}, 15{\deg}, 30{\deg} and 45{\deg}), and vary the point of contact between the fingers and the fruit. We observed that with both modes the grasp strength is approximately 40 N. We use an apple proxy to test the gripper's ability to obtain a grasp in the presence of occluding apples and leaves, achieving a grasp success rate over 96% (with an ideal controller). Finally, we validate our gripper in a commercial apple orchard.
comment: 8 pages, 9 figures
Design of a Double-joint Robotic Fish Using a Composite Linkage
Robotic fish is one of the most promising directions of the new generation of underwater vehicles. Traditional biomimetic fish often mimic fish joints using tandem components like servos, which leads to increased volume, weight and control complexity. In this paper, a new double-joint robotic fish using a composite linkage was designed, where the propulsion mechanism transforms the single-degree-of-freedom rotation of the motor into a double-degree-of-freedom coupled motion, namely caudal peduncle translation and caudal fin rotation. Motion analysis of the propulsion mechanism demonstrates its ability to closely emulate the undulating movement observed in carangiform fish. Experimental results further validate the feasibility of the proposed propulsion mechanism. To improve propulsion efficiency, an analysis is conducted to explore the influence of swing angle amplitude and swing frequency on the swimming speed of the robotic fish. This examination establishes a practical foundation for future research on such robotic fish systems.
A hybrid neural network for real-time OD demand calibration under disruptions
Existing automated urban traffic management systems, designed to mitigate traffic congestion and reduce emissions in real time, face significant challenges in effectively adapting to rapidly evolving conditions. Predominantly reactive, these systems typically respond to incidents only after they have transpired. A promising solution lies in implementing real-time traffic simulation models capable of accurately modelling environmental changes. Central to these real-time traffic simulations are origin-destination (OD) demand matrices. However, the inherent variability, stochasticity, and unpredictability of traffic demand complicate the precise calibration of these matrices in the face of disruptions. This paper introduces a hybrid neural network (NN) architecture specifically designed for real-time OD demand calibration to enhance traffic simulations' accuracy and reliability under both recurrent and non-recurrent traffic conditions. The proposed hybrid NN predicts the OD demand to reconcile the discrepancies between actual and simulated traffic patterns. To facilitate real-time updating of the internal parameters of the NN, we develop a metamodel-based backpropagation method by integrating data from real-world traffic systems and simulated environments. This ensures precise predictions of the OD demand even in the case of abnormal or unpredictable traffic patterns. Furthermore, we incorporate offline pre-training of the NN using the metamodel to improve computational efficiency. Validation through a toy network and a Tokyo expressway corridor case study illustrates the model's ability to dynamically adjust to shifting traffic patterns across various disruption scenarios. Our findings underscore the potential of advanced machine learning techniques in developing proactive traffic management strategies, offering substantial improvements over traditional reactive systems.
Physics-Informed Kolmogorov-Arnold Networks for Power System Dynamics
This paper presents, for the first time, a framework for Kolmogorov-Arnold Networks (KANs) in power system applications. Inspired by the recently proposed KAN architecture, this paper proposes physics-informed Kolmogorov-Arnold Networks (PIKANs), a novel KAN-based physics-informed neural network (PINN) tailored to efficiently and accurately learn dynamics within power systems. The PIKANs present a promising alternative to conventional Multi-Layer Perceptrons (MLPs) based PINNs, achieving superior accuracy in predicting power system dynamics while employing a smaller network size. Simulation results on a single-machine infinite bus system and a 4-bus 2- generator system underscore the accuracy of the PIKANs in predicting rotor angle and frequency with fewer learnable parameters than conventional PINNs. Furthermore, the simulation results demonstrate PIKANs capability to accurately identify uncertain inertia and damping coefficients. This work opens up a range of opportunities for the application of KANs in power systems, enabling efficient determination of grid dynamics and precise parameter identification.
comment: 10 pages, 12 figures
Dynamic Pricing of Electric Vehicle Charging Station Alliances Under Information Asymmetry
Due to the centralization of charging stations (CSs), CSs are organized as charging station alliances (CSAs) in the commercial competition. Under this situation, this paper studies the profit-oriented dynamic pricing strategy of CSAs. As the practicability basis, a privacy-protected bidirectional real-time information interaction framework is designed, under which the status of EVs is utilized as the reference for pricing, and the prices of CSs are the reference for charging decisions. Based on this framework, the decision-making models of EVs and CSs are established, in which the uncertainty caused by the information asymmetry between EVs and CSs and the bounded rationality of EV users are integrated. To solve the pricing decision model, the evolutionary game theory is adopted to describe the dynamic pricing game among CSAs, the equilibrium of which gives the optimal pricing strategy. Finally, the case study results in a real urban area in Shanghai, China verifies the practicability of the framework and the effectiveness of the dynamic pricing strategy.
Mathematical Optimization of Resolution Improvement in Structured Light data by Periodic Scanning Motion: Application for Feedback during Lunar Landing
This research explores the enhancement of lunar landing precision through an advanced structured light system, integrating machine learning, Iterative Learning Control (ILC) and Structured Illumination Microscopy (SIM) techniques. By employing Moire fringe patterns for high-precision scanning maneuvers, the study addresses the limitations of conventional structured light systems. A nonlinear mathematical optimization model is developed to refine the world model, optimizing oscillation frequency and amplitude to improve resolution. The findings suggest that this approach can double the conventional resolution, promising significant advancements in the accuracy of lunar landings, with potential real-time application.
comment: 5 pages, 2 figures
Variance-Reduced Cascade Q-learning: Algorithms and Sample Complexity
We study the problem of estimating the optimal Q-function of $\gamma$-discounted Markov decision processes (MDPs) under the synchronous setting, where independent samples for all state-action pairs are drawn from a generative model at each iteration. We introduce and analyze a novel model-free algorithm called Variance-Reduced Cascade Q-learning (VRCQ). VRCQ comprises two key building blocks: (i) the established direct variance reduction technique and (ii) our proposed variance reduction scheme, Cascade Q-learning. By leveraging these techniques, VRCQ provides superior guarantees in the $\ell_\infty$-norm compared with the existing model-free stochastic approximation-type algorithms. Specifically, we demonstrate that VRCQ is minimax optimal. Additionally, when the action set is a singleton (so that the Q-learning problem reduces to policy evaluation), it achieves non-asymptotic instance optimality while requiring the minimum number of samples theoretically possible. Our theoretical results and their practical implications are supported by numerical experiments.
Object Tracking Incorporating Transfer Learning into Unscented and Cubature Kalman Filters
We present a novel filtering algorithm that employs Bayesian transfer learning to address the challenges posed by mismatched intensity of the noise in a pair of sensors, each of which tracks an object using a nonlinear dynamic system model. In this setting, the primary sensor experiences a higher noise intensity in tracking the object than the source sensor. To improve the estimation accuracy of the primary sensor, we propose a framework that integrates Bayesian transfer learning into an Unscented Kalman Filter (UKF) and a Cubature Kalman Filter (CKF). In this approach, the parameters of the predicted observations in the source sensor are transferred to the primary sensor and used as an additional prior in the filtering process. Our simulation results show that the transfer learning approach significantly outperforms the conventional isolated UKF and CKF. Comparisons to a form of measurement vector fusion are also presented.
comment: 22 pages, 7 figures, 2 tables
Analysis of Stability in Multistage Feedforward Operational Transconductance Amplifiers using Successive One-Pole Approximation
This paper presents analysis results of the operational transconductance amplifiers (OTAs) that combine feedforward paths and multistage amplifiers to achieve high-gain wideband operation as well as frequency compensation. To analyze multistage feedforward OTAs and provide an intuitive design method, the successive one-pole approximation (SOPA) is used for each substage of a multistage feedforward OTA. Using SOPA, the stability analysis is carried out from the two-stage feedforward OTA to the four-stage feedforward OTA in this work.
comment: 14 pages, 10 figures, 2 tables, preprint
Active Learning for Control-Oriented Identification of Nonlinear Systems
Model-based reinforcement learning is an effective approach for controlling an unknown system. It is based on a longstanding pipeline familiar to the control community in which one performs experiments on the environment to collect a dataset, uses the resulting dataset to identify a model of the system, and finally performs control synthesis using the identified model. As interacting with the system may be costly and time consuming, targeted exploration is crucial for developing an effective control-oriented model with minimal experimentation. Motivated by this challenge, recent work has begun to study finite sample data requirements and sample efficient algorithms for the problem of optimal exploration in model-based reinforcement learning. However, existing theory and algorithms are limited to model classes which are linear in the parameters. Our work instead focuses on models with nonlinear parameter dependencies, and presents the first finite sample analysis of an active learning algorithm suitable for a general class of nonlinear dynamics. In certain settings, the excess control cost of our algorithm achieves the optimal rate, up to logarithmic factors. We validate our approach in simulation, showcasing the advantage of active, control-oriented exploration for controlling nonlinear systems.
A Network-Constrained Demand Response Game for Procuring Energy Balancing Services
Securely and efficiently procuring energy balancing services in distribution networks remains challenging, especially within a privacy-preserving environment. This paper proposes a network-constrained demand response game, i.e., a Generalized Nash Game (GNG), to incentivize energy consumers to offer balancing services. Specifically, we adopt a supply function-based bidding method for our demand response problem, where a requisite load adjustment must be met. To ensure the secure operation of distribution networks, we incorporate physical network constraints, including line capacity and bus voltage limits, into the game formulation. In addition, we analytically evaluate the efficiency loss of this game. Previous approaches to steer energy consumers toward the Generalized Nash Equilibrium (GNE) of the game often necessitated sharing some private information, which might not be practically feasible or desired. To overcome this limitation, we propose a decentralized market clearing algorithm with analytical convergence guarantees, which only requires the participants to share limited, non-sensitive information with others. Numerical analyses illustrate that the proposed market mechanism exhibits a low market efficiency loss. Moreover, these analyses highlight the critical role of integrating physical network constraints. Finally, we demonstrate the scalability of our proposed algorithm by conducting simulations on the IEEE 33-bus and 69-bus test systems.
Joint Mechanical and Electrical Adjustment of IRS-aided LEO Satellite MIMO Communications
In this correspondence, we propose a joint mechanical and electrical adjustment of intelligent reflecting surface (IRS) for the performance improvements of low-earth orbit (LEO) satellite multiple-input multiple-output (MIMO) communications. In particular, we construct a three-dimensional (3D) MIMO channel model for the mechanically-tilted IRS in general deployment, and consider two types of scenarios with and without the direct path of LEO-ground user link due to the orbital flight. With the aim of maximizing the end-to-end performance, we jointly optimize tilting angle and phase shift of IRS along with the transceiver beamforming, whose performance superiority is verified via simulations with the Orbcomm LEO satellite using a real orbit data.
comment: 5 pages, 6 figures
Weyl Calculus and Exactly Solvable Schrödinger Bridges with Quadratic State Cost
Schr\"{o}dinger bridge--a stochastic dynamical generalization of optimal mass transport--exhibits a learning-control duality. Viewed as a stochastic control problem, the Schr\"{o}dinger bridge finds an optimal control policy that steers a given joint state statistics to another while minimizing the total control effort subject to controlled diffusion and deadline constraints. Viewed as a stochastic learning problem, the Schr\"{o}dinger bridge finds the most-likely distribution-valued trajectory connecting endpoint distributional observations, i.e., solves the two point boundary-constrained maximum likelihood problem over the manifold of probability distributions. Recent works have shown that solving the Schr\"{o}dinger bridge problem with state cost requires finding the Markov kernel associated with a reaction-diffusion PDE where the state cost appears as a state-dependent reaction rate. We explain how ideas from Weyl calculus in quantum mechanics, specifically the Weyl operator and the Weyl symbol, can help determine such Markov kernels. We illustrate these ideas by explicitly finding the Markov kernel for the case of quadratic state cost via Weyl calculus, recovering our earlier results but avoiding tedious computation with Hermite polynomials.
Flight Path Optimization with Optimal Control Method
This paper is based on a crucial issue in the aviation world: how to optimize the trajectory and controls given to the aircraft in order to optimize flight time and fuel consumption. This study aims to provide elements of a response to this problem and to define, under certain simplifying assumptions, an optimal response, using Constrained Finite Time Optimal Control(CFTOC). The first step is to define the dynamic model of the aircraft in accordance with the controllable inputs and wind disturbances. Then we will identify a precise objective in terms of optimization and implement an optimization program to solve it under the circumstances of simulated real flight situation. Finally, the optimization result is validated and discussed by different scenarios.
Distance-coupling as an Approach to Position and Formation Control
In this letter, we study the case of autonomous agents which are required to move to some new position based solely on the distance measured from predetermined reference points, or anchors. A novel approach, referred to as distance-coupling, is proposed for calculating the agent's position exclusively from differences between squared distance measurements. The key insight in our approach is that, in doing so, we cancel out the measurement's quadratic term and obtain a function of position which is linear. We apply this method to the homing problem and prove Lyapunov stability with and without anchor placement error; identifying bounds on the region of attraction when the anchors are linearly transformed from their desired positions. As an application of the method, we show how the policy can be implemented for distributed formation control on a set of autonomous agents, proving the existence of the set of equilibria.
comment: 6 pages, 7 figures
Time series forecasting with high stakes: A field study of the air cargo industry KDD
Time series forecasting in the air cargo industry presents unique challenges due to volatile market dynamics and the significant impact of accurate forecasts on generated revenue. This paper explores a comprehensive approach to demand forecasting at the origin-destination (O\&D) level, focusing on the development and implementation of machine learning models in decision-making for the air cargo industry. We leverage a mixture of experts framework, combining statistical and advanced deep learning models to provide reliable forecasts for cargo demand over a six-month horizon. The results demonstrate that our approach outperforms industry benchmarks, offering actionable insights for cargo capacity allocation and strategic decision-making in the air cargo industry. While this work is applied in the airline industry, the methodology is broadly applicable to any field where forecast-based decision-making in a volatile environment is crucial.
comment: The 10th Mining and Learning from Time Series Workshop: From Classical Methods to LLMs. SIGKDD, Barcelona, Spain, 6 page
Error bounds, PL condition, and quadratic growth for weakly convex functions, and linear convergences of proximal point methods
Many practical optimization problems lack strong convexity. Fortunately, recent studies have revealed that first-order algorithms also enjoy linear convergences under various weaker regularity conditions. While the relationship among different conditions for convex and smooth functions is well-understood, it is not the case for the nonsmooth setting. In this paper, we go beyond convexity and smoothness, and clarify the connections among common regularity conditions in the class of weakly convex functions, including $\textit{strong convexity}$, $\textit{restricted secant inequality}$, $\textit{subdifferential error bound}$, $\textit{Polyak-{\L}ojasiewicz inequality}$, and $\textit{quadratic growth}$. In addition, using these regularity conditions, we present a simple and modular proof for the linear convergence of the proximal point method (PPM) for convex and weakly convex optimization problems. The linear convergence also holds when the subproblems of PPM are solved inexactly with a proper control of inexactness.
comment: 29 pages, 3 figures, and 1 table
Generalizable Physics-Informed Learning for Stochastic Safety-Critical Systems
Accurate estimate of long-term risk is critical for safe decision-making, but sampling from rare risk events and long-term trajectories can be prohibitively costly. Risk gradient can be used in many first-order techniques for learning and control methods, but gradient estimate is difficult to obtain using Monte Carlo (MC) methods because the infinitesimal divisor may significantly amplify sampling noise. Motivated by this gap, we propose an efficient method to evaluate long-term risk probabilities and their gradients using short-term samples without sufficient risk events. We first derive that four types of long-term risk probability are solutions of certain partial differential equations (PDEs). Then, we propose a physics-informed learning technique that integrates data and physics information (aforementioned PDEs). The physics information helps propagate information beyond available data and obtain provable generalization beyond available data, which in turn enables long-term risk to be estimated using short-term samples of safe events. Finally, we demonstrate in simulation that the proposed technique has improved sample efficiency, generalizes well to unseen regions, and adapts to changing system parameters.
comment: arXiv admin note: substantial text overlap with arXiv:2305.06432
Systems and Control (EESS)
Robust Model Predictive Control for Aircraft Intent-Aware Collision Avoidance
This paper presents the use of robust model predictive control for the design of an intent-aware collision avoidance system for multi-agent aircraft engaged in horizontal maneuvering scenarios. We assume that information from other agents is accessible in the form of waypoints or destinations. Consequently, we consider that other agents follow their optimal Dubin's path--a trajectory that connects their current state to their intended state--while accounting for potential uncertainties. We propose using scenario tree model predictive control as a robust approach that offers computational efficiency. We demonstrate that the proposed method can easily integrate intent information and offer a robust scheme that handles different uncertainties. The method is illustrated through simulation results.
comment: 8 Pages, 10 Figs
Optimization-Based Model Checking and Trace Synthesis for Complex STL Specifications
We present a bounded model checking algorithm for signal temporal logic (STL) that exploits mixed-integer linear programming (MILP). A key technical element is our novel MILP encoding of the STL semantics; it follows the idea of stable partitioning from the recent work on SMT-based STL model checking. Assuming that our (continuous-time) system models can be encoded to MILP -- typical examples are rectangular hybrid automata (precisely) and hybrid dynamics with closed-form solutions (approximately) -- our MILP encoding yields an optimization-based model checking algorithm that is scalable, is anytime/interruptible, and accommodates parameter mining. Experimental evaluation shows our algorithm's performance advantages especially for complex STL formulas, demonstrating its practical relevance e.g. in the automotive domain.
comment: Extended version of the paper accepted by 36th International Conference on Computer-Aided Verification (CAV), 2024
Verification of Diagnosability for Cyber-Physical Systems: A Hybrid Barrier Certificate Approach
Diagnosability is a system theoretical property characterizing whether fault occurrences in a system can always be detected within a finite time. In this paper, we investigate the verification of diagnosability for cyber-physical systems with continuous state sets. We develop an abstraction-free and automata-based framework to verify (the lack of) diagnosability, leveraging a notion of hybrid barrier certificates. To this end, we first construct a (delta,K)-deterministic finite automaton that captures the occurrence of faults targeted for diagnosis. Then, the verification of diagnosability property is converted into a safety verification problem over a product system between the automaton and the augmented version of the dynamical system. We demonstrate that this verification problem can be addressed by computing hybrid barrier certificates for the product system. To this end, we introduce two systematic methods, leveraging sum-of-squares programming and counter-example guided inductive synthesis to search for such certificates. Additionally, if the system is found to be diagnosable, we propose methodologies to construct a diagnoser to identify fault occurrences online. Finally, we showcase the effectiveness of our methods through a case study.
On the Improvement of the Performance of Inexpensive Electromagnetic Skins by means of an Inverse Source Design Approach
A new methodology for the improvement of the performance of inexpensive static passive electromagnetic skins (SP-EMSs) is presented. The proposed approach leverages on the non-uniqueness of the inverse source problem associated to the SP-EMS design by decomposing the induced surface current into pre-image (PI) and null-space (NS) components. Successively, the unknown EMS layout and NS expansion coefficients are determined by means of an alternate minimization of a suitable cost function. This latter quantifies the mismatch between the ideal surface current, which radiates the user-defined target field, and that actually induced on the EMS layout. Results from a representative set of numerical experiments, concerned with the design of EMSs reflecting pencil-beam as well as contoured target patterns, are reported to assess the feasibility and the effectiveness of the proposed method in improving the performance of inexpensive EMS realizations. The measurements on an EMS prototype, featuring a conductive ink pattern printed on a standard paper substrate, are also shown to prove the reliability of the synthesis process.
Synthesis of Wide-Angle Scanning Arrays through Array Power Control
A new methodology for the synthesis of wide-angle scanning arrays is proposed. It is based on the formulation of the array design problem as a multi-objective one where, for each scan angle, both the radiated power density in the scan direction and the total reflected power are accounted for. A set of numerical results from full-wave simulated examples - dealing with different radiators, arrangements, frequencies, and number of elements - is reported to show the features of the proposed approach as well as to assess its potentialities.
Residual Deep Reinforcement Learning for Inverter-based Volt-Var Control
A residual deep reinforcement learning (RDRL) approach is proposed by integrating DRL with model-based optimization for inverter-based volt-var control in active distribution networks when the accurate power flow model is unknown. RDRL learns a residual action with a reduced residual action space, based on the action of the model-based approach with an approximate model. RDRL inherits the control capability of the approximate-model-based optimization and enhances the policy optimization capability by residual policy learning. Additionally, it improves the approximation accuracy of the critic and reduces the search difficulties of the actor by reducing residual action space. To address the issues of "too small" or "too large" residual action space of RDRL and further improve the optimization performance, we extend RDRL to a boosting RDRL approach. It selects a much smaller residual action space and learns a residual policy by using the policy of RDRL as a base policy. Simulations demonstrate that RDRL and boosting RDRL improve the optimization performance considerably throughout the learning stage and verify their rationales point-by-point, including 1) inheriting the capability of the approximate model-based optimization, 2) residual policy learning, and 3) learning in a reduced action space.
comment: arXiv admin note: text overlap with arXiv:2210.07360
Robustness of optimal quantum annealing protocols
Noise in quantum computing devices poses a key challenge in their realization. In this paper, we study the robustness of optimal quantum annealing protocols against coherent control errors, which are multiplicative Hamlitonian errors causing detrimental effects on current quantum devices. We show that the norm of the Hamiltonian quantifies the robustness against these errors, motivating the introduction of an additional regularization term in the cost function. We analyze the optimality conditions of the resulting robust quantum optimal control problem based on Pontryagin's maximum principle, showing that robust protocols admit larger smooth annealing sections. This suggests that quantum annealing admits improved robustness in comparison to bang-bang solutions such as the quantum approximate optimization algorithm. Finally, we perform numerical simulations to verify our analytical results and demonstrate the improved robustness of the proposed approach.
Robust Deep Reinforcement Learning for Inverter-based Volt-Var Control in Partially Observable Distribution Networks
Inverter-based volt-var control is studied in this paper. One key issue in DRL-based approaches is the limited measurement deployment in active distribution networks, which leads to problems of a partially observable state and unknown reward. To address those problems, this paper proposes a robust DRL approach with a conservative critic and a surrogate reward. The conservative critic utilizes the quantile regression technology to estimate conservative state-action value function based on the partially observable state, which helps to train a robust policy; the surrogate rewards of power loss and voltage violation are designed that can be calculated from the limited measurements. The proposed approach optimizes the power loss of the whole network and the voltage profile of buses with measurable voltages while indirectly improving the voltage profile of other buses. Extensive simulations verify the effectiveness of the robust DRL approach in different limited measurement conditions, even when only the active power injection of the root bus and less than 10% of bus voltages are measurable.
Designing Consensus-Based Distributed Filtering over Directed Graphs
This paper proposes a novel consensus-on-only-measurement distributed filter over directed graphs under the collectively observability condition. First, the distributed filter structure is designed with an augmented leader-following measurement fusion strategy. Subsequently, two parameter design methods are presented, and the consensus gain parameter is devised utilizing local information exclusively rather than global information. Additionally, the lower bound of the fusion step is derived to guarantee a uniformly upper bound of the estimation error covariance. Moreover, the lower bounds of the convergence rates of the steady-state performance gap between the proposed algorithm and the centralized filter are provided with the fusion step approaching infinity. The analysis demonstrates that the convergence rate is, at a minimum, as rapid as exponential convergence under the spectral norm condition of the communication graph. The transient performance is also analyzed with the fusion step tending to infinity. The inherent trade-off between the communication cost and the filtering performance is revealed from the analysis of the steady-state performance and the transient performance. Finally, the theoretical results are substantiated through the validation of two simulation examples.
On the Effects of Modeling Errors on Distributed Continuous-time Filtering
This paper offers a comprehensive performance analysis of the distributed continuous-time filtering in the presence of modeling errors. First, we introduce two performance indices, namely the nominal performance index and the estimation error covariance. By leveraging the nominal performance index and the Frobenius norm of the modeling deviations, we derive the bounds of the estimation error covariance and the lower bound of the nominal performance index. Specifically, we reveal the effect of the consensus parameter on both bounds. We demonstrate that, under specific conditions, an incorrect process noise covariance can lead to the divergence of the estimation error covariance. Moreover, we investigate the properties of the eigenvalues of the error dynamical matrix. In the context of switching topological configurations, we provide a sufficient condition that ensures the stability of the error dynamical matrix. Furthermore, we explore the magnitude relations between the nominal performance index and the estimation error covariance. Finally, we present some numerical simulations to validate the effectiveness of the theoretical results.
Performance Analysis of Distributed Filtering under Mismatched Noise Covariances
This paper systematically investigates the performance of consensus-based distributed filtering under mismatched noise covariances. First, we introduce three performance evaluation indices for such filtering problems,namely the standard performance evaluation index, the nominal performance evaluation index, and the estimation error covariance. We derive difference expressions among these indices and establish one-step relations among them under various mismatched noise covariance scenarios. We particularly reveal the effect of the consensus fusion on these relations. Furthermore, the recursive relations are introduced by extending the results of the one-step relations. Subsequently, we demonstrate the convergence of these indices under the collective observability condition, and show this convergence condition of the nominal performance evaluation index can guarantee the convergence of the estimation error covariance. Additionally, we prove that the estimation error covariance of the consensus-based distributed filter under mismatched noise covariances can be bounded by the Frobenius norms of the noise covariance deviations and the trace of the nominal performance evaluation index. Finally, the effectiveness of the theoretical results is verified by numerical simulations.
Compact robotic gripper with tandem actuation for selective fruit harvesting
Selective fruit harvesting is a challenging manipulation problem due to occlusions and clutter arising from plant foliage. A harvesting gripper should i) have a small cross-section, to avoid collisions while approaching the fruit; ii) have a soft and compliant grasp to adapt to different fruit geometry and avoid bruising it; and iii) be capable of rigidly holding the fruit tightly enough to counteract detachment forces. Previous work on fruit harvesting has primarily focused on using grippers with a single actuation mode, either suction or fingers. In this paper we present a compact robotic gripper that combines the benefits of both. The gripper first uses an array of compliant suction cups to gently attach to the fruit. After attachment, telescoping cam-driven fingers deploy, sweeping obstacles away before pivoting inwards to provide a secure grip on the fruit for picking. We present and analyze the finger design for both ability to sweep clutter and maintain a tight grasp. Specifically, we use a motorized test bed to measure grasp strength for each actuation mode (suction, fingers, or both). We apply a tensile force at different angles (0{\deg}, 15{\deg}, 30{\deg} and 45{\deg}), and vary the point of contact between the fingers and the fruit. We observed that with both modes the grasp strength is approximately 40 N. We use an apple proxy to test the gripper's ability to obtain a grasp in the presence of occluding apples and leaves, achieving a grasp success rate over 96% (with an ideal controller). Finally, we validate our gripper in a commercial apple orchard.
comment: 8 pages, 9 figures
Design of a Double-joint Robotic Fish Using a Composite Linkage
Robotic fish is one of the most promising directions of the new generation of underwater vehicles. Traditional biomimetic fish often mimic fish joints using tandem components like servos, which leads to increased volume, weight and control complexity. In this paper, a new double-joint robotic fish using a composite linkage was designed, where the propulsion mechanism transforms the single-degree-of-freedom rotation of the motor into a double-degree-of-freedom coupled motion, namely caudal peduncle translation and caudal fin rotation. Motion analysis of the propulsion mechanism demonstrates its ability to closely emulate the undulating movement observed in carangiform fish. Experimental results further validate the feasibility of the proposed propulsion mechanism. To improve propulsion efficiency, an analysis is conducted to explore the influence of swing angle amplitude and swing frequency on the swimming speed of the robotic fish. This examination establishes a practical foundation for future research on such robotic fish systems.
A hybrid neural network for real-time OD demand calibration under disruptions
Existing automated urban traffic management systems, designed to mitigate traffic congestion and reduce emissions in real time, face significant challenges in effectively adapting to rapidly evolving conditions. Predominantly reactive, these systems typically respond to incidents only after they have transpired. A promising solution lies in implementing real-time traffic simulation models capable of accurately modelling environmental changes. Central to these real-time traffic simulations are origin-destination (OD) demand matrices. However, the inherent variability, stochasticity, and unpredictability of traffic demand complicate the precise calibration of these matrices in the face of disruptions. This paper introduces a hybrid neural network (NN) architecture specifically designed for real-time OD demand calibration to enhance traffic simulations' accuracy and reliability under both recurrent and non-recurrent traffic conditions. The proposed hybrid NN predicts the OD demand to reconcile the discrepancies between actual and simulated traffic patterns. To facilitate real-time updating of the internal parameters of the NN, we develop a metamodel-based backpropagation method by integrating data from real-world traffic systems and simulated environments. This ensures precise predictions of the OD demand even in the case of abnormal or unpredictable traffic patterns. Furthermore, we incorporate offline pre-training of the NN using the metamodel to improve computational efficiency. Validation through a toy network and a Tokyo expressway corridor case study illustrates the model's ability to dynamically adjust to shifting traffic patterns across various disruption scenarios. Our findings underscore the potential of advanced machine learning techniques in developing proactive traffic management strategies, offering substantial improvements over traditional reactive systems.
Physics-Informed Kolmogorov-Arnold Networks for Power System Dynamics
This paper presents, for the first time, a framework for Kolmogorov-Arnold Networks (KANs) in power system applications. Inspired by the recently proposed KAN architecture, this paper proposes physics-informed Kolmogorov-Arnold Networks (PIKANs), a novel KAN-based physics-informed neural network (PINN) tailored to efficiently and accurately learn dynamics within power systems. The PIKANs present a promising alternative to conventional Multi-Layer Perceptrons (MLPs) based PINNs, achieving superior accuracy in predicting power system dynamics while employing a smaller network size. Simulation results on a single-machine infinite bus system and a 4-bus 2- generator system underscore the accuracy of the PIKANs in predicting rotor angle and frequency with fewer learnable parameters than conventional PINNs. Furthermore, the simulation results demonstrate PIKANs capability to accurately identify uncertain inertia and damping coefficients. This work opens up a range of opportunities for the application of KANs in power systems, enabling efficient determination of grid dynamics and precise parameter identification.
comment: 10 pages, 12 figures
Dynamic Pricing of Electric Vehicle Charging Station Alliances Under Information Asymmetry
Due to the centralization of charging stations (CSs), CSs are organized as charging station alliances (CSAs) in the commercial competition. Under this situation, this paper studies the profit-oriented dynamic pricing strategy of CSAs. As the practicability basis, a privacy-protected bidirectional real-time information interaction framework is designed, under which the status of EVs is utilized as the reference for pricing, and the prices of CSs are the reference for charging decisions. Based on this framework, the decision-making models of EVs and CSs are established, in which the uncertainty caused by the information asymmetry between EVs and CSs and the bounded rationality of EV users are integrated. To solve the pricing decision model, the evolutionary game theory is adopted to describe the dynamic pricing game among CSAs, the equilibrium of which gives the optimal pricing strategy. Finally, the case study results in a real urban area in Shanghai, China verifies the practicability of the framework and the effectiveness of the dynamic pricing strategy.
Mathematical Optimization of Resolution Improvement in Structured Light data by Periodic Scanning Motion: Application for Feedback during Lunar Landing
This research explores the enhancement of lunar landing precision through an advanced structured light system, integrating machine learning, Iterative Learning Control (ILC) and Structured Illumination Microscopy (SIM) techniques. By employing Moire fringe patterns for high-precision scanning maneuvers, the study addresses the limitations of conventional structured light systems. A nonlinear mathematical optimization model is developed to refine the world model, optimizing oscillation frequency and amplitude to improve resolution. The findings suggest that this approach can double the conventional resolution, promising significant advancements in the accuracy of lunar landings, with potential real-time application.
comment: 5 pages, 2 figures
Variance-Reduced Cascade Q-learning: Algorithms and Sample Complexity
We study the problem of estimating the optimal Q-function of $\gamma$-discounted Markov decision processes (MDPs) under the synchronous setting, where independent samples for all state-action pairs are drawn from a generative model at each iteration. We introduce and analyze a novel model-free algorithm called Variance-Reduced Cascade Q-learning (VRCQ). VRCQ comprises two key building blocks: (i) the established direct variance reduction technique and (ii) our proposed variance reduction scheme, Cascade Q-learning. By leveraging these techniques, VRCQ provides superior guarantees in the $\ell_\infty$-norm compared with the existing model-free stochastic approximation-type algorithms. Specifically, we demonstrate that VRCQ is minimax optimal. Additionally, when the action set is a singleton (so that the Q-learning problem reduces to policy evaluation), it achieves non-asymptotic instance optimality while requiring the minimum number of samples theoretically possible. Our theoretical results and their practical implications are supported by numerical experiments.
Object Tracking Incorporating Transfer Learning into Unscented and Cubature Kalman Filters
We present a novel filtering algorithm that employs Bayesian transfer learning to address the challenges posed by mismatched intensity of the noise in a pair of sensors, each of which tracks an object using a nonlinear dynamic system model. In this setting, the primary sensor experiences a higher noise intensity in tracking the object than the source sensor. To improve the estimation accuracy of the primary sensor, we propose a framework that integrates Bayesian transfer learning into an Unscented Kalman Filter (UKF) and a Cubature Kalman Filter (CKF). In this approach, the parameters of the predicted observations in the source sensor are transferred to the primary sensor and used as an additional prior in the filtering process. Our simulation results show that the transfer learning approach significantly outperforms the conventional isolated UKF and CKF. Comparisons to a form of measurement vector fusion are also presented.
comment: 22 pages, 7 figures, 2 tables
Analysis of Stability in Multistage Feedforward Operational Transconductance Amplifiers using Successive One-Pole Approximation
This paper presents analysis results of the operational transconductance amplifiers (OTAs) that combine feedforward paths and multistage amplifiers to achieve high-gain wideband operation as well as frequency compensation. To analyze multistage feedforward OTAs and provide an intuitive design method, the successive one-pole approximation (SOPA) is used for each substage of a multistage feedforward OTA. Using SOPA, the stability analysis is carried out from the two-stage feedforward OTA to the four-stage feedforward OTA in this work.
comment: 14 pages, 10 figures, 2 tables, preprint
Active Learning for Control-Oriented Identification of Nonlinear Systems
Model-based reinforcement learning is an effective approach for controlling an unknown system. It is based on a longstanding pipeline familiar to the control community in which one performs experiments on the environment to collect a dataset, uses the resulting dataset to identify a model of the system, and finally performs control synthesis using the identified model. As interacting with the system may be costly and time consuming, targeted exploration is crucial for developing an effective control-oriented model with minimal experimentation. Motivated by this challenge, recent work has begun to study finite sample data requirements and sample efficient algorithms for the problem of optimal exploration in model-based reinforcement learning. However, existing theory and algorithms are limited to model classes which are linear in the parameters. Our work instead focuses on models with nonlinear parameter dependencies, and presents the first finite sample analysis of an active learning algorithm suitable for a general class of nonlinear dynamics. In certain settings, the excess control cost of our algorithm achieves the optimal rate, up to logarithmic factors. We validate our approach in simulation, showcasing the advantage of active, control-oriented exploration for controlling nonlinear systems.
A Network-Constrained Demand Response Game for Procuring Energy Balancing Services
Securely and efficiently procuring energy balancing services in distribution networks remains challenging, especially within a privacy-preserving environment. This paper proposes a network-constrained demand response game, i.e., a Generalized Nash Game (GNG), to incentivize energy consumers to offer balancing services. Specifically, we adopt a supply function-based bidding method for our demand response problem, where a requisite load adjustment must be met. To ensure the secure operation of distribution networks, we incorporate physical network constraints, including line capacity and bus voltage limits, into the game formulation. In addition, we analytically evaluate the efficiency loss of this game. Previous approaches to steer energy consumers toward the Generalized Nash Equilibrium (GNE) of the game often necessitated sharing some private information, which might not be practically feasible or desired. To overcome this limitation, we propose a decentralized market clearing algorithm with analytical convergence guarantees, which only requires the participants to share limited, non-sensitive information with others. Numerical analyses illustrate that the proposed market mechanism exhibits a low market efficiency loss. Moreover, these analyses highlight the critical role of integrating physical network constraints. Finally, we demonstrate the scalability of our proposed algorithm by conducting simulations on the IEEE 33-bus and 69-bus test systems.
Joint Mechanical and Electrical Adjustment of IRS-aided LEO Satellite MIMO Communications
In this correspondence, we propose a joint mechanical and electrical adjustment of intelligent reflecting surface (IRS) for the performance improvements of low-earth orbit (LEO) satellite multiple-input multiple-output (MIMO) communications. In particular, we construct a three-dimensional (3D) MIMO channel model for the mechanically-tilted IRS in general deployment, and consider two types of scenarios with and without the direct path of LEO-ground user link due to the orbital flight. With the aim of maximizing the end-to-end performance, we jointly optimize tilting angle and phase shift of IRS along with the transceiver beamforming, whose performance superiority is verified via simulations with the Orbcomm LEO satellite using a real orbit data.
comment: 5 pages, 6 figures
Weyl Calculus and Exactly Solvable Schrödinger Bridges with Quadratic State Cost
Schr\"{o}dinger bridge--a stochastic dynamical generalization of optimal mass transport--exhibits a learning-control duality. Viewed as a stochastic control problem, the Schr\"{o}dinger bridge finds an optimal control policy that steers a given joint state statistics to another while minimizing the total control effort subject to controlled diffusion and deadline constraints. Viewed as a stochastic learning problem, the Schr\"{o}dinger bridge finds the most-likely distribution-valued trajectory connecting endpoint distributional observations, i.e., solves the two point boundary-constrained maximum likelihood problem over the manifold of probability distributions. Recent works have shown that solving the Schr\"{o}dinger bridge problem with state cost requires finding the Markov kernel associated with a reaction-diffusion PDE where the state cost appears as a state-dependent reaction rate. We explain how ideas from Weyl calculus in quantum mechanics, specifically the Weyl operator and the Weyl symbol, can help determine such Markov kernels. We illustrate these ideas by explicitly finding the Markov kernel for the case of quadratic state cost via Weyl calculus, recovering our earlier results but avoiding tedious computation with Hermite polynomials.
Flight Path Optimization with Optimal Control Method
This paper is based on a crucial issue in the aviation world: how to optimize the trajectory and controls given to the aircraft in order to optimize flight time and fuel consumption. This study aims to provide elements of a response to this problem and to define, under certain simplifying assumptions, an optimal response, using Constrained Finite Time Optimal Control(CFTOC). The first step is to define the dynamic model of the aircraft in accordance with the controllable inputs and wind disturbances. Then we will identify a precise objective in terms of optimization and implement an optimization program to solve it under the circumstances of simulated real flight situation. Finally, the optimization result is validated and discussed by different scenarios.
Distance-coupling as an Approach to Position and Formation Control
In this letter, we study the case of autonomous agents which are required to move to some new position based solely on the distance measured from predetermined reference points, or anchors. A novel approach, referred to as distance-coupling, is proposed for calculating the agent's position exclusively from differences between squared distance measurements. The key insight in our approach is that, in doing so, we cancel out the measurement's quadratic term and obtain a function of position which is linear. We apply this method to the homing problem and prove Lyapunov stability with and without anchor placement error; identifying bounds on the region of attraction when the anchors are linearly transformed from their desired positions. As an application of the method, we show how the policy can be implemented for distributed formation control on a set of autonomous agents, proving the existence of the set of equilibria.
comment: 6 pages, 7 figures
Time series forecasting with high stakes: A field study of the air cargo industry KDD
Time series forecasting in the air cargo industry presents unique challenges due to volatile market dynamics and the significant impact of accurate forecasts on generated revenue. This paper explores a comprehensive approach to demand forecasting at the origin-destination (O\&D) level, focusing on the development and implementation of machine learning models in decision-making for the air cargo industry. We leverage a mixture of experts framework, combining statistical and advanced deep learning models to provide reliable forecasts for cargo demand over a six-month horizon. The results demonstrate that our approach outperforms industry benchmarks, offering actionable insights for cargo capacity allocation and strategic decision-making in the air cargo industry. While this work is applied in the airline industry, the methodology is broadly applicable to any field where forecast-based decision-making in a volatile environment is crucial.
comment: The 10th Mining and Learning from Time Series Workshop: From Classical Methods to LLMs. SIGKDD, Barcelona, Spain, 6 page
Error bounds, PL condition, and quadratic growth for weakly convex functions, and linear convergences of proximal point methods
Many practical optimization problems lack strong convexity. Fortunately, recent studies have revealed that first-order algorithms also enjoy linear convergences under various weaker regularity conditions. While the relationship among different conditions for convex and smooth functions is well-understood, it is not the case for the nonsmooth setting. In this paper, we go beyond convexity and smoothness, and clarify the connections among common regularity conditions in the class of weakly convex functions, including $\textit{strong convexity}$, $\textit{restricted secant inequality}$, $\textit{subdifferential error bound}$, $\textit{Polyak-{\L}ojasiewicz inequality}$, and $\textit{quadratic growth}$. In addition, using these regularity conditions, we present a simple and modular proof for the linear convergence of the proximal point method (PPM) for convex and weakly convex optimization problems. The linear convergence also holds when the subproblems of PPM are solved inexactly with a proper control of inexactness.
comment: 29 pages, 3 figures, and 1 table
Generalizable Physics-Informed Learning for Stochastic Safety-Critical Systems
Accurate estimate of long-term risk is critical for safe decision-making, but sampling from rare risk events and long-term trajectories can be prohibitively costly. Risk gradient can be used in many first-order techniques for learning and control methods, but gradient estimate is difficult to obtain using Monte Carlo (MC) methods because the infinitesimal divisor may significantly amplify sampling noise. Motivated by this gap, we propose an efficient method to evaluate long-term risk probabilities and their gradients using short-term samples without sufficient risk events. We first derive that four types of long-term risk probability are solutions of certain partial differential equations (PDEs). Then, we propose a physics-informed learning technique that integrates data and physics information (aforementioned PDEs). The physics information helps propagate information beyond available data and obtain provable generalization beyond available data, which in turn enables long-term risk to be estimated using short-term samples of safe events. Finally, we demonstrate in simulation that the proposed technique has improved sample efficiency, generalizes well to unseen regions, and adapts to changing system parameters.
comment: arXiv admin note: substantial text overlap with arXiv:2305.06432
Robotics
HeLiMOS: A Dataset for Moving Object Segmentation in 3D Point Clouds From Heterogeneous LiDAR Sensors IROS
Moving object segmentation (MOS) using a 3D light detection and ranging (LiDAR) sensor is crucial for scene understanding and identification of moving objects. Despite the availability of various types of 3D LiDAR sensors in the market, MOS research still predominantly focuses on 3D point clouds from mechanically spinning omnidirectional LiDAR sensors. Thus, we are, for example, lacking a dataset with MOS labels for point clouds from solid-state LiDAR sensors which have irregular scanning patterns. In this paper, we present a labeled dataset, called \textit{HeLiMOS}, that enables to test MOS approaches on four heterogeneous LiDAR sensors, including two solid-state LiDAR sensors. Furthermore, we introduce a novel automatic labeling method to substantially reduce the labeling effort required from human annotators. To this end, our framework exploits an instance-aware static map building approach and tracking-based false label filtering. Finally, we provide experimental results regarding the performance of commonly used state-of-the-art MOS approaches on HeLiMOS that suggest a new direction for a sensor-agnostic MOS, which generally works regardless of the type of LiDAR sensors used to capture 3D point clouds. Our dataset is available at https://sites.google.com/view/helimos.
comment: Proc. IEEE/RSJ Int. Conf. Intell. Robot. Syst. (IROS) 2024
EqNIO: Subequivariant Neural Inertial Odometry
Presently, neural networks are widely employed to accurately estimate 2D displacements and associated uncertainties from Inertial Measurement Unit (IMU) data that can be integrated into stochastic filter networks like the Extended Kalman Filter (EKF) as measurements and uncertainties for the update step in the filter. However, such neural approaches overlook symmetry which is a crucial inductive bias for model generalization. This oversight is notable because (i) physical laws adhere to symmetry principles when considering the gravity axis, meaning there exists the same transformation for both the physical entity and the resulting trajectory, and (ii) displacements should remain equivariant to frame transformations when the inertial frame changes. To address this, we propose a subequivariant framework by: (i) deriving fundamental layers such as linear and nonlinear layers for a subequivariant network, designed to handle sequences of vectors and scalars, (ii) employing the subequivariant network to predict an equivariant frame for the sequence of inertial measurements. This predicted frame can then be utilized for extracting invariant features through projection, which are integrated with arbitrary network architectures, (iii) transforming the invariant output by frame transformation to obtain equivariant displacements and covariances. We demonstrate the effectiveness and generalization of our Equivariant Framework on a filter-based approach with TLIO architecture for TLIO and Aria datasets, and an end-to-end deep learning approach with RONIN architecture for RONIN, RIDI and OxIOD datasets.
comment: 26 pages
Body Transformer: Leveraging Robot Embodiment for Policy Learning
In recent years, the transformer architecture has become the de facto standard for machine learning algorithms applied to natural language processing and computer vision. Despite notable evidence of successful deployment of this architecture in the context of robot learning, we claim that vanilla transformers do not fully exploit the structure of the robot learning problem. Therefore, we propose Body Transformer (BoT), an architecture that leverages the robot embodiment by providing an inductive bias that guides the learning process. We represent the robot body as a graph of sensors and actuators, and rely on masked attention to pool information throughout the architecture. The resulting architecture outperforms the vanilla transformer, as well as the classical multilayer perceptron, in terms of task completion, scaling properties, and computational efficiency when representing either imitation or reinforcement learning policies. Additional material including the open-source code is available at https://sferrazza.cc/bot_site.
EyeSight Hand: Design of a Fully-Actuated Dexterous Robot Hand with Integrated Vision-Based Tactile Sensors and Compliant Actuation
In this work, we introduce the EyeSight Hand, a novel 7 degrees of freedom (DoF) humanoid hand featuring integrated vision-based tactile sensors tailored for enhanced whole-hand manipulation. Additionally, we introduce an actuation scheme centered around quasi-direct drive actuation to achieve human-like strength and speed while ensuring robustness for large-scale data collection. We evaluate the EyeSight Hand on three challenging tasks: bottle opening, plasticine cutting, and plate pick and place, which require a blend of complex manipulation, tool use, and precise force application. Imitation learning models trained on these tasks, with a novel vision dropout strategy, showcase the benefits of tactile feedback in enhancing task success rates. Our results reveal that the integration of tactile sensing dramatically improves task performance, underscoring the critical role of tactile information in dexterous manipulation.
Stable-BC: Controlling Covariate Shift with Stable Behavior Cloning
Behavior cloning is a common imitation learning paradigm. Under behavior cloning the robot collects expert demonstrations, and then trains a policy to match the actions taken by the expert. This works well when the robot learner visits states where the expert has already demonstrated the correct action; but inevitably the robot will also encounter new states outside of its training dataset. If the robot learner takes the wrong action at these new states it could move farther from the training data, which in turn leads to increasingly incorrect actions and compounding errors. Existing works try to address this fundamental challenge by augmenting or enhancing the training data. By contrast, in our paper we develop the control theoretic properties of behavior cloned policies. Specifically, we consider the error dynamics between the system's current state and the states in the expert dataset. From the error dynamics we derive model-based and model-free conditions for stability: under these conditions the robot shapes its policy so that its current behavior converges towards example behaviors in the expert dataset. In practice, this results in Stable-BC, an easy to implement extension of standard behavior cloning that is provably robust to covariate shift. We demonstrate the effectiveness of our algorithm in simulations with interactive, nonlinear, and visual environments. We also conduct experiments where a robot arm uses Stable-BC to play air hockey. See our website here: https://collab.me.vt.edu/Stable-BC/
Towards Unconstrained Collision Injury Protection Data Sets: Initial Surrogate Experiments for the Human Hand IROS
Safety for physical human-robot interaction (pHRI) is a major concern for all application domains. While current standardization for industrial robot applications provide safety constraints that address the onset of pain in blunt impacts, these impact thresholds are difficult to use on edged or pointed impactors. The most severe injuries occur in constrained contact scenarios, where crushing is possible. Nevertheless, situations potentially resulting in constrained contact only occur in certain areas of a workspace and design or organisational approaches can be used to avoid them. What remains are risks to the human physical integrity caused by unconstrained accidental contacts, which are difficult to avoid while maintaining robot motion efficiency. Nevertheless, the probability and severity of injuries occurring with edged or pointed impacting objects in unconstrained collisions is hardly researched. In this paper, we propose an experimental setup and procedure using two pendulums modeling human hands and arms and robots to understand the injury potential of unconstrained collisions of human hands with edged objects. Based on our previous studies, we use pig feet as ex vivo surrogate samples - as these closely resemble the physiological characteristics of human hands - to create an initial injury database on the severity of injuries caused by unconstrained edged or pointed impacts. The use of such experimental setups and procedures in addition to other research on the occurrence of injuries in humans will eventually lead to a complete understanding of the biomechanical injury potential in pHRI.
comment: This work is a preprint and accepted at IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) and has been submitted to the IEEE for publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Motion Planning for Minimally Actuated Serial Robots
Modern manipulators are acclaimed for their precision but often struggle to operate in confined spaces. This limitation has driven the development of hyper-redundant and continuum robots. While these present unique advantages, they face challenges in, for instance, weight, mechanical complexity, modeling and costs. The Minimally Actuated Serial Robot (MASR) has been proposed as a light-weight, low-cost and simpler alternative where passive joints are actuated with a Mobile Actuator (MA) moving along the arm. Yet, Inverse Kinematics (IK) and a general motion planning algorithm for the MASR have not be addressed. In this letter, we propose the MASR-RRT* motion planning algorithm specifically developed for the unique kinematics of MASR. The main component of the algorithm is a data-based model for solving the IK problem while considering minimal traverse of the MA. The model is trained solely using the forward kinematics of the MASR and does not require real data. With the model as a local-connection mechanism, MASR-RRT* minimizes a cost function expressing the action time. In a comprehensive analysis, we show that MASR-RRT* is superior in performance to the straight-forward implementation of the standard RRT*. Experiments on a real robot in different environments with obstacles validate the proposed algorithm.
IIT Bombay Racing Driverless: Autonomous Driving Stack for Formula Student AI
This work presents the design and development of IIT Bombay Racing's Formula Student style autonomous racecar algorithm capable of running at the racing events of Formula Student-AI, held in the UK. The car employs a cutting-edge sensor suite of the compute unit NVIDIA Jetson Orin AGX, 2 ZED2i stereo cameras, 1 Velodyne Puck VLP16 LiDAR and SBG Systems Ellipse N GNSS/INS IMU. It features deep learning algorithms and control systems to navigate complex tracks and execute maneuvers without any human intervention. The design process involved extensive simulations and testing to optimize the vehicle's performance and ensure its safety. The algorithms have been tested on a small scale, in-house manufactured 4-wheeled robot and on simulation software. The results obtained for testing various algorithms in perception, simultaneous localization and mapping, path planning and controls have been detailed.
comment: 8 pages, 19 figures
Text2Interaction: Establishing Safe and Preferable Human-Robot Interaction
Adjusting robot behavior to human preferences can require intensive human feedback, preventing quick adaptation to new users and changing circumstances. Moreover, current approaches typically treat user preferences as a reward, which requires a manual balance between task success and user satisfaction. To integrate new user preferences in a zero-shot manner, our proposed Text2Interaction framework invokes large language models to generate a task plan, motion preferences as Python code, and parameters of a safe controller. By maximizing the combined probability of task completion and user satisfaction instead of a weighted sum of rewards, we can reliably find plans that fulfill both requirements. We find that 83% of users working with Text2Interaction agree that it integrates their preferences into the robot's plan, and 94% prefer Text2Interaction over the baseline. Our ablation study shows that Text2Interaction aligns better with unseen preferences than other baselines while maintaining a high success rate.
Developing Smart MAVs for Autonomous Inspection in GPS-denied Constructions
Smart Micro Aerial Vehicles (MAVs) have transformed infrastructure inspection by enabling efficient, high-resolution monitoring at various stages of construction, including hard-to-reach areas. Traditional manual operation of drones in GPS-denied environments, such as industrial facilities and infrastructure, is labour-intensive, tedious and prone to error. This study presents an innovative framework for smart MAV inspections in such complex and GPS-denied indoor environments. The framework features a hierarchical perception and planning system that identifies regions of interest and optimises task paths. It also presents an advanced MAV system with enhanced localisation and motion planning capabilities, integrated with Neural Reconstruction technology for comprehensive 3D reconstruction of building structures. The effectiveness of the framework was empirically validated in a 4,000 square meters indoor infrastructure facility with an interior length of 80 metres, a width of 50 metres and a height of 7 metres. The main structure consists of columns and walls. Experimental results show that our MAV system performs exceptionally well in autonomous inspection tasks, achieving a 100\% success rate in generating and executing scan paths. Extensive experiments validate the manoeuvrability of our developed MAV, achieving a 100\% success rate in motion planning with a tracking error of less than 0.1 metres. In addition, the enhanced reconstruction method using 3D Gaussian Splatting technology enables the generation of high-fidelity rendering models from the acquired data. Overall, our novel method represents a significant advancement in the use of robotics for infrastructure inspection.
A novel metric for detecting quadrotor loss-of-control ICRA
Unmanned aerial vehicles (UAVs) are becoming an integral part of both industry and society. In particular, the quadrotor is now invaluable across a plethora of fields and recent developments, such as the inclusion of aerial manipulators, only extends their versatility. As UAVs become more widespread, preventing loss-of-control (LOC) is an ever growing concern. Unfortunately, LOC is not clearly defined for quadrotors, or indeed, many other autonomous systems. Moreover, any existing definitions are often incomplete and restrictive. A novel metric, based on actuator capabilities, is introduced to detect LOC in quadrotors. The potential of this metric for LOC detection is demonstrated through both simulated and real quadrotor flight data. It is able to detect LOC induced by actuator faults without explicit knowledge of the occurrence and nature of the failure. The proposed metric is also sensitive enough to detect LOC in more nuanced cases, where the quadrotor remains undamaged but nevertheless losses control through an aggressive yawing manoeuvre. As the metric depends only on system and actuator models, it is sufficiently general to be applied to other systems.
comment: Presented at the International Conference on Robotics and Automation (ICRA) 2024 in Yokohama, Japan
Generative Design of Multimodal Soft Pneumatic Actuators
The recent advancements in machine learning techniques have steered us towards the data-driven design of products. Motivated by this objective, the present study proposes an automated design methodology that employs data-driven methods to generate new designs of soft actuators. One of the bottlenecks in the data-driven automated design process is having publicly available data to train the model. Due to its unavailability, a synthetic data set of soft pneumatic network (Pneu-net) actuators has been created. The parametric design data set for the training of the generative model is created using data augmentation. Next, the Gaussian mixture model has been applied to generate novel parametric designs of Pneu-net actuators. The distance-based metric defines the novelty and diversity of the generated designs. In addition, it is noteworthy that the model has the potential to generate a multimodal Pneu-net actuator that could perform in-plane bending and out-of-plane twisting. Later, the novel design is passed through finite element analysis to evaluate the quality of the generated design. Moreover, the trajectory of each category of Pneu-net actuators evaluates the performance of the generated Pneu-net actuators and emphasizes the necessity of multimodal actuation. The proposed model could accelerate the design of new soft robots by selecting a soft actuator from the developed novel pool of soft actuators.
Exploring and Learning Structure: Active Inference Approach in Navigational Agents
Drawing inspiration from animal navigation strategies, we introduce a novel computational model for navigation and mapping, rooted in biologically inspired principles. Animals exhibit remarkable navigation abilities by efficiently using memory, imagination, and strategic decision-making to navigate complex and aliased environments. Building on these insights, we integrate traditional cognitive mapping approaches with an Active Inference Framework (AIF) to learn an environment structure in a few steps. Through the incorporation of topological mapping for long-term memory and AIF for navigation planning and structure learning, our model can dynamically apprehend environmental structures and expand its internal map with predicted beliefs during exploration. Comparative experiments with the Clone-Structured Graph (CSCG) model highlight our model's ability to rapidly learn environmental structures in a single episode, with minimal navigation overlap. this is achieved without prior knowledge of the dimensions of the environment or the type of observations, showcasing its robustness and effectiveness in navigating ambiguous environments.
comment: IWAI workshop 2024
CAD-Mesher: A Convenient, Accurate, Dense Mesh-based Mapping Module in SLAM for Dynamic Environments
Most LiDAR odometry and SLAM systems construct maps in point clouds, which are discrete and sparse when zoomed in, making them not directly suitable for navigation. Mesh maps represent a dense and continuous map format with low memory consumption, which can approximate complex structures with simple elements, attracting significant attention of researchers in recent years. However, most implementations operate under a static environment assumption. In effect, moving objects cause ghosting, potentially degrading the quality of meshing. To address these issues, we propose a plug-and-play meshing module adapting to dynamic environments, which can easily integrate with various LiDAR odometry to generally improve the pose estimation accuracy of odometry. In our meshing module, a novel two-stage coarse-to-fine dynamic removal method is designed to effectively filter dynamic objects, generating consistent, accurate, and dense mesh maps. To our best know, this is the first mesh construction method with explicit dynamic removal. Additionally, conducive to Gaussian process in mesh construction, sliding window-based keyframe aggregation and adaptive downsampling strategies are used to ensure the uniformity of point cloud. We evaluate the localization and mapping accuracy on five publicly available datasets. Both qualitative and quantitative results demonstrate the superiority of our method compared with the state-of-the-art algorithms. The code and introduction video are publicly available at https://yaepiii.github.io/CAD-Mesher/.
comment: 9 pages, 7 figures
Spb3DTracker: A Robust LiDAR-Based Person Tracker for Noisy Environmen
Person detection and tracking (PDT) has seen significant advancements with 2D camera-based systems in the autonomous vehicle field, leading to widespread adoption of these algorithms. However, growing privacy concerns have recently emerged as a major issue, prompting a shift towards LiDAR-based PDT as a viable alternative. Within this domain, "Tracking-by-Detection" (TBD) has become a prominent methodology. Despite its effectiveness, LiDAR-based PDT has not yet achieved the same level of performance as camera-based PDT. This paper examines key components of the LiDAR-based PDT framework, including detection post-processing, data association, motion modeling, and lifecycle management. Building upon these insights, we introduce SpbTrack, a robust person tracker designed for diverse environments. Our method achieves superior performance on noisy datasets and state-of-the-art results on KITTI Dataset benchmarks and custom office indoor dataset among LiDAR-based trackers. Project page at anonymous.
comment: 17 pages, 5 figures
Adapting a Foundation Model for Space-based Tasks
Foundation models, e.g., large language models, possess attributes of intelligence which offer promise to endow a robot with the contextual understanding necessary to navigate complex, unstructured tasks in the wild. In the future of space robotics, we see three core challenges which motivate the use of a foundation model adapted to space-based applications: 1) Scalability of ground-in-the-loop operations; 2) Generalizing prior knowledge to novel environments; and 3) Multi-modality in tasks and sensor data. Therefore, as a first-step towards building a foundation model for space-based applications, we automatically label the AI4Mars dataset to curate a language annotated dataset of visual-question-answer tuples. We fine-tune a pretrained LLaVA checkpoint on this dataset to endow a vision-language model with the ability to perform spatial reasoning and navigation on Mars' surface. In this work, we demonstrate that 1) existing vision-language models are deficient visual reasoners in space-based applications, and 2) fine-tuning a vision-language model on extraterrestrial data significantly improves the quality of responses even with a limited training dataset of only a few thousand samples.
Hierarchical in-Context Reinforcement Learning with Hindsight Modular Reflections for Planning
Large Language Models (LLMs) have demonstrated remarkable abilities in various language tasks, making them promising candidates for decision-making in robotics. Inspired by Hierarchical Reinforcement Learning (HRL), we propose Hierarchical in-Context Reinforcement Learning (HCRL), a novel framework that decomposes complex tasks into sub-tasks using an LLM-based high-level policy, in which a complex task is decomposed into sub-tasks by a high-level policy on-the-fly. The sub-tasks, defined by goals, are assigned to the low-level policy to complete. Once the LLM agent determines that the goal is finished, a new goal will be proposed. To improve the agent's performance in multi-episode execution, we propose Hindsight Modular Reflection (HMR), where, instead of reflecting on the full trajectory, we replace the task objective with intermediate goals and let the agent reflect on shorter trajectories to improve reflection efficiency. We evaluate the decision-making ability of the proposed HCRL in three benchmark environments--ALFWorld, Webshop, and HotpotQA. Results show that HCRL can achieve 9%, 42%, and 10% performance improvement in 5 episodes of execution over strong in-context learning baselines.
TacSL: A Library for Visuotactile Sensor Simulation and Learning
For both humans and robots, the sense of touch, known as tactile sensing, is critical for performing contact-rich manipulation tasks. Three key challenges in robotic tactile sensing are 1) interpreting sensor signals, 2) generating sensor signals in novel scenarios, and 3) learning sensor-based policies. For visuotactile sensors, interpretation has been facilitated by their close relationship with vision sensors (e.g., RGB cameras). However, generation is still difficult, as visuotactile sensors typically involve contact, deformation, illumination, and imaging, all of which are expensive to simulate; in turn, policy learning has been challenging, as simulation cannot be leveraged for large-scale data collection. We present \textbf{TacSL} (\textit{taxel}), a library for GPU-based visuotactile sensor simulation and learning. \textbf{TacSL} can be used to simulate visuotactile images and extract contact-force distributions over $200\times$ faster than the prior state-of-the-art, all within the widely-used Isaac Gym simulator. Furthermore, \textbf{TacSL} provides a learning toolkit containing multiple sensor models, contact-intensive training environments, and online/offline algorithms that can facilitate policy learning for sim-to-real applications. On the algorithmic side, we introduce a novel online reinforcement-learning algorithm called asymmetric actor-critic distillation (\sysName), designed to effectively and efficiently learn tactile-based policies in simulation that can transfer to the real world. Finally, we demonstrate the utility of our library and algorithms by evaluating the benefits of distillation and multimodal sensing for contact-rich manip ulation tasks, and most critically, performing sim-to-real transfer. Supplementary videos and results are at \url{https://iakinola23.github.io/tacsl/}.
Decentralized Cooperation in Heterogeneous Multi-Agent Reinforcement Learning via Graph Neural Network-Based Intrinsic Motivation
Multi-agent Reinforcement Learning (MARL) is emerging as a key framework for various sequential decision-making and control tasks. Unlike their single-agent counterparts, multi-agent systems necessitate successful cooperation among the agents. The deployment of these systems in real-world scenarios often requires decentralized training, a diverse set of agents, and learning from infrequent environmental reward signals. These challenges become more pronounced under partial observability and the lack of prior knowledge about agent heterogeneity. While notable studies use intrinsic motivation (IM) to address reward sparsity or cooperation in decentralized settings, those dealing with heterogeneity typically assume centralized training, parameter sharing, and agent indexing. To overcome these limitations, we propose the CoHet algorithm, which utilizes a novel Graph Neural Network (GNN) based intrinsic motivation to facilitate the learning of heterogeneous agent policies in decentralized settings, under the challenges of partial observability and reward sparsity. Evaluation of CoHet in the Multi-agent Particle Environment (MPE) and Vectorized Multi-Agent Simulator (VMAS) benchmarks demonstrates superior performance compared to the state-of-the-art in a range of cooperative multi-agent scenarios. Our research is supplemented by an analysis of the impact of the agent dynamics model on the intrinsic motivation module, insights into the performance of different CoHet variants, and its robustness to an increasing number of heterogeneous agents.
comment: 8 pages, 4 figures
UniT: Unified Tactile Representation for Robot Learning
UniT is a novel approach to tactile representation learning, using VQVAE to learn a compact latent space and serve as the tactile representation. It uses tactile images obtained from a single simple object to train the representation with transferability and generalizability. This tactile representation can be zero-shot transferred to various downstream tasks, including perception tasks and manipulation policy learning. Our benchmarking on an in-hand 3D pose estimation task shows that UniT outperforms existing visual and tactile representation learning methods. Additionally, UniT's effectiveness in policy learning is demonstrated across three real-world tasks involving diverse manipulated objects and complex robot-object-environment interactions. Through extensive experimentation, UniT is shown to be a simple-to-train, plug-and-play, yet widely effective method for tactile representation learning. For more details, please refer to our open-source repository https://github.com/ZhengtongXu/UniT and the project website https://zhengtongxu.github.io/unifiedtactile.github.io/.
KIX: A Knowledge and Interaction-Centric Metacognitive Framework for Task Generalization
People aptly exhibit general intelligence behaviors in solving a variety of tasks with flexibility and ability to adapt to novel situations by reusing and applying high-level knowledge acquired over time. But artificial agents are more like specialists, lacking such generalist behaviors. Artificial agents will require understanding and exploiting critical structured knowledge representations. We present a metacognitive generalization framework, Knowledge-Interaction-eXecution (KIX), and argue that interactions with objects leveraging type space facilitate the learning of transferable interaction concepts and generalization. It is a natural way of integrating knowledge into reinforcement learning and is promising to act as an enabler for autonomous and generalist behaviors in artificial intelligence systems.
Toward a Surgeon-in-the-Loop Ophthalmic Robotic Apprentice using Reinforcement and Imitation Learning IROS'24
Robot-assisted surgical systems have demonstrated significant potential in enhancing surgical precision and minimizing human errors. However, existing systems cannot accommodate individual surgeons' unique preferences and requirements. Additionally, they primarily focus on general surgeries (e.g., laparoscopy) and are unsuitable for highly precise microsurgeries, such as ophthalmic procedures. Thus, we propose an image-guided approach for surgeon-centered autonomous agents that can adapt to the individual surgeon's skill level and preferred surgical techniques during ophthalmic cataract surgery. Our approach trains reinforcement and imitation learning agents simultaneously using curriculum learning approaches guided by image data to perform all tasks of the incision phase of cataract surgery. By integrating the surgeon's actions and preferences into the training process, our approach enables the robot to implicitly learn and adapt to the individual surgeon's unique techniques through surgeon-in-the-loop demonstrations. This results in a more intuitive and personalized surgical experience for the surgeon while ensuring consistent performance for the autonomous robotic apprentice. We define and evaluate the effectiveness of our approach in a simulated environment using our proposed metrics and highlight the trade-off between a generic agent and a surgeon-centered adapted agent. Finally, our approach has the potential to extend to other ophthalmic and microsurgical procedures, opening the door to a new generation of surgeon-in-the-loop autonomous surgical robots. We provide an open-source simulation framework for future development and reproducibility at https://github.com/amrgomaaelhady/CataractAdaptSurgRobot.
comment: Accepted at IROS'24
Utilizing Navigation Paths to Generate Target Points for Enhanced End-to-End Autonomous Driving Planning
In recent years, end-to-end autonomous driving frameworks have been shown to not only enhance perception performance but also improve planning capabilities. However, most previous end-to-end autonomous driving frameworks have focused primarily on enhancing environmental perception while neglecting the learning of autonomous vehicle driving intent, which refers to the vehicle's intended direction of travel. In planning, the autonomous vehicle's direction is clear and well-defined, yet this crucial aspect has often been overlooked. This paper introduces NTT (Navigation to Target for Trajectory planning), a method within an end-to-end framework for autonomous driving. NTT generates the planned trajectory in two steps. First, it generates the future target point for the autonomous vehicle on the basis of the navigation path. Then, it produces the complete planned trajectory on the basis of this target point. On the one hand, generating the target point for the autonomous vehicle from the navigation path enables the vehicle to learn a clear driving intent. On the other hand, generating the trajectory on the basis of the target point allows for a flexible planned trajectory that can adapt to complex environmental changes, thereby enhancing the safety of the planning process. Our method achieved excellent planning performance on the widely used nuScenes dataset and its effectiveness was validated through ablation experiments.
Neural Randomized Planning for Whole Body Robot Motion
Robot motion planning has made vast advances over the past decades, but the challenge remains: robot mobile manipulators struggle to plan long-range whole-body motion in common household environments in real time, because of high-dimensional robot configuration space and complex environment geometry. To tackle the challenge, this paper proposes Neural Randomized Planner (NRP), which combines a global sampling-based motion planning (SBMP) algorithm and a local neural sampler. Intuitively, NRP uses the search structure inside the global planner to stitch together learned local sampling distributions to form a global sampling distribution adaptively. It benefits from both learning and planning. Locally, it tackles high dimensionality by learning to sample in promising regions from data, with a rich neural network representation. Globally, it composes the local sampling distributions through planning and exploits local geometric similarity to scale up to complex environments. Experiments both in simulation and on a real robot show \NRP yields superior performance compared to some of the best classical and learning-enhanced SBMP algorithms. Further, despite being trained in simulation, NRP demonstrates zero-shot transfer to a real robot operating in novel household environments, without any fine-tuning or manual adaptation.
MIMONet: Multi-Input Multi-Output On-Device Deep Learning ICRA 2025
Future intelligent robots are expected to process multiple inputs simultaneously (such as image and audio data) and generate multiple outputs accordingly (such as gender and emotion), similar to humans. Recent research has shown that multi-input single-output (MISO) deep neural networks (DNN) outperform traditional single-input single-output (SISO) models, representing a significant step towards this goal. In this paper, we propose MIMONet, a novel on-device multi-input multi-output (MIMO) DNN framework that achieves high accuracy and on-device efficiency in terms of critical performance metrics such as latency, energy, and memory usage. Leveraging existing SISO model compression techniques, MIMONet develops a new deep-compression method that is specifically tailored to MIMO models. This new method explores unique yet non-trivial properties of the MIMO model, resulting in boosted accuracy and on-device efficiency. Extensive experiments on three embedded platforms commonly used in robotic systems, as well as a case study using the TurtleBot3 robot, demonstrate that MIMONet achieves higher accuracy and superior on-device efficiency compared to state-of-the-art SISO and MISO models, as well as a baseline MIMO model we constructed. Our evaluation highlights the real-world applicability of MIMONet and its potential to significantly enhance the performance of intelligent robotic systems.
comment: Submitted to ICRA 2025
A Soft Robotic System Automatically Learns Precise Agile Motions Without Model Information IROS 2024
Many application domains, e.g., in medicine and manufacturing, can greatly benefit from pneumatic Soft Robots (SRs). However, the accurate control of SRs has remained a significant challenge to date, mainly due to their nonlinear dynamics and viscoelastic material properties. Conventional control design methods often rely on either complex system modeling or time-intensive manual tuning, both of which require significant amounts of human expertise and thus limit their practicality. In recent works, the data-driven method, Automatic Neural ODE Control (ANODEC) has been successfully used to -- fully automatically and utilizing only input-output data -- design controllers for various nonlinear systems in silico, and without requiring prior model knowledge or extensive manual tuning. In this work, we successfully apply ANODEC to automatically learn to perform agile, non-repetitive reference tracking motion tasks in a real-world SR and within a finite time horizon. To the best of the authors' knowledge, ANODEC achieves, for the first time, performant control of a SR with hysteresis effects from only 30 seconds of input-output data and without any prior model knowledge. We show that for multiple, qualitatively different and even out-of-training-distribution reference signals, a single feedback controller designed by ANODEC outperforms a manually tuned PID baseline consistently. Overall, this contribution not only further strengthens the validity of ANODEC, but it marks an important step towards more practical, easy-to-use SRs that can automatically learn to perform agile motions from minimal experimental interaction time.
comment: Submitted to the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
IN-Sight: Interactive Navigation through Sight IROS 2024
Current visual navigation systems often treat the environment as static, lacking the ability to adaptively interact with obstacles. This limitation leads to navigation failure when encountering unavoidable obstructions. In response, we introduce IN-Sight, a novel approach to self-supervised path planning, enabling more effective navigation strategies through interaction with obstacles. Utilizing RGB-D observations, IN-Sight calculates traversability scores and incorporates them into a semantic map, facilitating long-range path planning in complex, maze-like environments. To precisely navigate around obstacles, IN-Sight employs a local planner, trained imperatively on a differentiable costmap using representation learning techniques. The entire framework undergoes end-to-end training within the state-of-the-art photorealistic Intel SPEAR Simulator. We validate the effectiveness of IN-Sight through extensive benchmarking in a variety of simulated scenarios and ablation studies. Moreover, we demonstrate the system's real-world applicability with zero-shot sim-to-real transfer, deploying our planner on the legged robot platform ANYmal, showcasing its practical potential for interactive navigation in real environments.
comment: The 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
TDANet: Target-Directed Attention Network For Object-Goal Visual Navigation With Zero-Shot Ability
The generalization of the end-to-end deep reinforcement learning (DRL) for object-goal visual navigation is a long-standing challenge since object classes and placements vary in new test environments. Learning domain-independent visual representation is critical for enabling the trained DRL agent with the ability to generalize to unseen scenes and objects. In this letter, a target-directed attention network (TDANet) is proposed to learn the end-to-end object-goal visual navigation policy with zero-shot ability. TDANet features a novel target attention (TA) module that learns both the spatial and semantic relationships among objects to help TDANet focus on the most relevant observed objects to the target. With the Siamese architecture (SA) design, TDANet distinguishes the difference between the current and target states and generates the domain-independent visual representation. To evaluate the navigation performance of TDANet, extensive experiments are conducted in the AI2-THOR embodied AI environment. The simulation results demonstrate a strong generalization ability of TDANet to unseen scenes and target objects, with higher navigation success rate (SR) and success weighted by length (SPL) than other state-of-the-art models. TDANet is finally deployed on a wheeled robot in real scenes, demonstrating satisfactory generalization of TDANet to the real world.
CoFiI2P: Coarse-to-Fine Correspondences for Image-to-Point Cloud Registration
Image-to-point cloud (I2P) registration is a fundamental task for robots and autonomous vehicles to achieve cross-modality data fusion and localization. Current I2P registration methods primarily focus on estimating correspondences at the point or pixel level, often neglecting global alignment. As a result, I2P matching can easily converge to a local optimum if it lacks high-level guidance from global constraints. To improve the success rate and general robustness, this paper introduces CoFiI2P, a novel I2P registration network that extracts correspondences in a coarse-to-fine manner. First, the image and point cloud data are processed through a two-stream encoder-decoder network for hierarchical feature extraction. Second, a coarse-to-fine matching module is designed to leverage these features and establish robust feature correspondences. Specifically, In the coarse matching phase, a novel I2P transformer module is employed to capture both homogeneous and heterogeneous global information from the image and point cloud data. This enables the estimation of coarse super-point/super-pixel matching pairs with discriminative descriptors. In the fine matching module, point/pixel pairs are established with the guidance of super-point/super-pixel correspondences. Finally, based on matching pairs, the transform matrix is estimated with the EPnP-RANSAC algorithm. Experiments conducted on the KITTI Odometry dataset demonstrate that CoFiI2P achieves impressive results, with a relative rotation error (RRE) of 1.14 degrees and a relative translation error (RTE) of 0.29 meters, while maintaining real-time speed.Additional experiments on the Nuscenes datasets confirm our method's generalizability. The project page is available at \url{https://whu-usi3dv.github.io/CoFiI2P}.
comment: Submitted to IEEE RA-L (under review); project page is available at: https://whu-usi3dv.github.io/CoFiI2P
Redefining Safety for Autonomous Vehicles
Existing definitions and associated conceptual frameworks for computer-based system safety should be revisited in light of real-world experiences from deploying autonomous vehicles. Current terminology used by industry safety standards emphasizes mitigation of risk from specifically identified hazards, and carries assumptions based on human-supervised vehicle operation. Operation without a human driver dramatically increases the scope of safety concerns, especially due to operation in an open world environment, a requirement to self-enforce operational limits, participation in an ad hoc sociotechnical system of systems, and a requirement to conform to both legal and ethical constraints. Existing standards and terminology only partially address these new challenges. We propose updated definitions for core system safety concepts that encompass these additional considerations as a starting point for evolving safe-ty approaches to address these additional safety challenges. These results might additionally inform framing safety terminology for other autonomous system applications.
comment: 19 pages, SafeComp 2024 preprint with additional appendix
Multiagent Systems
Quantum Annealing-Based Algorithm for Efficient Coalition Formation Among LEO Satellites
The increasing number of Low Earth Orbit (LEO) satellites, driven by lower manufacturing and launch costs, is proving invaluable for Earth observation missions and low-latency internet connectivity. However, as the number of satellites increases, the number of communication links to maintain also rises, making the management of this vast network increasingly challenging and highlighting the need for clustering satellites into efficient groups as a promising solution. This paper formulates the clustering of LEO satellites as a coalition structure generation (CSG) problem and leverages quantum annealing to solve it. We represent the satellite network as a graph and obtain the optimal partitions using a hybrid quantum-classical algorithm called GCS-Q. The algorithm follows a top-down approach by iteratively splitting the graph at each step using a quadratic unconstrained binary optimization (QUBO) formulation. To evaluate our approach, we utilize real-world three-line element set (TLE/3LE) data for Starlink satellites from Celestrak. Our experiments, conducted using the D-Wave Advantage annealer and the state-of-the-art solver Gurobi, demonstrate that the quantum annealer significantly outperforms classical methods in terms of runtime while maintaining the solution quality. The performance achieved with quantum annealers surpasses the capabilities of classical computers, highlighting the transformative potential of quantum computing in optimizing the management of large-scale satellite networks.
comment: 6 pages, 4 figures
Optimizing RAG Techniques for Automotive Industry PDF Chatbots: A Case Study with Locally Deployed Ollama Models
With the growing demand for offline PDF chatbots in automotive industrial production environments, optimizing the deployment of large language models (LLMs) in local, low-performance settings has become increasingly important. This study focuses on enhancing Retrieval-Augmented Generation (RAG) techniques for processing complex automotive industry documents using locally deployed Ollama models. Based on the Langchain framework, we propose a multi-dimensional optimization approach for Ollama's local RAG implementation. Our method addresses key challenges in automotive document processing, including multi-column layouts and technical specifications. We introduce improvements in PDF processing, retrieval mechanisms, and context compression, tailored to the unique characteristics of automotive industry documents. Additionally, we design custom classes supporting embedding pipelines and an agent supporting self-RAG based on LangGraph best practices. To evaluate our approach, we constructed a proprietary dataset comprising typical automotive industry documents, including technical reports and corporate regulations. We compared our optimized RAG model and self-RAG agent against a naive RAG baseline across three datasets: our automotive industry dataset, QReCC, and CoQA. Results demonstrate significant improvements in context precision, context recall, answer relevancy, and faithfulness, with particularly notable performance on the automotive industry dataset. Our optimization scheme provides an effective solution for deploying local RAG systems in the automotive sector, addressing the specific needs of PDF chatbots in industrial production environments. This research has important implications for advancing information processing and intelligent production in the automotive industry.
Decentralized Cooperation in Heterogeneous Multi-Agent Reinforcement Learning via Graph Neural Network-Based Intrinsic Motivation
Multi-agent Reinforcement Learning (MARL) is emerging as a key framework for various sequential decision-making and control tasks. Unlike their single-agent counterparts, multi-agent systems necessitate successful cooperation among the agents. The deployment of these systems in real-world scenarios often requires decentralized training, a diverse set of agents, and learning from infrequent environmental reward signals. These challenges become more pronounced under partial observability and the lack of prior knowledge about agent heterogeneity. While notable studies use intrinsic motivation (IM) to address reward sparsity or cooperation in decentralized settings, those dealing with heterogeneity typically assume centralized training, parameter sharing, and agent indexing. To overcome these limitations, we propose the CoHet algorithm, which utilizes a novel Graph Neural Network (GNN) based intrinsic motivation to facilitate the learning of heterogeneous agent policies in decentralized settings, under the challenges of partial observability and reward sparsity. Evaluation of CoHet in the Multi-agent Particle Environment (MPE) and Vectorized Multi-Agent Simulator (VMAS) benchmarks demonstrates superior performance compared to the state-of-the-art in a range of cooperative multi-agent scenarios. Our research is supplemented by an analysis of the impact of the agent dynamics model on the intrinsic motivation module, insights into the performance of different CoHet variants, and its robustness to an increasing number of heterogeneous agents.
comment: 8 pages, 4 figures
Distributed Stackelberg Strategies in State-based Potential Games for Autonomous Decentralized Learning Manufacturing Systems
This article describes a novel game structure for autonomously optimizing decentralized manufacturing systems with multi-objective optimization challenges, namely Distributed Stackelberg Strategies in State-Based Potential Games (DS2-SbPG). DS2-SbPG integrates potential games and Stackelberg games, which improves the cooperative trade-off capabilities of potential games and the multi-objective optimization handling by Stackelberg games. Notably, all training procedures remain conducted in a fully distributed manner. DS2-SbPG offers a promising solution to finding optimal trade-offs between objectives by eliminating the complexities of setting up combined objective optimization functions for individual players in self-learning domains, particularly in real-world industrial settings with diverse and numerous objectives between the sub-systems. We further prove that DS2-SbPG constitutes a dynamic potential game that results in corresponding converge guarantees. Experimental validation conducted on a laboratory-scale testbed highlights the efficacy of DS2-SbPG and its two variants, such as DS2-SbPG for single-leader-follower and Stack DS2-SbPG for multi-leader-follower. The results show significant reductions in power consumption and improvements in overall performance, which signals the potential of DS2-SbPG in real-world applications.
comment: This pre-print was submitted to IEEE Transactions on Systems, Man, and Cybernetics: Systems on July 31, 2024
QTypeMix: Enhancing Multi-Agent Cooperative Strategies through Heterogeneous and Homogeneous Value Decomposition
In multi-agent cooperative tasks, the presence of heterogeneous agents is familiar. Compared to cooperation among homogeneous agents, collaboration requires considering the best-suited sub-tasks for each agent. However, the operation of multi-agent systems often involves a large amount of complex interaction information, making it more challenging to learn heterogeneous strategies. Related multi-agent reinforcement learning methods sometimes use grouping mechanisms to form smaller cooperative groups or leverage prior domain knowledge to learn strategies for different roles. In contrast, agents should learn deeper role features without relying on additional information. Therefore, we propose QTypeMix, which divides the value decomposition process into homogeneous and heterogeneous stages. QTypeMix learns to extract type features from local historical observations through the TE loss. In addition, we introduce advanced network structures containing attention mechanisms and hypernets to enhance the representation capability and achieve the value decomposition process. The results of testing the proposed method on 14 maps from SMAC and SMACv2 show that QTypeMix achieves state-of-the-art performance in tasks of varying difficulty.
comment: 16 pages, 8 figures
A large-scale particle system with independent jumps and distributed synchronization
We study a system consisting of $n$ particles, moving forward in jumps on the real line. Each particle can make both independent jumps, whose sizes have some distribution, or ``synchronization'' jumps, which allow it to join a randomly chosen other particle if the latter happens to be ahead of it. System state is the empirical distribution of particle locations. The mean-field asymptotic regime, where $n\to\infty$, is considered. We prove that $v_n$, the steady-state speed of the particle system advance, converges, as $n\to\infty$, to a limit $v_{**}$ which can be easily found from a {\em minimum speed selection principle.} Also, as $n\to\infty$, we prove the convergence of the system dynamics to that of a deterministic mean-field limit (MFL). We show that the average speed of advance of any MFL is lower bounded by $v_{**}$, and the speed of a ``benchmark'' MFL, resulting from all particles initially co-located, is equal to $v_{**}$. In the special case of exponentially distributed independent jump sizes, we prove that a traveling wave MFL with speed $v$ exists if and only if $v\ge v_{**}$, with $v_{**}$ having simple explicit form; we also show the existence of traveling waves for the modified systems, with a left or right boundary moving at a constant speed $v$. Using these traveling wave existence results, we provide bounds on an MFL average speed of advance, depending on the right tail exponent of its initial state. We conjecture that these results for exponential jump sizes generalize to general jump sizes.
comment: Revision. 29 pages, 3 figures
Systems and Control (CS)
Data-Efficient Prediction of Minimum Operating Voltage via Inter- and Intra-Wafer Variation Alignment
Predicting the minimum operating voltage ($V_{min}$) of chips stands as a crucial technique in enhancing the speed and reliability of manufacturing testing flow. However, existing $V_{min}$ prediction methods often overlook various sources of variations in both training and deployment phases. Notably, the neglect of wafer zone-to-zone (intra-wafer) variations and wafer-to-wafer (inter-wafer) variations, compounded by process variations, diminishes the accuracy, data efficiency, and reliability of $V_{min}$ predictors. To address this gap, we introduce a novel data-efficient $V_{min}$ prediction flow, termed restricted bias alignment (RBA), which incorporates a novel variation alignment technique. Our approach concurrently estimates inter- and intra-wafer variations. Furthermore, we propose utilizing class probe data to model inter-wafer variations for the first time. We empirically demonstrate RBA's effectiveness and data efficiency on an industrial 16nm automotive chip dataset.
Learning in Time-Varying Monotone Network Games with Dynamic Populations
In this paper, we present a framework for multi-agent learning in a nonstationary dynamic network environment. More specifically, we examine projected gradient play in smooth monotone repeated network games in which the agents' participation and connectivity vary over time. We model this changing system with a stochastic network which takes a new independent realization at each repetition. We show that the strategy profile learned by the agents through projected gradient dynamics over the sequence of network realizations converges to a Nash equilibrium of the game in which players minimize their expected cost, almost surely and in the mean-square sense. We then show that the learned strategy profile is an almost Nash equilibrium of the game played by the agents at each stage of the repeated game with high probability. Using these two results, we derive non-asymptotic bounds on the regret incurred by the agents.
comment: 10 pages
The Distributionally Robust Infinite-Horizon LQR
We explore the infinite-horizon Distributionally Robust (DR) linear-quadratic control. While the probability distribution of disturbances is unknown and potentially correlated over time, it is confined within a Wasserstein-2 ball of a radius $r$ around a known nominal distribution. Our goal is to devise a control policy that minimizes the worst-case expected Linear-Quadratic Regulator (LQR) cost among all probability distributions of disturbances lying in the Wasserstein ambiguity set. We obtain the optimality conditions for the optimal DR controller and show that it is non-rational. Despite lacking a finite-order state-space representation, we introduce a computationally tractable fixed-point iteration algorithm. Our proposed method computes the optimal controller in the frequency domain to any desired fidelity. Moreover, for any given finite order, we use a convex numerical method to compute the best rational approximation (in $H_\infty$-norm) to the optimal non-rational DR controller. This enables efficient time-domain implementation by finite-order state-space controllers and addresses the computational hurdles associated with the finite-horizon approaches to DR-LQR problems, which typically necessitate solving a Semi-Definite Program (SDP) with a dimension scaling with the time horizon. We provide numerical simulations to showcase the effectiveness of our approach.
comment: Accepted at CDC 2024
Event-triggered moving horizon estimation for nonlinear systems
This work proposes an event-triggered moving horizon estimation (ET-MHE) scheme for general nonlinear systems. The key components of the proposed scheme are a novel event-triggering mechanism (ETM) and the suitable design of the MHE cost function. The main characteristic of our method is that the MHE's nonlinear optimization problem is only solved when the ETM triggers the transmission of measured data to the remote state estimator. If no event occurs, then the current state estimate results from an open-loop prediction using the system dynamics. Furthermore, we show robust global exponential stability of the ET-MHE under a suitable detectability condition. Finally, we illustrate the applicability of the proposed method in terms of a nonlinear benchmark example, where we achieved similar estimation performance compared to standard MHE using 86% less computational resources.
Hi-SAM: A high-scalable authentication model for satellite-ground Zero-Trust system using mean field game
As more and more Internet of Thing (IoT) devices are connected to satellite networks, the Zero-Trust Architecture brings dynamic security to the satellite-ground system, while frequent authentication creates challenges for system availability. To make the system's accommodate more IoT devices, this paper proposes a high-scalable authentication model (Hi-SAM). Hi-SAM introduces the Proof-of-Work idea to authentication, which allows device to obtain the network resource based on frequency. To optimize the frequency, mean field game is used for competition among devices, which can reduce the decision space of large-scale population games. And a dynamic time-range message authentication code is designed for security. From the test at large population scales, Hi-SAM is superior in the optimization of authentication workload and the anomaly detection efficiency.
Modelling of measuring systems -- From white box models to cognitive approaches
Mathematical models of measuring systems and processes play an essential role in metrology and practical measurements. They form the basis for understanding and evaluating measurements, their results and their trustworthiness. Classic analytical parametric modelling is based on largely complete knowledge of measurement technology and the measurement process. But due to digital transformation towards the Internet of Things (IIoT) with an increasing number of intensively and flexibly networked measurement systems and consequently ever larger amounts of data to be processed, data-based modelling approaches have gained enormous importance. This has led to new approaches in measurement technology and industry like Digital Twins, Self-X Approaches, Soft Sensor Technology and Data and Information Fusion. In the future, data-based modelling will be increasingly dominated by intelligent, cognitive systems. Evaluating of the accuracy, trustworthiness and the functional uncertainty of the corresponding models is required. This paper provides a concise overview of modelling in metrology from classical white box models to intelligent, cognitive data-driven solutions identifying advantages and limitations. Additionally, the approaches to merge trustworthiness and metrological uncertainty will be discussed.
comment: 8 pages, 13 figures, 1 table, XXIV IMEKO World Congress - Think Metrology, August 26 - 29, 2024, Hamburg, Germany
DC-DC Converters Optimization in Case of Large Variation in the Load
The method for controlling a DC-DC converter is proposed to ensures the high quality control at large fluctuations in load currents by using differential gain control coefficients and second derivative control. Various implementations of balancing the currents of a multiphase DC-DC converter are discussed, with a focus on achieving accurate current regulation without introducing additional delay in the control system. Stochastic particle swarm optimization method is used to find optimal values of the PID controller parameters. An automatic constraint-handling in optimization are also discussed as relevant techniques in the field.
Peaking into the Black-box: Prediction Intervals Give Insight into Data-driven Quadrotor Model Reliability
Ensuring the reliability and validity of data-driven quadrotor model predictions is essential for their accepted and practical use. This is especially true for grey- and black-box models wherein the mapping of inputs to predictions is not transparent and subsequent reliability notoriously difficult to ascertain. Nonetheless, such techniques are frequently and successfully used to identify quadrotor models. Prediction intervals (PIs) may be employed to provide insight into the consistency and accuracy of model predictions. This paper estimates such PIs for polynomial and Artificial Neural Network (ANN) quadrotor aerodynamic models. Two existing ANN PI estimation techniques - the bootstrap method and the quality driven method - are validated numerically for quadrotor aerodynamic models using an existing high-fidelity quadrotor simulation. Quadrotor aerodynamic models are then identified on real quadrotor flight data to demonstrate their utility and explore their sensitivity to model interpolation and extrapolation. It is found that the ANN-based PIs widen considerably when extrapolating and remain constant, or shrink, when interpolating. While this behaviour also occurs for the polynomial PIs, it is of lower magnitude. The estimated PIs establish probabilistic bounds within which the quadrotor model outputs will likely lie, subject to modelling and measurement uncertainties that are reflected through the PI widths.
comment: Presented at AIAA SciTech Forum 2023 in National Harbor, MD, USA
A novel metric for detecting quadrotor loss-of-control ICRA
Unmanned aerial vehicles (UAVs) are becoming an integral part of both industry and society. In particular, the quadrotor is now invaluable across a plethora of fields and recent developments, such as the inclusion of aerial manipulators, only extends their versatility. As UAVs become more widespread, preventing loss-of-control (LOC) is an ever growing concern. Unfortunately, LOC is not clearly defined for quadrotors, or indeed, many other autonomous systems. Moreover, any existing definitions are often incomplete and restrictive. A novel metric, based on actuator capabilities, is introduced to detect LOC in quadrotors. The potential of this metric for LOC detection is demonstrated through both simulated and real quadrotor flight data. It is able to detect LOC induced by actuator faults without explicit knowledge of the occurrence and nature of the failure. The proposed metric is also sensitive enough to detect LOC in more nuanced cases, where the quadrotor remains undamaged but nevertheless losses control through an aggressive yawing manoeuvre. As the metric depends only on system and actuator models, it is sufficiently general to be applied to other systems.
comment: Presented at the International Conference on Robotics and Automation (ICRA) 2024 in Yokohama, Japan
Harmonic Stability Analysis of Microgrids with Converter-Interfaced Distributed Energy Resources, Part II: Case Studies
In Part I of this paper a method for the Harmonic Stability Assessment (HSA) of power systems with a high share of Converter-Interfaced Distributed Energy Resources (CIDERs) was proposed. Specifically, the Harmonic State-Space (HSS) model of a generic power system is derived through combination of the components HSS models. The HSS models of CIDERs and grid are based on Linear Time-Periodic (LTP) models, capable of representing the coupling between different harmonics. In Part II, the HSA of a grid-forming, and two grid-following CIDERs (i.e., ex- and including the DC-side modelling) is performed. More precisely, the classification of the eigenvalues, the impact of the maximum harmonic order on the locations of the eigenvalues, and the sensitivity curves of the eigenvalues w.r.t. to control parameters are provided. These analyses allow to study the physical meaning and origin of the CIDERs eigenvalues. Additionally, the HSA is performed for a representative example system derived from the CIGRE low-voltage benchmark system. A case of harmonic instability is identified through the system eigenvalues, and validated with Time-Domain Simulations (TDS) in Simulink. It is demonstrated that, as opposed to stability analyses based on Linear Time-Invariant (LTI) models, the HSA is suitable for the detection of harmonic instability.
Harmonic Stability Analysis of Microgrids with Converter-Interfaced Distributed Energy Resources, Part I: Modelling and Theoretical Foundations
This paper proposes a method for the Harmonic Stability Assessment (HSA) of power systems with a high share of Converter-Interfaced Distributed Energy Resources (CIDERs). To this end, the Harmonic State-Space (HSS) model of a generic power system is formulated by combining the HSS models of the resources and the grid in closed-loop configuration. The HSS model of the resources is obtained from the Linear Time Periodic (LTP) models of the CIDER components transformed to frequency domain using Fourier theory and Toeplitz matrices. Notably, the HSS of a CIDER is capable of representing the coupling between harmonic frequencies in detail. The HSS model of the grid is derived from the dynamic equations of the individual branch and shunt elements. The system matrix of the HSS models on power-system or resource level is employed for eigenvalue analysis in the context of HSA. A sensitivity analysis of the eigenvalue loci w.r.t. changes in model parameters, and a classification of eigenvalues into control-design variant, control-design invariant, and design invariant eigenvalues is proposed. A case of harmonic instability is identified by the HSA and validated via Time-Domain Simulations (TDS) in Simulink.
Comparison of uncertainty propagation techniques in small-body environment
Close-proximity exploration of small celestial bodies is crucial for the comprehensive and accurate characterization of their properties. However, the complex and uncertain dynamical environment around them contributes to a rapid dispersion of uncertainty and the emergence of non-Gaussian distributions. Therefore, to ensure safe operations, a precise understanding of uncertainty propagation becomes imperative. In this work, the dynamical environment is analyzed around two asteroids, Apophis, which will perform a close flyby to Earth in 2029, and Eros, which has been already explored by past missions. The performance of different uncertainty propagation methods (Linear Covariance Propagation, Unscented Transformation, and Polynomial Chaos Expansion) are compared in various scenarios of close-proximity operations around the two asteroids. Findings are discussed in terms of propagation accuracy and computational efficiency depending on the dynamical environment. By exploring these methodologies, this work contributes to the broader goal of ensuring the safety and effectiveness of spacecraft operations during close-proximity exploration of small celestial bodies.
Time-limited H2-optimal Model Order Reduction of Linear Systems with Quadratic Outputs
An important class of dynamical systems with several practical applications is linear systems with quadratic outputs. These models have the same state equation as standard linear time-invariant systems but differ in their output equations, which are nonlinear quadratic functions of the system states. When dealing with models of exceptionally high order, the computational demands for simulation and analysis can become overwhelming. In such cases, model order reduction proves to be a useful technique, as it allows for constructing a reduced-order model that accurately represents the essential characteristics of the original high-order system while significantly simplifying its complexity. In time-limited model order reduction, the main goal is to maintain the output response of the original system within a specific time range in the reduced-order model. To assess the error within this time interval, a mathematical expression for the time-limited $\mathcal{H}_2$-norm is derived in this paper. This norm acts as a measure of the accuracy of the reduced-order model within the specified time range. Subsequently, the necessary conditions for achieving a local optimum of the time-limited $\mathcal{H}_2$ norm error are derived. The inherent inability to satisfy these optimality conditions within the Petrov-Galerkin projection framework is also discussed. After that, a stationary point iteration algorithm based on the optimality conditions and Petrov-Galerkin projection is proposed. Upon convergence, this algorithm fulfills three of the four optimality conditions. To demonstrate the effectiveness of the proposed algorithm, a numerical example is provided that showcases its ability to effectively approximate the original high-order model within the desired time interval.
Online-Score-Aided Federated Learning: Taming the Resource Constraints in Wireless Networks
While FL is a widely popular distributed ML strategy that protects data privacy, time-varying wireless network parameters and heterogeneous system configurations of the wireless device pose significant challenges. Although the limited radio and computational resources of the network and the clients, respectively, are widely acknowledged, two critical yet often ignored aspects are (a) wireless devices can only dedicate a small chunk of their limited storage for the FL task and (b) new training samples may arrive in an online manner in many practical wireless applications. Therefore, we propose a new FL algorithm called OSAFL, specifically designed to learn tasks relevant to wireless applications under these practical considerations. Since it has long been proven that under extreme resource constraints, clients may perform an arbitrary number of local training steps, which may lead to client drift under statistically heterogeneous data distributions, we leverage normalized gradient similarities and exploit weighting clients' updates based on optimized scores that facilitate the convergence rate of the proposed OSAFL algorithm. Our extensive simulation results on two different tasks -- each with three different datasets -- with four popular ML models validate the effectiveness of OSAFL compared to six existing state-of-the-art FL baselines.
comment: Under review for possible publication in IEEE Transactions on Wireless Communications (TWC)
Multi-Agent Deep Reinforcement Learning Framework for Wireless MAC Protocol Design and Optimization
In this letter, we propose a novel Multi-Agent Deep Reinforcement Learning (MADRL) framework for MAC protocol design. Unlike centralized approaches, which rely on a single entity for decision-making, MADRL empowers individual network nodes to autonomously learn and optimize their MAC based on local observations. Leveraging ns3-ai and RLlib, as far as we are aware of, our framework is the first of a kind that enables distributed multi-agent learning within the ns-3 environment, facilitating the design and synthesis of adaptive MAC protocols tailored to specific environmental conditions. We demonstrate the effectiveness of the MADRL MAC framework through extensive simulations, showcasing superior performance compared to legacy protocols across diverse scenarios. Our findings highlight the potential of MADRL-based MAC protocols to significantly enhance Quality of Service (QoS) requirements for future wireless applications.
Quantifying Phase Unbalance and Coordination Impacts on Distribution Network Flexibility
The increasing integration of distributed energy resources (DER) provides distribution system operators (DSO) with new flexible resources to support more efficient operation and planning of distribution networks. To utilise these resources, various DER flexibility aggregation methods have been proposed in the literature, such as aggregated P-Q flexibility areas at the interface with other networks. However, whereas focusing on estimating the limits of flexibility services, existing studies make the critical assumption that all available flexible units are perfectly coordinated to jointly provide flexibility and manage network constraints. Moreover, due to the extensive use of single-phase power flow analysis, the impacts of phase unbalance on DER flexibility aggregation remain largely unexplored. To address these gaps in knowledge, this work proposes a framework for modelling flexibility services in low voltage (LV) distribution networks which enables explicitly imposing voltage unbalance and phase coordination constraints. The simulations, performed for an illustrative 5-bus system and a real 221-bus LV network in the UK, demonstrate that a significant share (over 30%) of total aggregated DER flexibility potential may be unavailable due to voltage unbalances and lack of coordination between DER connected to different phases.
Dynamic Equivalent Identification of Interconnected Areas Using Disturbance records from Synchro phasor Measurement Units
In this paper, a methodology for modelling a dynamic equivalent of an external area is presented. The equivalent consists of a generator with series impedance and a parallel load (generalized Ward equivalent), integrating control systems such as the Automatic Voltage Regulator (AVR) and the Speed Regulator (GOV) in a test system known as PST-16. The main objective is to identify the parameters of the control systems and other parameters inherent to the generator so that the response of the equivalent system is similar to the response of the complete system.
comment: in Spanish language
Passivity-Based Gain-Scheduled Control with Scheduling Matrices
This paper considers gain-scheduling of very strictly passive (VSP) subcontrollers using scheduling matrices. The use of scheduling matrices, over scalar scheduling signals, realizes greater design freedom, which in turn can improve closed-loop performance. The form and properties of the scheduling matrices such that the overall gain-scheduled controller is VSP are explicitly discussed. The proposed gain-scheduled VSP controller is used to control a rigid two-link robot subject to model uncertainty where robust input-output stability is assured via the passivity theorem. Numerical simulation results highlight the greater design freedom, resulting in improved performance, when scheduling matrices are used over scalar scheduled signals.
comment: To be published in IEEE Conference on Control Technology and Applications (CCTA) 2024
Wireless Channel Aware Data Augmentation Methods for Deep Leaning-Based Indoor Localization
Indoor localization is a challenging problem that - unlike outdoor localization - lacks a universal and robust solution. Machine Learning (ML), particularly Deep Learning (DL), methods have been investigated as a promising approach. Although such methods bring remarkable localization accuracy, they heavily depend on the training data collected from the environment. The data collection is usually a laborious and time-consuming task, but Data Augmentation (DA) can be used to alleviate this issue. In this paper, different from previously used DA, we propose methods that utilize the domain knowledge about wireless propagation channels and devices. The methods exploit the typical hardware component drift in the transceivers and/or the statistical behavior of the channel, in combination with the measured Power Delay Profile (PDP). We comprehensively evaluate the proposed methods to demonstrate their effectiveness. This investigation mainly focuses on the impact of factors such as the number of measurements, augmentation proportion, and the environment of interest impact the effectiveness of the different DA methods. We show that in the low-data regime (few actual measurements available), localization accuracy increases up to 50%, matching non-augmented results in the high-data regime. In addition, the proposed methods may outperform the measurement-only high-data performance by up to 33% using only 1/4 of the amount of measured data. We also exhibit the effect of different training data distribution and quality on the effectiveness of DA. Finally, we demonstrate the power of the proposed methods when employed along with Transfer Learning (TL) to address the data scarcity in target and/or source environments.
comment: 13 pages, 14 figures
An Alternative to Multi-Factor Authentication with a Triple-Identity Authentication Scheme
Every user authentication scheme involves three login credentials, i.e. a username, a password and a hash value, but only one of them is associated with a user identity. However, this single identity is not robust enough to protect the whole system and the login entries (i.e., the username and password forms) have not been effectively authenticated. Therefore, a multi-factor authentication service is utilized to help guarantee the account security by transmitting an extra factor to the user to use. If more identities can be employed for the two login forms to associate with the corresponding login credentials, and if the identifiers are neither transmitted through the network nor accessible to users, such a system can be more robust even without relying on a third-party service. To achieve this, a triple-identity authentication scheme is designed within a dual-password login-authentication system, by which the identities for the username and the login password can be defined respectively. Therefore, in addition to the traditional server verification, the system can also verify the identity of a user at the username and password forms simultaneously. In the triple-identity authentication, the identifiers are entirely managed by the system without involvement of users or any third-party service, and they are concealed, incommunicable, inaccessible and independent of personal information. Thus, such truly unique identifiers are useless in online attacks.
comment: 4 pages, 2 figures, 11 conferences
Culsans: An Efficient Snoop-based Coherency Unit for the CVA6 Open Source RISC-V application processor
Symmetric Multi-Processing (SMP) based on cache coherency is crucial for high-end embedded systems like automotive applications. RISC-V is gaining traction, and open-source hardware (OSH) platforms offer solutions to issues such as IP costs and vendor dependency. Existing multi-core cache-coherent RISC-V platforms are complex and not efficient for small embedded core clusters. We propose an open-source SystemVerilog implementation of a lightweight snoop-based cache-coherent cluster of Linux-capable CVA6 cores. Our design uses the MOESI protocol via the Arm's AMBA ACE protocol. Evaluated with Splash-3 benchmarks, our solution shows up to 32.87% faster performance in a dual-core setup and an average improvement of 15.8% over OpenPiton. Synthesized using GF 22nm FDSOI technology, the Cache Coherency Unit occupies only 1.6% of the system area.
comment: 4 pages, 4 figures, DSD2024 and SEAA2024 Works in Progress Session AUG 2024; Updated the acknowledgments
A New Adaptive Phase-locked Loop for Synchronization of a Grid-Connected Voltage Source Converter: Simulation and Experimental Results
In [1] a new adaptive phase-locked loop scheme for synchronization of a grid connected voltage source converter with guaranteed (almost) global stability properties was reported. To guarantee a suitable synchronization with the angle of the three-phase grid voltage we design an adaptive observer for such a signal requiring measurements only at the point of common coupling. In this paper we present some simulation and experimental illustration of the excellent performance of the proposed solution.
Structures of M-Invariant Dual Subspaces with Respect to a Boolean Network
This paper presents the following research findings on Boolean networks (BNs) and their dual subspaces.First, we establish a bijection between the dual subspaces of a BN and the partitions of its state set. Furthermore, we demonstrate that a dual subspace is $M$-invariant if and only if the associated partition is equitable (i.e., for every two cells of the partition, every two states in the former have the same number of out-neighbors in the latter) for the BN's state-transition graph (STG). Here $M$ represents the structure matrix of the BN.Based on the equitable graphic representation, we provide, for the first time, a complete structural characterization of the smallest $M$-invariant dual subspaces generated by a set of Boolean functions. Given a set of output functions, we prove that a BN is observable if and only if the partition corresponding to the smallest $M$-invariant dual subspace generated by this set of functions is trivial (i.e., all partition cells are singletons). Building upon our structural characterization, we also present a method for constructing output functions that render the BN observable.
Data-Driven Nonlinear TDOA for Accurate Source Localization in Complex Signal Dynamics
The complex and dynamic propagation of oscillations and waves is often triggered by sources at unknown locations. Accurate source localization enables the elimination of the rotor core in atrial fibrillation (AFib) as an effective treatment for such severe cardiac disorder; it also finds potential use in locating the spreading source in natural disasters such as forest fires and tsunamis. However, existing approaches such as time of arrival (TOA) and time difference of arrival (TDOA) do not yield accurate localization results since they tacitly assume a constant signal propagation speed whereas realistic propagation is often non-static and heterogeneous. In this paper, we develop a nonlinear TDOA (NTDOA) approach which utilizes observational data from various positions to jointly learn the propagation speed at different angles and distances as well as the location of the source itself. Through examples of simulating the complex dynamics of electrical signals along the surface of the heart and satellite imagery from forest fires and tsunamis, we show that with a small handful of measurements, NTDOA, as a data-driven approach, can successfully locate the spreading source, leading also to better forecasting of the speed and direction of subsequent propagation.
comment: 9 pages, 5 figures, Accepted to IEEE Sensors Journal
Systems and Control (EESS)
Data-Efficient Prediction of Minimum Operating Voltage via Inter- and Intra-Wafer Variation Alignment
Predicting the minimum operating voltage ($V_{min}$) of chips stands as a crucial technique in enhancing the speed and reliability of manufacturing testing flow. However, existing $V_{min}$ prediction methods often overlook various sources of variations in both training and deployment phases. Notably, the neglect of wafer zone-to-zone (intra-wafer) variations and wafer-to-wafer (inter-wafer) variations, compounded by process variations, diminishes the accuracy, data efficiency, and reliability of $V_{min}$ predictors. To address this gap, we introduce a novel data-efficient $V_{min}$ prediction flow, termed restricted bias alignment (RBA), which incorporates a novel variation alignment technique. Our approach concurrently estimates inter- and intra-wafer variations. Furthermore, we propose utilizing class probe data to model inter-wafer variations for the first time. We empirically demonstrate RBA's effectiveness and data efficiency on an industrial 16nm automotive chip dataset.
Learning in Time-Varying Monotone Network Games with Dynamic Populations
In this paper, we present a framework for multi-agent learning in a nonstationary dynamic network environment. More specifically, we examine projected gradient play in smooth monotone repeated network games in which the agents' participation and connectivity vary over time. We model this changing system with a stochastic network which takes a new independent realization at each repetition. We show that the strategy profile learned by the agents through projected gradient dynamics over the sequence of network realizations converges to a Nash equilibrium of the game in which players minimize their expected cost, almost surely and in the mean-square sense. We then show that the learned strategy profile is an almost Nash equilibrium of the game played by the agents at each stage of the repeated game with high probability. Using these two results, we derive non-asymptotic bounds on the regret incurred by the agents.
comment: 10 pages
The Distributionally Robust Infinite-Horizon LQR
We explore the infinite-horizon Distributionally Robust (DR) linear-quadratic control. While the probability distribution of disturbances is unknown and potentially correlated over time, it is confined within a Wasserstein-2 ball of a radius $r$ around a known nominal distribution. Our goal is to devise a control policy that minimizes the worst-case expected Linear-Quadratic Regulator (LQR) cost among all probability distributions of disturbances lying in the Wasserstein ambiguity set. We obtain the optimality conditions for the optimal DR controller and show that it is non-rational. Despite lacking a finite-order state-space representation, we introduce a computationally tractable fixed-point iteration algorithm. Our proposed method computes the optimal controller in the frequency domain to any desired fidelity. Moreover, for any given finite order, we use a convex numerical method to compute the best rational approximation (in $H_\infty$-norm) to the optimal non-rational DR controller. This enables efficient time-domain implementation by finite-order state-space controllers and addresses the computational hurdles associated with the finite-horizon approaches to DR-LQR problems, which typically necessitate solving a Semi-Definite Program (SDP) with a dimension scaling with the time horizon. We provide numerical simulations to showcase the effectiveness of our approach.
comment: Accepted at CDC 2024
Event-triggered moving horizon estimation for nonlinear systems
This work proposes an event-triggered moving horizon estimation (ET-MHE) scheme for general nonlinear systems. The key components of the proposed scheme are a novel event-triggering mechanism (ETM) and the suitable design of the MHE cost function. The main characteristic of our method is that the MHE's nonlinear optimization problem is only solved when the ETM triggers the transmission of measured data to the remote state estimator. If no event occurs, then the current state estimate results from an open-loop prediction using the system dynamics. Furthermore, we show robust global exponential stability of the ET-MHE under a suitable detectability condition. Finally, we illustrate the applicability of the proposed method in terms of a nonlinear benchmark example, where we achieved similar estimation performance compared to standard MHE using 86% less computational resources.
Hi-SAM: A high-scalable authentication model for satellite-ground Zero-Trust system using mean field game
As more and more Internet of Thing (IoT) devices are connected to satellite networks, the Zero-Trust Architecture brings dynamic security to the satellite-ground system, while frequent authentication creates challenges for system availability. To make the system's accommodate more IoT devices, this paper proposes a high-scalable authentication model (Hi-SAM). Hi-SAM introduces the Proof-of-Work idea to authentication, which allows device to obtain the network resource based on frequency. To optimize the frequency, mean field game is used for competition among devices, which can reduce the decision space of large-scale population games. And a dynamic time-range message authentication code is designed for security. From the test at large population scales, Hi-SAM is superior in the optimization of authentication workload and the anomaly detection efficiency.
Modelling of measuring systems -- From white box models to cognitive approaches
Mathematical models of measuring systems and processes play an essential role in metrology and practical measurements. They form the basis for understanding and evaluating measurements, their results and their trustworthiness. Classic analytical parametric modelling is based on largely complete knowledge of measurement technology and the measurement process. But due to digital transformation towards the Internet of Things (IIoT) with an increasing number of intensively and flexibly networked measurement systems and consequently ever larger amounts of data to be processed, data-based modelling approaches have gained enormous importance. This has led to new approaches in measurement technology and industry like Digital Twins, Self-X Approaches, Soft Sensor Technology and Data and Information Fusion. In the future, data-based modelling will be increasingly dominated by intelligent, cognitive systems. Evaluating of the accuracy, trustworthiness and the functional uncertainty of the corresponding models is required. This paper provides a concise overview of modelling in metrology from classical white box models to intelligent, cognitive data-driven solutions identifying advantages and limitations. Additionally, the approaches to merge trustworthiness and metrological uncertainty will be discussed.
comment: 8 pages, 13 figures, 1 table, XXIV IMEKO World Congress - Think Metrology, August 26 - 29, 2024, Hamburg, Germany
DC-DC Converters Optimization in Case of Large Variation in the Load
The method for controlling a DC-DC converter is proposed to ensures the high quality control at large fluctuations in load currents by using differential gain control coefficients and second derivative control. Various implementations of balancing the currents of a multiphase DC-DC converter are discussed, with a focus on achieving accurate current regulation without introducing additional delay in the control system. Stochastic particle swarm optimization method is used to find optimal values of the PID controller parameters. An automatic constraint-handling in optimization are also discussed as relevant techniques in the field.
Peaking into the Black-box: Prediction Intervals Give Insight into Data-driven Quadrotor Model Reliability
Ensuring the reliability and validity of data-driven quadrotor model predictions is essential for their accepted and practical use. This is especially true for grey- and black-box models wherein the mapping of inputs to predictions is not transparent and subsequent reliability notoriously difficult to ascertain. Nonetheless, such techniques are frequently and successfully used to identify quadrotor models. Prediction intervals (PIs) may be employed to provide insight into the consistency and accuracy of model predictions. This paper estimates such PIs for polynomial and Artificial Neural Network (ANN) quadrotor aerodynamic models. Two existing ANN PI estimation techniques - the bootstrap method and the quality driven method - are validated numerically for quadrotor aerodynamic models using an existing high-fidelity quadrotor simulation. Quadrotor aerodynamic models are then identified on real quadrotor flight data to demonstrate their utility and explore their sensitivity to model interpolation and extrapolation. It is found that the ANN-based PIs widen considerably when extrapolating and remain constant, or shrink, when interpolating. While this behaviour also occurs for the polynomial PIs, it is of lower magnitude. The estimated PIs establish probabilistic bounds within which the quadrotor model outputs will likely lie, subject to modelling and measurement uncertainties that are reflected through the PI widths.
comment: Presented at AIAA SciTech Forum 2023 in National Harbor, MD, USA
A novel metric for detecting quadrotor loss-of-control ICRA
Unmanned aerial vehicles (UAVs) are becoming an integral part of both industry and society. In particular, the quadrotor is now invaluable across a plethora of fields and recent developments, such as the inclusion of aerial manipulators, only extends their versatility. As UAVs become more widespread, preventing loss-of-control (LOC) is an ever growing concern. Unfortunately, LOC is not clearly defined for quadrotors, or indeed, many other autonomous systems. Moreover, any existing definitions are often incomplete and restrictive. A novel metric, based on actuator capabilities, is introduced to detect LOC in quadrotors. The potential of this metric for LOC detection is demonstrated through both simulated and real quadrotor flight data. It is able to detect LOC induced by actuator faults without explicit knowledge of the occurrence and nature of the failure. The proposed metric is also sensitive enough to detect LOC in more nuanced cases, where the quadrotor remains undamaged but nevertheless losses control through an aggressive yawing manoeuvre. As the metric depends only on system and actuator models, it is sufficiently general to be applied to other systems.
comment: Presented at the International Conference on Robotics and Automation (ICRA) 2024 in Yokohama, Japan
Harmonic Stability Analysis of Microgrids with Converter-Interfaced Distributed Energy Resources, Part II: Case Studies
In Part I of this paper a method for the Harmonic Stability Assessment (HSA) of power systems with a high share of Converter-Interfaced Distributed Energy Resources (CIDERs) was proposed. Specifically, the Harmonic State-Space (HSS) model of a generic power system is derived through combination of the components HSS models. The HSS models of CIDERs and grid are based on Linear Time-Periodic (LTP) models, capable of representing the coupling between different harmonics. In Part II, the HSA of a grid-forming, and two grid-following CIDERs (i.e., ex- and including the DC-side modelling) is performed. More precisely, the classification of the eigenvalues, the impact of the maximum harmonic order on the locations of the eigenvalues, and the sensitivity curves of the eigenvalues w.r.t. to control parameters are provided. These analyses allow to study the physical meaning and origin of the CIDERs eigenvalues. Additionally, the HSA is performed for a representative example system derived from the CIGRE low-voltage benchmark system. A case of harmonic instability is identified through the system eigenvalues, and validated with Time-Domain Simulations (TDS) in Simulink. It is demonstrated that, as opposed to stability analyses based on Linear Time-Invariant (LTI) models, the HSA is suitable for the detection of harmonic instability.
Harmonic Stability Analysis of Microgrids with Converter-Interfaced Distributed Energy Resources, Part I: Modelling and Theoretical Foundations
This paper proposes a method for the Harmonic Stability Assessment (HSA) of power systems with a high share of Converter-Interfaced Distributed Energy Resources (CIDERs). To this end, the Harmonic State-Space (HSS) model of a generic power system is formulated by combining the HSS models of the resources and the grid in closed-loop configuration. The HSS model of the resources is obtained from the Linear Time Periodic (LTP) models of the CIDER components transformed to frequency domain using Fourier theory and Toeplitz matrices. Notably, the HSS of a CIDER is capable of representing the coupling between harmonic frequencies in detail. The HSS model of the grid is derived from the dynamic equations of the individual branch and shunt elements. The system matrix of the HSS models on power-system or resource level is employed for eigenvalue analysis in the context of HSA. A sensitivity analysis of the eigenvalue loci w.r.t. changes in model parameters, and a classification of eigenvalues into control-design variant, control-design invariant, and design invariant eigenvalues is proposed. A case of harmonic instability is identified by the HSA and validated via Time-Domain Simulations (TDS) in Simulink.
Comparison of uncertainty propagation techniques in small-body environment
Close-proximity exploration of small celestial bodies is crucial for the comprehensive and accurate characterization of their properties. However, the complex and uncertain dynamical environment around them contributes to a rapid dispersion of uncertainty and the emergence of non-Gaussian distributions. Therefore, to ensure safe operations, a precise understanding of uncertainty propagation becomes imperative. In this work, the dynamical environment is analyzed around two asteroids, Apophis, which will perform a close flyby to Earth in 2029, and Eros, which has been already explored by past missions. The performance of different uncertainty propagation methods (Linear Covariance Propagation, Unscented Transformation, and Polynomial Chaos Expansion) are compared in various scenarios of close-proximity operations around the two asteroids. Findings are discussed in terms of propagation accuracy and computational efficiency depending on the dynamical environment. By exploring these methodologies, this work contributes to the broader goal of ensuring the safety and effectiveness of spacecraft operations during close-proximity exploration of small celestial bodies.
Time-limited H2-optimal Model Order Reduction of Linear Systems with Quadratic Outputs
An important class of dynamical systems with several practical applications is linear systems with quadratic outputs. These models have the same state equation as standard linear time-invariant systems but differ in their output equations, which are nonlinear quadratic functions of the system states. When dealing with models of exceptionally high order, the computational demands for simulation and analysis can become overwhelming. In such cases, model order reduction proves to be a useful technique, as it allows for constructing a reduced-order model that accurately represents the essential characteristics of the original high-order system while significantly simplifying its complexity. In time-limited model order reduction, the main goal is to maintain the output response of the original system within a specific time range in the reduced-order model. To assess the error within this time interval, a mathematical expression for the time-limited $\mathcal{H}_2$-norm is derived in this paper. This norm acts as a measure of the accuracy of the reduced-order model within the specified time range. Subsequently, the necessary conditions for achieving a local optimum of the time-limited $\mathcal{H}_2$ norm error are derived. The inherent inability to satisfy these optimality conditions within the Petrov-Galerkin projection framework is also discussed. After that, a stationary point iteration algorithm based on the optimality conditions and Petrov-Galerkin projection is proposed. Upon convergence, this algorithm fulfills three of the four optimality conditions. To demonstrate the effectiveness of the proposed algorithm, a numerical example is provided that showcases its ability to effectively approximate the original high-order model within the desired time interval.
Online-Score-Aided Federated Learning: Taming the Resource Constraints in Wireless Networks
While FL is a widely popular distributed ML strategy that protects data privacy, time-varying wireless network parameters and heterogeneous system configurations of the wireless device pose significant challenges. Although the limited radio and computational resources of the network and the clients, respectively, are widely acknowledged, two critical yet often ignored aspects are (a) wireless devices can only dedicate a small chunk of their limited storage for the FL task and (b) new training samples may arrive in an online manner in many practical wireless applications. Therefore, we propose a new FL algorithm called OSAFL, specifically designed to learn tasks relevant to wireless applications under these practical considerations. Since it has long been proven that under extreme resource constraints, clients may perform an arbitrary number of local training steps, which may lead to client drift under statistically heterogeneous data distributions, we leverage normalized gradient similarities and exploit weighting clients' updates based on optimized scores that facilitate the convergence rate of the proposed OSAFL algorithm. Our extensive simulation results on two different tasks -- each with three different datasets -- with four popular ML models validate the effectiveness of OSAFL compared to six existing state-of-the-art FL baselines.
comment: Under review for possible publication in IEEE Transactions on Wireless Communications (TWC)
Multi-Agent Deep Reinforcement Learning Framework for Wireless MAC Protocol Design and Optimization
In this letter, we propose a novel Multi-Agent Deep Reinforcement Learning (MADRL) framework for MAC protocol design. Unlike centralized approaches, which rely on a single entity for decision-making, MADRL empowers individual network nodes to autonomously learn and optimize their MAC based on local observations. Leveraging ns3-ai and RLlib, as far as we are aware of, our framework is the first of a kind that enables distributed multi-agent learning within the ns-3 environment, facilitating the design and synthesis of adaptive MAC protocols tailored to specific environmental conditions. We demonstrate the effectiveness of the MADRL MAC framework through extensive simulations, showcasing superior performance compared to legacy protocols across diverse scenarios. Our findings highlight the potential of MADRL-based MAC protocols to significantly enhance Quality of Service (QoS) requirements for future wireless applications.
Quantifying Phase Unbalance and Coordination Impacts on Distribution Network Flexibility
The increasing integration of distributed energy resources (DER) provides distribution system operators (DSO) with new flexible resources to support more efficient operation and planning of distribution networks. To utilise these resources, various DER flexibility aggregation methods have been proposed in the literature, such as aggregated P-Q flexibility areas at the interface with other networks. However, whereas focusing on estimating the limits of flexibility services, existing studies make the critical assumption that all available flexible units are perfectly coordinated to jointly provide flexibility and manage network constraints. Moreover, due to the extensive use of single-phase power flow analysis, the impacts of phase unbalance on DER flexibility aggregation remain largely unexplored. To address these gaps in knowledge, this work proposes a framework for modelling flexibility services in low voltage (LV) distribution networks which enables explicitly imposing voltage unbalance and phase coordination constraints. The simulations, performed for an illustrative 5-bus system and a real 221-bus LV network in the UK, demonstrate that a significant share (over 30%) of total aggregated DER flexibility potential may be unavailable due to voltage unbalances and lack of coordination between DER connected to different phases.
Dynamic Equivalent Identification of Interconnected Areas Using Disturbance records from Synchro phasor Measurement Units
In this paper, a methodology for modelling a dynamic equivalent of an external area is presented. The equivalent consists of a generator with series impedance and a parallel load (generalized Ward equivalent), integrating control systems such as the Automatic Voltage Regulator (AVR) and the Speed Regulator (GOV) in a test system known as PST-16. The main objective is to identify the parameters of the control systems and other parameters inherent to the generator so that the response of the equivalent system is similar to the response of the complete system.
comment: in Spanish language
Passivity-Based Gain-Scheduled Control with Scheduling Matrices
This paper considers gain-scheduling of very strictly passive (VSP) subcontrollers using scheduling matrices. The use of scheduling matrices, over scalar scheduling signals, realizes greater design freedom, which in turn can improve closed-loop performance. The form and properties of the scheduling matrices such that the overall gain-scheduled controller is VSP are explicitly discussed. The proposed gain-scheduled VSP controller is used to control a rigid two-link robot subject to model uncertainty where robust input-output stability is assured via the passivity theorem. Numerical simulation results highlight the greater design freedom, resulting in improved performance, when scheduling matrices are used over scalar scheduled signals.
comment: To be published in IEEE Conference on Control Technology and Applications (CCTA) 2024
Wireless Channel Aware Data Augmentation Methods for Deep Leaning-Based Indoor Localization
Indoor localization is a challenging problem that - unlike outdoor localization - lacks a universal and robust solution. Machine Learning (ML), particularly Deep Learning (DL), methods have been investigated as a promising approach. Although such methods bring remarkable localization accuracy, they heavily depend on the training data collected from the environment. The data collection is usually a laborious and time-consuming task, but Data Augmentation (DA) can be used to alleviate this issue. In this paper, different from previously used DA, we propose methods that utilize the domain knowledge about wireless propagation channels and devices. The methods exploit the typical hardware component drift in the transceivers and/or the statistical behavior of the channel, in combination with the measured Power Delay Profile (PDP). We comprehensively evaluate the proposed methods to demonstrate their effectiveness. This investigation mainly focuses on the impact of factors such as the number of measurements, augmentation proportion, and the environment of interest impact the effectiveness of the different DA methods. We show that in the low-data regime (few actual measurements available), localization accuracy increases up to 50%, matching non-augmented results in the high-data regime. In addition, the proposed methods may outperform the measurement-only high-data performance by up to 33% using only 1/4 of the amount of measured data. We also exhibit the effect of different training data distribution and quality on the effectiveness of DA. Finally, we demonstrate the power of the proposed methods when employed along with Transfer Learning (TL) to address the data scarcity in target and/or source environments.
comment: 13 pages, 14 figures
An Alternative to Multi-Factor Authentication with a Triple-Identity Authentication Scheme
Every user authentication scheme involves three login credentials, i.e. a username, a password and a hash value, but only one of them is associated with a user identity. However, this single identity is not robust enough to protect the whole system and the login entries (i.e., the username and password forms) have not been effectively authenticated. Therefore, a multi-factor authentication service is utilized to help guarantee the account security by transmitting an extra factor to the user to use. If more identities can be employed for the two login forms to associate with the corresponding login credentials, and if the identifiers are neither transmitted through the network nor accessible to users, such a system can be more robust even without relying on a third-party service. To achieve this, a triple-identity authentication scheme is designed within a dual-password login-authentication system, by which the identities for the username and the login password can be defined respectively. Therefore, in addition to the traditional server verification, the system can also verify the identity of a user at the username and password forms simultaneously. In the triple-identity authentication, the identifiers are entirely managed by the system without involvement of users or any third-party service, and they are concealed, incommunicable, inaccessible and independent of personal information. Thus, such truly unique identifiers are useless in online attacks.
comment: 4 pages, 2 figures, 11 conferences
Culsans: An Efficient Snoop-based Coherency Unit for the CVA6 Open Source RISC-V application processor
Symmetric Multi-Processing (SMP) based on cache coherency is crucial for high-end embedded systems like automotive applications. RISC-V is gaining traction, and open-source hardware (OSH) platforms offer solutions to issues such as IP costs and vendor dependency. Existing multi-core cache-coherent RISC-V platforms are complex and not efficient for small embedded core clusters. We propose an open-source SystemVerilog implementation of a lightweight snoop-based cache-coherent cluster of Linux-capable CVA6 cores. Our design uses the MOESI protocol via the Arm's AMBA ACE protocol. Evaluated with Splash-3 benchmarks, our solution shows up to 32.87% faster performance in a dual-core setup and an average improvement of 15.8% over OpenPiton. Synthesized using GF 22nm FDSOI technology, the Cache Coherency Unit occupies only 1.6% of the system area.
comment: 4 pages, 4 figures, DSD2024 and SEAA2024 Works in Progress Session AUG 2024; Updated the acknowledgments
A New Adaptive Phase-locked Loop for Synchronization of a Grid-Connected Voltage Source Converter: Simulation and Experimental Results
In [1] a new adaptive phase-locked loop scheme for synchronization of a grid connected voltage source converter with guaranteed (almost) global stability properties was reported. To guarantee a suitable synchronization with the angle of the three-phase grid voltage we design an adaptive observer for such a signal requiring measurements only at the point of common coupling. In this paper we present some simulation and experimental illustration of the excellent performance of the proposed solution.
Structures of M-Invariant Dual Subspaces with Respect to a Boolean Network
This paper presents the following research findings on Boolean networks (BNs) and their dual subspaces.First, we establish a bijection between the dual subspaces of a BN and the partitions of its state set. Furthermore, we demonstrate that a dual subspace is $M$-invariant if and only if the associated partition is equitable (i.e., for every two cells of the partition, every two states in the former have the same number of out-neighbors in the latter) for the BN's state-transition graph (STG). Here $M$ represents the structure matrix of the BN.Based on the equitable graphic representation, we provide, for the first time, a complete structural characterization of the smallest $M$-invariant dual subspaces generated by a set of Boolean functions. Given a set of output functions, we prove that a BN is observable if and only if the partition corresponding to the smallest $M$-invariant dual subspace generated by this set of functions is trivial (i.e., all partition cells are singletons). Building upon our structural characterization, we also present a method for constructing output functions that render the BN observable.
Data-Driven Nonlinear TDOA for Accurate Source Localization in Complex Signal Dynamics
The complex and dynamic propagation of oscillations and waves is often triggered by sources at unknown locations. Accurate source localization enables the elimination of the rotor core in atrial fibrillation (AFib) as an effective treatment for such severe cardiac disorder; it also finds potential use in locating the spreading source in natural disasters such as forest fires and tsunamis. However, existing approaches such as time of arrival (TOA) and time difference of arrival (TDOA) do not yield accurate localization results since they tacitly assume a constant signal propagation speed whereas realistic propagation is often non-static and heterogeneous. In this paper, we develop a nonlinear TDOA (NTDOA) approach which utilizes observational data from various positions to jointly learn the propagation speed at different angles and distances as well as the location of the source itself. Through examples of simulating the complex dynamics of electrical signals along the surface of the heart and satellite imagery from forest fires and tsunamis, we show that with a small handful of measurements, NTDOA, as a data-driven approach, can successfully locate the spreading source, leading also to better forecasting of the speed and direction of subsequent propagation.
comment: 9 pages, 5 figures, Accepted to IEEE Sensors Journal