Aeronáutica y espacio

Who Handles Orientation? Investigating Invariance in Feature Matching
- Finding matching keypoints between images is a core problem in 3D computer vision. However, modern matchers struggle with large in-plane rotations. A straightforward mitigation is to learn rotation...
A Mamba-Based Multimodal Network for Multiscale Blast-Induced Rapid Structural Damage Assessment
- Accurate and rapid structural damage assessment (SDA) is crucial for post-disaster management, helping responders prioritise resources, plan rescues, and support recovery. Traditional field...
Towards Brain MRI Foundation Models for the Clinic: Findings from the FOMO25 Challenge
- Clinical deployment of automated brain MRI analysis faces a fundamental challenge: clinical data is heterogeneous and noisy, and high-quality labels are prohibitively costly to obtain....
The Impact of Federated Learning on Distributed Remote Sensing Archives
- Remote sensing archives are inherently distributed: Earth observation missions such as Sentinel-1, Sentinel-2, and Sentinel-3 have collectively accumulated more than 5 petabytes of imagery, stored...
ResearchCube: Multi-Dimensional Trade-off Exploration for Research Ideation
- Research ideation requires navigating trade-offs across multiple evaluative dimensions, yet most AI-assisted ideation tools leave this multi-dimensional reasoning unsupported, or reducing evaluation...
EdgeCIM: A Hardware-Software Co-Design for CIM-Based Acceleration of Small Language Models
- The growing demand for deploying Small Language Models (SLMs) on edge devices, including laptops, smartphones, and embedded platforms, has exposed fundamental inefficiencies in existing accelerators....
HuiYanEarth-SAR: A Foundation Model for High-Fidelity and Low-Cost Global Remote Sensing Imagery Generation
- Synthetic Aperture Radar (SAR) imagery generation is essential for deepening the study of scattering mechanisms, establishing trustworthy electromagnetic scene models, and fundamentally alleviating...
Observe Less, Understand More: Cost-aware Cross-scale Observation for Remote Sensing Understanding
- Remote sensing understanding inherently requires multi-resolution observation, since different targets and application tasks demand different levels of spatial detail. While low-resolution (LR)...
Scene Change Detection with Vision-Language Representation Learning
- Scene change detection (SCD) is crucial for urban monitoring and navigation but remains challenging in real-world environments due to lighting variations, seasonal shifts, viewpoint differences, and...
Seg2Change: Adapting Open-Vocabulary Semantic Segmentation Model for Remote Sensing Change Detection
- Change detection is a fundamental task in remote sensing, aiming to quantify the impacts of human activities and ecological dynamics on land-cover changes. Existing change detection methods are...
Semantic-Geometric Dual Compression: Training-Free Visual Token Reduction for Ultra-High-Resolution Remote Sensing Understanding
- Multimodal Large Language Models (MLLMs) have demonstrated immense potential in Earth observation. However, the massive visual tokens generated when processing Ultra-High-Resolution (UHR) imagery...
Robust Rate-Splitting Design for Mixed Dual-Polarized Integrated Satellite-Terrestrial Networks Under Polarization Mismatch
- Dual-polarized transmission offers a promising approach to improve spectral efficiency in multiantenna networks by reusing frequency and time resources across orthogonal polarization domains....
WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent Benchmark
- Existing browser agent benchmarks face a fundamental trilemma: real-website benchmarks lack reproducibility due to content drift, controlled environments sacrifice realism by omitting real-web noise,...
Fast-SegSim: Real-Time Open-Vocabulary Segmentation for Robotics in Simulation
- Open-vocabulary panoptic reconstruction is crucial for advanced robotics and simulation. However, existing 3D reconstruction methods, such as NeRF or Gaussian Splatting variants, often struggle to...
CheeseBench: Evaluating Large Language Models on Rodent Behavioral Neuroscience Paradigms
- We introduce CheeseBench, a benchmark that evaluates large language models (LLMs) on nine classical behavioral neuroscience paradigms (Morris water maze, Barnes maze, T-maze, radial arm maze, star...
BDIViz in Action: Interactive Curation and Benchmarking for Schema Matching Methods
- Schema matching remains fundamental to data integration, yet evaluating and comparing matching methods is hindered by limited benchmark diversity and lack of interactive validation frameworks....
Turning Generators into Retrievers: Unlocking MLLMs for Natural Language-Guided Geo-Localization
- Natural-language Guided Cross-view Geo-localization (NGCG) aims to retrieve geo-tagged satellite imagery using textual descriptions of ground scenes. While recent NGCG methods commonly rely on...
Energy-Efficient Federated Edge Learning For Small-Scale Datasets in Large IoT Networks
- Large-scale Internet of Things (IoT) networks enable intelligent services such as smart cities and autonomous driving, but often face resource constraints. Collecting heterogeneous sensory data,...
AWARE: Adaptive Whole-body Active Rotating Control for Enhanced LiDAR-Inertial Odometry under Human-in-the-Loop Interaction
- Human-in-the-loop (HITL) UAV operation is essential in complex and safety-critical aerial surveying environments, where human operators provide navigation intent while onboard autonomy must maintain...
GeoMeld: Toward Semantically Grounded Foundation Models for Remote Sensing
- Effective foundation modeling in remote sensing requires spatially aligned heterogeneous modalities coupled with semantically grounded supervision, yet such resources remain limited at scale. We...
NexusAI: Enabling Design Space Exploration of Ideas through Cognitive Abstraction and Functional Decomposition
- Large Language Models (LLMs) offer vast potential for creative ideation; however, their standard interaction paradigm often produces unstructured textual outputs that lead users to prematurely...