528 docs tagged with "architecture"

"Taking Stock at FAccT": Using Participatory Design to Co-Create a Vision for the Fairness, Accountability and Transparency Community

As a relatively new forum, ACM FAccT has become a key space for activists and scholars to critically examine emerging AI and ML technologies. It brings...

$τ$-Knowledge: Evaluating Conversational Agents over Unstructured Knowledge

Conversational agents are increasingly deployed in knowledge-intensive settings, where correct behavior depends on retrieving and applying domain-specif...

3DTCR: A Physics-Based Generative Framework for Vortex-Following 3D Reconstruction to Improve Tropical Cyclone Intensity Forecasting

Tropical cyclone (TC) intensity forecasting remains challenging as current numerical and AI-based weather models fail to satisfactorily represent extrem...

A 1/R Law for Kurtosis Contrast in Balanced Mixtures

Kurtosis-based Independent Component Analysis (ICA) weakens in wide, balanced mixtures. We prove a sharp redundancy law: for a standardized projection w...

A Constrained RL Approach for Cost-Efficient Delivery of Latency-Sensitive Applications

Next-generation networks aim to provide performance guarantees to real-time interactive services that require timely and cost-efficient packet delivery....

A Dataset is Worth 1 MB

A dataset server must often distribute the same large payload to many clients, incurring massive communication costs. Since clients frequently operate o...

A Decision-Theoretic Formalisation of Steganography With Applications to LLM Monitoring

Large language models are beginning to show steganographic capabilities. Such capabilities could allow misaligned models to evade oversight mechanisms....

A Dirac-Frenkel-Onsager principle: Instantaneous residual minimization with gauge momentum for nonlinear parametrizations of PDE solutions

Dirac-Frenkel instantaneous residual minimization evolves nonlinear parametrizations of PDE solutions in time, but ill-conditioning can render the param...

A distributed semismooth Newton based augmented Lagrangian method for distributed optimization

This paper proposes a novel distributed semismooth Newton based augmented Lagrangian method for solving a class of optimization problems over networks,...

A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

WebGIS development requires rigor, yet agentic AI frequently fails due to five large language model (LLM) limitations: context constraints, cross-sessio...

A Minimal Agent for Automated Theorem Proving

We propose a minimal agentic baseline that enables systematic comparison across different AI-based theorem prover architectures. This design implements...

A Mixed Diet Makes DINO An Omnivorous Vision Encoder

Pre-trained vision encoders like DINOv2 have demonstrated exceptional performance on unimodal tasks. However, we observe that their feature representati...

A multimodal slice discovery framework for systematic failure detection and explanation in medical image classification

Despite advances in machine learning-based medical image classifiers, the safety and reliability of these systems remain major concerns in practical set...

A Note on How to Remove the $\ln\ln T$ Term from the Squint Bound

In Orabona and Pál [2016], we introduced the shifted KT potentials, to remove the $\ln \ln T$ factor in the parameter-free learning with expert bound. I...

A Novel Computational Framework for Causal Inference: Tree-Based Discretization with ILP-Based Matching

Causal inference is essential for data-driven decision-making, as it aims to uncover causal relationships from observational data. However, identifying...

A novel hybrid approach for positive-valued DAG learning

Causal discovery from observational data remains a fundamental challenge in machine learning and statistics, particularly when variables represent inher...

A Predictive View on Streaming Hidden Markov Models

We develop a predictive-first optimisation framework for streaming hidden Markov models. Unlike classical approaches that prioritise full posterior reco...

A Proper Scoring Rule for Virtual Staining

Generative virtual staining (VS) models for high-throughput screening (HTS) can provide an estimated posterior distribution of possible biological featu...

A Quantitative Characterization of Forgetting in Post-Training

Continual post-training of generative models is widely used, yet a principled understanding of when and why forgetting occurs remains limited. We develo...

A recipe for scalable attention-based MLIPs: unlocking long-range accuracy with all-to-all node attention

Machine-learning interatomic potentials (MLIPs) have advanced rapidly, with many top models relying on strong physics-based inductive biases. However, a...

A Reference Architecture of Reinforcement Learning Frameworks

The surge in reinforcement learning (RL) applications gave rise to diverse supporting technology, such as RL frameworks. However, the architectural patt...

A Stein Identity for q-Gaussians with Bounded Support

Stein's identity is a fundamental tool in machine learning with applications in generative models, stochastic optimization, and other problems involving...

A Systematic Security Evaluation of OpenClaw and Its Variants

Tool-augmented AI agents substantially extend the practical capabilities of large language models, but they also introduce security risks that cannot be...

A theory of learning data statistics in diffusion models, from easy to hard

While diffusion models have emerged as a powerful class of generative models, their learning dynamics remain poorly understood. We address this issue fi...

A Tight Theory of Error Feedback Algorithms in Distributed Optimization

Communication costs are a major bottleneck in distributed learning and first-order optimization. A common approach to alleviate this issue is to compres...

A Tsetlin Machine-driven Intrusion Detection System for Next-Generation IoMT Security

The rapid adoption of the Internet of Medical Things (IoMT) is transforming healthcare by enabling seamless connectivity among medical devices, systems,...

A Two-Stage, Object-Centric Deep Learning Framework for Robust Exam Cheating Detection

Academic integrity continues to face the persistent challenge of examination cheating. Traditional invigilation relies on human observation, which is in...

A two-step sequential approach for hyperparameter selection in finite context models

Finite-context models (FCMs) are widely used for compressing symbolic sequences such as DNA, where predictive performance depends critically on the cont...

A unified perspective on fine-tuning and sampling with diffusion and flow models

We study the problem of training diffusion and flow generative models to sample from target distributions defined by an exponential tilting of a base de...

A Variational Estimator for $L_p$ Calibration Errors

Calibration - the problem of ensuring that predicted probabilities align with observed class frequencies - is a basic desideratum for reliable ML prediction.

Abductive Reasoning with Syllogistic Forms in Large Language Models

Research in AI using Large-Language Models (LLMs) is rapidly evolving, and the comparison of their performance with human reasoning has become a key con...

Accurate and Efficient Hybrid-Ensemble Atmospheric Data Assimilation in Latent Space with Uncertainty Quantification

Data assimilation (DA) combines model forecasts and observations to estimate the optimal state of the atmosphere with its uncertainty, providing initial...

Accurate and Reliable Uncertainty Estimates for Deterministic Predictions Extensions to Under and Overpredictions

Computational models support high-stakes decisions across engineering and science, and practitioners increasingly seek probabilistic predictions to quan...

Active Bipartite Ranking with Smooth Posterior Distributions

In this article, bipartite ranking, a statistical learning problem involved in many applications and widely studied in the passive context, is approache...

AdaCubic: An Adaptive Cubic Regularization Optimizer for Deep Learning

A novel regularization technique, AdaCubic, is proposed that adapts the weight of the cubic term. The heart of AdaCubic is an auxiliary optimization pro...

Adaptive Combinatorial Experimental Design: Pareto Optimality for Decision-Making and Inference

In this paper, we provide the first investigation into adaptive combinatorial experimental design, focusing on the trade-off between regret minimization...

Adaptive Conditional Forest Sampling for Spectral Risk Optimisation under Decision-Dependent Uncertainty

Minimising a spectral risk objective, defined as a convex combination of expected cost and Conditional Value-at-Risk (CVaR), is challenging when the unc...

Adaptive Greedy Frame Selection for Long Video Understanding

Large vision--language models (VLMs) are increasingly applied to long-video question answering, yet inference is often bottlenecked by the number of inp...

Adaptive multi-fidelity optimization with fast learning rates

In multi-fidelity optimization, biased approximations of varying costs of the target function are available. This paper studies the problem of optimizin...

Adaptive Querying with AI Persona Priors

We study adaptive querying for learning user-dependent quantities of interest, such as responses to held-out items and psychometric indicators, within t...

Affine-Scaled Attention: Towards Flexible and Stable Transformer Attention

Transformer attention is typically implemented using softmax normalization, which enforces attention weights with unit sum normalization. While effectiv...

AgentDropoutV2: Optimizing Information Flow in Multi-Agent Systems via Test-Time Rectify-or-Reject Pruning

While Multi-Agent Systems (MAS) excel in complex reasoning, they suffer from the cascading impact of erroneous information generated by individual parti...

Agnostic learning in (almost) optimal time via Gaussian surface area

The complexity of learning a concept class under Gaussian marginals in the difficult agnostic model is closely related to its $L_1$-approximability by l...

AI Agents Can Already Autonomously Perform Experimental High Energy Physics

Large language model-based AI agents are now able to autonomously execute substantial portions of a high energy physics (HEP) analysis pipeline with min...

AI-Assisted Unit Test Writing and Test-Driven Code Refactoring: A Case Study

Many software systems originate as prototypes or minimum viable products (MVPs), developed with an emphasis on delivery speed and responsiveness to chan...

AIFIND: Artifact-Aware Interpreting Fine-Grained Alignment for Incremental Face Forgery Detection

As forgery types continue to emerge consistently, Incremental Face Forgery Detection (IFFD) has become a crucial paradigm. However, existing methods typ...

Amortized Optimal Transport from Sliced Potentials

We propose a novel amortized optimization method for predicting optimal transport (OT) plans across multiple pairs of measures by leveraging Kantorovich...

An adaptive wavelet-based PINN for problems with localized high-magnitude source

In recent years, physics-informed neural networks (PINNs) have gained significant attention for solving differential equations, although they suffer fro...

An Agentic Multi-Agent Architecture for Cybersecurity Risk Management

Getting a real cybersecurity risk assessment for a small organization is expensive -- a NIST CSF-aligned engagement runs $15,000 on the low end, takes w...

An Efficient Unsupervised Federated Learning Approach for Anomaly Detection in Heterogeneous IoT Networks

Federated learning (FL) is an effective paradigm for distributed environments such as the Internet of Things (IoT), where data from diverse devices with...

An Empirical Study of SFT-DPO Interaction and Parameterization in Small Language Models

Direct Preference Optimization (DPO) is widely used after supervised fine-tuning (SFT) to align language models, yet empirical behavior under small back...

An Independent Safety Evaluation of Kimi K2.5

Kimi K2.5 is an open-weight LLM that rivals closed models across coding, multimodal, and agentic benchmarks, but was released without an accompanying sa...

An Open-Source, Open Data Approach to Activity Classification from Triaxial Accelerometry in an Ambulatory Setting

The accelerometer has become an almost ubiquitous device, providing enormous opportunities in healthcare monitoring beyond step counting or other averag...

ANTIC: Adaptive Neural Temporal In-situ Compressor

The persistent storage requirements for high-resolution, spatiotemporally evolving fields governed by large-scale and high-dimensional partial different...

Approximation and learning of anisotropic and mixed smooth functions by deep ReLU neural networks

This paper studies how efficiently deep ReLU neural networks can approximate and learn smooth functions. When the error is measured in $L^p([0,1]^d)$ no...

ArgLLM-App: An Interactive System for Argumentative Reasoning with Large Language Models

Argumentative LLMs (ArgLLMs) are an existing approach leveraging Large Language Models (LLMs) and computational argumentation for decision-making, with...

ARGUS: Seeing the Influence of Narrative Features on Persuasion in Argumentative Texts

Can narratives make arguments more persuasive? And to this end, which narrative features matter most? Although stories are often seen as powerful tools...

Artificial Intelligence for Detecting Fetal Orofacial Clefts and Advancing Medical Education

Orofacial clefts are among the most common congenital craniofacial abnormalities, yet accurate prenatal detection remains challenging due to the scarcit...

ASMR-Bench: Auditing for Sabotage in ML Research

As AI systems are increasingly used to conduct research autonomously, misaligned systems could introduce subtle flaws that produce misleading results wh...

Assessing Deanonymization Risks with Stylometry-Assisted LLM Agent

The rapid advancement of large language models (LLMs) has enabled powerful authorship inference capabilities, raising growing concerns about unintended...

Assign and Add: A Mechanistic Study of Compositional Arithmetic

Large language models are able to compose skills in order to perform complex tasks, many of which might not have been seen during training. The details...

Asymptotic and Finite-Time Guarantees for Langevin-Based Temperature Annealing in InfoNCE

The InfoNCE loss in contrastive learning depends critically on a temperature parameter, yet its dynamics under fixed versus annealed schedules remain po...

AtManRL: Towards Faithful Reasoning via Differentiable Attention Saliency

Large language models (LLMs) increasingly rely on chain-of-thought (CoT) reasoning to solve complex tasks. Yet ensuring that the reasoning trace both co...

Augmented Lagrangian Multiplier Network for State-wise Safety in Reinforcement Learning

Safety is a primary challenge in real-world reinforcement learning (RL). Formulating safety requirements as state-wise constraints has become a prominen...

Automated Instruction Revision (AIR): A Structured Comparison of Task Adaptation Strategies for LLM

This paper studies Automated Instruction Revision (AIR), a rule-induction-based method for adapting large language models (LLMs) to downstream tasks usi...

Automated Prediction of Postoperative Pancreatic Fistula Using Preoperative Computed Tomography

Postoperative pancreatic fistula (POPF) is a serious complication after pancreatic resection, increasing morbidity, hospital stay, and healthcare costs....

BAGEL: Benchmarking Animal Knowledge Expertise in Language Models

Large language models have shown strong performance on broad-domain knowledge and reasoning benchmarks, but it remains unclear how well language models...

Balancing Fidelity, Utility, and Privacy in Synthetic Cardiac MRI Generation: A Comparative Study

Deep learning in cardiac MRI (CMR) is fundamentally constrained by both data scarcity and privacy regulations. This study systematically benchmarks thre...

Batch Normalization for Neural Networks on Complex Domains

Riemannian neural networks have proven effective in solving a variety of machine learning tasks. The key to their success lies in the development of pri...

Batched Kernelized Bandits: Refinements and Extensions

In this paper, we consider the problem of black-box optimization with noisy feedback revealed in batches, where the unknown function to optimize has a b...

Bayesian X-Learner: Calibrated Posterior Inference for Heterogeneous Treatment Effects under Heavy-Tailed Outcomes

Conditional Average Treatment Effect (CATE) estimation in practice demands three properties simultaneously: heterogeneous effects $τ(x)$, calibrated unc...

Behavior-dLDS: A decomposed linear dynamical systems model for neural activity partially constrained by behavior

Brain-wide recordings of large-scale networks of neurons now provide an unprecedented view into how the brain drives behavior. However, brain activity c...

BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation

Accurate evaluation is central to the large language model (LLM) ecosystem, guiding model selection and downstream adoption across diverse use cases. In...

Better Learning-Augmented Spanning Tree Algorithms via Metric Forest Completion

We present improved learning-augmented algorithms for finding an approximate minimum spanning tree (MST) for points in an arbitrary metric space. Our wo...

BEVLM: Distilling Semantic Knowledge from LLMs into Bird's-Eye View Representations

The integration of Large Language Models (LLMs) into autonomous driving has attracted growing interest for their strong reasoning and semantic understan...

Beyond Additive Decompositions: Interpretability Through Separability

Interpretable machine learning requires models that are accurate and structurally faithful to the data.Existing explainability methods rely heavily on a...

Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer

Learning-to-Defer routes each input to the expert that minimizes expected cost, but it assumes that the information available to every expert is fixed a...

Beyond Distribution Sharpening: The Importance of Task Rewards

Frontier models have demonstrated exceptional capabilities following the integration of task-reward-based reinforcement learning (RL) into their trainin...

Beyond Final Answers: CRYSTAL Benchmark for Transparent Multimodal Reasoning Evaluation

We introduce **CRYSTAL** (*__C__lear __R__easoning via __Y__ielded __S__teps, __T__raceability and __L__ogic*), a diagnostic benchmark with 6,372 instan...

Beyond Gaussian Bottlenecks: Topologically Aligned Encoding of Vision-Transformer Feature Spaces

Modern visual world modeling systems increasingly rely on high-capacity architectures and large-scale data to produce plausible motion, yet they often f...

Beyond Mixtures and Products for Ensemble Aggregation: A Likelihood Perspective on Generalized Means

Density aggregation is a central problem in machine learning, for instance when combining predictions from a Deep Ensemble. The choice of aggregation re...

Beyond NNGP: Large Deviations and Feature Learning in Bayesian Neural Networks

We study wide Bayesian neural networks focusing on the rare but statistically dominant fluctuations that govern posterior concentration, beyond Gaussian...

Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD

It is currently difficult to distill discrete diffusion models. In contrast, continuous diffusion literature has many distillation approaches methods th...

Beyond Surface Statistics: Robust Conformal Prediction for LLMs via Internal Representations

Large language models are increasingly deployed in settings where reliability matters, yet output-level uncertainty signals such as token probabilities,...

Beyond the Parameters: A Technical Survey of Contextual Enrichment in Large Language Models: From In-Context Prompting to Causal Retrieval-Augmented Generation

Large language models (LLMs) encode vast world knowledge in their parameters, yet they remain fundamentally limited by static knowledge, finite context...

Bitwise Systolic Array Architecture for Runtime-Reconfigurable Multi-precision Quantized Multiplication on Hardware Accelerators

Neural network accelerators have been widely applied to edge devices for complex tasks like object tracking, image recognition, etc. Previous works have...

BLISSNet: Deep Operator Learning for Fast and Accurate Flow Reconstruction from Sparse Sensor Measurements

Reconstructing fluid flows from sparse sensor measurements is a fundamental challenge in science and engineering. Widely separated measurements and comp...

Boosting deep Reinforcement Learning using pretraining with Logical Options

Deep reinforcement learning agents are often misaligned, as they over-exploit early reward signals. Recently, several symbolic approaches have addressed...

BoSS: A Best-of-Strategies Selector as an Oracle for Deep Active Learning

Active learning (AL) aims to reduce annotation costs while maximizing model performance by iteratively selecting valuable instances. While foundation mo...

Breaking the Tuning Barrier: Zero-Hyperparameters Yield Multi-Corner Analysis Via Learned Priors

Yield Multi-Corner Analysis validates circuits across 25+ Process-Voltage-Temperature corners, resulting in a combinatorial simulation cost of $O(K im...

Budget-Sensitive Discovery Scoring: A Formally Verified Framework for Evaluating AI-Guided Scientific Selection

Scientific discovery increasingly relies on AI systems to select candidates for expensive experimental validation, yet no principled, budget-aware evalu...

Can Coding Agents Reproduce Findings in Computational Materials Science?

Large language models are increasingly deployed as autonomous coding agents and have achieved remarkably strong performance on software engineering benc...

Can LLMs Understand the Impact of Trauma? Costs and Benefits of LLMs Coding the Interviews of Firearm Violence Survivors

Firearm violence is a pressing public health issue, yet research into survivors' lived experiences remains underfunded and difficult to scale. Qualitati...

Case-Grounded Evidence Verification: A Framework for Constructing Evidence-Sensitive Supervision

Evidence-grounded reasoning requires more than attaching retrieved text to a prediction: a model should make decisions that depend on whether the provid...

Causal Cellular Context Transfer Learning (C3TL): An Efficient Architecture for Prediction of Unseen Perturbation Effects

Predicting the effects of chemical and genetic perturbations on quantitative cell states is a central challenge in computational biology, molecular medi...

Causal Interpretation of Neural Network Computations with Contribution Decomposition

Understanding how neural networks transform inputs into outputs is crucial for interpreting and manipulating their behavior. Most existing approaches an...

Causality Elicitation from Large Language Models

Large language models (LLMs) are trained on enormous amounts of data and encode knowledge in their parameters. We propose a pipeline to elicit causal re...

Certified and accurate computation of function space norms of deep neural networks

Neural network methods for PDEs require reliable error control in function space norms. However, trained neural networks can typically only be probed at...

Chain-of-Adaptation: Surgical Vision-Language Adaptation with Reinforcement Learning

Conventional fine-tuning on domain-specific datasets can inadvertently alter a model's pretrained multimodal priors, leading to reduced generalization....

Characterising LLM-Generated Competency Questions: a Cross-Domain Empirical Study using Open and Closed Models

Competency Questions (CQs) are a cornerstone of requirement elicitation in ontology engineering. CQs represent requirements as a set of natural language...

Characterization of Gaussian Universality Breakdown in High-Dimensional Empirical Risk Minimization

We study high-dimensional convex empirical risk minimization (ERM) under general non-Gaussian data designs. By heuristically extending the Convex Gaussi...

Chart-RL: Policy Optimization Reinforcement Learning for Enhanced Visual Reasoning in Chart Question Answering with Vision Language Models

The recent advancements in Vision Language Models (VLMs) have demonstrated progress toward true intelligence requiring robust reasoning capabilities. Be...

Chem-PerturBridge: a harmonized compendium of small molecule perturbation transcriptomic effects

Large perturbation models require training data encompassing chemical, cellular, and assay diversity. Current transcriptomic resources for small-molecul...

ChemGraph-XANES: An Agentic Framework for XANES Simulation and Analysis

Computational X-ray absorption near-edge structure (XANES) is widely used to probe local coordination environments, oxidation states, and electronic str...

Choosing the Lens: Strategic Perspective Activation in Context-Dependent Argumentation

The same arguments often need to be evaluated under different external regimes. An agent with influence over the regime has a strategic lever that stand...

Chunk-wise Attention Transducers for Fast and Accurate Streaming Speech-to-Text

We propose Chunk-wise Attention Transducer (CHAT), a novel extension to RNN-T models that processes audio in fixed-size chunks while employing cross-att...

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

LLM agents are expected to complete end-to-end units of work across software tools, business services, and local workspaces. Yet many agent benchmarks f...

Clean Architecture - Dependencies Point Inward

Implement Uncle Bob's Clean Architecture in Python with proper layering, the dependency rule, domain models, service layers, repositories, and framework boundaries.

CLoPA: Continual Low Parameter Adaptation of Interactive Segmentation for Medical Image Annotation

Interactive segmentation enables clinicians to guide annotation, but existing zero-shot models like nnInteractive fail to consistently reach expert-leve...

Clustering Astronomical Orbital Synthetic Data Using Advanced Feature Extraction and Dimensionality Reduction Techniques

The dynamics of Saturn's satellite system offer a rich framework for studying orbital stability and resonance interactions. Traditional methods for anal...

COLD-Steer: Steering Large Language Models via In-Context One-step Learning Dynamics

Activation steering methods enable inference-time control of large language model (LLM) behavior without retraining, but current approaches face a funda...

Collective Kernel EFT for Pre-activation ResNets

In finite-width deep neural networks, the empirical kernel $G$ evolves stochastically across layers. We develop a collective kernel effective field theo...

CoME: Empowering Channel-of-Mobile-Experts with Informative Hybrid-Capabilities Reasoning

Mobile Agents can autonomously execute user instructions, which requires hybrid-capabilities reasoning, including screen summary, subtask planning, acti...

Comparing Classical and Quantum Variational Classifiers on the XOR Problem

Quantum machine learning applies principles such as superposition and entanglement to data processing and optimization. Variational quantum models opera...

Competition-Aware CPC Forecasting with Near-Market Coverage

Cost-per-click (CPC) in paid search is a volatile auction outcome generated by a competitive landscape that is only partially observable from any single...

Compositional Generalization Requires Linear, Orthogonal Representations in Vision Embedding Models

Compositional generalization, the ability to recognize familiar parts in novel contexts, is a defining property of intelligent systems. Although modern...

Computing Equilibrium beyond Unilateral Deviation

Most familiar equilibrium concepts, such as Nash and correlated equilibrium, guarantee only that no single player can improve their utility by deviating...

Conditioning Protein Generation via Hopfield Pattern Multiplicity

Protein sequence generation via stochastic attention produces plausible family members from small alignments without training, but treats all stored seq...

Configuration Management - Environment-Driven Apps

Externalize and validate application configuration with python-dotenv, pydantic-settings, secrets management, multi-environment configs, and the 12-factor config principle.

Conformalized Neural Networks for Federated Uncertainty Quantification under Dual Heterogeneity

Federated learning (FL) faces challenges in uncertainty quantification (UQ). Without reliable UQ, FL systems risk deploying overconfident models at unde...

Consolidating Rewarded Perturbations for LLM Post-Training

Post-training of language models is commonly framed as a sample-score-update loop implemented by gradient descent. A recent line of work, exemplified by...

Continuous Orthogonal Mode Decomposition: Haptic Signal Prediction in Tactile Internet

The Tactile Internet demands sub-millisecond latency and ultra-high reliability, as high latency or packet loss could lead to haptic control instability...

Controllable Reasoning Models Are Private Thinkers

AI agents powered by reasoning models require access to sensitive user data. However, their reasoning traces are difficult to control, which can result...

Convergence of Two-Timescale Markovian Stochastic Approximations with Applications in Reinforcement Learning

This work studies the convergence of two-timescale stochastic approximations (SA), a class of iterative algorithms that update two sets of parameters in...

Correcting Split Selection in Online Decision Trees via Anytime-Valid Inference

Bagging-based ensembles, most notably Adaptive Random Forests, are among the strongest performers for learning from data streams. A common denominator a...

Coupled Control, Structured Memory, and Verifiable Action in Agentic AI (SCRAT -- Stochastic Control with Retrieval and Auditable Trajectories): A Comparative Perspective from Squirrel Locomotion and Scatter-Hoarding

Agentic AI is increasingly judged not by fluent output alone but by whether it can act, remember, and verify under partial observability, delay, and str...

Coverage-Aware Web Crawling for Domain-Specific Supplier Discovery via a Web--Knowledge--Web Pipeline

Identifying the full landscape of small and medium-sized enterprises (SMEs) in specialized industry sectors is critical for supply-chain resilience, yet...

Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes

Autonomous agents act through sandboxed containers and microVMs whose state spans filesystems, processes, and runtime artifacts. Checkpoint and restore...

CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

GPU kernel optimization is fundamental to modern deep learning but remains a highly specialized task requiring deep hardware expertise. Despite strong p...

CXReasonAgent: Evidence-Grounded Diagnostic Reasoning Agent for Chest X-rays

Chest X-ray plays a central role in thoracic diagnosis, and its interpretation inherently requires multi-step, evidence-grounded reasoning. However, lar...

DARE-bench: Evaluating Modeling and Instruction Fidelity of LLMs in Data Science

The fast-growing demands in using Large Language Models (LLMs) to tackle complex multi-step data science tasks create an emergent need for accurate benc...

Data Driven Optimization of GPU efficiency for Distributed LLM Adapter Serving

Large Language Model (LLM) adapters enable low-cost model specialization, but introduce complex caching and scheduling challenges in distributed serving...

Data Lake vs Warehouse vs Lakehouse for AI Workloads

What each storage architecture does for AI systems, when ML teams need both raw unstructured data and structured query access on the same platform, and how to choose and implement the right architecture in production AI data pipelines.

daVinci-Env: Open SWE Environment Synthesis at Scale

Training capable software engineering (SWE) agents demands large-scale, executable, and verifiable environments that provide dynamic feedback loops for...

Decentralized Proximal Stochastic Gradient Langevin Dynamics

We propose Decentralized Proximal Stochastic Gradient Langevin Dynamics (DE-PSGLD), a decentralized Markov chain Monte Carlo (MCMC) algorithm for sampli...

Decentralized Ranking Aggregation: Gossip Algorithms for Borda and Copeland Consensus

The concept of ranking aggregation plays a central role in preference analysis, and numerous algorithms for calculating median rankings, often originati...

Decoupled Descent: Exact Test Error Tracking Via Approximate Message Passing

In modern parametric model training, full-batch gradient descent (and its variants) suffers due to progressively stronger biasing towards the exact real...

Deep Autocorrelation Modeling for Time-Series Forecasting: Progress and Prospects

Autocorrelation is a defining characteristic of time-series data, where each observation is statistically dependent on its predecessors. In the context...

Deep ensemble graph neural networks for probabilistic cosmic-ray direction and energy reconstruction in autonomous radio arrays

Using advanced machine learning techniques, we developed a method for reconstructing precisely the arrival direction and energy of ultra-high-energy cos...

DEFault++: Automated Fault Detection, Categorization, and Diagnosis for Transformer Architectures

Transformer models are widely deployed in critical AI applications, yet faults in their attention mechanisms, projections, and other internal components...

Defending Quantum Classifiers against Adversarial Perturbations through Quantum Autoencoders

Machine learning models can learn from data samples to carry out various tasks efficiently. When data samples are adversarially manipulated, such as by...

Dependency Injection - Decoupling Components

Master dependency injection in Python from manual constructor injection to DI containers and FastAPI Depends, with testing strategies and architectural trade-offs.

Design Experiments to Compare Multi-armed Bandit Algorithms

Online platforms routinely compare multi-armed bandit algorithms, such as UCB and Thompson Sampling, to select the best-performing policy. Unlike standa...

Design-OS: A Specification-Driven Framework for Engineering System Design with a Control-Systems Design Case

Engineering system design -- whether mechatronic, control, or embedded -- often proceeds in an ad hoc manner, with requirements left implicit and tracea...

Detecting and Suppressing Reward Hacking with Gradient Fingerprints

Reinforcement learning with verifiable rewards (RLVR) typically optimizes for outcome rewards without imposing constraints on intermediate reasoning. Th...

Developing and evaluating a chatbot to support maternal health care

The ability to provide trustworthy maternal health information using phone-based chatbots can have a significant impact, particularly in low-resource se...

Developing the PsyCogMetrics AI Lab to Evaluate Large Language Models and Advance Cognitive Science -- A Three-Cycle Action Design Science Study

This study presents the development of the PsyCogMetrics AI Lab (psycogmetrics.ai), an integrated, cloud-based platform that operationalizes psychometri...

Differentiable Zero-One Loss via Hypersimplex Projections

Recent advances in machine learning have emphasized the integration of structured optimization components into end-to-end differentiable models, enablin...

Directed Social Regard: Surfacing Targeted Advocacy, Opposition, Aid, Harms, and Victimization in Online Media

The language in online platforms, influence operations, and political rhetoric frequently directs a mix of pro-social sentiment (e.g., advocacy, helpful...

Discovering Thermodynamically Admissible Dissipation Potentials via Grammar-Based Symbolic Regression

Constitutive laws for inelastic materials must satisfy strict thermodynamic admissibility requirements, yet current data-driven approaches sacrifice int...

Dissecting Quantization Error: A Concentration-Alignment Perspective

Quantization can drastically increase the efficiency of large language and vision models, but typically incurs an accuracy drop. Recently, function-pres...

Do Foundation Models Know Geometry? Probing Frozen Features for Continuous Physical Measurement

Vision-language models encode continuous geometry that their text pathway fails to express: a 6,000-parameter linear probe extracts hand joint angles at...

Do LLMs Benefit From Their Own Words?

Multi-turn interactions with large language models typically retain the assistant's own past responses in the conversation history. In this work, we rev...

Do Sparse Autoencoders Capture Concept Manifolds?

Sparse autoencoders (SAEs) are widely used to extract interpretable features from neural network representations, often under the implicit assumption th...

Domain-Adapted Retrieval for In-Context Annotation of Pedagogical Dialogue Acts

Automated annotation of pedagogical dialogue is a high-stakes task where LLMs often fail without sufficient domain grounding. We present a domain-adapte...

DSBD: Dual-Aligned Structural Basis Distillation for Graph Domain Adaptation

Graph domain adaptation (GDA) aims to transfer knowledge from a labeled source graph to an unlabeled target graph under distribution shifts. However, ex...

Dual-Modality Multi-Stage Adversarial Safety Training: Robustifying Multimodal Web Agents Against Cross-Modal Attacks

Multimodal web agents that process both screenshots and accessibility trees are increasingly deployed to interact with web interfaces, yet their dual-st...

E3-TIR: Enhanced Experience Exploitation for Tool-Integrated Reasoning

While Large Language Models (LLMs) have demonstrated significant potential in Tool-Integrated Reasoning (TIR), existing training paradigms face signific...

EASE: Federated Multimodal Unlearning via Entanglement-Aware Anchor Closure

Federated Multimodal Learning (FML) trains multimodal models across decentralized clients while keeping their image-text pairs private. However, joint e...

EB-RANSAC: Random Sample Consensus based on Energy-Based Model

Random sample consensus (RANSAC), which is based on a repetitive sampling from a given dataset, is one of the most popular robust estimation methods. In...

ECHO: Efficient Chest X-ray Report Generation with One-step Block Diffusion

Chest X-ray report generation (CXR-RG) has the potential to substantially alleviate radiologists' workload. However, conventional autoregressive vision-...

Effective Biological Representation Learning by Masking Gene Expression

RNA sequencing produces rich and diverse datasets of gene expression, offering compelling insights into cellular state and function that have many appli...

Efficient Discovery of Approximate Causal Abstractions via Neural Mechanism Sparsification

Neural networks are hypothesized to implement interpretable causal mechanisms, yet verifying this requires finding a causal abstraction -- a simpler, hi...

Efficient Multivector Retrieval with Token-Aware Clustering and Hierarchical Indexing

Multivector retrieval models achieve state-of-the-art effectiveness through fine-grained token-level representations, but their deployment incurs substa...

Efficient Refusal Ablation in LLM through Optimal Transport

Safety-aligned language models refuse harmful requests through learned refusal behaviors encoded in their internal representations. Recent activation-ba...

Empowering Heterogeneous Graph Foundation Models via Decoupled Relation Alignment

While Graph Foundation Models (GFMs) have achieved remarkable success in homogeneous graphs, extending them to multi-domain heterogeneous graphs (MDHGs)...

Enhancing AI and Dynamical Subseasonal Forecasts with Probabilistic Bias Correction

Decision-makers rely on weather forecasts to plant crops, manage wildfires, allocate water and energy, and prepare for weather extremes. Today, such for...

Enhancing Authorship Attribution with Synthetic Paintings

Attributing authorship to paintings is a historically complex task, and one of its main challenges is the limited availability of real artworks for trai...

Enhancing Hyperspace Analogue to Language (HAL) Representations via Attention-Based Pooling for Text Classification

The Hyperspace Analogue to Language (HAL) model relies on global word co-occurrence matrices to construct distributional semantic representations. While...

Enhancing Robustness of Federated Learning via Server Learning

This paper explores the use of server learning for enhancing the robustness of federated learning against malicious attacks even when clients' training...

Entropic Projection Alignment: Estimating, Explaining, and Improving Model Performance Under Distribution Shift

We propose a unified framework for addressing three key challenges of distribution shift: (1) estimating a model's performance on an unlabeled target do...

Envisioning the Future, One Step at a Time

Accurately anticipating how complex, diverse scenes will evolve requires models that represent uncertainty, simulate along extended interaction chains,...

ESG-Bench: Benchmarking Long-Context ESG Reports for Hallucination Mitigation

As corporate responsibility increasingly incorporates environmental, social, and governance (ESG) criteria, ESG reporting is becoming a legal requiremen...

Evaluating Stochasticity in Deep Research Agents

Deep Research Agents (DRAs) are promising agentic systems that gather and synthesize information to support research across domains such as financial de...

Evaluating the Progression of Large Language Model Capabilities for Small-Molecule Drug Design

Large Language Models (LLMs) have the potential to accelerate small molecule drug design due to their ability to reason about information from diverse s...

Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction

Leader-follower interaction is an important paradigm in human-robot interaction (HRI). Yet, assigning roles in real time remains challenging for resourc...

Event-Driven Temporal Graph Networks for Asynchronous Multi-Agent Cyber Defense in NetForge_RL

The transition of Multi-Agent Reinforcement Learning (MARL) policies from simulated cyber wargames to operational Security Operations Centers (SOCs) is...

Evolving Jailbreaks: Automated Multi-Objective Long-Tail Attacks on Large Language Models

Large Language Models (LLMs) have been widely deployed, especially through free Web-based applications that expose them to diverse user-generated inputs...

Explainable cluster analysis: a bagging approach

A major limitation of clustering approaches is their lack of explainability: methods rarely provide insight into which features drive the grouping of si...

Explainable Load Forecasting with Covariate-Informed Time Series Foundation Models

Time Series Foundation Models (TSFMs) have recently emerged as general-purpose forecasting models and show considerable potential for applications in en...

Exploiting Subgradient Sparsity in Max-Plus Neural Networks

Deep Neural Networks are powerful tools for solving machine learning problems, but their training often involves dense and costly parameter updates. In...

Exploration Hacking: Can LLMs Learn to Resist RL Training?

Reinforcement learning (RL) has become essential to the post-training of large language models (LLMs) for reasoning, agentic capabilities and alignment....

Fairness under Graph Uncertainty: Achieving Interventional Fairness with Partially Known Causal Graphs over Clusters of Variables

Algorithmic decisions about individuals require predictions that are not only accurate but also fair with respect to sensitive attributes such as gender...

FaultXformer: A Transformer-Encoder Based Fault Classification and Location Identification model in PMU-Integrated Active Electrical Distribution System

Accurate fault detection and localization in electrical distribution systems is crucial, especially with the increasing integration of distributed energ...

Feature-Optimized Vision for Adaptive 3D Scene Reconstruction

Three-dimensional scene reconstruction depends on local image evidence that is both visually discriminative and geometrically useful. Fixed feature thre...

Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models

Transformer-based large language models exhibit in-context learning, enabling adaptation to downstream tasks via few-shot prompting with demonstrations....

Finite Difference Flow Optimization for RL Post-Training of Text-to-Image Models

Reinforcement learning (RL) has become a standard technique for post-training diffusion-based image synthesis models, as it enables learning from reward...

Fixed-Budget Constrained Best Arm Identification in Grouped Bandits

We study fixed budget constrained best-arm identification in grouped bandits, where each arm consists of multiple independent attributes with stochastic...

FL-MHSM: Spatially-adaptive Fusion and Ensemble Learning for Flood-Landslide Multi-Hazard Susceptibility Mapping at Regional Scale

Existing multi-hazard susceptibility mapping (MHSM) studies often rely on spatially uniform models, treat hazards independently, and provide limited rep...

FlashOptim: Optimizers for Memory Efficient Training

Standard mixed-precision training of neural networks requires many bytes of accelerator memory for each model parameter. These bytes reflect not just th...

FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems

We present FlexiTac, a low-cost, open-source, and scalable piezoresistive tactile sensing solution designed for robotic end-effectors. FlexiTac is a pra...

Flow Matching is Adaptive to Manifold Structures

Flow matching has emerged as a simulation-free alternative to diffusion-based generative modeling, producing samples by solving an ODE whose time-depend...

Fly360: Omnidirectional Obstacle Avoidance within Drone View

Obstacle avoidance in unmanned aerial vehicles (UAVs), as a fundamental capability, has gained increasing attention with the growing focus on spatial in...

Fractals made Practical: Denoising Diffusion as Partitioned Iterated Function Systems

What is a diffusion model actually doing when it turns noise into a photograph? We show that the deterministic DDIM reverse chain operates as a Partitio...

Fraud Type Decomposition and the Observation-Mechanism Taxonomy:Class-Specific Detection Limits in Payment Networks

Fraud detection in payment networks relies on labels generated through heterogeneous and imperfect observation processes, yet existing approaches treat...

From Benchmarking to Reasoning: A Dual-Aspect, Large-Scale Evaluation of LLMs on Vietnamese Legal Text

The complexity of Vietnam's legal texts presents a significant barrier to public access to justice. While Large Language Models offer a promising soluti...

From Experiments to Expertise: Scientific Knowledge Consolidation for AI-Driven Computational Research

While large language models (LLMs) have transformed AI agents into proficient executors of computational materials science, performing a hundred simulat...

From Masks to Pixels and Meaning: A New Taxonomy, Benchmark, and Metrics for VLM Image Tampering

Existing tampering detection benchmarks largely rely on object masks, which severely misalign with the true edit signal: many pixels inside a mask are u...

From Shallow Bayesian Neural Networks to Gaussian Processes: General Convergence, Identifiability and Scalable Inference

In this work, we study scaling limits of shallow Bayesian neural networks (BNNs) via their connection to Gaussian processes (GPs), with an emphasis on s...

Functional Attention: From Pairwise Affinities to Functional Correspondences

Learning mappings between infinite-dimensional function spaces, or operator learning, is essential for many machine learning applications. Although tran...

General Bayesian Policy Learning

This study proposes the General Bayes framework for policy learning. We consider decision problems in which a decision-maker chooses an action from an a...

Generalization and Scaling Laws for Mixture-of-Experts Transformers

We develop a theory of generalization and scaling for Mixture-of-Experts (MoE) Transformers that cleanly separates \emph{active} per-input capacity from...

Generalization Properties of Score-matching Diffusion Models for Intrinsically Low-dimensional Data

Despite the remarkable empirical success of score-based diffusion models, their statistical guarantees remain underdeveloped. Existing analyses often pr...

Generalized Rapid Action Value Estimation in Memory-Constrained Environments

Generalized Rapid Action Value Estimation (GRAVE) has been shown to be a strong variant within the Monte-Carlo Tree Search (MCTS) family of algorithms f...

Generating DDPM-based Samples from Tilted Distributions

Given $n$ independent samples from a $d$-dimensional probability distribution, our aim is to generate diffusion-based samples from a distribution obtain...

Generating Statistical Charts with Validation-Driven LLM Workflows

Generating diverse, readable statistical charts from tabular data remains challenging for LLMs, as many failures become apparent after rendering and are...

GeoChemAD: Benchmarking Unsupervised Geochemical Anomaly Detection for Mineral Exploration

Geochemical anomaly detection plays a critical role in mineral exploration as deviations from regional geochemical baselines may indicate mineralization...

GeoContra: From Fluent GIS Code to Verifiable Spatial Analysis with Geography-Grounded Repair

Reliable spatial analysis in GIScience requires preserving coordinate semantics, topology, units, and geographic plausibility. Current LLM-based GIS sys...

Geometric regularization of autoencoders via observed stochastic dynamics

Stochastic dynamical systems with slow or metastable behavior evolve, on long time scales, on an unknown low-dimensional manifold in high-dimensional am...

Geometry-Guided Camera Motion Understanding in VideoLLMs

Camera motion is a fundamental geometric signal that shapes visual perception and cinematic style, yet current video-capable vision-language models (Vid...

Giving Sensors a Voice: Multimodal JEPA for Semantic Time-Series Embeddings

Transformer-based architectures have advanced sequence modeling in language and vision, yet general-purpose representation learning for heterogeneous mu...

Global Interpretability via Automated Preprocessing: A Framework Inspired by Psychiatric Questionnaires

Psychiatric questionnaires are highly context sensitive and often only weakly predict subsequent symptom severity, which makes the prognostic relationsh...

Global Optimality for Constrained Exploration via Penalty Regularization

Efficient exploration is a central problem in reinforcement learning and is often formalized as maximizing the entropy of the state-action occupancy mea...

GO-GenZip: Goal-Oriented Generative Sampling and Hybrid Compression

Current network data telemetry pipelines consist of massive streams of fine-grained Key Performance Indicators (KPIs) from multiple distributed sources...

Gradient Boosting within a Single Attention Layer

Transformer attention computes a single softmax-weighted average over values -- a one-pass estimate that cannot correct its own errors. We introduce \em...

Gradient Flow Polarizes Softmax Outputs towards Low-Entropy Solutions

Understanding the intricate non-convex training dynamics of softmax-based models is crucial for explaining the empirical success of transformers. In thi...

Gradient Regularized Newton Boosting Trees with Global Convergence

Gradient Boosting Decision Trees (GBDTs) dominate tabular machine learning, with modern implementations like XGBoost, LightGBM, and CatBoost being based...

Graph-Informed Adversarial Modeling: Infimal Subadditivity of Interpolative Divergences

We study adversarial learning when the target distribution factorizes according to a known Bayesian network. For interpolative divergences, including $(...

Heavy-Tailed and Long-Range Dependent Noise in Stochastic Approximation: A Finite-Time Analysis

Stochastic approximation (SA) is a fundamental iterative framework with broad applications in reinforcement learning and optimization. Classical analyse...

Hexagonal Architecture (Ports and Adapters)

Implement Hexagonal Architecture in Python using Protocol-based ports, swappable adapters, and clear boundaries between application logic and external systems.

Hierarchical Industrial Demand Forecasting with Temporal and Uncertainty Explanations

Hierarchical time-series forecasting is essential for demand prediction across various industries. While machine learning models have obtained significa...

Hierarchical Inference and Closure Learning via Adaptive Surrogates for ODEs and PDEs

Inverse problems are the task of calibrating models to match data. They play a pivotal role in diverse engineering applications by allowing practitioner...

Hierarchical Kernel Transformer: Multi-Scale Attention with an Information-Theoretic Approximation Analysis

The Hierarchical Kernel Transformer (HKT) is a multi-scale attention mechanism that processes sequences at L resolution levels via trainable causal down...

Hierarchical Planning with Latent World Models

Model predictive control (MPC) with learned world models has emerged as a promising paradigm for embodied control, particularly for its ability to gener...

Histopathology Image Normalization via Latent Manifold Compaction

Batch effects arising from technical variations in histopathology staining protocols, scanners, and acquisition pipelines pose a persistent challenge fo...

HyCOP: Hybrid Composition Operators for Interpretable Learning of PDEs

We introduce HyCOP, a modular framework that learns parametric PDE solution operators by composing simple modules (advection, diffusion, learned closure...

Hyper Input Convex Neural Networks for Shape Constrained Learning and Optimal Transport

We introduce Hyper Input Convex Neural Networks (HyCNNs), a novel neural network architecture designed for learning convex functions. HyCNNs combine the...

HyperFitS -- Hypernetwork Fitting Spectra for metabolic quantification of ${}^1$H MR spectroscopic imaging

Purpose: Proton magnetic resonance spectroscopic imaging ($^1$H MRSI) enables the mapping of whole-brain metabolites concentrations in-vivo. However, a...

Identifying Causal Effects Using a Single Proxy Variable

Unobserved confounding is a key challenge when estimating causal effects from a treatment on an outcome in scientific applications. In this work, we ass...

If LLMs Have Human-Like Attributes, Then So Does Age of Empires II

Much research has been carried out on large language models (LLMs) and LLM-powered agentic workflows. However, many works within the field state emergen...

Improved Scaling Laws via Weak-to-Strong Generalization in Random Feature Ridge Regression

It is increasingly common in machine learning to use learned models to label data and then employ such data to train more capable models. The phenomenon...

Improving Generalization on Cybersecurity Tasks with Multi-Modal Contrastive Learning

The use of ML in cybersecurity has long been impaired by generalization issues: Models that work well in controlled scenarios fail to maintain performan...

InCoder-32B-Thinking: Industrial Code World Model for Thinking

Industrial software development across chip design, GPU optimization, and embedded systems lacks expert reasoning traces showing how engineers reason ab...

Inferential Mechanics Part 1: Causal Mechanistic Theories of Machine Learning in Chemical Biology with Implications

Machine learning techniques are now routinely encountered in research laboratories across the globe. Impressive progress has been made through ML and AI...

Influence Malleability in Linearized Attention: Dual Implications of Non-Convergent NTK Dynamics

Understanding the theoretical foundations of attention mechanisms remains challenging due to their complex, non-linear dynamics. This work reveals a fun...

Information Router for Mitigating Modality Dominance in Vision-Language Models

Vision Language models (VLMs) have demonstrated strong performance across a wide range of benchmarks, yet they often suffer from modality dominance, whe...

Information-geometric adaptive sampling for graph diffusion

Standard diffusion models for graph generation typically rely on uniform time-stepping, an approach that overlooks the non-homogeneous dynamics of distr...

InnerQ: Hardware-aware Tuning-free Quantization of KV Cache for Large Language Models

Reducing the hardware footprint of large language models (LLMs) during decoding is critical for efficient long-sequence generation. A key bottleneck is...

InpaintSLat: Inpainting Structured 3D Latents via Initial Noise Optimization

We present a training-free approach for controllable 3D inpainting based on initial noise optimization. In the structured 3D latent diffusion framework,...

Integrated electro-optic attention nonlinearities for transformers

Transformers have emerged as the dominant neural-network architecture, achieving state-of-the-art performance in language processing and computer vision...

Intern-Atlas: A Methodological Evolution Graph as Research Infrastructure for AI Scientists

Existing research infrastructure is fundamentally document-centric, providing citation links between papers but lacking explicit representations of meth...

Invariance-Based Dynamic Regret Minimization

We consider stochastic non-stationary linear bandits where the linear parameter connecting contexts to the reward changes over time. Existing algorithms...

Invariant Transformation and Resampling based Epistemic-Uncertainty Reduction

An artificial intelligence (AI) model can be viewed as a function that maps inputs to outputs in high-dimensional spaces. Once designed and well trained...

Inverse Contextual Bandits without Rewards: Learning from a Non-Stationary Learner via Suffix Imitation

We study the Inverse Contextual Bandit (ICB) problem, in which a learner seeks to optimize a policy while an observer, who cannot access the learner's r...

Inversion-Free Natural Gradient Descent on Riemannian Manifolds

The natural gradient method is widely used in statistical optimization, but its standard formulation assumes a Euclidean parameter space. This paper pro...

Is Human Annotation Necessary? Iterative MBR Distillation for Error Span Detection in Machine Translation

Error Span Detection (ESD) is a crucial subtask in Machine Translation (MT) evaluation, aiming to identify the location and severity of translation erro...

Is More Data Worth the Cost? Dataset Scaling Laws in a Tiny Attention-Only Decoder

Training Transformer language models is expensive, as performance typically improves with increasing dataset size and computational budget. Although sca...

Iterative Identification Closure: Amplifying Causal Identifiability in Linear SEMs

The Half-Trek Criterion (HTC) is the primary graphical tool for determining generic identifiability of causal effect coefficients in linear structural e...

Joint-Centric Dual Contrastive Alignment with Structure-Preserving and Information-Balanced Regularization

We propose HILBERT (HIerarchical Long-sequence Balanced Embedding with Reciprocal contrastive Training), a cross-attentive multimodal framework for lear...

JumpLoRA: Sparse Adapters for Continual Learning in Large Language Models

Adapter-based methods have become a cost-effective approach to continual learning (CL) for Large Language Models (LLMs), by sequentially learning a low-...

Kernel Integrated $R^2$: A Measure of Dependence

We introduce kernel integrated $R^2$, a new measure of statistical dependence that combines the local normalization principle of the recently introduced...

Kernelized Advantage Estimation: From Nonparametric Statistics to LLM Reasoning

Recent advances in large language models (LLMs) have increasingly relied on reinforcement learning (RL) to improve their reasoning capabilities. Three a...

KLIP: localized distribution shift detection via KL-divergence with diffusion priors in Inverse Problems

Diffusion models have shown promising performance as data-driven priors for computational imaging, as well as some capacity to detect out-of-distributio...

Kolmogorov-Arnold causal generative models

Causal generative models provide a principled framework for answering observational, interventional, and counterfactual queries from observational data....

L2GTX: From Local to Global Time Series Explanations

Deep learning models achieve high accuracy in time series classification, yet understanding their class-level decision behaviour remains challenging. Ex...

Language Models Learn Constructional Semantics, Not To Mention Syntax: Investigating LM Understanding of Paired-Focus Constructions

Grasping the semantics of rare constructions (form-meaning pairings) has been shown to be a challenging problem that has currently only been solved by t...

Large Language Models Generate Harmful Content Using a Distinct, Unified Mechanism

Large language models (LLMs) undergo alignment training to avoid harmful behaviors, yet the resulting safeguards remain brittle: jailbreaks routinely by...

Latent Adversarial Detection: Adaptive Probing of LLM Activations for Multi-Turn Attack Detection

Multi-turn prompt injection follows a known attack path -- trust-building, pivoting, escalation but text-level defenses miss covert attacks where indivi...

Latent-GRPO: Group Relative Policy Optimization for Latent Reasoning

Latent reasoning offers a more efficient alternative to explicit reasoning by compressing intermediate reasoning into continuous representations and sub...

Learnability and Privacy Vulnerability are Entangled in a Few Critical Weights

Prior approaches for membership privacy preservation usually update or retrain all weights in neural networks, which is costly and can lead to unnecessa...

Learning Dynamic Belief Graphs for Theory-of-mind Reasoning

Theory of Mind (ToM) reasoning with Large Language Models (LLMs) requires inferring how people's implicit, evolving beliefs shape what they seek and how...

Learning Flexible Job Shop Scheduling under Limited Buffers and Material Kitting Constraints

The Flexible Job Shop Scheduling Problem (FJSP) originates from real production lines, while some practical constraints are often ignored or idealized i...

Learning from Child-Directed Speech in Two-Language Scenarios: A French-English Case Study

Research on developmentally plausible language models has largely focused on English, leaving open questions about multilingual settings. We present a s...

Learning interacting particle systems from unlabeled data

Learning the potentials of interacting particle systems is a fundamental task across various scientific disciplines. A major challenge is that unlabeled...

Learning Rate Transfer in Normalized Transformers

The Normalized Transformer, or nGPT (arXiv:2410.01131) achieves impressive training speedups and does not require weight decay or learning rate warmup....

Learning the Helmholtz equation operator with DeepONet for non-parametric 2D geometries

This paper deals with solving the 2D Helmholtz equation on non-parametric domains, leveraging a physics-informed neural operator network based on the De...

Learning the Signature of Memorization in Autoregressive Language Models

All prior membership inference attacks for fine-tuned language models use hand-crafted heuristics (e.g., loss thresholding, Min-K\%, reference calibrati...

Learning to Reason with Insight for Informal Theorem Proving

Although most of the automated theorem-proving approaches depend on formal proof systems, informal theorem proving can align better with large language...

LemmaBench: A Live, Research-Level Benchmark to Evaluate LLM Capabilities in Mathematics

We present a new approach for benchmarking Large Language Model (LLM) capabilities on research-level mathematics. Existing benchmarks largely rely on st...

Linear Models, Variable Selection, Artificial Intelligence

Variable selection in linear regression models has been a problem since hypothesis testing began. Which variables to include or exclude from a model is...

Linear-Core Surrogates: Smooth Loss Functions with Linear Rates for Classification and Structured Prediction

The choice of loss function in classification involves a fundamental trade-off: smooth losses (like Cross-Entropy) enable fast optimization rates but yi...

Lipschitz bounds for integral kernels

Feature maps associated with positive definite kernels play a central role in kernel methods and learning theory, where regularity properties such as Li...

LiveSense: A Real-Time Wi-Fi Sensing Platform for Range-Doppler on COTS Laptop

We present LiveSense - a cross-platform that transforms a commercial off-the-shelf (COTS) Wi-Fi Network Interface Card (NIC) on a laptop into a centimet...

LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis

Electroencephalogram (EEG) signals are vital for automated seizure detection, but their inherent noise makes robust representation learning challenging....

LLM Constitutional Multi-Agent Governance

Large Language Models (LLMs) can generate persuasive influence strategies that shift cooperative behavior in multi-agent populations, but a critical que...

LLM Novice Uplift on Dual-Use, In Silico Biology Tasks

Large language models (LLMs) perform increasingly well on biology benchmarks, but it remains unclear whether they uplift novice users -- i.e., enable hu...

LoASR-Bench: Evaluating Large Speech Language Models on Low-Resource Automatic Speech Recognition Across Language Families

Large language models (LLMs) have driven substantial advances in speech language models (SpeechLMs), yielding strong performance in automatic speech rec...

LoBoost: Fast Model-Native Local Conformal Prediction for Gradient-Boosted Trees

Gradient-boosted decision trees are among the strongest off-the-shelf predictors for tabular regression, but point predictions alone do not quantify unc...

Log-Ratio Propagation on the Simplex: A Theory of Cellwise Contamination for Compositional Data

Compositional data must be analysed through log-ratios: scale invariance, the defining axiom of the field, leaves no alternative. The centred log-ratio...

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Long-context reasoning remains a central challenge for large language models, which often fail to locate and integrate key information in extensive dist...

Low-degree Lower bounds for clustering in moderate dimension

We study the fundamental problem of clustering $n$ points into $K$ groups drawn from a mixture of isotropic Gaussians in $\mathbb{R}^d$. Specifically, w...

Low-Rank Compression of Pretrained Models via Randomized Subspace Iteration

The massive scale of pretrained models has made efficient compression essential for practical deployment. Low-rank decomposition based on the singular v...

Low-Resource Guidance for Controllable Latent Audio Diffusion

Generative audio requires fine-grained controllable outputs, yet most existing methods require model retraining on specific controls or inference-time c...

Lumos-Nexus: Efficient Frequency Bridging with Homogeneous Latent Space for Video Unified Models

Connector-based video unified models have demonstrated strong capability in instruction-grounded video synthesis, but integrating a large high-fidelity...

LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation

Recent advances in diffusion models have significantly improved text-to-video generation, enabling personalized content creation with fine-grained contr...

M-CaStLe: Uncovering Local Causal Structures in Multivariate Space-Time Gridded Data

Causal graph discovery for space-time systems is challenging in high-dimensional gridded data, which often has many more grid cells than temporal observ...

Make It Hard to Hear, Easy to Learn: Long-Form Bengali ASR and Speaker Diarization via Extreme Augmentation and Perfect Alignment

Although Automatic Speech Recognition (ASR) in Bengali has seen significant progress, processing long-duration audio and performing robust speaker diari...

Make Your LVLM KV Cache More Lightweight

Key-Value (KV) cache has become a de facto component of modern Large Vision-Language Models (LVLMs) for inference. While it enhances decoding efficiency...

ManifoldGD: Training-Free Hierarchical Manifold Guidance for Diffusion-Based Dataset Distillation

In recent times, large datasets hinder efficient model training while also containing redundant concepts. Dataset distillation aims to synthesize compac...

Many-Tier Instruction Hierarchy in LLM Agents

Large language model agents receive instructions from many sources-system messages, user prompts, tool outputs, and more-each carrying different levels...

Mapping the Methodological Space of Classroom Interaction Research: Scale, Duration, and Modality in an Age of AI

Research on classroom interaction has long been divided between large-scale observation and in-depth ethnographic work. We propose a framework mapping t...

Mapping the Phase Diagram of the Vicsek Model with Machine Learning

In this study, we use machine learning to classify and interpolate the phase structure of the Vicsek flocking model across the three-dimensional paramet...

Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms

Coarse data arise when learners observe only partial information about samples; namely, a set containing the sample rather than its exact value. This oc...

MeanFlow Meets Control: Scaling Sampled-Data Control for Swarms

Steering large-scale swarms in only a few control updates is challenging because real systems operate in sampled-data form: control inputs are updated i...

Measuring Faithfulness Depends on How You Measure: Classifier Sensitivity in LLM Chain-of-Thought Evaluation

Recent work on chain-of-thought (CoT) faithfulness reports single aggregate numbers (e.g., DeepSeek-R1 acknowledges hints 39% of the time), implying tha...

Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Large language model (LLM) agents are fundamentally bottlenecked by finite context windows on long-horizon tasks. As trajectories grow, retaining tool o...

Memory by Design: Probabilistic Sequence Layers

We introduce the design-model framework: a way to derive efficient recurrent sequence maps from explicit assumptions about memory. A design model writes...

Memory Caching: RNNs with Growing Memory

Transformers have been established as the de-facto backbones for most recent advances in sequence modeling, mainly due to their growing memory capacity...

Meritocratic Fairness in Budgeted Combinatorial Multi-armed Bandits via Shapley Values

We propose a new framework for meritocratic fairness in budgeted combinatorial multi-armed bandits with full-bandit feedback (BCMAB-FBF). Unlike semi-ba...

Microservices vs Monolith - Making the Right Choice

Navigate the monolith-to-microservices spectrum with Python - bounded contexts, communication patterns, the modular monolith, and practical decision frameworks.

Mind the Gap: Structure-Aware Consistency in Preference Learning

Preference learning has become the foundation of aligning Large Language Models (LLMs) with human intent. Popular methods, such as Direct Preference Opt...

Minimax Generalized Cross-Entropy

Loss functions play a central role in supervised classification. Cross-entropy (CE) is widely used, whereas the mean absolute error (MAE) loss can offer...

MinShap: A Modified Shapley Value Approach for Feature Selection

Feature selection is a classical problem in statistics and machine learning, and it continues to remain an extremely challenging problem especially in t...

MM-StanceDet: Retrieval-Augmented Multi-modal Multi-agent Stance Detection

Multimodal Stance Detection (MSD) is crucial for understanding public discourse, yet effectively fusing text and image, especially with conflicting sign...

Modality Collapse as Mismatched Decoding: Information-Theoretic Limits of Multimodal LLMs

Multimodal LLMs can process speech and images, but they cannot hear a speaker's voice or see an object's texture. We show this is not a failure of encod...

Mode Seeking meets Mean Seeking for Fast Long Video Generation

Scaling video generation from seconds to minutes faces a critical bottleneck: while short-video data is abundant and high-fidelity, coherent long-form d...

Model Agreement via Anchoring

Numerous lines of aim to control $ extit{model disagreement}$ -- the extent to which two machine learning models disagree in their predictions. We adop...

Model Selection and Parameter Estimation of Multi-dimensional Gaussian Mixture Model

In this paper, we study the problem of learning multi-dimensional Gaussian Mixture Models (GMMs), with a specific focus on model order selection and eff...

MoDora: Tree-Based Semi-Structured Document Analysis System

Semi-structured documents integrate diverse interleaved data elements (e.g., tables, charts, hierarchical paragraphs) arranged in various and often irre...

Modular Plugin System

Build an extensible CLI tool with plugin discovery, loading, and lifecycle management.

Module 01 - Design Patterns in Python Overview

GoF patterns, SOLID principles, DDD, and Hexagonal Architecture - enterprise design patterns implemented idiomatically in Python.

Module 01 - Object-Oriented Programming Overview

Master Python's object model at engineering depth - classes, instances, dunder methods, encapsulation, inheritance, MRO, composition, abstract base classes, dataclasses, SOLID principles, and production design patterns.

Module 02 - Microservices with Python Overview

FastAPI in depth, gRPC, event-driven architecture, service mesh patterns, and API contracts - building production Python microservices.

Module 05: Architecture & Systems Design - Complete Overview

Design production Python systems with clean architecture, hexagonal architecture, dependency injection, plugin systems, 12-factor methodology, and configuration management. The engineering patterns that separate scripts from systems.

Moment Matters: Mean and Variance Causal Graph Discovery from Heteroscedastic Observational Data

Heteroscedasticity -- where the variance of a variable changes with other variables -- is pervasive in real data, and elucidating why it arises from the...

MOO: A Multi-view Oriented Observations Dataset for Viewpoint Analysis in Cattle Re-Identification

Animal re-identification (ReID) faces critical challenges due to viewpoint variations, particularly in Aerial-Ground (AG-ReID) settings where models mus...

MovieTeller: Tool-augmented Movie Synopsis with ID Consistent Progressive Abstraction

With the explosive growth of digital entertainment, automated video summarization has become indispensable for applications such as content indexing, pe...

MT-PingEval: Evaluating Multi-Turn Collaboration with Private Information Games

We present a scalable methodology for evaluating language models in multi-turn interactions, using a suite of collaborative games that require effective...

Multimodal Optimal Transport for Unsupervised Temporal Segmentation in Surgical Robotics

Recognizing surgical phases and steps from video is a fundamental problem in computer-assisted interventions. Recent approaches increasingly rely on lar...

Multivariate Spatio-Temporal Neural Hawkes Processes

We propose a Multivariate Spatio-Temporal Neural Hawkes Process for modeling complex multivariate event data with spatio-temporal dynamics. The proposed...

MuViT: Multi-Resolution Vision Transformers for Learning Across Scales in Microscopy

Modern microscopy routinely produces gigapixel images that contain structures across multiple spatial scales, from fine cellular morphology to broader t...

MXNorm: Reusing MXFP block scales for efficient tensor normalisation

Matrix multiplication performance has long been the major bottleneck to scaling deep learning workloads, which has stimulated the design of new accelera...

Neural Diffusion Intensity Models for Point Process Data

Cox processes model overdispersed point process data via a latent stochastic intensity, but both nonparametric estimation of the intensity model and pos...

Neural Operators Can Discover Functional Clusters

Operator learning is reshaping scientific computing by amortizing inference across infinite families of problems. While neural operators (NOs) are incre...

Neuro-Symbolic ODE Discovery with Latent Grammar Flow

Understanding natural and engineered systems often relies on symbolic formulations, such as differential equations, which provide interpretability and t...

NOBLE: Accelerating Transformers with Nonlinear Low-Rank Branches

We introduce NOBLE (Nonlinear lOw-rank Branch for Linear Enhancement), an architectural augmentation that adds nonlinear low-rank branches to transforme...

Non-Asymptotic Convergence of Stochastic Iterative Algorithms: A Lyapunov Framework

We survey Lyapunov-based techniques for the finite-time analysis of stochastic iterative algorithms, also known as stochastic approximation (SA) algorit...

NonZero: Interaction-Guided Exploration for Multi-Agent Monte Carlo Tree Search

Monte Carlo Tree Search (MCTS) scales poorly in cooperative multi-agent domains because expansion must consider an exponentially large set of joint acti...

Normativity and Productivism: Ableist Intelligence? A Degrowth Analysis of AI Sign Language Translation Tools for Deaf People

Sign languages, of any geographical or accentual variation, understandably face continuous scrutiny under the ever present popularity of verbal dictatio...

Observable Performance Does Not Fully Reflect System Organization: A Multi-Level Analysis of Gait Dynamics Under Occlusal Constraint

In biomechanical systems, observable performance is often used as a proxy for underlying system organization. However, this assumption implicitly presum...

Observationally Informed Adaptive Causal Experimental Design

Randomized Controlled Trials (RCTs) represent the gold standard for causal inference yet remain a scarce resource. While large-scale observational data...

ODEBrain: Continuous-Time EEG Graph for Modeling Dynamic Brain Networks

Modeling neural population dynamics is crucial for foundational neuroscientific research and various clinical applications. Conventional latent variable...

On the Relationship Between Activation Outliers and Feature Death in Sparse Autoencoders

Sparse autoencoders (SAEs) decompose neural network activations into interpretable features, but many learned features never activate, a problem called...

One-Shot Generative Flows: Existence and Obstructions

We study dynamic measure transport for generative modelling in the setting of a stochastic process $X_\bullet$ whose marginals interpolate between a sou...

Online Quantile Regression for Nonparametric Additive Models

This paper introduces a projected functional gradient descent algorithm (P-FGD) for training nonparametric additive quantile regression models in online...

Optimal Spatio-Temporal Decoupling for Bayesian Conformal Prediction

Online Conformal Prediction (CP) struggles to balance temporal adaptability and structural stability. Feedback-driven methods (e.g., Adaptive Conformal...

Optimized Deferral for Imbalanced Settings

Learning algorithms can be significantly improved by routing complex or uncertain inputs to specialized experts, balancing accuracy with computational c...

OT on the Map: Quantifying Domain Shifts in Geographic Space

In computer vision and machine learning for geographic data, out-of-domain generalization is a pervasive challenge, arising from uneven global data cove...

Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading

Most PDE foundation models are pretrained and fine-tuned on fluid-centric benchmarks. Their utility under extreme-loading material dynamics remains uncl...

ParamMem: Augmenting Language Agents with Parametric Reflective Memory

Self-reflection enables language agents to iteratively refine solutions, yet often produces repetitive outputs that limit reasoning performance. Recent...

Partition Function Estimation under Bounded f-Divergence

We study the statistical complexity of estimating partition functions given sample access to a proposal distribution and an unnormalized density ratio f...

Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs

While autoregressive Large Vision-Language Models (LVLMs) demonstrate remarkable proficiency in multimodal tasks, they face a 'Visual Signal Dilution' p...

PhyCo: Learning Controllable Physical Priors for Generative Motion

Modern video diffusion models excel at appearance synthesis but still struggle with physical consistency: objects drift, collisions lack realistic rebou...

Physics Informed Viscous Value Representations

Offline goal-conditioned reinforcement learning (GCRL) learns goal-conditioned policies from static pre-collected datasets. However, accurate value esti...

PhysMoDPO: Physically-Plausible Humanoid Motion with Preference Optimization

Recent progress in text-conditioned human motion generation has been largely driven by diffusion models trained on large-scale human motion data. Buildi...

Plug-and-Play Diffusion Meets ADMM: Dual-Variable Coupling for Robust Medical Image Reconstruction

Plug-and-Play diffusion prior (PnPDP) frameworks have emerged as a powerful paradigm for solving imaging inverse problems by treating pretrained generat...

Plugin Systems - Building Extensible Applications

Build extensible Python applications with entry_points, importlib.metadata, stevedore, __init_subclass__, and plugin lifecycle management.

Policy-Aware Design of Large-Scale Factorial Experiments

Digital firms routinely run many online experiments on shared user populations. When product decisions are compositional, such as combinations of interf...

PONTE: Personalized Orchestration for Natural Language Trustworthy Explanations

Explainable Artificial Intelligence (XAI) seeks to enhance the transparency and accountability of machine learning systems, yet most methods follow a on...

Position: agentic AI orchestration should be Bayes-consistent

LLMs excel at predictive tasks and complex reasoning tasks, but many high-value deployments rely on decisions under uncertainty, for example, which tool...

Positional versus Symbolic Attention Heads: Learning Dynamics, RoPE Geometry, and Length Generalization

Transformer-based language models are widespread in today's society. As such, understanding the mechanisms by which they solve structured tasks and pred...

PR3DICTR: A modular AI framework for medical 3D image-based detection and outcome prediction

Three-dimensional medical image data and computer-aided decision making, particularly using deep learning, are becoming increasingly important in the me...

Prediction-powered Inference by Mixture of Experts

The rapidly expanding artificial intelligence (AI) industry has produced diverse yet powerful prediction tools, each with its own network architecture,...

Predictive Coding Graphs are a Superset of Feedforward Neural Networks

Predictive coding graphs (PCGs) are a recently introduced generalization to predictive coding networks, a neuroscience-inspired probabilistic latent var...

Preference Packing: Efficient Preference Optimization for Large Language Models

Resource-efficient training optimization techniques are becoming increasingly important as the size of large language models (LLMs) continues to grow. I...

PRIM-cipal components analysis

Supervised No Free Lunch Theorems (NFLTs) are well studied, yet unsupervised NFLTs remain underexplored. For elliptical distributions, we prove that the...

PRISM: LLM-Guided Semantic Clustering for High-Precision Topics

In this paper, we propose Precision-Informed Semantic Modeling (PRISM), a structured topic modeling framework combining the benefits of rich representat...

PRISM: Pre-alignment via Black-box On-policy Distillation for Multimodal Reinforcement Learning

The standard post-training recipe for large multimodal models (LMMs) applies supervised fine-tuning (SFT) on curated demonstrations followed by reinforc...

Probabilistic Joint and Individual Variation Explained (ProJIVE) for Data Integration

Collecting multiple types of data on the same set of subjects is common in modern scientific applications including, genomics, metabolomics, and neuroim...

Probing the Geometry of Diffusion Models with the String Method

Understanding the geometry of learned distributions is fundamental to improving and interpreting diffusion models, yet systematic tools for exploring th...

Process Reward Agents for Steering Knowledge-Intensive Reasoning

Reasoning in knowledge-intensive domains remains challenging as intermediate steps are often not locally verifiable: unlike math or code, evaluating ste...

Production FastAPI Application

Build a FastAPI app with clean architecture, dependency injection, and proper configuration management.

Prosodic Boundary-Aware Streaming Generation for LLM-Based TTS with Streaming Text Input

Streaming TTS that receives streaming text is essential for interactive systems, yet this scheme faces two major challenges: unnatural prosody due to mi...

PTOPOFL: Privacy-Preserving Personalised Federated Learning via Persistent Homology

Federated learning (FL) faces two structural tensions: gradient sharing enables data-reconstruction attacks, while non-IID client distributions degrade...

Python Clean Architecture Practice Problems & Exercises

Solve 11 Python clean architecture problems (3 Easy, 4 Medium, 4 Hard). Practice dependency rule, domain model with hints, runnable code, and solutions.

Python Configuration Management Practice Problems & Exercises

Solve 11 Python configuration management problems. Covers pydantic settings, environment variables. Hints and solutions.

Python Dependency Injection Practice Problems & Exercises

Solve 11 Python dependency injection problems. Covers DI container, constructor injection, FastAPI dependency. Hints and solutions.

Python Hexagonal Architecture (Ports and Adapters): Practice Problems & Exercises

Solve 11 Python hexagonal architecture (ports and adapters) problems. Covers hexagonal architecture, ports and, protocol port. Hints and solutions.

Python Microservices vs Monolith Practice Problems & Exercises

Solve 11 Python microservices vs monolith problems. Covers microservices vs, bounded context, modular monolith. Hints and solutions.

Python Plugin Systems Practice Problems & Exercises

Solve 11 Python plugin systems problems (3 Easy, 4 Medium, 4 Hard). Practice plugin system, entry_points exercises with hints, runnable code, and solutions.

Python The 12-Factor App Practice Problems & Exercises

Solve 11 Python the 12-factor app problems (3 Easy, 4 Medium, 4 Hard). Practice 12-factor app, 12-factor methodology with hints, runnable code, and solutions.

Quantity Convergence, Quality Divergence: Disentangling Fluency and Accuracy in L2 Mandarin Prosody

While second language (L2) learners may acquire target syntactic word order, mapping this syntax onto appropriate prosodic structures remains a persiste...

Quantum Diffusion Models: Score Reversal Is Not Free in Gaussian Dynamics

Diffusion-based generative modeling suggests reversing a noising semigroup by adding a score drift. For continuous-variable Gaussian Markov dynamics, co...

Quantum Interval Bound Propagation for Certified Training of Quantum Neural Networks

Quantum machine learning is a promising field for efficiently learning features of a dataset to perform a specified task, such as classification. Interv...

RAMoEA-QA: Hierarchical Specialization for Robust Respiratory Audio Question Answering

Conversational generative AI is rapidly entering healthcare, where general-purpose models must integrate heterogeneous patient signals and support diver...

Randomized Subspace Nesterov Accelerated Gradient

Randomized-subspace methods reduce the cost of first-order optimization by using only low-dimensional projected-gradient information, a feature that is...

RANGER: Sparsely-Gated Mixture-of-Experts with Adaptive Retrieval Re-ranking for Pathology Report Generation

Pathology report generation remains a relatively under-explored downstream task, primarily due to the gigapixel scale and complex morphological heteroge...

RAViT: Resolution-Adaptive Vision Transformer

Vision transformers have recently made a breakthrough in computer vision showing excellent performance in terms of precision for numerous applications....

RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video

Self-supervised novel view synthesis (NVS) remains challenging to scale, despite the abundance of video data, largely due to the brittleness of training...

Rays as Pixels: Learning A Joint Distribution of Videos and Camera Trajectories

Recovering camera parameters from images and rendering scenes from novel viewpoints have long been treated as separate tasks in computer vision and grap...

ReAct: Synergizing Reasoning and Acting in Language Models

Engineering breakdown of the ReAct paper (Yao et al., 2022) - the foundation of every AI agent built today. Plain English, production viability rating, implementation notes.

Real-Time Surrogate Modeling for Personalized Blood Flow Prediction and Hemodynamic Analysis

Cardiovascular modeling has rapidly advanced over the past few decades due to the rising needs for health tracking and early detection of cardiovascular...

RecaLLM: Addressing the Lost-in-Thought Phenomenon with Explicit In-Context Retrieval

We propose RecaLLM, a set of reasoning language models post-trained to make effective use of long-context information. In-context retrieval, which ident...

Recycling Failures: Salvaging Exploration in RLVR via Fine-Grained Off-Policy Guidance

Reinforcement Learning from Verifiable Rewards (RLVR) has emerged as a powerful paradigm for enhancing the complex reasoning capabilities of Large Reaso...

Reflective Context Learning: Studying the Optimization Primitives of Context Space

Generally capable agents must learn from experience in ways that generalize across tasks and environments. The fundamental problems of learning, includi...

Regular Fourier Features for Nonstationary Gaussian Processes

Simulating a Gaussian process requires sampling from a high-dimensional Gaussian distribution, which scales cubically with the number of sample location...

Regularized Online RLHF with Generalized Bilinear Preferences

We consider the problem of contextual online RLHF with general preferences, where the goal is to identify the Nash Equilibrium. We adopt the Generalized...

Reinforcement Learning with Markov Risk Measures and Multipattern Risk Approximation

For a risk-averse finite-horizon Markov Decision Problem, we introduce a special class of Markov coherent risk measures, called mini-batch measures. We...

Reliability Gated Multi-Teacher Distillation for Low Resource Abstractive Summarization

We study multiteacher knowledge distillation for low resource abstractive summarization from a reliability aware perspective. We introduce EWAD (Entropy...

Reliable Answers for Recurring Questions: Boosting Text-to-SQL Accuracy with Template Constrained Decoding

Large language models (LLMs) have revolutionized Text-to-SQL generation, allowing users to query structured data using natural language with growing eas...

Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling

Recent research has shown that filtering massive English web corpora into high-quality subsets significantly improves training efficiency. However, for...

Representation Learning for Spatiotemporal Physical Systems

Machine learning approaches to spatiotemporal physical systems have primarily focused on next-frame prediction, with the goal of learning an accurate em...

Research Roadmap: The Evolution of AI Agents

From Chain-of-Thought to production agent architectures. Read the 9 most important agent papers in order — with full engineering context between each one.

Research Roadmap: The Evolution of Multimodal AI

From CLIP to GPT-4V to Gemini. Read the 9 most important multimodal AI papers in order — understanding how vision and language were unified.

Research Roadmap: The Evolution of RAG

Read the 8 most important RAG papers in the right order. From the original Lewis et al. through GraphRAG. Full engineering context between each paper.

Resilient Strategies for Stochastic Systems: How Much Does It Take to Break a Winning Strategy?

We study the problem of resilient strategies in the presence of uncertainty. Resilient strategies enable an agent to make decisions that are robust agai...

Resources for Automated Evaluation of Assistive RAG Systems that Help Readers with News Trustworthiness Assessment

Many readers today struggle to assess the trustworthiness of online news because reliable reporting coexists with misinformation. The TREC 2025 DRAGUN (...

Rethinking Forward Processes for Score-Based Data Assimilation in High Dimensions

Data assimilation is the process of estimating the time-evolving state of a dynamical system by integrating model predictions and noisy observations. It...

Revisiting Gene Ontology Knowledge Discovery with Hierarchical Feature Selection and Virtual Study Group of AI Agents

Large language models have achieved great success in multiple challenging tasks, and their capacity can be further boosted by the emerging agentic AI te...

RewardUQ: A Unified Framework for Uncertainty-Aware Reward Models

Reward models are central to aligning large language models (LLMs) with human preferences. Yet most approaches rely on pointwise reward estimates that o...

Risk-Aware World Model Predictive Control for Generalizable End-to-End Autonomous Driving

With advances in imitation learning (IL) and large-scale driving datasets, end-to-end autonomous driving (E2E-AD) has made great progress recently. Curr...

RoboCasa365: A Large-Scale Simulation Framework for Training and Benchmarking Generalist Robots

Recent advances in robot learning have accelerated progress toward generalist robots that can perform everyday tasks in human environments. Yet it remai...

Robust support vector model based on bounded asymmetric elastic net loss for binary classification

In this paper, we propose a novel bounded asymmetric elastic net ($L_{baen}$) loss function and combine it with the support vector machine (SVM), result...

Robust Unscented Kalman Filtering via Recurrent Meta-Adaptation of Sigma-Point Weights

The Unscented Kalman Filter (UKF) is a ubiquitous tool for nonlinear state estimation; however, its performance is limited by the static parameterizatio...

Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

As Large Language Models (LLMs) transition into autonomous multi-agent ecosystems, robust minimax training becomes essential yet remains prone to instab...

RunAgent: Interpreting Natural-Language Plans with Constraint-Guided Execution

Humans solve problems by executing targeted plans, yet large language models (LLMs) remain unreliable for structured workflow execution. We propose RunA...

SafeAdapt: Provably Safe Policy Updates in Deep Reinforcement Learning

Safety guarantees are a prerequisite to the deployment of reinforcement learning (RL) agents in safety-critical tasks. Often, deployment environments ex...

SafeGen-LLM: Enhancing Safety Generalization in Task Planning for Robotic Systems

Safety-critical task planning in robotic systems remains challenging: classical planners suffer from poor scalability, Reinforcement Learning (RL)-based...

SafeMind: A Risk-Aware Differentiable Control Framework for Adaptive and Safe Quadruped Locomotion

Learning-based quadruped controllers achieve impressive agility but typically lack formal safety guarantees under model uncertainty, perception noise, a...

SAHOO: Safeguarded Alignment for High-Order Optimization Objectives in Recursive Self-Improvement

Recursive self-improvement is moving from theory to practice: modern systems can critique, revise, and evaluate their own outputs, yet iterative self-mo...

Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model

We study the sample complexity of learning an $ε$-optimal policy in the Stochastic Shortest Path (SSP) problem. We first derive sample complexity bounds...

SAVGO: Learning State-Action Value Geometry with Cosine Similarity for Continuous Control

While representation and similarity learning have improved the sample efficiency of Reinforcement Learning (RL), they are rarely used to shape policy up...

Scalable Evaluation of the Realism of Synthetic Environmental Augmentations in Images

Evaluation of AI systems often requires synthetic test cases, particularly for rare or safety-critical conditions that are difficult to observe in opera...

Scalable Learning of Multivariate Distributions via Coresets

Efficient and scalable non-parametric or semi-parametric regression analysis and density estimation are of crucial importance to the fields of statistic...

Scaling Search Relevance: Augmenting App Store Ranking with LLM-Generated Judgments

Large-scale commercial search systems optimize for relevance to drive successful sessions that help users find what they are looking for. To maximize re...

SCOPE: Scene-Contextualized Incremental Few-Shot 3D Segmentation

Incremental Few-Shot (IFS) segmentation aims to learn new categories over time from only a few annotations. Although widely studied in 2D, it remains un...

Seeing is Believing: Robust Vision-Guided Cross-Modal Prompt Learning under Label Noise

Prompt learning is a parameter-efficient approach for vision-language models, yet its robustness under label noise is less investigated. Visual content...

SeeThrough3D: Occlusion Aware 3D Control in Text-to-Image Generation

We identify occlusion reasoning as a fundamental yet overlooked aspect for 3D layout-conditioned generation. It is essential for synthesizing partially...

SELDON: Supernova Explosions Learned by Deep ODE Networks

The discovery rate of optical transients will explode to 10 million public alerts per night once the Vera C. Rubin Observatory's Legacy Survey of Space...

Self-Distilled RLVR

On-policy distillation (OPD) has become a popular training paradigm in the LLM community. This paradigm selects a larger model as the teacher to provide...

Semantic Invariance in Agentic AI

Large Language Models (LLMs) increasingly serve as autonomous reasoning agents in decision support, scientific problem-solving, and multi-agent coordina...

Semantic Rate-Distortion for Bounded Multi-Agent Communication: Capacity-Derived Semantic Spaces and the Communication Cost of Alignment

When two agents of different computational capacities interact with the same environment, they need not compress a common semantic alphabet differently;...

Semantic Token Clustering for Efficient Uncertainty Quantification in Large Language Models

Large language models (LLMs) have demonstrated remarkable capabilities across diverse tasks. However, the truthfulness of their outputs is not guarantee...

Semantics-Aware Caching for Concept Learning

Concept learning is a form of supervised machine learning that operates on knowledge bases in description logics. State-of-the-art concept learners ofte...

Semi-Supervised Generative Learning via Latent Space Distribution Matching

We introduce Latent Space Distribution Matching (LSDM), a novel framework for semi-supervised generative modeling of conditional distributions. LSDM ope...

SenCache: Accelerating Diffusion Model Inference via Sensitivity-Aware Caching

Diffusion models achieve state-of-the-art video generation quality, but their inference remains expensive due to the large number of sequential denoisin...

Sentiment Analysis of German Sign Language Fairy Tales

We present a dataset and a model for sentiment analysis of German sign language (DGS) fairy tales. First, we perform sentiment analysis for three levels...

Separating Secrets from Placeholders: A Hybrid CNN-CodeBERT Framework for Three-Class Credential Leakage Detection

Credential leakage in public source code repositories poses a critical security threat, with over 23.8 million secrets exposed in 2024 alone. Existing d...

Sequential Inference for Gaussian Processes: A Signal Processing Perspective

The proliferation of capable and efficient machine learning (ML) models marks one of the strongest methodological shifts in signal processing (SP) in it...

Sharp Convergence Rates for Masked Diffusion Models

Discrete diffusion models have achieved strong empirical performance in text and other symbolic domains, with masked (absorbing-rate) variants emerging...

Sharp description of local minima in the loss landscape of high-dimensional two-layer ReLU neural networks

We study the population loss landscape of two-layer ReLU networks of the form $\sum_{k=1}^K \mathrm{ReLU}(w_k^\top x)$ in a realisable teacher-student s...

Sim-to-Real Transfer for Muscle-Actuated Robots via Generalized Actuator Networks

Tendon drives paired with soft muscle actuation enable faster and safer robots while potentially accelerating skill acquisition. Still, these systems ar...

SimpliHuMoN: Simplifying Human Motion Prediction

Human motion prediction combines the tasks of trajectory forecasting and human pose prediction. For each of the two tasks, specialized models have been...

Sketching the Readout of Large Language Models for Scalable Data Attribution and Valuation

Data attribution and valuation are critical for understanding data-model synergy for Large Language Models (LLMs), yet existing gradient-based methods s...

Skill Reuse as Compression in Agentic RL

Large language model agents trained with reinforcement learning (RL) often learn brittle, task-specific shortcuts. We hypothesize that agents generalize...

SOLID Principles

Single Responsibility, Open/Closed, Liskov Substitution, Interface Segregation, and Dependency Inversion - applied to production Python.

SOTAlign: Semi-Supervised Alignment of Unimodal Vision and Language Models via Optimal Transport

The Platonic Representation Hypothesis posits that neural networks trained on different modalities converge toward a shared statistical model of the wor...

SPARTA: Scalable and Principled Benchmark of Tree-Structured Multi-hop QA over Text and Tables

Real-world Table-Text question answering (QA) tasks require models that can reason across long text and source tables, traversing multiple hops and exec...

Spatio-Temporal Token Pruning for Efficient High-Resolution GUI Agents

Pure-vision GUI agents provide universal interaction capabilities but suffer from severe efficiency bottlenecks due to the massive spatiotemporal redund...

SPECTRA: Synthetic IR Test Collections with Relevance Oracles and Controlled Distractor Diagnostics

Scalable information retrieval testing needs corpora that are large enough to stress index construction, ranking latency, query routing, and evaluation...

Spectral Alignment in Forward-Backward Representations via Temporal Abstraction

Forward-backward (FB) representations provide a powerful framework for learning the successor representation (SR) in continuous spaces by enforcing a lo...

Splitting Argumentation Frameworks with Collective Attacks and Supports

This work proposes novel splitting techniques for argumentation formalisms that incorporate supports between defeasible elements. We base our studies on...

SpotIt+: Verification-based Text-to-SQL Evaluation with Database Constraints

We present SpotIt+, an open-source tool for evaluating Text-to-SQL systems via bounded equivalence verification. Given a generated SQL query and the gro...

SPPCSO: Adaptive Penalized Estimation Method for High-Dimensional Correlated Data

With the rise of high-dimensional correlated data, multicollinearity poses a significant challenge to model stability, often leading to unstable estimat...

SPRINT: Semi-supervised Prototypical Representation for Few-Shot Class-Incremental Tabular Learning

Real-world systems must continuously adapt to novel concepts from limited data without forgetting previously acquired knowledge. While Few-Shot Class-In...

Stable and Steerable Sparse Autoencoders with Weight Regularization

Sparse autoencoders (SAEs) are widely used to extract human-interpretable features from neural network activations, but their learned features can vary...

State estimations and noise identifications with intermittent corrupted observations via Bayesian variational inference

This paper focuses on the state estimation problem in distributed sensor networks, where intermittent packet dropouts, corrupted observations, and unkno...

Stateful Online Monitoring Catches Distributed Agent Attacks

Language models can find thousands of severe software vulnerabilities, and agents are increasingly being misused for cyberattacks. To avoid detection, a...

Steve-Evolving: Open-World Embodied Self-Evolution via Fine-Grained Diagnosis and Dual-Track Knowledge Distillation

Open-world embodied agents must solve long-horizon tasks where the main bottleneck is not single-step planning quality but how interaction experience is...

Strait: Perceiving Priority and Interference in ML Inference Serving

Machine learning (ML) inference serving systems host deep neural network (DNN) models and schedule incoming inference requests across deployed GPUs. How...

Strategic Algorithmic Monoculture:Experimental Evidence from Coordination Games

AI agents increasingly operate in multi-agent environments where outcomes depend on coordination. We distinguish primary algorithmic monoculture -- base...

Structural interpretability in SVMs with truncated orthogonal polynomial kernels

We study post-training interpretability for Support Vector Machines (SVMs) built from truncated orthogonal polynomial kernels. Since the associated repr...

Structure-Preserving Multi-View Embedding Using Gromov-Wasserstein Optimal Transport

Multi-view data analysis seeks to integrate multiple representations of the same samples in order to recover a coherent low-dimensional structure. Class...

Structured Distillation for Personalized Agent Memory: 11x Token Reduction with Retrieval Preservation

Long conversations with an AI agent create a simple problem for one user: the history is useful, but carrying it verbatim is expensive. We study persona...

Stylistic-STORM (ST-STORM) : Perceiving the Semantic Nature of Appearance

One of the dominant paradigms in self-supervised learning (SSL), illustrated by MoCo or DINO, aims to produce robust representations by capturing featur...

SUREON: A Benchmark and Vision-Language-Model for Surgical Reasoning

Surgeons don't just see -- they interpret. When an expert observes a surgical scene, they understand not only what instrument is being used, but why it...

SurvHTE-Bench: A Benchmark for Heterogeneous Treatment Effect Estimation in Survival Analysis

Estimating heterogeneous treatment effects (HTEs) from right-censored survival data is critical in high-stakes applications such as precision medicine a...

Synthetic Computers at Scale for Long-Horizon Productivity Simulation

Realistic long-horizon productivity work is strongly conditioned on user-specific computer environments, where much of the work context is stored and or...

Synthetic data in cryptocurrencies using generative models

Data plays a fundamental role in consolidating markets, services, and products in the digital financial ecosystem. However, the use of real data, especi...

Synthetic Monitoring Environments for Reinforcement Learning

Reinforcement Learning (RL) lacks benchmarks that enable precise, white-box diagnostics of agent behavior. Current environments often entangle complexit...

Takeuchi's Information Criteria as Generalization Measures for DNNs Close to NTK Regime

Generalization measures have been studied extensively in the machine learning community to better characterize generalization gaps. However, establishin...

Taming Momentum: Rethinking Optimizer States Through Low-Rank Approximation

Modern optimizers like Adam and Muon are central to training large language models, but their reliance on first- and second-order momenta introduces sig...

Task Complexity Matters: An Empirical Study of Reasoning in LLMs for Sentiment Analysis

Large language models (LLMs) with reasoning capabilities have fueled a compelling narrative that reasoning universally improves performance across langu...

Task-Centric Acceleration of Small-Language Models

Small language models (SLMs) have emerged as efficient alternatives to large language models for task-specific applications. However, they are often emp...

Tell Me What To Learn: Generalizing Neural Memory to be Controllable in Natural Language

Modern machine learning models are deployed in diverse, non-stationary environments where they must continually adapt to new tasks and evolving knowledg...

Temporal Data Requirement for Predicting Unplanned Hospital Readmissions

With the proliferation of Electronic Health Records (EHRs), a critical challenge in building predictive models is determining the optimal historical dat...

Terminology Rarity Predicts Catastrophic Failure in LLM Translation of Low-Resource Ancient Languages: Evidence from Ancient Greek

This study presents the first systematic, reference-free human evaluation of large language model (LLM) machine translation (MT) for Ancient Greek (AG)...

The $\mathbf{Y}$-Combinator for LLMs: Solving Long-Context Rot with $λ$-Calculus

LLMs are increasingly used as general-purpose reasoners, but long inputs remain bottlenecked by a fixed context window. Recursive Language Models (RLMs)...

The 12-Factor App - Building Deployable Python Apps

Apply the 12-Factor App methodology to Python applications with FastAPI, Docker, and PostgreSQL - covering all 12 factors with production-ready code examples.

The Compression Gap: Why Discrete Tokenization Limits Vision-Language-Action Model Scaling

Scaling Vision-Language-Action (VLA) models by upgrading the vision encoder is expected to improve downstream manipulation performance--as it does in vi...

The Dynamic-Probabilistic Consistency Gap in Chaotic Surrogate Modeling

Dynamical systems reconstruction (DSR) aims to learn surrogate models that capture the dynamics underlying time-series data. Reliably deploying these su...

The EpisTwin: A Knowledge Graph-Grounded Neuro-Symbolic Architecture for Personal AI

Personal Artificial Intelligence is currently hindered by the fragmentation of user data across isolated silos. While Retrieval-Augmented Generation off...

The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback

We study the problem of learning in zero-sum matrix games with repeated play and bandit feedback. Specifically, we focus on developing uncoupled algorit...

The logic of KM belief update is contained in the logic of AGM belief revision

For each axiom of KM belief update we provide a corresponding axiom in a modal logic containing three modal operators: a unimodal belief operator $B$, a...

The Robot's Inner Critic: Self-Refinement of Social Behaviors through VLM-based Replanning

Conventional robot social behavior generation has been limited in flexibility and autonomy, relying on predefined motions or human feedback. This study...

The Stability of Online Algorithms in Performative Prediction

The use of algorithmic predictions in decision-making leads to a feedback loop where the models we deploy actively influence the data distributions we s...

Themis: Training Robust Multilingual Code Reward Models for Flexible Multi-Criteria Scoring

Reward models (RMs) have become an indispensable fixture of the language model (LM) post-training playbook, enabling policy alignment and test-time scal...

Thermodynamic Response Functions in Singular Bayesian Models

Singular statistical models-including mixtures, matrix factorization, and neural networks-violate regular asymptotics due to parameter non-identifiabili...

Time Series Foundation Models as Strong Baselines in Transportation Forecasting: A Large-Scale Benchmark Analysis

Accurate forecasting of transportation dynamics is essential for urban mobility and infrastructure planning. Although recent work has achieved strong pe...

To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling

Agentic AI architectures augment LLMs with external tools, unlocking strong capabilities. However, tool use is not always beneficial; some calls may be...

TopBench: A Benchmark for Implicit Prediction and Reasoning over Tabular Question Answering

Large Language Models (LLMs) have advanced Table Question Answering, where most queries can be answered by extracting information or simple aggregation....

Toward Expert Investment Teams:A Multi-Agent LLM System with Fine-Grained Trading Tasks

The advancement of large language models (LLMs) has accelerated the development of autonomous financial trading systems. While mainstream approaches dep...

Toward Generative Quantum Utility via Correlation-Complexity Map

We propose a Correlation-Complexity Map as a practical diagnostic tool for determining when real-world data distributions are structurally aligned with...

Toward Guarantees for Clinical Reasoning in Vision Language Models via Formal Verification

Vision-language models (VLMs) show promise in drafting radiology reports, yet they frequently suffer from logical inconsistencies, generating diagnostic...

Toward World Models for Epidemiology

World models have emerged as a unifying paradigm for learning latent dynamics, simulating counterfactual futures, and supporting planning under uncertai...

Towards Faithful Multimodal Concept Bottleneck Models

Concept Bottleneck Models (CBMs) are interpretable models that route predictions through a layer of human-interpretable concepts. While widely studied i...

Towards Improving Speaker Distance Estimation through Generative Impulse Response Augmentation

The Room Acoustics and Speaker Distance Estimation (SDE) Challenge at ICASSP 2025 explores the effectiveness of augmented room impulse response (RIR) da...

Transfer Learning for Meta-analysis Under Covariate Shift

Randomized controlled trials often do not represent the populations where decisions are made, and covariate shift across studies can invalidate standard...

Trojan horse hunt in deep forecasting models: Insights from the European Space Agency competition

Forecasting plays a crucial role in modern safety-critical applications, such as space operations. However, the increasing use of deep forecasting model...

TunerDiT: Training-free Progressive Steering of Diffusion Transformer for Multi-Event Video Generation

Text-to-video (T2V) generation faces challenging questions when generating videos with long horizons containing multiple events. Inspired by the intrins...

Turning Trust to Transactions: Tracking Affiliate Marketing and FTC Compliance in YouTube's Influencer Economy

YouTube has evolved into a powerful platform that where creators monetize their influence through affiliate marketing, raising concerns about transparen...

Two-Time-Scale Learning Dynamics: A Population View of Neural Network Training

Population-based learning paradigms, including evolutionary strategies, Population-Based Training (PBT), and recent model-merging methods, combine fast...

U-Cast: A Surprisingly Simple and Efficient Frontier Probabilistic AI Weather Forecaster

AI-based weather forecasting now rivals traditional physics-based ensembles, but state-of-the-art (SOTA) models rely on specialized architectures and ma...

Uncertainty Quantification for Multimodal Large Language Models with Incoherence-adjusted Semantic Volume

Despite their capabilities, Multimodal Large Language Models (MLLMs) may produce plausible but erroneous outputs, hindering reliable deployment. Accurat...

Uncovering Physical Drivers of Dark Matter Halo Structures with Auxiliary-Variable-Guided Generative Models

Deep generative models (DGMs) compress high-dimensional data but often entangle distinct physical factors in their latent spaces. We present an auxiliar...

Understanding the Role of Hallucination in Reinforcement Post-Training of Multimodal Reasoning Models

The recent success of reinforcement learning (RL) in large reasoning models has inspired the growing adoption of RL for post-training Multimodal Large L...

Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset

AI-powered scientific research tools are rapidly being integrated into research workflows, yet the field lacks a clear lens into how researchers use the...

Uniform-Correct Policy Optimization: Breaking RLVR's Indifference to Diversity

Reinforcement Learning with Verifiable Rewards (RLVR) has achieved substantial gains in single-attempt accuracy (Pass@1) on reasoning tasks, yet often s...

Unsupervised Continual Learning for Amortized Bayesian Inference

Amortized Bayesian Inference (ABI) enables efficient posterior estimation using generative neural networks trained on simulated data, but often suffers...

Unsupervised Denoising of Real Clinical Low Dose Liver CT with Perceptual Attention Networks

With the development of deep learning, medical image processing has been widely used to assist clinical research. This paper focuses on the denoising pr...

Using Large Language Models and Knowledge Graphs to Improve the Interpretability of Machine Learning Models in Manufacturing

Explaining Machine Learning (ML) results in a transparent and user-friendly manner remains a challenging task of Explainable Artificial Intelligence (XA...

Utilizing LLMs for Industrial Process Automation

A growing number of publications address the best practices to use Large Language Models (LLMs) for software engineering in recent years. However, most...

Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control

We present a method to identify a valence-arousal (VA) subspace within large language model representations. From 211k emotion-labeled texts, we derive...

Value Functions as Supermartingale Certificates

Certification methods for stochastic systems provide sufficient proof rules, based on real-valued supermartingale certificates, to determine the almost-...

Var-JEPA: A Variational Formulation of the Joint-Embedding Predictive Architecture -- Bridging Predictive and Generative Self-Supervised Learning

The Joint-Embedding Predictive Architecture (JEPA) is often seen as a non-generative alternative to likelihood-based self-supervised learning, emphasizi...

Variational Garrote for Sparse Inverse Problems

Sparse regularization plays a central role in solving inverse problems arising from incomplete or corrupted measurements. Different regularizers corresp...

VaSST: Variational Inference for Symbolic Regression using Soft Symbolic Trees

Symbolic regression has recently gained traction in AI-driven scientific discovery, aiming to recover explicit closed-form expressions from data that re...

VecMol: Vector-Field Representations for 3D Molecule Generation

Generative modeling of three-dimensional (3D) molecules is a fundamental yet challenging problem in drug discovery and materials science. Existing appro...

VideoSeek: Long-Horizon Video Agent with Tool-Guided Seeking

Video agentic models have advanced challenging video-language tasks. However, most agentic approaches still heavily rely on greedy parsing over densely...

Vision-Language Models Suppress Female Representations Under Ambiguous Input

Alignment teaches vision-language models (VLMs) to avoid expressing demographic biases, and when gender is clearly visible they largely succeed. Far les...

VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images

Vision-language models (VLMs) still struggle with visual perception tasks such as spatial understanding and viewpoint recognition. One plausible contrib...

VISOR: Agentic Visual Retrieval-Augmented Generation via Iterative Search and Over-horizon Reasoning

Visual Retrieval-Augmented Generation (VRAG) empowers Vision-Language Models to retrieve and reason over visually rich documents. To tackle complex quer...

Visual-ERM: Reward Modeling for Visual Equivalence

Vision-to-code tasks require models to reconstruct structured visual inputs, such as charts, tables, and SVGs, into executable or structured representat...

VL-Calibration: Decoupled Confidence Calibration for Large Vision-Language Models Reasoning

Large Vision Language Models (LVLMs) achieve strong multimodal reasoning but frequently exhibit hallucinations and incorrect responses with high certain...

What Does Flow Matching Bring To TD Learning?

Recent work shows that flow matching can be effective for scalar Q-value function estimation in reinforcement learning (RL), but it remains unclear why...

What Gets Unmasked First? Trajectory Analysis of Diffusion Models for Graph-to-Text Generation

We present the first systematic study of masked diffusion language models (MDLMs) for graph-to-text generation. We analyze MDLM generation trajectories...

When Are Multimodal Predictions Biologically Supported? A Diagnostic Evaluation Framework

Multimodal models in oncology can produce accurate predictions, but accurate prediction does not reveal whether the model has learned biology that is sh...

When One Modality Rules Them All: Backdoor Modality Collapse in Multimodal Diffusion Models

While diffusion models have revolutionized visual content generation, their rapid adoption has underscored the critical need to investigate vulnerabilit...

When RAG Chatbots Expose Their Backend: An Anonymized Case Study of Privacy and Security Risks in Patient-Facing Medical AI

Background: Patient-facing medical chatbots based on retrieval-augmented generation (RAG) are increasingly promoted to deliver accessible, grounded heal...

When Right Meets Wrong: Bilateral Context Conditioning with Reward-Confidence Correction for GRPO

Group Relative Policy Optimization (GRPO) has emerged as an effective method for training reasoning models. While it computes advantages based on group...

Who Guards the Guardians? The Challenges of Evaluating Identifiability of Learned Representations

Identifiability in representation learning is commonly evaluated using standard metrics (e.g., MCC, DCI, R^2) on synthetic benchmarks with known ground-...

Why Diffusion Language Models Struggle with Truly Parallel (Non-Autoregressive) Decoding?

Diffusion Language Models (DLMs) are often advertised as enabling parallel token generation, yet practical fast DLMs frequently converge to left-to-righ...

Why Linear Recurrent Memory Works in Partially Observable Reinforcement Learning

The family of linear recurrent neural networks has shown strong performance as recurrent memory units in partially observable reinforcement learning. We...

World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings

Recent work interprets the linear recoverability of geographic and temporal variables from large language model (LLM) hidden states as evidence for worl...