2026

Characterizing In-Context Learning: When Can Transformers Match Standard Learning Algorithms?

Mosab Hawarey

AIR Journal of Mathematics & Computational Sciences, Vol. 2026, AIRMCS2026277, DOI: 10.65737/AIRMCS2026277.

Transformers exhibit remarkable in-context learning (ICL) capabilities—the ability to learn new tasks from examples provided in the context window without weight updates. Despite extensive empirical investigation, a fundamental theoretical question remains unanswered: which function classes can be learned in-context, and which cannot? This gap in our understanding limits principled system design and creates uncertainty about when ICL will succeed or fail. We address this gap by developing a theoretical framework based on Sufficient Statistic Complexity (SSC)—the minimal information that must be extracted from context examples to enable accurate prediction. We prove that function classes with attention-computable sufficient statistics (those expressible as sums over examples) are efficiently ICL-learnable, matching the sample complexity of standard learning algorithms (Theorem 1). Conversely, we prove that function classes requiring combinatorial sufficient statistics—such as sparse parity—cannot be efficiently ICL-learned by any polynomial-size transformer (Theorem 2), establishing a fundamental computational barrier via novel connections to circuit complexity. These results yield a near-complete characterization: we prove a Master Theorem (Theorem 3) establishing necessary and sufficient conditions for ICL-learnability, and a Dichotomy Theorem (Theorem 6) showing that natural function classes are either ICL-Easy (learnable with optimal sample complexity) or ICL-Hard (requiring exponentially more resources). The boundary corresponds to whether learning is parallelizable or inherently sequential. Our framework explains empirical ICL phenomena, provides architectural guidance, and opens new research directions connecting learning theory, circuit complexity, and meta-learning.

2026

Quantum-Enhanced In-Context Learning for Geopotential Field Estimation: A Theoretical Framework

Mosab Hawarey

To Be Submitted To The AIR Journal of Mathematics & Computational Sciences

We establish a theoretical framework for in-context learning (ICL) of Earth's gravitational potential field using transformer architectures. The geopotential function class ℱN with K = (N+1)² spherical harmonic coefficients is proven to be ICL-Easy: it admits an additive sufficient statistic computable by a single attention layer. We derive tight sample complexity bounds showing nICL = Θ(Kσ²/ε), and quantify the quantum advantage—for Heisenberg-limited quantum gravimeters, sample requirements reduce by a factor of 10¹² for realistic sensor parameters. We also establish ICL-Hardness for inverse problems via reduction to sparse parity, yielding a complete geodetic dichotomy theorem.

2026

In-Context Learning Characterization of Geo-Foundation Models

Mosab Hawarey

To Be Submitted To The AIR Journal of Mathematics & Computational Sciences

We develop a comprehensive theoretical framework for in-context learning in geo-foundation models, establishing dichotomy theorems that characterize when transformers can efficiently learn geospatial tasks. The framework applies SSC/AC-SSC analysis to distinguish tractable problems with additive sufficient statistics from intractable problems with combinatorial structure. We provide domain-specific sample complexity bounds and computational hardness results across the full spectrum of geospatial applications.

2026

In-Context Learning Characterization of Climate Foundation Models: When Can Transformers Learn Weather and Climate?

Mosab Hawarey

To Be Submitted To The AIR Journal of Mathematics & Computational Sciences

We establish the Climate Prediction Dichotomy Theorem, characterizing when transformer-based foundation models can efficiently perform in-context learning for weather and climate prediction tasks. We prove that spatiotemporal forecasting problems with smooth dynamics are ICL-Easy, achieving sample complexity Θ(d²σ²/ε) for d-dimensional systems, while extreme event prediction and regime transitions are ICL-Hard due to combinatorial sufficient statistics. The framework provides rigorous foundations for deploying climate foundation models in operational forecasting.

2026

In-Context Learning Characterization of Multi-Modal Geo-Foundation Models: When Can Vision-Language Transformers Learn Geospatial Tasks?

Mosab Hawarey

To Be Submitted To The AIR Journal of Mathematics & Computational Sciences

We develop a theoretical framework for multi-modal in-context learning in geo-foundation models that integrate optical imagery, SAR, hyperspectral data, and spatiotemporal information. The Remote Sensing Dichotomy Theorem establishes that continuous-valued estimation tasks (land cover mapping, change magnitude estimation) are ICL-Easy with sample complexity Θ(Kσ²/ε), while discrete classification problems with combinatorial label dependencies are ICL-Hard. We provide matching upper and lower bounds and characterize the role of cross-modal attention in achieving efficient learning.

2026

Chain-of-Thought In-Context Learning for GeoAI: Breaking the Hardness Barrier

Mosab Hawarey

To Be Submitted To The AIR Journal of Mathematics & Computational Sciences

We prove that chain-of-thought (CoT) reasoning can break the computational hardness barrier for ICL-Hard geospatial problems. For tasks with combinatorial sufficient statistics that are ICL-Hard in the standard setting, we show that CoT-augmented transformers can achieve polynomial sample complexity by decomposing complex inference into sequences of tractable sub-problems. We establish theoretical guarantees for CoT-ICL across climate prediction, remote sensing, and navigation domains, demonstrating exponential speedups over standard ICL for problems involving extreme events, discrete classifications, and fault detection.

2026

In-Context Learning Characterization of Navigation and GNSS Foundation Models: A Theoretical Framework for Safety-Critical Positioning

Mosab Hawarey

To Be Submitted To The AIR Journal of Mathematics & Computational Sciences

We establish the Navigation Dichotomy Theorem for GNSS-based positioning systems, proving that continuous state estimation (position, velocity, clock bias) is ICL-Easy with sample complexity Θ(dσ²/ε), while discrete fault identification and integrity monitoring are ICL-Hard due to combinatorial alert logic. The framework provides theoretical foundations for deploying transformer-based foundation models in safety-critical navigation applications, with explicit characterization of when ICL can match Kalman filtering performance and when computational hardness barriers emerge.