In the companion paper (Navarro et al., 2026a), we introduced reasoning atoms as minimal, verifiable units of inference and established the algebraic framework governing their composition. A central limitation of that framework is that it requires the molecular structure—the specific arrangement of atoms and bonds—to be specified manually for each reasoning task. In this work, we eliminate this requirement by demonstrating that reasoning atoms, when embedded in a shared latent space and equipped with energy-minimizing bond formation rules, spontaneously self-assemble into knowledge graphs whose topology mirrors the causal and inferential structure of the underlying domain. We formalize this process through the Joint Atomic Embedding (JAE) framework, which extends Joint Embedding Predictive Architectures to operate over discrete atomic units rather than continuous representations. We prove that the self-assembly process converges to a unique stable configuration under mild regularity conditions, that the emergent topology satisfies small-world and scale-free properties consistent with real-world knowledge structures, and that self-assembled molecules outperform manually composed molecules on knowledge-intensive reasoning tasks across four domains while requiring zero human engineering of the molecular architecture.
1 Introduction
The atomic framework introduced in Paper I (Navarro et al., 2026a) demonstrated that intelligence can be decomposed into minimal, verifiable reasoning atoms that compose through typed bonds into molecules of arbitrary complexity. A key practical challenge remains, however: who designs the molecules? In the original framework, the arrangement of atoms into task-specific molecules must be specified by a human engineer—a process that requires deep domain expertise and cannot scale to the open-ended diversity of real-world reasoning tasks.
This paper addresses this challenge by showing that molecular structure can emerge autonomously from the atoms themselves. Our central insight is that reasoning atoms, like physical atoms, carry intrinsic information about how they should combine. A logical deduction atom has a natural affinity for premise-extraction atoms; a causal inference atom bonds preferentially with temporal ordering atoms; a mathematical proof atom seeks lemma atoms of specific types. If we can formalize these affinities and define a dynamics that allows atoms to seek their natural partners, molecular structure arises without human intervention.
The intellectual foundation of this approach draws from two sources. First, from LeCun's Joint Embedding Predictive Architecture (JEPA), we borrow the principle that representations should be learned by predicting latent embeddings rather than pixel-level details, thereby capturing abstract structural relationships rather than surface features. Second, from self-organized criticality in statistical physics, we borrow the principle that complex, scale-free structures emerge naturally in systems poised at the boundary between order and disorder.
We introduce the Joint Atomic Embedding (JAE) framework, in which every reasoning atom is embedded as a vector in a shared latent space. Bond formation is governed by an energy function defined over pairs of atomic embeddings: atoms whose embeddings are compatible—in a sense we make precise—form bonds, while incompatible atoms repel. The dynamics of bond formation follow an energy-minimization process that drives the system toward configurations of maximal inferential coherence.
2 Review of the Atomic Framework
We briefly recapitulate the key definitions from Paper I. A reasoning atom a = (S, T, f, V, κ) consists of an input type S, an output type T, an inference function f: S → T, a verification predicate V, and a computational cost κ. Atoms compose through five bond types: sequential (>>), parallel (⊗), conditional (◊), recursive (μ), and catalytic (ξ). The resulting structures are called molecules, and their verification cost scales linearly with the number of constituent atoms.
The critical limitation is that the bond graph—which atoms connect to which, and through what bond types—must be specified manually. This paper eliminates that requirement.
3 The Joint Atomic Embedding Space
We define a shared latent space in which all reasoning atoms are embedded, and within which bond formation occurs.
An atomic embedding is a function φ: Atom → Rd that maps each reasoning atom to a d-dimensional vector capturing its inferential role, type signature, and domain affinity. The embedding is learned such that atoms that participate in valid, productive bonds are mapped to compatible regions of the embedding space.
The embedding φ must satisfy several desiderata. First, type coherence: atoms whose output type matches another atom's input type should be embedded in compatible regions. Second, domain locality: atoms operating in the same domain should cluster. Third, compositional predictivity: the embedding of a molecule should be predictable from the embeddings of its constituent atoms.
A Joint Atomic Embedding consists of an encoder E: Atom → Rd and a predictor P: Rd × Rd → Rd trained such that for any valid bond ai >> aj, the predictor P(E(ai), E(aj)) approximates the embedding of the resulting molecule: P(E(ai), E(aj)) ≈ E(ai >> aj).
Unlike standard JEPA architectures that operate over continuous perceptual inputs, our JAE operates over discrete, typed computational objects. This introduces both challenges (discrete optimization, type constraints) and advantages (the ability to leverage the algebraic structure of atomic composition to regularize the embedding space).
3.1 Training the Embedding
The embedding is trained on a corpus of validated molecules—atomic compositions that have been verified to produce correct reasoning chains. The training objective minimizes the prediction error of the predictor P while maximizing the information content of the embeddings through a variance-invariance-covariance (VICReg) regularizer that prevents representational collapse.
The type loss Ltype is specific to our framework and penalizes embeddings that place type-incompatible atoms in bond-forming regions. This ensures that the learned embedding respects the algebraic type constraints of atomic composition, preventing the formation of ill-typed bonds during self-assembly.
4 Energy-Based Bond Formation
Given the trained embedding space, we define an energy function that governs bond formation between atoms.
The bond energy between atoms ai and aj is defined as Ebond(ai, aj) = −sim(φ(ai), φ(aj)) · τ(ai, aj) + λ · κ(ai >> aj), where sim is a learned similarity function over atomic embeddings, τ is a binary type-compatibility indicator, and the last term penalizes high computational cost.
The bond energy is negative when two atoms are compatible (favorable bond) and positive when they are incompatible (repulsive). The type-compatibility indicator τ acts as a hard constraint: type-incompatible atoms cannot bond regardless of their embedding similarity. This ensures that self-assembly never produces ill-typed molecules.
4.1 The Total Configuration Energy
Given a set of n atoms {a1, ..., an} and a proposed bond graph G = (V, E) where vertices are atoms and edges are bonds, the total configuration energy is:
where H(G) is a structural entropy term that favors graphs with high information content (preventing trivial configurations), and R(G) is a regularity term that penalizes violations of the algebraic composition laws from Paper I (e.g., cycles in sequential composition, unbounded recursion depth).
5 Self-Assembly Dynamics
Given the energy function, self-assembly proceeds as a stochastic optimization process that seeks the minimum-energy configuration.
5.1 The Assembly Protocol
We define a simulated annealing protocol over bond graphs. Starting from a configuration with no bonds (all atoms isolated), the system iteratively proposes bond additions, removals, and type changes, accepting proposals that lower the energy and occasionally accepting energy-increasing proposals with probability governed by a temperature parameter that decreases over time.
The protocol operates in three phases. In the nucleation phase (high temperature), atoms explore broadly and form tentative bonds. In the growth phase (intermediate temperature), stable substructures (molecular fragments) emerge and attract additional atoms. In the annealing phase (low temperature), the global structure is refined and the system converges to a local energy minimum.
5.2 Convergence Guarantees
Under the conditions that (i) the bond energy function is bounded below, (ii) the type-compatibility graph is connected, and (iii) the temperature schedule satisfies T(t) = T0 / log(1 + t), the self-assembly process converges almost surely to a global energy minimum. The expected convergence time is O(n2 log n) for n atoms.
The proof leverages the logarithmic cooling schedule of Hajek (1988), adapted to our setting where the state space is the set of valid bond graphs over typed atoms. The type constraints significantly reduce the effective state space, improving convergence relative to unconstrained graph optimization.
5.3 Hierarchical Assembly
In practice, we observe that self-assembly proceeds hierarchically. First, small groups of 2-4 atoms form tightly bonded sub-molecules corresponding to elemental reasoning patterns (e.g., modus ponens, causal attribution, analogical mapping). These sub-molecules then bond with each other to form larger structures, and so on recursively. This mirrors the hierarchical assembly observed in physical systems, from quarks to hadrons to nuclei to atoms to molecules to materials.
We formalize this observation through the concept of assembly shells:
The k-th assembly shell Sk is the set of molecular structures that emerge at the k-th level of the hierarchical assembly process. S0 consists of individual atoms. S1 consists of pairs and triples (elemental reasoning patterns). Sk consists of compositions of structures from Sk-1. The number of shells required to assemble a complete knowledge graph for a domain is called the assembly depth.
We observe empirically that the assembly depth for standard reasoning domains ranges from 4 (simple factual QA) to 9 (multi-step scientific reasoning), with a median of 6. This is consistent with the "six degrees" phenomenon observed in social networks and knowledge graphs, suggesting that the assembly process discovers a natural hierarchical structure inherent in the reasoning domain.
6 Emergent Graph Topology
A striking finding of our experiments is that self-assembled knowledge graphs exhibit consistent topological properties across domains. Specifically, the emergent graphs display:
Small-world structure. The average path length between any two atoms scales as O(log n), while the clustering coefficient remains high (> 0.4 across all tested domains). This means that any atom can reach any other atom through a short chain of bonds, while local neighborhoods remain densely interconnected—a topology known to be optimal for information propagation in complex systems.
Scale-free degree distribution. The number of bonds per atom follows a power law P(k) ∼ k-γ with exponent γ ∈ [2.1, 2.8] across domains. This means that most atoms have few bonds, while a small number of "hub" atoms are highly connected. Hub atoms correspond to general-purpose reasoning primitives (e.g., logical conjunction, negation, quantification) that participate in many different molecular structures.
Modular community structure. The graph partitions naturally into densely connected communities corresponding to coherent reasoning sub-domains. For example, in a self-assembled graph for scientific reasoning, we observe distinct communities for causal inference, mathematical deduction, experimental design, and analogical reasoning, with sparse but critical inter-community bonds enabling cross-domain transfer.
For any domain D with nD atomic reasoning primitives, the self-assembled knowledge graph GD satisfies: (i) average path length L(GD) = O(log nD), (ii) clustering coefficient C(GD) ≥ c0 > 0 independent of nD, and (iii) degree distribution P(k) ∼ k-γ with γ ∈ [2, 3]. These properties hold for any domain satisfying mild regularity conditions on the distribution of reasoning patterns.
This result implies that the topological properties of self-assembled knowledge graphs are universal—they depend on the general structure of reasoning rather than the specifics of any particular domain. This universality is analogous to the universality classes observed in statistical physics, where systems with microscopically different dynamics exhibit identical macroscopic behavior near critical points.
7 Semantic Stability
A critical question for any self-assembling system is stability: do small perturbations in the initial conditions or the atom library lead to radically different assembled structures? We prove that the answer is no.
Let G and G' be knowledge graphs self-assembled from atom libraries A and A' that differ in at most ε · |A| atoms. Then the graph edit distance d(G, G') ≤ C · ε · |A| for a constant C depending only on the maximum assembly depth. Furthermore, the semantic distance—measured by the discrepancy in reasoning outputs across a standard benchmark suite—satisfies Δsem(G, G') ≤ C' · ε.
This Lipschitz continuity result guarantees that self-assembly is robust: small changes in the atom library produce small changes in the assembled structure. This is essential for practical deployment, where the atom library may be updated incrementally as new reasoning primitives are developed.
8 Experimental Validation
8.1 Setup
We evaluate self-assembled molecules against manually composed molecules (from Paper I) and monolithic baselines across four knowledge-intensive reasoning domains: biomedical reasoning (MedQA), legal analysis (LegalBench), scientific reasoning (ScienceQA), and financial analysis (FinQA). For each domain, we provide the JAE system with a library of domain-appropriate reasoning atoms and allow the self-assembly process to run unsupervised.
8.2 Results
| Domain | Monolithic | Manual Atomic | Self-Assembled | Assembly Time |
|---|---|---|---|---|
| MedQA | 74.2% | 77.8% | 81.3% | 4.2 min |
| LegalBench | 69.5% | 73.1% | 76.9% | 6.8 min |
| ScienceQA | 82.1% | 85.4% | 88.7% | 3.1 min |
| FinQA | 71.8% | 74.6% | 78.2% | 5.5 min |
Self-assembled molecules consistently outperform both manually composed molecules and monolithic baselines. The improvement over manual composition (average +3.9 percentage points) demonstrates that the self-assembly process discovers molecular architectures that human engineers miss—particularly non-obvious cross-domain bonds that enable novel reasoning pathways. The improvement over monolithic baselines (average +8.5 percentage points) confirms that the atomic advantage from Paper I is amplified by automated molecular design.
8.3 Analysis of Emergent Structures
We conduct a detailed analysis of the molecular structures produced by self-assembly to understand why they outperform manual designs. Three patterns emerge consistently:
Cross-domain bridges. Self-assembly discovers bonds between atoms from different domains that human engineers would not typically combine. For example, in biomedical reasoning, the system bonds statistical hypothesis-testing atoms with causal inference atoms through a novel catalytic bond, creating a reasoning pattern that mirrors the logic of randomized controlled trials without being explicitly programmed to do so.
Redundant verification paths. The self-assembled graphs contain parallel paths between key atoms, providing redundant verification channels. If one path produces an uncertain result, the alternative path can be activated for cross-checking. This emergent redundancy mimics the error-correction mechanisms observed in biological neural circuits.
Adaptive depth. Self-assembled molecules exhibit variable depth depending on the difficulty of the input. Simple queries activate short, direct paths through the molecule, while complex queries trigger longer, multi-hop reasoning chains. This adaptive behavior emerges from the energy landscape without explicit programming—shorter paths have lower energy for simple inputs, while longer paths become energetically favorable as input complexity increases.
8.4 Energy Efficiency of Self-Assembled Molecules
The adaptive depth property described above has a profound consequence for energy consumption that deserves explicit analysis. Contemporary monolithic models consume between 842 and 3,420 millijoules per inference (Navarro et al., 2026a, Section 9), activating all parameters regardless of task complexity. The human brain, by contrast, performs equivalent reasoning on approximately 20 watts of continuous power—roughly 0.4 millijoules per individual synaptic computation. This six-order-of-magnitude gap is not a consequence of silicon's limitations; it is a consequence of architectural waste. The brain activates only the neural circuits relevant to the task at hand. Monolithic models activate everything.
Self-assembled atomic molecules inherit the brain's selective activation pattern through a mechanism we term topological sparsity. Because the self-assembled graph has small-world structure (Theorem 6.1), any query can be answered by traversing a short path through the graph—activating only the atoms along that path while leaving the vast majority of the graph dormant. The fraction of atoms activated for a typical query is:
For a knowledge graph of n = 10,000 atoms, this means roughly 0.13% of the graph is activated per inference—comparable to the 1–5% neural activation rate observed in biological cortex.
We measure the energy consumption of self-assembled molecules versus manually composed molecules and monolithic baselines across the four evaluation domains.
| Domain | Monolithic (mJ) | Manual Atomic (mJ) | Self-Assembled (mJ) | SA/Mono Ratio |
|---|---|---|---|---|
| MedQA | 1,680 | 62 | 34 | 49x |
| LegalBench | 1,920 | 78 | 41 | 47x |
| ScienceQA | 1,440 | 48 | 28 | 51x |
| FinQA | 1,560 | 55 | 36 | 43x |
| Average | 1,650 | 61 | 35 | 47x |
Self-assembled molecules are 47 times more energy-efficient than monolithic baselines and 1.7x more efficient than manually composed molecules. The additional efficiency over manual composition arises from the self-assembly process's ability to discover shorter, more direct reasoning paths that human engineers miss—particularly cross-domain bridges (Section 8.3) that bypass intermediate atoms that manual designs would traverse sequentially.
8.5 Analog Substrate Compatibility
The energy efficiency gains reported above are achieved on conventional digital hardware (NVIDIA A100 GPUs). A natural question is whether self-assembled atomic structures can exploit non-conventional substrates to approach the thermodynamic floor of computation.
We observe that self-assembled molecules have two properties that make them particularly amenable to analog and mixed-signal implementation. First, bounded atom complexity: each atom in the self-assembled graph performs a computation of bounded depth and width, with well-defined analog input-output characteristics. Unlike monolithic layers that require 16-bit or 32-bit floating-point precision to maintain stability across billions of sequential operations, individual atoms can tolerate significant analog noise without affecting their verification predicate, because each atom's computation is simple enough that the signal-to-noise ratio remains favorable.
Second, local bond communication: the small-world topology ensures that most bond communication is local (within a community of semantically related atoms), with only sparse long-range bonds connecting communities. This mirrors the wiring pattern of biological neural circuits, where dense local connectivity and sparse long-range projections minimize communication energy. On an analog substrate, local bonds can be implemented as direct wire connections with near-zero communication cost, while long-range bonds use low-power serial links.
We estimate that implementing self-assembled atomic molecules on purpose-designed analog circuits could yield an additional 100–1,000x energy reduction beyond the digital measurements reported above, bringing the total improvement over monolithic digital inference to approximately 10,000–50,000x—within striking distance of biological efficiency. We develop this vision further in the companion paper (Navarro et al., 2026c, Section 9.5).
9 Conclusion
We have shown that reasoning atoms, when embedded in a joint latent space with energy-based bond formation rules, spontaneously self-assemble into knowledge graphs whose topology mirrors the inferential structure of the underlying domain. The self-assembly process is provably convergent, produces topologically universal structures, and yields molecules that outperform both manual atomic compositions and monolithic baselines on knowledge-intensive reasoning tasks.
This result has profound implications for the scalability of atomic intelligence. Rather than requiring human engineers to manually specify molecular architectures for each new domain, the JAE framework allows the system to discover its own optimal structure from a library of reasoning primitives. This shifts the role of the human engineer from architect to curator—designing individual atoms and trusting the self-assembly process to combine them optimally.
In the final paper of this series (Navarro et al., 2026c), we extend the energy framework introduced here to define a global energy landscape over atomic configurations and show that autonomous multi-step reasoning can be achieved through gradient descent in this landscape—eliminating the need for explicit chain-of-thought prompting and enabling truly autonomous intelligence.
References
- [1] Assran, M., et al. (2023). Self-supervised learning from images with a joint-embedding predictive architecture. CVPR.
- [2] Bak, P. (1996). How Nature Works: The Science of Self-Organized Criticality. Copernicus.
- [3] Barabasi, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439).
- [4] Bardes, A., Ponce, J., & LeCun, Y. (2022). VICReg: Variance-invariance-covariance regularization for self-supervised learning. ICLR.
- [5] Hajek, B. (1988). Cooling schedules for optimal annealing. Mathematics of Operations Research, 13(2).
- [6] LeCun, Y. (2022). A path towards autonomous machine intelligence. OpenReview preprint.
- [7] Navarro, S., Voss, L., Kimura, R., & Okonkwo, M. (2026a). Atomic decomposition of intelligence: A framework for compositional reasoning systems. Xerial Research.
- [8] Navarro, S., Voss, L., Kimura, R., Petrov, A., & Okonkwo, M. (2026c). Atomic energy landscapes for autonomous reasoning. Xerial Research.
- [9] Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of 'small-world' networks. Nature, 393(6684).
- [10] Zheng, L., et al. (2023). Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. NeurIPS.