Originally published: August 15, 2025
This article looks at why the relational model remains relevant as AI systems increasingly generate and execute queries. The immediate question is whether the rise of LLM-driven natural language interfaces makes SQL, or the relational model beneath it, less important. My argument is the opposite: if AI is going to operate reliably over data, the formal structure of the relational model matters more, not less.
Because the relational model is often conflated with SQL products, this discussion goes back to fundamentals. It focuses on Codd's original design choices, the later refinements by Chris Date and Hugh Darwen, and the question of what kind of relational interface best supports AI-era systems. For a companion discussion of SQL's implementation limits versus the formal model, see Knowledge Engineering Beyond SQL. For the broader distinction between information, semantics, and knowledge, see What I Mean by Knowledge, Information, and Semantics.
Codd's design was not just about storing data. It was about building a mathematically disciplined information system. His use of first-order logic (FOL) and set theory gave queries a formal basis, made optimization possible, and avoided the navigational complexity of older database models. Treating relations as sets of tuples also gives the model closure: operations on relations yield relations, which makes complex queries easier to build and reason about compositionally.
Logical-physical independence is equally important. It separates what is being queried from how the system stores and retrieves it. That separation lets implementers improve storage and indexing without rewriting application logic, and it gives optimizers room to transform queries formally. As AI systems grow more complex, that separation between intent and execution becomes more valuable, not less.
When designing the relational model, Codd carefully considered various logical systems, including second-order logic. He ultimately chose First-Order Logic (FOL) for practical and computational reasons. Second-order logic is fundamentally undecidable; there is no general algorithm to guarantee that a second-order query will terminate or that its validity can be determined. For building automated database systems, this was a critical barrier. Query processors need to reliably terminate and allow for algorithmic reasoning about query equivalence and optimization.
Beyond decidability, second-order logic presents serious computational complexity. Even its decidable fragments often demand exponential time or space, making them impractical for the technology of the 1970s and still problematic at scale today. Codd needed a model that could be implemented efficiently. FOL mapped naturally to computable operations: relations align with finite sets, quantifiers with loops, and predicates with computable functions. That choice favored computational tractability over maximum expressive power, and the same trade-off still matters for AI systems whose generated queries must execute reliably in production.
Note: Full first-order logic (FOL) is only semi-decidable: you can confirm a theorem if it is valid, but you cannot always prove invalidity in finite time. The relational model instead uses relational algebra and relational calculus, which are decidable.
The relational model's foundational strengths offer distinct advantages, especially as AI systems become more sophisticated and data-intensive.
One key benefit is declarative reasoning. The ability to express what relationships should exist without specifying how to find them creates a clear separation between intent and computation. Unlike procedural or navigational models, the relational model lets systems reason about data relationships mathematically, test query equivalences, infer constraints, and optimize access patterns automatically.
The compositional closure property, where every operation produces a result of the same type (a relation), is another critical advantage. Automated reasoning systems can build complex queries from simple, well-defined parts and transform them algebraically. Many NoSQL systems, in contrast, lack this mathematical closure; their operations often yield varying data types or need external coordination, complicating automated processing.
Furthermore, the model's formal optimization theory is well suited to AI. Query optimization in relational databases is not just heuristic; it is grounded in mathematics. Cost-based optimizers can analyze equivalent expressions and select an efficient execution strategy. That is much harder to achieve in models without a strong algebraic foundation.
Crucially, the relational model can serve as a mathematically rigorous intermediate representation for AI systems. When a Large Language Model (LLM) translates a natural language request into a database query, the semantic clarity of the target matters. Translating natural language into a First-Order Logic (FOL)/set-based language such as relational algebra is more reliable than translating it into a loosely structured navigational query language. The relational model's compositional semantics and support for equivalence testing make it a strong semantic compilation target for both humans and machines.
While Codd set the stage, Chris Date and Hugh Darwen's "Third Manifesto" offers important refinements. It directly addresses SQL's deviations from Codd's original vision and sharpens the relational model for systems that depend on semantic consistency.
A core contribution is their insistence on a proper type system. They argue that types are not mere implementation details; they are logical constructs needed for formal reasoning. Their distinction between scalar and nonscalar types, together with proper type inheritance, builds a clearer mathematical foundation. This matters in part because it avoids semantic ambiguities that plague SQL, especially its handling of `NULL` values and three-valued logic.
Date and Darwen also emphasize true relational closure and orthogonality. They advocate a model in which every operator strictly produces a relation, avoiding SQL's many special cases and inconsistent return types. That kind of predictability matters for any system that needs to compose operations reliably, including AI systems that build or transform queries automatically.
Beyond that, the Manifesto extends logical-physical independence to data modification, treating assignment as a logical operation. This creates a cleaner basis for reasoning about state changes, which matters as systems increasingly need to interact with dynamic data, temporal relationships, and causal effects. Their push for languages like Tutorial D, which better reflect relational principles, also highlights how far SQL has drifted from the formal model.
These extensions, including Relation-Valued Attributes (RVAs), are also argued to be consistent with FOL. Date and Darwen treat relation types as first-class citizens within an enriched type system. On that view, the logical operations remain first-order; they quantify over a richer domain that includes relations as values. The important point for this article is that added expressiveness does not necessarily require giving up decidability or computational tractability.
Despite the relational model's theoretical elegance, SQL, its most common implementation surface, has practical limits, especially around composability. Queries can be verbose and awkward to build into layered rule systems. As Michael Stonebraker points out, successful database innovations often re-enter the SQL ecosystem. That shows SQL's market pull, but it also raises a question for AI: will AI systems remain constrained by SQL's interface, or will they push more expressive relational layers into wider use?
A useful architectural pattern addresses SQL's weaknesses while keeping the relational model's benefits: the "bridge approach." The idea is to create higher-level abstractions that compile down to SQL, preserving decades of optimization work while offering a cleaner logical interface.
Logica, a declarative logic programming language from Google, is a good example. As part of the Datalog family, Logica extends classical logic programming with features like aggregation and compiles queries to SQL. That gives developers a more compositional rule language while still using mature SQL engines for execution. In combination with systems like DuckDB, this makes the bridge approach especially attractive for analytical and embedded settings.
This bridge pattern suggests a plausible evolution for relational systems in the AI era. Instead of abandoning the relational foundation, developers can build mathematically disciplined languages that offer better compositional properties and cleaner semantic interfaces while still relying on SQL execution infrastructure. The main value is not that any one language, whether Logica, Tutorial D, Malloy, or PRQL, is the final answer. It is that their underlying ideas can improve how relational systems expose logic, composition, and intent.
The relational model will remain relevant in an AI-driven future because it can provide a mathematically rigorous intermediate representation. When AI models, especially Large Language Models (LLMs), handle data, they often translate natural language into queries. The relational model's clarity and formal properties, rooted in First-Order Logic and set theory, make it a strong target for those translations. This also complements the argument in What Does an Ontology Actually Give You?: conceptual clarity and formal structure remain valuable even when interfaces become more natural-language-driven.
Consider the challenge of training AI systems to generate queries for new or specialized functions, including ones not yet fully standardized. One response is to rely on the relational model's mathematical basis. Instead of training LLMs only on examples of a particular query language, they can map natural language into more general mathematical expressions, such as set operations and logical quantifiers. That intermediate layer can then be translated into relational constructs while preserving logical consistency and supporting optimization.
This approach also extends to pure functions, deterministic mappings from input to output without side effects. An AI image-recognition function such as `ImageClassifier(image, criteria) -> label` can be treated relationally as part of a larger query pipeline. That allows AI-native operations to combine with relational queries while still benefiting from composition and optimization.
Ultimately, Codd's vision, refined by Date and Darwen, remains a good guide for navigating data complexity in the AI age. SQL will likely remain dominant for some time, but the underlying relational principles may increasingly shape the next layer of interfaces built for AI-generated and AI-assisted computation. In that sense, the relational model may become more important because of AI, not less.