What I Mean by Knowledge, Information, and Semantics

When Meaning Goes Unmodeled

If you work on enterprise information systems, you may have encountered situations where teams did not agree on what a "customer" is, what "current" means, or how to represent a reversed transaction. Systems are built without ever agreeing to these fundamentals, and sometimes without even attempting to. The system produces results that stakeholders don't trust because the meaning was never made explicit before implementation, or even after. The consequences grow more serious as AI takes on decision-making roles, because a system that acts on undefined meaning can produce results no one can audit or correct.

This article provides working definitions of three terms that are frequently conflated within enterprise and information professional communities: data, information, and knowledge. More importantly, I want to define my concept of semantics, a word that is increasingly overused in the context of AI.

I should state that I don't subscribe to the notion that words have fixed meanings, including the terms defined in this article; however, the context under which I define these terms becomes clear as the article unfolds. This topic does not cover formal semantics, such as logic systems (propositional, first-order, description logics) or communication protocols. I cover those in Notes and FAQ.

I draw on Peirce, Wittgenstein, and Stamper/Dietz to give architects, AI designers, and product teams a practical foundation for concrete design choices and AI governance. The companion article, Intent-Driven AI Delegates, applies these definitions as design constraints in a governance framework for AI systems.

The Semiotic Ladder: How Meaning Emerges in Layers

Ronald Stamper developed the semiotic ladder during his work on organizational semiotics, extending Charles Morris's syntax/semantics/pragmatics division. The ladder runs from the physical world at the bottom through empirics, syntax, semantics, and pragmatics, up to the social world of commitments at the top. Jan Dietz adopted Stamper's ladder as a foundation for his theories in Enterprise Ontology and his DEMO methodology.

Physical The physical medium: electrical impulses, sound waves, marks on paper. This has no informational structure and no meaning: just physical substance and causal interaction.

Empiric Encoding (how signals are expressed): Roman letters, Morse code, binary digits. These patterns are used to represent signals, independent of what they mean.

Syntactic Structure and rules: Grammar, protocols, data formats. We can distinguish well-formed expressions from malformed ones, but still have no meaning, only formalism.

Semantic Meaning: Signs become associated with concepts. Symbols refer to things, whether real or modeled. This is where most discussions of "semantic technology" claim to operate.

Pragmatic Intent: the question shifts from what something means to why it is being said and what action or commitment it implies.

Social Shared conventions, norms, roles, and mutual commitments. Meaning is now embedded in a community of practice, not just an individual interpretation.

Dietz groups the six levels into three broad categories: forma at the bottom, meaning and intent in the middle, and social commitment at the top. The division matters because most systems that claim to operate at the semantic level are actually operating at the forma level: encoding, structuring, validating, without crossing into meaning at all.

The ladder clarifies that knowledge cannot be attained by operating only at the semantic level. Semantics provides meaning, but without intent and social context, it doesn't amount to knowledge.

This is why simplistic equations like "knowledge = data + semantics" are insufficient. After explaining what I mean by semantics, I will propose an alternative formulation that is suitable for information systems and AI.

Data: Signals Before Interpretation

Data is what you have before you have made any conceptual commitments. By itself it tells you little, and the mistake many teams make is believing that collecting and storing data is the same as understanding it.

Consider a spreadsheet with columns of numbers and no headings, or worse, headings like "Field1," "Field2," "Field3." The values are present. Nothing is missing at the physical or syntactic level. But you cannot answer even a basic question about what the data represents, because no one has committed to what the columns mean. Add a heading like "AccountID" and you have taken the first step toward semantics: you have made a conceptual commitment about what the values refer to. That label carries meaning only because it is a recognized term within a specific organizational context; outside that context, "AccountID" is just another string.

Poorly labeled or unlabeled data does not become meaningful by accumulating more of it.

Semantics: Conceptual Distinctions, Not Closed Truth

Semantics organizes and names ideas so people can share meaning and coordinate action.

Semantics is the structured specification of conceptual distinctions, expressed by a conceptual schema, that supports shared reference, reasoning, and the possibility of coordinated action. Its validity shows up in use, correction, and context, not in the specification itself.

Peirce: Meaning Is Triadic

Charles Sanders Peirce argued that meaning is not a simple two-way relationship between a word and a thing, but rather that meaning is triadic: it arises from the relationship among sign, object, and the interpretant, in this case the effect or understanding produced in an interpreter. Without this interpretant, there is no meaning.

A system that validates an RDF graph against an OWL ontology is performing a syntactic operation, not a semantic one; it is checking structural consistency, not establishing that the signs mean anything to anyone.

This is why the term conceptual schema strikes me as a better fit than ontology. Ontology, in its philosophical sense, asks "what exists?" and leans toward unchanging commitments. A conceptual schema asks "what distinctions matter for our purposes?" and permits provisional modeling that can be revised as understanding evolves.

Wittgenstein: Meaning Is Use

Ludwig Wittgenstein argued that the meaning of a word is, in most cases, best understood as its use in the language. Words acquire meaning through use in particular language activities (language games).

In one platform migration I worked on, a fintech that began as a direct-to-consumer business opened its platform to other businesses to embed under their own brands. The terms "customer" and "partner" had settled meanings in the original context. Once the platform model took hold, "customer" could mean the embedding business, the end user of the embedded product, or both, depending on who was using the term and in what context. No one changed the definitions; the business evolved and usage pulled the meaning with it.

Semantics is therefore intended at modeling time, where we aim for certain meanings, but tested in use, where we discover whether those meanings actually hold up in practice.

What Semantics Is Not

Semantics is not closed. Models evolve as they are used, corrected, and tested against real queries. There is no final specification of meaning that eliminates interpretation entirely, and anyone who has maintained a data model through a major business change already knows this.

Semantics is not absolute. Meaning is always relative to a purpose, a community, a situation. The same term can mean different things in different contexts, and this is not a failure but a feature of how language works.

Semantics is not self-executing. Semantics supports the potential for action, but knowing what something means does not automatically tell you what to do about it. Meaning must be connected to intent before it determines action.

Semantics is not truth. A conceptual schema can be internally consistent and factually wrong about the domain it models. This is not just a philosophical caveat; it is a practical design constraint. A schema that attempts to model everything observed in a domain is permanently at risk of drift as the domain evolves. The discipline Dietz applies in Enterprise Ontology is a direct response to this problem: model only what is essential and what is consequential if misunderstood. A schema created this way changes only when the organization's essential commitments change, as in the fintech example, not whenever observed reality across the organization shifts.

These distinctions also clarify when a schema needs revisiting. Major initiatives typically include an explicit requirements phase where changes in organizational essence surface naturally and can be modeled deliberately. Drift, by contrast, happens in the space between those initiatives, in the everyday BAU changes where nobody stops to ask whether the meaning of 'customer' or 'partner' has shifted since the last time anyone looked.

This is also why many metadata governance efforts produce less than expected. The typical response to definitional drift is to convene a working group, agree on canonical definitions, and mandate their use across the enterprise. The intention is reasonable but the approach works against how semantics actually function. Meaning is relative to community and purpose; a term that legitimately means different things in different parts of an organization is not a problem to be solved by forcing consensus. It is a feature of how language works in practice, and suppressing it tends to produce definitions that satisfy the governance process without changing how people actually use the terms.

The more productive discipline, and one that DEMO makes explicit, is to ground concepts in transactions rather than definitions. In DEMO, every concept that matters is tied to an organizational transaction: a commitment made, a production act performed, a result accepted or rejected. Concepts defined this way are stable not because a committee agreed on them but because they are anchored to what the organization actually does. When the transaction changes, the concept changes with it, deliberately and visibly, rather than drifting unnoticed through informal usage.

Information: Semantics Without Intent

If semantics is meaning, then information is structured meaning that can be communicated, stored, and reasoned over, but that lacks the pragmatic dimension of purpose.

Consider a birthdate record in a fintech system. The value is present, the label is recognized, and the semantic content is clear; however, the same record might be used to block a credit card application because the holder is under 18, or to verify a teenager qualifies for a supervised debit product. The semantics are the same; what changes is the purpose the system is serving and the obligations that follow from it.

The same is true of axioms, constraints, and business rules. They provide structure and order to meaning, and they can be evaluated or enforced, but they still do not supply purpose. A rule can be correct and executable and still not count as knowledge.

Knowledge: Semantics + Intent + Social Context

Knowledge adds what information lacks: purpose, commitment, and grounding in practice. Unlike information, knowledge is connected to intent: it serves goals and exists within a context of action and decision. It is validated through use; the test is whether it enabled good decisions and survived challenge. Where information can exist in isolation, knowledge is embedded in social practice: it is shared, maintained, and contested by communities, and shapes what the community treats as significant, not just what is formally correct. Knowledge requires the full semiotic ladder; meaning alone will not guide action.

Knowledge is semantics becoming actionable within a community of practice.

Knowledge = (Symbols + Structure + Shared Meaning) × Intention

Strip out intention and the product is zero, not diminished; the meaning is there but no one is using it for anything.

Why This Matters for AI

Most systems marketed as "semantic AI" operate at the structural or logical level. They check syntax, validate consistency, and draw inferences from asserted propositions; however, valuable as those capabilities are, they are not semantics in the sense described here. They lack the ability to interpret meaning, validate through use, or connect to intent. Calling them "semantic" gives a misleading impression about guarantees and responsibility.

Language models act as interpreters within a larger system, not as authorities; they can surface candidate meanings but cannot set domain definitions, enforce constraints, or assign decision authority.

The missing layers in most AI system designs are not technical. They are: intent, which specifies what purpose the AI is actually serving; accountability, which establishes who is responsible when the AI acts; and human judgment, which identifies what cannot be delegated at all. No matter how well designed, a conceptual schema will still fail to provide those answers.

Conclusion: Semantic Honesty as Practice

Semantics isn't fixed at design time; it's an ongoing practice as systems run and domains change.

The semiotic ladder shows that most systems stop at the forma level, and stopping there does not make them semantic.

Label systems that are syntactic as syntactic and make intent explicit: intent doesn't emerge naturally from data, and hiding it inside code makes it hard to trace during audits. Assign a named reviewer and logging procedures for decisions; not because AI is untrustworthy, but because accountability is a social act that only humans can perform.

When these principles are applied consistently, they give teams a way to audit AI decisions and stay ahead of domain drift.