AI is increasingly accessible. Cloud platforms have lowered the barrier to deploying machine learning. Pre-built models are available off the shelf. Experimentation is easier than it has ever been. And yet, for the majority of enterprises attempting to move AI from pilot to production, the results remain frustratingly inconsistent.
The explanation for that gap is almost never the technology itself. The explanation is the data and knowledge infrastructure beneath it. Without a coherent framework for organizing what an organization knows and how its concepts relate to each other, AI systems lack the context they need to perform reliably at scale. That framework is an ontology, and understanding what it is, what it enables, and how to build it practically is foundational to any serious enterprise AI program.
Why Data Quality Cannot Be Treated as Optional
Two broad categories of AI application drive enterprise interest in the technology. The first encompasses efficiency improvements: getting better insights from large data sets, automating repetitive analytical work, and reducing the cognitive burden on knowledge workers. The second involves more transformative capabilities: personalizing user experiences in real time, responding dynamically to field performance data, or enabling new forms of customer interaction that were not practical with traditional approaches.
Both categories share a non-negotiable prerequisite. If product names or categories are missing or inconsistent across applications, that data requires significant cleansing and manipulation before it can support analytics or machine learning of any kind. Data quality is the price of admission for AI. That means the foundational disciplines, including governance processes, quality assurance procedures, data ownership protocols, and content curation practices, must be in place before AI programs can reliably deliver value.
This is where many organizations stumble. They invest in AI capability while treating the underlying data environment as something to be addressed later. Later is always more expensive and more disruptive than earlier would have been.
The Specific Barriers to ROI
Several patterns recur when enterprise AI programs fail to produce the return on investment that justified them.
The most fundamental is a failure to clearly define what the program is actually trying to accomplish. Organizations that approach AI as an experimentation exercise, without a specific problem they are trying to solve, tend to produce demonstrations that impress stakeholders but do not translate into operational impact. Clarity about the target problem is where ROI begins.
A second and closely related barrier involves the transition from pilot to production. Pilots succeed in controlled environments partly because data scientists invest intensive effort in preparing the data, curating it carefully and structuring it to work well with the algorithm being tested. That level of care does not scale automatically. When a pilot moves to production, the data preparation processes need to work at enterprise volume without manual intervention, and organizations frequently discover that the infrastructure to support that does not exist.
A third barrier is tool complexity mismatched to problem complexity. Highly capable AI platforms exist for genuinely complex problems, but applying them to simpler challenges does not improve outcomes. It adds cost, implementation difficulty, and interpretability challenges without proportionate benefit. Matching the sophistication of the tool to the actual difficulty of the problem is a discipline that requires honest assessment of what the business needs.
A fourth barrier is unrealistic expectations set by vendor marketing. Platforms are often sold on aspirational capabilities that require significant organizational prerequisites to realize. When those prerequisites are not in place, the capability does not materialize, and the investment is characterized as a failure of the technology when it was actually a failure of readiness.
What Ontologies Are and What They Do
An ontology is a structured set of organizing principles and the relationships between them. It catalogs the concepts that matter to a business, including its products and services, customer types, organizational structures, processes, and domain knowledge, in a way that is flexible, relationship-rich, and designed to evolve as the business evolves.
The ontology is the knowledge scaffolding of the organization. It defines not just what things are called but what they mean, how they relate to each other, and how those meanings are implemented across different systems and applications. In this sense it is the soul of the business, encoded in a form that machines can work with.
For AI specifically, the ontology provides something that machine learning algorithms cannot reliably generate on their own: reference data. Without reference data, an algorithm does not know what things are named, what product categories exist, what the organization's customer types look like, or how any of these concepts connect. AI systems can generate their own labels through unsupervised learning, but those labels are not always human-interpretable or aligned with the organization's actual terminology. Analysts must then translate machine-generated categories into business language, which defeats much of the efficiency the AI was intended to provide. The ontology makes that translation unnecessary by providing the shared vocabulary from the start.
How Ontologies Are Built and Stored
Building an ontology means developing vocabularies and hierarchies and then defining the relationships between them. The resulting structure can be stored in a standard relational database, a specialized ontology management system, or a graph database. Graph databases are particularly well-suited to ontological knowledge because they are designed to represent and traverse complex networks of relationships efficiently.
A common way to think about the data structure in an ontology is the triple: a subject-verb-object relationship. For example, a product belongs to a category, a service applies to a product, a product category maps to an industry application. These relationships are the actual data being stored, and the ability to traverse them in multiple directions is what makes ontologies powerful for knowledge retrieval and AI reasoning.
The knowledge graph concept extends this further. A corporate knowledge graph connects entities and their relationships across the full breadth of organizational knowledge, making it possible to identify connections that no single database or application would reveal on its own. Finding which employees have expertise relevant to a specific client challenge, tracing which products apply to which industry applications, or mapping which knowledge sources address a specific operational problem all become tractable through a well-built knowledge graph. The ontology is the conceptual foundation on which the knowledge graph is constructed.
Common Misunderstandings About Ontologies
Two persistent misunderstandings limit how effectively organizations approach ontology development.
The first is that ontologies are abstract and academic, useful in research contexts but difficult to apply in practice. This perception is changing as more organizations recognize the concrete value that well-designed ontologies deliver, particularly in powering search, enabling product information management, and providing the reference framework that makes AI systems more reliable. The challenge has historically been deployment, since ontology management tools can be complex and the discipline requires sustained attention. A more practical architecture exposes ontological structures to multiple systems through web services rather than building them into individual applications, which makes them more widely usable and easier to maintain.
The second misunderstanding is that building an ontology is a project with an end date. It is not. An ontology is never complete because the business it represents never stops changing. New products, new customer segments, new market categories, and new organizational capabilities all need to be reflected as they emerge. The appropriate comparison is to sales or manufacturing processes: they are ongoing disciplines, not deliverables. The ontology requires the same kind of continuous stewardship.
A Practical Approach to Getting Started
Given the scope of what a comprehensive ontology represents, the practical question is where to begin. The answer is not to attempt to model everything at once. The right approach is to start with the specific business problem that most needs solving, build the ontological structures that address that problem, and expand from there incrementally.
Basic information hygiene, including consistent naming, controlled vocabularies, and defined relationships between core business concepts, provides immediate value and creates the foundation for more sophisticated structures over time. The ontology grows as the organization applies it, each new application revealing additional relationships and requirements that make the whole more complete and more valuable.
The compounding nature of this investment is what makes it strategically significant. Each AI application built on a shared ontological foundation benefits from everything that came before it. Each one that bypasses that foundation creates a new island of data and logic that cannot be shared, reused, or built upon. Over time, the organizations that invest in the foundational work find that their AI capabilities grow coherently and efficiently. Those that do not find themselves rebuilding the same knowledge structures repeatedly, in different forms, at ongoing cost.
This article draws on insights from Seth Earley's interview with Big Data Quarterly, published March 2020.
