Ready to start building your own knowledge models? Atlas lets you start for free and prove value before every making a commercial decision.
The AI pilot looked promising. The model identified anomalies in historical data, summarized maintenance logs, and answered questions about production reports with surprising accuracy. The vendor demonstration showed enough value to earn attention from the digital transformation team, operations leadership, and information technology. Then the pilot moved closer to production, and the confidence started to fade.
The answers became inconsistent. Engineers questioned the recommendations. Operations managers found edge cases the model could not explain. A project that looked convincing in a controlled environment struggled once it encountered the normal mess of a live manufacturing operation: bad tags, incomplete context, shift-to-shift variation, workarounds, stale alarm limits, and equipment behavior that only makes sense if you know the process history.
Walk through the architecture of almost any manufacturer operating ten or more sites and the pattern becomes familiar. The historian records time-series data from controllers and supervisory systems. Supervisory Control and Data Acquisition (SCADA) generates alarms and operator events. The Manufacturing Execution System (MES) tracks production orders, genealogy, quality records, and execution. Enterprise Resource Planning (ERP) manages materials, inventory, purchasing, and production schedules. Maintenance systems record work orders and asset history. Laboratories contribute quality measurements. Spreadsheets fill the gaps that no enterprise system ever fully addressed.
None of these systems are necessarily broken. Most perform the function they were purchased to perform. The historian records temperatures, pressures, motor currents, and production rates over time. SCADA knows when an alarm occurred and whether an operator acknowledged it. MES understands which batch was running and whether it passed quality inspection. ERP knows which customer order drove production that day. What none of these systems understands by default is how those facts relate to one another.
Over years of expansion, acquisitions, capital projects, and local problem solving, manufacturers connect these systems through hundreds of interfaces. Some are modern application programming interfaces (APIs). Others are database queries written ten years ago by people who have moved on. Some integrations were built by system integrators under project pressure. Others exist because someone in operations created a nightly export that became too important to replace. After enough startups and remediation projects, the same pattern shows up again and again: every integration moves data, but very few preserve knowledge.
This is where AI pilots start to fall apart. The model receives historian data, alarm logs, maintenance history, production records, and maybe a set of documents. On paper, the dataset appears rich. Millions of records suggest that the model should find patterns no human could see. What usually happens instead is that the model finds statistical relationships while remaining blind to operational reality.
Consider a predictive maintenance project monitoring a heat exchanger. The model notices that rising outlet temperature often appears before maintenance activity. After training, it begins flagging similar temperature increases as possible failures. The recommendation sounds reasonable until operations reviews the results. Half of the flagged events occurred during product changeovers where higher temperatures were expected. Several others happened while an upstream process intentionally increased flow. Another group came from a sensor that had been drifting for months before calibration.
An experienced engineer can separate those cases quickly because they understand the process behind the measurements. The AI cannot. Nothing in the data environment tells it that one temperature increase represented normal production, another reflected an intentional process adjustment, and another came from an unreliable instrument. The model is not reasoning incorrectly. It is reasoning from incomplete information. This is not a model problem. It is a data meaning problem.
Every manufacturing facility depends on knowledge that exists outside its software systems. Ask a production engineer why a line slows during one product family but not another. Ask a maintenance supervisor which alarms everyone ignores because the threshold was configured years ago for equipment that has since changed. Ask an operator which sequence of alarms indicates a real upset and which sequence is just noise after a restart. The answers usually come fast, and very little of that knowledge exists in structured form.
Good operational decisions are rarely driven by a single measurement. They come from understanding relationships. Engineers know which pumps feed which tanks, which utilities constrain which lines, which process variables influence downstream quality, and which control changes were made after a failure years earlier. They know that two lines sharing compressed air are not fully independent. They know that a certain product behaves differently during winter because cooling water temperatures change. None of that understanding comes from the raw tag values alone.
That is operational knowledge: the structured understanding of how a facility actually works. It includes equipment relationships, process dependencies, operating assumptions, engineering decisions, and the context that explains why observed data should be interpreted one way instead of another. The historian may record that a valve opened. Operational knowledge explains why that valve exists, what equipment depends on it, when it should operate, and what downstream consequences should be expected if it fails.
The missing layer is a semantic model, sometimes called a knowledge model or knowledge graph in industrial contexts. In plain terms, it is a structured, queryable representation of the facility that captures equipment, processes, relationships, and conditions instead of only storing values. It describes the operation as engineers understand it, not just as individual software systems happen to store it.
This layer turns isolated records into meaningful information. It connects assets to lines, lines to processes, processes to products, products to quality outcomes, and events to operating states. It defines common terms so that a dashboard, an analytics workflow, and an AI system do not each invent their own version of the plant. Without this layer, every new consumer integrates against raw sources independently, recreates context from scratch, and embeds another interpretation of the same operation somewhere downstream.
That is why digital transformation programs accumulate technical debt faster than they deliver value. One dashboard calculates downtime one way. Another calculates it differently. An analytics project builds its own equipment hierarchy because none exists centrally. An AI initiative creates another mapping because it cannot reuse the previous work. The organization does not just have a data problem. It has too many competing versions of operational meaning.
This layer is usually missing for honest reasons. It is hard to scope, hard to budget, and hard to prove in a ninety-day pilot. It does not fit cleanly inside the ownership model of any single system. Historians preserve history. SCADA supervises control. MES manages execution. ERP coordinates business processes. None of them naturally owns the full representation of how the facility works across assets, processes, products, states, and outcomes.
Building that representation requires someone to make explicit what has always been implicit. That means process engineers, controls engineers, operations leaders, quality teams, maintenance teams, and information technology have to agree on definitions that may have evolved differently across sites for years. This is not easy work, and anyone who says it can be automated away has probably not spent enough time with real plant data.
At multi-site scale, the problem compounds. One site names a packaging line by production area. Another names it after the original equipment manufacturer. A third inherited naming conventions from an acquisition fifteen years ago. The equipment may perform the same function, but every historian, alarm database, and reporting system describes it differently. No central architect can model all of that correctly from headquarters without involving the people who understand each site's operating reality.
When the semantic layer exists, the change is practical rather than flashy. AI agents can trace a production anomaly through actual equipment relationships instead of guessing from correlated values. A cooling water issue no longer appears as dozens of unrelated alarms across separate systems. It can be understood as one operational event propagating through connected assets because the relationships are part of the model.
Dashboards also become less fragile. A new reporting requirement does not require every team to rebuild basic definitions from raw sources. Analytics teams spend less time reconstructing context before analysis begins. New sites onboard faster because they can start from an agreed model instead of inventing another local interpretation of familiar processes. The model will still need site-specific work, but the starting point is no longer a blank page.
The larger benefit is that operational knowledge starts to survive personnel turnover. Senior engineers will always take judgment and experience with them when they leave, but they should not take the entire operating model of the facility. Capturing that model in structured form gives future engineers, applications, and AI systems a better foundation than disconnected tags, alarm logs, and tribal memory.
The question for most manufacturing organizations is probably not which AI tool they should buy next. It is whether their data stack contains the semantic foundation that any AI tool needs before it can reason about the operation with confidence. Do you have that foundation, or are you still asking AI to infer the plant from raw signals alone?
