Cold Iron OS: how we build an intelligent organization (today)

TL;DR

You do not buy an intelligent organization. You build one, in layers. The layers you own are the ones that make a plant smarter. Cold Iron OS is our name for that stack: four layers – semantic layer, access, skills, agents – built on top of the systems you already run. This is how we build today. The field moves fast, so like any operating system, Cold Iron OS ships versions. What follows is the mid-2026 version. We will tell you where it is still moving.

Most “AI for manufacturing” is sold as a product. A copilot. A platform. A model with your logo on the login screen. That framing is the problem.

An intelligent organization is not a product you install. It is a stack you build, one layer at a time, on top of the systems you already run – your ERP, your MES, your spreadsheets. (Your ERP runs orders and finance. Your MES tracks the floor.)

Some layers you rent, because they are commodities. Some you own, because they are the business. Get that line right and you end up with something a competitor cannot buy off a shelf – something that survives a vendor swap.

We call that stack Cold Iron OS. It is not a product with a license key. It is an architecture, plus the components we build to fill it: the schema probe, the connectors, the skills, the agents. Here is how it works as of mid-2026.

What is Cold Iron OS?

Layer	What it is	Rent or own?
1. Semantic layer (the foundation)	The governed definitions of what your data means. What “on-time delivery” is. How “scrap rate” is calculated. Which ERP field maps to which in your MES.	Own. This is the asset.
2. Access	How agents and people reach your systems – through MCP connectors and CLI tools.	Own the pattern. The tools are swappable.
3. Skills	Packaged, versioned capabilities an agent calls. Know-how written down once and reused.	Build and own the ones that are yours.
4. Agents	Software that reads the semantic layer, calls skills, uses access, and does work – inside a human supervision gate.	Buy the framework. Design the boundaries.

An operating system sits between raw hardware and the work that runs on it. Cold Iron OS is the same idea for a plant. It sits between your raw systems and the AI work. So the work never has to know where your data hides or what your scars are.

One honest note. These four layers are a way to think, not boxes you build in isolation. Real components bundle them. We will show you one that is two layers in a single artifact.

Why is the layer you own worth more than the tool you buy?

Because tools get replaced. The layer underneath is the part you keep.

A tool reads your data and returns an answer. Swap it, and a better one reads the same data next quarter. But your definitions are yours and nobody else’s. They are what your numbers actually mean.

The model is a tenant. The semantic layer is the building. Rent tenants for price and performance. Own the building, always.

“Own your definitions” is not a slogan. It is a small, concrete thing. Owning the definition of on-time delivery looks like one governed line you control:

On-time delivery = ship date on or before the promised date, measured per line item; a partial shipment counts as late.

Write it down once, in a file you own. Then every report and every agent answers the same way.

How you get there depends on what you already have. Often the definitions already exist – in your BI tool, your data warehouse, or the SQL behind the board report. The work is to surface them, settle the conflicts, and govern them in one place. Sometimes you sit with your operators and write them from scratch. And when the source system is a black box, our fast path is an automated probe. Point it at the raw, undocumented ERP and it infers a first-pass model in weeks. Not the year a full cleanup takes.

Whatever the method, the slow part is the same. It is agreeing on what to call things. That is a few weeks of sessions with your operators, not a committee that runs forever.

And notice: this is one layer of four. Owning your definitions is necessary. It is not sufficient. Definitions with nothing reading them are inert. That is why the stack keeps going.

How do you give an agent access to your systems – MCP or CLI?

Both. The choice is situational.

MCP, the Model Context Protocol, gives an agent a typed, discoverable set of tools. It is good when you want a clean contract and plug-and-play. The catch: it works best with a server that holds state, and many OT and historian servers do not.

A CLI is a command-line tool. It gives you flexibility. The agent can chain commands you never planned. It costs less context, reuses decades of mature tooling, and runs locally. That last point matters in air-gapped and OT plants – shops cut off from outside networks, running floor systems that cannot host a server.

In practice it is a mix. Own the access pattern: what an agent can reach, and how. Keep the tools behind it swappable. Which to use when is a real engineering tradeoff, and its own piece.

What is a skill, and why build them instead of just prompting?

A skill is a packaged, versioned capability an agent calls. It is know-how, written down once, tested, and reused – instead of re-typed into a prompt every time.

The difference is the difference between a keystroke and an asset. A prompt is gone when the chat closes. A skill is something you keep, improve, and hand to the next agent.

Build the skills that encode how your business does a thing – your inspection routine, your quote logic, your close process. Rent the generic ones.

What does an agent actually do, and where does the human stay?

An agent reads the semantic layer, calls skills, uses access, and takes action. All of it happens inside a supervision gate a human controls.

The gate is the point, not a footnote. Today we let agents draft, retrieve, propose, and prepare. That is more useful than it sounds:

a shift-handover summary, built from the day’s data
a nonconformance report, assembled and ready for sign-off
a quote, prepped with the numbers already pulled and checked

We keep a human on the trigger for anything that changes a record, sends a message, or touches money or production. Where the gate sits is a judgment call. And it is moving – more on that below.

What does a full Cold Iron OS component look like in practice?

A single component often spans two layers at once. The clearest example we can show publicly is the memory system we built for our own agents.

It is version-controlled files – a running, timestamped record of what the agent knows. Facts go in one place, decisions in another, context in a third. Rules govern how knowledge gets read and written.

That memory system is two layers in one artifact. The files are a semantic layer: governed meaning the agent reads. The search and the read/write rules are skills: the capabilities that make the meaning usable. We did not build a semantic layer and some skills and bolt them together. We built one thing that is both.

Here is the part we like. It is a semantic layer for the agent’s own knowledge. It is us doing to ourselves exactly what we tell a manufacturer to do with their data. We use the same pattern in client work, on real operational data. The internal build is just the one we can open up. We plan to open the core of it as source, so you can see how it works rather than take our word for it.

Where is Cold Iron OS still moving?

Cold Iron OS is in production at each layer. The layers work. What is still moving is where the boundaries between them land. Those shift as the tools and the models mature.

Three places we are still working it out. Each is our current call, plus the thing that will change it.

MCP versus CLI. Today we lean on CLI tools for flexibility and a small footprint, with MCP where a typed contract earns its keep. Both are young. As they mature, our default will shift, and we will shift with it.
How much autonomy to give an agent. We gate hard right now. A human is on anything that changes a record or touches production. The forcing function is model reliability and better guardrails. As those improve, the gate loosens in specific, earned places. We move that line on purpose, not by drift. We would rather be slow than ship a wrong purchase order.
Where a skill ends and an agent begins. Today we package know-how as explicit skills. As base models get better at planning, more of that becomes something an agent just does. We expect the line to drift upward, and we are watching it.

If any of that changes how we build, we will say so. That is the deal with an OS that ships versions.

What is Cold Iron OS?

A four-layer architecture – semantic layer, access, skills, agents – for running AI on a mid-market manufacturer’s real operations. It is not a product you license. It is the way we build, plus the components we build to fill the layers.

How long until the first layer is useful, and what does it take?

The first useful layer is the semantic layer. The first slice of it is weeks of focused work with your operators, not a multi-year program. You get value from one well-gated agent reading one governed definition, long before the whole stack exists. Start at the foundation. Add layers as you have work for them.

Do I need all four layers at once?

No. A manufacturer with a solid semantic layer and one well-gated agent is further ahead than one that bought a platform and skipped the foundation.

MCP or CLI for connecting an agent to my systems?

Both, situationally. Use MCP for a typed, plug-and-play contract. Use a CLI for flexibility, low context cost, mature tooling, and local or OT environments. In practice it is usually a mix. Either way, you own the access pattern.

How much should an agent do on its own?

As much as you can supervise, and no more. Let it draft, retrieve, and prepare. Keep a human on anything that changes a record, sends a message, or touches money or production. Move that line deliberately, as trust is earned.

Conclusion

Cold Iron OS is four layers. Own the semantic layer. Own your access pattern. Build the skills that are yours. Put agents on top, inside a gate you control. The tools churn. The layers you own compound.

This is the mid-2026 version. Next quarter we will know more, and so will you. We will update it. That is not a hedge. It is the only honest way to write about a field this young, from inside the work.

Build the layers that matter. Rent the rest. Keep a human on the gate.