Morning ScrumAI WorkflowAI Agents

From Dev to Production: The Real Workflow for AI Agents

Session with Codex is where the workflow gets shaped. Production is where the repeatable parts become fast.

AI production is not just taking a Codex session and asking it to do the same thing fifty times. The workflow gets shaped in Session with Codex, then the repeatable parts move into faster, tighter production surfaces powered by bound tools and models like Groq.

From Dev to Production: The Real Workflow for AI Agents

I kept coming back to one question in today's Morning Scrum:

What does "production" even mean when the thing you are building is an AI workflow?

Because it is not as simple as saying, "I used AI, therefore I have an AI system." That is the easy part. The hard part is figuring out where the thinking happens, where the repeatable work happens, and where the human stays in control.

For me, the dev side is Session plus Codex.

That is where the workflow gets shaped. I can keep a persistent Codex session open, point it at the real files, the real CRM, the real queue, the real artifact history, and say, "Okay, help me figure out what this should do." It can inspect the system. It can reason through the messy parts. It can notice when the UI state is wrong, when a deployment did not actually pick up a change, or when the process itself needs to be rethought.

That is incredibly powerful.

It is also not the thing I want to use fifty times in a row.

Session to production workflow ladder

I showed this live with Tim, my communications agent. I had a LinkedIn reply to deal with. In Session, I could ask Codex to research the person, look at the context, decide whether she was a buyer, recruiter, spam contact, or keep-in-touch relationship, and then draft a response.

It worked.

But the whole thing took a couple of minutes.

That is fine when I am shaping the workflow. It is not fine when I have a queue of fifty messages and most of them are routine. At that point, I do not need a high-reasoning agent to rediscover the pattern every single time. I need the pattern to become a production action.

So that is the split I am trying to get sharper about.

Session with Codex is where I build the clay. I shape the process there. I figure out the instructions, the edge cases, the data it needs, the UI state, the review pattern, and the failure modes.

Then production is where the repeatable pieces become fast.

That production layer might still use an LLM, but it is a different kind of setup. It is tighter. It is bound to specific tools. It is operating inside a known UI and a known work queue. It can use a faster model layer, like Groq, where the point is not deep open-ended reasoning every time. The point is: classify this, draft this, update this queue item, preserve the formatting rules, schedule this send, log the result.

That is a very different job.

Production agent queue

The difference is obvious when it works.

If I say, "mark this as spam," I want it done almost instantly. If I say, "draft a lightweight recruiter response," I want the draft to appear fast enough that I stay in flow. If the message is low stakes and the pattern is known, I do not want to wait for a long Codex turn.

But the recording also showed the real part of production work: it breaks.

I tried the production version. Some parts worked. Some parts did not. One flow could not see the selected item correctly. Another created the draft, but then the UI shifted into the wrong state. A send action said "sent," but I still needed to verify whether the LinkedIn queue actually had the right record. Then I had the classic deployment question: did the fix really reach the environment, or was I looking at stale behavior?

That is the part that feels boring until you are actually doing it.

AI makes it easy to build fast. It does not remove the need for a real dev-to-production loop. If anything, it makes that loop more important because you can change so much so quickly.

The pacing matters.

If I let an agent build ten things and then deploy them all at once, I can create a wide debugging surface. If I make one or two focused changes, build, deploy, and test against the real workflow, the loop is slower in the moment but clearer overall. I can see what changed. I can see what broke. I can decide whether the process is actually getting better.

This is the pattern I am settling into:

  1. Use Session with Codex to explore the real workflow.
  2. Let Codex reason through the messy cases and shape the process.
  3. Turn the repeated parts into explicit tools, scripts, queue actions, and UI states.
  4. Run those repeated parts in a faster, tightly-bound production agent surface.
  5. Test with real work, not a fake demo.
  6. Bring the failures back into Session, fix them, and redeploy deliberately.

That is the real workflow.

It is not "AI replaces the app." It is more like the app starts to absorb the parts of the AI workflow that are stable enough to run quickly.

And Session is what makes the whole thing workable for me.

Persistent agent workspaces

I can run Marni for content production, Tim for communications, Susie for planning, Dev Master for environment and deployment work, and keep each one in its own persistent workspace. If the machine restarts, I do not lose the thread. If I come back later, the tab still has the identity, metadata, resume command, status, files, and working context.

That matters because these agents get better at their lanes over time.

Marni now has context around OBS, recording assets, article artifacts, and the way I like Morning Scrum production to work. Tim has context around LinkedIn replies, queue records, drafts, sends, and the difference between a deep Codex pass and a fast production action. Dev Master has the deployment context.

That persistence is what lets the dev side and production side reinforce each other.

Codex is where I figure out what the work should become. A faster production agent is where I run the known pattern again and again.

That is the piece I think more AI teams are going to have to confront. It is not enough to say, "we have agents." The question is whether the agents are still stuck in open-ended development mode, or whether the useful repeated work has been hardened into something fast, inspectable, and reliable enough to operate.

That is where the payday is.

Use the high-intelligence agent to reshape the process.

Then turn the process into something that can run.

Source Note

This article is based on the May 28, 2026 Morning Scrum recording, "From Dev to Production." The recording followed the real workflow of shaping a communications-agent process in Session with Codex, then trying to move repeated actions into a faster Command Central production surface using tighter LLM/tool binding.

Watch The Recording

Watch the full Morning Scrum recording behind this article: a rough, live walkthrough of shaping AI work in Session and turning repeated actions into faster production surfaces.

Dispatch

Get the next field note

New articles, show notes, and practical lessons from the Strattegys AI crew.

Choose your next move
From field notes to operating system.
Keep reading
More field notes from the stack.