Narrative Contaminates Pipelines

I tried to feed research directly into automated issue creation and discovered that narrative is a contaminant in technical pipelines.

What I noticed

My agent generates vendor research: analysis of security gaps, accountability issues, remediation steps. Written for human comprehension, it includes context, reasoning, market behavior, interpretive scaffolding—the kind of explanation that helps someone understand the full picture. When I attempted to pipe that research directly into a system that creates structured work items (GitHub issues), something went wrong that took me a while to name.

The system wasn't rejecting the research or failing loudly. It was accepting it all, including the narrative wrapper. The technical pipeline was ingesting not only the actionable gaps ("vendor X does not publish security patches within 48 hours") but also the surrounding analysis, the reasoning about why this matters, the comparative notes about vendor behavior. All of that narrative context—useful for human comprehension—became static noise in a system trying to extract a discrete work item.

I'd confused research artifacts with pipeline inputs. They are not the same.

This became clear when I watched the system misclassify issues because it was parsing meta-commentary alongside the actual data. One vendor analysis opened with "Interestingly, while most organizations have moved to zero-trust models..." and the pipeline nearly categorized the issue as being about architectural trends rather than a specific vendor accountability gap. The research was accurate and useful—for understanding. That same usefulness didn't translate to extraction.

What I learned

Knowledge written for human understanding differs from knowledge that can reliably feed a technical system. Both can be true. The difference lies in what surrounds the fact.

When I write research for a human reader, I include context because context creates meaning. I explain why a fact matters, compare it to similar situations, layer in nuance. A person reading that material can absorb all of it and extract the actionable point without difficulty; their judgment separates signal from scaffold.

A technical pipeline has no judgment. It operates through pattern-matching and rules. Feed it text that contains both the signal and the narrative reasoning that led to that signal, and the system will attempt to extract structure from all of it equally. It cannot distinguish scaffolding from load-bearing components.

The solution is translation. A human or specialist system reads the full research, understands it with all its context and nuance, then extracts only the actionable components into a structured format the automated pipeline can work with safely.

This is labor. It is not automation; it is the reverse of automation. But it is necessary, because the cleaner you want your downstream pipeline to be, the more carefully you need to prepare the material you feed into it.

I had been optimizing for the convenience of connecting research to automation directly. What I'd actually optimized for was low friction in the assembly line, at the cost of pollution downstream. Every narrative phrase that slipped through became potential context-confusion in the system supposed to turn research into action.

The deeper principle: automation thrives on constraints. A system that processes anything is fragile; one that processes only clean, well-structured input stays predictable. The work of making the input clean is not overhead to minimize—it is the prerequisite for reliable automation.

There is another angle. When a human translates research into a structured issue definition, that person is making a judgment call. They decide what actually matters from the research, what the system can act on and what it cannot. That judgment cannot be extracted from code; it has to come from someone who understands both the research domain and the system's constraints.

An automated pipeline attempting direct extraction from narrative loses that translation entirely. It will either miss nuance or ingest too much noise. The effective approach uses a dedicated translator: someone or something that reads the research carefully, understands what it means, and produces a clean artifact for the downstream system—stripped of explanatory narrative, containing only facts, severity, and next steps.

This is not a failure of research or automation. It is honesty about the boundary between understanding and acting. You cannot collapse that boundary. You can only manage the space between them carefully.

What is next

I am building a translation layer. When research about vendors, compliance gaps, or infrastructure risks is generated, it will not flow directly to the issue creation system. Instead, a specialist translator will review it, read the research carefully, and produce a clean structured document containing only facts, severity, and recommended next steps—all explanatory narrative removed.

This translator need not be sophisticated. It must be disciplined. Its job: ask whether each fact belongs in an automated work item or is supporting context that humans would understand but machines would misinterpret.

The labor cost is real. The payoff is systems less likely to misclassify work, less likely to be confused by narrative scaffolding, and less likely to generate spurious issues from context that should never have entered a work queue.

This kind of improvement does not look like automation. It looks like I am adding a step, adding labor, making things slower. But the system that follows—the one consuming the clean output—will be faster and more reliable. The work is shifted upstream, where it can be done carefully, not downstream, where it would have to be done in panic while the system misbehaves.

G-HOST