The Necessity of Noise

When agents replace tools, the audit trail becomes the interface.

Feb 13, 2026 · By Thomas Walichiewicz · 6 min read

Visibility of system status: How well the state of the system is conveyed to its users. Ideally, systems should always keep users informed about what is going on, through appropriate feedback within reasonable time. —NN/G¹

In a recent Claude Code update, the default output behavior had been condensed. Where previous versions had scrolled past a detailed, verbose list of every file read and every pattern searched, the new version might simply state: “Read 3 files.”

On the surface, that seems like a straightforward solution to issues they’ve been trying to solve: more capable agents produce more tool calls, terminals have limited pixels and uneven rendering performance, and after dogfooding it internally, they believed the default view needed progressive disclosure to stay usable at scale.

However, upon releasing the update, it triggered a wave of negative community feedback. So what did they miss?

The Shift From Monofunctional Affordances

Before LLM chat, most interfaces were built out of monofunctional (single-purpose) affordances: buttons, menu items, flags. Each mapped to a specific function, with a predictable set of required inputs and visible side effects. Every affordance had to answer:

What information does the system require to perform the function? What can be inferred, and what must be explicit?
What control does the user have over how the function runs: scope, specificity, reversibility?
Where does feedback appear in the larger experience, and at what cadence?

Using a text editor as an example:

In deterministic (classically coded) software: if I press B for bold, it does one predictable thing in one predictable way. The designer and engineer picked what “bold” means, and you can learn it.
In probabilistic (LLM-powered) software: if I prompt “make this bold,” it can pursue an unbounded set of methods to satisfy that intent, depending on its weights, training, your file context, prior requests, preferences, and tuning parameters. It can also take nonsensical intermediate steps (add, remove, nest, undo, repeat) and still arrive at “bold” on the surface.

In both scenarios, I end up with bold text. But what happens behind the scenes can have a downstream impact on how editable the text stays.² If prompts are replacing buttons, then the audit trail becomes the new control surface.

By condensing the output, Claude Code hid the evidence of its work. It replaced a raw stream of system state with a polite summary, transforming the terminal from a cockpit into a waiting room. When a program is deterministic, status is easy: I clicked save, the icon turned green. But when a probabilistic AI is thinking through a problem, the interface becomes a negotiation of trust. How do you know it’s working? How do you know it isn’t hallucinating?

Steering the Black Box

In an agentic CLI, you are not just a reader; you are a supervisor. You watch tool calls and diffs scroll by so that you can intervene—hitting Ctrl+C the moment the agent goes off the rails. File paths, commands, and search patterns aren’t trivia; they are the anchors that tell you what the system is doing, where, and why. If you can’t see them, you can’t stop the mistake. You lose the ability to steer the ship before it hits the rocks. When the interface hides the ‘how,’ it demands a level of trust that the model has not yet earned. It asks the developer to surrender their agency and become a passenger.

Solutions to this problem aren’t even entirely novel to LLM-driven software. Computational design tools often provide a preview and tuning dials for heavy actions. Photoshop filters, for example, show a mini preview of the expected output before it’s applied. This allows you to tune it, see several variants side-by-side, and commit only when satisfied.

But LLMs are probabilistic, not deterministic. They can’t reliably “preview” the future because the act of generating the preview is often as costly and unstable as the work itself. In this context, the live trace is our preview. It is the only real-time signal we have to judge the model’s trajectory. Collapsing that status into counters like “Read 3 files” removes the only actionable signal a supervisor has.

This isn’t just a preference for “hacker aesthetics”—it’s an accessibility issue. Depending on implementation, collapsing traces can become a regression: sighted users can “glance” at a summary, but screen reader users often need a keyboard-first way to expand detail in place without losing context. If the detail is buried or summarization is lossy, they are flying blind.

Perception of effort

People value a service more when they can see the effort involved. It’s the reason travel websites show you a progress bar of “searching airline databases” even when the work could happen instantly. We trust the result because we saw the sweat. In conversational interfaces, this often manifests as latency theater:³ deliberate pauses or streaming tokens that mimic the cadence of human thought. For a novice user, those delays can be reassuring. They suggest deliberation. They make the machine feel human.

But in developer tools, the theater isn’t the pause. It’s the trace. Tool calls and diffs are how experienced users verify that the agent did the right kind of work. Hide that context, and you don’t just reduce clutter. You make every output feel cheaper and less trustworthy, because you’ve hidden the only evidence a supervisor can use to calibrate belief.

Closing thoughts

Realistically, we’re reaching the limits of the current design patterns of these tools as single scrolling transcripts. A CLI can do more than print lines: stable status spines, in-place expansion, split views, hyperlinks, and a persisted trace you can grep later. But the stdout transcript paradigm isn’t built for supervising an autonomous system with a long, branching trajectory.

I have some ideas about how to solve this, but I’ll save that for a later post. The point I’ll leave you with is that applying the old tried solutions to agentic tooling isn’t going to work as it had in the past with classically coded software. And as users’ technical proficiency continues to rise, the demand for high-fidelity interfaces—those that offer steerability rather than just simplicity—will only grow. CLI apps aren’t exempt from proper user experience design.

Nielsen Norman Group. “10 Usability Heuristics for User Interface Design.” https://www.nngroup.com/articles/ten-usability-heuristics/ ↩
<b></b> could work, but so could <div id=’bold’ class=’bold-sometimes bold-always bold-when-in-container’><b><strong></strong></b></div> ↩
Kruger, J., Wirtz, D., Van Boven, L., & Altermatt, T. (2004). The effort heuristic. Journal of Experimental Social Psychology, 40(1), 91-98. https://doi.org/10.1016/S0022-1031%2803%2900065-9 ↩