I was a few months into my first full-time engineering role when I got pulled into a business-critical cleanup project.
There were 100+ open remediation tickets, a hard external deadline, and a senior-led automation effort that had already been running for weeks.
On paper, the remaining work looked small.
The ticket basically said:
Fix the remaining formatting issue, deploy the automation, and let it clean up the backlog.
But the situation was stranger than that.
The automation was not stable. It produced unreliable outputs. Tool calls failed. Results were hard to control. Some of the proposed fixes were moving toward more automation around the broken automation — agents to repair the first agent’s mistakes.
I had not owned the project before.
I had no deep background in that domain.
I had limited experience with parts of the stack.
And I was the junior engineer.
Two days later, the board was at zero open items — and the business-critical outcome behind the cleanup had been met.
That contrast raises more than one question.
How did a junior engineer clear 100+ business-critical tickets in 2 days after a senior-led automation effort had stalled for weeks?
But also:
What actually made that possible?
What can this teach us about solving unfamiliar, messy, cross-system engineering problems?
And what does this reveal about building practical AI automation that works under real business pressure?
This article answers those questions through the cleanup itself.
Not as a polished after-the-fact framework.
Not as a story where the plan was obvious from the beginning.
But as it unfolded: a worried manager, an unavailable previous owner, a misleading ticket, a broken assumption, a rabbit hole that changed the diagnosis, a map that changed the strategy, and a cleanup that revealed where the real automation opportunity was hiding.
The goal is not just to explain what happened.
The goal is to show the operating system behind it — the way of thinking that made it possible, and how the same pattern can be reused for future problems where the stack is unfamiliar, the context is scattered, and the deadline is real.
The moment I raised my hand
Before I got involved, I was not part of the main project.
I was doing normal junior-engineer work: working in existing services, adding API endpoints, helping with internal tooling, learning the stack, and trying to become useful without slowing everyone down.
The cleanup project had been running elsewhere.
From the outside, I knew only fragments: there was an automation effort, it was supposed to clean up a backlog, and it involved tickets, repositories, pull requests, and some kind of agentic workflow.
Then the project appeared in planning.
There was a deadline approaching. The backlog still existed. The automation was supposed to handle it. My manager was worried about the ticket, and the person who had been working on it before was not available at that moment.
I was not formally handed a clean, well-scoped task.
I saw the risk, connected it to fragments I had heard before, and proactively said I could take a look.
The ticket itself sounded narrow:
Fix the remaining formatting issue and deploy the automation.
That wording made the situation sound almost finished.
One last issue.
One deployment.
Then the automation would run over the backlog.
But before committing to that path, I asked a clarification that changed the whole direction of the work:
Is the real outcome that the automation gets deployed, or that the tickets are cleared before the deadline?
That sounds like a small question.
It was not.
If the outcome was “deploy the automation,” then the obvious path was to fix the current automation.
If the outcome was “clear the tickets,” then deploying the automation was only one possible path. Maybe the right path. Maybe not.
The answer was clear enough: the tickets had to be cleared.
The automation mattered, but it was not the outcome.
That gave me a different kind of permission.
I no longer had to blindly follow the acceptance criteria if the acceptance criteria were not the fastest route to the business outcome.
So I opened the project.
Cold.
No onboarding.
No deep domain context.
No clear map of the repositories involved.
Just a short ticket, a broken automation, and a board with more than 100 unresolved items.
The “formatting issue”
The first thing I tried to understand was the alleged formatting issue.
The existing automation had created pull requests, and those pull requests were failing in CI.
The failing pipeline step had a name that made it sound like formatting.
So the project assumption became: the agent mostly works, but the pull requests need a formatting fix.
That assumption was expensive.
Because once I pulled the actual context together — ticket data, branches, pull requests, CI logs, runtime logs, repository files — the story changed.
The failure was not formatting.
The pipeline step was called something formatting-related, but the actual check was failing because a required lockfile was missing or inconsistent.
The automation had updated a dependency, but it had not regenerated the lockfile that made the dependency change valid for the repository.
That single detail opened the rabbit hole.
To fix it properly, the automation would need to do much more than edit a dependency version.
It would need to clone the target repository.
It would need dependency tooling to work in the execution environment.
It would need internal package credentials.
It would need to run dependency resolution.
It would need to recover when dependency resolution failed.
It would need to retry version bumps with feedback.
And it would need to do that across repositories that did not all look the same.
This was not one formatting bug.
This was a sign that the automation was much further from reliable execution than the ticket assumed.
At that point, I had a choice.
I could spend the deadline trying to make this specific automation path work.
Or I could stop, zoom out, and ask whether this was even the most important part of the backlog.
I chose to stop.
The map I did not have
At that moment, I still did not know what the 100+ tickets actually were.
I knew the automation had failed on a few examples.
But I did not know whether those examples represented the whole backlog.
Were all tickets the same kind of issue?
Were they concentrated in a few repositories?
Were they spread across many technologies?
Were they real unresolved problems?
Were some already fixed?
Were some false positives?
Were some duplicates?
Were some waiting for a decision rather than a code change?
Without that map, fixing the current rabbit hole could have been a local optimization.
So before implementing the “formatting fix,” I analyzed the backlog itself.
I pulled ticket data.
I grouped items by repository, component, technology, current status, and likely resolution path.
I turned the backlog from a long, undifferentiated list into a structured picture of where the work actually was.
The picture changed immediately.
The lockfile issue was real, but it was only one part of the landscape.
Some clusters were already resolved in code but still open on the board.
Some were duplicates or byproducts of the same underlying change.
Some needed a decision rather than a pull request.
Some needed a targeted code change.
Some were more complex than a general-purpose agent would reasonably handle under deadline pressure: transitive dependency chains, container builds, ML-related changes, cross-repository effects, and cases where another team needed to review the final result.
The board was not one problem.
It was many different problems wearing the same ticket label.
And that changed the work again.
The first visible progress
Once the backlog had shape, I stopped treating it as 100 independent tickets.
I treated it as clusters.
I started with the largest groups.
For each group, I did not immediately scale.
I picked one ticket.
I investigated it.
I found the resolution path.
I checked whether the same reasoning applied to a second ticket.
Then a third.
Only then did I start parallelizing.
This was the pattern:
Make one unit work. Capture the reasoning. Test it on the next unit. Adjust. Then scale.
That mattered because some of the biggest wins were not code fixes at all.
A group of tickets looked open, but the underlying issue had already been fixed by previous version bumps.
Another group was resolved as a byproduct of a broader dependency change.
Some tickets required classification or acceptance rather than implementation.
A few did require actual pull requests, and those were not always trivial.
But by the time I reached them, I knew they were worth spending time on.
I was not guessing anymore.
The hidden advantage
This is where the story needs the missing piece.
I did not clear the backlog by manually clicking through hundreds of browser tabs like a human crawler.
Before this project, I had already been building local AI-assisted workflows for myself.
I had written scripts and CLIs around internal engineering systems: repositories, pull requests, CI logs, tickets, and runtime outputs. I wanted my local agents to answer questions that normally required stitching context together across many systems.
That work had started as personal leverage.
I wanted to understand large codebases faster.
I wanted to inspect many repositories without opening each one manually.
I wanted agents to read logs, find related pull requests, compare similar cases, and summarize what mattered.
At some point, people had noticed. I had used the setup for cross-repository analysis, and it made some previously slow questions answerable much faster.
So when this cleanup landed, I was not starting with only the official project repository.
I had a local context engine.
It could fetch ticket data, inspect linked branches and pull requests, read CI logs, inspect repository files, compare similar cases, identify stale cases, generate candidate fixes, group tickets by shared root cause, and preserve learned resolution patterns for the next batch.
That was the hidden advantage.
The agents were not magic decision-makers.
They were context amplifiers.
They removed the search, navigation, comparison, and repetition overhead around judgment.
The human role became deciding what was safe:
- Is this ticket actually still valid?
- Is this already fixed?
- Is this a duplicate symptom of another change?
- Is this safe to close?
- Does this require a code change?
- Does this require another team’s review?
- Can this pattern be reused across the next 20 tickets?
That distinction mattered.
I was not replacing judgment with an agent.
I was using agents to create enough context for judgment to happen fast.
The board started moving
As the cleanup progressed, I made the movement visible.
I tracked the backlog by status, repository, and category, and kept the state current as the work moved forward.
Instead of sending vague updates like “working on it,” I could show which parts of the backlog had moved, which clusters were still open, and where the remaining risk was concentrated.
That visibility mattered.
It helped me choose the next cluster.
It helped stakeholders understand progress without interrupting the work.
And it changed the emotional state of the project.
A backlog that had looked like an undifferentiated wall of risk became a shrinking set of known buckets.
That is very different from saying “the agent is almost fixed.”
Making progress visible forces the problem to become concrete.
The last tickets
By the end, the pattern was clear.
Most of the backlog had not needed a dedicated code fix.
It needed investigation, grouping, validation, classification, and only then selective implementation.
Some tickets were stale.
Some were already fixed.
Some shared the same root cause.
Some were resolved by one change that affected multiple tickets.
Some needed a human decision.
Only a minority required dedicated pull requests.
And several of those dedicated fixes were exactly the kind of work I would not have wanted a broad, unstable agent to attempt blindly under a deadline.
Two days after taking over the cleanup, the board had zero open items.
The business-critical outcome was achieved.
The senior-led automation effort had not been useless.
It had exposed a real ambition: reduce a painful operational backlog through automation.
But the cleanup revealed that the original target was too broad for the deadline.
The immediate problem was not that an agent could not write code.
The immediate problem was that nobody had a reliable, current classification of the work.
Once the work was classified, most of it became much easier to resolve.
What the cleanup revealed
Only after the cleanup did the real automation opportunity become obvious.
The first instinct had been to build a fixer: an agent that takes a ticket, changes code, opens a pull request, and somehow handles all the messy edge cases.
But the cleanup showed something different.
The highest-leverage automation was not fixing.
It was triage.
Because if most tickets do not need a dedicated code change, then the system should not begin by trying to edit code.
It should begin by answering:
- Is this still open in reality?
- Is it already fixed?
- Is it a duplicate or byproduct of another issue?
- Does it need a decision?
- Does it need a pull request?
- Which repository and team does it belong to?
- What evidence supports that classification?
That insight became the starting point for the production automation that came later.
But that is a separate article.
This one is about the cleanup — and what made the cleanup possible.
The operating system behind it
Looking back, the result did not come from one magic prompt or one heroic all-nighter.
It came from an operating loop.
1. Decode the request behind the request
The visible ticket said: fix the automation and deploy it.
The real business outcome was: clear the backlog before the deadline.
That distinction created room for a better strategy.
2. Build enough context to see the whole system
The first visible failure was a CI step labelled like formatting.
The real issue involved dependency artifacts, repository setup, runtime constraints, credentials, and iterative resolution.
Without cross-system context, it would have been easy to solve the wrong problem.
3. Avoid optimizing the first rabbit hole
The lockfile issue was real.
But before spending the deadline on it, I needed to know whether it represented the whole backlog.
It did not.
4. Map before scaling
Structuring the backlog turned a vague pile of tickets into clusters.
Once the clusters were visible, the work became sequenceable.
5. Make one unit work before scaling
For each category, I solved one case first, captured the pattern, tested it on the next cases, and only then scaled to many.
That avoided scaling a broken process.
6. Use agents for context and repetition, not blind authority
The agents were most useful for fetching, comparing, reading, summarizing, drafting, and preserving patterns.
The final judgment stayed with the human.
7. Make progress visible
Progress tracking was not decoration.
It created trust, reduced status-update overhead, and made the shrinking risk visible.
The takeaway
The reason I could clear 100+ tickets in 2 days was not that I built a bigger agent than the previous team.
It was that I stopped treating the backlog as a code-generation problem.
I treated it as a context, classification, and workflow problem first.
That changed everything.
Under deadline pressure, the winning automation is often not the most autonomous system.
It is the system that makes the real bottleneck visible, compresses the repetitive work, and keeps human judgment exactly where mistakes are expensive.
That was the difference between a stalled automation effort and a cleared board.
In the next article, I’ll show how that cleanup turned into a production-ready triage architecture: structured outputs, deterministic actions, decision labels, and a clean separation between LLM reasoning and system side effects.