In the past few weeks, I’ve become a power user of agentic AI coding tools. Thousands of lines of code have been written, dozens of bugs have been solved, and entire repositories have been refactored. At least 95% of the code was written by the AI. I just made small tweaks here and there, mostly cosmetic changes to styling and text labels. Over these weeks, I’ve learned a lot. This post is a list of tips for people new to AI coding agents.
TL;DR — Click Links for Details
- Tip 1 — Use the Smartest Model Available
 - Tip 2 — Delegate Big Chunks, Not One-Liners
 - Tip 3 — Plan First, Then Implement
 - Tip 4 — Codify Patterns in 
docs/and Reuse Them - Tip 5 — Allow Duplication; Run Refactoring Passes
 - Tip 6 — Ask What’s Blocking the Agent
 - Tip 7 — Reuse Strong Context Across Tasks
 - Tip 8 — Work in Parallel with Multiple Agents
 - Tip 9 — Make the Agent Test Its Own Code
 
Tip 1 — Use the Smartest Model Available
I’ve seen people recommend dumb but fast models (Sonnet 3.7) over smart and slow ones (Opus 4.1). I disagree. Default to the smartest model and give it room to think (set high reasoning effort). At the time of writing, the models I use are Opus 4.1 and gpt-5-codex with high reasoning effort.
The smarter the model, the better it is at reading across files, sticking to constraints, and resisting the urge to add unsolicited features. It does what you ask rather than what it imagines you want. Prompt adherence improves with model capability. Speed also matters, but you can always run multiple slower agents in parallel (Tip 8).
Tip 2 — Delegate Big Chunks, Not One-Liners
Your first intuition might be to use AI as an advanced autocomplete. Don’t! Delegate tasks in big chunks and let the agent plan the steps. Small cosmetic tweaks (labels/CSS) are usually faster to do by hand. I never ask the model to "write a for-loop" or follow a list of detailed micro-instructions. If a prompt requires line‑by‑line instructions (‘open X, change Y, then Z’), the task is too granular, either expand it into a feature or do it by hand.
I think of working with AI agents as managing a small team of software developers, not as having a pair programming partner. Just like with human coworkers, the less you have to micromanage, the better. I'm always testing the limits of the models and trying to push the task size as high as I can. When you can prepare a task, hand it off to an agent, and then go do something else while it's working you're starting to see some real productivity gains.
Tip 3 — Plan First, Then Implement
AI agents tend to do more than you asked for. That’s just how they’ve been trained. They don’t know the scope and demands of your projects, so they often overengineer and add unnecessary features or ‘hardening’ that you didn’t ask for. Planning can significantly mitigate this problem.
What I mean by planning is simply asking the agent to think through a problem and outputting a plan for the code changes that the task requires before actually writing any code. In my experience, this has three benefits:
- You can confirm that you and the agent are on the same page as far as the goals and constraints of the task. At this point you can spot flaws in the plan, usually some extra feature you didn’t ask for, and fix them. This is easier than fixing the code afterward.
 - It forces the agent to think through the problem before jumping straight into code. I often have to encourage them to think more and spend more time on a given problem instead of rushing to implementation.
 - It gives you a chance to start with a vague request and refine it as you see the agent fill in the gaps. This is useful if you have a vague idea of what you’d like to do but don’t quite know how to perfectly articulate it yet. You can outsource some of that hard work to the agent.
 
Tip 4 — Codify Patterns in docs/ and Reuse Them
Because agents have a limited context window and no real long-term memory, you have to collect important information about the codebase into files that you can feed to the AI when needed.
I usually have a docs folder where I store such files. For example, in a recent project I had a form-view.md file that included common patterns used in all forms in different parts of the app. This included, among other things, styles, server-side and client-side validation, and common components. Then I ask the AI to study that file before giving it a task involving forms. (e.g. “Make sure to adhere to patterns established in @docs/form-view.md”)
I also added a list of these files, along with a short description, to my AGENTS.md. This strategy is extremely helpful in maintaining uniformity of style and coding patterns throughout the codebase.
Important: Don’t write these files yourself, use the AI. Work with the AI to develop one canonical example of what you want to codify. Then ask the AI to write the doc for you. Then read it and adjust it if you find something you don’t like. Remember to keep it up to date if you later decide to change a pattern.
Tip 5 — Allow Duplication; Run Refactoring Passes
In my experience, AI agents are very hesitant to change existing code. They’d rather copy an existing function and change one line in it than adapt the existing function to the new requirements. They are terrified of causing bugs unrelated to their current task.
At first glance, this can seem like a terrible trait and is probably one of the main reasons people claim that AI coding agents are only good for quick one-off scripts. And it is true that if you just let them run amok in your codebase, they will turn it into a mess.
Fortunately, they are also quite good at refactoring. You just have to explicitly prompt them to do so. My current strategy is to let the agent write messy code with lots of duplication and then:
- Ask it to write a test for the new feature it just implemented, then have it refactor the code.
 - Periodically ask it to refactor a specific file or function I find messy.
 - Ask it to find opportunities for refactoring. It often comes up with good suggestions I wouldn’t have thought of.
 - If I want to go in and manually change something and find it difficult, I’ll ask the agent to make that part of the code easier to read.
 
Tip 6 — Ask What’s Blocking the Agent
When the agent struggles to complete a task, don’t just ask it to fix it time and again. Ask it to explain its blockers to surface design flaws and next steps.
I was working on a web app that used htmx and Alpine.js libraries. If you’re not familiar with these libraries, don’t worry, it’s not important. The point is that you can use these libraries in several different ways, and the state of your app can live in the js runtime (Alpine.js state), in html form inputs (htmx), or a combination of both. I was trying to get my agent to add a simple multiselect component to an HTML form, and it struggled. After it failed a third time, I simply asked why this seemingly easy task was so hard. My prompt was something like: “This is the third time you failed to complete this simple task. Why is it so hard?”
It proceeded to explain that the way the app was using htmx and Alpine together was confusing and recommended that we refactor it for a clearer separation of concerns between the two. I agreed, and we refactored the code, then documented the pattern (Tip 3).
The refactoring and the documentation together made a huge difference in the agent’s performance. It became much more reliable and consistent in its edits. This was probably the single most important learning experience for me during these weeks of AI experimentation.
Tip 7 — Reuse Strong Context Across Tasks
Rebuilding context for every task wastes cycles and invites mistakes. Branch from a known, good state and reuse it across related work.
Both Claude Code and Codex have a feature that lets you go back in the chat history. This saves time because you can encourage the AI to build up good context and then reuse it for multiple tasks. Let’s say you’re working with a specific module of your app and you have multiple (more or less) independent tasks for the agent. You can ask the agent to explain how that module works. Let it explore the codebase and explain it to you. Make sure the explanation is correct before continuing.
The goal is to have the agent accumulate files and functions that are relevant to the following tasks in its context. Once you have a good context, you can ask the agent to do the first task. When it’s done, you can go back and restore the context the agent built earlier. Then ask it to do the next task. Now you don’t have to do all the context gathering separately for each task.
Claude Code takes this even a step further by allowing you to initiate a new agent with the current context. This lets you branch off from a good state and spin off multiple agents with their own tasks using the same base context. Alternatively, you can use this technique for launching multiple parallel attempts at a very difficult task. Unfortunately, at the time of writing Codex doesn’t yet let you initiate a new session using the context from the previous session.
Tip 8 — Work in Parallel with Multiple Agents
If you find yourself scrolling Twitter while waiting for your agent to finish, run multiple agents in parallel so you always have something meaningful to do.
I recommend always using the smartest model and giving it large tasks and plenty of time to think and gather context. This introduces a new problem. What are you going to do while the AI is working? The solution is simple: run multiple AI agents in parallel so you always have something to work on. If I don’t, I soon find myself watching YouTube, forgetting what I was doing, and my productivity tanks.
Working with multiple agents is a skill you have to learn, and it can feel overwhelming at first. At the moment I find that three agents is the maximum my brain can handle. This number is likely to increase in the future as the agents become capable of longer stretches of autonomous work. I might write a separate post on how I currently wrangle multiple agents. Hint: it involves tmux, Neovim, and Git worktrees.
Tip 9 — Make the Agent Test Its Own Code
Copying failures back and forth stalls progress and hides bugs. Make the agent run tests, read errors, and iterate until green.
This is an important tip. A lot of the frustration people experience with AI comes from having to prompt it to solve endless errors in its code by copying error logs into its context. If you find yourself doing this a lot, stop and think about how you can help the agent detect errors before it declares the task finished. You want the AI to get into a feedback loop where it’s constantly testing and adjusting its code until it complies with requirements.
The more you can make the AI test its code, the less time you have to spend copy-pasting error messages. Even really simple tests like “Make sure the page loads without errors” are useful because they let the AI catch silly mistakes before it stops and declares the task done. If you’re working on a web app, you can also give the agent access to a Playwright MCP server so that it can make quick smoke tests. This is something I’m still experimenting with. So far, I think the tests have made a bigger difference than the MCP server.
If you have a difficult bug, it’s worth instructing the agent to write a test that reproduces it before asking it to fix the bug. Otherwise, you may get stuck in a loop where the AI tries various fixes, claims it has fixed the bug, but when you test, it’s still there.
What’s Next?
I’m constantly pushing the agents to take on bigger, harder tasks to see where the limits are. As they get better, I expect the size of tasks they can do to grow. At the same time, the amount of time they can work autonomously will also increase. This means that, in the future, I can probably run more agents side by side, each tackling bigger tasks. Perhaps not too far from now, I’ll set up a queue of tasks at the end of the day and check their completed work the next morning.
As AI now writes the vast majority of code, the bottleneck has become how much code I can read. So the logical next question is: How can I avoid reading all this code? I need to be able to understand what’s going on in my codebase, but I would also like to delegate some of the reading. This is a work in progress, but the obvious next step is to use AI for code reviews. This might be something I explore in future posts.