AI-Assisted Iterative Optimization Loops

Developer Productivity

AI-Assisted Iterative Optimization Loops

What happens when you give an AI a goal, a metric, and permission to loop?

Two live demos: code coverage and performance optimization

Same codebase, different objectives, same pattern

Claude Code — Anthropic's AI coding agent in the terminal

The Problem

We write code. We manually profile. We manually optimize.

Developer spots a slowdown, fires up VisualVM, reads flame graphs

Writes a fix, re-benchmarks, checks if it helped

Repeat... until the sprint ends or patience runs out

Same for test coverage: write tests, check JaCoCo report, write more tests

80%

of optimization time is spent on measurement, not coding

3-5

manual cycles before most developers move on

The Shift

What if the AI could run the tools, read the output, and decide what to do next?

Claude Code can execute shell commands: mvn, JaCoCo, JFR

It can read structured output and reason about what it means

It can modify code and re-run measurements automatically

The missing piece: a loop with a termination condition

while (goal_not_met) {
    measure()       // run the tool, capture metrics
    analyze()       // reason about what to improve
    apply_change() // modify code with best optimization
    re_measure()    // verify improvement
}

Demo 1 — Test Coverage

Measure coverage. Write tests. Loop until 92%.

Run JaCoCo

→

Parse Coverage %

→

< 92%

↓

Find Gaps

↓

Write Tests

↓

← loop back

≥ 92%

↓

Done!

~10%

Starting coverage (3 tests)

92%+

Target coverage

4-6

Expected iterations

Live Demo

Watch the coverage climb in real time

Starting: 3 tests, ~10% line coverage

The AI reads the JaCoCo report, identifies uncovered classes and methods

Each iteration: new tests appear, coverage jumps

Terminates when coverage ≥ 92%

// The prompt I'll feed to Claude Code:
Run `mvn clean test jacoco:report`, then read
target/site/jacoco/jacoco.csv to determine line coverage %.
If below 92%, analyze the report for uncovered classes,
write JUnit 5 tests targeting those gaps, and re-run.
Keep looping until line coverage ≥ 92%.

Switching to terminal...

Demo 2 — Performance Optimization

Benchmark. Analyze. Optimize. Commit only what works.

Run Benchmark

→

Analyze Timing

→

Apply Best Fix

→

Re-Benchmark

→

≥ 2% faster

↓

Commit + Log

reset failures = 0

← loop back

< 2% gain

↓

failures++

if 3 in a row →

Stop

~6.5s

Starting pipeline time

22x

Expected total speedup

3

Consecutive failures = stop

Live Demo

~6.5 seconds to start. Let's see how low it goes.

Same Java app, deliberate inefficiencies planted

The AI discovers bubble sort, redundant I/O, regex recompilation...

Each successful optimization: git commit + CHANGELOG entry

Watch the CHANGELOG grow in real time

// The prompt I'll feed to Claude Code:
Iteratively optimize this Java app for performance.
Run the benchmark, analyze the timing breakdown,
pick the single most impactful optimization, apply it.
If ≥ 2% faster: commit + update CHANGELOG.
If < 2%: count as failure.
3 consecutive failures = stop.

Switching to terminal...

Results

The AI found optimizations we planted. Here's what it did.

Check git log for auto-generated commit messages

Check CHANGELOG.md for documented optimizations

Before/after timing comparison from the benchmark output

Optimization	Category	Impact
Bubble sort → Collections.sort()	Algorithm	Very Large
File re-reading → load once	I/O	Large
Pattern.compile in loop → static	Object creation	Moderate
String += → StringBuilder	String handling	Moderate
DateTimeFormatter per call → cached	Object creation	Moderate
ArrayList scan → HashMap	Data structure	Moderate
Redundant copies → single pass	Memory	Small

Actual results will vary — the AI may discover them in a different order

Engineering Patterns

The anatomy of an effective AI optimization loop

Measurable goal — a number the AI can parse (coverage %, elapsed ms)

Tool access — the AI runs the same tools you would (mvn, JaCoCo, JFR)

Termination condition — prevents infinite loops (threshold met, or N failures)

Checkpoint discipline — commit on success, revert on failure

Structured output — the AI needs machine-readable results, not just prose

1

Prompt to start

N

Iterations automated

0

Manual tool switching

Honest Edges

This is not a silver bullet. Here's where it struggles.

Architectural changes — optimizes within the current design, rarely redesigns

Algorithmic leaps — finds O(n²) → O(n log n), but not novel algorithms

Concurrency subtleties — parallel streams work, but thread-safety reasoning is fragile

Diminishing returns — the 3-failure rule exists because marginal gains get noisy

Best for: mechanical optimizations you'd do yourself given enough time

Broader Applications

The loop pattern applies to any measurable quality metric

Security scanning — run SAST, fix findings, re-scan until clean
Code quality — run SonarQube, address issues, loop until gate passes
API response time — load test, optimize hot path, re-test
Bundle size — measure, tree-shake, re-measure
Accessibility — run axe, fix violations, re-scan

// The universal pattern:
measurable_goal + tool_access + termination_condition = AI loop

Try It Today

One prompt. One loop. Real results.

Install: npm install -g @anthropic-ai/claude-code

Point it at your codebase with a measurable goal

Define a termination condition in your prompt

Let it run — review the commits when it's done

92%+

Coverage achieved

22x

Performance speedup

1

Prompt each

The demo prompts are in the project README — try them on your own code