RL
AI Optimization Loops | Live Demo
Developer Productivity
AI-Assisted Iterative Optimization Loops
What happens when you give an AI a goal, a metric, and permission to loop?
Two live demos: code coverage and performance optimization
Same codebase, different objectives, same pattern
Claude Code — Anthropic's AI coding agent in the terminal
The Problem
We write code. We manually profile. We manually optimize.
Developer spots a slowdown, fires up VisualVM, reads flame graphs
Writes a fix, re-benchmarks, checks if it helped
Repeat... until the sprint ends or patience runs out
Same for test coverage: write tests, check JaCoCo report, write more tests
80%
of optimization time is spent on measurement, not coding
3-5
manual cycles before most developers move on
The Shift
What if the AI could run the tools, read the output, and decide what to do next?
Claude Code can execute shell commands: mvn, JaCoCo, JFR
It can read structured output and reason about what it means
It can modify code and re-run measurements automatically
The missing piece: a loop with a termination condition
while (goal_not_met) {
    measure()       // run the tool, capture metrics
    analyze()       // reason about what to improve
    apply_change() // modify code with best optimization
    re_measure()    // verify improvement
}
Demo 1 — Test Coverage
Measure coverage. Write tests. Loop until 92%.
Run JaCoCo
Parse Coverage %
< 92%
Find Gaps
Write Tests
← loop back
 
≥ 92%
Done!
~10%
Starting coverage (3 tests)
92%+
Target coverage
4-6
Expected iterations
Live Demo
Watch the coverage climb in real time
Starting: 3 tests, ~10% line coverage
The AI reads the JaCoCo report, identifies uncovered classes and methods
Each iteration: new tests appear, coverage jumps
Terminates when coverage ≥ 92%
// The prompt I'll feed to Claude Code:
Run `mvn clean test jacoco:report`, then read
target/site/jacoco/jacoco.csv to determine line coverage %.
If below 92%, analyze the report for uncovered classes,
write JUnit 5 tests targeting those gaps, and re-run.
Keep looping until line coverage ≥ 92%.
Switching to terminal...
Demo 2 — Performance Optimization
Benchmark. Analyze. Optimize. Commit only what works.
Run Benchmark
Analyze Timing
Apply Best Fix
Re-Benchmark
≥ 2% faster
Commit + Log
reset failures = 0
← loop back
 
< 2% gain
failures++
if 3 in a row →
Stop
~6.5s
Starting pipeline time
22x
Expected total speedup
3
Consecutive failures = stop
Live Demo
~6.5 seconds to start. Let's see how low it goes.
Same Java app, deliberate inefficiencies planted
The AI discovers bubble sort, redundant I/O, regex recompilation...
Each successful optimization: git commit + CHANGELOG entry
Watch the CHANGELOG grow in real time
// The prompt I'll feed to Claude Code:
Iteratively optimize this Java app for performance.
Run the benchmark, analyze the timing breakdown,
pick the single most impactful optimization, apply it.
If ≥ 2% faster: commit + update CHANGELOG.
If < 2%: count as failure.
3 consecutive failures = stop.
Switching to terminal...
Results
The AI found optimizations we planted. Here's what it did.
Check git log for auto-generated commit messages
Check CHANGELOG.md for documented optimizations
Before/after timing comparison from the benchmark output
OptimizationCategoryImpact
Bubble sort → Collections.sort()AlgorithmVery Large
File re-reading → load onceI/OLarge
Pattern.compile in loop → staticObject creationModerate
String += → StringBuilderString handlingModerate
DateTimeFormatter per call → cachedObject creationModerate
ArrayList scan → HashMapData structureModerate
Redundant copies → single passMemorySmall
Actual results will vary — the AI may discover them in a different order
Engineering Patterns
The anatomy of an effective AI optimization loop
Measurable goal — a number the AI can parse (coverage %, elapsed ms)
Tool access — the AI runs the same tools you would (mvn, JaCoCo, JFR)
Termination condition — prevents infinite loops (threshold met, or N failures)
Checkpoint discipline — commit on success, revert on failure
Structured output — the AI needs machine-readable results, not just prose
1
Prompt to start
N
Iterations automated
0
Manual tool switching
Honest Edges
This is not a silver bullet. Here's where it struggles.
Architectural changes — optimizes within the current design, rarely redesigns
Algorithmic leaps — finds O(n²) → O(n log n), but not novel algorithms
Concurrency subtleties — parallel streams work, but thread-safety reasoning is fragile
Diminishing returns — the 3-failure rule exists because marginal gains get noisy
Best for: mechanical optimizations you'd do yourself given enough time
Broader Applications
The loop pattern applies to any measurable quality metric
  • Security scanning — run SAST, fix findings, re-scan until clean
  • Code quality — run SonarQube, address issues, loop until gate passes
  • API response time — load test, optimize hot path, re-test
  • Bundle size — measure, tree-shake, re-measure
  • Accessibility — run axe, fix violations, re-scan
// The universal pattern:
measurable_goal + tool_access + termination_condition = AI loop
Try It Today
One prompt. One loop. Real results.
Install: npm install -g @anthropic-ai/claude-code
Point it at your codebase with a measurable goal
Define a termination condition in your prompt
Let it run — review the commits when it's done
92%+
Coverage achieved
22x
Performance speedup
1
Prompt each
The demo prompts are in the project README — try them on your own code
{{ currentSlide + 1 }} / {{ totalSlides }}