Boris sat down with Spotify VP of Engineering Niklas Gustavsson.
Spotify ships 4,500 production deploys a day, and 73% of PRs are now AI-assisted.
Niklas keeps 5 to 10 Claude sessions running in tmux, one per git worktree, agents working in the background.
All of it inside a 20M+ line monorepo. He expected agents to struggle at that size, but it's worked well.
Spotify's migration codemods grew into thousands of lines of edge cases. Code has too much API surface for static rewrites. Early LLMs barely did better.
Adding a judge took PR success from ~25% to 80%.
All of this leans on verification, the single most important thing when agents are used and the place most companies underinvest
Spotify rebuilt their test automation around it so engineers can confidently guide and supervise agents, rather than manually execute repetitive tasks