Prices Increase in

Days
Hours
Minutes
Seconds
D
H
M
S
Fireside Chat

Humans + Machines: Unlocking Performance in Remote Teams

Summary

Performance management has been broken in most organisations for a long time. It was designed for office environments with ambient visibility, annual cycles, and managers who could track progress through proximity. Remote work removed the ambient visibility. AI is now removing the need for traditional performance proxies entirely. What is left — the actual work of understanding whether someone is doing good work — turns out to require considerably more clarity and considerably less administrative overhead than most companies have built.

At Running Remote 2026, Nadia Vatalidis, Head of People at Doist, and Nick Francis, Chairman of Help Scout, offered a detailed account of where they are in that evolution. Shelby Wolpa of Shelby Wolpa Consulting moderated.

What high performance actually means

Both companies have made a deliberate choice to define high performance in terms of outcomes and business impact, not effort. At Doist, a 105-person company with a large customer base across multiple products, every role has measurable impact on the business. The bar keeps rising — not because of AI, but because the definition of what is achievable keeps expanding.

At Help Scout, effort is explicitly not a performance dimension. If time and energy are not producing the right outcomes for the business or customers, that is a signal to pivot, not a signal to appreciate the effort. This is a different psychological contract than most organisations operate under, and it requires significant clarity in how expectations are communicated.

The 90/10 principle

Nick Francis’s most useful observation came from watching his own thinking evolve: AI can now handle roughly 90% of the synthesis work in performance evaluation — aggregating peer feedback, summarising goals and outcomes, identifying patterns across six to twelve months of data that a human manager’s brain struggles to hold simultaneously. The recency bias that has always plagued performance reviews — where the last three weeks dominate the last year — gets addressed not through process changes but through AI that remembers everything.

The last 10% is what requires humans. The final accountability. The quality of the actual conversation. The creativity and craft in interpreting what the data means for a specific person in a specific context. The relationship knowledge that no system can hold. Hard conversations and expressions of genuine recognition must happen between people — using AI for these interactions, both leaders agreed, would be soulless and counterproductive. Employees detect it immediately, and it signals that their manager does not actually care.

Doist’s 48-hour performance review cycle

The practical demonstration of this principle was Doist’s February performance review cycle. They ran a complete review — 105 people, self-reviews and peer reviews, using AI summarisation tools — in 48 hours. The typical comparable process at most companies takes six weeks. Each employee completed their own review and a maximum of five peer reviews. AI aggregated and summarised all peer input into final summaries, allowing managers to focus on the qualitative last 10%.

Vatalidis was clear about what made this possible: a performance philosophy established before any framework was built. Doist’s career framework, publicly available on their handbook, separates levels from job descriptions and maintains equal salary brackets for ICs and managers at the same level. The framework has been through seven or eight major iterations based on employee feedback. Peer reviews at Doist are not anonymous — they operate under a radical candor norm where direct feedback is safe and expected.

The sprint philosophy: more time given equals more procrastination. Compress the window, pair accountability with the constraint, and people produce better work with less anxiety.

ARR per employee as the north star metric

Help Scout’s performance conversation extends beyond individual reviews to organisational performance metrics. The primary metric tracked at board and C-suite level is ARR per employee — annual recurring revenue divided by headcount. The goal is to continuously increase this without burning people out or reducing headcount through layoffs.

The benchmark has been shifting: $250K per employee used to indicate solid performance. $500K is now considered strong outlier performance. Some companies are reaching $1M per employee. Doist maintains extremely low attrition alongside this ambition — more than 50% of employees have been there five or more years, and 21 people celebrated ten-year anniversaries in 2026.

Importantly, performative AI metrics — lines of code generated, percentage of code written by AI, AI session counts — are dismissed by both leaders as meaningless distractions. The question is whether AI initiatives are moving core business KPIs. Everything else is noise.

Where human accountability is non-negotiable

A cautionary example from the panel: a manager copy-pasted a ChatGPT response to an employee’s sick message, complete with the telltale prompt fragment ‘if you’d like me to add.’ The employee saw it. The trust damage from that moment was disproportionate to the effort saved. Using AI to respond to a sick message is not efficiency. It is evidence that the manager does not actually care about the person who is sick.

Hard conversations, genuine recognition, and the meaningful moments of a person’s career must be handled by humans who have thought carefully about what they want to say. AI as a soundboard for preparing those conversations — Doist uses Gemini Gems AI coaches to improve feedback quality before writing reviews — is genuinely useful. AI as a replacement for the conversation itself is not.

Session slides

More on this topic

You already purchased this product.