The Talent Judgment Gap: Why AI Won’t Solve What Organizations Never Built

87% of employees believe algorithms could give them fairer feedback than their managers. That number, cited by Gartner last year, is worth sitting with. It often gets interpreted as evidence that AI is ready to take on a bigger role in talent decisions. I read it differently.

Read it as a verdict on the state of human judgment in organizations, and it becomes considerably more uncomfortable. Employees do not trust the people assessing them. They would rather take their chances with a machine. That is not a technology story. It is a talent management story, and one that has been building for a long time.

Having spent two decades working on talent strategy and leadership development across global organizations, I have sat in enough performance calibration sessions and succession discussions to recognize this pattern. The quality of judgment in talent decisions has been a long-standing problem in many organizations. What concerns me is that we are now handing it to AI, and I am not convinced this is the right problem to outsource.

What Talent Decisions Actually Require

In many organizations, the talent data underpinning these decisions is incomplete, inconsistent, or distorted. That is a problem worth fixing. Better data matters, and the investment is overdue. But it was never going to be sufficient on its own.

What AI changes is the consequence of getting the data wrong. Flawed inputs used to produce flawed human judgments: visible, challengeable, and attributable to a person. Now, flawed inputs fed into AI will produce flawed outputs at scale, wrapped in the appearance of objectivity.

Succession, promotion, performance appraisal: every talent leader has sat in these conversations countless times. The evidence that shifts these decisions often lives outside the data. In succession, it is the person’s learning pattern, whether their development history shows genuine acceleration under pressure or steady performance in comfortable conditions. In promotion, it is the reality of the role today: the political complexity, the team they would inherit, the stakeholder dynamics that need managing, and the pressures the function is currently under. In performance, it is context, whether a difficult year reflects a capability gap or circumstances that would have tested anyone in that seat.

Data captures none of that. And those are precisely the nuances that determine whether a talent decision is right.

In each of these, the dataset is necessary, but not the whole picture. Data captures the what: output, ratings, formal role history. It rarely captures the how: the context someone operated in, the degree of difficulty, or what they absorbed along the way. That gap requires accumulated, contextual judgment, the kind that comes from watching someone navigate challenges, not just measuring their output. The kind that distinguishes between a person who is genuinely growing and one who is exceptionally good at making it look like they are.

The Judgment We Never Developed

Organizations put significant effort into building talent processes: frameworks, rating scales, calibration cadences, and succession templates. Far less effort went into developing the judgment those processes were designed to rely on.

When judgment is not developed, something fills the space. In talent decisions, it tends to be a familiar and predictable set of biases. A strong quarter inflating an entire year’s assessment. One visible failure painting everything else negatively. The people seen most often getting rated highest. Those managing up well or advocating loudest for themselves gaining a persistent advantage.

The conflation of performance and potential is perhaps the most persistent judgment mistake of all. People making assessment calls routinely treat sustained high performance as a signal of leadership readiness, and the nine-box, designed specifically to hold those two dimensions apart, too often becomes a place where they collapse together. An example we have all seen: a high-performing individual contributor gets attributed management potential, is promoted, and then “surprises” everyone by struggling to lead a team.

This is the context in which AI is now being deployed. Most talent leaders reading this will recognize it. The processes exist. The judgment supporting them has been much harder to build.

What We’re Now Asking AI To Do

The judgment gap was a contained problem when the manager’s role included gathering data, synthesizing inputs, and building direct knowledge of the people they assessed. The work itself created some opportunity to develop judgment, imperfectly, but progressively.

AI changes that.

As systems surface candidates, flag flight risks, score potential, and synthesize performance signals, the analytical work moves to the machine. What remains for the manager is the call. But before the call comes something new: evaluating whether what AI is surfacing is actually right. And we all know that AI has a particular talent for looking very convincing while being completely wrong.

The role of a manager has been elevated. Their primary contribution is now judgment, bringing together the data and the context no system fully captures. Which means a pre-existing judgment gap has become a much larger problem.

Most organizations are moving to deploy AI into talent decisions without fully addressing this. Even if human oversight is built into the design, oversight is only as good as the judgment behind it. Adding “human in the loop” does not answer the harder question: is that human equipped to critically evaluate and make quality decisions based on the data AI is surfacing?

That question of judgment connects directly to accountability. When AI contributes to a consequential talent decision and it goes wrong, who is responsible? As Deloitte’s 2026 research shows, most organizations have no clear answer. Without one, decisions that no one can fully explain or defend will become an increasing source of risk in employee trust, legal exposure, and the quality of leadership pipelines shaped by calls no one can account for.

What This Actually Requires

The answer is not to slow AI adoption in talent management. Used well, AI adds genuine value, reducing information fragmentation, surfacing patterns across large populations, and introducing consistency where pure human judgment was producing bias.

What this requires is parallel investment: in AI capability and in the human judgment that makes AI oversight meaningful.

That means taking seriously how managers are developed as decision-makers, not just process followers. It means calibration conversations that genuinely interrogate the distinction between performance and potential. It means succession discussions where the evidence is stronger, the cognitive shortcuts are named and challenged, and accountability for the call is clear.

It also means designing the accountability architecture before it is needed. Deciding where human judgment begins and algorithmic recommendation ends. Naming who owns the outcome when AI contributes to a consequential talent decision.

Final Thought

AI will keep getting better at surfacing patterns, scoring potential, and flagging risk.

What it will not do is take responsibility for being wrong.

That remains a human job.

The question is whether we are developing the humans who can do it fast enough.