AI Is Flooding HR… But Interviews Should Stay Human

AI Is Flooding HR… But Interviews Should Stay Human

In the last couple of months there’s been a few high profile articles discussing the intersection between AI and Human Resources (or management in general):

  1. “Does HR still need humans?” (Financial Times, Andrew Hill)
  2. “MIT report misunderstood: Shadow AI economy booms while headlines cry failure” (VentureBeat)
  3. “The Job Market Is Hell” (The Atlantic, Annie Lowrey)

Read together, they map the same terrain: AI is crushing the repetitive parts of HR, workers and candidates are already using AI whether companies are ready or not, and the lived experience of hiring feels like a slot machine. The sane strategy is simple: automate the repeatables; keep interviews—where perception and intuition matter—decidedly human.


What the FT is really diagnosing (beneath the headlines)

The FT piece isn’t asking a philosophical question; it’s documenting an economic one. CFOs want operating leverage. AI delivers leverage on tasks that are:

  • high-volume,
  • policy-constrained,
  • and recall-based (find, route, retrieve, acknowledge).

HR is full of those: policy questions, leave calculations, doc lookups, scheduling. Of course chatbots shine there. But the FT also flags two fault lines leaders ignore at their peril:

  • Goodhart’s Law at work. Once you turn a messy people process into a metric (e.g., “time-to-fill”), the organization optimizes for the number and quietly degrades the thing you actually cared about (quality of hire). Full automation is gasoline on that dynamic.
  • Legitimacy risk. Even if a model could auto-decide, employees don’t just need efficient outcomes; they need explainable and fair ones. In sensitive calls (hiring, promotion, termination), perceived fairness is the product. Remove the human and you don’t merely save time—you corrode trust.

Bottom line: automate the edges; don’t hollow out the core. Interviews sit at the core.


What the MIT/VentureBeat story really says about “95% of pilots fail”

The “95% fail” headline is catchy—and misleading. What’s actually happening:

  • Top-down, bespoke pilots often stall because they’re rigid, built far from the workflow, and don’t learn fast enough.
  • Bottom-up adoption is exploding: employees and candidates use flexible, general AI daily because it helps now.

This creates a widening gray market of AI in hiring:

  • Candidates use AI to craft applications, tailor narratives, even rehearse interviews.
  • Interviewers use AI to prep, outline probes, and summarize notes.

That’s not a bug—it’s a signal. The tech works when it augments the human doing the job. It fails when it tries to be the human.


The Atlantic’s ground truth: a “Tinderized” funnel that exhausts everyone

Lowrey’s reporting nails the lived experience: candidates blast AI-written résumés into a void; employers deploy AI filters to survive the flood; almost nobody talks to a human. Time-to-job stretches, quits drop, and early-career and marginalized groups bear disproportionate pain.

Two harmful feedback loops follow:

  1. Noise begets more automation. More spam → more filters → fewer conversations → more spam.
  2. Talent discovery collapses to the obvious. Filters do well on credentials and keywords, poorly on signal hiding in non-obvious trajectories.

The only reliable antidote is earlier and better human conversation—with structure—so non-linear talent isn’t silently weeded out.


Interviews are the “last 20%” for a reason

People don’t go to humans for facts; they go to humans for judgment under ambiguity. In interviews, this is the job:

  • reading trade-offs (speed vs. quality, autonomy vs. control),
  • testing integrity (owning mistakes, handling conflict),
  • weighing context (stage, team, constraints).

These are relational judgments. They benefit from structure (clear competencies and anchored rubrics) but still hinge on perception, intuition, and trust. That’s precisely why interviews should be human-led and AI-assisted, not the other way around.


What “augment, don’t replace” looks like in practice

Here’s a practical split that respects both the economics and the psychology:

Automate (heavily)

  • Drafting job posts and roles from templates
  • Screening logistics: scheduling, reminders, FAQs
  • Policy queries and doc retrieval (benefits, PTO, process)
  • Pre-read assembly: JD + CV synopsis + portfolio highlights

Why: clear rules; low judgment; huge volume.

Augment (human-led, AI-assisted)

  • Interview prep: bespoke probes tied to competencies and the specific candidate narrative
  • In-call scaffolding: timeboxing, follow-up prompts when answers are hand-wavy, live note-to-evidence mapping
  • Post-call synthesis: a first-draft scorecard that you review, edit, and submit

Why: structure increases signal; human owns judgment and accountability.

Never outsource (human-only)

  • Hire/No Hire decisions on finalists
  • Exceptions, references, culture/values calls
  • Sensitive performance or remediation conversations

Why: legitimacy, explainability, and ethics.


If you care about ROI, measure the right things

Leaders often track “time saved.” Fine, but superficial. If we want better hiring and better economics, track:

  • Time-to-first conversation (TFC): from application to the first real human touch. The Atlantic’s “void” shrinks as TFC drops.
  • Signal Velocity (SV): interview end → first usable scorecard. If SV approaches same-day, panels decide sooner with fresher context.
  • Rubric Coverage Rate (RCR): % of must-have competencies with at least one high-quality evidence note. Quantity of interviews is secondary to coverage.
  • Panel Consistency Index (PCI): variance of ratings for the same candidate across competencies. Lower PCI = fewer avoidable disputes.
  • False-negative review rate: % of later high performers initially screened out. Run spot audits; you will find gold you’re missing.

These turn AI from a vanity demo into an evidence engine—without surrendering judgment.


Anti-patterns to avoid (learned the hard way)

  • Score fetishism. Turning a nuanced person into a single AI score invites overconfidence and bias laundering. Keep multi-dimensional rubrics and narrative evidence.
  • Chatbot interviews as default. Useful as a supplement for high-volume roles at the early filtering stages; corrosive if they’re the only human-less gate.
  • Unreviewable automation. If no human can reconstruct why a candidate was rejected, you’ve built a liability machine.
  • Process last. Applying AI to a fuzzy loop makes fuzzy faster. Standardize competencies and rubrics first.

The cultural shift leaders will need to make

  • From “AI decides” to “AI drafts.” Teams will normalize the idea that AI produces first drafts and humans finish them.
  • From heroic interviewers to trained interviewers. “I trust my gut” becomes “I trust my gut inside a structure.”
  • From secrecy to auditability. Notes, rubrics, and rationales get captured and reviewable. That’s how trust and compliance improve together.

What this means for us

We’ll continue to argue—and design—for a future where AI accelerates the process around interviews and augments the people inside them. We’ll keep interviews human-led by default, and we’ll aim our AI at raising the signal-to-noise of those scarce conversations. The point isn’t to make interviewers optional; it’s to make interviews worth more.

If there’s one line to draw from all three articles, it’s this: Automation will keep eating the repeatables. Let it. But reserve the human core—interviews—for what humans are still uniquely good at: reading ambiguity, weighing trade-offs, and making decisions other humans can trust.