Blog post

How to Reduce Manager Favoritism in Reviews With Structured Calibration

Manager favoritism in performance reviews costs companies millions in legal risk and lost talent. Structured calibration eliminates bias by forcing managers to defend ratings against peers.

How to Reduce Manager Favoritism in Reviews With Structured Calibration
Last updated: March 2026

Performance reviews are supposed to be objective. But across most organizations, they're not.

Managers consistently rate employees they like higher than those they don't, regardless of actual performance. A high-performer who doesn't socialize with the boss gets a "meets expectations." A lower performer who reminds the manager of someone they trust gets "exceeds expectations."

This isn't malice; it's cognitive bias. And it costs companies millions in legal risk, retention, and wasted potential.

Structured calibration is the most direct fix. It works by forcing managers to defend their ratings against each other. This removes the cover of individual "judgment" and exposes bias.

Why Manager Favoritism Happens

The Liking Effect

Managers rate employees who are similar to them higher, even when performance is identical. This "halo effect" means the extroverted employee who grabs beers after work gets a boost just for being visible.

Research from the Center for Talent Innovation found that 64% of high-performing minorities report being overlooked for promotions despite strong work. Visible bias is one factor. Hidden favoritism is the bigger one.

Lack of Reference Points

Without comparison, a manager uses their own internal standard, which is inconsistent. Employee A in January might have gotten a 3.5 for "X level of output." Employee B in May does the same thing and gets a 4.0 simply because the manager is in a different mood, has different recent examples, or remembers Employee A doing something slightly better three months ago.

Distance from Data

Most managers rely on memory and recent events (recency bias). If someone had a great project two quarters ago but isn't top-of-mind, it gets downweighted. If someone had one bad sprint recently, that one poor result colors the entire year.

Structured calibration forces managers to bring data, not just impressions.

What Structured Calibration Is

Calibration is a peer-comparison process where managers across the organization rate their employees using the same scale, then defend those ratings in front of each other.

Instead of each manager rating their team in isolation, managers gather (usually by department or level) and walk through each rating:

  • "I rated Chen as a 4. Here's why: shipped project X on time, mentored three people, resolved that critical bug that saved us..."
  • Peers ask: "How does that compare to what you saw from Smith? Chen got a 4, Smith got a 3.5?"
  • Manager responds with specifics: "Smith is solid but hasn't taken on stretch work. Chen sought out the hardest problem..."

Then the group adjusts. Chen stays a 4, Smith gets bumped to 3.75. If there's disagreement, they dig into the evidence.

The result: ratings that are anchored to job performance, not personality.

How Structured Calibration Eliminates Favoritism

1. Removes the Cover of Individual Judgment

Favoritism thrives in private decisions. The moment you have to explain why Employee A got a 4.2 and Employee B (doing similar work) got a 3.8 to a room full of peers, bias becomes visible. Most managers correct their ratings on the spot rather than defend them.

2. Anchors Ratings to Evidence

Calibration forces managers to bring specifics: "Chen shipped 15% more features than average, mentored two people, and owned a critical path item." Vague superiority ("I just think they're better") gets pushed back immediately. Peers ask for examples. Managers who were relying on personality instead of performance get exposed.

3. Reveals Patterns

If one manager rates women consistently lower than another manager rates women at the same level, that pattern becomes visible. If a manager gives their friend a 4.5 for work another manager rated a 3.5 for the same output, it shows.

Single instances look like judgment. Patterns look like bias. Calibration makes patterns visible.

4. Creates Peer Accountability

It's harder to justify unfair ratings when your peers (people you respect and work with daily) are listening and evaluating your judgment. Social pressure, applied fairly, works.

5. Constrains Ratings to Reality

Without calibration, one team has six "exceeds expectations" ratings and another has one. That's not a difference in team performance; it's a difference in rating generosity. Calibration forces consistency. If you want to give six 4s out of eight, you need to defend why your team is that strong compared to peer teams. Usually, you can't.

The Process: Step by Step

Step 1: Prepare Individual Ratings (Pre-Calibration)

Before the calibration meeting, managers rate each of their employees independently, with supporting examples:

  • Rating (1-5 scale or similar)
  • 2-3 specific examples of work that drove the rating
  • Any important context (tenure, ramp-up time, role transitions)

This takes time but prevents managers from gaming the calibration meeting. They're committed to a position first.

Step 2: Gather by Level and Department

Group managers by the employee level being discussed (e.g., all managers rating individual contributors) and department (or cross-departmentally if you're looking for broader consistency). Typical group size: 4-8 managers.

Step 3: Walk Through Each Rating

Go employee by employee, organized by rating. Start with the highest ratings and work down:

"4.5+ (Exceeds Expectations) - Critical Talent" Manager A: "I rated Jane a 4.8. She led the Q1 infrastructure project, completed it two weeks early, and mentored the two newest hires. She's been with us two years and ramped from IC to tech lead in one."

Peer questions:

  • How does that compare to Mark's project delivery?
  • Has Jane ever missed a deadline or had a difficult cross-team situation?
  • Is mentoring part of her role or extra credit?

Manager A's response: "Mark shipped on time but took the standard path. Jane sought out the hardest problem. Mentoring isn't her role, but she does it anyway. She also handled a conflict with the design team proactively."

Result: Jane stays a 4.8. Mark gets reassessed; maybe he was underrated because he's newer. Peers see the difference in impact.

Step 4: Adjust and Document

As the group discusses, ratings may shift. Some go up (underrated in isolation), some go down (favoritism gets corrected). The calibration facilitator documents the final ratings and the reasoning.

Step 5: Manager Feedback Loop

Managers walk away knowing:

  • Their final rating and why it might have shifted
  • How their team's distribution compares to peers
  • Specific examples of what earns each rating level
  • The group's understanding of performance standards

This means next year, they calibrate faster because everyone shares context.

Real Impact: What Changes

Fairness: Employees doing equivalent work earn equivalent ratings. The person who's quiet but shipping gets rated fairly. The charismatic underperformer doesn't get the benefit of the doubt.

Retention: High-performers who were invisible now get recognized. They see the rating and the context in the review, and they know it's real (not just their manager's opinion). That reduces regrettable turnover.

Promotion Equity: When promotions are anchored to calibrated ratings, people from underrepresented groups see the same advancement probability as majority groups (when doing equivalent work). Favoritism was hiding gender and racial bias in promotion decisions. Calibration brings it to light.

Legal Safety: Calibrated ratings create a defensible paper trail. If a termination or low rating gets challenged, you can show the employee was rated fairly against peers doing similar work, with documented evidence. That's gold in litigation.

Manager Development: Managers learn faster what "good performance" actually looks like. In isolation, managers develop weird standards. In calibration, they see 30-40 examples of what earns each rating, from peers' teams. That accelerates judgment.

How to Implement

1. Define Clear Rating Levels

Before you calibrate, define what each rating level actually means:

  • 4.5+: Exceptional contributor. Expanding role, mentoring, taking on hard problems.
  • 4.0-4.5: Strong performer. Meeting all goals, delivering quality work, driving some initiatives.
  • 3.5-4.0: Solid contributor. Meeting expectations, learning, engaged.
  • 3.0-3.5: Developing. Meeting core expectations but needs growth in [specific area].
  • Below 3.0: Underperforming. Performance plan required.

Add concrete examples to each level. "Strong performer shipped 12 features, mentored one person, took on project X" is better than "consistently exceeds expectations."

2. Train Facilitators

Someone needs to run the calibration meeting (usually an HR leader or department head). The facilitator needs to:

  • Manage time (calibrations can sprawl)
  • Flag potential bias ("Three of your four highest ratings are people you hired")
  • Push for evidence ("What specifically makes them a 4.3?")
  • Keep discussion professional and evidence-focused

3. Set Guardrails

Decide in advance:

  • What's the expected distribution? (Should 30% of people be "exceeds," or 10%?)
  • Who reviews the recommendations for bias? (Usually HR + the department head)
  • How are calibrated ratings communicated to employees? (In the review feedback section)

4. Brief Managers

Managers need to understand:

  • This isn't about overruling their judgment
  • It's about consistency and fairness
  • Ratings may shift, and that's fine
  • They'll get feedback on their rating patterns
  • Next year will feel easier because everyone shares context

The Objections You'll Hear

"This takes too much time."

Yes. And it's worth it. Most calibrations are 4-6 hours for 60-80 employees. That's the annual cost. Compare that to legal settlements from discrimination claims, or the cost of losing one top performer because they weren't recognized.

"Our managers won't want to defend their ratings."

Some won't, at first. That's actually fine; it means bias will surface quickly and get corrected. After one calibration, managers realize that defending solid judgments is easy and that the process is fair. Resistance drops.

"We'll just end up with everyone getting the same rating."

No. Calibration doesn't flatten ratings. It anchors them to reality. Some employees are genuinely stronger than others. Calibration makes that visible and defensible, rather than hidden behind one manager's preference.

Beyond Performance Reviews

Calibration works anywhere comparisons matter:

  • Promotion decisions: Same process, higher stakes. Who's ready for the next level?
  • Bonus distribution: How do we allocate discretionary budget fairly across teams?
  • Succession planning: Who should we develop for leadership?

The principle is the same: bring data, compare peers, defend decisions, adjust.

Getting Started This Quarter

If you want to reduce manager favoritism, start small:

  1. Pick one level (e.g., all individual contributors in engineering)
  2. Define what each rating level means for that population
  3. Have managers come in with independent ratings + examples
  4. Calibrate for 2-3 hours
  5. Collect feedback on the process
  6. Expand next cycle

The first calibration always feels clunky. By the second, it's routine. And the fairness gains are immediate.

Your highest performers will notice. So will people who were underrated. And your legal team will sleep better knowing your ratings are defensible.


Want to see how Confirm makes calibration easier? Schedule a demo to watch how structured performance management eliminates the subjectivity that drives favoritism.

See Confirm in action

See why forward-thinking enterprises use Confirm to make fairer, faster talent decisions and build high-performing teams.

G2 High Performer Enterprise G2 High Performer G2 Easiest To Do Business With G2 Highest User Adoption Fast Company World Changing Ideas 2023 SHRM partnership badge — Confirm backed by Society for Human Resource Management

Ready to see Confirm in action?