How to Run Fair Performance Calibration Sessions
Performance calibration sounds like HR jargon, but it's one of the most effective tools for eliminating pay discrimination and ensuring consistent feedback across teams. Yet most calibration sessions are run poorly, often reinforcing the same biases they're meant to fix.
This guide covers what calibration actually is, why it matters, how to structure a fair session, and where technology helps.
What Is Performance Calibration?
Calibration is the practice of bringing managers together to discuss employee ratings, pay decisions, and development plans so they align with company standards. Instead of each manager rating independently, you normalize conversations: "Is this person really a 4 out of 5? Let's compare them to the 4s on other teams."
It sounds simple. In practice, most managers have never done it, and it's usually run reactively during review cycles—rushed, without structure, and subject to whoever speaks loudest.
Why Calibration Matters
Pay equity audits usually find the same pattern: women and minorities earn less than peers, even when job titles and tenure match. The cause is rarely conspiracy. It's usually inconsistency. Different managers apply different standards, anchor on different information, or default to existing disparities.
Calibration forces consistency. When managers have to justify a 3 rating compared to a peer's 4, they either defend the difference with evidence or adjust the ratings. That friction surfaces and corrects bias.
Studies show structured calibration sessions reduce unexplained pay gaps by 10–25%.
How to Run a Fair Calibration Session
1. Set Clear Rating Definitions
Before the meeting, define what each rating means in your system:
- 1 = Does not meet expectations. Performance improvement plan.
- 2 = Meets some expectations. May need targeted coaching.
- 3 = Meets expectations. Solid contributor.
- 4 = Exceeds expectations. High performer, ready for advancement.
- 5 = Far exceeds expectations. Top talent.
Document examples for each level specific to your company. "Meets expectations" has different meaning in engineering versus sales. Make that explicit.
2. Gather Calibration-Ready Data
Prepare for each employee:
- Manager rating and justification (in writing, before the meeting)
- Key accomplishments (from the employee, not just manager memory)
- Feedback from peers or skip-level reports (if available)
- Tenure, role changes, compensation history
- Any documented performance issues or accommodations
Don't wing it from memory. Memory is where bias thrives.
3. Structure the Meeting
- Duration: 2 to 3 hours for a team of 50 to 75 people. Allow 2 to 3 minutes per person.
- Attendees: Direct managers only, plus one HR facilitator. Not senior leadership watching from the back. That creates political pressure.
- Agenda: Go through each employee once. No hierarchies. Don't reverse-sort by job level. That signals which roles matter more.
- Facilitation: HR's role is to spot inconsistency ("We rated Sarah a 3 but Michael a 4 for nearly identical accomplishments") and ask for evidence.
4. Call Out Inconsistencies
This is the hardest part. It requires uncomfortable conversations. When inconsistencies emerge:
- Ask: "What evidence supports this difference?"
- Push back gently: "That same situation happened with Michael, and we rated him differently. Why?"
- Avoid: "This is biased." Instead: "I don't see the evidence for this difference."
Most inconsistencies resolve when managers articulate them.
5. Document Everything
After calibration, document the final rating, the justification (one sentence), and who was in the room. This creates accountability and a record for legal defensibility.
Common Calibration Pitfalls
Pitfall 1: Calibration Without Standards
If "exceeds expectations" means different things in different teams, calibration just normalizes the existing inconsistency. Align definitions first.
Pitfall 2: Not Enough Information
A manager says "I've worked with Sarah for three years, and I know she's a 4." That's anecdote, not data. Push for specifics: What did she accomplish? How did it compare to others? You'll often find the rating was generous or harsh.
Pitfall 3: Group Pressure
Calibration sessions can become popularity contests. The loudest manager or the most likable team wins. Mitigate this by preparing written justifications before the meeting and rotating the order each time.
Pitfall 4: Pay Decisions Before Calibration
Some companies decide pay raises first, then "calibrate" ratings to match. That defeats the purpose. Calibrate ratings first, then make pay decisions based on the calibrated ratings.
Pitfall 5: Ignoring Tenure and Demographics
Watch for patterns: Are newer hires rated higher? Are men rated higher in technical roles? Are women rated higher in softer skills? If you see a pattern, dig in. It's a flag.
Pitfall 6: No Calibration for Terminations
If an employee is on track for performance improvement plan or termination, calibration is your protection. It ensures that low rating is consistent with how you rate others. Skip this, and you risk an unfair-termination lawsuit.
How Technology Helps
Modern performance management platforms reduce bias in calibration in several ways:
1. Centralized Data
Instead of managers relying on memory, the system surfaces accomplishments, feedback, and historical ratings in one place. Everyone works from the same facts.
2. Forced Justification
A good system requires managers to write down their rating and rationale before calibration. It's harder to have an off-hand conversation that privileges one person over another.
3. Inconsistency Flagging
When a manager rates someone a 2 and everyone else rates peers a 3 or 4, the system can flag it. Not automatically changing the rating, but surfacing the outlier for discussion.
4. History and Trends
The system can show: "Last cycle, this team was 60% rated 4 or above. This cycle, it's 30%. What changed?" Without data, you'd never spot that drift.
5. Anonymous Feedback Integration
If you have 360-degree feedback, the system can surface patterns without revealing who said what. This reduces the manager's ability to discount feedback because they don't like the source.
6. Legal Defensibility
A documented calibration session, with attendance logged, decisions recorded, and rationales captured, is your proof that pay decisions were fair and consistent. That matters if anyone ever challenges them.
Your First Calibration Session: A Checklist
- Define rating scale and examples (specific to your roles)
- Gather calibration data for every employee (no exceptions)
- Schedule 2–3-hour block with HR facilitator
- Invite direct managers only—no audience
- Send pre-calibration brief with rating definitions and data
- Ask managers to prepare written justifications
- Run the meeting, flagging inconsistencies
- Document final ratings and rationales
- Follow up on any outliers or patterns
- Communicate decisions to employees
- Adjust compensation or development plans based on calibrated ratings
- Schedule the next cycle (quarterly or annual)
Next Steps
If your company has never calibrated, start with one team or function. Learn the process. Then expand. If you already do calibration, audit your last session:
- Was there a documented rating scale?
- Was all data gathered before the meeting?
- Were inconsistencies called out and resolved?
- Are decisions documented?
If you answered "no" to any of these, your calibration is at risk. Fix it before the next cycle.
Performance calibration isn't a silver bullet for pay equity. But it's one of the few tools that gives you evidence. Use it.
This is part of a series on performance management. For frameworks on conducting reviews, running 1-on-1s, and coaching high performers, see our performance management resources.
