Every comp cycle, the same thing happens. A manager submits their team's ratings, and almost everyone is a 4 or 5. Another manager submits theirs, and it's the same. By the time HR pulls the aggregate numbers, 70% of the company "exceeds expectations." See how Confirm handles performance calibration.
That's not a success story. That's rating inflation, and it quietly corrupts every compensation decision you make.
When ratings mean nothing, raises mean nothing. You lose your ability to differentiate. High performers get the same bump as average ones. People notice. The good ones start looking elsewhere.
Here's how to catch it before it does that damage.
Warning sign 1: Your rating distribution looks like a hockey stick
Pull last cycle's final ratings by manager or department. What does the distribution look like?
A healthy rating distribution for most companies has maybe 10-15% of employees in the top tier, 60-70% in the middle ("meets expectations"), and 10-15% below that. Some variance across teams is normal. Every team having 80% of their people at "exceeds" or above is not.
At one 400-person software company, HR pulled the data before a comp cycle and found that 74% of all ratings submitted were 4 or 5 out of 5. When they broke it down by manager, the spread was telling: some managers gave 90% of their team top ratings, others gave 40%. Same pay bands, same job levels, wildly different standards.
The problem wasn't that some teams were performing better. It was that each manager had internalized a completely different definition of what "exceeds expectations" meant. One thought it meant "shows up and does good work." Another thought it meant "materially raised the bar for the team."
When you see a hockey stick distribution, you're not seeing performance. You're seeing each manager's personal tolerance for uncomfortable conversations.
What to look for: Run a simple pivot on final ratings by manager. Flag anyone where 60% or more of their direct reports landed at the top two rating tiers. That doesn't mean they're wrong, but it does mean they need to justify it in calibration.
Warning sign 2: Ratings don't connect to outcomes
This one takes a bit more digging, but it's more damning.
Go back to teams or individuals that received top ratings. Did those ratings correlate with actual output? Project completions? Revenue impact? Customer satisfaction?
One HR director at a 200-person consulting firm spent an afternoon matching prior-cycle ratings against billable hours, client NPS, and project delivery metrics for a sample of 50 employees. What she found: virtually no correlation between who got the top ratings and who actually drove the strongest outcomes. Several of the highest-rated employees had average-to-low delivery metrics. Several of the average-rated employees had carried the company's most important client relationships.
Ratings were a measure of how well people communicated to their managers, not how well they performed. Nice people who kept their manager informed got 4s and 5s. Quiet people who did the actual work got 3s.
This doesn't always mean inflation in the traditional sense, but it has the same effect: your ratings don't tell you who to promote, who to invest in, or who to put on a performance plan. They're noise.
What to look for: Pick a recent cohort of highly-rated employees. Cross-reference their ratings against any objective output data you have: sales numbers, delivery metrics, peer feedback scores, tenure relative to promotion. If the correlation is weak, your ratings aren't measuring performance. They're measuring visibility.
Warning sign 3: Ratings cluster within managers, not across them
Consistent rating inflation isn't always company-wide. Sometimes it's isolated to specific managers, and it spreads.
Here's how it works: Manager A gives generous ratings because he doesn't want conflict. His reports get slightly better raises. Their peers on other teams notice. Those peers start pushing their own managers to rate more generously. Within two cycles, a manager who used to give calibrated ratings is now feeling pressure to inflate, because being honest means his team falls behind on comp.
At a fintech startup, an HR team noticed that one particular director had given 95% of their team "exceeds" for three consecutive cycles. Over that same period, attrition on that team was near zero, while it was 20%+ elsewhere. When they dug in, they found that other managers had started inflating to keep pace, worried their people would transfer to get the higher ratings and better raises.
The director wasn't malicious. He genuinely liked his team and avoided hard conversations. But the signal he sent cascaded through the organization.
What to look for: Track rating averages by manager over multiple cycles. Look for managers with consistently high averages and low spread (almost everyone rated the same). Then look for neighboring teams where averages have crept up over time. That creep is usually social pressure, not performance gains.
The 30-minute DIY audit
You don't need a data analyst to run this. You need about 30 minutes and export access to your HRIS.
Step 1: Export last cycle's ratings by manager (15 minutes)
Pull a spreadsheet with: employee ID, manager, department, final rating, rating from previous cycle. You need at least two cycles to spot trends; one is just a snapshot.
Step 2: Build the pivot (5 minutes)
Create a pivot table: manager as rows, rating tiers as columns, employee count as values. Add a column for percentage at top two tiers. Sort by that percentage, descending.
Any manager where 60%+ of reports are in the top two tiers goes on your review list. Any manager where that number jumped by 15+ percentage points from last cycle goes on a separate "urgent review" list.
Step 3: Layer in tenure and outcome data (5 minutes)
Pull average tenure for each manager's top-rated employees. If a manager's 4s and 5s are mostly employees with less than 18 months, that's a flag. New employees rarely "exceed expectations" by definition, they're still learning the job.
If you have any objective output data, even rough metrics, add that as a column. You're not running a formal analysis. You're looking for obvious mismatches.
Step 4: Identify calibration priorities (5 minutes)
Take your review list and map out who needs the most scrutiny in the next calibration session. Prepare two or three questions for each manager whose data looks off: "Walk me through your top-rated employees and what they accomplished that put them in that tier." "How did you define 'exceeds expectations' for this cycle?" "Which of your direct reports, if any, do you think is on track for promotion in the next 12 months?"
Those questions, asked directly in calibration, do more work than any formal process change.
How calibration fixes it
The audit tells you where the problems are. Calibration is where you actually fix them.
Most calibration sessions run like this: managers sit in a room, read out their ratings, and nobody challenges anyone because it's uncomfortable and they all want to leave by noon. That's not calibration. That's a ceremony that makes HR feel like due diligence happened.
Real calibration starts with data in the room. Before the session, HR shares the distribution analysis. Every manager sees their numbers relative to peers. The manager with 85% of their team at "exceeds" walks in knowing that question is coming.
Then the session runs on anchored questions rather than free discussion. For each rating, the facilitator asks: "What did this person accomplish that places them here rather than one tier lower? Give me a specific example." That's it. Simple question, every time.
Before calibration: One company ran their mid-year cycle with no calibration. 71% of employees rated 4 or 5. Comp decisions were made on those ratings. Three months later, they ran an anonymous engagement survey. Their highest-rated employees gave the lowest engagement scores. Many felt the ratings were meaningless and their actual contributions weren't recognized.
After calibration: The following year, the same company introduced structured calibration with distribution targets and the data-first approach above. The percentage rated 4 or 5 dropped to 38%. Total comp spend stayed flat, but it shifted. The employees who'd carried critical projects got meaningfully higher raises. Engagement scores among the genuinely high performers improved by 22 points.
The ratings didn't just get more accurate. People started trusting the process.
Rating inflation is a systems problem, not a people problem
Most managers who inflate ratings aren't doing it for bad reasons. They like their teams. They want to avoid hard conversations. They want their people to earn well. Those are human impulses.
The problem is the process. Without calibration, without distribution data, without anyone in the room asking "why is this person a 5?", those impulses run unchecked and compound over cycles.
The companies that get comp right don't have better managers. They have better systems. They build in friction at the right moments: before ratings are finalized, not after. They give managers data to anchor against, not just vibes.
If you're heading into a comp cycle with inflated ratings and you know it, you still have time. Run the 30-minute audit. Bring the numbers to calibration. Ask the uncomfortable questions.
Your high performers will notice the difference. That's the whole point.
Want to see how Confirm handles calibration at scale? Our platform surfaces rating distributions in real time, runs structured calibration workflows, and flags inconsistencies automatically. Book a demo to see it in action.
If you're earlier in the process: if your ratings feel off but you're not sure how to fix the process, a 30-minute demo can walk you through what calibration looks like when it works. Schedule time here.
