Here's the deal with Lattice's calibration: it works. You can pull ratings distributions, make bulk adjustments, flag statistical outliers, and run a session. For companies doing basic calibration where the main goal is consistent ratings across managers, Lattice is fine. See how Confirm handles performance calibration.
But when HR teams start digging into why ratings differ, Lattice hits a wall. There's no behavioral evidence underneath the numbers. When Manager A rates 15% higher than the org average, Lattice can show you that. It can't tell you whether Manager A is correct. That's not a minor gap in 2026; it's the central problem HR leaders are trying to solve.
This post breaks down the specific feature differences between Confirm and Lattice on calibration, based on publicly available data and G2 reviews, and explains why the gap matters for teams trying to run fair performance processes.
What Lattice's calibration tool does
Lattice's calibration module has been around long enough to be genuinely functional. The interface is clean. Within a standard review cycle, you can:
- See rating distributions across managers (bell curve views, 9-box grids)
- Adjust individual scores directly or via CSV bulk upload
- Set up calibration sessions without heavy configuration
- Track session completion across the org
- Log comments on rating changes (added in 2025)
The February 2026 update added granular visibility controls in calibration, giving different stakeholders different views of the data. That's a real improvement, though it's table stakes for any enterprise HR workflow.
Lattice also updated its AI features to summarize calibration insights. You can see which managers trend high or low versus peers, and which employees are getting unusual ratings within their cohort. For teams running straightforward calibration (get consistent ratings, run a session, export results), Lattice gets it done. G2 reviewers give it 4.7 stars overall, and calibration is part of that score.
Where it stops
The issue isn't what Lattice calibration does. It's what it can't do when the conversation gets harder.
Picture a calibration session three hours in. Two managers disagree on a rating. One argues their employee deserves "Exceeds Expectations" based on a strong Q4. The other manager says that employee has been coasting all year. Both could be right. Both are working from memory.
Lattice has nowhere to go from there. The tool can tell you the two managers rate differently from the org average, which is useful for identifying systematic rater patterns. What it can't tell you is whether either manager's specific assessment of this employee is accurate.
G2 reviewers (2025-2026) describe this gap directly: Lattice calibration "misses advanced rating adjustments" and makes it "hard to surface the full picture when managers disagree." One independent review at Research.com noted that "performance calibration tools are simplistic and miss advanced rating adjustments." That tracks with what the tool fundamentally is: a workflow tool built to structure the process, not an evidence tool built to resolve disputes.
For most straightforward calibration needs, that's fine. When you're trying to make defensible comp and promotion decisions, it isn't.
The ONA gap
Confirm approaches calibration from a different starting point. The core difference is Organizational Network Analysis (ONA), which measures actual collaboration patterns across an organization. Who do people turn to when they're stuck? Who connects teams that rarely interact? Who's driving cross-functional decisions even when their name doesn't appear on the output?
ONA data shows what happened. Lattice shows what managers think happened.
That distinction matters because the employees most likely to be systematically underrated are also the hardest to see from a single manager's vantage point: employees who worked across teams, took parental leave during the review period, or contributed to initiatives outside their immediate team. ONA captures contribution patterns that don't appear in any single manager's notes.
When Confirm surfaces a bias flag during a calibration session, it's backed by network data. If a manager rates someone low and ONA shows that person is a core collaboration node others rely on across three departments, that gets surfaced in the session. The same logic runs in reverse: if a manager rates someone highly and ONA shows minimal collaborative activity, that gets flagged too. The data doesn't override the manager's judgment, but it gives the room something concrete to evaluate rather than competing memories.
Lattice's AI summarizes calibration insights, but without behavioral data underneath, those summaries pattern-match on ratings. They can't connect a rating dispute to actual work patterns because Lattice doesn't have that data.
What this costs in practice
The gap shows up in three measurable places.
Calibration time. A typical Lattice calibration cycle runs 2-3 weeks: pre-work, manager preparation, multiple sessions, post-session debates, revisions. Confirm compresses that to single sessions, typically 2-4 hours. The efficiency comes from having pre-built employee evidence profiles generated before the session starts. There's no reconstruction phase where managers try to remember what happened six months ago. Thoropass, one of Confirm's customers, completed 98% of performance reviews within 6 days of launch. Their VP of People, Joe Bast, described it as unlike anything he'd seen in terms of completion speed.
Bias outcomes. Confirm customers have documented 40% less bias variance post-calibration compared to their previous process. The three specific bias patterns Confirm monitors are recency bias (over-weighting recent events), affinity bias (rating people similar to yourself higher), and advocacy bias (outcomes driven by which manager argues loudest). These are the most common ways calibration results diverge from actual performance. Bias flags run in real time during sessions, not just as post-hoc reports.
Demographic equity tracking. After each calibration cycle, Confirm runs a demographic disparity analysis, checking whether rating distributions differ by gender, ethnicity, location, or tenure in ways that can't be accounted for by performance factors. This analysis happens before decisions are finalized, while there's still time to correct them. Lattice offers some demographic reporting, but it's not integrated into the calibration workflow itself.
What Lattice is better at
Lattice's broader platform covers more territory than Confirm. If your primary requirements include engagement surveys, native compensation planning, and performance management in one system, Lattice makes more sense.
The Workday integration is genuinely strong. Customers regularly describe "the best of both worlds": Workday as system of record, Lattice handling performance. For organizations already deep in Workday, that's a meaningful point in Lattice's favor.
Lattice also competes hard on goal-setting. Multiple customers describe it as one of the cleanest OKR/goal management interfaces on the market. For teams where goal tracking is the primary driver of a performance platform purchase, Lattice's tooling is mature.
For companies with 50-500 employees doing routine review cycles without complex calibration requirements, Lattice's $11/seat base price and breadth of features deliver real value.
Who outgrows Lattice on calibration
The Lattice calibration model works until you need to defend a decision or trace a rating back to its source. For companies where performance outcomes feed directly into comp and promotion decisions that managers have to explain to employees, "we reached consensus in the room" doesn't hold up.
The HR teams that typically start evaluating Confirm after using Lattice tend to share a few characteristics:
Growing teams between 200 and 2,000 employees, where calibration used to be informal and now needs structure and documentation. Lattice handles the structure; the documentation of why decisions were made is thinner.
Post-DEI audit or litigation situations, where HR needs to demonstrate that calibration outcomes weren't influenced by demographic factors. Lattice's audit trail shows who changed what rating. Confirm's audit trail shows what evidence supported the change.
Companies where manager bias has already surfaced as a problem, visible in high performer attrition patterns, demographic rating gaps flagged by legal or finance, or leadership feedback about fairness in the review process.
One customer who evaluated both platforms put it directly: "We reviewed Lattice, Rippling, and Culture Amp. Confirm was the only platform designed to be lightweight, fast, and easy to use." Confirm typically goes live in 1-2 weeks, compared to Lattice's 4-8 week typical implementation timeline.
The feature comparison
| Feature | Confirm | Lattice |
|---|---|---|
| ONA-based bias detection | ✅ Built-in | ❌ Not available |
| AI prioritization of which employees need discussion | ✅ Auto-triage | ⚠️ Requires manual configuration |
| Statistical outlier detection | ✅ AI-powered | ✅ Basic |
| Rating distribution visualization | ✅ Yes | ✅ Yes |
| Multi-level calibration sessions | ✅ Full support | ⚠️ Limited to basic tiers |
| Demographic disparity detection | ✅ Integrated into calibration workflow | ⚠️ Limited, not integrated into sessions |
| Implementation time to first session | 1-2 weeks | 4-8 weeks |
| Pricing | $8/seat all-in | $11+/seat base |
Bottom line
Lattice is a solid performance management platform. Its calibration tools work well for straightforward review cycles where the primary goal is consistency across managers. The gap appears when you need to explain why two managers disagree, catch bias before it corrupts comp decisions, or compress a multi-week process to a single focused session.
Organizational Network Analysis is the feature Lattice doesn't have, and in 2026, it's increasingly where the defensibility of performance decisions lives. HR teams are being held to a higher standard on bias and equity than they were five years ago. Calibration meetings that end with "we reached consensus" aren't sufficient evidence anymore when a promotion decision gets challenged.
If your calibration process currently relies on managers being well-prepared and arguing fairly, it's worth understanding what behavioral evidence would change about that. Not because managers are careless, but because calibration without evidence asks them to do something they were never equipped to do: evaluate performance accurately from memory, in a room full of competing advocates, in a couple of hours.
That's the gap Lattice hasn't closed. Confirm was built specifically to close it.
Want to see how Confirm handles this? Request a demo — we'll walk you through the platform in 30 minutes.
