1

Scientific Rigor

Quality of experimental design, analysis, and validation of models and analyses.

Exceeds

Experiments are well-designed and conclusions are appropriately qualified. Work would stand up to external peer review.

Meets

Experiments are sound. Conclusions are supported by evidence.

Below

Experimental design has flaws. Conclusions are overstated relative to the evidence.

Example review phrases

  • "Every A/B test they design includes a pre-specified analysis plan—prevents the multiple comparison problems other teams run into."
2

Business Impact

Measurable impact of data science work on business metrics.

Exceeds

Data science initiatives are directly tied to revenue, retention, or product quality metrics with measurable results.

Meets

Data science work improves product or business decisions. Impact is felt if not always precisely measured.

Below

Data science work is technically interesting but business impact is unclear.

Example review phrases

  • "The propensity-to-buy model they deployed is now used in 80% of enterprise outreach sequences—pipeline from those sequences is up 28%."
3

Production Collaboration

Quality of collaboration with data engineering and software engineering to deploy and maintain models.

Exceeds

Models ship to production reliably. Engineering collaboration is proactive. Models are maintained with monitoring.

Meets

Models ship to production with adequate engineering support.

Below

Models remain in notebooks rather than production. Engineering collaboration is difficult.

Example review phrases

  • "Data engineering team said they're the easiest data scientist to productionize work with—code is clean, requirements are clear, handoff is smooth."
🔮

Where do these examples come from in real reviews?

Most managers write performance reviews from memory—limited to what they personally observed. Confirm surfaces behavioral evidence from across the organization: who relied on this person, what they drove, how their impact extended beyond their direct manager's line of sight. Reviews written with Confirm's data are more accurate, more defensible, and faster to write.

See Confirm in action →