Unconscious Bias, Performance Evaluation, and Promotion Fact Sheet
- Evaluations of women more often contain strings of stereotypical terms (e.g., “compassionate,” “cooperative”) while words like “accomplishment” are more often used for men.ii
- Women’s evaluations more often contain “grindstone adjectives,” such as “hardworking” or “diligent.” While these terms may not be negative on their own, they are harmful when they take the place of stronger adjectives often used for men (e.g., “innovative” or “brilliant”), and they tend to imply that women have a strong work ethic while men have ability.v
- “Doubt raisers” or language like “it appears as if her personal life is stable” appeared in twice as many female evaluations than those written for males.v
- In a review of fellowship applications, women applicants needed to produce more than 99 “impact factors” to be perceived as competent as men who produced only 20 “impact factors.”iii
- Evaluations for women were often shorter than those written for men.v
- Evaluations for men contained more “standout” adjectives, such as “exceptional” and “unique.”v
- Evaluations for men were more likely to include specific feedback about how to improve, while evaluations for women were more likely to include vague remarks and negative assessments of their personality.iv
- In a qualitative study, black and minority ethnic employees reported that they wanted more specific feedback on how to improve instead of the vague commendations (“good job!”) they tended to receive.v
- Motherhood penalty: Mentioning that a woman is a mother leads evaluators to rate her as less competent and to recommend a lower salary than when the same employee is described as not having children. This discrepancy does not apply to men who are fathers compared to those who are not.vi
- In a comparison of numerical performance ratings with written narratives, supervisors routinely gave lower numerical scores to black employees relative to white employees, which they did not explain in their summaries.vii
- Women and people of color who engage in diversity and equity-promoting endeavors receive diminished performance ratings. White men are neither penalized nor rewarded for promoting diversity.viii
- In a study of 9,000 employees, white men often received higher bonuses than women and minorities even though they were in the same job with the same supervisor and received equivalent performance ratings.ix
The following factors can increase the prevalence of these biases:
Unconscious biases are more salient and significant when a profession or environment is dominated by a single group (e.g., gender, race).
- Gender discrepancies in performance evaluation and rewards are higher when women are underrepresented in executive roles industry-wide.x
- Unconscious bias is more common when evaluators are under time pressure, are fatigued, or need a quick decision.
- Vague or ambiguous performance criteria can foster unconscious gender bias in evaluations.
- Narrow definitions of success can amplify bias because those definitions are likely to be based on the traits and behaviors of those who are already in power.i
- Limited information about performance leads raters to rely on overall impressions, opening the door to bias.
- Evaluators who hold traditional attitudes about gender roles and/or subscribe to beliefs that justify social inequality (e.g., high social dominance orientation or strong belief in meritocracy) are most likely to negatively evaluate women, especially if the woman is perceived as nontraditional. Men are more likely than women to hold traditional attitudes about gender roles.xi
Common Errors in Performance Evaluation
The following are some common errors that managers make during performance reviews. These errors stem from cognitive shortcuts and undermine fair, rational, and meritocratic evaluation and promotion.
- Halo Effect: The evaluator scores a person high on all performance factors based on a positive general impression of the employee.
- Horn Effect: The evaluator scores a person low on all performance factors because of a poor general impression of the employee. The more raters rely on general impressions rather than specific examples, the more these errors will distort outcomes.
- Compatibility: Those who frequently agree with the evaluator (e.g., nod their heads, etc.) can get better ratings than their performance justifies. Conversely, evaluators may find it difficult to be objective (and ignore private irritations) with a person who disagrees often.
- Self-comparison: Employees working in positions similar to ones the evaluator has held may do the job differently than the evaluator remembers doing it. This may result in lower evaluations than for those who do work unfamiliar to the evaluator.
- One-asset Associated: Employees strongly associated with a single positive element (e.g., the person with advanced degrees, the graduate of the same alma mater, the persuasive talker, the person with the nice looks) can experience upward bias when evaluated. Evaluators often judge employees’ credentials or reputations, rather than what they have actually done for the organization.
- Recency: Unfortunately, the person who makes a mistake the day before a performance evaluation discussion can offset good performance over the previous months.
- Central Tendency: Some evaluators rate all employees as average, giving scores in the middle of the scale and avoiding extreme ratings. This practice can be particularly disadvantageous to employees when other evaluators or managers do not abide by this same approach.
- No-complaint Bias: The evaluator is often apt to treat no news as average performance. If the appraiser has not heard any complaints or compliments about the person, she or he assumes everything is fine, and evaluates the person as average even though this may not be accurate.
- Lack of Definition: The evaluator rates all employees as average because what constitutes outstanding performance is unclear (e.g., there are no clearly defined performance standards). This lack of clear criteria also exaggerates the effects of many of the other patterns listed here.
ii Trix & Psenka (2003). Exploring the color of glass: Letters of recommendation for female and male medical faculty.
iii Wenneras & Wold (1997). Nepotism and sexism in peer review.
iv Correll and Simard (2016). Vague feedback is holding women back.
v Wyatt, M., & Silvester, J. (2015). Reflections on the labyrinth: Investigating black and minority ethnic leaders' career experiences.
vi Correll (2007). Is there a motherhood penalty?
vii Wilson, K. Y. (2010). An analysis of bias in supervisor narrative comments in performance appraisal.
viii Hekman, Johnson, Foo, & Yang (2017). Does diversity-valuing behavior result in diminished performance ratings for non-white and female leaders?
ix Castilla (2008). Gender, race and meritocracy in organizational careers.
x Joshi, Son, and Roh (2015). When can women close the gap? A meta-analytic test of sex differences in performance and rewards.
xi Basow, S. A. (2016). Evaluation of female leaders: Stereotypes, prejudice, and discrimination.