Paper Session 3
AbstractAsk a Question?
Standardized Tests and Affirmative Action: The Role of Bias and Variance
The University of California suspended through 2024 the requirement that applicants from California submit SAT scores, upending the major role standardized testing has played in college admissions. We study the impact of such decisions and its interplay with other policies---such as affirmative action---on admitted class composition. This paper considers a theoretical framework to study the effect of requiring test scores on academic merit and diversity in college admissions. The model has a college and set of potential students. Each student has observed application components and group membership, as well as an unobserved noisy skill level generated from an observed distribution. The college is Bayesian and maximizes an objective that depends on both diversity and merit. It estimates each applicant's true skill level using the observed features and potentially their group membership, and then admits students with or without affirmative action. We characterize the trade-off between the (potentially positive) informational role of standardized testing in college admissions and its (negative) exclusionary nature. Dropping test scores may exacerbate disparities by decreasing the amount of information available for each applicant, especially those from non-traditional backgrounds. However, if there are substantial barriers to testing, removing the test improves both academic merit and diversity by increasing the size of the applicant pool. Finally, using application and transcript data from the University of Texas at Austin, we demonstrate how an admissions committee could measure the trade-off in practice to better decide whether to drop their test scores requirement.
From Papers to Programs: Courts, Corporations, Clinics and the Battle over Computerized Psychological Testing
This paper examines the role of technology firms in computerizing personality tests from the 1960s to 1980s. It focuses on National Computer Systems (NCS)’s development of a computer software to interpret the Minnesota Multiphasic Personality Inventory. NCS trumpeted their computerized interpretation as a way to free up clerical labor and mitigate human bias, even as psychologists cautioned that proprietary algorithms risked obscuring decision rules. Clinics, courtrooms, and businesses all had competing interests in the use of computerized personality tests. I argue that test developers promoted computerized psychological tests as technical fixes for bias, even as courts and psychologists pointed to the complex layers of mediation —of technology and human expertise—embedded in software programs for psychological tests. This paper contributes to histories of computing emphasizing the importance of intellectual property law in software development, to the relationship between labor, technology, and expertise; and to scholarship on the history and politics of algorithms.
On the Moral Justification of Statistical Parity
A crucial but often neglected aspect of algorithmic fairness is the question of how we justify enforcing a certain fairness metric from a moral perspective. When fairness metrics are proposed, they are typically argued for by highlighting their mathematical properties. Rarely are the moral assumptions beneath the metric explained. Our aim in this paper is to consider the moral aspects associated with the statistical fairness criterion of independence (statistical parity). To this end, we consider previous work, which discusses the two worldviews "What You See Is What You Get" (WYSIWYG) and"We’re All Equal" (WAE) and by doing so provides some guidance for clarifying the possible assumptions in the design of algorithms. We present an extension of this work, which centers on morality. The most natural moral extension is that independence needs to be fulfilled if and only if differences in predictive features (e.g. high school grades and standardized test scores are predictive of performance at university) between socio-demographic groups are caused by unjust social disparities or measurement errors. Through two counterexamples, we demonstrate that this extension is not universally true. This means that the question of whether independence should be used or not cannot be satisfactorily answered by only considering the justness of differences in the predictive features.