“Algorithmic tools are being deployed rapidly and across a broad range of industries, without regulatory or voluntary safeguards in place to prevent these tools from replicating systemic bias and discrimination.”
“Discrimination and bias are not modern inventions,” says the Center
for Democracy & Technology in its comprehensive 2016 report,
Digital Decisions. “However, the scope and scale of automated technology
makes the impact of a biased process faster, wider spread, and harder to
catch or eliminate.”
This isn’t a problem for the future — machine-learning already dictates our
world today, and algorithmic tools are impacting the global marketplace
at a massive scale. According to International Data Corporation (IDC),
worldwide revenues for “cognitive/AI solutions” will mushroom from
nearly $8 billion in 2016 to $47 billion in 2020. Fastest growth is
expected in banking, securities investment services, manufacturing, retail,
and healthcare. “Cognitive/AI systems are quickly becoming a key part of
IT infrastructure and all enterprises need to understand and plan for the
adoption and use of these technologies in their organizations,” says IDC.
One problem, says mathematician, data scientist and former hedge fund
quant Cathy O’Neil, who authored Weapons of Math Destruction, is the
perception that algorithms are neutral because they rely on math — and
people generally trust math. The myth of algorithmic objectivity is a
critical barrier to uncovering and addressing the ways in which algorithmic
decision-making can be unfair.
Examples of algorithmic bias abound.
Facebook’s algorithms, for example, have enabled dozens of brand-name
companies — such as Amazon, Verizon, UPS — to exclude older workers
from job ads, according to an investigative report from ProPublica and
The New York Times. One employment expert called this practice
“blatantly unlawful.” As recently as November 2017, Facebook’s “ethnic
affinity” advertising algorithm allowed housing advertisers to exclude
potential home buyers who were “African Americans, mothers of high
school kids, people interested in wheelchair ramps, Jews, expats from
Argentina and Spanish speakers,” ProPublica found.
In 2015, researchers at Carnegie Mellon found that Google’s online
advertising system showed ads for higher-income executive jobs nearly six
times more often to men than to women.
The inherent risks of AI bias are now readily conceded by tech industry
leaders, who often highlight the need for humans to analyze and take
responsibility for what they do with artificial intelligence.
A 151-page report by Microsoft, published in January 2018, provides multiple examples
of how algorithmic decision-making can go wrong. Among them:
-
“An AI system could also be unfair if people do not understand
the limitations of the system, especially if they assume technical
systems are more accurate and precise than people, and therefore
more authoritative. In many cases, the output of an AI system is
actually a prediction. One example might be ‘there is a 70 per-
cent likelihood that the applicant will default on the loan.’ The AI
system may be highly accurate, meaning that if the bank extends
credit every time to people with the 70 percent ‘risk of default,’
70 percent of those people will, in fact, default. Such a system
may be unfair in application, however, if loan officers incorrectly
interpret ‘70 percent risk of default’ to simply mean ‘bad credit
risk’ and decline to extend credit to everyone with that score —
even though nearly a third of those applicants are predicted to be
a good credit risk. It will be essential to train people to understand
the meaning and implications of AI results to supplement their
decision-making with sound human judgment.”
Algorithms’ flaws carry significant reputational risk for companies,
which greater transparency can help address. Addressing the problem
requires businesses to distinguish between the intention of an algorithmic
tool and its real-world impact. According to the non-profit Data & Society
Research Institute, “the designer of an algorithm may have no intentions
of producing discriminatory results. For example, algorithmically
inferring race with a high degree of accuracy without actually knowing
race is relatively easy. Unless an analyst is testing to make sure that race
is not a factor, the correlates that enable such discrimination to occur can
often go unnoticed.”
Case in point: a 2012 Wall Street Journal investigation into pricing on the
web site of Staples, the office supply company, found that users were
offered different prices for the same product depending on their location.
An algorithm’s decision was driven by competition: “ZIP Codes whose
center was farther than 20 miles from a Staples competitor saw higher
prices 67 percent of the time. By contrast, ZIP Codes within 20 miles of a
rival saw the high price least often, only 12 percent of the time.” The
unintended consequence of this formula — which was hidden from
consumers — was that customers in poorer communities were more likely
to be charged higher prices for the same product since Staples had fewer
nearby competitors in these areas.
Importantly, even when algorithms are intentionally programmed to
counteract bias, discrimination can still occur, especially when cost-
effectiveness is prioritized. In 2016, a study by marketing professors at
the MIT Sloan School and the London School of Business found that
social media advertising for technology and science jobs that was
“explicitly intended to be gender-neutral in its delivery” resulted in far
fewer women seeing the ad — even though women who saw the ad were
more likely to click on it. How could this be? The researchers concluded
that “women aged 18-35 are a prized demographic and as a consequence
are more expensive to show ads to. This means that an ad algorithm which
simply optimizes ad delivery to be cost-effective, can deliver ads which
are intended to be gender-neutral in what appears to be a discriminatory
way.” In order to mitigate this discriminatory impact, the study advised
advertisers “set different budgets for female and male advertising
campaigns, but also further separate out bidding strategies by age as well
as gender, to ensure that they do reach younger women.”
U.S. law recognizes a theory of “disparate impact,” which occurs when a
seemingly neutral policy produces a disproportionate adverse effect or
impact on a protected class of people. Some data scientists argue that
“finding a solution to big data’s disparate impact will require more than
best efforts to stamp out prejudice and bias; it will require a wholesale
reexamination of the meanings of ‘discrimination’ and ‘fairness.’”
Most companies treat their algorithms as trade secrets, a corporate asset
to be protected in a metaphorical “black box,” far from public scrutiny.
But many critics agree with Marc Rotenberg, executive director of the
Electronic Privacy Information Center, who says the problem with “black
boxes” is that “even the developers and operators do not fully understand
how outputs are produced.”
“We need to confront the reality that power and authority are
moving from people to machines,” Rotenberg says. “That is why
#AlgorithmicTransparency is one of the great challenges of our era.”
Algorithmic transparency is not about exposing trade secrets; it is about
disclosing the impact of an algorithm’s decision on consumers, and
particularly the most marginalized groups. Only by measuring and
examining the outcomes of an algorithm’s decision — including unintended
consequences — can investors and the public truly understand whether
advantages gained by businesses come at the expense of broader harms
to society.
Case Study: Predictive Hiring
The hiring process provides a good example of the purported benefits and
inherent risks of algorithms. Companies are increasingly using automated
hiring tools that include personality tests, skills tests, questionnaires
and other exams to make judgements about candidates’ employability.
By one estimate, the HR software market is expected to top $10 billion
by 2022.
Some of the biggest marketers of algorithmic tools in the human resources
field include By Oracle, whose Taleo software “enables companies to easily
source, recruit, develop, and retain top talent with an engaging, social, and
data-rich talent management software suite”; SAP, whose human resource
software offers to help employers “find the right talent, develop future
leaders, and engage all employees with automated, transparent
processes, and a digital HR experience”; and IBM, which acquired Kenexa,
an HR software developer, in 2012 for $1.3 billion.
Many companies purchase automated human resources tools to reduce
bias in the hiring process. Without a doubt, implementing equitable hiring
practices is a critical and urgent endeavor for economic equity.
But how effective are automated hiring tools at reducing bias?
It’s true that HR software companies promote the supposed neutrality
of their tools as an advantage over traditional hiring methods. For
example, HireVue, one of the more popular big data-driven human
resources software companies, writes, “Not only does AI cut costs
and speed up onboarding, it doesn’t care about an applicant’s club
membership or what their favorite sports team is.” Goldman Sachs and
Unilever have reportedly used technology from HireVue that analyzes the
facial expressions and voice of job candidates to advise hiring managers.
Similarly, the banking giant Citigroup uses Koru, predictive hiring software that says it “identifies the drivers of performance in your company,
increases high quality hires, and reduces bias.” Koru claims to measure
each job candidate for seven “impact skills,” including “Grit, Rigor, Impact,
Teamwork, Curiosity, Ownership, and Polish.” According to Fortune, “the
software uses algorithms that search for signs of grit in past behavior. It’s
less a matter of any individual sign than the accumulation of them. Maybe
a candidate was on the volleyball team. But what really matters is how
long the person persisted — while, say, holding down a full-time job — as
well as the leadership role she attained and the solo projects she
completed. The software can suggest follow-up interview questions that
let employers dig deeper.”
While predictive software companies claim their products help reduce
bias, many data scientists say these algorithms can behave in dangerously
biased ways.
According to the Center for Democracy & Technology, “if training data
for an employment eligibility algorithm consists only of all past hires for a
company, no matter how the target variable is defined, the algorithms may
reproduce past prejudice, defeating efforts to diversify by race, gender,
educational background, skills, or other characteristics.”
Kelly Trindel, chief analyst for the Equal Employment Opportunity
Commission, has testified:
- “...if the training phase for a big data algorithm happened to
identify a greater pattern of absences for a group of people with
disabilities, it might cluster the relevant people together in what’s
called a high absenteeism risk profile. This profile need not be
tagged as disability; rather, it might appear to be based upon some
common group of financial, consumer or social media behaviors.
It may not be obvious to the employer, or even to the data
scientist who created the algorithm, that subsequent employment
decisions based on this model could discriminate against people
with disabilities.
-
Similarly, if most previously successful employees at a firm happen
to be young, white or Asian American men, then the model will
codify success in this way. If married women of a particular age
are more likely to churn, then the model will codify this in proxies
and subsequently predict lower success rates for similar women.
All of this can happen without informing the model of disability
status, age, race or gender, and all while giving the appearance
that the machine is working just as it should, to increase
worker ROI.”
Speaking before the same EEOC hearing in 2016, Dr. Kathleen Lundquist,
CEO of APTMetrics explained: “[A]lgorithms are trained to predict
outcomes which are themselves the result of previous discrimination.”
This not only has the potential to magnify bias, but also appears to
transfer the responsibility of fairness from human to computer. When
job applicants’ tests are “scored” by algorithms, company hiring managers
and applicants’ alike lack an understanding of how the decision (or score)
was determined.
Because hiring managers only encounter the qualified candidates the
algorithms detect, there is no way of knowing how often the algorithm is
making unfair rejections. Data scientist Cathy O’Neil notes: “If you think
about it, if you filter people out from even getting an interview, you
never see them again....There’s no way to see, to learn that you made a
mistake on that — that person would have been a good employee —
because they’re gone.”