2022 Journal Houssam Kherraz

Technology Ethics: Three Key Takeaways

by Houssam Kherraz, 2022 Design & Technology Fellow

Similar Goals and Similar Tensions With the Broader Professional Community

In 2018, a group of MIT researchers launched TuringBox: a two-sided platform on which users submitted algorithms on one side and “AI examiners” tested and studied the algorithm on the other. The hope in launching the project was that academics, engineers, and other experts might use the opportunity to discover biases in AI systems. Today, the platform no longer exists. While I have not been able to find a post-mortem and the main authors have not responded to my requests, I strongly suspect that it did not gain traction with either side of the platform. This failure, I contend, reflects the hurdles facing AI ethicists interested in questions of algorithmic fairness.

On my trip with FASPE, I was fascinated by the large differences in language, approach, and perspective among the different professions. It made me wonder how much can be learned from other professional communities and inspired this piece exploring what AI ethicists can learn from computer security. In other words, it offered a path forward to investigate the question raised by TuringBox.

Fundamentally, both computer security experts and algorithmic fairness experts try to achieve the same goal: fixing issues in systems that are inherently flawed, often because they were designed to deliver quickly without all stakeholders in mind. If you listen to both specialists closely, the similarities become obvious. Many security engineers are quick to point out how hard their job is, how difficult it is to keep data safe when many of our digital systems, including the internet, were initially built without security in mind. They might decry how many vulnerabilities are due to a failure in understanding the different people who interact with their technology and the creative ways in which it can be used (or “abused”). They might grumble about how little most engineers know about security, how their only concern is moving fast. Since no computer system is fully secure, many say that a primary function of their job is simply to be paranoid.

AI ethicists have very similar complaints, albeit using a different vocabulary. They often say that it is impossible to “fix” algorithmic bias in its entirety. All modern machine learning algorithms require a lot of data, which tend to reflect existing human biases. Most algorithms currently in use, and the datasets behind them, were constructed with no consideration toward mitigating algorithmic bias. Ethicists might, then, point out how the design process behind many of these algorithms is not holistic, both in terms of who takes part in the design and who user tests the product. The Google Photos incident, for example, saw the auto-album classifier label people with dark skin as gorillas in the first weeks of release.¹ If a resourceful behemoth like Google has failed to include African Americans in their design process, the hope for the rest of the industry to include people from different professional and demographic backgrounds is slim. Finally, the third parallel is that most AI professionals know little about algorithmic bias and are primarily concerned about performance and quickly delivering business value.

At an abstract level, the similarities between the goals and tensions prevalent in each field are striking to me. The computer security field is quite a bit more mature, having existed for longer. But even in this past, we can see echoes of the shorter history of algorithmic fairness.

Historical Echoes Between the Two Fields

When computers were first adopted by major institutions and universities, no one was thinking about security. In the 60s, ARPANET, the precursor to the internet, was designed with an inherent assumption that all actors within the network were competent and non-malicious. Even when many security vulnerabilities became obvious, many did not take them all that seriously. The first computer virus, for example, was not meant to do harm—it was a simple experiment by its creators (who proudly included their names in the virus’ code).² The rapid growth of computer systems came as a surprise, and so many enthusiasts tried to break things wondering if it were feasible, without really wondering if they should. By the time anyone was really paying attention to security, software systems were far too successful and well-dispersed to rebuild from the ground up.

Take, for instance, DNS (Domain Name System), a central system without which the internet would not have scaled as well as it has. DNS converts domain names, like www.faspe-ethics.org, into IP addresses, like 74.208.236.121. As you might guess, its builders had efficiency and scalability rather than security in mind. As a result of this legacy system, security loopholes that enable DDoS attacks are difficult to fix. Were it possible to rebuild DNS from scratch, security would be considered a priority and these loopholes could easily be resolved. But that is not an option, and so security experts have been forced to work with what they have. As hackers discover new flaws, our systems become safer, even as new technologies are introduced with their own new security concerns. Experts, in other words, have had to innovate and retain rather than fundamentally reconstruct.

We can see the same patterns in the development of AI fairness. Algorithmic bias has only recently become a major part of the conversations within industry and academia. A quick search on Google Trends reveals that AI ethics- and algorithmic fairness-related words only saw increased usage in 2016. The first ACM (Association for Computing Machinery) Conference on Fairness, Accountability, and Transparency (ACM FAccT) took place in 2018, only four years ago. Popular books that popularized these questions, like Cathy O’Neil’s Weapons of Math Destruction (2016), are mostly less than a decade old. In a similar way to how computer security was mostly an afterthought in the early days, today new mainstream algorithms and datasets mostly do not include any analysis of algorithmic bias; it simply is not considered a central part of the design process. For instance, OpenAI recently released a powerful image generation algorithm called DALL-E 2. The related paper³ does not mention any algorithmic bias analysis, even though such a complex system has many problematic behaviors.⁴ Their analysis, on the risks of DALL-E 2, is instead buried on a lowly Github page.⁵ The lab, founded with a mission to ensure that AI benefits all of humanity, seems content to leave the issue as a later exercise for “experts” as they slowly open access to the algorithm.

Today, while both fields do not receive the priority they rightfully deserve, the algorithmic bias space is in the worse position. Very few companies have roles dedicated to mitigating bias and thinking about algorithmic fairness; whereas, virtually any IT division of a company has dedicated security experts. As a software professional, I constantly hear about security patches to open-source software and data breaches at specific companies, but I do not hear nearly as much about fairness issues uncovered in specific common datasets and algorithms, despite how ubiquitous they are today. The number of people looking into these issues is also much smaller compared to the number dealing with security vulnerabilities.

Given the similarities, one might assume the communities collaborate and inspire each other today. Surprisingly, I do not see much communication or much cross-pollination of ideas. Algorithmic bias experts are likely to benefit from the longer and richer history that computer security experts have, and thereby potentially not have to reinvent the wheel, saving time and harm in their own development as a field.

What Can Algorithmic Bias Experts Learn from the Computer Security Community?

The incentive structures of the computer security sector offer a helpful lesson for algorithmic bias experts. Many companies view digital security vulnerabilities as a serious existential risk. Losing data or intellectual property can kill enterprises—whether by losing their competitive advantage or as a result of lawsuits.⁶ In the capitalist system we operate under, risks to the bottom-line drive behavior. Successful hacks are well-ingrained, to various extents, into the minds of business leaders, and so they feel they must take security seriously—for example, by hiring staff dedicated to security, giving them power within their organizations, incentivizing discovering issues early via bug bounty programs, and developing standards like SOC2 compliance. The flip side of this incentive structure is that clients too are security conscious. As I have seen in my time in the industry, many scoff if they do not believe a company is doing their best to protect their data, taking their business elsewhere when necessary.

Today, unfortunately, algorithmic bias is still not viewed as particularly risky by businesses. Until there are major lawsuits that scare business leaders or significant PR scandals that lead to loss of profits, it is unlikely that the current status quo will change—the incentives are simply not there. As a result, AI fairness experts do not benefit from the system as it exists, and many of those who do find work at major companies are seen as little more than liminal features working on vanity projects.⁷

So how can the AI industry, specifically as concerns algorithmic fairness, be pushed to adopt higher standards similar to those upheld in computer security? Some might contend that the answer is time—it took decades for computer security to establish itself as crucial, and even then, only after many catastrophic failures and data breaches. Companies now understand that paltry security can cause immense damage. By the same logic, with enough algorithmic bias crises, the industry will have no choice but to make it a first-class concern.

This possible future really concerns me for a few reasons. Firstly, computer security problems tend to lie dormant, only having massive effects once exploited. Algorithmic bias deals with continuous harm—the longer the algorithm is running and making decisions, the more harm is done to one or many groups of people. Additionally, computer security has had the help of the hacking community to push standards higher, especially in the early days. The algorithmic fairness field does not have an activist or hobbyist community comparable in size or in dynamism.

In looking for how such a group might come to exist for algorithmic fairness experts, we should ask: how did the hacking community come to exist? Why have so many people been interested in hacking, especially in the early days when all such activity was illegal? Complete answers to these questions would require a significant socio-psychological and ethnographic study exploring what motivates hackers, how hacking became so prevalent in geek culture and pop culture,⁸ and how such communities grow despite challenges like secrecy and illegality. For our purposes, however, I only conjecture that the growth behind hacking was due to three main reasons, each of which contrasts with the current state of affairs in algorithmic fairness.

Firstly, geek and hacking culture were very intertwined in the early days. They both were part of a broader computer counterculture, with people in them reading the same magazines and forums.⁹ As a result, hacking gained a lot of momentum and social capital in these groups early on. These geek communities were involved in these spaces out of sheer interest since no money to be made in that way at the time. It’s no surprise then that many people spent countless unpaid hours finding vulnerabilities in computer systems out of curiosity, seeking credit and praise from their peers. In contrast, algorithmic fairness has not yet gained comparable social capital with either communities within AI or communities within the social sciences. Very few people are looking into computational bias issues as a hobby.

Secondly, beyond curiosity and seeking peer acknowledgement, I suspect hackers chase a sense of accomplishment when they gain unauthorized access to some computer systems—a feeling of power, getting one over on the corporate behemoth. We might conceive of this as a feeling of winning against all odds, of David vanquishing Goliath, of a motivated individual overcoming a team of “experts.” Yet the opposite seems to be the case for many algorithmic bias experts. Once they uncover an issue in a hiring or ranking algorithm, many end up feeling powerless—powerless because the solution to the issue is unclear, and companies are unwilling to put forward the resources needed to figure it out (or sacrifice profits for a fairer system). Consider how the COMPAS algorithm, used in courts to predict recidivism risk, is still being used today despite ProPublica showing in 2016 that it is racist against Black defendants.¹⁰ Powerlessness, and negative feelings in general, are not great motivators for human beings.

Finally, hacking is essentially about gaining a certain level of access when you are not supposed to. Since the whole point is to go where you are not wanted, no permission is needed to get started or try. Algorithmic fairness experts, on the other hand, need access to the algorithms and ideally their historical data to be able to even start
“testing” the algorithm for bias. Very few algorithms are publicly accessible, and even those that are will rate limit a user looking into the biases of the algorithm.¹¹ This lack of access could be solved if companies were to give researchers and hobbyists limited access to help uncover issues. This is common in cybersecurity where they provide some level of access to hackers to see if they can penetrate further or steal privileged data. For instance, huge competitions are organized at conferences to find vulnerabilities, and many companies have significant bug bounty programs. Such initiatives incentivize the hacker community to contribute to the advancement of cybersecurity, to work with companies directly when they find vulnerabilities rather than sell them to nefarious parties. Both are practically nonexistent for AI algorithmic bias. In 2021, Twitter organized an algorithmic bias bounty program in partnership with a major security conference, the first of its kind, but there hasn’t been another since.

The solution, then, may be simple: have the algorithmic bias community more closely collaborate and team up with the hacker community, especially the subset with a penchant for activism. One could imagine situations in which teams of gray hat hackers provide access to some algorithm, while other teams leverage that access to test and gain insights on algorithmic bias. It’s the perfect match—a community that suffers from a lack of access working with one that loves gaining access. Since most algorithms in use are behind closed doors, if this type of “hacking” were to become more common, companies would be forced to look into their closet and study how their algorithms operate before a group of people unconstrained by NDAs exposes them.¹² We will likely see more “bias” bounty programs, and more resources dedicated to algorithmic fairness. But should the algorithmic fairness community embrace this illegal and ethically questionable route? Is using computer security and its complex history with hacking as an inspiration even wise, given the risks involved and the vulnerabilities that still exist in the security sector?¹³

There is no easy answer to either question. What I do know, however, is that countless unexamined algorithms are deployed and affecting humans at scale right now. The longer it takes for governments and companies to take algorithmic fairness seriously the more harm will be done. Adversarial activism in collaboration with hackers, while taking inspiration from the computer security field to create a larger community of “bias testing” hobbyists, seem to be effective ways to get more of those algorithms examined sooner. More generally, I suspect the fairness and bias community has a lot to gain by more closely aligning itself culturally and socially with the computer security community, rather than the broader machine learning community. In the absence of other options and with the need so great, it is incumbent upon us to consider this collaboration seriously. Ethical computing demands it.

Houssam Kherraz was a 2022 FASPE Design and Technology Fellow. He is a software engineer at Kensho Technologies.

Notes

It’s also interesting that they haven’t truly fixed the issue: https://www.theverge.com/2018/1/12/16882408/google-racist-gorillas-photo-recognition-algorithm-ai.
Macdonald, Neil, et al. “Digital Dangers: A Brief History of Computer Security Threats.” Information Security Buzz, 23 Sept. 2014.
https://cdn.openai.com/papers/dall-e-2.pdf.
https://www.wired.com/story/dall-e-2-ai-text-image-bias-social-media/.
https://github.com/openai/dalle-2-preview/blob/main/system-card.md.
See the Equifax FTC Settlement, although the amount is laughable given the potential damage https://www.ftc.gov/enforcement/refunds/equifax-data-breach-settlement.
Google has, for example, fired some of its major AI ethicists: https://www.theverge.com/2021/4/13/22370158/google-ai-ethics-timnit-gebru-margaret-mitchell-firing-reputation.
Most “techie” characters in movies are hackers, and their very unrealistic “computer skills” tend to be hacking related.
2600: The Hacker Quarterly and others.
https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm
Being able to send many requests programmatically to a publicly available algorithm is crucial to gain a broad understanding. Rate-limiting requests/bot prevention algorithms make this substantially harder.
Another grim possibility here is that companies decide to solely invest in computer security to prevent access to their algorithms in the first place, rather than invest in making their algorithms fairer.
Especially in contrast with other engineering fields like civil or aerospace engineering.