Removing Racist Terminology from Code

In my last post I talked about the actions I plan to take to combat racism in academia and science. I have also been reading with great interest the committments from both academia and industry. Here is one example of how racism is embedded in the computing community, and how I implicitly contributed to it.

Racist Terminology in Programming

Computer scientist use a number of formalisms to describe computing paradigms. Many of these concepts are decades old but have become the status quo, and some of these have racist or sexist connotations. Further, they are still taught in computer science classrooms. I know of two racist terms used in programming*:

Master & slave terminology describes a relationship among devices (e.g. hard drives) where one device controls the others. This terminology has been used in many major tech company code bases and has been used in version control (e.g. Git) and other commonly-used applications. The term’s obviously racist connotations deemed it to be the most politically incorrect term in 2004.

Blacklist & whitelist terminology describes access control that is denied or allowed to a group of individuals, respectively. While “blacklist” originally referred to the color of the book used to write problematic people’s names in the 1600s, I would argue that the introduction of “whitelist” made this term problematic.

Despite the controversy, these terms have nonetheless been pervasive in computing. The Black Lives Matter protests have motivated the tech industry to critically examine their code bases for such racist language. As one example, GitHub has started working on renaming their master branch to a neutral term (like main). Even before the recent protests, there were efforts to remove the “master/slave” terminology. In 2003, Los Angeles County officials asked that manufacturers stop using the terms “master” and “slaves” on computing equipment. In 2018, the Python programming language dropped master/slave terminology.

My Code

When I read these articles, my first thought was “oh good.” I’ve always had issues with “master/slave” terminology, but I had also felt a little uneasy about “blacklist/whitelist.” I am glad that GitHub is changing their notation, and I also decided to not use blacklist/whitelist terminology accordingly.

Yesterday, in a meeting with a student, we were going through the code from one of my published papers which is stored a public GitHub repository. The word “blacklist” was everywhere. This must have been an old repository, right? Maybe one of my first projects from graduate school, when I hadn’t fully thought about the racial implications of these words? Wrong.

The problematic GitHub repository is from my most recent paper, published October 2019.

In that paper, I wrote

PathwayCommons reports ubiquitous molecules in their database (“blacklisted nodes” [19]),…

And the term “blacklist” is used to reference this set of molecules for the rest of the work.1 In re-reading this section, I knew that this term is inappropriate because I put it in quotes. When I described another set of molecules we consider, the quotes are no longer needed. I realized this today, just now, when reading my own work. The paper was accepted to a conference, reviewed by multiple anonymous reviewers, and passed editorial approval for publication. I had numerous opportunities to change this writing, and I should have removed this language from the start.

I am part of the problem.

To start to fix this, I updated the GitHub repository to rename all instances of “blacklist” to a more neutral term (“ubiquitous”, which describes the dataset better). I updated the code, the filenames, and the documentation. The developer log now reads:

Jun 2020: Renamed “blacklisted” files to “ubiquitous” files in response to the racist connotations with using “blacklist” and “whitelist.” Read more about GitHub’s phrase changes, and look through your own repos to remove racist terminology.

This blog post will serve as a reminder of how quickly my own memory can fade, and how I have implicitly contributed to systemic racism. Racist terminology is easy to remove, and I hope “slave/master” and “blacklist/whitelist” terms are acknowledged as problematic and companies follow up on their promise to remove them from code. Like all efforts to correct systematic racial injustice, this is the first step of many that are needed to make this change long-lasting and sustainable.

*If anyone knows of other examples of racist terms in computing please let me know.
1The PathwayCommons database also called this set of molecules “blacklisted,” but that is not an excuse to adopt the same terminology if it has racist connotations.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s