I adapted this essay from a short address I gave at the Ethical Society of St. Louis on 16 April 2023.
For most of 2012 and 2013, I worked as an Editor in the language services unit at the United Nations in Beirut. Most of my colleagues specialized in Arabic-English translation, and even back in 2012, they were using a range of machine translation tools. My colleagues were dedicated to working with these technologies for at least two reasons: first, most of them loved Arabic more than Ken loves Barbie, and they were adamant about the future of Arabic; and second, my colleagues were well aware of the UN’s institutional commitment to working with machine translation, and they were determined that the Beirut office would lead in all matters related to Arabic.
Machine translation came up in many staff meetings. I gathered that it was tedious and time-consuming, unpredictable and unreliable. Everything had to be checked and corrected, and no one had space for what seemed like extra work. Discussing these challenges, one colleague explained that this was not wasted effort. She encouraged our unit to keep on feeding the machine, to keep on giving it examples of correct translations, to keep on building the tool. Machine translation was only as good as the examples fed into it.
Well it has been more than a decade since then. I moved to Bangkok and the technology moved on and now everyone is talking about machine learning and AI. For example, I was on the meditation subreddit the other day and scrolled past a comment that hit home:
To which all I can say is, BRO me also too.
The one thing I know about AI echoes a general and unavoidable truth about ourselves. Versions of this relationship – inputs to outputs – are just about everywhere. This month we celebrate Earth Day, and a harrowing inputs-outputs relationship is right there in the pollutants we dump into the environment that then build up in our own blood.
But I’ll focus now on how it turned up for me, in terms of race. Maybe you can relate to my experience – about a decade ago the murder of unarmed black children and adults began showing me something about the data set I was trained on. Trayvon Martin. Eric Garner. And then Michael Brown in Ferguson rocked my world, a world away in Bangkok. And it just kept on happening. Tamir Rice. Sandra Bland. Breonna Taylor. And by the time we got to George Floyd, I knew my data set had blinded me to multiple forms of systemic racial injustice, and I had to own my own blindness. The racism I was born into followed me everywhere I went in the world, and I had replicated it, uncritically, without a gun to my head.
I began to change my inputs, what I followed, read and watched. I began feeding myself a better data set; one that represented more of the real world and less of the white one. More of the Global South and less of the West. More of the 99% and less of the 1%.
There is power in how we use our attention and focus, in who we include and exclude, in what we replicate and what we disrupt. I encourage you and me and all of us to be intentional and critical about the data sets we train ourselves on. Be as critical as you would be about the food you eat, the air you breathe and the water you drink. Be careful of what you are building, because, as my colleague said in 2012, this is not wasted effort.