Module 6: Ethics, Biases, and Diversity in a Digital World

Going into this week, I had a vague understanding of how algorithms worked and the extent to which personal data was used by corporations to tailor our experiences online. The readings for this week widened that understanding though, and if this year wasn’t already a conglomeration of tragedy, I would most likely be angry about it, but I am too exhausted to be mad. On top of that though, I think I just accepted that my data was nearly open content years ago. Edward Snowden blew the whistle on the NSA when I was in high school, and that is a distinct memory in my mind. I think at that time I just accepted that privacy wasn’t really a thing on the internet. Although there is infinite amounts of data that makes my posts and data irrelevant in the grand scheme (like people worry about the government spying on them, but I don’t worry, because who am I to the government? They don’t care about me). But now, realizing that every aspect of who I am as a person is documented in some way based off my behavior online and in real life, I realize how terrifying that can be. At the same time though, I can’t help but just accept my place as a data point in the world, because what can we do to rage against the machine of Big Data?

A distinction that stood out for me this week is that algorithms function in (at least) two ways: as a function of input data creating output data in a general form, and as a function of your specific input data creating a specific output data for you. When I tried to explain the Twitter algorithm to my dad a few weeks ago, he couldn’t understand how the algorithm could be explicitly favoring white faces over Black faces. He thought that it depended upon your own activity and didn’t understand why it didn’t change from person to person. In this case, it would be the kind of algorithm talked about in “Erasure, Misrepresentation and Confusion” or “Algorithmic Accountability: A Primer” that is designed to produce a specific output (in this case, the input is merely seeing the picture on Twitter). Each of these cases pointed out glaring flaws in how algorithms operate. In the case of JSTOR, it is clear that the algorithm designed to capture topics is flawed because it cannot recognize topics in the same way that researchers think; in other words, it cannot anticipate broad themes in an article the way a researcher can, it can only recognize specific words. Trying to identify an article as a sum of its parts only goes so far. A solution to that would be consulting researchers when designing an algorithm that parses their work, but that would be a costly (assuming researchers were paid) and lengthy endeavor to create a new algorithm. But also, something like parsing a broad expanse of human knowledge for specific tags is an incredibly complex task even for a computer.

On the other hand, the COMPAS algorithm was fed “bad” data when it was trained. Although we were able to interpret that the outputs were “incorrect,” it was not the fault of the machine; it did as it was trained. Instead, it is a reflection of the justice system itself, making clear a pattern of unfair incarceration. In that sense, the COMPAS algorithm acted almost as a diagnostic tool, performing a similar function to researchers crunching large data to find patterns. The only difference there is that the pattern that was analyzed was produced by a machine, not a person; but the key point is that the machine is incapable of making decisions on its own.

The other type of algorithm though, as outlined a bit in the Primer and in “Open Data in Cultural Heritage Institutions,” is personalized with your own data. This is the kind of algorithm that people are more familiar with; this is the one that shows you ads for that shirt you just considered buying on all your social media platforms. This is also the kind of algorithm that scares me. I’ve manipulated the algorithm before to my advantage, knowing that if I shop for something it will give me deals or looking for something related to a product I hope exists so that the algorithm can find me what I am looking for. But what I didn’t realize was that the data goes far beyond shopping habits or what pages or tweets to show based on your interactions.

Despite being a historian, trained to research and analyze and draw connections that imply, I hadn’t considered that every action we take as people can act as a reflection of us as people. What struck me particularly was “gamblers” being listed in “Open Data in Cultural Heritage Institutions.” If a company knows you gamble, it is easy to make an assumption about their character. Are they reliable with money? Do they have an addictive personality? Can you trust them? These all seem like stretches based on one fact, but when added to other data about other gamblers, companies can draw statistical conclusions to these questions and make assumptions based off those statistics. For me, I have recently bought parts to build a computer and I tweet about games a lot. Therefore, a company could classify me as a gamer, which could then reflect on who I am. Am I dedicated to my work? Do I think logically or abstractly? Am I professional, or does gaming make me unprofessional? Even something so innocent as a hobby could reflect on hiring me for a job based on these tangential questions that can be answered through statistics.

I will cut off my post here as it is running long, but I have a lot to think about regarding algorithms and my personal data. Although as I mentioned before, it all feels hopeless. I can’t protest a system that defines modern life, and I am just a small data point in the world. The best I can hope for is that our technological overlords are ethical in their use of data, and there is yet to be a sign of that being true.

5 Replies to “Module 6: Ethics, Biases, and Diversity in a Digital World”

  1. I share the same defeatist attitude you have regarding my data – sure, I may be the product now, but the convenience provided by services like Google amount to a “net positive” in my calculus so … harvest away, I suppose. But in that benign sense, could there be some greater good to all of this collection? Will posterity use it as a means to build as good a representation of our lives as we experience it today? Or could it be a metric by which they judge our shortcomings? Algorithms or not, it’s important that we recognize, much as you did in your post, that humans are still the decision-makers. So long as we keep that degree of agency, I think there’s still a hopeful outcome for us yet.

  2. Your point at the very end reminds me of ethics and paywalls for some reason. Paywalls serve two purposes in my mind; they support the newspaper as they’ve largely switched print press to digital press (though print still happens) and, I assume, they help pay the writers who regularly produce material for these newspapers. However, there’s an ethical dilemma with how material is open and accessible to all. Today the NYT published a piece that I found fascinating — though I could not read it as it was protected by a NYT paywall. Yet, I could go onto the page and read several “free” content articles that anyone could read. What are the ethics to publishing one thing over another? These articles, of course, related to the current political state and I bet they offered free content for those interested in the election. Nonetheless, how can we figure out what is ethical in the world when both sides can make cases in that debate? Just an idea that came to mind as I finished reading your post.

  3. You are right; the privacy is over. We live our lives almost as public life. Can I play devil’s advocate for a second? Is it just bad, or it has both pros and cons? You mentioned an “algorithm that people are more familiar with; this is the one that shows you ads for that shirt you just considered buying on all your social media platforms.”
    So true. Enough to type “bicycle” in Amazon, and your Facebook account starts featuring ads of bicycles. It’s frustrating, but isn’t it also beneficial? The wider choice might help to find a good bicycle. Google’s autofill, to which Dr. Noble referred to, speeds up our search.
    As Zeigler stated, “Data brokers profit from other people’s information. Those described in their datasets often have no way of knowing how they are being represented, and have no way of questioning or correcting this representation”. Well, again, privacy is kind of over. I can copy-paste here your very last statement: “The best I can hope for is that our technological overlords are ethical in their use of data, and there is yet to be a sign of that being true.”
    I hope the same. And also, I want to focus on the benefits that came with the ends of privacy. I think they do exist, and it helps to balance out the dissatisfaction.

  4. Before this week I didn’t have a clear understanding of algorithms and how they used my personal data. I agree that sometimes it can be convenient when I’m thinking about buying something and it targets me with sales and lower prices, but it also is sort of a creepy phenomenon. The ways it categorizes people is worrisome in how it calculates these statistical conclusions. If there was a way to make algorithms more neutral I would be less weary towards them, but the issue I have is how these biases play a role in erasure of marginalized people and research.

  5. Hey Robert. I think your first paragraph encapsulates a lot of my feelings regarding the readings this week. I was less concerned over the lack of privacy, however, though it is very important, and got caught up on the racism and sexism discussed in Brock’s and Noble’s work. I think I knew about these algorithmic issues in an abstract way, but reading these cemented arguments proved to be how big of an issue it is and how it can affect our own work.

    About your father’s point – I would be willing to guess that many people think that way, understandably. It highlights again the call-to-action for education regarding search engines, algorithms, and tech. The larger call-to-action is still there, but immediate help can come from people simply understanding the issues presented.

Leave a Reply

Your email address will not be published.