People vs. Algorithms: Data Ethics in the 21st Century
Welcome! Here you’ll find the syllabus and readings for “People vs. Algorithms: Data Ethics in the 21st Century,” a mentored research experience with guided readings and computing labs, taught in the Department of Statistics, at Columbia University, in Spring 2022, running from February 1 to April 26.
- STAT UN 3107 Sec 3 / GR5298 Sec 5 Mentored Research: “People vs. Algorithms: Data Ethics in the 21st Century”
- Tuesdays, 10:10–12:00. Spring 2022.
- Classroom: To be held over Zoom, here.
- Instructors: Jonathan Reeve, Isabelle Zaugg, Tian Zheng
- Email addresses: firstname.lastname@example.org; email@example.com. But please direct all questions to our chatroom on Matrix, where appropriate.
- Chatroom: data-ethics-spr2022 on Matrix
- Website, including course readings: https://data-ethics.jonreeve.com
- Source code: https://github.com/JonathanReeve/course-data-ethics
This interdisciplinary mentored research experience introduces students to the field of data ethics through an exploration of the societal impacts of data-driven technologies. It aims to bridge the philosophy of ethics, humanities and social science scholarship, and computational thinking and practice. We include hands-on, guided lab activities where students wrestle with intellectually-challenging ethical questions first-hand. The research experience is designed for students from all disciplinary backgrounds and supports the development of introductory computational skill sets for beginners. An interest in Python is recommended, and a crash course will be provided for students who are new to Python.
By the end of the semester, students will be able to:
- Understand ethical challenges posed by the big data era.
- Analyze public data critically, with a sensitivity towards social issues.
- Develop a clear vision of their own ethical framework, and how best to apply it in the realm of data science.
- Develop skills for data analysis in the Python programming language.
The class is Pass/Fail. To earn a passing grade, you must write 2 annotations on each required reading (2 readings per week). (Further details are in the section Readings and Annotations, below.) You can skip 1 week, no questions asked. You are also required to attend ten out of our twelve class sessions, and actively participate in the discussion and lab activity, including serving as discussion leader for one or more readings.
Readings and Annotations
For each reading, please write 2-3 annotations to our editions of the text, using hypothes.is. Annotations are not required for videos or other non-textual websites. Links to the texts are provided below. You’ll have to sign up for a hypothes.is account first. Please use your real name as your username, so we know who you are. You may write about anything you want, but it will help to think about ethical problems. Good annotations are:
- Concise (think: a long tweet)
- Observant, rather than evaluative
You may respond to another student’s annotation for one or two of your annotations, if you want. Just make your responses equally as thoughtful.
There are no prerequisites. We will use the Python programming language for computational data analysis, and a crash course will be provided for those who are new to Python, on Feb. 5th and 6th.
Please direct all questions to our course chatroom on Matrix.
- Sign up for a user account on hypothes.is, our annotation platform. Please use your real name as your username.
- Sign up for an account on Matrix, and introduce yourself in the course chatroom.
- Download and install Anaconda, a Python distribution, which contains a lot of useful data science packages.
If you want some extra help, or want to read a little more about some of the things we’re doing, there are plenty of resources out there. If you want a second opinion about a question, or have questions that we can’t answer in the chatroom, a good website for getting help with programming is StackOverflow. Also, the Internet is full of Python learning resources. One of my favorites is CodeCademy, which has a game-like interactive interface, badges, and more. There’s also the fantastic interactive textbook How to Think Like a Computer Scientist, which is the textbook for Computing in Context, the introduction to Python at Columbia’s Computer Science department.
Jonathan Reeve and a colleague have also put together a few guides for beginning programming:
Note: this schedule is subject to some change, so please check the course website for the most up-to-date version.
Week 0, 2-1: Introduction to the Course
To be read in class:
- Sloane, Mona. 2019. “Inequality Is the Name of the Game: Thoughts on the Emerging Field of Technology, Ethics and Social Justice.” In Weizenbaum Conference, 9. DEU.
Week 0.5, 2-5 and 2-6: Python Bootcamp
Over the weekend of February 5th and 6th, the Department of Statistics will host a Python bootcamp, led by Jonathan Reeve. If you’re not already proficient in Python and data science libraries like Pandas, please attend this event.
Week 1, 2-8: All the data on all the people
- Boyd, Danah, and Kate Crawford. 2012. “Critical Questions for Big Data.” Information, Communication & Society 15 (5): 662–79.
- Thomas, Rachel. 2020. “What Are Ethics and Why Do They Matter?”
Optional readings: - Sweeney, Latanya. n.d. “All the Data on All the People.” In The Sweet Spot: Harmonizing Technology and Society, 45. Accessed November 23, 2021.
Week 2, 2-15: Ethical frameworks in tech
- Mhlambi, Sabelo. 2020. “From Rationality to Relationality: Ubuntu as an Ethical and Human Rights Framework for Artificial Intelligence Governance.” Carr Center for Human Rights Policy Discussion Paper Series 9.
- Piper, Kelsey. 2019. “Exclusive: Google Cancels AI Ethics Board in Response to Outcry.” Vox.
- “AI Ethics Workshop.” 2019. AI Ethics Workshop.
- “Value Assessment.” n.d. ImaginePhD. Accessed November 25, 2021.
- Franzke, Aline Shakti, Iris Muis, and Mirko Tobias Schäfer. 2021. “Data Ethics Decision Aid (DEDA): A Dialogical Framework for Ethical Inquiry of AI and Data Projects in the Netherlands.” Ethics and Information Technology, January.
- Packer, George. 2013. “Change the World.” The New Yorker.
- Montgomery, Kathryn C. 2015. “Youth and Surveillance in the Facebook Era.” Telecommunications Policy 39 (9): 771–86.
Week 3: Algorithmic racism
- Benjamin, R. 2019. “Introduction: The New Jim Code.” In Race After Technology: Abolitionist Tools for the New Jim Code. Wiley.
- Raji, Deb. 2021. “These Are the Four Most Popular Misconceptions People Have about Race & Gender Bias in Algorithms.” Twitter.
- Thomas, Rachel. 2020. “Not All Types of Bias Are Fixed by Diversifying Your Dataset.”
- Bowker, Geoffrey C, and Susan Leigh Star. 2000. “The Case of Race Classification and Reclassification Under Apartheid.” In Sorting Things Out: Classification and Its Consequences. MIT press.
- “Race + Data Science Lecture Series - The Data Science Institute at Columbia University.” n.d. Accessed November 25, 2021.
- “Surveillance and Race Online Simone Browne at MozFest.” n.d. Accessed November 25, 2021.
- McIlwain, Charlton. n.d. “Of Course Technology Perpetuates Racism. It Was Designed That Way.” MIT Technology Review. Accessed November 25, 2021.
Week 4: Workers’ rights and data collection
- Ajunwa, Ifeoma. 2020. “The ‘Black Box’ at Work.” Big Data & Society 7 (2): 205395172096618.
- Sloane, Mona. 2020. “Now Is the Time to Rethink AI, Automation and Employee Rights.” BRINK – Conversations and Insights on Global Business.
- Society, Digital Future. n.d. “A Walk with Mary Gray - Long Version Interview.” Accessed November 25, 2021.
- Winner, Langdon. 1980. “Do Artifacts Have Politics?” Daedalus, 121–36.
- Seabrook, John. 2019. “The Age of Robot Farmers.” The New Yorker.
- Dilmegani, Cem. 2021. “Top 12 Use Cases / Applications of AI in Manufacturing.”
Week 5: Algorithms in the criminal justice system
- “S3, Episode 1: The Precrime Unit (Jan. 31st, 2019).” 2019. Hi-Phi Nation.
- Hao, Karen, and Jonathan Stray. 2019. “Can You Make AI Fairer Than a Judge? Play Our Courtroom Algorithm Game.” MIT Technology Review.
- Ochigame, Rodrigo. 2019. “The Invention of ‘Ethical AI’: How Big Tech Manipulates Academia to Avoid Regulation.” The Intercept.
- Jorgensen, Renée. 2021. “Algorithms and the Individual in Criminal Law.” Canadian Journal of Philosophy, October, 1–17.
- Brayne, Sarah, and Angèle Christin. 2020. “Technologies of Crime Prediction: The Reception of Algorithms in Policing and Criminal Courts.” Social Problems 68 (3): 608–24.
Week 6: Language diversity and digital justice
- Zaugg, Isabelle. n.d. “Digital Surveillance and Digitally-Disanvantaged Language Communities.” In International Conference Language Technologies for All. UNESCO.
- Coffey, Donavyn. n.d. “Māori Are Trying to Save Their Language from Big Tech.” Wired UK. Accessed November 25, 2021.
- Desir, N., and K. A. Dawkins. 2021. “Columbia Language Justice Perspectives Project.” Columbia Data Science Institute.
- Desir, N., and K. A. Dawkins. 2020. “Columbia Language Justice Perspectives Project.” MapHub.
Week 7: Ethical dimensions of NLP (natural language processing)
- Bender, Emily M et al. 2021. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–23.
- Metz, Cade, and Daisuke Wakabayashi. 2020. “Google Researcher Says She Was Fired Over Paper Highlighting Bias in A.I.” The New York Times, December.
- Bird, Steven. 2020. “Decolonising Speech and Language Technology.” In Proceedings of the 28th International Conference on Computational Linguistics, 3504–19. Barcelona, Spain (Online): International Committee on Computational Linguistics.
Week 8: Geographic and demographic data ethics
- Shahmirzadi, Atefeh Akbari. 2018. “Mapping, Naming, and Language Justice in the Digital Sphere.” Explorations in Global Language Justice.
- Schor, Paul. 2017. “New Asian Races, New Mixtures, and the "Mexican" Race: Interest in "Minor Races".” In Counting Americans: How the US Census Classified the Nation, 203–19. Oxford University Press.
- Schor, Paul. 2017. “Introduction.” In Counting Americans: How the US Census Classified the Nation, 1–12. Oxford University Press.
- DeLuca, Krystina. 2015. “Selling or Spying: The Legal Implications of Target Marketing Through Geolocation Technologies.” Law School Student Scholarship, January.
Week 9: The data firehose of mobile computing
- Taylor, Linnet. 2015. “No Place to Hide? The Ethics and Analytics of Tracking Mobility Using Mobile Phone Data.” Environment and Planning D: Society and Space 34 (2): 319–36.
- Cohen, Jason. 2021. “These Apps Collect the Most Personal Data.” PCMAG.
- Rooksby, John et al. 2016. “Implementing Ethics for a Mobile App Deployment.” Proceedings of the 28th Australian Conference on Computer-Human Interaction - OzCHI ’16.
Week 10: The two-way mirror of web search
- Ochigame, Rodrigo, and Katherine Ye. 2021. “Search Atlas: Visualizing Divergent Search Results Across Geopolitical Borders.” Designing Interactive Systems Conference 2021, June.
- Noble, S. U. 2018. “A Society, Searching.” In Algorithms of Oppression: How Search Engines Reinforce Racism. NYU Press.
Week 11: Our bodies, our data
- “WPF Report — The Scoring of America: How Secret Consumer Scores Threaten Your Privacy and Your Future World Privacy Forum.” n.d. Accessed November 25, 2021.
- Szalavitz, Maia. 2021. “A Drug Addiction Risk Algorithm and Its Grim Toll on Chronic Pain Sufferers.” Wired, August.
- Sweeney, Latanya. 1997. “The Data Map.” The Data Map.
- Sweeney, Latanya. 2013. “Matching Known Patients to Health Records in Washington State Data.” arXiv:1307.1370 [Cs], July.
- Montgomery, Chester, and Kopp. 2018. “Health Wearables: Ensuring Fairness, Preventing Discrimination, and Promoting Equity in an Emerging Internet-of-Things Environment.” Journal of Information Policy 8: 34.
- Duhigg, Charles. 2012. “How Companies Learn Your Secrets.” The New York Times, February.