AI-powered content analysis: Using generative AI to measure media and communication content
Methods tutorial #28834(a), module (political) communication research methods, winter term 2024/2025
Last updated on 2024-11-18 at 17:17
Important links
Overview
Large language models (LLM; starting with Googleâs BERT) and particularly their implementations as generative or conversational AI tools (e.g., OpenAIâs ChatGPT) are increasingly used to measure or classify media and communication content. The idea is simple yet intriguing: Instead of training and employing humans for annotation tasks, researchers describe the concept of interest to a model such as ChatGPT, present the coding unit, and ask for a classification. The first tests of the utility of ChatGPT and similar tools for content analysis were positive to enthusiastic (Gilardi et al., 2023; Heseltine & Clemm von Hohenberg, 2024; Rathje et al., 2024). However, others pointed out the need for more thorough validation and reliability tests (Pangakis et al., 2023; Reiss, 2023). Easy-to-use tools and user-friendly tutorials have proliferated the methods to the average social scientist (Kjell et al., 2023; Törnberg, 2023, 2024b). Yet (closed-source, commercial) large language models are not entirely understood even by their developers, and their uncritical use has been criticized on ethical grounds (Bender et al., 2021; Spirling, 2023).
In this seminar, we will engage practically with this cutting-edge methodological research. We start with a quick refresher on the basics of quantitative content analysis (both human and computational), focusing on quality criteria and evaluation (validity, reliability, reproducibility, robustness, replicability). We will then attempt an overview of the rapidly developing literature on LLMsâ utility for content analysis. The central part of the seminar will be dedicated to small evaluation studies by student teams. Questions can range from understanding a toolâs parameters (e.g., Whatâs the effect of a modelâs âtemperatureâ on reliability and validity?) to practical optimization (e.g., Which prompts work best for a given task?) to critical questions (e.g., Does the classification show gender, racial, âŠ, biases?).
Requirements
- Some prior exposure to (standardized, quantitative) content analysis will be helpful. However, qualitative methods also have their place in evaluating content analysis methods. If you have little experience with the former but can contribute with the latter, make sure to team up with students whose skill set complements yours.
- Prior knowledge in R or Python, applied data analysis, and interacting with application programming interfaces (API) will be helpful but are not required. Again, make sure that the teams overall have a balanced skill set.
- You will use your computer to conduct your evaluation study. Credit for commercial APIs (e.g., OpenAI) will be provided within sensible limits.
- This is not a programming class. Neither are programming skills required nor will you acquire such skills in a systematic way. We will learn the basics of interacting with an API using R. Code examples will be provided and discussed.
- Here are some ressources to get started with R:
- Check out R Primers by Andrew Heiss: Browser-based, no installation required.
- Get code snippets for typical tasks: Posit Recipes
- Read (and work through) R for Data Science by Hadley Wickham; make sure to get 2nd edition.
Aims
After the seminar, you should be able to:
- critically evaluate and improve the performance of a classifier in a (computational) content analysis.
- use zero-shot content content analysis with generative AI tools in your own research project.
Tasks
- 5 ECTS â 125-150 hours workload
- Active participation, not graded
- Participation in class: read texts, ask questions, discuss, give feedback to other students
- Short presentation of a published evaluation study report (in teams)
- Not a detailed description, but a summary for the class. The audience should learn a) what kind of questions and studies might be interesting and b) which texts might be worth reading once they have decided on a study idea.
- Plan and conduct an evaluation study (in teams)
- Present the results of your own evaluation study (in teams)
Session plan
Please note that the session plan is subject to change.
(1) 14. 10.: Hello
Class content: Introduction, demo, and organization
Organization: Find a team for the state-of-the-art presentation. The goal is to find a team with a complementary skill set. Select or find an additional text.
Homework:
- Listen to this podcast episode with Petter Törnberg: LLMs in Social Science
- Register your presentation in the Blackboard Wiki.
- Preparing your computer: If you want to actively participate in the computational part of the seminar using the prepared R code, please prepare your laptop with either an up-to-date installation of R (at least version 4.2) and RStudio or create an account at Posit Cloud.
(2) 21. 10.: Refresher: Traditional content analysis (human and computational)
Class content:
- Quick refresher on the basics of quantitative content analysis (both human and computational), focusing on quality criteria and evaluation (validity, reliability, reproducibility, robustness, replicability).
Texts (if needed):
State of the art: Overview
Class content: Short presentations on current work about LLM-based zero-shot classification
- Short presentations (10-15 Minutes)
- One paper presented by two to three participants
Texts: Some recommendations include Alizadeh et al. (2023), Brown et al. (2020), Burnham (2024), Chae & Davidson (2024), Egami et al. (2023), Gilardi et al. (2023), Gupta et al. (2024), He et al. (2023), Heseltine & Clemm von Hohenberg (2024), Hoes et al. (2023), Huang et al. (2023), Kathirgamalingam et al. (2024), Kojima et al. (2023), Kuzman et al. (2023), Lai et al. (2023), Matter et al. (2024), MÞller et al. (2024), Ollion et al. (2024), Ornstein et al. (2023), Pangakis et al. (2023), Plaza-del-Arco et al. (2023), Qin et al. (2023), Rathje et al. (2024), Reiss (2023), Schulhoff et al. (2024), Tam et al. (2024), Thalken et al. (2023), Törnberg (2024a), Weber & Reichardt (2023), Yang & Menczer (2023), Zhu et al. (2023), Ziems et al. (2024). You are free to use other texts (check citations in and to these texts to find more). Text assignment will be managed via Blackboard.
(3) 28. 10.: State of the art I
(4) 04. 11.: State of the art II
(5) 11. 11.: State of the art III
Empirical evaluation study
(6) 18. 11.: Introduction to the evaluation study
The central part of the seminar will be dedicated to small evaluation studies by student teams. Questions can range from understanding a toolâs parameters (e.g., Whatâs the effect of a modelâs âtemperatureâ on reliability and validity?) to practical optimization (e.g., Which prompts work best for a given task?) to critical questions (e.g., Does the classification show gender, racial, âŠ, biases?).
Class content:
Decision about type of evaluation study
- Competition: Multiple groups work on the same task using a training set; at the end, classifiers are tested against a new test set); The competition would be held on the this task: Explainable Detection of Online Sexism (EDOS)
- Free topics: Each groups works on their own idea and data sets evaluation study of classifier for their own task
- Find a data set and a task (e.g., Hugging Face, SemEval, GermEval)
- Use or create an own data set.
Tools and computers
- Introduction of tools to play with the API
- Quick computer check
Organization: Form teams for the evaluation study. The goal is to create teams with diverse skill sets. In my experience, three to five persons are a good team size, but your preferences might differ.
Homework:
- Register your team for the evaluation study in the Blackboard Wiki.
- One group member: Send me an e-mail and receive an API key for OpenAI.
- Start thinking about a task and a data set for your evaluation study.
(7) 25. 11.: Design evaluation study
Class content: Support in class and office hours
(8) 02. 12.: Study idea presentations
Class content: Presentations and feedback
(9) 09. 12.: Design evaluation study
(10) 16. 12.: Design evaluation study
Winter break
(11) 06. 01.: Conduct evaluation study
(12) 13. 01.: Conduct evaluation study
(13) 20. 01.: Conduct evaluation study
(14) 27. 01.: Conduct evaluation study
(15) 03. 02.: Conduct evaluation study
(16) 10. 02.: Final presentations
Class content: Presentations (XX Minutes per team) and feedback
Teamwork
Teamwork is a crucial part of the methods tutorial. Working in a team is also an important soft skill that you will need beyond this seminar and your academic education. Working in a team is more fun, creative, and productive. However, group work can also lead to conflicts. If you have problems in your group, please address them early within the group and/or to me.
Here are some recommendations:
- Division of labor: Distribute tasks and responsibilities early and evenly. But also know that you will learn most about the tasks in which you actively participated.
- Communication: Clarify early on how you want to communicate with each other. Agree on fixed dates for meetings and stick to them. If possible, plan a regular in-person work meeting. Use digital tools such as messengers, e-mail, or video conferences to coordinate.
- Infrastructure & tools:
- Webex can be used for video calls and team chats.
- The university library offers group work spaces that you can book for group meetings on campus.
- Box.FU is a cloud storage solution. You can collaborate on documents and share files.
- Here is a list of software available to all FU students free of charge.
- Of course, you can and should use other tools that make collaboration easier. Please make sure that all group members have access to the tools.
Use of AI tools and plagiarism
You are likely aware of AI tools like ChatGPT that can assist you with various academic tasks. If you havenât explored these tools yet, I encourage you to do so, as they are expected to become integral in both academic and professional settings. Being familiar with these tools and understanding their strengths and weaknesses is crucial. However, some ways of using them are more beneficial for your learning and academic success than others.
Before tackling an assignment with the help of an AI tool, consider what you might miss out on. Assignments are designed to help you practice certain skills (repeatedly), allowing you to improve and deepen your abilities over time. This improvement only occurs if you engage with the tasks independently. Relying on AI tools too early in the process will hinder your skill development. Conversely, not using AI tools at all means missing out on learning about a useful tool.
I recommend approaching each task initially without AI support to practice the necessary skills. Afterwards, compare your work with suggestions from an AI tool. These can be used to enhance your work. However, you will often find that the AIâs suggestions are incorrect or less suitable than your own. By comparing different tools and methods (e.g., prompting strategies), you can discover how to maximize the benefits of AI tools.
[All linked sources in this paragraph are in German, sorry] When submitting academic work, particularly essays or theses, please refer to IfPuKâs guidelines on using AI-based tools and plagiarism in the Guide to Academic Writing. You can also take a look at my provisional guideline for using AI in thesis work. It is crucial to document and transparently disclose the use of AI tools and information sources. You alone are responsible for your submitted work, including verifying its correctness and adherence to academic integrity standards. Plagiarism created by an AI tool remains plagiarism, even if you document the AI usage or are unaware of the plagiarized source.
In this class, using AI tools is allowed for the following purposes:
- Assisting in understanding concepts or studies
- Helping gather ideas or create outlines
- Supporting specific steps in the research process (e.g., suggestions for questions or categories, selecting appropriate statistical tests)
- Identifying and correcting grammar, spelling, and punctuation errors
- Working with programming languages (e.g., R or Python)
The following uses of AI tools are not permitted in this class:
- Using primarily AI-generated text (verbatim or edited) in presentations or written assignments without proper citation
- Completing entire tasks, assignments, or papers using AI tools
This seminar is (for most students) not graded. You have the opportunity to engage practically in empirical work and receive feedback and suggestions for improvement. There is no benefit in using dishonest means hereâso please donât.
Diversity, equity, and inclusion
My goal is for all students to feel welcome and able to actively participate in this class. I strive to ensure that no one is discriminated against or excluded through course planning and my language. Likewise, I expect all participants to behave respectfully and appreciatively, acknowledging the opinions and experiences of other students. At the same time, it is clear that neither I nor the students will always fully meet this expectation. Therefore, I ask you to inform me or your peers if you feel uncomfortable or observe discriminatory behavior. If you prefer not to do this yourself, you can also appoint a trusted person to do so.
Mental health
Attending university is demanding and, as a time of transition, brings many challenges, both within and outside of your academic work. If you feel overwhelmed, please make use of support services such as the Mental Wellbeing support.point or the Psychological Counseling Service. Feel free to contact me directly or through a trusted person if your situation conflicts with the course requirements.
Contact information
Division Digital Research Methods
Email: marko.bachl@fu-berlin.de
Phone: +49-30-838-61565
Webex: Personal Meeting Room
Office: Garystr. 55, Room 274
Student office hours: Tuesday, 11:00-13:00, please book an appointment.