By now, you’ve probably tried out at least one of the newly launched image-generation apps yourself. You might have tried a prompt like ‘Woman working on her computer, which is a portal to another universe’, or ‘Man with superpowers rescues his neighborhood from a tsunami of lava’. Some people create accurate and convincing (thus confusing) images of the pope in a puff-jacket, others use the tools to create magazine covers. According to Forbes (2022) two million images are generated by the use of the DALL-E 2 tool every day. However, with the development of these tools, a critique on the ethics of their workings comes from academics, technology experts, and queer, BIPOC, feminist, and religious communities. Meanwhile we started to wonder what it takes to visualize an inclusive future for visual culture.
Ctrl. Alt. Img is a creative intervention that uses existing AI tools like text-to-image apps and art to explore different approaches to image making. Our goal is to offer an alternative image database generated by the collective action of the public to broaden the scope of who and what we see in AI-generated images. To realize this project, a cross disciplinary collective – affect lab, Prospektor and Triple Black – has joined forces. The process of inclusive image making requires different perspectives, both within the team as in the audiences that we speak to.
Klasien, Babusi and Natalie Image by Affect Lab
The future is being visualized by artificial intelligence. Many of the images we see every day in the media are increasingly being created by algorithms and image-generation apps. Image banks, big news outlets and social media companies are now turning to technologies like Open AI’s DALL-E 2, Google Brain’s Imagen and StabilityAI’s Stable Diffusion to visualize the world around us. Through these image generation models who are trained on a crowd-sourced, un-edited source of data from media sources including news outlets and archives, we are seeing the direct effects of media misrepresentation through the eyes of generative machines. For underrepresented groups, this leap into the future of technology further deepens the digital and social divide and increases the risk of long-held stereotypes. To address machine bias, and regain human agency in the interaction with this field, Ctrl. Alt. Img. stimulates the audience to a critical and playful engagement with image generation tools. It also contributes to future proofing professionals against biases that have long-since plagued media practice.
Through our extensive desktop research, we found that these biases sometimes show themselves as glitches. Drawing on findings of critical thinkers like Legacy Russel’s ‘Glitch Feminism’ (2020), and Ruha Benjamin’s ‘Race After Technology’, we will set up a diverse focus group to hunt for AI glitches in the current visual culture of our society. In her book, Race After Technology, Dr. Ruha Benjamin argues that a glitch is an important sign to pay attention to. Rather than a trivial problem to ignore. She introduces the New Jim Code, which she defines as: “the employment of new technologies that reflect and reproduce existing inequities but that are promoted and perceived as more objective or progressive than the discriminatory systems of a previous era” (Benjamin, 2019:5) According to Benjamin we must examine glitches more closely, as they might point out inequities under the objective or progressive guise of artificial intelligence. Currently most people are willing to acknowledge that technology can be faulty. In the tech world it is a well known critique. However, Benjamin argues that we must dig deeper than to just acknowledge technological glitches, because ‘this is more than a glitch. It is a form of exclusion and subordination built into the ways in which priorities are established and solutions defined in the tech industry.’ (Ibid.:127) An example from Benjamin’s book is where Google Maps featured an AI that can speak out loud and translate Roman numerals, but fails to recognize the name of famous human rights activist Malcolm X.
In the context of the glitch theory we’re looking at how it has unraveled, in the period after the introduction of these models. Here, Cigdem Yuksel’s critical research on the representation of Muslim women in Dutch media formed a starting point. She has detected bias in the way Muslim women are depicted in the Dutch media, and started doing extensive research in collaboration with Ewoud Butter. Now, we want to take it a step further by working with people who experience bias in the visual cultures that they are surrounded by, in the wake of current AI developments. We aim to give agency back to people who are now underrepresented, and will work with them to visualize a more inclusive future. Babusi Nyoni (Tripleblack Agency) has created a prototype media mirror that allows us to experiment with the interaction between humans and technology. To help participants of an inclusive focus group address ‘glitches’ in the systems behind computer vision, and more importantly to regain agency in the interaction with computer systems by fostering digital and media literacy through interactive data storytelling. Their stories will become central to the installation for the public.
The training of AI text-to-image tools is currently still based on datasets that reflect excluding biases. Underlying biases can be found throughout digital visual culture. Cigdem Yuksel and Ewoud Butter’s research (2020) on the kinds of images of Muslim women offered by ANP, a leading pressphoto agency in the Netherlands, is a strong example of how bias is unintentionally reinforced through the tagging of press photos. In short, when a photo is tagged by a photographer it falls in a certain category. Because photographers want to sell their photos, and make them easy to find, these categories tend to become simplistic and discriminating. A similar process underlies the making and distribution of stock photography. Stock photos are used for multiple purposes, though there aren’t many people who question what they represent. Stock photos make up a large part of the visual culture which we are surrounded by. You find them in commercials, magazines, websites of companies and newspapers. According to Giorgia Aiello, professor of visual culture and communication, we have to think critically about who uses these images, who downloads these images and how images are produced in the service of sales (2022). It is exactly in this space between user and visual culture where we want to place emphasis on the human. In the interaction between human and technology is where agency can be taken back. While we raise awareness of the public’s media use of (AI) pre-made images, we also want to stimulate forms of agency in the playful process of image making.
One of the interventions we have been working on is a browser plug-in. The browser add-on is primarily aimed at media professionals, but it is free for anyone to use. During an image bank search the user is confronted with a set of choices. This could, for example, concern the selection of images for an article, website or story. The tool provokes critical thinking and will nudge the user to make a conscious choice of image. This recommendation is based on both embedded and algorithmically inferred data (that includes metadata tagging and object and pose data extracted using our existing machine perception pipeline). The prototype will soon be tested in practice.
To do so, we are working on integrating different forms of research in an interactive installation which mimics the form of a photo booth. The booth invites members of the public to interact directly with the technology in a playful way. The experience narrates through data-storytelling how the media sees a user and explains how this will impact their futures in the context of the proliferation of image generation models. The interaction ends when the participant is confronted with an image of how the ‘machine’ sees them, generated by image generation technology, and eventually printed by the booth as a photo strip. The installation sparks conversation with audiences about biases in the media, how machines perceive people and how this impacts society.
As part of our process we also worked with a designer to realize the visual identity of the project using DALL-E. Because the playfulness of image making is core to the project, it is reflected in the visual identity. We realize that the topic can be heavy, dense and complex. Though it is important not to ignore this complexity, we want to approach it in a light and playful way. We truly believe that it is in this playfulness that inclusive human agency can be regained.
Re-imagining visual futures
Because of the aim to reach a broad audience, we started to rethink the spaces where we would exhibit the installation. The project might have more impact in public spaces that are ‘messy’ than than a white cube, and the audience it attracts. We are now thinking of showing it in the hallways of municipalities, schools and cultural institutions to reach a broad audience. With this change comes a different way of thinking about the experience flow of the installation. Who is our audience? What do they know about visual culture and text-to-image tools? While we are convinced that most people have had encounters with pre-made images and AI technology by now, many of us might not recognise AI creations that easily. Moreover, most of us probably don’t understand how they work at all. Therefore we have started to work on an introduction film for the project. We are currently working on a storyboard for the film, for which we will launch a trailer in June. The film will give a short introduction to machine learning and its influence on visual culture and specifically why this matters to the person on the street. The film includes interview footage of members of marginalized groups who we have been working with over the years, most especially members of the BIPOC, queer and religious minority groups who have been systematically sidelined or erased from these technologies and narratives. Based on interviews and sessions with our focus group, we will detect biases in technology, and the glitches that show them. Together we will re-write the story of our visual culture, and make it more inclusive.
We wonder, what will the future look like if we humanize the digitalised landscape of visual culture? Will we, the average people on the street, have any say in the process? We hope it will be as messy, funny and playful as Ctrl. Alt. Img. imagines it to be.
Examples of the visual identity for the installation
Aiello, G. 2022. Communication, media, espace. Recorded lecture on: https://www.youtube.com/watch?v=QXQSJ6jQqAw
Benjamin, R. 2019. Race after technology: Abolitionist tools for the new Jim code. Polity.
Forbes. 2022. Dall-E Mini and the Future of Artificial Intelligence Art. https://www.forbes.com/sites/qai/2022/10/21/dalle-mini-and-the-future-of-artificial-intelligence-art/?sh=89d387c7d781
Russell, L. 2020. Glitch Feminism. A manifesto. London, Verso.
Yuksel, C., Butter, E. 2020. Moslima. https://www.cigdemyuksel.com/muslima