MediaFutures is a Digital Innovation Hub funded by the European Commission to support startups, artists and collaborative projects between startups and artists with a focus on countering misinformation, disinformation as well as a variety of related challenges in the media sector. With a project duration of three years, MediaFutures is soon coming to an end. Throughout the project we have had a number of interesting experiences that we are sharing with our interview series on Lessons Learned for experts working on similar projects.
For today’s series on Lessons Learned, we have interviewed Elena Simperl, Professor of Computer Science at King’s College London as well as Julià Vicens, Research Scientist at Eurecat to give us some insights on the experiences we made running design experiments with several projects in order to perform research on the fields of computational social science, human-computer interaction or artificial intelligence as well as the overall approach applied in the MediaFutures’ support programme.
Elena, Julià, tell us a bit about the approach used in the support programmes of MediaFutures. How does it help to tackle mis- or disinformation?
Elena Simperl: Misinformation in its many forms is a substantial and growing problem for society today. In fact, the scale of how much it is growing and how fast is quite frankly alarming. We support artists and startups to focus their efforts to counter misinformation by providing more information on how misinformation works and spreads. We invited leading organisations in the misinformation field such as Newsguard to share their knowledge with our participants, which they did in our mid-programme meeting in Rome. We dug into how literacy supports media literacy with Onilo in Hamburg. By going into more depth about existing gaps in approaches such as fact checking (for instance, how can facts be checked if no verified dataset of facts exists for a certain area or in a specific language) or media literacy (which is normally aimed at younger people, even though older social media users can be very vulnerable to misinformation) we help participants define their artistic and commercial interventions. A key aspect is the variety of approaches, and MediaFutures facilitates this.
Julià Vicens: We have been supporting teams from different angles. First, we have provided essential training on topics like data ethics, personal data and GDPR, or data visualisation, to mention some examples. However, given the diversity of the projects we have also provided one-to-one mentorship in very specific aspects. To do that, we reviewed all projects identifying key areas of potential improvement and organised individual sessions to understand the needs of the teams in terms of data technologies, legal and ethical aspects of data, social impact, and other topics of interest for the teams. In our case, for instance, we have provided teams with a catalogue of datasets and tools that facilitate different aspects of the development of data-driven projects, from access to data and generative models. We have provided sessions on fine-tuning models, network-based algorithms for detecting misinformation spreading or strategies for mitigating biases in the use of artificial intelligence systems, just to mention a few. The consortium as a whole covers most of these topics, however, eventually, given the specificity of the needs required by the teams, we point to experts outside the consortium. The support programme is designed to provide expert knowledge in different topics to help projects reach their goals, which ultimately are to create a more healthy media ecosystem. Furthermore, from this exchange, potential collaborations emerge in research or experimentation that persist even after the support programme, as a result of this we have been collaborating beyond MediaFutures.
What are the needs of artists and startups in regard to data?
Elena Simperl: Data is the fundamental infrastructural basis of AI. We always need more (quality) data. A lot of artists and startups are using social media data as that is where a lot of misinformation happens, but that means that if data is not available, such as when the Tiktok API was not opened as expected, it’s harder for teams to address misinformation at its distribution point.
Another requirement is that the data is legally and ethically sound to use. One recent issue has been created by generative AI, such as that involved in creating deepfake videos. Which datasets these generative AIs are trained on is oblique, but as recent interest in conversational agents such as ChatGPT has shown, generative AI is likely to have been trained on publicly available data including personal data that people did not consent to its use as a data training set. Therefore we worked with a number of artists who preferred to generate their own synthetic data sets, rather than use existing ones and get involved in the world of data that is not properly consented for use.
In general though, access to high quality data is still a key limiting factor in creating better tools for all purposes, including countering misinformation.
Julià Vicens: The skillset of teams is very diverse regardless if they are artists or startups. In general, the background of artists in the MediaFutures programme is very technological; some of them are experts in fields like data visualisation or artificial intelligence, while artistic duos or collectives tend to be multidisciplinary. The same happens with startups, some of them have a very research-based approach, others are very focused on business, and others on education. Despite that, in general terms, the needs of both artists and startups are quite similar.
In general, the use of data in MediaFutures projects covers from using data as a source for producing narratives to sophisticated artificial intelligence models for making deep fakes. As a result, the needs are significantly different. One common need of teams is datasets that fit the purpose of their projects and can provide insights for their artwork, installation or business. In some cases, they are looking for datasets about a specific topic, participatory methods for generating labelled data or techniques for collecting data from digital platforms. This is not something exclusive of MediaFutures teams but of a multitude of data-driven projects.
The MediaFutures support programme provides tools to the teams to access open datasets and to create their datasets. The latter is probably the most interesting point because the vast majority of the research and projects are based on the same data sources; for instance, specific social media platforms that have traditionally granted access via API to data samples. However, other participatory approaches allow the collection of more unique data for generating datasets. We have encouraged the teams to reuse and create new datasets and return these datasets to the community in open access following the open research principles.
Regarding data technologies, most of the teams are applying data technologies in innovative ways based on their background and knowledge. This is something premeditated since, during the review process of applications, reviewers and jury panels selected teams with the required competencies for generating successful projects. However, during the support programme, trainers and mentors provide guidelines, methodologies, tools, and other resources, such as research articles, that can improve the development of the projects.
Julià Vicens (EURECAT) evaluating startup and artist teams at the MediaFutures Demo Day in Hamburg
How do the MediaFutures startups and artists contribute to the digital media ecosystem?
Elena Simperl: Whether financially or ideologically motivated, purveyors of misinformation do not abide by legal, technical or moral rules. We need new, playful, narrative, gamified and artistic approaches. This is where MediaFuture comes in. The key strategies for countering misinformation are debunking and inoculating strategies. This means, fact checking after someone has come into contact with misinformation, and training them in media literacy before they are exposed, so they are more able to identify and repudiate it when they see it. While projects in MediaFutures do this, they also focused on strategies of data literacy and collective intelligence. They reinforce social links, by promoting discussion between different perspectives and beliefs, and they draw attention to hidden behaviours and data. They also help develop critical thinking about algorithms. They are able to integrate hard-to-refute strategies such as narrative and emotion – it’s hard to argue with a work of art!
Julià Vicens: Most of the artworks developed by artists raise awareness on various issues, ranging from spreading pure misinformation to exploring the socio-technical implications of using artificial intelligence for generating non-real content (e.g. generative AI) or manipulating content (e.g. deep fakes). On the other hand, the experience of art is entirely personal and unique to each individual, surpassing the creator’s original intentions. Some projects focus on important topics in our digital world, such as digital literacy, specifically data and media literacy, aiming to help people better understand the complex realm of information, data, and technology we inhabit.
One contribution that I would like to highlight of projects developed for startups is the creation of bridges between polarised communities or groups that struggle to reach a consensus due to various reasons. Some teams have dedicated their efforts to building methodologies, services, and platforms that facilitate consensus-building. Instead of providing platforms where trolls and hate thrive, these spaces foster evidence-based discussions and help individuals with differing views to find common ground and reach agreements.
What are some of the major learnings for you in designing experiments with startups and artists?
Elena Simperl: As a sociotechnical researcher myself, I have known for a long time that Interdisciplinarity and collaboration are the future. But a major learning when we designed our experiments was that many more of the artists we worked with are highly technically skilled in working with data than we first anticipated. When the project was initially designed we assumed that most artists would require some technical support to develop the AI side of their work. In fact, data is an established artistic material, and many artists have developed the technical tools to work with it and were keen to develop these skills further in MediaFutures. This is not to say we did not support many artists/technologist collaborations as well.
What we saw with our third cohort was many teams also using GPT to support their coding. That lets people bring knowledge to skills they would have previously had to spend a long time acquiring. The modes of production are changing, and this is an area to explore in the future.
But there is value for startups and artists in learning from one another’s processes, rather than specific techniques. Enhancements can be made to products by engaging with artistic processes, and artworks can be supported to develop tools and products, as we’ve seen in a number of our Artist Meets startup collaborations. It extends what both sides can do creatively.
Another aspect is how do we evaluate the success of an art work? To an extent, there is an ‘easy’ way to evaluate the success of a startup offering – does it sell? We are currently working on ways to assess the effectiveness of the artistic interventions and have piloted these with the aim of carrying out larger scale studies in the near future.
Julià Vicens: Designing experiments with the teams has been very enriching for us. We had discussions on timely topics that cover free speech, disinformation, data biases, large language models, citizen participation, etc. During these three years, research on those topics has advanced a lot and this is also reflected in the projects. Overall, we have seen a fruitful combination of different approaches for tackling common challenges and how these solutions have evolved during the execution of the project.
The variety of projects, jointly with the combination of teams with very diverse backgrounds, skills and identities has been very enlightening. We have worked with artistic collectives with knowledge of artificial intelligence, social scientists and entrepreneurs, musicians, etc. We have learnt that the combination of different perspectives to solve a particular challenge, when well-managed, is a powerful source of innovation. At the same time, to be honest, we have seen that it also poses big challenges because the times of research and business or the language of artists and engineers are not always aligned. Therefore open-mindedness is needed for establishing those collaborations.
Overall, I would say that creating actual multidisciplinary teams, where technology experts, scientists, and artists collaborate and experiment together, can lead to an extraordinary experience with potential for groundbreaking innovation.
Elena Simperl (KCL) announcing the winning startup-artist team from the Startup meets Artist Track at the MediaFutures Demo Days in Hamburg
Did you face unexpected challenges during the project implementation, and if yes, how were these challenges overcome?
Elena Simperl: Like everyone running a project in the past three years we found ourselves facing the new situation of taking a programme designed for in person implementation and running it online. But something that anyone running this kind of open call needs to think about is, did we specify the right challenges for the potential participants to answer? And consortia running these kinds of projects need to have flexibility to critically assess the kinds of applications they are receiving and, if necessary, shift approaches slightly. In our first open call we had 4 challenges, but we realised that to attract the kind of artists and startups we wanted to support we needed only one – addressing misinformation and disinformation. We implemented this change in the second open call.
Julià Vicens: Undoubtedly, the main challenge we faced was delivering an online support program instead of an in-person one due to the pandemic;the project started in September 2020. The consortium members physically met in London in 2022 for the first time. Fortunately, in the last cohort, when we transitioned to in-person activities, gathering with teams in Paris, Barcelona, and Rome made a significant difference, generating a positive impact for the teams and the members of the consortium too.
Another challenge, but in this case not totally unexpected, has been establishing a balance between supporting teams based on their needs and demands and generating new experiments based on the potentialities of the projects that go beyond their initial objectives. We designed experiments to be performed after the residency and accelerator programme, especially with a selection of teams from the first cohort, because during this period the teams were very focused on developing their projects and, for example, collecting experimental data should be done after the project was deployed. We changed this approach with the second and third cohorts; we still worked on spotting prospective research actions with teams, but they did not need additional effort, and we concentrated more on doing research about their projects and supporting them to achieve their goals.
Would you propose certain policy recommendations at EU and national level that would further enable media and data entrepreneurs to thrive and achieve sustainable growth?
Elena Simperl: Using the term ‘entrepreneur’ here we’re thinking about both projects led by artists, startup projects and collaborations. I think it’s important to understand that all can be equally effective and all should be regarded equally – and that includes when it comes to funding! We took artists to Mobile World Congress in Barcelona, which typically is an opportunity that might be offered to startups, and they found it hugely valuable in assisting them with all kinds of aspects, including legitimising planned routes to commercialisation. Equally with skills such as pitching, these can be just as relevant to artists. While we don’t want to make artistic residencies the identical twins of startup acceleration, and force square pegs into round holes, it makes sense to let artists access startup opportunities and vice versa.
Europe is ahead of the global curve when thinking about engaging artists in raising public awareness of issues. However, more can be done to raise awareness on both sides of the potential of art-tech collaborations, which we cover in our Data innovation Toolkit.
It can be more challenging to understand which artists/art works have the most potential. Selection processes such as those we used in MediaFutures, where all teams go through several phases of the programme to develop their projects and work, help address these concerns.
However, having said that, if we want to think in terms of startups, then developing the art work ready for exhibition is akin to developing the minimum viable product (MVP). That’s a great place to be, but just as startups require assistance to go from MVP to commercial launch, artists require support to take their finished artwork to the commercial level – ie, exhibit it. Our partners such as IRCAM are well placed to help artists take this next step, but it requires funding, especially, as noted above, there are fewer routes to ‘market’.
A really vital part of funding the arts is to recognise the time of artists. This is equally valuable as that of entrepreneurs! When we fund startups, we fund entrepreneurs’ time, whereas often in the arts, it is materials that are funded rather than time. This can lead to an emphasis on the creation of new materials rather than the circular reuse of existing material.
Julià Vicens: From my perspective, and considering my field of research, I believe the potential policy recommendations should be based on evidence, and therefore, generate evidence-based policies to address the challenges faced by the media ecosystem from a holistic perspective. While technology and legislative actions are necessary, they are not the sole solutions for overcoming these complex challenges. Policies should aim to drive a change of mindset, emphasising not only sustainable growth but also de-growth.
Would you like to share lessons learned and recommendations for other related projects / data accelerators in general or media accelerators in particular?
Elena Simperl: While our participants were very ethically aware, it is still a lot to expect teams, whether startup or artist, to be completely on top of the requirements of GDPR. Hands on, practical sessions on personal data issues are hugely important and of great value to participants. Even then, it is helpful to enable access to support on GDPR once teams are working with specific datasets and have started to create outputs.
One aspect that we heard about again and again was the value to our participants of meeting other teams and individuals working on, and inspired by, the same issues, in our case, addressing misinformation. Whether the teams were artist, startup, or artist/startup collaborations was less important than the fact they were all motivated by the same goal, and were able to share knowledge on the subject and support each other with input. While in person meetings facilitated this to an extent, regular meetings held to simply discuss challenges and ideas across teams also assisted.
Human-computer interaction was at the heart of our project and detailed information can be found in our Data Innovation Toolkit. As the potential applications of generative AI grow exponentially, the interaction between users or audience and AI is going to become increasingly vital, and media/data accelerators need to make sure they cover this appropriately.
Julià Vicens: These days the value of data is evident. Even platforms, like Reddit, that used to give access to data for research and other purposes are restricting access to the APIs. Organisations and citizens who have access to high-quality data and tools for extracting knowledge and applying it to research, business or, in general, decision-making, have a distinct advantage over those who do not. However, obtaining datasets that contain quality data is not trivial. The quality of the data could be both subjective and objective. Subjective in the sense that the data has high quality for solving a specific problem but maybe not for other problems. And objective in the sense that data collected could contain multiple biases or errors. So properly designing a data collection is fundamental, and tedious too. That is why a relevant programme, in my opinion, should concentrate the efforts regarding access to data on three points. First, extracting all of the potentials of open datasets and performing a data quality assessment of open datasets since, unfortunately, data repositories are rife with low-quality and poorly documented data. Second, generating unique datasets following participatory practices in a non-extractivist way. In this sense, the Data Governance Act is a good starting point as a framework to enhance trust in voluntary data sharing. And, finally, related to the other points, promoting the release of datasets based on the data governance agreements and following data sharing good practices to contribute to the community.
In general, in our residency and accelerator programme, most of the startups and artists that participated were data literate. If, indeed, this is the case, in my opinion, it is more interesting and enriching to focus on cross-domain impacts, like the socio-technical, than on other technicalities such as fine-tuning models or providing computational resources. All the advances in data technologies from computation power to cutting-edge applications of artificial intelligence models have an impact that goes beyond purely technical performance. So, mentoring on the implications of the use of these technologies and methodologies to mitigate, for instance, algorithmic biases or the carbon footprint of their models could be more beneficial for people.
Elena Simperl is Professor of Computer Science at King’s College London, and the Technical Lead of MediaFutures. She is currently leading ACTION, a Horizon 2020 programme that helps 15 citizen science communities around Europe fight pollution, and is also the principal investigator of Data Stories, an EPSRC-funded grant that develops concepts and tools to facilitate ease in everyday engagement with data. She previously led ODINE, an EU-funded incubator for open data businesses, and Data Pitch, which developed a purposive open innovation programme and corporate accelerator enabled by shared data. Elena’s interest in leading initiatives within the scientific community has also taken form through serving as chair and programme chair of the European and International Semantic Web Conference series, the European Data Forum, and the European Semantic Technologies conference.
Julià Vicens is a research scientist specialising in the study of collective behaviour within complex systems. At Fundació Eurecat, he is leading the Computational Social Science research line of the Data Science and Big Data unit, contributing his insights and expertise to cutting-edge research applied to gaining a comprehensive understanding of social and technical systems like culture, cities, media or climate. He has been involved in numerous national and international projects spanning diverse fields such as complex systems, artificial intelligence, human behaviour, and citizen science, among others. Julià is a true advocate for cross-disciplinary collaboration. He devotes his efforts to bridging the gaps between science, technology, education, and the arts, actively contributing to innovative and creative initiatives.
The MediaFutures Lessons Learned interview series was coordinated by the team Leibniz University of Hannover, L3S Research Center, with support from DEN Institute.