Eno-Prompt

European Narrative Observatory - Predictive Research On Misinformation and narrative Propagation Trajectories

This page in a nutshell: ENO-prompt is a project to train a fine-tuned open-source LLM (large language model) that will detect disinformation based on linguistic analysis. It started in September 2024 and ends in February 2026.

- work in progress

The rise of disinformation has traditionally been analyzed through mid-20th-century propaganda frameworks, emphasizing the coordinated and repetitive spread of falsehoods. However, the complexity of today’s digital landscape and the socio-cultural dynamics of a "post-truth" society demand a fresh approach. This research explores how socio-cultural patterns embedded in language can be weaponized to shape and spread misleading narratives.

This project uses Large Language Models (LLMs) and combines it with actionable strategies to achieve three core objectives:

Improving Detection: Developing efficient processes to identify narratives embedded with misinformation.
Enhancing Narrative Insights: Investigating not only the supply of misleading content but also the societal "demand" for it. This includes analyzing why certain narratives resonate, reflecting collective fears, desires, and identities.
Empowering the Information Ecosystem: Providing training and resources to journalists, citizen reporters, fact-checkers, and Wikimedia editors to foster a resilient and trustworthy information environment.

Our project expands disinformation research to platforms like Twitter, Facebook, Instagram, YouTube, TikTok, and Wikipedia, focusing on their pivotal roles in shaping public perception. Supported by the Digital Services Act, the methodology involves:

Training and Support: Equipping journalists and Digital Commons communities with skills to counter misinformation.
Policy Recommendations: Addressing threats to information integrity through comprehensive analysis.

Who works on this project ?

Opsci (project lead)
Hun-Ren, Eric Brost Institute, Riga Stradins University, University of Urbino (research institutes, universities)
Wikimédia France
Going Public, ADB - EURACTIV, Les Surligneurs, Re:Baltica, Orizzonti Politici (fact checkers, independent media)

Steps

Data gathering strategy (septembre-october 2024)
Data management plan (septembre-october)
Creating a matrix to analyse examples of disinformation (november)
Feed matrix with examples in English, French, Italian, Romanian, Latvian, Lithuanian, Estonian, Russian - click here to help (november-december 2024)
Feed this new database to the LLM (january 2025)
next steps to come

Country-Specific Case Studies

France: France faces disinformation that amplifies societal divisions, including false claims about foreign involvement in law enforcement and manipulated translations of President Macron’s speeches. Misinformation targeting LGBTQ+ communities inflames tensions. With the 2024 European Parliament elections approaching, France’s polarized environment makes it critical for studying election-related disinformation.

Italy: Economic instability and immigration issues in Italy fuel politically motivated disinformation, particularly from far-right figures. Russian narratives exploit historical ties, and LGBTQ+ misinformation fosters discrimination. Italy’s complex balance between freedom of speech and regulation highlights challenges in combating disinformation ahead of the 2024 elections.

Romania: Romania’s vulnerabilities to disinformation stem from Russian hybrid threats and domestic narratives, such as false military reports near Moldova. LGBTQ+ issues are framed as foreign impositions, heightening social divisions. Romania’s case underscores the geopolitical and cultural factors shaping disinformation.

Baltic Region: Estonia, Latvia, and Lithuania are targets of Russian campaigns exploiting ethnic divides, often via Telegram. Despite governmental interventions, these nations' strategic locations make them critical for understanding resilience to election-related disinformation.

Digital Public Opinion and the DSA

Platforms like TikTok, YouTube, and Twitter/X significantly shape discourse. Adhering to the Digital Services Act, the project will analyze online news and public sentiment using two data streams:

Keywords and Hashtags: A BERT-based system will expand topic vocabularies.
Key Opinion Leaders: High-engagement accounts and flagged disinformation sources will be tracked using tools like CooRnet.

LLM to analyse multicultural narratology

The project uses narratology and fine-tuned LLMs (e.g., LLaMa-13B) to analyze disinformation narratives. LLMs identify key narrative elements, ideological stances, and rhetorical techniques like sarcasm and metaphors, clickbaits.

following SMART’s research outputs, a narrative is understood as a representation detailing a change of state, encompassing: a. Change (sequence of events), b. State (reflection of societal value systems through narratives), c. Representations (narratives manifest themselves in specific mediums, thus acting as signs, d. Systematic Character (narratives adhere to distinct schemas, making them comprehensible), e. Complexity (narratives can be parsed into their basic units or "narremes" for granular analysis).

Narratives aren't just cognitive structures; they resonate emotionally. In debates charged with emotion, narratives often overshadow raw data, influencing human cognition and prompting action. Recognising that narratives exist within communities where perceived realities mold actual realities, it's vital to adopt a systems perspective. Such an approach acknowledges that societal dynamics revolve around meaning

making, and narratives are central to this process.

Using this framework, we can discern what constitutes a narrative and, by extension, detect anomalies or deviations that might indicate disinformation. However, the biggest issue that the previous research has encountered is the practical impossibility of enhancing the qualitative analysis with quantitative methods, thus rendering the methodologies unscalable and largely non replicable. In its research design, SMART used NLP techniques such as TF-IDF which emphasize word frequency without grasping context. In contrast, LLMs (starting with BERT) understand text deeply, capturing contextual meanings and relationships, making them more adept at nuanced text qualification.

Preliminary tests show that the latest open source LLMs like LLaMa or Falcon are able to identify numerous layers of text analysis that have been largely unattainable through quantitative methods (communication intent, stance on complex political and social issues, perceived expertise).

Moreover, using the narratological criteria, LLMs prove to be able to deconstruct text into basic narrative units, or "narremes." By analyzing these narremes, LLMs can determine the underlying narrative structure. This process has already been tested internally at OpSci. For each submitted piece of social media expression, our retrained LLM is returning the same set of annotations. Preliminary results of the fine-tuned LLaMa-2 show a significant improvement over past methods like Bertopic, especially in regard to the identification of consistent ideological narratives (like “pro-nuclear” or “anti-nuclear” discourse). LLMs are very efficient at identifying stances held by a post over a range of complex political issues and narratives. They also make it possible to map different levels of signification of a given text such as sarcasms, metaphors or tonality. LLM like LLaMa-2 are trained on multilingual data, even though support for languages with less online documentation may be lacking. At the very least, we were able to correctly test the annotation capabilities of LLaMa-2 in English, French, Italian and German. We plan to train a unique annotation LLM for the different languages of the project and use English as the default language of annotation, as the LLM is still optimised on English sources. Open LLMs with better multilingual support for European languages are currently being trained for instance by the French companies Mistral or Falcon. We plan to potentially retrain these new models on our instruction datasets if they show a significant gain in capabilities in regards to less documented languages.

The project aims to identify the most commonly misdirected statements and argumentative routines and shed light on their manipulation methods, such as highly codified event writing standards (taking the form of a "standard-fact"), invoking sources and authoritative discourses, or relying on specific narrative forms (the "angle"). The project will also seek to characterize and classify the rhetorical methods of misinformation: "clickbait" rhetorics, conspiracy theory appeals, rumor invocation (Zannettou et al. 2019). A taxonomy proposed by (Islam et al. 2020) incorporates terminologies introduced by other publications.

Fine-Tuned Open Source Models Vs State-of-the-Art Generalist Closed Models

Since March 2023, ChatGPT has faced increasing competition from open source models like LLaMa, Falcon or MPT. While the LLMs of OpenAI currently remain the State-of-the-Art, their advance has eroded and, as of September 2023, external evaluations suggest that very large open models like Llama 70B or Falcon 180B have comparable capabilities to GPT 3.5.

Smaller open LLMs have proven especially effective for “fine-tuning” on a specific corpus or task. Through this process, the LLM can gain a better comprehension at specialised documents or assimilate a given style or forms of communication than large generalist models. Fine-tuning has become more widespread thanks to the development of economic methods like LORA: rather than retraining the complete weights of the mode which would require an advanced infrastructure, only a smaller subset of weights is selected. The choice of a fine-tuned model over continuing to use a very large LLM through an API stems from the following considerations:

● Fine-tuned models trained on standardised annotations are more constrained and their output can easily be integrated into structured databases.

● Very large LLMs are only accessible through APIs due to the massive infrastructures required to run them. This creates a continuous external dependency that is not desirable for an open science project.

● Smaller fine-tuned models can run considerably faster and cheaper, especially if the only purpose of the generation is the annotation of a short social media expression. Current tests run by OpSci suggest that about 1,000 tweets could be annotated in slightly more than one minute on one A100 GPU (40 go VRAM), whereas the ChatGPT API would take nearly an hour.

● Finally, there are increasing concerns over the energetic and ecological costs of LLMs. In contrast with chatGPT, smaller models like llama 7B or 13B can hold in a single GPU.

Moreover, multiple ethical and cultural concerns arise over the use of non fine-tuned LLM for generation or classification purposes. The formation of social movements often hinges on strategic language use, challenging established narratives and shining a light on marginalized viewpoints. These movements introduce new norms and communication methods. However, when utilizing Language Models (LMs), there's a risk of 'value-lock.' In this scenario, technologies reliant on LLMs might reinforce outdated, non inclusive beliefs.

For instance, the Black Lives Matter (BLM) movement significantly impacted Wikipedia's content creation and editing. As BLM grew, articles on the shootings of Black individuals not only increased but were also produced more promptly. Historical events of police brutality and shootings also received updates, signifying how movements interlink events to shape unified narratives. Twyman et al. emphasize that these social movements actively redefine minority stories, which often shape the data foundational to LLMs. The Common Crawl Corpus has been employed to train most LLMs. It is reported to have been sanitized by excluding content with a list of roughly 400 contentious terms, predominantly sexual references, a few racial slurs, and symbols related to white supremacy. While this might filter out explicit content and certain hate speech, it could also inadvertently diminish the representation of LGBTQI+ communities online. If we censor discussions from marginalized communities, we miss out on training data that positively portrays these identities and repurposes slurs.

The process of evolving narratives could either be incompletely learned or lost amidst the vast data used for training extensive LLMs, especially if this data isn't frequently updated. Taking into consideration recent experimentations that opsci has conducted, fine-tuning might offer a way to substantially retrain LLMs to allow for a more inclusive classification. This will necessitate meticulous data curation to accurately reflect evolving narratives and methods to assess if the tuning responds to the objectives of the model's extension towards more challenging viewpoints.

As of September 2023, we will likely opt for Llama 13B, which is probably the most effective model that can still be run on common GPU infrastructures. The base version of the model has repeatedly proven preferable to the “chat” version for multilingual versions, as the latter is overtrained on English instructions.

- work in progress