Research:User Engagemenet in Wikipedia: The Influence of Cultural Identity

Mari-Carmen Marcos
Duration:  2011-11 – 2013-12
Open data project  Open data
This page documents a completed research project.

Key PersonnelEdit

This research is carried out by a team of researchers based in Universitat Pompeu Fabra, Barcelona, with the collaboration of members from the Universitat Politècnica de Catalunya. Key personnel on the project include:

  • Marc Miquel i Ribé, Universitat Pompeu Fabra (UPF)
  • Mari-Carmen Marcos, phD, Universitat Pompeu Fabra (UPF)
  • Horacio Rodríguez, phD, Universitat Politècnica de Catalunya (UPC)

Project SummaryEdit

The purpose or main goal of this study is to understand the influence ‘Local Content’ identification factor exerts in the User Engagement process between readers and editors and Wikipedia.

Our hypothesis is that this new factor affects positively increasing the other User Engagement process factors. To verify it, we propose four specific goals, as necessary to cover the relationship between the user (reader and writer) and Wikipedia. They are the following:

  1. To obtain/quantify Cultural identity related content scope, verify and measure its existence and characteristics in several WP language editions.
  2. To evaluate over time the influence of Cultural Identity as a UE factor in both edition/discussion community dynamics and navigation/reading.
  3. To understand how Cultural Identity related content is consequence of both user involvement/activity in Wikipedia and affection for its cultural background.
  4. To assess how Cultural Identity can affect the reading experience regarding the elements in the Wikipedia article layout.

The hypothesis will be verified for sets of languages up to 20 when processing. When focusing in a specific community case we will take Catalan Wikipedia community.

Background InformationEdit

User Engagement (UE) is one of the trendiest concepts in the Internet, it appeared after the acknowledgement of the importance of user-centered design, and it qualifies the user experience between an object and a user. Thus, it is used both in academic and professional sphere – with slightly different meanings. It is appreciated for any technologic device and usually studied on user’s behavior.

One of its working frameworks integrated different psychology theories (Flow, Play and Aesthetics) and proved it was composed by factors like attention, usability, aesthetics, novelty, endurability and involvement (O’Brien, 2008). Although the framework has been used by different studies, it must be remarked that content meaning was not generally included as a factor.

It seems undeniable then that familiarity with content and a technological interface alters the use and therefore UE process. Besides, identification with a culture (so called cultural identity) has been referred as one of the main human drives or motivation to accomplish a goal. Identification with content can be seen as identification with a particular set of meanings from a culture.

Wikipedia (WP) is a free, collaboratively edited and multilingual Internet encyclopedia available in 285 languages. It is constructed by communities of volunteers – registered and anonymous - which decide its content by consensus. Then, each language edition can dedicate more interest to certain topics others don’t. This is the case of ‘local content’, a set of articles which develop the language or culturally related topics like territory, language, traditions and societal dynamics.

WP along its popularity is becoming a well-researched study object. Nonetheless, it has not been explored as a product of UE – which makes sense when a user can easily become a producer – and its multilingual characteristic is often neglected taking only the English edition. Our hypothesis is that UE process happening between Wikipedia and its readers/editors is also influenced by their cultural identification.

In this research we propose to explain how the mapping of this cultural identification exists with ‘local content’ as its representative set of articles, and we explore the relationship between user interactions and WP with several studies. Hence, we will divide the study object into different spaces to obtain a holistic approach and reliable results.

At last, the study will propose an improvement for the UE framework and disseminate its results among the academic. And no less important, the study conclusions will be useful for the Wikipedia community, since they will get more understanding of their members but also will benefit interface change proposals based on the insights.


Data Processing (Data Analysis and Natural Language Processing) In the first phase, Wikipedia was mainly the research object and thus it had to be approached by computational means and methodologies. In order to obtain general conclusions we chose 20 language editions from the most edited to very small ones. We proposed an analytical model using different Wikipedia informational structures - textual, relational and quantitative. This implied all kind of elements such as edits, links, text and categories among others. And we did use techniques like Tf-Idf, Page Rank or Semantic Relatedness.

Etnography and Qualitative Research In the second phase, we propose using qualitative means to approach different Wikipedia language communities. We already did two informal surveys to the Catalan community (through Amical Viquipèdia association) which gave good indicator of a cultural motivation to collaborate in local content. However, it is necessary to use qualitative techniques to obtain fundamented conclusions. This will imply a recruitment, interview and analys periods.

User Testing: Eye Tracking and Think Aloud Methods In the third and last phase, we propose 'Eye Tracking' and 'Think Aloud' as mature User Experience methodologies to understand how users relate in the present moment to 'local content'. It will be necessary to take into account all degrees of interest in 'local content'. This implies a recruitment, testing and analysis period. Later, it will be also possible to continue the research by proposing and using metrics in order to see if the interface changes users behaviors.


The research will be presented at relevant conferences, seminars and journals.

Wikimedia Policies, Ethics, and Human Subjects ProtectionEdit

Our foremost priority is to conduct our research in an ethical, respectful, and non-disruptive manner. We will ensure that we conform to strict standards of informed consent and transparency in data collection methods. All participants will be informed about our affiliation, purpose and research goals. We will make all efforts to address any risks associated with participation in this study.

Benefits for the Wikimedia community - Fit to StrategyEdit

This work will help to:

  1. Propose useful engaging guidelines in 'User Experience' for newer MediaWiki versions.
  2. Identify motivation of editors, both registered or anonymous.
  3. Give a better understanding on the Wikipedia content, their strenghts and lacks.
  4. Reinforce a new multicultural neutral point of view (MNPOV).
  5. Help in the overall goal of spreading all human knowledge to all languages.


January-March 2012

  • Develope the API for Wikipedia analysis
  • Process new data from Wikipedia languages
  • Draft report

April - June 2012

  • Data analysis

June-September 2012

  • Data Analysis
  • Prototyping
  • Eye Tracking Testing

January-March 2013

  • Survey Creation
  • Data Analysis

April - December 2013

  • User testing
  • Data Analysis
  • Drafting, writing and publishing


At the moment it has no funding.


Marc Miquel – MSc in Telecommunication and degree in Humanities - marcmiquel @