Grants:Project/nschwitter/The Role of Offline Ties of Wikipedians/Midpoint


Report under review
This Project Grant midpoint report has been submitted by the grantee, and is currently being reviewed by WMF staff. You may add comments, responses, or questions to this report's discussion page.



Welcome to this project's midpoint report! This report shares progress and learning from the first half of the grant period up until March 2022.

Summary edit

In a few short sentences or bullet points, give the main highlights of what happened with your project so far.

  • Finished pre-processing and cleaning of data.
  • Conducted descriptive analysis and written up description.
  • Conducted inferential analysis of 2 main topics and written up their results.
  • Presented results on topic 1) at a small conference.
  • Submitted a paper for topic 2) to the WikiWorkshop (which was accepted).
  • Started guidelines for reproducibility and scalability.

Methods and activities edit

How have you setup your project, and what work has been completed so far?

Describe how you've setup your experiment or pilot, sharing your key focuses so far and including links to any background research or past learning that has guided your decisions. List and describe the activities you've undertaken as part of your project to this point.


This first half of the project continued my work from the past two years. It was mostly research work I conducted on my own to push the academic research (and the research thesis) forward. I work on three topical domains (1) productivity and collaboration, 2) norm relevant behaviour/reverting, 3) elections) and have made substantial progress in all of them. In the following, I will give some mention of the statistical methods used and previous work I build upon as there were a number of questions focused on these parts with the grant application. If anyone is intersted in any part in particular, please raise it on the discussion page or send me an email and I am more than happy to give more detail.

I worked on the following tasks:

  • Finished pre-processing and cleaning of data.
  • Conducted descriptive analysis.
    • I constructed basic variables to better understand meetups on Wikipedia and get key numbers: Where and when do they take place, how many users take part in these meetings, what does the network that develops between meetup goers look like?
    • I summarised information about the meetups on Wikipedia to highlight problems and dynamics and to generally paint a richer understanding of meetups on Wikipedia.
  • Conducted inferential analysis of productivity and collaboration.
  • Written up results on productivity and collaboration.
    • My main interest were finding out whether partaking in meetings had a positive effect on how much users contribute in the future, and whether it affected who one collaborated with.
    • To assess a causal effect of meetups, I constructed a comparable control group of users with a similar pre-meetup activity level and pattern, as well as a similar registration date, so that I could compare meetup attendees with this matched control group.
    • I follow a difference-in-differences approach to analyse the data.
    • I use linear probability models and linear regressions with robust standard errors to assess the effects of meetup attendance.
  • Conducted inferential analysis of reverting behaviour
  • Written up results on reverting behaviour
    • I tried to replicate a previous study conducted by Piskorski and Gorbatai which was published in the American Sociological Review in 2017 [1].
    • In line with this previous research, I concentrate on one year of activity on Wikipedia and check to what extent the number of norm violated, norm punishments conducted and rewards given and received depend on a user's network (in my case, their online and offline network).
    • I made use of the features of Wikipedia which allow users to undo changes which do not conform to the norms and rules of the website, and which allow users to thank others for edits made. I diverged from the variable operationalisations of Piskorski and Gorbatai (2017).
    • I use multilevel negative binomial models to analyse the data (which have a relatively good fit).
  • Presented results on reverting behaviour at a German conference of network researchers.
  • Started conducting inferential analysis of voting behaviour.
    • I have set up the data, described it, started preliminary analysis and will focus on this this month.
    • I will employ linear probability models and focus on four different explananda: 1) Who is running to become administrator, 2) who is winning in elections, 3) who is voting in elections, and 4) who is voting supportively in elections.
  • Started guideline on data collection for reproducibility and scalability.
  • Started guideline on data analysis for reproducibility and scalability.

Midpoint outcomes edit

What are the results of your project or any experiments you’ve worked on so far?

Please discuss anything you have created or changed (organized, built, grown, etc) as a result of your project to date.

Qualitative results:

  • Meetings are generally friendly places of community: These meetups bring together editors of Wikipedia, giving the anonymous usernames a face.
    • Regional-based: Most meetups (the most common form is the informal Stammtisch) is organised locally, and the purpose is mostly to socialise and get to know each other.
    • Project-based: Project-oriented meetings tend to include users from different geographical area sharing a topical interests.
    • Supra-national: Every once in awhile, some more planning extensive meetups take place which are supra-regional in nature. These can attract dozens of attendees from different parts of the country/area.
  • A community develops at such meetings and there is ample evidence for this (e.g. attending a funeral together after the passing of previous attendees).
  • There also exists conflict at meetings:
    • In some cases, users were disappointed on the social and informal nature of these meetings; such as being disappointed with the lack of structure and introductions of participants. While such negative feeling are acknowledged by some, others also highlight that it requires effort by newcomers to join established meetup cliques. Even if meetings are not appreciated by all, the regulars seem to become rather defensive about their meetup culture.
    • In some cases, users have not attended meetings depending on the other users attending. For example, users have mentioned that attendees seem to be a rather selective group of people, in particular made up of administrators. Users have reasoned not to come as they expected those meetups to be "meetings for administrators and insiders instead of for real authors".
    • Users also tend to be rather hesitant to come when journalists are present. Journalists often tend to be seen as external intruders.
    • There are also instances of considerable conflict which tend to include the Wikimedia foundation.
    • Specifically, lines of conflict can occur in cities with community spaces: Community spaces exist in a number of German cities, are generally supported by the Wikimedia foundation and offer a headquarter for both staff members of the foundation as well as engaged Wikipedians. Community spaces often grew out of an active meeting community in a city but, once established, can lead to conflicts. In most cities, they co-exist peacefully (however, the establishment of a community space often leads to a reduction in the frequency of general meetings), but there are cases of disagreement about how things should be organised.
    • The handling of blocked users is a point of discussion: In some instances, there were explicit anti-invitation of some users (being de-invited after also having been blocked). Some users agree with this practices, while other speak against this.
    • Such conflicts can lead to a split of the meetup community with alternative meetups organised.
    • (Perceived) Inequality of meeting access: While Wikipedia meetups are generally open to all, a certain reluctance to join them is observable on the organisational pages of multiple regional portals, and skewed distributions of attendee demographics are also sometimes directly discussed (particularly a skewed gender distribution: Female quote of 0%-20% are to be the norm). In many cases, editors that are or consider themselves to be in a minority on Wikipedia (e.g. newcomers, young editors, women) are hesitant to join local meetups.

Quantitative, descriptive results:

  • 4408 meetings were recorded in the German Wikipedia. The first one took place in 2003.
  • Around 3/4 of them can be classified as primarily social, i.e. not having mainly the intention to work on Wikipedia. The proportion of work meetings has increased across the years.
 
  • 89% of the meetings have taken place in Germany, 6% in Austria, 4% in Switzerland. The remaining per cent of meetings has taken place in locations all across the globe

 

  • The average number of attendees per meetup is 8.4 (mean; median of 7) with a minimum of 1 (meaning there were meetups where users were alone) and a maximum of 119 (I excluded very large meetups without a social character because I assume in my analyses that attendees of a meeting have actually met).
 
  • The average number of meetups of a Wikipedian who is in the meetup network at all (i.e. went to at least one meeting) is 9.2 (mean; median of 2) with a minimum of 1 and a maximum of 289 meetups.
 
  • In the user network (network connecting users with other users which have attended the same meetup) there are 4013 nodes sharing 102738 edges (density of 0.013). This means, 4013 users have taken part in meetings and creating 102738 relationships with each other.
  • The mean of the number of times users have met is 2.3 (median of 1), with a minimum of 1 and a maximum of 153.
  • The degree of Wikipedians relates to the number of other users they have met through meetups. The average degree in the user network is 51.2 (mean; median is 22) with a minimum of 1 and a maximum of 1141.
 

Quantitative results on productivity: Is there a causal effect of meetup participation on the extent of contribution and collaboration on Wikipedia? Note: Creating reader-friendly graphs on this is a challenge for the future (and to be shown in the final report), see notes below.

  • A control group of similar other users was constructed so that for each election date, a difference-in-differences could be calculated, assessing a causal effect of the meetup in both the short and long term.
  • Attending an offline meetup has a positive effect on the contribution behaviour of users. It is not necessarily the case that users increase their contributions after a meetup in comparison to before the meetup, their reduction in contribution is less than the reduction a comparable control group experiences.
  • Attendees become more likely to collaborate with each other and there is no evidence in shifting the extent of the collaboration to the users that have attended a meetup with a user in favour of those that have not been met.


Quantitative results on norm-relevant behaviour: Building upon the theoretical arguments brought forward by James Samuel Coleman[2], I tested to what extent the density of a user's offline network is relevant in explaining their norm-relevant behaviour. Note: Creating reader-friendly graphs on this is a challenge for the future (and to be shown in the final report), see notes below.

  • This was a conceptual replication of the study of Piskorski and Gorbatai (2017) and tested the same hypotheses.
  • I found only partial support of the argument of James Colemanand only limited importance of the offline network: Actors embedded in dense (online) networks are less frequently the victim of norm violations or violate norms themselves (supportive of theoretical argument). I find negative effects of an actor's online network density on their likelihood to punish norm violators on behalf of others and to experience such punishments of others themselves (no support). Regarding rewards: When focusing on all users, I find that users which have previously conducted norm punishments receive more rewards but there is no evidence of a positive effect of network density on giving rewards.
  • Those attending meetups at all tend to experience both fewer norm violations and norm punishments, and they give and receive more rewards. However, the density of the offline network does not play a noteworthy role in explaining online norm violation and norm enforcement.
  • There is thus no support for Coleman's mechanism based on the offline network, but the results do suggest that those taking part in meetups offline behave differently online.

Finances edit

Please take some time to update the table in your project finances page. Check that you’ve listed all approved and actual expenditures as instructed. If there are differences between the planned and actual use of funds, please use the column provided there to explain them.

Then, answer the following question here: Have you spent your funds according to plan so far? Please briefly describe any major changes to budget or expenditures that you anticipate for the second half of your project.

The funds are used as research stipend and used accordingly. Half of the stipend has been paid out so far. No changes have and will occur.

Learning edit

The best thing about trying something new is that you learn from it. We want to follow in your footsteps and learn along with you, and we want to know that you are taking enough risks to learn something really interesting! Please use the below sections to describe what is working and what you plan to change for the second half of your project.

What are the challenges edit

What challenges or obstacles have you encountered? What will you do differently going forward? Please list these as short bullet points.

  • Most of my obstacles were computational in nature: The sheer amount of Wikpedia data is difficult to work with. I am thankful for the help I have received in this regard and for me being able to now have access to improved computational infrastructure. I have shared my lessons learnt in the learning pattern above.
  • The general time constraint is a problem: There are many interesting issue and thing worth researching, but my schedule only permits to follow up on parts of this. I have started to explore some meetup dynamics more in-depth, but given my research focus, I need to and will concentrate on more of a quantitative overview of meetups on the German Wikipedia. I am highlighting other avenues as directions for future research.
  • It can be difficult to navigate between the two spaces of academic research and the Wikimedia community and their different needs. While I have mostly been focusing on the more academic side so far, I will try to set up my output in a more enganging way. In the next months, I will focus on unpacking my results in a more community-focused way, making results plot that are easy to understand and convey consice messages, etc.

What is working well edit

What have you found works best so far? To help spread successful strategies so that they can be of use to others in the movement, rather than writing lots of text here, we'd like you to share your finding in the form of a link to a learning pattern.

Comment: These learning patterns are great, I was not aware of them! I hope to be sharing more learning patterns in the second half of the project and write up my guidelines as learning patterns because they seem more accessible and useful than sharing them as static PDF files.

Next steps and opportunities edit

What are the next steps and opportunities you’ll be focusing on for the second half of your project? Please list these as short bullet points.

Most of the work in the first half of the funding period was more of "background work" and did not yet include much outreach. I have enjoyed to be in touch with some Wikimedians via mailing lists and also appreciated people reaching out to me, as well as the writing of the monthly reports. I am looking forward to be able to concentrate on sharing results going forward in the second half of the project.

I will work on the following tasks:

  • Finish inferential analysis of election behaviour.
  • Finish write-up of submittable thesis.
  • Out-reach activities:
  • Write up of guidelines for reproducible research.
  • Prepare data to make it openly sharable / research possibilties and archives to share data.

Grantee reflection edit

We’d love to hear any thoughts you have on how the experience of being an grantee has been so far. What is one thing that surprised you, or that you particularly enjoyed from the past 6 months?

I am very glad about having received the Wikimedia project grant. It gives me more freedom and time to conduct my research effectively, and it is fantastic to have received recognition through this. It has been great that people have reached out to me via e-mail after reading the project proposal (and if anyone else wanted to do so: Please, I encourage you to get in touch!).


  1. Piskorski, Mikołaj Jan; Gorbatâi, Andreea (2017). "Testing Coleman's social-norm enforcement mechanism: Evidence from Wikipedia". American Journal of Sociology 122 (4): 1183–1222. 
  2. Coleman, James (1990). Foundations of Social Theory.