Grants:IdeaLab/Characterization of Editors on Wikipedia

statusnot selected
Characterization of Editors on Wikipedia
Who is editing on Wikipedia and at what volume do they edit?
amount14,970 USD
idea creator
this project needs...
created on12:25, 20 March 2015 (UTC)

Project idea edit

What is the problem you're trying to solve? edit

While there are numerous benefits stemming from the anonymity of Wikipedia editors’ profiles, the Wikipedia community might benefit (and further expand) by accurately describing the traits and characteristics of its super editors, active editors, and passive editors. The paramount identification of gender trouble on Wikipedia inspired a conversation about gender inequality in online communities and the 2011 survey begins to document this problem. However, it remains unknown how gender, in the contemporary considerations of gender from a scaled perspective, might be a precursor or influencing factor in the degree and frequency by which someone edits.

An intersectionality approach to this problem may further contextualize how group membership (i.e. identifying as female and an ethnic minority) and enactment of one's identity contribute to this gender disparity. As our identity understanding and enactment is active as we encounter asymmetrical relationships (Weber, 2009), an intersectionality approach may help contextualize how group membership (i.e. identifying as female and an ethnic minority) and the enactment of one's identity may contribute to this gender disparity. Not surprisingly, barriers to editing may exist within interconnected (potentially historical) systems of oppression or discrimination (See Shields, 2008; Shields & Dicicco, 2011), including disruptions in how these systems are translated or transformed online (as cited by Haraway, 1991). The research community can shed further light on these barriers by exploring identity intersections from a scaled-approach that accounts for attitudes and behaviors along a variation.

As a potentially influential factor when one decides to contribute to an online, open-source community, civic engagement or one’s desire to "give back" might play a role or mediate the relationships between our gender identities and other traits, in addition to our Wikipedia-editing behaviors. It is not surprising that researchers have found that the online Wikipedia community has a strong desire to give back to public knowledge (Cho, Chen, & Chung, 2010; for a review see Jullien, 2012; Okoli et al., 2012). However, are there certain traits or behaviors, or intersections of these traits that are preventing individuals from engaging on Wikipedia?

Do certain intersections of identities experience unique barriers to editing on Wikipedia? In addition, are individuals with certain attributes more directly drawn to editing on this online community? For examples, individuals with restricted interests and those that exhibit repetitive behaviors (i.e. a dimension of autistic traits) might be more inclined to edit on Wikipedia at a higher frequency due to the ease of access to narrow topics (Jordan & Caledwell-Harris, 2012) and/or the availability of pre-packaged online social norms (Benford, 2008; Gillespie-Lynch et al., 2014). However, these individuals may also experience unique barriers as a result of the limited, in-person and gesture-based feedback available through online communities (Burke, Kraut, & Williams, 2010; Shane-Simpson et al., under review).

Strong efforts are currently attempting to bridge this gender gap. However, in order to effectively intervene, our Wikipedia community must fully understand the barriers to editing as these may exist within our identity constructions and enactment online. In addition, how are these systems translated or transformed online, via Wikipedia engagement?

What is your solution? edit

In order to accurately explore the main goals of the Inspire Campaign, we must be able to effectively characterize our community. Any interventions that we develop should reflect and match the needs of the target population, requiring a thorough understanding of the traits and behaviors of our community of editors. As a direct extension of the recent gender gap research on Wikipedia and to explore other potential areas of inequality, we’d like to conduct another study that compares the traits of the super-editor, the active editor (moderate editing), and the passive editor (views Wikipedia/Wikimedia content without editing). Super-editors will be defined as the top 2,000 editors on Wikipedia (a smaller subset sampled) identified by their volume of editing, while active editors will include a sampling of editors that edit regularly (edit once or twice a week), and passive editors are the infrequent editors. The third, passive editor group would be recruited via Amazon's mechanical turk to ensure a matched sample based on gender and other pertinent traits. Pending funding, editors would be entered into a drawing for a raffle prize as an incentive for their participation in the online survey.

The proposed project would use an online self-report survey that is posted on editor talk pages and distributed via a mechanical turk (for passive users). The research team has experience conducting online surveys and will monitor responses on this survey to identify any potential misuse of the survey (i.e. vandalism) and/or outliers in the data, such as participants that check the first response for each survey item. This entire project would only be implemented after an IRB approval from the lead researcher's academic institution. Every precaution would be taken to ensure participant confidentiality by de-identifying data and using randomized ID codes for participant data.

As a special precaution to prevent participant fatigue or over-surveying of Wikipedia editors, the research team will also pilot out the instrument with volunteer Wikipedia editors and a larger, university-based sample. This piloting phase will solicit participants for feedback on the survey in the context of item clarity, instrument length (very important), and alignment between item prompt and responses (validity of the measure items).

The following variables would be assessed via the online survey:

  • Editing Frequency
    • Items assessing this variable would include closed- and open-ended questions about participants' Talk Page engagement and Contributions/Edits on Wikipedia
    • How many edits have you made in the past month on user Talk Pages? Would you consider this to be an accurate depiction of the frequency of your interactions on Talk Pages? These items would be validated against actual edits made by the editor (via their Talk Page).
  • Gender
    • Gender will be explored from a scaled perspective of femininity and masculinity vs. the binary of "male" or "female." Using a multidimensional approach to gender, participants would have the opportunity to choose their expressivity.
  • Age
    • From the perspective of intersectionality in identity formation and expression, age should be considered in the context of the aforementioned and additional variables listed below.
  • Geographic Location
    • Geographic location will be assessed by including a drop-down menu of potential countries from which the participant is logging onto Wikipedia from.
  • Occupation
    • Participants would be asked their employment level, including degree if applicable (unemployed, retired, part-time, undergraduate, graduate, Bachelors, Masters, PhD, etc.)
  • Social Skills (autistic traits)
    • A subset of these items would be directly pulled from the Social Responsiveness Scale-2 (SRS-2), under the social motivation dimension. The number of items included in the final survey would be based on feedback from the instrument piloting.
  • Restricted Interests and Repetitive Behaviors (autistic traits)
    • These items would also be pulled from the SRS-2, in which only a subsection of survey items would be used. In no way does this study seek to expose or diagnose participants as autistic. Since autistic traits are commonly believed to exist in the general population, at varying degrees (see work by Baron-Cohen; although I disagree with some of his other theories), this study would explore what types of traits might relate to specific behaviors on Wikipedia. Furthermore, are there certain characteristics that are more frequently found in super-editors or inactive editors and how do these traits interact with how we self-identify? Wikipedia allows us to engage in very narrow interests and potentially repetitive behaviors. Consequently, it's likely that those with high levels of these traits might find our community a particularly welcoming opportunity to fulfill these narrow desires.
  • Civic Engagement
    • We need more literature that contextualizes how motivation for civic engagement (desire to "give back") might play a role or mediate the relationships between our self-identified gender (and other traits) and our editing behaviors on Wikipedia. I'd hypothesize that the online Wikipedia community has a strong desire to give back to the knowledge held by the greater good.
  • Experiences with Other Online Communities
    • Would you consider yourself "active" on other online communities?
    • Branched Question: If so, which communities do you contribute to?
  • Perceived Barriers
    • These would include barriers that the editor has experienced and would be assessed via open-ended questions. These items directly reflect the need for Wikipedians to identify the accessibility of the community and knowledge for varying groups of Wikipedians.
      • Have you experienced any barriers that have prevented you from editing on Wikipedia? No or Yes response - branched below.
        • If Yes, what has prevented you from editing on Wikipedia?
      • Do you have any suggestions for the Wikipedia community that might enhance your ability to edit on the site?
      • Are there any articles or topics that you feel are "more accessible" for you to edit?
    • Branched Question: In your other online communities, have you experienced similar barriers?
      • If so, what were they?
      • How did they influence your participation?
  • Do you have any additional comments or suggestions for enhancing the accessibility and fluidity of the editing process on Wikipedia?
  • Do you have any suggestions for the Wikipedia community that might encourage you to edit more consistently on Wikipedia?

Goals edit

Overall, the proposed project seeks to answer the following questions:

1. Who is editing on Wikipedia and at what volume are they editing?
2. Can an intersectional approach to identity development and enactment highlight unique barriers experienced by less active editor populations?
3. What are the personality characteristics of super-editors, active editors, and inactive editors on Wikipedia?
4. How can the Wikipedia community support and further develop our inactive editors into super-editors?

Project plan edit

Phase I edit

The online survey instrument will be developed within months 1-2 of this project and participant categorizations (editing group) will be acquired from a Quarry tool, with assistance provided by an experienced programmer - see $2,000 allocation in budget for tool development and assistance with implementation.

Phase II edit

Once participants have been identified in the two Wikipedia-based editing categories (super-editor and active editor), a unique link to the survey instrument would be posted on the editor Talk Page. The research team will aim for a sample size of 500 participants from each of the three groups (N=1,500 total in the study). However, with the anticipation of a 50% attrition/no-response rate, the team plans to post survey links on 1,000 Talk Pages of editors in each Wikipedia-based group (N=2,000 total surveys posted). The use of Amazon's mechanical turk will be used in a similar manner to obtain n=500 participants (target) that are gender-matched as closely as possible to the two Wikipedia-based samples. In accordance with an IRB review process, participation in the survey would be completely voluntary without forced-choice questions, and data would be de-identified once collected (random ID's assigned). It is anticipated that the data collection process will occur over a 3-month period of time.

During data collection, the research team will continuously monitor survey progress for each of the participant groups in an effort to identify potential vandalism attempts and non-human survey answers. In addition, the research team will ensure that a matched-sample is solicited to gain the voices of underrepresented groups on Wikipedia - such as women.

Once sufficient sample sizes are acquired (target of 500 participants for each editor group), the quantitative survey results will be analyzed using a nested model approach. Qualitative responses will be explored using a grounded theory approach to identify themes and then coded based on the identified themes. The data analysis process will incorporate the assistance of two research assistants and it is anticipated to occur over a 2 month period.

Phase III edit

In the final phase of the project, the research team will seek to appropriately disseminate findings through Wiki and other relevant conferences, and through open-source publications. The team is particularly interested in enhancement of this current dissemination strategy and would greatly welcome feedback on furthering these methods through a more comprehensive approach to dissemination.

Budget edit

Total Request = $14,970

Community engagement edit

This project would only thrive from community participation and I'd greatly welcome feedback on the project during all phases of development, implementation, analysis, and dissemination of results. Due to the nature of this form of research, I'd be limited in community participation (aside from survey participation). However, I'd love to incorporate participant feedback into the survey results. My prior experiences taking a participatory action research approach would aid the further development of this project in the hopes that the Wikipedia community can enhance the research design and protocol.

Sustainability edit

After the grant ends, I'd like to expand upon any of the unique themes that appear via the survey results. The identification and more thorough understanding of the gender gap would help us better characterize the problem so the community could develop effective, feedback-based iterative interventions that might eradicate the problem. In addition, we could target interventions (such as programs and edit-a-thons) towards specific demographic groups that might benefit from such event planning. The research team could also potentially expand this exploratory study to non-english wikipedias in order to capture whether themes identified on English Wikipedia translate to other wikipedia communities.

Measures of success edit

As a research study, success will be measured via the acquisition of the target sample sizes. This project would be deemed successful is the research team is able to further describe the gender gap and how this gap pertains to shared identities. Success will also be explored in the context of extensions beyond this grant - program and intervention development - from the results.

Participants edit

  • Advisor I have been working with Cshanesimpson, developing this project with her, and advising her on wiki-matters Theredproject (talk) 01:08, 21 March 2015 (UTC)
  • Additional Advisors & Research Team There will be two additional, (not-yet Wikipedian) researchers with PhD's in Developmental Psychology. Both Kristen Gillespie-Lynch and Patricia Brooks will join the team by assisting with the research and instrument design, in addition to the data analysis. Cshanesimpson (talk) 14:29, 2 April 2015 (UTC)

Endorsements edit

  • There is a demonstrable need for further research in this area. Its findings would be relevant not only to Wikipedia, but to other online communities. This study would also be supportive of further research into the socio-cultural issues that appear online. This is important, necessary work that needs to be done. Mozucat (talk) 19:12, 22 March 2015 (UTC)
  • I think that a good idea would be to start with the French study which is going on at French Wikipedia at the moment. They ask similar questions (except personal characteristics) and they also split editors into frequent and occasional, so it might be interesting to cooperate with them and see which part of their work can be reused — NickK (talk) 22:03, 22 March 2015 (UTC)
    • Wonderful suggestion! It looks like we might have some overlap in interests and I'll definitely reach out to that project. Cshanesimpson (talk) 12:16, 23 March 2015 (UTC)
  • After the responses Cshanesimpson gave me on the Discussion page, and the edits she subsequently made to the Project idea, I definitely endorse this Idea. It is by far one of the best Ideas I've encountered thus far on the Inspire IdeaLab, and I strongly recommend anyone and everyone to consider its proposals. –Nøkkenbuer (talkcontribs) 16:00, 28 March 2015 (UTC)
  • Support. Cshanesimpson has the background and skills to do this work. I'm happy to help by reviewing survey questions and providing feedback re: how to recruit participants. --Mssemantics (talk) 05:32, 31 March 2015 (UTC)
  • Great initiative I fully support it Rberchie (talk) 16:15, 8 April 2015 (UTC)
  • Very supportive of this work and let me know how I can be of help. OR drohowa (talk) 17:43, 20 April 2015 (UTC)
  • P.S. I can help specifically with info on where to publish your results in the Wiki community and perhaps elsewhere. Definitely check out Research:Newsletter if you haven't already. OR drohowa (talk) 17:45, 20 April 2015 (UTC)
Thank you OR drohowa for your endorsement and I'd love to explore (pending funding) these more comprehensive dissemination strategies. Cshanesimpson (talk) 15:51, 25 April 2015 (UTC)

Community notification edit

Please paste links below to where relevant communities have been notified of your proposal, and to any other relevant community discussions. Need notification tips?

Phase Line Item Description Estimated Time Estimated Cost
Phase I Survey Piloting Stipends provided for participants in the pilot phase x 50 participants N/A $250
Phases I & II Research Assistant Stipend Survey piloting and data analysis 40 hours x4 assistants $1,920 total
Phase II Programmer Stipend Assistance with survey posting to Talk Pages Stipend-based $2,000
Phase I, II, and III Administrative Costs Instrument development ($1,000), study implementation ($2,000), and data analysis ($2,000) Stipend-based $5,000 total
Phase II Participant Incentives Online Amazon giftcards at $400 per card x6 cards (raffle prize); these have been used by the research team in the past to keep mailing costs down N/A $2,400
Phase III Travel Wikimania Conference travel to Esino Lario, Italy $1,700 x2 travelers includes partial for flight, hotel, and food $3,400 total