Grants talk:Project/Rapid/Chlod/Contributor copyright investigation tool

Latest comment: 1 year ago by JChen (WMF) in topic Request for extension

Follow-up questions edit

Hello @Chlod, it was lovely being able to connect via a call to get to know you earlier.

Here are some further questions for your response.

  • Regarding the Contributor_copyright_investigation_tool, would you be able to share more on how this aligns with (or duplicates?) https://copyvios.toolforge.org/
  • Is your proposed project applicable specifically to English Wikipedia? Do you have any thoughts on how to make this applicable to other sites in different languages?

Thank you. JChen (WMF) (talk) 03:16, 16 March 2022 (UTC)Reply

Hi, Jacqueline! Let me try to address those questions per bullet point:
  • Earwig's Copyvio Tool is a well-used tool in the copyright cleanup field for detecting copyright violations of pages based on content found online. Since the tool I'll be making is focused mostly on workflow improvements and with assisting the manual labor that comes with finding and fixing copyright violations (which may include using Earwig's Copyvio Tool), the two tools barely have an overlap with each other. Perhaps their only overlaps as tools would be that they'd both be used in cleaning up copyright issues.
  • I took a look at Wikipedia:Contributor copyright investigations (Q15275379) item on Wikidata which has a list of CCI-like projects from other projects, but I've found that most of them are either highly inactive or unmaintained. It's a bit of a shame to admit, but CCI work is something that's very labor-intensive (something that I aim to deal with using this tool) and not all wikis have the dedicated set of editors to deal with them.
    • Japanese Wikipedia's Wikipedia:著作権問題調査依頼/多数投稿者 includes a list of serial copyright violators, however none of them have dedicated case pages or contribution analyses like the English Wikipedia does. They have a Wikipedia:Copyright problems-like noticeboard, however even that is relatively low-traffic.
    • Italian Wikipedia has a few open CCI cases, which includes a list of pages but is also relatively unmaintained, with Special:RelatedChanges telling me that there hasn't been any edits to the cases in the past 30 days.
    • Marathi Wikipedia has a CCI page and as it seems, one active case with thousands of pages to check and was last edited in 2020. This page also uses MER-C's Contribution Surveyor (the tool we also use on the English Wikipedia), so the script should be able to work here. Since I do have experience with developing internationalization- and localization-ready scripts, I can make sure to add in language support for Marathi if someone experienced in the language can help translate.
    • Since CCI is a rather niche editing field, this really wasn't a surprise to me. Many wikis really don't have the manpower to work on CCI cases, and even the English Wikipedia copyright cleanup team is stretched so thin that we can't actively vet every CCI newcomer's analyses. Perhaps the only exception to that is Wikimedia Commons, but even its patrollers can't constantly check new uploads for copyright violations (but hopefully that can also be worked on soon by another editor). I personally am aware of a large copyright problem on the Tagalog Wikibooks, with many book summaries and contents mercilessly ripped from the internet or from the original source works (such as Ibong Adarna and Florante at Laura). I had planned to help out with it in 2020, although never found the time to do all of the work. Hopefully by making this process less tedious, starting with the English Wikipedia, more editors would be inclined to help out with CCI work, and that'll eventually bring copyright cleanup out of the darkness for other wikis. One of my grant points from before was to also perform some sort of outreach for exactly that: drive up the number of editors contributing in CCI, however I decided to drop that idea for now since it'll probably be too big of a task to do in the span of a few months. Perhaps it's something to revisit in the future as a community project rather than as part of this grant.
Hope this answers your questions. Feel free to ask for clarifications or just ask more questions in general.   Chlod (say hi!) 17:21, 16 March 2022 (UTC)Reply

Request for extension edit

Copying here an email I sent to the grants administrator last July 29:

Greetings!

I applied for a Wikimedia Foundation Rapid Grant a few months ago to support the development of my proposed tool for assisting contributor copyright investigations on the English Wikipedia. Recent work on the tool has been going at a steady pace, however an underestimation of the work needed to be done, bugs that appeared during development that took a while to resolve, and also some sudden and unexpected events in my personal life have caused the project to go off-schedule. Unfortunately due to the delays, I'm no longer confident that the project would be completed by its end date (July 30). However, a lot of progress on the tool has already been finished (including the main interface as linked in the grant request) and I don't think it would take more than another month to finish the project's activities.

For this reason, I'd like to request a one-month extension of the grant from the Wikimedia Foundation. The extra month would be used to make up for the delays and would be used to complete the rest of the initial feature set planned in the original grant request. If an extension of the project period cannot be allowed, an extension of the submission for the final report would work as well, in which case some time after the end date (and prior to the final report's deadline) would be used to finish the tool. I want to deliver the tool without compromising quality, and I wouldn't want to provide editors an unfulfilling experience while using the tool by releasing an unfinished project.

Please feel free to ask if you have further questions or concerns. Thank you for your consideration, and I hope to hear from you soon.

Some additional context: I had mentioned briefly in an email with MJue (WMF) that I was graduating on June 13, which was the main reason why I requested that the implementation date of the grant be changed. What I had not anticipated was that our school had informed us on short notice that we would not be able to attend a physical graduation if we did not attend an on-campus quarantine that spanned the last 3 weeks of May — taking up much of the time I expected to use to develop the tool. This was the "sudden and unexpected events in my personal life" mentioned above. I expected to make up for the delay by doubling the amount of time I'd spend in the remaining months, but that wasn't enough to get everything done. Chlod (say hi!) 14:19, 1 August 2022 (UTC)Reply

Dear @Chlod,
Thank you for reaching out. Yes, we are happy to grant an extension based on your request. The new end date of the project will be 31 August 2022 and the report will be due on 30 September. Thank you and good luck for the rest of the project.
Regards, Jacqueline JChen (WMF) (talk) 11:55, 3 August 2022 (UTC)Reply
Return to "Project/Rapid/Chlod/Contributor copyright investigation tool" page.