Grants talk:Project/ContentMine/WikiFactMine/Final

Latest comment: 3 years ago by Mjohnson (WMF) in topic Question about metrics

Report accepted edit

I'm documenting onwiki that this report is accepted and the grant project is complete. Thank you for all of your work on this project!

--Marti (WMF) (talk) 18:02, 3 August 2020 (UTC)Reply

Question about metrics edit

Dear Charles Matthews,

The Wikimedia Foundation's grants team is conducting a retrospective during which we are aggregating quantitative data from completed grants. I noticed that the target tables for this grant don't always provide specific numbers. For example, these metrics simply say "Fell short of targets."

  • Use of feeds
    • Feeds actively used on a weekly basis by at least 10 Wikipedia and Wikidata editors by M12
    • Contributions to 1000 Wikidata entries by M12
    • Contribution to 100 Wikipedia entries by M12

I don't believe that you are actively involved in this project anymore, but I thought I would ask if you happen to have records for this grant that would allow you to provide any information about any of these three metrics:

  • Total participants (this includes attendees at events, participants in surveys, online participants, etc)
  • New editors (newly registered users)
  • Content pages created or improved across Wikimedia projects (for the purposes of this project, this would be Wikidata items and Wikipedia articles created or improvedd)

If by chance it happened to be an easy data pull for you to provide this information, I would appreciate it!

Thank you!

--Marti (WMF) (talk) 18:10, 3 August 2020 (UTC)Reply

@Mjohnson (WMF): What I can provide on participants, and is not represented on the page Grants:Project/ContentMine/WikiFactMine/Final, are approximate numbers for the three events I ran at the Moore Library, in the early part of the project. The total attendance was around 45. Tom Arrow gave a hackathon workshop at Montreal: about six people came to that, and he gave a talk at Wikimania itself, when there were about 20 people in the room. I gave a lightning talk at the WikiMed Conference there, and spoke to about 15 people.

Going back to the Moore Library events, their format was a one hour lecture followed by a two hour workshop, and the workshop part involved people in editing. I'm pretty sure I submitted figures to WMUK at the time for the editing. I think the numbers editing were like 3, then 10, then 7.

I no longer have full records for these events in 2017. If there is more you need, I'll try to give you an idea. Charles Matthews (talk) 18:33, 3 August 2020 (UTC)Reply

Charles Matthews, thank you so much for this feedback. It's super helpful and I appreciate your quick response. In scanning the report, I am looking for a way to represent the number of contributions that were made to Wikidata via software built through this project, so the work will be counted in our aggregated numbers of what grantees have accomplished. This project is especially useful to count since the number may be quite large, and it will help to demonstrate the importance of funding large software projects. At one point, the report says, "WikiFactMine also played a part in the WikiCite initiative, to build up bibliographical content within Wikimedia, by providing a software component that was then applied millions of times on Wikidata." Does this mean that millions of Wikidata items were created through Fatemah? If it is possible for you to provide an approximate estimate of Wikidata items created as a result of the tool, then I can include that as the content created/improve figure for this project. Alternatively, the number will be zero, which seems a shame. I'm just having trouble interpreting the report to know how to represent content created/improved on either Wikidata or Wikipedia in a specific numeric way. Do you have any suggestions? Again, if you don't have precise numbers, a conservative estimate would still be helpful.
Thank you!
--Marti (WMF) (talk) 21:44, 3 August 2020 (UTC)Reply

@Mjohnson (WMF): As far as Wikidata is concerned, there were two things going on at that period in 2017. The more significant one was the development by Tom Arrow of the fatameh tool, which enabled a quicker way to create Wikidata items containing metadata for scientific papers. It was written by Tom Arrow around the time of the WikiCite conference in May of that year. At that time, the number of such items on Wikidata stood at around half a million. By August, the number had reached five million. In fact ContentMine held a strategy weekend in the middle of August, and I remember well monitoring the number of such items, and taking my Chromebook over to Tom to show him, as the number clicked past five million. (It now stands at about 30 million.)

So this was a major software contribution to the development of Wikidata, enabling bots to import bibliographical data into Wikidata. On those items actual citation data - paper A cites paper B - can be held, beginning the large-scale holding of the "citation graph" of open citation data on Wikidata: which remains one of the big points of interest in Wikidata for scientists.

Now, this development was not anticipated in the WikiFactMine grant proposal. That was about text-mining for particular facts in the open scientific literature. Tom also wrote a Javascript tool by means of which the mined facts could be accessed, from the Wikidata sidebar, for possible inclusion into Wikidata. I had a look at that: but there was no significant progress in importing to Wikidata in that way. The comments under "What didn't work" in the report are learnings relating to why that was the case. Charles Matthews (talk) 03:12, 4 August 2020 (UTC)Reply

Charles Matthews, thank you so much for this feedback, once again! Greatly appreciated.
--Marti (WMF) (talk) 19:13, 4 August 2020 (UTC)Reply
Return to "Project/ContentMine/WikiFactMine/Final" page.