Grants talk:Project/Finding References and Sources for Wikidata

Latest comment: 6 years ago by Alessandro Piscopo in topic General feedback

October 11 Proposal Deadline: Reminder to change status to 'proposed' edit

The deadline for Project Grant submissions this round is October 11th, 2016. To submit your proposal, you must (1) complete the proposal entirely, filling in all empty fields, and (2) change the status from "draft" to "proposed." As soon as you’re ready, you should begin to invite any communities affected by your project to provide feedback on your proposal talkpage.

Warm regards,
Alex Wang (WMF) (talk) 19:44, 11 October 2016 (UTC)Reply

Goals edit

"This will allow us to rank each source based on the criteria settled upon, and therefore make use of the best source for each statement, ensuring that a certain threshold of quality is maintained at all times."

  • This sounds strange to me. There's no need to choose the best source. Wikidata does fine with listing multiple sources for a claim.
  • Talking about "threshold of quality" also doesn't inspire confidence in me. Having a source to a self published website might often be better than having no source at all. It would be preferable to have a high quality source but in the absence of a high quality source there's no reason to delete the low quality source because it fails to pass a threshold. ChristianKl (talk) 10:57, 15 November 2016 (UTC)Reply

"As a starting point, a call has been made for tools to help harvest sources, in particular from Wikipedia, which is where about half of the existing Wikidata sources emanate from. Our aim is to investigate this problem beyond Wikipedia and to develop methods and technology to find new, relevant sources automatically online."

There seems to be an assumption that a source has a quality that's independent of the claim that's to be supported. I don't think that's a good assumption. Not all facts that are in a given newspaper article have the same quality. The main facts usually receive more scrutiny then side facts like the age of a person. If a person publishes on their own website that they are born in 1983-02-03 and a newspaper writes in passing that they are born in 1982, 1983-02-03 is the higher quality date. At the same time if the newspaper says that the person is a professor at university X, that's higher quality. ChristianKl (talk) 11:13, 15 November 2016 (UTC)Reply

Comments of Glrx edit

I would decline this proposal. I buy into the above comments by ChristianKl.

The proposal title is "Finding References and Sources for Wikidata", but that is not the goal of the proposal. Instead the proposal wants to use crowds to support some sort of quality metric. The title is a disconnect. There is one statement in the goals section about developing an app that will use the metric, but we are not told how external statements will be found.

The proposal tells us that 50% of statements are unsourced, and then it states that 34 percent of the references are inadequate by Wikipedia policy. Those statements do not indicate that there is a needed for some sort of automated quality assessment. It may suggest there is a need for tools that remove references to wikis or a tool that digs down in the referenced wikis for a suitable reference.

Also, there's no indication that other, biased, sources are a problem. I often judge references by the publisher: a book published by a reputable publisher such as Wiley should be good; a book from a small, unknown, publishing house is suspect, but I won't reject it out of hand. It's only when push comes to shove does it get interesting: if a Wiley book says one thing and a small publisher's book says something else, then I lean to the better publisher. I also go looking for the author's credentials. Then the attack is not so much on the reliability of the reference but rather whether the statement is true.

The other day I corrected a WikiData statement that said the atomic mass of helium (Q560) was 4.002602±0.000002 amu and referenced to a presumably ultra-reliable PubChem. The problem is the PubChem reference only states a mass of 4.003. It is IUPAC that gives the more precise value. If we are to judge the reliability of a statement, then we must know the speaker made the statement as well as how much we trust the speaker. Christian raised this point.

There are probably many statements that have very few sources. A census bureau might be the only source for some population statistics, but it would presumably be reliable on its face. There might be many sources for the statement that George Washington (Q23) was the first US president, but then we'll run across sources that say Peyton Randolph (Q963741) was the first president. Crowds will say the former, but qualification suggests the latter. Just for grins, George Washington (Q23) has a statement that Washington was the commanding general of the US Army starting in 1798 and ending in 1788 and sourced to en.WP; an unsourced statement says the interval started in 1775 and ended in 1788. I'll go with the unsourced statement. A SPARQL query that finds nonsense intervals might be interesting.

The proposal doesn't clarify how crowdsourcing will give better reliability than what WD has already. Why will some nebulous metric produce a good reliability metric? An editor entering a fact probably has a good idea of source reliability. If the editor enters an unreliable wiki, then software could ask that editor and other editors to supply a better source.

The proposal is not concrete. It does not persuade that machine learning will produce a good result or that it is even essential.

Glrx (talk) 02:07, 7 March 2017 (UTC)Reply

Eligibility confirmed, round 1 2017 edit

 

This Project Grants proposal is under review!

We've confirmed your proposal is eligible for round 1 2017 review. Please feel free to ask questions and make changes to this proposal as discussions continue during the community comments period, through the end of 4 April 2017.

The committee's formal review for round 1 2017 begins on 5 April 2017, and grants will be announced 19 May. See the schedule for more details.

Questions? Contact us.

--Marti (WMF) (talk) 19:53, 27 March 2017 (UTC)Reply

Dear Prof. Elena Simperl and Dr Christopher Phethean,

Thank you for submitting this proposal. I have a couple of questions for you:

  • Are you aware of the work on StrepHit being done by Hjfocs? Your project goals seem related to Hjfocs' work, so it would be useful to know to what extent you have all coordinated and are aware of one another's work. Hjfocs, I would appreciate your feedback on this proposal, if you are willing to comment.
Dear Marti (WMF), thanks for your questions. I answer as a participant in the project, together with Prof. Elena Simperl and Dr Christopher Phethean. Our project is definitely related to Hjfocs's work; although we haven't contacted him yet, we believe that the two projects are somehow complementary.
Compared to StrepHit, our project aims at building more granular measures of quality. As an example, Wikidata verifiability policy defines authoritative sources as "trustworthy, free of bias, and up to date". Our system should be able to provide scores for each of these dimensions, in order to gain a deeper understanding of the type of issues affecting Wikidata sources.
An interesting idea for the future would be to investigate how Hjfocs's project and our can work together to build a comprehensive system for Wikidata to find references online and provide for each a measure of its different quality dimensions. --Alessandro Piscopo (talk) 14:34, 19 May 2017 (UTC)Reply
  • Can describe your previous experience engaging with the Wikidata volunteer community? Can you say more about your community engagement over the life of this project? Ideally, we would like to see members of the Wikidata community commenting on this proposal, indicating to what extent this project meets their needs and how highly they prioritize the work you propose to do.
We have already worked with the Wikidata community to gather their opinions about a suitable data quality framework for Wikidata (Data quality framework for Wikidata). With the same purpose of engaging the Wikidata community in defining what quality means for Wikidata, we plan to engage it both online, e.g. through Wiki Labels, and offline, by organising hackathons to gather sources and perform quality evaluations. --Alessandro Piscopo (talk) 14:34, 19 May 2017 (UTC)Reply


Kind regards, --Marti (WMF) (talk) 02:40, 3 April 2017 (UTC)Reply

Thanks Marti for the heads up, I was not aware of this proposal. You can find below my general feedback.
Best,
Hjfocs (talk) 18:37, 5 April 2017 (UTC)Reply

General feedback edit

I totally agree that the addition of reliable references to statements is a crucial process to ensure trust in Wikidata, since this is exactly the high-level goal of StrepHit. By the way, its very first task was the selection of a set of reliable third-party Web sources for the biographical domain. See Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References/Timeline#Biographies.

Also, I fully agree with the comments posted by ChristianKl and Glrx.

With respect to the project goals and activities, I'm not sure I grasped the specific aims, so I have a lot of questions that may help the team expand the proposal.

Hi Hjfocs, I went through your questions, trying to clarify each point. Please ask, if anything is still unclear.--Alessandro Piscopo (talk) 22:47, 19 May 2017 (UTC)Reply
  • Evaluate the authority or quality of different sources using crowdsourcing
How do you plan to do so and which communities would you notify? Maybe the Wikidata community through a survey or a request for comments?
We would like to engage users both online, by using tools such as Wiki Labels, and offline, by means of editathons to evaluate sources and gather new ones. In order to do this, we will seek the collaboration of different local Wikimedia chapters. We might also consider paid crowdsourcing on platforms like CrowdFlower as a means to collect data.
  • Create a ranking of sources based on their scores for different quality dimensions
Which sources? Those already appearing in Wikidata or new ones? I think that a URL whitelist would be definitely useful here, as other applications may benefit from it. See Spinster's comment: d:Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements#A_whitelist_for_sources. On the other hand, I wonder whether a ranking is actually needed.
We want to start by assessing sources already in Wikidata. Rather than an actual ranking, we aim to build a method to produce scores regarding different aspects of quality. A following step would involve investigating how to apply this method to evaluate sources not yet used in Wikidata.
  • Develop a machine learning approach that identifies features of sources that people would like to see used
I really don't understand this goal. Could you provide a running example?
Authoritativeness is a contextual property. A page from a religious web site could be a good source for the creed it refers to. However, it is likely to be unsuitable as a reference about the human genome, for example. The machine learning approach developed within our project should be able to help user identify non-authoritative sources in cases like this, by providing scores for different quality dimensions (e.g. currency and objectivity).
  • Develop an application to locate potential sources using text analysis
This seems to me a very ambitious goal, although I'm not sure how text analysis can be used to locate sources. I'd be really interested in a more specific description of this point.
An attempt to locate news sources for Wikipedia is described in a 2016 paper from Fetahu et al.[1]. Sources are found by issuing a query on a search engine and then evaluating the results using a number of (also textual) features. A similar approach could be tempted for Wikidata.
  • and choose the best source based on the quality dimensions.
At what level would you do that? I guess at the Wikidata statement level. For instance, given d:Q5921, you would suggest reliable references to unsourced statements like genre rock and roll. This is what the primary sources tool does (depending on the dataset, the source may be questionable or not).
Yes, at statement level.

Hope this helps! --Hjfocs (talk) 18:34, 5 April 2017 (UTC)Reply

Staff costs edit

Hi and thanks for this proposition. In your budget, you indicated $29 706 of staff, that is supposed to cover for six month of software developpment + researcher time. Could you clarify if it is the time of one or several of the participants (who are, if I'm correct, already paid by the University of Southampton) or a contractor not listed here ? Léna (talk) 19:35, 11 April 2017 (UTC)Reply

Hi Léna, this would primarily be to hire a developer for six months, so they are not listed under the researchers. There would be a small amount to pay for 10% of a researcher's time to support the developer, however the researcher is on a fixed term, project based contract rather than permanent, full-time (so not already paid). Thanks, Chris. --Chrisphethean (talk) 09:51, 19 May 2017 (UTC)Reply

Clarify general source validity versus statement-level reference applicability edit

I want to echo and perhaps attempt to clarify some concerns above. I think its really important to distinguish between the problems of:

  1. Rating the quality of a particular source of information such as a scientific journal, newspaper or other publication venue. As an example, you could imagine producing a system that would attempt to rate the Nature Reviews article collection as more trustworthy than, say The Onion with a spectrum in between.
  2. Assessing whether a particular publication, such as a particular journal article, is a satisfactory reference for supporting a particular Wikidata statement.

If you can pull apart the techniques you are developing to address both of these in turn, I think you would have a better, clearer proposal.

Apart from clarifying your objective, it would also be useful to clarify exactly what the inputs to your machine learning model are going to be. I could imagine, for example, learning a lot about what sources WIkipedian's consider trustworthy from article edit histories. Alternatively, I suspect you could crowdsource the construction of a fairly robust rule-based ranking system by engaging directly with communities that care a lot about this sort of thing such as the people that produced MEDRS .--I9606 (talk) 18:42, 18 May 2017 (UTC)Reply

Round 1 2017 decision edit

 

This project has not been selected for a Project Grant at this time.

We love that you took the chance to creatively improve the Wikimedia movement. The committee has reviewed this proposal and not recommended it for funding. This was a very competitive round with many good ideas, not all of which could be funded in spite of many merits. We appreciate your participation, and we hope you'll continue to stay engaged in the Wikimedia context.


Next steps: Applicants whose proposals are declined are welcome to consider resubmitting your application again in the future. You are welcome to request a consultation with staff to review any concerns with your proposal that contributed to a decline decision, and help you determine whether resubmission makes sense for your proposal.

Over the last year, the Wikimedia Foundation has been undergoing a community consultation process to launch a new grants strategy. Our proposed programs are posted on Meta here: Grants Strategy Relaunch 2020-2021. If you have suggestions about how we can improve our programs in the future, you can find information about how to give feedback here: Get involved. We are also currently seeking candidates to serve on regional grants committees and we'd appreciate it if you could help us spread the word to strong candidates--you can find out more here. We will launch our new programs in July 2021. If you are interested in submitting future proposals for funding, stay tuned to learn more about our future programs.

Aggregated feedback from the committee for Finding References and Sources for Wikidata edit

Scoring rubric Score
(A) Impact potential
  • Does it have the potential to increase gender diversity in Wikimedia projects, either in terms of content, contributors, or both?
  • Does it have the potential for online impact?
  • Can it be sustained, scaled, or adapted elsewhere after the grant ends?
4.9
(B) Community engagement
  • Does it have a specific target community and plan to engage it often?
  • Does it have community support?
4.9
(C) Ability to execute
  • Can the scope be accomplished in the proposed timeframe?
  • Is the budget realistic/efficient ?
  • Do the participants have the necessary skills/experience?
4.5
(D) Measures of success
  • Are there both quantitative and qualitative measures of success?
  • Are they realistic?
  • Can they be measured?
2.8
Additional comments from the Committee:
  • Project has a limited online impact. If the grant ends, the project couldn't be replicated but it could scale to a new phase.
  • Machine learning is good but this specific use is really hard for a quality perspective.
  • The project fits with Wikimedia's strategic priorities but its online impact potential is unclear - the proposal is too vague. So, it is probably not sustainable - "corpus of sources" will become outdated pretty quickly.
  • This proposal wants to address an important issue in Wikidata in an innovative and interesting manner. However, its lack of anchoring in the Wikimedia ecosystem makes it likely to scale down its potential impact, by redoing work already done elsewhere.
  • Very low potential for impact. There is little probability this will have any significant impact on Wikidata as it addresses a wrongly defined issue.
  • The grantees doesn't specify the numbers of metrics and the goals seems not concrete and they aren't linked with the activities.
  • The approach is not particularly innovative. Due to vagueness of the proposal the risks are high and clear measures of success are lacking.
  • Rather innovative but there is a high risk of getting a useless outcome.
  • Due to the needs of the project and the goals, I don't see the sufficient skills to develop the idea. The budget isn't detailed about the time to be paid.
  • The budget is not detailed enough. The Wikimedia related experience of the participants is non-existent. I have doubts about their ability to execute the project.
  • To the date of the review, budget clarification is still missing.
  • Participants are rather skillful, but don't seem to know Wikidata well enough.
  • No notifications, no endorsements.
  • I have not noticed any.
  • No community engagement, no attempt to work with others, no answers on the talk page.
  • Specification is too vague. No interaction with the feedback they received.
  • They state, “By focusing on a crowdsourcing approach to ranking source quality, community participation is integral to our project,” but they seem to have done little or no outreach to get feedback.
  • The proposal is too vague. It is not clear what will be the final results and whether anybody needs them.
  • Attending Wikimania doesn't seem to be useful for the project itself.
  • Chances are high this project will not have any positive impact on Wikidata.
  1. Fetahu, B., Markert, K., Nejdl, W., & Anand, A. (2016, October). Finding News Citations for Wikipedia. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (pp. 337-346). ACM.
Return to "Project/Finding References and Sources for Wikidata" page.