Research talk:Autoconfirmed article creation trial/Work log/2017-09-14

Thursday, September 14, 2017

Today I'll be working on documenting ideas on how we will determine what revisions we get quality predictions for, and what kind of questions we have regarding article creation, deletion, review, account maturity, and account survival.

Predicting quality

We are interested in understanding quality of newly created articles, as that might allow us to make inferences about article survival, reasons for deletion, etc. There is also the question of whether the quality of new articles has changed in any significant way over time. An important question in that regard is: what revision of a page should we use to predict quality? There might be many candidate revisions for a given article, which one we use depends on what question we seek to answer. In this discussion, we ignore articles created by users with the autopatrol right because the articles they create do not require review.

The process of reviewing new articles is one of triage. This is also reflected in that the extension designed to make the process easier is called "PageTriage". If we apply the WWI battlefield triage principles described in the lede of the triage article to new articles, we would get the following three categories:

Articles that are likely to survive, regardless of any edits done to them.
Articles that are likely to be deleted, regardless of any edits done to them.
Articles where immediate edits can make a positive difference in outcome.

The first two categories should be straightforward to review, the first one gets approved, the second gets an appropriate deletion tag applied. Question is: when does this approval happen, and what is the state of the article at that point? The third category is more uncertain, which could mean that one or more reviewers might skip over it as it does not require immediate attention. This can lead to similar challenges of determining what the state of the article was when a triage decision was made.

In an ideal world we would know when someone were looking at an article while having the opportunity to review it. However, that disregards other contributors' ability to edit, e.g. by proposing an article for deleting, adding templates suggesting specific improvements, etc. Those types of edits do not require the reviewer right, although they arguably interfere with the review process. At the same time, it is those types of edits that also make the process of selecting the right revision for quality prediction more difficult. For example, we might be interested in predicting the "final quality" of the article as made by its creator, yet that product might be interleaved with edits from other contributors.

So, how do we choose revisions to predict quality for?

Related to this question is an assumption that's the foundation of PWR and WikiTrust:^[1]^[2] a contributor making an edit is an implicit approval of everything that is left unchanged. In our case, this means that an edit to a new article that does not change the content or proposes it for deletion is an implicit review of the article, and arguably votes for it to stay. These implicit reviews means that we would want to predict the quality of an article both before and after edit sessions by contributors who did not create said article. This could drastically increase the number of revisions that we would predict quality for.

In this project, we will regard page review as an efficient triage process. Because it's efficient, we will expect newly created articles to go through their first review shortly after creation. While this review might not result in any action taken, we will regard it as "someone took a look at the newly created article". As a result of this, we will predict the quality of the initial revision of any newly created article.

We are also interested in understanding article quality and its relation to review decisions. This translates to a prediction of an article's quality at the time a review decision was made. We will disregard any edits by the reviewer preceding the review action, as those edits might affect the predicted quality, and we are interested in understanding the quality with regards to the reviewer's subsequent decision.

Binary flags

Why is article creation and review associated with binary flags? E.g. why can you not create an article to begin with, but if you're experienced, the article still have to be manually reviewed? Would it be better if we looked at reviewing articles as a classification problem with a probability distribution? E.g. can we train a classifier that can judge whether a user is capable of creating an article that needs scrutiny, one that can be reviewed later, and one that's automatically approved?

References

↑ B. Thomas Adler and Luca de Alfaro. A Content-Driven Reputation System for the Wikipedia. Technical Report ucsc-crl-06-18, School of Engineering, University of California, Santa Cruz, 2006.
↑ B. Thomas Adler, Krishnendu Chatterjee, Luca de Alfaro, Marco Faella, Ian Pye, Vishwanath Raman, Assigning Trust to Wikipedia Content, in WikiSym '08: Proceedings of the 2008 international symposium on Wikis, May 2008

Add topic

[adler06content-1] B. Thomas Adler and Luca de Alfaro. A Content-Driven Reputation System for the Wikipedia. Technical Report ucsc-crl-06-18, School of Engineering, University of California, Santa Cruz, 2006.

[2] B. Thomas Adler, Krishnendu Chatterjee, Luca de Alfaro, Marco Faella, Ian Pye, Vishwanath Raman, Assigning Trust to Wikipedia Content, in WikiSym '08: Proceedings of the 2008 international symposium on Wikis, May 2008

[1]

[2]