Reporter's Notebook

Reporter's_Notebook is just a concept at this point, not an official proposal. It would impact at least two projects: WikiNews and Wikipedia (news articles only) . It's tentative and preliminary, a concept at this point, for discussion only. Any thoughtful feedback welcome, positive or negative. More on talk page. Thanks!
This is a proposal for a new WMF sister project.
Reporter's Notebook
Status of the proposal
ReasonInactive since 2017. * Pppery * it has begun 18:39, 20 June 2019 (UTC)

Proposed byEdit

Rjlabs (talk) 00:38, 1 February 2016 (UTC)

Alternative namesEdit

None at this time.

Related projects/proposalsEdit

Review underway. Wikijournals has been suggested however that project is percieved as too broad in scope.

Domain namesEdit

Really anything with reportersNotebook could be used. Even the name Reporter's Notebook is not cast in concrete.

Mailing list linksEdit

Not quite ready for this but soon.


If there are other wikis out there already with similar themes (but not under the Wikimedia umbrella) provide links to them here:

Feel free to add to this section if you are aware of any other similar effort, Wiki based or other software based...

People interestedEdit

This section is in process

Real time news / news contentEdit

Public sourced news with bullet point style facts. As media becomes more biased, in one direction or another, it seems likely that the desire for factual news will increase. Wikipedia already acts somewhat like a news site - information is posted very quickly. I would like to see this more explicitly. (United States) 2015 Strategy Consultation Report

Reporter's Notebook - preliminary system designEdit

Been pondering this idea for a while, just wanted to express it at an early stage, and perhaps get some thought on preliminary system design.

Imagine a really good investigative journalist's personal notebook of facts related to a breaking story. This project (early design phase ONLY at this point) would be to "generalize" that single user notebook into a multi-user tool that would permit collaborative fact reporting, in near-real-time, for news stories large enough to be potentially included in WikiNews and/or WikiPedia.

More preliminary design concepts in bullet form, in no particular order.

  1. The tool would be a support tool, used "behind the scenes" to aid Wikinews and Wikipedia authors and editors writing major news stories.
  2. It would be an accurate, permanent, historical accounting of who reported, what, when.
  3. It would relieve many authors and editors of the tedious job of complete citation creation in Wikinews and Wikipedia articles.
  4. It would be LAMP (Linux, Apache, MySQL, PHP) based, and also rely heavily, if not entirely, on the standing WikiMedia wiki software.
  5. The design would be such that it would appeal first and foremost to professional users, including a fairly wide audience of:
    1. investigative journalists
    2. beat reporters
    3. detectives
    4. private investigators
    5. special agents
    6. etc.
  6. The design would also allow participation (perhaps to a lesser extent) by everyone, perhaps subject to login and positive identification - to upload images, videos, recordings, eye-witness accounts. It might do things like geo code the IP of the poster to corroborate being "at the scene / at the time" etc.
  7. A key feature of a reporter's notebook would be hyper vigilance to accurate reporting of names, places, times, titles, organizations, etc. There would be particular attention paid to the accuracy of these to insure that professional media could rely on them.
  8. Hearsay (according to our anonymous source) and leaked documents could possibly be scored for reliability based on the reporting entity.
  9. Dates/Times/Events of specific interest to the press would be included. For example date/time/place of upcoming news conferences, press releases / media kits / backgrounders / transcripts captures. These are of little or no relevance to subsequent articles directly, but are a part of the information web around news events.
  10. Hyperlinks would be provided to the source, historical snapshots, archiving and intense demand cache might be issues.
  11. Because this would be a permanently accessible "archive" of items perhaps the article in WikiNews and/or WikiPedia might not need to have extensive space devoted to references in the main articles. This would unclutter, without loosing the "one click" away from the source citation.
  12. The tool would "capture" news facts as they are discovered/reported in (very) near real-time. This would provide a location for users interested in "any breaking news, irregardless of the fully vetted reliability of the source" vs. WikiNews and WikiPedia which need to report only reliable sources, with neutral point of view.
  13. Professional journalists (and WikiPedia/WikiNews authors) are charged with placing the news in proper context, this system would be totally free of that constraint.
  14. The system would be designed for speed in reporting. Each report has a time/date stamp. LATER, the item may be corroborated, or "enhanced" with the credibility of the source, or official corroborating info. There could be a high speed "preliminary and unverified" status, that could later be updated.
  15. Corrections would be a reported event, also get a time/date stamp.
  16. The news ticker (by event) could be launched into Twitter, encouraging Twitter "news hounds" to also report into the system. Weibo could be another prime user/contributor
  17. As time passes the early "sensational news" as reported in WikiPedia needs to be refactored into more thoughtful, more enduring "encyclopedic" article content. Several times going "though the loop" (use this tool from the outset, going from first versions of articles to later, more stable versions of the articles) would improve both early stage and later stage news reporting. The system would be specifically designed up front to improve the news reporting process (early "breaking" and later "more thoughtful/analytical")
  18. Crowd sourced news is of critical importance in States where professional news is heavily censored. Some of the "good reasons" for early news censorship in restrictive States are that much of it is overly sensational or even downright false. Panic has very negative consequences. This "reporters notebook" design might (eventually) help support a better balance. In uncensored new States it might be designed with an eye to encourage more responsible reporting all the way through. In censored States the entire "trail" of reports might be allowable after a reasonable time delay.
  19. Just look at Wikipedia page views in the first day-days-first week or so of breaking news. The demand is INTENSE! Those readers want ready access to any fragment of information related to an event, ASAP! All that demand clashes with encyclopedia writing, however people DO turn to Wkipedia for news.
  20. I'm thinking of something that is row oriented for reporting, with a time/date stamp, relevant fields for speedy human scanning, drill downs for more, built for
    1. high speed, real-time use (perhaps with a scrolling window tool, similar to a AP wire headline reporter)
    2. easy subsequent review, and lookup research later on, for "depth" article writers
    3. very dbPedia oriented "under the hood" to preserve linked data
    4. very well thought out data structures to satisfy all current and future users
    5. flexible reporting of "fields" in rows to allow users a wide amount of flexibility, lots of ability to cut down the heap with flexible filters.
    6. metadata tagged everything with significant attention to data integrity all through the system

Again, this is so far from any proposal, just a fuzzy concept floating around at this point. Only looking for thoughts around preliminary system design concepts. No rush. If it ever moved forward we would want to deep engage with news professionals (especially associated with investigative journalism) on a world wide basis to really to elicit all system requirements, engage academics in journalism, pay close attention to news values, media bias, reporting bias, publication bias, etc. Would want the design to be global friendly and up to date at the outset to encourage potential broad scale use.

Might be fun to engage even the professional news censors in heavily censored States to find out what their precise requirements are? Perhaps the need for censorship might "expire" after a certain embargo? Even censored news States need a flow of reports to pick and choose what to broadcast. A system that also accommodated those needs might bring the benefit of potentially increased future collaboration?

All feedback more than welcome :) Feel free to add comments to talk page here or edit this page as needed.

Technical NotesEdit

Any way to get Wikipedia PAGE ACCESSES PER MINUTE in (near) real time?Edit

Source: Wikipedia:Reference_desk/Archives/Computing/2017_May_18&action=edit&section=3

More generally, is there any way to look at (in near real-time):

  • Web hits per minute by article title?
  • Edits per minute by article title (article and / or talk page)?
  • New article creations during the last minute?

For the moment I'm just interested in the English site.

I'm not familiar with how clustering works at WikiPedia, but am familiar with basic Apache httpd logs (and piping those). MySQL likely has logs of when people "commit"?

Are any of these made available to the public? Any APIs?

I'm looking into identifying what is "hot" and "trending" at Wikipedia that might serve as a "breaking news ticker / IRC channel" (possibility) to feed potential WikiNews editors.

Have no real idea where to start. Have seen grok (daily is the most granular I've seen), and a few of the 3rd party analysis sites at Wikipedia:Statistics but have not yet found anything that operated with say 1,10,30,60 minute "time buckets".

All leads and pointers greatly appreciated (including finding out that's NEVER going to happen, etc....)

Thanks! Rick (talk) 22:33, 18 May 2017 (UTC)

For access: the cluster topology is described at meta:Wikimedia servers. For most visitors, "hot" articles will be served from Varnish cache, not by Apache servers. I'm not aware of anything like a real-time API showing what those Varnish servers are sending; you'd probably best ask on the Wikitech list linked from that article. -- Finlay McWalter··–·Talk 23:04, 18 May 2017 (UTC)
For edits and creation: you can listen to the ATOM stream from recent changes (there's a link to it in your own recent-changes link) and process it from there. It may be wise to ask for an account at meta:Help:Tool Labs where that could be run (rather than your own machine). -- Finlay McWalter··–·Talk 23:08, 18 May 2017 (UTC)

Rick (talk) 17:29, 15 January 2016 (UTC)