Talk:Wikimedia Fellowships/Project Ideas/The Wikipedia Library

Add discussion
Active discussions

Good project, 2 points.Edit

Have you considered the reaction of the existing publishers. How supportive they would be with a permanent idea like this? They derive their revenue from limiting access. Offering temporary or limited membership as a promotional offer, does not translate into the idea that they would like users through their paywalls permanently.

Since, becoming an editor is free and completely open to just about anyone, you would in essence, be liberating all their paid content for free. Any user can just create an account to get access to the journals and bypass their paywall. I don't see a lot in it, for the publishers.
I'm sure you have considered this implication, I just want to know how you plan to address this rather large problem, since you will be removing or at the very least affecting, the publisher's primary source of revenue through this attempt. I believe 1000 won't be a realistic number the publishers might agree with for free permanent access and you might have to negotiate that lower.

If the cited sources are behind a pay-wall, and only a limited number of users have access, won't that fundamentally affect our core policy of verifiable third party sources? Since, only a limited set of users would have access to verify, correct or dispute any source. Either way, this is a great idea, and I wish you the best. Kind regards. Theo10011 (talk) 17:54, 11 August 2012 (UTC)

There is some reason for optimism about their participation, although every publisher is different.
  • For one, we already have 4 partnerships, with Credo, HighBeam, Questia, and JSTOR.
  • Two, participation in The Wikipedia Library (TWL) would benefit these research databases by acting as a visible act of good will and increasing their utilization on Wikipedia. That includes getting links back to their sites in references, as well as any media TWL receives (e.g. WMF blog, possible mainstream news coverage).
  • Three, not all editors will qualify. The current plan is 1-year old account with 1000 edits minimum. My goal is to get 1000 Wikipedia editors access. We've already done that with HighBeam and Questia. Credo is up to 500. JSTOR is starting out at 100.
  • Four, I think once one research database takes the first step of signing on, others will want to follow, so that they are not left out. There is also incentive for databases to sign on early, because the current plan is for sites to be listed on the page in the order that they join up.
As for paywalled sources, I think there is fair reason to raise questions about their benefit. I don't totally share the concerns, but they are worth addressing:
  • First off, we are not handled any ideal choices here. Either our editors do not have access to paywalled information from which to add to our articles, or, our readers will likely not have access to those paywalled sources from which content was added.
  • An approach to better weigh the balance here is to consider the relative percentage of our users who will *read* article content versus those who *source-check* it. I think I can comfortably say that readers far outnumber source-checkers. That means that whatever the cost to readers, it is likely several times less than the benefit to them, at least in aggregate.
  • There are secondary considerations, still. For example, will having an increasing number of paywalled sources make things difficult for fellow *editors* to do verification work? While this is already a problem to a degree, it's not necessarily one we want to worsen. My approach to mitigating that concern is to try and make sure that *enough* of our readers do have access to these paywalled sources. For example, there will soon be '1000' editors with access to HighBeam (some of our most active for sure), and then there's always Wikiproject Resource Exchange for what falls in the gap.
  • Will the public lose faith in Wikipedia if the content cannot be easily verified? I wish the answer wasn't so easy for me, but I think it's almost definitely that they will not lose faith. Because the average reader cares not where the information came from as long as it is presented to them in a seemingly accurate, thorough, and unbiased fashion. And I can't really imagine a great revolt in the press or elsewhere because Wikipedia is suddenly taking advantage of the best available resources that serious scholars use in their own practice.
  • This is not just a problem with paywalled sources, but *any* source which is not available free *and* online. Not all of the sources that have been donated are solely pay-for-access; some of them, for example, you would just need a good university library reference section to access. Yet I don't know if the same concerns would be raised about editors using library reference desks, any printed content for that matter. Much print content is just as difficult for readers to verify, whether it is available somewhere in the brick-and-mortar world free, or not.
  • Another consideration is that editors are instructed as part of these partnerships to use a free version if available, and to always provide the original citation information so that a reader can seek it out on their own. Some information, for example newspaper archives, may be available nowhere else but paywalled sites. If we don't have access to them, then not only will our readers not be able to look up the source, they won't be able to read about the content in the first place.
  • There is a sea change happening with open access, and perhaps we are benefiting in part from databases trying to 'open-wash' their reputations. I think there are more primary reasons they have made these donations, however, such as receiving linkbacks, attention and good will among editors, and altruistic intentions to improve Wikipedia. In time, perhaps, we won't have to make these kinds of difficult choices...
  • In the end, the use of paywalled sources is simply not prohibited by policy. See, specifically EN:WP:PAYWALL ("Verifiability in this context means that other people should be able to check that material in a Wikipedia article has been published by a reliable source. The principle of verifiability implies nothing about ease of access to sources: some online sources may require payment, while some print sources may be available only in university libraries. WikiProject Resource Exchange may be able to assist in obtaining source material.")
Thanks for your thoughtful questions and kind words. Cheers! Ocaasi (talk) 18:12, 13 August 2012 (UTC)
Thanks Ocaasi. I commend you for your honesty and effort, it's great to read that you share some of the same ambivalent feelings, while weighing the benefits. I endorsed this project at the same time as I raised my concerns above, because I think this is a worthy attempt. I hope you reach out to the journal publishers already and gauge their response. Keep up the great work! Thanks. Theo10011 (talk) 20:53, 18 August 2012 (UTC)

Initial fellowship program feedbackEdit

Hi! In response to a comment from an endorser about this being a snowball keep, I thought I'd share an update on where I think this project idea stands from the program perspective:

First, it does appear that a project like this would have great value to the community, that seems to be pretty clearly demonstrated in discussions so far. Its great to see so many endorsements and we'll continue to monitor this closely. Community support is an important factor in selection, and endorsements are a good barometer of this. I will point out though that endorsements aren't a !vote or poll, and so WP:SNOW doesn't really apply to the fellowship selection process ;-). There can be some very good reasons why a highly endorsed project may still not be feasible for a fellowship at any given time.

Secondly, as a primary concern for this project moving forward. I'll highlight what Ocaasi has already pointed out. This project has 2 main components: relationship building, and technical infrastructure. The technical infrastructure component of this project is not insignificant: it probably requires about 6 months of developer time, plus additional time/resources for code review and deployment. In order to take this project on in the fellowship program, we'd need to ensure we've got the resources to handle this component - perhaps we would need to have a fellow dedicated to working on the project with all the necessary technical skills to get this part of job done, in addition to someone who can handle the relationship end of things. Or, some such creative solution would be needed to technically resource this idea. Fellowship projects that require investment from WMF Tech resources are not an option because WMF engineers are already committed to completing a whole suite of other great projects, like the visual editor. At this point, I'm not seeing how we would technically make this project happen, and it may be that this project is simply too tech-heavy to be feasible in the near-term.

We're keeping this idea open and will look forward to seeing updates, comments, endorsements, creative solutions, and signups from volunteers or potential fellows interested in helping work on the project over time, because it really does seem to be something that would be useful to editors! But practically speaking, the fellowships program is not going to take this on without ensuring we can address the technical requirements. Luckily, Wikipedia has no deadline, right? :-) Siko (talk) 18:41, 15 August 2012 (UTC)

This is a bit disappointing Siko. I was expecting more support for the idea, at least something to help him try this. This has to be the most widely endorsed Fellowship idea I've ever seen. I would point out that his proposal doesn't necessarily need dev. time or resources, or more specifically WMF dev. resources. This could be handled in a clunky manner with no development support if need be (authorizing 1000 accounts from publishers, handing out access manually, etc.) It might be worth suggesting that he add someone more technically inclined to the proposal who can help out, or he can seek out what Wikipedia was built on - volunteer time. Majority of the tools and sundries were, and still are, built by volunteers, it is entirely possible Ocaasi can find the help of more qualified volunteer developers who have been doing it for years.
I really believe along with several others that he should at least be given a chance to try this. Even without any commitments of dev. resources, he had near unanimous support here and on the mailing lists, from several users who have been editing for years and know that this can bring about a great positive change. I also have to point out that this is probably the best kind of proposal you are likely to get, though I've seen a couple approved without this much support or thought, this is one of the better ones I've seen in years. I really hope something becomes of this proposal, and it doesn't kill the impetus behind a good idea, which I fear this stalling will do. :(
General note about policies and Meta - I know endorsements aren't votes, but if we are not going to weigh the response than I suggest removing that section from the template all together. Also, SNOW applies to open requests that need to be closed in a finite amount of time, essentially to save time. SNOW doesn't necessarily apply to Meta, even if it were to be applied, proposals don't have a close-by date. ;) Thanks. Theo10011 (talk) 21:17, 18 August 2012 (UTC)
For what its worth, Theo, I feel disappointed we can't just jump up and do this too. But the fact remains that we've been looking at the options, and I'm not saying we're closing the door, just that we haven't figured out a way to make this idea practically happen at this point. That is a real obstacle we face, regardless of the amount of support or thought involved. I don't think its fair to say the community response isn't being weighed, though, or that we should throw out endorsement as an idea. This program takes the overwhelming response quite seriously. Thats precisely why I'm commenting on the obstacles here, and inviting "updates, comments, endorsements, creative solutions, and signups from volunteers or potential fellows interested in helping work on the project" from everyone involved, to see how we as a movement might make this happen. Note that we haven't gone and rejected or closed the proposal, sorry if that wasn't clear - what I've posted is preliminary update and feedback from a first review, because I know so many people want to see it happen. What I'm saying is we don't have everything in place to make this particular dream a reality yet. And I agree that sucks, because of course it is a good idea. (many thanks to my favorite Communty Fellow Steven Zhang for drawing us into the SNOW debate, which I think we both agree has zero bearing here) Siko (talk) 04:23, 23 August 2012 (UTC)
Hiya Siko. Thanks for explaining this, it is really hard to discern the tone or gesture from written responses. I took the earlier response to mean that it was a "hard no" for the project, your recent reply clarifies that I mistook the tone. You also want to see this develop, I am very happy to hear that. Thanks, and sorry if my tone was out of line at any point. I really hope Ocaasi forgoes CentralAuth incorporation and tries this without the dev. resources, to take the initiative and see if this is viable or not. Thanks again, and feel free to leave a message if you ever wanna talk policies or need help with anything, Meta-pedians are more versed in them than regular Wikipedians. That's why it sometimes gets the better of us. ;) Thanks again. Theo10011 (talk) 06:11, 23 August 2012 (UTC)
Thanks, Theo, I always do so appreciate your input and offers of help :-) Agree with the tone/gesture issue, I generally think it's one of the hardest things about coordinating on-wiki and via the mailing lists! Glad to see we're on the same page, and that everyone is thinking about this together. We'll see what comes! Siko (talk) 16:31, 23 August 2012 (UTC)
I'm not sure a technical hurdle exists. Every tertiary institution in the world buys their library software off the shelf. Ocassi, can you ask one or more of the suppliers to recommend some options? I have a relative very involved in vending ebook library software to tertiary institutions, and will approach them for advice if it comes to it, but would rather you approach your contacts first. --Anthonyhcole (talk) 19:35, 22 August 2012 (UTC)
I've explored this as much as I can with my (lack of) technical background. What we're looking for is either SAML or OpenID implementation. That is relatively straightforward to set up for a single resource provider; however, it involves completely integrating with the Wikimedia CentralAuth system, which is apparently why Ryan Lane at Wikimedia Labs told me it would take 6 months of full time developer support. It is not an impossible task, it's just one that we can't yet fund, can't get developer time at the Foundation for, and can't easily crowdsource due to the rarity of people with experience in Authorization protocol AND Wikimedia code. This idea is far from dead, and Siko is just facing the same constraints I am. As soon as we find a loophole, we'll keep going no doubt. Feel free to talk to your connections, the more feedback the better. BTW, I've described the technical challenge in some detail on the English Wikipedia project page at W:EN:WP:TWL. Cheers, Ocaasi (talk) 22:07, 22 August 2012 (UTC)
Thanks. I followed that link and understood, oooh, about a tenth of it. I'm wondering why we need to tie this to an existing Wikimedia site. Can't we create a brand new site, tailored to dovetail seamlessly into OpenID? I have one username and password for Wikipedia, and another for my university. It's no big deal to have to sign into each. If tailoring the SSO system to wiki architecture causes difficulty, and setting up a bog standard website that bog standardly fits into OpenID eliminates that difficulty, can't we do the latter? --Anthonyhcole (talk) 02:53, 23 August 2012 (UTC)

Ocaasi, I would seriously urge you to consider the benefit of integrating this into CentralAuth. If that is the only hold-up, I would really suggest you forgo CentralAuth integration from a resources point of view. Make a separate approval process and project, you can manually hand out access after approving certain candidates. Anthony makes a good point. This is supposed to be the first step, it's better that you prove and test your idea first than ask for a 6 month dev. commitment just to integrate access. I would also suggest you talk to more devs besides Ryan, Ryan def. knows his stuff but it doesnt hurt to get a second opinion, or someone else to volunteer some help. There is Quentinv57 or Pathoschild on Meta who might have some advice, or might be able to take on some of the dev. work involved. Regards. Theo10011 (talk) 06:03, 23 August 2012 (UTC)

I will absolutely explore if there is a more lightweight way to do this using OpenID. I will ask for feedback from the WMF devs who have been consulting on the technical issues and report back once I hear anything. Thanks for your encouragement and ideas. I really want to see this happen and am open to any way of getting it done. Ocaasi (talk) 15:39, 23 August 2012 (UTC)
A number of ops and devs have been involved in the discussion. We've all decided that SAML is the only really viable option here.--Ryan Lane (WMF) (talk) 18:42, 23 August 2012 (UTC)

Technical obstacles and proposed workaroundsEdit

Integration with CentralAuth isn't necessarily the hard part. OpenID isn't a really viable solution to this project. What we really need is SAML. This is going to require developer resources. Creating another site just adds complexity (especially with regards to integration efforts later), and makes the life of ops more difficult. Ops is a bottleneck for any project. Making the obstacles more on the dev side and less on the ops side will ensure the project moves along more quickly. Additionally, how do we handle the second set of user accounts? What happens when we want to expand this program later? We should absolutely aim at integration with CentralAuth. It's only slightly more work and would have much greater impact.

The program doesn't require a paid developer. The entirety of the development can be done by volunteers. Infrastructure that mimics CentralAuth can be created inside of Labs. We can even do dry runs with the libraries from Labs. When the development work is done, we can formalize infrastructure changes made in Labs and move them to production. The entire process, excluding deployment to production, can be done by volunteers.

That said, this is a pretty large project for a volunteer. It would be much better if this was a funded short-term contract. The foundation doesn't have budget for this, at least for this year. My recommendation is to reach out to some chapters for funding, if they are interested in the project. We can look at adding this to the budget for next year, if a chapter doesn't take interest.--Ryan Lane (WMF) (talk) 18:32, 23 August 2012 (UTC)

Can someone point me to a page showing a breakdown of which chapters have which funds? Ryan, have you any idea what the budget should be for this task? --Anthonyhcole (talk) 07:42, 28 August 2012 (UTC)
I don't think you'd find that information easily, or in one place. Off the top of my head, I can only think of 4 chapters that are allowed to fundraise themselves this year, all the rest rely on grants from WMF. Even from those 4, their budgets is less than a tenth of WMF, and they rely on a skeleton staff, almost all have no dev. resources. Theo10011 (talk) 07:56, 28 August 2012 (UTC)
I have very little chapter knowledge, but the German, UK, and New York City chapters have been mentioned as potential pathways. Ocaasi (talk) 13:24, 28 August 2012 (UTC)
Thanks. The incipient thematic organization at Talk:Wikimedia Medicine is discussing this idea and will be in a position to apply for the next round of chapter funding, but that won't be approved before 15 June 2013, which is still rather a long time off. --Anthonyhcole (talk) 14:54, 28 August 2012 (UTC)

Another interpretation of "The Wikipedia Library"Edit

I read your proposal and it sounds good, but expensive for Wikipedia (ie. all of those membership subscriptions to pay for). I also am always on the lookout for free databases of information. However, when I read the title of the proposal my mind jumped to something else entirely; our own virtual world library at Wikipedia. We as a collective probably have millions of useful offline sources at our disposal (as I gaze over at my bookcase). Wouldn't it be great if we could all submit our booklists (or check them off somehow) to WikiData, and then if someone needs a quick peek into one of our books, we could receive an alert and we could post the information (or corroborate or validate or whatever). Similarly, it would be wonderful if we could sign up to find out if any of our favorite offline sources end up online (I have some copies of books in the public domain that are on Google books, but it is much more efficient to search text online than paging through my books, though I will grab the book on occasion when I am not doing a simple cut&paste). The concept is of course also applicable to Wikipedians with access to "behind-the-firewall" resources, which are out of reach to other editors. A model like this would have enormous benefits, and if tracking is implemented, we could see over time where the investments need to happen (popular resources, etc). Jane023 (talk) 08:09, 7 September 2012 (UTC)

The current concept for the Wikipedia Library would be one in which the research databases donated the resources. That would make it substantially cheaper ;) I like your idea about WikiData, but I have no idea about the technical or legal issues involved. —The preceding unsigned comment was added by Ocaasi (talk) 19:14, 7 September 2012
Return to "Wikimedia Fellowships/Project Ideas/The Wikipedia Library" page.