NEH Narrative updated

Here is an updated version of the NEH narrative. It still needs a budget narrative. Please help in any way you can.

The Wikimedia Foundation

The Wikimedia Foundation (http://www.wikimedia.org) is an international organization based in the United States, devoted to the accumulation and free distribution of information about the sciences and the humanities in every available language. To date, projects developed by the Wikimedia Foundation include an online encyclopedia, Wikipedia, available in over 150 languages, dictionaries in each of those languages (Wiktionary), free textbooks on various subjects (Wikibooks), a collection of quotes (Wikiquote), and a repository of primary source texts (Wikisource).

These projects are in various stages of development, with the English-language Wikipedia being the largest at over 300 thousand articles. It has 90.1 million words, giving a mean article length of 301 words. It also has 72,000 photographs and illustrations, 179,000 redirects (indexed titles which point to existing articles), 203,000 links to other websites and 5 million cross reference links between articles. In comparison, Encyclopedia Britannica, the best known encyclopedia available today, claims to have over 85,000 articles, 55 million words, and a mean article length of 647 words. Within just a few months, the English-language Wikipedia is expected to have over half a million articles, making it the largest encyclopedia in history.

All of this material is available free of charge under the GNU-Free Distribution License and can be copied and edited at will. Everyone is invited to add and edit content, which undergoes peer review and scrutiny by experts and amateurs alike to ensure quality and increase the amount of information available. This has resulted in the emergence of a virtual community of Wikipedian volunteers of every age and from every walk of life, devoted to the development of Wikipedia and other Wikimedia projects. On April 23, 2004, this community was awarded the Prix Ars Electronica's first ever Golden Nica Award for Digital Communities, and a display about Wikipedia and the other prize winners was set up in the foyer of the United Nations.

More and more people search Wikipedia every day for information and the simple joy of learning something new. Students turn to Wikipedia regularly to help them with projects and reports, and it has been cited as a source by newspapers and courts of law. Newspapers and news sources to cite Wikipedia include The New York Times, The Sacramento Bee, The San Diego Union-Tribune, Canada's Globe and Mail, The Western Daily News, Philadelphia Daily News, Britain's The Guardian, and most recently, Associated Press. For a more comprehensive listing, including online sources, see http://en.wikipedia.org/wiki/Wikipedia:Wikipedia_as_a_press_source.

The number of people around the world who use Wikipedia regularly has soared over the past year, and the combined projects now receive about 21 million hits daily. With the quick response of Wikipedians everywhere to current events, many people have come to turn to Wikipedia as a reliable news source, with in-depth coverage of emerging issues that generally includes hyperlinks to comprehensive articles about the people and places being discussed. For instance, within an hour of the Columbia space shuttle disaster (which was first reported to Wikipedia by a contributor in Texas who heard the explosion), complete articles appeared about the space shuttle and each of the astronauts, as well as on other related subjects. These were either written or added to in realtime, based on a wide variety of sources, online and offline.

History

The idea of collecting all of the world's knowledge under a single roof goes back to the ancient libraries of Alexandria and Pergamon. The modern notion of the general purpose, widely distributed, printed encyclopedia dates from shortly before Denis Diderot and the 18th century encyclopedists.

The idea of using automated machinery beyond the printing press to build a more useful encyclopedia can be traced to H. G. Wells' short story World Brain (1937) and Vannevar Bush's future vision of the microfilm-based Memex in As We May Think (1945). An important milestone along this path is also Ted Nelson's Project Xanadu.

At the same time, the open source and free software movement founded by Richard Stallman had led to the concept of collaborative creation of works under copyright licences that ensured the equitable sharing of the end product. The most notable example of this is the Linux operating system, built from Stallman's GNU project and Linus Torvalds' operating system kernel.

Stallman had also proposed a GNU Network Encyclopedia, which would apply the same principle to the creation of an encyclopedia. However, nothing became of this proposal.

On January 15, 2001, Wikipedia (http://www.wikipedia.org) was founded by Jimmy Wales and Larry Sanger as a mere scratchpad for Nupedia, an online encyclopedia founded by Wales in March 2000. Articles submitted to Nupedia had to be peer reviewed by a cadre of experts, and authors had to be "true experts in their fields [...] and possess Ph.D.s."

In the end, only a few dozen articles were created through the complex Nupedia process. Wikipedia, it was hoped, would allow anyone to start working on articles, which could then be peer reviewed and published on Nupedia. Wikipedia passed 1,000 articles after just one month in existence. It was thus clear from the start that this project would soon become the dominant one. Nupedia was silently abandoned, its closed working model a failure.

Within a short while, Wikipedia had reached critical mass, and article numbers were growing exponentially. At the same time, it was noticed that article quality was also growing, rather than diminishing. In order to facilitate and coordinate operations, the Wikimedia Foundation, Inc., a non-profit organization registered in the state of Florida, was founded on June 20, 2003.

The aim of the Wikimedia Foundation is to spread knowledge by creating a free, high quality encyclopedia, as well as other learning resources. We believe that anyone can be a teacher by sharing their knowledge with others, and as such, everyone is invited to edit Wikimedia pages. Regardless of the scope of their participation, each author enhances the quality of the material for the next generation of users.

The project's success is evident from the international community of users and the number of articles that has emerged using the Wikimedia model. In the English-language Wikipedia alone, there are over 300,000 articles covering virtually every subject imaginable, from art history to software engineering, from theology to astrophysics.

In response to strong user demand, Wikipedias were soon set up in languages other than English. Today, Wikipedia exists in over 150 editions, ranging from very large ones like German and Japanese to tiny, emerging ones such as Cherokee and Cornish. An article may often start as a "stub", only a few words long, and will be developed into an in-depth treatment over time. Every week, the English Wikipedia community picks an article which is particularly lacking in depth and focuses on improving it. Within a few days, many articles have grown to twenty times their initial size or more using this unique collaborative process.

The number of Wikipedia users has also grown at an exponential rate since the founding of the first project, the English-language Wikipedia. Some thirty thousand people worldwide have contributed to this effort in any number of ways and in so doing, created a tightly knit community that transcends national borders and ideological boundaries. These divisions, too often over-emphasized in the "real world", are superseded by the sense of commitment to a common goal: making knowledge free and accessible to everyone.

To date, the participants have been able to keep up with the breathtaking growth of the various Wikipedias and ancillary projects. However, if we are to continue to grow at this rate (and internal figures indicate that our growth rate is exponential) while incorporating new projects and cooperating with other like-minded ventures, we will require financial assistance. Wikimedia projects operate on a volunteer basis. While more and more people have shown themselves willing to donate of their time and knowledge to support these projects, this will require additional servers, offline organizational components, and the other tools necessary to allow the project to continue to grow.

We therefore ask that the National Endowment for the Humanities consider our request for $500,000 so that we can take Wikipedia and the other Wikimedia Foundation projects to the next level of development. The projects to be covered by the grant include:

Wikipedia: The largest and oldest of the projects being developed by the Wikimedia Foundation. It is an online encyclopedia, built by users around the world. The English version already has 300,000 articles, while other languages combined have some 500,000, for a total of over 800,000 articles. Each article is created and edited by the users, so that new articles are constantly appearing and older articles are constantly being improved simultaneously. There is, however, no central control of how this process takes place. Growth is natural, reflecting the interests and ideas of any number of users on any day.

That is not to say that the structure is entirely random and chaotic. Experience has shown the opposite to be true. In just a brief time, the users have organized themselves into "Wikiprojects," establishing guidelines for contributors and ensuring that similar articles follow similar formats. Examples of this include the standardized naming formats adopted for monarchs, which were hammered out after considerable negotiations among the community, or the biology taxoboxes, which are being inserted for all articles about plants and animals. Nevertheless, there is a certain fluidity within the model, so that if a significant group of users decides that some other format would be more helpful to them, it will be discussed by the community and changes implemented accordingly. In this way, the reference material consistently reflects the needs of people who would actually refer to it.

While the final product, the Wikipedia article, is what end-users see, they can also investigate each article's "History" and often extensive "Talk page," to see how the article developed, and how compromise positions acceptable to all of the participants were hammered out. In a metaphoric sense, they are able to go through all the "out-takes" that led to the final product, because these are virtually never discarded.

Wiktionary and related projects: Similar to Wikipedia in objectives is the Wiktionary project to create comprehensive online dictionaries in each of the Wikipedia languages. These dictionaries would include all words in that language and other languages (defined in that language), definitions, etymologies, examples of usage, historical usage, and translations into other languages. As an online component with limitless virtual space, it would be far more comprehensive than any standard paper dictionary. Wiktionary projects currently appear in seven languages and the frameworks for other, additional languages are already in place.

The existing Wiktionary format can be expanded even further to include other useful components such as a rhyming dictionary and a thesaurus (Wikisaurus). A related Wikiquote project, containing notable quotations organized by author and theme, is already underway. Like any other Wikimedia project, these will be constructed by the community over time and as information becomes accessible.

Wikibooks and related projects: Wikibooks, another project of the Wikimedia Foundation, has as its goal the compilation of information collected by Wikipedia and its sister projects and the transformation of this information into textbook-like formats that can also be accessible online. These textbooks, created by the same collaborative process as Wikipedia and Wiktionary, will cover a wide range of topics from the Sciences, the Humanities, and Languages on a number of different levels. This will make it possible for people not only to look up specific information but to learn particular subjects in an orderly fashion.

Some of these books will also refer readers to Wikisource, a collection of primary sources referred to by the various Wikiprojects. In the future, these and other sources will be translated into various languages so that such essential documents as the U.S. Constitution or the U.N. Bill of Rights will be available in full and in their native language to anyone visiting the Wikimedia Foundation website.

It is hoped that in the future, Wikibooks will form the basis of the Wikiversity, a site devoted to online learning, where people can employ the latest technologies associated with distance learning to take free courses on many different subjects. Even though the Wikiversity will not be accredited (at least initially), it will provide people with an opportunity to learn many different subjects for free, thereby expanding the role of the Internet as an educational tool.

Akin to Wikibooks are Wikireaders, two of which (Internet and Sweden) have already been launched in German. Wikireaders are small collections of Wikipedia articles on a particular theme, that are marketed by the Wikimedia Foundation and serve as a potential source of revenue in the future. Printed and sold via the Wikishop, Wikireaders provide a handy reference text on any number of subjects. Plans are underway to create a number of similar Wikireaders in English on a variety of subjects that would appeal to a mass audience, thereby increasing their salability.

Scope

Wikipedia's goals are ambitious: it aims · to be an encyclopedia, in the normal sense of a collection of all human knowledge; · to be freely editable by anyone (except for banned users, and excluding protected pages); · to be open content, using the copyleft GNU Free Documentation License; · to do all of the above in all known human languages.

Since there is no space limitation on Wikipedia because of its digital nature, it also aims to subsume the functions of specialist encyclopedias in all and any specialist subjects. Unlike a paper encyclopedia, Wikipedia can encompass articles for both elementary topics and advanced treatments of the same subject.

As well as traditional encyclopedic topics, Wikipedia is able to react very quickly to current events and provide information almost as soon as it happens. It is arguably also more accurate and unbiased than regular media sources tend to be due to POV standards. This up-to-date nature is invaluable for educational purposes, where teachers and students need to be up to date.

It carries topics not to be found so comprehensively elsewhere on the Internet, and enables expert writers to share their knowledge. Indeed Wikipedia allows specialist scholarly material a far wider dissemination than any print media can achieve, and access to a much larger potential audience.

All modern printed encyclopedias have to make space for new topics by jettisoning old ones. This is particularly true for science and technology topics. Wikipedia provides a vehicle for a more balanced record of such topics to be maintained, thus materially contributing to history resources.

Wikipedia is particularly rich in topics relating to IT, computing and computers, and the Internet. It is also strong in Media topics, such as cinema, television and music. Though the content of Wikipedia suggests something about its contributors interests and hobbies, as Internet access becomes more of a commodity around the globe, the content of Wikipedia will undoubtedly expand in as-yet unpredicted directions.

Methodology and Standards

Wikipedia is accessed through a World Wide Web interface. All articles can be created and edited using any Web browser, without any additional software. This is accomplished through the use of wikitext, an intuitive and easy-to-learn markup system, to edit each article as plain text.

Image files to illustrate articles may be uploaded using the standard file upload function within Web browsers.

Within specialist articles certain kinds of specialist markup may be used. The most prominent is the use of embedded TeX markup, which allows the use the TeX mathematical typesetting system to create graphics for mathematical notation that cannot be represented in all web browsers.

When an article change is saved, the Wikipedia servers then render it as XHTML, including producing any images that may be needed, and serve this to the web browser, allowing the full typeset version of the article to be viewed.

Wikipedia makes extensive use of XHTML and Cascading Style Sheets to try to separate representation from presentation. This allows the maximum possible customizability and reuse of article material, as well as accessibility for devices such as readers for the blind. All of this is designed to be backwards-compatible as far as possible with early versions of Web browsers.

"Neutral Point of View"

Wikipedia has contributors with a wide range of cultural backgrounds, religious beliefs, and political points of view. It is inevitable that such controversial topics as abortion and the teaching of evolution would spark considerable controversy on Wikipedia. We do not avoid this controversy; rather, we channel the differing views of contributors into a discussion of the various sides of an issue. The policy that defines this focus is the "Neutral Point of View", or "NPOV".

Wikipedia's NPOV (the phrase is almost always abbreviated) policy, simply stated, is that articles should be written "without bias, representing all views fairly" (http://en.wikipedia.com/wiki/Wikipedia:Neutral_point_of_view). A common rule of thumb is that any reasonable person, no matter what their opinions are, should be able to agree that a Wikipedia article is accurate.

While this policy is intended as a matter of good academic practice, it is absolutely essential to the functioning of the Wikipedia community. Wikipedia unites people of vastly differing views who are willing to work with each other because they share the goal of providing a non-biased resource that is useful for everyone. Rather than reject ideological conflict between users, we actually embrace it, so long as it is conducted with civility and mutual respect. The Wikipedia attitude is that such debate in the Talk pages actually enriches the article. Open dialogue ensures that one position does not overshadow the other, and that the article reflects the views of everyone, while never abandoning its intellectual neutrality.

This makes Wikipedia an invaluable tool for people researching contemporary American and international issues. All major aspects of controversial issues are presented, all arguments can be found. This is the result of contributors representing the intersection of a common interest (the improvement and enhancement of the project) and their partisan interests, whatever they may be. Fifty years from now, even the history and discussion evoked by an occasional edit war will be a valuable resource to researchers attempting to understand the personal positions that were at stake. As Wikipedia grows, the ability to trace those issues and their subsequent development will remain invaluable.

Open peer review and the removal of errors

Apart from the Neutral Point of View policy, another distinguishing characteristic of Wikipedia is open peer review. Every page is open to editing by any person, and those edits are visible to everyone. Any user can correct an error, enlarge an article, amplify a point or copyedit at any time. It is also true that this open access permits the insertion of vandalism or bad information; but the open real-time nature of the editing process, and the complete audit trail kept of all pages, allows the rapid removal of any bad edits. This is accomplished by "recent changes" and "watchlist" features in the software that allow edits to be followed. In effect, the Wikipedia readership functions not only as a pool of editors, but as a review board.

Where there are differences of opinion that need to be resolved between editors, a system of talk pages allows changes to pages to be discussed.

Finally, persistently abusive users can be blocked from editing, although Wikipedia policy is to do this only as a last resort.

Organization of and access to material

The articles are organized as a series of single web pages, linked together by hyperlinks. Unlike with the rest of the Web, Wiki hyperlink syntax is very simple, and the address of the linked page is simply its title. (These are then resolved by the software into normal HTML hypertext links at render time.)

When an article grows too long to be comfortably read as a single web page, it will be split up into sub-articles. In this way, what would have been a single lengthy article in a paper encyclopedia will typically correspond to perhaps ten or so Wikipedia pages, with one master page acting as a summary page and table of contents for the topic.

However, the non-hierarchical nature of hyperlinks within Wikipedia actually allow much more sophisticated linking patterns.

One common pattern is the linking of words within articles without knowing whether they yet refer to an article. In many cases, these links will not correspond to any article, leaving a "red link" which is shown to warn users that this link does not as yet correspond to any article. As time goes by, these unresolved links provide a stimulus for editors to write new articles covering their topics.

Alternatively, this "accidental" linking will lead to a link to a pre-existing article, providing readers and editors with links to further material which may be related to the topic of the article.

A set of naming conventions exist to attempt to make it as likely as possible that a new article will match pre-existing hyperlinks, or vice versa.

Storage, maintenance, and protection of data

All of the Wikipedia data is stored in a MySQL relational database. Multi-gigabyte backups of the database are taken at regular intervals to staging machines, and a number of people periodically download and save these backups over the Internet.

Every version of every article is saved to the database, and so each article has a complete audit trail of every edit. This is the principal measure against casual vandalism of Wikipedia; it takes longer to vandalize an article than it does to revert it back to a known good previous version.

As well as an audit trail for every article, there is also a publicly visible record of "recent changes". This is monitored by a large number of users, who usually rapidly pick up on suspicious patterns of behavior, and intervene to prevent vandalism.

Articles are written and stored in simple text format with additional formatting codings that is readable by humans and can be translated automatically into different formats like HTML or PDF. Metadata is provided in Dublin Core using RDF technologies. Efforts are being done to standardize the article syntax making it even easier for other projects to reuse Wikimedia content.

Work Plan The Wikimedia Foundation is already a fully functional organization that has proven its greatest strength to be the volunteers who work around the clock to build up the various projects. Our plan is to continue working in this same manner, while directing the efforts of contributors to those areas that the Foundation deems weak. This will be achieved by appointing a team of experts on various topics, who will not only assess the quality of the information, but point to the significant lacunae and help to identify people who can fill in the gaps. Two projects, one proposed and the other already in existence, already provide the essential framework for this effort. The validation proposal recommends that we draw experts from our team of volunteer contributors, who will be asked to assess the quality of articles on various topics. They would also appoint people to survey the content for egregious (and not so egregious errors), and fix them. Once an article has been fixed, whether by the community or individually appointed editors, it would be rechecked and, if found to be comprehensive, factual, and well written, it would be marked as validated. Contributors would continue to be able to edit the article, however, their edits would have to be examined before they are incorporated into the validated version. The structure for this would be based on the Wikiprojects model, by which groups of contributors with similar interests cooperate to establish a format and create/edit articles in that particular field. This ensures that related articles follow similar formats with identical (or similar) headings, taxonomy boxes, structure, and style. This is important to ensure that users will find similar groupings of information, while for the contributors it will ensure that quality is maintained throughout, even as the articles are being created and developed. Throughout the course of this grant, we hope to implement this structure for approximately 500,000 articles on Wikipedia and to develop a similar structure for Wiktionary. The process, as a whole is intended to aid and direct, rather than hinder the phenomenal natural growth of these projects by ensuring that they remain on course. The structure for the various validation teams will be based on the "Categories" model already in place. In the first six months (April-October, 2005) we will identify about 50 key themes and validators who can oversee work on them, employing the democratic model of all Wikimedia projects to select these people. Over the next four months (October-February) these validators will finalize their teams and, with the help of these teams, create lists of target objectives to be achieved. The validation process will begin in February and last fourteen months, with additional teams and sub-teams created as deemed necessary throughout the entire timeframe. At the same time, the Wikimedia Foundation plans to create a physical center for all of its international projects, to be located in Florida. This center many be augmented by regional centers across the United States and national centers around the world as budget permits. The process of locating and staffing such a center would take approximately six months from the receipt of the grant. Staff

Wikipedia has been edited by thousands of people (referred to as Wikipedians). There is no editor-in-chief, as such. The two people who founded Wikipedia are Jimmy Wales (former CEO of the small Internet company Bomis, Inc.) and Larry Sanger. For the first thirteen months, Sanger was paid by Bomis to work on the project. Sanger was said to have taken a role of mediator at times, making decisions on issues that aroused contention. This was based not on formal authority, but on demands from users at large. Funding ran out for his position, leading to his resignation in February of 2002. Other current and past Bomis employees who have done some work on the encyclopedia include Tim Shell, one of the co-founders of Bomis, and its current CEO, and programmers Jason Richey and Toan Vo.

Wikimedia expects to continue experiencing exponential growth in the near future. As the number of articles, editors, and projects continue to grow, a paid staff to coordinate, evaluate, and develop content in areas currently under represented in the projects will become necessary.

Needed staff members include:

A Projects Coordinator will oversee the work of the volunteer editors at each of the Wikimedia projects. The project coordinator will determine what subjects are ready to be published as printed topical encyclopedias or wikibooks, and will then be responsible to ensure the quality of the published work. They will also recommend areas for future growth in the various projects, and will work with the public relations coordinator to expand the volunteer community.

A Public Relations Coordinator will ensure that the public as a whole is aware of Wikipedia and the other Wikimedia projects. This person will also act as a press contact for the Foundation. The public relations coordinator, with consultation from the projects coordinator, will recruit volunteer editors from academic disciplines under represented in the project. The areas most in need of further editors at this time are the arts and social sciences.

Full time developers will be hired to work on the MediaWiki software that is used to receive and display contributions to the Wikimedia projects. Additional "programming bounties" may be offered to volunteer developers in order to complete particular tasks.

A full time systems administrator is also needed to maintain the Wikimedia servers, perform backups and coordinate necessary hardware purchases.

A Chief Financial Officer will be needed to coordinate the financial affairs of the foundation. Working with the CFO will be a Development Coordinator, who will coordinate all fund raising activities of the Foundation.

Dissemination

Wikipedia is distributed both online, and in print. Up until now, the print versions include only the German language WikiReaders. Wikimedia aims to expand this to cover a variety of topics in English, and other languages. Plans for printing a less selective number of articles in a print version are also underway. On and off line distribution of selected topics, and of the complete encyclopedia, to schools, both within America, and in developing countries, is expected to occur during the period of this grant.

Professional practices

Although Wikipedia is not maintained by professional librarians, a great deal of work has gone into trying to maintain consistency throughout the Wikipedia. In particular: · multiple meanings of the same term are dealt with by so-called disambiguation pages; · multiple terms related to the same article are dealt with by redirect pages.

A number of software tools are available to aid editors in maintaining this consistency, by checking and flagging possible errors for human review. This is also used to find possible broken or bad content, such as very short articles.

A category system has recently been added to the software, and categories are rapidly being added to existing articles. The category system allows multiple systems of categories, and will allow both the use of informal categories and standardized category systems such as the Dewey Decimal System and the Library of Congress classification system. Categories can themselves be categorized, allowing the creation of category trees. It is intended that automatic tools will be written to mine information from article categories.

A great many manual indices within Wikipedia have already been compiled in the forms of lists. Examples include lists of years, lists of people by profession, lists of inventions and so on. These are carefully maintained in simple formats. Many of these contain information that is available to be added to the category system.

National standards

Wherever possible, Wikipedia has used pre-existing standards. For example, it uses ISO language codes to describe the various languages supported, and keeps its timestamps using the UTC date/time scheme. Date and time formats within articles are automatically parsed, so that dates and times can be presented in any needed format.

Character sets and encodings

All Wikipedias support the Unicode character set, either directly through the use of the UTF-8 character encoding or by the use of HTML entities. The few remaining non-UTF-8 Wikipedias use ISO 8859-1, and are in the process of being converted one by one to UTF-8 encoding.

Budget