Web2Cit: collaborative automatic citations for web sources

November 29, 2022 - New and updated translationsEdit

Web2Cit sever v1.2.0 has been released and deployed including new and updated translations from Translatewiki contributors. Full list of new and updated translations available at the corresponding changelog.

November 8, 2022 - Web2Cit monitor is up and running!Edit

Web2Cit community collaboratively defines translation tests indicating the expected output for specific webpages. Knowing the expected output for a webpage helps Web2Cit contributors define translation templates (which indicate procedures to extract citation metadata) or fix them when webpages change.

From today, the Web2Cit monitor regularly compares these expected outputs against the actual output returned by Web2Cit, and writes these results to domain result pages on Meta-Wiki. Domain checks are run any time a configuration file is changed, or after 30 days from the last check. You may add any of these result pages to your watchlist to find out when test results change and update configuration if necessary. An overview of test results can be found in the monitor's overview page.

The Web2Cit monitor was developed by Web2Cit project's team member Dennis Tobar and you may read more about it here.

October 27, 2022 - Research resultsEdit

The Web2Cit research team has been developing an approach to automatically estimate the accuracy of automatic citations in Wikipedia. Their findings have just been published in their final report.

Also, they have come up with a way to automatically generate Web2Cit tests from their research results. You can read more about this here.

October 26, 2022 - Web2Cit server v1.1.0Edit

Web2Cit server v1.1.0 is available at https://web2cit.toolforge.org/. Differences from latest beta version include:

  • New and updated translations from Translatewiki.
  • Favicon.

Check alpha and beta release announcements below for changes from latest version deployed to production, v1.0.4.


Web2Cit logo with English legend

Web2Cit now has a logo! To the left, the handwritten "Web" represents metadata on the web, irregular and disorganized. To the right, "Cit" written with a squared font represents structure and order of citation metadata. In between, three intertwining arrows in the colors of the Wikimedia community resemble the "2" in "Web2Cit" and represent the Web2Cit community bridging the web and the citation worlds.

The logo is available in different formats:

October 18, 2022 - Web2Cit server v1.1.0 beta availableEdit

The beta version of Web2Cit server v1.1.0 is already available at https://w2c-beta.toolforge.org/ and will soon be deployed to the production server.

Most important changes include:

  • Web2Cit now follows configuration file redirections, useful for domain aliases such as www.example.com and example.com. Read more about domain aliases in Web2Cit here.
  • JSON schema files (used to customize the JSON configuration file editor) are now served from the server. Hence, the editor launched from the test instance at https://w2c-beta.toolforge.org/ automatically includes the latest changes.

See the Server's changelog for the full list of changes.

In addition, the Web2Cit user script now supports using an alternative Web2Cit server, so users willing to use the latest features can have the user script use the beta instance instead. More about this setting here.

September 30, 2022 - Collaborative translation on translatewiki.netEdit

Web2Cit is now available for collaborative (language) translation on translatewiki.net!

For now, only the Web2Cit server interface is available for translation, not including the JSON editor.

Enabling translation for the Web2Cit JSON editor is planned for both its interface and contents, and will be available under the same Web2Cit translatewiki.net project. In the meantime, you may use automatic translation provided by some web browsers.

September 21, 2022 - Alpha release of Web2Cit Server v1.1.0 out for testing, including JSON-LD selection!Edit

Alpha release 2 of the upcoming v1.1.0 of the Web2Cit Server is available for testing at our development instance: https://w2c-beta.toolforge.org/.

The most important change is that the web2cit library (Web2Cit Core) has been updated from v1.0.1 to v2.0.0-alpha.1, which among other changes includes JSON-LD selection!

Other alpha and beta versions will be deployed to w2c-beta.toolforge.org in the coming days for further testing.

After some testing, v1.1.0 will be deployed to the production instance: web2cit.toolforge.org.


Note that the JSON editor won't show JSON-LD selection because it is still using JSON schemas from Web2Cit Core v1 (see T318352). You may try this temporary workaround:

  1. locate this parameter on the JSON editor URL &schema=https%3A%2F%2Fraw.githubusercontent.com%2Fweb2cit%2Fw2c-core%2Fmain%2Fschema%2Ftemplates.schema.json, and
  2. replace it with &schema=https%3A%2F%2Fraw.githubusercontent.com%2Fweb2cit%2Fw2c-core%2Fv2%2Fschema%2Ftemplates.schema.json.

Also note that the Web2Cit user script will still be using the production Web2Cit server. A setting to have it use an alternative instance has been proposed in T318195.

September 7, 2022 - New homepage!Edit

Our homepage has been completely updated. It was still a page about why the project made sense and what we wanted to do, which was useless for most users wanting to know how to use Web2Cit. We hope that the new homepage will help potential Web2Cit users quickly understand what it is and how to use it. It also includes links to more detailed documentation resources, which are still under construction. We hope you find our new homepage useful!

August 23, 4 PM UTC: Last workshopEdit

In August 23, at 4 PM UTC, we will have a workshop (probably our last) in the context of the grant on Web2Cit. We're going to be doing this in the context of the LD4 community calls. Here's the Zoom link to join.

Here is the agenda for the meeting.

June 13, 2022 - Translation testsEdit

Web2Cit now supports translation test configuration files! Just open any URL using our translation service, and you will get the expected outputs and matching scores next to the translation outputs (currently available for testing at w2c-beta.toolforge.org; soon to be available at web2cit.toolforge.org too!). Make sure there is a translation test configured for the URL that you are checking. Otherwise, just click "edit" beneath "Expected output".

This will enable a test-driven approach to writing translation templates:

  1. Identify a URL you are having trouble with.
  2. Open it with Web2Cit translation service.
  3. Define a translation test for it, indicating the expected translation output.
  4. Define a translation template (or adjust existing templates) to match the URL's translation output with its expected output.

In addition, to quickly check how Web2Cit translation is working for a domain, you can check translation outputs and scores for all paths defined in its configuration files. Just go to https://w2c-beta.toolforge.org/translate?domain=<YourDomain>&tests=true (soon to be available at web2cit.toolforge.org too). A shortcut from the translation service's homepage is planned (see T310518).

You can also check the draft Translation tests section at our early adopters guide to know how scores are calculated.