Interwiki bot access protocol
This page is kept for historical interest. Any policies mentioned may be obsolete. If you want to revive the topic, you can use the talk page or start a discussion on the community forum. |
Note: Most of this is obsolete, as it has been replaced by the API interface. The interwiki bots should be modified to use the existing Query API to significantly reduce server load. --Yurik 07:39, 4 September 2006 (UTC)
This page documents a mediawiki interface and implementation that allows interwiki bots to get needed information with considerably less server load and bandwidth usage.
In the present implementation, bot requests all the data through Special:Export interface, parses each page, recognizes interwiki links and disambiguation templates, follows interwiki links to the other sites to also get data, assembles the results, and updates the pages with the new interwiki links.
To substantially optimize this process, the following has been proposed:
- The bot should be able to request just the interwiki links for needed pages
- A format similar to Special:Export request can be used here
- Example: http://de.wikipedia.org/w/api.php?action=parse&format=xml&text={{:Test}}{{:Bot}}{{:Haus}}&prop=langlinks (langlinks, categories, links, templates, images, externallinks)
- A format similar to Special:Export request can be used here
- The bot needs to know if the page is a disambiguation.
- The bot maintains a list of all disambiguation templates for all sites. The bot can send the disambiguation template names as part of the request, so server can give a flag if a given page is a disambig.
- The bot needs to know when the page is a redirect
- Resolving example: http://de.wikipedia.org/w/api.php?action=query&prop=langlinks&titles=Main%20Page&redirects
- Checking example: http://de.wikipedia.org/w/api.php?action=query&list=allpages&apfrom=Main_Page&aplimit=1&apfilterredir=redirects (&apfilterlanglinks=withlanglinks)
More general format proposed on IRC
editThe mediawiki users should be able to request data by giving a list of needed pages and needed properties. For example:
- Request
Get properties interwiki, template, followredirects for pages A,B,C
- Response
- A: iwlinks [x1,y1]; templates [aa1,bb1]
- B: DOES NOT EXIST
- C: REDIRECT to D
- D: iwlinks [x2,y2]; templates [aa2,bb2]
All this could be incorporated into the Special:Export result format.
See http://de.wikipedia.org/w/api.php?action=parse&format=xml&page=BKL&prop=categories as an example.
See Also
edit- Similar initiative for individual page data and CSV formatted data at REST page.