Maps server setup tasks

Architecture and issues

Database server

All the geospatial are stored in a PostgreSQL database running PostGIS extension.

Problems to solve:

Replication from OSM
Do we need two PostgreSQL instances or a will a single one do (for WMF and Toolserver)?
How are we going to handle massive data imports for people who need them?
How are we going to handle additional metadata that Wikimedia community might want to put into "Wikimedia" OSM? Separate database? Modified OSM schema? Something else?
How scalable will be one server?

Rendering machinery

Rendering is a batch, background task, however in some architectures it is being done as a demand-based rendering (a tile that is not found is rendering on-demand).

There are different renderers available:

Mapnik

Mapnik currently rendering works in a combination of Apache mod_tile and renderd. There may be some scalability issues as mod_tile can talk only to a single renderd instance (one machine). There are issues with web crawlers and massive database imports, since those generate load spikes in the rendering infrastructure.

As far as I can tell, Mapnik is by far the most scalable solution of the options. A single server can so far handle the full load of Openstreetmap fairly well. Osmarender in comparison needs 100s of clients to achieve the same. Mapnik is also so scalable because it renders things on the fly and thus does not need to render all those tiles that never get served before they are outdated again. If that is still not enough, there are patches that will allow renderd to become distributed across a lan further increasing scalability ( http://trac.openstreetmap.org/ticket/2005 ). It hasn't yet been merged into renderd, as it turned out that there was no need for it on OSM, but if wikipedia needs it, I can try and get it merged. --Apmon

Osmarender (XSLT-based), or/p (Perl-based), producing SVG output
Tiles@home distributed rendering base on Osmarender
Kosmos real-time rendering, Windows-only
http://cartagen.org/, JavaScript

Mapnik has its advantages, though once the "Mapnik" PostGIS database is setup, it's equally easy to render maps with other software such as Geoserver.

Tile software options:

Problems to solve:

Which renderer do we support? Or do we go for all?
How do we schedule rendering jobs?
How do we control and contain them?
How do we collect statistics and measure improvement?
1. What statistics do you need? mod_tile and renderd come with a bunch of ways measuring performance. E.g. http://munin.openstreetmap.org/openstreetmap/tile.openstreetmap.html#Renderd shows rendering throughput of the OSM tile server. There are more stats on mod_tile, that haven't been deployed yet and it might not be too hard to add more. --Apmon

Tile serving

As fast as possible. We probably need to measure here a lot there.

Problems to solve:

On-demand generation? Pre-generate all?
How to spread the load? How many machines?
Go for a simple web server like thttpd and/or use some cache like Varnish or Squid? Some other solution? (The guys running the NL Tile Server are using Cherokee and appear to have measured a lot.)
1. mod_tile works quite nicely together with squid and support HTCP cache expiry for newly renderd tiles and uses some heuristics to improve expiry times and cacheability
How do we collect statistics and measure improvement?

Stylesheet management

A stylesheet gives instructions to the render. Advanced users will probably want to play with new stylesheets for the maps.

Problems to solve:

Internationalization - how?
Should somebody need to regenerate a whole planet to test a stylesheet? Should test rendering be handled differently from production rendering?
What will be the process of putting a new stylesheet in production?

Presentation to the user

We need statistics on the current usage of Geohack and WMA tiles

Webstats for Geohack and WMA

Static embedding (priority?)
Javascript "Sloppy map" implementation - needs very scalable tile serving

OpenStreetMap architecture

(SourceOpenstreetmap:Develop)

Ptolemy: production OSM database server

master postgres instance

Server setup

Partition the server
- setup separate partition for postgres db logs
- separate partition for database

Main OSM mirror database

mirror production osm main database
procedure (scripts) to regularly update our OSM database with new OSM changesets

Questions:

what will be mirrored? (see [1])
- the current-tables
- the history-tables
- the raw-tables
how could this be mirrored?
- only current can be imported from a planet.osm
how often should this be updated?
is access needed from Ortelius (tile server) or just from Cassini (toolserver)?
will there be access from Cassini?

Mapnik database

mapnik rendering database (with PostGIS support), done using osm2pgsql
add and maintain multiple database views, for multilingual rendering
procedure to update rendering database at regular interval, with new OSM changesets (with osmosis --read-change-interval)
procedure for regular complete re-imports to solve inconsistencies introduced by the diff-import

XAPI-Instance

see: http://wiki.openstreetmap.org/wiki/Xapi

Ortelius: production OSM tile server

Partition the server
The default.style would be functional, however it would be best to come up with a modified wikipedia style.
- our style also needs to incorporate the multiple database views, which support rendering tiles for each language.
- other styles need a different *.style file to allow rendering other features
to do

Cassini (toolserver)

See also: https://wiki.toolserver.org/view/OpenStreetMap_server/Setup_notes

php, perl & python with apache2 and on cli
access to mysql & postgresql
see jira for a list of packages needed for this
a way for tools, that uses the osm-databases, to tell the users of the tools about the date/time of the last update and the date/time of the next planned update (similar to the globalsitenotice currently discussed on toolserver-l
samples on how to use cassini / the dbs in various languages on the wiki
list of project-ideas on wiki
will Cassini have it's own PostGIS / OSM database or shared from Ptolemy?

Background info

Server info: OpenStreetMap#Servers

Required bits for our purposes:
- database
- mapnik rendering
- slippymap
- api (maybe? for toolserver usage)