User:Graumeister/MergingTwoMediawikiWikis

Whats this about

edit

We had the task to combine the contents of two separated living wikis. As there is no explanaition or tool yet, here is our way. I take no responsibility on any damage this would cause to your data, here it worked.

To execute all steps you need shell-access. We toke one Wiki and transfered the data of the other wiki. we only merged the userpages, diskussions, contentpages and the belonging histories.

this is the quick and dirty way to merge your wikis

NOTE: in the code-block ther is much content which has to be changed to work on your system depending on your language and your content. (passwords, ids, values, the name of your mainpage)

How it started

edit

I put a request on the meta:babel-Page [1] about it:

Hi. Does anybody know a possibility to merge two separately mediawiki-installations (the content of them of course). As theres is no word in the documentation and no information from anywhere it seems to be a white place on the card ;) --Graumeister 11:35, 30 Sep 2004 (UTC)

The only fairly automatic way I can think of is to use Special:Export and Special:Import to transfer all articles from one into the other (preserving history, etc). But I'm not sure Special:Import is even fully operational yet, and that would require doing them one by one anyway. If you were willing to hack around with SQL, you could probably do it by manipulating the databases. There's some documentation lying around this wiki and in the source, I believe. You would need to at least:
  • make sure no two pages had the same name but different cur_id (they could have the same of both, and the histories would fold into each other, I imagine)
  • make sure that no two pages had the same cur_id but different names (cos that would just be evil)
  • make sure that any cur_ids you reassigned were also changed in the old table (I think that's right)
  • run the maintenance/rebuild_all script to update link tables etc
If you do make a script to do this, it would probably be worth uploading it somewhere so that others could use it too, but AFAIK nobody's done it yet. I may be wrong though. - IMSoP 14:51, 30 Sep 2004 (UTC)
hi imsop. i'll try to make it via script and give the explanation then on my userpage. a problem too is the fact that all ids in both wikis are counting from zero, so the most ids will exist allready. maybe the best way is to start with the larger database and import the data of the lesser db with inserts as new data. special:import does nothing. in the package it is commented out and after commenting in it says "import ok" with no data imported ;) --GrauMeister 10:09, 1 Oct 2004 (UTC)

Backup your data

edit

Backup your data via shell (it dont work via phpMyAdmin as there are many binary-fields in the database. I don't know why. It is so, I've tested it.)

mysqldump -uUSERNAME -pPASSWORD DATABASENAME > BACKUPFILENAME.sql

(Its recommended to backup both databases ;)

Use the same Version of Mediawiki

edit

both wikis should run under the same version of mediawiki. if not - upgrade!

Check Users

edit

If there are users they may have accounts on both wikis. to hold the data and the history of the pages, we have to change the user-ids of the wiki-to-be-imported to the value of the existing user in the new wiki. i've made it by hand because we had only 10 users or so.

UPDATE user SET user_id=18 WHERE user_id=12;
UPDATE cur SET cur_user=18 WHERE cur_user=12;
UPDATE old SET old_user=18 WHERE old_user=12;
UPDATE archive SET ar_user=18 WHERE ar_user=12;
UPDATE image SET img_user=18 WHERE img_user=12;
UPDATE oldimage SET oi_user=18 WHERE oi_user=12;
UPDATE watchlist SET wl_user=18 WHERE wl_user=12;

after this, the users who are not registered in the new wiki can be token to the new db. i took an phpMyAdmin-Export of the tabledata of user and deleted the users existing in the new database.

after this it can be, that your users cannot log in. if so, they have to let the system send them a new password.

Update Article-IDs

edit

The article-IDs have to be set up above the existing values in the new wiki for no collisions.

UPDATE cur SET cur_id=cur_id+2000;
UPDATE old SET old_id=old_id+10000;
UPDATE links SET l_from=l_from+2000;
UPDATE links SET l_to=l_to+2000;
UPDATE brokenlinks SET bl_from=bl_from+2000;
UPDATE recentchanges SET rc_cur_id=rc_cur_id+2000;
UPDATE recentchanges SET rc_this_oldid=rc_this_oldid+10000;
UPDATE recentchanges SET rc_last_oldid=rc_last_oldid+10000;

we had not much articles in the new wiki, so 2000 and 10000 are good values. on more content you have to put a higher number here.

Check for duplicates

edit

There shall not be an article whose title is the same as an existing article

$link = mysql_connect('localhost', 'user', 'pwd');
if (!$link) {
	die('Verbindung nicht möglich : ' . mysql_error());
}
$db_selected = mysql_select_db('wikidb', $link);
if (!$db_selected) {
	die ('Kann foo nicht benutzen : ' . mysql_error());
}
$sql = "SELECT cur_id,cur_title FROM cur WHERE cur_namespace<3";
$result = mysql_query($sql);
while ($row = mysql_fetch_row($result)) {
	$oldtitle[$row[0]] = $row[1];
}
mysql_close();
$link = mysql_connect('localhost', 'user_dbnew', 'pwd_dbnew');
if (!$link) {
	die('Verbindung nicht möglich : ' . mysql_error());
}
$db_selected = mysql_select_db('wikidb_new', $link);
if (!$db_selected) {
	die ('Kann foo nicht benutzen : ' . mysql_error());
}
$sql = "SELECT cur_id,cur_title FROM cur WHERE cur_namespace<3";
$result = mysql_query($sql);
while ($row = mysql_fetch_row($result)) $curtitle[$row[0]] = $row[1];
print_r(array_intersect($oldtitle,$curtitle)); 

the output of this script shows the pages which exist in both databases. now you have to rename the pages (move them) in the old pages. eg. old: "topic" new: "topic2".

merge the data

edit

at first: make a backup.

then run the following script (of course you have to change the values)

this script dont merge the existing mainpage because there is ever a mainpage in the new wiki, you have to rename it or merge it by hand

$link = mysql_connect('localhost', 'user', 'pwd');
if (!$link) {
	die('Verbindung nicht möglich : ' . mysql_error());
}
$db_selected = mysql_select_db('dbold', $link);
if (!$db_selected) {
	die ('Kann foo nicht benutzen : ' . mysql_error());
}
$sql = "SELECT * FROM cur WHERE cur_namespace<3 AND cur_title<>'Hauptseite'";
$result = mysql_query($sql);
while ($row = mysql_fetch_assoc($result)) {
	$from_cur[] = $row;
}
$sql = "SELECT * FROM old WHERE old_namespace<3 AND old_title<>'Hauptseite'";
$result = mysql_query($sql);
while ($row = mysql_fetch_assoc($result)) {
        $from_old[] = $row;
}
$sql = "SELECT * FROM recentchanges WHERE rc_namespace<3 AND rc_title<>'Hauptseite'";
$result = mysql_query($sql);
while ($row = mysql_fetch_assoc($result)) {
        $from_recentchanges[] = $row;
}
mysql_close();
$link = mysql_connect('localhost', 'user_new', 'pwd_new');
if (!$link) {
        die('Verbindung nicht möglich : ' . mysql_error());
}
$db_selected = mysql_select_db('dbnew', $link);
if (!$db_selected) {
	die ('Kann foo nicht benutzen : ' . mysql_error());
}
foreach ($from_cur as $value){
	$sql = create_insertstatement('cur',$value);
	mysql_query($sql);
}
foreach ($from_old as $value){
	$sql = create_insertstatement('old',$value);
	mysql_query($sql);
}
foreach ($from_recentchanges as $value){
	$sql = create_insertstatement('recentchanges',$value);
	mysql_query($sql);
}
function create_insertstatement($table,$array){
	foreach ($array as $key => $value) {
	 	if (!$sql) $sql = $key."='".addslashes($value)."'";
		else $sql .= ", ".$key."='".addslashes($value)."'";
	}
	$sql = "INSERT INTO ".$table." SET ".$sql;
	return $sql;
}

Finished?

edit

not yet. you have to run the script /maintenance/updateall.php in your mediawikiinstallation. this script will rebuild all links and indexes.

now its done.

it worked. for me.

HTH