From Cncz
Jump to navigation Jump to search

Mediawiki mysql migration

Migrating a mediawiki database from a mysql4 to a myslq5 server it may happen that different character encodings result in incorrectly displayed page titles in the migrated wiki and/or broken links to titels containing accented characters. During the migration the unwanted conversion can take place during the dump and during the import. So one should take care at both phases. In the mysqldump one has to use the --default-character-set=latin1 option, so:

/usr/local/mysql-4/bin/mysqldump --default-character-set=latin1 --allow-keywords --quote-names  --add-drop-table  -p -h localhost database >database.sql

The result is a dump with utf8 data but with table create statements still using latin1 for the default_charset en collation. If one would import this data in a mysql5 server the data will be converted which will lead to wrong titles /and/or broken links. To avoid this replace globally in the sql dump "=latin1" with 'utf8".

cat database.sql | sed -e "s/CHARSET=latin1 COLLATE=latin1/CHARSET=utf8 COLLATE=utf8/g' | mysql  -h localhost -u root database -p