WikiXRay Parser Options
user@machine:path$ python dump_sax.py --help usage: dump_sax.py [options] options: -h, --help show this help message and exit -t STUBTH, --stubth=STUBTH Max. size in bytes to consider an article as stub [default: 256] --pagefile=FILE Name of the SQL file created for the page table [default: page.sql] --revfile=FILE Name of the SQL file created for the revision table [default: revision.sql] --textfile=FILE Name of the SQL file created for the text table [default: text.sql] --skipnamespaces=NAMESPACES List of namespaces whose content will be ignored [comma separated values, without blanks; e.g. --skipnamespaces=name1,name2,name3] -i STRING, --inject=STRING Optional string to inject at the very start of articles' text; string must be provided within quotes (e.g. --inject='my string') or double quotes -f, --fileout Create SQL files from parsed XML dump -s, --streamout Generate an output SQL stream suitable for a direct import into MySQL database -m, --monitor Insert SQL code directly into MySQL database [default] -u MySQL_USER, --user=MySQL_USER Username to connect to MySQL database -p MySQL_PASSWORD, --passwd=MySQL_PASSWORD Password for MySQL user to access the database -d DBNAME, --database=DBNAME Name of the MySQL database --port=MySQL_SERVER_PORT Listening port of MySQL server --machine=SERVER_NAME Name of MySQL server -v, --verbose Display standard status reports about the parsing process [default] -q, --quiet Do not display any status reports -l LOGFILE, --log=LOGFILE Store status reports in a log file; do not display them --insertmaxsize=MAXSIZE Max size in KB of the MySQL extended inserts [default: 156] [max: 256] --insertmaxnum=MAXROWS Max number of individual rows allowed in the MySQL extended inserts [default: 50000][max: 250000]
Please, note that some of these options (like log to a file, skipping namespaces, and text injection), are not yet implemented, though they are going to be included in the following days.