Well, this question comes up over and over again and I would like to describe here how to do this - and this is valid for Wikipedia and Wiktionary.
Now I did this quite often on the Italian wiktionary and on the Neapolitan wikipedia (and some other projects).
For the upload I use the pywikipediabot - and in particular pagefromfile.py. This bot was mainly created to upload pages to Wiktionary, but then it turned out to be a great tool for wikipedia as well.
You need a .txt file saved in utf-8 code. The bot understands the first word on the page between '''and''' pagename and will of course create that page. If the page already exists it will be skipped.
Now the question I got is how a typical entry would look like. Here is an example:
{{-start-}}
'''Rome''' is the capital of Italy.
{{-stop-}}
This means the bot would create the page Rome and add the contents "Rome is the capital of Italy." to the page.
If the first word between '''and''' is not the page name you can use a workaround using a comment:
{{-start-}}
'''Statistical data''' about Rome: ....
{{-stop-}}
In this case the template Rome is being created that contains statistical data.
Just add everything you want to see on the wikipage you want to create between start and stop.
Now one thing you are probably wondering about is how to do this for a huge number of cities or other data. Well: use mailmerge in OpenOffice.org Writer or Microsoft Office and create the layout for a typical template page, then enter the fields of the database you have and simply have it merge. Copy and paste the whole contents of the resulting file into a .txt file (Editor) and save it with utf-8 coding. You can try to do this with Word and OpenOffice.org as well (I mean create the utf-8 coded text file), but we noted that on some systems this creates problems. So just try it out.
Then copy the file in your pywikipediabot folder and call the file.
To have the bot run I use the following comand for the file nap.txt:
pagefromfile.py -start:{{-start-}} -end:{{-stop-}} -file:nap.txt -utf
Of course first you must login using login.py.
I hope this helps those who want to know how to do things. If you have further questions: well, just ask :-) I'll answer asap.
All sorts of things - whatever is interesting to me. - Alle möglichen Sachen - alles, was mich interessiert. - Tante cose diverse - tutto quel che mi interessa.
Subscribe to:
Post Comments (Atom)
Khalil Gibran über die Musik
Die Musik wirkt wie die Sonne, die alle Blumen des Feldes mit ihrem Strahlen zum Leben erweckt. ( Khalil Gibran ) Image by Pete Linforth fr...
-
It is approximately a year ago when there was the first translated article on Wikipedia that was paid for. The idea then was to create a t...
-
Deutsch Vor Jahren, Ende 2009, habe ich, als ich noch in Italien lebte, über die Viedothek des Bayerischen Rundfunks eine Sendung der F...
-
There are many dictionaries around and there is no real central place for them because they are under different licenses and some are propri...
No comments:
Post a Comment