Last modified: 2011-02-26 10:40:07 UTC
Gentlemen, let us observe what happens to the output one sees in reports made to STDOUT (not the sitemaps themselves) by generateSitemap.php when not all of the e.g., >$wgSitemapNamespaces=array(NS_MAIN,NS_PROJECT,NS_TEMPLATE_TALK,NS_HELP_TALK,NS_CATEGORY_TALK,NS_USER); in ones LocalSettings.php are populated: >0 () /home/jidanni/mediawiki/sitemap-mwabj-NS_0-0.xml.gz >4 (ABJ)11 (Template talk)13 (Help talk)15 (Category talk)2 (User) /home/jidanni/mediawiki/sitemap-mwabj-NS_2-0.xml.gz The problem is if there are no items to print for a given namespace, the line > $this->output( "\t$this->fspath$filename\n" ); will not fire, thus no "\n" will get printed. Thus the next line just gets stuck upon the first, garbling the report!
Care to provide a patch if, and I do believe your tag, this is easy?
reedy@ubuntu64-esxi:~/mediawiki/trunk/phase3/maintenance$ php generateSitemap.php 0 () /home/reedy/mediawiki/trunk/phase3/maintenance/sitemap-wikidb-mw_-NS_0-0.xml.gz 2 (User) /home/reedy/mediawiki/trunk/phase3/maintenance/sitemap-wikidb-mw_-NS_2-0.xml.gz 6 (File) /home/reedy/mediawiki/trunk/phase3/maintenance/sitemap-wikidb-mw_-NS_6-0.xml.gz 14 (Category) /home/reedy/mediawiki/trunk/phase3/maintenance/sitemap-wikidb-mw_-NS_14-0.xml.gz I'm failing to see an issue here? Please elabourate more
I dare not look at any more code, but instead just prove it to you again: 22:46 ~$ cd transgender-taiwan.org/maintenance/ 22:46 maintenance$ php generateSitemap.php 0 () /home/jidanni/mediawiki/maintenance/sitemap-transgender-NS_0-0.xml.gz 4 (蝶園)22:46 maintenance$ Note ^the neglected newline? That's because my namespace 4 just happens to have 0 articles on my wiki. So try again with an unpopulated namespace or two added to the list of candidates.
Ok, I can see that. I'm just trying to replicate it. If I just add a new namespace, it doesn't generate a sitemap for it. And just does the ones that do. Have you done anything else to get it to appear there? Set it as a content namespace or anything? Seemingly, the fix is Change $this->output( "\t$this->fspath$filename\n" ); to $this->output( "\t$this->fspath$filename" ); And then add $this->output( "\n" ); after the if. I'll attach a patch, can you confirm if this fixes your issue?
Created attachment 8121 [details] add newline
Now it's $ php generateSitemap.php 0 () /home/jidanni/mediawiki/maintenance/sitemap-transgender-NS_0-0.xml.gz 4 (蝶園)01:08 maintenance$
If you can tell me how to replicate the lack of a path being printed, i can try and help more
Fill up your $wgSitemapNamespaces=array(NS_MAIN,NS_PROJECT... with lots of never uses namespaces, then run generateSitemap.php, and post what you get.
It's ok that no path is printed. It is not OK that no newline is printed.
What if we formatted the output like this (have local patch, looks ok to me) maintenance chad$ php generateSitemap.php 0 () /www/maintenance/sitemap-wikidb-mw_-NS_0-0.xml.gz 2 (User) /www/phase3/maintenance/sitemap-wikidb-mw_-NS_2-0.xml.gz 14 (Category) 3 (User talk) /www/phase3/maintenance/sitemap-wikidb-mw_-NS_3-0.xml.gz maintenance chad$
Chad, that looks a reasonably sane way of outputting it...
Why is it 0 2 14 3, and not 0 2 3 14?
Because $wgSitemapNamespaces isn't sorted, and I put them in my list in a semi-random order.
Done in r82115.
Created attachment 8215 [details] hard to understand I still say this is very hard to understand. Especially when done for more than one site. Perhaps a header could be added saying what the columns mean or something.