Last modified: 2013-03-26 13:16:12 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T42195, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 40195 - ArticleFeedbackv5 generating bad recent changes entries
ArticleFeedbackv5 generating bad recent changes entries
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
ArticleFeedbackv5 (Other open bugs)
master
All All
: High blocker (vote)
: ---
Assigned To: Nobody - You can work on this!
https://en.wikipedia.org/w/api.php?ac...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-12 21:41 UTC by bianjiang
Modified: 2013-03-26 13:16 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description bianjiang 2012-09-12 21:41:03 UTC
http://en.wikipedia.org/w/api.php?action=query&list=recentchanges&rcprop=title%7Cids%7Csizes%7Cflags%7Cuser%7Ccomment%7Ctimestamp%7Cloginfo&rclimit=200&format=xml

The above link returns an error page now.

Removing "%7Cloginfo" will get an XML.

We rely on loginfo to get full updates of a mediawiki site. Currently nothing works.
Comment 2 Brion Vibber 2012-09-12 21:43:37 UTC
Problem is XML parsing error. The offending bit appears to be:

... logid="44650407" logtype="articlefeedbackv5" logaction="helpful" 4::feedbackId="348221" 5::pageId="34684163" ...

the 4::feedbackId and 5::pageId are invalid attribute names (I'm not sure whether it's the double colon or the initial digit offhand that's the prob).

These really shouldn't be in the output, not sure how it happens.
Comment 3 Brion Vibber 2012-09-12 21:44:07 UTC
Note that the original link no longer fails because the offending item has scrolled off. At this moment rclimit=2000 still shows it.
Comment 4 Sam Reed (reedy) 2012-09-12 21:50:06 UTC
(In reply to comment #2)
> Problem is XML parsing error. The offending bit appears to be:
> 
> ... logid="44650407" logtype="articlefeedbackv5" logaction="helpful"
> 4::feedbackId="348221" 5::pageId="34684163" ...
> 
> the 4::feedbackId and 5::pageId are invalid attribute names (I'm not sure
> whether it's the double colon or the initial digit offhand that's the prob).
> 
> These really shouldn't be in the output, not sure how it happens.


https://gerrit.wikimedia.org/r/#/c/23380/
https://gerrit.wikimedia.org/r/#/c/23575/
Comment 5 bianjiang 2012-09-12 22:20:46 UTC
(In reply to comment #1)
> https://en.wikipedia.org/w/api.php?action=query&list=recentchanges&rcprop=title%7Cids%7Csizes%7Cflags%7Cuser%7Ccomment%7Ctimestamp%7Cloginfo&rclimit=200&format=xmlfm
> 
> ^ that works fine. So certainly it's not "nothing works"

by "nothing works" I mean it's on our side.

we have a monitor process polling the API to get every update on all wikipedia site and then crawl the update and update our internal storage. At first the bug only happened on EN wikipedia, now all wikipedia sites suffer this. So our system is complete down.
Comment 6 Roan Kattouw 2012-09-12 22:51:33 UTC
(In reply to comment #2)
> Problem is XML parsing error. The offending bit appears to be:
> 
> ... logid="44650407" logtype="articlefeedbackv5" logaction="helpful"
> 4::feedbackId="348221" 5::pageId="34684163" ...
> 
> the 4::feedbackId and 5::pageId are invalid attribute names (I'm not sure
> whether it's the double colon or the initial digit offhand that's the prob).
> 
> These really shouldn't be in the output, not sure how it happens.
Yeah I'm not quite sure where this is coming from, *something* is giving us bad data. I suppose it would be ArticleFeedbackv5, I'll look into that.
Comment 7 Roan Kattouw 2012-09-12 23:05:18 UTC
(In reply to comment #6)
> Yeah I'm not quite sure where this is coming from, *something* is giving us bad
> data. I suppose it would be ArticleFeedbackv5, I'll look into that.
Confirmed it's AFTv5's fault, talked to Mathias and he said he'd fix it later today.
Comment 8 MZMcBride 2012-09-13 06:34:01 UTC
Confirmed.
Comment 9 anthonyzhang 2012-09-14 01:23:17 UTC
(In reply to comment #8)
> Confirmed.
What is the current status now? We can't get updates from wikipedia because of this bug.
Comment 10 anthonyzhang 2012-09-14 03:04:51 UTC
Seems https://gerrit.wikimedia.org/r/#/c/23623/ fixed this bug.
Comment 11 anthonyzhang 2012-09-14 04:33:57 UTC
(In reply to comment #10)
> Seems https://gerrit.wikimedia.org/r/#/c/23623/ fixed this bug.

Is this code in use now? I can still saw 4::feedbackId 1 hour ago. For example:

<rc type="log" ns="-1" title="Special:ArticleFeedbackv5/The Philadelphia Inquirer/254606" rcid="527746847" pageid="0" revid="0" old_revid="0" user="Medvedenko" oldlen="0" newlen="0" timestamp="2012-09-14T03:13:48Z" comment="" logid="44672742" logtype="articlefeedbackv5" logaction="unhelpful" 4::feedbackId="254606" 5::pageId="102952" />
Comment 12 Bryan Tong Minh 2012-09-14 14:42:00 UTC
Added permanent url https://en.wikipedia.org/w/api.php?action=query&list=recentchanges&rcprop=title|ids|sizes|flags|user|comment|timestamp|loginfo&rclimit=20&format=xml&rcstart=2012-09-14T03:13:48Z

There will be stray log entries left that need to be dealt with, even when that fix is live. 

ApiFormatXml should be able to check if all the attribute keys are valid before simply outputing them.
Comment 13 Bryan Tong Minh 2012-09-14 14:52:12 UTC
A little bit more information from <http://msdn.microsoft.com/en-us/library/ms256152.aspx> "Like element names, attribute names are case-sensitive and must start with a letter or underscore. The rest of the name can contain letters, digits, hyphens, underscores, and periods."

I think the best solution is to prefix everything that does not start with a letter or underscore with an underscore, and replace every special character with an underscore.
Comment 14 Sam Reed (reedy) 2012-09-17 14:40:51 UTC
*** Bug 40299 has been marked as a duplicate of this bug. ***
Comment 15 Sam Reed (reedy) 2012-09-17 16:41:38 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > Seems https://gerrit.wikimedia.org/r/#/c/23623/ fixed this bug.
> 
> Is this code in use now? I can still saw 4::feedbackId 1 hour ago. For example:
> 
> <rc type="log" ns="-1" title="Special:ArticleFeedbackv5/The Philadelphia
> Inquirer/254606" rcid="527746847" pageid="0" revid="0" old_revid="0"
> user="Medvedenko" oldlen="0" newlen="0" timestamp="2012-09-14T03:13:48Z"
> comment="" logid="44672742" logtype="articlefeedbackv5" logaction="unhelpful"
> 4::feedbackId="254606" 5::pageId="102952" />

The fix is now live, not sure why Roan didn't do it before.

We need a script to clean up these bad entries.. I thought re-running loggingUpdate.php would've fixed them..
Comment 16 Matthias Mullie 2012-09-17 16:49:34 UTC
Sam: loggingUpdate should indeed have fixed them, and quickly skimming the entries in logging-table, they seem fine now. E.g.: a:2:{s:10:"feedbackId";i:67680;s:6:"pageId";i:29219160;}
Comment 17 Sam Reed (reedy) 2012-09-17 16:51:41 UTC
(In reply to comment #16)
> Sam: loggingUpdate should indeed have fixed them, and quickly skimming the
> entries in logging-table, they seem fine now. E.g.:
> a:2:{s:10:"feedbackId";i:67680;s:6:"pageId";i:29219160;}

Must be squid caching or similar...
Comment 18 Matthias Mullie 2012-09-17 16:53:42 UTC
Note: might also want to update documentation
(http://www.mediawiki.org/wiki/Manual:Logging_to_Special:Log) - still
encourages to use parameter numbering (// Parameter numbering should start from
4.)
Comment 20 Matthias Mullie 2012-09-24 14:17:04 UTC
Just looked at the data in the db and it's just fine in there:

mysql> SELECT log_params FROM logging WHERE log_id = 44672742;
+---------------------------------------------------------+
| log_params                                              |
+---------------------------------------------------------+
| a:2:{s:10:"feedbackId";i:254606;s:6:"pageId";i:102952;} |
+---------------------------------------------------------+
1 row in set (0.00 sec)

Still some cache persisting, apparently.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links