Last modified: 2014-05-28 18:46:32 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T67812, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 65812 - Infinite loop/stuck parsing
Infinite loop/stuck parsing
Status: RESOLVED FIXED
Product: Parsoid
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Highest major
: ---
Assigned To: Gabriel Wicke
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-05-27 15:59 UTC by ssastry
Modified: 2014-05-28 18:46 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description ssastry 2014-05-27 15:59:23 UTC
Try this url:
http://localhost:8000/huwiki/Vegy%C3%BCletek_%C3%B6sszegk%C3%A9plet-t%C3%A1bl%C3%A1zata?oldid=14167011

Puts Parsoid in a coma.

Discovered via production logs after Parsoid cluster load spiked y'day where most parsoid processes were stuck.
Comment 1 ssastry 2014-05-27 16:03:28 UTC
It does look like a tokenizer issue:

[subbu@earth tests] ./fetch-wt.js --prefix huwiki 14167011 > inf.loop.wt
[subbu@earth tests] node parse --trace peg --prefix huwiki < inf.loop.wt
... some tokens emitted ...
... stuck ...
Comment 2 Gerrit Notification Bot 2014-05-27 20:32:29 UTC
Change 135611 had a related patch set uploaded by GWicke:
Bug 65812: Speed up processing of huge sync token chunks

https://gerrit.wikimedia.org/r/135611
Comment 3 Gabriel Wicke 2014-05-27 20:33:25 UTC
(In reply to Gerrit Notification Bot from comment #2)
> Change 135611 had a related patch set uploaded by GWicke:
> Bug 65812: Speed up processing of huge sync token chunks
> 
> https://gerrit.wikimedia.org/r/135611

Sorry for the spam, this was actually intendend for bug 65812.
Comment 4 Gabriel Wicke 2014-05-27 20:35:12 UTC
(In reply to Gabriel Wicke from comment #3)
> Sorry for the spam, this was actually intendend for bug 65812.

Never mind..
Comment 5 ssastry 2014-05-27 20:39:19 UTC
{{:Sablon:összegtáblázat}} is the transclusion in question in huwiki which generates a 408K token chunk in the tokenizer prior to the fix and which seemed to essentially slow down in async-ttm after about close to 128K tokens were processed and we traced this to a slowdown in concatenation once the accum size crossed a threshold.
Comment 6 Gerrit Notification Bot 2014-05-27 20:55:49 UTC
Change 135611 merged by jenkins-bot:
Bug 65812: Speed up processing of huge sync token chunks

https://gerrit.wikimedia.org/r/135611
Comment 7 Gabriel Wicke 2014-05-28 18:46:32 UTC
Fixed by https://gerrit.wikimedia.org/r/135611, and further improved by https://gerrit.wikimedia.org/r/135723 to the point where this huge test case now parses in about 66 seconds.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links