Last modified: 2013-04-22 16:51:41 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T41493, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 39493 - High OOM rate in refreshLinks2
High OOM rate in refreshLinks2
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Page editing (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-08-20 01:01 UTC by Tim Starling
Modified: 2013-04-22 16:51 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Tim Starling 2012-08-20 01:01:20 UTC
We're logging around 6000-10000 job queue OOMs per day:

$ for day in `seq 15 17`; do echo -n "August $day: "; zgrep -A2 'Allowed memory size of' fatal.log-201208$day.gz  | grep unknown-host | wc -l ; done

August 15: 7239
August 16: 9737
August 17: 6492

They are OOMs from various points in the parser, with RefreshLinksJob2::run() as the ultimate caller.

These cause collateral damage beyond the article that actually triggered the OOM, since the whole RefreshLinks2 batch is lost.

Perhaps there is a memory leak.
Comment 1 Aaron Schulz 2012-09-03 19:55:35 UTC
Starting looking at this a bit.

Also, see https://gerrit.wikimedia.org/r/22497.
Comment 2 Rob Lanphier 2012-09-21 18:27:26 UTC
Speaking with Aaron now, it would seem this one maybe isn't a problem anymore.  Tim, does this still look like a problem to you?
Comment 3 Aaron Schulz 2012-09-25 21:58:59 UTC
Actually it looks just as frequent as before.
Comment 4 Aaron Schulz 2012-10-10 18:39:53 UTC
OK, more info:

aaron@fluorine:~/mw-log$ for day in `seq 25 30`; do echo -n "Sep $day: "; zgrep -A2 'Allowed memory
size of' archive/fatal.log-201209$day.gz  | grep unknown-host | wc -l ; done
Sep 25: 1359
Sep 26: 979
Sep 27: 823
Sep 28: 812
Sep 29: 769
Sep 30: 970
Comment 5 Aaron Schulz 2013-01-03 01:09:02 UTC
Seems lower the last few weeks.

aaron@fluorine:~/mw-log$ for day in `seq 25 31`; do echo -n "Dec $day: "; zgrep -A2 'Allowed memory size of' archive/fatal.log-201212$day.gz  | grep unknown-host | wc -l ; done
Dec 25: 11
Dec 26: 11
Dec 27: 239
Dec 28: 46
Dec 29: 3
Dec 30: 4
Dec 31: 1
Comment 6 Aaron Schulz 2013-04-17 19:43:30 UTC
The jobs runner memory limits were doubled and the wikidata job batch sizes where also halved (again) on Apr 16 (those were piling OOMs of there own).
Comment 7 Aaron Schulz 2013-04-22 16:51:41 UTC
aaron@fluorine:~/mw-log$ for day in `seq 10 21`; do echo -n "Day $day: "; zgrep -A2 'Allowed memory size of' archive/fatal.log-201304$day.gz | grep -P "mw10(0[1-9]|1[0-6])" | wc -l ; done
Day 10: 6665
Day 11: 16169
Day 12: 29571
Day 13: 1879
Day 14: 142
Day 15: 6
Day 16: 141
Day 17: 27
Day 18: 0
Day 19: 0
Day 20: 0
Day 21: 0

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links