Last modified: 2014-07-22 15:49:55 UTC
When running recurrent reports, if more than a certain number of reports at the same time, we hit a bug in pickle which causes python to throw an error about the maximum recursion limit. This can be hacked around with the very ugly: sys.setrecursionlimit(10000) Or it can be fixed by moving away from pickle serialization. I recommend the latter, but we can fall back to the former and find a decent value to pass there.
Let's do the quick fix now (setrecursionlimit) and log a new bug to move away from pickle serialization so we can do that later.
What are we trying to pickle anyway? I read the documentation on Pickling ( https://docs.python.org/2/library/pickle.html#what-can-be-pickled-and-unpickled ) and it left me wondering if we were trying to pickle objects with recursive references, or really big objects. Seems odd that we'd be doing this.
This is the link to the pickle issue. It's because we're using a chain of group of tasks (to allow us to throttle how many run in parallel): https://github.com/celery/celery/issues/1078
#hornetsnest
Change 146297 had a related patch set uploaded by Milimetric: Avoid pickle max recursion while serializing chain https://gerrit.wikimedia.org/r/146297
Change 146297 merged by jenkins-bot: Avoid pickle max recursion while serializing chain https://gerrit.wikimedia.org/r/146297