Last modified: 2014-07-07 11:57:12 UTC
Created attachment 15798 [details] Stack trace from osmium To reproduce, curl the enwikivoyage api on osmium repeatedly, requesting a parse of the contents of <http://en.wikivoyage.org/wiki/Geneva?action=raw>. Crashes observed with a production build of HHVM running in FastCGI server mode with JIT enabled. Two stack traces are attached.
Created attachment 15799 [details] Additional trace from osmium
Try with these two fixes: <https://gerrit.wikimedia.org/r/144125> <https://github.com/tstarling/hiphop-php/commit/fb40de3491789ad98e7f07e9b03ef1ff2860b89a> There are still plenty more bug fixes to come for bugs that I have found with multi-threaded testing, but hopefully the first of those two commits will avoid crashes with std::set<HashTable*> methods in the backtrace, and the second will fix a few other crashes.
(In reply to Tim Starling from comment #2) > Try with these two fixes: > > <https://gerrit.wikimedia.org/r/144125> > <https://github.com/tstarling/hiphop-php/commit/ > fb40de3491789ad98e7f07e9b03ef1ff2860b89a> > > There are still plenty more bug fixes to come for bugs that I have found > with multi-threaded testing, but hopefully the first of those two commits > will avoid crashes with std::set<HashTable*> methods in the backtrace, and > the second will fix a few other crashes. Confirmed. There are still threading-related segfaults, but std::set<HashTable*> methods do not appear in the traces.
The crash I am able to easily reproduce is this: # 0 bt_handler at /srv/hhvm-dev/hphp/util/process.h:81 # 1 lua_sethook at /usr/lib/x86_64-linux-gnu/liblua5.1.so.0:0 # 2 luasandbox_timer_handle_profiler at /srv/luasandbox/timer.c:218 # 3 timer_sigev_thread at /build/buildd/eglibc-2.19/rt/../nptl/sysdeps/unix/sysv/linux/timer_routines.c:66 # 4 start_thread at /build/buildd/eglibc-2.19/nptl/pthread_create.c:312 # 5 clone at /build/buildd/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:113 No PHP stack trace is present. Some cursory Googling produced a few posts in which people grumble about lua_sethook not being signal-safe, contrary to what is claimed in the source: http://lua-users.org/lists/lua-l/2012-02/msg00060.html http://lua-users.org/lists/lua-l/2012-02/msg00060.html
(In reply to Ori Livneh from comment #4) > The crash I am able to easily reproduce is this: > > # 0 bt_handler at /srv/hhvm-dev/hphp/util/process.h:81 > # 1 lua_sethook at /usr/lib/x86_64-linux-gnu/liblua5.1.so.0:0 > # 2 luasandbox_timer_handle_profiler at /srv/luasandbox/timer.c:218 > # 3 timer_sigev_thread at > /build/buildd/eglibc-2.19/rt/../nptl/sysdeps/unix/sysv/linux/timer_routines. > c:66 > # 4 start_thread at /build/buildd/eglibc-2.19/nptl/pthread_create.c:312 > # 5 clone at > /build/buildd/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:113 > > No PHP stack trace is present. IIRC that was one of the issues fixed by <https://github.com/facebook/hhvm/pull/3120>. The LuaSandbox object was destroyed before the timer event was delivered, so dereferencing the user data failed. > Some cursory Googling produced a few posts in which people grumble about > lua_sethook not being signal-safe, contrary to what is claimed in the > source: http://lua-users.org/lists/lua-l/2012-02/msg00060.html Theoretically, pointer reads and writes may be non-atomic, like far pointers in [[real mode]], but they are atomic on x86-64. The only remaining issue in LuaSandbox against my HHVM dev branch that I know about is the fact that emergency timeouts don't work -- that will be fixed by disabling them. I have tested a patch for this. Anyway, the luasandbox_unprotect_recursion() issue was fixed, so any other crash can go in a new bug report.