system/libaio: Test 18 triggers x86_64 host crash/restart
When 64-bit Test 18 (x86_64/packages/system/libaio/src/libaio-0.3.112/harness/cases/18.p
, attached) is run, whether inside a VM or not, it can cause the x86_64 host server to crash (restart) instantly with no error logs to indicate the issue.
When 32-bit Test 18 (pmmx/packages/system/libaio/src/libaio-0.3.112/harness/cases/18.p
, attached) is run, the test is successful on the same hardware.
The affected x86-based hardware is: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
.
It occurs on one of our development nodes, but not on an identical node of identical hardware, so the issue is likely a defective chip, not implementation. We do not yet know which (if not both) CPUs are affected.
We have replaced the motherboard, thoroughly cleaned the CPU contacts, disabled Hyperthreading and tested various other BIOS settings to no avail.
Note that this dynamically-linked binary will trigger the behavior on both Adélie and Alpine operating systems, indicating that multiple libc and/or kernel version(s) may be affected, or that this is purely a hardware issue.
On Larry (same x86_64 environment but AMD-based):
adelie ~ # ./18.p
test cases/18.t completed PASSED.
adelie ~ # ldd 18.p
/lib/ld-musl-x86_64.so.1 (0x7f5f72ddb000)
libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f5f72ddb000)
On a test machine (identical hardware to the node which crashes, but different machine):
localhost:~$ ./18.p
test cases/18.t completed PASSED.
localhost:~$ ldd 18.p
/lib/ld-musl-x86_64.so.1 (0x7f787d616000)
libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f787d616000)