[x264-devel] Machine Check errors
henrik at gramner.com
Wed Mar 11 22:54:20 CET 2015
On Tue, Mar 10, 2015 at 7:45 PM, Mark Nelson <markn at ieee.org> wrote:
> Using recent videolan builds of the x264 windows command line executable,
> (x264-r2491-24e4fed.exe), I have some hardware that experiences BSOD errors
> due to Machine Check 9C. This is seen when using the the default auto-detect
> CPU flags.
> The BSODs are very rare. On a machine that is using close to 100% of its
> cycles on encoding, the average rate of failure is perhaps 1/week.
> The error has been seen on Xeon E5645 @ 2.4 GHz CPUs running XP, and on Xeon
> X5680 @3.33 GHz CPUs running Server 2008 R2.The crash is not associated with
> specific machines, it seems to occur on any machine of a specific model and
> CPU type.
> On both types of system, running the encoders with --asm 0x1400EE
> eliminates the problem - thousands and thousands of hours with no crashes.
> Getting to the bottom of Machine Check errors on Intel CPUs seems very
> problematic. It doesn't seem like our MB manufacturer or Intel has a good
> way to actually catch this in the act and and explain why it happens. All
> the advice for fixing this error is along the lines of eliminating possible
> problems, mostly by pointing fingers at things that can go bad on the MB,
> faulty memory, bad BIOS settings etc.
> All of that is fine, but these same machines never experience that BSOD
> error when running other types of software at the same high rates - close to
> 100% CPU utilization. There is something about the default CPU options being
> selected by x264 that is causing the unique event:
> x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
> I realize it is *way* outside the scope of this mailer to debug CPU, MB, and
> chipset defects, but it would be interesting to know if anyone has ever seen
> this, either in the context of x264 or elsewhere.
> I don't think there is any way a Machine Check 9C can be generated by user
> mode code, so I have all along been working on the theory that this is a
> result of either a hardware defect or configuration error. To no avail.
> Mark Nelson – markn at ieee.org - http://marknelson.us
> x264-devel mailing list
> x264-devel at videolan.org
That indeed sounds like a hardware issue since a user space
application shouldn't be able to cause a BSOD.
Both E5645 and X5680 are Westmere-EP CPUs, does it occur with other
microarchitectures as well? If not it could possibly be a CPU bug
(those exist in a much larger number than you'd expect), see
for an errata summary of the 5600 series.
More information about the x264-devel