[x264-devel] Patch for data races, windows threading, OpenMP threading

Larry Meadows lfm at easystreet.net
Tue Jan 6 17:18:57 CET 2009

Dear x264 developers:

Attached please find a patch that you may be interested in. The patch
was created with 'git diff'. Let me explain what I've done:

First, I added macros to support native threading on Windows and with
OpenMP 3.0.

For Windows XP, there is no equivalent of
pthread_cond_wait/pthread_cond_broadcast . So I cheated by using
pthread_mutex_lock and pthread_mutex_unlock in a loop with a
SwichToThread call. This is probably more intrusive than a cond_wait
but should still yield the processor.

Vista has condition variables so it was a straighforward translation.

For OpenMP I used OpenMP tasks instead of threads. I have to busy-wait
just like for WIndows XP. I've tested the OpenMP code with Intel's
11.1 (beta) compiler and it seems to work fine. I did not try it with
gcc; I believe that the most recent gcc release (is it 4.4? ) supports
OpenMP 3.0 tasking.

I didn't get much if any improvement with OpenMP which sort of
surprised me; I guess that there's enough work per frame that the
overhead of thread creation doesn't matter much. It may be more
visible on linux, I did all my work on Windows.

Second, I used an internal version of Intel Parallel Inspector to
check the code for data races. I found several potential data races
due to the copying code in x264_thread_sync . Data that was being
initialized in x264_slice_write was also being copied. I fixed this by
moving the offending struct members outside of the range that is
copied (in common/common.h). I also changed the way that statistics
are handled for similar reasons.

As an aside it would be cleaner to organize x264_t into two or three
different structures. I'm not sure how much of the shared data even
changes after initialization, you might be able to keep only one copy
of that. You'd probably get a small performance boost by avoiding
false sharing also, and the code would be easier to maintain.

There was a minor data race in encoder/analyse.c -- the frame mutex
was released before reading the value of i_lines_completed. There is
also a data race further down in that same file in
x264_analyse_update_cache, but it is in code that is only for
debugging, so I left it alone.

Races on i_lines_completed:

ID	Description	Source	Function	Module	Object Size
X9	Read	analyse.c:2889	x264_analyse_update_cache	x264.exe	
X10	Write	frame.c:881	x264_frame_cond_broadcast	x264.exe	

Finally there are two other data races that I wasn't sure how to fix.
The first is on the x264_t member nr_count. It has the same issue of
being copied in x264_thread_sync but I couldn't find a place where it
was initialized for the new frame. So I'm not sure if it is supposed
to be cumulative between frames; if so, I don't think the code is
doing what you expect because of the data race.

These are the races on nr_count reported by our tool (sorry for the formatting):

ID	Description	Source	Function	Module	Object Size
X12	Read	macroblock.c:821	x264_noise_reduction_update	x264.exe	
X13	Read	encoder.c:1375	x264_encoder_encode	x264.exe	
X11	Write	macroblock.c:579	x264_macroblock_encode	x264.exe	

The other data race is between close_file_thread and read_frame_yuv .
I think this happens only if --frames is specified. Encode() tries to
close the file while read_frame_yuv() is still reading ahead. This
didn't show up as a problem but could at some point. Probably some
kind of lock and handshake is needed.

As an aside I also did some work on incorporating a multithreaded
version of x264 into ffdshow and had to do some tricks with the input
(running it in a separate thread), and had to work around a similar
issue. If I ever get back to that project I will see about recreating
the patches in a portable fashion and send them to the list.

Anyway, I hope you find this useful. Please let me know if you have
any questions or comments. Thanks for the opportunity to play with
your encoder!

-- Larry

Larry Meadows
lfm at easystreet.net
lawrence.f.meadows at intel.com
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: patch.txt
Url: http://mailman.videolan.org/pipermail/x264-devel/attachments/20090106/86e7e84d/attachment.txt 

More information about the x264-devel mailing list