[x264-devel] Patch for data races, windows threading, OpenMP threading
Loren Merritt
lorenm at u.washington.edu
Sat Jan 10 18:05:37 CET 2009
> For Windows XP, there is no equivalent of
> pthread_cond_wait/pthread_cond_broadcast
There is no single pthread_cond_wait which can receive both
pthread_cond_broadcast and pthread_cond_signal. But we don't need both.
Are you sure there's no equivalent to just broadcast?
A quick google says: PulseEvent, SignalObjectAndWait.
> I didn't get much if any improvement with OpenMP which sort of
> surprised me; I guess that there's enough work per frame that the
> overhead of thread creation doesn't matter much. It may be more
> visible on linux, I did all my work on Windows.
If thread creation were important, I would fix it by not creating lots of
threads. I wrote such a patch patch long ago, and didn't commit it because
it didn't help.
> I found several potential data races
> due to the copying code in x264_thread_sync . Data that was being
> initialized in x264_slice_write was also being copied.
That's a few unnecessary bytes of memcpy, but why call it a race?
If a variable is initted in x264_slice_write, then it doesn't matter
what x264_thread_sync sets it to, because x264_thread_sync runs
before x264_slice_write.
> There was a minor data race in encoder/analyse.c -- the frame mutex
> was released before reading the value of i_lines_completed.
Yeah, --non-deterministic was a silly option.
btw, anyone who knows more about threading than I do, feel free to
replace the condition variables with memory barriers. Because there
really shouldn't be any mutual exclusion between two frames checking
on the same reference.
> Finally there are two other data races that I wasn't sure how to fix.
> The first is on the x264_t member nr_count. It has the same issue of
> being copied in x264_thread_sync but I couldn't find a place where it
> was initialized for the new frame. So I'm not sure if it is supposed
> to be cumulative between frames; if so, I don't think the code is
> doing what you expect because of the data race.
Ideally, nr should linearly accumulate data over all previous frames.
This is just plain impossible with threads.
Next best choice would be to accumulate data over all previous frames
that are guaranteed to have finished yet. Perhaps separate the current
frame's nr_residual_sum from the global sum, and move
x264_noise_reduction_update to the beginning of each frame, where it
would add the stats from the frame previously encoded by the the
current thread, and the global sum would then be immediately available to
the next frame.
> The other data race is between close_file_thread and read_frame_yuv .
> I think this happens only if --frames is specified. Encode() tries to
> close the file while read_frame_yuv() is still reading ahead. This
> didn't show up as a problem but could at some point. Probably some
> kind of lock and handshake is needed.
How about just
--- a/muxers.c
+++ b/muxers.c
@@ -506,10 +506,10 @@
int close_file_thread( hnd_t handle )
{
thread_input_t *h = handle;
- h->p_close_infile( h->p_handle );
- x264_picture_clean( &h->pic );
if( h->in_progress )
x264_pthread_join( h->tid, NULL );
+ h->p_close_infile( h->p_handle );
+ x264_picture_clean( &h->pic );
free( h->next_args );
free( h );
return 0;
--Loren Merritt
More information about the x264-devel
mailing list