[x264-devel] Patch for data races, windows threading, OpenMP threading

Loren Merritt lorenm at u.washington.edu
Sat Jan 10 18:05:37 CET 2009

> For Windows XP, there is no equivalent of 
> pthread_cond_wait/pthread_cond_broadcast

There is no single pthread_cond_wait which can receive both 
pthread_cond_broadcast and pthread_cond_signal. But we don't need both. 
Are you sure there's no equivalent to just broadcast? 
A quick google says: PulseEvent, SignalObjectAndWait.

> I didn't get much if any improvement with OpenMP which sort of 
> surprised me; I guess that there's enough work per frame that the 
> overhead of thread creation doesn't matter much. It may be more 
> visible on linux, I did all my work on Windows.

If thread creation were important, I would fix it by not creating lots of 
threads. I wrote such a patch patch long ago, and didn't commit it because 
it didn't help.

> I found several potential data races 
> due to the copying code in x264_thread_sync . Data that was being 
> initialized in x264_slice_write was also being copied.

That's a few unnecessary bytes of memcpy, but why call it a race? 
If a variable is initted in x264_slice_write, then it doesn't matter 
what x264_thread_sync sets it to, because x264_thread_sync runs 
before x264_slice_write.

> There was a minor data race in encoder/analyse.c -- the frame mutex 
> was released before reading the value of i_lines_completed.

Yeah, --non-deterministic was a silly option.

btw, anyone who knows more about threading than I do, feel free to 
replace the condition variables with memory barriers. Because there 
really shouldn't be any mutual exclusion between two frames checking 
on the same reference.

> Finally there are two other data races that I wasn't sure how to fix. 
> The first is on the x264_t member nr_count. It has the same issue of 
> being copied in x264_thread_sync but I couldn't find a place where it 
> was initialized for the new frame. So I'm not sure if it is supposed 
> to be cumulative between frames; if so, I don't think the code is 
> doing what you expect because of the data race.

Ideally, nr should linearly accumulate data over all previous frames. 
This is just plain impossible with threads.
Next best choice would be to accumulate data over all previous frames 
that are guaranteed to have finished yet. Perhaps separate the current 
frame's nr_residual_sum from the global sum, and move 
x264_noise_reduction_update to the beginning of each frame, where it 
would add the stats from the frame previously encoded by the the 
current thread, and the global sum would then be immediately available to 
the next frame.

> The other data race is between close_file_thread and read_frame_yuv . 
> I think this happens only if --frames is specified. Encode() tries to 
> close the file while read_frame_yuv() is still reading ahead. This 
> didn't show up as a problem but could at some point. Probably some 
> kind of lock and handshake is needed.

How about just

--- a/muxers.c
+++ b/muxers.c
@@ -506,10 +506,10 @@
  int close_file_thread( hnd_t handle )
      thread_input_t *h = handle;
-    h->p_close_infile( h->p_handle );
-    x264_picture_clean( &h->pic );
      if( h->in_progress )
          x264_pthread_join( h->tid, NULL );
+    h->p_close_infile( h->p_handle );
+    x264_picture_clean( &h->pic );
      free( h->next_args );
      free( h );
      return 0;

--Loren Merritt

More information about the x264-devel mailing list