[x264-devel] Re: bug in commit 607

Måns Rullgård mru at inprovide.com
Tue Dec 19 10:04:20 CET 2006


Loren Merritt <lorenm at u.washington.edu> writes:

> On Sun, 17 Dec 2006, Måns Rullgård wrote:
>
>> Loren Merritt <lorenm at u.washington.edu> writes:
>>
>>> On Sat, 16 Dec 2006, Måns Rullgård wrote:
>>>>
>>>> The problem is with the pthread stuff at line 48 and following in
>>>> common/common.h, specifically the #define pthread_t int and similar
>>>> lines.  This whole chunk looks very dodgy to me.  Redefining system
>>>> symbols is never a good idea.
>>>>
>>>> Loren, what's the deal with this?  What's the purpose of the
>>>> USE_CONDITION_VAR thing, and why is it only defined on windows?
>>>
>>> On linux, we don't need any thread synchronization primitives.
>>>    while(!*foo) usleep(100);
>>
>> That is not guaranteed to work.  The compiler is free to optimize out
>> all but the first test of the condition
>
> Ok, but the fix for that would be volatile. I don't see what a
> function call to pthread_mutex_lock has to do with it, unless any
> function call would be sufficient since the compiler has to assume
> that any function might modify that memory address.

No, volatile has a different purpose.  It is mainly intended for
variables that might be modified by a signal handler, and makes no
guarantees about cache coherency.

As for something being modified by functions, the compiler is free to
assume that functions with standard names perform the standard
function.  This allows it to inline simple calls to functions like
memcpy(), and also perform optimizations based on knowledge that a
function does not have side-effects.

>> and even if the compiler does not, the CPU might cache the
>> value. Using a mutex enforces the necessary barriers to make it safe.
>
> Does this also apply to all the other memory that's written by one
> thread any read by another, such as the pixels? If so, does a single
> mutex synchronize all caches of all the cpus?

Strictly, yes.  The pixel data is large enough that any current CPU
will do the right thing, but for the wrong reason.  Using a standard
thread synchronization method is guaranteed to ensure coherency.

> I was under the impression that it's the cpu's job to keep the caches
> coherent and keep writes in order. It should never be the case that 2
> cpus see different values for the same address, unless one just wrote
> to that address and hasn't had time to communicate to the others yet.

The CPUs do keep caches in sync, but can only do so when the data
actually hits the cache.  Pending writes held in a write combining
buffer might not trigger a cache sync.  Intel CPUs tend to do things
"safely" letting you get away with things like this.  Others, like Sun
Sparc and IBM Power, have much more relaxed cache coherency protocols.

>>> is faster (though not by much. it wouldn't kill performance if we had
>>> to enable condition variables on linux too).
>>
>> How much of a difference does it make?
>
> I take that back. My latest benchmark shows no measurable difference.

In that case I strongly suggest that we use standard thread
synchronization methods to be on the safe side.

-- 
Måns Rullgård
mru at inprovide.com

-- 
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html



More information about the x264-devel mailing list