[x264-devel] Re: Very small optimizations

Loren Merritt lorenm at u.washington.edu
Thu Dec 1 22:52:41 CET 2005


On Thu, 1 Dec 2005, David Pio wrote:

> encoder/me.c
> function 'x264_me_search_ref'
>
> various search methods use 'i_me_range/2' or 'i_me_range/4' in the
> conditional part of the looping structure.
> i_me_range seems not to change within the context of the function, so should
> those divides be taken out of the looping structure?  say create 2 new
> variables i_me_range_div2 and i_me_range_div4??  It would save some divide
> CPU cycles.
>
> Or does the compiler optimize this out?

Division symbol != division instruction. The compiler knows to use shifts.

> common/mc.c
>
> line 247 and 278:
>  int filter1 = (hpel1x & 1) + ( (hpel1y & 1) << 1 );
> could be
>  int filter1 = (hpel1x & 1) ^ ( (hpel1y & 1) << 1 );
>
> replacing an addition with a bitwise OR, should save some CPU cycles?

Is there any modern CPU where ADD and OR are not equally fast?

> lines 314 to 317:
>    const int cA = (8-d8x)*(8-d8y);
>    const int cB = d8x    *(8-d8y);
>    const int cC = (8-d8x)*d8y;
>    const int cD = d8x    *d8y;
>
> could be rewritten as:
>    int d8x_times8 = d8x * 8;
>    int d8y_times8 = d8x * 8;
>    const int cD = d8x * d8y;
>    const int cC = d8y_times8 - cD;
>    const int cB = d8x_times8 - cD;
>    const int cA = 64 - d8x_times8 - d8y_times8 + cD;
>
> 4 subtractions and 4 multiplications are replaced by 1 multplication, 4
> subtractions, 1 addition, and 2 bit shifts ( the *8 should be optimized by
> the compiler to a 3 bit left shift, right?)
> In the SSE version it is accomplished with 2 subtractions and 4
> multiplications
> I couldn't find how many cycles a multiplcation takes, but additions,
> subtractions, and bit shifts take like 1 cycle each, right?

This does remove 3 imul instructions, but doesn't save any cycles on my 
athlon64. Maybe with p4's slower imul?

--Loren Merritt

-- 
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html



More information about the x264-devel mailing list