[x265] [PATCH 1 of 2] asm: rewrite and fix bug in weight_pp_sse4 on HIGH_BIT_DEPTH mode
chen
chenm003 at 163.com
Mon Jan 19 18:34:42 CET 2015
At 2015-01-20 01:19:27,dave <dtyx265 at gmail.com> wrote:
>On 01/19/2015 02:22 AM, Min Chen wrote:
>> # HG changeset patch
>> # User Min Chen <chenm003 at 163.com>
>> # Date 1421662905 -28800
>> # Node ID a0bb3bb1b076d2ef559ab94bfe81052142d302c3
>> # Parent bbc333bd4a6207c72c682b3ea88794c67996aa83
>> asm: rewrite and fix bug in weight_pp_sse4 on HIGH_BIT_DEPTH mode
>> ---
>> source/common/x86/asm-primitives.cpp | 2 +-
>> source/common/x86/pixel-util8.asm | 55 +++++++++++++++++++++-------------
>> source/test/pixelharness.cpp | 45 +++++++++++++++++++++++++++
>> 3 files changed, 80 insertions(+), 22 deletions(-)
>>
>> diff -r bbc333bd4a62 -r a0bb3bb1b076 source/common/x86/asm-primitives.cpp
>> --- a/source/common/x86/asm-primitives.cpp Mon Jan 19 09:59:33 2015 +0530
>> +++ b/source/common/x86/asm-primitives.cpp Mon Jan 19 18:21:45 2015 +0800
>> @@ -924,7 +924,7 @@
>>
>> p.planecopy_cp = x265_upShift_8_sse4;
>> // these fail unit tests
>> - // p.weight_pp = x265_weight_pp_sse4;
>> + p.weight_pp = x265_weight_pp_sse4;
>> // p.weight_sp = x265_weight_sp_sse4;
>>
>> p.cu[BLOCK_4x4].psy_cost_pp = x265_psyCost_pp_4x4_sse4;
>> diff -r bbc333bd4a62 -r a0bb3bb1b076 source/common/x86/pixel-util8.asm
>> --- a/source/common/x86/pixel-util8.asm Mon Jan 19 09:59:33 2015 +0530
>> +++ b/source/common/x86/pixel-util8.asm Mon Jan 19 18:21:45 2015 +0800
>> @@ -55,6 +55,8 @@
>> cextern pw_1
>> cextern pb_1
>> cextern pw_00ff
>> +cextern pw_1023
>> +cextern pw_3fff
>> cextern pw_2000
>> cextern pw_pixel_max
>> cextern pd_1
>> @@ -856,26 +858,52 @@
>> ;void weight_pp(pixel *src, pixel *dst, intptr_t stride, int width, int height, int w0, int round, int shift, int offset)
>> ;-----------------------------------------------------------------------------------------------------------------------------------------------
>> INIT_XMM sse4
>> -cglobal weight_pp, 6, 7, 6
>> -
>> - shl r5d, 6 ; m0 = [w0<<6]
>> +cglobal weight_pp, 4,7,7
>> +%define correction (14 - BIT_DEPTH)
>> +%if BIT_DEPTH == 10
>> + mova m6, [pw_1023]
>> +%elif BIT_DEPTH == 12
>> + mova m6, [pw_3fff]
>> +%else
>> + %error Unsupported BIT_DEPTH!
>> +%endif
>Unsupported BIT_DEPTH! is triggered in 8 bit
>from gcc:
>
>source/common/x86/pixel-util8.asm:868: warning: Unsupported 8!
Sorry, my local tree merge mistake, I have sent a new patch, please verify it, thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20150120/5339eb91/attachment-0001.html>
More information about the x265-devel
mailing list