[x265] [PATCH] testbench: added new optimized c primitive for psyCost_pp, suitable to write asm code
Steve Borho
steve at borho.org
Mon Dec 15 18:17:44 CET 2014
On 12/15, dnyaneshwar at multicorewareinc.com wrote:
> # HG changeset patch
> # User Dnyaneshwar G <dnyaneshwar at multicorewareinc.com>
> # Date 1418633185 -19800
> # Mon Dec 15 14:16:25 2014 +0530
> # Node ID ff352d647f4b3a8f0c249fc7a8f4eb3645aaa974
> # Parent 6ba7be7b169783db1d667d1140e51b68ff4b64fb
> testbench: added new optimized c primitive for psyCost_pp, suitable to write asm code
>
> in new primitive, combined sa8d_8x8 and sad_8x8 together to save redundant loads, removed unnecessary zeroBuffer
> testbench checks old c vs new c code correctness
Queued.
As a test I wired up the two C refs to be run by the speed tests. The results
are interesting. The new C functions are faster, primarily because there are
fewer function calls. I look forward to the assembly code.
psycost_pp[4x4] 1.56x 957.33 1489.44
psycost_ss[4x4] 1.09x 1216.87 1323.21
psycost_pp[8x8] 1.47x 3876.76 5689.62
psycost_ss[8x8] 1.17x 4575.75 5364.46
psycost_pp[16x16] 1.23x 25199.46 30884.86
psycost_ss[16x16] 1.00x 21506.01 21432.01
psycost_pp[32x32] 1.14x 81989.49 93471.70
psycost_ss[32x32] 1.94x 81693.79 158855.92
psycost_pp[64x64] 1.46x 263514.53 385280.00
psycost_ss[64x64] 1.02x 339000.50 344397.19
My hacks:
diff -r be5ab1a2a3fa source/test/pixelharness.cpp
--- a/source/test/pixelharness.cpp Mon Dec 15 15:10:27 2014 +0530
+++ b/source/test/pixelharness.cpp Mon Dec 15 11:11:30 2014 -0600
@@ -1695,6 +1695,18 @@
HEADER("copy_cnt[%dx%d]", 4 << i, 4 << i);
REPORT_SPEEDUP(opt.copy_cnt[i], ref.copy_cnt[i], sbuf1, sbuf2, STRIDE);
}
+
+ if (ref.psy_cost_pp[i])
+ {
+ HEADER("psycost_pp[%dx%d]", 4 << i, 4 << i);
+ REPORT_SPEEDUP(ref.psy_cost_pp[i + NUM_SQUARE_BLOCKS], ref.psy_cost_pp[i], pbuf1, STRIDE, pbuf2, STRIDE);
+ }
+
+ if (ref.psy_cost_ss[i])
+ {
+ HEADER("psycost_ss[%dx%d]", 4 << i, 4 << i);
+ REPORT_SPEEDUP(ref.psy_cost_ss[i + NUM_SQUARE_BLOCKS], ref.psy_cost_ss[i], sbuf1, STRIDE, sbuf2, STRIDE);
+ }
}
if (opt.weight_pp)
--
Steve Borho
More information about the x265-devel
mailing list