[x265] [PATCH] integrate assembly code for psyCost_pp
Steve Borho
steve at borho.org
Thu Dec 11 17:40:01 CET 2014
On 12/11, Divya Manivannan wrote:
> # HG changeset patch
> # User Divya Manivannan <divya at multicorewareinc.com>
> # Date 1418296477 -19800
> # Thu Dec 11 16:44:37 2014 +0530
> # Node ID 440d264fcdf33889b665848f19e87ca3559d1b6c
> # Parent 667e4ea0899fcf026ee9df935381487d3148ed0c
> integrate assembly code for psyCost_pp
>
> diff -r 667e4ea0899f -r 440d264fcdf3 source/common/pixel.cpp
> --- a/source/common/pixel.cpp Thu Dec 11 09:36:16 2014 +0530
> +++ b/source/common/pixel.cpp Thu Dec 11 16:44:37 2014 +0530
> @@ -815,10 +815,11 @@
> for (int j = 0; j < dim; j+= 8)
> {
> /* AC energy, measured by sa8d (AC + DC) minus SAD (DC) */
> - int sourceEnergy = sa8d_8x8(source + i * sstride + j, sstride, zeroBuf, 0) -
> - (sad<8, 8>(source + i * sstride + j, sstride, zeroBuf, 0) >> 2);
> - int reconEnergy = sa8d_8x8(recon + i * rstride + j, rstride, zeroBuf, 0) -
> - (sad<8, 8>(recon + i * rstride + j, rstride, zeroBuf, 0) >> 2);
> + // PartitionFromSizes(8, 8) = 1
> + int sourceEnergy = primitives.sa8d[1](source + i * sstride + j, sstride, zeroBuf, 0) -
> + (primitives.sad[1](source + i * sstride + j, sstride, zeroBuf, 0) >> 2);
> + int reconEnergy = primitives.sa8d[1](recon + i * rstride + j, rstride, zeroBuf, 0) -
> + (primitives.sad[1](recon + i * rstride + j, rstride, zeroBuf, 0) >> 2);
This is an improvement over just C code, but it is still vastly slower
than writing new assembly functions for these. The function call
overhead is non-trivial.
>
> totEnergy += abs(sourceEnergy - reconEnergy);
> }
> @@ -828,8 +829,11 @@
> else
> {
> /* 4x4 is too small for sa8d */
> - int sourceEnergy = satd_4x4(source, sstride, zeroBuf, 0) - (sad<4, 4>(source, sstride, zeroBuf, 0) >> 2);
> - int reconEnergy = satd_4x4(recon, rstride, zeroBuf, 0) - (sad<4, 4>(recon, rstride, zeroBuf, 0) >> 2);
> + // partitionFromSizes(4, 4) = 0
> + int sourceEnergy = primitives.satd[0](source, sstride, zeroBuf, 0) -
> + (primitives.sad[0](source, sstride, zeroBuf, 0) >> 2);
> + int reconEnergy = primitives.satd[0](recon, rstride, zeroBuf, 0) -
> + (primitives.sad[0](recon, rstride, zeroBuf, 0) >> 2);
> return abs(sourceEnergy - reconEnergy);
> }
> }
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
--
Steve Borho
More information about the x265-devel
mailing list