[x265] [PATCH] Added 10bit support to ssse3 dct16 and dct32 intrinsics

Steve Borho steve at borho.org
Thu Jan 22 01:31:04 CET 2015


On 01/21, dtyx265 at gmail.com wrote:
> # HG changeset patch
> # User David T Yuen <dtyx265 at gmail.com>
> # Date 1421877896 28800
> # Node ID ebbcf28b6d78afe0781516523c6f961e4404581c
> # Parent  66f85a0519e2e881b3ecd0026b3fabfc46926293
> Added 10bit support to ssse3 dct16 and dct32 intrinsics
> 
> WARNING:My system is old and limited to sse3 so this is untested!
> I will be happy to fix any errors found by anyone else.

It seems to work, here's the results for a HIGH_BIT_DEPTH build.

% ./test/TestBench --cpu ssse3 --test transform | grep dct
dct4x4          3.42x    344.36      1178.27 
dct16x16        4.58x    8254.76     37785.05
dct32x32        2.80x    81600.71    228222.84
idct4x4         7.96x    226.55      1804.38 
idct8x8         5.86x    1267.84     7432.42 
idct16x16       5.90x    7163.71     42261.29
idct32x32       6.31x    51990.97    327992.16

Running some more validations, then will push it if all is well.

-- 
Steve Borho


More information about the x265-devel mailing list