[x265] [PATCH 15 of 16] improve sao diff[] by primivites.sub_ps

chen chenm003 at 163.com
Wed Oct 7 16:19:11 CEST 2015


Yes, the 'E*' depends on two of point and compare gradient between them, so we can pass two of pointer into asm code.

Who work on it?


At 2015-10-07 17:08:47,"Ashok Kumar Mishra" <ashok at multicorewareinc.com> wrote:

Min, Now we have done the first improvement in SAO. The second one I believe we can write only two functions, 
one for BO statistics and one for EO statistics using some data manipulation. So that we have
only two primitives for BO and EO. No need to write separate functions for each EO. Though it may not give
significant performance but the code will be in very good shape.


On Wed, Oct 7, 2015 at 4:25 AM, Min Chen <chenm003 at 163.com> wrote:
# HG changeset patch
# User Min Chen <chenm003 at 163.com>
# Date 1444167726 18000
# Node ID 36a54b2cf7c4c96067bafb67077651d30d83e8e9
# Parent  8fdd1b8fd4529b6966ab787f6c624f6056f77593
improve sao diff[] by primivites.sub_ps
---
 source/encoder/sao.cpp |   23 +++++++++++++++++------
 source/encoder/sao.h   |    1 +
 2 files changed, 18 insertions(+), 6 deletions(-)

diff -r 8fdd1b8fd452 -r 36a54b2cf7c4 source/encoder/sao.cpp
--- a/source/encoder/sao.cpp    Tue Oct 06 16:42:02 2015 -0500
+++ b/source/encoder/sao.cpp    Tue Oct 06 16:42:06 2015 -0500
@@ -106,6 +106,7 @@
 bool SAO::create(x265_param* param)
 {
     m_param = param;
+    m_chromaFormat = param->internalCsp;
     m_hChromaShift = CHROMA_H_SHIFT(param->internalCsp);
     m_vChromaShift = CHROMA_V_SHIFT(param->internalCsp);

@@ -715,14 +716,24 @@
     ALIGN_VAR_32(int16_t, diff[MAX_CU_SIZE * MAX_CU_SIZE]);

     // Calculate (fenc - frec) and put into diff[]
-    // WARNING: *) May read beyond bound on video than width or height is NOT multiple of cuSize
-    //          *) MUST BE handle ColorSpace other than 420 yourself!
-    //primitives.cu[g_maxLog2CUSize - 2 - (plane != 0)].sub_ps(diff, MAX_CU_SIZE, fenc0, rec0, stride, stride);
-    for(int y = 0; y < ctuHeight; y++)
+    if ((lpelx + ctuWidth <  picWidth) & (tpely + ctuHeight < picHeight))
     {
-        for(int x = 0; x < ctuWidth; x++)
+        // WARNING: *) May read beyond bound on video than ctuWidth or ctuHeight is NOT multiple of cuSize
+        X265_CHECK((ctuWidth == ctuHeight) || (m_chromaFormat != X265_CSP_I420), "video size check failure\n");
+        if (plane)
+            primitives.chroma[m_chromaFormat].cu[g_maxLog2CUSize - 2].sub_ps(diff, MAX_CU_SIZE, fenc0, rec0, stride, stride);
+        else
+           primitives.cu[g_maxLog2CUSize - 2].sub_ps(diff, MAX_CU_SIZE, fenc0, rec0, stride, stride);
+    }
+    else
+    {
+        // path for non-square area (most in edge)
+        for(int y = 0; y < ctuHeight; y++)
         {
-            diff[y * MAX_CU_SIZE + x] = (fenc0[y * stride + x] - rec0[y * stride + x]);
+            for(int x = 0; x < ctuWidth; x++)
+            {
+                diff[y * MAX_CU_SIZE + x] = (fenc0[y * stride + x] - rec0[y * stride + x]);
+            }
         }
     }

diff -r 8fdd1b8fd452 -r 36a54b2cf7c4 source/encoder/sao.h
--- a/source/encoder/sao.h      Tue Oct 06 16:42:02 2015 -0500
+++ b/source/encoder/sao.h      Tue Oct 06 16:42:06 2015 -0500
@@ -83,6 +83,7 @@
     int8_t      m_offsetBo[SAO_NUM_BO_CLASSES];
     int8_t      m_offsetEo[NUM_EDGETYPE];

+    int         m_chromaFormat;
     int         m_numCuInWidth;
     int         m_numCuInHeight;
     int         m_hChromaShift;

_______________________________________________
x265-devel mailing list
x265-devel at videolan.org
https://mailman.videolan.org/listinfo/x265-devel


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20151007/dccb0cb7/attachment-0001.html>


More information about the x265-devel mailing list