[x265-commits] [x265] cli: delay calling showHelp until a param is allocated an...

Steve Borho steve at borho.org
Wed May 20 18:53:31 CEST 2015


details:   http://hg.videolan.org/x265/rev/e0738af788da
branches:  stable
changeset: 10507:e0738af788da
user:      Steve Borho <steve at borho.org>
date:      Wed May 20 10:29:09 2015 -0500
description:
cli: delay calling showHelp until a param is allocated and defaulted

Removes an old 'help' variable that was initialized to zero but never set
Subject: [x265] Merge with stable

details:   http://hg.videolan.org/x265/rev/a0b0c7bb486c
branches:  
changeset: 10508:a0b0c7bb486c
user:      Steve Borho <steve at borho.org>
date:      Wed May 20 11:53:06 2015 -0500
description:
Merge with stable

diffstat:

 doc/reST/api.rst                     |    54 +
 doc/reST/cli.rst                     |    24 +
 source/CMakeLists.txt                |     2 +-
 source/common/dct.cpp                |     5 +-
 source/common/lowres.cpp             |    12 +-
 source/common/lowres.h               |     2 +
 source/common/param.cpp              |    11 +-
 source/common/quant.cpp              |    81 +-
 source/common/quant.h                |     2 +-
 source/common/x86/asm-primitives.cpp |   189 ++-
 source/common/x86/const-a.asm        |    16 +-
 source/common/x86/intrapred.h        |     1 +
 source/common/x86/intrapred16.asm    |   458 +++++-
 source/common/x86/intrapred8.asm     |   101 +-
 source/common/x86/ipfilter16.asm     |   712 ++++++++++
 source/common/x86/ipfilter8.asm      |  2317 ++++++++++++++++++++++++++++-----
 source/common/x86/ipfilter8.h        |    62 +
 source/common/x86/mc-a.asm           |   526 +++++++-
 source/common/x86/pixel-a.asm        |   353 ++++-
 source/common/x86/pixel-util8.asm    |    11 +-
 source/common/x86/sad16-a.asm        |   630 +++++++++-
 source/encoder/analysis.cpp          |   281 ++-
 source/encoder/analysis.h            |     4 +-
 source/encoder/api.cpp               |    81 +-
 source/encoder/encoder.cpp           |    12 +
 source/encoder/entropy.cpp           |    76 +-
 source/encoder/entropy.h             |     2 +
 source/encoder/sao.cpp               |   109 +-
 source/encoder/search.cpp            |    55 +-
 source/encoder/search.h              |    13 +-
 source/encoder/slicetype.cpp         |     4 +-
 source/test/pixelharness.cpp         |    32 +-
 source/x265.cpp                      |    15 +-
 source/x265.def.in                   |     1 +
 source/x265.h                        |    62 +-
 source/x265cli.h                     |     6 +-
 36 files changed, 5554 insertions(+), 768 deletions(-)

diffs (truncated from 8531 to 300 lines):

diff -r ddb5868a4bcd -r a0b0c7bb486c doc/reST/api.rst
--- a/doc/reST/api.rst	Mon May 18 18:18:40 2015 -0500
+++ b/doc/reST/api.rst	Wed May 20 11:53:06 2015 -0500
@@ -419,3 +419,57 @@ under the name libx265 (so all applicati
 and then also install libx265_main10.so (symlinked to its numbered solib).
 Thus applications which use x265_api_get() will be able to generate main
 or main10 bitstreams.
+
+There is a second bit-depth introspection method that is designed for
+applications which need more flexibility in API versioning.  If you use
+the public API described at the top of this page or x265_api_get() then
+your application must be recompiled each time x265 changes its public
+API and bumps its build number (X265_BUILD, which is also the SONAME on
+POSIX systems).  But if you use **x265_api_query** and dynamically link to
+libx265 (use dlopen() on POSIX or LoadLibrary() on Windows) your
+application is no longer directly tied to the API version of x265.h that
+it was compiled against.
+
+	/* x265_api_query:
+	 *   Retrieve the programming interface for a linked x265 library, like
+	 *   x265_api_get(), except this function accepts X265_BUILD as the second
+	 *   argument rather than using the build number as part of the function name.
+	 *   Applications which dynamically link to libx265 can use this interface to
+	 *   query the library API and achieve a relative amount of version skew
+	 *   flexibility. The function may return NULL if the library determines that
+	 *   the apiVersion that your application was compiled against is not compatible
+	 *   with the library you have linked with.
+	 *
+	 *   api_major_version will be incremented any time non-backward compatible
+	 *   changes are made to any public structures or functions. If
+	 *   api_major_version does not match X265_MAJOR_VERSION from the x265.h your
+	 *   application compiled against, your application must not use the returned
+	 *   x265_api pointer.
+	 *
+	 *   Users of this API *must* also validate the sizes of any structures which
+	 *   are not treated as opaque in application code. For instance, if your
+	 *   application dereferences a x265_param pointer, then it must check that
+	 *   api->sizeof_param matches the sizeof(x265_param) that your application
+	 *   compiled with. */
+	const x265_api* x265_api_query(int bitDepth, int apiVersion, int* err);
+
+A number of validations must be performed on the returned API structure
+in order to determine if it is safe for use by your application. If you
+do not perform these checks, your application is liable to crash::
+
+	if (api->api_major_version != X265_MAJOR_VERSION) /* do not use */
+	if (api->sizeof_param != sizeof(x265_param))      /* do not use */
+	if (api->sizeof_picture != sizeof(x265_picture))  /* do not use */
+	if (api->sizeof_stats != sizeof(x265_stats))      /* do not use */
+	if (api->sizeof_zone != sizeof(x265_zone))        /* do not use */
+	etc.
+
+Note that if your application does not directly allocate or dereference
+one of these structures, if it treats the structure as opaque or does
+not use it at all, then it can skip the size check for that structure.
+
+In particular, if your application uses api->param_alloc(),
+api->param_free(), api->param_parse(), etc and never directly accesses
+any x265_param fields, then it can skip the check on the
+sizeof(x265_parm) and thereby ignore changes to that structure (which
+account for a large percentage of X265_BUILD bumps).
diff -r ddb5868a4bcd -r a0b0c7bb486c doc/reST/cli.rst
--- a/doc/reST/cli.rst	Mon May 18 18:18:40 2015 -0500
+++ b/doc/reST/cli.rst	Wed May 20 11:53:06 2015 -0500
@@ -581,6 +581,30 @@ the prediction quad-tree.
 	be consistent for all of them since the encoder configures several
 	key global data structures based on this range.
 
+.. option:: --limit-refs <0|1|2|3>
+
+	When set to X265_REF_LIMIT_DEPTH (1) x265 will limit the references
+	analyzed at the current depth based on the references used to code
+	the 4 sub-blocks at the next depth.  For example, a 16x16 CU will
+	only use the references used to code its four 8x8 CUs.
+
+	When set to X265_REF_LIMIT_CU (2), the rectangular and asymmetrical
+	partitions will only use references selected by the 2Nx2N motion
+	search (including at the lowest depth which is otherwise unaffected
+	by the depth limit).
+
+	When set to 3 (X265_REF_LIMIT_DEPTH && X265_REF_LIMIT_CU), the 2Nx2N 
+	motion search at each depth will only use references from the split 
+	CUs and the rect/amp motion searches at that depth will only use the 
+	reference(s) selected by 2Nx2N. 
+
+	You can often increase the number of references you are using
+	(within your decoder level limits) if you enable one or
+	both of these flags.
+
+	This feature is EXPERIMENTAL and currently only functional at RD
+	levels 0 through 4
+
 .. option:: --rect, --no-rect
 
 	Enable analysis of rectangular motion partitions Nx2N and 2NxN
diff -r ddb5868a4bcd -r a0b0c7bb486c source/CMakeLists.txt
--- a/source/CMakeLists.txt	Mon May 18 18:18:40 2015 -0500
+++ b/source/CMakeLists.txt	Wed May 20 11:53:06 2015 -0500
@@ -30,7 +30,7 @@ option(STATIC_LINK_CRT "Statically link 
 mark_as_advanced(FPROFILE_USE FPROFILE_GENERATE NATIVE_BUILD)
 
 # X265_BUILD must be incremented each time the public API is changed
-set(X265_BUILD 59)
+set(X265_BUILD 60)
 configure_file("${PROJECT_SOURCE_DIR}/x265.def.in"
                "${PROJECT_BINARY_DIR}/x265.def")
 configure_file("${PROJECT_SOURCE_DIR}/x265_config.h.in"
diff -r ddb5868a4bcd -r a0b0c7bb486c source/common/dct.cpp
--- a/source/common/dct.cpp	Mon May 18 18:18:40 2015 -0500
+++ b/source/common/dct.cpp	Wed May 20 11:53:06 2015 -0500
@@ -798,11 +798,11 @@ uint32_t findPosFirstLast_c(const int16_
             break;
     }
 
-    X265_CHECK(n >= 0, "non-zero coeff scan failuare!\n");
+    X265_CHECK(n >= -1, "non-zero coeff scan failuare!\n");
 
     uint32_t lastNZPosInCG = (uint32_t)n;
 
-    for (n = 0;; n++)
+    for (n = 0; n < SCAN_SET_SIZE; n++)
     {
         const uint32_t idx = scanTbl[n];
         const uint32_t idxY = idx / MLS_CG_SIZE;
@@ -813,6 +813,7 @@ uint32_t findPosFirstLast_c(const int16_
 
     uint32_t firstNZPosInCG = (uint32_t)n;
 
+    // NOTE: when coeff block all ZERO, the lastNZPosInCG is undefined and firstNZPosInCG is 16
     return ((lastNZPosInCG << 16) | firstNZPosInCG);
 }
 
diff -r ddb5868a4bcd -r a0b0c7bb486c source/common/lowres.cpp
--- a/source/common/lowres.cpp	Mon May 18 18:18:40 2015 -0500
+++ b/source/common/lowres.cpp	Wed May 20 11:53:06 2015 -0500
@@ -36,13 +36,13 @@ bool Lowres::create(PicYuv *origPic, int
     lumaStride = width + 2 * origPic->m_lumaMarginX;
     if (lumaStride & 31)
         lumaStride += 32 - (lumaStride & 31);
-    int cuWidth = (width + X265_LOWRES_CU_SIZE - 1) >> X265_LOWRES_CU_BITS;
-    int cuHeight = (lines + X265_LOWRES_CU_SIZE - 1) >> X265_LOWRES_CU_BITS;
-    int cuCount = cuWidth * cuHeight;
+    maxBlocksInRow = (width + X265_LOWRES_CU_SIZE - 1) >> X265_LOWRES_CU_BITS;
+    maxBlocksInCol = (lines + X265_LOWRES_CU_SIZE - 1) >> X265_LOWRES_CU_BITS;
+    int cuCount = maxBlocksInRow * maxBlocksInCol;
 
     /* rounding the width to multiple of lowres CU size */
-    width = cuWidth * X265_LOWRES_CU_SIZE;
-    lines = cuHeight * X265_LOWRES_CU_SIZE;
+    width = maxBlocksInRow * X265_LOWRES_CU_SIZE;
+    lines = maxBlocksInCol * X265_LOWRES_CU_SIZE;
 
     size_t planesize = lumaStride * (lines + 2 * origPic->m_lumaMarginY);
     size_t padoffset = lumaStride * origPic->m_lumaMarginY + origPic->m_lumaMarginX;
@@ -74,7 +74,7 @@ bool Lowres::create(PicYuv *origPic, int
     {
         for (int j = 0; j < bframes + 2; j++)
         {
-            CHECKED_MALLOC(rowSatds[i][j], int32_t, cuHeight);
+            CHECKED_MALLOC(rowSatds[i][j], int32_t, maxBlocksInCol);
             CHECKED_MALLOC(lowresCosts[i][j], uint16_t, cuCount);
         }
     }
diff -r ddb5868a4bcd -r a0b0c7bb486c source/common/lowres.h
--- a/source/common/lowres.h	Mon May 18 18:18:40 2015 -0500
+++ b/source/common/lowres.h	Wed May 20 11:53:06 2015 -0500
@@ -130,6 +130,8 @@ struct Lowres : public ReferencePlanes
     uint16_t(*lowresCosts[X265_BFRAME_MAX + 2][X265_BFRAME_MAX + 2]);
     int32_t*  lowresMvCosts[2][X265_BFRAME_MAX + 1];
     MV*       lowresMvs[2][X265_BFRAME_MAX + 1];
+    uint32_t  maxBlocksInRow;
+    uint32_t  maxBlocksInCol;
 
     /* used for vbvLookahead */
     int       plannedType[X265_LOOKAHEAD_MAX + 1];
diff -r ddb5868a4bcd -r a0b0c7bb486c source/common/param.cpp
--- a/source/common/param.cpp	Mon May 18 18:18:40 2015 -0500
+++ b/source/common/param.cpp	Wed May 20 11:53:06 2015 -0500
@@ -151,6 +151,7 @@ void x265_param_default(x265_param* para
     param->subpelRefine = 2;
     param->searchRange = 57;
     param->maxNumMergeCand = 2;
+    param->limitReferences = 0;
     param->bEnableWeightedPred = 1;
     param->bEnableWeightedBiPred = 0;
     param->bEnableEarlySkip = 0;
@@ -430,8 +431,8 @@ int x265_param_default_preset(x265_param
             param->deblockingFilterBetaOffset = -2;
             param->deblockingFilterTCOffset = -2;
             param->bIntraInBFrames = 0;
-            param->rdoqLevel = 1;
-            param->psyRdoq = 30;
+            param->rdoqLevel = 0;
+            param->psyRdoq = 0;
             param->psyRd = 0.5;
             param->rc.ipFactor = 1.1;
             param->rc.pbFactor = 1.1;
@@ -641,6 +642,7 @@ int x265_param_parse(x265_param* p, cons
         }
     }
     OPT("ref") p->maxNumReferences = atoi(value);
+    OPT("limit-refs") p->limitReferences = atoi(value);
     OPT("weightp") p->bEnableWeightedPred = atobool(value);
     OPT("weightb") p->bEnableWeightedBiPred = atobool(value);
     OPT("cbqpoffs") p->cbQpOffset = atoi(value);
@@ -1026,6 +1028,8 @@ int x265_check_params(x265_param* param)
           "subme must be less than or equal to X265_MAX_SUBPEL_LEVEL (7)");
     CHECK(param->subpelRefine < 0,
           "subme must be greater than or equal to 0");
+    CHECK(param->limitReferences > 3,
+          "limitReferences must be 0, 1, 2 or 3");
     CHECK(param->frameNumThreads < 0 || param->frameNumThreads > X265_MAX_FRAME_THREADS,
           "frameNumThreads (--frame-threads) must be [0 .. X265_MAX_FRAME_THREADS)");
     CHECK(param->cbQpOffset < -12, "Min. Chroma Cb QP Offset is -12");
@@ -1277,6 +1281,8 @@ void x265_print_params(x265_param* param
     if (param->rc.aqMode)
         x265_log(param, X265_LOG_INFO, "AQ: mode / str / qg-size / cu-tree  : %d / %0.1f / %d / %d\n", param->rc.aqMode,
                  param->rc.aqStrength, param->rc.qgSize, param->rc.cuTree);
+    x265_log(param, X265_LOG_INFO, "References / ref-limit  cu / depth  : %d / %d / %d\n",
+             param->maxNumReferences, !!(param->limitReferences & X265_REF_LIMIT_CU), !!(param->limitReferences & X265_REF_LIMIT_DEPTH));
 
     if (param->bLossless)
         x265_log(param, X265_LOG_INFO, "Rate Control                        : Lossless\n");
@@ -1420,6 +1426,7 @@ char *x265_param2string(x265_param* p)
     s += sprintf(s, " bframe-bias=%d", p->bFrameBias);
     s += sprintf(s, " b-adapt=%d", p->bFrameAdaptive);
     s += sprintf(s, " ref=%d", p->maxNumReferences);
+    s += sprintf(s, " limit-refs=%d", p->limitReferences);
     BOOL(p->bEnableWeightedPred, "weightp");
     BOOL(p->bEnableWeightedBiPred, "weightb");
     s += sprintf(s, " aq-mode=%d", p->rc.aqMode);
diff -r ddb5868a4bcd -r a0b0c7bb486c source/common/quant.cpp
--- a/source/common/quant.cpp	Mon May 18 18:18:40 2015 -0500
+++ b/source/common/quant.cpp	Wed May 20 11:53:06 2015 -0500
@@ -251,30 +251,63 @@ void Quant::setChromaQP(int qpin, TextTy
 }
 
 /* To minimize the distortion only. No rate is considered */
-uint32_t Quant::signBitHidingHDQ(int16_t* coeff, int32_t* deltaU, uint32_t numSig, const TUEntropyCodingParameters &codeParams)
+uint32_t Quant::signBitHidingHDQ(int16_t* coeff, int32_t* deltaU, uint32_t numSig, const TUEntropyCodingParameters &codeParams, uint32_t log2TrSize)
 {
-    const uint32_t log2TrSizeCG = codeParams.log2TrSizeCG;
+    uint32_t trSize = 1 << log2TrSize;
     const uint16_t* scan = codeParams.scan;
-    bool lastCG = true;
 
-    for (int cg = (1 << (log2TrSizeCG * 2)) - 1; cg >= 0; cg--)
+    uint8_t coeffNum[MLS_GRP_NUM];      // value range[0, 16]
+    uint16_t coeffSign[MLS_GRP_NUM];    // bit mask map for non-zero coeff sign
+    uint16_t coeffFlag[MLS_GRP_NUM];    // bit mask map for non-zero coeff
+
+#if CHECKED_BUILD || _DEBUG
+    // clean output buffer, the asm version of scanPosLast Never output anything after latest non-zero coeff group
+    memset(coeffNum, 0, sizeof(coeffNum));
+    memset(coeffSign, 0, sizeof(coeffNum));
+    memset(coeffFlag, 0, sizeof(coeffNum));
+#endif
+    const int lastScanPos = primitives.scanPosLast(codeParams.scan, coeff, coeffSign, coeffFlag, coeffNum, numSig, g_scan4x4[codeParams.scanType], trSize);
+    const int cgLastScanPos = (lastScanPos >> LOG2_SCAN_SET_SIZE);
+    unsigned long tmp;
+
+    // first CG need specially processing
+    const uint32_t correctOffset = 0x0F & (lastScanPos ^ 0xF);
+    coeffFlag[cgLastScanPos] <<= correctOffset;
+
+    for (int cg = cgLastScanPos; cg >= 0; cg--)
     {
         int cgStartPos = cg << LOG2_SCAN_SET_SIZE;
         int n;
 
+#if CHECKED_BUILD || _DEBUG
         for (n = SCAN_SET_SIZE - 1; n >= 0; --n)
             if (coeff[scan[n + cgStartPos]])
                 break;
-        if (n < 0)
+        int lastNZPosInCG0 = n;
+#endif
+
+        if (coeffNum[cg] == 0)
+        {
+            X265_CHECK(lastNZPosInCG0 < 0, "all zero block check failure\n");
             continue;
+        }
 
-        int lastNZPosInCG = n;
-
+#if CHECKED_BUILD || _DEBUG
         for (n = 0;; n++)
             if (coeff[scan[n + cgStartPos]])
                 break;
 
-        int firstNZPosInCG = n;
+        int firstNZPosInCG0 = n;
+#endif
+
+        CLZ(tmp, coeffFlag[cg]);
+        const int firstNZPosInCG = (15 ^ tmp);
+
+        CTZ(tmp, coeffFlag[cg]);
+        const int lastNZPosInCG = (15 ^ tmp);


More information about the x265-commits mailing list