[x265-commits] [x265] param: make --tune zero-latency actually have zero-latenc...

Steve Borho steve at borho.org
Mon Jan 26 17:10:22 CET 2015


details:   http://hg.videolan.org/x265/rev/8445d4bef936
branches:  
changeset: 9201:8445d4bef936
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 21 09:12:16 2015 -0600
description:
param: make --tune zero-latency actually have zero-latency at the encoder

It now disables frame parallelism, which can be a large performance loss.
Users may want to increase the number of frame encoders if they only need
zero-latency at the decoder.
Subject: [x265] level: make --tune zero-latency have zero-latency at the decoder (closes #99)

details:   http://hg.videolan.org/x265/rev/288fb71ac4a2
branches:  
changeset: 9202:288fb71ac4a2
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 21 09:13:01 2015 -0600
description:
level: make --tune zero-latency have zero-latency at the decoder (closes #99)
Subject: [x265] slice: signal sps_max_latency_increase_plus1 more accurately (refs #99)

details:   http://hg.videolan.org/x265/rev/998358779845
branches:  
changeset: 9203:998358779845
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 21 09:19:46 2015 -0600
description:
slice: signal sps_max_latency_increase_plus1 more accurately (refs #99)
Subject: [x265] cli: allow the CLI to be bit-depth independent on non-Windows platforms

details:   http://hg.videolan.org/x265/rev/b271df20f9e3
branches:  
changeset: 9204:b271df20f9e3
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 21 09:29:27 2015 -0600
description:
cli: allow the CLI to be bit-depth independent on non-Windows platforms

This allows one to do something like this:
LD_LIBRARY_PATH=/usr/local/x265_16bpp ./x265 in.y4m out-main10.hevc
LD_LIBRARY_PATH=/usr/local/x265_8bpp ./x265 in.y4m out-main8.hevc

Without this change, the CLI "remembers" the bit depth it was compiled with
for no particularly good reason.

On Windows, the CLI must link with the static library and this point is moot.

closes (#98)
Subject: [x265] analysis: allocate and initialize interData ref index

details:   http://hg.videolan.org/x265/rev/2b93cf2a5ac8
branches:  
changeset: 9205:2b93cf2a5ac8
user:      Gopu Govindaswamy <gopu at multicorewareinc.com>
date:      Wed Jan 21 16:50:07 2015 +0530
description:
analysis: allocate and initialize interData ref index
Subject: [x265] encoder: white-space, comment nits

details:   http://hg.videolan.org/x265/rev/b304928274d0
branches:  
changeset: 9206:b304928274d0
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 21 10:17:14 2015 -0600
description:
encoder: white-space, comment nits
Subject: [x265] encoder: if zero-latency, encode each picture in single call

details:   http://hg.videolan.org/x265/rev/e95e2bafa173
branches:  
changeset: 9207:e95e2bafa173
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 21 11:20:10 2015 -0600
description:
encoder: if zero-latency, encode each picture in single call

This patch deliberately doesn't change indentation so the logic changes are
clear. It's fairly ugly but I can't think of a cleaner method to handle the
problem.
Subject: [x265] encoder: proper indentation for the zero-latency loop, no logic changes

details:   http://hg.videolan.org/x265/rev/2ebc41ee6e87
branches:  
changeset: 9208:2ebc41ee6e87
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 21 11:21:23 2015 -0600
description:
encoder: proper indentation for the zero-latency loop, no logic changes
Subject: [x265] profile: give an compile error if PPA and VTUNE are enabled

details:   http://hg.videolan.org/x265/rev/6765d5e4e46a
branches:  
changeset: 9209:6765d5e4e46a
user:      Steve Borho <steve at borho.org>
date:      Sat Jan 24 11:25:15 2015 -0600
description:
profile: give an compile error if PPA and VTUNE are enabled

In this configuration, neither will work properly
Subject: [x265] profile: name the file read thread

details:   http://hg.videolan.org/x265/rev/88e1b7d0ecdf
branches:  
changeset: 9210:88e1b7d0ecdf
user:      Steve Borho <steve at borho.org>
date:      Sat Jan 24 11:33:55 2015 -0600
description:
profile: name the file read thread
Subject: [x265] profile: re-enable frame encode tasks

details:   http://hg.videolan.org/x265/rev/40b59d7be128
branches:  
changeset: 9211:40b59d7be128
user:      Steve Borho <steve at borho.org>
date:      Sat Jan 24 11:47:46 2015 -0600
description:
profile: re-enable frame encode tasks
Subject: [x265] profile: illuminate pre-lookahead tasks of downscale and AQ init

details:   http://hg.videolan.org/x265/rev/5fc167a483c2
branches:  
changeset: 9212:5fc167a483c2
user:      Steve Borho <steve at borho.org>
date:      Sat Jan 24 11:53:51 2015 -0600
description:
profile: illuminate pre-lookahead tasks of downscale and AQ init

diffstat:

 source/common/common.h          |    3 +
 source/common/param.cpp         |    1 +
 source/common/slice.h           |    1 +
 source/encoder/encoder.cpp      |  322 +++++++++++++++++++++------------------
 source/encoder/encoder.h        |    1 +
 source/encoder/entropy.cpp      |    2 +-
 source/encoder/frameencoder.cpp |    2 +-
 source/encoder/level.cpp        |    4 +-
 source/encoder/search.cpp       |    2 +-
 source/encoder/search.h         |    1 -
 source/encoder/slicetype.cpp    |    8 +-
 source/input/y4m.cpp            |    1 +
 source/input/yuv.cpp            |    1 +
 source/profile/cpuEvents.h      |    1 +
 source/x265.cpp                 |   12 +-
 15 files changed, 192 insertions(+), 170 deletions(-)

diffs (truncated from 567 to 300 lines):

diff -r ebbcf28b6d78 -r 5fc167a483c2 source/common/common.h
--- a/source/common/common.h	Wed Jan 21 14:04:56 2015 -0800
+++ b/source/common/common.h	Sat Jan 24 11:53:51 2015 -0600
@@ -41,6 +41,9 @@
 
 #include "x265.h"
 
+#if ENABLE_PPA && ENABLE_VTUNE
+#error "PPA and VTUNE cannot both be enabled. Disable one of them."
+#endif
 #if ENABLE_PPA
 #include "profile/PPA/ppa.h"
 #define ProfileScopeEvent(x) PPAScopeEvent(x)
diff -r ebbcf28b6d78 -r 5fc167a483c2 source/common/param.cpp
--- a/source/common/param.cpp	Wed Jan 21 14:04:56 2015 -0800
+++ b/source/common/param.cpp	Sat Jan 24 11:53:51 2015 -0600
@@ -409,6 +409,7 @@ int x265_param_default_preset(x265_param
             param->lookaheadDepth = 0;
             param->scenecutThreshold = 0;
             param->rc.cuTree = 0;
+            param->frameNumThreads = 1;
         }
         else if (!strcmp(tune, "grain"))
         {
diff -r ebbcf28b6d78 -r 5fc167a483c2 source/common/slice.h
--- a/source/common/slice.h	Wed Jan 21 14:04:56 2015 -0800
+++ b/source/common/slice.h	Sat Jan 24 11:53:51 2015 -0600
@@ -230,6 +230,7 @@ struct SPS
 
     uint32_t maxDecPicBuffering; // these are dups of VPS values
     int      numReorderPics;
+    int      maxLatencyIncrease;
 
     bool     bUseStrongIntraSmoothing; // use param
     bool     bTemporalMVPEnabled;
diff -r ebbcf28b6d78 -r 5fc167a483c2 source/encoder/encoder.cpp
--- a/source/encoder/encoder.cpp	Wed Jan 21 14:04:56 2015 -0800
+++ b/source/encoder/encoder.cpp	Sat Jan 24 11:53:51 2015 -0600
@@ -257,6 +257,8 @@ void Encoder::create()
         }
     }
 
+    m_bZeroLatency = !m_param->bframes && !m_param->lookaheadDepth && m_param->frameNumThreads == 1;
+
     m_aborted |= parseLambdaFile(m_param);
 
     m_encodeStartTime = x265_mdate();
@@ -456,7 +458,10 @@ int Encoder::encode(const x265_picture* 
                 }
             }
             else
+            {
+                ProfileScopeEvent(prelookahead);
                 m_rateControl->calcAdaptiveQuantFrame(inFrame);
+            }
         }
 
         /* Use the frame types from the first pass, if available */
@@ -488,169 +493,185 @@ int Encoder::encode(const x265_picture* 
     m_curEncoder = (m_curEncoder + 1) % m_param->frameNumThreads;
     int ret = 0;
 
-    // getEncodedPicture() should block until the FrameEncoder has completed
-    // encoding the frame.  This is how back-pressure through the API is
-    // accomplished when the encoder is full.
-    Frame *outFrame = curEncoder->getEncodedPicture(m_nalList);
+    /* Normal operation is to wait for the current frame encoder to complete its current frame
+     * and then to give it a new frame to work on.  In zero-latency mode, we must encode this
+     * input picture before returning so the order must be reversed. This do/while() loop allows
+     * us to alternate the order of the calls without ugly code replication */
+    Frame* outFrame = NULL;
+    Frame* frameEnc = NULL;
+    int pass = 0;
+    do
+    {
+        /* getEncodedPicture() should block until the FrameEncoder has completed
+         * encoding the frame.  This is how back-pressure through the API is
+         * accomplished when the encoder is full */
+        if (!m_bZeroLatency || pass)
+            outFrame = curEncoder->getEncodedPicture(m_nalList);
+        if (outFrame)
+        {
+            Slice *slice = outFrame->m_encData->m_slice;
 
-    if (outFrame)
-    {
-        Slice *slice = outFrame->m_encData->m_slice;
+            /* Free up pic_in->analysisData since it has already been used */
+            if (m_param->analysisMode == X265_ANALYSIS_LOAD)
+                freeAnalysis(&outFrame->m_analysisData);
 
-        /* Free up pic_in->analysisData since it has already been used */
-        if (m_param->analysisMode == X265_ANALYSIS_LOAD)
-            freeAnalysis(&outFrame->m_analysisData);
+            if (pic_out)
+            {
+                PicYuv *recpic = outFrame->m_reconPic;
+                pic_out->poc = slice->m_poc;
+                pic_out->bitDepth = X265_DEPTH;
+                pic_out->userData = outFrame->m_userData;
+                pic_out->colorSpace = m_param->internalCsp;
 
-        if (pic_out)
-        {
-            PicYuv *recpic = outFrame->m_reconPic;
-            pic_out->poc = slice->m_poc;
-            pic_out->bitDepth = X265_DEPTH;
-            pic_out->userData = outFrame->m_userData;
-            pic_out->colorSpace = m_param->internalCsp;
+                pic_out->pts = outFrame->m_pts;
+                pic_out->dts = outFrame->m_dts;
 
-            pic_out->pts = outFrame->m_pts;
-            pic_out->dts = outFrame->m_dts;
+                switch (slice->m_sliceType)
+                {
+                case I_SLICE:
+                    pic_out->sliceType = outFrame->m_lowres.bKeyframe ? X265_TYPE_IDR : X265_TYPE_I;
+                    break;
+                case P_SLICE:
+                    pic_out->sliceType = X265_TYPE_P;
+                    break;
+                case B_SLICE:
+                    pic_out->sliceType = X265_TYPE_B;
+                    break;
+                }
 
-            switch (slice->m_sliceType)
+                pic_out->planes[0] = recpic->m_picOrg[0];
+                pic_out->stride[0] = (int)(recpic->m_stride * sizeof(pixel));
+                pic_out->planes[1] = recpic->m_picOrg[1];
+                pic_out->stride[1] = (int)(recpic->m_strideC * sizeof(pixel));
+                pic_out->planes[2] = recpic->m_picOrg[2];
+                pic_out->stride[2] = (int)(recpic->m_strideC * sizeof(pixel));
+
+                /* Dump analysis data from pic_out to file in save mode and free */
+                if (m_param->analysisMode == X265_ANALYSIS_SAVE)
+                {
+                    pic_out->analysisData.poc = pic_out->poc;
+                    pic_out->analysisData.sliceType = pic_out->sliceType;
+                    pic_out->analysisData.numCUsInFrame = outFrame->m_analysisData.numCUsInFrame;
+                    pic_out->analysisData.numPartitions = outFrame->m_analysisData.numPartitions;
+                    pic_out->analysisData.interData = outFrame->m_analysisData.interData;
+                    pic_out->analysisData.intraData = outFrame->m_analysisData.intraData;
+                    writeAnalysisFile(&pic_out->analysisData);
+                    freeAnalysis(&pic_out->analysisData);
+                }
+            }
+            if (slice->m_sliceType == P_SLICE)
             {
-            case I_SLICE:
-                pic_out->sliceType = outFrame->m_lowres.bKeyframe ? X265_TYPE_IDR : X265_TYPE_I;
-                break;
-            case P_SLICE:
-                pic_out->sliceType = X265_TYPE_P;
-                break;
-            case B_SLICE:
-                pic_out->sliceType = X265_TYPE_B;
-                break;
+                if (slice->m_weightPredTable[0][0][0].bPresentFlag)
+                    m_numLumaWPFrames++;
+                if (slice->m_weightPredTable[0][0][1].bPresentFlag ||
+                    slice->m_weightPredTable[0][0][2].bPresentFlag)
+                    m_numChromaWPFrames++;
+            }
+            else if (slice->m_sliceType == B_SLICE)
+            {
+                bool bLuma = false, bChroma = false;
+                for (int l = 0; l < 2; l++)
+                {
+                    if (slice->m_weightPredTable[l][0][0].bPresentFlag)
+                        bLuma = true;
+                    if (slice->m_weightPredTable[l][0][1].bPresentFlag ||
+                        slice->m_weightPredTable[l][0][2].bPresentFlag)
+                        bChroma = true;
+                }
+
+                if (bLuma)
+                    m_numLumaWPBiFrames++;
+                if (bChroma)
+                    m_numChromaWPBiFrames++;
             }
 
-            pic_out->planes[0] = recpic->m_picOrg[0];
-            pic_out->stride[0] = (int)(recpic->m_stride * sizeof(pixel));
-            pic_out->planes[1] = recpic->m_picOrg[1];
-            pic_out->stride[1] = (int)(recpic->m_strideC * sizeof(pixel));
-            pic_out->planes[2] = recpic->m_picOrg[2];
-            pic_out->stride[2] = (int)(recpic->m_strideC * sizeof(pixel));
+            if (m_aborted)
+                return -1;
 
-            /* Dump analysis data from pic_out to file in save mode and free */
+            finishFrameStats(outFrame, curEncoder, curEncoder->m_accessUnitBits);
+
+            /* Allow this frame to be recycled if no frame encoders are using it for reference */
+            if (!pic_out)
+            {
+                ATOMIC_DEC(&outFrame->m_countRefEncoders);
+                m_dpb->recycleUnreferenced();
+            }
+            else
+                m_exportedPic = outFrame;
+
+            m_numDelayedPic--;
+
+            ret = 1;
+        }
+
+        /* pop a single frame from decided list, then provide to frame encoder
+         * curEncoder is guaranteed to be idle at this point */
+        if (!pass)
+            frameEnc = m_lookahead->getDecidedPicture();
+        if (frameEnc && !pass)
+        {
+            /* give this frame a FrameData instance before encoding */
+            if (m_dpb->m_picSymFreeList)
+            {
+                frameEnc->m_encData = m_dpb->m_picSymFreeList;
+                m_dpb->m_picSymFreeList = m_dpb->m_picSymFreeList->m_freeListNext;
+                frameEnc->reinit(m_sps);
+            }
+            else
+            {
+                frameEnc->allocEncodeData(m_param, m_sps);
+                Slice* slice = frameEnc->m_encData->m_slice;
+                slice->m_sps = &m_sps;
+                slice->m_pps = &m_pps;
+                slice->m_maxNumMergeCand = m_param->maxNumMergeCand;
+                slice->m_endCUAddr = slice->realEndAddress(m_sps.numCUsInFrame * NUM_CU_PARTITIONS);
+                frameEnc->m_reconPic->m_cuOffsetC = m_cuOffsetC;
+                frameEnc->m_reconPic->m_cuOffsetY = m_cuOffsetY;
+                frameEnc->m_reconPic->m_buOffsetC = m_buOffsetC;
+                frameEnc->m_reconPic->m_buOffsetY = m_buOffsetY;
+            }
+
+            curEncoder->m_rce.encodeOrder = m_encodedFrameNum++;
+            if (m_bframeDelay)
+            {
+                int64_t *prevReorderedPts = m_prevReorderedPts;
+                frameEnc->m_dts = m_encodedFrameNum > m_bframeDelay
+                    ? prevReorderedPts[(m_encodedFrameNum - m_bframeDelay) % m_bframeDelay]
+                    : frameEnc->m_reorderedPts - m_bframeDelayTime;
+                prevReorderedPts[m_encodedFrameNum % m_bframeDelay] = frameEnc->m_reorderedPts;
+            }
+            else
+                frameEnc->m_dts = frameEnc->m_reorderedPts;
+
+            /* Allocate analysis data before encode in save mode. This is allocated in frameEnc */
             if (m_param->analysisMode == X265_ANALYSIS_SAVE)
             {
-                pic_out->analysisData.poc = pic_out->poc;
-                pic_out->analysisData.sliceType = pic_out->sliceType;
-                pic_out->analysisData.numCUsInFrame = outFrame->m_analysisData.numCUsInFrame;
-                pic_out->analysisData.numPartitions = outFrame->m_analysisData.numPartitions;
-                pic_out->analysisData.interData = outFrame->m_analysisData.interData;
-                pic_out->analysisData.intraData = outFrame->m_analysisData.intraData;
-                writeAnalysisFile(&pic_out->analysisData);
-                freeAnalysis(&pic_out->analysisData);
-            }
-        }
-        if (slice->m_sliceType == P_SLICE)
-        {
-            if (slice->m_weightPredTable[0][0][0].bPresentFlag)
-                m_numLumaWPFrames++;
-            if (slice->m_weightPredTable[0][0][1].bPresentFlag ||
-                slice->m_weightPredTable[0][0][2].bPresentFlag)
-                m_numChromaWPFrames++;
-        }
-        else if (slice->m_sliceType == B_SLICE)
-        {
-            bool bLuma = false, bChroma = false;
-            for (int l = 0; l < 2; l++)
-            {
-                if (slice->m_weightPredTable[l][0][0].bPresentFlag)
-                    bLuma = true;
-                if (slice->m_weightPredTable[l][0][1].bPresentFlag ||
-                    slice->m_weightPredTable[l][0][2].bPresentFlag)
-                    bChroma = true;
+                x265_analysis_data* analysis = &frameEnc->m_analysisData;
+                analysis->poc = frameEnc->m_poc;
+                analysis->sliceType = frameEnc->m_lowres.sliceType;
+                uint32_t widthInCU       = (m_param->sourceWidth  + g_maxCUSize - 1) >> g_maxLog2CUSize;
+                uint32_t heightInCU      = (m_param->sourceHeight + g_maxCUSize - 1) >> g_maxLog2CUSize;
+
+                uint32_t numCUsInFrame   = widthInCU * heightInCU;
+                analysis->numCUsInFrame  = numCUsInFrame;
+                analysis->numPartitions  = NUM_CU_PARTITIONS;
+                allocAnalysis(analysis);
             }
 
-            if (bLuma)
-                m_numLumaWPBiFrames++;
-            if (bChroma)
-                m_numChromaWPBiFrames++;
+            /* determine references, setup RPS, etc */
+            m_dpb->prepareEncode(frameEnc);
+
+            if (m_param->rc.rateControlMode != X265_RC_CQP)
+                m_lookahead->getEstimatedPictureCost(frameEnc);


More information about the x265-commits mailing list