[x265] [PATCH] Changed FrameEncoder::m_tld to a pointer and set it to one of Encoder's ThreadLocalData instances

Steve Borho steve at borho.org
Thu Sep 25 03:10:27 CEST 2014


On 09/24, dave wrote:
> On 09/24/2014 02:07 PM, Steve Borho wrote:
> >On 09/24, dtyx265 at gmail.com wrote:
> >># HG changeset patch
> >># User David T Yuen <dtyx265 at gmail.com>
> >># Date 1411589843 25200
> >># Node ID fc82666c3f8fe258e99ffbb8398ae04fd90a4bee
> >># Parent  b2b7072ddbf73085d457bd6a71bca946e505dea8
> >>Changed FrameEncoder::m_tld to a pointer and set it to one of Encoder's ThreadLocalData instances.
> >>
> >>This uses less memory since m_tld isn't used in --wpp and Encoder's ThreadLocalData instances are not used in --no-wpp
> >>Also there was a small performance increase on my system
> >>
> >>diff -r b2b7072ddbf7 -r fc82666c3f8f source/encoder/encoder.cpp
> >>--- a/source/encoder/encoder.cpp	Wed Sep 24 11:48:15 2014 +0530
> >>+++ b/source/encoder/encoder.cpp	Wed Sep 24 13:17:23 2014 -0700
> >>@@ -102,15 +102,20 @@
> >>      m_scalingList.setupQuantMatrices();
> >>      /* Allocate thread local data shared by all frame encoders */
> >>-    ThreadPool *pool = ThreadPool::getThreadPool();
> >>-    const int poolThreadCount = pool ? pool->getThreadCount() : 1;
> >>-    m_threadLocalData = new ThreadLocalData[poolThreadCount];
> >>-    for (int i = 0; i < poolThreadCount; i++)
> >>+    const int poolThreadCount = ThreadPool::getThreadPool()->getThreadCount();
> >>+    int numLocalData = m_param->frameNumThreads;
> >>+    if (m_param->bEnableWavefront)
> >>+        numLocalData = poolThreadCount;
> >The general concept is sound, I think, but there could be edge cases
> >where the number of frame threads is greater than the number of
> >allocated pool threads.  Perhaps:
> >
> >        const int numLocalData = X265_MAX(poolThreadCount, m_param->frameNumThreads);
> I tested this for all presets, --wpp and --no-wpp with no problems.
> 
> At first I tried something like your suggestion.
> 
> if (numLocalData < poolThreadCount) then numLocalData = poolThreadCount;
> 
> But then you may have idle ThreadLocalData laying around.

Probably not.  In Encoder::configure() when WPP is disabled we are
reducing the number of worker threads to 1 regardless of --threads.

    // Trim the thread pool if WPP is disabled
    if (!p->bEnableWavefront)
        p->poolNumThreads = 1;

So the problem would be impossible to reproduce via the command line
without changing that function.

As-is, I guess you would want: 
 numLocalData = m_param->bEnableWavefront ? pool->getThreadCount() : m_param->frameNumThreads;

which is essentially what it does now, so I think I'll queue the
original patch.

> Either way, I thought I could force an edge case by playing around with the
> number of pool threads and framethreads but I couldn't force a segfault due
> to a FrameEncoder not having a ThreadLocalData object.  Playing with the
> number of threads and framethreads did alter hevc output.
> 
> By the way, setting an excessive number of framethreads hurts performance as
> stated in the cli documentation.. so why allow framethreads > than
> cpucount()

We generally don't prevent options which might be slow; the user may
have some exotic hardware that works well with it.  We generally only
prevent broken configurations that would cause crashes or invalid
streams.

> >>+    m_threadLocalData = new ThreadLocalData[numLocalData];
> >>+    for (int i = 0; i < numLocalData; i++)
> >>      {
> >>          m_threadLocalData[i].cuCoder.initSearch(m_param, m_scalingList);
> >>          m_threadLocalData[i].cuCoder.create(g_maxCUDepth + 1, g_maxCUSize);
> >>      }
> >>+    for (int i = 0; i < m_param->frameNumThreads; i++)
> >>+        m_frameEncoder[i].m_tld = &m_threadLocalData[i];
> >>+
> >>      m_lookahead = new Lookahead(m_param, m_threadPool, this);
> >>      m_dpb = new DPB(m_param);
> >>      m_rateControl = new RateControl(m_param);
> >>diff -r b2b7072ddbf7 -r fc82666c3f8f source/encoder/frameencoder.cpp
> >>--- a/source/encoder/frameencoder.cpp	Wed Sep 24 11:48:15 2014 +0530
> >>+++ b/source/encoder/frameencoder.cpp	Wed Sep 24 13:17:23 2014 -0700
> >>@@ -105,9 +105,6 @@
> >>          m_pool = NULL;
> >>      }
> >>-    m_tld.cuCoder.initSearch(m_param, top->m_scalingList);
> >>-    m_tld.cuCoder.create(g_maxCUDepth + 1, g_maxCUSize);
> >>-
> >>      m_frameFilter.init(top, this, numRows);
> >>      // initialize HRD parameters of SPS
> >>@@ -470,7 +467,7 @@
> >>              }
> >>          }
> >>-        m_tld.cuCoder.loadCTUData(cu);
> >>+        m_tld->cuCoder.loadCTUData(cu);
> >hmm; I wonder if this line is even necessary; but that is perhaps a
> >separate issue from this patch.
> This is what I just started working on
> 1.  Create the table to replace Analysis::getDepthScanIdx()
> 2.  Clean up loadCTUData as much as possible
> 3.  Move loadCTUData to TComDataCU since it has nothing to do with Analysis
> other than calling getDepthScanIdx() and even that is just an independent
> function which just happens to be in Analysis.

Interesting, ok.

> >>          // final coding (bitstream generation) for this CU
> >>          m_entropyCoder.encodeCTU(cu);
> >>@@ -581,8 +578,8 @@
> >>      const int realRow = row >> 1;
> >>      const int typeNum = row & 1;
> >>-    ThreadLocalData& tld = threadId >= 0 ? m_top->m_threadLocalData[threadId] : m_tld;
> >>-
> >>+    ThreadLocalData& tld = threadId >= 0 ? m_top->m_threadLocalData[threadId] : *m_tld;
> >>+
> >>      if (!typeNum)
> >>          processRowEncoder(realRow, tld);
> >>      else
> >>diff -r b2b7072ddbf7 -r fc82666c3f8f source/encoder/frameencoder.h
> >>--- a/source/encoder/frameencoder.h	Wed Sep 24 11:48:15 2014 +0530
> >>+++ b/source/encoder/frameencoder.h	Wed Sep 24 13:17:23 2014 -0700
> >>@@ -159,7 +159,7 @@
> >>      uint32_t*                m_substreamSizes;
> >>      NoiseReduction*          m_nr;
> >>      NALList                  m_nalList;
> >>-    ThreadLocalData          m_tld; /* for --no-wpp */
> >>+    ThreadLocalData*         m_tld; /* for --no-wpp */
> >>      int                      m_filterRowDelay;
> >>      int                      m_filterRowDelayCus;
> >>_______________________________________________
> >>x265-devel mailing list
> >>x265-devel at videolan.org
> >>https://mailman.videolan.org/listinfo/x265-devel
> 
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel

-- 
Steve Borho


More information about the x265-devel mailing list