[x265] [PATCH] Changed FrameEncoder::m_tld to a pointer and set it to one of Encoder's ThreadLocalData instances
Steve Borho
steve at borho.org
Wed Sep 24 23:07:02 CEST 2014
On 09/24, dtyx265 at gmail.com wrote:
> # HG changeset patch
> # User David T Yuen <dtyx265 at gmail.com>
> # Date 1411589843 25200
> # Node ID fc82666c3f8fe258e99ffbb8398ae04fd90a4bee
> # Parent b2b7072ddbf73085d457bd6a71bca946e505dea8
> Changed FrameEncoder::m_tld to a pointer and set it to one of Encoder's ThreadLocalData instances.
>
> This uses less memory since m_tld isn't used in --wpp and Encoder's ThreadLocalData instances are not used in --no-wpp
> Also there was a small performance increase on my system
>
> diff -r b2b7072ddbf7 -r fc82666c3f8f source/encoder/encoder.cpp
> --- a/source/encoder/encoder.cpp Wed Sep 24 11:48:15 2014 +0530
> +++ b/source/encoder/encoder.cpp Wed Sep 24 13:17:23 2014 -0700
> @@ -102,15 +102,20 @@
> m_scalingList.setupQuantMatrices();
>
> /* Allocate thread local data shared by all frame encoders */
> - ThreadPool *pool = ThreadPool::getThreadPool();
> - const int poolThreadCount = pool ? pool->getThreadCount() : 1;
> - m_threadLocalData = new ThreadLocalData[poolThreadCount];
> - for (int i = 0; i < poolThreadCount; i++)
> + const int poolThreadCount = ThreadPool::getThreadPool()->getThreadCount();
> + int numLocalData = m_param->frameNumThreads;
> + if (m_param->bEnableWavefront)
> + numLocalData = poolThreadCount;
The general concept is sound, I think, but there could be edge cases
where the number of frame threads is greater than the number of
allocated pool threads. Perhaps:
const int numLocalData = X265_MAX(poolThreadCount, m_param->frameNumThreads);
> + m_threadLocalData = new ThreadLocalData[numLocalData];
> + for (int i = 0; i < numLocalData; i++)
> {
> m_threadLocalData[i].cuCoder.initSearch(m_param, m_scalingList);
> m_threadLocalData[i].cuCoder.create(g_maxCUDepth + 1, g_maxCUSize);
> }
>
> + for (int i = 0; i < m_param->frameNumThreads; i++)
> + m_frameEncoder[i].m_tld = &m_threadLocalData[i];
> +
> m_lookahead = new Lookahead(m_param, m_threadPool, this);
> m_dpb = new DPB(m_param);
> m_rateControl = new RateControl(m_param);
> diff -r b2b7072ddbf7 -r fc82666c3f8f source/encoder/frameencoder.cpp
> --- a/source/encoder/frameencoder.cpp Wed Sep 24 11:48:15 2014 +0530
> +++ b/source/encoder/frameencoder.cpp Wed Sep 24 13:17:23 2014 -0700
> @@ -105,9 +105,6 @@
> m_pool = NULL;
> }
>
> - m_tld.cuCoder.initSearch(m_param, top->m_scalingList);
> - m_tld.cuCoder.create(g_maxCUDepth + 1, g_maxCUSize);
> -
> m_frameFilter.init(top, this, numRows);
>
> // initialize HRD parameters of SPS
> @@ -470,7 +467,7 @@
> }
> }
>
> - m_tld.cuCoder.loadCTUData(cu);
> + m_tld->cuCoder.loadCTUData(cu);
hmm; I wonder if this line is even necessary; but that is perhaps a
separate issue from this patch.
> // final coding (bitstream generation) for this CU
> m_entropyCoder.encodeCTU(cu);
>
> @@ -581,8 +578,8 @@
> const int realRow = row >> 1;
> const int typeNum = row & 1;
>
> - ThreadLocalData& tld = threadId >= 0 ? m_top->m_threadLocalData[threadId] : m_tld;
> -
> + ThreadLocalData& tld = threadId >= 0 ? m_top->m_threadLocalData[threadId] : *m_tld;
> +
> if (!typeNum)
> processRowEncoder(realRow, tld);
> else
> diff -r b2b7072ddbf7 -r fc82666c3f8f source/encoder/frameencoder.h
> --- a/source/encoder/frameencoder.h Wed Sep 24 11:48:15 2014 +0530
> +++ b/source/encoder/frameencoder.h Wed Sep 24 13:17:23 2014 -0700
> @@ -159,7 +159,7 @@
> uint32_t* m_substreamSizes;
> NoiseReduction* m_nr;
> NALList m_nalList;
> - ThreadLocalData m_tld; /* for --no-wpp */
> + ThreadLocalData* m_tld; /* for --no-wpp */
>
> int m_filterRowDelay;
> int m_filterRowDelayCus;
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
--
Steve Borho
More information about the x265-devel
mailing list