[x265] [PATCH 3 of 4] limitTU: use max TU depth of first subTU to limit recursion of others in split

Kavitha Sampath kavitha at multicorewareinc.com
Wed Oct 5 09:39:46 CEST 2016


Test machine - I5-4440 Haswell

1) Video: ducks_take_off_420_720p50.y4m
--input E:\TestSequences\ducks_take_off_420_720p50.y4m --preset veryslow
--hash 1 --bitrate 3500 --output op.hevc --ssim --tune ssim
encoded 500 frames in 228.44s (2.19 fps), 3542.00 kb/s, Avg QP:40.23, SSIM
Mean Y: 0.8583162 ( 8.487 dB)

--input E:\TestSequences\ducks_take_off_420_720p50.y4m --preset veryslow
--hash 1 --bitrate 3500 --output op.hevc --ssim --tune ssim --limit-TU 1
encoded 500 frames in 212.47s (2.35 fps), 3542.84 kb/s, Avg QP:40.23, SSIM
Mean Y: 0.8582706 ( 8.485 dB)

--input E:\TestSequences\ducks_take_off_420_720p50.y4m --preset veryslow
--hash 1 --bitrate 3500 --output op.hevc --ssim --tune ssim --limit-TU 2
encoded 500 frames in 203.92s (2.45 fps), 3542.46 kb/s, Avg QP:40.23, SSIM
Mean Y: 0.8582912 ( 8.486 dB)


2) Video : Johnny_1280x720_60.y4m
--input E:\TestSequences\Johnny_1280x720_60.y4m --preset veryslow --hash 1
--bitrate 3500 --output op.hevc --ssim --tune ssim
encoded 600 frames in 191.64s (3.13 fps), 3412.73 kb/s, Avg QP:24.64, SSIM
Mean Y: 0.9732328 (15.724 dB)

--input E:\TestSequences\Johnny_1280x720_60.y4m --preset veryslow --hash 1
--bitrate 3500 --output op.hevc --ssim --tune ssim --limit-TU 1
encoded 600 frames in 172.39s (3.48 fps), 3413.65 kb/s, Avg QP:24.62, SSIM
Mean Y: 0.9732404 (15.725 dB)

--input E:\TestSequences\Johnny_1280x720_60.y4m --preset veryslow --hash 1
--bitrate 3500 --output op.hevc --ssim --tune ssim --limit-TU 2
encoded 600 frames in 162.40s (3.69 fps), 3413.30 kb/s, Avg QP:24.62, SSIM
Mean Y: 0.9732278 (15.723 dB)


3) Video: Kimono1_1920x1080_24.y4m
--input E:\TestSequences\Kimono1_1920x1080_24.y4m --preset veryslow --hash
1 --bitrate 9000 --output op.hevc --ssim --tune ssim
encoded 240 frames in 430.29s (0.56 fps), 8477.26 kb/s, Avg QP:22.92, SSIM
Mean Y: 0.9640507 (14.443 dB)

--input E:\TestSequences\Kimono1_1920x1080_24.y4m --preset veryslow --hash
1 --bitrate 9000 --output op.hevc --ssim --tune ssim --limit-TU 1
encoded 240 frames in 403.67s (0.59 fps), 8474.58 kb/s, Avg QP:22.91, SSIM
Mean Y: 0.9640510 (14.443 dB)

--input E:\TestSequences\Kimono1_1920x1080_24.y4m --preset veryslow --hash
1 --bitrate 9000 --output op.hevc --ssim --tune ssim --limit-TU 2
encoded 240 frames in 393.87s (0.61 fps), 8472.13 kb/s, Avg QP:22.92, SSIM
Mean Y: 0.9640245 (14.440 dB)



On Tue, Oct 4, 2016 at 6:58 PM, Ashok Kumar Mishra <
ashok at multicorewareinc.com> wrote:

> Can you please paste the performance improvement figures for some of the
> videos ?
>
> On Tue, Oct 4, 2016 at 2:50 PM, <kavitha at multicorewareinc.com> wrote:
>
>> # HG changeset patch
>> # User Kavitha Sampath <kavitha at multicorewareinc.com>
>> # Date 1475238341 -19800
>> #      Fri Sep 30 17:55:41 2016 +0530
>> # Node ID 3ae30a43ac939fe875eaec7f22d134711b00c449
>> # Parent  c018bc0ffc156902b1a9a13ecd6996d30d7403df
>> limitTU: use max TU depth of first subTU to limit recursion of others in
>> split
>>
>> diff -r c018bc0ffc15 -r 3ae30a43ac93 source/encoder/search.cpp
>> --- a/source/encoder/search.cpp Fri Sep 23 14:22:41 2016 +0530
>> +++ b/source/encoder/search.cpp Fri Sep 30 17:55:41 2016 +0530
>> @@ -67,6 +67,7 @@
>>      m_param = NULL;
>>      m_slice = NULL;
>>      m_frame = NULL;
>> +    m_maxTUDepth = 0;
>>  }
>>
>>  bool Search::initSearch(const x265_param& param, ScalingList&
>> scalingList)
>> @@ -2617,6 +2618,8 @@
>>
>>      m_entropyCoder.load(m_rqt[depth].cur);
>>
>> +    if (m_param->limitTU == X265_TU_LIMIT_DFS)
>> +        m_maxTUDepth = 0;
>>      Cost costs;
>>      estimateResidualQT(interMode, cuGeom, 0, 0, *resiYuv, costs,
>> tuDepthRange);
>>
>> @@ -2876,6 +2879,11 @@
>>
>>      bool bCheckSplit = log2TrSize > depthRange[0];
>>      bool bCheckFull = log2TrSize <= depthRange[1];
>> +    if (m_param->limitTU == X265_TU_LIMIT_DFS && m_maxTUDepth)
>> +    {
>> +        uint32_t log2MaxTrSize = cuGeom.log2CUSize - m_maxTUDepth;
>> +        bCheckSplit = log2TrSize > log2MaxTrSize;
>> +    }
>>      bool bSplitPresentFlag = bCheckSplit && bCheckFull;
>>
>>      if (cu.m_partSize[0] != SIZE_2Nx2N && !tuDepth && bCheckSplit)
>> @@ -3372,6 +3380,11 @@
>>          uint32_t ycbf = 0, ucbf = 0, vcbf = 0;
>>          for (uint32_t qIdx = 0, qPartIdx = absPartIdx; qIdx < 4; ++qIdx,
>> qPartIdx += qNumParts)
>>          {
>> +            if (m_param->limitTU == X265_TU_LIMIT_DFS && tuDepth == 0 &&
>> qIdx == 1)
>> +            {
>> +                for (uint32_t i = 0; i < cuGeom.numPartitions / 4; i++)
>> +                    m_maxTUDepth = X265_MAX(m_maxTUDepth,
>> cu.m_tuDepth[i]);
>> +            }
>>              estimateResidualQT(mode, cuGeom, qPartIdx, tuDepth + 1,
>> resiYuv, splitCost, depthRange);
>>              ycbf |= cu.getCbf(qPartIdx, TEXT_LUMA,     tuDepth + 1);
>>              if (m_csp != X265_CSP_I400 && m_frame->m_fencPic->m_picCsp
>> != X265_CSP_I400)
>> diff -r c018bc0ffc15 -r 3ae30a43ac93 source/encoder/search.h
>> --- a/source/encoder/search.h   Fri Sep 23 14:22:41 2016 +0530
>> +++ b/source/encoder/search.h   Fri Sep 30 17:55:41 2016 +0530
>> @@ -274,6 +274,7 @@
>>      bool            m_bFrameParallel;
>>      uint32_t        m_numLayers;
>>      uint32_t        m_refLagPixels;
>> +    uint32_t        m_maxTUDepth;
>>
>>      int16_t         m_sliceMaxY;
>>      int16_t         m_sliceMinY;
>> _______________________________________________
>> x265-devel mailing list
>> x265-devel at videolan.org
>> https://mailman.videolan.org/listinfo/x265-devel
>>
>
>
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>
>


-- 
Regards,
Kavitha
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20161005/a739054c/attachment-0001.html>


More information about the x265-devel mailing list