[x265] [PATCH 2 of 4] limit TU : use cbf and quantization coefficients to limit recursion

Wed Oct 12 16:35:16 CEST 2016

On Wed, Oct 12, 2016 at 9:51 AM, Deepthi Nandakumar <
deepthipnandakumar at gmail.com> wrote:

> Some more points.
>
> 1. Is there anything to prevent limit-tu 1 and 2 being used together, say
> as limit-tu 3?
>

limit-tu 1 and 2 use different approaches to limit recursion. limit-tu 1
perform breadth first traversal to compute total split cost of TUs in a
depth. But limit-tu 2 has to traverse full TU depth for the first subTU to
determine the maximum depth that other subTUs can use to limit recursion.
So, it is not possible to have level 3 combining 1 and 2.

> 2. It's recommended that limit-tu is conservative when performing early
> skip for TU recursion. There have been suspicions that large TUs are
> responsible for smoothening artifacts, and if limit-tu is too aggressive,
> it could definitely worsen this.
>

We could avoid choosing large TUs but at the expense of loosing the
performance benefit of limitTU that we see now. However we can try to bias
it to not choose a 32x32 TU block and see the performance numbers.

>
> On Mon, Oct 10, 2016 at 11:46 AM, Deepthi Nandakumar <
> deepthipnandakumar at gmail.com> wrote:
>
>> Is this condition ever satisfied? Minimum value of a coeff, to be counted
>> in numSig is 1 (since it's uint16).
>>
>
True. There is a bug in energy calculation that allowed this condition to
get satisfied. We will send patch fixing the issue soon.

On Sat, Oct 8, 2016 at 5:07 PM, Bhavna Hariharan <
>> bhavna at multicorewareinc.com> wrote:
>>
>>> Hi Deepthi,
>>>
>>> On Fri, Oct 7, 2016 at 1:17 PM, Deepthi Nandakumar <
>>> deepthipnandakumar at gmail.com> wrote:
>>>
>>>> Kavitha/Bhavana, excellent job! The test metrics look pretty good.
>>>>
>>>>
>>>>
>>>> On Tue, Oct 4, 2016 at 2:50 PM, <kavitha at multicorewareinc.com> wrote:
>>>>
>>>>> # HG changeset patch
>>>>> # User Bhavna Hariharan <bhavna at multicorewareinc.com>
>>>>> # Date 1474620761 -19800
>>>>> #      Fri Sep 23 14:22:41 2016 +0530
>>>>> # Node ID c018bc0ffc156902b1a9a13ecd6996d30d7403df
>>>>> # Parent  c10ef341f4e65883243f78040f52ed06ace99535
>>>>> limit TU : use cbf and quantization coefficients to limit recursion
>>>>>
>>>>> diff -r c10ef341f4e6 -r c018bc0ffc15 source/encoder/search.cpp
>>>>> --- a/source/encoder/search.cpp Tue Oct 04 13:27:48 2016 +0530
>>>>> +++ b/source/encoder/search.cpp Fri Sep 23 14:22:41 2016 +0530
>>>>> @@ -3194,6 +3194,8 @@
>>>>>                  singlePsyEnergy[TEXT_LUMA][0] = nonZeroPsyEnergyY;
>>>>>                  cbfFlag[TEXT_LUMA][0] = !!numSigTSkipY;
>>>>>                  bestTransformMode[TEXT_LUMA][0] = 1;
>>>>> +                if (m_param->limitTU)
>>>>> +                    numSig[TEXT_LUMA][0] = numSigTSkipY;
>>>>>                  uint32_t numCoeffY = 1 << (log2TrSize << 1);
>>>>>                  memcpy(coeffCurY, m_tsCoeff, sizeof(coeff_t) *
>>>>> numCoeffY);
>>>>>                  primitives.cu[partSize].copy_ss(curResiY,
>>>>> strideResiY, m_tsResidual, trSize);
>>>>> @@ -3331,6 +3333,21 @@
>>>>>              fullCost.rdcost = m_rdCost.calcPsyRdCost(fullCost.distortion,
>>>>> fullCost.bits, fullCost.energy);
>>>>>          else
>>>>>              fullCost.rdcost = m_rdCost.calcRdCost(fullCost.distortion,
>>>>> fullCost.bits);
>>>>> +
>>>>> +        if (m_param->limitTU && bCheckSplit)
>>>>> +        {
>>>>> +            // Stop recursion if the TU's energy level is minimal
>>>>> +            if (cbfFlag[TEXT_LUMA][0] == 0)
>>>>> +                bCheckSplit = false;
>>>>>
>>>>
>>>> Agreed.
>>>>
>>>> +            else if (numSig[TEXT_LUMA][0] < (cuGeom.numPartitions /
>>>>> 16))
>>>>> +            {
>>>>> +                uint32_t energy = 0;
>>>>> +                for (uint32_t i = 0; i < cuGeom.numPartitions; i++)
>>>>> +                    energy += abs(coeffCurY[i]);
>>>>> +                if (energy < numSig[TEXT_LUMA][0])
>>>>>
>>>>
>>>> One question, why are we comparing actual coefficient values to number
>>>> of significant coefficients?
>>>>
>>>
>>> We want to stop recursion when the energy of TU is low. If the value of
>>> each of the coefficients is minimal (close to 1), the energy will be less
>>> than the number of coefficients.
>>>
>>>
>>>>
>>>>> +                    bCheckSplit = false;
>>>>> +            }
>>>>> +        }
>>>>>      }
>>>>>
>>>>>      // code sub-blocks
>>>>> _______________________________________________
>>>>> x265-devel mailing list
>>>>> x265-devel at videolan.org
>>>>> https://mailman.videolan.org/listinfo/x265-devel
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Deepthi
>>>>
>>>> _______________________________________________
>>>> x265-devel mailing list
>>>> x265-devel at videolan.org
>>>> https://mailman.videolan.org/listinfo/x265-devel
>>>
>>>
>>>
>>> Regards,
>>>
>>> Bhavna Hariharan
>>>
>>>
>>> _______________________________________________
>>> x265-devel mailing list
>>> x265-devel at videolan.org
>>> https://mailman.videolan.org/listinfo/x265-devel
>>>
>>>
>>
>>
>> --
>> Deepthi
>>
>
>
>
> --
> Deepthi
>
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>
>

-- 
Regards,
Kavitha
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20161012/ac58d6c5/attachment.html>