[x265-commits] [x265] cudata: perform MV scaling directly within POC distance f...

Tue Oct 28 22:47:46 CET 2014

details:   http://hg.videolan.org/x265/rev/59df6b4fe1d7
branches:  
changeset: 8714:59df6b4fe1d7
user:      Steve Borho <steve at borho.org>
date:      Mon Oct 27 15:25:00 2014 -0500
description:
cudata: perform MV scaling directly within POC distance function

this avoids some code duplication and is also a bit more efficient
Subject: [x265] cudata: coding style nits

details:   http://hg.videolan.org/x265/rev/4afcdb09550e
branches:  
changeset: 8715:4afcdb09550e
user:      Steve Borho <steve at borho.org>
date:      Mon Oct 27 15:33:48 2014 -0500
description:
cudata: coding style nits

* reorder arguments so outputs are listed first
* pass const by reference
* return single integer output rather than pass by reference
* A == 0 ? B : C => A ? C : B;
* standardized variable names (puIdx, absPartIdx, etc)
Subject: [x265] search: nits

details:   http://hg.videolan.org/x265/rev/4ad4ba77a339
branches:  
changeset: 8716:4ad4ba77a339
user:      Steve Borho <steve at borho.org>
date:      Mon Oct 27 23:13:36 2014 -0500
description:
search: nits
Subject: [x265] search: use Cost instances to accumulate costs in xEstimateResidualQT

details:   http://hg.videolan.org/x265/rev/efe17882bca5
branches:  
changeset: 8717:efe17882bca5
user:      Steve Borho <steve at borho.org>
date:      Mon Oct 27 23:44:18 2014 -0500
description:
search: use Cost instances to accumulate costs in xEstimateResidualQT
Subject: [x265] search: remove x prefixes from inter residual coding functions

details:   http://hg.videolan.org/x265/rev/da3191896381
branches:  
changeset: 8718:da3191896381
user:      Steve Borho <steve at borho.org>
date:      Mon Oct 27 23:45:59 2014 -0500
description:
search: remove x prefixes from inter residual coding functions
Subject: [x265] search: sync up argument names between source and header

details:   http://hg.videolan.org/x265/rev/fa79ec52c34d
branches:  
changeset: 8719:fa79ec52c34d
user:      Steve Borho <steve at borho.org>
date:      Mon Oct 27 23:46:57 2014 -0500
description:
search: sync up argument names between source and header
Subject: [x265] [OUTPUT CHANGED for 422] made loops for chroma components in xEstimateResidualQT()

details:   http://hg.videolan.org/x265/rev/554dd4aad4a0
branches:  
changeset: 8720:554dd4aad4a0
user:      Ashok Kumar Mishra<ashok at multicorewareinc.com>
date:      Tue Oct 28 13:21:41 2014 +0530
description:
[OUTPUT CHANGED for 422] made loops for chroma components in xEstimateResidualQT()

The output change for 422 is valid. Initially the no. of bits(cbf and coeff.) were calculated
per block and per chroma component. Now the no. of bits are calculated per chroma component.
Subject: [x265] doc: remove uncrustify helper scripts

details:   http://hg.videolan.org/x265/rev/f91c01f6ca83
branches:  
changeset: 8721:f91c01f6ca83
user:      Steve Borho <steve at borho.org>
date:      Tue Oct 28 09:47:08 2014 -0500
description:
doc: remove uncrustify helper scripts

I don't expect a lot of whole-new file development or wholesale style
enforcement. Leave the config script in place for new developers.
Subject: [x265] search: remove redundant logic

details:   http://hg.videolan.org/x265/rev/689e105ae41f
branches:  
changeset: 8722:689e105ae41f
user:      Steve Borho <steve at borho.org>
date:      Tue Oct 28 11:31:26 2014 -0500
description:
search: remove redundant logic
Subject: [x265] cudata: split getQuadtreeTULog2MinSizeInCU() into intra/inter functions

details:   http://hg.videolan.org/x265/rev/98573a12738d
branches:  
changeset: 8723:98573a12738d
user:      Steve Borho <steve at borho.org>
date:      Tue Oct 28 11:44:21 2014 -0500
description:
cudata: split getQuadtreeTULog2MinSizeInCU() into intra/inter functions

The caller usually knows what the CU prediction mode is
Subject: [x265] search: remove redundant work from residualTransformQuantInter()

details:   http://hg.videolan.org/x265/rev/252f886f4871
branches:  
changeset: 8724:252f886f4871
user:      Steve Borho <steve at borho.org>
date:      Tue Oct 28 12:21:14 2014 -0500
description:
search: remove redundant work from residualTransformQuantInter()
Subject: [x265] search: simplify checks for 2x2 chroma blocks

details:   http://hg.videolan.org/x265/rev/90e1b515a364
branches:  
changeset: 8725:90e1b515a364
user:      Steve Borho <steve at borho.org>
date:      Tue Oct 28 12:36:33 2014 -0500
description:
search: simplify checks for 2x2 chroma blocks

"log2TrSize == 2 && m_csp != X265_CSP_I444" essentially means that the chroma
transform would be 2x2, aka log2TrSizeC == 1.

In offsetSubTUCBFs(), the chroma tu size is not calculated but implied. We
should be able to skip the X265_CSP_I444 check since the function should only
be called by 4:2:2 encodes that code two half-sized chroma blocks per luma block
Subject: [x265] entropy: make a fast const method for getting MPM mode signal cost

details:   http://hg.videolan.org/x265/rev/7400828ccd0e
branches:  
changeset: 8726:7400828ccd0e
user:      Steve Borho <steve at borho.org>
date:      Tue Oct 28 12:54:17 2014 -0500
description:
entropy: make a fast const method for getting MPM mode signal cost
Subject: [x265] search: use fast-path to get mpm mode signal cost

details:   http://hg.videolan.org/x265/rev/5b1d67874dd3
branches:  
changeset: 8727:5b1d67874dd3
user:      Steve Borho <steve at borho.org>
date:      Tue Oct 28 13:37:00 2014 -0500
description:
search: use fast-path to get mpm mode signal cost

inline single caller of getIntraModeBits
Subject: [x265] search: move getIntraDirLumaPredictor() into getIntraRemModeBits()

details:   http://hg.videolan.org/x265/rev/9cc367aa2b40
branches:  
changeset: 8728:9cc367aa2b40
user:      Steve Borho <steve at borho.org>
date:      Tue Oct 28 13:39:21 2014 -0500
description:
search: move getIntraDirLumaPredictor() into getIntraRemModeBits()
Subject: [x265] search: add a fast method for estimating non-MPM intra mode signal cost

details:   http://hg.videolan.org/x265/rev/9cdc7c61d3fb
branches:  
changeset: 8729:9cdc7c61d3fb
user:      Steve Borho <steve at borho.org>
date:      Tue Oct 28 13:48:57 2014 -0500
description:
search: add a fast method for estimating non-MPM intra mode signal cost
Subject: [x265] search: make getIntraRemModeBits() const

details:   http://hg.videolan.org/x265/rev/c561b0e99684
branches:  
changeset: 8730:c561b0e99684
user:      Steve Borho <steve at borho.org>
date:      Tue Oct 28 14:52:11 2014 -0500
description:
search: make getIntraRemModeBits() const
Subject: [x265] entropy: simplify loadIntraDirModeLuma

details:   http://hg.videolan.org/x265/rev/42566b53b96d
branches:  
changeset: 8731:42566b53b96d
user:      Steve Borho <steve at borho.org>
date:      Tue Oct 28 14:53:24 2014 -0500
description:
entropy: simplify loadIntraDirModeLuma
Subject: [x265] search: trModeC -> tuDepthC

details:   http://hg.videolan.org/x265/rev/7cfc1edb083f
branches:  
changeset: 8732:7cfc1edb083f
user:      Steve Borho <steve at borho.org>
date:      Tue Oct 28 16:31:31 2014 -0500
description:
search: trModeC -> tuDepthC
Subject: [x265] Merge with default (prep for 1.4 release)

details:   http://hg.videolan.org/x265/rev/24b4177ea1ec
branches:  stable
changeset: 8733:24b4177ea1ec
user:      Steve Borho <steve at borho.org>
date:      Tue Oct 28 16:46:56 2014 -0500
description:
Merge with default (prep for 1.4 release)

diffstat:

 doc/uncrustify/apply-to-all-source.py |   24 -
 doc/uncrustify/drag-uncrustify.bat    |   10 -
 source/common/cudata.cpp              |  164 +++---
 source/common/cudata.h                |   28 +-
 source/encoder/analysis.cpp           |   29 +-
 source/encoder/entropy.cpp            |   27 +-
 source/encoder/entropy.h              |    4 +-
 source/encoder/search.cpp             |  833 ++++++++++++++-------------------
 source/encoder/search.h               |    8 +-
 9 files changed, 482 insertions(+), 645 deletions(-)

diffs (truncated from 1952 to 300 lines):

diff -r 3ccb20b6c022 -r 24b4177ea1ec doc/uncrustify/apply-to-all-source.py

--- a/doc/uncrustify/apply-to-all-source.py	Mon Oct 27 21:59:30 2014 -0500
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,24 +0,0 @@
-# Python script which builds a package with the current repository state
-
-import os
-import subprocess
-import shutil
-
-EXTENSIONS = ['.h', '.cpp', '.inc', '.c']
-EXCLUDES = ['VectorClass', 'compat']
-
-candidates = []
-for (dirpath, dirnames, filenames) in os.walk('../../source'):
-    for exc in EXCLUDES:
-        if exc in dirpath:
-            break
-    else:
-        for file in filenames:
-            base, ext = os.path.splitext(file)
-            if ext in EXTENSIONS:
-                candidates.append(os.path.join(dirpath, file))
-
-for file in candidates:
-    cmdline = ['uncrustify', '-f', file, '-c', 'codingstyle.cfg', '-o', 'tempfile']
-    subprocess.Popen(cmdline, stdout=None, stderr=subprocess.PIPE).communicate()
-    shutil.move('tempfile', file)
diff -r 3ccb20b6c022 -r 24b4177ea1ec doc/uncrustify/drag-uncrustify.bat
--- a/doc/uncrustify/drag-uncrustify.bat	Mon Oct 27 21:59:30 2014 -0500
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,10 +0,0 @@
- at echo on
-::
-::  Drag a CPP or H file onto this batch file to apply the
-::  project's coding style to the file.  This will likely overwrite the
-::  original file, so use with care
-
-uncrustify.exe -f "%1" -c %~dp0\codingstyle.cfg -o indentoutput.tmp
-move /Y indentoutput.tmp "%1"
-
-pause
diff -r 3ccb20b6c022 -r 24b4177ea1ec source/common/cudata.cpp
--- a/source/common/cudata.cpp	Mon Oct 27 21:59:30 2014 -0500
+++ b/source/common/cudata.cpp	Tue Oct 28 16:46:56 2014 -0500
@@ -909,18 +909,27 @@ uint32_t CUData::getCtxSplitFlag(uint32_
     return ctx;
 }
 
-void CUData::getQuadtreeTULog2MinSizeInCU(uint32_t tuDepthRange[2], uint32_t absPartIdx) const
+void CUData::getIntraTUQtDepthRange(uint32_t tuDepthRange[2], uint32_t absPartIdx) const
 {
     uint32_t log2CUSize = m_log2CUSize[absPartIdx];
-    PartSize partSize   = (PartSize)m_partSize[absPartIdx];
-    uint32_t quadtreeTUMaxDepth = m_predMode[absPartIdx] == MODE_INTRA ? m_slice->m_sps->quadtreeTUMaxDepthIntra : m_slice->m_sps->quadtreeTUMaxDepthInter;
-    uint32_t intraSplitFlag = (m_predMode[absPartIdx] == MODE_INTRA && partSize == SIZE_NxN) ? 1 : 0;
-    uint32_t interSplitFlag = ((quadtreeTUMaxDepth == 1) && (m_predMode[absPartIdx] == MODE_INTER) && (partSize != SIZE_2Nx2N));
+    uint32_t splitFlag = m_partSize[absPartIdx] == SIZE_NxN;
 
     tuDepthRange[0] = m_slice->m_sps->quadtreeTULog2MinSize;
     tuDepthRange[1] = m_slice->m_sps->quadtreeTULog2MaxSize;
 
-    tuDepthRange[0] = X265_MAX(tuDepthRange[0], X265_MIN(log2CUSize - (quadtreeTUMaxDepth - 1 + interSplitFlag + intraSplitFlag), tuDepthRange[1]));
+    tuDepthRange[0] = X265_MAX(tuDepthRange[0], X265_MIN(log2CUSize - (m_slice->m_sps->quadtreeTUMaxDepthIntra - 1 + splitFlag), tuDepthRange[1]));
+}
+
+void CUData::getInterTUQtDepthRange(uint32_t tuDepthRange[2], uint32_t absPartIdx) const
+{
+    uint32_t log2CUSize = m_log2CUSize[absPartIdx];
+    uint32_t quadtreeTUMaxDepth = m_slice->m_sps->quadtreeTUMaxDepthInter;
+    uint32_t splitFlag = quadtreeTUMaxDepth == 1 && m_partSize[absPartIdx] != SIZE_2Nx2N;
+
+    tuDepthRange[0] = m_slice->m_sps->quadtreeTULog2MinSize;
+    tuDepthRange[1] = m_slice->m_sps->quadtreeTULog2MaxSize;
+
+    tuDepthRange[0] = X265_MAX(tuDepthRange[0], X265_MIN(log2CUSize - (quadtreeTUMaxDepth - 1 + splitFlag), tuDepthRange[1]));
 }
 
 uint32_t CUData::getCtxSkipFlag(uint32_t absPartIdx) const
@@ -1288,8 +1297,9 @@ void CUData::deriveLeftRightTopIdx(uint3
     }
 }
 
-void CUData::deriveLeftBottomIdx(uint32_t partIdx, uint32_t& outPartIdxLB) const
+uint32_t CUData::deriveLeftBottomIdx(uint32_t puIdx) const
 {
+    uint32_t outPartIdxLB;
     outPartIdxLB = g_rasterToZscan[g_zscanToRaster[m_absIdxInCTU] + ((1 << (m_log2CUSize[0] - LOG2_UNIT_SIZE - 1)) - 1) * s_numPartInCUSize];
 
     switch (m_partSize[0])
@@ -1298,35 +1308,37 @@ void CUData::deriveLeftBottomIdx(uint32_
         outPartIdxLB += m_numPartitions >> 1;
         break;
     case SIZE_2NxN:
-        outPartIdxLB += (partIdx == 0) ? 0 : m_numPartitions >> 1;
+        outPartIdxLB += puIdx ? m_numPartitions >> 1 : 0;
         break;
     case SIZE_Nx2N:
-        outPartIdxLB += (partIdx == 0) ? m_numPartitions >> 1 : (m_numPartitions >> 2) * 3;
+        outPartIdxLB += puIdx ? (m_numPartitions >> 2) * 3 : m_numPartitions >> 1;
         break;
     case SIZE_NxN:
-        outPartIdxLB += (m_numPartitions >> 2) * partIdx;
+        outPartIdxLB += (m_numPartitions >> 2) * puIdx;
         break;
     case SIZE_2NxnU:
-        outPartIdxLB += (partIdx == 0) ? -((int)m_numPartitions >> 3) : m_numPartitions >> 1;
+        outPartIdxLB += puIdx ? m_numPartitions >> 1 : -((int)m_numPartitions >> 3);
         break;
     case SIZE_2NxnD:
-        outPartIdxLB += (partIdx == 0) ? (m_numPartitions >> 2) + (m_numPartitions >> 3) : m_numPartitions >> 1;
+        outPartIdxLB += puIdx ? m_numPartitions >> 1 : (m_numPartitions >> 2) + (m_numPartitions >> 3);
         break;
     case SIZE_nLx2N:
-        outPartIdxLB += (partIdx == 0) ? m_numPartitions >> 1 : (m_numPartitions >> 1) + (m_numPartitions >> 4);
+        outPartIdxLB += puIdx ? (m_numPartitions >> 1) + (m_numPartitions >> 4) : m_numPartitions >> 1;
         break;
     case SIZE_nRx2N:
-        outPartIdxLB += (partIdx == 0) ? m_numPartitions >> 1 : (m_numPartitions >> 1) + (m_numPartitions >> 2) + (m_numPartitions >> 4);
+        outPartIdxLB += puIdx ? (m_numPartitions >> 1) + (m_numPartitions >> 2) + (m_numPartitions >> 4) : m_numPartitions >> 1;
         break;
     default:
         X265_CHECK(0, "unexpected part index\n");
         break;
     }
+    return outPartIdxLB;
 }
 
 /* Derives the partition index of neighboring bottom right block */
-void CUData::deriveRightBottomIdx(uint32_t partIdx, uint32_t& outPartIdxRB) const
+uint32_t CUData::deriveRightBottomIdx(uint32_t puIdx) const
 {
+    uint32_t outPartIdxRB;
     outPartIdxRB = g_rasterToZscan[g_zscanToRaster[m_absIdxInCTU] +
                                    ((1 << (m_log2CUSize[0] - LOG2_UNIT_SIZE - 1)) - 1) * s_numPartInCUSize +
                                    (1 << (m_log2CUSize[0] - LOG2_UNIT_SIZE)) - 1];
@@ -1337,30 +1349,31 @@ void CUData::deriveRightBottomIdx(uint32
         outPartIdxRB += m_numPartitions >> 1;
         break;
     case SIZE_2NxN:
-        outPartIdxRB += (partIdx == 0) ? 0 : m_numPartitions >> 1;
+        outPartIdxRB += puIdx ? m_numPartitions >> 1 : 0;
         break;
     case SIZE_Nx2N:
-        outPartIdxRB += (partIdx == 0) ? m_numPartitions >> 2 : (m_numPartitions >> 1);
+        outPartIdxRB += puIdx ? m_numPartitions >> 1 : m_numPartitions >> 2;
         break;
     case SIZE_NxN:
-        outPartIdxRB += (m_numPartitions >> 2) * (partIdx - 1);
+        outPartIdxRB += (m_numPartitions >> 2) * (puIdx - 1);
         break;
     case SIZE_2NxnU:
-        outPartIdxRB += (partIdx == 0) ? -((int)m_numPartitions >> 3) : m_numPartitions >> 1;
+        outPartIdxRB += puIdx ? m_numPartitions >> 1 : -((int)m_numPartitions >> 3);
         break;
     case SIZE_2NxnD:
-        outPartIdxRB += (partIdx == 0) ? (m_numPartitions >> 2) + (m_numPartitions >> 3) : m_numPartitions >> 1;
+        outPartIdxRB += puIdx ? m_numPartitions >> 1 : (m_numPartitions >> 2) + (m_numPartitions >> 3);
         break;
     case SIZE_nLx2N:
-        outPartIdxRB += (partIdx == 0) ? (m_numPartitions >> 3) + (m_numPartitions >> 4) : m_numPartitions >> 1;
+        outPartIdxRB += puIdx ? m_numPartitions >> 1 : (m_numPartitions >> 3) + (m_numPartitions >> 4);
         break;
     case SIZE_nRx2N:
-        outPartIdxRB += (partIdx == 0) ? (m_numPartitions >> 2) + (m_numPartitions >> 3) + (m_numPartitions >> 4) : m_numPartitions >> 1;
+        outPartIdxRB += puIdx ? m_numPartitions >> 1 : (m_numPartitions >> 2) + (m_numPartitions >> 3) + (m_numPartitions >> 4);
         break;
     default:
         X265_CHECK(0, "unexpected part index\n");
         break;
     }
+    return outPartIdxRB;
 }
 
 void CUData::deriveLeftRightTopIdxAdi(uint32_t& outPartIdxLT, uint32_t& outPartIdxRT, uint32_t partOffset, uint32_t partDepth) const
@@ -1371,17 +1384,17 @@ void CUData::deriveLeftRightTopIdxAdi(ui
     outPartIdxRT = g_rasterToZscan[g_zscanToRaster[outPartIdxLT] + numPartInWidth - 1];
 }
 
-bool CUData::hasEqualMotion(uint32_t absPartIdx, const CUData* candCU, uint32_t candAbsPartIdx) const
+bool CUData::hasEqualMotion(uint32_t absPartIdx, const CUData& candCU, uint32_t candAbsPartIdx) const
 {
-    if (m_interDir[absPartIdx] != candCU->m_interDir[candAbsPartIdx])
+    if (m_interDir[absPartIdx] != candCU.m_interDir[candAbsPartIdx])
         return false;
 
     for (uint32_t refListIdx = 0; refListIdx < 2; refListIdx++)
     {
         if (m_interDir[absPartIdx] & (1 << refListIdx))
         {
-            if (m_mv[refListIdx][absPartIdx] != candCU->m_mv[refListIdx][candAbsPartIdx] ||
-                m_refIdx[refListIdx][absPartIdx] != candCU->m_refIdx[refListIdx][candAbsPartIdx])
+            if (m_mv[refListIdx][absPartIdx] != candCU.m_mv[refListIdx][candAbsPartIdx] ||
+                m_refIdx[refListIdx][absPartIdx] != candCU.m_refIdx[refListIdx][candAbsPartIdx])
                 return false;
         }
     }
@@ -1419,10 +1432,9 @@ uint32_t CUData::getInterMergeCandidates
 
     uint32_t count = 0;
 
-    uint32_t partIdxLT, partIdxRT, partIdxLB;
+    uint32_t partIdxLT, partIdxRT, partIdxLB = deriveLeftBottomIdx(puIdx);
     PartSize curPS = (PartSize)m_partSize[absPartIdx];
-    deriveLeftBottomIdx(puIdx, partIdxLB);
-
+    
     // left
     uint32_t leftPartIdx = 0;
     const CUData* cuLeft = getPULeft(leftPartIdx, partIdxLB);
@@ -1454,7 +1466,7 @@ uint32_t CUData::getInterMergeCandidates
         cuAbove->isDiffMER(xP + nPSW - 1, yP - 1, xP, yP) &&
         !(puIdx == 1 && (curPS == SIZE_2NxN || curPS == SIZE_2NxnU || curPS == SIZE_2NxnD)) &&
         !cuAbove->isIntra(abovePartIdx);
-    if (isAvailableB1 && (!isAvailableA1 || !cuLeft->hasEqualMotion(leftPartIdx, cuAbove, abovePartIdx)))
+    if (isAvailableB1 && (!isAvailableA1 || !cuLeft->hasEqualMotion(leftPartIdx, *cuAbove, abovePartIdx)))
     {
         // get Inter Dir
         interDirNeighbours[count] = cuAbove->m_interDir[abovePartIdx];
@@ -1475,7 +1487,7 @@ uint32_t CUData::getInterMergeCandidates
     bool isAvailableB0 = cuAboveRight &&
         cuAboveRight->isDiffMER(xP + nPSW, yP - 1, xP, yP) &&
         !cuAboveRight->isIntra(aboveRightPartIdx);
-    if (isAvailableB0 && (!isAvailableB1 || !cuAbove->hasEqualMotion(abovePartIdx, cuAboveRight, aboveRightPartIdx)))
+    if (isAvailableB0 && (!isAvailableB1 || !cuAbove->hasEqualMotion(abovePartIdx, *cuAboveRight, aboveRightPartIdx)))
     {
         // get Inter Dir
         interDirNeighbours[count] = cuAboveRight->m_interDir[aboveRightPartIdx];
@@ -1496,7 +1508,7 @@ uint32_t CUData::getInterMergeCandidates
     bool isAvailableA0 = cuLeftBottom &&
         cuLeftBottom->isDiffMER(xP - 1, yP + nPSH, xP, yP) &&
         !cuLeftBottom->isIntra(leftBottomPartIdx);
-    if (isAvailableA0 && (!isAvailableA1 || !cuLeft->hasEqualMotion(leftPartIdx, cuLeftBottom, leftBottomPartIdx)))
+    if (isAvailableA0 && (!isAvailableA1 || !cuLeft->hasEqualMotion(leftPartIdx, *cuLeftBottom, leftBottomPartIdx)))
     {
         // get Inter Dir
         interDirNeighbours[count] = cuLeftBottom->m_interDir[leftBottomPartIdx];
@@ -1519,8 +1531,8 @@ uint32_t CUData::getInterMergeCandidates
         bool isAvailableB2 = cuAboveLeft &&
             cuAboveLeft->isDiffMER(xP - 1, yP - 1, xP, yP) &&
             !cuAboveLeft->isIntra(aboveLeftPartIdx);
-        if (isAvailableB2 && (!isAvailableA1 || !cuLeft->hasEqualMotion(leftPartIdx, cuAboveLeft, aboveLeftPartIdx))
-            && (!isAvailableB1 || !cuAbove->hasEqualMotion(abovePartIdx, cuAboveLeft, aboveLeftPartIdx)))
+        if (isAvailableB2 && (!isAvailableA1 || !cuLeft->hasEqualMotion(leftPartIdx, *cuAboveLeft, aboveLeftPartIdx))
+            && (!isAvailableB1 || !cuAbove->hasEqualMotion(abovePartIdx, *cuAboveLeft, aboveLeftPartIdx)))
         {
             // get Inter Dir
             interDirNeighbours[count] = cuAboveLeft->m_interDir[aboveLeftPartIdx];
@@ -1537,11 +1549,8 @@ uint32_t CUData::getInterMergeCandidates
     }
     if (m_slice->m_sps->bTemporalMVPEnabled)
     {
+        uint32_t partIdxRB = deriveRightBottomIdx(puIdx);
         MV colmv;
-        uint32_t partIdxRB;
-
-        deriveRightBottomIdx(puIdx, partIdxRB);
-
         int ctuIdx = -1;
 
         // image boundary check
@@ -1570,13 +1579,12 @@ uint32_t CUData::getInterMergeCandidates
         }
 
         int refIdx = 0;
-        uint32_t partIdxCenter;
+        uint32_t partIdxCenter = deriveCenterIdx(puIdx);
         uint32_t curCTUIdx = m_cuAddr;
         int dir = 0;
-        deriveCenterIdx(puIdx, partIdxCenter);
-        bool bExistMV = ctuIdx >= 0 && getColMVP(0, ctuIdx, absPartAddr, colmv, refIdx);
+        bool bExistMV = ctuIdx >= 0 && getColMVP(colmv, refIdx, 0, ctuIdx, absPartAddr);
         if (!bExistMV)
-            bExistMV = getColMVP(0, curCTUIdx, partIdxCenter, colmv, refIdx);
+            bExistMV = getColMVP(colmv, refIdx, 0, curCTUIdx, partIdxCenter);
         if (bExistMV)
         {
             dir |= 1;
@@ -1586,9 +1594,9 @@ uint32_t CUData::getInterMergeCandidates
 
         if (isInterB)
         {
-            bExistMV = ctuIdx >= 0 && getColMVP(1, ctuIdx, absPartAddr, colmv, refIdx);
+            bExistMV = ctuIdx >= 0 && getColMVP(colmv, refIdx, 1, ctuIdx, absPartAddr);
             if (!bExistMV)
-                bExistMV = getColMVP(1, curCTUIdx, partIdxCenter, colmv, refIdx);
+                bExistMV = getColMVP(colmv, refIdx, 1, curCTUIdx, partIdxCenter);
 
             if (bExistMV)
             {
@@ -1688,15 +1696,14 @@ bool CUData::isDiffMER(int xN, int yN, i
 }
 
 /* Constructs a list of candidates for AMVP, and a larger list of motion candidates */
-int CUData::fillMvpCand(uint32_t partIdx, uint32_t partAddr, int picList, int refIdx, MV* amvpCand, MV* mvc) const
+int CUData::fillMvpCand(uint32_t puIdx, uint32_t absPartIdx, int picList, int refIdx, MV* amvpCand, MV* mvc) const
 {
     int num = 0;
 
     // spatial MV