[x265-commits] [x265] YUV output: correct a rext merge issue

Wed Nov 6 23:42:27 CET 2013

details:   http://hg.videolan.org/x265/rev/dd8510d84b5a
branches:  
changeset: 4883:dd8510d84b5a
user:      Deepthi Nandakumar <deepthi at multicorewareinc.com>
date:      Wed Nov 06 15:20:10 2013 +0530
description:
YUV output: correct a rext merge issue
Subject: [x265] YUV Output: more rext merge bugs

details:   http://hg.videolan.org/x265/rev/b2068453b55b
branches:  
changeset: 4884:b2068453b55b
user:      Deepthi Nandakumar <deepthi at multicorewareinc.com>
date:      Wed Nov 06 15:42:48 2013 +0530
description:
YUV Output: more rext merge bugs
Subject: [x265] Merge

details:   http://hg.videolan.org/x265/rev/21e08cf159c5
branches:  
changeset: 4885:21e08cf159c5
user:      Deepthi Nandakumar <deepthi at multicorewareinc.com>
date:      Wed Nov 06 15:43:19 2013 +0530
description:
Merge
Subject: [x265] asm: routines for luma hps filter functions for all block sizes.

details:   http://hg.videolan.org/x265/rev/01d97a51d37d
branches:  
changeset: 4886:01d97a51d37d
user:      Nabajit Deka
date:      Wed Nov 06 15:42:33 2013 +0530
description:
asm: routines for luma hps filter functions for all block sizes.
Subject: [x265] Adding asm function declaration and function pointer initializations for luma hps functions.

details:   http://hg.videolan.org/x265/rev/450947d76251
branches:  
changeset: 4887:450947d76251
user:      Nabajit Deka
date:      Wed Nov 06 15:46:19 2013 +0530
description:
Adding asm function declaration and function pointer initializations for luma hps functions.
Subject: [x265] asm code for blockcopy_sp, 16xN blocks

details:   http://hg.videolan.org/x265/rev/264b1458963a
branches:  
changeset: 4888:264b1458963a
user:      Praveen Tiwari
date:      Wed Nov 06 16:01:41 2013 +0530
description:
asm code for blockcopy_sp, 16xN blocks
Subject: [x265] asm code for blockcopy_sp, 24x32 block

details:   http://hg.videolan.org/x265/rev/8f71fba52d55
branches:  
changeset: 4889:8f71fba52d55
user:      Praveen Tiwari
date:      Wed Nov 06 17:02:19 2013 +0530
description:
asm code for blockcopy_sp, 24x32 block
Subject: [x265] YUV, Y4M Output: bitdepth confusion resolved

details:   http://hg.videolan.org/x265/rev/846e2c0d8478
branches:  
changeset: 4890:846e2c0d8478
user:      Deepthi Nandakumar <deepthi at multicorewareinc.com>
date:      Wed Nov 06 15:37:17 2013 +0530
description:
YUV, Y4M Output: bitdepth confusion resolved
Subject: [x265] asm: ipfilter_ss[FILTER_V_S_S_8]

details:   http://hg.videolan.org/x265/rev/de7a50155cba
branches:  
changeset: 4891:de7a50155cba
user:      Min Chen <chenm003 at 163.com>
date:      Wed Nov 06 19:35:33 2013 +0800
description:
asm: ipfilter_ss[FILTER_V_S_S_8]
Subject: [x265] pixel.h: nit

details:   http://hg.videolan.org/x265/rev/267b3da1a734
branches:  
changeset: 4892:267b3da1a734
user:      Steve Borho <steve at borho.org>
date:      Wed Nov 06 14:46:08 2013 -0600
description:
pixel.h: nit
Subject: [x265] TComSlice: nits

details:   http://hg.videolan.org/x265/rev/eab2d925a0e0
branches:  
changeset: 4893:eab2d925a0e0
user:      Steve Borho <steve at borho.org>
date:      Wed Nov 06 14:50:09 2013 -0600
description:
TComSlice: nits
Subject: [x265] tcomdatacu: remove the for loop in InitCU(), which will never execute

details:   http://hg.videolan.org/x265/rev/8bdb65fef0f0
branches:  
changeset: 4894:8bdb65fef0f0
user:      Gopu Govindaswamy <gopu at multicorewareinc.com>
date:      Wed Nov 06 17:10:35 2013 +0530
description:
tcomdatacu: remove the for loop in InitCU(), which will never execute

partStartIdx is always zero or negative, and the numElements is also always zero or negative
the for will never executed if numElements is zero or negative, removed the for loop block in initCU()
Subject: [x265] tcomdatacu: partStartIdx is always negative, no need to have else block in InitCU()

details:   http://hg.videolan.org/x265/rev/11a4c5a15d79
branches:  
changeset: 4895:11a4c5a15d79
user:      Gopu Govindaswamy <gopu at multicorewareinc.com>
date:      Wed Nov 06 16:22:52 2013 +0530
description:
tcomdatacu: partStartIdx is always negative, no need to have else block in InitCU()
Subject: [x265] asm code for blockcopy_sp, 32xN

details:   http://hg.videolan.org/x265/rev/6b6d54cc234e
branches:  
changeset: 4896:6b6d54cc234e
user:      Praveen Tiwari
date:      Wed Nov 06 17:25:49 2013 +0530
description:
asm code for blockcopy_sp, 32xN
Subject: [x265] asm code for blockcopy_sp, 12x16 block

details:   http://hg.videolan.org/x265/rev/99c3b2e4f1cc
branches:  
changeset: 4897:99c3b2e4f1cc
user:      Praveen Tiwari
date:      Wed Nov 06 17:42:52 2013 +0530
description:
asm code for blockcopy_sp, 12x16 block
Subject: [x265] asm code for blockcopy_sp, 2x4

details:   http://hg.videolan.org/x265/rev/ea33d0f85b8e
branches:  
changeset: 4898:ea33d0f85b8e
user:      Praveen Tiwari
date:      Wed Nov 06 19:58:21 2013 +0530
description:
asm code for blockcopy_sp, 2x4
Subject: [x265] asm code for blockcopy_sp, 2x8

details:   http://hg.videolan.org/x265/rev/529bf6093782
branches:  
changeset: 4899:529bf6093782
user:      Praveen Tiwari
date:      Wed Nov 06 20:28:15 2013 +0530
description:
asm code for blockcopy_sp, 2x8
Subject: [x265] asm code for blockcopy_sp, 6x8

details:   http://hg.videolan.org/x265/rev/2ae2eb6c8e51
branches:  
changeset: 4900:2ae2eb6c8e51
user:      Praveen Tiwari
date:      Wed Nov 06 19:31:12 2013 +0530
description:
asm code for blockcopy_sp, 6x8
Subject: [x265] used sse4 for 2x4, 2x8 and 6x8

details:   http://hg.videolan.org/x265/rev/ddaa80b9b959
branches:  
changeset: 4901:ddaa80b9b959
user:      Praveen Tiwari
date:      Wed Nov 06 20:36:25 2013 +0530
description:
used sse4 for 2x4, 2x8 and 6x8
Subject: [x265] asm code for blockcopy_sp, 32x64

details:   http://hg.videolan.org/x265/rev/1a46771b9f87
branches:  
changeset: 4902:1a46771b9f87
user:      Praveen Tiwari
date:      Wed Nov 06 20:47:52 2013 +0530
description:
asm code for blockcopy_sp, 32x64
Subject: [x265] blockcopy_sp, added 16x64 block size

details:   http://hg.videolan.org/x265/rev/598a03afc62f
branches:  
changeset: 4903:598a03afc62f
user:      Praveen Tiwari
date:      Wed Nov 06 20:51:42 2013 +0530
description:
blockcopy_sp, added 16x64 block size
Subject: [x265] asm code for blockcopy_sp, 48x64

details:   http://hg.videolan.org/x265/rev/cde21084ca9d
branches:  
changeset: 4904:cde21084ca9d
user:      Praveen Tiwari
date:      Wed Nov 06 21:01:32 2013 +0530
description:
asm code for blockcopy_sp, 48x64
Subject: [x265] blockcopy_sp, 48x64 changed the macro name according to width

details:   http://hg.videolan.org/x265/rev/d87d627b2161
branches:  
changeset: 4905:d87d627b2161
user:      Praveen Tiwari
date:      Wed Nov 06 21:06:54 2013 +0530
description:
blockcopy_sp, 48x64 changed the macro name according to width
Subject: [x265] asm code for blockcopy_sp, 64xN

details:   http://hg.videolan.org/x265/rev/f0214135645a
branches:  
changeset: 4906:f0214135645a
user:      Praveen Tiwari
date:      Wed Nov 06 21:26:48 2013 +0530
description:
asm code for blockcopy_sp, 64xN
Subject: [x265] blockcopy_sp, corrected number of xmm registers

details:   http://hg.videolan.org/x265/rev/0c359d82ebc1
branches:  
changeset: 4907:0c359d82ebc1
user:      Praveen Tiwari
date:      Wed Nov 06 21:29:32 2013 +0530
description:
blockcopy_sp, corrected number of xmm registers
Subject: [x265] asm: move _sse4 block copy function pointer assignments into SSE4 section

details:   http://hg.videolan.org/x265/rev/34d494a8051f
branches:  
changeset: 4908:34d494a8051f
user:      Steve Borho <steve at borho.org>
date:      Wed Nov 06 15:26:24 2013 -0600
description:
asm: move _sse4 block copy function pointer assignments into SSE4 section
Subject: [x265] asm: move block copy funcdefs into blockcopy8.h

details:   http://hg.videolan.org/x265/rev/edf77f60b55c
branches:  
changeset: 4909:edf77f60b55c
user:      Steve Borho <steve at borho.org>
date:      Wed Nov 06 15:30:57 2013 -0600
description:
asm: move block copy funcdefs into blockcopy8.h
Subject: [x265] asm: use new block based chroma single-pass MC primitives

details:   http://hg.videolan.org/x265/rev/8d1bd79d3618
branches:  
changeset: 4910:8d1bd79d3618
user:      Steve Borho <steve at borho.org>
date:      Wed Nov 06 16:05:59 2013 -0600
description:
asm: use new block based chroma single-pass MC primitives
Subject: [x265] asm: use new block based single-pass H-filter motion compensation primitives

details:   http://hg.videolan.org/x265/rev/dbb86150c919
branches:  
changeset: 4911:dbb86150c919
user:      Steve Borho <steve at borho.org>
date:      Wed Nov 06 16:30:44 2013 -0600
description:
asm: use new block based single-pass H-filter motion compensation primitives

diffstat:

 source/Lib/TLibCommon/TComDataCU.cpp     |   81 +--
 source/Lib/TLibCommon/TComDataCU.h       |   28 -
 source/Lib/TLibCommon/TComPrediction.cpp |   24 +-
 source/Lib/TLibCommon/TComSlice.cpp      |    2 +-
 source/Lib/TLibCommon/TComSlice.h        |   22 +-
 source/common/CMakeLists.txt             |    2 +-
 source/common/ipfilter.cpp               |   41 +-
 source/common/primitives.h               |    3 +-
 source/common/x86/asm-primitives.cpp     |   40 +
 source/common/x86/blockcopy8.asm         |  850 +++++++++++++++++++++++++++++++
 source/common/x86/blockcopy8.h           |  102 +++
 source/common/x86/ipfilter8.asm          |  306 ++++++++--
 source/common/x86/ipfilter8.h            |    2 +
 source/common/x86/pixel.h                |   70 --
 source/encoder/motion.cpp                |  277 ++++-----
 source/encoder/motion.h                  |    2 -
 source/encoder/slicetype.cpp             |    2 +-
 source/output/y4m.cpp                    |   47 +-
 source/output/yuv.cpp                    |   36 +-
 source/test/ipfilterharness.cpp          |  102 +++
 source/test/ipfilterharness.h            |    1 +
 21 files changed, 1579 insertions(+), 461 deletions(-)

diffs (truncated from 2626 to 300 lines):

diff -r bc99537483f1 -r dbb86150c919 source/Lib/TLibCommon/TComDataCU.cpp

--- a/source/Lib/TLibCommon/TComDataCU.cpp	Tue Nov 05 22:21:55 2013 -0600
+++ b/source/Lib/TLibCommon/TComDataCU.cpp	Wed Nov 06 16:30:44 2013 -0600
@@ -255,40 +255,8 @@ void TComDataCU::initCU(TComPic* pic, ui
 
     // CHECK_ME: why partStartIdx always negative
     int partStartIdx = 0 - (cuAddr) * pic->getNumPartInCU();
-
-    int numElements = std::min<int>(partStartIdx, m_numPartitions);
-    for (int i = 0; i < numElements; i++)
-    {
-        TComDataCU* from = pic->getCU(getAddr());
-        m_skipFlag[i]   = from->getSkipFlag(i);
-        m_partSizes[i] = from->getPartitionSize(i);
-        m_predModes[i] = from->getPredictionMode(i);
-        m_cuTransquantBypass[i] = from->getCUTransquantBypass(i);
-        m_depth[i] = from->getDepth(i);
-        m_width[i] = from->getWidth(i);
-        m_height[i] = from->getHeight(i);
-        m_trIdx[i] = from->getTransformIdx(i);
-        m_transformSkip[0][i] = from->getTransformSkip(i, TEXT_LUMA);
-        m_transformSkip[1][i] = from->getTransformSkip(i, TEXT_CHROMA_U);
-        m_transformSkip[2][i] = from->getTransformSkip(i, TEXT_CHROMA_V);
-        m_mvpIdx[0][i] = from->m_mvpIdx[0][i];
-        m_mvpIdx[1][i] = from->m_mvpIdx[1][i];
-        m_mvpNum[0][i] = from->m_mvpNum[0][i];
-        m_mvpNum[1][i] = from->m_mvpNum[1][i];
-        m_qp[i] = from->m_qp[i];
-        m_bMergeFlags[i] = from->m_bMergeFlags[i];
-        m_mergeIndex[i] = from->m_mergeIndex[i];
-        m_lumaIntraDir[i] = from->m_lumaIntraDir[i];
-        m_chromaIntraDir[i] = from->m_chromaIntraDir[i];
-        m_interDir[i] = from->m_interDir[i];
-        m_cbf[0][i] = from->m_cbf[0][i];
-        m_cbf[1][i] = from->m_cbf[1][i];
-        m_cbf[2][i] = from->m_cbf[2][i];
-        m_iPCMFlags[i] = from->m_iPCMFlags[i];
-    }
-
     int firstElement = std::max<int>(partStartIdx, 0);
-    numElements = m_numPartitions - firstElement;
+    int numElements = m_numPartitions - firstElement;
 
     if (numElements > 0)
     {
@@ -330,25 +298,6 @@ void TComDataCU::initCU(TComPic* pic, ui
         memset(m_iPCMSampleCb, 0, sizeof(Pel) * c_tmp);
         memset(m_iPCMSampleCr, 0, sizeof(Pel) * c_tmp);
     }
-    else
-    {
-        TComDataCU * from = pic->getCU(getAddr());
-        m_cuMvField[0].copyFrom(&from->m_cuMvField[0], m_numPartitions, 0);
-        m_cuMvField[1].copyFrom(&from->m_cuMvField[1], m_numPartitions, 0);
-        for (int i = 0; i < y_tmp; i++)
-        {
-            m_trCoeffY[i] = from->m_trCoeffY[i];
-            m_iPCMSampleY[i] = from->m_iPCMSampleY[i];
-        }
-
-        for (int i = 0; i < c_tmp; i++)
-        {
-            m_trCoeffCb[i] = from->m_trCoeffCb[i];
-            m_trCoeffCr[i] = from->m_trCoeffCr[i];
-            m_iPCMSampleCb[i] = from->m_iPCMSampleCb[i];
-            m_iPCMSampleCr[i] = from->m_iPCMSampleCr[i];
-        }
-    }
 
     // Setting neighbor CU
     m_cuLeft        = NULL;
@@ -438,16 +387,6 @@ void TComDataCU::initEstData(uint32_t de
 
     m_cuMvField[0].clearMvField();
     m_cuMvField[1].clearMvField();
-
-    uint32_t tmp = width * height;
-    memset(m_trCoeffY,    0, tmp * sizeof(*m_trCoeffY));
-    memset(m_iPCMSampleY, 0, tmp * sizeof(*m_iPCMSampleY));
-
-    tmp = (width >> m_hChromaShift) * (height >> m_vChromaShift);
-    memset(m_trCoeffCb,    0, tmp * sizeof(*m_trCoeffCb));
-    memset(m_trCoeffCr,    0, tmp * sizeof(*m_trCoeffCr));
-    memset(m_iPCMSampleCb, 0, tmp * sizeof(*m_iPCMSampleCb));
-    memset(m_iPCMSampleCr, 0, tmp * sizeof(*m_iPCMSampleCr));
 }
 
 // initialize Sub partition
@@ -513,16 +452,6 @@ void TComDataCU::initSubCU(TComDataCU* c
         m_mvpNum[1][i] = -1;
     }
 
-    uint32_t tmp = width * heigth;
-    memset(m_trCoeffY, 0, sizeof(TCoeff) * tmp);
-    memset(m_iPCMSampleY, 0, sizeof(Pel) * tmp);
-
-    tmp = (width >> m_hChromaShift) * (heigth >> m_vChromaShift);
-    memset(m_trCoeffCb, 0, sizeof(TCoeff) * tmp);
-    memset(m_trCoeffCr, 0, sizeof(TCoeff) * tmp);
-    memset(m_iPCMSampleCb, 0, sizeof(Pel) * tmp);
-    memset(m_iPCMSampleCr, 0, sizeof(Pel) * tmp);
-
     m_cuMvField[0].clearMvField();
     m_cuMvField[1].clearMvField();
 
@@ -1597,14 +1526,6 @@ void TComDataCU::setTransformSkipSubPart
     memset(m_transformSkip[g_convertTxtTypeToIdx[ttype]] + absPartIdx, useTransformSkip, sizeof(UChar) * curPartNum);
 }
 
-void TComDataCU::setSizeSubParts(uint32_t width, uint32_t height, uint32_t absPartIdx, uint32_t depth)
-{
-    uint32_t curPartNum = m_pic->getNumPartInCU() >> (depth << 1);
-
-    memset(m_width  + absPartIdx, width,  sizeof(UChar) * curPartNum);
-    memset(m_height + absPartIdx, height, sizeof(UChar) * curPartNum);
-}
-
 UChar TComDataCU::getNumPartInter()
 {
     UChar numPart = 0;
diff -r bc99537483f1 -r dbb86150c919 source/Lib/TLibCommon/TComDataCU.h
--- a/source/Lib/TLibCommon/TComDataCU.h	Tue Nov 05 22:21:55 2013 -0600
+++ b/source/Lib/TLibCommon/TComDataCU.h	Wed Nov 06 16:30:44 2013 -0600
@@ -224,8 +224,6 @@ public:
 
     UChar         getDepth(uint32_t idx)           { return m_depth[idx]; }
 
-    void          setDepth(uint32_t idx, UChar h)  { m_depth[idx] = h; }
-
     void          setDepthSubParts(uint32_t depth, uint32_t absPartIdx);
 
     // -------------------------------------------------------------------------------------------------------------------
@@ -234,12 +232,8 @@ public:
 
     char*         getPartitionSize()                      { return m_partSizes; }
 
-    int           getUnitSize()                           { return m_unitSize; }
-
     PartSize      getPartitionSize(uint32_t idx)              { return static_cast<PartSize>(m_partSizes[idx]); }
 
-    void          setPartitionSize(uint32_t idx, PartSize uh) { m_partSizes[idx] = (char)uh; }
-
     void          setPartSizeSubParts(PartSize eMode, uint32_t absPartIdx, uint32_t depth);
     void          setCUTransquantBypassSubParts(bool flag, uint32_t absPartIdx, uint32_t depth);
 
@@ -247,8 +241,6 @@ public:
 
     bool         getSkipFlag(uint32_t idx)            { return m_skipFlag[idx]; }
 
-    void         setSkipFlag(uint32_t idx, bool skip) { m_skipFlag[idx] = skip; }
-
     void         setSkipFlagSubParts(bool skip, uint32_t absPartIdx, uint32_t depth);
 
     char*         getPredictionMode()                 { return m_predModes; }
@@ -259,24 +251,16 @@ public:
 
     bool          getCUTransquantBypass(uint32_t idx)     { return m_cuTransquantBypass[idx]; }
 
-    void          setPredictionMode(uint32_t idx, PredMode uh) { m_predModes[idx] = (char)uh; }
-
     void          setPredModeSubParts(PredMode eMode, uint32_t absPartIdx, uint32_t depth);
 
     UChar*        getWidth()                     { return m_width; }
 
     UChar         getWidth(uint32_t idx)             { return m_width[idx]; }
 
-    void          setWidth(uint32_t idx, UChar  uh)  { m_width[idx] = uh; }
-
     UChar*        getHeight()                    { return m_height; }
 
     UChar         getHeight(uint32_t idx)            { return m_height[idx]; }
 
-    void          setHeight(uint32_t idx, UChar  uh) { m_height[idx] = uh; }
-
-    void          setSizeSubParts(uint32_t width, uint32_t height, uint32_t absPartIdx, uint32_t depth);
-
     char*         getQP()                        { return m_qp; }
 
     char          getQP(uint32_t idx)                { return m_qp[idx]; }
@@ -342,16 +326,12 @@ public:
 
     bool          getMergeFlag(uint32_t idx)            { return m_bMergeFlags[idx]; }
 
-    void          setMergeFlag(uint32_t idx, bool b)    { m_bMergeFlags[idx] = b; }
-
     void          setMergeFlagSubParts(bool bMergeFlag, uint32_t absPartIdx, uint32_t partIdx, uint32_t depth);
 
     UChar*        getMergeIndex()                   { return m_mergeIndex; }
 
     UChar         getMergeIndex(uint32_t idx)           { return m_mergeIndex[idx]; }
 
-    void          setMergeIndex(uint32_t idx, uint32_t mergeIndex) { m_mergeIndex[idx] = (UChar)mergeIndex; }
-
     void          setMergeIndexSubParts(uint32_t mergeIndex, uint32_t absPartIdx, uint32_t partIdx, uint32_t depth);
     template<typename T>
     void          setSubPart(T bParameter, T* pbBaseLCU, uint32_t cuAddr, uint32_t cuDepth, uint32_t puIdx);
@@ -364,24 +344,18 @@ public:
 
     UChar         getLumaIntraDir(uint32_t idx) { return m_lumaIntraDir[idx]; }
 
-    void          setLumaIntraDir(uint32_t idx, UChar uh) { m_lumaIntraDir[idx] = uh; }
-
     void          setLumaIntraDirSubParts(uint32_t dir, uint32_t absPartIdx, uint32_t depth);
 
     UChar*        getChromaIntraDir()                 { return m_chromaIntraDir; }
 
     UChar         getChromaIntraDir(uint32_t idx)         { return m_chromaIntraDir[idx]; }
 
-    void          setChromaIntraDir(uint32_t idx, UChar  uh) { m_chromaIntraDir[idx] = uh; }
-
     void          setChromIntraDirSubParts(uint32_t dir, uint32_t absPartIdx, uint32_t depth);
 
     UChar*        getInterDir()                    { return m_interDir; }
 
     UChar         getInterDir(uint32_t idx)            { return m_interDir[idx]; }
 
-    void          setInterDir(uint32_t idx, UChar  uh) { m_interDir[idx] = uh; }
-
     void          setInterDirSubParts(uint32_t dir,  uint32_t absPartIdx, uint32_t partIdx, uint32_t depth);
     bool*         getIPCMFlag()                     { return m_iPCMFlags; }
 
@@ -414,8 +388,6 @@ public:
 
     char*         getMVPIdx(int picList)                       { return m_mvpIdx[picList]; }
 
-    void          setMVPNum(int picList, uint32_t idx, int mvpNum) { m_mvpNum[picList][idx] = (char)mvpNum; }
-
     int           getMVPNum(int picList, uint32_t idx)             { return m_mvpNum[picList][idx]; }
 
     char*         getMVPNum(int picList)                       { return m_mvpNum[picList]; }
diff -r bc99537483f1 -r dbb86150c919 source/Lib/TLibCommon/TComPrediction.cpp
--- a/source/Lib/TLibCommon/TComPrediction.cpp	Tue Nov 05 22:21:55 2013 -0600
+++ b/source/Lib/TLibCommon/TComPrediction.cpp	Wed Nov 06 16:30:44 2013 -0600
@@ -476,6 +476,7 @@ void TComPrediction::xPredInterLumaBlk(T
 
     int srcStride = refPic->getStride();
     int srcOffset = (mv->x >> 2) + (mv->y >> 2) * srcStride;
+    int partEnum = partitionFromSizes(width, height);
     Pel* src = refPic->getLumaAddr(cu->getAddr(), cu->getZorderIdxInCU() + partAddr) + srcOffset;
 
     int xFrac = mv->x & 0x3;
@@ -483,11 +484,11 @@ void TComPrediction::xPredInterLumaBlk(T
 
     if ((yFrac | xFrac) == 0)
     {
-        primitives.blockcpy_pp(width, height, dst, dstStride, src, srcStride);
+        primitives.luma_copy_pp[partEnum](dst, dstStride, src, srcStride);
     }
     else if (yFrac == 0)
     {
-        primitives.ipfilter_pp[FILTER_H_P_P_8](src, srcStride, dst, dstStride, width, height, g_lumaFilter[xFrac]);
+        primitives.luma_hpp[partEnum](src, srcStride, dst, dstStride, xFrac);
     }
     else if (xFrac == 0)
     {
@@ -537,7 +538,7 @@ void TComPrediction::xPredInterLumaBlk(T
         int filterSize = NTAPS_LUMA;
         int halfFilterSize = (filterSize >> 1);
         primitives.ipfilter_ps[FILTER_H_P_S_8](ref - (halfFilterSize - 1) * refStride, refStride, m_immedVals, tmpStride, width, height + filterSize - 1, g_lumaFilter[xFrac]);
-        primitives.ipfilter_ss[FILTER_V_S_S_8](m_immedVals + (halfFilterSize - 1) * tmpStride, tmpStride, dst, dstStride, width, height, g_lumaFilter[yFrac]);
+        primitives.ipfilter_ss[FILTER_V_S_S_8](m_immedVals + (halfFilterSize - 1) * tmpStride, tmpStride, dst, dstStride, width, height, yFrac);
     }
 }
 
@@ -568,23 +569,24 @@ void TComPrediction::xPredInterChromaBlk
 
     int xFrac = mv->x & 0x7;
     int yFrac = mv->y & 0x7;
+    int partEnum = partitionFromSizes(width, height);
     uint32_t cxWidth = width >> 1;
     uint32_t cxHeight = height >> 1;
 
     if ((yFrac | xFrac) == 0)
     {
-        primitives.blockcpy_pp(cxWidth, cxHeight, dstCb, dstStride, refCb, refStride);
-        primitives.blockcpy_pp(cxWidth, cxHeight, dstCr, dstStride, refCr, refStride);
+        primitives.chroma_copy_pp[partEnum](dstCb, dstStride, refCb, refStride);
+        primitives.chroma_copy_pp[partEnum](dstCr, dstStride, refCr, refStride);
     }
     else if (yFrac == 0)
     {
-        primitives.ipfilter_pp[FILTER_H_P_P_4](refCb, refStride, dstCb, dstStride, cxWidth, cxHeight, g_chromaFilter[xFrac]);
-        primitives.ipfilter_pp[FILTER_H_P_P_4](refCr, refStride, dstCr, dstStride, cxWidth, cxHeight, g_chromaFilter[xFrac]);
+        primitives.chroma_hpp[partEnum](refCb, refStride, dstCb, dstStride, xFrac);
+        primitives.chroma_hpp[partEnum](refCr, refStride, dstCr, dstStride, xFrac);
     }
     else if (xFrac == 0)
     {
-        primitives.ipfilter_pp[FILTER_V_P_P_4](refCb, refStride, dstCb, dstStride, cxWidth, cxHeight, g_chromaFilter[yFrac]);
-        primitives.ipfilter_pp[FILTER_V_P_P_4](refCr, refStride, dstCr, dstStride, cxWidth, cxHeight, g_chromaFilter[yFrac]);
+        primitives.chroma_vpp[partEnum](refCb, refStride, dstCb, dstStride, yFrac);
+        primitives.chroma_vpp[partEnum](refCr, refStride, dstCr, dstStride, yFrac);
     }
     else
     {
@@ -643,9 +645,9 @@ void TComPrediction::xPredInterChromaBlk
         int filterSize = NTAPS_CHROMA;
         int halfFilterSize = (filterSize >> 1);
         primitives.ipfilter_ps[FILTER_H_P_S_4](refCb - (halfFilterSize - 1) * refStride, refStride, m_immedVals, extStride, cxWidth, cxHeight + filterSize - 1, g_chromaFilter[xFrac]);
-        primitives.ipfilter_ss[FILTER_V_S_S_4](m_immedVals + (halfFilterSize - 1) * extStride, extStride, dstCb, dstStride, cxWidth, cxHeight, g_chromaFilter[yFrac]);