[x265-commits] [x265] asm: fix for potential mismach between ASM and no-ASM out...

Fri Jan 31 23:59:52 CET 2014

details:   http://hg.videolan.org/x265/rev/539d1b0561b1
branches:  stable
changeset: 5945:539d1b0561b1
user:      Praveen Tiwari
date:      Fri Jan 31 18:37:06 2014 +0530
description:
asm: fix for potential mismach between ASM and no-ASM outputs
Subject: [x265] slicetype: prevent divide-by-zero and sqrtf(0)

details:   http://hg.videolan.org/x265/rev/3bc0651c0f40
branches:  stable
changeset: 5946:3bc0651c0f40
user:      Steve Borho <steve at borho.org>
date:      Fri Jan 31 14:55:35 2014 -0600
description:
slicetype: prevent divide-by-zero and sqrtf(0)
Subject: [x265] slicetype: use explicit float type constant

details:   http://hg.videolan.org/x265/rev/e04f2b3dea39
branches:  stable
changeset: 5947:e04f2b3dea39
user:      Steve Borho <steve at borho.org>
date:      Fri Jan 31 14:56:33 2014 -0600
description:
slicetype: use explicit float type constant
Subject: [x265] slicetype: alloc wpScalingParam instance as a struct member

details:   http://hg.videolan.org/x265/rev/86081bfcacf9
branches:  stable
changeset: 5948:86081bfcacf9
user:      Steve Borho <steve at borho.org>
date:      Fri Jan 31 15:02:58 2014 -0600
description:
slicetype: alloc wpScalingParam instance as a struct member

This is a workaround for VC11.  When x265 was compiled for debug targeting Win32
the stack was being reported as corrupted by weightCostLuma().  No other
compiler or build option would report any problems (not even valgrind).  In the
VisualStudio debugger the stack would be obviously garbaged once the function
was entered.  Moving `w` off of the stack makes the VC11 debugger happy again.
Subject: [x265] slicetype: comment nits

details:   http://hg.videolan.org/x265/rev/d24e2a8c4326
branches:  stable
changeset: 5949:d24e2a8c4326
user:      Steve Borho <steve at borho.org>
date:      Fri Jan 31 15:03:47 2014 -0600
description:
slicetype: comment nits

Remove a comment copied from x264 that has no bearing in x265, and fix the
alignment of another comment.
Subject: [x265] Added tag 0.7 for changeset d24e2a8c4326

details:   http://hg.videolan.org/x265/rev/edf64ac976ea
branches:  stable
changeset: 5950:edf64ac976ea
user:      Steve Borho <steve at borho.org>
date:      Fri Jan 31 15:03:56 2014 -0600
description:
Added tag 0.7 for changeset d24e2a8c4326
Subject: [x265] testbench: fix for random test bench failure caused by pixeladd_ss

details:   http://hg.videolan.org/x265/rev/8f066e4e48e9
branches:  
changeset: 5951:8f066e4e48e9
user:      Nabajit Deka
date:      Thu Jan 30 15:42:49 2014 +0530
description:
testbench: fix for random test bench failure caused by pixeladd_ss
Subject: [x265] testbench: add stress test case for luma_pp filter function

details:   http://hg.videolan.org/x265/rev/897067ac23ac
branches:  
changeset: 5952:897067ac23ac
user:      Nabajit Deka
date:      Thu Jan 30 19:15:26 2014 +0530
description:
testbench: add stress test case for luma_pp filter function
Subject: [x265] testbench: fix signed/unsigned comparison warning

details:   http://hg.videolan.org/x265/rev/24e448ed4341
branches:  
changeset: 5953:24e448ed4341
user:      Steve Borho <steve at borho.org>
date:      Fri Jan 31 13:52:58 2014 -0600
description:
testbench: fix signed/unsigned comparison warning
Subject: [x265] cmake: white-space nits

details:   http://hg.videolan.org/x265/rev/58cff481d6ed
branches:  
changeset: 5954:58cff481d6ed
user:      Steve Borho <steve at borho.org>
date:      Fri Jan 31 15:23:27 2014 -0600
description:
cmake: white-space nits
Subject: [x265] cmake: attempt to support non-x86 compile targets

details:   http://hg.videolan.org/x265/rev/8769cd7b97ac
branches:  
changeset: 5955:8769cd7b97ac
user:      Steve Borho <steve at borho.org>
date:      Thu Jan 30 12:19:57 2014 -0600
description:
cmake: attempt to support non-x86 compile targets
Subject: [x265] Merge with stable

details:   http://hg.videolan.org/x265/rev/65003e385629
branches:  
changeset: 5956:65003e385629
user:      Steve Borho <steve at borho.org>
date:      Fri Jan 31 15:26:29 2014 -0600
description:
Merge with stable
Subject: [x265] ratecontrol: white-space nits

details:   http://hg.videolan.org/x265/rev/4aed055bd1ed
branches:  
changeset: 5957:4aed055bd1ed
user:      Steve Borho <steve at borho.org>
date:      Fri Jan 31 15:52:48 2014 -0600
description:
ratecontrol: white-space nits
Subject: [x265] ratecontrol: add missing braces

details:   http://hg.videolan.org/x265/rev/c4e99fde0b0b
branches:  
changeset: 5958:c4e99fde0b0b
user:      Steve Borho <steve at borho.org>
date:      Fri Jan 31 15:53:00 2014 -0600
description:
ratecontrol: add missing braces
Subject: [x265] weightp: vc11-win32-debug workarounds

details:   http://hg.videolan.org/x265/rev/461316bc1dd5
branches:  
changeset: 5959:461316bc1dd5
user:      Steve Borho <steve at borho.org>
date:      Fri Jan 31 15:55:55 2014 -0600
description:
weightp: vc11-win32-debug workarounds
Subject: [x265] weightp: cleanups

details:   http://hg.videolan.org/x265/rev/fb048ad78e78
branches:  
changeset: 5960:fb048ad78e78
user:      Steve Borho <steve at borho.org>
date:      Fri Jan 31 15:56:15 2014 -0600
description:
weightp: cleanups
Subject: [x265] uncrustify source (mechanical coding style enforcement)

details:   http://hg.videolan.org/x265/rev/9d0abf80eeb1
branches:  
changeset: 5961:9d0abf80eeb1
user:      Steve Borho <steve at borho.org>
date:      Fri Jan 31 16:50:33 2014 -0600
description:
uncrustify source (mechanical coding style enforcement)

A few changes that uncrustify wanted to make have been left out of the commit
for style reasons.
Subject: [x265] ratecontrol: use X265_DEPTH instead of g_bitDepth

details:   http://hg.videolan.org/x265/rev/413ad959a5c6
branches:  
changeset: 5962:413ad959a5c6
user:      Steve Borho <steve at borho.org>
date:      Fri Jan 31 16:53:36 2014 -0600
description:
ratecontrol: use X265_DEPTH instead of g_bitDepth

On 8 bit builds, bit depth is known at compile time, allowing the compiler to
optimize away a few of these operations.

diffstat:

 .hgtags                                             |    1 +
 source/CMakeLists.txt                               |   16 +-
 source/Lib/TLibCommon/ContextTables.h               |   51 +-
 source/Lib/TLibCommon/TComDataCU.cpp                |   98 +--
 source/Lib/TLibCommon/TComDataCU.h                  |   16 +-
 source/Lib/TLibCommon/TComMotionInfo.h              |    3 -
 source/Lib/TLibCommon/TComPattern.cpp               |   98 ----
 source/Lib/TLibCommon/TComPattern.h                 |   52 --
 source/Lib/TLibCommon/TComPic.h                     |    3 +
 source/Lib/TLibCommon/TComPrediction.cpp            |   42 +-
 source/Lib/TLibCommon/TComPrediction.h              |    3 -
 source/Lib/TLibCommon/TComRom.cpp                   |    9 +-
 source/Lib/TLibCommon/TComRom.h                     |    7 -
 source/Lib/TLibCommon/TComTrQuant.cpp               |    6 +-
 source/Lib/TLibCommon/TComTrQuant.h                 |    2 +-
 source/Lib/TLibCommon/TComYuv.cpp                   |   49 +-
 source/Lib/TLibCommon/TypeDef.h                     |   11 +-
 source/Lib/TLibEncoder/TEncCu.cpp                   |   15 +-
 source/Lib/TLibEncoder/TEncCu.h                     |    2 +-
 source/Lib/TLibEncoder/TEncSampleAdaptiveOffset.cpp |    8 +-
 source/Lib/TLibEncoder/TEncSbac.cpp                 |   49 +-
 source/Lib/TLibEncoder/TEncSearch.cpp               |   98 +--
 source/Lib/TLibEncoder/TEncSearch.h                 |   15 +-
 source/common/CMakeLists.txt                        |   22 +-
 source/common/TShortYUV.cpp                         |    1 -
 source/common/TShortYUV.h                           |    1 -
 source/common/common.cpp                            |   48 +-
 source/common/common.h                              |    1 -
 source/common/cpu.cpp                               |    6 +
 source/common/intrapred.cpp                         |    4 +-
 source/common/ipfilter.cpp                          |  166 +-----
 source/common/lowres.cpp                            |    6 +-
 source/common/lowres.h                              |    2 +-
 source/common/pixel.cpp                             |   28 +-
 source/common/primitives.cpp                        |   16 +-
 source/common/primitives.h                          |   30 +-
 source/common/threadpool.cpp                        |   83 ++-
 source/common/vec/intra-sse41.cpp                   |    3 +-
 source/common/vec/intra-ssse3.cpp                   |    2 +-
 source/common/vec/ipfilter-sse41.cpp                |  257 ----------
 source/common/vec/vec-primitives.cpp                |   11 +-
 source/common/x86/asm-primitives.cpp                |   27 +-
 source/common/x86/intrapred.h                       |   57 ++-
 source/common/x86/intrapred8.asm                    |  486 ++++++++++++++++++++
 source/common/x86/ipfilter8.asm                     |  235 +++------
 source/common/x86/ipfilter8.h                       |    1 -
 source/common/x86/pixel-a.asm                       |   18 +-
 source/encoder/CMakeLists.txt                       |    5 +-
 source/encoder/compress.cpp                         |   20 +-
 source/encoder/cturow.h                             |    2 +-
 source/encoder/dpb.cpp                              |    4 +-
 source/encoder/encoder.cpp                          |  141 +++-
 source/encoder/encoder.h                            |    6 +
 source/encoder/frameencoder.cpp                     |   21 +-
 source/encoder/frameencoder.h                       |    1 +
 source/encoder/ratecontrol.cpp                      |   32 +-
 source/encoder/slicetype.cpp                        |  480 ++++++++-----------
 source/encoder/slicetype.h                          |    6 +-
 source/encoder/weightPrediction.cpp                 |  103 +--
 source/encoder/weightPrediction.h                   |    2 +
 source/input/y4m.cpp                                |    5 +
 source/input/yuv.cpp                                |    3 +
 source/test/intrapredharness.cpp                    |    3 +-
 source/test/ipfilterharness.cpp                     |  284 ++---------
 source/test/ipfilterharness.h                       |    5 +-
 source/test/pixelharness.cpp                        |   65 ++-
 source/test/pixelharness.h                          |    1 +
 source/test/testharness.h                           |    1 +
 source/x265.cpp                                     |    7 +-
 source/x265.h                                       |   33 +-
 70 files changed, 1566 insertions(+), 1829 deletions(-)

diffs (truncated from 6210 to 300 lines):

diff -r 564eefbb3812 -r 413ad959a5c6 .hgtags

--- a/.hgtags	Thu Jan 30 17:34:31 2014 -0600
+++ b/.hgtags	Fri Jan 31 16:53:36 2014 -0600
@@ -8,3 +8,4 @@ 2ba6ec553f218d2b06ad803b87d6ec751fd639f7
 93707bc4fccdaa89a1f2da11db8808ca912a691c 0.4.1
 69acb3cb777f977f5edde908069ac565915dd366 0.5
 b970ffbdd696e3ce45c93b315902eb6366ff085e 0.6
+d24e2a8c4326b0cd01bfa6c414c5378481af9018 0.7
diff -r 564eefbb3812 -r 413ad959a5c6 source/CMakeLists.txt
--- a/source/CMakeLists.txt	Thu Jan 30 17:34:31 2014 -0600
+++ b/source/CMakeLists.txt	Fri Jan 31 16:53:36 2014 -0600
@@ -13,7 +13,7 @@ include(CheckFunctionExists)
 include(CheckCXXCompilerFlag)
 
 # X265_BUILD must be incremented each time the public API is changed
-set(X265_BUILD 4)
+set(X265_BUILD 5)
 configure_file("${PROJECT_SOURCE_DIR}/x265.def.in"
                "${PROJECT_BINARY_DIR}/x265.def")
 configure_file("${PROJECT_SOURCE_DIR}/x265_config.h.in"
@@ -21,12 +21,16 @@ configure_file("${PROJECT_SOURCE_DIR}/x2
 
 SET(CMAKE_MODULE_PATH "${PROJECT_SOURCE_DIR}/cmake" "${CMAKE_MODULE_PATH}")
 
-if ("${CMAKE_SIZEOF_VOID_P}" MATCHES 8)
-    set(X64 1)
-    add_definitions(-DX86_64=1)
+if(${CMAKE_SYSTEM_PROCESSOR} STREQUAL "i386" OR ${CMAKE_SYSTEM_PROCESSOR} STREQUAL "x86")
+    set(X86 1)
+    add_definitions(-DX265_ARCH_X86=1)
+    if("${CMAKE_SIZEOF_VOID_P}" MATCHES 8)
+        set(X64 1)
+        add_definitions(-DX86_64=1)
+    endif()
 endif()
 
-if (CMAKE_GENERATOR STREQUAL "Xcode")
+if(CMAKE_GENERATOR STREQUAL "Xcode")
   set(XCODE 1)
 endif()
 if (APPLE)
@@ -154,8 +158,8 @@ endif()
 
 include(version) # determine X265_VERSION and X265_LATEST_TAG
 include_directories(. Lib common encoder "${PROJECT_BINARY_DIR}")
+add_subdirectory(encoder)
 add_subdirectory(common)
-add_subdirectory(encoder)
 
 if((MSVC_IDE OR XCODE) AND ENABLE_ASSEMBLY)
     # this is horrible. ugly, and hacky, and it reproduces logic found
diff -r 564eefbb3812 -r 413ad959a5c6 source/Lib/TLibCommon/ContextTables.h
--- a/source/Lib/TLibCommon/ContextTables.h	Thu Jan 30 17:34:31 2014 -0600
+++ b/source/Lib/TLibCommon/ContextTables.h	Fri Jan 31 16:53:36 2014 -0600
@@ -56,8 +56,7 @@
 #define NUM_MERGE_FLAG_EXT_CTX        1       ///< number of context models for merge flag of merge extended
 #define NUM_MERGE_IDX_EXT_CTX         1       ///< number of context models for merge index of merge extended
 
-#define NUM_PART_SIZE_CTX             3       ///< number of context models for partition size
-#define NUM_CU_AMP_CTX                1       ///< number of context models for partition size (AMP)
+#define NUM_PART_SIZE_CTX             4       ///< number of context models for partition size
 #define NUM_PRED_MODE_CTX             1       ///< number of context models for prediction mode
 
 #define NUM_ADI_CTX                   1       ///< number of context models for intra prediction
@@ -78,7 +77,9 @@
 #define NUM_SIG_FLAG_CTX_LUMA         27      ///< number of context models for luma sig flag
 #define NUM_SIG_FLAG_CTX_CHROMA       15      ///< number of context models for chroma sig flag
 
-#define NUM_CTX_LAST_FLAG_XY          15      ///< number of context models for last coefficient position
+#define NUM_CTX_LAST_FLAG_XY          18      ///< number of context models for last coefficient position
+#define NUM_CTX_LAST_FLAG_XY_LUMA     15      ///< number of context models for last coefficient position of luma
+#define NUM_CTX_LAST_FLAG_XY_CHROMA    3      ///< number of context models for last coefficient position of chroma
 
 #define NUM_ONE_FLAG_CTX              24      ///< number of context models for greater than 1 flag
 #define NUM_ONE_FLAG_CTX_LUMA         16      ///< number of context models for greater than 1 flag of luma
@@ -87,7 +88,7 @@
 #define NUM_ABS_FLAG_CTX_LUMA          4      ///< number of context models for greater than 2 flag of luma
 #define NUM_ABS_FLAG_CTX_CHROMA        2      ///< number of context models for greater than 2 flag of chroma
 
-#define NUM_MVP_IDX_CTX               2       ///< number of context models for MVP index
+#define NUM_MVP_IDX_CTX               1       ///< number of context models for MVP index
 
 #define NUM_SAO_MERGE_FLAG_CTX        1       ///< number of context models for SAO merge flags
 #define NUM_SAO_TYPE_IDX_CTX          1       ///< number of context models for SAO type index
@@ -115,12 +116,11 @@
 #define OFF_SIG_CG_FLAG_CTX                 (OFF_QT_ROOT_CBF_CTX        +     NUM_QT_ROOT_CBF_CTX)
 #define OFF_SIG_FLAG_CTX                    (OFF_SIG_CG_FLAG_CTX        + 2 * NUM_SIG_CG_FLAG_CTX)
 #define OFF_CTX_LAST_FLAG_X                 (OFF_SIG_FLAG_CTX           +     NUM_SIG_FLAG_CTX)
-#define OFF_CTX_LAST_FLAG_Y                 (OFF_CTX_LAST_FLAG_X        + 2 * NUM_CTX_LAST_FLAG_XY)
-#define OFF_ONE_FLAG_CTX                    (OFF_CTX_LAST_FLAG_Y        + 2 * NUM_CTX_LAST_FLAG_XY)
+#define OFF_CTX_LAST_FLAG_Y                 (OFF_CTX_LAST_FLAG_X        +     NUM_CTX_LAST_FLAG_XY)
+#define OFF_ONE_FLAG_CTX                    (OFF_CTX_LAST_FLAG_Y        +     NUM_CTX_LAST_FLAG_XY)
 #define OFF_ABS_FLAG_CTX                    (OFF_ONE_FLAG_CTX           +     NUM_ONE_FLAG_CTX)
 #define OFF_MVP_IDX_CTX                     (OFF_ABS_FLAG_CTX           +     NUM_ABS_FLAG_CTX)
-#define OFF_CU_AMP_CTX                      (OFF_MVP_IDX_CTX            +     NUM_MVP_IDX_CTX)
-#define OFF_SAO_MERGE_FLAG_CTX              (OFF_CU_AMP_CTX             +     NUM_CU_AMP_CTX)
+#define OFF_SAO_MERGE_FLAG_CTX              (OFF_MVP_IDX_CTX            +     NUM_MVP_IDX_CTX)
 #define OFF_SAO_TYPE_IDX_CTX                (OFF_SAO_MERGE_FLAG_CTX     +     NUM_SAO_MERGE_FLAG_CTX)
 #define OFF_TRANSFORMSKIP_FLAG_CTX          (OFF_SAO_TYPE_IDX_CTX       +     NUM_SAO_TYPE_IDX_CTX)
 #define OFF_CU_TRANSQUANT_BYPASS_FLAG_CTX   (OFF_TRANSFORMSKIP_FLAG_CTX + 2 * NUM_TRANSFORMSKIP_FLAG_CTX)
@@ -157,7 +157,6 @@ uint8_t sbacInit(int qp, int initValue);
 // Tables
 // ====================================================================================================================
 
-
 // initial probability for cu_transquant_bypass flag
 static const uint8_t
     INIT_CU_TRANSQUANT_BYPASS_FLAG[3][NUM_CU_TRANSQUANT_BYPASS_FLAG_CTX] =
@@ -203,17 +202,9 @@ static const uint8_t
 static const uint8_t
     INIT_PART_SIZE[3][NUM_PART_SIZE_CTX] =
 {
-    { 154,  139,  CNU, },
-    { 154,  139,  CNU, },
-    { 184,  CNU,  CNU, },
-};
-
-static const uint8_t
-    INIT_CU_AMP_POS[3][NUM_CU_AMP_CTX] =
-{
-    { 154, },
-    { 154, },
-    { CNU, },
+    { 154,  139,  154, 154 },
+    { 154,  139,  154, 154 },
+    { 184,  CNU,  CNU, CNU },
 };
 
 static const uint8_t
@@ -275,9 +266,9 @@ static const uint8_t
 static const uint8_t
     INIT_QT_CBF[3][2 * NUM_QT_CBF_CTX] =
 {
-    { 153,  111,  CNU,  CNU,  149,   92,  167,  CNU, },
-    { 153,  111,  CNU,  CNU,  149,  107,  167,  CNU, },
-    { 111,  141,  CNU,  CNU,   94,  138,  182,  CNU, },
+    { 153,  111,  CNU,  CNU,  149,   92,  167,  154, },
+    { 153,  111,  CNU,  CNU,  149,  107,  167,  154, },
+    { 111,  141,  CNU,  CNU,   94,  138,  182,  154, },
 };
 
 static const uint8_t
@@ -289,14 +280,14 @@ static const uint8_t
 };
 
 static const uint8_t
-    INIT_LAST[3][2 * NUM_CTX_LAST_FLAG_XY] =
+    INIT_LAST[3][NUM_CTX_LAST_FLAG_XY] =
 {
     { 125,  110,  124,  110,   95,   94,  125,  111,  111,   79,  125,  126,  111,  111,   79,
-      108,  123,   93,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU, },
+      108,  123,   93 },
     { 125,  110,   94,  110,   95,   79,  125,  111,  110,   78,  110,  111,  111,   95,   94,
-      108,  123,  108,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU, },
+      108,  123,  108 },
     { 110,  110,  124,  125,  140,  153,  125,  127,  140,  109,  111,  143,  127,  111,   79,
-      108,  123,   63,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU,  CNU, },
+      108,  123,   63 },
 };
 
 static const uint8_t
@@ -337,9 +328,9 @@ static const uint8_t
 static const uint8_t
     INIT_MVP_IDX[3][NUM_MVP_IDX_CTX] =
 {
-    { 168,  CNU, },
-    { 168,  CNU, },
-    { CNU,  CNU, },
+    { 168 },
+    { 168 },
+    { CNU },
 };
 
 static const uint8_t
diff -r 564eefbb3812 -r 413ad959a5c6 source/Lib/TLibCommon/TComDataCU.cpp
--- a/source/Lib/TLibCommon/TComDataCU.cpp	Thu Jan 30 17:34:31 2014 -0600
+++ b/source/Lib/TLibCommon/TComDataCU.cpp	Fri Jan 31 16:53:36 2014 -0600
@@ -98,8 +98,6 @@ TComDataCU::TComDataCU()
     m_cuColocated[1] = NULL;
     m_mvpIdx[0] = NULL;
     m_mvpIdx[1] = NULL;
-    m_mvpNum[0] = NULL;
-    m_mvpNum[1] = NULL;
     m_chromaFormat = 0;
 }
 
@@ -152,8 +150,6 @@ void TComDataCU::create(uint32_t numPart
 
     m_mvpIdx[0] = new char[numPartition];
     m_mvpIdx[1] = new char[numPartition];
-    m_mvpNum[0] = new char[numPartition];
-    m_mvpNum[1] = new char[numPartition];
 
     m_trCoeffY  = (TCoeff*)X265_MALLOC(TCoeff, width * height);
     m_trCoeffCb = (TCoeff*)X265_MALLOC(TCoeff, (width >> m_hChromaShift) * (height >> m_vChromaShift));
@@ -203,10 +199,6 @@ void TComDataCU::destroy()
     m_mvpIdx[0] = NULL;
     delete[] m_mvpIdx[1];
     m_mvpIdx[1] = NULL;
-    delete[] m_mvpNum[0];
-    m_mvpNum[0] = NULL;
-    delete[] m_mvpNum[1];
-    m_mvpNum[1] = NULL;
     delete[] m_skipFlag;
     m_skipFlag = NULL;
     delete[] m_partSizes;
@@ -254,49 +246,45 @@ void TComDataCU::initCU(TComPic* pic, ui
     }
 
     // CHECK_ME: why partStartIdx always negative
-    int partStartIdx = 0 - (cuAddr) * pic->getNumPartInCU();
-    int firstElement = std::max<int>(partStartIdx, 0);
-    int numElements = m_numPartitions - firstElement;
-
-    if (numElements > 0)
+    int numElements = m_numPartitions;
+    assert(numElements > 0);
+
     {
-        memset(m_skipFlag         + firstElement, false,                    numElements * sizeof(*m_skipFlag));
-        memset(m_predModes        + firstElement, MODE_NONE,                numElements * sizeof(*m_predModes));
-        memset(m_cuTransquantBypass + firstElement, false,                  numElements * sizeof(*m_cuTransquantBypass));
-        memset(m_depth            + firstElement, 0,                        numElements * sizeof(*m_depth));
-        memset(m_trIdx            + firstElement, 0,                        numElements * sizeof(*m_trIdx));
-        memset(m_transformSkip[0] + firstElement, 0,                        numElements * sizeof(*m_transformSkip[0]));
-        memset(m_transformSkip[1] + firstElement, 0,                        numElements * sizeof(*m_transformSkip[1]));
-        memset(m_transformSkip[2] + firstElement, 0,                        numElements * sizeof(*m_transformSkip[2]));
-        memset(m_width            + firstElement, g_maxCUWidth,             numElements * sizeof(*m_width));
-        memset(m_height           + firstElement, g_maxCUHeight,            numElements * sizeof(*m_height));
-        memset(m_mvpNum[0]        + firstElement, -1,                       numElements * sizeof(*m_mvpNum[0]));
-        memset(m_mvpNum[1]        + firstElement, -1,                       numElements * sizeof(*m_mvpNum[1]));
-        memset(m_qp               + firstElement, qp,                       numElements * sizeof(*m_qp));
-        memset(m_bMergeFlags      + firstElement, false,                    numElements * sizeof(*m_bMergeFlags));
-        memset(m_mergeIndex       + firstElement, 0,                        numElements * sizeof(*m_mergeIndex));
-        memset(m_lumaIntraDir     + firstElement, DC_IDX,                   numElements * sizeof(*m_lumaIntraDir));
-        memset(m_chromaIntraDir   + firstElement, 0,                        numElements * sizeof(*m_chromaIntraDir));
-        memset(m_interDir         + firstElement, 0,                        numElements * sizeof(*m_interDir));
-        memset(m_cbf[0]           + firstElement, 0,                        numElements * sizeof(*m_cbf[0]));
-        memset(m_cbf[1]           + firstElement, 0,                        numElements * sizeof(*m_cbf[1]));
-        memset(m_cbf[2]           + firstElement, 0,                        numElements * sizeof(*m_cbf[2]));
-        memset(m_iPCMFlags        + firstElement, false,                    numElements * sizeof(*m_iPCMFlags));
+        memset(m_skipFlag         , false,                    numElements * sizeof(*m_skipFlag));
+        memset(m_predModes        , MODE_NONE,                numElements * sizeof(*m_predModes));
+        memset(m_cuTransquantBypass, false,                   numElements * sizeof(*m_cuTransquantBypass));
+        memset(m_depth            , 0,                        numElements * sizeof(*m_depth));
+        memset(m_trIdx            , 0,                        numElements * sizeof(*m_trIdx));
+        memset(m_transformSkip[0] , 0,                        numElements * sizeof(*m_transformSkip[0]));
+        memset(m_transformSkip[1] , 0,                        numElements * sizeof(*m_transformSkip[1]));
+        memset(m_transformSkip[2] , 0,                        numElements * sizeof(*m_transformSkip[2]));
+        memset(m_width            , g_maxCUWidth,             numElements * sizeof(*m_width));
+        memset(m_height           , g_maxCUHeight,            numElements * sizeof(*m_height));
+        memset(m_qp               , qp,                       numElements * sizeof(*m_qp));
+        memset(m_bMergeFlags      , false,                    numElements * sizeof(*m_bMergeFlags));
+        memset(m_mergeIndex       , 0,                        numElements * sizeof(*m_mergeIndex));
+        memset(m_lumaIntraDir     , DC_IDX,                   numElements * sizeof(*m_lumaIntraDir));
+        memset(m_chromaIntraDir   , 0,                        numElements * sizeof(*m_chromaIntraDir));
+        memset(m_interDir         , 0,                        numElements * sizeof(*m_interDir));
+        memset(m_cbf[0]           , 0,                        numElements * sizeof(*m_cbf[0]));
+        memset(m_cbf[1]           , 0,                        numElements * sizeof(*m_cbf[1]));
+        memset(m_cbf[2]           , 0,                        numElements * sizeof(*m_cbf[2]));
+        memset(m_iPCMFlags        , false,                    numElements * sizeof(*m_iPCMFlags));
     }
 
     uint32_t y_tmp = g_maxCUWidth * g_maxCUHeight;
     uint32_t c_tmp = (g_maxCUWidth >> m_hChromaShift) * (g_maxCUHeight >> m_vChromaShift);
-    if (0 >= partStartIdx)
     {
         m_cuMvField[0].clearMvField();
         m_cuMvField[1].clearMvField();
-        memset(m_trCoeffY, 0, sizeof(TCoeff) * y_tmp);
-        memset(m_iPCMSampleY, 0, sizeof(Pel) * y_tmp);
-
-        memset(m_trCoeffCb, 0, sizeof(TCoeff) * c_tmp);
-        memset(m_trCoeffCr, 0, sizeof(TCoeff) * c_tmp);
-        memset(m_iPCMSampleCb, 0, sizeof(Pel) * c_tmp);
-        memset(m_iPCMSampleCr, 0, sizeof(Pel) * c_tmp);
+
+        // TODO: can be remove, but I haven't data to verify it, remove later
+        if (getSlice()->getSPS()->getUsePCM())
+        {
+            memset(m_iPCMSampleY, 0, sizeof(Pel) * y_tmp);
+            memset(m_iPCMSampleCb, 0, sizeof(Pel) * c_tmp);
+            memset(m_iPCMSampleCr, 0, sizeof(Pel) * c_tmp);
+        }
     }
 
     // Setting neighbor CU
@@ -360,8 +348,6 @@ void TComDataCU::initEstData(uint32_t de
     {
         m_mvpIdx[0][i] = -1;
         m_mvpIdx[1][i] = -1;
-        m_mvpNum[0][i] = -1;
-        m_mvpNum[1][i] = -1;
         m_depth[i] = depth;
         m_width[i] = width;
         m_height[i] = height;
@@ -448,8 +434,6 @@ void TComDataCU::initSubCU(TComDataCU* c
         m_cuTransquantBypass[i] = false;
         m_mvpIdx[0][i] = -1;
         m_mvpIdx[1][i] = -1;