[x265-commits] [x265] dpb: does not need to include frameencoder.h

Steve Borho steve at borho.org
Mon Sep 15 17:19:47 CEST 2014


details:   http://hg.videolan.org/x265/rev/6bb0d3a25b08
branches:  
changeset: 8046:6bb0d3a25b08
user:      Steve Borho <steve at borho.org>
date:      Thu Sep 11 12:42:28 2014 +0200
description:
dpb: does not need to include frameencoder.h
Subject: [x265] search: remove x prefixes from ME helper functions

details:   http://hg.videolan.org/x265/rev/e9e71ece1344
branches:  
changeset: 8047:e9e71ece1344
user:      Steve Borho <steve at borho.org>
date:      Thu Sep 11 12:43:22 2014 +0200
description:
search: remove x prefixes from ME helper functions
Subject: [x265] analysis: minor comment and code cleanups, no behavior change

details:   http://hg.videolan.org/x265/rev/3d6cc40ebbf7
branches:  
changeset: 8048:3d6cc40ebbf7
user:      Steve Borho <steve at borho.org>
date:      Thu Sep 11 13:17:32 2014 +0200
description:
analysis: minor comment and code cleanups, no behavior change
Subject: [x265] api: add analysis data structures and param options

details:   http://hg.videolan.org/x265/rev/e5a24e5ba46e
branches:  
changeset: 8049:e5a24e5ba46e
user:      Sagar Kotecha <sagar at multicorewareinc.com>
date:      Thu Sep 11 19:18:40 2014 +0530
description:
api: add analysis data structures and param options
Subject: [x265] api: introduce methods to allocate and free analysis buffers

details:   http://hg.videolan.org/x265/rev/baf07b965909
branches:  
changeset: 8050:baf07b965909
user:      Sagar Kotecha <sagar at multicorewareinc.com>
date:      Thu Sep 11 19:21:37 2014 +0530
description:
api: introduce methods to allocate and free analysis buffers
Subject: [x265] store analysis information in buffers

details:   http://hg.videolan.org/x265/rev/b0d006337801
branches:  
changeset: 8051:b0d006337801
user:      Sagar Kotecha <sagar at multicorewareinc.com>
date:      Thu Sep 11 19:23:25 2014 +0530
description:
store analysis information in buffers
Subject: [x265] cli: add cli options analysis-mode and analysis-file

details:   http://hg.videolan.org/x265/rev/7e29b10982d2
branches:  
changeset: 8052:7e29b10982d2
user:      Sagar Kotecha <sagar at multicorewareinc.com>
date:      Thu Sep 11 19:24:28 2014 +0530
description:
cli: add cli options analysis-mode and analysis-file

analysis-mode: save|1 - Dump analysis buffers into file, load|2 - read analysis buffers from the file
analysis-file: Specify file name used for either dumping or reading analysis data
Subject: [x265] asm: disable buggy denoise primitives until the bugs are fixed

details:   http://hg.videolan.org/x265/rev/d522e7662111
branches:  stable
changeset: 8053:d522e7662111
user:      Deepthi Nandakumar <deepthi at multicorewareinc.com>
date:      Fri Sep 12 16:41:31 2014 +0530
description:
asm: disable buggy denoise primitives until the bugs are fixed
Subject: [x265] Merge with stable

details:   http://hg.videolan.org/x265/rev/fda32ff40246
branches:  
changeset: 8054:fda32ff40246
user:      Deepthi Nandakumar <deepthi at multicorewareinc.com>
date:      Fri Sep 12 16:44:47 2014 +0530
description:
Merge with stable
Subject: [x265] Resolved gcc compiler error of mismatched type

details:   http://hg.videolan.org/x265/rev/cd8fd0afd4e8
branches:  
changeset: 8055:cd8fd0afd4e8
user:      David T Yuen <dtyx265 at gmail.com>
date:      Thu Sep 11 17:25:40 2014 -0700
description:
Resolved gcc compiler error of mismatched type
Subject: [x265] x265: add missing typedefs

details:   http://hg.videolan.org/x265/rev/2fb61cc75152
branches:  
changeset: 8056:2fb61cc75152
user:      Deepthi Nandakumar <deepthi at multicorewareinc.com>
date:      Sat Sep 13 22:43:52 2014 +0530
description:
x265: add missing typedefs
Subject: [x265] asm: avx2 assembly code for dct32x32

details:   http://hg.videolan.org/x265/rev/184e56afa951
branches:  
changeset: 8057:184e56afa951
user:      Murugan Vairavel <murugan at multicorewareinc.com>
date:      Fri Sep 12 12:02:46 2014 +0530
description:
asm: avx2 assembly code for dct32x32
Subject: [x265] asm: fix mismatch due to dct32 avx2 assembly code

details:   http://hg.videolan.org/x265/rev/b5fb734517c0
branches:  
changeset: 8058:b5fb734517c0
user:      Murugan Vairavel <murugan at multicorewareinc.com>
date:      Mon Sep 15 13:47:05 2014 +0530
description:
asm: fix mismatch due to dct32 avx2 assembly code
Subject: [x265] Search: remove redundant encode coefficients in intra for performance

details:   http://hg.videolan.org/x265/rev/8972169f252d
branches:  
changeset: 8059:8972169f252d
user:      Ashok Kumar Mishra<ashok at multicorewareinc.com>
date:      Wed Sep 10 15:03:40 2014 +0530
description:
Search: remove redundant encode coefficients in intra for performance
Subject: [x265] rc: bug fix for 2 pass when bframes = 0. fixes Issue #77

details:   http://hg.videolan.org/x265/rev/9107dc4a2632
branches:  
changeset: 8060:9107dc4a2632
user:      Aarthi Thirumalai
date:      Mon Sep 15 16:08:30 2014 +0530
description:
rc: bug fix for 2 pass when bframes = 0. fixes Issue #77
Subject: [x265] rc: check for changes in scenecut input between multiple passes.

details:   http://hg.videolan.org/x265/rev/67ee212bbf78
branches:  
changeset: 8061:67ee212bbf78
user:      Aarthi Thirumalai
date:      Mon Sep 15 16:09:52 2014 +0530
description:
rc: check for changes in scenecut input between multiple passes.

wpp/no-wpp doesn't affect slice type decisions. they can differ between the passes in multipass encode.
Subject: [x265] rc: bug fix for 2 pass when bframes = 0. fixes Issue #77

details:   http://hg.videolan.org/x265/rev/e6a80fb007e8
branches:  stable
changeset: 8062:e6a80fb007e8
user:      Aarthi Thirumalai
date:      Mon Sep 15 16:08:30 2014 +0530
description:
rc: bug fix for 2 pass when bframes = 0. fixes Issue #77
Subject: [x265] Merge with stable

details:   http://hg.videolan.org/x265/rev/70c836fef6d9
branches:  
changeset: 8063:70c836fef6d9
user:      Steve Borho <steve at borho.org>
date:      Mon Sep 15 13:37:10 2014 +0200
description:
Merge with stable
Subject: [x265] search: comment nits

details:   http://hg.videolan.org/x265/rev/017ceb9d2b06
branches:  
changeset: 8064:017ceb9d2b06
user:      Steve Borho <steve at borho.org>
date:      Mon Sep 15 14:03:07 2014 +0200
description:
search: comment nits
Subject: [x265] search: measure RDO of intra modes within 12% of least cost [CHANGES OUTPUTS]

details:   http://hg.videolan.org/x265/rev/02353d20f051
branches:  
changeset: 8065:02353d20f051
user:      Steve Borho <steve at borho.org>
date:      Wed Sep 10 12:35:55 2014 +0200
description:
search: measure RDO of intra modes within 12% of least cost [CHANGES OUTPUTS]

This version adaps the number of RD measured modes by param.rdLevel (aka preset)
and by depth. This gives a non-trivial speedup to the very fast presets which
use frequent keyframes and helps improve compression in slower presets.

all presets use this function to encode I slices, so every encode is affected.

Previous behavior:
  RD measure top N least sa8d cost intra modes and all most probable modes where
  N was depth-based: intraModeNumFast[] = { 8, 8, 3, 3, 3 }; // 4x4, 8x8, etc

New behavior:
  RD measure up to N modes that are within 12% of best sa8d cost or are most
  probable. where N if a function of rd-level and depth

The new behavior may measure fewer modes than before may skip some most-probable
modes if there are plenty of other modes which are near the best cost. Since
mode signal cost is included already, this seems ok.

The general idea is that if 1-2 modes have much better sa8d cost than all the
others, then we are likely wasting our time RD measuring 8-11 modes. We're
betting that sa8d cost is a somewhat decent predictor of RD cost.

Note that I initially tried without a limit (measure all within 12% or MPM) but
for some clips this was a horrible perf trade-off. In some situations all the
intra modes might measure close together (flat source block) and we would end
up measuring most or all of the intra modes for very little gain. So this
version re-introduces a "top N candidate list" but does not bother trying to
keep the list sorted since it is small
Subject: [x265] search: header cleanups, no functional change

details:   http://hg.videolan.org/x265/rev/db063839c8fe
branches:  
changeset: 8066:db063839c8fe
user:      Steve Borho <steve at borho.org>
date:      Mon Sep 15 13:58:42 2014 +0200
description:
search: header cleanups, no functional change
Subject: [x265] sao: some cleanups

details:   http://hg.videolan.org/x265/rev/098a00de4a72
branches:  
changeset: 8067:098a00de4a72
user:      Satoshi Nakagawa <nakagawa424 at oki.com>
date:      Fri Sep 12 11:01:54 2014 +0900
description:
sao: some cleanups
Subject: [x265] doc: fix typo and nit in threading page

details:   http://hg.videolan.org/x265/rev/76240da72c38
branches:  
changeset: 8068:76240da72c38
user:      Steve Borho <steve at borho.org>
date:      Mon Sep 15 14:41:50 2014 +0200
description:
doc: fix typo and nit in threading page
Subject: [x265] doc: describe performance impact of SAO

details:   http://hg.videolan.org/x265/rev/dff0cd55b520
branches:  
changeset: 8069:dff0cd55b520
user:      Steve Borho <steve at borho.org>
date:      Mon Sep 15 14:43:19 2014 +0200
description:
doc: describe performance impact of SAO
Subject: [x265] param: preset tuning changes

details:   http://hg.videolan.org/x265/rev/1de67321275e
branches:  
changeset: 8070:1de67321275e
user:      Steve Borho <steve at borho.org>
date:      Mon Sep 15 15:00:13 2014 +0200
description:
param: preset tuning changes

1. disable SAO in superfast

Recent changes have made --no-sao substantially faster than SAO, which has
made ultrafast preset much much faster than superfast.  By disabling SAO in
superfast, it is now roughly half-way between ultrafast and veryfast again.

2. Enable weighted prediction for B slices in slower, veryslow, and placebo

Weighted prediction for B can sometimes be beneficial, so turn it on for slower
encodes.

diffstat:

 doc/reST/api.rst                     |   30 ++
 doc/reST/cli.rst                     |   18 +
 doc/reST/presets.rst                 |    4 +-
 doc/reST/threading.rst               |   23 +-
 source/CMakeLists.txt                |    2 +-
 source/Lib/TLibCommon/CommonDef.h    |    2 -
 source/common/common.h               |    9 +-
 source/common/frame.cpp              |    2 +
 source/common/frame.h                |    3 +
 source/common/param.cpp              |    5 +
 source/common/x86/asm-primitives.cpp |   10 +-
 source/common/x86/dct8.asm           |  328 ++++++++++++++++++++++++-
 source/common/x86/dct8.h             |    1 +
 source/common/x86/loopfilter.asm     |    2 +-
 source/encoder/analysis.cpp          |   81 ++---
 source/encoder/api.cpp               |   44 +++
 source/encoder/dpb.cpp               |    1 -
 source/encoder/encoder.cpp           |   13 +
 source/encoder/entropy.cpp           |    8 +-
 source/encoder/ratecontrol.cpp       |    4 +-
 source/encoder/sao.cpp               |  469 +++++++++++++---------------------
 source/encoder/sao.h                 |   14 +-
 source/encoder/search.cpp            |  266 ++++++-------------
 source/encoder/search.h              |  120 ++++----
 source/x265.cpp                      |  118 ++++++++-
 source/x265.def.in                   |    2 +
 source/x265.h                        |   66 ++++
 27 files changed, 1042 insertions(+), 603 deletions(-)

diffs (truncated from 3348 to 300 lines):

diff -r 012f315d3eda -r 1de67321275e doc/reST/api.rst
--- a/doc/reST/api.rst	Wed Sep 10 17:27:20 2014 +0200
+++ b/doc/reST/api.rst	Mon Sep 15 15:00:13 2014 +0200
@@ -223,6 +223,36 @@ Structures allocated from the library sh
 	void x265_picture_free(x265_picture *);
 
 
+Analysis Buffers
+================
+
+Analysis information can be saved and reused to between encodes of the
+same video sequence (generally for multiple bitrate encodes).  The best
+results are attained by saving the analysis information of the highest
+bitrate encode and reuse it in lower bitrate encodes.
+
+When saving or loading analysis data, buffers must be allocated for
+every picture passed into the encoder using::
+
+	/* x265_alloc_analysis_data:
+	 *  Allocate memory to hold analysis meta data, returns 1 on success else 0 */
+	int x265_alloc_analysis_data(x265_picture*);
+
+Note that this is very different from the typical semantics of
+**x265_picture**, which can be reused many times. The analysis buffers must
+be re-allocated for every input picture.
+
+Analysis buffers passed to the encoder are owned by the encoder until
+they pass the buffers back via an output **x265_picture**. The user is
+responsible for releasing the buffers when they are finished with them
+via::
+
+	/* x265_free_analysis_data:
+	 *  Use x265_free_analysis_data to release storage of members allocated by
+	 *  x265_alloc_analysis_data */
+	void x265_free_analysis_data(x265_picture*);
+
+
 Encode Process
 ==============
 
diff -r 012f315d3eda -r 1de67321275e doc/reST/cli.rst
--- a/doc/reST/cli.rst	Wed Sep 10 17:27:20 2014 +0200
+++ b/doc/reST/cli.rst	Mon Sep 15 15:00:13 2014 +0200
@@ -918,6 +918,24 @@ Quality, rate control and rate distortio
 	* :option:`--subme` = MIN(2, :option:`--subme`)
 	* :option:`--rd` = MIN(2, :option:`--rd`)
 
+.. option:: --analysis-mode <string|int>
+
+	Specify whether analysis information of each frame is output by encoder
+	or input for reuse. By reading the analysis data writen by an
+	earlier encode of the same sequence, substantial redundant work may
+	be avoided.
+
+	The following data may be stored and reused:
+	I frames   - split decisions and luma intra directions of all CUs.
+	P/B frames - motion vectors are dumped at each depth for all CUs.
+
+	**Values:** off(0), save(1): dump analysis data, load(2): read analysis data
+
+.. option:: --analysis-file <filename>
+
+	Specify a filename for analysis data (see :option:`--analysis-mode`)
+	If no filename is specified, x265_analysis.dat is used.
+
 Loop filters
 ============
 
diff -r 012f315d3eda -r 1de67321275e doc/reST/presets.rst
--- a/doc/reST/presets.rst	Wed Sep 10 17:27:20 2014 +0200
+++ b/doc/reST/presets.rst	Mon Sep 15 15:00:13 2014 +0200
@@ -52,12 +52,14 @@ The presets adjust encoder parameters to
 +--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
 | fast-cbf     |    1      |     1     |    1     |   1    |  0   |    0   |  0   |   0    |    0     |    0    |
 +--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
-| sao          |    0      |     1     |    1     |   1    |  1   |    1   |  1   |   1    |    1     |    1    |
+| sao          |    0      |     0     |    1     |   1    |  1   |    1   |  1   |   1    |    1     |    1    |
 +--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
 | signhide     |    0      |     1     |    1     |   1    |  1   |    1   |  1   |   1    |    1     |    1    |
 +--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
 | weightp      |    0      |     0     |    1     |   1    |  1   |    1   |  1   |   1    |    1     |    1    |
 +--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
+| weightb      |    0      |     0     |    0     |   0    |  0   |    0   |  0   |   1    |    1     |    1    |
++--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
 | aq-mode      |    0      |     0     |    2     |   2    |  2   |    2   |  2   |   2    |    2     |    2    |
 +--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
 | cuTree       |    0      |     0     |    0     |   0    |  1   |    1   |  1   |   1    |    1     |    1    |
diff -r 012f315d3eda -r 1de67321275e doc/reST/threading.rst
--- a/doc/reST/threading.rst	Wed Sep 10 17:27:20 2014 +0200
+++ b/doc/reST/threading.rst	Mon Sep 15 15:00:13 2014 +0200
@@ -13,7 +13,7 @@ the first encoder created within each pr
 
 :option:`--threads` specifies the number of threads the encoder will
 try to allocate for its thread pool.  If the thread pool was already
-allocated this parameter is ignored.  By default x265 allocated one
+allocated this parameter is ignored.  By default x265 allocates one
 thread per (hyperthreaded) CPU core in your system.
 
 Work distribution is job based.  Idle worker threads ask their parent
@@ -29,7 +29,7 @@ providers are recommended to call this m
 available.
 
 Worker jobs are not allowed to block except when abosultely necessary
-for data locking.  If a job becomes blocked, the worker thread is
+for data locking. If a job becomes blocked, the worker thread is
 expected to drop that job and go back to the pool and find more work.
 
 .. note::
@@ -206,3 +206,22 @@ The function slicetypeDecide() itself ma
 thread if your system has enough CPU cores to make this a beneficial
 trade-off, else it runs within the context of the thread which calls the
 x265_encoder_encode().
+
+SAO
+===
+
+The Sample Adaptive Offset loopfilter has a large effect on encode
+performance because of the peculiar way it must be analyzed and coded.
+
+SAO flags and data are encoded at the CTU level before the CTU itself is
+coded, but SAO analysis (deciding whether to enable SAO and with what
+parameters) cannot be performed until that CTU is completely analyzed
+(reconstructed pixels are available) as well as the CTUs to the right
+and below.  So in effect the encoder must perform SAO analysis in a
+wavefront at least a full row behind the CTU compression wavefront.
+
+This extra latency forces the encoder to save the encode data of every
+CTU until the entire frame has been analyzed, at which point a function
+can code the final slice bitstream with the decided SAO flags and data
+coded between each CTU.  This second pass over the CTUs can be
+expensive, particularly at large resolutions and high bitrates.
diff -r 012f315d3eda -r 1de67321275e source/CMakeLists.txt
--- a/source/CMakeLists.txt	Wed Sep 10 17:27:20 2014 +0200
+++ b/source/CMakeLists.txt	Mon Sep 15 15:00:13 2014 +0200
@@ -21,7 +21,7 @@ include(CheckSymbolExists)
 include(CheckCXXCompilerFlag)
 
 # X265_BUILD must be incremented each time the public API is changed
-set(X265_BUILD 31)
+set(X265_BUILD 32)
 configure_file("${PROJECT_SOURCE_DIR}/x265.def.in"
                "${PROJECT_BINARY_DIR}/x265.def")
 configure_file("${PROJECT_SOURCE_DIR}/x265_config.h.in"
diff -r 012f315d3eda -r 1de67321275e source/Lib/TLibCommon/CommonDef.h
--- a/source/Lib/TLibCommon/CommonDef.h	Wed Sep 10 17:27:20 2014 +0200
+++ b/source/Lib/TLibCommon/CommonDef.h	Mon Sep 15 15:00:13 2014 +0200
@@ -73,8 +73,6 @@
 #define SCAN_SET_SIZE               16
 #define LOG2_SCAN_SET_SIZE          4
 
-#define FAST_UDI_MAX_RDMODE_NUM     35 // maximum number of RD comparison in fast-UDI estimation loop
-
 #define ALL_IDX                     -1
 #define PLANAR_IDX                  0
 #define VER_IDX                     26 // index for intra VERTICAL   mode
diff -r 012f315d3eda -r 1de67321275e source/common/common.h
--- a/source/common/common.h	Wed Sep 10 17:27:20 2014 +0200
+++ b/source/common/common.h	Mon Sep 15 15:00:13 2014 +0200
@@ -63,6 +63,7 @@ extern "C" intptr_t x265_stack_align(voi
 #define ALIGN_VAR_16(T, var) __declspec(align(16)) T var
 #define ALIGN_VAR_32(T, var) __declspec(align(32)) T var
 #define x265_stack_align(func, ...) func(__VA_ARGS__)
+#define fseeko _fseeki64
 
 #endif // if defined(__GNUC__)
 
@@ -199,6 +200,8 @@ typedef int16_t  coeff_t;      // transf
 
 namespace x265 {
 
+enum { SAO_NUM_OFFSET = 4 };
+
 // NOTE: MUST be alignment to 16 or 32 bytes for asm code
 struct NoiseReduction
 {
@@ -214,9 +217,8 @@ struct SAOQTPart
     enum { NUM_DOWN_PART = 4 };
 
     int     bestType;
-    int     length;
     int     subTypeIdx;  // indicates EO class or BO band position
-    int     offset[4];
+    int     offset[SAO_NUM_OFFSET];
     int     startCUX;
     int     startCUY;
     int     endCUX;
@@ -244,10 +246,9 @@ struct SaoLcuParam
     bool mergeLeftFlag;
     int  typeIdx;
     int  subTypeIdx;    // indicates EO class or BO band position
-    int  offset[4];
+    int  offset[SAO_NUM_OFFSET];
     int  partIdx;
     int  partIdxTmp;
-    int  length;
 
     void reset()
     {
diff -r 012f315d3eda -r 1de67321275e source/common/frame.cpp
--- a/source/common/frame.cpp	Wed Sep 10 17:27:20 2014 +0200
+++ b/source/common/frame.cpp	Mon Sep 15 15:00:13 2014 +0200
@@ -51,6 +51,8 @@ Frame::Frame()
     m_avgQpRc = 0;
     m_avgQpAq = 0;
     m_bChromaPlanesExtended = false;
+    m_intraData = NULL;
+    m_interData = NULL;
 }
 
 Frame::~Frame()
diff -r 012f315d3eda -r 1de67321275e source/common/frame.h
--- a/source/common/frame.h	Wed Sep 10 17:27:20 2014 +0200
+++ b/source/common/frame.h	Mon Sep 15 15:00:13 2014 +0200
@@ -83,6 +83,9 @@ public:
     double            m_rateFactor; // calculated based on the Frame QP
     int32_t           m_forceqp;    // Force to use the qp specified in qp file
 
+    x265_intra_data*  m_intraData;  // intra analysis information
+    x265_inter_data*  m_interData;  // inter analysis information
+
     Frame();
     virtual ~Frame();
 
diff -r 012f315d3eda -r 1de67321275e source/common/param.cpp
--- a/source/common/param.cpp	Wed Sep 10 17:27:20 2014 +0200
+++ b/source/common/param.cpp	Mon Sep 15 15:00:13 2014 +0200
@@ -277,6 +277,7 @@ int x265_param_default_preset(x265_param
             param->rc.aqStrength = 0.0;
             param->rc.aqMode = X265_AQ_NONE;
             param->rc.cuTree = 0;
+            param->bEnableSAO = 0;
             param->bEnableFastIntra = 1;
         }
         else if (!strcmp(preset, "veryfast"))
@@ -326,6 +327,7 @@ int x265_param_default_preset(x265_param
         }
         else if (!strcmp(preset, "slower"))
         {
+            param->bEnableWeightedBiPred = 1;
             param->bEnableAMP = 1;
             param->bEnableRectInter = 1;
             param->lookaheadDepth = 30;
@@ -339,6 +341,7 @@ int x265_param_default_preset(x265_param
         }
         else if (!strcmp(preset, "veryslow"))
         {
+            param->bEnableWeightedBiPred = 1;
             param->bEnableAMP = 1;
             param->bEnableRectInter = 1;
             param->lookaheadDepth = 40;
@@ -353,6 +356,7 @@ int x265_param_default_preset(x265_param
         }
         else if (!strcmp(preset, "placebo"))
         {
+            param->bEnableWeightedBiPred = 1;
             param->bEnableAMP = 1;
             param->bEnableRectInter = 1;
             param->lookaheadDepth = 60;
@@ -658,6 +662,7 @@ int x265_param_parse(x265_param *p, cons
     OPT("me")        p->searchMethod = parseName(value, x265_motion_est_names, bError);
     OPT("cutree")    p->rc.cuTree = atobool(value);
     OPT("slow-firstpass") p->rc.bEnableSlowFirstPass = atobool(value);
+    OPT("analysis-mode") p->analysisMode = parseName(value, x265_analysis_names, bError);
     OPT("sar")
     {
         p->vui.aspectRatioIdc = parseName(value, x265_sar_names, bError);
diff -r 012f315d3eda -r 1de67321275e source/common/x86/asm-primitives.cpp
--- a/source/common/x86/asm-primitives.cpp	Wed Sep 10 17:27:20 2014 +0200
+++ b/source/common/x86/asm-primitives.cpp	Mon Sep 15 15:00:13 2014 +0200
@@ -1446,6 +1446,7 @@ void Setup_Assembly_Primitives(EncoderPr
         p.dequant_normal = x265_dequant_normal_avx2;
 #if X86_64
         p.dct[DCT_16x16] = x265_dct16_avx2;
+        p.dct[DCT_32x32] = x265_dct32_avx2;
 #endif
     }
     /* at HIGH_BIT_DEPTH, pixel == short so we can reuse a number of primitives */
@@ -1564,7 +1565,7 @@ void Setup_Assembly_Primitives(EncoderPr
         p.idct[IDCT_4x4] = x265_idct4_sse2;
         p.idct[IDST_4x4] = x265_idst4_sse2;
         p.planecopy_sp = x265_downShift_16_sse2;
-        p.denoiseDct = x265_denoise_dct_sse2;
+        //p.denoiseDct = x265_denoise_dct_sse2;
         p.copy_shl[BLOCK_4x4] = x265_copy_shl_4_sse2;
         p.copy_shl[BLOCK_8x8] = x265_copy_shl_8_sse2;
         p.copy_shl[BLOCK_16x16] = x265_copy_shl_16_sse2;
@@ -1604,7 +1605,7 @@ void Setup_Assembly_Primitives(EncoderPr
         p.dct[DST_4x4] = x265_dst4_ssse3;
         p.idct[IDCT_8x8] = x265_idct8_ssse3;
         p.count_nonzero = x265_count_nonzero_ssse3;
-        p.denoiseDct = x265_denoise_dct_ssse3;
+        //p.denoiseDct = x265_denoise_dct_ssse3;
     }
     if (cpuMask & X265_CPU_SSE4)
     {
@@ -1708,7 +1709,7 @@ void Setup_Assembly_Primitives(EncoderPr
 
         p.ssim_4x4x2_core = x265_pixel_ssim_4x4x2_core_avx;
         p.ssim_end_4 = x265_pixel_ssim_end4_avx;
-        p.denoiseDct = x265_denoise_dct_avx;


More information about the x265-commits mailing list