[x265] [PATCH] perf: Enabling lookahead-slices for all presets except veryslow & placebo
Deepthi Nandakumar
deepthi at multicorewareinc.com
Wed Nov 4 11:57:12 CET 2015
The logic for disabling lookahead slices for resolutions less than 720p is
not included?
On Wed, Nov 4, 2015 at 12:27 PM, Pradeep Ramachandran <
pradeep at multicorewareinc.com> wrote:
> # HG changeset patch
> # User Pradeep Ramachandran <pradeep at multicorewareinc.com>
> # Date 1446620236 -19800
> # Wed Nov 04 12:27:16 2015 +0530
> # Node ID f04e0fb7b82e98b5bab4e8ae1cfb3caf92f8c277
> # Parent 02db15e14351c7f5190203a087db285655961205
> perf: Enabling lookahead-slices for all presets except veryslow & placebo
>
> Seeing ~10% performance on the faster presets on skylake, and ~2X
> performance
> on Xeon systems in ultrafast setting. Performance improvement is through a
> considerable increase in utilization across the board. Disabling slicing
> for
> videos of resoultion < 720p to limit impact on quality.
>
> Commit will change outputs, but reduction in quality (measured by PSNR or
> SSIM)
> is <0.01% across a wide variety of presets and runs.
>
> diff -r 02db15e14351 -r f04e0fb7b82e doc/reST/cli.rst
> --- a/doc/reST/cli.rst Thu Oct 29 15:22:36 2015 +0530
> +++ b/doc/reST/cli.rst Wed Nov 04 12:27:16 2015 +0530
> @@ -1124,21 +1124,31 @@
>
> .. option:: --lookahead-slices <0..16>
>
> - Use multiple worker threads to measure the estimated cost of each
> - frame within the lookahead. When :option:`--b-adapt` is 2, most
> - frame cost estimates will be performed in batch mode, many cost
> - estimates at the same time, and lookahead-slices is ignored for
> - batched estimates. The effect on performance can be quite small.
> - The higher this parameter, the less accurate the frame costs will
> be
> - (since context is lost across slice boundaries) which will result
> in
> - less accurate B-frame and scene-cut decisions.
> + Use multiple worker threads to measure the estimated cost of each
> frame
> + within the lookahead. The frame is divided into the specified
> number of
> + slices, and one-thread is launched per slice. When
> :option:`--b-adapt` is
> + 2, most frame cost estimates will be performed in batch mode (many
> cost
> + estimates at the same time) and lookahead-slices is ignored for
> batched
> + estimates; it may still be used for single cost estimations. The
> higher this
> + parameter, the less accurate the frame costs will be (since
> context is lost
> + across slice boundaries) which will result in less accurate
> B-frame and
> + scene-cut decisions. The effect on performance can be significant
> especially
> + on systems with many threads.
>
> - The encoder may internally lower the number of slices to ensure
> - each slice codes at least 10 16x16 rows of lowres blocks. If slices
> - are used in lookahead, they are logged in the list of tools as
> - *lslices*.
> -
> - **Values:** 0 - disabled (default). 1 is the same as 0. Max 16
> + The encoder may internally lower the number of slices or disable
> + slicing to ensure each slice codes at least 10 16x16 rows of lowres
> + blocks to minimize the impact on quality. For example, for 720p and
> + 1080p videos, the number of slices is capped to 4 and 6, respectively.
> + For resolutions lesser than 720p, slicing is auto-disabled.
> +
> + If slices are used in lookahead, they are logged in the list of tools
> + as *lslices*
> +
> + **Values:** 0 - disabled. 1 is the same as 0. Max 16.
> + Default: 8 for ultrafast, superfast, faster, fast, medium
> + 4 for slow, slower
> + disabled for veryslow, slower
> +
>
> .. option:: --b-adapt <integer>
>
> diff -r 02db15e14351 -r f04e0fb7b82e doc/reST/presets.rst
> --- a/doc/reST/presets.rst Thu Oct 29 15:22:36 2015 +0530
> +++ b/doc/reST/presets.rst Wed Nov 04 12:27:16 2015 +0530
> @@ -19,61 +19,63 @@
>
> The presets adjust encoder parameters to affect these trade-offs.
>
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| | ultrafast | superfast | veryfast | faster | fast |
> medium | slow | slower | veryslow | placebo |
>
> -+==============+===========+===========+==========+========+======+========+======+========+==========+=========+
> -| ctu | 32 | 32 | 32 | 64 | 64 | 64
> | 64 | 64 | 64 | 64 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| min-cu-size | 16 | 8 | 8 | 8 | 8 | 8
> | 8 | 8 | 8 | 8 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| bframes | 3 | 3 | 4 | 4 | 4 | 4
> | 4 | 8 | 8 | 8 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| b-adapt | 0 | 0 | 0 | 0 | 0 | 2
> | 2 | 2 | 2 | 2 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| rc-lookahead | 5 | 10 | 15 | 15 | 15 | 20
> | 25 | 30 | 40 | 60 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| scenecut | 0 | 40 | 40 | 40 | 40 | 40
> | 40 | 40 | 40 | 40 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| ref | 1 | 1 | 1 | 1 | 2 | 3
> | 3 | 3 | 5 | 5 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| me | dia | hex | hex | hex | hex |
> hex | star | star | star | star |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| merange | 57 | 57 | 57 | 57 | 57 | 57
> | 57 | 57 | 57 | 92 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| subme | 0 | 1 | 1 | 2 | 2 | 2
> | 3 | 3 | 4 | 5 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| rect | 0 | 0 | 0 | 0 | 0 | 0
> | 1 | 1 | 1 | 1 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| amp | 0 | 0 | 0 | 0 | 0 | 0
> | 0 | 1 | 1 | 1 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| max-merge | 2 | 2 | 2 | 2 | 2 | 2
> | 3 | 3 | 4 | 5 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| early-skip | 1 | 1 | 1 | 1 | 0 | 0
> | 0 | 0 | 0 | 0 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| fast-intra | 1 | 1 | 1 | 1 | 1 | 0
> | 0 | 0 | 0 | 0 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| b-intra | 0 | 0 | 0 | 0 | 0 | 0
> | 0 | 1 | 1 | 1 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| sao | 0 | 0 | 1 | 1 | 1 | 1
> | 1 | 1 | 1 | 1 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| signhide | 0 | 1 | 1 | 1 | 1 | 1
> | 1 | 1 | 1 | 1 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| weightp | 0 | 0 | 1 | 1 | 1 | 1
> | 1 | 1 | 1 | 1 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| weightb | 0 | 0 | 0 | 0 | 0 | 0
> | 0 | 1 | 1 | 1 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| aq-mode | 0 | 0 | 1 | 1 | 1 | 1
> | 1 | 1 | 1 | 1 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| cuTree | 0 | 0 | 0 | 0 | 1 | 1
> | 1 | 1 | 1 | 1 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| rdLevel | 2 | 2 | 2 | 2 | 2 | 3
> | 4 | 6 | 6 | 6 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| rdoq-level | 0 | 0 | 0 | 0 | 0 | 0
> | 2 | 2 | 2 | 2 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| tu-intra | 1 | 1 | 1 | 1 | 1 | 1
> | 1 | 2 | 3 | 4 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| tu-inter | 1 | 1 | 1 | 1 | 1 | 1
> | 1 | 2 | 3 | 4 |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| | ultrafast | superfast | veryfast | faster | fast
> | medium | slow | slower | veryslow | placebo |
>
> ++=====================+===========+===========+==========+========+======+========+======+========+==========+=========+
> +| ctu | 32 | 32 | 32 | 64 | 64
> | 64 | 64 | 64 | 64 | 64 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| min-cu-size | 16 | 8 | 8 | 8 | 8
> | 8 | 8 | 8 | 8 | 8 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| bframes | 3 | 3 | 4 | 4 | 4
> | 4 | 4 | 8 | 8 | 8 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| b-adapt | 0 | 0 | 0 | 0 | 0
> | 2 | 2 | 2 | 2 | 2 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| rc-lookahead | 5 | 10 | 15 | 15 | 15
> | 20 | 25 | 30 | 40 | 60 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| scenecut | 0 | 40 | 40 | 40 | 40
> | 40 | 40 | 40 | 40 | 40 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| ref | 1 | 1 | 1 | 1 | 2
> | 3 | 3 | 3 | 5 | 5 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| me | dia | hex | hex | hex | hex
> | hex | star | star | star | star |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| merange | 57 | 57 | 57 | 57 | 57
> | 57 | 57 | 57 | 57 | 92 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| subme | 0 | 1 | 1 | 2 | 2
> | 2 | 3 | 3 | 4 | 5 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| rect | 0 | 0 | 0 | 0 | 0
> | 0 | 1 | 1 | 1 | 1 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| amp | 0 | 0 | 0 | 0 | 0
> | 0 | 0 | 1 | 1 | 1 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| max-merge | 2 | 2 | 2 | 2 | 2
> | 2 | 3 | 3 | 4 | 5 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| early-skip | 1 | 1 | 1 | 1 | 0
> | 0 | 0 | 0 | 0 | 0 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| fast-intra | 1 | 1 | 1 | 1 | 1
> | 0 | 0 | 0 | 0 | 0 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| b-intra | 0 | 0 | 0 | 0 | 0
> | 0 | 0 | 1 | 1 | 1 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| sao | 0 | 0 | 1 | 1 | 1
> | 1 | 1 | 1 | 1 | 1 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| signhide | 0 | 1 | 1 | 1 | 1
> | 1 | 1 | 1 | 1 | 1 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| weightp | 0 | 0 | 1 | 1 | 1
> | 1 | 1 | 1 | 1 | 1 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| weightb | 0 | 0 | 0 | 0 | 0
> | 0 | 0 | 1 | 1 | 1 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| aq-mode | 0 | 0 | 1 | 1 | 1
> | 1 | 1 | 1 | 1 | 1 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| cuTree | 0 | 0 | 0 | 0 | 1
> | 1 | 1 | 1 | 1 | 1 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| rdLevel | 2 | 2 | 2 | 2 | 2
> | 3 | 4 | 6 | 6 | 6 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| rdoq-level | 0 | 0 | 0 | 0 | 0
> | 0 | 2 | 2 | 2 | 2 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| tu-intra | 1 | 1 | 1 | 1 | 1
> | 1 | 1 | 2 | 3 | 4 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| tu-inter | 1 | 1 | 1 | 1 | 1
> | 1 | 1 | 2 | 3 | 4 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| lookahead-slices | 1 | 1 | 1 | 1 | 1
> | 1 | 1 | 2 | 3 | 4 |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
>
> Placebo mode enables transform-skip prediction evaluation.
>
> diff -r 02db15e14351 -r f04e0fb7b82e source/common/param.cpp
> --- a/source/common/param.cpp Thu Oct 29 15:22:36 2015 +0530
> +++ b/source/common/param.cpp Wed Nov 04 12:27:16 2015 +0530
> @@ -147,7 +147,7 @@
> param->bFrameAdaptive = X265_B_ADAPT_TRELLIS;
> param->bBPyramid = 1;
> param->scenecutThreshold = 40; /* Magic number pulled in from x264 */
> - param->lookaheadSlices = 0;
> + param->lookaheadSlices = 8;
>
> /* Intra Coding Tools */
> param->bEnableConstrainedIntra = 0;
> @@ -348,6 +348,7 @@
> param->subpelRefine = 3;
> param->maxNumMergeCand = 3;
> param->searchMethod = X265_STAR_SEARCH;
> + param->lookaheadSlices = 4; // limit parallelism as already
> enough work exists
> }
> else if (!strcmp(preset, "slower"))
> {
> @@ -365,6 +366,7 @@
> param->maxNumMergeCand = 3;
> param->searchMethod = X265_STAR_SEARCH;
> param->bIntraInBFrames = 1;
> + param->lookaheadSlices = 4; // limit parallelism as already
> enough work exists
> }
> else if (!strcmp(preset, "veryslow"))
> {
> @@ -383,6 +385,7 @@
> param->searchMethod = X265_STAR_SEARCH;
> param->maxNumReferences = 5;
> param->bIntraInBFrames = 1;
> + param->lookaheadSlices = 0; // disabled for best quality
> }
> else if (!strcmp(preset, "placebo"))
> {
> @@ -404,6 +407,7 @@
> param->maxNumReferences = 5;
> param->rc.bEnableSlowFirstPass = 1;
> param->bIntraInBFrames = 1;
> + param->lookaheadSlices = 0; // disabled for best quality
> // TODO: optimized esa
> }
> else
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>
--
Deepthi Nandakumar
Engineering Manager, x265
Multicoreware, Inc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20151104/a2f34503/attachment-0001.html>
More information about the x265-devel
mailing list