[x265] [PATCH] perf: Enabling lookahead-slices for all presets except veryslow & placebo

Deepthi Nandakumar deepthi at multicorewareinc.com
Wed Nov 4 11:57:12 CET 2015


The logic for disabling lookahead slices for resolutions less than 720p is
not included?

On Wed, Nov 4, 2015 at 12:27 PM, Pradeep Ramachandran <
pradeep at multicorewareinc.com> wrote:

> # HG changeset patch
> # User Pradeep Ramachandran <pradeep at multicorewareinc.com>
> # Date 1446620236 -19800
> #      Wed Nov 04 12:27:16 2015 +0530
> # Node ID f04e0fb7b82e98b5bab4e8ae1cfb3caf92f8c277
> # Parent  02db15e14351c7f5190203a087db285655961205
> perf: Enabling lookahead-slices for all presets except veryslow & placebo
>
> Seeing ~10% performance on the faster presets on skylake, and ~2X
> performance
> on Xeon systems in ultrafast setting. Performance improvement is through a
> considerable increase in utilization across the board. Disabling slicing
> for
> videos of resoultion < 720p to limit impact on quality.
>
> Commit will change outputs, but reduction in quality (measured by PSNR or
> SSIM)
> is <0.01% across a wide variety of presets and runs.
>
> diff -r 02db15e14351 -r f04e0fb7b82e doc/reST/cli.rst
> --- a/doc/reST/cli.rst  Thu Oct 29 15:22:36 2015 +0530
> +++ b/doc/reST/cli.rst  Wed Nov 04 12:27:16 2015 +0530
> @@ -1124,21 +1124,31 @@
>
>  .. option:: --lookahead-slices <0..16>
>
> -       Use multiple worker threads to measure the estimated cost of each
> -       frame within the lookahead. When :option:`--b-adapt` is 2, most
> -       frame cost estimates will be performed in batch mode, many cost
> -       estimates at the same time, and lookahead-slices is ignored for
> -       batched estimates. The effect on performance can be quite small.
> -       The higher this parameter, the less accurate the frame costs will
> be
> -       (since context is lost across slice boundaries) which will result
> in
> -       less accurate B-frame and scene-cut decisions.
> +       Use multiple worker threads to measure the estimated cost of each
> frame
> +       within the lookahead. The frame is divided into the specified
> number of
> +       slices, and one-thread is launched  per slice. When
> :option:`--b-adapt` is
> +       2, most frame cost estimates will be performed in batch mode (many
> cost
> +       estimates at the same time) and lookahead-slices is ignored for
> batched
> +       estimates; it may still be used for single cost estimations. The
> higher this
> +       parameter, the less accurate the frame costs will be (since
> context is lost
> +       across slice boundaries) which will result in less accurate
> B-frame and
> +       scene-cut decisions. The effect on performance can be significant
> especially
> +       on systems with many threads.
>
> -       The encoder may internally lower the number of slices to ensure
> -       each slice codes at least 10 16x16 rows of lowres blocks. If slices
> -       are used in lookahead, they are logged in the list of tools as
> -       *lslices*.
> -
> -       **Values:** 0 - disabled (default). 1 is the same as 0. Max 16
> +       The encoder may internally lower the number of slices or disable
> +    slicing to ensure each slice codes at least 10 16x16 rows of lowres
> +    blocks to minimize the impact on quality. For example, for 720p and
> +    1080p videos, the number of slices is capped to 4 and 6, respectively.
> +    For resolutions lesser than 720p, slicing is auto-disabled.
> +
> +    If slices are used in lookahead, they are logged in the list of tools
> +    as *lslices*
> +
> +       **Values:** 0 - disabled. 1 is the same as 0. Max 16.
> +    Default: 8 for ultrafast, superfast, faster, fast, medium
> +             4 for slow, slower
> +             disabled for veryslow, slower
> +
>
>  .. option:: --b-adapt <integer>
>
> diff -r 02db15e14351 -r f04e0fb7b82e doc/reST/presets.rst
> --- a/doc/reST/presets.rst      Thu Oct 29 15:22:36 2015 +0530
> +++ b/doc/reST/presets.rst      Wed Nov 04 12:27:16 2015 +0530
> @@ -19,61 +19,63 @@
>
>  The presets adjust encoder parameters to affect these trade-offs.
>
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -|              | ultrafast | superfast | veryfast | faster | fast |
> medium | slow | slower | veryslow | placebo |
>
> -+==============+===========+===========+==========+========+======+========+======+========+==========+=========+
> -| ctu          |   32      |    32     |   32     |  64    |  64  |   64
>  |  64  |  64    |   64     |   64    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| min-cu-size  |   16      |     8     |    8     |   8    |   8  |    8
>  |   8  |   8    |    8     |    8    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| bframes      |    3      |     3     |    4     |   4    |  4   |    4
>  |  4   |   8    |    8     |    8    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| b-adapt      |    0      |     0     |    0     |   0    |  0   |    2
>  |  2   |   2    |    2     |    2    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| rc-lookahead |    5      |    10     |   15     |  15    |  15  |   20
>  |  25  |   30   |   40     |   60    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| scenecut     |    0      |    40     |   40     |  40    |  40  |   40
>  |  40  |   40   |   40     |   40    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| ref          |    1      |     1     |    1     |   1    |  2   |    3
>  |  3   |   3    |    5     |    5    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| me           |   dia     |   hex     |   hex    |  hex   | hex  |
>  hex  | star |  star  |   star   |   star  |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| merange      |   57      |    57     |   57     |  57    |  57  |   57
>  | 57   |  57    |   57     |   92    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| subme        |    0      |     1     |    1     |   2    |  2   |    2
>  |  3   |   3    |    4     |    5    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| rect         |    0      |     0     |    0     |   0    |  0   |    0
>  |  1   |   1    |    1     |    1    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| amp          |    0      |     0     |    0     |   0    |  0   |    0
>  |  0   |   1    |    1     |    1    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| max-merge    |    2      |     2     |    2     |   2    |  2   |    2
>  |  3   |   3    |    4     |    5    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| early-skip   |    1      |     1     |    1     |   1    |  0   |    0
>  |  0   |   0    |    0     |    0    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| fast-intra   |    1      |     1     |    1     |   1    |  1   |    0
>  |  0   |   0    |    0     |    0    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| b-intra      |    0      |     0     |    0     |   0    |  0   |    0
>  |  0   |   1    |    1     |    1    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| sao          |    0      |     0     |    1     |   1    |  1   |    1
>  |  1   |   1    |    1     |    1    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| signhide     |    0      |     1     |    1     |   1    |  1   |    1
>  |  1   |   1    |    1     |    1    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| weightp      |    0      |     0     |    1     |   1    |  1   |    1
>  |  1   |   1    |    1     |    1    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| weightb      |    0      |     0     |    0     |   0    |  0   |    0
>  |  0   |   1    |    1     |    1    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| aq-mode      |    0      |     0     |    1     |   1    |  1   |    1
>  |  1   |   1    |    1     |    1    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| cuTree       |    0      |     0     |    0     |   0    |  1   |    1
>  |  1   |   1    |    1     |    1    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| rdLevel      |    2      |     2     |    2     |   2    |  2   |    3
>  |  4   |   6    |    6     |    6    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| rdoq-level   |    0      |     0     |    0     |   0    |  0   |    0
>  |  2   |   2    |    2     |    2    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| tu-intra     |    1      |     1     |    1     |   1    |  1   |    1
>  |  1   |   2    |    3     |    4    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> -| tu-inter     |    1      |     1     |    1     |   1    |  1   |    1
>  |  1   |   2    |    3     |    4    |
>
> -+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +|                     | ultrafast | superfast | veryfast | faster | fast
> | medium | slow | slower | veryslow | placebo |
>
> ++=====================+===========+===========+==========+========+======+========+======+========+==========+=========+
> +| ctu                 |   32      |    32     |   32     |  64    |  64
> |   64   |  64  |  64    |   64     |   64    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| min-cu-size         |   16      |     8     |    8     |   8    |   8
> |    8   |   8  |   8    |    8     |    8    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| bframes             |    3      |     3     |    4     |   4    |  4
>  |    4   |  4   |   8    |    8     |    8    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| b-adapt             |    0      |     0     |    0     |   0    |  0
>  |    2   |  2   |   2    |    2     |    2    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| rc-lookahead        |    5      |    10     |   15     |  15    |  15
> |   20   |  25  |   30   |   40     |   60    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| scenecut            |    0      |    40     |   40     |  40    |  40
> |   40   |  40  |   40   |   40     |   40    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| ref                 |    1      |     1     |    1     |   1    |  2
>  |    3   |  3   |   3    |    5     |    5    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| me                  |   dia     |   hex     |   hex    |  hex   | hex
> |   hex  | star |  star  |   star   |   star  |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| merange             |   57      |    57     |   57     |  57    |  57
> |   57   | 57   |  57    |   57     |   92    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| subme               |    0      |     1     |    1     |   2    |  2
>  |    2   |  3   |   3    |    4     |    5    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| rect                |    0      |     0     |    0     |   0    |  0
>  |    0   |  1   |   1    |    1     |    1    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| amp                 |    0      |     0     |    0     |   0    |  0
>  |    0   |  0   |   1    |    1     |    1    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| max-merge           |    2      |     2     |    2     |   2    |  2
>  |    2   |  3   |   3    |    4     |    5    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| early-skip          |    1      |     1     |    1     |   1    |  0
>  |    0   |  0   |   0    |    0     |    0    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| fast-intra          |    1      |     1     |    1     |   1    |  1
>  |    0   |  0   |   0    |    0     |    0    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| b-intra             |    0      |     0     |    0     |   0    |  0
>  |    0   |  0   |   1    |    1     |    1    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| sao                 |    0      |     0     |    1     |   1    |  1
>  |    1   |  1   |   1    |    1     |    1    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| signhide            |    0      |     1     |    1     |   1    |  1
>  |    1   |  1   |   1    |    1     |    1    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| weightp             |    0      |     0     |    1     |   1    |  1
>  |    1   |  1   |   1    |    1     |    1    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| weightb             |    0      |     0     |    0     |   0    |  0
>  |    0   |  0   |   1    |    1     |    1    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| aq-mode             |    0      |     0     |    1     |   1    |  1
>  |    1   |  1   |   1    |    1     |    1    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| cuTree              |    0      |     0     |    0     |   0    |  1
>  |    1   |  1   |   1    |    1     |    1    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| rdLevel             |    2      |     2     |    2     |   2    |  2
>  |    3   |  4   |   6    |    6     |    6    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| rdoq-level          |    0      |     0     |    0     |   0    |  0
>  |    0   |  2   |   2    |    2     |    2    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| tu-intra            |    1      |     1     |    1     |   1    |  1
>  |    1   |  1   |   2    |    3     |    4    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| tu-inter            |    1      |     1     |    1     |   1    |  1
>  |    1   |  1   |   2    |    3     |    4    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
> +| lookahead-slices    |    1      |     1     |    1     |   1    |  1
>  |    1   |  1   |   2    |    3     |    4    |
>
> ++---------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+
>
>  Placebo mode enables transform-skip prediction evaluation.
>
> diff -r 02db15e14351 -r f04e0fb7b82e source/common/param.cpp
> --- a/source/common/param.cpp   Thu Oct 29 15:22:36 2015 +0530
> +++ b/source/common/param.cpp   Wed Nov 04 12:27:16 2015 +0530
> @@ -147,7 +147,7 @@
>      param->bFrameAdaptive = X265_B_ADAPT_TRELLIS;
>      param->bBPyramid = 1;
>      param->scenecutThreshold = 40; /* Magic number pulled in from x264 */
> -    param->lookaheadSlices = 0;
> +    param->lookaheadSlices = 8;
>
>      /* Intra Coding Tools */
>      param->bEnableConstrainedIntra = 0;
> @@ -348,6 +348,7 @@
>              param->subpelRefine = 3;
>              param->maxNumMergeCand = 3;
>              param->searchMethod = X265_STAR_SEARCH;
> +            param->lookaheadSlices = 4; // limit parallelism as already
> enough work exists
>          }
>          else if (!strcmp(preset, "slower"))
>          {
> @@ -365,6 +366,7 @@
>              param->maxNumMergeCand = 3;
>              param->searchMethod = X265_STAR_SEARCH;
>              param->bIntraInBFrames = 1;
> +            param->lookaheadSlices = 4; // limit parallelism as already
> enough work exists
>          }
>          else if (!strcmp(preset, "veryslow"))
>          {
> @@ -383,6 +385,7 @@
>              param->searchMethod = X265_STAR_SEARCH;
>              param->maxNumReferences = 5;
>              param->bIntraInBFrames = 1;
> +            param->lookaheadSlices = 0; // disabled for best quality
>          }
>          else if (!strcmp(preset, "placebo"))
>          {
> @@ -404,6 +407,7 @@
>              param->maxNumReferences = 5;
>              param->rc.bEnableSlowFirstPass = 1;
>              param->bIntraInBFrames = 1;
> +            param->lookaheadSlices = 0; // disabled for best quality
>              // TODO: optimized esa
>          }
>          else
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>



-- 
Deepthi Nandakumar
Engineering Manager, x265
Multicoreware, Inc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20151104/a2f34503/attachment-0001.html>


More information about the x265-devel mailing list