<div dir="ltr">This patch no longer applies at the tip as the doc was recently changed. Will send an update - please ignore until then.<br><div>Pradeep.</div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr">On Tue, Nov 3, 2015 at 9:32 PM, Pradeep Ramachandran <span dir="ltr"><<a href="mailto:pradeep@multicorewareinc.com" target="_blank">pradeep@multicorewareinc.com</a>></span> wrote:<br></div></div></div></div></div></div></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"># HG changeset patch<br>
# User Pradeep Ramachandran <<a href="mailto:pradeep@multicorewareinc.com">pradeep@multicorewareinc.com</a>><br>
# Date 1446566445 -19800<br>
#      Tue Nov 03 21:30:45 2015 +0530<br>
# Node ID cdc2b132a66f97bab510313f614909df3089c2c7<br>
# Parent  61396ea8096a9f75667ac01ae8b4bf02169d3b64<br>
perf: Enabling lookahead-slices for all presets except veryslow & placebo<br>
<br>
Seeing ~10% performance on the faster presets on skylake, and ~2X performance<br>
on Xeon systems in ultrafast setting. Performance improvement is through a<br>
considerable increase in utilization across the board. Disabling slicing for<br>
videos of resoultion < 720p to limit impact on quality.<br>
<br>
Commit will change outputs, but reduction in quality (measured by PSNR or SSIM)<br>
is <0.01% across a wide variety of presets and runs.<br>
<br>
diff -r 61396ea8096a -r cdc2b132a66f doc/reST/cli.rst<br>
--- a/doc/reST/cli.rst  Mon Oct 12 10:23:37 2015 +0800<br>
+++ b/doc/reST/cli.rst  Tue Nov 03 21:30:45 2015 +0530<br>
@@ -1124,21 +1124,31 @@<br>
<br>
 .. option:: --lookahead-slices <0..16><br>
<br>
-       Use multiple worker threads to measure the estimated cost of each<br>
-       frame within the lookahead. When :option:`--b-adapt` is 2, most<br>
-       frame cost estimates will be performed in batch mode, many cost<br>
-       estimates at the same time, and lookahead-slices is ignored for<br>
-       batched estimates. The effect on performance can be quite small.<br>
-       The higher this parameter, the less accurate the frame costs will be<br>
-       (since context is lost across slice boundaries) which will result in<br>
-       less accurate B-frame and scene-cut decisions.<br>
+       Use multiple worker threads to measure the estimated cost of each frame<br>
+       within the lookahead. The frame is divided into the specified number of<br>
+       slices, and one-thread is launched  per slice. When :option:`--b-adapt` is<br>
+       2, most frame cost estimates will be performed in batch mode (many cost<br>
+       estimates at the same time) and lookahead-slices is ignored for batched<br>
+       estimates; it may still be used for single cost estimations. The higher this<br>
+       parameter, the less accurate the frame costs will be (since context is lost<br>
+       across slice boundaries) which will result in less accurate B-frame and<br>
+       scene-cut decisions. The effect on performance can be significant especially<br>
+       on systems with many threads.<br>
<br>
-       The encoder may internally lower the number of slices to ensure<br>
-       each slice codes at least 10 16x16 rows of lowres blocks. If slices<br>
-       are used in lookahead, they are logged in the list of tools as<br>
-       *lslices*.<br>
-<br>
-       **Values:** 0 - disabled (default). 1 is the same as 0. Max 16<br>
+       The encoder may internally lower the number of slices or disable<br>
+    slicing to ensure each slice codes at least 10 16x16 rows of lowres<br>
+    blocks to minimize the impact on quality. For example, for 720p and<br>
+    1080p videos, the number of slices is capped to 4 and 6, respectively.<br>
+    For resolutions lesser than 720p, slicing is auto-disabled.<br>
+<br>
+    If slices are used in lookahead, they are logged in the list of tools<br>
+    as *lslices*<br>
+<br>
+       **Values:** 0 - disabled. 1 is the same as 0. Max 16.<br>
+    Default: 8 for ultrafast, superfast, faster, fast, medium<br>
+             4 for slow, slower<br>
+             disabled for veryslow, slower<br>
+<br>
<br>
 .. option:: --b-adapt <integer><br>
<br>
diff -r 61396ea8096a -r cdc2b132a66f doc/reST/presets.rst<br>
--- a/doc/reST/presets.rst      Mon Oct 12 10:23:37 2015 +0800<br>
+++ b/doc/reST/presets.rst      Tue Nov 03 21:30:45 2015 +0530<br>
@@ -19,61 +19,63 @@<br>
<br>
 The presets adjust encoder parameters to affect these trade-offs.<br>
<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-|              | ultrafast | superfast | veryfast | faster | fast | medium | slow | slower | veryslow | placebo |<br>
-+==============+===========+===========+==========+========+======+========+======+========+==========+=========+<br>
-| ctu          |   32      |    32     |   32     |  64    |  64  |   64   |  64  |  64    |   64     |   64    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| min-cu-size  |   16      |     8     |    8     |   8    |   8  |    8   |   8  |   8    |    8     |    8    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| bframes      |    3      |     3     |    4     |   4    |  4   |    4   |  4   |   8    |    8     |    8    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| b-adapt      |    0      |     0     |    0     |   0    |  0   |    2   |  2   |   2    |    2     |    2    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| rc-lookahead |    5      |    10     |   15     |  15    |  15  |   20   |  25  |   30   |   40     |   60    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| scenecut     |    0      |    40     |   40     |  40    |  40  |   40   |  40  |   40   |   40     |   40    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| refs         |    1      |     1     |    1     |   1    |  2   |    3   |  3   |   3    |    5     |    5    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| me           |   dia     |   hex     |   hex    |  hex   | hex  |   hex  | star |  star  |   star   |   star  |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| merange      |   57      |    57     |   57     |  57    |  57  |   57   | 57   |  57    |   57     |   92    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| subme        |    0      |     1     |    1     |   2    |  2   |    2   |  3   |   3    |    4     |    5    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| rect         |    0      |     0     |    0     |   0    |  0   |    0   |  1   |   1    |    1     |    1    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| amp          |    0      |     0     |    0     |   0    |  0   |    0   |  0   |   1    |    1     |    1    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| max-merge    |    2      |     2     |    2     |   2    |  2   |    2   |  3   |   3    |    4     |    5    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| early-skip   |    1      |     1     |    1     |   1    |  0   |    0   |  0   |   0    |    0     |    0    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| fast-intra   |    1      |     1     |    1     |   1    |  1   |    0   |  0   |   0    |    0     |    0    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| b-intra      |    0      |     0     |    0     |   0    |  0   |    0   |  0   |   1    |    1     |    1    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| sao          |    0      |     0     |    1     |   1    |  1   |    1   |  1   |   1    |    1     |    1    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| signhide     |    0      |     1     |    1     |   1    |  1   |    1   |  1   |   1    |    1     |    1    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| weightp      |    0      |     0     |    1     |   1    |  1   |    1   |  1   |   1    |    1     |    1    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| weightb      |    0      |     0     |    0     |   0    |  0   |    0   |  0   |   1    |    1     |    1    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| aq-mode      |    0      |     0     |    1     |   1    |  1   |    1   |  1   |   1    |    1     |    1    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| cuTree       |    0      |     0     |    0     |   0    |  1   |    1   |  1   |   1    |    1     |    1    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| rdLevel      |    2      |     2     |    2     |   2    |  2   |    3   |  4   |   6    |    6     |    6    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| rdoq-level   |    0      |     0     |    0     |   0    |  0   |    0   |  2   |   2    |    2     |    2    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| tu-intra     |    1      |     1     |    1     |   1    |  1   |    1   |  1   |   2    |    3     |    4    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
-| tu-inter     |    1      |     1     |    1     |   1    |  1   |    1   |  1   |   2    |    3     |    4    |<br>
-+--------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+|                  | ultrafast | superfast | veryfast | faster | fast | medium | slow | slower | veryslow | placebo |<br>
++=================+===========+===========+==========+========+======+========+======+========+==========+=========+<br>
+| ctu              |   32      |    32     |   32     |  64    |  64  |   64   |  64  |  64    |   64     |   64    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| min-cu-size      |   16      |     8     |    8     |   8    |   8  |    8   |   8  |   8    |    8     |    8    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| bframes          |    3      |     3     |    4     |   4    |  4   |    4   |  4   |   8    |    8     |    8    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| b-adapt          |    0      |     0     |    0     |   0    |  0   |    2   |  2   |   2    |    2     |    2    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| rc-lookahead     |    5      |    10     |   15     |  15    |  15  |   20   |  25  |   30   |   40     |   60    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| scenecut         |    0      |    40     |   40     |  40    |  40  |   40   |  40  |   40   |   40     |   40    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| refs             |    1      |     1     |    1     |   1    |  2   |    3   |  3   |   3    |    5     |    5    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| me               |   dia     |   hex     |   hex    |  hex   | hex  |   hex  | star |  star  |   star   |   star  |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| merange          |   57      |    57     |   57     |  57    |  57  |   57   | 57   |  57    |   57     |   92    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| subme            |    0      |     1     |    1     |   2    |  2   |    2   |  3   |   3    |    4     |    5    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| rect             |    0      |     0     |    0     |   0    |  0   |    0   |  1   |   1    |    1     |    1    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| amp              |    0      |     0     |    0     |   0    |  0   |    0   |  0   |   1    |    1     |    1    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| max-merge        |    2      |     2     |    2     |   2    |  2   |    2   |  3   |   3    |    4     |    5    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| early-skip       |    1      |     1     |    1     |   1    |  0   |    0   |  0   |   0    |    0     |    0    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| fast-intra       |    1      |     1     |    1     |   1    |  1   |    0   |  0   |   0    |    0     |    0    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| b-intra          |    0      |     0     |    0     |   0    |  0   |    0   |  0   |   1    |    1     |    1    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| sao              |    0      |     0     |    1     |   1    |  1   |    1   |  1   |   1    |    1     |    1    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| signhide         |    0      |     1     |    1     |   1    |  1   |    1   |  1   |   1    |    1     |    1    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| weightp          |    0      |     0     |    1     |   1    |  1   |    1   |  1   |   1    |    1     |    1    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| weightb          |    0      |     0     |    0     |   0    |  0   |    0   |  0   |   1    |    1     |    1    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| aq-mode          |    0      |     0     |    1     |   1    |  1   |    1   |  1   |   1    |    1     |    1    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| cuTree           |    0      |     0     |    0     |   0    |  1   |    1   |  1   |   1    |    1     |    1    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| rdLevel          |    2      |     2     |    2     |   2    |  2   |    3   |  4   |   6    |    6     |    6    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| rdoq-level       |    0      |     0     |    0     |   0    |  0   |    0   |  2   |   2    |    2     |    2    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| tu-intra         |    1      |     1     |    1     |   1    |  1   |    1   |  1   |   2    |    3     |    4    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| tu-inter         |    1      |     1     |    1     |   1    |  1   |    1   |  1   |   2    |    3     |    4    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
+| lookahead-slices |    8      |     8     |    8     |   8    |  8   |    8   |  4   |   4    |    0     |    0    |<br>
++------------------+-----------+-----------+----------+--------+------+--------+------+--------+----------+---------+<br>
<br>
 Placebo mode enables transform-skip prediction evaluation.<br>
<br>
diff -r 61396ea8096a -r cdc2b132a66f source/common/param.cpp<br>
--- a/source/common/param.cpp   Mon Oct 12 10:23:37 2015 +0800<br>
+++ b/source/common/param.cpp   Tue Nov 03 21:30:45 2015 +0530<br>
@@ -147,7 +147,7 @@<br>
     param->bFrameAdaptive = X265_B_ADAPT_TRELLIS;<br>
     param->bBPyramid = 1;<br>
     param->scenecutThreshold = 40; /* Magic number pulled in from x264 */<br>
-    param->lookaheadSlices = 0;<br>
+    param->lookaheadSlices = 8;<br>
<br>
     /* Intra Coding Tools */<br>
     param->bEnableConstrainedIntra = 0;<br>
@@ -347,6 +347,7 @@<br>
             param->subpelRefine = 3;<br>
             param->maxNumMergeCand = 3;<br>
             param->searchMethod = X265_STAR_SEARCH;<br>
+            param->lookaheadSlices = 4; // limit parallelism as already enough work exists<br>
         }<br>
         else if (!strcmp(preset, "slower"))<br>
         {<br>
@@ -364,6 +365,7 @@<br>
             param->maxNumMergeCand = 3;<br>
             param->searchMethod = X265_STAR_SEARCH;<br>
             param->bIntraInBFrames = 1;<br>
+            param->lookaheadSlices = 4; // limit parallelism as already enough work exists<br>
         }<br>
         else if (!strcmp(preset, "veryslow"))<br>
         {<br>
@@ -382,6 +384,7 @@<br>
             param->searchMethod = X265_STAR_SEARCH;<br>
             param->maxNumReferences = 5;<br>
             param->bIntraInBFrames = 1;<br>
+            param->lookaheadSlices = 0; // disabled for best quality<br>
         }<br>
         else if (!strcmp(preset, "placebo"))<br>
         {<br>
@@ -403,6 +406,7 @@<br>
             param->maxNumReferences = 5;<br>
             param->rc.bEnableSlowFirstPass = 1;<br>
             param->bIntraInBFrames = 1;<br>
+            param->lookaheadSlices = 0; // disabled for best quality<br>
             // TODO: optimized esa<br>
         }<br>
         else<br>
</blockquote></div><br></div></div>