[x265-commits] [x265] threading: add ATOMIC_ADD

Steve Borho steve at borho.org
Thu Jan 29 17:14:53 CET 2015


details:   http://hg.videolan.org/x265/rev/7f0b87cbad6d
branches:  
changeset: 9216:7f0b87cbad6d
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 11:49:46 2015 -0600
description:
threading: add ATOMIC_ADD
Subject: [x265] stats: keep running count of number of active worker threads per frame encoder

details:   http://hg.videolan.org/x265/rev/e8367f5cd43f
branches:  
changeset: 9217:e8367f5cd43f
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 11:15:45 2015 -0600
description:
stats: keep running count of number of active worker threads per frame encoder
Subject: [x265] stats: add frame statistic for average WPP benefit

details:   http://hg.videolan.org/x265/rev/be33891b8457
branches:  
changeset: 9218:be33891b8457
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 11:51:48 2015 -0600
description:
stats: add frame statistic for average WPP benefit

Show how many worker threads, on average, were working on each frame. Also move
the performance statistics together at the end of the CSV line in preparation
for adding a few more of them.
Subject: [x265] frameencoder: use uint32_t more consistently for rows and columns

details:   http://hg.videolan.org/x265/rev/b1a2ed9bc3b4
branches:  
changeset: 9219:b1a2ed9bc3b4
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 12:07:53 2015 -0600
description:
frameencoder: use uint32_t more consistently for rows and columns
Subject: [x265] stats: keep timestamps instead of elapsed times, to allow more flexibility

details:   http://hg.videolan.org/x265/rev/0c5078dfd9c5
branches:  
changeset: 9220:0c5078dfd9c5
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 12:28:49 2015 -0600
description:
stats: keep timestamps instead of elapsed times, to allow more flexibility
Subject: [x265] stats: count the number of times top dependencies block worker threads

details:   http://hg.videolan.org/x265/rev/6cd6e04a0abd
branches:  
changeset: 9221:6cd6e04a0abd
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 12:32:58 2015 -0600
description:
stats: count the number of times top dependencies block worker threads
Subject: [x265] stats: report row0wait and frame end overhead seperate from wall time

details:   http://hg.videolan.org/x265/rev/090d305c5708
branches:  
changeset: 9222:090d305c5708
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 12:44:21 2015 -0600
description:
stats: report row0wait and frame end overhead seperate from wall time

These are times where the frame encoder is either blocked for reference
dependencies or is doing some non-compression related work
Subject: [x265] stats: report times in milliseconds

details:   http://hg.videolan.org/x265/rev/451e21c4c66b
branches:  
changeset: 9223:451e21c4c66b
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 12:46:03 2015 -0600
description:
stats: report times in milliseconds
Subject: [x265] stats: include loop filter processing and all overhead in worker wall time

details:   http://hg.videolan.org/x265/rev/ff6eec030551
branches:  
changeset: 9224:ff6eec030551
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 12:54:11 2015 -0600
description:
stats: include loop filter processing and all overhead in worker wall time
Subject: [x265] stats: report wall time of wait for reference rows

details:   http://hg.videolan.org/x265/rev/766e91256991
branches:  
changeset: 9225:766e91256991
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 13:44:58 2015 -0600
description:
stats: report wall time of wait for reference rows
Subject: [x265] stats: report wall time of frame encoder with no active worker threads

details:   http://hg.videolan.org/x265/rev/9c26ca2c9c0e
branches:  
changeset: 9226:9c26ca2c9c0e
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 13:55:42 2015 -0600
description:
stats: report wall time of frame encoder with no active worker threads

But do not start this counter until the first CTU is processed
Subject: [x265] stats: report frame wall time spent waiting for decided frames

details:   http://hg.videolan.org/x265/rev/6260c3ca9f0d
branches:  
changeset: 9227:6260c3ca9f0d
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 14:12:53 2015 -0600
description:
stats: report frame wall time spent waiting for decided frames

This is latency caused by the lookahead
Subject: [x265] stats: document the new columms in per-frame CSV files

details:   http://hg.videolan.org/x265/rev/f45b12a0ded8
branches:  
changeset: 9228:f45b12a0ded8
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 14:27:22 2015 -0600
description:
stats: document the new columms in per-frame CSV files
Subject: [x265] stats: nits

details:   http://hg.videolan.org/x265/rev/4feedc4f6cf5
branches:  
changeset: 9229:4feedc4f6cf5
user:      Deepthi Nandakumar <deepthi at multicorewareinc.com>
date:      Thu Jan 29 16:57:10 2015 +0530
description:
stats: nits
Subject: [x265] vps: frameOnlyConstraintFlag is true if fieldSeqFlag is false.

details:   http://hg.videolan.org/x265/rev/194eeb121e45
branches:  
changeset: 9230:194eeb121e45
user:      Deepthi Nandakumar <deepthi at multicorewareinc.com>
date:      Thu Jan 29 17:10:06 2015 +0530
description:
vps: frameOnlyConstraintFlag is true if fieldSeqFlag is false.

frameOnlyConstraintFlag is true for progressive sources and false for
interlaced sources.
Subject: [x265] stats: introduce X265_LOG_FRAME for file level CSV logging without console logs

details:   http://hg.videolan.org/x265/rev/a5af4cf20660
branches:  
changeset: 9231:a5af4cf20660
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 15:25:25 2015 -0600
description:
stats: introduce X265_LOG_FRAME for file level CSV logging without console logs

Using --log-level debug to trigger frame level CSV logging is problematic since
the console logging is often a big enough overhead that it influences the
performance characteristics. --log-level frame will log frame level stats to
the CSV without enabling frame-level console logging. Note that this does not
change the behavior of --log-level debug, but it does change the behavior of
--log-level 3.
Subject: [x265] encoder: abort on failure to open CSV log file for write

details:   http://hg.videolan.org/x265/rev/504807f61ddc
branches:  
changeset: 9232:504807f61ddc
user:      Steve Borho <steve at borho.org>
date:      Wed Jan 28 15:29:02 2015 -0600
description:
encoder: abort on failure to open CSV log file for write

If the user specified a log file, then they probably do not want the encode to
be started if the log file failed to open.
Subject: [x265] cli: remove a comment that was out of date 18 months ago

details:   http://hg.videolan.org/x265/rev/3680e6c888fe
branches:  
changeset: 9233:3680e6c888fe
user:      Steve Borho <steve at borho.org>
date:      Thu Jan 29 09:29:45 2015 -0600
description:
cli: remove a comment that was out of date 18 months ago
Subject: [x265] cli: move a param validation into the encoder with other param validations

details:   http://hg.videolan.org/x265/rev/2a007f3cdd46
branches:  
changeset: 9234:2a007f3cdd46
user:      Steve Borho <steve at borho.org>
date:      Thu Jan 29 09:31:58 2015 -0600
description:
cli: move a param validation into the encoder with other param validations
Subject: [x265] cli: improve and document return codes

details:   http://hg.videolan.org/x265/rev/3cde2f08f9bf
branches:  
changeset: 9235:3cde2f08f9bf
user:      Steve Borho <steve at borho.org>
date:      Thu Jan 29 09:33:07 2015 -0600
description:
cli: improve and document return codes

command parse errors were being reported but not many other errors were.
Subject: [x265] cmake: bump build number for X265_LOG_FRAME

details:   http://hg.videolan.org/x265/rev/bf257ba100c5
branches:  
changeset: 9236:bf257ba100c5
user:      Steve Borho <steve at borho.org>
date:      Thu Jan 29 10:10:30 2015 -0600
description:
cmake: bump build number for X265_LOG_FRAME

diffstat:

 doc/reST/cli.rst                |   60 +++++++++++++++++-
 source/CMakeLists.txt           |    2 +-
 source/common/param.h           |    2 +-
 source/common/threading.h       |    2 +
 source/encoder/encoder.cpp      |  125 ++++++++++++++++++++++++---------------
 source/encoder/frameencoder.cpp |   94 ++++++++++++++++++++---------
 source/encoder/frameencoder.h   |   39 ++++++++----
 source/x265.cpp                 |   33 ++++++----
 source/x265.h                   |   16 ++--
 source/x265cli.h                |    8 ++-
 10 files changed, 262 insertions(+), 119 deletions(-)

diffs (truncated from 865 to 300 lines):

diff -r c1371f175178 -r bf257ba100c5 doc/reST/cli.rst
--- a/doc/reST/cli.rst	Mon Jan 26 15:31:42 2015 -0600
+++ b/doc/reST/cli.rst	Thu Jan 29 10:10:30 2015 -0600
@@ -28,7 +28,7 @@ consider this an error and abort.
 
 Generally, when an option expects a string value from a list of strings
 the user may specify the integer ordinal of the value they desire. ie:
-:option:`--log-level` 3 is equivalent to :option:`--log-level` debug.
+:option:`--log-level` 4 is equivalent to :option:`--log-level` debug.
 
 Executable Options
 ==================
@@ -45,13 +45,21 @@ Executable Options
 
 	**CLI ONLY**
 
+Command line executable return codes::
+
+	0. encode successful
+	1. unable to parse command line
+	2. unable to open encoder
+	3. unable to generate stream headers
+	4. encoder abort
+
 Logging/Statistic Options
 =========================
 
 .. option:: --log-level <integer|string>
 
 	Logging level. Debug level enables per-frame QP, metric, and bitrate
-	logging. If a CSV file is being generated, debug level makes the log
+	logging. If a CSV file is being generated, frame level makes the log
 	be per-frame rather than per-encode. Full level enables hash and
 	weight logging. -1 disables all logging, except certain fatal
 	errors, and can be specified by the string "none".
@@ -59,8 +67,9 @@ Logging/Statistic Options
 	0. error
 	1. warning
 	2. info **(default)**
-	3. debug
-	4. full
+	3. frame
+	4. debug
+	5. full
 
 .. option:: --no-progress
 
@@ -72,9 +81,50 @@ Logging/Statistic Options
 
 	Writes encoding results to a comma separated value log file. Creates
 	the file if it doesnt already exist, else adds one line per run.  if
-	:option:`--log-level` is debug or above, it writes one line per
+	:option:`--log-level` is frame or above, it writes one line per
 	frame. Default none
 
+	When frame level logging is enabled, several frame performance
+	statistics are listed:
+
+	**DecideWait ms** number of milliseconds the frame encoder had to
+	wait, since the previous frame was retrieved by the API thread,
+	before a new frame has been given to it. This is the latency
+	introduced by slicetype decisions (lookahead).
+	
+	**Row0Wait ms** number of milliseconds since the frame encoder
+	received a frame to encode before its first row of CTUs is allowed
+	to begin compression. This is the latency introduced by reference
+	frames making reconstructed and filtered rows available.
+	
+	**Wall time ms** number of milliseconds between the first CTU
+	being ready to be compressed and the entire frame being compressed
+	and the output NALs being completed.
+	
+	**Ref Wait Wall ms** number of milliseconds between the first
+	reference row being available and the last reference row becoming
+	available.
+	
+	**Total CTU time ms** the total time (measured in milliseconds)
+	spent by worker threads compressing and filtering CTUs for this
+	frame.
+	
+	**Stall Time ms** the number of milliseconds of the reported wall
+	time that were spent with zero worker threads, aka all compression
+	was completely stalled.
+
+	**Avg WPP** the average number of worker threads working on this
+	frame, at any given time. This value is sampled at the completion of
+	each CTU. This shows the effectiveness of Wavefront Parallel
+	Processing.
+
+	**Row Blocks** the number of times a worker thread had to abandon
+	the row of CTUs it was encoding because the row above it was not far
+	enough ahead for the necessary reference data to be available. This
+	is more of a problem for P frames where some blocks are much more
+	expensive than others.
+
+
 .. option:: --cu-stats, --no-cu-stats
 
 	Records statistics on how each CU was coded (split depths and other
diff -r c1371f175178 -r bf257ba100c5 source/CMakeLists.txt
--- a/source/CMakeLists.txt	Mon Jan 26 15:31:42 2015 -0600
+++ b/source/CMakeLists.txt	Thu Jan 29 10:10:30 2015 -0600
@@ -21,7 +21,7 @@ include(CheckSymbolExists)
 include(CheckCXXCompilerFlag)
 
 # X265_BUILD must be incremented each time the public API is changed
-set(X265_BUILD 42)
+set(X265_BUILD 43)
 configure_file("${PROJECT_SOURCE_DIR}/x265.def.in"
                "${PROJECT_BINARY_DIR}/x265.def")
 configure_file("${PROJECT_SOURCE_DIR}/x265_config.h.in"
diff -r c1371f175178 -r bf257ba100c5 source/common/param.h
--- a/source/common/param.h	Mon Jan 26 15:31:42 2015 -0600
+++ b/source/common/param.h	Thu Jan 29 10:10:30 2015 -0600
@@ -37,7 +37,7 @@ void  getParamAspectRatio(x265_param *p,
 bool  parseLambdaFile(x265_param *param);
 
 /* this table is kept internal to avoid confusion, since log level indices start at -1 */
-static const char * const logLevelNames[] = { "none", "error", "warning", "info", "debug", "full", 0 };
+static const char * const logLevelNames[] = { "none", "error", "warning", "info", "frame", "debug", "full", 0 };
 
 #define MAXPARAMSIZE 2000
 }
diff -r c1371f175178 -r bf257ba100c5 source/common/threading.h
--- a/source/common/threading.h	Mon Jan 26 15:31:42 2015 -0600
+++ b/source/common/threading.h	Thu Jan 29 10:10:30 2015 -0600
@@ -53,6 +53,7 @@
 #define ATOMIC_AND(ptr, mask)               __sync_fetch_and_and(ptr, mask)
 #define ATOMIC_INC(ptr)                     __sync_add_and_fetch((volatile int32_t*)ptr, 1)
 #define ATOMIC_DEC(ptr)                     __sync_add_and_fetch((volatile int32_t*)ptr, -1)
+#define ATOMIC_ADD(ptr, value)              __sync_add_and_fetch((volatile int32_t*)ptr, value)
 #define GIVE_UP_TIME()                      usleep(0)
 
 #elif defined(_MSC_VER)                 /* Windows atomic intrinsics */
@@ -63,6 +64,7 @@
 #define CTZ(id, x)                          _BitScanForward(&id, x)
 #define ATOMIC_INC(ptr)                     InterlockedIncrement((volatile LONG*)ptr)
 #define ATOMIC_DEC(ptr)                     InterlockedDecrement((volatile LONG*)ptr)
+#define ATOMIC_ADD(ptr, value)              InterlockedAdd((volatile LONG*)ptr, value)
 #define ATOMIC_OR(ptr, mask)                _InterlockedOr((volatile LONG*)ptr, (LONG)mask)
 #define ATOMIC_AND(ptr, mask)               _InterlockedAnd((volatile LONG*)ptr, (LONG)mask)
 #define GIVE_UP_TIME()                      Sleep(0)
diff -r c1371f175178 -r bf257ba100c5 source/encoder/encoder.cpp
--- a/source/encoder/encoder.cpp	Mon Jan 26 15:31:42 2015 -0600
+++ b/source/encoder/encoder.cpp	Thu Jan 29 10:10:30 2015 -0600
@@ -208,18 +208,25 @@ void Encoder::create()
             m_csvfpt = fopen(m_param->csvfn, "wb");
             if (m_csvfpt)
             {
-                if (m_param->logLevel >= X265_LOG_DEBUG)
+                if (m_param->logLevel >= X265_LOG_FRAME)
                 {
                     fprintf(m_csvfpt, "Encode Order, Type, POC, QP, Bits, ");
                     if (m_param->rc.rateControlMode == X265_RC_CRF)
                         fprintf(m_csvfpt, "RateFactor, ");
-                    fprintf(m_csvfpt, "Y PSNR, U PSNR, V PSNR, YUV PSNR, SSIM, SSIM (dB), "
-                                      "Encoding time, Elapsed time, List 0, List 1\n");
+                    fprintf(m_csvfpt, "Y PSNR, U PSNR, V PSNR, YUV PSNR, SSIM, SSIM (dB),  List 0, List 1");
+                    /* detailed performance statistics */
+                    fprintf(m_csvfpt, ", DecideWait (ms), Row0Wait (ms), Wall time (ms), Ref Wait Wall (ms), Total CTU time (ms), Stall Time (ms), Avg WPP, Row Blocks\n");
                 }
                 else
                     fputs(summaryCSVHeader, m_csvfpt);
             }
         }
+
+        if (!m_csvfpt)
+        {
+            x265_log(m_param, X265_LOG_ERROR, "Unable to open CSV log file <%s>, aborting\n", m_param->csvfn);
+            m_aborted = true;
+        }
     }
 
     if (m_frameEncoder)
@@ -976,7 +983,7 @@ void Encoder::writeLog(int argc, char **
 {
     if (m_csvfpt)
     {
-        if (m_param->logLevel >= X265_LOG_DEBUG)
+        if (m_param->logLevel >= X265_LOG_FRAME)
         {
             // adding summary to a per-frame csv log file needs a summary header
             fprintf(m_csvfpt, "\nSummary\n");
@@ -1115,14 +1122,14 @@ void Encoder::finishFrameStats(Frame* cu
             m_analyzeB.addSsim(ssim);
     }
 
-    // if debug log level is enabled, per frame logging is performed
+    char c = (slice->isIntra() ? 'I' : slice->isInterP() ? 'P' : 'B');
+    int poc = slice->m_poc;
+    if (!IS_REFERENCED(curFrame))
+        c += 32; // lower case if unreferenced
+
+    // if debug log level is enabled, per frame console logging is performed
     if (m_param->logLevel >= X265_LOG_DEBUG)
     {
-        char c = (slice->isIntra() ? 'I' : slice->isInterP() ? 'P' : 'B');
-        int poc = slice->m_poc;
-        if (!IS_REFERENCED(curFrame))
-            c += 32; // lower case if unreferenced
-
         char buf[1024];
         int p;
         p = sprintf(buf, "POC:%d %c QP %2.2lf(%d) %10d bits", poc, c, curEncData.m_avgQpAq, slice->m_sliceQp, (int)bits);
@@ -1149,43 +1156,6 @@ void Encoder::finishFrameStats(Frame* cu
             }
         }
 
-        // per frame CSV logging if the file handle is valid
-        if (m_csvfpt)
-        {
-            fprintf(m_csvfpt, "%d, %c-SLICE, %4d, %2.2lf, %10d,", m_outputCount++, c, poc, curEncData.m_avgQpAq, (int)bits);
-            if (m_param->rc.rateControlMode == X265_RC_CRF)
-                fprintf(m_csvfpt, "%.3lf,", curEncData.m_rateFactor);
-            double psnr = (psnrY * 6 + psnrU + psnrV) / 8;
-            if (m_param->bEnablePsnr)
-                fprintf(m_csvfpt, "%.3lf, %.3lf, %.3lf, %.3lf,", psnrY, psnrU, psnrV, psnr);
-            else
-                fprintf(m_csvfpt, " -, -, -, -,");
-            if (m_param->bEnableSsim)
-                fprintf(m_csvfpt, " %.6f, %6.3f,", ssim, x265_ssim2dB(ssim));
-            else
-                fprintf(m_csvfpt, " -, -,");
-            fprintf(m_csvfpt, " %.3lf, %.3lf", curEncoder->m_frameTime, curEncoder->m_elapsedCompressTime);
-            if (!slice->isIntra())
-            {
-                int numLists = slice->isInterP() ? 1 : 2;
-                for (int list = 0; list < numLists; list++)
-                {
-                    fprintf(m_csvfpt, ", ");
-                    for (int ref = 0; ref < slice->m_numRefIdx[list]; ref++)
-                    {
-                        int k = slice->m_refPOCList[list][ref] - slice->m_lastIDR;
-                        fprintf(m_csvfpt, " %d", k);
-                    }
-                }
-
-                if (numLists == 1)
-                    fprintf(m_csvfpt, ", -");
-            }
-            else
-                fprintf(m_csvfpt, ", -, -");
-            fprintf(m_csvfpt, "\n");
-        }
-
         if (m_param->decodedPictureHashSEI && m_param->logLevel >= X265_LOG_FULL)
         {
             const char* digestStr = NULL;
@@ -1205,7 +1175,60 @@ void Encoder::finishFrameStats(Frame* cu
                 p += sprintf(buf + p, " [Checksum:%s]", digestStr);
             }
         }
+
         x265_log(m_param, X265_LOG_DEBUG, "%s\n", buf);
+    }
+
+    if (m_param->logLevel >= X265_LOG_FRAME && m_csvfpt)
+    {
+        // per frame CSV logging if the file handle is valid
+        fprintf(m_csvfpt, "%d, %c-SLICE, %4d, %2.2lf, %10d,", m_outputCount++, c, poc, curEncData.m_avgQpAq, (int)bits);
+        if (m_param->rc.rateControlMode == X265_RC_CRF)
+            fprintf(m_csvfpt, "%.3lf,", curEncData.m_rateFactor);
+        double psnr = (psnrY * 6 + psnrU + psnrV) / 8;
+        if (m_param->bEnablePsnr)
+            fprintf(m_csvfpt, "%.3lf, %.3lf, %.3lf, %.3lf,", psnrY, psnrU, psnrV, psnr);
+        else
+            fputs(" -, -, -, -,", m_csvfpt);
+        if (m_param->bEnableSsim)
+            fprintf(m_csvfpt, " %.6f, %6.3f", ssim, x265_ssim2dB(ssim));
+        else
+            fputs(" -, -", m_csvfpt);
+        if (slice->isIntra())
+            fputs(", -, -", m_csvfpt);
+        else
+        {
+            int numLists = slice->isInterP() ? 1 : 2;
+            for (int list = 0; list < numLists; list++)
+            {
+                fprintf(m_csvfpt, ", ");
+                for (int ref = 0; ref < slice->m_numRefIdx[list]; ref++)
+                {
+                    int k = slice->m_refPOCList[list][ref] - slice->m_lastIDR;
+                    fprintf(m_csvfpt, " %d", k);
+                }
+            }
+
+            if (numLists == 1)
+                fputs(", -", m_csvfpt);
+        }
+
+#define ELAPSED_MSEC(start, end) (((double)(end) - (start)) / 1000)
+
+        // detailed frame statistics
+        fprintf(m_csvfpt, ", %.1lf, %.1lf, %.1lf, %.1lf, %.1lf, %.1lf",
+            ELAPSED_MSEC(0, curEncoder->m_slicetypeWaitTime),
+            ELAPSED_MSEC(curEncoder->m_startCompressTime, curEncoder->m_row0WaitTime),
+            ELAPSED_MSEC(curEncoder->m_row0WaitTime, curEncoder->m_endCompressTime),
+            ELAPSED_MSEC(curEncoder->m_row0WaitTime, curEncoder->m_allRowsAvailableTime),
+            ELAPSED_MSEC(0, curEncoder->m_totalWorkerElapsedTime),
+            ELAPSED_MSEC(0, curEncoder->m_totalNoWorkerTime));


More information about the x265-commits mailing list