[x265] [PATCH] Performance: Prevent small thread-pools if NUMA disabled and # CPUs > MAX_POOL_THREADS

pradeep at multicorewareinc.com pradeep at multicorewareinc.com
Wed Aug 26 08:36:48 CEST 2015


# HG changeset patch
# User pradeep
# Date 1440570947 -19800
#      Wed Aug 26 12:05:47 2015 +0530
# Node ID 241bee8c193f8134522f48f433c532a2cbe11a55
# Parent  a28a863393994d8fb1d58c721352d9b4ec8c46ee
Performance: Prevent small thread-pools if NUMA disabled and # CPUs > MAX_POOL_THREADS

When NUMA is disabled and if # CPUs is > MAX_POOL_THREADS (64 or 32 depending on 64-bit
or 32-bit builds), the last pool may have few threads. This patch allows the last pool to
exist only if it has > MAX_POOL_THREADS/2 threads; the 50% number is a heurstic.

Feature gains performance of 5% on Intel Xeon E5-2699v3 measured in slower preset with 4K video.

diff -r a28a86339399 -r 241bee8c193f source/common/threadpool.cpp
--- a/source/common/threadpool.cpp	Mon Aug 24 14:04:32 2015 +0530
+++ b/source/common/threadpool.cpp	Wed Aug 26 12:05:47 2015 +0530
@@ -289,6 +289,14 @@
         }
     }
 
+    // In the case that numa is disabled and we have more CPUs than 64,
+    // spawn the last pool only if the # threads in that pool is > 1/2 max (heuristic)
+    if ((numNumaNodes == 1) && (cpusPerNode[0] % MAX_POOL_THREADS < (MAX_POOL_THREADS / 2)))
+    {
+        cpusPerNode[0] -= (cpusPerNode[0] % MAX_POOL_THREADS);
+        x265_log(p, X265_LOG_DEBUG, "Creating only %d worker threads to prevent asymmetry in pools; may not use all HW contexts\n", cpusPerNode[0]);
+    }
+
     numPools = 0;
     for (int i = 0; i < numNumaNodes; i++)
     {


More information about the x265-devel mailing list