[x265] [PATCH] Performance: Balance # threads per pool for non-NUMA machines with > 64 vCPUs

pradeep at multicorewareinc.com pradeep at multicorewareinc.com
Mon Aug 24 11:19:02 CEST 2015


# HG changeset patch
# User pradeep
# Date 1440406873 -19800
#      Mon Aug 24 14:31:13 2015 +0530
# Node ID cf6210f6f5cbbeec441f7eee3d8abf82208942fd
# Parent  f63273fa3137fef2f6898c686b68ee12608acd31
Performance: Balance # threads per pool for non-NUMA machines with > 64 vCPUs

By default, each thread pool may have up to 64 threads. In the case of a CPU that
has > 64 threads without NUMA support, we end up creating the first pool with 64
threads and the other pool with a few stray threads significantly affecting performance.
This fix balances the threads out in that case.

diff -r f63273fa3137 -r cf6210f6f5cb source/common/threadpool.cpp
--- a/source/common/threadpool.cpp	Thu Aug 20 11:13:25 2015 +0530
+++ b/source/common/threadpool.cpp	Mon Aug 24 14:31:13 2015 +0530
@@ -307,6 +307,16 @@
         numPools = X265_MAX(p->frameNumThreads / 2, 1);
     }
 
+    // In the case that numa is disabled and we have more CPUs than 64,
+    // balance the # threads created across thread pools
+    if ((numNumaNodes==1) && (numPools > 1))
+    {
+        int threads = cpusPerNode[0];
+        for (int i = 0; i < numPools; i++)
+            cpusPerNode[i] = threads / numPools;
+        cpusPerNode[0] += (threads % numPools) ;
+    }
+
     ThreadPool *pools = new ThreadPool[numPools];
     if (pools)
     {


More information about the x265-devel mailing list