[x265] [PATCH] threadpool.cpp: use WIN system call for popcount

Pradeep Ramachandran pradeep at multicorewareinc.com
Thu May 3 18:37:33 CEST 2018


On Thu, May 3, 2018 at 2:23 PM, <praveen at multicorewareinc.com> wrote:

> # HG changeset patch
> # User Praveen Tiwari <praveen at multicorewareinc.com>
> # Date 1525328839 -19800
> #      Thu May 03 11:57:19 2018 +0530
> # Branch stable
> # Node ID 9cbb2aadcca3a2f7a308ea1dc792fb817bcc5b51
> # Parent  69aafa6d70ad4e151f4590766c6b125621c5d007
> threadpool.cpp: use WIN system call for popcount
>

Unless this fixes a known bug, I don't want to push this directly into
stable. Syscalls are notorious especially when working with older versions
of the OS.
I would rather push this into default and allow users to test that this
works with all kinds of systems and then merge with stable once the answer
is known.
Does this fix a specific issue on some platform, or improve performance?


>
> diff -r 69aafa6d70ad -r 9cbb2aadcca3 source/common/threadpool.cpp
> --- a/source/common/threadpool.cpp      Wed May 02 15:15:05 2018 +0530
> +++ b/source/common/threadpool.cpp      Thu May 03 11:57:19 2018 +0530
> @@ -71,21 +71,6 @@
>  # define strcasecmp _stricmp
>  #endif
>
> -#if defined(_WIN32_WINNT) && _WIN32_WINNT >= _WIN32_WINNT_WIN7
> -const uint64_t m1 = 0x5555555555555555; //binary: 0101...
> -const uint64_t m2 = 0x3333333333333333; //binary: 00110011..
> -const uint64_t m3 = 0x0f0f0f0f0f0f0f0f; //binary:  4 zeros,  4 ones ...
> -const uint64_t h01 = 0x0101010101010101; //the sum of 256 to the power of
> 0,1,2,3...
> -
> -static int popCount(uint64_t x)
> -{
> -    x -= (x >> 1) & m1;
> -    x = (x & m2) + ((x >> 2) & m2);
> -    x = (x + (x >> 4)) & m3;
> -    return (x * h01) >> 56;
> -}
> -#endif
> -
>  namespace X265_NS {
>  // x265 private namespace
>
> @@ -274,7 +259,7 @@
>      for (int i = 0; i < numNumaNodes; i++)
>      {
>          GetNumaNodeProcessorMaskEx((UCHAR)i, groupAffinityPointer);
> -        cpusPerNode[i] = popCount(groupAffinityPointer->Mask);
> +        cpusPerNode[i] = __popcnt(static_cast<unsigned
> int>(groupAffinityPointer->Mask));
>      }
>      delete groupAffinityPointer;
>  #elif HAVE_LIBNUMA
> @@ -623,7 +608,7 @@
>      for (int i = 0; i < numNumaNodes; i++)
>      {
>          GetNumaNodeProcessorMaskEx((UCHAR)i, &groupAffinity);
> -        cpus += popCount(groupAffinity.Mask);
> +        cpus += __popcnt(static_cast<unsigned int>(groupAffinity.Mask));
>      }
>      return cpus;
>  #elif _WIN32
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20180503/beb4376f/attachment.html>


More information about the x265-devel mailing list