[x265] x265 CPU utilization very low on a multi-numa sockets server
Ximing Cheng
chengximing1989 at gmail.com
Mon Aug 3 04:12:04 CEST 2015
I found the lookahead JobProvider only process its tasks on the threadpool
zero (the first threadpool), this will destroy the load balance of the
muti-threadpool system as the frame encoders are distributed on
the muti-threadpool by round robin. The encoder could not fully use the CPU
resource as the HEVC algorithm's high correlation.
Some of the worker thread sometimes waiting for the Job to awaken them, but
lookahead could not awaken thread on the second threadpool. And the main
x265 encoder must wait for the output queue of the lookahead. If the
lookahead use the same round robin strategy to distribute different frames
on the muti-threadpool as the frame encoder, is it better for the muti-numa
system? Thanks!
On Tue, Jul 28, 2015 at 1:57 PM, Mario *LigH* Rohkrämer <contact at ligh.de>
wrote:
> Hi Cheng.
>
> This issue has been discussed before in the VideoHelp forum.
> Parallelization is a bit more limited because dependencies between tasks in
> HEVC algorithms are possibly more restrictive than in AVC algorithms (many
> parts of the HEVC algorithm need to wait for others finishing intermediate
> results, and splitting the video frame across too many slices would hurt
> the encoding efficiency).
>
> But it is easily possible to run several applications in parallel so that
> they each get a share of available cores.
>
>
> Am 28.07.2015, 03:58 Uhr, schrieb Ximing Cheng <chengximing1989 at gmail.com
> >:
>
> Hi, I am testing x265 with a two numa nodes server, each node has 36 cores.
>> The x265 version is 1.7 release with command line
>>
>> ./x265 --input-res 1920x1080 --input input.yuv --bitrate 1200
>> --vbv-maxrate
>> 1380 --fps 20 --early-skip --preset fast -o test1.hevc
>>
>> but when ruuning on the server, CPU utilization ranges from 27% ~ 35% (<
>> 40%) which means most of the CPU cores are not busy.
>>
>> x265 [info]: HEVC encoder version 1.7x265 [info]: build info
>> [Linux][GCC 4.4.6][64 bit] 8bppx265 [info]: using cpu capabilities:
>> MMX2 SSE2Fast SSSE3 SSE4.2 AVX AVX2 FMA3 LZCNT BMI2x265 [warning]:
>> --psnr used with AQ on: results will be invalid!x265 [warning]: --tune
>> psnr should be used if attempting to benchmark psnr!x265 [info]: Main
>> profile, Level-4 (Main tier)x265 [info]: Thread pool 0 using 36
>> threads on NUMA node 0x265 [info]: Thread pool 1 using 36 threads on
>> NUMA node 1x265 [info]: frame threads / pool features : 16 /
>> wpp(34 rows)+pmodex265 [warning]: VBV maxrate specified, but no
>> bufsize, ignoredx265 [info]: Coding QT: max CU size, min CU size : 32
>> / 8x265 [info]: Residual QT: max TU size, max depth : 32 / 2 inter / 2
>> intrax265 [info]: ME / range / subpel / merge : star / 57 / 1
>> / 2x265 [info]: Keyframe min / max / scenecut : 20 / 250 /
>> 40x265 [info]: Lookahead / bframes / badapt : 60 / 4 / 2x265
>> [info]: b-pyramid / weightp / weightb / refs: 1 / 1 / 1 / 1x265
>> [info]: AQ: mode / str / qg-size / cu-tree : 1 / 0.3 / 32 / 1x265
>> [info]: Rate Control / qCompress : ABR-1200 kbps / 0.60x265
>> [info]: tools: rect amp rd=4 rdoq=2 early-skip signhide tmvp b-intra
>>
>
>
> --
>
> Fun and success!
> Mario *LigH* Rohkrämer
> mailto:contact at ligh.de
>
>
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20150803/ed561669/attachment.html>
More information about the x265-devel
mailing list