[x265] Parallelization on "manycore" systems

Pradeep Ramachandran pradeep at multicorewareinc.com
Thu Feb 2 05:39:15 CET 2017


Michael,
There have been a few other efforts to create a benchmarking tool around
x265  as it stresses the CPU pretty heavily; some of these tools are
available for free download as well.

As far as scaling goes, at this point, a single instance of x265 scales
well to around 20-25 CPU threads, but the serial nature of the decisions
that we make limits our parallelism to these levels. We have recently
implemented the slices feature (the above number is without slices) which
could enable us to scale more, and this is something that we're actively
looking at. More lookahead-slices and lookahead-threads should help, but
the specifics depend on the content that you are encoding; so I'd encourage
you to play with them and share your results so that the community can also
comment.

Also, from your command-line below, I would remove the -pass 1
--slow-firstpass options as I think those aren't relevant for you; they are
used to generate results from a quick first pass that can be refined
further in subsequent passes.

Pradeep.

On Wed, Feb 1, 2017 at 4:55 PM, Michael Lackner <
michael.lackner at unileoben.ac.at> wrote:

> Greetings,
>
> I have a question about parallelization in x265. I'm currently preparing a
> benchmarking
> project based on x265 (a successor of a similar project using x264).
>
> The x264 one created in 2010 was locked on a specific version/options and
> is now running
> out of steam because it fails to fully utilize todays' larger processors
> (16 and more
> logical CPUs).
>
> I'm currently basing this new thing on 4K input content (either UHD or
> full 4096x2160,
> unsure), and I'd like it to scale up to around 1000-2000 logical CPUs or
> more if possible
> (fully loading them). This would also make it possible to load entire
> shared memory
> clusters today.
>
> I don't care about effective output quality that much, so parallelization
> is paramount.
>
> I've seen that x265 has a few knobs you can turn manually to better
> utilize many cores,
> but for my content I'm not sure when I should set which option to what
> value?! I don't
> have test systems for this yet of course...
>
> I've begun to write a script to determine logical CPU counts on Windows,
> Linux and
> FreeBSD, I just need to know what to do with the following:
>
> --slices <integer>
> --lookahead-slices <0..16>
> --lookahead-threads <integer>
>
> I'm already using:
>
> --ctu 16
> --wpp
> --pmode
> --pme
>
> In total, my current options are like this (I also want to be hard on the
> CPU per clock to
> make the benchmark run long enough even with a small enough input file,
> but only where it
> doesn't hurt parallelization):
>
> -D 10 --fps 24000/1001 -p veryslow --pmode --pme --wpp --open-gop --ref 6
> --bframes 16
> --b-pyramid --weightb --max-merge 5 --b-intra --bitrate 10000 --rect --amp
> --aq-mode 2
> --no-sao --qcomp 0.75 --no-strong-intra-smoothing --psy-rd 1.6 --psy-rdoq
> 5.0 --rdoq-level
> 1 --tu-inter-depth 4 --tu-intra-depth 4 --ctu 16 --max-tu-size 32 --pass 1
> --slow-firstpass --stats v.stats --sar 1 --range full
>
> These might not be good settings for my purpose, and some are redundant
> given the profile
> I guess, which is why I'd like to ask here. I'm just unsure when I should
> start using more
> lookahead slices. And then? Should I switch from lookahead slices to
> lookahead threads at
> some point, or can both be used together!?
>
> When should I start slicing up input frames, and what do I need to to
> consider given
> proper values for --slices <integer> etc.
>
> Is it even possible to scale to THAT many cores with 4K/UHD content?!
>
> I wanna make this a bit more future-proof this time around...
>
> Thanks a lot for your input!
>
> Best,
> Michael
>
> --
> Michael Lackner
> Lehrstuhl für Informationstechnologie (CiT)
> Montanuniversität Leoben
> Tel.: +43 (0)3842/402-1505 | Mail: michael.lackner at unileoben.ac.at
> Fax.: +43 (0)3842/402-1502 | Web: http://institute.unileoben.ac.
> at/infotech
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20170202/3a141dce/attachment.html>


More information about the x265-devel mailing list