[x264-devel] OpenCL lookahead
Steve Borho
git at videolan.org
Tue Apr 23 23:29:24 CEST 2013
x264 | branch: master | Steve Borho <steve at borho.org> | Thu Feb 21 12:48:40 2013 -0600| [9bc3da1eb10ff5e9c592d7c737b0fd4ffbb3e157] | committer: Jason Garrett-Glaser
OpenCL lookahead
OpenCL support is compiled in by default, but must be enabled at runtime by an
--opencl command line flag. Compiling OpenCL support requires perl. To avoid
the perl requirement use: configure --disable-opencl.
When enabled, the lookahead thread is mostly off-loaded to an OpenCL capable GPU
device. Lowres intra cost prediction, lowres motion search (including subpel)
and bidir cost predictions are all done on the GPU. MB-tree and final slice
decisions are still done by the CPU. Presets which do not use a threaded
lookahead will not use OpenCL at all (superfast, ultrafast).
Because of data dependencies, the GPU must use an iterative motion search which
performs more total work than the CPU would do, so this is not work efficient
or power efficient. But if there are spare GPU cycles to spare, it can often
speed up the encode. Output quality when OpenCL lookahead is enabled is often
very slightly worse in quality than the CPU quality (because of the same data
dependencies).
x264 must compile its OpenCL kernels for your device before running them, and in
order to avoid doing this every run it caches the compiled kernel binary in a
file named x264_lookahead.clbin (--opencl-clbin FNAME to override). The cache
file will be ignored if the device, driver, or OpenCL source are changed.
x264 will use the first GPU device which supports the required cl_image
features required by its kernels. Most modern discrete GPUs and all AMD
integrated GPUs will work. Intel integrated GPUs (up to IvyBridge) do not
support those necessary features. Use --opencl-device N to specify a number of
capable GPUs to skip during device detection.
Switchable graphics environments (e.g. AMD Enduro) are currently not supported,
as some have bugs in their OpenCL drivers that cause output to be silently
incorrect.
Developed by MulticoreWare with support from AMD and Telestream.
> http://git.videolan.org/gitweb.cgi/x264.git/?a=commit;h=9bc3da1eb10ff5e9c592d7c737b0fd4ffbb3e157
---
.gitignore | 2 +
Makefile | 43 +-
common/common.c | 12 +
common/common.h | 10 +
common/frame.c | 3 +
common/frame.h | 4 +
common/opencl.c | 606 +++++++++++++++++++++++
common/opencl.h | 120 +++++
common/opencl/bidir.cl | 265 ++++++++++
common/opencl/downscale.cl | 135 ++++++
common/opencl/intra.cl | 1072 +++++++++++++++++++++++++++++++++++++++++
common/opencl/motionsearch.cl | 249 ++++++++++
common/opencl/subpel.cl | 242 ++++++++++
common/opencl/weightp.cl | 48 ++
common/opencl/x264-cl.h | 132 +++++
configure | 61 +++
encoder/encoder.c | 41 ++
encoder/slicetype-cl.c | 766 +++++++++++++++++++++++++++++
encoder/slicetype.c | 196 +++++---
tools/cltostr.pl | 65 +++
x264.c | 6 +
x264.h | 5 +
22 files changed, 4007 insertions(+), 76 deletions(-)
Diff: http://git.videolan.org/gitweb.cgi/x264.git/?a=commitdiff;h=9bc3da1eb10ff5e9c592d7c737b0fd4ffbb3e157
More information about the x264-devel
mailing list