[x264-devel] [OpenCL] Please whitelist mesa's OpenCL implementation
Tom Stellard
tom at stellard.net
Thu Feb 13 17:19:43 CET 2014
On Thu, Feb 13, 2014 at 05:13:52PM +0100, Christian König wrote:
> Hi Niccolò,
>
> I would like to help, but this is probably more a question for Tom
> Stellard. CCing him on this mail as well.
>
> Cheers,
> Christian.
>
> Am 13.02.2014 16:09, schrieb Niccolò Belli:
> > Hi,
> > I'm using Gentoo Linux amd64 with latest graphic stack from git (mesa
> > git, llvm git, linux 3.14, etc.) and x264 git. My graphic card is an
> > AMD Radeon HD 7950.
> >
> > Test file (~90s/~200MB):
> > https://mega.co.nz/#!eQhSjJQR!EEe8-taN5IspIu-RW0WQzmvKzc5fkCn282kS5ugZ_as
> >
> > ffmpeg -i PlanetEarthBirds.mkv -c:v libx264 -preset slower -crf 18
> > -x264opts opencl -c:a copy slower-cfr18-ocl.mkv
> >
> > This is what I get when encoding with --opencl with Catalyst 14.1_beta:
> >
> > [libx264 @ 0x257e360] using cpu capabilities: MMX2 SSE2Fast SSSE3
> > SSE4.2 AVX
> > [libx264 @ 0x257e360] OpenCL acceleration enabled with Advanced Micro
> > Devices, Inc. Tahiti (SI)
> >
> > Unfortunately when I use mesa's OpenCL implementation I get:
> >
> > [libx264 @ 0x1409360] using cpu capabilities: MMX2 SSE2Fast SSSE3
> > SSE4.2 AVX
> > [libx264 @ 0x1409360] OpenCL: Unable to find a compatible device
> >
> >
> > This is my clinfo with radeonsi (mesa):
> >
> > ~ $ clinfo
> >
> > clinfo: /usr/lib64/libOpenCL.so.1: no version information available
> > (required by clinfo)
> >
> >
> >
> > clinfo: /usr/lib64/libOpenCL.so.1: no version information available
> > (required by clinfo)
> >
> > Number of platforms: 1
> > Platform Profile: FULL_PROFILE
> > Platform Version: OpenCL 1.1 MESA 10.2.0-devel
> > Platform Name: Default
> > Platform Vendor: Mesa
> > Platform Extensions: cl_khr_icd
> >
> >
> > Platform Name: Default
> > Number of devices: 1
> > Device Type: CL_DEVICE_TYPE_GPU
> > Device ID: 4098
> > Max compute units: 1
> > Max work items dimensions: 3
> > Max work items[0]: 256
> > Max work items[1]: 256
> > Max work items[2]: 256
> > Max work group size: 256
> > Preferred vector width char: 16
> > Preferred vector width short: 8
> > Preferred vector width int: 4
> > Preferred vector width long: 2
> > Preferred vector width float: 4
> > Preferred vector width double: 2
> > Native vector width char: 16
> > Native vector width short: 8
> > Native vector width int: 4
> > Native vector width long: 2
> > Native vector width float: 4
> > Native vector width double: 2
> > Max clock frequency: 0Mhz
> > Address bits: 32
> > Max memory allocation: 500000000
> > Image support: Yes
> > Max number of images read arguments: 32
> > Max number of images write arguments: 32
> > Max image 2D width: 32768
> > Max image 2D height: 32768
> > Max image 3D width: 32768
> > Max image 3D height: 32768
> > Max image 3D depth: 32768
> > Max samplers within kernel: 0
> > Max size of kernel argument: 1024
> > Alignment (bits) of base address: 128
> > Minimum alignment (bytes) for any datatype: 128
> > Single precision floating point capability
> > Denorms: Yes
> > Quiet NaNs: Yes
> > Round to nearest even: Yes
> > Round to zero: No
> > Round to +ve and infinity: No
> > IEEE754-2008 fused multiply-add: No
> > Cache type: None
> > Cache line size: 0
> > Cache size: 0
> > Global memory size: 2000000000
> > Constant buffer size: 0
> > Max number of constant args: 0
> > Local memory type: Scratchpad
> > Local memory size: 32768
> > Kernel Preferred work group size multiple: 1
> > Error correction support: 0
> > Unified memory for Host and Device: 1
> > Profiling timer resolution: 0
> > Device endianess: Little
> > Available: Yes
> > Compiler available: Yes
> > Execution capabilities:
> > Execute OpenCL kernels: Yes
> > Execute native function: No
> > Queue properties:
> > Out-of-Order: No
> > Profiling : Yes
> > Platform ID: 0x00007f04160dcfc0
> > Name: AMD TAHITI
> > Vendor: X.Org
> > Device OpenCL C version: OpenCL C 1.1
> > Driver version: 10.2.0-devel
> > Profile: FULL_PROFILE
> > Version: OpenCL 1.1 MESA 10.2.0-devel
> > Extensions:
> >
> >
> >
> > And this is my clinfo with Catalyst 14.1_beta:
> >
> > ~ $ clinfo | head -83
> >
> > Number of platforms: 1
> > Platform Profile: FULL_PROFILE
> > Platform Version: OpenCL 1.2 AMD-APP (1411.4)
> > Platform Name: AMD Accelerated Parallel Processing
> > Platform Vendor: Advanced Micro Devices, Inc.
> > Platform Extensions: cl_khr_icd
> > cl_amd_event_callback cl_amd_offline_devices cl_amd_hsa
> >
> >
> > Platform Name: AMD Accelerated Parallel Processing
> > Number of devices: 2
> > Device Type: CL_DEVICE_TYPE_GPU
> > Device ID: 4098
> > Board name: AMD Radeon HD 7900 Series
> > Device Topology: PCI[ B#3, D#0, F#0 ]
> > Max compute units: 28
> > Max work items dimensions: 3
> > Max work items[0]: 256
> > Max work items[1]: 256
> > Max work items[2]: 256
> > Max work group size: 256
> > Preferred vector width char: 4
> > Preferred vector width short: 2
> > Preferred vector width int: 1
> > Preferred vector width long: 1
> > Preferred vector width float: 1
> > Preferred vector width double: 1
> > Native vector width char: 4
> > Native vector width short: 2
> > Native vector width int: 1
> > Native vector width long: 1
> > Native vector width float: 1
> > Native vector width double: 1
> > Max clock frequency: 810Mhz
> > Address bits: 32
> > Max memory allocation: 1073741824
> > Image support: Yes
> > Max number of images read arguments: 128
> > Max number of images write arguments: 8
> > Max image 2D width: 16384
> > Max image 2D height: 16384
> > Max image 3D width: 2048
> > Max image 3D height: 2048
> > Max image 3D depth: 2048
> > Max samplers within kernel: 16
> > Max size of kernel argument: 1024
> > Alignment (bits) of base address: 2048
> > Minimum alignment (bytes) for any datatype: 128
> > Single precision floating point capability
> > Denorms: No
> > Quiet NaNs: Yes
> > Round to nearest even: Yes
> > Round to zero: Yes
> > Round to +ve and infinity: Yes
> > IEEE754-2008 fused multiply-add: Yes
> > Cache type: Read/Write
> > Cache line size: 64
> > Cache size: 16384
> > Global memory size: 2923429888
> > Constant buffer size: 65536
> > Max number of constant args: 8
> > Local memory type: Scratchpad
> > Local memory size: 32768
> > Kernel Preferred work group size multiple: 64
> > Error correction support: 0
> > Unified memory for Host and Device: 0
> > Profiling timer resolution: 1
> > Device endianess: Little
> > Available: Yes
> > Compiler available: Yes
> > Execution capabilities:
> > Execute OpenCL kernels: Yes
> > Execute native function: No
> > Queue properties:
> > Out-of-Order: No
> > Profiling : Yes
> > Platform ID: 0x00007f59736ee500
> > Name: Tahiti
> > Vendor: Advanced Micro Devices, Inc.
> > Device OpenCL C version: OpenCL C 1.2
> > Driver version: 1411.4 (VM)
> > Profile: FULL_PROFILE
> > Version: OpenCL 1.2 AMD-APP (1411.4)
> > Extensions: cl_khr_fp64 cl_amd_fp64
> > cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics
> > cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics
> > cl_khr_int64_base_atomics cl_khr_int64_extended_atomics
> > cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing
> > cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3
> > cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt
> > cl_khr_image2d_from_buffer cl_khr_spir
> >
> >
> > Is it possible to whitelist mesa's OpenCL implementation? Can someone
> > please write a patch to force-enable OpenCL so that I can at least
> > test if it works with mesa?
> >
Does x264 currently blacklist Mesa? It's possible it's failing for
some other reason, like a missing extension or there is an error
compiling one of the kernels.
-Tom
More information about the x264-devel
mailing list