[x264-devel] Re: Scalability

Fri Mar 2 04:04:59 CET 2007

On Thu, 1 Mar 2007, Christian Bienia wrote:
>
> Overall, the current parallelization of x264 still seems a little
> limited. Loren mentioned that fine-granular parallelization is possible,
> but more difficult. But a combined approach seems to be promising: A
> central work queue would contain all work units which await processing.
> A work unit can be a chunk of a frame (no more than 2-4 chunks per
> frame). As soon as a thread has finished working on a work unit, it
> enqueues any work units of subsequent frames which are now possible to
> encode (and dequeues its next work unit). This approach would also
> eliminate the need to have more threads than CPUs to get a high
> utilization. :-)

I didn't say fine-granular parallelization is hard, I said it doesn't work 
as well. But yes, it would also require much more rearranging of x264 
internals than the current threading did.

You can't divide a frame into large independent chunks without slices. And 
even if you did use slices, that's completely incompatible with 
frame-threading. The only temporal work division compatible with 
slice-threading is non-referenced B-frames and GOP-threading.

The sub-frame work division XviD uses is to encode consecutive macroblock 
rows in separate threads, making sure each row stays at least 2 MBs behind 
the previous. Then run another thread behind them all to do the bitstream 
writing. However, this reduces the temporal splitting possible by almost 
as much as it increases the spatial splitting, because there's that much 
more data in-progress that the next frame has to wait for. It also 
prevents bit-exact CABAC RDO, though I haven't simulated how much that 
would cost in compression quality.

--Loren Merritt

-- 
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html