[x264-devel] Re: Scalability
Loren Merritt
lorenm at u.washington.edu
Fri Mar 2 04:04:59 CET 2007
On Thu, 1 Mar 2007, Christian Bienia wrote:
>
> Overall, the current parallelization of x264 still seems a little
> limited. Loren mentioned that fine-granular parallelization is possible,
> but more difficult. But a combined approach seems to be promising: A
> central work queue would contain all work units which await processing.
> A work unit can be a chunk of a frame (no more than 2-4 chunks per
> frame). As soon as a thread has finished working on a work unit, it
> enqueues any work units of subsequent frames which are now possible to
> encode (and dequeues its next work unit). This approach would also
> eliminate the need to have more threads than CPUs to get a high
> utilization. :-)
I didn't say fine-granular parallelization is hard, I said it doesn't work
as well. But yes, it would also require much more rearranging of x264
internals than the current threading did.
You can't divide a frame into large independent chunks without slices. And
even if you did use slices, that's completely incompatible with
frame-threading. The only temporal work division compatible with
slice-threading is non-referenced B-frames and GOP-threading.
The sub-frame work division XviD uses is to encode consecutive macroblock
rows in separate threads, making sure each row stays at least 2 MBs behind
the previous. Then run another thread behind them all to do the bitstream
writing. However, this reduces the temporal splitting possible by almost
as much as it increases the spatial splitting, because there's that much
more data in-progress that the next frame has to wait for. It also
prevents bit-exact CABAC RDO, though I haven't simulated how much that
would cost in compression quality.
--Loren Merritt
--
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html
More information about the x264-devel
mailing list