[x264-devel] Re: comments is not consistent with source code.

Sun Apr 29 21:09:03 CEST 2007

On Sun, 29 Apr 2007, jogging song wrote:
> On 29 Apr 2007, Loren Merritt wrote:
>
>> Given that B-frames' QP are locked to an offset from the adjacent
>> P-frames's QP (and that's optimal for compression)
>
> why is that optimal?

A large part of the compression gain from using B-frames is that they can 
have a higher QP than the P-frames, and only the high quality P-frames are 
used for motion compensation. (The other part of the compression gain 
being the bidirectional prediction.)

Consider the rate-distortion curve you get by varying the QP of a B-frame 
while holding the adjacent P-frames constant:
If the B-frame is lower QP than the P-frame, it takes lots of bits. 
Essentially as many bits as it would have taken to encode the P-frame at 
that lower QP. If you put that quality in a P-frame then future frames 
can also benefit by being predicted from it, but if you put it in a 
B-frame then it's lost.
If the B-frame is much higher QP than the P-frame, it will be tiny. Motion 
compensation will be mostly sufficient and almost no residual will be coded. 
So further increasing QP will lose quality but not save many bits.
There is some optimal offset in the middle, which has to be experimentally 
determined. I did the experiment, and determined it to be +2. YMMV.

>> Before asking what x264 does with VBV, do you know what VBV means in
>> general, as descibed in the MPEG-2 and H.264 standards and related docs?
>> I don't feel up to explaining that.
>
> Maybe I think I know how the vbv works. What confuses me is how to make good
> use of vbv? In the x264, qscale is adjusted according to
> vbv buffer state, is the way to adjust qscale the best way? Just as you said
> in ratecontrol.txt, this is empirical. Can you improve that?
> If we want to improve it, what can we change? the way to adjust qscale?
> Ratecontrol equation?

Even assuming perfect knowledge of the complexity of all current and 
future frames, and lots of cpu-time, I don't know what the optimal 
algorithm is. Except of course the exponential-time "encode all sequences 
of QPs and keep the highest PSNR of those that satisfy VBV".

My guess is:
First run an unrestricted ratecontrol algorithm (which is also heuristic, 
but at least we don't have to develop it twice). Then minimally perturb 
the bit distribution so as to meet VBV constraints.

The perturbation could be:
Foreach point at which vbv underflows, increase the QP of that frame and 
all frames back to the previous point at which vbv was full, by an 
infinitesimal amount. Repeat until maxrate is met. Each iteration may 
remove underflows and/or add overflows, thus changing the regions 
considered in the next iteration.
Foreach point at which vbv overflows, reduce the QP of that frame and all 
frames back until the previous point were the vbv was empty, by an 
infinitesimal amount. Repeat until target filesize is met, i.e. until it 
has reallocated all the bits taken away in the previous step. In the 
case of cbr this will end with no overflows and no underflows. If 
maxrate>avgrate then it will end with some overflows.

To apply this in 1 pass: Run the complexity analysis on X frames in 
advance of the one actually being encoded, and run the ratecontrol in that 
window.

--Loren Merritt

-- 
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html