[x264-devel] Encoder questions

Fri Nov 23 12:11:29 CET 2018

Finally, there you are. I want you to answer questions without reminder. I understand that you do it when you have time, but simply ignoring them is just obnoxious - don't do that.
Now for the subject. What linux kernel or anyone else uses is not an argument, because everyone, including Linus Torvalds, is prone to stupidity or bad habits: we're all just humans. So you must make your own decisions.
As for the encoder, I need to clarify something. You say 8*8 is not always better. In what sense better? Compression or quality? Can you explain it comprehensively or give a link to a paper or book chapter where this point is explained in such way?
Then you said about b-adapt algorithm. I looked up the x264 paper by Loren Merritt, but it describes this choice just briefly, and I need more detail. It says there's a low-res ME run for each couple of alternative frame sets, but is it a full-featured ME or is it further truncated in some way? Also the parameters include 2 versions of it, fast and optimal, and I want to understand the difference between the two. Actually when I tried to switch from fast to optimal, leaving other pars untouched, I got a bigger bitrate in my test encode, which is the opposite of what I expected.
And the last question that is the most crucial. When I used the CRF mode and compared it to CQ of the same value, I noticed that it often  increases and decreases the QP in very unnecessary frames. So if the QP choice for CRF is based on the amount of frame motion as described in the paper, then maybe it will give better results when using a different frame evaluation method from 2-pass VBR mode. So the question is: can I use the CRF mode in conjunction with an evaluation pass?

-- 
 Securely sent with Tutanota. Get your own encrypted, ad-free mailbox: 
 https://tutanota.com

Nov 21, 2018, 9:32 PM by BugMaster at narod.ru:

> Hi
>
> On Sat, 3 Nov 2018 20:03:58 +0100 (CET), > vpstranger at tutanota.com <mailto:vpstranger at tutanota.com>>  wrote:
>
>> Several more questions have come up to me.
>> -Why don't you have a search on the mailing list contents? I might
>> have found answers to some of my questions there, but without search
>> the previous months are essentially unavailable to me.
>>
>
> We develop x264 not ML software so we use what videolan (as umbrella
> organization) gives us for ML. And even Linux Kernal ML use google
> search as was suggested to you by Andreas.
>
>> -With 8*8 DCT active, why does the algorithm choose when to apply
>> it? Whatever the blocks partitioning is, it could be applied
>> everywhere (except for intra macroblocks), wouldn't it be more efficient?
>>
>
> No, it wouldn't. Larger DCT doesn't mean always better so we still
> need to compare 4x4 vs 8x8 DCT.
>
>> -In the list of block partitions for B frames directly predicted
>> blocks are listed as a separate block size with its own percentage,
>> although they are not a separate size, but may have different sizes. Why?
>>
>
> B_Skip, B_Direct_16x16, and B_Direct_8x8 significantly differ from
> others so they have separate % in output. But B_Direct_16x16 and
> B_Direct_8x8 was decided to be in one % to not overload output.
> I doubt anyone can really make any deduction from this stats other
> than it differ between samples and encoding options.
>
>> -How does the algorithm decide, whether to insert IDR or simple I
>> frame? Actually, I haven't seen a single IDR frame in my test
>> encodes except for the initial one. So maybe it doesn't use them at
>> all? And what exactly does the keyint parameter (and scenecut)
>> specify? IDR frames only, simple I only or both types?
>>
>
> Already answered by Andreas.
>
>> -What is the choice of frame type based on?
>>
>
> I/IDR-frames by scenecut detection. P- vs B-frames by b-adapt
> algorithm which try to search between different sequences of P- and
> B-frames and choose one that minimize sum of frame costs.
>
>> -What is the difference between simple RD and RD refinement for the subme parameter?
>>
>
> RD refinement also make RDO for subpartitions and MV qpel decision.
>
>> -How does the algorithm choose reference frames for blocks and what
>> part of it makes that decision? I assume this choice is made
>> collectively by the motion estimator set by me parameter and the RD optimizer, is this correct?
>>
>
> Search the one that have minimal cost (MV cost + residual cost) i.e.
> it is part of ME. IIRC there is no RDO for reference frame selection.
>
>
> _______________________________________________
> x264-devel mailing list
> x264-devel at videolan.org <mailto:x264-devel at videolan.org>
> https://mailman.videolan.org/listinfo/x264-devel <https://mailman.videolan.org/listinfo/x264-devel>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x264-devel/attachments/20181123/de1267e9/attachment.html>