[vlc-devel] Deinterlacer documentation

Sun Apr 3 18:00:42 CEST 2011

Hi,

On 04/03/2011 05:17 PM, Jean-Baptiste Kempf wrote:
> Hello,
>
> On Sun, Apr 03, 2011 at 05:12:20PM +0300, Juha Jeronen wrote :
>> The second thing is, the user documentation for a general semi-technical
>> audience needs to be updated after Phosphor and IVTC. The wiki page
>> http://wiki.videolan.org/Deinterlacing is a good starting point.
> Yes for user
>
> http://wiki.videolan.org/Documentation:Hacker%27s_Guide/Video_Output
> http://wiki.videolan.org/Documentation:Hacker%27s_Guide
> for dev documentatoin.

Thanks :)

So I'll add a page there...

>> Here's why I'm posting this - is there anything specific any of you
>> would like to see explained in the user documentation for the deinterlacer?
> Yes, how are the deinterlaced frames stored...
> I had to write a decoder (crystalhd) and seriously I had a lot of
> difficulties for interlaced mode.

...and yes, of course suggestions for dev documentation are welcome, too :)

But I'm not sure if I understand the suggestion. Looking for something
like the below? Here's a first draft, needs updating...

The frames output by the deinterlacer are picture_t's, with i_nb_fields
== 2 and b_progressive == true. The output chroma format and number of
visible lines are algorithm-dependent.

Resolution and chroma conversions are allowed, and it should be assumed
there is no particular relation between input and output. Currently
possible output chroma formats are I420, YV12, J420, I422 and J422.
Vertical resolution may be original or half. Thus, chroma format and
vertical resolution should be read off the output frame metadata
(appropriate picture_t data fields).

>From the top-level Deinterlace(), the output pictures go into a linked
list (using the "next" pointer in picture_t), which is then given to the
caller. What happens to the pictures then, is outside the scope of the
module.

Temporally, non-doubling deinterlacers produce exactly one output
picture per one input, IVTC produces one or zero (at the dropped frame),
and framerate doublers produce two or three (depending on repeat_pict of
each input frame).

###TODO: But here I'm assuming that the calling end keeps the linked
list as-is, and doesn't put the pictures into an array or something
(erasing the "next" pointers)... I should check this.

The call to Deinterlace() comes from the control logic for the filter
chain, but I don't remember where that is, specifically. I only had a
brief look at it when I was debugging the repeat_pict crash. I remember
only that the calling end checks the "next" pointer, and walks the
linked list until next == NULL. What it does with the pictures, I don't
recall. It might be documented in some of my posts to this list. I'll
have a look... end TODO###

The timings (presentation timestamp; PTS; picture_t data field "date")
may change arbitrarily between input and output. It is not even
guaranteed that calling the deinterlacer for an input frame outputs a
frame corresponding to that input frame. Yadif uses the frame offset
feature, and IVTC effectively does, too. This means that when frame "2"
goes in, what comes out is deinterlaced frame "1" - in the case of
Yadif, still with its original PTS! (The offset does NOT introduce a
delay; thus it keeps A/V sync intact. This is why it is called
i_frame_offset and not i_frame_delay in the code.) In the case of IVTC,
the PTS is somewhat like original, but corrected for the 29.97 > 23.976
fps conversion... usually.

The deinterlace filter is not required to actually output anything the
first few times it is called. Some algorithms keep a history and use it
for temporal filtering. Currently, this is 3 input pictures, and the
first call to Deinterlace() always outputs a picture. The second call
may drop. From the third call on, the history buffer has filled, and at
this point it is guaranteed that normal operation starts as defined by
the chosen algorithm.

For generality, it should be assumed that M input pictures map to N
output pictures, with arbitrary, different M and N. Note that even
though repeat_pict means "repeat first field", the first and third
output pictures from a framerate doubler, for any given input frame, are
allowed to be different due to temporal filtering. (Phosphor does this.)
Thus, it should also be assumed that each output picture is unique.

It may or may not be safe to display-hold still images, depending on the
deinterlacing algorithm. For non-doublers, this is safe. For doublers,
it is not. This is because framerate doubling algorithms are often
designed based on the illusion of increased perceived resolution that
can be achieved by rapidly alternating half-resolution images in an
appropriate way.

Thus, "progressive" pictures from framerate doublers are not actually
progressive (at least not at full resolution), but only appear so while
the filter is constantly producing new pictures at a steady framerate.
Unfortunately, currently there is no way to determine from the outside
which kind of algorithm has been chosen.

I think that about covers the topic...

>> 4) Yadif needs a better explanation - what does the algorithm do, on a
>> general level? I'll try, but I'm not sure if I can help with this.
> Only MN should know :D

Maybe we should ask him... :)

>> 7) Recommendations should be revised after Phosphor and IVTC. Whether to
>> choose Bob, Linear or Phosphor is mainly a matter of taste. For
>> telecined input, IVTC is clearly The Right Thing.
> And yadif ?

Hmm, yes :)

Let's see...

- For any true interlaced input: matter of taste; use anything other
than IVTC.
  - If a non-doubling mode is desired: try Yadif and X. Maybe Yadif as a
general recommendation?
  - If a framerate doubler is desired: use Linear if not particularly
picky. Gives low CPU usage and acceptable quality. Otherwise, try all
(Bob, Linear, Phosphor, Yadif2x) and pick one.
- For telecined input (e.g. almost all NTSC anime), use IVTC.
- For hybrid 60i/24p (Sol Bianca...), use a framerate doubler.

- Note on CPU usage: Yadif requires more CPU than the other algorithms.
Yadif2x is the heaviest algorithm in VLC.

Maybe that covers it?

 -J