[vlc-devel] IVTC accuracy improvement idea
Juha Jeronen
juha.jeronen at jyu.fi
Wed Jan 12 10:20:47 CET 2011
Hi all,
I did some further R&D on the IVTC filter. Current results below.
On 01/10/2011 02:12 PM, Juha Jeronen wrote:
> I suddenly got one idea for the improvement of IVTC accuracy. Keep the
> frame composer running all the time (even when no telecine is
> detected), and only handle the framerate conversion (and frame
> dropping) with the cadence tracking mechanism.
This turned out not to be such a bright idea after all. Its only effect
was that it reintroduced the subtitle flicker problem that the Transcode
composition strategy had completely gotten rid of. So, the current
posted version is better.
During those talking scenes that tend not to work correctly, it's not
that the filter exits film mode (as I thought at first) - it doesn't -
but rather that the Transcode comb metric ranks the field pairs
incorrectly in such cases.
The cadence detector isn't that good for general input. This explains
why the Transcode strategy works the best for frame composition - it
doesn't need to know the cadence position. (This is another aspect that
makes it ingenious.)
The detector, as of the posted version, already works rather well in
cases with horizontal motion, but making it better in general is a hard
problem. I tried to make it more reliable by judging how much of an
outlier the winning score is (by computing the variance of
pi_ivtc_scores with and without the winner included). Theoretically,
this should indicate how certain the detection result is, but in
practice, it didn't really help.
Horizontal camera pans give the correct result a 50-70% boost (i.e.
1.5x..1.7x) to variance with the winner included, indicating that the
other scores are indeed much higher (worse) as expected. But if there is
just some horizontal motion (e.g. one small-ish moving object), the
difference decreases to ~20%, which already gives false positives rather
often. Also, by this metric, the stencil positions "abc" and "bcd" seem
consistently harder to detect than the other three.
These results don't change even if I define the score as
(pi_best_field_pairs - pi_worst_field_pairs), so the previous
observation stands that both strategies do the same thing.
The best this kind of strategy can do is to lock on when there is a pan,
and then keep the detected cadence until a new reliable detection is
possible. This stops working when there is a bad cut in the film. I
tried a cut detection approach ("i_blocks_with_motion > 5000" seems
pretty reliable while still avoiding false positives), and switching the
frame composition strategy while the detector acquires a new lock-on,
but this didn't work very well either.
I also played around with progressive material detection (check if
position "dea" comes up 3 times in a row, while there is motion so the
frames are not identical), to see if the reliability-estimating detector
could do that better. Mostly it worked, but false positives caused some
previously easy cases to break, so I think it can't be included after all.
I now have another idea, which I haven't tested yet. Since there is a
block-based motion detector, it is possible to include only blocks
having motion into the interlace detection. The scores can be normalized
by the number of blocks with motion, at least theoretically making them
comparable.
The difficult part with this approach is what to do with the "trivial"
combinations, where both fields come from the same frame, since by
definition they have no motion between them. I'm thinking that for P (N
respectively), it's possible to use the motion detection result between
P and C (C and N respectively), and include only those blocks, so that
the same blocks take part in the detection as for the combinations
TPBC/TCBP (TCBN/TNBC respectively). For C, one can probably include the
union (in the sense of logical OR) of those motion detection results,
and the normalization should take care of the rest.
Even if this strategy is used only in the frame composer (and not in the
cadence detector, as the case of C is a bit unsure how it'll compare to
the rest), it should make the frame construction more accurate. Emphasis
on "should" - IVTC for hard telecine is more of an art than a science.
I'll test this when I have time.
Oh, and I built a soft telecine remover. It was simple (based on
i_nb_fields of the frames in the three-frame cache) and works fine.
That's all for today :)
-J
More information about the vlc-devel
mailing list