[vlc-devel] [PATCH] Phosphor deinterlacer, reworked++

Tue Mar 1 13:30:24 CET 2011

Hi,

Thanks for the review!

Comments below.

On 02/28/11 23:10, Laurent Aimar wrote:
> Hi,
>
> On Thu, Feb 24, 2011 at 01:34:03AM +0200, Juha Jeronen wrote:
>> +static const char *const phosphor_dimmer_list[] = { "off", "low", "medium", "high" };
>> +static const char *const phosphor_dimmer_list_text[] = { N_("Off (no light decay)"),
>> +                                                         N_("Low"),
>> +                                                         N_("Medium"),
>> +                                                         N_("High") };
>  Using an integer type for dimmer would simplify the code. You can still have
> a name associated to each numerical value.

Ok. Thanks for the info. Having a name for each choice was indeed the
primary reason of choosing the string type.

Will change.

>  I think you could also use a integer type for chroma (dunno if it
> simplify in this case).

Maybe it'll simplify a bit if the enum choices are given explicit
values. I'll change it.

>> +#define PHOSPHOR_WORKING_AREA_SIZE (1)
>  Do you need to be able to change it? Or is 1 the only sensible value?

If Phosphate/IVTC2x is deemed useful, then it'll need 2. For the
Phosphor filter proper, 1 is the only sensible value.

I could simplify the code now, and introduce this more general approach
later?

>> @@ -154,6 +208,11 @@ struct filter_sys_t
>>  
>>      /* Input frame history buffer for algorithms that perform temporal filtering. */
>>      picture_t *pp_history[HISTORY_SIZE];
>> +
>> +    /* Phosphor */
>> +    phosphor_chroma_t i_phosphor_chroma_for_420;
>> +    int i_phosphor_dimmer_strength;
>> +    picture_t *pp_phosphor_working_area[PHOSPHOR_WORKING_AREA_SIZE];
>  struct {
>  	phosphor_chroma_t chroma_....
>  } phospor;
>  could be used.

Ok. Will change.

>> @@ -271,6 +330,13 @@ static void SetFilterMethod( filter_t *p_filter, const char *psz_method, vlc_fou
>> @@ -330,6 +397,12 @@ static void GetOutputFormat( filter_t *p_filter,
>>              break;
>>          }
>>      }
>> +    else if( p_sys->i_phosphor_chroma_for_420 == PC_UPCONVERT )
>> +    {
>> +        if( p_sys->i_mode == DEINTERLACE_PHOSPHOR )
>  The test order does not seem right (even if it works).

True. Thanks for the catch. I missed that when I made it a configuration
option :)

Will fix.

>> +static void ComposeFrame( filter_t *p_filter, picture_t *p_outpic,
>> +                          picture_t *p_inpic_top, picture_t *p_inpic_bottom, compose_chroma_t i_output_chroma )
>> +{
>> +    assert( p_filter != NULL );
>> +    assert( p_outpic != NULL );
>> +    assert( p_inpic_top != NULL );
>> +    assert( p_inpic_bottom != NULL );
---8<---[snip]---8<---
>> +            }
>> +            else /* i_output_chroma == CC_MERGE */
>> +            {
>> +                /* Average the chroma of the input fields. */
>> +#define Merge p_filter->p_sys->pf_merge
>> +                for( ; p_out < p_out_end ; )
>> +                {
>> +                    Merge( p_out, p_in_top, p_in_bottom, p_inpic_top->p[i_plane].i_pitch );
>> +
>> +                    p_out       += p_outpic->p[i_out_plane].i_pitch;
>> +                    p_in_top    += p_inpic_top->p[i_plane].i_pitch;
>> +                    p_in_bottom += p_inpic_bottom->p[i_plane].i_pitch;
>> +                }
>> +#undef Merge
>> +                EndMerge();
>> +            }
>> +        }
>> +    }
>> +}
>  I think that this function can simplified by using plane_Copy() and maybe a
> helper that returns a plane_t for a field.

CC_SOURCE_TOP and CC_SOURCE_BOTTOM: ok, will change. These cases are
indeed simple copies from one of the input frames and they should
simplify into one line each using plane_Copy().

As for CC_ALTLINE and CC_UPCONVERT, both require weaving the inputs,
which should be possible with the suggested approach (plane_t for a field).

Thinking about the details, define the pitch so that it skips the other
field (2*original pitch), while keeping visible_pitch as the original
visible pitch. Number of lines (and number of visible lines) will be
half of that of the full frame. That takes care of the top field, and
then the bottom field can be represented by adding an offset (original
pitch) to the base pointer. Representing also the target this way, it
should be possible to field-copy, and thus also weave, using
plane_Copy(). (This is assuming that it starts copying at the base
pointer, and assumes that all of the margin is at the end of the line.)

Thanks for the idea. I'll try this and report the results :)

So that's four cases that should simplify. The last case, CC_MERGE,
obviously needs the merge. It might be possible to refactor this and the
merge that RenderBlend() does to each plane, but I'm not sure if that
would make the code any simpler. I think it's slightly clearer to leave
CC_MERGE as-is.

>  Dunno how it is called but becareful that output pitch can be lower than
> the input one (plane_Copy() take care of that).

Ok.

It was designed to be called only for pictures from
picture_NewFromFormat(), which have matching pitches. But maybe this
should be fixed.

The above approach, plus fixing CC_MERGE to use visible_pitch, should
take care of the problem. Will fix.

>> +static void DecayLuma( filter_t *p_filter, picture_t *p_dst, picture_t *p_src, int i_field, int i_strength )
>> +{
>> +    assert( p_filter != NULL );
>> +    assert( p_dst != NULL );
>> +    assert( p_src != NULL );
>> +    assert( i_field == 0 || i_field == 1 );
>> +    assert( i_strength >= 1 && i_strength <= 3 );
>> +
>> +    int i_plane = Y_PLANE;
>> +
>> +    uint8_t *p_in, *p_out, *p_out_end;
>> +    p_in  = p_src->p[i_plane].p_pixels;
>> +    p_out = p_dst->p[i_plane].p_pixels;
>> +    p_out_end = p_out + p_dst->p[i_plane].i_pitch
>> +                      * p_dst->p[i_plane].i_visible_lines;
>  I am pretty sure that iterating over y would make the code easier to read (it
> applies for the whole file, and not only for this occurence).

This approach was chosen to keep it consistent with the existing
filters, so yeah...

Maybe these should all be changed in a separate patch? I propose doing
this at the same time as the splitting of the deinterlacer file - maybe
a separate patch first to fix stuff like this, and then split.

>> +    /* FIXME? (What to do more sensibly in case of input and output pitch mismatch?) */
>> +    int w = p_src->p[i_plane].i_pitch;
>> +    if( p_dst->p[i_plane].i_pitch < w )
>> +        w = p_dst->p[i_plane].i_pitch;
>  The min is the right choice, using visible_pitch would probably be better.

Ok. Will change to visible_pitch.

By the way, what is normally the relation between pitch and
visible_pitch (except that obviously, pitch >= visible_pitch)? And is
the margin always guaranteed to be at the end of each line, as it would
seem?

>> +        if( parity )
>> +        {
>> +            /* process this line */
>> +            uint64_t *pi = (uint64_t *)p_in; /* actually, 8-bit pixel values - we just vectorize */
>> +            uint64_t *po = (uint64_t *)p_out;
>> +            for( int x = 0 ; x < w8; x += 8 )
>> +            {
>> +                (*po) = ( ((*pi) >> i_strength) & remove_high );
>> +                ++po; /* note: steps forward by 8 bytes because of the type */
>> +                ++pi;
>  *po++ = (*pi++ >> i_strength) && remove_high
> is simpler. The comment is not usefull IMHO.

True. Will fix.

I added the comment only because it's the kind of thing that may slip my
own attention when I read quickly... but now looking at it, here the
definitions are so close to their point of use that I agree that there
is no point in having the comment. I'll remove it.

>> +            }
>> +            /* handle the remainder from division (usually none; video widths tend to be divisible by 8) */
>> +            if( wm8 )
>> +            {
>> +                uint8_t *pi_temp = (uint8_t *)pi;
>> +                uint8_t *po_temp = (uint8_t *)po;
>  Useless casts.

Ok. Will remove.

>> +                for( int x = 0 ; x < wm8; ++x )
>> +                {
>> +                    (*po_temp) = ( ((*pi_temp) >> i_strength) & remove_high );
>> +                    ++pi_temp;
>> +                    ++po_temp;
>  *po_temp++ = ((*pi_temp++) >> i_strength) & remove_high;

Yes. Will fix.

>> +            /* FIXME? (What to do more sensibly in case of input and output pitch mismatch?) */
>> +            int w = p_src->p[i_plane].i_pitch;
>> +            if( p_dst->p[i_plane].i_pitch < w )
>> +                w = p_dst->p[i_plane].i_pitch;
>> +
>> +            for( ; p_out < p_out_end ; )
>> +            {
>> +                vlc_memcpy( p_out, p_in, w );
>> +                p_out += p_dst->p[i_plane].i_pitch;
>> +                p_in  += p_src->p[i_plane].i_pitch;
>> +            }
>> +        }
>  If it is a simple plane copy, then plane_Copy().

Yes, here it is a simple copy. Thanks, I'll do that.

>  By the way, why not doing the same processing for the luma and chroma?

Because I didn't want the light output decay simulation to touch the
colours :)

(Basically this boils down to a question of into which direction the
simulated decay should move in the YUV colour cube. Physically, (at
least without a more accurate model) it can be expected that all colour
components decay equally. Thus, the decay takes place along the Y axis
rather than toward the origin.)

>> +static int RenderPhosphor( filter_t *p_filter, picture_t *p_dst, picture_t *p_src, int i_order, int i_field )
>> +{
>> +    assert( p_filter != NULL );
>> +    assert( p_dst != NULL );
>> +    assert( p_src != NULL );
>> +    assert( i_order >= 0 && i_order <= 2 ); /* 2 = soft field repeat */
>> +    assert( i_field == 0 || i_field == 1 );
>> +
>> +    filter_sys_t *p_sys = p_filter->p_sys;
>> +
>> +    /* Allocate working area at first call after filter Open() */
>> +    if( !p_sys->pp_phosphor_working_area[0] )
>  Why not doing it in the Open() call ?

Mm... good question? Will change :)

Thinking back on it, I think it was because I wanted to avoid allocating
the working area if Phosphor is not used. I suppose one more picture is
not a major memory expense.

Maybe - considering this and IVTC - there should be a general-use
working area of a few frames? (Two is currently enough.) I'll do it that
way.

>> +    {
>> +        video_format_t temp_format = p_src->format;
>> +        if( p_sys->i_phosphor_chroma_for_420 == PC_UPCONVERT )
>> +        {
>> +            const int i_chroma = p_filter->fmt_in.video.i_chroma;
>> +            if( i_chroma == VLC_CODEC_I420 || i_chroma == VLC_CODEC_YV12 )
>> +                temp_format.i_chroma = VLC_CODEC_I422;
>> +            else if( i_chroma == VLC_CODEC_J420 )
>> +                temp_format.i_chroma = VLC_CODEC_J422;
>> +        } /* no "else"; otherwise we simply match input format. */
>> +
>> +        bool b_success = true;
>> +        for( int i = 0; i < PHOSPHOR_WORKING_AREA_SIZE; i++ )
>> +        {
>> +            picture_t *p_new = picture_NewFromFormat( &temp_format );
>> +            if( !p_new )
>> +            {
>> +                b_success = false;
>> +                break;
>> +            }
>> +            picture_CopyProperties( p_new, p_src );
>  I don't think you need it.

The history mechanism does it, so I did it the same way here :)

Then again, history needs the timestamps, while the render buffer doesn't.

By the code in vlc_picture.h, picture_CopyProperties() copies the
timestamp, the force, progressive and TFF flags, and i_nb_fields.

Indeed, none of these are needed for the render buffer. I'll remove the
call.

>> +            p_sys->pp_phosphor_working_area[i] = p_new;
>> +        }
>> +        if( !b_success )
>> +        {
>> +            for( int i = 0; i < PHOSPHOR_WORKING_AREA_SIZE; i++ )
>> +            {
>> +                if( p_sys->pp_phosphor_working_area[i] )
>> +                    picture_Release( p_sys->pp_phosphor_working_area[i] );
>> +                p_sys->pp_phosphor_working_area[i] = NULL;
>> +            }
>> +
>> +            return VLC_ENOMEM;
>  If you can't move the code into Open(), then here you have a memleak.

Will move.

(This release logic was there only to prevent a memory leak in case of a
partial alloc of the working area. The normal release already took place
in Flush(). This potentially confusing design will be fixed by the move.)

>> +        default:
>> +            /* The above are the only possibilities, if there are no bugs.
>> +               Trigger an assert failure. */
>> +            assert( p_sys->i_phosphor_chroma_for_420 == PC_MERGE   ||
>> +                    p_sys->i_phosphor_chroma_for_420 == PC_LATEST  ||
>> +                    p_sys->i_phosphor_chroma_for_420 == PC_ALTLINE ||
>> +                    p_sys->i_phosphor_chroma_for_420 == PC_UPCONVERT );
>  The assert cannot happen here. if the default case cannot happen, then assert(0); break; is what you need.

Ok. Indeed, this default case is a "cannot happen". I was going for a
more descriptive assert failure message, but if it's ok then I'll use
assert(0). Will fix.

>> +    }
>> +
>> +    ComposeFrame( p_filter, p_out, p_in_top, p_in_bottom, cc );
>> +
>> +    /* Simulate phosphor light output decay for the field that is not the current one.
>> +
>> +       We eliminate one copy operation by writing the luma decay result picture to p_dst directly.
>> +       Note that usually the pitch of p_dst is slightly different from that of p_out.
>> +    */
>> +    if( p_sys->i_phosphor_dimmer_strength > 0 )
>> +        DecayLuma( p_filter, p_dst, p_out, !i_field, p_sys->i_phosphor_dimmer_strength );
>> +    else /* dimmer off */
>> +        picture_Copy( p_dst, p_out );
>  What does i_phosphor_dimmer_strength == 0 do?

WIth i_phosphor_dimmer_strength == 0, the filter pairs the latest two
fields, at every input field, and renders that. Note that this is done
also across temporal boundaries of input frames.

This is otherwise identical to the usual operation of the filter, but no
dimming is applied to the old field.

It makes telecined material look slightly better than without any
filtering (for a semi-obvious but slightly lengthy reason), but maybe
it's better described as a curiosity, or an educational toy for
technical users (about field rendering in general, about telecine, and
about the operation of this particular filter).

 -J