<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">Hi,</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"></span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">I'm analyzing the performance of h.264 decoders and I used some input movies from the hdvideobench (<a href="http://personals.ac.upc.edu/alvarez/hdvideobench/">
http://personals.ac.upc.edu/alvarez/hdvideobench/</a>) generated by x264. While analyzing the motion vectors I found some extremely large values (up to 500 pixels). I'm wondering if something is wrong with x264 or that I am misunderstanding this issue.
</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">If you are knowledgeable on the motion estimation of x264, please read the detailed description below and I would like to hear from you.</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"></span></p>
<div class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"></span> </div>
<div class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></div>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">THE MOVIES:</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">Hdvideobench has four movies (riverbed, rush_hour, pedestrian, and blue_sky) in three different resolutions: 1920x1088, 1280x720, 720x756 each with length of 100 frames.
</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">The yuv source files were encoded in x264 with the following options:</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><code><span lang="EN-GB" style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-ansi-language: EN-GB">x264 --bframes 2 --no-b-adapt --b-bias=0 --ref 16 --qp=26 --analyse all --weightb --me hex --merange 24 --subme 7 --8x8dct -fps 25 --frames 101 --progress -o [outname].h264 [inname].yuv [resolution]
</span></code></p>
<p class="MsoNormal" style="MARGIN: 0pt"><code><span lang="EN-GB" style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-ansi-language: EN-GB"> </span></code></p>
<p class="MsoNormal" style="MARGIN: 0pt"><code><span lang="EN-GB" style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-ansi-language: EN-GB">I also created the movies with "–me esa –-merange 16" instead of <font size="2">--me hex --merange 24
</font></span></code></p>
<p class="MsoNormal" style="MARGIN: 0pt"><code><span lang="EN-GB" style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-ansi-language: EN-GB"> </span></code></p>
<p class="MsoNormal" style="MARGIN: 0pt"><code><span lang="EN-GB" style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-ansi-language: EN-GB"> </span></code></p>
<p class="MsoNormal" style="MARGIN: 0pt"><code><span lang="EN-GB" style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-ansi-language: EN-GB">THE DECODERS</span></code></p>
<p class="MsoNormal" style="MARGIN: 0pt"><code><span lang="EN-GB" style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-ansi-language: EN-GB">I used both ffmpeg and the reference software to analyse the motion vectors. For ffmpeg I analyzed the motion vectors as follows. (To keep things readable I provided here just some simple code with printf's, but it will do the job) In the file
h264.c in the function mc_dir_part() I added just after the variable declarations:</span></code></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">int my_x = h->mv_cache[list][ scan8[n] ][0] ;</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">int my_y =<span style="mso-spacerun: yes"> </span>h->mv_cache[list][ scan8[n] ][1] ;</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">if ((my_x>>2) > 100 || (my_y>>2) > 100 || (my_x>>2) < -100 || (my_y>>2) < -100)</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"><span style="mso-tab-count: 1"> </span>{</span></p>
<p class="MsoNormal" style="MARGIN: 0pt; TEXT-INDENT: 36pt"><span style="FONT-FAMILY: Arial">#undef printf<span style="mso-tab-count: 1"> </span></span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"><span style="mso-tab-count: 1"> </span>printf("# my_x: %d, my_y: %d, MB: [%d,%d]\n", (my_x>>2), (my_y>>2), s->mb_x, s->mb_y);
</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"><span style="mso-tab-count: 1"> </span>}</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">I understand that the motion vectors in mv_cache are in quarter pixels, and I compensate for that. I check for motion vector larger than 100 and I find around 100 motion vectors with lengths up to 500 in each movie. For movies encoded with hexagonal ME the values are a bit lower (up to 200) than for full search ME. According to the encoding options these values should not exceed 24 for hex and 16 for esa.
</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">I also used the "-vismv 3" option, which visualizes the motion vectors on screen, of ffplay and stepped through the frames. I could clearly see extremely large arrows.
</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">In the reference decoder I added the following code in file mc_prediction.c in function perform_mc(), just after the code where vec1_x, vec1_y, vec2_x, vec2_y is computed (this is in two places, below the code for the first):
</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">int my_x = (vec1_x>>2) - img->block_x*4;</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span lang="ES" style="FONT-FAMILY: Arial; mso-ansi-language: ES">int my_y = (vec1_y>>2) - img->block_y*4;</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span lang="ES" style="FONT-FAMILY: Arial; mso-ansi-language: ES"> </span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">if ((my_x>>2) > 100 || (my_y>>2) > 100 || (my_x>>2) < -100 || (my_y>>2) < -100)</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"><span style="mso-tab-count: 1"> </span>{</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"><span style="mso-tab-count: 1"> </span>printf("vec1_x: %d; vex1_y: %d; MB is [%d,%d]\n", my_x, my_y, (img->block_x>>2) , (img->block_y>>2) );
</span></p>
<p class="MsoNormal" style="MARGIN: 0pt; TEXT-INDENT: 36pt"><span style="FONT-FAMILY: Arial">}</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">The results are the same as with ffmpeg, that is motion vectors of up to 500 pixels.</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">REFERENCE ENCODER</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">For comparison I also encoded the movies with the reference encoder setting the motion search range to 16. Using the both ffmpeg and the reference decoder, I found motion vectors up to 57. This is better than the range of 500 resulting from x264 encoding. However, it is still larger than the search range of 16.
</span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></p>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">THE QUESTION:</span></p>
<div class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">Why does x264 create this large motion vectors? Is something wrong, or do I misunderstand something? Do I interpret the motion vectors correctly (even the reference encoder seems to produce vector too large)?
</span></div>
<div class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"></span> </div>
<div class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">I woudl really appreciate if someone could help me out on this.</span></div>
<div class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"></span> </div>
<div class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial">regards Cor</span></div>
<p class="MsoNormal" style="MARGIN: 0pt"><span style="FONT-FAMILY: Arial"> </span></p>