<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
I have some more information on this problem. Something is broken
with the p4x4 partition with NEON.<br>
<br>
I can set all the options for slower with the single exception of
the p4x4 partition. So, in other words the command line:<br>
<br>
x264 -o test.264 --preset slow --rc-lookahead 60 --ref 8 --subme 9
--trellis 2 --partitions b8x8,i8x8,i4x4,p8x8 --input-res 720x576
test.yuv<br>
<br>
works, but the command line:<br>
<br>
x264 -o test.264 --preset slow --rc-lookahead 60 --ref 8 --subme 9
--trellis 2 --partitions p4x4,p8x8 --input-res 720x576 test.yuv<br>
<br>
doesn't (note p8x8 is needed for p4x4, even though they're called
16x16 and 8x8 in the code).<br>
<br>
Also the command line:<br>
<br>
x264 -o test.264 --preset slow --partitions p4x4,p8x8 --input-res
720x576 test.yuv<br>
<br>
appears to fail faster!<br>
<br>
Any ideas guys?<br>
<br>
Jim.<br>
<br>
<br>
On 11/09/12 18:15, Jim Darby wrote:
<blockquote cite="mid:504F71AB.9080304@gmail.com" type="cite">
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
<font face="Verdana"><small>I thought I'd give x264 a blast on the
PandaBoard ES (see <a moz-do-not-send="true"
class="moz-txt-link-freetext" href="http://pandaboard.org">http://pandaboard.org</a>
for details but essentially a dual-core Cortex-A9 with NEON).
I compiled it with no special options (even though it sets the
target machine to Cortex-A8).<br>
<br>
It encodes for a short while (5-10 seconds worth of video) and
then gets a segmentation fault. Recompiling with --debug and
unwinding the stack I can see that it blows up in</small> </font><tt>mc_luma_neon</tt><small><font
face="Verdana"> because the weight </font></small><tt>struct</tt><small><font
face="Verdana"> passed in from </font></small><tt>x264_me_refine_qpel_rd</tt><font
face="Verdana"> <small>via the </small></font><tt>COST_MV_SATD(
bmx, bmy, bsatd, 0 )</tt><font face="Verdana"><small> macro
called at encoder/me.c:1210 is an invalid pointer
(interestingly 0x53366970 which spells T69p in ascii). This
weight parameter comes from the m structure in </small></font><tt>x264_me_refine_qpel_rd</tt><font
face="Verdana"><small> which has the same corrupted value.<br>
<br>
The command line was: </small></font><small><tt>x264 -o </tt></small><small><tt>test.264
--preset slower --input-res 720x576 test.yuv</tt></small><font
face="Verdana"><small><br>
<br>
If I perform the same run with --no-asm it is amazingly slow
but does not appear to crash. This would appear to indicate
some problem with the NEON code. Running it single threaded
doesn't help either.<br>
<br>
What does help is replacing the </small></font><small><tt>--preset
slower</tt><font face="Verdana"> with </font></small><small><tt>--preset
slow</tt><font face="Verdana">. <i>Now that's interesting!</i>
That limits us quite a lot as to what is causing the problem.</font></small><br>
<font face="Verdana"><small><br>
Any ideas on this one? More specifically, what can I do to
help with debugging this? I've avoided sending 126MB core
files or detailed backtraces but I'm very happy to do whatever
is needed to help track this down.<br>
<br>
As per the information on the web page, here is the gdb
information:<br>
<br>
</small></font><small><tt>(gdb) bt<br>
#0 mc_luma_neon (dst=0x36ac78 "\345\331\325\327\331\343\344",
<incomplete sequence \345>, i_dst_stride=32,
src=0xb5317254, i_src_stride=784, <br>
mvx=0, mvy=0, i_width=8, i_height=8, weight=0x53366a70) at
common/arm/mc-c.c:146<br>
#1 0x0005e0c6 in x264_me_refine_qpel_rd (h=0x365c60,
m=0xb5317240, i_lambda2=2322, i4=12, i_list=0) at
encoder/me.c:1210<br>
#2 0x000563a4 in x264_macroblock_analyse (h=0x365c60) at
encoder/analyse.c:3362<br>
#3 0x0001f9b2 in x264_slice_write (h=0x365c60) at
encoder/encoder.c:2309<br>
#4 0x0002057a in x264_slices_write (h=0x365c60) at
encoder/encoder.c:2625<br>
#5 0x0002539e in x264_threadpool_thread (pool=0x3853a0) at
common/threadpool.c:69<br>
#6 0xb6e4fed2 in start_thread () from
/lib/arm-linux-gnueabihf/libpthread.so.0<br>
#7 0xb6de6058 in ?? () from
/lib/arm-linux-gnueabihf/libc.so.6<br>
#8 0xb6de6058 in ?? () from
/lib/arm-linux-gnueabihf/libc.so.6<br>
Backtrace stopped: previous frame identical to this frame
(corrupt stack?)<br>
(gdb) disass $pc-32,$pc+32<br>
Dump of assembler code from 0x71ff4 to 0x72034:<br>
0x00071ff4 <mc_luma_neon+132>: mov r0, r5<br>
0x00071ff6 <mc_luma_neon+134>: mov r1, r4<br>
0x00071ff8 <mc_luma_neon+136>: blx r7<br>
0x00071ffa <mc_luma_neon+138>: ldr r3, [r6,
#44] ; 0x2c<br>
0x00071ffc <mc_luma_neon+140>: cbz r3,
0x7204a <mc_luma_neon+218><br>
0x00071ffe <mc_luma_neon+142>: str r6, [sp,
#0]<br>
0x00072000 <mc_luma_neon+144>: str.w r8, [sp,
#4]<br>
0x00072004 <mc_luma_neon+148>: ldr.w r6, [r3,
r9, lsl #2]<br>
0x00072008 <mc_luma_neon+152>: mov r0, r5<br>
0x0007200a <mc_luma_neon+154>: mov r1, r4<br>
0x0007200c <mc_luma_neon+156>: mov r2, r5<br>
0x0007200e <mc_luma_neon+158>: mov r3, r4<br>
0x00072010 <mc_luma_neon+160>: blx r6<br>
0x00072012 <mc_luma_neon+162>: b.n 0x7204a
<mc_luma_neon+218><br>
=> 0x00072014 <mc_luma_neon+164>: ldr r1,
[r6, #44] ; 0x2c<br>
0x00072016 <mc_luma_neon+166>: cbz r1,
0x7202e <mc_luma_neon+190><br>
0x00072018 <mc_luma_neon+168>: mov.w r9, r9,
asr #2<br>
0x0007201c <mc_luma_neon+172>: str r6, [sp,
#0]<br>
0x0007201e <mc_luma_neon+174>: str.w r8, [sp,
#4]<br>
0x00072022 <mc_luma_neon+178>: ldr.w r6, [r1,
r9, lsl #2]<br>
0x00072026 <mc_luma_neon+182>: mov r0, r5<br>
0x00072028 <mc_luma_neon+184>: mov r1, r4<br>
0x0007202a <mc_luma_neon+186>: blx r6<br>
0x0007202c <mc_luma_neon+188>: b.n 0x7204a
<mc_luma_neon+218><br>
0x0007202e <mc_luma_neon+190>: movw r1,
#42212 ; 0xa4e4<br>
0x00072032 <mc_luma_neon+194>: movt r1, #8<br>
End of assembler dump.<br>
(gdb) info all-registers<br>
r0 0x0 0<br>
r1 0xb5317254 3039916628<br>
r2 0xb4b53172 3031773554<br>
r3 0x310 784<br>
r4 0x20 32<br>
r5 0x36ac78 3583096<br>
r6 0x53366a70 1396075120<br>
r7 0x0 0<br>
r8 0x8 8<br>
r9 0x8 8<br>
r10 0x3 3<br>
r11 0x0 0<br>
r12 0x0 0<br>
sp 0xb5316a48 0xb5316a48<br>
lr 0x0 0<br>
pc 0x72014 0x72014 <mc_luma_neon+164><br>
cpsr 0x40000130 1073742128<br>
</tt></small><font face="Verdana"><small><tt><br>
</tt><font face="Verdana">I've chopped off the rest of the
registers as it very long and doesn't seem to contain any
relevant information.</font><tt><br>
</tt><br>
Cheers,<br>
<br>
Jim.<br>
</small></font> </blockquote>
<br>
</body>
</html>