[x264-devel] NDK r8c linker warnings for asm files from libx264

Loren Merritt lorenm at u.washington.edu
Sun Dec 16 12:19:43 CET 2012


On Sun, 16 Dec 2012, Alex Cohn wrote:

> I don't have enough information to decide whether strip step on ARM is
> useful, useless, or dangerous. If you could detail the original
> showcase that led to the decision to add strip on x86, I could try to
> assess it with an Android ARM device. I have not got other ARM devices
> to test the consequences of performing or not performing the strip
> step.

A correct x86 profile, with strip, looks like:
~> configure ; make ; perf record x264 foo.y4m -o /dev/null ; perf report
6.29%  x264_pixel_avg2_w16_sse2
5.95%  x264_me_search_ref
5.48%  refine_subpel
5.03%  x264_pixel_satd_8x8_internal_avx
4.47%  get_ref_sse2
4.33%  x264_mc_chroma_avx
4.18%  x264_pixel_satd_16x4_internal_avx
4.11%  x264_pixel_avg2_w8_mmx2
2.63%  x264_pixel_sad_16x16_sse2
2.33%  x264_pixel_sad_x4_16x16_sse2
...

Whereas without strip it looks like:
6.19%  x264_pixel_avg2_w16_sse2.height_loop
5.95%  x264_me_search_ref
5.48%  refine_subpel
4.47%  get_ref_sse2
4.18%  x264_pixel_satd_16x4_internal_avx
3.90%  x264_pixel_avg2_w8_mmx2.height_loop
2.63%  x264_pixel_sad_16x16_sse2
2.59%  x264_pixel_satd_8x8_internal_avx
2.44%  .. at 26082.pixel_satd_8x4_internal
2.33%  x264_pixel_sad_x4_16x16_sse2
...
1.93%  x264_mc_chroma_avx.loop8
1.34%  x264_mc_chroma_avx.loop4
0.79%  x264_mc_chroma_avx
0.29%  x264_mc_chroma_avx.width8
...

Note the attribution of cycles to x264_pixel_avg2_w16_sse2.height_loop
rather than x264_pixel_avg2_w16_sse2, and splitting x264_mc_chroma_avx
into 4 parts.
Likewise, the gdb command "disass x264_pixel_avg2_w16_sse2" only shows
the prologue, not the loop body, because it thinks that the loop is a
separate function.
Strip fixes this by erasing the local labels, leaving only function
entrypoints.

The ARM asm also has local labels, such as avg2_w16_loop in
x264_pixel_avg2_w16_neon. I want to know whether the assembler gets
the metadata right such that debuggers and profilers know that they're
just branch targets, not functions.

--Loren Merritt


More information about the x264-devel mailing list