[x264-devel] NDK r8c linker warnings for asm files from libx264
Loren Merritt
lorenm at u.washington.edu
Sun Dec 16 12:19:43 CET 2012
On Sun, 16 Dec 2012, Alex Cohn wrote:
> I don't have enough information to decide whether strip step on ARM is
> useful, useless, or dangerous. If you could detail the original
> showcase that led to the decision to add strip on x86, I could try to
> assess it with an Android ARM device. I have not got other ARM devices
> to test the consequences of performing or not performing the strip
> step.
A correct x86 profile, with strip, looks like:
~> configure ; make ; perf record x264 foo.y4m -o /dev/null ; perf report
6.29% x264_pixel_avg2_w16_sse2
5.95% x264_me_search_ref
5.48% refine_subpel
5.03% x264_pixel_satd_8x8_internal_avx
4.47% get_ref_sse2
4.33% x264_mc_chroma_avx
4.18% x264_pixel_satd_16x4_internal_avx
4.11% x264_pixel_avg2_w8_mmx2
2.63% x264_pixel_sad_16x16_sse2
2.33% x264_pixel_sad_x4_16x16_sse2
...
Whereas without strip it looks like:
6.19% x264_pixel_avg2_w16_sse2.height_loop
5.95% x264_me_search_ref
5.48% refine_subpel
4.47% get_ref_sse2
4.18% x264_pixel_satd_16x4_internal_avx
3.90% x264_pixel_avg2_w8_mmx2.height_loop
2.63% x264_pixel_sad_16x16_sse2
2.59% x264_pixel_satd_8x8_internal_avx
2.44% .. at 26082.pixel_satd_8x4_internal
2.33% x264_pixel_sad_x4_16x16_sse2
...
1.93% x264_mc_chroma_avx.loop8
1.34% x264_mc_chroma_avx.loop4
0.79% x264_mc_chroma_avx
0.29% x264_mc_chroma_avx.width8
...
Note the attribution of cycles to x264_pixel_avg2_w16_sse2.height_loop
rather than x264_pixel_avg2_w16_sse2, and splitting x264_mc_chroma_avx
into 4 parts.
Likewise, the gdb command "disass x264_pixel_avg2_w16_sse2" only shows
the prologue, not the loop body, because it thinks that the loop is a
separate function.
Strip fixes this by erasing the local labels, leaving only function
entrypoints.
The ARM asm also has local labels, such as avg2_w16_loop in
x264_pixel_avg2_w16_neon. I want to know whether the assembler gets
the metadata right such that debuggers and profilers know that they're
just branch targets, not functions.
--Loren Merritt
More information about the x264-devel
mailing list