Hi Mauricio,<br><br>I guess there is a bit of confusion here. The pointer is used, as the name suggests, to point to some place where you can peek or poke information. The contents of the pointer can receive any value, so the compiler cannot be used to force alignment on its contents.<br>
<br>In the case you are showing, you are loading a "number" (which happens to be an address) that is not divisible by 64, as you pointed out. When your code tries to use that pointer the processor complains, usually throwing an exception that would take forever to discover what you are trying to do and working around the problem.<br>
<br>So, you cannot instruct the compiler to fix the pointer value. The address it's taking is from a unaligned structure, there is nothing you can do unless you "unpack" that structure.<br><br>BUT you can use a neat trick that allows you to read and write to unaligned addresses without the penalty associated with an exception handling by the Operationg System.<br>
<br>I'll show you how to do it for 32 bits and you can extend further to 64 bits.<br><br><span style="font-family: courier new,monospace;">file.h:</span><br style="font-family: courier new,monospace;"><pre style="font-family: courier new,monospace;">
//==========================================<br>typedef union { u32 dw;<br> u16 w[2];<br> u8 u[4];<br> } T_DWORD;<br>//==========================================<br>typedef union { u16 w;<br>
u8 u[2];<br> } T_WORD;<br>//==========================================
u16 READ_UNALIGNED_16(const u8 *p);<br>u32 READ_UNALIGNED_32(const u8 *p);<br>//==========================================<br>void WRITE_UNALIGNED_16(u8 *p, u16 value);<br>void WRITE_UNALIGNED_32(u8 *p, u32 value);<br>//==========================================<br>
</pre><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">file.c:</span><br style="font-family: courier new,monospace;"><pre style="font-family: courier new,monospace;">//==========================================<br>
u32 READ_UNALIGNED_32(const u8 *p)<br>{<br> T_DWORD n;<br><br> n.u[3] = p[3];<br> n.u[2] = p[2];<br> n.u[1] = p[1];<br> n.u[0] = p[0];<br><br> return (n.dw);<br>}<br>//==========================================<br>void WRITE_UNALIGNED_32(u8 *p, u32 value)<br>
{<br> T_DWORD d;<br><br> d.dw = value;<br><br> p[0] = d.u[0];<br> p[1] = d.u[1];<br> p[2] = d.u[2];<br> p[3] = d.u[3];<br>}<br>//==========================================<br></pre><div class="gmail_quote"><br>I hope this helps.<br>
<br>(Nestor)<br><br><br><br>On Thu, Feb 28, 2008 at 1:49 PM, Mauricio Alvarez <<a href="mailto:lokifo@gmail.com">lokifo@gmail.com</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Hi,<br>
<div class="Ih2E3d"><br>
On Thu, Feb 28, 2008 at 2:16 PM, Loren Merritt <<a href="mailto:lorenm@u.washington.edu">lorenm@u.washington.edu</a>> wrote:<br>
> On Tue, 26 Feb 2008, Mauricio Alvarez wrote:<br>
</div><div class="Ih2E3d">> > x264(8689): unaligned access to 0x607fffffff32ee5c, ip=0x40000000000afb71<br>
> > x264(8689): unaligned access to 0x607fffffff32ee64, ip=0x40000000000aff31<br>
><br>
> The obvious step is to disassemble it and see what pieces of code<br>
> correspond to addresses 0x40000000000afb71 and 0x40000000000aff31.<br>
<br>
<br>
</div>I repetead the test with the following configuration: gcc with -01 -g<br>
(--enable-debug). Using different input videos and different number of<br>
frames to encode I found the following unaligned accesses:<br>
x264(11582): unaligned access to 0x607ffffffece29ac, ip=0x400000000005d901<br>
x264(11582): unaligned access to 0x607ffffffece29b4, ip=0x400000000005dd50<br>
x264(11582): unaligned access to 0x607ffffffece29ac, ip=0x400000000005d2b1<br>
x264(12585): unaligned access to 0x607ffffffeac69cc, ip=0x400000000005d2f0<br>
x264(12585): unaligned access to 0x607ffffffeac69ac, ip=0x400000000005cb00<br>
x264(12585): unaligned access to 0x607ffffffeac69b4, ip=0x400000000005cb20<br>
x264(12585): unaligned access to 0x607ffffffeac69b4, ip=0x400000000005d2d1<br>
<br>
the most common one is: x264(11582): unaligned access to<br>
0x607ffffffece29b4, ip=0x400000000005dd50<br>
<br>
All these accesses correspond to stores to the mvc array in<br>
x264_mb_analyse_inter_XXX functions in analyse.c . The problem is that<br>
the base address that these functions are receiving is not 64-bit<br>
aligned.<br>
Just an example from the function x264_mb_analyse_inter_p8x8():<br>
<br>
// initialization of mvc<br>
//encoder/analyse.c:1100<br>
int (*mvc)[2] = a->l0.mvc[i_ref];<br>
<br>
// access to mvc<br>
//encoder/analyse.c:1107<br>
*(uint64_t*)mvc[0] = *(uint64_t*)a-><a href="http://l0.me16x16.mv" target="_blank">l0.me16x16.mv</a>;<br>
400000000005d8f6: f0 00 84 08 42 00 adds r15=512,r33;;<br>
400000000005d8fc: 00 00 04 00 nop.i 0x0<br>
400000000005d900: 0b 78 00 1e 18 10 [MMI] ld8 r15=[r15];;<br>
<br>
// access to mvc<br>
//encoder/analyse.c:1127<br>
*(uint64_t*)mvc[i_mvc] = *(uint64_t*)m->mv;<br>
400000000005dd40: 0b 70 00 40 01 21 [MMI] adds r14=128,r32;;<br>
400000000005dd46: e0 00 38 30 20 00 ld8 r14=[r14]<br>
400000000005dd4c: 00 00 04 00 nop.i 0x0;;<br>
400000000005dd50: 02 40 38 56 98 15 [MII] st8 [r43]=r14,8<br>
<br>
I'm not familiar with the X264 code, so I ask this: it is possible to<br>
force (by using a compiler directive) a specific alignment to these<br>
pointers? Or are there generated dynamically by motion estimation?.<br>
<br>
Regards,<br>
<br>
Mauricio A.<br>
<div><div></div><div class="Wj3C7c">_______________________________________________<br>
x264-devel mailing list<br>
<a href="mailto:x264-devel@videolan.org">x264-devel@videolan.org</a><br>
<a href="http://mailman.videolan.org/listinfo/x264-devel" target="_blank">http://mailman.videolan.org/listinfo/x264-devel</a><br>
</div></div></blockquote></div><br>