May 3, 2005

We have to consider Stack Alignment on Windows x64

When you enable optimize using as O1 / O2 option, you have to consider stack alignment. Because Microsoft C/C++ compliler uses movdqa opcode to stack to backup some registers such as SSE (XMM) registers.

So, when you use assember with x64 mode, you have to keep stack alignment to 16 bytes. If you don't do it, Windows will throws Access Violation by movdqa opcode.

Also, CL.EXE sometimes generate SSE code for structure copy. Because, about 16bytes copy, "movdqa" opcode seems to be faster than other opcode. "movdqa" is new opcode from SSE2. Intel says, "movdqa" is fastest in copying 16 bytes data.

Here is an article about movdqa opcode. http://akiba.ascii24.com/akiba/column/latestparts/2004/03/01/print/648472.html (This is Japanese page). According to it, when we use Prescott 3.2GH, average speed of this opcode is 45GB per seconds.

Trackback URL: http://oldskool.s60.coreserver.jp/www.mozilla-x86-64.com/mt/mt-tb.cgi/54