The __fastcall calling convention specifies that arguments to functions are to be passed in registers, when possible. The following list shows the implementation of this calling convention.
Microsoft x64 calling convention
The x64 calling convention (for long mode on x86-64) takes advantage of additional register space in the AMD64/Intel 64 platform. The registers RCX, RDX, R8, R9 are used for integer and pointer arguments, and XMM0, XMM1, XMM2, XMM3 are used for floating point arguments.
stack frame layout on x64
The stack frame layout for functions on x64 is very similar to x86, but with a few key differences. Just like x86, the stack frame on x64 is divided into three parts: parameters, return address, and locals. One of the important principals to understand when it comes to x64 stack frames is that the stack does not fluctuate throughout the course of a given function. In fact, the stack pointer is only permitted to change in the context of a function prologue. Parameters are not pushed and popped from the stack. Instead, stack space is pre-allocated for all of the arguments that would be passed to child functions. This is done, in part, for making it easier to unwind call stacks in the event of an exception.
Callee clean-up
When the callee cleans the arguments from the stack it needs to be known at compile time how many bytes the stack needs to be adjusted. Therefore, these calling conventions are not compatible with variable argument lists, eg. printf(). They may be, however, slightly more efficient as the code needed to unwind the stack does not need to be generated by the calling code.
Caller clean-up
In these conventions the caller cleans the arguments from the stack, which allows for variable argument lists, eg. printf().
ESP in action
Let’s say we want to quickly discard 3 items we pushed earlier onto the stack, without saving the values (in other words “clean” the stack).
AMD64 ABI convention
The calling convention of the AMD64 application binary interface is followed on Linux and other non-Microsoft operating systems. The registers RDI, RSI, RDX, RCX, R8 and R9 are used for integer and pointer arguments while XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6 and XMM7 are used for floating point arguments. As in the Microsoft x64 calling convention, additional arguments are pushed onto the stack and the return value is stored in RAX.