Simple pointer analysis

Simple pointer analysis

Pointers are a part of C++ programming that can cause grief though not as much as in C, where to do anything non trivial you have to use pointers. Pointers aren’t that bad, but if you’re new to them they will cause you troubles. It is all to easy to write bad code with pointers.

This pattern show a very basic and simple usage of a pointer and how it will look like during our analysis with IDA Pro.

Original C++ code

1
2
3
4
5
6
7
8
9
10
int main(int argc, char* argv[])
{
  int firstvalue = 5;
  int *p1;
 
  p1 = &firstvalue;  // p1 = address of firstvalue
  *p1 = 10;          // value pointed by p1 = 10
 
  return 0;
}

Non-annotated assembly code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
.text:00401000 var_C= dword ptr -0Ch
.text:00401000 var_4= dword ptr -4
.text:00401000 argc= dword ptr  8
.text:00401000 argv= dword ptr  0Ch
.text:00401000 envp= dword ptr  10h
.text:00401000
.text:00401000 push    ebp
.text:00401001 mov     ebp, esp
.text:00401003 sub     esp, 0Ch
.text:00401006 mov     [ebp+var_4], 5
.text:0040100D lea     eax, [ebp+var_4]
.text:00401010 mov     [ebp+var_C], eax
.text:00401013 mov     ecx, [ebp+var_C]
.text:00401016 mov     dword ptr [ecx], 0Ah
.text:0040101C xor     eax, eax
.text:0040101E mov     esp, ebp
.text:00401020 pop     ebp
.text:00401021 retn
.text:00401021 _main endp

Annotated assembly code

1
2
3
4
5
.text:00401006 mov     [ebp+firstvalue], 5    ; int firstvalue = 5;
.text:0040100D lea     eax, [ebp+firstvalue]  ; load pointer to firstvalue in eax;
.text:00401010 mov     [ebp+p1], eax          ; place pointer to p1
.text:00401013 mov     ecx, [ebp+p1]          ; now place pointer to p1 to ecx 
.text:00401016 mov     dword ptr [ecx], 10    ; *p1 = 10;

Conclusion

The main functionality is behind the following 3 lines of code. Line 1 (int firstvalue = 5;) fits perfect to line 1 of the annotated assembly code (mov [ebp+firstvalue], 5). But we should have a look at lines 2,3,4 and 5!

1
2
3
  int firstvalue = 5;
  p1 = &firstvalue;  // p1 = address of firstvalue
  *p1 = 10;          // value pointed by p1 = 10

However, we can see that pointers are more complicated in assembly code than we coded them in C++. Therefore the following line

1
  p1 = &firstvalue;  // p1 = address of firstvalue

mutates to

1
2
lea     eax, [ebp+firstvalue]  ; load pointer to firstvalue in eax;
mov     [ebp+p1], eax          ; place pointer to p1

At line 1 LEA computes the effective address of the second operand (the source operand) and stores it in the first operand (destination operand). The source operand is a memory address (offset part) specified with one of the processors addressing modes; the destination operand is a general-purpose register. The address-size and operand-size attributes affect the action performed by this instruction. The operand-size attribute of the instruction is determined by the chosen register; the address-size attribute is determined by the attribute of the codesegment. We move this effective address then in line 2 to our variable p1 which has been declared in our C++ code to contain a pointer (*p1).

The next line (*p1 = 10;) gives us again 2 lines of assembly code.

  *p1 = 10;

mutates to

1
2
.text:00401013 mov     ecx, [ebp+p1]        
.text:00401016 mov     dword ptr [ecx], 10

We place the address of variable p1 to ecx, then we place the value 10 to the location where ecx points to. Since ecx has now the same address as our variable p1 we write the value to p1.