How to find the size of an array in binary?

Question

I'm currently learning RE and try to understand some basic c programs.

I've almost figure out some concepts, but right now i've no idea how to find the size of an array when i use objdump or gdb.

for example :

int main(int argc, char **argv)
{
  char buffer[64];               // <= Where i supposed to find the array size ?
  gets(buffer);                   
  printf("Buffer : %s",buffer);
  return 0;
}

Anyone can explain me how is it possible ?

There is a difference between dynamically allocated arrays (trivial to find), stack-based (which may or not may be harder) and static arrays. For the latter you may have to disassemble your entire program -- and even then you can not be sure. — Jongware, Nov 21 '14 at 14:23

score 6 · Accepted Answer · answered Nov 21 '14 at 16:26

There's no easy way to do this, C doesn't have a concept of array sizes (at execute time), so the size isn't stored anywhere. You'll have to read the assembly code and (try to) understand it.

Take the following program

extern void *malloc(int);
extern char *strcpy(char *dst, char *src);

char firstname[80];
char lastname[80];

int main(void) {
    int some_variable=1;
    char buffer[64];
    int some_other_variable=2;
    char *otherbuffer=malloc(100);
    gets(buffer);
    strcpy(firstname, "John");
    strcpy(lastname, "Doe");
}

and compile it with cc -fno-builtin -O0 -o arraysize arraysize.c. (I had to disable built-in functions to prevent gcc from short-circuiting malloc and strcpy, and i declared them myself instead of using the headers for the same reason. Also, without -O0, gcc omits stuff that's never used).

Then, use objdump -d arraysize and check the main function:

0000000000400554 <main>:
  400554:   55                      push   %rbp
  400555:   48 89 e5                mov    %rsp,%rbp

// This instruction tells you that the function needs 80 (0x50) bytes on the
// stack. This happens to be the same as the size of all local variables
// here, but might be higher as well if the function needs stack space for
// function arguments and the like.
  400558:   48 83 ec 50             sub    $0x50,%rsp


// This puts 1 and 2 into the integer variables. Note we now know they're
// located at -0x10(%rbp) and -0xc(%rbp) on the stack.
  40055c:   c7 45 f0 01 00 00 00    movl   $0x1,-0x10(%rbp)
  400563:   c7 45 f4 02 00 00 00    movl   $0x2,-0xc(%rbp)

// This calls malloc(100) and puts the result into -0x8(rbp). We now know
// the array pointed to has 100 bytes, because that's what was malloc'ed.
// Note that you have no other way of finding out the size afterwards
// (except if you know how exactly malloc is implemented and where malloc
// keeps its internal housekeeping structures)
  40056a:   bf 64 00 00 00          mov    $0x64,%edi
  40056f:   e8 b4 fe ff ff          callq  400428 <malloc@plt>
  400574:   48 89 45 f8             mov    %rax,-0x8(%rbp)

// now, we call gets, feeding it with -0x50(%rbp) as its parameter.
// As the next variable that's used on the stack is at -0x10(rbp), we can
// assume that the array has 0x40=64 bytes. This does not have to be true;
// for example, if the function declared 2 arrays of 32 bytes each, they'd
// be at -0x50(%rbp) and -0x30(%rbp), and if the function never used the
// one at 0x30(%rbp), there'd be no way for us to tell the difference.
  400578:   48 8d 45 b0             lea    -0x50(%rbp),%rax
  40057c:   48 89 c7                mov    %rax,%rdi
  40057f:   b8 00 00 00 00          mov    $0x0,%eax
  400584:   e8 bf fe ff ff          callq  400448 <gets@plt>

// This is the strcpy to firstname. The address of firstname is at 0x6009e0.
// We don't know how large it is, as we haven't seen a variable behind it yet.
  400589:   be a8 06 40 00          mov    $0x4006a8,%esi
  40058e:   bf e0 09 60 00          mov    $0x6009e0,%edi
  400593:   e8 c0 fe ff ff          callq  400458 <strcpy@plt>

// And this is the second strcpy, to lastname at 0x6000980. Since we've
// seen the other strcpy to 0x60009e0, we assume that there are no more than
// 0x50=80 bytes in that buffer, but see below.
  400598:   be ad 06 40 00          mov    $0x4006ad,%esi
  40059d:   bf 80 09 60 00          mov    $0x600980,%edi
  4005a2:   e8 b1 fe ff ff          callq  400458 <strcpy@plt>

// end of function
  4005a7:   c9                      leaveq 
  4005a8:   c3                      retq

The C source code said

char firstname[80];
char lastname[80];
strcpy(firstname, "John");
strcpy(lastname, "Doe");

and from the address difference between the two strcpys, we assumed an array size of 80. But note that the exact same instructions would have been generated in that case:

char name[160];
strcpy(name+80, "John");
strcpy(name, "Doe");

So if you don't have debugging symbols, all you get are assumptions.

How to find the size of an array in binary?

1 Answers1