24

I would like to know what are the different ways to perform a system call in x86 assembler under Linux. But, with no cheating, only assembler must be used (i.e. compilation with gcc must be done with -nostdlib).

I know four ways to perform a system calls, namely:

  • int $0x80
  • sysenter (i586)
  • call *%gs:0x10 (vdso trampoline)
  • syscall (amd64)

I am pretty good at using int $0x80, for example, here is a sample code of a classic 'Hello World!' in assembler using int $0x80 (compile it with gcc -nostdlib -o hello-int80 hello-int80.s):

.data
msg:
  .ascii "Hello World!\n"
  len = . - msg

.text
.globl _start

_start:
# Write the string to stdout
  movl  $len, %edx
  movl  $msg, %ecx
  movl  $1, %ebx
  movl  $4, %eax
  int   $0x80

# and exit
  movl  $0, %ebx
  movl  $1, %eax
  int   $0x80

But the sysenter is often ending with a segmentation fault error. Why ? And, how to use it right ?

Here is an example with call *%gs:0x10 (compiled with gcc -o hello-gs10 hello-gs10.s). Note that I need to go through the libc initialization before calling it properly (that is why I am using main and not anymore _start and, that is also why I removed the option -nostdlib from the compile line):

.data
msg:
  .ascii "Hello World!\n"
  len = . - msg

.text
.globl main

main:
# Write the string to stdout
  movl  $len, %edx
  movl  $msg, %ecx
  movl  $1, %ebx
  movl  $4, %eax
  call  *%gs:0x10

# and exit
  movl  $0, %ebx
  movl  $1, %eax
  call  *%gs:0x10

Also, the syscall is working pretty well also if you know the syscall codes for this architecture (thanks to lfxgroove) (compiled with: gcc -m64 -nostdlib -o hello-syscall hello-syscall.s):

.data
msg:
  .ascii "Hello World!\n"
  len = . - msg

.text
.globl _start

_start:
# Write the string to stdout
  movq  $len, %rdx
  movq  $msg, %rsi
  movq  $1, %rdi
  movq  $1, %rax
  syscall
# and exit
  movq  $0, %rdi
  movq  $60, %rax
  syscall

So, the only problem I have to trigger a system call is this sysenter way. Here is an example with sysenter ending with a segmentation fault (compiled with gcc -m32 -nostdlib -o hello-sysenter hello-sysenter.s):

.data
msg:
  .ascii "Hello World!\n"
  len = . - msg

.text
.globl _start

_start:
# Write the string to stdout
  movl  $len, %edx
  movl  $msg, %ecx
  movl  $1, %ebx
  movl  $4, %eax

  push    final
  sub $12, %esp
  mov %esp, %ebp

  sysenter
# and exit
final:  
  movl  $0, %ebx
  movl  $1, %eax

  sub $12, %esp
  mov %esp, %ebp

  sysenter
Igor Skochinsky
  • 36,553
  • 7
  • 65
  • 115
perror
  • 19,083
  • 29
  • 87
  • 150
  • A first guess for the syscall try is that you've got the wrong syscall numbers as it seems that for 64-bits mode (which it seems syscall is for) the numbers are all scrambled, ie: exit is 60 instead of what you're using right now, see http://lxr.linux.no/#linux+v2.6.32/arch/x86/include/asm/unistd_64.h for the numbers – lfxgroove Oct 02 '13 at 21:26
  • Indeed, you are right. I was really puzzled but it seems that in my example it was the second syscall that was calling the write (and not the first one, as I was expecting). – perror Oct 02 '13 at 21:44
  • Rather than making another QA. @perror would you mind explaining len = . - msg - edit : I know of the purpose by obvious reasons but the semantics of it. - Thanks – k0ng0 Oct 03 '13 at 15:58
  • The '.' in the gas syntax refer to the current address. So, len = . - msg is a way to store in len the size of the string msg (it computes the difference between the current address and the position of the msg label). – perror Oct 03 '13 at 20:22
  • Question, is syscall and sysenter architecture specific? As far as i can tell i have a intel which would mean i can use sysenter but the compiled code has syscall in it. Am i missing something? – lfxgroove Oct 03 '13 at 20:31
  • @lfxgroove plz share the answer for above question you asked about syscall(in case you found it) – incompetent Jan 01 '16 at 15:20
  • @shami, sorry i didn't :( – lfxgroove Jul 10 '16 at 16:43
  • To ensure correct syscall numbers, might I suggest #include <sys/syscalls.h> and then $SYS_write, $SYS_exit etc. in your code instead of $4, $1 – sigjuice Jul 12 '17 at 02:40
  • 1
    Thank you so much, you have helped me figure out this systemd assertion failure on a bugged ChromiumOS kernel by layong out the different ways how syscalls can be made and the code example in the answer. – nh2 Mar 18 '19 at 03:47
  • Congrats nh2! That was a very nice bug hunt! I really enjoyed reading this bug report summary on GitHub (and your explanations are really good!). SIncerely, I would never have expected that my explanations of the sysenter instruction could be of any use for anybody and you proved me wrong! ;-) – perror Mar 18 '19 at 08:52

1 Answers1

14

System calls through sysenter

sysenter is a i586 instruction, specifically tight to 32-bits applications. It has been subsumed by syscall on 64-bits plateforms.

One particularity of sysenter is that it does require, in addition to the usual register setting, a few manipulations on the stack before calling it. This is because before leaving sysenter, the process will go through the last part of the __kernel_vsyscall assembler snippet (starting from 0xf7ffd430):

Dump of assembler code for function __kernel_vsyscall:
   0xf7ffd420 <+0>:        push   %ecx
   0xf7ffd421 <+1>:        push   %edx
   0xf7ffd422 <+2>:        push   %ebp
   0xf7ffd423 <+3>:        mov    %esp,%ebp
   0xf7ffd425 <+5>:        sysenter 
   0xf7ffd427 <+7>:        nop
   0xf7ffd428 <+8>:        nop
   0xf7ffd429 <+9>:        nop
   0xf7ffd42a <+10>:       nop
   0xf7ffd42b <+11>:       nop
   0xf7ffd42c <+12>:       nop
   0xf7ffd42d <+13>:       nop
   0xf7ffd42e <+14>:       int    $0x80
=> 0xf7ffd430 <+16>:       pop    %ebp
   0xf7ffd431 <+17>:       pop    %edx
   0xf7ffd432 <+18>:       pop    %ecx
   0xf7ffd433 <+19>:       ret    
End of assembler dump.

So, the sysenter instruction expect to have the stack forged in that way:

0x______0c  saved_eip   (ret)
0x______08  saved_%ecx  (pop %ecx)
0x______04  saved_%edx  (pop %edx)
0x______00  saved_%ebp  (pop %ebp)

That's why, each time we need to call sysenter, we first have to push the values of the saved %eip, and the same with%ecx, %edx and %ebp. Which leads to:

.data
msg:
    .ascii "Hello World!\n"
    len = . - msg

.text
.globl _start
_start:
    pushl  %ebp
    movl   %esp, %ebp
# Write the string to stdout
    movl   $len, %edx
    movl   $msg, %ecx
    movl   $1, %ebx
    movl   $4, %eax
# Setting the stack for the systenter
    pushl  $sysenter_ret
    pushl  %ecx
    pushl  %edx
    pushl  %ebp
    movl   %esp,%ebp
    sysenter
# and exit
sysenter_ret:    
    movl   $0, %ebx
    movl   $1, %eax
# Setting the stack for the systenter
    pushl  $sysenter_ret # Who cares, this is an exit !
    pushl  %ecx
    pushl  %edx
    pushl  %ebp
    movl   %esp,%ebp
    sysenter
perror
  • 19,083
  • 29
  • 87
  • 150
  • 1
    I really do not know how the Linux kernel can re-route the result of the instruction sysenter to __kernel_vsyscall+16. If somebody can clarify this for me, I would be pleased. – perror Oct 06 '13 at 16:56
  • 3
    it's via the MSR called registers , http://x86.renejeschke.de/html/file_module_x86_id_313.html , in SYSEXIT they used those registers too. Don't know how they were setten up or what is the source files they were declared in. – sandun dhammika Oct 07 '13 at 09:25
  • 3
    @perror in fact you don't have to push anything on the stack to exit, just make sure ebp has an accessible address, so that the kernel reads ahead the potential 6th syscall argument without #GP (the value doesn't have to mean anything, so no need to push ebp too). I. e. the exit call would reduce to mov eax,1; xor ebx,ebx; mov ebp, esp; sysenter. – Ruslan Aug 28 '15 at 14:55
  • @Ruslan I'm afraid code you suggested just doesn't work - program terminates with segfault. – Aleksander Alekseev Jul 17 '16 at 14:22
  • @AleksanderAlekseev what Linux version are you using? And what do you have in esp? – Ruslan Jul 17 '16 at 14:31
  • @Ruslan, sorry, I just realized that you suggested to modify only exit() call, not both write() and exit() calls. In this case everything works fine. – Aleksander Alekseev Jul 17 '16 at 14:48
  • It would be great to link to the part of the kernel code where you see the stack expectations. – Evan Carroll Sep 30 '18 at 00:44
  • It is located in arch/x86/entry/vdso/vdso32-setup.c. But, it does not really pop up with a better understanding of how it works when you read it. :) – perror Sep 30 '18 at 08:24
  • If you don't mind garbage being put in ecx, edx and ebp (such as when you're going to be overwriting those value anyway) you can instead use lea ebp, [esp-12] – 0x777C Feb 25 '19 at 19:02
  • Are you sure it's sysenter requiring stack set up? The way I understand it, those should be values for sysexit, so it knows where to return and restore to previous stack. So I would say it's kernel that works with the values from the stack, but maybe that's what you meant. In the last code snippet, what does the second sysenter do? – Pyjong Jan 13 '20 at 15:36