UNIX Syscalls

Overview

On UNIX-like operating systems, userland processes invoke kernel procedures using the “syscall” feature. Each syscall is identified by a “syscall number” and has a short list of parameters, which both can vary betwen operating systems, hardware platforms, and configuration options.

Performing a syscall is usually done via a special assembly instruction, though some platforms use other mechanisms (e.g. a vDSO). This page is a catalog of how to invoke syscalls on different UNIX-like platforms.

int $0x80 (or int 80h)

int $0x80 (also styled as int 80h) is the traditional syscall instruction on i386 UNIX-like platforms. It triggers a software interrupt that transfers control to the kernel, which inspects its registers and stack to find the syscall number + parameters. It is obsolete since the mid 2000s for performance reasons, but can still be found in tutorials because it’s easier to understand than more modern mechanisms.

Linux

Linux syscalls are defined in include/linux/syscalls.h. Syscalls use the same parameter order across platforms, but some (e.g. sys_stat64) are only defined on some platforms, and others (e.g. sys_clone) have different parameters depending on kernel compilation options. Syscall numbers are platform-dependent.

Manpage syscalls(2) lists syscalls and which kernel version they were added in. Manpage syscall(2) lists per-architecture calling conventions and register assignments.

Documentation and tutorials for implementing a Linux syscall:

Linux: i386 (INT 0x80)

The syscall number is passed in register eax. Syscalls with six or fewer parameters pass them in registers [ebx, ecx, edx, esi, edi, ebp]. Syscalls with more than six parameters use ebxto pass a memory address, in a way that doesn’t seem to be well documented.

Linux syscall numbers for i386 are defined in arch/x86/entry/syscalls/syscall_32.tbl.

See above for background on int $0x80.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
.data
    .set .L_STDOUT,        1
    .set .L_SYSCALL_EXIT,  1
    .set .L_SYSCALL_WRITE, 4
    .L_message:
        .ascii "Hello, world!\n"
        .set .L_message_len, . - .L_message

.text
    .global _start
    _start:
        # write(STDOUT, message, message_len)
        mov $.L_SYSCALL_WRITE, %eax
        mov $.L_STDOUT,        %ebx
        mov $.L_message,       %ecx
        mov $.L_message_len,   %edx
        int $0x80

        # exit(0)
        mov $.L_SYSCALL_EXIT, %eax
        mov $0,               %ebx
        int $0x80

static linking

1
2
3
4
5
6
7
$ as --32 -o hello.o hello.s
$ ld -m elf_i386 -o hello hello.o
$ file hello
hello: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, not stripped
$ ./hello
Hello, world!
$

dynamic linking

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
$ as --32 -o hello.o hello.s
$ ld -m elf_i386 -o hello hello.o \
   --dynamic-linker /lib/ld-linux.so.2 \
   -l:ld-linux.so.2
$ file hello
hello: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, not stripped
$ ldd hello
    /lib/ld-linux.so.2 (0x56614000)
    linux-gate.so.1 (0xf77ba000)
$ ./hello
Hello, world!
$

Linux: i386 (vDSO)

A vDSO is a shared library injected into processes by the kernel, rather than loaded by the dynamic linker. It’s used in i386 linux to implement faster syscalls via the SYSENTER instructions available in modern 32-bit x86 processors12. Later kernel versions also added fast paths for certain read-only syscalls3.

This code is slightly more complicated than the int 0x80 example because all functions loaded from shared objects (including __kernel_vsyscall) must use indirect calls.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
.extern __kernel_vsyscall

.data
    .set .L_STDOUT,        1
    .set .L_SYSCALL_WRITE, 4
    .set .L_SYSCALL_EXIT,  1
    .L_message:
        .ascii "Hello, world!\n"
        .set .L_message_len, . - .L_message

.text
    .global _start
    _start:
        call .L_get_pc_thunk.esi
        add  $_GLOBAL_OFFSET_TABLE_, %esi

        # write(STDOUT, message, message_len)
        mov  $.L_SYSCALL_WRITE, %eax
        mov  $.L_STDOUT,        %ebx
        mov  $.L_message,       %ecx
        mov  $.L_message_len,   %edx
        call *__kernel_vsyscall@GOT(%esi)

        # exit(0)
        mov  $.L_SYSCALL_EXIT, %eax
        mov  $0,               %ebx
        call *__kernel_vsyscall@GOT(%esi)

    .L_get_pc_thunk.esi:
        mov (%esp), %esi
        ret

The linux-gate.so.1 library that will be available at runtime is not available to the linker at compile time. To get the correct symbols and ELF headers into the executable, we need to inject some fake data:

The resulting binary is a totally normal dynamic ELF executable.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
$ echo '.type __kernel_vsyscall STT_FUNC' | as --32 -o dummy_so.o
$ ld -m elf_i386 -shared \
   --defsym __kernel_vsyscall=0 \
   -soname=linux-gate.so.1 \
   -o dummy_so dummy_so.o

$ as --32 -o hello.o hello.s
$ ld -m elf_i386 -o hello hello.o \
   --dynamic-linker /lib/ld-linux.so.2 \
   -l:ld-linux.so.2 \
   dummy_so
$ file hello
hello: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, not stripped
$ ldd hello
    /lib/ld-linux.so.2 (0x56625000)
    linux-gate.so.1 (0xf77d5000)
$ ./hello
Hello, world!
$

Why not auxinfo?

Some articles about the Linux vDSO describe looking up its address using the ELF auxiliary vector. I avoided this because it seems complicated and fussy:

The main disadvantage of my solution is it can’t be used in a statically linked executable, which are useful for system recovery tools (e.g. busybox) or minimal Docker containers.

Why not gs:0x10?

I’ve seen one article recommend using call *%gs:0x10to invoke __kernel_vsyscall, because GNU libc uses this register to locate its early-initialized magic globals.

Don’t do this. Everything I can find about glibc auxv handling indicates that the value of %gs is not part of the GNU libc public ABI, and it seems to be pointing to some internal datastructure that happens to have the address of __kernel_vsyscall at offset 0x10 (used to be 0x18). There is no guarantees that these properties will be true in the future, especially if you want your code to link against non-GNU libc implementations such as musl.

Linux: x86-64

The syscall number is passed in register rax. Parameters are passed in registers [rdi, rsi, rdx, rcx, r8, r9]. I haven’t found documentation on what x86-64 Linux does for syscalls with more than six parameters. The syscall instruction is used to pass control to the kernel.

Linux syscall numbers for x86-64 are defined in arch/x86/entry/syscalls/syscall_64.tbl.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
.data
    .set .L_STDOUT,        1
    .set .L_SYSCALL_EXIT,  60
    .set .L_SYSCALL_WRITE, 1
    .L_message:
        .ascii "Hello, world!\n"
        .set .L_message_len, . - .L_message

.text
    .global _start
    _start:
        # write(STDOUT, message, message_len)
        mov     $.L_SYSCALL_WRITE, %rax
        mov     $.L_STDOUT,        %rdi
        mov     $.L_message,       %rsi
        mov     $.L_message_len,   %rdx
        syscall

        # exit(0)
        mov     $.L_SYSCALL_EXIT, %rax
        mov     $0,               %rdi
        syscall

static linking

1
2
3
4
5
6
7
$ as --64 -o hello.o hello.s
$ ld -m elf_x86_64 -o hello hello.o
$ file hello
hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped
$ ./hello
Hello, world!
$

dynamic linking

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
$ as --64 -o hello.o hello.s
$ ld -m elf_x86_64 -o hello hello.o \
   --dynamic-linker /lib64/ld-linux-x86-64.so.2 \
   -l:ld-linux-x86-64.so.2
$ file hello
hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, not stripped
$ ldd hello
    /lib64/ld-linux-x86-64.so.2 (0x00007f472a831000)
    linux-vdso.so.1 (0x00007ffe83d7a000)
$ ./hello
Hello, world!
$

Linux: ARM v6 (Little-Endian, EABI)

Linux syscall numbers for ARM are defined in arch/arm/tools/syscall.tbl.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
.arch armv6
.data
    .set .L_STDOUT,        1
    .set .L_SYSCALL_EXIT,  1
    .set .L_SYSCALL_WRITE, 4
    .L_message:
        .ascii "Hello, world!\n"
    .set .L_message_len, . - .L_message

.text
    .global _start
    _start:
        @ write(STDOUT, message, message_len)
        mov %r7, #.L_SYSCALL_WRITE
        mov %r0, #.L_STDOUT
        ldr %r1, =.L_message
        mov %r2, #.L_message_len
        swi #0

        @ exit(0)
        mov %r7, #.L_SYSCALL_EXIT
        mov %r0, #0
        swi #0

static linking

1
2
3
4
5
6
7
$ as -EL -o hello.o hello.s
$ ld -m armelf_linux_eabi -o hello hello.o
$ file hello
hello: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, not stripped
$ ./hello
Hello, world!
$

dynamic linking

1
2
3
4
5
6
7
8
9
$ as -EL -o hello.o hello.s
$ ld -m armelf_linux_eabi -o hello hello.o \
   --dynamic-linker /lib/ld-linux-armhf.so.3 \
   -l:ld-linux-armhf.so.3
$ file hello
hello: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, not stripped
$ ./hello
Hello, world!
$

Darwin (MacOS X)

Note that I have left out the instructions to statically link binaries because they are documented as unsupported: Technical Q&A QA1118: Statically linked binaries on Mac OS X. Apple is also known to break the syscall ABI between MacOS versions, though it should be stable enough for the syscalls inherited from BSD.

Use of lea here is because PIE addressing is required for -macos_version_min 10.7 or later. Make sure this linker flag matches the .macosx_version_min value in the assembly, or the linker may reject your object code.

10.8 and later requires linking with libSystem via ld -lSystem. Earlier versions don’t need that link.

The default entry point changed from start to _main in 10.8. Use ld -e _main to build for earlier -macos_version_min values.

Darwin: i386

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
.macosx_version_min 10, 8

.data
    .set L_STDOUT,        1
    .set L_SYSCALL_EXIT,  1
    .set L_SYSCALL_WRITE, 4
    L_message:
        .ascii "Hello, world!\n"
        .set L_message_len, . - L_message

.text
    .global _main
    _main:
        mov %eax, %esi

        # write(STDOUT, message, message_len)
        push $L_message_len
        lea  L_message-_main(%esi), %eax
        push %eax
        push $L_STDOUT
        push $0 # stack padding
        mov  $L_SYSCALL_WRITE, %eax
        int  $0x80
        add  $16, %esp

        # exit(0)
        push $0 # exit code
        push $0 # stack padding
        mov  $L_SYSCALL_EXIT, %eax
        int  $0x80

dynamic linking

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
$ as -arch i386 -o hello.o hello.s
$ ld -arch i386 -macosx_version_min 10.8 -lSystem -o hello hello.o
$ file hello
hello: Mach-O executable i386
$ otool -L hello
hello:
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1238.60.2)
$ ./hello
Hello, world!
$

Darwin: x86-64

In 64-bit MacOS X, syscall numbers are divided into “classes”. The syscalls inherited from BSD are in SYSCALL_CLASS_UNIX, starting at 0x2000000. See XNU header osfmk/mach/syscall_sw.h for details.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
.macosx_version_min 10, 8

.data
    .set L_STDOUT,        1
    .set L_SYSCALL_EXIT,  0x2000001
    .set L_SYSCALL_WRITE, 0x2000004
    L_message:
        .ascii "Hello, world!\n"
        .set L_message_len, . - L_message

.text
    .global _main
    _main:
        # write(STDOUT, message, message_len)
        mov     $L_SYSCALL_WRITE, %rax
        mov     $L_STDOUT,        %rdi
        lea     L_message(%rip),  %rsi
        mov     $L_message_len,   %rdx
        syscall

        # exit(0)
        mov     $L_SYSCALL_EXIT, %rax
        mov     $0,              %rdi
        syscall

dynamic linking

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
$ as -arch x86_64 hello.s -o hello.o
$ ld -arch x86_64 -o hello hello.o \
    -macosx_version_min 10.8 -lSystem
$ file hello
hello: Mach-O 64-bit executable x86_64
$ otool -L hello
hello:
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1238.60.2)
$ ./hello
Hello, world!
$

FreeBSD

The list of system calls is defined in sys/kern/syscalls.master. Syscall numbers appear to be the same across hardware platforms.

FreeBSD: i386

int $0x80 appears to be the only supported syscall mechanism for FreeBSD on i386. There is a vDSO at sys/sys/vdso.h but it doesn’t contain a Linux-style generic syscall trampoline.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
.data
    .set .L_STDOUT,        1
    .set .L_SYSCALL_EXIT,  1
    .set .L_SYSCALL_WRITE, 4
    .L_message:
        .ascii "Hello, world!\n"
        .set .L_message_len, . - .L_message

.text
    .global _start
    _start:
        # write(STDOUT, message, message_len)
        push $.L_message_len
        push $.L_message
        push $.L_STDOUT
        push $0 # stack padding
        mov  $.L_SYSCALL_WRITE, %eax
        int  $0x80
        add  $16, %esp

        # exit(0)
        push $0 # exit code
        push $0 # stack padding
        mov  $.L_SYSCALL_EXIT, %eax
        int  $0x80

static linking

1
2
3
4
5
6
7
$ as --32 -o hello.o hello.s
$ ld -m elf_i386_fbsd -o hello hello.o
$ file hello
hello: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), statically linked, not stripped
$ ./hello
Hello, world!
$

dynamic linking

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
$ as --32 -o hello.o hello.s
$ ld -m elf_i386_fbsd -o hello hello.o \
    --dynamic-linker=/libexec/ld-elf.so.1 \
    -L/libexec -l:ld-elf.so.1 \
    --hash-style=gnu
$ file hello
hello: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, not stripped
$ ldd hello
hello:
    /libexec/ld-elf.so.1 (0x2806e000)
$ ./hello
Hello, world!
$

FreeBSD: x86-64

Note that older FreeBSD kernels contain a bug in syscall handling that can cause crashes when using the SYSCALL instruction. Compilers targeting these old versions should use INT $0x80 instead.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
.data
    .set L_STDOUT,        1
    .set L_SYSCALL_EXIT,  1
    .set L_SYSCALL_WRITE, 4
    L_message:
        .ascii "Hello, world!\n"
        .set L_message_len, . - L_message

.text
    .global _main
    _main:
        # write(STDOUT, message, message_len)
        mov     $L_SYSCALL_WRITE, %rax
        mov     $L_STDOUT,        %rdi
        mov     $L_message,       %rsi
        mov     $L_message_len,   %rdx
        syscall

        # exit(0)
        mov     $L_SYSCALL_EXIT, %rax
        mov     $0,              %rdi
        syscall

static linking

1
2
3
4
5
6
7
$ as --64 -o hello.o hello.s
$ ld -m elf_x86_64_fbsd -o hello hello.o
$ file hello
hello: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), statically linked, not stripped
$ ./hello
Hello, world!
$

dynamic linking

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
$ as --64 -o hello.o hello.s
$ ld -m elf_x86_64_fbsd -o hello hello.o \
    --dynamic-linker=/libexec/ld-elf.so.1 \
    -L/libexec -l:ld-elf.so.1 \
    --hash-style=gnu
$ file hello
hello: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, not stripped
$ ldd hello
hello:
    /libexec/ld-elf.so.1 (0x800822000)
$ ./hello
Hello, world!
$

SunOS 4.x (Solaris 1.x)

SunOS: SPARC v7

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
.seg "data"
    L_STDOUT        = 1
    L_SYSCALL_EXIT  = 1
    L_SYSCALL_WRITE = 4
    L_message:
        .ascii "Hello world!\n"
        L_message_len = . - L_message

.seg "text"
    .global _start
    _start:
        ! write(STDOUT, message, message_len)
        mov L_SYSCALL_WRITE, %g1
        mov L_STDOUT,        %o0
        set L_message,       %o1
        set L_message_len,   %o2
        ta  0

        ! exit(0)
        mov L_SYSCALL_EXIT, %g1
        mov 0,              %o0
        ta  0

static linking

1
2
3
4
5
6
7
8
9
% as -o hello.o hello.s
% ld -e _start -o hello hello.o
% file hello
hello:          sparc demand paged executable not stripped
% ldd hello
hello: statically linked
% ./hello
Hello world!
%

Inline Assembly

Higher-level languages sometimes let assembly be embedded directly into their object code. The exact syntax is language- and compiler-specific.

I used x86-64 Linux as the target platform for these examples, but they should work equally well if the appropriate instructions are substituted.

A note on “clobbering”: compilers require the inline assembly block to declare which CPU registers other than the inputs and outputs may be modified. The exact set of clobbered registers is compiler-, platform-, and os-specific5. Linux on x86-64 clobbers rcx and r11 (and maybe r10, as claimed by osdev?).

Linux: x86-64 (GNU C)

See Using Assembly Language with C in the GCC manual for an overview, Machine Constraints for architecture-specific codes to pass parameters into an assembly block, and Local Register Variables for details on assigning values to specific registers.

I couldn’t find documentation on which registers GNU C’s inline assembly clobbers, if any.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
static const int STDOUT = 1;
static const int SYSCALL_EXIT = 60;
static const int SYSCALL_WRITE = 1;
static const char message[] = "Hello, world!\n";
static const int message_len = sizeof(message);

void _start() {
    {   /* write(STDOUT, message, message_len) */
        register int         rax __asm__ ("rax") = SYSCALL_WRITE;
        register int         rdi __asm__ ("rdi") = STDOUT;
        register const char *rsi __asm__ ("rsi") = message;
        register int         rdx __asm__ ("rdx") = message_len;
        __asm__ __volatile__ ("syscall"
            : "+r" (rax)
            : "r" (rax), "r" (rdi), "r" (rsi), "r" (rdx)
            : "rcx", "r11");
    }

    {   /* exit(0) */
        register int rax __asm__ ("rax") = SYSCALL_EXIT;
        register int rdi __asm__ ("rdi") = 0;
        __asm__ __volatile__ ("syscall"
            :
            : "r" (rax), "r" (rdi)
            : "rcx", "r11");
    }
}

static linking

1
2
3
4
5
6
7
$ gcc -m64 -c -o hello.o hello.c
$ ld -m elf_x86_64 -o hello hello.o
$ file hello
hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped
$ ./hello
Hello, world!
$

dynamic linking

1
2
3
4
5
6
7
8
9
$ gcc -m64 -c -o hello.o hello.c
$ ld -m elf_x86_64 -o hello hello.o \
   --dynamic-linker /lib64/ld-linux-x86-64.so.2 \
   -l:ld-linux-x86-64.so.2
$ file hello
hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, not stripped
$ ./hello
Hello, world!
$

Linux: x86-64 (LLVM IR)

See Inline Assembler Expressions in the LLVM IR reference for an overview. I’m using named registers in the input list instead of moving things around in the ASM block, so that LLVM will handle the register allocation.

LLVM documentation says its ASM calls clobber registers dirflag, fpsr, and flags in addition to any registers clobbered by the kernel.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
@.message = internal constant [14 x i8] c"Hello, world!\0A"

define void @_start() {
    %message_ptr = getelementptr [14 x i8], [14 x i8]* @.message , i64 0, i64 0

    ; write(STDOUT, message, message_len)
    call i64 asm sideeffect "syscall",
        "={rax},{rax},{rdi},{rsi},{rdx},~{rcx},~{r11},~{dirflag},~{fpsr},~{flags}"
        ( i64 1            ; {rax} SYSCALL_WRITE
        , i64 1            ; {rdi} STDOUT
        , i8* %message_ptr ; {rsi} message
        , i64 14           ; {rdx} message_len
        )

    ; exit(0)
    call i64 asm sideeffect "syscall",
        "={rax},{rax},{rdi},~{rcx},~{r11},~{dirflag},~{fpsr},~{flags}"
        ( i64 60 ; {rax} SYSCALL_EXIT
        , i64 0  ; {rdi} exit_code
        )

    ret void
}

static linking

1
2
3
4
5
6
7
$ llc -o hello.o hello.ll -filetype=obj
$ ld -m elf_x86_64 -o hello hello.o
$ file hello
hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped
$ ./hello
Hello, world!
$

dynamic linking

1
2
3
4
5
6
7
8
9
# llc -o hello.o hello.ll -filetype=obj -relocation-model=pic
$ ld -m elf_x86_64 -o hello hello.o \
   --dynamic-linker /lib64/ld-linux-x86-64.so.2 \
   -l:ld-linux-x86-64.so.2
$ file hello
hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, not stripped
$ ./hello
Hello, world!
$