The Zerofox Syndicate

Asmtutor X64_86 lesson 1

2022-06-06

Disclaimer

These posts are heavily based on lessons that DGivney originally published on asmtutor.com.

I just wanted to port his lessons to x64 as an exercise for myself and put them somewhere online. Credit should still go to @DGiveny.

Lesson 1: Hello World

This tutorial assumes that you already know what a CPU register is and what a CPU instruction is, yet are unfamiliar with how programs interact with the outside world.

What is a system call?

Programs can execute instructions on a CPU. Most of those instructions are very basic and allow you to do arithmetic, like adding two numbers.

For any program to be useful, it needs to be able to interact with the outside world. Calculation only gets you so far. At some point, you want to display the results of those calculations. A program needs ways to receive input and ways to output the results of their calculations.

Programs cannot do these things directly. There are no CPU instructions to perform these actions directly. In order to achieve these things, a program needs to ask the kernel to perform this specific operation. The kernel will then suspend the execution of the program and perform the task that you requested, after which the program is resumed, and the result provided to your program. These interactions with the kernel are called system calls.

System calls

System calls allow programs to request certain operations that have to be performed by the kernel. Since a system calls are often related to manipulations in the real world it is necessary to have a way to provide data and receive data from the kernel through these system calls. We do this by providing parameters in certain registers, and we expect a return value in a specific register. Which registers are used as arguments are not hard-wired into the CPU. That decision is done by the kernel, but not communicated to the program. The program is supposed to know which registers the CPU will expect with which arguments for a specific syscall. These expectations are called a calling convention. It will be different if you use a BSD kernel instead of a Linux kernel or if you are using a different CPU architecture.

The original tutorial that this blog post is based on, only showed the 32-bit(x86) version. While it is still relevant for educational purposes, it isn’t really used in modern desktops any more, hence I wanted to update this tutorial with the 64-bit version that everyone is likely to run these days.

x86 syscall calling convention

If you want to learn the original x86 calling convention. I suggest reading the original tutorial on asmtutor.com

What is x64?

The x86_64 architecture, or sometimes x64, is a 64-bit extension by AMD of the original 32-bit x86 architecture by Intel.

Hello world example

Before diving into more theory, here is a small program.

; Hello World Program - asmtutor.com
; Compile with: nasm -f elf64 helloworld.asm
; Link with: ld -m elf_x86_64 helloworld.o -o helloworld
; Run with: ./helloworld

SECTION .data
msg     db      'Hello World!', 0Ah     ; assign msg variable with your message string

SECTION .text
global  _start

_start: 

    mov     rax, 1    ; invoke SYS_WRITE (kernel opcode 1)
    mov     rdi, 1    ; write to the STDOUT file
    mov     rsi, msg  ; move the memory address of our message string into ecx
    mov     rdx, 13   ; number of bytes to write - one for each letter plus 0Ah (line feed character)
    syscall

For the kernel to know which syscall you want to execute, it looks into the rax register. In this case, 1 means we are trying to invoke the write syscall.

The write syscall takes 3 arguments. First the file descriptor, secondly the address in memory that we want to write, and thirdly the number of bytes we would like to write away.

Our first argument is 1 because we would like to write to STDOUT. Our second argument is the msg variable, the NASM compiler will replace this with the address in memory that holds the “Hello World!\n” string. Our third argument is the length of the “Hello World!\n” string.

syscall number	arg1	arg2	arg3	arg4	arg5	arg6
rax	rdi	rsi	rdx	r10	r8	r9

To run this, we first need to compile this file to an object file. This file will contain all the instructions but, it not yet in the format of an executable file that the operating system can execute. We create that by linking it.

nasm -f elf64 helloworld.asm # compile to helloworld.o (object file)
ld -m elf_x86_64 helloworld.o -o helloworld # create the executable

If we execute the resulting helloworld file, you will notice that we get the desired output but that the program also crashes immediately afterwards.

This is because we are not properly exiting the program. Properly exiting the program requires another syscall.

The exit syscall number is 60 and takes only one argument, the exit code. As an exercise, try adding it to the hello world example so that the program exits cleanly without crashing.

Lesson 2 will be the solution to this exercise.

Some useful links

A list of x64 syscall numbers and their argument: https://chromium.googlesource.com/chromiumos/docs/+/HEAD/constants/syscalls.md#x86_64-64_bit
A syscall table with searchfilippo.io/linux-syscall-table
Remember that you can also look at the man page for syscall(2), if you just want to know which register is used for which argument.

Tags: x64 nasm asmtutor.com