2020 Jun 26

> Code-vid: UEFI

I spent some time over this quarantime brushing up on a couple things, but it all started with an interest in UEFI.

UEFI fills a lot of niches that a system's BIOS once filled, and provides a programming environment that is somewhere between firmware and an operating system. I've been told it does quite a bit more than BIOS once did, and while I've always had an interest in programming in a BIOS environment, I've never really gotten off the ground with any of my putterings, so as a result of my inexpertise, I also cannot tell you what the difference between UEFI and BIOS is, only that UEFI is newer.

Most of my experience programming has been with an application that is intended to be run in an OS. Whether it's an executable on your computer desktop or an app on your phone, I was always able to assume that certain functions and structures were available to me, albeit in different flavours for eg linux, windows, mac, android, etc. One example of this is a filesystem, so operations like opening or reading a file are all calls into the operating system.

When programming for this early boot environment on a PC, I am presented with a different, more limited toolset, and that's one of the main things that interests me.

I started this project with the intent of writing a bootloader, and using that to bootstrap into more OS development stuff, but I got really sidetracked and right now I am firmly stuck in UEFI land until I finish this.


There was an initial (maybe self-inflicted) headache of trying to assemble with yasm and link with llvm tools, instead of fasm+gcc or just gcc, which seems to be a common option.

After pouring over the UEFI executable format documentation, I found it was a Portable Executable (PE) format, which is commonly used in the windows world, so looking through some llvm-ld documentation and a bunch of trial-and-error I found a magic incantation that worked for me:

# replace <entry>, <file.o>, and <file.efi> with something that makes sense for you
lld-link /debug /entry:<entry> /subsystem:efi_application <file.o> /out:<file.efi>

lld-link is the windows-base linker frontend for the llvm linker, which explains the windows-style cli flags even on linux. For the rest of the post, I'm going to assume <entry> is efi_main, but you can choose whichever one you want.

Note: I tried a bit with a custom linker script before this, and I honestly can't remember if I got it working or not, but this ended up being much cleaner

For yasm it was a little easier, just

yasm -g dwarf2 -f win64 <file.s> <file.o>

And lastly, in order to run the executable, I needed a UEFI environment with the executables I built. The natural choice is qemu running OVMF boot images, and every time I built the efi executables, I'd copy them into a disk image that was attached to the qemu system when it booted up.

There's a good start on the OSDev wiki page for UEFI that I based the following commands and stuff off of.

# make the image
dd if=/dev/zero of=uefi.img bs=512 count=93750 status=none
# partition it
# 1. create a gpt table
# 2. new partition, start 2048, end 93716
# 3. write to disk
printf 'g\nn\n1\n2048\n93716\nw\n' |fdisk uefi.img |sed 's/^/fdisk: /'

mounting the image as a loop device, and formatting it need to be done as root

mnt=$(mktemp -d)

losetup --offset $off --sizelimit $siz /dev/loop0 uefi.img
mkfs.fat -F 32 /dev/loop0
mount /dev/loop0 $mnt
cp -R ./*.efi $mnt/
umount $mnt
losetup -d /dev/loop0
rm -rf $mnt

Those numbers for the size of the image, the formatting, and the offset and size params to losetup were gleaned from other guides and docs on UEFI images.

I wrapped all these commands in a Makefile and a build script.


The first step is an obligatory hello world, and now with all that other stuff out of the way, that too is where I go.

I found a couple documents really useful for programming for UEFI in assembly:

With all that in mind, we start with the symbol I've declared as the entrypoint in my lld-link command: efi_main

; hello.s
section .text
global efi_main
	; something something
	; print "hello world"
	; exit?

The UEFI documentation describes calling conventions as well as the UEFI entry point.

So, we have no use for the image handle right now, what we really want is the system table, specifically the ConOut pointer, and its associated OutputString function.

; these data structs are taken from UEFI docs
resq 8 ; buncha fields we don't care about right now
.ConOut: resq 1
resq 6
resq 1
.OutputString: resq 1
resq 8

section .text
global efi_main
	; rcx = ImageHandle, rdx = SystemTable
	; rcx, rdx, rax are all volatile, which means we don't have to save them
	; OutputString takes rcx = ConOut, rdx = <string>
	mov rcx, [rdx+EFI_SYSTEM_TABLE.ConOut]
	mov rax, [rcx+EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL.OutputString]
	lea rdx, [rel _hello]
	call rax

; UEFI strings are UCS or utf16be encoded
; for ascii chars, this is equivalent to the ascii byte left-padded with 0's
; the 13,10 are cr nl
hello: dw 'h','e','l','l','o',13,10,0

Lastly, we need to actually run the image we made, so my qemu command is

qemu-system-x86_64 -cpu qemu64 -net none -bios OVMF.fd -drive format=raw,file=uefi.img

Google around for how to get the OVMF.fd bios file. For me it is provided as a package, but there should also be a direct download somewhere, I don't feel like finding it for you.

Debugging (or lack therof)

I could not get gdb or lldb to attach to qemu and do anything intelligible, so for the time being, line-debugging is out. Also, before I had gotten OutputString working, I couldn't debug with log messages either. Luckily, and I'm not sure if it's just a function of qemu or the OVMF environment or if it's from UEFI, but when there's a fault, I get a nice dump of all the registers, so I used that to debug where I was in the program.

I'd have some mov r8, __LINE__ scattered throughout the program, and I would stick a call 0 to induce a fault. I could then check r8 to see whether a certain branch had been hit or not. That or just add a call 0 somewhere to check what's in the different registers at a certain point.

FS0:\> main.efi
!!!! X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000000
RIP  - 00000000067E0004, CS  - 0000000000000038, RFLAGS - 0000000000000246
RAX  - 00000000067E2018, RCX - 06B3D32000000000, RDX - 0000000007BEE018
RBX  - 0000000007F1C578, RSP - 0000000007F1C498, RBP - 0000000007F1C568
RSI  - 0000000000000009, RDI - 00000000067E2018
R8   - 0000000000000004, R9  - 0000000000000108, R10 - 0000000000000000
R11  - 0000000000000008, R12 - 0000000000000000, R13 - 0000000006B1C018
R14  - 0000000000000000, R15 - 00000000067E2040
DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
GS   - 0000000000000030, SS  - 0000000000000030
CR0  - 0000000080010033, CR2 - 0000000000000000, CR3 - 0000000007C01000
CR4  - 0000000000000668, CR8 - 0000000000000000
DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
GDTR - 0000000007BEEA98 0000000000000047, LDTR - 0000000000000000
IDTR - 00000000072D1018 0000000000000FFF,   TR - 0000000000000000
FXSAVE_STATE - 0000000007F1C0F0
!!!! Find image based on IP(0x67E0004) (No PDB)  (ImageBase=00000000067DF000, EntryPoint=000000
00067E0000) !!!!

Next Steps

Now that I could compile and run something I started messing around with input as well as output, and thought I'd try my hand at making a forth interpreter. I'd heard about forth being a common lang to implement in embedded environments so this seemed like a good place to try it, but before I did, it was really bugging me that I couldn't use string literal syntax and have yasm encode it as utf16be. I found that nasm, which yasm is mimicing, has string operations like __utf16be__("hello") which is exactly what I'm looking for, but yasm does not have them.

So I took a detour and added utf string ops to yasm.