> Code-vid: UEFI
I spent some time over this quarantime brushing up on a couple things, but it all started with an interest in UEFI.
UEFI fills a lot of niches that a system's BIOS once filled, and provides a programming environment that is somewhere between firmware and an operating system. I've been told it does quite a bit more than BIOS once did, and while I've always had an interest in programming in a BIOS environment, I've never really gotten off the ground with any of my putterings, so as a result of my inexpertise, I also cannot tell you what the difference between UEFI and BIOS is, only that UEFI is newer.
Most of my experience programming has been with an application that is intended to be run in an OS. Whether it's an executable on your computer desktop or an app on your phone, I was always able to assume that certain functions and structures were available to me, albeit in different flavours for eg linux, windows, mac, android, etc. One example of this is a filesystem, so operations like opening or reading a file are all calls into the operating system.
When programming for this early boot environment on a PC, I am presented with a different, more limited toolset, and that's one of the main things that interests me.
I started this project with the intent of writing a bootloader, and using that to bootstrap into more OS development stuff, but I got really sidetracked and right now I am firmly stuck in UEFI land until I finish this.
Pre-Init
There was an initial (maybe self-inflicted) headache of trying to assemble with yasm
and link with llvm tools, instead of fasm+gcc or just gcc, which seems to be a common option.
After pouring over the UEFI executable format documentation, I found it was a Portable Executable (PE) format, which is commonly used in the windows world, so looking through some llvm-ld documentation and a bunch of trial-and-error I found a magic incantation that worked for me:
# replace <entry>, <file.o>, and <file.efi> with something that makes sense for you
lld-link /debug /entry:<entry> /subsystem:efi_application <file.o> /out:<file.efi>
lld-link
is the windows-base linker frontend for the llvm linker,
which explains the windows-style cli flags even on linux.
For the rest of the post, I'm going to assume <entry>
is efi_main
, but you can
choose whichever one you want.
Note: I tried a bit with a custom linker script before this, and I honestly can't remember if I got it working or not, but this ended up being much cleaner
For yasm
it was a little easier, just
yasm -g dwarf2 -f win64 <file.s> <file.o>
And lastly, in order to run the executable, I needed a UEFI environment with the
executables I built.
The natural choice is qemu
running OVMF
boot images, and every time I built
the efi executables, I'd copy them into a disk image that was attached to the
qemu system when it booted up.
There's a good start on the OSDev wiki page for UEFI that I based the following commands and stuff off of.
# make the image
dd if=/dev/zero of=uefi.img bs=512 count=93750 status=none
# partition it
# 1. create a gpt table
# 2. new partition, start 2048, end 93716
# 3. write to disk
printf 'g\nn\n1\n2048\n93716\nw\n' |fdisk uefi.img |sed 's/^/fdisk: /'
mounting the image as a loop device, and formatting it need to be done as root
off=1048576
siz=46934528
mnt=$(mktemp -d)
losetup --offset $off --sizelimit $siz /dev/loop0 uefi.img
mkfs.fat -F 32 /dev/loop0
mount /dev/loop0 $mnt
cp -R ./*.efi $mnt/
umount $mnt
losetup -d /dev/loop0
rm -rf $mnt
Those numbers for the size of the image, the formatting, and the offset and size
params to losetup
were gleaned from other guides and docs on UEFI images.
I wrapped all these commands in a Makefile
and a build script.
Init
The first step is an obligatory hello world, and now with all that other stuff out of the way, that too is where I go.
I found a couple documents really useful for programming for UEFI in assembly:
- UEFI documentation - for info on the entry point, and datastructures available
- ISA documentation for x86_64 - for info on asm codes
- YASM documentation - for info on yasm format, as well as preprocessor macros
With all that in mind, we start with the symbol I've declared as the entrypoint
in my lld-link
command: efi_main
; hello.s
section .text
global efi_main
efi_main:
; something something
; print "hello world"
; exit?
The UEFI documentation describes calling conventions as well as the UEFI entry point.
- the first two arguments to a function are available as
rcx
andrdx
(if they fit) - the entrypoint should accept two arguments:
ImageHandle
,SystemTable
So, we have no use for the image handle right now, what we really want is the
system table, specifically the ConOut
pointer, and its associated OutputString
function.
; these data structs are taken from UEFI docs
struc EFI_SYSTEM_TABLE
resq 8 ; buncha fields we don't care about right now
.ConOut: resq 1
resq 6
endstruc
struc EFI_SIMLE_TEXT_OUTPUT_PROTOCOL
resq 1
.OutputString: resq 1
resq 8
endstruc
section .text
global efi_main
efi_main:
; rcx = ImageHandle, rdx = SystemTable
; rcx, rdx, rax are all volatile, which means we don't have to save them
; OutputString takes rcx = ConOut, rdx = <string>
mov rcx, [rdx+EFI_SYSTEM_TABLE.ConOut]
mov rax, [rcx+EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL.OutputString]
lea rdx, [rel _hello]
call rax
ret
section.data
; UEFI strings are UCS or utf16be encoded
; for ascii chars, this is equivalent to the ascii byte left-padded with 0's
; the 13,10 are cr nl
hello: dw 'h','e','l','l','o',13,10,0
Lastly, we need to actually run the image we made, so my qemu command is
qemu-system-x86_64 -cpu qemu64 -net none -bios OVMF.fd -drive format=raw,file=uefi.img
Google around for how to get the OVMF.fd
bios file.
For me it is provided as a package, but there should also be a direct download somewhere,
I don't feel like finding it for you.
Debugging (or lack therof)
I could not get gdb
or lldb
to attach to qemu and do anything intelligible,
so for the time being, line-debugging is out.
Also, before I had gotten OutputString
working, I couldn't debug with log
messages either.
Luckily, and I'm not sure if it's just a function of qemu or the OVMF
environment
or if it's from UEFI, but when there's a fault, I get a nice dump of all the
registers, so I used that to debug where I was in the program.
I'd have some mov r8, __LINE__
scattered throughout the program,
and I would stick a call 0
to induce a fault.
I could then check r8
to see whether a certain branch had been hit or not.
That or just add a call 0
somewhere to check what's in the different registers at a certain point.
FS0:\> main.efi
!!!! X64 Exception Type - 0D(#GP - General Protection) CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000000
RIP - 00000000067E0004, CS - 0000000000000038, RFLAGS - 0000000000000246
RAX - 00000000067E2018, RCX - 06B3D32000000000, RDX - 0000000007BEE018
RBX - 0000000007F1C578, RSP - 0000000007F1C498, RBP - 0000000007F1C568
RSI - 0000000000000009, RDI - 00000000067E2018
R8 - 0000000000000004, R9 - 0000000000000108, R10 - 0000000000000000
R11 - 0000000000000008, R12 - 0000000000000000, R13 - 0000000006B1C018
R14 - 0000000000000000, R15 - 00000000067E2040
DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030
GS - 0000000000000030, SS - 0000000000000030
CR0 - 0000000080010033, CR2 - 0000000000000000, CR3 - 0000000007C01000
CR4 - 0000000000000668, CR8 - 0000000000000000
DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
GDTR - 0000000007BEEA98 0000000000000047, LDTR - 0000000000000000
IDTR - 00000000072D1018 0000000000000FFF, TR - 0000000000000000
FXSAVE_STATE - 0000000007F1C0F0
!!!! Find image based on IP(0x67E0004) (No PDB) (ImageBase=00000000067DF000, EntryPoint=000000
00067E0000) !!!!
Next Steps
Now that I could compile and run something I started messing around with
input as well as output, and thought I'd try my hand at making a forth interpreter.
I'd heard about forth being a common lang to implement in embedded environments
so this seemed like a good place to try it, but before I did, it was really
bugging me that I couldn't use string literal syntax and have yasm
encode it
as utf16be.
I found that nasm
, which yasm
is mimicing, has string operations like
__utf16be__("hello")
which is exactly what I'm looking for,
but yasm
does not have them.
So I took a detour and added utf string ops to yasm
.
-JD