I spent some time over this quarantime brushing up on a couple things, but it all started with an interest in UEFI.
UEFI fills a lot of niches that a system's BIOS once filled, and provides a programming environment that is somewhere between firmware and an operating system. I've been told it does quite a bit more than BIOS once did, and while I've always had an interest in programming in a BIOS environment, I've never really gotten off the ground with any of my putterings, so as a result of my inexpertise, I also cannot tell you what the difference between UEFI and BIOS is, only that UEFI is newer.
Most of my experience programming has been with an application that is intended to be run in an OS. Whether it's an executable on your computer desktop or an app on your phone, I was always able to assume that certain functions and structures were available to me, albeit in different flavours for eg linux, windows, mac, android, etc. One example of this is a filesystem, so operations like opening or reading a file are all calls into the operating system.
When programming for this early boot environment on a PC, I am presented with a different, more limited toolset, and that's one of the main things that interests me.
I started this project with the intent of writing a bootloader, and using that to bootstrap into more OS development stuff, but I got really sidetracked and right now I am firmly stuck in UEFI land until I finish this.
There was an initial (maybe self-inflicted) headache of trying to assemble with
and link with llvm tools, instead of fasm+gcc or just gcc, which seems to be a common option.
After pouring over the UEFI executable format documentation, I found it was a Portable Executable (PE) format, which is commonly used in the windows world, so looking through some llvm-ld documentation and a bunch of trial-and-error I found a magic incantation that worked for me:
# replace <entry>, <file.o>, and <file.efi> with something that makes sense for you lld-link /debug /entry:<entry> /subsystem:efi_application <file.o> /out:<file.efi>
lld-link is the windows-base linker frontend for the llvm linker,
which explains the windows-style cli flags even on linux.
For the rest of the post, I'm going to assume
efi_main, but you can
choose whichever one you want.
Note: I tried a bit with a custom linker script before this, and I honestly can't remember if I got it working or not, but this ended up being much cleaner
yasm it was a little easier, just
yasm -g dwarf2 -f win64 <file.s> <file.o>
And lastly, in order to run the executable, I needed a UEFI environment with the
executables I built.
The natural choice is
OVMF boot images, and every time I built
the efi executables, I'd copy them into a disk image that was attached to the
qemu system when it booted up.
There's a good start on the OSDev wiki page for UEFI that I based the following commands and stuff off of.
# make the image dd if=/dev/zero of=uefi.img bs=512 count=93750 status=none # partition it # 1. create a gpt table # 2. new partition, start 2048, end 93716 # 3. write to disk printf 'g\nn\n1\n2048\n93716\nw\n' |fdisk uefi.img |sed 's/^/fdisk: /'
mounting the image as a loop device, and formatting it need to be done as root
off=1048576 siz=46934528 mnt=$(mktemp -d) losetup --offset $off --sizelimit $siz /dev/loop0 uefi.img mkfs.fat -F 32 /dev/loop0 mount /dev/loop0 $mnt cp -R ./*.efi $mnt/ umount $mnt losetup -d /dev/loop0 rm -rf $mnt
Those numbers for the size of the image, the formatting, and the offset and size
losetup were gleaned from other guides and docs on UEFI images.
I wrapped all these commands in a
Makefile and a build script.
The first step is an obligatory hello world, and now with all that other stuff out of the way, that too is where I go.
I found a couple documents really useful for programming for UEFI in assembly:
With all that in mind, we start with the symbol I've declared as the entrypoint
; hello.s section .text global efi_main efi_main: ; something something ; print "hello world" ; exit?
The UEFI documentation describes calling conventions as well as the UEFI entry point.
rdx(if they fit)
So, we have no use for the image handle right now, what we really want is the
system table, specifically the
ConOut pointer, and its associated
; these data structs are taken from UEFI docs struc EFI_SYSTEM_TABLE resq 8 ; buncha fields we don't care about right now .ConOut: resq 1 resq 6 endstruc struc EFI_SIMLE_TEXT_OUTPUT_PROTOCOL resq 1 .OutputString: resq 1 resq 8 endstruc section .text global efi_main efi_main: ; rcx = ImageHandle, rdx = SystemTable ; rcx, rdx, rax are all volatile, which means we don't have to save them ; OutputString takes rcx = ConOut, rdx = <string> mov rcx, [rdx+EFI_SYSTEM_TABLE.ConOut] mov rax, [rcx+EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL.OutputString] lea rdx, [rel _hello] call rax ret section.data ; UEFI strings are UCS or utf16be encoded ; for ascii chars, this is equivalent to the ascii byte left-padded with 0's ; the 13,10 are cr nl hello: dw 'h','e','l','l','o',13,10,0
Lastly, we need to actually run the image we made, so my qemu command is
qemu-system-x86_64 -cpu qemu64 -net none -bios OVMF.fd -drive format=raw,file=uefi.img
Google around for how to get the
OVMF.fd bios file.
For me it is provided as a package, but there should also be a direct download somewhere,
I don't feel like finding it for you.
I could not get
lldb to attach to qemu and do anything intelligible,
so for the time being, line-debugging is out.
Also, before I had gotten
OutputString working, I couldn't debug with log
Luckily, and I'm not sure if it's just a function of qemu or the
or if it's from UEFI, but when there's a fault, I get a nice dump of all the
registers, so I used that to debug where I was in the program.
I'd have some
mov r8, __LINE__ scattered throughout the program,
and I would stick a
call 0 to induce a fault.
I could then check
r8 to see whether a certain branch had been hit or not.
That or just add a
call 0 somewhere to check what's in the different registers at a certain point.
FS0:\> main.efi !!!! X64 Exception Type - 0D(#GP - General Protection) CPU Apic ID - 00000000 !!!! ExceptionData - 0000000000000000 RIP - 00000000067E0004, CS - 0000000000000038, RFLAGS - 0000000000000246 RAX - 00000000067E2018, RCX - 06B3D32000000000, RDX - 0000000007BEE018 RBX - 0000000007F1C578, RSP - 0000000007F1C498, RBP - 0000000007F1C568 RSI - 0000000000000009, RDI - 00000000067E2018 R8 - 0000000000000004, R9 - 0000000000000108, R10 - 0000000000000000 R11 - 0000000000000008, R12 - 0000000000000000, R13 - 0000000006B1C018 R14 - 0000000000000000, R15 - 00000000067E2040 DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030 GS - 0000000000000030, SS - 0000000000000030 CR0 - 0000000080010033, CR2 - 0000000000000000, CR3 - 0000000007C01000 CR4 - 0000000000000668, CR8 - 0000000000000000 DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000 DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400 GDTR - 0000000007BEEA98 0000000000000047, LDTR - 0000000000000000 IDTR - 00000000072D1018 0000000000000FFF, TR - 0000000000000000 FXSAVE_STATE - 0000000007F1C0F0 !!!! Find image based on IP(0x67E0004) (No PDB) (ImageBase=00000000067DF000, EntryPoint=000000 00067E0000) !!!!
Now that I could compile and run something I started messing around with
input as well as output, and thought I'd try my hand at making a forth interpreter.
I'd heard about forth being a common lang to implement in embedded environments
so this seemed like a good place to try it, but before I did, it was really
bugging me that I couldn't use string literal syntax and have
yasm encode it
I found that
yasm is mimicing, has string operations like
__utf16be__("hello") which is exactly what I'm looking for,
yasm does not have them.
So I took a detour and added utf string ops to