Mini-kernel tutorial
This page contains several small exercises that should help you with your first steps within MIPS or RISC-V kernel running in MSIM.
We will provide you with both instructions how to run MSIM as well as a source code of small kernel that you can play with.
Toolchain setup
We expect that you have a cross-compiler toolchain already installed so that you can try the examples yourself.
Please, refer to our instructions if you need help with installing a toolchain.
Once you have your toolchain ready, we can dive into kernel code for real :-).
First compilation
If you have never compiled an operating system kernel (or if you are new to C, GCC, or make), you may wish to start with compiling a smaller kernel first.
Please, clone the MSIM repository.
This tutorial contains examples for both MIPS and RISC-V mini-kernels (both in 32bit variants). Because the architectures are rather similar (after all, RISC-V designers admit they were inspired by MIPS) the text will only contain following markers if the behavior (or source code) differs significantly between these two architectures.
MIPS specific notes
MIPS examples are in the contrib/kernel-tutorial-mips32/ subdirectory.
RISC-V specific notes
RISC-V examples are in the contrib/kernel-tutorial-riscv32/ subdirectory.
We will start inside the first subdirectory, please, choose either MIPS
or RISC-V architecture for this exercise
(of course, intrepid developers can choose to inspect and experiment with
both at the same time).
Before we discuss the contents of the directory, we will build the kernel.
All the examples use make
as the build tool so simply type make to build it.
The make command launches the make tool, which reads dependency rules
from a file named Makefile and uses them to figure out how to
compile C sources into a binary executable.
In this case, make should run a sequence of commands to build the
loader.bin executable from the loader.S source, and the
kernel.bin executable from the head.S and main.c
sources.
make will produce the following output (there might be some differences
in the paths but otherwise the output should look the same on your machine).
make -C kernel
make[1]: Entering directory './kernel'
/usr/bin/mipsel-unknown-linux-gnu-gcc -march=r4000 -mabi=32 -mgp32 -msoft-float -mlong32 -G 0 -mno-abicalls -fno-pic -fno-builtin -ffreestanding -nostdlib -nostdinc -pipe -Wall -Wextra -Werror -Wno-unused-parameter -Wmissing-prototypes -g3 -std=c11 -I. -D__ASM__ -c -o boot/loader.o boot/loader.S
/usr/bin/mipsel-unknown-linux-gnu-ld -G 0 -static -g -T kernel.lds -Map loader.map -o loader.raw boot/loader.o
/usr/bin/mipsel-unknown-linux-gnu-objcopy -O binary loader.raw loader.bin
/usr/bin/mipsel-unknown-linux-gnu-objdump -d loader.raw > loader.disasm
/usr/bin/mipsel-unknown-linux-gnu-gcc -O2 -march=r4000 -mabi=32 -mgp32 -msoft-float -mlong32 -G 0 -mno-abicalls -fno-pic -fno-builtin -ffreestanding -nostdlib -nostdinc -pipe -Wall -Wextra -Werror -Wno-unused-parameter -Wmissing-prototypes -g3 -std=c11 -c -o src/main.o src/main.c
/usr/bin/mipsel-unknown-linux-gnu-gcc -march=r4000 -mabi=32 -mgp32 -msoft-float -mlong32 -G 0 -mno-abicalls -fno-pic -fno-builtin -ffreestanding -nostdlib -nostdinc -pipe -Wall -Wextra -Werror -Wno-unused-parameter -Wmissing-prototypes -g3 -std=c11 -I. -D__ASM__ -c -o src/head.o src/head.S
/usr/bin/mipsel-unknown-linux-gnu-ld -G 0 -static -g -T kernel.lds -Map kernel.map -o kernel.raw src/main.o src/head.o
/usr/bin/mipsel-unknown-linux-gnu-objcopy -O binary kernel.raw kernel.bin
/usr/bin/mipsel-unknown-linux-gnu-objdump -d kernel.raw > kernel.disasm
make[1]: Leaving directory './kernel'
make -C kernel
make[1]: Entering directory './kernel'
/usr/bin/riscv32-unknown-elf-gcc -msmall-data-limit=0 -mstrict-align -fno-pic -fno-builtin -ffreestanding -nostdlib -nostdinc -mno-riscv-attribute -pipe -Wall -Wextra -Werror -Wno-unused-parameter -Wmissing-prototypes -g3 -std=c11 -I. -D__ASM__ -march=rv32g -c -o boot/loader.o boot/loader.S
/usr/bin/riscv32-unknown-elf-ld -G 0 -static -g -T loader.lds -Map loader.map -o loader.raw boot/loader.o
/usr/bin/riscv32-unknown-elf-ld: warning: loader.raw has a LOAD segment with RWX permissions
/usr/bin/riscv32-unknown-elf-objcopy -O binary loader.raw loader.bin
/usr/bin/riscv32-unknown-elf-objdump -d loader.raw > loader.disasm
/usr/bin/riscv32-unknown-elf-gcc -O2 -msmall-data-limit=0 -mstrict-align -fno-pic -fno-builtin -ffreestanding -nostdlib -nostdinc -mno-riscv-attribute -pipe -Wall -Wextra -Werror -Wno-unused-parameter -Wmissing-prototypes -g3 -std=c11 -march=rv32g -c -o src/main.o src/main.c
/usr/bin/riscv32-unknown-elf-gcc -msmall-data-limit=0 -mstrict-align -fno-pic -fno-builtin -ffreestanding -nostdlib -nostdinc -mno-riscv-attribute -pipe -Wall -Wextra -Werror -Wno-unused-parameter -Wmissing-prototypes -g3 -std=c11 -I. -D__ASM__ -march=rv32g -c -o src/head.o src/head.S
/usr/bin/riscv32-unknown-elf-ld -G 0 -static -g -T kernel.lds -Map kernel.map -o kernel.raw src/main.o src/head.o
/usr/bin/riscv32-unknown-elf-ld: warning: kernel.raw has a LOAD segment with RWX permissions
/usr/bin/riscv32-unknown-elf-objcopy -O binary kernel.raw kernel.bin
/usr/bin/riscv32-unknown-elf-objdump -d kernel.raw > kernel.disasm
make[1]: Leaving directory './kernel'
Extra information (using make)
The advantage of using make as opposed to a shell script is in
that make will only rebuild files (along dependency chains) that
have changed since the last compilation, which saves build time,
especially on larger projects (you can try that by running
make again now).
In this example, the rules in the top-level Makefile just tell
make to run make again, but this time using the Makefile
in the kernel subdirectory; more details of the
compilation will come later on.
Note that there is msim.conf in our directory. It contains
directives for the MSIM simulator, configuring it so as to provide
a simple computer equipped with one processor, two
blocks of memory, and a console-like device for textual output (we
will dissect the configuration in the next exercise).
To run the compiled kernel code, run msim without any
arguments. MSIM will load the binary images (loader.bin and
kernel.bin) into the two memory blocks and reset the simulated
CPU so that it starts executing code at factory-defined addresses.
You should see the following output:
Hello, World.
<msim> Alert: XHLT: Machine halt
Cycles: 41
Hello, World.
<msim> Alert: EHALT: Machine halt
Cycles: 42
The “Hello, World.” message was printed from C code compiled into machine code running on the processor of your choosing. Getting the target processor to execute your (compiled) C code is usually one of the major technical obstacles when starting OS development from scratch, which is why we have taken care of this step for now.
The last line (as well as the line prefixed with <msim>) is
the output of the simulator, telling us how many virtual cycles
has the CPU executed. This is the exact amount of executed instructions.
We can safely ignore those lines for now.
Important
If the compilation failed for you, or if the execution printed something completely different, please, feel free to contact us: please, open an issue here and describe what have you tried, what failed and please do not forget to describe your environment.
If you are a NSWI200 student, please, prefer the standard means of communicating with your teachers instead of the GitHub issues. Thank you.
Configuring the virtual machine
We will now take a closer look at the msim.conf file, which
contains the configuration of the simulated computer that runs
your kernel.
Using a simulated computer instead of a real one makes it much easier to develop a small kernel (for one thing, installation does not require sacrificing your own computer, also, the simulation is completely deterministic and therefore bugs that appear once keep appearing until you fix them). However, rest assured the simulated environment is close enough to the real thing.
Reading msim.conf from top to bottom and ignoring the comment
lines starting with the # character, the first configuration
line tells MSIM to add one processor and name it cpu0
add dr4kcpu cpu0
add drvcpu cpu0
MIPS specific notes
The MIPS R4000 processor device is named dr4kcpu.
RISC-V specific notes
The RISC-V RV32IMA processor device is named drvcpu.
The next two groups of directives add two blocks of physical memory, one for the bootloader and one for the main memory, both initialized from files on disk.
The main memory block (called mainmem) is a read-write memory
with a size of 1 MiB. The memory block is initialized with
the contents of the kernel/kernel.bin file before the simulated
computer starts running:
add rwm mainmem 0
mainmem generic 1M
mainmem load "kernel/kernel.bin"
add rwm mainmem 0x80000000
mainmem generic 1M
mainmem load "kernel/kernel.bin"
MIPS specific notes
The mainmem memory segment starts at physical address 0.
The processor then maps it to a virtual address 0x80000000
(so printing a pointer address in your code will print addresses with
the highest bit set).
RISC-V specific notes
The mainmem memory segment starts at physical address 0x80000000.
The processor uses identity mapping when booting, hence we do not need to
explicitly distinguish virtual and physical addresses (at least, for now¨).
The bootloader memory block (called loadermem) is a read-only
memory initialized with the contents of the kernel/loader.bin file:
add rom loadermem 0x1FC00000
loadermem generic 4K
loadermem load "kernel/loader.bin"
add rom loadermem 0xF0000000
loadermem generic 8K
loadermem load "kernel/loader.bin"
MIPS specific notes
The loadermem memory segment starts at physical address 0x1FC00000 and has a size of 4 KiB.
RISC-V specific notes
The loadermem memory segment starts at physical address 0xF0000000 and has a size of 8 KiB.
Finally, we add a simple output device (called printer),
which will allow the code running in the simulator to display
text on the host computer console.
This is similar to serial console found on real
hardware, except the printer device is much simpler:
add dprinter printer 0x10000000
add dprinter printer 0x90000000
MIPS specific notes
This device resides at physical address 0x10000000.
RISC-V specific notes
This device resides at physical address 0x90000000.
This is actually enough for a simple machine and more than enough for our purposes :-).
Disassembling the kernel
With the simulator configured to provide us with a simple
computer, it is now time to look at the files in the
kernel directory. Again, there is a Makefile which
controls the compilation, and a linker script which controls the
layout of the binary image produced by the linker.
Extra information (linker scripts)
We will not dissect the linker script further, because we will not need to modify it in this tutorial.
As a matter of fact, linker scripts are rarely modified and in normal circumstances come with your linker. For our purposes, where we have a non-standard kernel and a simplified emulator, we have our own ones.
The boot subdirectory contains loader.S, an assembly
source file which contains the computer bootloader code. On a real
computer, the bootloader is (ultimately) responsible for loading
the operating system into memory. In our case, the MSIM simulator
does this for us (see the directives telling MSIM to load
kernel/kernel.bin into mainmem in msim.conf), so we
just need a few instructions to make the processor jump into the
kernel code after reset.
The loader code needs to be present at a specific address (it is
hard-wired into the CPU, see msim.conf) which the CPU starts
executing instructions from after a power up/reset. Other than
that, the loader code does not really do anything – it just jumps
to another fixed address, where our main code will reside.
MIPS specific notes
The loader jumps to address 0x80000400.
The reason why we keep the rest of the kernel code separate from the loader is quite simple – the entry point of the loader is quite far from the entry points of the exception handlers, which are also hardwired, and which the kernel must implement. We simply want to keep the rest of the kernel code in one piece, and that means next to the exception handlers.
RISC-V specific notes
The loader jumps to address 0x80001000.
The loader.S file is compiled and linked into loader.bin.
This file contains only machine instructions (no symbol
information, no debugging information, no relocation information):
it is code in its rawest form, a form that the CPU actually sees.
Look into loader.bin and loader.disasm. The second one is
a disassembly of the binary format back to assembler.
cat loader.disasm
hexdump -C loader.bin
Since loader.bin and loader.disasm are produced from
loader.S, they should contain the same instructions as in the
original loader.S. Do take a look.
Self-test quiz
A question for you: why are the instructions in loader.disasm
different from loader.S?
Hint
Think about the limited instruction repertoire of the CPU.
Solution MIPS
The difference in code concerns the loading of the 32-bit constant (jump target address). The CPU does not have an instruction that can load an entire 32-bit constant in one go (because the instruction itself must fit into 32 bits), hence two instructions are used. The assembly code uses a shorthand notation so that the programmer does not have to perform this trivial conversion.
Solution RISC-V
The difference in code concerns the loading of the 32-bit constant (jump
target address). The CPU does not have an instruction that can load an
entire 32-bit constant in one go (because the instruction itself must
fit into 32 bits), hence two instructions would need to be used
generally. (For example li t0, 0x0x80000001 would be transformed
into lui t0, 0x80000 and addi t0, t0, 1 - try it yourself!) Our
code manages with only one, because the lowest 12 bits (3 hex digits) of
our target address are all 0. The lui t0, 0x80001 instruction loads
the constant 0x80001 to the highest 20 bits of t0, meaning it
sets it to 0x80001000, which is exactly our desired address. The
assembly code uses a shorthand notation so that the programmer does not
have to perform this trivial conversion.
From boot to C code
We will now look into the src directory, where the foundations
of our kernel reside.
The head.S file contains a lot of assembly code, but do not be
afraid ;-).
MIPS specific notes
Find the line containing start: (around line 120). Above this,
we can see a special directive .org 0x400 that says that the
following code will be placed at address 0x400 bytes away from the
start of the code segment. The linker specifies that the code
segment starts at 0x80000000, together this yields 0x80000400 -
exactly the address our boot loader jumps to! Hence, after the
boot loader is done, the execution will continue here.
We start by setting up few registers (such as the stack pointer)
and execute jal kernel_main. This will pass control from the
assembly code to the kernel_main function, which is a standard
C function that you can see if you open src/main.c.
RISC-V specific notes
Find the line containing start: (around line 90). Above this, we can
see a special directive .org 0x1000 that says that the following
code will be placed at address 0x1000 bytes away from the start of the
code segment. The linker specifies that the code segment starts at
0x80000000, together this yields 0x80001000 - exactly the address our
boot loader jumps to! Hence, after the boot loader is done, the
execution will continue here.
We start by setting up few registers (such as the stack pointer and the
mepc CSR) and execute mret. This will pass control from the
assembly code to the kernel_main function, which is a standard C
function that you can see if you open src/main.c.
These few lines of assembler (loader.S and head.S)
constitute the only assembly code needed to boot the processor and
get into C.
Extra information (assembler and booting)
One cannot boot a CPU without at least a bit of assembler that jumps into a C code. But the assembly code is usually straightforward and only sets-up basic registers and stack.
Feel free to return to this code later, understanding it completely is not required to continue with the tutorial. As long as you understand that we need special instructions to jump to a C code, you will be fine.
kernel_main is where the fun starts
The last file we have not commented much on is src/main.c.
It contains the kernel_main() function, which is called
shortly after boot. This is the function, where the kernel
would initialize itself or launch the first userspace process
(e.g. init on Linux).
Right now it contains only a very short greeting.
Printing from the simulator is trivial: since we told MSIM that there should be a console printer device available at an particular address. MSIM monitors this address and any write to it causes the written character to appear at the console.
MIPS specific notes
A question for you: if you look up the console printer device
address in the source code, you will see it is 0x90000000, but
msim.conf says 0x10000000. Why?
Hint
Think about virtual and physical addresses.
Solution
The code uses virtual addresses, but the simulator
configuration uses physical addresses (exactly what a
real hardware would see). In the kernel segment,
virtual addresses are mapped to physical addresses
simply by masking the highest bit - virtual address
0x80000000 therefore corresponds to physical address
0, and so on. The mapping is intentionally simple
because the kernel must run even before more complex
mapping structures, such as page tables, can be set
up.
An important note: you probably noticed that we print
the characters one by one instead of using printf
or puts. That is because we are in our own kernel
and we do not have any of these functions. As a
matter of fact, we will have only functions
that we implement ourselfs.
Thus, there is no printf, no malloc and definitely no
fopen (unless you implement them yourself).
The first modification of the kernel
Modify the kernel so that it prints the greeting with an exclamation mark instead of a plain period. After all, we can be proud of it ;-).
Before running msim again do not forget to recompile with
make.
Solution
Just replace '.' with '!' in main.c :-).
Note that make should recompile only main.c into main.o
and re-link the kernel.* files. Files related to
the bootloader should remain without change.
Tracing the execution
Let’s see which instructions were actually executed by MSIM. This may come in handy in later debugging tasks.
We will run msim -t. This turns on a trace mode where MSIM prints
every instruction as it is executed. (Unfortunately, there is just
one console, so the MSIM output is interleaved with your OS
output.)
Self-test quiz
Compare the trace with your *.disasm files. What is the
difference?
Solution
The answer is obvious: *.disasm contains the code
in its static form while the trace represents the true
execution - jumps are taken, loop bodies are executed
repeatedly etc.
Stepping through the execution
To run the kernel instruction by instruction interactively, launch
MSIM with msim -i. This time, MSIM will wait for further
commands, as indicated by the [msim] prompt.
Simply typing continue will resume standard execution, which
will run our OS and eventually terminate MSIM.
Run MSIM again but instead of typing continue, we will just hit Enter.
An empty command in MSIM is equivalent to typing step and
executes a single instruction. We should see how the greeting
starts to appear next to the prompt as we continue pressing
Enter.
We can also do step 10 to execute ten instructions at once.
Entering the debugger
Stepping through our kernel from the very first instruction is not so useful for debugging when the code we are interested in is executed long after boot. In that case, we can also enter the interactive mode programmatically, by asking for it from inside our (kernel) code.
That is something that is super-easy when running in a simulator such as MSIM but somewhat more difficult on real hardware. That is why simulators are so useful :-).
To enter the interactive mode, we will use a special assembly language instruction, which the real CPU does not recognize but MSIM does.
We will insert the following fragment at a location (in the C code) where we want to interrupt the execution.
__asm__ volatile(".word 0x29\n");
__asm__ volatile("ebreak\n");
Let us try it: insert the break after printing Hello. If we execute
msim, it will print Hello and enter interactive mode. We
can again step through the execution or continue.
Inspecting the registers
Let us start MSIM in interactive mode again and type set trace as the
first command.
Then we will hit Enter several times. We executed several instructions and MSIM is printing what instructions are executed.
We can also inspect all registers at once. We will use the cpu0 rd
command for a register dump of the cpu0` processor
(that is the only processor that we added to our computer in
MSIM).
This is an extremely useful command as it allows us to inspect what is the current state of the processor and what code it executes.
Self-test quiz
Which register would tell you what code is executed?
Solution
The pc register is the program counter telling the
(virtual) address where the CPU decodes the next
instruction.
Matching instructions back to source code
Start MSIM again in the interactive mode and step until it starts printing the greeting. Look at the register dump.
You will see something like this (note that we have dropped the 64bit extension to make the dump a bit shorter):
0 00000000 at 00000000 v0 90000000 v1 00000000 a0 00000000
a1 00000048 a2 00000000 a3 00000000 t0 00000000 t1 00000000
t2 00000000 t3 00000000 t4 00000000 t5 00000000 t6 00000000
t7 00000000 s0 00000000 s1 00000000 s2 00000000 s3 00000000
s4 00000000 s5 00000000 s6 00000000 s7 00000000 t8 00000000
t9 00000000 k0 0000FF01 k1 00000000 gp 80000000 sp 80000400
fp 00000000 ra 80000420 pc 8000043C lo 00000000 hi 00000000
zero: 0 ra: 80001060 sp: 80001000 gp: 0
tp: 0 t0: 800 t1: 0 t2: 0
s0/fp: 0 s1: 0 a0: 0 a1: 0
a2: 0 a3: 0 a4: 48 a5: 90000000
a6: 0 a7: 0 s2: 0 s3: 0
s4: 0 s5: 0 s6: 0 s7: 0
s8: 0 s9: 0 s10: 0 s11: 0
t3: 0 t4: 0 t5: 0 t6: 0
pc: 8000106c Privilege mode: S
MIPS specific notes
In our dump, pc contains the 8000043C.
If we open kernel.disasm and find this address there, we will see
it is few lines below 80000430 <kernel_main> which indicates that
it is an instruction inside kernel_main().
RISC-V specific notes
In our dump, pc contains 8000106c.
If we open kernel.disasm and find this address there, we will see
it is few lines below 80001060 <kernel_main> which indicates that
it is an instruction inside kernel_main().
This is extremely important information because it allows us to decide in which function our OS will be when it is interrupted etc.
We can interrupt code in MSIM by hitting Ctrl-C. That is
useful if our code enters an unexpected loop and we want to
investigate in which function it got stuck.
Instruction and memory dumps
MSIM allows us to inspect not only registers but also memory.
Let us see the string directory. It contains almost the same code
as the previous example, but uses iteration over a string
(const char *) to print the greeting.
Self-test quiz
Compile the code, run MSIM interactively and step until it starts printing characters.
What is the value of the program counter?
Let’s inspect the code of the loop. We can look at
kernel.disasm or inspect it directly from MSIM.
MIPS specific notes
To inspect things in MSIM, we need to work with physical
addresses. Recall that pc contains a virtual address. As long
as our code runs in the kernel segment, the mapping between the
virtual and physical addresses is hardwired into the processor
as a simple shift by 2GB. For example, virtual address 0x8000042C
maps to physical address 0x42C.
It is quite important to remember that if we see an address above
0x80000000 in MSIM, it points into the kernel segment, but if
we see a numerically lower address, it is either an untranslated
physical address (such as those in msim.conf), an address in
the user segment, which at this time most likely indicates a bug
in our code.
Now, we will take the virtual address 0x80000042C, translate
it to a physical address (simply by removing the leading 8),
and disassemble in MSIM.
RISC-V specific notes
We can use the address 0x8000106c directly, as we are using the
BARE virtual address translation mode, which keeps the addresses
unchanged.
To disassemble instructions in MSIM:
[msim] dumpins r4k 0x42c 10
[msim] dumpins rv 0x80001060 10
This will dump 10 instructions starting at the specified address.
MIPS specific notes
We should notice that we are (in overly simplified terms) reading
the string via registers v0 and v1 and writing it to the
console via a0.
Let’s look at the register content:
v0 80000460 v1 00000048 a0 90000000
v0 looks like a virtual address of our kernel, v1 looks
like an ASCII value (actually, it is the capital H) and a0
is the address of our console (recall code in src/main.c).
So we can guess that v0 would contain the address of the
string.
RISC-V specific notes
We should notice that we are (in overly simplified terms) reading
the string via registers a4 and a5 and writing it to the
console via a3.
Let’s look at the register content:
a3: 90000000 a4: 48 a5: 8000108a
a5 looks like a virtual address of our kernel, a4 looks like an
ASCII value (actually, it is the uppercase H) and a3 is the
address of our console (recall code in src/main.c).
So we can guess that a5 would contain the address of the string.
Let’s look at that address. Now we do not want to see it as an instruction dump but rather as plain memory dump, hence:
[msim] dumpmem 0x460 4
0x00000460 6c6c6548 57202c6f 646c726f 00000a21
[msim] dumpmem 0x8000108a 4
0x080001088 6c6c6548 57202c6f 646c726f 00000a21
6c6c is actually ll from our Hello greeting and if we
translate the rest of the numbers, it is really our greeting.
Self-test quiz
Why is the string ordered backwards?
If we run hexdump -C kernel.bin you will see these characters
there as well.
Solution
While we read strings character by character, MSIM dumps memory by 4 byte words. Both MIPS and RISC-V are little endian, so the bytes on lower addresses take place in less significant bits of the word, making them appear more towards the right when written down.
Exception handling
Let’s now see how MSIM (and our kernel) behaves when things go wrong.
We will use the unaligned directory. We will compile it and let us
open main.c.
It contains a simple code: we build an array of individual bytes and later typecast it to a 32-bit integer. This is something our program might do for example to inspect memory, however, it is also an operation that may be illegal on some CPUs. Including ours as we will shortly see.
(The code uses volatile variables to prevent the compiler from
optimizing the code too much.)
If we run the code, MSIM will switch to the interactive mode and show a dump of registers. This is because the access to a 32-bit integer that is not aligned (the address we access is not a multiple of the size of an integer) is illegal. The CPU reacts by generating an exception. Our kernel is currently written so that it reacts to an exception by switching MSIM to the interactive mode (which is a sane default for debugging).
We can return to this example and run (once MSIM switches to the interactive mode) the following commands to find what addresses caused the problem and what is the interrupt code (type).
cpu0 cp0d 0x0d
cpu0 cp0d 0x08
cpu0 cp0d 0x0e
cpu0 csrd mepc
cpu0 csrd mcause
cpu0 csrd mtval
The volatile modifier
Let us go back to our first kernel again.
You perhaps noticed that our console printer uses a special
modifier volatile. If you are new to C, you may want to read
for example this
article
about volatile first.
Self-test quiz
Let’s compile the code and open kernel.disasm again. We will see
that most code of kernel_main() is a mix of constant loads
(li) and stores to memory (sb). These instructions
represent the call to print_char that writes the character to
a special part of memory that represents the console (recall that
MSIM is printing any value written here on your console).
Now let us remove the volatile modifier and recompile the code.
Let us run MSIM again.
Nothing (except the newline) was printed!
We will look at the disassembly again: the code is much shorter! Why?
Hint
Imagine what the code looks like when print_char
is actually inlined into kernel_main.
Solution
Without volatile, the source is actually this:
char *printer = (char*)(0x90000000);
*printer = 'H';
*printer = 'e';
...
*printer = '.';
Any decent compiler will recognize that we are
overwriting the same variable without reading the
values. When optimizing code, the compiler is only
required to preserve an externally visible behavior,
and a write that nobody reads is not externally
visible - hence all writes but the last are removed by
the compiler. This means only *printer = '\n'
remains.
Using volatile informs the compiler that someone
else (here it is the console device of the simulator,
but it can also be another thread) can read or write
the variable and therefore accesses to it must not be
optimized away.
Surviving without sources
The directory endless contains only an image of a simple
kernel, without sources.
The kernel image contains an endless loop. Run MSIM, after a while
break the execution with Ctrl-C to get into the interactive
mode.
Inspect the state of the machine and decide in which function the
endless loop is (function names are in the kernel.disasm
file).
Hint
Dump the registers.
Solution MIPS
The PC register will contain values around
0x80000460, hence it is function endless_two.
Solution RISC-V
The PC register will contain values around
0x80001090, hence it is function endless_two.
The complex one
The printers directory again contains only a binary kernel
image, this time it is a bit bigger kernel and msim.conf
actually contains several printers (consoles).
The task is simple: determine what console device is actually
used. This changes with every boot so do not try editing
msim.conf, that would be cheating ;-) …
Note that with newer version of MSIM, you need to execute with
-n as the hardware is configured with time device that adds
non-determinism to the simulator.
To find the right answer, inspect the code loaded into MSIM and check the contents of the registers. To make the task easier, the kernel prints dots in an infinite loop.
Solution
The printer number is the last but one digit in the Run id.
Tracing the instructions would be enough, somewhere in the registers we would see the address of the printer.
Other option is to look into the disassembly and we
would see that print_char was not inlined. Hence
we can watch until program reaches this point
and then inspect the target address of the sb instruction.
MIPS specific notes
Watch until the program counter reaches address 0x80000430
and look into the content of the v0 register.
RISC-V specific notes
Watch until the program counter reaches address 0x80001068
and look into the content of the a5 register.