[Operating Systems] xv6 - Traps and Interrupts

다영 윤
Mar 25, 2022
5 min read

Updated: Mar 25, 2022

cat.c

The cat function reads the file into buf and invokes the write system call to print it onto the terminal. Use the cscope command ":cs find s write" to find the definition of the write function. That will navigate us to the user.h file.

user.h

We can see that all function definitions of system calls, and other interrupts and traps, are here.

usys.S

At compilation, SYSCALL(write) in line 16 will be expanded to

write:

movl $SYS_write, %eax;

int $T_write;

ret;

SYS_write is defined in syscall.h and has the value of 16. By writing 16 to the eax register, we are informing which system call has been invoked. Then we call the interruption handler for system calls through int $T_SYSCALL. T_SYSCALL is defined in traps.h where all the interrupt numbers are defined.

traps.h

You can see the many trap and interrupt numbers defined here! For example, interrupt number 64 means a system call has been invoked.

Now, let's see what happens when the int n instruction (line 8 in usys.S file) is called. The workflow of the int n instruction can be summed up into the following steps

Fetch n'th interrupt descriptor from the IDT(interrupt descriptor table).
Check if the CPL(current privilige level) <=DPL(descriptor privilige level) in the fetched interrupt descriptor. ex. This stops a user program with CPL=3 (user mode) from invoking an interrupt other than system call, which will have DPL=0 (kernel mode). -> Ensure that in user mode, only system calls will be handled.
If target code segment's PL < CPL -> mode switch from user to kernel mode.
1. Save esp and ss temporarily.
2. Load esp and ss registers from the code segment descriptor. -> Switch to kernel stack (mode switch)
3. Push the values saved in step 3-a. to the switched kernel stack.
Push eflags, cs, eip values to kernel stack. (saving the user progam context)
Clear esp flag bits.
Load eip and cs registers from the code segment desciptor.

Initializing the IDT & Fetching Desciptors

Interruption descriptors contain information about a given interrupt, such as the DPL. To fetch the n'th descriptor, the offset value in the corresponing IDTR is used.

Fetching the n'th interrupt descriptor

The interrupt descriptor itself looks like this.

Interrupt Descriptor

IDT tables are initialized at boot time, and the code for doing that is defined in trap.c.

trap.c

SETGATE definition

Step3~7: Mode switch and Jump to Interrupt Handler

In step 3, we check the privilige level written in the code segment descriptor. The code segment descriptor contains information about the interrupt handler code and can be fetched using the 'code segment selector' field defined in interrupt descriptors we saw earlier.

Getting the privlige level of the interrupt handler code

If the privilige level we got from the segment descriptor is 0, which means kernel mode, and the current privilige level of the CPU (CPL) is 3, meaning user mode, we must perform a mode swith so we can run the mode 0 priviliged kernel code.

This means we first save the current esp value and change from the user stack to the kernel stack (step 3-a). Then we load the esp values written in the code segment descriptor(step 3-b) and push the esp value we saved into the kernel stack (step 3-c). Then we save the rest of the register context (of the program that was running when the interrupt occured) by pushing it onto the kernel stack. Specifically we save the eflags and eip regsiters (step 5-6). Lastly, we load the eip register with the address to the interrupt handler code we rish to run (step 7).

The kernel stack will look like this after this process.

The address of the interrupt handler code that is written to the eip register is written in the interrupt descriptors' offset field at boot time. Let's revisit the tvinit() function to see what is initialized in the offset field.

WE can see that an interrupt vector (vector[i] and vector[T_SYSCALL] in the case of system calls) corresponding to each interrupt is passed to the SETGATE function which initializes each interrupt descriptor in the IDT. So each vectors[i], containng the interupt handler code, is passed as the offset feild in interrupt descriptors!

Let's return to our example of invoking the write system call in the cat.c file. SYSCALL(write) macro in usys.S will be called, which looks like this.

write:

movl $SYS_write, %eax

int $T_SYSCALL

ret

What happens in the 'int 64' insruction?

We first fetch the interrupt descriptor for interrupt number 64. We then look at the dpl defined in there and compare with the CPL, verifying this interrupt is allowed at the currnet privlige level. If we pass this verification, we now get ready to make a mode swtich if needed. If we were in user mode when the interrupt occured, we probably would have to. We save regsiters like esp and eip, switch to the kernel stack, and push onto it the saved register context so we can restore them alter once the interrupt has been handled. Finally we jump to the interrupt handler code by reading the offset value from the interrupt desriptor. (Just a recap of the stpes from before.)

We also checked that vector[64] is the offset value set for the system call interrupt descriptor, so this will be interrupt handler called for our write system call.

vectors.S

pushl $0 is the error code, and pushl $64 is the trap number, both sent parameters of the alltraps function. Now the stack looks like this.

Now we should take a look at the alltraps function, right? Long process for a simple command like cat😅.

What's happening is that a trap frame is being built.

After pushl %ds ~ and pusal(which pushes all general registers), this will be the resulting kernel stack. It exactly corresponds to the trapframe structure defined in x86.h.

x86.h

The trap frame will capture all the context of the process that was running when the interrupt was called, and as you can see in the trapret routine, all the contexts are being popped back into their registers once the trap handler has returned.

Also, the trapframe address is sent to the trap function as a parameter and is used to decode information such as error code and trapno (this is done by the pushl %esp before call to trap). This is the prototype of trap.

-> void trap(struct trapframe* tf);

trap.c

The trap function checks the trapno in the trap frame, and in our write example, this would be the system call trap nuber, 64. Let's see what happens in the syscall().

syscall.c

The eax value which stored the system call number is being read. This is why it was so important to save the register context in a trap frame before. Then syscalls[num]() will invoke the sys_write function, which will finally run the code we actually want to. Follow sys_write using 'Ctrl+]' to to find the actual implementation of the write code. What we focus on here is the process of processing a trap, specifically a system call. We also see in line 143 that the return values of the system call will be also written to eax.

Initializing the IDT & Fetching Desciptors

Step3~7: Mode switch and Jump to Interrupt Handler

Comments