Page 17 - DCAP103_Principle of operating system
P. 17
Principles of Operating Systems
Notes This is where kernel programming gets dangerous. While writing the
example below, I killed the open() system call. This meant I couldn’t
open any files, I couldn’t run any programs, and I couldn’t shutdown
the computer. I had to pull the power switch. Luckily, no files died. To
ensure you won’t lose any files either, please run sync right before you
do the insmod and the rmmod.
In general, a process is not supposed to be able to access the kernel. It can’t access kernel memory
and it can’t call kernel functions. The hardware of the CPU enforces this (that is the reason why
it is called ‘protected mode’).
System calls are an exception to this general rule. What happens is that the process fills
the registers with the appropriate values and then calls a special instruction which jumps
to a previously defined location in the kernel (of course, that location is readable by user
processes, it is not writable by them)? Under Intel CPUs, this is done by means of interrupt
0x80. The hardware knows that once you jump to this location, you are no longer running
in restricted user mode, but as the operating system kernel and, therefore, you are allowed
to do whatever you want.
The location in the kernel a process can jump to is called system call. The procedure at that location
checks the system call number, which tells the kernel what service the process requested. Then, it
looks at the table of system calls (sys_call_table) to see the address of the kernel function to call.
Then it calls the function, and after it returns, does a few system checks and then return back to
the process (or to a different process, if the process time ran out). If you want to read this code, it
is at the source file arch/$<$architecture$>$/kernel/entry.S, after the line ENTRY(system_call).
So, if we want to change the way a certain system call works, what we need to do is to write
our own function to implement it (usually by adding a bit of our own code, and then calling
the original function) and then change the pointer at sys_call_table to point to our function.
Because we might be removed later and we don’t want to leave the system in an unstable state,
it is important for cleanup_module to restore the table to its original state.
The source code here is an example of such a kernel module. We want to ‘spy’ on a certain user,
and to printk( ) a message whenever that user opens a file. Towards this end, we replace the
system call to open a file with our own function, called our_sys_open. This function checks the
uid (user’s id) of the current process, and if it is equal to the uid we spy on, it calls printk( ) to
display the name of the file to be opened. Then, either way, it calls the original open( ) function
with the same parameters, to actually open the file.
The init_module function replaces the appropriate location in sys_call_table and keeps the original
pointer in a variable. The cleanup_module function uses that variable to restore everything back
to normal. This approach is dangerous, because of the possibility of two kernel modules changing
the same system call. Imagine we have two kernel modules, A and B. A’s open system call will
be A open and B’s will be B_open. Now, when A is inserted into the kernel, the system call is
replaced with A open, which will call the original sys_open when it is done. Next, B is inserted
into the kernel, which replaces the system call with B_open, which will call what it thinks is the
original system call, A open, when it is done.
Now, if B is removed first, everything will be well it will simply restore the system call to A
open, which calls the original. However, if A is removed and then B is removed, the system
will crash. A’s removal will restore the system call to the original, sys_open, cutting B out of the
loop. Then, when B is removed, it will restore the system call to what it thinks is the original,
A open, which is no longer in memory. At first glance, it appears we could solve this particular
problem by checking if the system call is equal to our open function and if so not changing it
at all (so that B won’t change the system call when it is removed), but that will cause an even
worse problem. When A is removed, it sees that the system call was changed to B_open so that
10 LOVELY PROFESSIONAL UNIVERSITY