Overview
- Minix signals
- file systems: user interface, API
- file system implementation
Minix signal handling implementation
- sig_proc (p. 904) delivers a signal, may be called
from the kernel or the memory manager
- sig_proc
- makes sure the process is not dead,
- returns if the signal is ignored or blocked
(if blocked, after recording it in the pending signals), then
- checks for a handler, and if so, executes it, or
- if no handler, kills the process (calling pm_exit), optionally
dumping core
- to execute a signal handler, sig_proc builds two large
structures on the stack (Fig 4-49 on p. 392):
- a sigcontext structure holds a copy of significant
parts of the kernel process table entry, particularly all the saved registers
- a sigframe structure holds a valid stack frame for
the execution of sigreturn, including a return address and some parameters
because these structures are large, the stack may overflow, in which
case the entire process is killed
- the signal is removed from the set of pending processes
- if the process is paused on a system call (including but
not limited to pause), it is unblocked by sending it
a reply -- the stack is set up so that the signal handler will execute
first, and only then will the system call complete
- the signal itself is delivered by sys_sigsend, p.759.
Since the PM set up the stack correctly, all that is needed is the
appropriate context switch
Minix signal handling functions
- check_pending is called whenever the set of signals for
a process may have changed, and calls sig_proc as appropriate
- do_sigaction, do_sigpending,
do_sigprocmask, do_sigsuspend,
and do_sigreturn
do the bit set and handler table manipulation, contacting the
kernel or calling check_pending as appropriate
- check_pending is called to make sure the signal can be sent,
and to send it to a group of processes if appropriate (e.g. by the kernel
when rebooting)
- do_kill and ksig_pending are called to send a signal
(from user space and from the kernel, respectively), and eventually
call check_sig. The major difference between them is the kernel
may send several signals at once
- do_alarm and set_alarm turn alarms on and off
by calling pm_set_timer and pm_expire_timers and,
if necessary, contacting the system task
- dump_core uses several system calls to create
and write a core file
File System Motivation
- more storage than is available in a process's memory
- persistence of information across process termination and system crashes
- information sharing among processes
- solution: store information on disk in named units called files
- files are only deleted by explicit user action
- file system design must decide on
- file/directory (folder) structure(s)
- file names (length and structure)
- what access is supported for files, including what protection is
allowed and what operations are supported
- how these operations are implemented given a persistent (block?) device
- first three are user interface issues, last is an implementation issue
File Names
- a name is the way of referring to a persistent object (pointers are
a kind of name, but not very user-friendly)
- file names are often limited in length, e.g. 8 or 255 characters
- some operating systems distinguish between upper- and lower-case
names (e.g. Unix), others don't (e.g. Windows/Dos)
- some operating systems support a single extension (following a period)
and ascribe significance to it, others support arbitrary extensions (e.g.
.tar.gz) and the operating system itself does not care about the extension
File Structures
- most common these days is the unstructured file consisting
of an arbitrary-length sequence of bytes: Unix, Dos/Windows
- files could also be composed of a sequence of fixed-sized
records, in which case read, write, and positioning operations
access records, not bytes
- each record can also contain a distinct key (perhaps
unique, perhaps not), and the file system may structure the file to
make access by key efficient, e.g. as a tree or using an index
(Indexed Sequential Access Method, ISAM, on IBM, then DEC/VMS)
- operating systems are starting to offer versions of
content-based (associative) access to files, which creates
an index across an entire file system -- not (yet) commonly
supported by the OS, though I think BeOS had it integrated
File Types
- Unix: regular files, directories, device files including character
special files and block special files
- ASCII (text) files contain 0 or more arbitrary-length lines,
typically ended by a newline ('\n') in Unix, or a carriage-return
+ newline ("\r\n") in Windows/Dos -- "\r\n" is standard
- binary files usually have internal structure, but that structure
is entirely defined by the program that creates them, e.g. MS-Word files
or executable files
- example: archive file has a collection of headers followed by
object modules. Each header has the module name, date, owner,
protection, and size.
- windows file types defined by the extension, implies a default
program to be used to access the file: easier to use, but less flexible
File Access
- files are normally accessed sequentially:
- an editor may read the entire file into memory before editing
- an daemon reads its entire configuration file when it starts
- a compiler reads and processes a file linearly, from start to end
- a server reads a file one (arbitrary sized) block at a time and sends
the block
- some programs, especially data base or similar systems, need
random access to files -- position can be specified as part of the
read or write call, or by a separate seek operation
- many programs that need to be efficient now map files into
their virtual memory, then let the operating system page them in as needed
File Attributes
- size (and maybe maximum size and/or record size, key offset in record)
- creation time, time of last (read, modify) access
- owner (and maybe group)
- permissions (read/write/execute/visible/system)
- has this file been archived? is it temporary? is it locked?
File Operations
- create, delete (unlink)
- open, close
- read, write, append, seek
- get and set attributes
- rename
Directories
- directory (folder) systems are used to structure files
- directories are usually hierarchical when strictly looked
at (e.g. by a program), but have some escapes to allow
loops (e.g. Unix parent directory ".." and symbolic links,
Windows "My computer" and other links)
- often each user has his/her own directory, e.g. /usr/loginname
or /home/loginname or /home/mount/loginname (e.g. uhunix)
- a directory is usually a special kind of file which may
or may not support specific regular file operations, but usually
supports operations such as create, delete, open, close, read (one
record at a time), rename (a subfile), link (a subfile), unlink (a subfile)
- Unix has hard links (multiple hard links to a file, possibly with
different names, are equivalent) and soft links
- when the last hard link to a file is unlinked, the file is removed
(file attributes include a reference count of links)
- a file may have any number of soft links (essentially, the name
of the hard link stored in the soft link file), which can work across
file systems, but become invalid if the file is removed
Directory Structures
- very simple systems might only have a single-level (top-level) directory
- most systems allow almost arbitrary depth
- a file name is generally interpreted either relative to a
per-process current (working) directory, or as an absolute path
name (beginning with "/" in Unix-like systems)
- some operating systems allow a process to set the root of its
file system (chroot), which may provide more security if the
process is engaging in risky activities (i.e. listening on the network)
- directories are stored in (special) files, usually containing at
least the names of the subfiles, pointers to the data, and either the
attributes or pointers to the attributes
- which is more efficient: to store all the attributes in the directory,
or to have a pointer to the attributes in the directory? Which is simpler?
File System Implementation
- keeping track of which blocks of data belong to which file:
- contiguous allocation is simple and makes sequential or random
access fast, but also requires knowing the file size when allocating
and may give fragmentation such that the space is available but not
usable (until defragmentation)
- linked list allocation makes sequential access fast, but
random access is very slow and block sizes are not powers of two bytes.
- linked list allocation in memory keeps a global table with one pointer
per block (table is in memory while running, frequently backed up on disk),
sequential and random access is fast, block sizes are a power of two, but
the memory table may be large (4GB disk with 16K blocks requires 1M of RAM)
- i-node (index node) allocation keeps a per-file table of blocks,
sequentially ordered, and stored on disk until the file is opened. The
inode may also store the file attributes, and some of the entries may point
to indirect blocks which contain more pointers. Fast for sequential
and random access, does not use a lot of memory or disk, does not cause
fragmentation
- some versions of Unix used i-nodes with 13 block addresses per inode
- this makes the inode constant size and such that it fits in a block
- if more than 10 data blocks are needed, the 11th pointer is a single
indirect block, i.e. points to a block containing addresses of data blocks
- if this is not sufficient, the 12th pointer is a double indirect
block, containing the address of a block which contains addresses of
blocks of addresses of data blocks
- the last address is a triple-indirect block
- if each indirect block holds the addresses of up to 64 other blocks
of 512 bytes each, what is the maximum file size (in bytes) in this system?

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 2.5 License.