Outline
- shells and window systems
- what is and what isn't part of the operating system?
- starting programs
- processes
- pipes
- scripting languages
Interactive Shells
- in Unix, the shell and the window system are considered applications
- in DOS/Windows (and in many other systems), the command line
interpreter and the window system are considered part of the
operating system
- the OS system calls must be powerful enough that applications
can do anything the OS offers to the user
- a shell is a program that accepts input from the user and
uses it to determine which command(s) to execute
- in Unix, shells may also allow input and output redirection
to/from files or other commands or even devices
What is in the operating system?
- scheduler: always
- memory management: always
- command execution: always
- file system: usually (but in Minix it is in a "server")
- inter-process communication: frequently
- user interface: sometimes
- having as much as possible outside the OS gives the
user more choices -- usually easier to replace an application than the OS.
For example, unix users have a choice of shells, and sometimes
of window systems
- having as much as possible outside the OS makes the OS
simpler, more maintainable, more secure against attacks, and less
likely to have bugs
- having as much as possible inside the OS may be more
efficient -- less time is spent doing system calls
- having as much as possible inside the OS may raise the bar
for what the OS should do, and discourage people from making some
choices
Command-Line Shell Implementation
- read one line of text
- parse the line of text, and perhaps return to step (1) if additional
lines are to be parsed (e.g. bash has commands spanning many lines)
- interpret the input text to decide what needs to be done
- ask the operating system to execute commands as needed,
providing the parameters and any other information
- internal commands are executed directly by the shell
program, rather than resulting in the OS loading an executable file,
e.g. exit
- while this process is usually interactive, the shell will work
exactly the same if given input from a file -- a shell script
- the shell is a regular program, usually written in C but could
be written e.g. in scheme (scsh) or in any other language
Starting a program in Unix
- main is given a count of arguments and an array of
pointers to arguments, of which the first is the command name
- the system call execv takes as its arguments a
command file name and an array of pointers to null-terminated
strings containing the arguments
- execv destroys the memory copy of the currently
running program (the program that called execv) and
in its place copies the contents of the executable file -- this
is called an overlay
- once the overlay is done, execv begins execution
of the new program
- because of the overlay, a successful call to execv never returns
- therefore, a shell forks a child process
which is responsible for executing the command
- the shell then usually waits for the child process to
complete
Processes in Unix
- from the point of view of the application, a process is a virtual
machine:
- each process has its own memory which no other process can affect
- each process can do input and output almost independently of all
other processes
- the operating system supports this abstraction with virtual
memory and by providing high-level I/O operations, e.g.
file access rather than raw disk access
Creating a new process
- the application calls the fork system call
- after the fork, the two virtual machines have the same
identical contents, except for the fork return value
- on primitive operating systems, this is done by
actually allocating and replicating each writable page of virtual memory,
so fork could be an expensive operation
- on more elaborate operating systems, this is done by marking
all the memory pages read-only, and only copying them when the
process actually tries to modify them: Copy on Write, CoW
Sharing among processes
- a new process is identical to its parent even to the point
of having the same open file descriptors
- therefore, printf works equally well in the parent and in
the child
- shells sometimes want to redirect the input or the output
of a command that will be executed
- this can be done by changing the file descriptors (e.g. fd 0 for
stdin) after the fork but before the exec call (since
the exec call never returns, nothing can be done after the exec call)
Pipes
- a pipe is a simple mechanism for interprocess communication
- when requesting a pipe from the OS, the OS returns two file
descriptors, one for reading and one for writing
- if file descriptor 0 is replaced by the read end of a pipe,
all the input for the program comes from the pipe
- likewise, if file descriptor 1 is replaced by the write end of a pipe,
all the standard output for the program goes to the pipe
- to use the pipe for IPC, it must be created before the
fork call, so it can be shared among the two processes that exist
after the fork
Project 1
- create a simple shell that can:
- execute commands with arbitrary arguments (but no need for quoting, etc)
- support input and output redirection to files
- support input and output redirection to pipes
- allow programs to run in the background (which is actually
easier than waiting for programs to terminate)
- it is my hope that if you write your own shell,
- you will no longer consider shells a mystical component of
the operating system
- you will in the future feel more comfortable writing programs
that execute other programs
Shell scripts
- a shell given input from a file is executing a shell script
- some shells have elaborate command interpreters with variables,
loops, and conditionals
- as a result, it is possible to write programs in the shell
scripting languages
- such a program can be interpreted, easily modified (without
recompilation), and is often more compact (but slower) than a
comparable C or Java program
- when execv tries to execute a file, it checks to
see whether it begins with the characters
#!
If it does, the remainder of the line is assumed to be a path
for an interpreter for the remaining lines of thie file
- this allows interpretation of arbitrary scripting languages, not just
shell scripts, e.g. perl, tcl/tk, python
- because this is so convenient, for applications that are not
performance critical or overly complicated it is very advantageous
to use scripts
A design principle
- giving users (and programmers) flexibility leads to the
creation of multiple efficient and effective ways of doing things
- this is good for expert users, though not necessarily for
users that don't want to learn to program

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 2.5 License.