Advanced Linux Programming - Process
Process
A running instance of a program is called a process.
If you have two terminal windows showing on your screen, then you are probably running the
same terminal program twice—you have two terminal processes. Each terminal window is probably running a shell; each running shell is another process.When you invoke a command from a shell, the corresponding program is executed in a new process; the shell process resumes when that process completes.Advanced programmers often use multiple cooperating processes in a single application to enable the application to do more than one thing at once, to increase application robustness, and to make use of already-existing programs.
Most of the process manipulation functions described in this chapter are similar to those on other UNIX systems. Most are declared in the header file
<unistd.h>
; check the man page for each function to be sure.
Process Interactions
In C++, getopt()
is a library function to parse CLI arguments
Usage:
getopt(argc, argv, "st:")
- input: arguments and a string describing the desired format
- output: returns the next argument and an option value
1 | // adapted from |
Process
A process is a running instance of a program
Examples:
- Each of the two running instances of Firefox
- The shell and the
ls
command executed, each is a process
Advanced programmers use multiple processes to
- Do several tasks at once
- Increase robustness (one process fails, other still running)
- Make use of already-existing processes
The main components of a process:
- An executable piece of code (a program)
- Data that is input or output by the program
- Execution context (information about the program needed by OS)
Process - Hierarchy
Each process in a Linux system is identified by its unique process ID, sometimes referred to as pid
.
- Process IDs are 16-bit numbers that are assigned sequentially by Linux as new processes are created.
Each process (with some exceptions) has a parent process (indicated by ppid
)
We can get the information within program
import the library unistd.h
getpid()
getppid()
1 |
|
Observe that if you invoke this program several times, a different process ID is reported because each invocation is in a new process. However, if you invoke it every time from the same shell, the parent process ID (that is, the process ID of the shell process) is the same.
1 | The process ID is 21062 |
Process - Create
- Running a program will automatically create (at least) a process
- Creating process within your program:
- Using
system
- Runs a shell (as a subprocess) to run the given commands
- Using
using system is not recommended because
- The call to system relies on the installed shell
- It brings the shell’s features, limitations, security flaws
Therefore, it’s preferable to use the fork and exec method for creating processes.
1 |
|
Process - Fork
- Forks an execution of a process (creates a new copy of the process)
- After a call to
fork()
, a new process is created (called child) - The original process (called parent) continues to execute concurrently
- In the parent,
fork()
returns the process id of the child that was created - In the child,
fork()
returns 0 to indicate that this is a child process - The parent and child are independent
- After a call to
1 |
|
Output
1 | the main program process ID is 17609 |
Process - Exec
exec()
series of functions are used to start another program in the current process- After a call to
exec()
the current process is replaced with the image of the specified program - Different versions allow for different ways to pass command line arguments and environment settings
int execv(const char *file, char *const argv[ ])
- file is a path to an executable
argv
is an array of arguments. By convention,argv[0]
is the name of the program being executed
Functions that contain the letter v in their names (execv, execvp, and execve) accept the argument list for the new program as a NULL-terminated array of pointers to strings. Functions that contain the letter l (execl, execlp, and execle) accept the argument list using the C language’s varargs mechanism.
To spawn a new process, you first use fork to make a copy of the current process.Then you use exec to transform one of these processes into an instance of the program you want to spawn.
The following example illustrates the use of execv
to execute the ls
shell command:
1 |
|
Output
1 | [exec] executing '/bin/ls -l' using execv |
Another simple example:
1 |
|
Output
1 | total 1960 |
Some practical example:
1 | // Run C++ program (but the program takes argv as input) |
Difference between exec family:
- execve():
p : not present => name of the program to run will be taken from
pathname
v : present => argument will be passed as
array
e : present => environment will be taken from
envp argument
- execle():
p : not present => name of the program to run will be taken from
pathname
l : present => argument will be passed as
list
e : present => environment will be taken from
envp argument
- execlp():
p : present => name of the program to run will be taken from
filename
specified or system willsearch for program file
inPATH
variable.l : present => argument will be passed as
list
e : not present => environment will be taken from
caller's environ
- execvp():
p : present => name of the program to run will be taken from
filename
specified or system willsearch for program file
inPATH
variable.v : present => argument will be passed as
array
e : not present => environment will be taken from
caller's environ
- execv():
p : not present => name of the program to run will be taken from
pathname
v : present => argument will be passed as
array
e : not present => environment will be taken from
caller's environ
- execl():
p : not present => name of the program to run will be taken from
pathname
l : present => argument will be passed as
list
e : not present => environment will be taken from
caller's environ
Process - Kill
Run kill in the terminal (run kill with -KILL)
- Does not terminate a process necessarily
Process - Signals
- A special message sent to a process
- Signals are asynchronous
- Different types of signals (defined in
signum.h
)SIGTERM
: TerminationSIGINT
: Terminal interrupt (Ctrl+C)SIGKILL
: Kill (can’t be caught or ignored)SIGBUS
: BUS errorSIGSEGV
: Invalid memory segment accessSIGPIPE
: Write on a pipe with no reader, Broken pipeSIGSTOP
: Stop executing (can’t be caught or ignored)
Process - Receiving / Sending Signals
Receiving a signal:
- Default disposition
- Signal handler procedure
Sending signal from one process to another process (SIGTERM, SIGKILL)
- Usually the parent process sends signals to its children
int kill(pid_t pid, int sig)
- Send a signal
sig
to a processpid
It’s a good practice to kill and wait for children to terminate before exiting
Process - Termination
What does it mean “a process terminates”?
Exit codes:
- zero: successful termination
- non-zero: termination with an error
1 | exit(0); |
Process - Zombie Processes
A zombie process is a process that exits and the parent is not handling the exit status
WAITING FOR A CHILD!
A parent process can wait for a child process to terminate
pid_t waitpid(pid_t pid, int *stat_loc, int options)
- Blocks the parent until the process with the specified
pid
terminates - The return code from the terminating process is placed in
stat_loc
options
control whether the function blocks or not- 0 is a good choice for
options
- 0 is a good choice for
Interprocess Communication
Pipeline
Pipeline is an Unidirectional communication
- Serial devices: the read and write order are the same
- Pipeline synchronizes two processes
In a shell, use |
for pipeline
1 | ls | less |
Pipe
A pipe is a communication device that permits unidirectional communication. Data written to the “write end” of the pipe is read back from the “read end.”
Pipes are serial devices; the data is always read from the pipe in the same order it was written.
Typically, a pipe is used to communicate between two threads in a single process or between parent and child processes.
pipe()
creates a ONE directional pipe
- Two file descriptors: one to write to and one to read from the pipe
- A process can use the pipe by itself, but this is unusual
- Typically, a parent process creates a pipe and shares it with a child, or between multiple children
- Some processes read from it, and some write to it
- There can be multiple writers and multiple readers
- Although multiple writers is more common
To create a pipe, invoke the pipe command. Supply an integer array of size 2.The call to pipe stores the reading file descriptor in array position 0 and the writing file descriptor in position 1. For example, consider this code:
1 | int pipe_fds[2]; |
Dup
dup2()
duplicates a file descriptor
- Used to redirect standard input, standard output, and standard error to a pipe (or another file)
STDOUT_FILENO
is the number of the standard output
1 |
|
Output
1 | Bye |
Another example
1 |
|
Output:
1 | [P]: total 1968 |
Using Pipe and Dup2
Redirecting the Standard Input, Output, and Error Streams
Frequently, you’ll want to create a child process and set up one end of a pipe as its standard input or standard output.
Using the dup2
call, you can equate one file descriptor with another. For example, to redirect a process’s standard input to a file descriptor fd, use this line:
1 | dup2 (fd[0], STDIN_FILENO); |
Example
1 | // based on the example from |
output
1 | [C]: Hi |
or
1 | [C]: Hi |
or
1 | [C]: Hi |
Minimal example
1 |
|
output
1 | [C]: Hi |
Practical Example:
- output of
rgen
is connected to the input ofa1
- output of
a1
is connected to the input ofa2
- output of
a2
is then print out by theprocPrinter
- When
procPrinter
takectrl+d
, it will exit
2 pipes needed (
rgenToA1
,a1ToA2
)Note that
- pipe[0] - the read end of the pipe - is a file descriptor used to read from the pipe
- pipe[1] - the write end of the pipe - is a file descriptor used to write to the pipe
rgen
write output to thergenToA1
pipe, thereforergenToA1[1]
is used throughdup2
a1
read input from thergenToA1
pipe, thereforergenToA1[0]
is used throughdup2
a1
write output to thea1ToA2
pipe, thereforea1ToA2[1]
is used throughdup2
a2
read input from thea1ToA2
pipe, thereforea1ToA2[0]
is used throughdup2
procPrinter then subscribe to
a1ToA2[1]
pipe because it takes input as well
1 | // Entry point of process Printer |
Written Example:
You are given two programs A and B. You would like to execute the two programs in parallel such that the standard input of B is connected to the standard output of A and the standard input of A is connected to the standard output of B. Sketch a code for the main() function of the parent process that creates the above configuration. You might need to use the following system calls: fork(), dup2(), pipe(), close(), and exec(). You can write exec(A) to mean that program A is to be executed (i.e., don’t worry about the exact syntax of exec() family of commands).
- output of
A
is connected to the input ofB
- output of
B
is connected to the input ofA
2 pipes are needed because each single direction require a pipe. We name it
AtoB
andBtoA
.Note that
- pipe[0] - the read end of the pipe - is a file descriptor used to read from the pipe
- pipe[1] - the write end of the pipe - is a file descriptor used to write to the pipe
A
write output to theAtoB
pipe, thereforeAtoB[1]
is used throughdup2
A
read input from theBtoA
pipe, thereforeBtoA[0]
is used throughdup2
B
write output to theBtoA
pipe, thereforeBtoA[1]
is used throughdup2
B
read input from theAtoB
pipe, thereforeAtoB[0]
is used throughdup2
1 | int main() { |
Popen / Pclose
All-together package
A common use of pipes is to send data to or receive data from a program being run in a subprocess.The popen and pclose functions ease this paradigm by eliminating the need to invoke pipe, fork, dup2, exec, and fdopen.
Combining pipe
, fork
, dup2
, exec
, and fdopen
!
- The call to popen creates a child process executing the given command
- The second argument to
peopen
“w” or “r” indicates whether the process wants to read from or write to the child process - Return value from
popen
is one end of a pipe - The other end is connected to the standard input/output of the child process
- WARNING:
popen
uses shell to execute the given command!
1 |
|
- Note that
popen()
can only open the pipe in read or write mode, not both.
popen works in one direction. If you need to read and write, You can create a pipe with pipe(), span a new process by fork() and exec functions and then redirect its input and outputs with dup2().
FIFO
A first-in, first-out (FIFO) file is a pipe that has a name in the filesystem. Any process can open or close the FIFO; the processes on either end of the pipe need not be related to each other. FIFOs are also called named pipes.
For processes that are unrelated (no parent-child relationship)
-
In the shell
mkfifo
-
In the program
- Open and use a fifo exactly like a file:
1
2
3int fd = open (fifo_path, O_WRONLY);
write (fd, data, data_length);
close (fd);
/dev/urandom for random numbers
1 | // an example of reading random numbers from /dev/urandom |
Practical Usage of /dev/urandom
:
1 | int getRandomInt(int lower, int upper) |