my blog

Thursday, 22 January 2015

Notes on half-duplex pipes:

• Two way pipes can be created by opening up two pipes, and properly reassigning the

file descriptors in the child process.

• The pipe() call must be made BEFORE a call to fork(), or the descriptors will not be

inherited by the child! (same for popen()).

• With half-duplex pipes, any connected processes must share a related ancestry. Since

the pipe resides within the confines of the kernel, any process that is not in the ancestry

for the creator of the pipe has no way of addressing it. This is not the case with

named pipes (FIFOS).

Atomic Operations with Pipes

. NAMED PIPES (FIFOS - FIRST IN FIRST OUT)

Atomic Operations with Pipes

In order for an operation to be considered “atomic”, it must not be interrupted for any
reason at all. The entire operation occurs at once. The POSIX standard dictates in
/usr/include/posix1 lim.h that the maximum buffer size for an atomic operation on a pipe
is:
#define _POSIX_PIPE_BUF 512
Up to 512 bytes can be written or retrieved from a pipe atomically. Anything that
crosses this threshold will be split, and not atomic. Under Linux, however, the atomic
operational limit is defined in “linux/limits.h” as:
#define PIPE_BUF 4096
As you can see, Linux accommodates the minimum number of bytes required by
POSIX, quite considerably I might add. The atomicity of a pipe operation becomes important
when more than one process is involved (FIFOS). For example, if the number of
bytes written to a pipe exceeds the atomic limit for a single operation, and multiple processes
are writing to the pipe, the data will be “interleaved” or “chunked”. In other words,
one process may insert data into the pipeline between the writes of another.

Creating Pipes in C

Creating “pipelines” with the C programming language can be a bit more involved than our

simple shell example. To create a simple pipe with C, we make use of the pipe() system
call. It takes a single argument, which is an array of two integers, and if successful, the
array will contain two new file descriptors to be used for the pipeline. After creating a
pipe, the process typically spawns a new process (remember the child inherits open file
descriptors).
SYSTEM CALL: pipe();
PROTOTYPE: int pipe( int fd[2] );
RETURNS: 0 on success
-1 on error: errno = EMFILE (no free descriptors)
EMFILE (system file table is full)
EFAULT (fd array is not valid)
NOTES: fd[0] is set up for reading, fd[1] is set up for writing
The first integer in the array (element 0) is set up and opened for reading, while the
second integer (element 1) is set up and opened for writing. Visually speaking, the output
of fd1 becomes the input for fd0. Once again, all data traveling through the pipe moves
through the kernel.
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
main()
{
int fd[2];
pipe(fd);
.
.
}

. LINUX INTERPROCESS COMMUNICATIONS

Remember that an array name in C decays into a pointer to its first member. Above,
fd is equivalent to &fd[0]. Once we have established the pipeline, we then fork our new
child process:
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
main()
{
int fd[2];
pid_t childpid;
pipe(fd);
if((childpid = fork()) == -1)
{
perror("fork");
exit(1);
}
.
.
}
If the parent wants to receive data from the child, it should close fd1, and the child
should close fd0. If the parent wants to send data to the child, it should close fd0, and
the child should close fd1. Since descriptors are shared between the parent and child, we
should always be sure to close the end of pipe we aren’t concerned with. On a technical
note, the EOF will never be returned if the unnecessary ends of the pipe are not explicitly
closed.
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
main()
{
int fd[2];
pid_t childpid;
pipe(fd);
if((childpid = fork()) == -1)
{
perror("fork");
exit(1);
}
if(childpid == 0)
{
/* Child process closes up input side of pipe */
close(fd[0]);
}
else
{6.2. HALF-DUPLEX UNIX PIPES 21
/* Parent process closes up output side of pipe */
close(fd[1]);
}
.
.
}
As mentioned previously, once the pipeline has been established, the file descriptors
may be treated like descriptors to normal files.
/*****************************************************************************
Excerpt from "Linux Programmer’s Guide - Chapter 6"
(C)opyright 1994-1995, Scott Burkett
*****************************************************************************
MODULE: pipe.c
*****************************************************************************/
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
int main(void)
{
int fd[2], nbytes;
pid_t childpid;
char string[] = "Hello, world!\n";
char readbuffer[80];
pipe(fd);
if((childpid = fork()) == -1)
{
perror("fork");
exit(1);
}
if(childpid == 0)
{
/* Child process closes up input side of pipe */
close(fd[0]);
/* Send "string" through the output side of pipe */
write(fd[1], string, strlen(string));
exit(0);
}
else
{
/* Parent process closes up output side of pipe */
close(fd[1]);
/* Read in a string from the pipe */
nbytes = read(fd[0], readbuffer, sizeof(readbuffer));
printf("Received string: %s", readbuffer);22 CHAPTER 6. LINUX INTERPROCESS COMMUNICATIONS
}
return(0);
}
Often, the descriptors in the child are duplicated onto standard input or output. The
child can then exec() another program, which inherits the standard streams. Let’s look at
the dup() system call:
SYSTEM CALL: dup();
PROTOTYPE: int dup( int oldfd );
RETURNS: new descriptor on success
-1 on error: errno = EBADF (oldfd is not a valid descriptor)
EBADF (newfd is out of range)
EMFILE (too many descriptors for the process)
NOTES: the old descriptor is not closed! Both may be used interchangeably
Although the old descriptor and the newly created descriptor can be used interchangeably,
we will typically close one of the standard streams first. The dup() system call uses
the lowest-numbered, unused descriptor for the new one.
Consider:
.
.
childpid = fork();
if(childpid == 0)
{
/* Close up standard input of the child */
close(0);
/* Duplicate the input side of pipe to stdin */
dup(fd[0]);
execlp("sort", "sort", NULL);
.
}
Since file descriptor 0 (stdin) was closed, the call to dup() duplicated the input descriptor
of the pipe (fd0) onto its standard input. We then make a call to execlp(), to overlay
the child’s text segment (code) with that of the sort program. Since newly exec’d programs
inherit standard streams from their spawners, it actually inherits the input side of the pipe
as its standard input! Now, anything that the original parent process sends to the pipe, goes
into the sort facility.
There is another system call, dup2(), which can be used as well. This particular call
originated with Version 7 of UNIX, and was carried on through the BSD releases and is
now required by the POSIX standard.
SYSTEM CALL: dup2();
PROTOTYPE: int dup2( int oldfd, int newfd );
RETURNS: new descriptor on success
-1 on error: errno = EBADF (oldfd is not a valid descriptor)6.2. HALF-DUPLEX UNIX PIPES 23
EBADF (newfd is out of range)
EMFILE (too many descriptors for the process)
NOTES: the old descriptor is closed with dup2()!
With this particular call, we have the close operation, and the actual descriptor duplication,
wrapped up in one system call. In addition, it is guaranteed to be atomic, which
essentially means that it will never be interrupted by an arriving signal. The entire operation
will transpire before returning control to the kernel for signal dispatching. With the
original dup() system call, programmers had to perform a close() operation before calling
it. That resulted in two system calls, with a small degree of vulnerability in the brief
amount of time which elapsed between them. If a signal arrived during that brief instance,
the descriptor duplication would fail. Of course, dup2() solves this problem for us.
Consider:
.
.
childpid = fork();
if(childpid == 0)
{
/* Close stdin, duplicate the input side of pipe to stdin */
dup2(0, fd[0]);
execlp("sort", "sort", NULL);
.
.
}

Pipes the Easy Way!

If all of the above ramblings seem like a very round-about way of creating and utilizing
pipes, there is an alternative.
LIBRARY FUNCTION: popen();
PROTOTYPE: FILE *popen ( char *command, char *type);
RETURNS: new file stream on success
NULL on unsuccessful fork() or pipe() call
NOTES: creates a pipe, and performs fork/exec operations using "command"
This standard library function creates a half-duplex pipeline by calling pipe() internally.
It then forks a child process, execs the Bourne shell, and executes the ”command” argument
within the shell. Direction of data flow is determined by the second argument, ”type”. It
can be ”r” or ”w”, for ”read” or ”write”. It cannot be both! Under Linux, the pipe will be
opened up in the mode specified by the first character of the ”type” argument. So, if you
try to pass ”rw”, it will only open it up in ”read” mode.
While this library function performs quite a bit of the dirty work for you, there is a
substantial tradeoff. You lose the fine control you once had by using the pipe() system
call, and handling the fork/exec yourself. However, since the Bourne shell is used directly,
shell metacharacter expansion (including wildcards) is permissible within the ”command”
argument.
Pipes which are created with popen() must be closed with pclose(). By now, you have
probably realized that popen/pclose share a striking resemblance to the standard file stream
I/O functions fopen() and fclose(): pclose();

. LINUX INTERPROCESS COMMUNICATIONS
LIBRARY FUNCTION

PROTOTYPE: int pclose( FILE *stream );
RETURNS: exit status of wait4() call
-1 if "stream" is not valid, or if wait4() fails
NOTES: waits on the pipe process to terminate, then closes the stream.
The pclose() function performs a wait4() on the process forked by popen(). When it
returns, it destroys the pipe and the file stream. Once again, it is synonymous with the
fclose() function for normal stream-based file I/O.
Consider this example, which opens up a pipe to the sort command, and proceeds to
sort an array of strings:
/*****************************************************************************
Excerpt from "Linux Programmer’s Guide - Chapter 6"
(C)opyright 1994-1995, Scott Burkett
*****************************************************************************
MODULE: popen1.c
*****************************************************************************/
#include <stdio.h>
#define MAXSTRS 5
int main(void)
{
int cntr;
FILE *pipe_fp;
char *strings[MAXSTRS] = { "echo", "bravo", "alpha",
"charlie", "delta"};
/* Create one way pipe line with call to popen() */
if (( pipe_fp = popen("sort", "w")) == NULL)
{
perror("popen");
exit(1);
}
/* Processing loop */
for(cntr=0; cntr<MAXSTRS; cntr++) {
fputs(strings[cntr], pipe_fp);
fputc(’\n’, pipe_fp);
}
/* Close the pipe */
pclose(pipe_fp);
return(0);
}
Since popen() uses the shell to do its bidding, all shell expansion characters and
metacharacters are available for use! In addition, more advanced techniques such as redi-

. HALF-DUPLEX UNIX PIPES

rection, and even output piping, can be utilized with popen(). Consider the following
sample calls:
popen("ls ˜scottb", "r");
popen("sort > /tmp/foo", "w");
popen("sort | uniq | more", "w");
As another example of popen(), consider this small program, which opens up two pipes
(one to the ls command, the other to sort):
/*****************************************************************************
Excerpt from "Linux Programmer’s Guide - Chapter 6"
(C)opyright 1994-1995, Scott Burkett
*****************************************************************************
MODULE: popen2.c
*****************************************************************************/
#include <stdio.h>
int main(void)
{
FILE *pipein_fp, *pipeout_fp;
char readbuf[80];
/* Create one way pipe line with call to popen() */
if (( pipein_fp = popen("ls", "r")) == NULL)
{
perror("popen");
exit(1);
}
/* Create one way pipe line with call to popen() */
if (( pipeout_fp = popen("sort", "w")) == NULL)
{
perror("popen");
exit(1);
}
/* Processing loop */
while(fgets(readbuf, 80, pipein_fp))
fputs(readbuf, pipeout_fp);
/* Close the pipes */
pclose(pipein_fp);
pclose(pipeout_fp);
return(0);
}
For our final demonstration of popen(), let’s create a generic program that opens up a
pipeline between a passed command and filename:
/*****************************************************************************26 . LINUX INTERPROCESS COMMUNICATIONS

*****************************************************************************
MODULE: popen3.c
*****************************************************************************/
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *pipe_fp, *infile;
char readbuf[80];
if( argc != 3) {
fprintf(stderr, "USAGE: popen3 [command] [filename]\n");
exit(1);
}
/* Open up input file */
if (( infile = fopen(argv[2], "rt")) == NULL)
{
perror("fopen");
exit(1);
}
/* Create one way pipe line with call to popen() */
if (( pipe_fp = popen(argv[1], "w")) == NULL)
{
perror("popen");
exit(1);
}
/* Processing loop */
do {
fgets(readbuf, 80, infile);
if(feof(infile)) break;
fputs(readbuf, pipe_fp);
} while(!feof(infile));
fclose(infile);
pclose(pipe_fp);
return(0);
}
Try this program out, with the following invocations:
popen3 sort popen3.c
popen3 cat popen3.c
popen3 more popen3.c
popen3 cat popen3.c | grep main

HALF-DUPLEX UNIX PIPES

HALF-DUPLEX UNIX PIPES

Construction of the pipeline is now complete! The only thing left to do is make use of
the pipe. To access a pipe directly, the same system calls that are used for low-level file I/O
can be used (recall that pipes are actually represented internally as a valid inode).
To send data to the pipe, we use the write() system call, and to retrieve data from the
pipe, we use the read() system call. Remember, low-level file I/O system calls work with
file descriptors! However, keep in mind that certain system calls, such as lseek(), do not
work with descriptors to pipes.

Saturday, 17 January 2015

LINUX INTERPROCESS COMMUNICATIONS

The above sets up a pipeline, taking the output of ls as the input of sort, and the output

of sort as the input of lp. The data is running through a half duplex pipe, traveling (visually)

left to right through the pipeline.

Although most of us use pipes quite religiously in shell script programming, we often

do so without giving a second thought to what transpires at the kernel level.

When a process creates a pipe, the kernel sets up two file descriptors for use by the

pipe. One descriptor is used to allow a path of input into the pipe (write), while the other

is used to obtain data from the pipe (read). At this point, the pipe is of little practical use,

as the creating process can only use the pipe to communicate with itself. Consider this

representation of a process and the kernel after a pipe has been created:

From the above diagram, it is easy to see how the descriptors are connected together. If

the process sends data through the pipe (fd0), it has the ability to obtain (read) that information

from fd1. However, there is a much larger objective of the simplistic sketch above.

While a pipe initially connects a process to itself, data traveling through the pipe moves

through the kernel. Under Linux, in particular, pipes are actually represented internally

with a valid inode. Of course, this inode resides within the kernel itself, and not within the

bounds of any physical file system. This particular point will open up some pretty handy

I/O doors for us, as we will see a bit later on.

At this point, the pipe is fairly useless. After all, why go to the trouble of creating a

pipe if we are only going to talk to ourself? At this point, the creating process typically

forks a child process. Since a child process will inherit any open file descriptors from the

parent, we now have the basis for multiprocess communication (between parent and child).

Consider this updated version of our simple sketch:

Above, we see that both processes now have access to the file descriptors which constitute

the pipeline. It is at this stage, that a critical decision must be made. In which direction

do we desire data to travel? Does the child process send information to the parent, or viceversa?

The two processes mutually agree on this issue, and proceed to “close” the end

of the pipe that they are not concerned with. For discussion purposes, let’s say the child

performs some processing, and sends information back through the pipe to the parent. Our

newly revised sketch would appear as such:

Linux Interprocess communications

B. Scott Burkett, scottb@intnet.net v1.0, 29 March 1995

6.1 Introduction

The Linux IPC (Inter-process communication) facilities provide a method for multiple processes

to communicate with one another. There are several methods of IPC available to

Linux C programmers:

• Half-duplex UNIX pipes

• FIFOs (named pipes)

• SYSV style message queues

• SYSV style semaphore sets

• SYSV style shared memory segments

• Networking sockets (Berkeley style) (not covered in this paper)

• Full-duplex pipes (STREAMS pipes) (not covered in this paper)

These facilities, when used effectively, provide a solid framework for client/server development

on any UNIX system (including Linux).

6.2 Half-duplex UNIX Pipes

6.2.1 Basic Concepts

Simply put, a pipe is a method of connecting the standard output of one process to the

standard input of another. Pipes are the eldest of the IPC tools, having been around since

the earliest incarnations of the UNIX operating system. They provide a method of one-way

communications (hence the term half-duplex) between processes.

This feature is widely used, even on the UNIX command line (in the shell).

ls | sort | lp

The “swiss army knife” ioctl

ioctl stands for input/output control and is used to manipulate a character device via a

filedescriptor. The format of ioctl is

ioctl(unsigned int fd, unsigned int request, unsigned long argument).

The return value is -1 if an error occured and a value greater or equal than 0 if the request

succeeded just like other system calls. The kernel distinguishes special and regular files.

Special files are mainly found in /dev and /proc. They differ from regular files in that way

that they hide an interface to a driver and not to a real (regular) file that contains text or

binary data. This is the UNIX philosophy and allows to use normal read/write operations

on every file. But if you need to do more with a special file or a regular file you can do it

with ... yes, ioctl. You more often need ioctl for special files than for regular files, but it’s

possible to use ioctl on regular files as well.