Thursday, February 6, 2014

PROCESSES, FORK AND WAIT

UNDERSTANDING PROCESSES

A process consist of an address space and a single thread of control that executes with that address space and its required system resources.

Wao, that quite mouthful of jargon. Speaking in simple language a process is essentially a running program, consisting of some program code, some data and variables, open files and an environment.

The UNIX system usually shares the code between processes so that there is only one copy of a programs code in memory at a time. Same is true for shared system libraries.

The processes running at a time can be viewed in the process tree maintained by UNIX. Each process has an identification number that is used to manage it as well as to index it in the tree.

When UNIX starts it runs a single program "init" which starts other system processes who themselves may start some more, gradually firing complete ensemble of OS features and facilities.

CREATING NEW PROCESSES USING system FUNCTION

We can create new processes from withins of another program. One of the ways is to use system function, which takes in a shell command as a string and tries to run it.

#include <stdlib.h>
#include <stdio.h>
int main(){
    printf("Running ls command with system\n");
    system ("ps -ax");
    printf("Done. \n");
    exit(0);
}
	  

CREATING NEW PROCESSES USING exec FUNCTION

Another method probably with better control over the child processes is provided by family of exec commands. Members of this family differ from one another in the way they start the processes or present the program arguments. But generally speaking they replace the current process with another created according to the arguments given. The program given by the path argument is used as the program code to execute in place of that which is currently running.
#include <unistd.h>
#include <stdio.h>
int main(){
printf("Running ps with execlp\n"
execlp("ps","ps","-ax",0);
printf("Done\n");
exit(0);
}
	  

This program will print the first message and then calls the execlp to execute ps command. This new process will replace our program and control would return to terminal once ps finishes. This means that subsequent lines of our code never get executed. Only time that an exec function will return is when you have an error while executing. In such case it returns a value -1 and errno is set accordingly.

CREATING NEW PROCESSES USING fork FUNCTION

The third way to create new processes is 'fork' which creates new processes without replacing the parent. This is a system call and duplicates the calling process a create a new entry in the process table with several characteristics identical to the parent.

A simple example of fork is given here:

#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
int main(){
  
  /*Declare variables*/
  pid_t pid;             /*process id*/
  char* slogan;	         /*message to be printed*/
  int n;    

  /*Print a welcome message*/
  printf("+++++++++++++++++++++++++++++++++++++++++++++++++++++\n");
  printf("(%d/%d) A program with fork application\n",getpid(),getppid());

  /*Create a new process using fork*/
  pid  = fork();
  
  /*Processes are managed using fork*/
  switch(pid){
    /*Error during fork*/
  case -1:
    exit(0);
      
    /*fork returns value zero in children*/
  case 0:
    slogan = "(%d/%d) Hello World, I am child\n";
    n=5;
    break;

    /*but not inside the parent*/
  default:	
    slogan = "(%d/%d) Hello World, This is parent\n";
    n=3;
    break;
    
  }
  /**
     THIS PART IS TEMPORARILY COMMENTED OUT
  int status;
  pid_t p = wait(&status);
  if(WIFEXITED(status)){
    printf("(%d/%d) Child exited normally w/status=%d\n",
             getpid(),getppid(),WIFEXITSTATUS());
	}
   else
   printf("(%d/%d) Child exited anormally ",
          getpid(),getppid());

   *************************************************/

  /*Now make each process print a characteristic number of times*/

  for(; n>0;n--){
    
    printf(slogan,getpid(),getppid());    
    sleep(2);
  }
  
  /*If you make upto this point, you are discharged honourably !*/
  exit(0);
  
}
	  

Now it is a time to have a brief look at manual page for the fork(2):
Library:
    #include <unistd.h>
Prototype:
    pid_t fork(void);
Description:
    1. creates  a  new  process by duplicating the calling process
    2. child has its own unique process ID
    3. parent process ID is the same as current process
    4. child does not inherit its parent's memory locks
Return:
    On Success:		
       PID of the child process is returned in the parent
       0 is returned inside the child

    On Error: returns -1, no child created and errno is set.
	  

Another function that is used in conjuction with fork is wait:
Library:
    #include <sys/types.h>
    #include <sys/wait.h>
Prototype:
    pid_t wait(int* status);
Description:
    1. wait for state changes in a child of the calling process
    2. obtain information about the child whose  state  has  changed
    Notes: What is a state change?
	1. the child terminated
	2. the child was stopped by a signal
	3. the child was resumed by a signal. 

    Notes: Why use wait?
       In the case of a terminated child, performing a
       wait  allows  the system to release the resources 
       associated with the child; if a wait is not performed, 
       then the terminated child remains in a  "zombie"  state 

Inspecting Status: The integer status is examined with following macros
       	   
   WIFEXITED(status): true if the child terminated normally.

   WEXITSTATUS(status): 
   	returns the exit status of the child.  
   	should be employed only if WIFEXITED returned true.

   WIFSIGNALED(status)
        returns true if the child process was terminated by a signal.

   WTERMSIG(status)
        returns the number of the signal that caused the child  terminate.
        should be employed only if WIFSIGNALED returned true.

   WCOREDUMP(status)
        returns  true if the child produced a core dump.
        employed if WIFSIGNALED returned true.  

   WIFSTOPPED(status)
        returns true if the child process was stopped by delivery of a signal; 

   WSTOPSIG(status)
        returns the number of the signal which caused the child to  stop.

   WIFCONTINUED(status)
        returns true if the child process was resumed by delivery of SIGCONT.
	  

The wait can be used in above program to ensure that the parent process wait for all the child processes to terminate before terminating itself. An illustration of wait function is given below. It can be run inside the program shown above (just insert it before the 'for' looping).
 int status;
  pid_t p  = wait(&status);
  if(WIFEXITED(status)){
    printf("(%d/%d) Child exited normally w/status=%d\n",
	   getpid(),getppid(),WEXITSTATUS(status));
  }
  else{
    printf("(%d/%d) Child exited anormally ", getpid(),getppid());

  }
	  

It may be useful to compare the output from this example program before and after the usage of wait function:
* Without wait:

(7025/4265) A program with fork application
(7025/4265) Hello World, This is parent
(7026/7025) Hello World, I am child
(7025/4265) Hello World, This is parent
(7026/7025) Hello World, I am child
(7025/4265) Hello World, This is parent
(7026/7025) Hello World, I am child
(7026/7025) Hello World, I am child
anil@indica$ (7026/1903) Hello World, I am child

* With wait:
(6967/4265) A program with fork application
(6968/6967) Child exited normally w/status=0
(6968/6967) Hello World, I am child
(6968/6967) Hello World, I am child
(6968/6967) Hello World, I am child
(6968/6967) Hello World, I am child
(6968/6967) Hello World, I am child
(6967/4265) Child exited normally w/status=0
(6967/4265) Hello World, This is parent
(6967/4265) Hello World, This is parent
(6967/4265) Hello World, This is parent
	  

We saw that there was a zombie process that outlived the parent process when we did not use the wait function.

SUMMARY

We have come quite a distance from our first post where we talked about the fundamentals of linux OS. The second post was about taking baby steps into I/O operations, and the third one dealt with reading and writing information from the directories. In the current post which is fourth one in this series, we talked a little bit about what a process is and how do we create the new processes from existing ones. We spent quite a bit of time on discussing fork method and did an example to demonstrate the working. Finally we had a brief idea of the usefulness of wait function, and we compared the output of our program before and after the inclusion of wait. In our next post we shall keep building these foundations and talk about file I/O in Linux System programming.

DIRECTORY ACCESS WITH SYSTEM CALLS

ACCESSING THE CONTENTS

We have seen earlier that directory is a special file that contain the directory objects who contain a mapping of file-names to the actual files. We shall write a small program to check the contents of a directory specified via command line.
First we have a look at the functions that we would use:
opendir(3)
readdir(3)
closedir(3)
	  

So let us have a look at opendir:
man 2  opendir

Library:
    #include <sys/types.h>
    #include <dirent.h>
Prototype:
       DIR* opendir(const char* name);
Description:
     opens a directory stream corresponding to the directory name, and
     returns a pointer to the directory stream. On an error NULL is
     returned and errno is set.
	  

and a peek at readdir:
man 2 readdir

Library:
    #include <dirent.h>
Prototype:
       struct dirent *readdir(DIR *dirp);
Description:
    returns next item in the directory stream dirp.
Returns:
    returns a struct dirent that represent the next entry in directory
    stream dirp. If there is an error or end of directory is reached
    it will return NULL.
	  

It will be useful to know the composition of dirent struct:
         struct dirent {
               ino_t          d_ino;       /* inode number */
               off_t          d_off;       /* not an offset; see NOTES */
               unsigned short d_reclen;    /* length of this record */
               unsigned char  d_type;      /* type of file; not supported
                                              by all filesystem types */
               char           d_name[256]; /* filename */
           };
	  

and a quick look at closedir:
man closedir

Library:
    #include <sys/types.h>
    #include <dirent.h>
Prototype:
       int closedir(DIR *dirp);
Description:
    close directory stream associated with dirp.
Returns:
    On Success: The value zero is returned.
    On Failure: The value -1 is resturned, errno is set accordingly.
	  
;

Now is the time to dive into the actual program.
/*
Created By: Anil Singh, NIU, IL-60115
Description: Receive name of directory, and list contents.
*/

#include<sys/types.h>
#include<dirent.h>
#include<stdio.h>
#include<errno.h>
#include<string.h>
#include<stdlib.h>
static void scanDir ( char* dir ){

  printf("Directory : %s\n",dir);
  // pointers for structures for directory processing
  DIR* dp;
  struct dirent* dirp;
  // Open directory for reading.
  errno=0;
  dp = opendir ( dir );
  if ( !dp ) {
    fprintf(stderr,"Can't open %s: %s\n",dir,strerror(errno));
    return;
  };

  // Read and display contents of directory.
  errno=0;
  while ( ( dirp = readdir ( dp ) ) )
    printf ( "%s\n", dirp->d_name );
  if(!dirp && errno){
    fprintf(stderr,"can't read %s: %s\n","name",strerror(errno));
    return;
  }
  // Close directory.
  if ( (closedir(dp)<0)){
    fprintf(stderr,"can't close %s: %s\n","name",strerror(errno));
    return;
  }
}

int main(int argc, char* argv[]){

  /*If no argument is supplied, set current directory*/
  int i;
  if(argc==1){
    scanDir(".");
  }
  else
    for( i=1; i<argc; i++){
      scanDir(argv[i]);
    }
  return 0;
}

	  

SUMMARY

We saw how to open, read and close directories under LINUX and also a bit on how to deal with the errors.

Wednesday, February 5, 2014

THE BASIC INPUT AND OUTPUT OPERATIONS

THE I/O PROCESSING

One of the ways to classify the Unix I/O is to think of it as buffered or unbuffered.

The Unbuffered I/O has following path of data flow:

 Data->Kernel Buffer->Disk
       

Most of the system calls like: read, write, open, gets,puts, close are unbuffered.

The buffered I/O has following path of data flow:

 Data->User Buffer -> Kernel Buffer -> Disk
       

The standard I/O are usually buffered: fread,fwrite, fopen fclose,getchar,putchar.

Please note that : STDIN_FILENO, STDOUT_FILENO, STDERR_FILENO are file descriptors whose type is 'int' and they are system calls themselves.

Please note that : stdin, stdout and stderr are standard I/O and they are pointers of type FILE*.

USING UNBUFFERED I/O

In this section we are going to encounter following functions:
read(2)
write(2)
perror(3)
exit(3)
     

Before going ahead and creating a program, let us have a brief look at the information from man pages. Here is what we have for read function:
man 2 read
Library:
#include <unistd.h >
Prototype:
  ssize_t read(int fd, const void *buf, size_t count);
Description:
  attempts to read up to count bytes from file
  descriptor fd into the buffer starting at buf.
Notes:
  0. If count=0, then zero is returned and no other effects.
  1. Commences at current file offset.
  2. If offset is past end of file, nothing read, zero returned
  3. If count > SSIZE_MAX, result unspecified.
Return:
  On Success: number of bytes read.
  On Error: returns -1, set 'errno'
       

Now we shall look at the function perror(3):
Library:
    #include <stdio.h>
   #include <errno.h>
Prototype:
   void perror(const char *s);
escription:
    produces  a message on the standard error output,
    describing the last error encountered during a call to
    a system or library function
Return:
    On Success: number of bytes returned.
    On Error: returns -1, set 'errno'
       

Finally we need to know about the exit call
Library:
  #include <stdlib.h>
Prototype:
  void exit(int status);
Description:
  causes normal process termination and the value o
Return:
  no return
   

Now here comes the exiting part: The actual program.
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>

#define BUFFERSIZE  8192

int main(int argc, char* argv[]){
          /*Size of data read in single go*/
          int numRead;  
   
          /*Buffer to store the data read in one go*/
          char buffer[8192]; 
 
   /*attempt a first read from STDIN_FILENO*/
   numRead = read ( STDIN_FILENO, buffer, BUFFERSIZE );
  
   while(numRead>0){
   
   /*Write numBytes bytes into the stdout*/
   int numWrite = write (STDIN_FILENO, buffer, numRead);
   
   /*Check if all the bytes sent to write, got written*/
          if(numRead != numWrite){
   perror("Error while writing");
   exit(1);
   }
 
  /*Write the next record.*/
   numRead = read ( STDIN_FILENO, buffer, BUFFERSIZE );
   }
  
   
   /*If we see a negative value of numRead, its error*/
   if(numRead<0){
         perror("Error while reading");
         exit(2);
         }
         
   /*If you manage to reach here, exit normally*/
    exit(0);
}
 

USING BUFFERED I/0

In this program we shall use buffered I/O methods to do what we did in last program. Here are the functions that we gonna use:
fgetc(3)
fputc(3)
ferror(3)
perror(3)
exit(3)

So we start with man information on fgetc:
Library:
    #include <stdio.h>
Prototype:
    void fgetc(FILE *stream);
Description:
    reads the next character from stream.
    
Returns:
    returns the caharacter read as an unsigned char cast to an int,
    or EOF on end of file or error.

And here is a bit on fputc:
Library:
    #include <stdio.h>
Prototype:
    void fputc(FILE *stream);
Description:
    writes the character c, to stream.
Returns:
    nothing

and finally we need to talk about ferror:
Library:
    #include 
Prototype:
    void ferror(FILE *stream);
Description:
    test the error indicator on stream.
Returns:
    if error indicator on stream is set, ferror returns nonzero

And now we are ready to have a look at our program:
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>

#define BUFFERSIZE  8192

int main(int argc, char* argv[]){
  
  char datum; /*variable to hold the data*/

  /*Attempt a first read from stdin*/
  datum = fgetc (stdin);

  while(datum != EOF){
    
    /*Write the datum into stdout*/
    if(fputc(datum, stdout)==EOF){
      perror("Error while trying to write: fputc");
      exit(1);
    }
    /*Read the next record.*/
    datum = fgetc (stdin);
  }


  /*If a stdin error brought you here: its bad*/
  if(ferror(stdin)){
    perror("Error while reading: fgetc");
    exit(2);
  }

  /*If you manage to reach here, exit normally*/
  exit(0);
}

SUMMARY

We in last post took a brief tour into the basic ideas that are needed to begin Linux System Programming. In this post we further developed those ideas and examined the buffered and unbuffered I/O modes. In the next post we shall continue to build upon these points and see how to open and read directories.

INTRODUCTION TO UNIX AND UNIX LIKE SYSTEMS

WHAT IS UNIX?

The essentials of a UNIX or UNIX-like operating system consist of a 'kernel' and 'system programs'. In addition there also are the 'application program' for helping specific user tasks.

A kernel constiture the heart of the OS, and perform severl low level chores like keeping track of the files on disk, starting and running the programs, managing resources for the processes, communicate by exchanging packets of information with the network etc. Another way to look at Kernel is to think of it as an interface between the hardware and users. It prevents the direct access to hardware, and force the processes to do so using the tools (System Calls) provided by it.

The System Calls provided by Kernel are used by System Programs in order to implement various facilities provided by the operating system. A "mount" command is an example of the System program. The Application Programs can also user the System Calls but they are not System Programs because the intent here is to assist user in one of his tasks and not to get the system working.

The line between what is a system call and what is an application is not sharp and there are several programs which can be midway between the two.

The UNIX/UNIX-like OS provide several services some of which we shall list here:

  1. The "first" Process: init
  2. Logins from terminals: getty+login
  3. System Logs: syslog
  4. Periodic command executon: cron, at service
  5. Graphical User Interface: X Window System
  6. Networking Support
  7. Network Logins: telnet, rlogin, ssh
  8. Network File Systems: NFS (Supported by Kernel),CIFS
  9. Electronic mail, Printer Support.
Finally we have a USER MANUAL that comes installed automatically on every UNIX/LINUX machine and can be used for quick reference. Here is a summary of various sections of the LINUX MANUAL
  1. Executable programs or shell commands
  2. System calls (functions provided by the kernel)
  3. Library calls (functions within program libraries)
  4. Special files (usually found in /dev)
  5. File formats and conventions eg /etc/passwd
  6. Games
  7. Miscellaneous (including macro packages and conventions)
  8. System administration commands (usually only for root)
  9. Kernel routines [Non standard]

BASIC TERMINOLOGY OF UNIX SYSTEM PROGRAMMING

We shall follow a very concise approach with less talk and more programming. But before we begin it will be worthwhile to have a look at following terms:
  1. File: A near good definition of a file in UNIX system that that its an object residing either on external or internal memmory, that can be written to, or read from or both.
  2. File System: The file system refers to tree like hierarchy into which the files are organised in UNIX.
  3. Directories: The directories are special files that represent the nodes of the file-hierarchy and appear to the user as if they contain other files. But in reality it containscontain directory entries, which are the objects who associate a file to its filename - thus making an impression that actual files reside there.
  4. Processes: Although we shall discuss the idea of what a process is in detail, it shall suffice to note here that it is an instance of running a program. Each process has an 'id' number called "pid" as well as a 'parent-process-id' called "ppid", about both of which we shall talk more later.

Now that we have some idea of what involves in our endeavor to start with UNIX/LINUX system programming, its time to start!!

A "HELLO SYSTEMS WORLD" PROGRAM

Here's our first program which will print a simple message on to the terminal.
#include 
int main(int argc, char* argv[]){
  const char *buffer = "Hello Systems World !\n";
  //Write to console, 22 bytes from the buffer.
  write(1,buffer,22);
}

Output:
 Hello Systems World !
 
The main function defines a C string and then invokes "write(2)" function to print it on the screen. The function "write(2)" is provided by the library "unistd.h" and can be seen in the man pages as follows:
man 2 write
 
Some brief exercepts from man information are here:
Library:
    #include 
Prototype:
    ssize_t write(int fd, const void *buf, size_t count);
Description:
    write count bytes from buffer pointed to by buf, and
    write to the file associated with descriptor fd.
Return:
    On Success: number of bytes returned.
    On Error: returns -1, set 'errno'
 
Now its pretty easy to see whats happening in the program. We create a buffer to hold the message, pass it to write along with a number 22 (size of buffer) and the file descriptor of the terminal. And lo we have our first on screen print !!

SUMMARY

We have taken a brief tour of fundamental ideas that we need for starting enroute to system programming. We also saw a very simple application and learned about the use to "write" call. In the next post we shall have a longer look at the I/O operations and will learn about buffered/unbuffered modes of doing so.