Monday, April 28, 2014

CREATING AN UDP SERVER - I

Introduction

We in current and next post show how to implement a server application with User Datagram Protocol (UDP) which is a fast, lightweight, unreliable mode of transport of data between TCP/IP hosts. The UDP messages are sent encapsulated in IP datagrams and provide a connectionless service with no ibuild guarantee of delievery or sequence preservation.

Creating UDP Sockets

In a TCP/IP network the endpoints between which the communication takes place are called sockets. Working with sockets is very similar to working with files in the sense that both are accessed using handles called file descriptors. But there are several differences also, like sockets have addresses while files dont. Also a socket can not be accessed randomly like a file is accessed with fseek(). Below we describe the function that is used to create a linux socket:


NAME
       socket - create an endpoint for communication

SYNOPSIS
       #include     /* See NOTES */
       #include 

       int socket(int domain, int type, int protocol);

DESCRIPTION
       socket() creates an endpoint for communication and 
       returns a descriptor.
       
       The domain argument specifies a communication domain; 
       this selects the protocol family which will be used for
       communication.  

       The type argument specifies communication semantics.

       The  protocol  specifies  a  particular protocol to be 
       used with the socket.
   

Since we are going to study an UDP/IP server, the appropriate domain is 'AF_INET' which correspond to IPv4 Internet protocols, and the type is 'SOCK_DGRM' which supports the datagrams which are connectionless, unreliable messages of fixed maximum length. The last parameter 'protocol' is important where more than one protocols support a particular socket type within a given protocol family, thence this parameter is used to specify the particular one. In our case here, since only one protocol supports 'SOCK_DGRM' of 'AF_INET' family, this parameter is set to 0.

Naming a Socket

Now once a socket is created and its file descriptor returned, we need to associate an IP address and port number to it. The port numbers and IP addresses are represented by 2 and 4 bytes of data placed in packets for purpose of routing and multiplexing. This is done using bind function:


NAME
       bind - bind a name to a socket

SYNOPSIS
       #include           /* See NOTES */
       #include 

       int bind(int sockfd, const struct sockaddr *addr,
                socklen_t addrlen);

DESCRIPTION
 bind() assigns the address specified by addr to the 
 socket referred to by the file  descriptor  sockfd.
 
 addrlen specifies the size, in bytes, of the address 
 structure pointed to by addr.

RETURN 
        On success : 0
 On failure : -1, errno is set 
   

Note the datatype struct sockaddr for the addr parameter:

struct sockaddr {
  sa_family_t sa_family;
  char        sa_data[14];
}
   

In actual life, the actual structure used to hold information depends on the address family. It is passed to bind after doing a cast into struct sockaddr type. Note: Most of the time IP address of the server host is not known in advance, or there may even be more than one addresses associated with this host. In such cases we may set IP address to 'INADDR_ANY', which ensures that connections to a specified port will be directed to this socket, regardless of the address the address they are sent to. And if we dont want this kind of sweeping behavior, we may use bind to specify which IP address (among many of the host) will be binded to which port number.

Receive Queries

Once bind is done, the next step is to wait and recieve message. Here we shall use recvfrom function:

NAME
      recv, recvfrom, recvmsg - receive a message from a socket

SYNOPSIS
       #include 
       #include 
       ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags,
                        struct sockaddr *src_addr, socklen_t *addrlen);

DESCRIPTION
       The  recvfrom()  recieve messages from a socket, and may be used 
       to receive data on a socket whether or not it is connection-oriented.
       
       buf     : buffer to hold the incoming datagram.
       src_addr: an empty sockaddr struct, to recieve sender's address.
       addrlen : this variable must be initialized to size of src_addr, and
       is modified on return to indicate actual size of the source address.

RETURN
 All three routines return the length of the message on successful 
 completion.  If a message is  too  long  to fit  in  the  supplied  
 buffer,  excess bytes may be discarded depending on the type of 
 socket the message is received from.
   

Note that when src_addr is NULL, nothing is filled in about the sender of the message and the parameter addrlen is not used and is NULL too. This mode is used in situations when we are not interested in knowing the protocol address of who sent us the data.

Serve Information to Client

After doing a recieve, a server may need to 'serve back' information back to the client. For this we will use sendto function, and here is the corresponding man page:


NAME
       send, sendto, sendmsg - send a message on a socket

SYNOPSIS
       #include 
       #include 

       ssize_t send(int sockfd, const void *buf, size_t len, int flags);
       ssize_t sendto(int sockfd, const void *buf, size_t len, int flags,
                      const struct sockaddr *dest_addr, socklen_t addrlen);
       ssize_t sendmsg(int sockfd, const struct msghdr *msg, int flags);

DESCRIPTION
       The system calls send(), sendto(), and sendmsg() are used to 
       transmit a message to another socket.

       sockfd: file descriptor of sending socket
       buf   : buffer to hold the send infromation.
       len   : length of the message
       flags : 
       dest_addr : Address of the destination socket.
       socklen_t : 

RETURN
       On Success: number of characters sent
       On Error  : -1, errno is set
   

Note that is is perfectly legal to write a datagram of length 0. In case of UDP this leads to an IP datagram containing an IP header, UDP header but no data. By extension this means that a return of value 0 from recvfrom is fine, and does not indicate (unlike connection oridented services) a that peer has closed connection.

Summary

We now summarize the working of a simple UDP server with following sequence of steps:

  • Create a socket object. Use socket()
  • Associate IP address and port to the socket. Use bind()
  • Receive datagram from client process. Use recvfrom()
  • Serve (if you want) datagram to client process. Use sendto()
We in our next post shall present the actual implementation of a simple echo server and explain each line. After that we shall move on to see the implementation of an UDP client process.

Friday, April 18, 2014

LINUX SIGNAL HANDLING


Introduction

The signals are software interrupts that report occurence of an exceptional event like:
  • Division by zero, or issuing addresses out of valid range.
  • A suspension or termination request by the user.
  • Termination of a child process.
  • Expiration of allocated time to the process.
  • A kill call by same or anothr process.
These and other events like these which can generate signals fall into 3 major classes:
  1. Errors : Program has done soemthing invalid, and can not run.
  2. External : External factors like I/O, timer expiration, child termination.
  3. Explicit : A library function has been used to geneate the signal.
Note that signals from both the 'Error' and 'library call within' may be considered being delivered in a specific part of the program on account of some specific action there. Such signals are called Synchronus. On the other hand, the events like those in category 'External' generate signals whose timing is outside the control of the recieving process. Such signals are called Asynchronous because of the unpredictability of when they will arrive in a given process. Note that 'library calls' may also generate asynchronous signals when generating process is not same as the one that recieves it.

Soon after generation a signal goes into pending state before going into the delivered state. The delivery may also be with-held if the signal is blocked, untill it is un-blocked again.

Ways to Handle Signals

Once a signal is delievered to a process and unless it is a SIGKILL or a SIGSTOP, the program has three choices:

  1. Ignore the signal
  2. Accept the default action
  3. Specify your action with a handler function
The appropriate choice is specified by using signal or sigaction in the program. This act is also called as "setting the vocation" of the signal. Another piece of jargon is "catching the signal" which refers to the case when we use a handler function to specify the action to be taken.

Whenever a process terminates including "on account of a signal", its parent can determine the cause by examining the termination code from either wait or waitpid functions. Another notheworthy feature is that when "program error" signals (even if generated by an explcit library call) terminate a process, it writes a core dump file that records the state of process at time of termination. This dump file can be examined in order to help with the debugging process.

Some Standard Signals

The standard signals are categrized into serveral categories.
  • Program Error Signals
  • Termination Signals
  • Alarm Signals
  • Asynchronous I/O Signals
  • Job Control Signals
  • Miscellaneous Signals
We shall discuss a just a few of them, because of their common place occurrence:


Macro: int SIGFPE  (Program-Error)
     Fatal Arithmetic Error has occurred.


Macro: int SIGBUS  (Program-Error)
     Invalid pointer dereferenced.


Macro: int SIGHUP  (Termination)
     Hangup Signal to report termination or user's terminal.


Macro: int SIGTERM (Termination)
     Generic Signal to terminate processes which can be blocked,
 handled or ignored.


Macro: int SIGKILL (Termination)
     Generic Signal to terminate processes which can NOT be blocked,
 handled or ignored.


Macro: int SIGIO   (Asynchronous)
     File descriptor ready for READ/WRITE.


Macro: int SIGURG  (Asynchronous)
     Urgent or Out-of-Band data has arrived on socket.


Macro: int SIGCHLD (Job Control)
     Child process terminated or stopped.


Macro: int SIGSTOP (Job Control)
     Stop this process


Macro: int SIGPIPE (Miscellaneous)
     FIFO/PIPE write error.     
   

Traditional Signal Handling

Historically the signal handling has been done using the signal library call which provides a simple interface for establishing the vocation of a signal. Here are exercepts from LINUX MAN page for signal calls:

NAME        : signal
SYNOPSIS    : #include <signal.h>
       typedef void (*sighandler_t) (int);
       sighandler_t signal(int signum, sighandler_t handler);
 
DESCRIPTION : The signal() system call installs a new signal handler for 
              the signal with number signum. The signal handler is set to
              sighandler which may be a user specified function, or either
              SIG_IGN or SIG_DFL.

              Upon arrival of a signal with number signum the following 
              happens. If the corresponding handler is set to SIG_IGN, 
              then the signal is ignored. If the handler is set to 
              SIG_DFL, then the default action associated to the signal 
              occurs. Finally, if the handler is set to a function sighandler 
              then first either the handler is reset to SIG_DFL or an 
              implementation-dependent blocking of the signal is performed
              and next sighandler is called with argument signum.

              Using a signal handler function for a signal is called 
              "catching the signal". The signals SIGKILL and SIGSTOP cannot
              be caught or ignored. 

In case of an error the signal returns the value of Macro sighandler_t
 SIG_ERR  to indicate so.

   


Below we illustrate the traditional signal handling with an example.


          #include <signal.h>

          void sigHandler(int signum){
     struct temp_file *p;
     for (p=temp_file_list; p; p = p->next)
              unlink(p->next);
   }


          int main(void){
    /*
     Set vocation and check if old vocation
     was to ignore.
           */
    if (signal (SIGINT, sigHandler) == SIG_IGN)/*If it was, then*/
    signal (SIGINT, SIG_IGN); /*restore old vocation*/

          }
          


The above example shows several aspects of signal handling technology. In the function main, once we change the vocation, we analyse the return value to find if old vocation was set to ignore the signal. If it was, then we do another signal call to restore the old one. Now this kind of code idiom is used when we wish to ensure that we never change the vocation of signals who by default are set to SIG_IGN.

Posix Signal Handling

The POSIX way to handle a signal is to use sigaction function, which acts similar to signal but offers a greater control via allowing for additional flags. Here is an exercept from LINUX MAN page:

NAME        : sigaction()
SYNOPSIS    : 
             #include <signal.h>
             int sigaction(int signum, const struct sigaction *act, 
      struct sigaction *oldact); 
DESCRIPTION :The sigaction system call is used to change the action 
      taken by a process on receipt of a specific signal. signum 
      specifies the signal and can be any valid signal except 
      SIGKILL and SIGSTOP. If act is non-null, the new action for 
             signal signum is installed from act. If oldact is non-null, 
             the previous action is saved in oldact. 
RETURN      : 
 On Success : 0
 On Failure : 1


The struct sigaction is defined as follows:

struct sigaction {
    void (*sa_handler)(int);
    void (*sa_sigaction)(int, siginfo_t *, void *);
    sigset_t sa_mask;
    int sa_flags;
    void (*sa_restorer)(void);
}
      


The sa_restorer element is obsolete and should not be used. POSIX does 
not specify a sa_restorer element. 

sa_handler specifies the action to be associated with signum and may be 
SIG_DFL for the default action, SIG_IGN to ignore this signal, or a 
pointer to a signal handling function. This function receives the signal 
number as its only argument.


sa_sigaction also specifies the action to be associated with signum. 
This function receives the signal number as its first argument, a pointer 
to a siginfo_t as its second argument and a pointer to a ucontext_t 
(cast to void *) as its third argument.

sa_mask gives a mask of signals which should be blocked during execution 
of the signal handler. In addition, the signal which triggered the 
handler will be blocked, unless the SA_NODEFER or SA_NOMASK flags are used.


sa_flags specifies a set of flags which modify the behaviour of the signal 
handling process



       


Now we show an example, where we handle the SIGCHLD signal using POSIX type handling.

 
typedef void Sigfunc(int); // used in Sigaction for readability
/*A wrapper function for sigaction*/
Sigfunc *Sigaction(int signo, Sigfunc *func)
{
 struct sigaction NewAction;
 struct sigaction OldAction;

 NewAction.sa_handler = func;
 sigemptyset(&NewAction.sa_mask);
 NewAction.sa_flags = 0;
 if (sigaction(signo, &NewAction, &OldAction) < 0)
 {
  fprintf(stderr, "sigaction(%d,...) failed : %s\n", signo, strerror(errno));
  exit(1);
 }
 return(OldAction.sa_handler); 
}
/*The actual signal handler*/
static void sigchildHandler(int signo)
{
    pid_t   pid;
    int     stat;
    while ((pid=waitpid(-1, &stat, WNOHANG)) > 0)
        printf("Child %d terminated\n", pid);
    return;
}

int main(){

 //registert the signal
  Sigaction(SIGCHLD, sigchildHandler);
 


}
          


Posix Signal Semantics

  • Once installed, a signal handler remains so unlike traditional system that removed the signal handler each time it was executed.
  • During the execution of an handler, corresponding signal remains blocked. Any other signal specified in sa_mask are also blocked.
  • If signal is generated on one or more time while it was blocked it is delievered only one time after it is unblocked. So by default linux signals are not queued.

Thursday, February 6, 2014

PROCESSES, FORK AND WAIT

UNDERSTANDING PROCESSES

A process consist of an address space and a single thread of control that executes with that address space and its required system resources.

Wao, that quite mouthful of jargon. Speaking in simple language a process is essentially a running program, consisting of some program code, some data and variables, open files and an environment.

The UNIX system usually shares the code between processes so that there is only one copy of a programs code in memory at a time. Same is true for shared system libraries.

The processes running at a time can be viewed in the process tree maintained by UNIX. Each process has an identification number that is used to manage it as well as to index it in the tree.

When UNIX starts it runs a single program "init" which starts other system processes who themselves may start some more, gradually firing complete ensemble of OS features and facilities.

CREATING NEW PROCESSES USING system FUNCTION

We can create new processes from withins of another program. One of the ways is to use system function, which takes in a shell command as a string and tries to run it.

#include <stdlib.h>
#include <stdio.h>
int main(){
    printf("Running ls command with system\n");
    system ("ps -ax");
    printf("Done. \n");
    exit(0);
}
	  

CREATING NEW PROCESSES USING exec FUNCTION

Another method probably with better control over the child processes is provided by family of exec commands. Members of this family differ from one another in the way they start the processes or present the program arguments. But generally speaking they replace the current process with another created according to the arguments given. The program given by the path argument is used as the program code to execute in place of that which is currently running.
#include <unistd.h>
#include <stdio.h>
int main(){
printf("Running ps with execlp\n"
execlp("ps","ps","-ax",0);
printf("Done\n");
exit(0);
}
	  

This program will print the first message and then calls the execlp to execute ps command. This new process will replace our program and control would return to terminal once ps finishes. This means that subsequent lines of our code never get executed. Only time that an exec function will return is when you have an error while executing. In such case it returns a value -1 and errno is set accordingly.

CREATING NEW PROCESSES USING fork FUNCTION

The third way to create new processes is 'fork' which creates new processes without replacing the parent. This is a system call and duplicates the calling process a create a new entry in the process table with several characteristics identical to the parent.

A simple example of fork is given here:

#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
int main(){
  
  /*Declare variables*/
  pid_t pid;             /*process id*/
  char* slogan;	         /*message to be printed*/
  int n;    

  /*Print a welcome message*/
  printf("+++++++++++++++++++++++++++++++++++++++++++++++++++++\n");
  printf("(%d/%d) A program with fork application\n",getpid(),getppid());

  /*Create a new process using fork*/
  pid  = fork();
  
  /*Processes are managed using fork*/
  switch(pid){
    /*Error during fork*/
  case -1:
    exit(0);
      
    /*fork returns value zero in children*/
  case 0:
    slogan = "(%d/%d) Hello World, I am child\n";
    n=5;
    break;

    /*but not inside the parent*/
  default:	
    slogan = "(%d/%d) Hello World, This is parent\n";
    n=3;
    break;
    
  }
  /**
     THIS PART IS TEMPORARILY COMMENTED OUT
  int status;
  pid_t p = wait(&status);
  if(WIFEXITED(status)){
    printf("(%d/%d) Child exited normally w/status=%d\n",
             getpid(),getppid(),WIFEXITSTATUS());
	}
   else
   printf("(%d/%d) Child exited anormally ",
          getpid(),getppid());

   *************************************************/

  /*Now make each process print a characteristic number of times*/

  for(; n>0;n--){
    
    printf(slogan,getpid(),getppid());    
    sleep(2);
  }
  
  /*If you make upto this point, you are discharged honourably !*/
  exit(0);
  
}
	  

Now it is a time to have a brief look at manual page for the fork(2):
Library:
    #include <unistd.h>
Prototype:
    pid_t fork(void);
Description:
    1. creates  a  new  process by duplicating the calling process
    2. child has its own unique process ID
    3. parent process ID is the same as current process
    4. child does not inherit its parent's memory locks
Return:
    On Success:		
       PID of the child process is returned in the parent
       0 is returned inside the child

    On Error: returns -1, no child created and errno is set.
	  

Another function that is used in conjuction with fork is wait:
Library:
    #include <sys/types.h>
    #include <sys/wait.h>
Prototype:
    pid_t wait(int* status);
Description:
    1. wait for state changes in a child of the calling process
    2. obtain information about the child whose  state  has  changed
    Notes: What is a state change?
	1. the child terminated
	2. the child was stopped by a signal
	3. the child was resumed by a signal. 

    Notes: Why use wait?
       In the case of a terminated child, performing a
       wait  allows  the system to release the resources 
       associated with the child; if a wait is not performed, 
       then the terminated child remains in a  "zombie"  state 

Inspecting Status: The integer status is examined with following macros
       	   
   WIFEXITED(status): true if the child terminated normally.

   WEXITSTATUS(status): 
   	returns the exit status of the child.  
   	should be employed only if WIFEXITED returned true.

   WIFSIGNALED(status)
        returns true if the child process was terminated by a signal.

   WTERMSIG(status)
        returns the number of the signal that caused the child  terminate.
        should be employed only if WIFSIGNALED returned true.

   WCOREDUMP(status)
        returns  true if the child produced a core dump.
        employed if WIFSIGNALED returned true.  

   WIFSTOPPED(status)
        returns true if the child process was stopped by delivery of a signal; 

   WSTOPSIG(status)
        returns the number of the signal which caused the child to  stop.

   WIFCONTINUED(status)
        returns true if the child process was resumed by delivery of SIGCONT.
	  

The wait can be used in above program to ensure that the parent process wait for all the child processes to terminate before terminating itself. An illustration of wait function is given below. It can be run inside the program shown above (just insert it before the 'for' looping).
 int status;
  pid_t p  = wait(&status);
  if(WIFEXITED(status)){
    printf("(%d/%d) Child exited normally w/status=%d\n",
	   getpid(),getppid(),WEXITSTATUS(status));
  }
  else{
    printf("(%d/%d) Child exited anormally ", getpid(),getppid());

  }
	  

It may be useful to compare the output from this example program before and after the usage of wait function:
* Without wait:

(7025/4265) A program with fork application
(7025/4265) Hello World, This is parent
(7026/7025) Hello World, I am child
(7025/4265) Hello World, This is parent
(7026/7025) Hello World, I am child
(7025/4265) Hello World, This is parent
(7026/7025) Hello World, I am child
(7026/7025) Hello World, I am child
anil@indica$ (7026/1903) Hello World, I am child

* With wait:
(6967/4265) A program with fork application
(6968/6967) Child exited normally w/status=0
(6968/6967) Hello World, I am child
(6968/6967) Hello World, I am child
(6968/6967) Hello World, I am child
(6968/6967) Hello World, I am child
(6968/6967) Hello World, I am child
(6967/4265) Child exited normally w/status=0
(6967/4265) Hello World, This is parent
(6967/4265) Hello World, This is parent
(6967/4265) Hello World, This is parent
	  

We saw that there was a zombie process that outlived the parent process when we did not use the wait function.

SUMMARY

We have come quite a distance from our first post where we talked about the fundamentals of linux OS. The second post was about taking baby steps into I/O operations, and the third one dealt with reading and writing information from the directories. In the current post which is fourth one in this series, we talked a little bit about what a process is and how do we create the new processes from existing ones. We spent quite a bit of time on discussing fork method and did an example to demonstrate the working. Finally we had a brief idea of the usefulness of wait function, and we compared the output of our program before and after the inclusion of wait. In our next post we shall keep building these foundations and talk about file I/O in Linux System programming.

DIRECTORY ACCESS WITH SYSTEM CALLS

ACCESSING THE CONTENTS

We have seen earlier that directory is a special file that contain the directory objects who contain a mapping of file-names to the actual files. We shall write a small program to check the contents of a directory specified via command line.
First we have a look at the functions that we would use:
opendir(3)
readdir(3)
closedir(3)
	  

So let us have a look at opendir:
man 2  opendir

Library:
    #include <sys/types.h>
    #include <dirent.h>
Prototype:
       DIR* opendir(const char* name);
Description:
     opens a directory stream corresponding to the directory name, and
     returns a pointer to the directory stream. On an error NULL is
     returned and errno is set.
	  

and a peek at readdir:
man 2 readdir

Library:
    #include <dirent.h>
Prototype:
       struct dirent *readdir(DIR *dirp);
Description:
    returns next item in the directory stream dirp.
Returns:
    returns a struct dirent that represent the next entry in directory
    stream dirp. If there is an error or end of directory is reached
    it will return NULL.
	  

It will be useful to know the composition of dirent struct:
         struct dirent {
               ino_t          d_ino;       /* inode number */
               off_t          d_off;       /* not an offset; see NOTES */
               unsigned short d_reclen;    /* length of this record */
               unsigned char  d_type;      /* type of file; not supported
                                              by all filesystem types */
               char           d_name[256]; /* filename */
           };
	  

and a quick look at closedir:
man closedir

Library:
    #include <sys/types.h>
    #include <dirent.h>
Prototype:
       int closedir(DIR *dirp);
Description:
    close directory stream associated with dirp.
Returns:
    On Success: The value zero is returned.
    On Failure: The value -1 is resturned, errno is set accordingly.
	  
;

Now is the time to dive into the actual program.
/*
Created By: Anil Singh, NIU, IL-60115
Description: Receive name of directory, and list contents.
*/

#include<sys/types.h>
#include<dirent.h>
#include<stdio.h>
#include<errno.h>
#include<string.h>
#include<stdlib.h>
static void scanDir ( char* dir ){

  printf("Directory : %s\n",dir);
  // pointers for structures for directory processing
  DIR* dp;
  struct dirent* dirp;
  // Open directory for reading.
  errno=0;
  dp = opendir ( dir );
  if ( !dp ) {
    fprintf(stderr,"Can't open %s: %s\n",dir,strerror(errno));
    return;
  };

  // Read and display contents of directory.
  errno=0;
  while ( ( dirp = readdir ( dp ) ) )
    printf ( "%s\n", dirp->d_name );
  if(!dirp && errno){
    fprintf(stderr,"can't read %s: %s\n","name",strerror(errno));
    return;
  }
  // Close directory.
  if ( (closedir(dp)<0)){
    fprintf(stderr,"can't close %s: %s\n","name",strerror(errno));
    return;
  }
}

int main(int argc, char* argv[]){

  /*If no argument is supplied, set current directory*/
  int i;
  if(argc==1){
    scanDir(".");
  }
  else
    for( i=1; i<argc; i++){
      scanDir(argv[i]);
    }
  return 0;
}

	  

SUMMARY

We saw how to open, read and close directories under LINUX and also a bit on how to deal with the errors.

Wednesday, February 5, 2014

THE BASIC INPUT AND OUTPUT OPERATIONS

THE I/O PROCESSING

One of the ways to classify the Unix I/O is to think of it as buffered or unbuffered.

The Unbuffered I/O has following path of data flow:

 Data->Kernel Buffer->Disk
       

Most of the system calls like: read, write, open, gets,puts, close are unbuffered.

The buffered I/O has following path of data flow:

 Data->User Buffer -> Kernel Buffer -> Disk
       

The standard I/O are usually buffered: fread,fwrite, fopen fclose,getchar,putchar.

Please note that : STDIN_FILENO, STDOUT_FILENO, STDERR_FILENO are file descriptors whose type is 'int' and they are system calls themselves.

Please note that : stdin, stdout and stderr are standard I/O and they are pointers of type FILE*.

USING UNBUFFERED I/O

In this section we are going to encounter following functions:
read(2)
write(2)
perror(3)
exit(3)
     

Before going ahead and creating a program, let us have a brief look at the information from man pages. Here is what we have for read function:
man 2 read
Library:
#include <unistd.h >
Prototype:
  ssize_t read(int fd, const void *buf, size_t count);
Description:
  attempts to read up to count bytes from file
  descriptor fd into the buffer starting at buf.
Notes:
  0. If count=0, then zero is returned and no other effects.
  1. Commences at current file offset.
  2. If offset is past end of file, nothing read, zero returned
  3. If count > SSIZE_MAX, result unspecified.
Return:
  On Success: number of bytes read.
  On Error: returns -1, set 'errno'
       

Now we shall look at the function perror(3):
Library:
    #include <stdio.h>
   #include <errno.h>
Prototype:
   void perror(const char *s);
escription:
    produces  a message on the standard error output,
    describing the last error encountered during a call to
    a system or library function
Return:
    On Success: number of bytes returned.
    On Error: returns -1, set 'errno'
       

Finally we need to know about the exit call
Library:
  #include <stdlib.h>
Prototype:
  void exit(int status);
Description:
  causes normal process termination and the value o
Return:
  no return
   

Now here comes the exiting part: The actual program.
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>

#define BUFFERSIZE  8192

int main(int argc, char* argv[]){
          /*Size of data read in single go*/
          int numRead;  
   
          /*Buffer to store the data read in one go*/
          char buffer[8192]; 
 
   /*attempt a first read from STDIN_FILENO*/
   numRead = read ( STDIN_FILENO, buffer, BUFFERSIZE );
  
   while(numRead>0){
   
   /*Write numBytes bytes into the stdout*/
   int numWrite = write (STDIN_FILENO, buffer, numRead);
   
   /*Check if all the bytes sent to write, got written*/
          if(numRead != numWrite){
   perror("Error while writing");
   exit(1);
   }
 
  /*Write the next record.*/
   numRead = read ( STDIN_FILENO, buffer, BUFFERSIZE );
   }
  
   
   /*If we see a negative value of numRead, its error*/
   if(numRead<0){
         perror("Error while reading");
         exit(2);
         }
         
   /*If you manage to reach here, exit normally*/
    exit(0);
}
 

USING BUFFERED I/0

In this program we shall use buffered I/O methods to do what we did in last program. Here are the functions that we gonna use:
fgetc(3)
fputc(3)
ferror(3)
perror(3)
exit(3)

So we start with man information on fgetc:
Library:
    #include <stdio.h>
Prototype:
    void fgetc(FILE *stream);
Description:
    reads the next character from stream.
    
Returns:
    returns the caharacter read as an unsigned char cast to an int,
    or EOF on end of file or error.

And here is a bit on fputc:
Library:
    #include <stdio.h>
Prototype:
    void fputc(FILE *stream);
Description:
    writes the character c, to stream.
Returns:
    nothing

and finally we need to talk about ferror:
Library:
    #include 
Prototype:
    void ferror(FILE *stream);
Description:
    test the error indicator on stream.
Returns:
    if error indicator on stream is set, ferror returns nonzero

And now we are ready to have a look at our program:
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>

#define BUFFERSIZE  8192

int main(int argc, char* argv[]){
  
  char datum; /*variable to hold the data*/

  /*Attempt a first read from stdin*/
  datum = fgetc (stdin);

  while(datum != EOF){
    
    /*Write the datum into stdout*/
    if(fputc(datum, stdout)==EOF){
      perror("Error while trying to write: fputc");
      exit(1);
    }
    /*Read the next record.*/
    datum = fgetc (stdin);
  }


  /*If a stdin error brought you here: its bad*/
  if(ferror(stdin)){
    perror("Error while reading: fgetc");
    exit(2);
  }

  /*If you manage to reach here, exit normally*/
  exit(0);
}

SUMMARY

We in last post took a brief tour into the basic ideas that are needed to begin Linux System Programming. In this post we further developed those ideas and examined the buffered and unbuffered I/O modes. In the next post we shall continue to build upon these points and see how to open and read directories.

INTRODUCTION TO UNIX AND UNIX LIKE SYSTEMS

WHAT IS UNIX?

The essentials of a UNIX or UNIX-like operating system consist of a 'kernel' and 'system programs'. In addition there also are the 'application program' for helping specific user tasks.

A kernel constiture the heart of the OS, and perform severl low level chores like keeping track of the files on disk, starting and running the programs, managing resources for the processes, communicate by exchanging packets of information with the network etc. Another way to look at Kernel is to think of it as an interface between the hardware and users. It prevents the direct access to hardware, and force the processes to do so using the tools (System Calls) provided by it.

The System Calls provided by Kernel are used by System Programs in order to implement various facilities provided by the operating system. A "mount" command is an example of the System program. The Application Programs can also user the System Calls but they are not System Programs because the intent here is to assist user in one of his tasks and not to get the system working.

The line between what is a system call and what is an application is not sharp and there are several programs which can be midway between the two.

The UNIX/UNIX-like OS provide several services some of which we shall list here:

  1. The "first" Process: init
  2. Logins from terminals: getty+login
  3. System Logs: syslog
  4. Periodic command executon: cron, at service
  5. Graphical User Interface: X Window System
  6. Networking Support
  7. Network Logins: telnet, rlogin, ssh
  8. Network File Systems: NFS (Supported by Kernel),CIFS
  9. Electronic mail, Printer Support.
Finally we have a USER MANUAL that comes installed automatically on every UNIX/LINUX machine and can be used for quick reference. Here is a summary of various sections of the LINUX MANUAL
  1. Executable programs or shell commands
  2. System calls (functions provided by the kernel)
  3. Library calls (functions within program libraries)
  4. Special files (usually found in /dev)
  5. File formats and conventions eg /etc/passwd
  6. Games
  7. Miscellaneous (including macro packages and conventions)
  8. System administration commands (usually only for root)
  9. Kernel routines [Non standard]

BASIC TERMINOLOGY OF UNIX SYSTEM PROGRAMMING

We shall follow a very concise approach with less talk and more programming. But before we begin it will be worthwhile to have a look at following terms:
  1. File: A near good definition of a file in UNIX system that that its an object residing either on external or internal memmory, that can be written to, or read from or both.
  2. File System: The file system refers to tree like hierarchy into which the files are organised in UNIX.
  3. Directories: The directories are special files that represent the nodes of the file-hierarchy and appear to the user as if they contain other files. But in reality it containscontain directory entries, which are the objects who associate a file to its filename - thus making an impression that actual files reside there.
  4. Processes: Although we shall discuss the idea of what a process is in detail, it shall suffice to note here that it is an instance of running a program. Each process has an 'id' number called "pid" as well as a 'parent-process-id' called "ppid", about both of which we shall talk more later.

Now that we have some idea of what involves in our endeavor to start with UNIX/LINUX system programming, its time to start!!

A "HELLO SYSTEMS WORLD" PROGRAM

Here's our first program which will print a simple message on to the terminal.
#include 
int main(int argc, char* argv[]){
  const char *buffer = "Hello Systems World !\n";
  //Write to console, 22 bytes from the buffer.
  write(1,buffer,22);
}

Output:
 Hello Systems World !
 
The main function defines a C string and then invokes "write(2)" function to print it on the screen. The function "write(2)" is provided by the library "unistd.h" and can be seen in the man pages as follows:
man 2 write
 
Some brief exercepts from man information are here:
Library:
    #include 
Prototype:
    ssize_t write(int fd, const void *buf, size_t count);
Description:
    write count bytes from buffer pointed to by buf, and
    write to the file associated with descriptor fd.
Return:
    On Success: number of bytes returned.
    On Error: returns -1, set 'errno'
 
Now its pretty easy to see whats happening in the program. We create a buffer to hold the message, pass it to write along with a number 22 (size of buffer) and the file descriptor of the terminal. And lo we have our first on screen print !!

SUMMARY

We have taken a brief tour of fundamental ideas that we need for starting enroute to system programming. We also saw a very simple application and learned about the use to "write" call. In the next post we shall have a longer look at the I/O operations and will learn about buffered/unbuffered modes of doing so.