Monday, April 28, 2014

CREATING AN UDP SERVER - I

Introduction

We in current and next post show how to implement a server application with User Datagram Protocol (UDP) which is a fast, lightweight, unreliable mode of transport of data between TCP/IP hosts. The UDP messages are sent encapsulated in IP datagrams and provide a connectionless service with no ibuild guarantee of delievery or sequence preservation.

Creating UDP Sockets

In a TCP/IP network the endpoints between which the communication takes place are called sockets. Working with sockets is very similar to working with files in the sense that both are accessed using handles called file descriptors. But there are several differences also, like sockets have addresses while files dont. Also a socket can not be accessed randomly like a file is accessed with fseek(). Below we describe the function that is used to create a linux socket:


NAME
       socket - create an endpoint for communication

SYNOPSIS
       #include     /* See NOTES */
       #include 

       int socket(int domain, int type, int protocol);

DESCRIPTION
       socket() creates an endpoint for communication and 
       returns a descriptor.
       
       The domain argument specifies a communication domain; 
       this selects the protocol family which will be used for
       communication.  

       The type argument specifies communication semantics.

       The  protocol  specifies  a  particular protocol to be 
       used with the socket.
   

Since we are going to study an UDP/IP server, the appropriate domain is 'AF_INET' which correspond to IPv4 Internet protocols, and the type is 'SOCK_DGRM' which supports the datagrams which are connectionless, unreliable messages of fixed maximum length. The last parameter 'protocol' is important where more than one protocols support a particular socket type within a given protocol family, thence this parameter is used to specify the particular one. In our case here, since only one protocol supports 'SOCK_DGRM' of 'AF_INET' family, this parameter is set to 0.

Naming a Socket

Now once a socket is created and its file descriptor returned, we need to associate an IP address and port number to it. The port numbers and IP addresses are represented by 2 and 4 bytes of data placed in packets for purpose of routing and multiplexing. This is done using bind function:


NAME
       bind - bind a name to a socket

SYNOPSIS
       #include           /* See NOTES */
       #include 

       int bind(int sockfd, const struct sockaddr *addr,
                socklen_t addrlen);

DESCRIPTION
 bind() assigns the address specified by addr to the 
 socket referred to by the file  descriptor  sockfd.
 
 addrlen specifies the size, in bytes, of the address 
 structure pointed to by addr.

RETURN 
        On success : 0
 On failure : -1, errno is set 
   

Note the datatype struct sockaddr for the addr parameter:

struct sockaddr {
  sa_family_t sa_family;
  char        sa_data[14];
}
   

In actual life, the actual structure used to hold information depends on the address family. It is passed to bind after doing a cast into struct sockaddr type. Note: Most of the time IP address of the server host is not known in advance, or there may even be more than one addresses associated with this host. In such cases we may set IP address to 'INADDR_ANY', which ensures that connections to a specified port will be directed to this socket, regardless of the address the address they are sent to. And if we dont want this kind of sweeping behavior, we may use bind to specify which IP address (among many of the host) will be binded to which port number.

Receive Queries

Once bind is done, the next step is to wait and recieve message. Here we shall use recvfrom function:

NAME
      recv, recvfrom, recvmsg - receive a message from a socket

SYNOPSIS
       #include 
       #include 
       ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags,
                        struct sockaddr *src_addr, socklen_t *addrlen);

DESCRIPTION
       The  recvfrom()  recieve messages from a socket, and may be used 
       to receive data on a socket whether or not it is connection-oriented.
       
       buf     : buffer to hold the incoming datagram.
       src_addr: an empty sockaddr struct, to recieve sender's address.
       addrlen : this variable must be initialized to size of src_addr, and
       is modified on return to indicate actual size of the source address.

RETURN
 All three routines return the length of the message on successful 
 completion.  If a message is  too  long  to fit  in  the  supplied  
 buffer,  excess bytes may be discarded depending on the type of 
 socket the message is received from.
   

Note that when src_addr is NULL, nothing is filled in about the sender of the message and the parameter addrlen is not used and is NULL too. This mode is used in situations when we are not interested in knowing the protocol address of who sent us the data.

Serve Information to Client

After doing a recieve, a server may need to 'serve back' information back to the client. For this we will use sendto function, and here is the corresponding man page:


NAME
       send, sendto, sendmsg - send a message on a socket

SYNOPSIS
       #include 
       #include 

       ssize_t send(int sockfd, const void *buf, size_t len, int flags);
       ssize_t sendto(int sockfd, const void *buf, size_t len, int flags,
                      const struct sockaddr *dest_addr, socklen_t addrlen);
       ssize_t sendmsg(int sockfd, const struct msghdr *msg, int flags);

DESCRIPTION
       The system calls send(), sendto(), and sendmsg() are used to 
       transmit a message to another socket.

       sockfd: file descriptor of sending socket
       buf   : buffer to hold the send infromation.
       len   : length of the message
       flags : 
       dest_addr : Address of the destination socket.
       socklen_t : 

RETURN
       On Success: number of characters sent
       On Error  : -1, errno is set
   

Note that is is perfectly legal to write a datagram of length 0. In case of UDP this leads to an IP datagram containing an IP header, UDP header but no data. By extension this means that a return of value 0 from recvfrom is fine, and does not indicate (unlike connection oridented services) a that peer has closed connection.

Summary

We now summarize the working of a simple UDP server with following sequence of steps:

  • Create a socket object. Use socket()
  • Associate IP address and port to the socket. Use bind()
  • Receive datagram from client process. Use recvfrom()
  • Serve (if you want) datagram to client process. Use sendto()
We in our next post shall present the actual implementation of a simple echo server and explain each line. After that we shall move on to see the implementation of an UDP client process.

Friday, April 18, 2014

LINUX SIGNAL HANDLING


Introduction

The signals are software interrupts that report occurence of an exceptional event like:
  • Division by zero, or issuing addresses out of valid range.
  • A suspension or termination request by the user.
  • Termination of a child process.
  • Expiration of allocated time to the process.
  • A kill call by same or anothr process.
These and other events like these which can generate signals fall into 3 major classes:
  1. Errors : Program has done soemthing invalid, and can not run.
  2. External : External factors like I/O, timer expiration, child termination.
  3. Explicit : A library function has been used to geneate the signal.
Note that signals from both the 'Error' and 'library call within' may be considered being delivered in a specific part of the program on account of some specific action there. Such signals are called Synchronus. On the other hand, the events like those in category 'External' generate signals whose timing is outside the control of the recieving process. Such signals are called Asynchronous because of the unpredictability of when they will arrive in a given process. Note that 'library calls' may also generate asynchronous signals when generating process is not same as the one that recieves it.

Soon after generation a signal goes into pending state before going into the delivered state. The delivery may also be with-held if the signal is blocked, untill it is un-blocked again.

Ways to Handle Signals

Once a signal is delievered to a process and unless it is a SIGKILL or a SIGSTOP, the program has three choices:

  1. Ignore the signal
  2. Accept the default action
  3. Specify your action with a handler function
The appropriate choice is specified by using signal or sigaction in the program. This act is also called as "setting the vocation" of the signal. Another piece of jargon is "catching the signal" which refers to the case when we use a handler function to specify the action to be taken.

Whenever a process terminates including "on account of a signal", its parent can determine the cause by examining the termination code from either wait or waitpid functions. Another notheworthy feature is that when "program error" signals (even if generated by an explcit library call) terminate a process, it writes a core dump file that records the state of process at time of termination. This dump file can be examined in order to help with the debugging process.

Some Standard Signals

The standard signals are categrized into serveral categories.
  • Program Error Signals
  • Termination Signals
  • Alarm Signals
  • Asynchronous I/O Signals
  • Job Control Signals
  • Miscellaneous Signals
We shall discuss a just a few of them, because of their common place occurrence:


Macro: int SIGFPE  (Program-Error)
     Fatal Arithmetic Error has occurred.


Macro: int SIGBUS  (Program-Error)
     Invalid pointer dereferenced.


Macro: int SIGHUP  (Termination)
     Hangup Signal to report termination or user's terminal.


Macro: int SIGTERM (Termination)
     Generic Signal to terminate processes which can be blocked,
 handled or ignored.


Macro: int SIGKILL (Termination)
     Generic Signal to terminate processes which can NOT be blocked,
 handled or ignored.


Macro: int SIGIO   (Asynchronous)
     File descriptor ready for READ/WRITE.


Macro: int SIGURG  (Asynchronous)
     Urgent or Out-of-Band data has arrived on socket.


Macro: int SIGCHLD (Job Control)
     Child process terminated or stopped.


Macro: int SIGSTOP (Job Control)
     Stop this process


Macro: int SIGPIPE (Miscellaneous)
     FIFO/PIPE write error.     
   

Traditional Signal Handling

Historically the signal handling has been done using the signal library call which provides a simple interface for establishing the vocation of a signal. Here are exercepts from LINUX MAN page for signal calls:

NAME        : signal
SYNOPSIS    : #include <signal.h>
       typedef void (*sighandler_t) (int);
       sighandler_t signal(int signum, sighandler_t handler);
 
DESCRIPTION : The signal() system call installs a new signal handler for 
              the signal with number signum. The signal handler is set to
              sighandler which may be a user specified function, or either
              SIG_IGN or SIG_DFL.

              Upon arrival of a signal with number signum the following 
              happens. If the corresponding handler is set to SIG_IGN, 
              then the signal is ignored. If the handler is set to 
              SIG_DFL, then the default action associated to the signal 
              occurs. Finally, if the handler is set to a function sighandler 
              then first either the handler is reset to SIG_DFL or an 
              implementation-dependent blocking of the signal is performed
              and next sighandler is called with argument signum.

              Using a signal handler function for a signal is called 
              "catching the signal". The signals SIGKILL and SIGSTOP cannot
              be caught or ignored. 

In case of an error the signal returns the value of Macro sighandler_t
 SIG_ERR  to indicate so.

   


Below we illustrate the traditional signal handling with an example.


          #include <signal.h>

          void sigHandler(int signum){
     struct temp_file *p;
     for (p=temp_file_list; p; p = p->next)
              unlink(p->next);
   }


          int main(void){
    /*
     Set vocation and check if old vocation
     was to ignore.
           */
    if (signal (SIGINT, sigHandler) == SIG_IGN)/*If it was, then*/
    signal (SIGINT, SIG_IGN); /*restore old vocation*/

          }
          


The above example shows several aspects of signal handling technology. In the function main, once we change the vocation, we analyse the return value to find if old vocation was set to ignore the signal. If it was, then we do another signal call to restore the old one. Now this kind of code idiom is used when we wish to ensure that we never change the vocation of signals who by default are set to SIG_IGN.

Posix Signal Handling

The POSIX way to handle a signal is to use sigaction function, which acts similar to signal but offers a greater control via allowing for additional flags. Here is an exercept from LINUX MAN page:

NAME        : sigaction()
SYNOPSIS    : 
             #include <signal.h>
             int sigaction(int signum, const struct sigaction *act, 
      struct sigaction *oldact); 
DESCRIPTION :The sigaction system call is used to change the action 
      taken by a process on receipt of a specific signal. signum 
      specifies the signal and can be any valid signal except 
      SIGKILL and SIGSTOP. If act is non-null, the new action for 
             signal signum is installed from act. If oldact is non-null, 
             the previous action is saved in oldact. 
RETURN      : 
 On Success : 0
 On Failure : 1


The struct sigaction is defined as follows:

struct sigaction {
    void (*sa_handler)(int);
    void (*sa_sigaction)(int, siginfo_t *, void *);
    sigset_t sa_mask;
    int sa_flags;
    void (*sa_restorer)(void);
}
      


The sa_restorer element is obsolete and should not be used. POSIX does 
not specify a sa_restorer element. 

sa_handler specifies the action to be associated with signum and may be 
SIG_DFL for the default action, SIG_IGN to ignore this signal, or a 
pointer to a signal handling function. This function receives the signal 
number as its only argument.


sa_sigaction also specifies the action to be associated with signum. 
This function receives the signal number as its first argument, a pointer 
to a siginfo_t as its second argument and a pointer to a ucontext_t 
(cast to void *) as its third argument.

sa_mask gives a mask of signals which should be blocked during execution 
of the signal handler. In addition, the signal which triggered the 
handler will be blocked, unless the SA_NODEFER or SA_NOMASK flags are used.


sa_flags specifies a set of flags which modify the behaviour of the signal 
handling process



       


Now we show an example, where we handle the SIGCHLD signal using POSIX type handling.

 
typedef void Sigfunc(int); // used in Sigaction for readability
/*A wrapper function for sigaction*/
Sigfunc *Sigaction(int signo, Sigfunc *func)
{
 struct sigaction NewAction;
 struct sigaction OldAction;

 NewAction.sa_handler = func;
 sigemptyset(&NewAction.sa_mask);
 NewAction.sa_flags = 0;
 if (sigaction(signo, &NewAction, &OldAction) < 0)
 {
  fprintf(stderr, "sigaction(%d,...) failed : %s\n", signo, strerror(errno));
  exit(1);
 }
 return(OldAction.sa_handler); 
}
/*The actual signal handler*/
static void sigchildHandler(int signo)
{
    pid_t   pid;
    int     stat;
    while ((pid=waitpid(-1, &stat, WNOHANG)) > 0)
        printf("Child %d terminated\n", pid);
    return;
}

int main(){

 //registert the signal
  Sigaction(SIGCHLD, sigchildHandler);
 


}
          


Posix Signal Semantics

  • Once installed, a signal handler remains so unlike traditional system that removed the signal handler each time it was executed.
  • During the execution of an handler, corresponding signal remains blocked. Any other signal specified in sa_mask are also blocked.
  • If signal is generated on one or more time while it was blocked it is delievered only one time after it is unblocked. So by default linux signals are not queued.