Sockets, Internet and UNIX domains

 

Syntax:

t/f = listen(path[,mode])

str = get[b](path|fd[,max])

num = put[b](path|fd,str)

str = peerip(path|fd)

str = peerpt(path|fd)

num = peeruid(path|fd)

num = peergid(path|fd)

num = peerpid(path|fd)

t/f = close(path|fd)

 

Synopsis:

JSUS supports client-server and peer-to-peer applications, using both Internet and UNIX domain sockets, primarily through the get() and put() functions defined for regular file I/O. Internet sockets support inter-process communication between different programs on the same machine; across a local area network (or private WAN); and across the public Internet. JSUS fully supports traditional IPv4 and the newer IPv6 Internet addressing, via Transmission Control Protocol (TCP) and User Datagram Protocol (UDP). Some support is also included for SCTP but is currently disabled until the protocol becomes more widely available. UNIX sockets communicate only between processes on the same machine.

 

UNIX sockets:

UNIX domain sockets are "special" files that live within directories of the main file system. Thus, they have a normal path, but they occupy no space on any external medium. Messages are passed from one program to another through memory buffers managed by the operating system. They are, perhaps, the best method for creating client-server applications to run only on the local machine. They are superior to both named pipes and Internet sockets because this is the only method by which a server program can reliably determine the identity of the client (using the peeruid() and peergid() functions). It is also possible to kill() the client, using the process id returned by peerpid(). In fact, these three functions operate only on UNIX sockets and generate an error if used on any other type of file. UNIX sockets are also faster than Internet sockets (but slightly slower than named pipes, which have other disdvantages).

The socket file is created automatically, when the listen() function is called to place the server side program into the waiting state. If a file already exists, at the given path, the listen() call will fail. An optional mode argument may also be passed, to limit client connections to those from programs with appropriate permissions. If omitted, a value of 0660 is assumed, which limits client connections to programs with the same user or group id as the server. Of course, both client and server must also have access permissions to all directories in the path. Ideally, the containing directory should be owned by a particular application group and created in a file system mounted with BSD semantics or with the setgid permission bit on.

The listen() function performs all of the tasks necessary to set up the socket and then waits (blocks) until a connection is received from any authorized client program. A connection is made by the client issuing a get() or put() to the same path. The server's listen() function then fork()s a new "transaction" process to handle just the new connection. The main server process then continues listening for more connections. It can be interrupted by sending a signal, such as SIGTERM (which may be sent from a child transaction process). It is also interrupted by any error condition arising during an attempted connection. In either case, the listen() function in the main listening process returns null and an error code (after closing and removing the socket file). The listening process can then determine the error code or signal type, with errno() and strerror(), and then performs any necessary clean up before terminating.

After accepting a new connection, the listen() function in the new server transaction process returns true, with errno set to zero. The transaction and client programs then communicate with any pre-arranged sequence of get()s and put()s. Typically, the client put()s a query message, which also makes the connection to the listening server. The server's transaction process then get()s this message, after the listen() completes, and put()s one or more response messages to the client. It is also possible for the client to connect with a get(), which connects to the server and waits for a response. The server transaction process then follows its listen() with an initial put(), which completes the client's get(). Both client and transaction processes can also inspect each other's credentials, with the peeruid(), peergid() and peerpid() functions.

In JSUS, UNIX sockets are set up to send and receive continuous "streams" of bytes which are typically read line by line (with no max argument). But, they may also be read in explicitly sized blocks of bytes and all available data in the stream can be read with get(path). The get() function returns null when the other end of the connection is closed.

It may be helpful to remember that listen() always returns false in the listening process and true in all successfully forked transaction processes.

 

TCP sockets:

With one major difference, TCP sockets are the same as UNIX sockets. They do not exist in the file system, so they do not really have a path. Instead, JSUS functions accept a special path string, in the following format:

TCP:address:port

TCP identifies the protocol to be used. It must be the same in both client and server (but may also be UDP, below). In the client program, the address portion is the IPv4 or IPv6 address of the server (which may be obtained using the dig() function). In the server, it must be the IP address of an interface on which the server is to listen(). For IPv4, the address portion may also be empty, to cause the server to listen() on all available interfaces. IPv6 addresses may need to be ended in %n (the scope of the interface). In both client and server, the port number must be the same (i.e. the port on which the server is listening). It must also be remembered that only the super user, root, can listen() on ports numbered lower than 1024.

Otherwise, listen(), get() and put() operate the same as for UNIX sockets. The peer's IP address and port number can also be obtained (or verified) using the peerip() and peerpt() functions. But it should be noted that the lack of a client's user credentials makes TCP (and UDP) less secure than UNIX sockets.

 

UDP sockets:

UDP sockets are "connectionless" and less reliable than TCP or UNIX. Complete "datagrams" are sent and received by a single put() or get(). However, there is no guarantee they will be received in the same sequence they are sent, if they are even received at all. Also, in the underlying run time libraries, there is no "listen" function for UDP sockets. Nevertheless, JSUS does provide it's own listen() for UDP, emulating similar functionality as for UNIX and TCP sockets (including the fork()ing of server transaction processes as described below).

In the client, at it's simplest, a datagram is transmitted as a string of UCS characters passed to a put(), with a pseudo-path like that for TCP (except beginning with UDP:). The string is normally converted to UTF-8 bytes but binary data can also be sent using putb(). In principle, there is no limit to the size of the string but, if the total number of bytes is greater than the Maximum Transmission Unit (MTU) of the underlying link, the datagram will be "fragmented" and reassembled at the other end. This adds to the rate of failure and makes error recovery more difficult. Since the MTU over the public Internet is typically about 1500 bytes, this is recommended as a practical maximum size for a datagram.

The client can receive replies with a get() from the same path as the initial put(). In fact, a get() can typically receive datagrams from any host on the Internet. This is not a massive security issue because both the put() and get() are bound to the same random port (which is normally not known to any host other than the one originally contacted with the first put()). Each get() always returns a complete datagram, ignoring any length given for the function call.

A simple server can be set up with nothing more than a get() from a path containing the address of a local interface (which may be omitted to receive on all IPv4 interfaces). This will accept datagrams from any host on the Internet with a route to any of the interfaces and, when repeated, allows the server to process requests from multiple clients. Reply datagrams can be put() out on the same path. JSUS automatically sends them to the IP address and port from which the last datagram was received. Thus, a basic query-response server application is a simple loop of get(), process, put(). However, if processing is extensive, or a reply put() blocks for any reason, it will hinder the server program from issuing a new get(), thereby degrading the overall transaction rate. A more elegent solution is to use the listen() function.

The JSUS listen(), with a UDP local interface path, operates very much like those for UNIX and TCP sockets. It first accepts and saves an initial datagram from any host with a route to its interface(s). Then, it fork()s a new server transaction process in which the listening socket file is replaced by a new socket with a pseudo connection to the client. In the child process, the first get() after the listen() retrieves the saved initial datagram. Thereafter, both parties can continue an indefinate and fairly secure private conversation, with multiple get() and put() calls, independent of any other client transaction initiated over the same UDP server port. Because of the pseudo "connection", the transaction process never even sees datagrams sent by any host other than that which made the initial contact. Meanwhile, the server parent process keeps on listen()ing for new initial datagrams from other clients, fork()ing new transactions as needed.

 

See also:

The Domain Information Grabber — dig()

Binary file I/O — getb(), putb()