The address information functions are special in that they don't share the error code "namespace" with the rest of this
Designing a POSIX/C++ Binding for the socket API Revision 1 Pedro Lamarão
[email protected]
Introduction In the scope of the P1003.27 draft “POSIX® C++ Language Interfaces” is the specification of a binding to the socket API. We feel that such a library has many opportunities to realize the wish to provide additional benefits to C++ programs through the use of C++ languages mechanisms. This paper is a study of these opportunities. Following the Scope statement of P1003.27 draft we discuss the form of a Network Library for C++ as a wrapper of the C binding; all C++ functions are defined in terms of C functions. We consider C++0x features as available up to the latest draft at the time this document was being prepared [C+ +0x]. This is a revised version of our first paper, based on feedback received in the POSIX/C++ Working Group mailing list.
Design Our investigation is guided by the application of individual design techniques provided by the C+ +0x programming language. We consider the application of these techniques in order as a thought exercise. ●
encapsulating declarations in namespaces
●
replacing returning error values with throwing exceptions
●
using default arguments when applicable
●
using function overloading when appropriate
●
wrapping resources are classes where resource acquisition is initialization
●
using generic programming techniques
In this paper we won't consider the application of some of the newest features in the language, like Concepts. In a future revision these will be taken into consideration.
POSIX sockets Synopsis This is a quick overview of the interfaces we intend to wrap, as declared in [POSIX]. // basic life-cycle operations int socket (int domain, int type, int protocol); int close (int fildes); // special life-cycle operations struct sockaddr; int bind (int socket, const struct sockaddr *address, socklen_t address_len); int listen (int socket, int backlog);
int accept (int socket, struct sockaddr *restrict address, socklen_t *restrict address_len); int connect (int socket, const struct sockaddr *address, socklen_t address_len); int shutdown (int socket, int how); // I/O operations struct iovec; struct msghdr; ssize_t read (int fildes, void *buf, size_t nbyte); ssize_t readv (int fildes, const struct iovec *iov, int iovcnt); ssize_t recv (int socket, void *buffer, size_t length, int flags); ssize_t recvfrom (int socket, void *restrict buffer, size_t length, int flags, struct sockaddr *restrict address, socklen_t *restrict address_len); ssize_t recvmsg (int socket, struct msghdr *message, int flags); ssize_t write (int fildes, const void *buf, size_t nbyte); ssize_t writev (int fildes, const struct iovec *iov, int iovcnt); ssize_t send (int socket, const void *buffer, size_t length, int flags); ssize_t sendto (int socket, const void *message, size_t length, int flags, const struct sockaddr *dest_addr, socklen_t dest_len); ssize_t sendmsg (int socket, const struct msghdr *message, int flags); // getting and settings attributes int getsockname (int socket, struct sockaddr *restrict address, socklen_t *restrict address_len); int getpeername (int socket, struct sockaddr *restrict address, socklen_t *restrict address_len); int getsockopt (int socket, int level, int option_name, void *restrict option_value, socklen_t *restrict option_len); int setsockopt (int socket, int level, int option_name, const void *option_value, socklen_t option_len); int fcntl (int fildes, int cmd, ...); int ioctl (int fildes, int request, ...); // readiness event notification struct timeval; unspecified fd_set; struct pollfd; int select (int nfds, fd_set *restrict readfds, fd_set *restrict writefds, fd_set *restrict errorfds, struct timeval *restrict timeout); int poll (struct pollfd fds[], nfds_t nfds, int timeout); // address information struct addrinfo; int getaddrinfo (const char *node, const char *service, const struct addrinfo *hints, struct addrinfo **res); void freeaddrinfo (struct addrinfo *res); int getnameinfo (const struct sockaddr *sa, socklen_t salen, char *host, size_t hostlen, char *serv, size_t servlen, int flags); const char *gai_strerror (int errcode);
C++ Namespace First we encapsulate a new set of declarations in namespace posix. Every declaration in our new binding will live in this namespace in order to avoid conflicting with the original ones; this will give us freedom to modify them at will. Also, we simplify them in simple ways: in C++ there is no need to use the struct keyword to declare variables of user defined types, and the restrict keyword is not part of the language.
Errors and Exceptions Second, we replace error values by exceptions in every case; instead of returning -1 from a function to communicate an error condition we throw a std::system_error with std::error_code corresponding to the error reported. Consider, for example, an implementation of posix::bind: void posix::bind (int socket, sockaddr const* address, socklen_t length) { int status = ::bind(socket, address, length); if (status == -1) { throw std::system_error(errno, std::posix_category); } }
Many functions in this interface return some value exclusively to report errors; these functions will have their declarations modified to return void. Functions affected in this manner are posix::close, posix::bind, posix::listen, posix::connect, posix::shutdown, posix::getsockname, posix::getpeername, posix::getsockopt, posix::setsockopt, posix::getaddrinfo and posix::getnameinfo. Not every function will be modified like this. Consider, for example, an implementation of posix::read: ssize_t posix::read (int socket, void* buffer, size_t length) { ssize_t status = read(socket, buffer, length); if (status == -1) { throw std::system_error(errno, std::posix_category); } return status; }
The return value is used at the same time to communicate the result of the operation and to report errors; functions like this will now assert as post-condition that the return value is zero or positive. Functions affected in this manner are posix::socket, posix::accept, posix::read, posix::readv, posix::recv, posix::recvfrom, posix::recvmsg, posix::write, posix::writev, posix::send, posix::sendto, posix::sendmsg, posix::fcntl, posix::ioctl, posix::select and posix::poll. The decision to completely replace the return of -1 by throwing an exception is not free of controversy: not all "error" conditions are hard errors. For example, the EWOULDBLOCK "error" code will be frequently met by certain applications dealing with non-blocking sockets as part of their normal operation. // reading on a non-blocking socket
int bytes = -1; try { bytes = posix::read(socket, buffer, buffer_size, 0); } catch (std::system_error const& e) { switch (e.code()) { case std::posix_error::operation_would_block: case std::posix_error::interrupted: break; default: throw; } }
To force the programmer to write a try-catch block to deal with common conditions like this is very inconvenient with the potential do introduce bad performance (the cost of a catch block). In our current design we ignore this question and replace error reporting for exceptions in all cases. Applications that might encounter EWOULDBLOCK, EINTR etc. must catch std::system_error exceptions and inspec the std::error_code to handle the situation. The address information functions are special in that they don't share the error code "namespace" with the rest of this API and return an error code directly, instead of setting errno. To represent them correctly as std::system_error objects we must extend the standard mechanism with a new std::error_category, which we will call posix::ai_category, as well as the associated stuff. namespace posix { const std::error_category& get_ai_category (); static const std::error_category& ai_category = get_ai_category(); namespace ai_error { enum ai_errno; } } namespace std { template struct is_error_condition_enum : public true_type { }; } namespace posix { namespace ai_error { std::error_code make_error_code (ai_errno e); std::error_condition make_error_condition (ai_errno e); } }
With these new elements, consider an implementation of posix::getnameinfo: void posix::getnameinfo (const sockaddr *sa, socklen_t salen, char *host, size_t hostlen, char *serv, size_t servlen, int flags) { int status = ::getnameinfo(sa, salen, host, hostlen, serv, servlen, flags); if (status != 0) { throw std::system_error(status, ai_category); } }
The posix::ai_category object absorbs the responsibilities of posix::gai_strerror, so we can omit this function from our new binding.
Default Arguments We use the default argument notation to simplify the use of functions where we identify some value for a parameter meaning "null" or "no value". Consider the protocol parameter for posix::socket. The value zero passed as argument means "use the default protocol". Consider the flags parameter for posix::recv. The value zero passed as argument means "no flags". Consider the last two parameters for posix::accept: The values null and zero passed as arguments mean "I am not interested in this information". So we make these values default arguments explicitly in the corresponding declarations. The functions affected in this manner are posix::socket (protocol), posix::listen (backlog), posix::accept (address and address_len), posix::shutdown (mode), posix::recv, posix::recvfrom, posix::recvmsg, posix::send, posix::sendto e posix::sendmsg (flags). In some cases we identify a possible default argument we can't make explicit because the position of the corresponding parameter forbids a default argument. One example of this case is the hints parameters to posix::getaddrinfo which might accept null as default argument but is not the last parameter in the list.
Function Overloading We use function overloading to unite under on name functions that realize the same logical operation. The candidates for overloads are obvious: posix::read. posix::readv, posix::recv, posix::recvfrom e posix::recvmsg realize the same logical "read" operation, while posix::write, posix::writev, posix::send, posix::sendto e posix::sendmsg realize the same local "write" operation. We observe further that many of these functions have similar parameter lists: a socket parameter, a buffer parameter and a buffer size parameter. We make them all the "same" function overloaded on the original parameters lists, calling the "read" operations posix::receive and the "write" operations posix::send. In some cases, two declarations will be merged into one, because they differ only by a parameter with a default value; this is the case of posix::read and posix::recv, and posix::write and posix::send, so our new binding becomes more compact than the original. Observe that this merge preserves semantics: posix::recv with the default argument behaves exactly like posix::read, and the same goes for posix::send and posix::write. We consider the functions that deal with socket attributes as candidates for overloads of some function but they are not sufficiently uniform; some functions just get attributes, some functions just set them, some do both things. We will use a more powerful technique to encapsulate this functionality.
Resource Acquisition is Initialization We wrap the socket handle insider a socket class, making all operations member functions. We design class socket as a RAII class, equating allocating a socket with initializing an object of class socket, and deallocating a socket with destroying the object. We design it so the destructor will never throw; for applications that desire to catch deallocation erros we provide an explicit close member function. To keep the semantics simple we forbid copy construction and copy assignments, offering move construction and move assignment only. This way we can keep sockets in Containers while guaranteeing deallocation on destruction without overhead. We add a default constructor to create objects which are effectively "null", the same state the object will be after a call to the close member function or after being moved from. This is not strictly necessary; we could forbid creating objects in this state. The only use for such a state we can think of is "lazy initialization": socket client; // empty object // do some other stuff s = listener.accept(); // actual initialization class socket
will be a typical resource class:
class socket { public: socket (); socket (int domain, int socktype, int protocol = 0); socket (socket const& x) = delete; socket (socket&& x); ~socket (); socket& operator= (socket const& x) = delete; socket& operator= (socket&& x); void close (); // etc. private: int handle; };
All functions made member functions naturally lose their first arguments, which become implicit as the this object. The accept function is modified to return an object instead of a handle; this object will me moved out of the function by the moved constructor. Example use: extern socket listener; socket client = listener.accept(); // move initialization
Last, we add the native_handle_type member typedef and the native_handle member function with the same semantics as described by class std::thread and others; also, we add a swap member function. will allow users to adapt objects of class socket to the readiness notification functions. Example use: native_handler
extern socket listener; pollfd pfd; pfds.fd = listener.native_handle(); pfds.events = POLLIN;
poll(&pfds, 1, -1); // wait for readiness
Generic Programming Em sexto lugar, aplicamos técnicas de programação genérica para finalizar o projeto. We apply generic programming design principles, searching for opportunities to turn individual functions into one generic function. We find an opportunity in the set of functions that deal with socket addresses. These functions must accept many kinds of socket address objects, to support different socket address families, so they take as parameter an opaque pointer and the size of the pointed-to data. We understand these as generic functions taking as template parameter the type of the socket address object. Consider the declaration for the socket::bind member function: void socket::bind (sockaddr const* address, socklen_t length);
Conceptually, this function takes one socket address object as parameter. Making it a template function, it becomes: template void socket::bind (SocketAddress const& address);
As we know statically the type of the socket address, the size parameter becomes unnecessary. The template function can easily be implemented in terms of the original fuction: { bind(m_handle, (sockaddr*)&address, sizeof(SocketAddress); }
As the non-template function offers no advantages over the template function we replace it entirely; functions affected in this manner are the socket::connect, socket::bind, socket::accept, socket::recvfrom, socket::sendto, socket::getsockname and socket::getpeername member functions and the posix::getnameinfo free function. We find a second opportunity in the set of functions that deal with socket attributes. Before, we tried making them some overloaded function but concluded they were too heterogenous. We observe that, each in it's own way, these functions represent the attribute to be operated upon as some kind of tag value: socket::fcntl and socket::ioctl represent it as one integral value; socket::getsockopt and socket::setsockopt represent it as a pair of integral values. The socket::getsockname and socket::getpeername were already turned into template functions, so we exclude them from our current consderation. We observe that these functions, in a mixed manner, realize two logical operations: "getting" an attribute and "setting" an attribute. We may consider them as the implementation mechanism of a pair of generic functions for getting and setting attributes. Every attribute is identified by a class that encapsulates statically that which is necessary to identify the attribute for the implementation function, as well as a static tag that identifies the implementation function which must be used. The public generic functions can dispatch the user call to an implementation function through tag dispatching based on the static information embedded in the attribute type.
class socket { public: template void get_attribute (Attribute& attribute) { typename Attribute::socket_attribute_tag tag; this->get_attribute(attribute, tag); } template void set_attribute (Attribute&& attribute) { typename Attribute::socket_attribute_tag tag; this->set_attribute(attribute, tag); } };
To each attribute tag a corresponding template function will exist in the implementation that calls the appropriate function to operate on the attribute. An example of attribute class: struct receive_buffer { // public stuff typedef int value_type; value_type value; receive_buffer (); receive_buffer (value_type v); // tag dispatching implementation typedef sockopt_tag socket_attribute_tag; static const int attribute_level; static const int attribute_name; };
The socket_attribute_tag member typedef identifies that we must use socket::getsockopt and socket::setsockopt as implementation functions, and the attribute_level and attribute_name static data members will be used as the level and option_name arguments when calling those functions. Likewise, other types of attributes may have a tag corresponding to some other implementation function, such as socket::fcntl, and other static data members identifying to which attribute it corresponds. We retrieve the receive_buffer attribute from a socket like this: socket s; receive_buffer attr; s.get_attribute(attr); std::cout