Archive for the ‘Protocols’ Category

Creating a Websocket server with Websocket++

September 16, 2015 4 comments

Recently I had to add a Websocket server to a C++ project. Some research showed that the options here aren’t too many. There are a few C-based options, and one can of course pick the Websocket module from the POCO libraries [1] if one desires a C++ approach. Since this particular project is written in C++ I much preferred a purely C++ solution, preferably stand-alone. Ultimately I picked the (creatively named) Websocket++ library [2], also referred to as Websocketpp. Main arguments here were as mentioned being an object-oriented C++ solution, without significant dependencies, as well as ease of implementation thanks to it being a header-only library.

Websocket++ is a fairly modular library, making heavy use of templating to assemble various configurations, end points and similar into one coherent whole. As the basis for the transport module can for example pick from iostreams (very slow) and ASIO. For the latter one can pick between Boost ASIO and stand-alone ASIO. There is also the option of using no C++11 features and using the Boost alternatives instead. Since this project involved a number of compile targets, not all of which featured a C++11-capable compiler, the final configuration involved a Boost dependency using its ASIO and system library, as well as various other header-only dependencies.

After starting the actual integration of the library into my project, I did however find out that the quality of the documentation is very… sub-optimal. The documentation is split between the GitHub site and the author’s own site, with most of this documentation being completely and utterly outdated. Only after significant amounts of trial and error did I manage to get a fully working implementation. To save others the trouble, I would like to hereby present a (simplified and altered) version of my implementation. I hope it will be useful.

Let’s move on to the header file of our implementation:

#include "websocketpp/server.hpp"
#include "websocketpp/config/asio_no_tls.hpp"

With these two includes we pick from the Websocket++ server role and make available the ASIO configuration without TLS feature, meaning no encrypted connections.

class WebsocketServer {
	static bool init();
	static void run();
	static void stop();

	static bool sendClose(string id);
	static bool sendData(string id, string data);
    static bool getWebsocket(const string &id, websocketpp::connection_hdl &hdl);
	static websocketpp::server<websocketpp::config::asio> server;
	static pthread_rwlock_t websocketsLock;
	static map<string, websocketpp::connection_hdl> websockets;
	static LogStream ls;
	static ostream os;
	// callbacks
	static bool on_validate(websocketpp::connection_hdl hdl);
	static void on_fail(websocketpp::connection_hdl hdl);
	static void on_close(websocketpp::connection_hdl hdl);

Our class definition implements a static class. This will allow us to use the Websocket functionality from multiple classes. Websocket++ is thread-safe, so all we have to worry about is multi-thread level access to our own data structures and variables.

Moving on to the implementation, we can see first the usual static initialisations and namespace merging:

// static initialisations
websocketpp::server<websocketpp::config::asio> WebsocketServer::server;
map<string, connection_hdl> WebsocketServer::websockets;
pthread_rwlock_t WebsocketServer::websocketsLock = PTHREAD_RWLOCK_INITIALIZER;
LogStream WebsocketServer::ls;
ostream WebsocketServer::os(&ls);
// namespace merging
using websocketpp::connection_hdl;

Next is initialising the library and the server instance:

bool WebsocketServer::init() {
	// Initialising WebsocketServer.

When using the ASIO transport option, we call its init method here.

	// Set custom logger (ostream-based).

We may want to redirect the logging output to our own logging method. Websocket++’s basic logger allows us to set an ostream alternative for the standard std::cout and std::cerr. We will look at this in more detail later on.

	// Register the message handlers.

Next we set the message handlers. These are all callback methods we will define in a moment.

	// Listen on port.
	int port = 8082;
	try {
	} catch(websocketpp::exception const &e) {
		// Websocket exception on listen. Get char string via e.what().

With all the configuration done, we can start listening using the transport framework, this is done with the listen() call on the server object. This method is not exception-free, so we surround it with a try/catch block.

	// Starting Websocket accept.
	websocketpp::lib::error_code ec;
	if (ec) {
		// Can log an error message with the contents of ec.message() here.
		return false;
	return true;

Finally we start accepting connections. We just need to start the server proper now, which is done in the following function:

void WebsocketServer::run() {
	try {;
	} catch(websocketpp::exception const &e) {
        // Websocket exception. Get message via e.what().

Again, this is another method which isn’t exception-free, so we have to surround it with a try/catch block. The other clue here is that when we shut down the server at some point, we have to wait for this (blocking) run() call to return before we for example terminate a thread.

void WebsocketServer::stop() {
	// Stopping the Websocket listener and closing outstanding connections.
	websocketpp::lib::error_code ec;
	if (ec) {
		// Failed to stop listening. Log reason using ec.message().
	// Close all existing websocket connections.
	string data = "Terminating connection...";
	map<string, connection_hdl>::iterator it;
	for (it = websockets.begin(); it != websockets.end(); ++it) {
		websocketpp::lib::error_code ec;
		server.close(it->second, websocketpp::close::status::normal, data, ec); // send text message.
		if (ec) { // we got an error
			// Error closing websocket. Log reason using ec.message().
	// Stop the endpoint.

Shutting down the Websocket server is fairly obvious: first we stop listening. This means we will no longer accept new connections. Next we go through all of the websocket connections we still have and close every single one of them. Finally we call stop() on the server object. This isn’t strictly necessary, but it will ensure that the transport backend is completely shut down and any remaining connections forcefully terminated.

Let’s move on to actually accepting new connections. For this we can use a number of handlers [3], including open() and validate. I picked the validate handler, since it allows one to filter incoming connections and reject any which do not authenticate properly or such:

bool WebsocketServer::on_validate(connection_hdl hdl) {
	websocketpp::server<websocketpp::config::asio>::connection_ptr con = server.get_con_from_hdl(hdl);
	websocketpp::uri_ptr uri = con->get_uri();
	string query = uri->get_query(); // returns empty string if no query string set.
	if (!query.empty()) {
		// Split the query parameter string here, if desired.
		// We assume we extracted a string called 'id' here.
	else {
		// Reject if no query parameter provided, for example.
		return false;
	if (pthread_rwlock_wrlock(&websocketsLock) != 0) {
		// Failed to write-lock websocketsLock.
	websockets.insert(std::pair<string, connection_hdl>(id, hdl));
	if (pthread_rwlock_unlock(&websocketsLock) != 0) {
		// Failed to unlock websocketsLock.

	return true;

This code shows how to obtain the connection behind a connection handle from Websocket++ and to extract the URI including its query parameter string from it.

Here we assume that the connection client has to provide a string-based ID, though one can also use another identifier, based on the implementation. We use pthread-based locking around the websockets map to ensure no concurrent access takes place on this data structure and insert the new websocket handle with its id as key.

We may also wish to implement the fail() and close() handlers:

void WebsocketServer::on_fail(connection_hdl hdl) {
	websocketpp::server<websocketpp::config::asio>::connection_ptr con = server.get_con_from_hdl(hdl);
	websocketpp::lib::error_code ec = con->get_ec();
	// Websocket connection attempt by client failed. Log reason using ec.message().

void WebsocketServer::on_close(connection_hdl hdl) {
	// Websocket connection closed.

For the fail handler, we can obtain the connection as before, and extract the error code object to learn the reason behind the failure.

The close handler should generally be fairly boring, but it can be informative to have the confirmation in a log or such of a successfully closed connection.

Moving on, we just have to look at how to send data to such a socket.

bool WebsocketServer::sendData(string id, string data) {
	connection_hdl hdl;
	if (!getWebsocket(id, hdl)) {
		// Sending to non-existing websocket failed.
		return false;
	websocketpp::lib::error_code ec;
	server.send(hdl, data, websocketpp::frame::opcode::text, ec); // send text message.
	if (ec) { // we got an error
		// Error sending on websocket. Log reason using ec.message().
		return false;
	return true;

This function obtains the appropriate connection handle based upon the ID, then proceeds to write the provided data to this connection. The getWebsocket() method is a trivial STL map-based find and iteration effort and isn’t further documented here. Do not forget to lock the map while performing said find and iterator actions on it.

Lastly, how to close a socket:

bool WebsocketServer::sendClose(string id) {
	connection_hdl hdl;
	if (!getWebsocket(id, hdl)) {
		// Closing non-existing websocket failed.
		return false;
	string data = "Terminating connection...";
	websocketpp::lib::error_code ec;
	server.close(hdl, websocketpp::close::status::normal, data, ec); // send close message.
	if (ec) { // we got an error
		// Error closing websocket. Log reason using ec.message().
		return false;
	// Remove websocket from the map.
	return true;

Here we again obtain the proper connection handle, only this time we use the ‘close’ method instead of ‘send’. We can send a close reason using a string, or just send an empty string.

Finally the ID is erased from the websockets map and the now invalid connection handle with it.

With this we have everything we need for the Websocket server, except for one thing: the redirecting of the logging output from Websocket++. We saw earlier that we use the set_ostream() method on the logging interfaces. In the class declaration we saw this mysterious ‘LogStream’ type and an ostream, and again in the static initialisations.

What happens here is that this LogStream class is a custom implementation of std::streambuf, assigned to an std::ostream object which then replaces the standard outputs Websocket++’s logging. For the actual streambuf implementation, one would use something like this:

class LogStream : public streambuf {	
	string buffer;
    int overflow(int ch) override {
        buffer.push_back((char) ch);
        if (ch == '\n') {
            // End of line, write to logging output and clear buffer.
		return ch;
        //  Return traits::eof() for failure.

We just override the virtual overflow() method in the streambuf class. In the default implementation the default buffer overflows for every character written to the streambuf class and thus our overflow method is called for each character.

Using a string as buffer, we capture each received character and check whether it is a newline character or not. If it is we have a complete line which we can then write to whatever logging functionality we use in our project. After this we empty the buffer string and continue with the new line.

In conclusion, I must say that despite the effort it cost me to get a working integration of Websocket++ in my project, I do think it was worth it. Technically it is a well-designed library with a lot of cool features and thanks to its template-based nature ease of expansion and configuration to fit different purposes. Its main weakness is simply the outdated, lacking and occasionally wrong documentation and examples. Hopefully this article will fix at least part of that problem 🙂



Binary Network Protocol Implementation In Java Using Byte Arrays

July 26, 2013 2 comments

Java in many ways is a very eccentric programming language. Reading the designer’s responses to questions on its design lead to interesting ideas, such as that unsigned integer types would be confusing and error-prone to the average programmer. There’s also the thought that Java is purely object-oriented, even though it has many primitive types and concepts lurking in its depths. Its design poses very uncomfortable issues for developers who seek to read, write and generally handle binary data and files, as the entire language seems to be oriented towards text-based formats such as XML. This leads one to such problems as how to implement a basic binary networking protocol.

Many network and communication protocols are binary as this makes them easier and faster to parse, more light-weight to transfer and generally less prone to interpretation. The question hereby is how to implement such a protocol in a language which is wholly unfamiliar with the concepts of unsigned integers, operator overloading and similar. The most elegant answer I have found so far is to stay low-level, and I really do mean low-level. We will treat Java’s built-in signed integers as though they are unsigned using bitwise operators where necessary and use byte-arrays to translate between Java and the outside world.

The byte type in Java is an 8-bit signed integer with a range from -128 to 127. For our purposes we will ignore the sign bit and treat it as an unsigned 8-bit integer. Network communication occurs in streams of bytes, with the receiving side interpreting it according to a predefined protocol. This means that to write on the Java side to the network socket we will have to put the required bytes into a prepared byte array. As Java arrays are fixed size like in C, it makes the most sense to either use one byte-array per field or to pre-allocate the whole array and copy the bytes into it.

Writing is done into the Java Socket via its OutputStream which we wrap into a BufferedOutputStream.

public class BinaryExample {
	Socket mSocket;
	String mServer = "";
	int mServerPort = 123;
	byte[] header = {0x53, 0x41, 0x4D, 0x50, 0x4C, 0x45}; // SAMPLE
	int mProtocolVersion = 0;
	OutputStream mOutputStream;
	InputStream mInputStream;
	BufferedOutputStream mBufferedOutputStream;

	public void run() {
		try {
			// set up connection with server
			this.mSocket = new Socket(mServer, mServerPort);
		} catch (Exception ee) {

		// get the I/O streams for the socket.
		try {
			mOutputStream = this.mSocket.getOutputStream();
			mBufferedOutputStream = new BufferedOutputStream(mOutputStream);
			mInputStream = this.mSocket.getInputStream();
		} catch (IOException e) {

		byte version = (byte) mProtocolVersion;
		int messageLength = 4 + header.length + version.length;
		byte[] msgSize = intToByteArray(messageLength);

		// write to the socket
		try {
		} catch (IOException e1) {

		// Writes provided 4-byte integer to a 4 element byte array in Little-Endian order.
		public static final byte[] intToByteArray(int value) {
			return new byte[] {
				(byte)(value & 0xff),
				(byte)(value >> 8 & 0xff),
				(byte)(value >> 16 & 0xff),
				(byte)(value >>> 24)

Any ASCII strings in the protocol we define as individual bytes. Fortunately the ASCII codes only go to 127 (0x7F) and thus fit within the positive part of Java’s byte type. For values stretching into the negative range of the byte we might have to use bit masking to deal with the sign bit, or do the conversion ourselves. We define the protocol version as an int (BE signed, 32-bit), which we convert to a byte using a simple cast, stripping off the higher three bytes. Again pay attention to the value of the int. If it’s higher than 127 you have to deal with the sign bit again or risk an overflow.

In this example we implement a lower-endian (LE) protocol. This means that in converting to a byte array from a 16-bit or larger integer we have to place the LSB first, as is done in the function intToByteArray(). We also add a message length indicator at the beginning of the message we’re sending in the form of an int, extending the message by 4 bytes.

Reading the response and interpreting it is similar:

		// wait for response. This is a blocking example.
		byte[] responseBytes = new byte[5];
		int bytesRead = 0;
		try {
			bytesRead =, 0, 4);
		} catch (IOException e1) {

		if (bytesRead != 5) {
			// communication error. Abort.

		// the fifth byte now contains the value of the response code. 0 means OK, everything else is an error.
		short responseCode = (short) responseBytes[4];
		if (responseCode != 0) { return; }

This is a brief and naive sample which just has to read a single response, skipping the message length indicator and reading just five bytes. In a more complex application you would convert the individual sections of the byte array to their respective formats (strings, ints, etc.) and verify them. For this you would use a function to invert from LE-order byte array to BE-order int such as the following:

	// Writes provided 4-byte array containing a little-endian integer to a big-endian integer.
	public static final int byteArrayToInt(byte[] value) {
		int ret = ((value[0] & 0xFF) << 24) | ((value[1] & 0xFF) << 16) |
					((value[2] & 0xFF) << 8) | (value[3] & 0xFF);

		return ret;

In many ways it’s ironic that bit shifts and bitwise operators are the way to go with a language which profiles itself as a high-level language, but such is the result of the design choices made. While it is true that the above byte array-oriented code could be encapsulated by fancy classes which would take the tediousness out of implementing such a protocol, in essence they would do the exact same as detailed above. With the upcoming Java 8 release unsigned integers will be introduced for the first time in a limited manner, but for most projects (including Android-based ones) it’s not an option to upgrade to it.

For reference, the above code is used in an actual project I’m working on and is as far as I am aware functional. I can however not accept any liability for anything going haywire, applications crashing, marriages torn up or pets set on fire. Any further checks and handling of errors is probably an awesome idea to make the code more robust.


Design Your Own Protocol In Five Minutes

October 3, 2011 7 comments

Among the most scary and official sounding terms in computing we find the word ‘protocol’. Its meaning really isn’t that scary, however. Just like when used in other contexts, all it means is a collection of agreements about how to go about something. In this case we’re talking about communication protocols, protocols which allow two or more devices and/or applications to communicate with each other.

Much like how humans have developed their own communication protocols, basically. We also do a handshake part during which we initialize the connection, whether it’s by smiling at each other, remarking on the beautiful/terrible weather or asking after something specific, depending on whether there was previous contact or not. Possible failure modes include getting ignored (Server Time-out), getting slapped in the face after a failed pick-up line (Connection Closed By Host) or interrupted by the girl’s muscular boyfriend (Connection Reset By Peer), as well as addressing the wrong person (Connection Denied).

After successfully establishing the connection, information is exchanged. For humans both during handshake and communication the form used for information exchange is a so-called language, a rather organic and informal set of syllables which when put into the right order (‘spelling’ and ‘grammar’) can be used to evoke understanding in the receiving party. To even get to this level, humans needed tens of thousands of years to evolve a series of grunts and other random noises into something coherent. Suffice it to say that human communication protocols are elaborate, imprecise, filled with misunderstandings and are a clear example of how not to design a communication protocol 🙂

Finally, ending the connection. Again, for humans this can take many forms, generally fails to result in a clean termination and can add many more minutes to a connection. Aren’t we glad now that we are designing a communication protocol for computers?

All joking aside, designing a communication protocol is fairly easy. The first choice we have to make is whether we want the protocol to be binary or text-based. Text-based protocols include the HTTP protocol, which is what we use to browse webpages with. Main benefit of it is that it’s easy for humans to write it out and debug it. Main disadvantage is that it’s less precise and exact in that generally you can’t parse it in one go, can’t instantly verify that it is valid as a whole and using the wrong text encoding can mess things up quite badly. You’ll quickly find that it’s a cumbersome and error-prone way to go about a communication protocol. It’s no wonder that they’re fairly rarely used, mostly with network applications for some reason.

Text-based protocols have the benefit of not being affected by endianness [1], which is the byte order used by a particular system. Little endian is what Intel and AMD processors use and mean that the least important (little) bits are placed at the front of a byte, while big endian is the opposite. This means that if we take the number 14 (hexidecimal 0x0E), in little endian a resulting four-byte integer looks like this: 0E 00 00 00, whereas with big endian it looks like: 00 00 00 0E. Confusing little endian with big or the other way around will lead to interpreting the number wrongly and making our small number of 14 into a much larger number of 917,504. Oops.

To solve this problem with binary protocols which might be used in mixed endian environments, we add a magic number to the front of the header, usually two bytes with known values. By reading those we know which endianness the data is in. One example is using ‘MM’ like in TIFF file headers to indicate big endian (MSB) and ‘ll’ to indicate little endian (LSB) byte order. We can then enter a different parsing routine, or swap the byte order while parsing.

Writing out the protocol itself is a fairly easy and in my experience fun task, but I may just be a tad crazy. It is made easiest when you know what the requirements for the protocol are, but in general we start with the endianness indicator if needed, then one or more indicators identifying the header as being what is expected. I generally use the name of my company followed by the protocol name. After that the data follows. Sections within the data have their own text headers to detect corruption. Where offsets aren’t fixed such as with text strings, an unsigned integer precedes the data to indicate the length of the segment.

The basic protocol thus looks like follows:

ll/MM        uint8(2)
size        uint32
NYANKO        uint8(6)
UDS            uint8(3)
command        uint8(4)

To send a UDS protocol ‘LIST’ command to the server, we would use the following code, this one using a QByteArray:

QByteArray data;
data = "ll";
quint32 size = 19;
for (int i = sizeof(size); i > 0 ; --i) {
data.append((size >> (i * 8)) & 0xFF);

data += "NYANKOUDS";
data += "LIST";

Did I mention yet that bitwise operators are important? 🙂 When dealing with low-level interactions such as communication protocols, they are invaluable and one’d do well to study them. Finally, I’d like to comment on the ‘size’ variable used. With network protocols it’s hard for the receiving socket to know when the end of the data has been reached. Putting the size of the whole data at the front of the header like this allows it to know exactly how much data still has to be received, when the data end has been reached and when a download is incomplete.

Parsing the protocol is essentially the opposite of putting it together. It’s done in a linear fashion, with checks for every value read. If done right it’s very robust and quite fool-proof.

Anyway, these are the basics of putting a communication protocol together. It’s very easy, absolutely not scary and even pretty fun 🙂 Go give it a try some time.



Categories: programming, Protocols