Archive for July, 2013

Binary Network Protocol Implementation In Java Using Byte Arrays

July 26, 2013 2 comments

Java in many ways is a very eccentric programming language. Reading the designer’s responses to questions on its design lead to interesting ideas, such as that unsigned integer types would be confusing and error-prone to the average programmer. There’s also the thought that Java is purely object-oriented, even though it has many primitive types and concepts lurking in its depths. Its design poses very uncomfortable issues for developers who seek to read, write and generally handle binary data and files, as the entire language seems to be oriented towards text-based formats such as XML. This leads one to such problems as how to implement a basic binary networking protocol.

Many network and communication protocols are binary as this makes them easier and faster to parse, more light-weight to transfer and generally less prone to interpretation. The question hereby is how to implement such a protocol in a language which is wholly unfamiliar with the concepts of unsigned integers, operator overloading and similar. The most elegant answer I have found so far is to stay low-level, and I really do mean low-level. We will treat Java’s built-in signed integers as though they are unsigned using bitwise operators where necessary and use byte-arrays to translate between Java and the outside world.

The byte type in Java is an 8-bit signed integer with a range from -128 to 127. For our purposes we will ignore the sign bit and treat it as an unsigned 8-bit integer. Network communication occurs in streams of bytes, with the receiving side interpreting it according to a predefined protocol. This means that to write on the Java side to the network socket we will have to put the required bytes into a prepared byte array. As Java arrays are fixed size like in C, it makes the most sense to either use one byte-array per field or to pre-allocate the whole array and copy the bytes into it.

Writing is done into the Java Socket via its OutputStream which we wrap into a BufferedOutputStream.

public class BinaryExample {
	Socket mSocket;
	String mServer = "";
	int mServerPort = 123;
	byte[] header = {0x53, 0x41, 0x4D, 0x50, 0x4C, 0x45}; // SAMPLE
	int mProtocolVersion = 0;
	OutputStream mOutputStream;
	InputStream mInputStream;
	BufferedOutputStream mBufferedOutputStream;

	public void run() {
		try {
			// set up connection with server
			this.mSocket = new Socket(mServer, mServerPort);
		} catch (Exception ee) {

		// get the I/O streams for the socket.
		try {
			mOutputStream = this.mSocket.getOutputStream();
			mBufferedOutputStream = new BufferedOutputStream(mOutputStream);
			mInputStream = this.mSocket.getInputStream();
		} catch (IOException e) {

		byte version = (byte) mProtocolVersion;
		int messageLength = 4 + header.length + version.length;
		byte[] msgSize = intToByteArray(messageLength);

		// write to the socket
		try {
		} catch (IOException e1) {

		// Writes provided 4-byte integer to a 4 element byte array in Little-Endian order.
		public static final byte[] intToByteArray(int value) {
			return new byte[] {
				(byte)(value & 0xff),
				(byte)(value >> 8 & 0xff),
				(byte)(value >> 16 & 0xff),
				(byte)(value >>> 24)

Any ASCII strings in the protocol we define as individual bytes. Fortunately the ASCII codes only go to 127 (0x7F) and thus fit within the positive part of Java’s byte type. For values stretching into the negative range of the byte we might have to use bit masking to deal with the sign bit, or do the conversion ourselves. We define the protocol version as an int (BE signed, 32-bit), which we convert to a byte using a simple cast, stripping off the higher three bytes. Again pay attention to the value of the int. If it’s higher than 127 you have to deal with the sign bit again or risk an overflow.

In this example we implement a lower-endian (LE) protocol. This means that in converting to a byte array from a 16-bit or larger integer we have to place the LSB first, as is done in the function intToByteArray(). We also add a message length indicator at the beginning of the message we’re sending in the form of an int, extending the message by 4 bytes.

Reading the response and interpreting it is similar:

		// wait for response. This is a blocking example.
		byte[] responseBytes = new byte[5];
		int bytesRead = 0;
		try {
			bytesRead =, 0, 4);
		} catch (IOException e1) {

		if (bytesRead != 5) {
			// communication error. Abort.

		// the fifth byte now contains the value of the response code. 0 means OK, everything else is an error.
		short responseCode = (short) responseBytes[4];
		if (responseCode != 0) { return; }

This is a brief and naive sample which just has to read a single response, skipping the message length indicator and reading just five bytes. In a more complex application you would convert the individual sections of the byte array to their respective formats (strings, ints, etc.) and verify them. For this you would use a function to invert from LE-order byte array to BE-order int such as the following:

	// Writes provided 4-byte array containing a little-endian integer to a big-endian integer.
	public static final int byteArrayToInt(byte[] value) {
		int ret = ((value[0] & 0xFF) << 24) | ((value[1] & 0xFF) << 16) |
					((value[2] & 0xFF) << 8) | (value[3] & 0xFF);

		return ret;

In many ways it’s ironic that bit shifts and bitwise operators are the way to go with a language which profiles itself as a high-level language, but such is the result of the design choices made. While it is true that the above byte array-oriented code could be encapsulated by fancy classes which would take the tediousness out of implementing such a protocol, in essence they would do the exact same as detailed above. With the upcoming Java 8 release unsigned integers will be introduced for the first time in a limited manner, but for most projects (including Android-based ones) it’s not an option to upgrade to it.

For reference, the above code is used in an actual project I’m working on and is as far as I am aware functional. I can however not accept any liability for anything going haywire, applications crashing, marriages torn up or pets set on fire. Any further checks and handling of errors is probably an awesome idea to make the code more robust.