New Project: NGS CPU Architecture

August 30, 2014 Leave a comment

For those looking at the scarcity of posts on this blog and wondering what in the world happened to me, I can offer the following explanation: personal (health) issues, as well as the embarking on writing this one book for Packt Publishing on AndEngine game development have taken up most of my time recently. Unfortunately I haven’t had much opportunity to write on this blog for that reason. Fortunately, however, I have not been sitting completely idle and have begun a new project which at least some may find interesting.

The project is a custom CPU architecture I have been wanting to develop for a while now. ‘Great’, I can hear some of you think, ‘Another CPU architecture, why would we need another one?!’ The short version is that this is a pretty experimental architecture, exploring features and designs not commonly used in any mainstream CPU architectures. Consider it a bit of a research project, one aimed at developing a CPU architecture which may be useful for HPC (high-performance computing) as well as general-purpose computing.

The project’s name is ‘Nyanko Grid-scaling System’, or NGS for short. Currently I’m working on the first prototype – a simplified 16-bit version of NGS – featuring only a single ALU. This prototype is referred to as ‘NGS-16′. Even then it has many of the essential features which I think make this into such an interesting project, including:

- unclocked design: all components work without a central clock or pipeline directing them.
- task scheduler: integrating the functionality of the software-based scheduler of an OS.
- virtual memory management: virtual memory management done in hardware.
- driver management: drivers for hardware devices are either in hardware, or directly communicate with the CPU.

Essentially this means that there’s no software-based operating system (OS) as such. A shell will be required to do the actual interfacing with human beings and to instruct the NGS task scheduler to launch new processes, but no OS in the traditional sense. While this also means that existing operating systems cannot be ported to the NGS architecture in any realistic fashion, it does not mean that applications can not be compiled for it. After porting a C/C++ toolchain (GCC or LLVM) to NGS, the average C/C++-based application would only be some library-wrangling and recompile away from functioning.

Moving back to the present, I’m writing NGS-16 in VHDL, with the Lattice MachX02-7000 [1] as the target FPGA. The basic structure has been laid out (components, top entities, signals), with just the architecture implementations and debugging/simulation left to finish. While this prototype is taking the usual short-cuts (leaving out unneeded components, etc.) to ease development, it should nevertheless be a useful representation of what the NGS architecture can do.

The FPGA board I’ll be using is actually produced by a friend, who called it the FleaFPGA following the name of his company: Fleasystems [2]. As you can see on the FleaFPGA page [3], it offers quite a reasonable amount of I/O, including VGA, USB (host), PS/2, audio and an I/O header. The idea is to use as much of this hardware as possible with the initial range of prototypes. I also have another FPGA board (Digilent Nexys 2, Spartan 3E-based), which offers similar specifications (LEs and I/O). Depending on how things work out I may also run NGS-16 on that board. Ultimately I may want to build my own FPGA board aimed specifically at running NGS.

Over the coming months I’ll be blogging about my progress with this NGS-16 prototype and beyond, so stay tuned :)

Maya

[1] http://www.latticesemi.com/en/Products/FPGAandCPLD/MachXO2.aspx
[2] http://fleasystems.com/
[2] http://www.fleasystems.com/fleaFPGA.html

Categories: NGS, VHDL Tags: , , , , ,

2013 in review

January 26, 2014 Leave a comment

The WordPress.com stats helper monkeys prepared a 2013 annual report for this blog.

Here’s an excerpt:

The Louvre Museum has 8.5 million visitors per year. This blog was viewed about 71,000 times in 2013. If it were an exhibit at the Louvre Museum, it would take about 3 days for that many people to see it.

Click here to see the complete report.

Categories: Uncategorized Tags:

Pointers Into Arrays: What You’re Really Dereferencing

January 26, 2014 2 comments

This one falls under the heading of things you should definitely know as a C/C++ programmer, but which are easy to get wrong by accident. For me it was while working on an Sliding Discrete Fourier Transform (SDFT) implementation in C++ that I stumbled over this gotcha. When I got nonsense output from the algorithm I took a long, detailed look at all aspects of it until finally a friend pointed me at something which I had overlooked until then because my brain had been telling itself that it couldn’t possibly be something that simple.

First of all, a little bit of theory on arrays in C/C++: all they are is just a series (array) of bytes in memory of which you tell the compiler that it’s special. You also give it a type, which doesn’t do anything to the bytes, but just hints to the compiler how it should treat the array in certain operations. This type is usually char, but can be anything else as well, including int and float. The trick here is that the compiler will thus essentially know not only the type this array contains, but also how many bytes go into each unit of the array.

Now, arrays are usually allocated on the heap, which in C++ takes the following format:

char* pChar = new char[16];

This gives us a single pointer into the array, which we have told the compiler contains char types. The pointer we have is thus also of the type char. This is the entire clue we have to the next procedure where we attempt to read 32-bit floating point types (float) from the array. We therefore want to get 4 bytes at a time where the array is said to contain chars, which are a single byte each. The naive but wrong approach is the following:

float f = *pChar;

Here we hope that the compiler will be so kind as to deposit four bytes into our four-byte destination type from the array. Unfortunately compilers aren’t very nice and thus from our dereferenced char pointer we only get a single byte, namely the char value it was pointing at.

To actually obtain four bytes from the array in one go we need to talk a bit with the compiler. This is also called ‘casting’, whereby we tell the compiler that we want to stop to pretend that this blob of bits is a certain type and that we’d rather have the compiler treat it as something else. This is a common technique in many applications, whereby the by itself unusable void type is instrumental. Fortunately we don’t have to go that far here. All we want in this case is to let the compiler know that we want to have this array treated as a series of floats instead of chars now:

float f = *((float*) pChar);

What we do here via some delicious brackets magic to make things flow in the right order, is to first cast the char pointer we have into a float pointer, which means that it now points at four bytes instead of just one. When we thus dereference the result we are copying four bytes into the destination instead of one. Mission accomplished.

It is possible to go even fancier here than in the above example using C++’s myriad of fancy casting mechanisms, but for basic casting as we need here the C-style method suffices. It’s best to leave those for special cases anyway, as they tend to be significantly more specialized and unnecessary for casting of basic types.

Hopefully the above will be useful to someone, whether a beginner or a more advanced C/C++ user, even just as a quick reminder. I know I could have used it a few days ago :)

Maya

Introducing Nt With the Qt-based NNetworkSocket Class

August 17, 2013 Leave a comment

While recently working on a project involving C++, Qt and networking using TCP sockets I came across a nasty surprise in the form of blocking socket functions not working on Windows with QTcpSocket [1]. Not working as in fundamentally broken and a bug report [1] dating back to early 2012, when Qt 4.8 was new. After a quick look through the relevant Qt source code (native socket engine and kin) in an attempt to determine whether it was something I could reasonably fix myself, I determined that this was not a realistic option due to the amount of work involved.

Instead I found myself drawn to the option of implementing a QObject-based class wrapping around the native sockets on Windows (Winsock2) and others (POSIX/Berkeley). Having used these native sockets before, I could only think of how easy it’d be to write such a wrapper class for my needs. These needs involved TCP sockets, blocking functions and client-side functionality, all of which take only a little bit of effort to implement. Any further features could be implemented on an as-needed basis. The result of this is the NNetworkSocket class, which I put on Github today [2] in a slightly expanded version.

As I suspect that it won’t be the first add-on/drop-in/something else class I’ll be writing to complement the Qt framework, I decided to come up with the so very creative name of ‘Nt’ for the project. Much like ‘Qt’, its pronunciation is obvious and silly: ‘Qt’ as ‘cute’ and ‘Nt’ as ‘neat’. Feel free to have a look at the code I put online including the sample application (samples/NNetworkSocket_sample). Documentation will follow at some point once the class has matured some more.

As usual, feel free to provide feedback, patches and donations :)

Maya

[1] https://bugreports.qt-project.org/browse/QTBUG-24451
[2] https://github.com/MayaPosch/Nt

Java On Android TCP Socket Issue

August 8, 2013 1 comment

Related to my previous post [1] involving a project using Java sockets, I’d like to post about an issue I encountered while debugging the project. Allow me to first describe the environment and set up.

The Java side as an extended version of the class described in the linked post ran as client on Android, specifically a Galaxy Nexus device running Android 4.2.2 and later 4.3. Its goal was to send locally collected arrays of bytes via the socket to the server after connecting. The server was written in C++ with part of the networking side handled by the Qt framework (QTcpServer) and the actual communication via native sockets on a Windows 7 Ultimate x64 system.

The problem occurred upon the connecting of the Android client to the server: the connecting would be handled fine, the thread to handle the native socket initialized and started as it should be. After that however the issue was that never any data would be received on the server-side of the client socket. Checks using select() showed that there never arrived any data in the buffer. Upon verification with a telnet client (Putty) it turned out that the server was able to receive data just fine, and thus that the issue had to lie with the client side, i.e. the Android client.

Inspection using the Wireshark network traffic sniffer during the communication between the Android client and the server showed a normal TCP sequence, with SYN, SYN-ACK and ACK packets followed by a PSH-ACK from the client with the first data. This followed by an ACK from the server, indicating that the server network stack had at least acknowledged the data package. Everything seemed in order, although it was somewhat remarkable that the first client-side ACK had the exact same timestamp in Wireshark as the PSH-ACK packet.

Mystified, I stumbled over a few posts [2][3] on StackOverflow in which it was suggested that using Thread.sleep() after the connecting phase would resolve this. Trying this solution with a 500 ms sleep period I found that suddenly the client-server communication went flawlessly. My only question hereby is why this is the case.

Looking through the TCP specifications I didn’t find anything definite, even though the evidence so far suggests that the ACK on SYN-ACK can not be accompanied by a PSH or similar at the same time. Possibly that something else plays a role here, but it seems that the lesson here is that in case of non-functioning Java sockets one just has to wait a little while before starting to send data. I’d gladly welcome further clarification on why this happens.

Maya

[1] https://mayaposch.wordpress.com/2013/07/26/binary-network-protocol-implementation-in-java-using-byte-arrays/
[2] http://stackoverflow.com/questions/5326652/java-socket-read-not-receiving-write-sent-immediatly-after-connection-is-establi
[3] http://stackoverflow.com/questions/17073045/why-does-java-send-needs-thread-sleep

Binary Network Protocol Implementation In Java Using Byte Arrays

July 26, 2013 1 comment

Java in many ways is a very eccentric programming language. Reading the designer’s responses to questions on its design lead to interesting ideas, such as that unsigned integer types would be confusing and error-prone to the average programmer. There’s also the thought that Java is purely object-oriented, even though it has many primitive types and concepts lurking in its depths. Its design poses very uncomfortable issues for developers who seek to read, write and generally handle binary data and files, as the entire language seems to be oriented towards text-based formats such as XML. This leads one to such problems as how to implement a basic binary networking protocol.

Many network and communication protocols are binary as this makes them easier and faster to parse, more light-weight to transfer and generally less prone to interpretation. The question hereby is how to implement such a protocol in a language which is wholly unfamiliar with the concepts of unsigned integers, operator overloading and similar. The most elegant answer I have found so far is to stay low-level, and I really do mean low-level. We will treat Java’s built-in signed integers as though they are unsigned using bitwise operators where necessary and use byte-arrays to translate between Java and the outside world.

The byte type in Java is an 8-bit signed integer with a range from -128 to 127. For our purposes we will ignore the sign bit and treat it as an unsigned 8-bit integer. Network communication occurs in streams of bytes, with the receiving side interpreting it according to a predefined protocol. This means that to write on the Java side to the network socket we will have to put the required bytes into a prepared byte array. As Java arrays are fixed size like in C, it makes the most sense to either use one byte-array per field or to pre-allocate the whole array and copy the bytes into it.

Writing is done into the Java Socket via its OutputStream which we wrap into a BufferedOutputStream.

public class BinaryExample {
	Socket mSocket;
	String mServer = "127.0.0.1";
	int mServerPort = 123;
	byte[] header = {0x53, 0x41, 0x4D, 0x50, 0x4C, 0x45}; // SAMPLE
	int mProtocolVersion = 0;
	OutputStream mOutputStream;
	InputStream mInputStream;
	BufferedOutputStream mBufferedOutputStream;

	public void run() {
		try {
			// set up connection with server
			this.mSocket = new Socket(mServer, mServerPort);
		} catch (Exception ee) {
			return;
		}

		// get the I/O streams for the socket.
		try {
			mOutputStream = this.mSocket.getOutputStream();
			mBufferedOutputStream = new BufferedOutputStream(mOutputStream);
			mInputStream = this.mSocket.getInputStream();
		} catch (IOException e) {
			e.printStackTrace();
			return;
		}

		byte version = (byte) mProtocolVersion;
		int messageLength = 4 + header.length + version.length;
		byte[] msgSize = intToByteArray(messageLength);

		// write to the socket
		try {
			mBufferedOutputStream.write(msgSize);
			mBufferedOutputStream.write(header);
			mBufferedOutputStream.flush();
		} catch (IOException e1) {
			e1.printStackTrace();
		}

		// Writes provided 4-byte integer to a 4 element byte array in Little-Endian order.
		public static final byte[] intToByteArray(int value) {
			return new byte[] {
				(byte)(value & 0xff),
				(byte)(value >> 8 & 0xff),
				(byte)(value >> 16 & 0xff),
				(byte)(value >>> 24)
			};
		}
	}
}

Any ASCII strings in the protocol we define as individual bytes. Fortunately the ASCII codes only go to 127 (0x7F) and thus fit within the positive part of Java’s byte type. For values stretching into the negative range of the byte we might have to use bit masking to deal with the sign bit, or do the conversion ourselves. We define the protocol version as an int (BE signed, 32-bit), which we convert to a byte using a simple cast, stripping off the higher three bytes. Again pay attention to the value of the int. If it’s higher than 127 you have to deal with the sign bit again or risk an overflow.

In this example we implement a lower-endian (LE) protocol. This means that in converting to a byte array from a 16-bit or larger integer we have to place the LSB first, as is done in the function intToByteArray(). We also add a message length indicator at the beginning of the message we’re sending in the form of an int, extending the message by 4 bytes.

Reading the response and interpreting it is similar:

		// wait for response. This is a blocking example.
		byte[] responseBytes = new byte[5];
		int bytesRead = 0;
		try {
			bytesRead = mInputStream.read(responseBytes, 0, 4);
		} catch (IOException e1) {
			e1.printStackTrace();
		}

		if (bytesRead != 5) {
			// communication error. Abort.
			return;
		}

		// the fifth byte now contains the value of the response code. 0 means OK, everything else is an error.
		short responseCode = (short) responseBytes[4];
		if (responseCode != 0) { return; }

This is a brief and naive sample which just has to read a single response, skipping the message length indicator and reading just five bytes. In a more complex application you would convert the individual sections of the byte array to their respective formats (strings, ints, etc.) and verify them. For this you would use a function to invert from LE-order byte array to BE-order int such as the following:

	// Writes provided 4-byte array containing a little-endian integer to a big-endian integer.
	public static final int byteArrayToInt(byte[] value) {
		int ret = ((value[0] & 0xFF) << 24) | ((value[1] & 0xFF) << 16) |
					((value[2] & 0xFF) << 8) | (value[3] & 0xFF);

		return ret;
	}

In many ways it’s ironic that bit shifts and bitwise operators are the way to go with a language which profiles itself as a high-level language, but such is the result of the design choices made. While it is true that the above byte array-oriented code could be encapsulated by fancy classes which would take the tediousness out of implementing such a protocol, in essence they would do the exact same as detailed above. With the upcoming Java 8 release unsigned integers will be introduced for the first time in a limited manner, but for most projects (including Android-based ones) it’s not an option to upgrade to it.

For reference, the above code is used in an actual project I’m working on and is as far as I am aware functional. I can however not accept any liability for anything going haywire, applications crashing, marriages torn up or pets set on fire. Any further checks and handling of errors is probably an awesome idea to make the code more robust.

Maya

MSAA And IA2 Accessibility In Qt

February 15, 2013 5 comments

A few months ago I was approached by someone who wanted to have a fairly basic app developed. Nothing special there. What was special was that this person is blind and required the application to work well with his favourite screen-reader software (Window Eyes). After an initial meddling around with a basic Win32 GUI application I decided that I could much more easily do this in Qt, assuming that accessibility there worked as intended. This resulted in a quick and messy crash course into accessibility with the Qt framework.

To immediately start off with the most important lesson: forget about accessibility with Qt 4.8 and lower. This version of the framework only has partial MSAA (MicroSoft Active Accessibility) support, the API Microsoft first introduced with Windows 95 Service Release 2 (SR2) to interface with screen and braille readers, among other technologies. The MSAA support in Qt 4 is limited to some main GUI elements, but omits lists and other crucial views. As a result only the most basic, stripped-down Qt 4 applications will work via the MSAA API.

There’s hope, however. With Qt 5 accessibility has been majorly improved. Not only has MSAA been improved to the point where it’s pretty much fully accessible, but the IAccessibility2 API has also been added, which is a third-party accessibility API more recently introduced. This latter API is most easy to use with Qt 5 applications, though MSAA doesn’t require much more work either.

Basically all you need to do to enable accessibility in your application is to set the ‘Accessible name’ and ‘Accessible description’ for relevant widgets in your application’s GUI (check the property list in Qt Designer/Qt Creator with the GUI file open). Then, when deploying the application make sure you have a folder named ‘accessible’ in your application’s folder with the executable and the file ‘qtaccessiblewidgets.dll’ (or comparable .so) in that accessible folder. This will cause the Qt 5 application to load the accessibility features for its widgets and enables MSAA and IA2.

During testing and experimenting I have used both the Window Eyes and NVDA screen readers in Windows 7 x64 and Windows XP. It was found that NVDA already works with Qt 5 applications, likely via its IA2 interface, but that Window Eyes needed to have its Qt accessibility support improved due to it only supporting MSAA. I worked together with the creators of Window Eyes – GWMicro – on this improvement, resulting in a new build which works great for both me and the client. This new build will soon become available as an update for Window Eyes customers.

So after a few months of trial and error at long last this application is done and working for the client. Would it have been better to just go ahead with the Win32 API version? I’m not sure. First of all it would have represented its own share of issues, especially with regard to the implementation of the application’s other features which Qt’s object-oriented, message-slot-based architecture makes a snap. As a result of this struggle it seems that everyone came out ahead; for myself the added knowledge of accessibility in Qt, my client with an improved screen reader which now supports all Qt 5 applications which load that DLL I mentioned, and GWMicro which now has a better product.

In summary, Qt 5 accessibility support is pretty darn easy and well-supported these days, whether using the MSAA or IA2 API. As far as I’m concerned it’s worth a good look for your next project.

Maya

Follow

Get every new post delivered to your Inbox.

Join 1,737 other followers