Lock-free ring buffer implementation for maximum throughput

November 12, 2021 Leave a comment

During the development of the NymphCast project [1], a core task was to implement a ring buffer that could provide the reliability, latency and throughput required for the needs of the media file decoder during playback of, and seeking actions within high bitrate content. While the general concept of a ring buffer is exceedingly common in software development, doing this in a way that is completely thread-safe is less straightforward and common.

The general implementation of a ring buffer involves a number of pointers into a preallocated buffer of size N, indicating the buffer front (index 0), end (index N) and the current positions of the read and write pointers. In the implementation as used by NymphCast, a number of higher level counters were added to keep track of characteristics, such as the total unread and free bytes in the buffer.

This ring buffer implementation has now been generalised into its own project [2], with NymphCast-specific features stripped out, for easier analysis.


For a high-level overview of how this ring buffer (DataBuffer) works, we can look at the provided example project [3] that shows the basic structure, set-up and tear-down.

Initially, the DataBuffer class is set up with the number of bytes that should be allocated for its buffer. In this example a 1 MB buffer is used. Using DataBuffer::setSeekRequestCallback(), our custom handler for seek requests is provided to the ring buffer.

Next, we launch a new thread for the data write function – which writes new data into the buffer – and a thread for the data request function – which is called by the DataBuffer class when more data can be written. After this we call DataBuffer::start() to begin the initial buffering process.

In this start function, the data request function callback (set using DataBuffer::setDataRequestCondition()) is called by signalling its condition variable. This causes it to trigger a data write sequence, in this example simplified to the calling of another condition variable in the data write function, which writes a pre-allocated chunk of data (200 kB-sized) into the ring buffer using DataBuffer::write().

The reason for having a separate data request and data write thread is that this allows for efficient buffering, with each write call leading to the data request function being called for so long as there’s room in the buffer for another (200 kB) chunk of data.


The measures taken to make this ring buffer thread-safe can be summarised as atomic variable capture, and atomic wait. The latter refers to the waiting on an atomic variable (dataRequestPending) that indicates whether there’s an active data request action. This is used in DataBuffer::seek() as follows:

uint32_t timeout = 1000;
while (dataRequestPending) {
if (--timeout < 1) {
		return -1;

This code ensures that when the buffer is expecting an incoming write request, we do not start the seeking action, as this involves resetting the buffer contents. Synchronising seeking actions like this is one of the more complicated aspects of a lock-free ring buffer. In addition, this code also implements a time-out feature of 1 second, to prevent application lock-up if the impending write somehow does not occur.


The atomic variable capture aspect comes into play with the DataBuffer::read() and DataBuffer::write() methods. The aforementioned free and unread counters are implemented as atomic variables. This provides some level of safety, but does not guarantee that the variable’s value will not be changed by another thread during the execution of either of these class methods.

A solution is to capture the variables needed into a local variable, as done e.g. in DataBuffer::read():

uint32_t locunread = unread;
uint32_t bytesSingleRead = locunread;
if ((end - index) < bytesSingleRead) { bytesSingleRead = end - index; }

In this code we capture the unread atomic variable into a local variable, which is then used to calculate the of bytes we can safely read from the buffer. This same captured variable is then used throughout the class method instead of the class variable, as after capturing it is deemed unsafe to use.

The same atomic capture approach is used with the counter for free bytes when writing into the buffer. What this guarantees is that at the moment of capture, the value of these counters is accurate, and most importantly that even if they are written to after capture, it has no negative repercussions. The worst outcome of an increase of the value of the number of free or unread bytes is there are more bytes to read or available to write than at the moment of capture. Neither of which poses a problem.


The result of this approach is that the single writer thread can safely write into the ring buffer until it runs out of sufficient space to write another chunk. Similarly, the reader thread can read constantly and should theoretically not run out of data to read, barring throughput issues on the side of the writer that are beyond the scope of the ring buffer.

A possible improvement here would be to have the data request callback receive the total number of free bytes (as captured), so that it could scale the requested data to the capabilities of the data source. This will have to be further tested and analysed.




[1] https://github.com/MayaPosch/NymphCast
[2] https://github.com/MayaPosch/LockFreeRingBuffer
[3] https://github.com/MayaPosch/LockFreeRingBuffer/blob/master/test/test_databuffer_multi_port.cpp


Refactoring NymphRPC for zero-copy optimisation

November 11, 2021 Leave a comment

When I originally wrote the code for what became NymphRPC [1], efficiency was not my foremost concern, but rather reliable functionality was. Admittedly, so long as you just send a couple of bytes and short strings to and from client and server, the overhead of network transmission is very likely to mask many inefficiencies. That is, until you try to send large chunks of data.

The motivation for refactoring NymphRPC came during the performance analysis of NymphCast [2] using Valgrind’s Callgrind tool. NymphCast uses NymphRPC for all its network-based communications, including the streaming of media data between client and server. This involves sending the data in chunks of hundreds of kilobytes, which is where the constant copying of data strings in NymphRPC showed itself to be a major overhead.

Specifically this showed itself (on Linux) in many calls to __memcpy_avx_unaligned_erms, largely originating from within std::string. There were multiple reasons for this, involving the copying of std::string instances into a NymphString type, copying this data again during message serialisation, and repeated copying of message data during deserialisation: first after receiving the messages from the network socket, then again during deserialisation of the message.

Finally, the old NymphRPC API was designed thus that all data would be copied inside the NymphRPC types, which added convenience, but at a fairly large performance impact, as seen.

Using a benchmark program created using the Catch2 benchmarking framework [3][4] – consisting out of a NymphRPC client and server – the following measurements were obtained after compilation with Visual Studio 2019 (MSVC 16) with -O2 optimisation level:

benchmark name                            samples    iterations          mean
uint32                                          20             1    178.387 us
double                                          20             1    138.282 us
array     1:                                    20             1    197.452 us
array     5:                                    20             1    198.407 us
array    10:                                    20             1    204.417 us
array   100:                                    20             1    512.027 us
array  1000:                                    20             1    3.08481 ms
array 10000:                                    20             1    32.8876 ms
blob       1:                                   20             1    188.677 us
blob      10:                                   20             1    141.712 us
blob     100:                                   20             1    174.832 us
blob    1000:                                   20             1    133.617 us
blob   10000:                                   20             1    211.097 us
blob  100000:                                   20             1    362.747 us
blob  200000:                                   20             1    1.35672 ms
blob  500000:                                   20             1    3.37874 ms
blob 1000000:                                   20             1    8.19277 ms

In order to reduce the number of calls to memcpy, it was decided to move to a zero-copy approach, which effectively means that no data is copied by NymphRPC unless it’s absolutely necessary, or there is no significant difference between copying and taking the pointer address of a value.

This involved changing the NymphRPC type system to still copy simple types (integers, floating point, boolean), but only accept pointers to an std::string, character array, std::vector (‘array’ type) or std::map (‘struct’ type), with optional transfer of ownership to NymphRPC. Done this way, this means that ideally the original non-simple value is allocated once (stack or heap), and copied once into the transfer buffer for the network socket. The serialisation itself is done into a pre-allocated buffer, avoiding the use of std::string altogether.

On the receiving end the receiving character buffer is filled in with the received data, and the parsing routine creating a pointer reference to non-simple types within the received data. In the receiving application’s code, this can then be read straight from this buffer, which in the case of NymphCast means that its internal ring buffer can copy the blocks of data straight from the received data buffer into the ring buffer with a single call to memcpy(), without any intermediate copying of the data.

Running the same benchmark (adapted for the new API) with the same compilation settings results in the following results:

benchmark name                            samples    iterations          mean
uint32                                          20             1    122.193 us
double                                          20             1    140.368 us
array     1:                                    20             1    173.963 us
array     5:                                    20             1    189.888 us
array    10:                                    20             1    220.653 us
array   100:                                    20             1    573.168 us
array  1000:                                    20             1    3.33472 ms
array 10000:                                    20             1    31.8041 ms
blob       1:                                   20             1    181.433 us
blob      10:                                   20             1    194.048 us
blob     100:                                   20             1    153.998 us
blob    1000:                                   20             1    174.073 us
blob   10000:                                   20             1    166.228 us
blob  100000:                                   20             1    240.223 us
blob  200000:                                   20             1    343.233 us
blob  500000:                                   20             1    716.233 us
blob 1000000:                                   20             1     2.0748 ms

Taking into account natural variation when running benchmark tests (even with network data via localhost), it can be noted that there is no significant change for simple types, and arrays (std::vector) show no major change either. For the latter type a possible further optimisation can be achieved by streamlining the determination of total binary size for the types within the array, avoiding the use of a loop. This was a compromise solution during refactoring that may deserve revisiting in the future.

The most significant change can – as expected – be observed in the character strings (‘blob’). Here entire milliseconds are shaved off for the larger transfers, making for a roughly 3.5x improvement. In the case of NymphCast, which uses 200 kB chunks, this means a reduction from about 1.4 milliseconds to 350 microseconds, or 4 times faster.

After integration of the new NymphRPC into NymphCast, this improvement was observed during a subsequent analysis with Callgrind: the use of __memcpy_avx_unaligned_erms dropped from being at the top of the list of methods the application spent time in to somewhere below the noise floor to the point of being inconsequential. In actual usage of NymphCast, the improvements were somewhat noticeable in improved response time.

Further analysis would have to be performed to characterise the improvements in memory (heap and stack) usage, but it is presumed that both are lower – along with CPU usage – due to the reduction in copies of the data, and reduction in CPU time spent on creating these copies.




[1] https://github.com/MayaPosch/NymphRPC/
[2] https://github.com/MayaPosch/NymphCast
[3] https://github.com/catchorg/Catch2
[4] https://github.com/MayaPosch/NymphRPC/blob/master/test/test_performance_nymphrpc_catch2.cpp

Why I hate testing my own code, and NymphCast Beta things

September 2, 2021 Leave a comment

There appears to be a trend in the world of software development where it is assumed that QA departments are an unnecessary expense, and that developers are perfectly capable of testing their own code, as well as that of their colleagues. After all, if you wrote the code, or are in the same development team, you know the best way to test the code. Unfortunately this is about as wrong an assumption as one can make.

I have been constantly reminded of this during the past months as NymphCast [1] has been slowly crawling its way through the first Alpha stage. While trying to settle on the right feature set for a first release, having external input from not only the intended users, but from dedicated testers is essential.

Although your target market will give you a list of things they’d like to see, it is your QA team which will make it clear to you which of the implemented features are actually ready for prime-time, and how much air there is left in the time & sanity budget for new features or stabilising already implemented features.

In the upcoming NymphCast Beta v0.1, the core feature set has now been settled upon and a feature freeze is in place. The features that will make it in ‘officially supported’ status including the streaming of media content to a NymphCast receiver (server) from an Android device or desktop device, as well from a NymphCast MediaServer instance on the same local network (LAN).

Features that are at least partially implemented, but are considered ‘experimental’ include multicast/multi-room media playback, the standalone GUI (‘SmartTV mode’) and NymphCast Apps as a whole. These features will be present in the v0.1 release, but are more of a preview of features that will become likely supported in v0.2. As a preview feature, they are not expected to work reliably, if at all. Whether v0.2-release will have similar experimental features in place remains to be seen.

Many of these decisions were made while working with the friend who has dedicated himself to QA, and I’m grateful for every bit of feedback and every bug report I got tossed my way.

The reason why asking a developer to ‘test their code’ doesn’t work, is because they’re biased. And even if they try really hard to test well, it’s a different mindset to test code versus writing it. The best QA person is one who doesn’t care in the slightest how you as a developer expected users to use your software, but who will gleefully use it whatever way ‘seems logical’.

As a developer, when I test my own code, I tend to focus just on a single detail of the software. This is excellent when you’re processing a bug report and trying to hunt down an issue in the code or environment, but the QA mode, top-down look of the code is not something that comes naturally. Based on my experiences, this is something which I can do with my own code, but it generally requires me to not touch the project for a few months or longer.

This makes it obvious that the main reason why developers are terrible at QA-level of testing their own code is because they know too much about the code. Whether consciously or subconsciously, this makes you dodge risky tasks, and follow UX patterns that may seem reasonable in light of how the code works, but where the average user may do something entirely different. Much like you’d do yourself when confronted with your own software a few years from now.

What this also shows just how much of a team effort software development is. Of course one can expected a couple of software developers to kinda sorta band together and handle writing code, testing it, doing QA, creating and processing bug reports, handle packaging and distribution, but things work just so much better when you have dedicated developers, QA folk, people who handle packaging & distribution and so on.

I’d argue that each of those are tasks which aren’t easy to switch between, especially not all on the same day. Yes, it can be made to work, but the results will always be sub-optimal.

With the NymphCast QA department currently working hard to shake out any remaining issues before the first v0.1-beta release of NymphCast, it will soon be time to engage the community for testing this and future Betas and Release Candidates. For as awesome as QA people and other dedicated testers on a project are, it’s hard to beat a large community of users for the most diverse collection of hardware, network configurations and usage patterns imaginable.

On that note, standby for the upcoming announcement of the first NymphCast Beta. Feel free to have a peek at the software before then as well, and add your questions or feedback.


[1] https://github.com/MayaPosch/NymphCast

Categories: nymphcast

Purgatory or Hell: Escape from eternal Alpha status

January 17, 2021 Leave a comment

Many of us will have laughed and scoffed at Google’s liberal use of the tag ‘Beta software’ these past years. Is the label ‘Beta’ nothing more than an excuse for any bugs and issues that may still exist in code, even when it has been running in what is essentially a production environment for years? Similarly, the label ‘Alpha’ when given to software would also seem to seek a kind of indemnity for any issues or lacking features: to dismiss any issue or complaint raised with the excuse that the software is still ‘in alpha’.

Obviously, any software project needs time to develop. Ideally it would have a clear course through the design and requirements phase, smooth sailing through Alpha phase as all the features are bolted onto the well-designed architecture, and finally the polishing of the software during the Beta and Release Candidate (RC) phases. Yet it’s all too easy to mess things up here, which usually ends up with a prolonged stay in the Alpha phase.

A common issue that leads to this is too little time spent in the initial design and requirements phase. Without a clear idea of what the application’s architecture should look like, the result is that during the Alpha phase both the features and architecture end up being designed on the spot. This is akin to building a house before the architectural plans are drawn up, but one wants to starts building anyway, because one has a rough idea of what a house looks like.

When I began work on the NymphCast project [1] a few years back, all I had was a vague idea of ‘streaming audio’, which slowly grew over time. With the demise of Google’s ChromeCast Audio product, it gave me a hint to look at what that product did, and what people looked at it. By that time NymphCast was little more than a concept and an idea in my head, and I’m somewhat ashamed to say that it took me far too long to work out solid requirements and a workable design and architecture.

Looking back, what NymphCast was at the beginning of 2020 – when it got a sudden surge of attention after an overly enthusiastic post from me on the topic – was essentially a prototype. A prototype is somewhat like an Alpha-level construction, but never meant to be turned into a product: it’s a way to gather information for the design and requirements phase, so that a better architecture and product can be developed. Realising this was essential for me take the appropriate steps with the NymphCast project.

With only a vague idea of one’s direction and goals while in the Alpha phase, one can be doomed to stay there for a long time, or even forever. After all, when is the Alpha phase ‘done’, when one doesn’t even have a clear definition of what ‘done’ actually means in that context? Clearly one needs to have a clear feature set, clear requirements, a clear schedule and definition of ‘done’ for all of those. Even for a hobby project like NymphCast, there is no fun in being stuck in Alpha Limbo for months or even years.

After my recent post [2] on the continuation of the NymphCast project after a brief burn-out spell, I have not yet gotten the project into a Beta stage. What I have done is frozen the feature set, and together with a friend I’m gradually going through the remaining list of Things That Do Not Work Properly Yet. Most of this is small stuff, though the small stuff is usually the kind of thing that will have big consequences on user friendliness and overall system stability. This is also the point where there are big rewards for getting issues fixed.

The refactored ring buffer class has had some issues fixed, and an issue with a Stop condition was recently resolved. The user experience on the player side has seen some bug fixes as well. This is what Alpha-level testing should be like: the hunting down of issues that impede a smooth use of the software, until everything seems in order.

The moral of this story then is that before one even writes a line of code, it’s imperative that one has a clear map of where to go and what to do, lest one becomes lost. The second moral is that it’s equally imperative to set limits. Be realistic about the features one can implement this time around. Sort the essential from the ‘nice to have’. If one does it right now, there is always a new development cycle after release into production where one gets to tear everything apart again and add new things.

Ultimately, the Alpha phase ends when it’s ‘good enough’. The Beta phase ends when the issue tracker begins to run dry. Release Candidates exist because life is full of unexpected surprises, especially when it concerns new software. Yet starting the Alpha phase before putting together a plan makes as much sense as walking into the living room at night without turning a light on because ‘you know where to walk’.

Fortunately, even after you have repeatedly bumped your shins against furniture and fallen over a chair, it’s still not too late to turn on a light and do the limping walk of shame 🙂


[1] https://github.com/MayaPosch/NymphCast
[2] https://mayaposch.wordpress.com/2020/12/27/nymphcast-on-getting-a-chromecast-killer-to-a-beta-release/

NymphCast: on getting a ‘ChromeCast killer’ to a Beta release

December 27, 2020 1 comment

It’s been a solid nine months since I first wrote about the NymphCast project [1] on my personal blog [2]. That particular blog post ended up igniting a lot of media attention [3], as it also began to dawn on me how much work would still be required to truly get it to a ‘release’ state. Amidst the stress from this, the 2020 pandemic and other factors, the project ended up slumbering for a few months as I tried to stave off burn-out on the project as a whole.

Sometimes such a break from a project is essential, to be able to step back instead of bashing one’s head against the same seemingly insurmountable problems over and over as they threaten to drown you into an ocean of despair, frustration and helplessness. You know, the usual reason why ‘grinding’, let alone a full-blown death march, is such a terrible thing in software development.

One thing I did do during that time off was to solve one particular issue that had made me rather sad during initial NymphCast development: that of auto-discovery of NymphCast servers on the local network. I had attempted to use DNS Service Discovery (DNS-SD, mDNS) for this, but ran into issue that there is no cross-platform solution for mDNS that Just Works ™. Before reading up on mDNS I had in my mind a setup where the application itself would announce its presence to the network, or to a central mDNS server on the system, as that made sense to me.

Instead I found myself dealing with a half-working solution that basically required Avahi on Linux, Bonjour on MacOS and something custom installed and configured on Windows, not to mention other desktop operating systems. On the client side things were even more miserable, with me finding only a single library for mDNS that was somewhat easy to integrate. Yet even then I had no luck making it work across different OSes, with the running server instances regularly not found, or requiring specific changes to the service name string to get a match.

The troubleshooting there was one factor that nearly made me burn out on the NymphCast project. Then, during that break I figured that I might as well write something myself to replace mDNS. After all, I just needed something that spit out a UDP Broadcast message, and something that listened for it and responded to it. This idea turned into NyanSD [4], which I wrote about before [5].

I have since integrated NyanSD into NymphCast on the server & client side, with as result that I have had no problems any more with service discovery, regardless of the platform.

Other aspects of NymphCast were less troublesome, but mostly just annoying, such as getting a mobile client for NymphCast. Originally I had planned to use a single codebase for the graphical NymphCast Player application, using Qt’s Android & iOS cross-platform functionality to target desktop and mobile platforms. Unfortunately this ran into the harsh reality of Qt’s limited Android support and spotty documentation [6]. This led me to work on a standard, native Android application written in Java for the GUI and using the JNI to use the same C++ client codebase. This way I only have to port the Qt-specific code on the Android side to the Java-Android equivalent.

Status at this point is that all features for the targeted v0.1 release have been implemented, with testing ongoing. An additional feature that got integrated at the last moment was the synchronisation of music and video playback between different NymphCast devices, for multi-room playback and similar. The project also saw the addition of a MediaServer [7], which allows clients to browse the media files shared by the server, and start playback of these files on any of the NymphCast servers (receivers) on the network. I also refactored the in-memory buffer to use a simple ringbuffer instead of the previous, more complicated buffer.

In order to get the v0.1 development branch out of Alpha and into Beta, a few more usage scenarios have to be tested, specifically the playback of large media files (100+ MB), both with a single NymphCast receiver and a group, and directly from a client as well as using a MediaServer instance. The synchronisation feature has seen some fixes recently already while testing it, but needs more testing to make it half-way usable.

A major issue I found with this synchronisation feature was the difficulty of determining local time on all the distinct devices. With the lack of a real-time clock (RTC) on Raspberry Pi SBCs in particular, I had to refactor the latency algorithm to only rely on the clock of the receiver that was used as the master receiver. Likely this issue may require more tweaking over the coming time to get synchronisation with better than 100 ms de-synchronisation.

I think that in the run-up to a v0.1 release, the Beta phase will be highly useful in figuring out the optimal end-user scenarios, both in terms of easy setup and configuration, as well as the day to day usage. This is the point where I pretty much have to rely on the community to get a solid idea of what are good ideas, and what patterns should be avoided.

That said, it’s somewhat exciting to see the project now finally progressing to a first-ever Beta release. Shouldn’t be more than a year or two before the first Release Candidate now, perhaps 🙂


[1] https://github.com/MayaPosch/NymphCast
[2] https://mayaposch.blogspot.com/2020/03/nymphcast-casual-attempt-at-open.html
[3] https://mayaposch.blogspot.com/2020/03/the-fickle-world-of-software-development.html
[4] https://github.com/MayaPosch/NyanSD
[5] https://mayaposch.wordpress.com/2020/07/26/easy-network-service-discovery-with-nyansd/
[6] https://bugreports.qt.io/browse/QTBUG-83372
[7] https://github.com/MayaPosch/NymphCast-MediaServer

Categories: nymphcast

Easy network service discovery with NyanSD

July 26, 2020 1 comment

In the process of developing an open alternative to ChromeCast called NymphCast [1], I found myself having to deal with DNS-SD (DNS service discovery) and mDNS [2]. This was rather frustrating, if only because one cannot simply add a standard mDNS client to a cross-platform C++ application, nor is setting up an mDNS record for a cross-platform service (daemon) an easy task, with the Linux world mostly using Avahi, while MacOS uses Bonjour, and Windows also kinda-sorta-somewhat using Bonjour if it’s been set up and configured by the user or third-party application.

As all that I wanted for NymphCast was to have an easy way to discover NymphCast receivers (services) running on the local network from a NymphCast client, this all turned out to be a bit of a tragedy, with the resulting solution only really working when running the server and client on Linux. This was clearly sub-optimal, and made me face the options of fighting some more with existing mDNS solutions, implement my own mDNS server and client, or to write something from scratch.

As mDNS (and thus DNS-SD) is a rather complex protocol, and it isn’t something which I feel a desperate need to work with when it comes to network service discovery of custom services, I decided to implement a light-weight protocol and reference implementation called ‘NyanSD’, for ‘Nyanko Service Discovery’ [3].

NyanSD is a simple binary protocol that uses a UDP broadcast socket on the client and UDP listening sockets on the server side. The client sends out a broadcast query which can optionally request responses matching a specific service name and/or network protocol (TCP/UDP). The server registers one or more services, which could be running on the local system, or somewhere else. This way the server acts more as a registry, allowing one to also specify services which do not necessarily run on the same LAN.

The way that I envisioned NyanSD originally was merely as an integrated solution within NymphCast, so that the NymphCast server can advertise itself on the UDP port, while accepting service requests on its TCP port. As I put the finishing touches on this, it hit me that I could easily make a full-blown daemon/service solution out of it as well. With the NyanSD functionality implemented in a single header and source file, it was fairly easy to create a server that would read in service files from a standard location (/etc/nyansd/services on Linux/BSD/MacOS, %ProgramData%\NyanSD\services on Windows). This also allowed me implement my first ever Windows service, which was definitely educational.

Over the coming time I’ll be integrating NyanSD into NymphCast and likely discarding the dodgy mDNS/DNS-SD attempt. It will be interesting to see whether I or others will find a use for the NyanSD server. While I think it would be a more elegant solution than the current mess with mDNS/DNS-SD and UPnP network discovery, some may disagree with this notion. I’m definitely looking forward to discussing the merits and potential improvements of NyanSD.


[1] https://github.com/MayaPosch/NymphCast
[2] https://en.wikipedia.org/wiki/Zero-configuration_networking#DNS-based_service_discovery
[3] https://github.com/MayaPosch/NyanSD

Keeping history alive with a 1959 FACOM 128B relay-based computer

August 4, 2019 2 comments

Back in the 1950s, the competition was between vacuum tube (valve) based computers and their relay-based brethren. Whereas the former type was theoretically faster, vacuum tubes suffer from reliability issues, which meant that relay-based computers would be used alongside tube-based ones. Not surprisingly, Fujitsu also designed a number of such electro-mechanical computers back then. More surprisingly, they are still keeping a FACOM 128B in tip-top shape.

Known in the 1950s as Fuji Tsushinki Manufacturing Corporation, Fujitsu’s Ikeda Toshio was involved in the design of first the FACOM 100, which was completed in 1954, followed by the FACOM 128A in 1956. The 128B was a 1958 upgrade of the 128A based on user experiences. Fujitsu installed a FACOM 128B at their own offices in 1959 to assist with projects ranging from the design of camera lenses to the NAMC YS-11 passenger plane, as well as calculation services.

As a successor in a long line of electro-mechanical computers (including the US’s 1944 Harvard Mark I) performance was pretty much as good as it was going to get with relays. Ratings of the FACOM 128B were listed as 0.1-0.2 seconds for addition/subtraction operations, 0.1-0.35 seconds for multiplication, with operations involving complex numbers and logarithmic operations taking in the order of seconds. Maybe not amazing by today’s (or 1970s) standards, but back then their point was to massively and consistently outperform human computers, with (ideally) unfailing accuracy.

Today, this same FACOM 128B can be found at the Toshio Ikeda Memorial Hall at Fujitsu’s Numazu Plant, where it’s lovingly maintained by the 49-year old engineer Tadao Hamada. Working as the leader of Fujitsu’s 2006 project to pass down technology that is still historically relevant, his job is basically to keep this relay-based computer working the way it has done since it was installed in 1959.

Read more…

Parsing command line arguments in C++

March 17, 2019 4 comments

One of the things which have frustrated me since I first started programming has been the difficulty in using command line arguments provided to one’s application. Everyone of us is aware of the standard formulation of the main function:

int main(int argc, char** argv);

Here argc is the number of arguments (separated by spaces), including the name of the application binary itself. Then argv is an array of C-style strings, each containing an argument. This leads to the most commonly used style of categorising the arguments:

app.exe -h --long argument_text

What annoyed me all these years is not having a built-in way to parse command line arguments in C++. Sure, there’s the getopt [2] way if one uses Linux or a similar OS. There are a range of argument parser libs or APIs in frameworks, such as gflags [3], Boost Program Options [4], POCO [5], Qt [6] and many others.

What these do not provide is a simple, zero-dependency way to add argument parsing to C++, while also being as uncomplicated as possible. This led me to put together a simple command line argument parsing class, which does exactly what I desire of such an API, without any complications.

Meet Sarge [1] and its integration test application:

#include "../src/sarge.h"


int main(int argc, char** argv) {
	Sarge sarge;
	sarge.setArgument("h", "help", "Get help.", false);
	sarge.setArgument("k", "kittens", "K is for kittens. Everyone needs kittens in their life.", true);
	sarge.setDescription("Sarge command line argument parsing testing app. For demonstration purposes and testing.");
	sarge.setUsage("sarge_test ");
	if (!sarge.parseArguments(argc, argv)) {
		std::cerr << "Couldn't parse arguments..." << std::endl;
		return 1;
	std::cout << "Number of flags found: " << sarge.flagCount() << std::endl;
	if (sarge.exists("help")) {
	else {
		std::cout << "No help requested..." << std::endl;
	std::string kittens;
	if (sarge.getFlag("kittens", kittens)) {
		std::cout << "Got kittens: " << kittens << std::endl;
	return 0;

Here one can see most of the Sarge API, with the setting of arguments we are looking for, followed by the application description and usage, as it'll be printed if the user requests the help view or our code decides to print it in the case of missing options or similar.

The Sarge class implementation itself is very basic, using nothing but the STL features, specifically the vector, map, memory, iostream and string headers, in as of writing 136 lines of code.

When asked to parse the command line arguments, it will scan the argument list (argv) for known flags, flags which require a value, and unknown flags. It'll detect unknown flags and missing values, while allowing for short options (single-character) to be chained together.

I'll be using Sarge for my own projects from now on, making additions and tweaks as I see fit. Feel free to have a look and poke at the project as well, and let me know your thoughts.


[1] https://github.com/MayaPosch/Sarge
[2] https://en.wikipedia.org/wiki/Getopt
[3] https://github.com/gflags/gflags
[4] http://www.boost.org/doc/libs/1_64_0/doc/html/program_options.html
[5] https://pocoproject.org/docs/Poco.Util.OptionProcessor.html
[6] http://qt.io

Categories: C++, Projects Tags: , ,

Reviewing dual-layer PCBWay PCBs

March 11, 2019 Leave a comment

This review is an addendum to the first part in the Greentropia Base Board article series [1]. Here we have a look at the PCB ordering options, process and product delivered by PCBWay and conclude with impressions of the Greentropia Base realized with these PCBs.

Much to the delight of professional hardware developers and hobbyists alike, prices for dual layer FR4 PCBs have come down to a point where shipping from Asia has become the major cost factor. An online price comparison [2] brings up the usual suspects, with new and lesser-known PCB manufacturers added to the mix.

In this competitive environment, reputation is just as important as consistently high quality and great service. Thus PCBWay [3] reached out to us to talk about their PCB manufacturing process and products by providing free PCBs, which we accepted as an opportunity to fast-lane the Greentropia Base board [4], a primary building block of the ongoing Greentropia indoor farming project [5].

Ordering the PCBs

PCB specifications guide the design process and show up again when ordering the actual PCBs. They are at the beginning and the end of the board design process – hopefully without escalation to smaller drill sizes, trace widths and layer count.

The manufacturing capabilities [6] are obviously just bounds for the values selected in a definitive set of design rules, leaving room for a trade-off between design challenges and manufacturing cost. Sometimes relaxing the minimum trace width and spacing from 5/5mil (0.125 mm) to 6/6mil (0.15 mm) can make a noticeable difference in PCB cost. And then again, switching from 0.3 mm to 0.25 mm minimum drill size can make fan-out and routing in tight spaces happen, albeit at a certain price.

Logically we will need to look at the price tag of standard and extended manufacturing capabilities. The following picture displays pricing as of the writing of this article:pcbway_order_spec

For some options the pricing is very attractive. Most notably an array of attractive colours is available at no additional charge. With RoHS and REACH directives in place however it remains to be seen whether lead-free hot air surface levelling (HASL) will become the new standard at no added cost.

Luckily for our project we do not need to stray far from the well-trodden path and just opt for the lead-free finish on a blue 1.6mm PCB.

The ordering process is hassle-free and provides frequent status updates:


A month after our order, an online gerber viewer [7] was introduced to help designers quickly verify their gerber output before uploading them for the order. It must be noted however that this online feature is at an early stage and is expected to provide layer de-duplication, automatic and consistent color assignment and appropriate z-order and better rendering speed in the future.


Gerbv [8] is a viable alternative which also provides last-minute editing capabilities (e.g. deleting a stray silkscreen element).

Visual inspection

PCBs were received within one week after ordering, packaged in a vacuum sealed bag and delivered in a cardboard box with foam sheets for shock protection. One extra PCB was also included in the shipment, which is nice to have.

The boards present with cleanly machined edges, well-aligned drill pattern and stop masks on both sides and without scratches or defects. The silkscreen has good coverage and high resolution. Adhesion of stop mask and silkscreen printing are excellent. The lead-free HASL finish is glossy and flat, and while we couldn’t put it to the test with this layout, the TSSOP footprint results suggest no issues with TSSOP, TQFP and BGA components down to 0.5mm pitch.

The board identifier is thankfully hidden underneath an SOIC component in the final product. Pads show the expected probe marks from e-test. without affecting the final reflow result. No probe damage to the pads is evident.

Realising the project

We conclude with some impressions of the assembled PCBs, which we will use in the following articles to build an automated watering system.

Here we see our signature paws with the 2 mm wide capacitor C15 next to them for scale. The pitch of the vertical header is 2.54 mm. Tenting of the vias is also consistent and smooth.


Good mask alignment and print quality.

The next picture shows successful reflow of components of different size and thermal mass after a lead-free reflow cycle in a convection oven. As the PCBs were properly sealed and fresh, no issues with delamination occured.


DC-DC section reflow result.

The reflow result with the lead-free HASL PCB and the stencil ordered along with it is also quite promising. No solder bridges were observed despite lack of mask webbing, which is likely due to our mask relief settings and minimum webbing width. Very thin webbing can be destroyed during HASL, so if the additional safety in the 0.15 to 0.2 mm between the pads is needed it’s worth checking back with the manufacturer.


TSSOP reflow result.

While testing the 5V to 12V boost converter, it was found that it worked without issues. Initial testing of the ADC was also promising. As we continue to test the boards over the coming time we’ll find out whether there are zero issues, but so far it appears that everything is working as it should.


[1] https://mayaposch.wordpress.com/2019/03/06/keeping-plants-happy-with-the-greentropia-base-board-part-1/
[2] https://pcbshopper.com/
[3] https://www.pcbway.com/
[4] https://github.com/MayaPosch/Greentropia_Base
[5] http://www.nyantronics.com/greentropia.php
[6] https://www.pcbway.com/capabilities.html
[7] https://www.pcbway.com/project/OnlineGerberViewer.html
[8] http://gerbv.sourceforge.net/

Keeping plants happy with the Greentropia Base board – Part 1

March 6, 2019 Leave a comment

Last year I got started on an automatic plant watering project, with as goal a completely stand-alone, self-sufficient solution. It should be capable of not only monitoring the level of moisture in the soil, but also control a pump that would add water to the soil when needed.

Later iterations of this basic design added a scale to measure the level in the water reservoir, as well as a multi-colour LED to be used as a system indicator as well as for more decorative purposes. This design was initially developed further for my third book that got released [1][2][3] in February of this year. In chapter 5 of that book it is featured as an example project, using the BMaC [4] firmware for the ESP8266 microcontroller.

That’s where the project remained for a while, as even though a PCB design (the Greentropia [5] base board) had been created that would accommodate the project’s complete functionality on a single board, converting that into a physical product along with the associated effort and costs prevented me from just pushing the button on ordering the PCBs and components.

Thus the board remained just a digital render:


When I got suddenly contacted by a representative from PCBWay [6] with an offer to have free PCBs made in exchange for a review of the finished board, it made it all too easy to finally take the step to have the board produced for real.

After some last-minute, frantic validation of the design and board layout by yours truly and a good friend, the Gerber files were submitted to PCBWay. We used the Gerber viewer in KiCad to check the files prior to submitting them. Later I learned that PCBWay also offers an online Gerber viewer [7]. We did not use that one, but it’s important to use a Gerber viewer before one submits a design, to be sure that the resulting PCB will look like and function the way it should.

After a couple of days of PCB production and shipping from China to Germany, the boards arrived:


Top side:


Bottom side:


All boards looked pretty good, with pretty sharp silkscreen features and the soldermask being aligned with the pads. We compared them with another Nyantronics PCB that we have been working on for a while now, that one being from JLCPCB. It is a good way to compare the blue soldermask that they use:


Which colour you prefer is a personal choice, of course. Personally I like the more deep-blue colour of the JLCPCB board, but the PCBWay blue isn’t half bad either. The real concern is of course whether or not the PCB does what it’s supposed to, which is what we’d find out once we assembled the boards.

For this we used a professional reflow oven, courtesy of the local university:


This resulted in the following boards, after a few through-hole components being added by hand:


Each of these boards has sockets for a NodeMCU board, which contains an ESP-12E or 12F module with the ESP8266 microcontroller. This provides the ability to control the pump output and SPI bus, as well as read out the HX711-based scale interface and soil sensor.

Microscope images of the finished boards were also made and can be found in this addendum article: https://mayaposch.wordpress.com/2019/03/11/reviewing-dual-layer-pcbway-pcbs/

In the next parts we will wrap up the remaining development of the hardware, and conclude with the development of the firmware for this board.


[1] https://www.amazon.com/Hands-Embedded-Programming-versatile-solutions-dp-1788629302/dp/1788629302/
[2] https://www.packtpub.com/application-development/hands-embedded-programming-c17
[3] https://www.amazon.de/Hands-Embedded-Programming-versatile-solutions/dp/1788629302/
[4] https://github.com/MayaPosch/BMaC
[5] http://nyantronics.com/greentropia.php
[6] http://www.pcbway.com/
[7] https://www.pcbway.com/project/OnlineGerberViewer