As it happens to me from time to time, today I ended up sleeplessat 2 am with nothing to do. So after contemplating life, the universe and everything for a while I decided to put my time to a good use by playing with MQTT.

In case you haven’t heard it before, the name MQTT stands for MQ Telemetry Transport, and it is a publish/subscribe protocol aimed at sensor networks and embedded devices. MQTT is an application (layer 7) protocol that uses TCP or other reliable transports and is standardized at the wire format level (in this aspect it is similar to AMQP, where one client library can connect to any AMQP broker, instead of requiring vendor-specific clients like JMS or ODBC/JDBC).

The operations implemented by MQTT are the bare minimum for this kind of service:

  • Connect / Disconnect
  • Subscribe to a topic
  • Unsubscribe from a topic
  • Publish a message to a topic

There is no concept of “Queue” in MQTT. All messages sent to a topic are dispatched to all subscribers, which is very confusing since the two first letters of the protocol’s name are precisely “MQ”2. MQTT does not impose any particular format on the message data, so it can handle JSON, XML and binary formats equally well. The simplicity plus the lack of restrictions make it ideally suited for use in embedded systems like IoT devices.

Deploying Mosquitto

The simplest way to try MQTT these days seems to be to use Mosquitto, an project that implements a full MQTT server. Luckily for me several people have done the hard work of packaging Mosquitto as a Docker image, so the only thing I had to do was pull the image and wire up some folders, and I was up and running. This is the command I used to do that:

docker run -it 
    -p [server_ip]:[server_port]:1883 \
    -p [server_ip]:[server_port]:9001 \
    -v /storage/docker/work/mqtt/config:/mosquitto/config \
    -v /storage/docker/work/mqtt/data:/mosquitto/data \
    -v /storage/docker/work/mqtt/log:/mosquitto/log \
    --restart=always --name mosquitto eclipse-mosquitto

By default the image will listen on ports 1883 (MQTT) and 9001 (MQTT over Websocket). I had to remap those ports because I already have other things listening there. I also mapped host directories to the configuration, data and log directories in the container, to make them persistent3.

Playing with the Python client

Connecting to an MQTT server is surprisingly straightforward using the Paho client for Python, and the no-nonsense design of the API makes it very easy to work with topics and messages. With just a handful of lines of code I was able to publish and consume simple messages:

import paho.mqtt.client as paho
import threading
import time

# ------------ SUBSCRIBER CODE -----------------
# Callback function for every received message
def processMessage(client, userdata, msg):
  print("Message from " + msg.topic + ": " + str(msg.payload))

# Create receiver
client = paho.Client()
client.connect("[SERVER_IP]", [SERVER_PORT], 60)
client.on_message = processMessage

# Start receiver on a separate thread
loopThread = threading.Thread(target=client.loop_forever)

# ------------ PUBLISHER CODE -----------------
# Create publisher
publisher = paho.Client()
publisher.connect("[SERVER_IP]", [SERVER_PORT], 60)

# Publish 3 messages
for i in range(3):
  publisher.publish("/test/topic", "Message number " + str(i))

# Clean up


Protocol analysis

Since by default MQTT doesn’t use SSL I was able to capture and examine the contents of the MQTT conversation between clients and broker. I was surprised by how compact the protocol actually is. The message headers are almost nonexistent and the data is packed so as to fit as much as possible in a single TCP segment or Link Layer frame.

Here are some Wireshark captures of both the publisher and the subscriber. I annotated the TCP stream with the different segments exchanged between clients and server.

First, here is the exchange when the publisher connects to the broker and pushes three messages (client to server messages are in red, where server to client messages are in blue):

And here is the client side of it, with the three messages being pushed to the client by the MQTT broker:


Next step: Home Assistant and the Google Home that I got for Christmas!
Stay tuned.


1 Go ahead, make the “Sleepless in Seattle” joke. You know you want to.
Apparently MQTT was once part of IBM’s MQ series of products. Hence the prefix.
3 I know I could have used Docker volumes, but this works better with my backup strategy.

Some people have asked me why I don’t allow comments on this blog. I had initially thought about enabling them but in the end I decided against it.

The truth is, most websites that run on WordPress require people behind them to continually monitor and block misbehaving users, most of which end up being spambots.
Things like CAPTCHAs, Bayesian classifiers and Machine Learning filters solve part of the problem but (bad) things still get through the cracks.

There’s also the increased attack surface. WordPress is not even remotely bug-free and the less unwanted data I let in, the less the chance of finding out one day that my website has been used to host malware or is part of a botnet.

The last reason why I don’t want comments here is that comment boxes are restrictive anyway. Nowadays anyone can open free accounts on sites like Medium and write beautiful responses full of cat memes, so there’s no need for me to spend hours installing, configuring and maintaining all the infrastructure that would allow you to express yourself here.

So… if you want to send me a comment, note, criticism, encouragement or a cat meme (you know who you are), send an email to nlofeudo [at] gmail [dot] com, tweet me at @NahuelLofeudo or open an account on and write there. I promise I’ll read everything you send my way.

Thank you.

Most programmers understand Garbage Collection but very few know that memory can get fragmented and filled with holes just like hard drives, with far more serious consequences.


All languages, interpreted or compiled, give programmers the ability to allocate and release spans of memory. Objects, structures or simple blobs of addressable space, they can be created, used and returned to the memory pool once they are no longer needed. But there’s a catch:

Even with the most efficient memory manager, even with the best-in-class garbage collection algorithm, there is no guarantee that after a piece of code has done its thing the memory will have the same capacity to hold data. Let this sink in for a second: you write your code to the best of your ability, your debugger and profilers tell you there is plenty of memory to go around, and yet your program crashes because it ran out of memory, and there is nothing you can do.

Consider, for example, a piece of code that takes a string already stored in memory, and simply adds an extra character at the end. Regardless of the language, and except in very special circumstances, the program will need to allocate a new chunk of memory to hold the new string, copy the data over (adding the character at the end) and then free the old memory block.


Rinse and repeat. A million times. Ten million times. Across days, weeks or months. The memory space of any non-trivial program becomes a series of holes where new data may not fit. Granted, with today’s computers and heap sizes, a condition like this is unlikely to happen in server-class hardware but in low-end devices it is a real possibility.

The solution to this problem is to run a process called Memory Compacting that physically relocates all objects in the application’s heap and re-writes references and pointers so that al free memory becomes a single block again:


Now, not all (in fact, very few) runtimes do this. The grand list of languages and runtimes that compact memory is:

  1. JRE: Java, Scala, groovy, etc.
  2. .NET CLI: C#, F#, Visual Basic and others
  3. LISP

The only viable alternative to memory compaction is not to use dynamic memory allocation at all, and only use statically-defined variables and stack-local variables. As you can imagine this reduces the flexibility of the algorithms that can be implemented, but has the advantage of being the only method that guarantees that the program will never run out of space to store objects and can calculate in advance the amount of memory that the program will need.

Not surprisingly the static memory management is the preferred method for it systems implemented on microcontrollers and other systems very very limited amount of memory. You have no way of doing it in any language other than assembler, C and (to a certain degree) C++.

A plea for sanity in memory management

With this in mind, i’d like to end this article asking you to please stop using scripting languages like Python, Ruby or PHP for projects that must run for months or years at a time even if it’s not on limited hardware. Just stop.

Use real languages that use a real runtime that will guarantee your program will run for as long as it needs to, or take matters into your own hands and do your own memory management with C. All other options will be problematic in the long run.


This is my second article on a series of posts documenting the rebuilding of the LAN at my home and is a copy of an article I had originally posted on LinkedIn. The original article is here. This post is about the design of the different networks I’m putting in place to connect all my devices, and maintaining security of my personal data.

One of the things I want to do right this time around is to establish a clear separation between Internet-only traffic (strictly from a device to the Internet without interacting with other devices in the LAN) and traffic to services running in the local network. An example of the first type of traffic would be video streaming or gaming, whereas the second type could be printing a document or accessing the local file server. This doesn’t mean that the internet-only, or “external” LAN should be completely isolated: some services will be accessible, but the devices in the external LAN should not have carte blanche to connect to every service or computer they want.

As for wireless access, guests will be able to connect to the external LAN through WPA2 Personal authentication, and wireless clients for the internal LAN will use username/password authentication on a different WPA2 Enterprise network. All user accounts will be kept in the server, in an LDAP server that all services will use (I’ll discuss this in a future post).

And since I’m building this network trying to manage risk as much as possible within budget, I’d also like to put all management services (like SSH servers and web configurators) in a third VLAN, that would only be accessible through a limited set of connection points.

My only problem is that the layout of my house forces me to have the Internet connection in my living room, away from the home office that will house the router and the applications server, and connected to a second switch that I will install in the living room. So the Internet connection that under normal circumstances would use a dedicated cable to the router will in my case be a VLAN sharing the same gigabit link as the rest of the VLANs which will force me to be very careful when configuring the switches. It’s not ideal but it will have to do.
The network’s diagram will, more or less, look like this:



All in all, there will be four VLANs in the network:

VLAN 2 – Management

  • Only enabled on specific ports of the backbone switches
  • WLAN access and Remote access only through VPN
  • SSH and management access to servers and switches
  • IPv4:
  • IPv6: fedc::1/48

VLAN 4 – Internal LAN

  • Residents’ WLAN
  • Back-ups
  • File storage
  • Git repository
  • Databases
  • IPv4:
  • IPv6: (Preffix assigned by ISP)/60 + subnet 2.

VLAN 8 – External LAN (no internal LAN access)

  • Guests’ WLAN
  • Streaming audio and video
  • Gaming
  • IPv4:
  • IPv6: (Preffix assigned by ISP)/60 + subnet 4

VLAN 64 – Internet

  • Direct cablemodem / router
  • IPv4:(assigned by ISP)
  • IPv6: link-local

The router will provide connectivity from more secure networks to less secure ones: a laptop connected to the internal LAN will be able to access guests’ shared folders, but not the other way around

Managing the bandwidth

We have come a long way since the days of dial-up modems and bandwidths of just tens of kilobits per second. Nowadays cable modem speeds go up to tens (sometimes hundreds) of megabits per second and gigabit fiber-to-the-home is starting to become a global standard. That would be enough if traffic demands had not grown exponentially as well, sometimes even faster than the technologies that support them.

In a normal household there are several different types of traffic:

  • Streaming media: non-real-time, bounded speed (usually by resolution or quality of the media).
  • VoIP: real-time, highly sensitive to latency, low or medium bandwidth.
  • Video conference: same as VoIP but with higher bandwidth requirements, for both download and upload.
  • Internet data: web pages and IM; high bandwidth requirements for short amounts of time, medium sensitivity to latency.
  • Bulk data: mostly large file transfers. Low sensitivity to latency.
  • SSH and other terminal services: Sensitive to latency, minimal bandwidth requirements

With this in mind I’d like to take advantage of the HFSC (Hierarchical Fair Service Curve) packet schedulers in today’s router OSs and create a series of packet classifiers that will prioritize the most time-sensitive traffic and leave enough bandwidth available for the bulk data-intensive applications. I’d like to establish a hierarchy that would, at least, guarantee (from most important to least important):

  • Bandwidth for at least one Skype video call, to call our families on the weekends. These calls should have minimal delay (ideally, they should have zero delay). Microsoft recommends 500Kbps up/down for high-quality Skype calls. The packet classifier must allow for at least this much traffic using a real-time scheduling policy.
  • Minimal delay for interactive SSH sessions, but SFTP transfers’ bandwidth should be limited. Traffic on port 22 should have high priority (but not higher than VoIP) if it’s less than 100Kbps. Anything higher than that (most likely an SFTP transfer) should be fair game.
  • Guaranteed bandwidth for streaming video, enough for at least 720p resolution, so that large file transfers don’t prevent other people from watching TV. A 720p HD video stream is about 4 to 5 Mbps depending on who you ask. I’d like to guarantee 5mbps for Netflix and Amazon video, and let everything above that (1080p and above) be fair game.
  • Dynamic bandwidth for off-site backups: Up to 95% of uplink speed during the night, up to 50% during the day.

Everything else would share the same QoS level and compete for bandwidth as usual.

Note: I posted this article on LinkedIn around February of 2016. As I’m going to use this space to add all my future notes I’m also copying it here. The original post is here.

When my wife and I moved to California in 2012, we arrived with just our laptops and our smartphones. Being the techie I am, I immediately started gathering a collection of new, used and re-purposed hardware to build my home network. I ended up putting together a basic LAN for Internet access and a NAS for family pictures, home videos and personal data, on a limited budget and with whatever I happened to have at hand at the time.
More Link