Load Balancer

Running more than one copy of a (stateless) service typically implies that there is an entity that takes care of routing requests from clients to one of the copies. This entity is called a load balancer. It provides the service client the illusion of an infinitely scalable back-end and is a fundamental building piece for many zero-downtime deployments.

Here are some standalone (software) load balancers as well as managed services you might want to use:

Domain Name Service (DNS)

The numerical IP addresses like are not cool: both machines and humans are better dealing with logical names such as selfie-service.example.com. How to manage the mapping of numerical IP addresses to logical names (actually called a fully qualified domain name or FQDN) and their lookup is defined by the Domain Name Service (DNS) spec. If you are not familiar with DNS, check out howdns.works or invest some 20 min in this incredible good video with tons of background.

Here are some managed services if you want to or have to run your own DNS infra:

Note also that in the past years a number of efforts have been made to secure DNS. And, as a general reminder:

In doubt, when troubleshooting, check DNS. And again. And again. And again ...

Content Delivery Networks (CDN)

If you are physically far away from a server it takes longer until data has arrived than if you're closer. Leaving available bandwidth aside for now, one can gain a lot by serving content close to the end-user. Content Delivery Networks (CDN) do exactly this: they are physically close to you, reducing latency and providing a better UX, often used for static content. Some popular CDNs are:


A very important aspect in distributed systems is that of time. There are multiple challenges:

  • On each of the machines in a cluster you have a local concept of what time exactly it is; you need to make sure all nodes are synchronized (no clock skews happen). You do that usually with the venerable ntp or the newer systemd-timesyncd along with an (external, reliable) NTP server such as time.google.com.
  • Time might not monotonically increase. That means it may look like you're going back in time, say, because of a leap second or a user fat fingering something. For example, see this discussion in the Go community or an incident at Cloudflare.

Most of us don't need it or have the engineering muscle to do it but there is an interesting concept by Google called TrueTime, an atomic-clock, global sync mechanism, see also this article for an intro.


Q: What's the difference between a forward proxy and a reverse proxy?
A: First, forget about forward/reverse. Think: client-side (== forward) and server-side (== reverse) proxy, where the former requires explicit knowledge/config by the client and acts on its behalf and the latter is transparent, acting on the server's behalf. More …

Q: How do I see what's going on inside a container? How can I prod around in it?
A: Depends on the environment: remote exec (Docker, DC/OS, K8S)

Q: How can I list active network connections and their details in a *nix environment?
A: Use the ss command.

Q: How do I securely connect to a remove server?
A: Follow, for example, the tutorial on how to use SSH to connect to a remote server in Ubuntu.

Q: In DC/OS, how can I develop locally with all the cluster services and an nodes being available to me, directly?
A: Use DC/OS Tunnel with the VPN option.

Q: How can I see what's going on in and between my containers?
A: Use Weave Scope to visualize and debug clusters (for DC/OS and Kubernetes environments).

Q: Should I block ICMP?
A: Read shouldiblockicmp.com.