# EnqueueZero Techshack 2019-01

Kia ora! I am the creator of Enqueue Zero (https://enqueuezero.com), a site that explains code principles to Developers/DevOps/Sysadmins. This site is updated under a single man team since 2018. The `EnqueueZero Techshack` series is comprised of a bunch of interesting posts I wrote or found on the Internet. More importantly, I hope these posts would please you.

Please follow Twitter account @enqueuezero for the updates. Happy exploring!

# Kubernetes In a Nutshell

enqueuezero.com (opens new window)

Kubernetes is a system for running and coordinating containerized applications. These applications are deployed across a cluster of machines.

As a Kubernetes user, you define how the application should run. You also define how the application should interact with other applications or the outside world.

Kubernetes brings together machine servers into a cluster using a shared network.

  • Master server(s) controls the entire cluster.
  • Nodes are the other servers in the cluster.

# How Discord Stores Billions of Messages

blog.discordapp.com (opens new window)

The underlying database is Cassandra. Reasons for choosing Cassandra was:

  • Read/Write ratio is about 50/50.
  • Many random reads.
  • Large dataset.
  • Uneven amount of data for Private text chat massive Discord servers and Large public Discord servers.

Requirements were:

  • Linear scalability.
  • Automatic failover.
  • Low maintenance.
  • Proven to work.
  • Predictable performance.
  • Not a blob store.
  • Open source.

The data Modeling is to bucket messages by time. The primary key for Message is ((channel_id, bucket), message_id).

CREATE TABLE messages (
   channel_id bigint,
   bucket int,
   message_id bigint,
   author_id bigint,
   content text,
   PRIMARY KEY ((channel_id, bucket), message_id)

Cassandra is an AP database, meaning it trades Consistency with Availability. To solve the dilemma, engineers decided to delete records in the database if the fault detected. The delete action is performed by a form of write called a “tombstone.” On read, it just skips over tombstones it comes across. The retention policy for tombstone is two days by default.

# Everything You Need To Know About Networking On AWS

dev.to (opens new window)

                                        | ig-1  |
                                        |       |
        vpc-123:  |         |       |        |
       |                      |                          |                     |
       |  +-----+             |  +-----+                 |  +-----+            |
       |  | NAT |             |  | NAT |                 |  | NAT |            |
public |  |     |             |  |     |                 |  |     |            |
subnets|  +-----+             |  +-----+                 |  +-----+            |
       |                      |                          |                     |
       |                      |                          |                     |
       |                      |                          |                     |
       |              +-------+                  +-------+             +-------+
       |              | rt-1a |                  | rt-1b |             | rt-1c |
       |  |       |      |       | |       |
       |  | rt-2a |      | rt-2b | | rt-2c |
       |              |       |                  |       |             |       |
       |              +-------+                  +-------+             +-------+
private|                      |                          |                     |
subnets|                      |                          |                     |
       |                      |                          |                     |
       |                      |                          |                     |
       |                      |                          |                     |
       |                      |                          |                     |
       |                      |                          |                     |
       |                      |                          |                     |
       |         AZ 1         |          AZ 2            |        AZ 3         |
  • A virtual private cloud or VPC is a private network space in which you can run your infrastructure.
    • It has an address space (CIDR range) which you choose e.g. It determines how many IP addresses you can have.
  • In VPC, there are public subnets and private subnets, depending on whether traffic can reach them from outside the VPC (the Internet).
  • As long as one Availability Zone or AZ is alive, your service should be able to operate.
  • A routing table contains rules about how IP packets in the subnets can travel to different IP addresses.
  • Internet Gateways allows traffic to the outside world.
  • A NAT gateway is a device which sits in the public subnets, accepts any IP packets bound for the Internet coming from the private subnets, sends those packets on to their destination and then sends the returning packets back to the source.
  • A security groups can specify ingress (inbound) and egress (outbound) traffic rules, limiting them to some sources (inbound) and destinations (outbound).

# An Idiot’s guide to Support vector machines (SVMs)

web.mit.edu (opens new window)

  • Support vectors are the data points that lie closest to the decision surface. They are the data points most difficult to classify.
  • Support Vector Machine

# How do you document a tech project with comics?

jvns.ca (opens new window)

  • Comics without good information are useless.
  • Focus on concepts that don’t change.
  • Make a single graphic with a pretty small amount of information.
  • Make a more in-depth comic / illustration

# High-Reliability Infrastructure Migrations

youtube.com (opens new window)

The topic is about how to improve the availability from 99% to 99.99%.

Below are solutions.

  • Understand the design.
  • Run gamedays.
  • Classify failures.
  • Have incidents only once.
  • Make incremental changes.
  • Always have a rollback plan.

It's okay to start not being an expert, but you need to become one!

# Introducing Pipelines to Airbnb’s Deployment Process

medium.com (opens new window)

Check out this post on how Airbnb design the deployment process and minimize potential misuses.

  • Develop a specification for a configuration file that service owners can use to define their deployment procedures (i.e., pipeline).
  • Save pipeline configurations in a database.
  • Give the frontend a method of accessing this data.
  • Separate and order the targets correctly in the UI display.
  • Disable targets that shouldn’t be deployed to (i.e., it has dependencies that have not yet been deployed to).
  • Finally, implement the UI.

# Learn eBPF Tracing: Tutorial and Examples

www.brendangregg.com (opens new window) | bcc (opens new window) | bpftrace (opens new window) | bcc Python guide (opens new window)

Check out Brendan Gregg's new blog post on the beginner's guide to eBPF.

  • What is eBPF? eBPF does to Linux what JavaScript V8 does to HTML. By leveraging eBPF, you can write mini-programs that run on events like disk I/O, which are run in a safe virtual machine in the kernel.
  • What are bcc and bpftrace? As no one writes V8 bytecode, not many people write eBPF programs directly. Instead, people write bcc and bpftrace.
  • Companies including Netflix and Facebook have bcc installed on all servers by default, and maybe you'll want to as well.
  • Check the bcc Beginner's Tutorial (opens new window). Many more tracing tools exist
  • For intermediate gamers, you can switch from bcc to bpftrace, which has a high-level language that is much easier to learn.
  • An example of bpftrace one-liner:
$ bpftrace -e 'tracepoint:syscalls:sys_enter_open { printf("%d %s\n", pid, str(args->filename)); }'

# Docker Internals

docker-saigon.github.io (opens new window)

A Deep Dive Into Docker For Engineers Interested In The Gritty Details.

  • What is a container?
    • Containers share the host kernel
    • Containers use the kernel ability to group processes for resource control
    • Containers ensure isolation through namespaces
    • Containers feel like lightweight VMs (lower footprint, faster), but are not Virtual Machines!
  • How do containers compare to Package Managers?
    • Package managers failed us due to shared libraries version differences causing dependency issues; packaging shared libraries in an image goes around that.
  • How do containers compare to Configuration Management?
    • It is still advisable to leverage such a provisioning tool to bootstrap the Docker infrastructure, letting the Container Runtime layer take care of the application layer once it is ready.
  • Why Docker?
    • Because Docker is currently the only ecosystem providing the full package of Image management, Resource Isolation, File System Isolation, Network Isolation, Change Management, Sharing, Process Management, Service Discovery (DNS since 1.10)
  • How? Kernel Namespaces. CGroups, iptables, Copy-On-Write UnionFS.
  • Container Runtimes: LXC, Systemd-nspawn, runC
  • How to hook into the various Docker components?

# Everything curl

ec.haxx.se (opens new window) | github.com (opens new window)

Without a doubt, curl is a great software. The author of curl published a book "Everything curl," a 10/10 worth reading.

# GraphQL Server Design @ Medium

medium.com (opens new window)

This article explains how the structure of Medium's GraphQL server helped make the migration much smoother.

Some decisions were made:

They came up with a layered architecture. As discusses in this post https://enqueuezero.com/layered-architecture.html (opens new window), "Each layer has it’s own responsibilities, and only interacts with the layer below it". Below is the proposed layers:

  • Schema. It is the form of the data will take when it gets sent to the clients.
  • Repositories. It “stores” the cleaned-up data that initially came from data sources. Traditionally, it's like a controller layer in MVC, massaging and shaping the data.
  • Fetchers. It's fetched by GraphQL server and for fetching data from multiple data sources

GraphQL server