# Airbnb Architecture

# Overview

Airbnb is a website that operates an online marketplace and hospitality service for people to lease or rent short-term lodging. The challenges for the engineering team includes high-availability, quick-scaling, etc. In this post, I put the architecture of Airbnb website in one article. Please tweet to @enqueuezero (opens new window) if you think anything is incorrect or out-dated.

Disclaimer: I'm not from Airbnb team and don't know anybody from Airbnb. All information can be found on the Internet, mainly from the Airbnb engineering blog (opens new window).

# Solutions

# AWS Stack

Airbnb uses below AWS services.

  • It uses EC2 instances for its application, memcache, and search servers.
  • It uses RDS as main MySQL database.
  • It used ELB for traffic load balancing (Note: seems no longer used anymore, check section Load Balancer below.).
  • It uses EMR for daily data processing and analyzing (Note: seems somewhat out-dated, check section Data Warehouse below).
  • It uses S3 for backups and static files, including user pictures.
  • It uses Amazon CloudWatch to supervise ES2 assets.

# Load Balancer

Charon is Airbnb's front-facing load balancer. Previously it was Amazon's ELB. The decision based on the fact that ELB was clunky and less helpful to troubleshoot.

With Charon, Akamai traffic hits Nginx servers directly. Then the traffic routes to the backend services by Synapse and HAProxy.

# Service Discovery

  • SmartStack is an OSS service discovery framework. It has two components: Nerve (opens new window) and Synapse (opens new window). It relies on Zookeeper to store discovery data, as well as HAProxy for routing.
  • Nerve manages the life-cycle of microservices based on health checks.
  • Synapse looks up microservices instances and automatically update HAProxy configuration.
  • Zookeeper stores znode for the name of the services and provide microservice instances change via Zookeeper watches.

# Web Tier

Airbnb users Rails for the front-end.

# Data Tier

Airbnb uses Amazon RDS as main MySQL database. The databases are deployed in multi-AZ (availability zone). Below 3-tier architecture reflects the basic pattern. Note that there are several types of databases for different scenarios, for example, airmaster, calendar, message, etc. Therefore, there are over a dozen dbproxy and hundreds of database instances gets deployed.

3 Tier DB

  • They're the community edition of MySQL server.
  • Each MySQL server uses one-thread-per-connection model.
  • Airbnb forked and modified (opens new window) MariaDB MaxScale for database proxy.
  • Main functionalities of this proxy layer include connection pooling, request throttling, query blocklist, etc.

# Infrastructure as code

Airbnb manages infrastructure with Chef (opens new window).

# Data Warehouse

The Airbnb data infrastructure handles metrics, trains machine learning models, and runs business analytics, etc.

Data Pipeline

  • Kafka performs as a broker for event logs.
  • Sqoop performs as a broker for production database dumps.
  • The Gold and Silver Hive cluster are the data sinks. The Gold Hive cluster replicates data to silver. The Gold Hive cluster has a higher SLA guarantee.
  • A Spark Cluster works on machine learning for stream processing.
  • A Presto Cluster is for ad hoc querying.
  • An Airflow application runs in front-end for job scheduling.
  • S3 is a long-term solution for HDFS data.

# Microservices

Airbnb uses Dropwizard (opens new window) service framework, and customized a Thrift service IDL.

  • Developers can choose between JSON-over-http and Thrift-over-http.
  • Downstream services need to install generated RPC clients from upstream.
  • Downstream services also need to apply standard timeout, retry, and circuit breaker logic.
  • The framework adds request and response metrics on both service-side and client-side.
  • The framework adds requests context, including request id to all underlying service requests.
  • The framework supports adding alerts based on metrics like p95_latency, p99_latency, etc.

# Search Service

Search Service

  • Nebula is a schema-less, versioned data store service with both real-time random data access and offline batch data management.
  • The search flow only adds some search indexing logic into this system.
  • The snapshot is generated daily as a part of the offline data merge.
  • The search index is built from the snapshot and then deployed to search periodically as an ordinary binary deploy.

# References