
Today we are incredibly excited to open source Envoy , our high performance C++ distributed proxy and communication bus designed for large service oriented architectures. The project was born out of the belief that:
The network should be transparent to applications. When network and application problems do occur it should be easy to determine the source of the problem.
Envoy runs on every host and abstracts the network by providing common features (load balancing, circuit breaking, service discovery, etc.) in a platform-agnostic manner. When all service traffic in an infrastructure flows via an Envoy mesh, it becomes easy to visualize problem areas, tune overall performance, and add substrate features in a single place.
Use atLyftEnvoy has been in development at Lyft for around 1.5 years. Before Envoy existed, Lyft’s networking setup was fairly standard for a company of our size. We used Amazon’s ELBs for service discovery and load balancing, and a mishmash of different libraries across both php and python. In a few places we deployed HAProxy for increased performance.
At the time, we had about 30 services and even at that level of scale we faced continuous issues with sporadic networking and service call failures to the extent that most developers were afraid to have high volume service calls in critical paths. It was incredibly difficult to understand where the problems were occurring. In the service code? In EC2 networking? In the ELB? Who knew? We relied on whatever statistics each application and HAProxy provided as well the extremely primitive CloudWatch ELB statistics and logging.
Envoy is influenced by years of experience observing how different companies attempt to make sense of a confusing situation. Initially we used it as our front proxy, and gradually replaced our usage of ELBs across the infrastructure with direct mesh connections and local Envoys running on every service node.
About theprojectIn practice, achieving complete network transparency is difficult. Envoy attempts to do so by providing the following high level features:
Out of process architecture: Envoy is a self contained process that is designed to run alongside every application server. All of the Envoys form a transparent communication mesh in which each application sends and receives messages to and from localhost and is unaware of the network topology. The out of process architecture has two substantial benefits over the traditional library approach to service to service communication:
Envoy works with any application language. A single Envoy deployment can form a mesh between Java, C++, Go, PHP, Python, etc. It is becoming increasingly common for service oriented architectures to use multiple application frameworks and languages. Envoy transparently bridges the gap. As anyone that has worked with a large service oriented architecture knows, deploying library upgrades can be painful. Envoy can be deployed and upgraded quickly across an entire infrastructure transparently.Modern C++11 code base: Envoy is written in C++11. Native code was chosen because we believe that an architectural component such as Envoy should get out of the way as much as possible. Modern application developers already deal with tail latencies that are difficult to understand due to deployments in shared cloud environments and the use of very productive but not particularly well performing languages such as PHP, Python, Ruby, Scala, etc. Native code provides generally excellent latency properties that don’t add additional confusion to an already confusing situation. Unlike other native code proxy solutions written in C, C++11 provides both excellent developer productivity and performance.
L3/L4 filter architecture : At its core, Envoy is an L3/L4 network proxy. A pluggable filter chain mechanism allows filters to be written to perform different L3/L4 proxy tasks and inserted into the main server. Filters have already been written to support various tasks such as raw TCP proxy, HTTP proxy, TLS client certificate authentication, etc.
HTTP L7 filter architecture : HTTP is such a critical component of modern application architectures that Envoy supports an additional HTTP L7 filter layer. HTTP filters can be plugged into the HTTP connection management subsystem that perform different tasks such as buffering, rate limiting, routing/forwarding, sniffing Amazon’s DynamoDB, etc.
First class HTTP/2 support : When operating in HTTP mode, Envoy supports both HTTP/1.1 and HTTP/2. Envoy can operate as a transparent HTTP/1.1 to HTTP/2 proxy in both directions. This means that any combination of HTTP/1.1 and HTTP/2 clients and target servers can be bridged. Our recommended service to service configuration uses HTTP/2 between all Envoys to create a mesh of persistent connections that requests and responses can be multiplexed over.
HTTP L7 routing : When operating in HTTP mode, Envoy supports a routing subsystem that is capable of routing and redirecting requests based on path, authority, content type, runtime values, etc. This functionality is most useful when using Envoy as a front/edge proxy but is also leveraged when building a service to service mesh.
GRPC support : GRPC is a new RPC framework from Google that uses HTTP/2 as the underlying multiplexed transport. Envoy supports all of the HTTP/2 features required to be used as the routing and load balancing substrate for GRPC requests and responses. The two systems are very complementary.
MongoDB L7 support : MongoDB is a popular database used in modern web applications. Envoy supports L7 sniffing, statistics production, and logging for MongoDB connections. MongoDB lacks decent hooks for observability, and at Lyft we have found the statistics Envoy produces are invaluable when running sharded MongoDB clusters in production. In summary, Envoy makes MongoDB far more web scale .
DynamoDB L7 support : DynamoDB is Amazon’s hosted key/value NoSQL datastore. Envoy supports L7 sniffing and statistics production for DynamoDB connections. Similar to Envoy’s MongoDB support, having a single source of statistics for all DynamoDB connections from any application platform has been invaluable at Lyft.