Application architectures are evolving from the era of large monoliths to a more distributed design based model. One of the key initiators of this movement is the advent of cloud computing and the ability it brings in terms of handling ever increasing scale. When an enterprise primarily soaked (people and processes) with the model of building and managing monolithic applications, the journey to build new distributed systems requires re-learning some of the older design techniques and adopting some new patterns. As part of this, I will detail certain architecture concerns that become prominent when moving to a distributed model of application

  • Scheduler/Orchestration management – From managing 100s of instances to managing 1000s of instances require the ability to orchestrate/schedule service instances/containers across hosts in a seamless manner. To handle increasing scale, workload scheduling/orchestration is a key ingredient of distributed system. Products like Docker Swarm, Kubernetes, Mesos, Marathon etc are some of the leading products in this space 
  • Service Discovery/Registration – As the container based services go up and down, there need to be mechanism to register/unregister the services along with the mechanism to discover the service end points at run time. Products like Consul, Zookeeper, etcd, Confd, Eureka are some of the leading products in this space. Most of these products support load balancing of the incoming traffic across the service instances. 
  • System State Management / Cluster Management – As the cluster grows, there is a need to manage the system state of the cluster. What are the SRV for each of the services, how many instance, on what hosts, what is load etc. To manage this, there is a need for cluster management that keep track of the system state. Products like Docker Swarm Agents, Kubernetes Nodes/Masters, Mesos Slaves, Containership etc are some of the leading products in this space 
  • Data storage – the container storage is ephemeral, which means the any data that needs to be retained beyond the container lifecycle need to be persisted outside. Projects like Docker Volume Plugin, Flocker, Kubernetes Persistent volumes etc are some of the key products 
  • Network – with each of the containers running different processes, there is a need to manage and at time isolate which container services can access which other services. Multiple containers are running on same host sharing the network resources might require security groups to be created for container isolation. Similarly, containers might want to discover services that are hosted across hosts and need simple model to access those. Products like Flannel, Weaveworks, Calico are some of the products in this space. 
  • Monitoring/Auditing/Logging – With 1000s of containers running, monitoring/auditing /logging each of the containers become a tough problem. Data/Logs need to be pulled from each of the container for analysis. Products like Loggly, Fluentd, logentries, datadog, ELK stack are some of the key products in this space. 
Besides this, other factors that need to be considered are Container OS and Container Runtime when architecting a distributed application. Other factors like application runtime, deployment management, DNS, Security, SSO/OAuth, API Gateways, Circuit breakers, Performance/Scalability Patterns etc still need to be handled. In your experience, anything else that is a key architecture concern for distributed application, please do share.

This post originally appeared at www.techspot.co.in