What is Avi?
Avi Networks provides software-defined Load Balancing at its core. Complementing the core service is a plethora of mouthwatering services and features. Surely the reason Avi was acquired by VMware on July 11th 2019. Rebranded as the NSX Advanced Load Balancer.
Some of the features include application analytics, predictive autoscaling, micro-segmentation and WAF (Web Application Firewall) all this and on both cloud and on-prem. The Load Balancer for any platform!
After reading this, if you like the look of Avi and want a demo or PoC, reach out to the team at https://avinetworks.com/.
The components
Avi builds its components in a similar way to other VMware products. A management, control and data plane. The logical separation of components and functions is a tried and tested method that provides robust and scalable solutions.
The management plane and control plane is merged into 1 device a la NSX-T. Named the Controller Cluster, aptly. This provides a configuration portal and an analytics portal for your services/applications. There is also a CLI and RestAPI endpoint should you want to integrate or automate any part of the configuration/analytics. Much like NSX (v or T) the recommended controller deployment size is 3, it will also support a 1 node cluster. The controllers communicate with one another over their management interfaces (encrypted). This can be to pass workload information between each other, share information learnt from the SEs (Service Engines) or push configuration changes towards the SE’s.
The data plate is Avi’s SEs (Service Engines). These are the customer-facing devices that are used to pass traffic. While passing traffic, they collect valuable stats that you are able to see in the Controller portal.
Features such as SSL/TLS offloading is something we have grown accustomed to in Load Balancers and Avi deliver this in their SE profiles along with other features.
Data Plane Scaling
Auto-scaling with Avi is the concept I found most interesting. This is something we have seen in public cloud providers for a while and it’s great to see it being offered on-prem. There are 2 concepts to grasp here, first is the scaling of SE resources and 2nd is the scaling of VIP (Virtual IP) advertisements.
There are 2 types of SE resource scaling, virtual scaling and horizontal scaling. Virtual scaling is an SE having 95% utilisation, then adding more vCPU and/or vRAM (reboot required). Horizontal scaling is done by adding more nodes, so rather than increasing the resources on 1 Load Balancer, make a second and/or third SE and split the traffic amongst them. Then there are 2 types of VIP scaling within the horizontal scaling methods, Native and BGP-based.
Native scaling is simpler to accomplish and has it’s used cases, but there are limitations. The SEs are able to scale out horizontally to cater for more load. This can be done manually or automatically depending on your preference. This works well, except for the fact while scaling out, there will be 1 primary SE and n secondary SEs (2 recommended). All initial connections will be received by the primary SE and it will forward these connections to the other SEs or opt to serve the content itself. Important to note, this is only to establish the connection. Once a decision has been made as to which SE will serve the content, the traffic between the SE and the user is direct (i.e. not via the primary SE). This works fine, but we have to keep in mind that the bottleneck here is the primary SEs ability to proxy these connection requests and the fact you have a single failover object. SE’s will promote themselves if the primary is lost, but this takes a few seconds.
BGP-based scaling is more dynamic than native scaling and I think we all assumed it would be. The Mechanism at play here is BGP and RHI (Route Health Injection). Each of the SE’s will peer with upstream BGP devices. Once the SEs front a VIP, a /32 (IPv4) or /128 (IPv6) is advertised to the upstream BGP peers and you have multiple routes to the same VIP, creating ECMP southbound from the DC fabric. The SE’s can scale to 64 ECMP objects, where typically routers will have a maximum of 8 ECMP paths (soft or hard limit), keep that in mind.
Closing
Over the course of the next few months, I will be diving into the Avi product, focusing on the base configuration, auto-scaling and redirections. That said, if there is another area you would like me to investigate and blog about, please reach out to me in the comments section or twitter.