Network Burst
NIC RSS Feature
AWS Instance Network Throughput
Enhanced Networking
AWS offers instance types with varying network capabilities. Network throughput of instances is advertised as: low, Moderate, High, up to 10 Gbps, 10 Gbps and 20 Gbps. To help Netflix teams to know the true network throughput of AWS instances, internal micro-benchmarks were executed using netperf tool.
New instance types supports Enhanced Networking feature that allows even smaller instances to achieve higher network throughput and low latencies. Enhanced Networking feature allows native driver to run on an instance where it can have direct access (DMA) to subset of NIC hardware resources via PCIe SR-IOV extension that helps achieve low latency networking due to low virtualization overhead.
To check if instance is configured correctly for Enhanced Networking run: $ sudo ethtool -i eth0. If driver field shows: ena or ixgbevf , then enhanced networking is properly configured.
Network Burst
In addition to Enhanced Networking, instance families like I3 and R4 offer Network burst feature on smaller instance types (l|xl|2xl|4xl). These instances use network credit model, similar to CPU credit model used for T1 instance family and and IO credit model for EBS GP2 and ST1/SC1 storage. Instance accumulates network credits during low or no network traffic. Larger instance gets more credits and thus can sustain network burst of 6 Gbps for a longer period. Once all network credits are consumed, network throughout is dropped down to base levels. Both Network and IO credit system work best for bursty workloads like Hadoop, large file transfers, that may require higher network throughput for a shorter period of network activity. It is recommended to upgrade to a newer instance types, if your application is still on an old instance type, to achieve better network throughput and lower latency at the same price point.
NIC RSS Feature
Instance NIC (sriov, ena) supports (RSS) feature that divides the card into multiple logical receive and transmit descriptor queues, called multi-queue devices. When a packet is received, NIC hardware fans out packets to various receive queues (AWS ena NIC can have up to 8 queues) that bound to different cpus. This allows packet processing to scale across multiple cpus to achieve higher throughput and lower latencies. NIC hardware applies a filter to each packets that assigns logical flow and that steers the packet to a assigned receive queue.Filter is typically a hash function that uses TCP network and transport layer headers (4 tuples: src ip, src port and dst ip and dst port) to direct traffic to logical queues.
Thus in order to achieve peak throughput, traffic should be spread across all NIC queues. Each queue is then served by a different cpu and that offers better performance and scalability.
See RSS in action below. As you can see incoming traffic is spread across multiple logical queues on the same physical NIC on m5.24XL instance.
AWS Instance Network Throughput
Network throughput listed was measured internally by running netperf tool. To achieve 23 Gbit/s throughput on an instance, network load is generated in parallel across 18 8xl instances by selecting jumbo frames (BaseAMI is hardwired to 1500 MTU due to lack of jumbo frames support in older AWS instances and mixing MTU on the network results in unpredictable performance)
Type | L | XL | 2XL | 4XL | 8XL | 9XL | 12XL | 16XL | 18XL | 24XL | 32XL |
---|---|---|---|---|---|---|---|---|---|---|---|
T2 | 500 Mbps | 700 Mbps | 960 Mbps | x | x | x | x | x | x | x | x |
I2 | x | 700 Mbps | 960 Mbps | 2 Gbps | 9.4 Gbps | x | x | x | x | x | x |
I3* | 6 Gbps
700 Mbps
| 6 Gbps
1 Gbps
| 7 Gbps
2 Gbps
| 9 Gbps
4 Gbps
| 9.4 Gbps | x | x | 15 Gbps** | x | x | x |
R3 | 500 Mbps | 700 Mbps | 960 Mbps | 2 Gbps | 5 Gbps | x | x | x | x | x | x |
R4* | 6 Gbps
700 Mbps
| 6 Gbps
1 Gbps
| 7 Gbps
2 Gbps
| 9 Gbps
4 Gbps
| 9.4 Gbps | x | x | 15 Gbps** | x | x | x |
M4 | x | 700 Mbps | 1 Gbps | 2 Gbps | x | x | x | 12 Gbps** | x | x | x |
M5* | 6 Gbps
700 Mbps
| 6 Gbps
1 Gbps
| 7 Gbps
2 Gbps
| 9 Gbps
4 Gbps
| x | x | 9.4 Gbps | x | x | 16 Gbps** | x |
C4 | 500 Mbps | 700 Mbps | 2 Gbps | 4 Gbps | 9.4 Gbps | x | x | x | x | x | x |
C5* | 6 Gbps
700 Mbps
| 6 Gbps
1 Gbps
| 7 Gbps
2 Gbps
| 9 Gbps
4 Gbps
| x | 9Gbps | x | x | 16 Gbps** | x | x |
D2 | x | 700 Mbps | 2 Gbps | 4 Gbps | 9.4 Gbps | x | x | x | x | x | |
X1 | x | x | x | x | x | x | x | 9.4 Gbps | x | x | 15 Gbps** |
X1e* | x | 6 Gbps
600 Mbps
| 7 Gbps
1 Gbps
| 9 Gbps
2 Gbps
| 9.4 Gbps | x | x | 9.4 Gbps | x | x | 15 Gbps** |
*New instance families (I3,R4,M5,C5) offer burstable network throughput on lower instance types (l, xl, 2xl, 4xl). Selecting jumbo frames (9001 MTU) offers much higher burst up to 9.4 Gbps on lower instance types: l, xl, 2xl, 4xl. Once burst credits are exhausted, network throughput drops to baseline level, several folds lower than burst levels.
**Selecting Jumbo Frames (9001 MTU) (sudo ip link set dev eth0 mtu 9001) on instance types: i3.16xl, r4.16xl, m4.16xl, m5.24xl, c5.18xl, c5.36xl, x1.32xl,x1e.32xl, x1.64xl can offer network throughput of 23 Gbps