AWS Fargate is Amazons solution to run containers without managing servers or clusters. For many aspects AWS Fargate is similar to AWS Elastic Container Service but without having to deal with EC2 clusters. It is basically "server-less containers".
Let's take a first look at AWS Fargate's network performance!
TLDR; Sustained network throughput is very stable, but not symmetric and does not really grow with assigned container resources.
There are various articles on AWS EC2 network performance, sometimes between instances, sometimes to other AWS services like S3 (e.g. by Andreas Wittig). For EC2 we know those numbers very well and we regularly check them in order to recommend correct StormForger cluster sizes to our customers. Known network performance is very important in order not to be mislead when doing performance testing. For EC2, raw network performance in terms of bandwidth depends on instance type and more importantly on instance size within one instance family. The general rule of thumb: The bigger the instance (and the more you pay), the better networking gets.
Back to AWS Fargate. We were wondering what kind of network performance can one expect from AWS Fargate? And how does the container's resource sizing (CPU and memory) relate to its network performance? These might be important parameters to know if you have a bandwidth dependent workload.
Preparation and Test Setup
To asses network performance we are going to use iPerf3. To make the measurement not being limited on the iPerf server part, we are using beefy 72 core
c5.18xlarge instances, which are advertised with up to 25Gibt/s network performance. During preparation we did sanity checks between two
c5.18xlarge instances where we saw 22Gibt/s sustained throughput with peaks reaching close to 25Gibt/s. That should provide us with enough headroom for our experiments.
Our test target will be AWS Fargate containers launched in
eu-west-1. Fargate allows five tiers of "CPU Units" with a range of memory (check AWS Fargate documentation). Since we were interested in the relation of network performance and CPU/memory sizing, we decided to test 10 scenarios, the minimum and maximum memory values for each of the five CPU Unit tiers:
|CPU Units||vCPU||Memory (MiB)||Price per hour (USD)|
We also tested the same 10 configurations with iperf's
--reverse option to verify symmetric network performance, resulting in a total of 20 scenarios.
All test scenarios are configured to omit the first 10 seconds of measurement to skip past TCP slow-start window and also (initial) peak performance. Actual measurements are performed for 60 seconds using two parallel TCP streams. We are primarily interested in sustained network performance, ignoring shot throughput peaks.
The results were quite interesting and quite a bit surprising to us. But without further ado, let's take a look at the results.
Here is a plot of all tested configurations with the average bandwidth measured over 60 seconds. "Fargate out" refers to traffic being send from AWS Fargate (iPerf server) to our EC2 instance (iPerf client) done via iPerf's
--reverse flag. "Fargate in" is the other direction, without
--reverse, sending data from EC2 to the Fargate container.
Here is the same result data:
|CPU Units||vCPU||Memory (MB)||Outgoing (MBit/s)||Incoming (MBit/s)|
Although for most measurements the bandwidth is not super high, the measured bandwidth is very stable over time. For all tests there was an initial peak, but the sustained bandwidth was very solid with little to no variation. This is very nice as this makes it quite predictable.
In addition to the network throughput we also took a look at the containers CPU utilization (normalized to 100%). This draws a very interesting picture which could at least explain why network performance is not symmetric (in higher then out in all cases):
We are not surprised by the relatively low throughput of the smaller container configurations (this is basically very similar to EC2 network performance). Two aspects strikes us as very odd:
- Network throughput on AWS Fargate does not seem to be symmetric. Often there is "just" a 2x difference, but it goes up to over 10x.
- Ingress performance of 2048 CPU / 4 GB memory configuration is way off at 3Gbits (same with 4096 CPU / 8 GB memory). And the throughput goes down again when we increased CPU/memory for the container.
This is really strange observation. The biggest container you can get with 4096 CPU Units and 30GB memory has roughly the same network performance as the 50% cheaper 2048 CPU/16GB option.
The initial assumption was that network performance correlates with allocated resources like with EC2. At first this seems to be the case, until the 2048 CPU Unit tier where things got really odd. Compared to pricing and network performance for EC2 the price per MBit is actually not that bad (at least for ingress).
For a follow-up one might look into more details like measuring all memory tiers (1GB increments currently), vary the amount of TCP streams in iPerf, etc. There might be configuration tweaking possible to increase network performance overall, but that does not explain the immensely (positive) performance outliers.
Do you have questions or remarks regarding AWS Fargate or other performance topics? Just drop us a line :)
In case you are interested in a bit more detail of the test setup, here you go!
We used an unofficial
fargate CLI tool by John Pignata, which has a really nice and simple interface. We had to patch the tool to support the newly available regions, including
eu-west-1 which we wanted to use for testing.
We used the following
Dockerfile to set up the iPerf3 servers running in Fargate:
FROM alpine:latest RUN apk --update add iperf3 \ && rm -rf /var/cache/apk/* ENTRYPOINT ["iperf3"] CMD ["--server", "--json", "--verbose"]
Creating a AWS Fargate task based on this
Dockerfile with our desired parameters was very straight forward. This command will build the container image, upload it to your private Docker registry and launch a Fargate task with the given resource configuration:
$ bin/fargate task run iperf-test --region=eu-west-1 --cpu $CPU_UNITS --memory $MEMORY_UNITS
When the container is ready, you can start iPerf from the beefy EC2 instance like this:
$ iperf3 \ --time 60 \ --omit 10 \ --parallel 2 \ --reverse \ --verbose \ --json \ --get-server-output \ --client $IPERF_TARGET_CONTAINER_IP > measurements/c$CPU_UNITS-m$MEMORY_UNITS.json
After each test, we wait 20 seconds before starting the next experiment. This is go give both systems a bit to cool down.
The iPerf3 client (EC2 instance) and all Fargate containers were started in the same VPC using the same Security Group within
eu-west-1 region. No other user processes were running on the EC2 instance, besides network monitoring tools for sanity checking.