AWS Lambda battle 2021: performance comparison for all languages (cold and warm start)

Aleksandr Filichkin on 2021-09-07

Let’s compare the performance of all supported runtimes + 2 custom runtimes (Rust and GraalVM).

Will compare cold start and warm.

Source code is here: https://github.com/Aleksandr-Filichkin/aws-lambda-runtimes-performance. It requires the minimum local setup(almost all is Dockerized)

NodeJs (14.x)
Python (3.9)
Go(1.x)
Ruby(2.7)
.Net(3.1)
Java (11)
Rust(1.54.0)
GraalVM(21.2)

Disclaimer:

All benchmarks were performed in September 2021

I’m not an expert in all these languages and I’m happy to see MR in GitHub repo with performance improvements. I’m going to support these repo and run the perfomance test every 3 months. I believe in opensource collaboration :)

Test scenario

We are going to test API-Gateway -> AWS Lambda->DynamoDb flow.

We will test only POST endpoint which will save the book into the DynamoDb table in the known AWS region(us-east-2).

Cold start test

I did all my best to reduce the cold start:

Removed useless dependencies.
Move as much as possible to the initialization phase(for example, in Java move everything to static) to use CPU burst on startup.
Specified the Region.
Got rid of any DI frameworks

The detailed information about cold start read here.

Result:

Cold-start..tsv

	Java	Graalvm	.Net	Go	Rust	Python	NodeJS	Ruby
128mb	OOM	1480	11810	1050	844	641	1190	773
256mb	6570	774	5820	661	480	527	769	612
512mb	5180	684	2940	404	304	502	771	677
1024mb	4450	531	1500	299	234	482	656	652
10240mb	2790	501	904	327	219	449	518	649

All languages(except Java and .Net) have a pretty small cold start.
Java even cannot start with 128Mb. It needs more memory. But GraalVM can help in this case. Feel free to read a detailed page about GraalVM and AWS Lambda
Rust beats all runtimes for all setups, the only exception is 128 MB where Python is the best.
The huge setup helps only for Java and .Net.

WARM test

The test is to send 15.000 requests to each lambda one by one.

For the load test, I’m using JMeter. It looks like:

Which metrics will we check?

The average(per minute) duration for each language (256MB setup,(128MB short result you can find at the end)
The maximum(per minute) duration for each language (256MB setup)

NodeJS

NodeJS has an expected behavior.

First times it’s slow, but after JIT optimization it becomes better:

Python

Has a stable performance: 100th and 15000th invocations are the same.

Ruby

I observe very weird behavior for Ruby: average duration is growing up(looks like a memory leak or bug in code)

.NET

The first ~1k invocations are slow, but then it has very good performance:

Golang

Stable briliant performance:

Java

The first ~1k iterations are slow, then it becomes faster(JIT C1 helps).

For Java I expected C2 JIT optimization after 10k iterations, but there is no optimization even after 20k invocations and duration is the same. See the screen below:

GraalVM:

As expected, GraalVM has stable good performance from the very beginning.

Rust

Rust has a constant awesome performance.

All together

It’s very tricky to measure average performance because every new lambda has a bit different result (I believe it’s because lambdas are allocated on different hardware). I run the test 3 times with 30 min delay between tests to have 3 different lambdas allocations.

Also, I tested the same flow for 128MB lambda. And here we can see a big difference.

I assume for CPU-intensive flow the difference between compiled and interpreted languages will be much bigger. I guess, GraalVM doesn’t perform well for 128 MB, because it still has JVM inside and it needs too much memory and Lambda performs to often GC.

Conclusion:

Cold start:

All languages(except Java and .Net) have a pretty small cold start.
Java even cannot start with 128Mb. It needs more memory. But GraalVM can help in this case.
Rust beats all runtimes for all setups for cold start, the only exception is 128 MB where Python is the best.

Warm start:

Golang and Rust are the winners. They have the same brilliant performance.
.Net has almost the same performance as Golang and Rust, but only after 1k iterations(after JIT).
GraalVM has a stable great performance almost the same as .Net and a bit worse than Rust and Golang. But it doesn’t perform well for the smallest setup.
Java is the next after GraalVM.The same as .Net, Java needs some time(1–3k iterations) for JIT(C1). Unfortunately for this particular use case, I was not able to achieve the expected great performance after JIT C2 compilation. Maybe AWS just disabled it.
Python has stable good performance but works too slow for the 128 MB
Ruby has almost the same performance as Python, but we see some duration growing after 20 min invocations(after 15k iteration).
NodeJs is the slowest runtime, after some time it becomes better(JIT?) but still is not good enough. In addition, we see the NodeJS has the worst maximum duration.

Cold+warm start winners are Golang and Rust. They are always faster than other runtimes and demonstrated very stable results.

Check my next performance comparison for AWS Lambda: x86 vs ARM https://filia-aleks.medium.com/aws-lambda-battle-x86-vs-arm-graviton2-perfromance-3581aaef75d9