AWS Lambda battle 2021: performance comparison for all languages (cold and warm start)

Aleksandr Filichkin on 2021-09-07

Let’s compare the performance of all supported runtimes + 2 custom runtimes (Rust and GraalVM).

Will compare cold start and warm.

Source code is here: https://github.com/Aleksandr-Filichkin/aws-lambda-runtimes-performance. It requires the minimum local setup(almost all is Dockerized)

Disclaimer:

All benchmarks were performed in September 2021

I’m not an expert in all these languages and I’m happy to see MR in GitHub repo with performance improvements. I’m going to support these repo and run the perfomance test every 3 months. I believe in opensource collaboration :)

Test scenario

We are going to test API-Gateway -> AWS Lambda->DynamoDb flow.

We will test only POST endpoint which will save the book into the DynamoDb table in the known AWS region(us-east-2).

The main flow

Cold start test

I did all my best to reduce the cold start:

The detailed information about cold start read here.

Result:

Cold-start..tsv

	Java	Graalvm	.Net	Go	Rust	Python	NodeJS	Ruby
128mb	OOM	1480	11810	1050	844	641	1190	773
256mb	6570	774	5820	661	480	527	769	612
512mb	5180	684	2940	404	304	502	771	677
1024mb	4450	531	1500	299	234	482	656	652
10240mb	2790	501	904	327	219	449	518	649
Cold start result

WARM test

The test is to send 15.000 requests to each lambda one by one.

For the load test, I’m using JMeter. It looks like:

Which metrics will we check?

NodeJS

NodeJS has an expected behavior.

First times it’s slow, but after JIT optimization it becomes better:

NodeJS 256MB average duration
NodeJS 256MB maximum duration

Python

Has a stable performance: 100th and 15000th invocations are the same.

Python 256MB average duration
Python 256MB maximum duration

Ruby

I observe very weird behavior for Ruby: average duration is growing up(looks like a memory leak or bug in code)

Ruby 256MB average duration
Ruby 256MB maximum duration

.NET

The first ~1k invocations are slow, but then it has very good performance:

.Net 256MB average duration
.Net 256MB maximum duration

Golang

Stable briliant performance:

Golang 256MB average duration
Golang 256MB maximum duration

Java

The first ~1k iterations are slow, then it becomes faster(JIT C1 helps).

Java 256MB average duration
Java 256MB maximum duration

For Java I expected C2 JIT optimization after 10k iterations, but there is no optimization even after 20k invocations and duration is the same. See the screen below:

Java 256 MB, no C2 optimization.

GraalVM:

As expected, GraalVM has stable good performance from the very beginning.

GraalVM 256MB average duration
GraalVM 256MB maximum duration

Rust

Rust has a constant awesome performance.

Rust 256MB average duration
Rust 256MB maximum duration

All together

It’s very tricky to measure average performance because every new lambda has a bit different result (I believe it’s because lambdas are allocated on different hardware). I run the test 3 times with 30 min delay between tests to have 3 different lambdas allocations.

5K iteration for 3 timeslots(256MB Lambda)
256 MB Lambda

Also, I tested the same flow for 128MB lambda. And here we can see a big difference.

128MB average warm state
128MB average warm state
128MB maximum(per minute) warm state

I assume for CPU-intensive flow the difference between compiled and interpreted languages will be much bigger. I guess, GraalVM doesn’t perform well for 128 MB, because it still has JVM inside and it needs too much memory and Lambda performs to often GC.

Conclusion:

Cold start:

Warm start:

Cold+warm start winners are Golang and Rust. They are always faster than other runtimes and demonstrated very stable results.

Check my next performance comparison for AWS Lambda: x86 vs ARM https://filia-aleks.medium.com/aws-lambda-battle-x86-vs-arm-graviton2-perfromance-3581aaef75d9