Concurrency is an important topic to deal with in Software Engineering. For that reason, the technology that you used is really important to solve some problems. We use Go language at Trendyol in many applications and Go comes with concurrency advantages. In this article, we will look at how concurrency solutions are different in Go and how to use channels.
Before explaining Go channels, we should look at what concurrency is and what are the problems.
We could run our tasks sequentially in a single thread. But it does not fit for our problems. So we use more than one thread to run our task asynchronously. Concurrency is not the same with running tasks at the same time. Running tasks as parallel is a mix of hardware and software requirements. But concurrency is about our code design.
In computer science, concurrency is the ability of different parts or units of a program, algorithm, or problem to be executed out-of-order or in partial order, without affecting the final outcome. (Wikipedia)
Several problems may arise in concurrency. Shared resource is one of them. Shared resources can be accessed by several threads concurrently. So the sequence of access to that resource is important. When a thread tries to read a shared variable, another thread may change the value of the shared variable. That may cause inconsistency.
Concurrency Solutions in Go
We may solve concurrency problems using Mutexes, Semaphores, Locks etc. Basically, when one thread tries to access the shared resource, it locks the critical section, in this way other threads can not access the shared resource until the section is unlocked.
Go solves those problems in another way. It uses goroutines instead of threads and uses channels instead of accessing to a shared state.
Threads in a traditional Java application map directly to OS threads by JVM. Go uses goroutines rather than threads. Goroutines are divided onto small numbers of OS threads. They exist only in the virtual space of go runtime. Go has a segmented stack that grows when needed. That means it is controlled by Go runtime, not OS.
- Goroutines have other advantages like startup time. Goroutines are started faster than threads.
- Goroutines are created with only 2 KB stack size. A Java thread takes about 1 MB stack size.
Also learning and using goroutine in your application as simple as you see.
ConsumeFromKafka is just a function that consumes messages from Kafka. You see the “go” keyword when we call it. It makes the function asynchronous. We could do the same thing with Java by creating a new thread and invoking the callback method inside that thread.
How goroutines communicate?
Goroutines are good and easy to use. But how they communicate? We have learned a term called “shared resource” in the previous chapter. Do the goroutines access to a shared resource? The answer is No.
The philosophy behind the Go’s concurrency is different.
Do not communicate by sharing memory; instead, share memory by communicating. (https://blog.golang.org/codelab-share)
It is not preferred to use a shared resource in Go. Instead use channels to communicate goroutines. Go’s concurrency model relies on CSP (Communicating Sequential Processes, 1978). This approach ensures that only one goroutine access to the data at a given time. Let’s see how to use go channels.
Go Channels in Practice
Go channels are like pipes. One goroutine sends data, other goroutines receive that data from the other side. There are 2 types of channels called buffered and unbuffered. There are some differences between them.
There is a given capacity to hold data in the buffered channel. But unbuffered channel does not have the capacity to hold more than one data. That means, only one piece of data fits through the unbuffered channel at a time.
By default, writing and reading from an unbuffered channel is blocking operation. When one goroutine sends data, it is blocked until other goroutines receive data from the channel. It is the same for the receiving part. When one goroutine tries to receive data from the channel, it is blocked until a data sent to the channel. It is kind of the same for the buffered channel. Sender goroutine is blocked when capacity is full until other goroutines fetch the data from the channel.
Let’s see this analogy from a part of our production code.
We have created 2 buffered channels with a given capacity. Then passed them to goroutines. The capacity of channels is up to your needs. Goroutines communicate via these buffered channels. Let’s see sending and receiving data from channels.
Sending and receiving data from a go channel is simple as you see.
Go channels are highly recommended to use in Go applications when dealing with concurrency. But if your problem can not be solved with channels, you may still use other solutions by sync package. This package provides low-level components like mutexes.
Channels work as expected in our production environment. Here some metrics.
The application fetches data every 5 minutes from Kafka. Then process it and index it to Elasticsearch. All data in the application is transmitted over channels between goroutines. When one goroutine reads data from Kafka, the second goroutine process that data, the third one insert that data to Elasticsearch. All jobs work asynchronously, not sequentially.
I hope this article shows another perspective about concurrency and helps you.