Lessons in Golang - Goroutines and Channels

Recently I made publicly available an in memory scheduler written in Golang, https://gitlab.com/kylehqcom/kevin.

Writing code for public consumption is always something of a scary exercise. Knowing full well the critiques, scrutiny and chin rubbing that will come from your peers and non peers alike. But being an optimist I appreciate any feedback. Good or bad it is always greatly received as it all just helps with the learning. My goal here is to share some of that learning, mistakes and “a ha” moments I have found in developing with Go.

Long may it continue!

Background

In my own time I have been writing a new SaaS. It’s currently private but details will emerge as and when the beta rolls out. The project itself is made up of 3 distinctive parts, a getter, a cruncher and a sender. Starting with the “getter”, it became evident pretty early on that I would need a way to schedule “gets”. But such a scheduler would come in super handy later on with my “sender” too. I first took a look at the existing open source packages.

From what I found, most were overkill for what I needed. Sure AirBnB’s chronos and even the “trimmed” down version kala would do the job, but have way too much overhead for me at this time.

At the other end of the scale, most took a direct linux cron job approach. gocron will certainly repeat tasks but didn’t have a nice way to manipulate running tasks or retrieve details from these tasks for introspection. It’s at this point that I took inspiration from the above and rolled up my sleeves. For reference, take a look at Kevin’s source.

Goroutines and Channels

Because my own requirements didn’t require a persistent store to track running jobs, implementing a schedule pool was as simple as creating a map key’d on a string.

type schedules map[string]*schedule

This allowed for the returning of a schedule based on a generic name, easy enough. When adding a Job to a Schedule, after validation, the details of the job are submitted in a non blocking *Goroutine. This ensured that any surrounding code can continue to execute and the job would happily run in the background. On submission, if an ID was not passed, a generated ID was returned to introspect the job in future. The job runner itself utilises Golang’s built in

time.NewTicker(time.Duration)

to repeat the actual function calls assigned to the job. So nothing terribly scary or out of the ordinary happening here. Well that was until I wanted to use an ID to stop a running job. Although at the time my “knowledge” of the problem was a wee bit lacking, I was pretty confident I could get the desired outcome.

The question was, how can I access the values of a function running in a background Goroutine?

As far as I was concerned “at the time”, each Goroutine is living in some “forbidden zone” away from variable scope. This left me with one option, using some well intentioned channels. Channels (among other things) are a way for Goroutines to communicate with each other. But how would I use a **stop channel for a pool of schedules with numerous jobs??? The internet is your friend.

After screening though numerous Stackoverflow results, my face made a rye smile when I finally found my way to @matreyer - I had been in contact with Mat earlier in the year in regards to a GoLang position so I wasn’t surprised to find his repo to handle my exact use case - https://github.com/matryer/runner Bam happy days! I now had a way to stop a Goroutine from the inside out.

But like most things, I wasn’t entirely happy with this initial implementation. I took issue with the duration between stop being called from the outside and the job timer checking for the stopped status. In short, a ticker is an infinite “for loop” in which you “break” out of or stop().

// do is called via Run in a go routine.
func do(r *runner) {
      // This will tick on "Every" forever.
      e := time.NewTicker(r.job.Every)
      for range e.C {
          ...
          // Work happens
          ...
          
          // But you can stop via a channel and 
          // break from the timer.
          if <- stop {
               break
          }

Take for example a job that gets an “Every” value of one week. If we are one day into the week and a stop is called on the job, the runner will then sit and wait for another 6 days before the ticker ticks and stops itself. I didn’t want that wait so set out to find a solution.

Curiosity for the win.

After hours of research, code attempts, Goroutine and channel examples I “happened” upon this playground example. Please go take a look.

https://play.golang.org/p/FZKVATjcu0

From this example, it is clear that Goroutines still have access to variables in package scope. Notice that var x int is being updated by one Goroutine, and then being rendered from another. Now that I know that Goroutines can access a variable in scope, then surely an out of scope variable can be replaced with a pointer instead?

A pointer is a memory address to a variable. In past code I had used pointers for memory optimisations or to ensure a single point of truth, I just hadn’t applied the same thoughts to Goroutines. So I’m sitting here thinking, if a Goroutine has access to a variable pointer from inside the ticker loop, then why do we even need these channels at all? Answer is you don’t!

*** My current implementation uses a pointer of a runner instance https://gitlab.com/kylehqcom/kevin/blob/master/runner.go#L172 and the runner instance is assigned against the schedule map via a JobID. This ensures that any updates to the stopped flag on the runner instance can be read by the ticker and will stop/break as required. Winning!

As much as I enjoyed implementing @matreyer’s package, it’s much more enjoyable losing an external dependency!

The Lesson

Channels and Goroutines are awesome and fun, but use them only when required. Often a simpler solution can be found. Even without Mat’s runner package, I am still bound to a ticker being stopped based on it’s Every duration. But I’m glad I looked for a better solution. I now have a deeper understanding and my Golang just levelled up.

*There are plenty of resources about Goroutines online so please “go” have a read if they are new to you.

** Interestingly, I could still have used a single channel sending JobID’s. Though as I currently see it, this would mean that every runner would get ALL job updates. JobID values would be sent down the channel and runners would have to self check themselves for a JobID match. Terribly inefficient.

*** Addendum: the current implementation uses the closing of an empty struct channel to exit the ticker.

It's where the magic happens © 2018 - KyleHQ