Phil Calcado on Lessons Learnt During SoundCloud's Microservice Migration

At QCon London 2015 Phil Calcado shared lessons learnt from SoundCloud’s move from a monolithic to micro services architecture.

The article is very practical written,  introducing best partices about the topic as well as states fundamental capabilities in context of a micro service platform, as for example 

  • rapid provisioning,
  • basic monitoring and
  • rapid application deployment.

He states as a core lesson

the need to standards monitoring dashboards

due to the fact that it can be difficult to determine what has broken in a distributed micro service architecture. An experience which I encounter as well even with a small startup docker environment.

Currently I standardize my docker runtime monitoring needs around the platform, which provides an easy to use and powerful Log Management and Realtime Analytics solution. For micro companies in the bootstrap-/development mode it offers a free tier, which allows you to gain knowledge and experience. It’s clear for me when going live I will switch over to a paid plan to contribute to this neat service. 

Integration of the the log entries remote output appender into my java-/groovy based curation Bot via the log4j library was straightforward. Below you see a screenshot of the aggregated log data. Via tags (in the example „Exception“, „Warning“) - which can be configured for log entries messages - it’s quite easy to get a quick dashboard overview of any kind of problems. “Live Tail” feature is  handy in case you restart the Beanstalk docker application. During the reconfiguration and restart process the Beanstalk Log feature isn’t available. I now can switch to Logentries and watch „live“ the startup process. It will allows me to detect startup problems early.

Another finding by Phil which will be instrumental when you go for production is the capability 

to expose the application’s operational functionality in a standard way, allowing monitoring data to be easily accessed essential. As well as providing the capability to shut-down an application, or exposing the ability to trip downstream circuit-breakers. 

This is something I have still to address and it will allow me to look deeper in the AWS Elastibeanstalk Worker configuration, which I want to use to address this point.

The full article can be found here:


comments powered by Disqus