Analytics for the Engineering Manager

Analytics are critical to success in business today. Businesses can’t expect to understand their customers or the best path forward without constantly leveraging and gaining insights from their data. Just like business leaders use analytics to be more effective, so too should engineering leaders!

Leveraging analytics in your daily work as an engineering manager will help you make better decisions and increase your confidence in those decisions. Simply put, if you build the skills up and use them – it’s like a superpower. It will open up a whole new world of opportunities to make you and your team more effective.

Why it matters

One of the things that comes up a lot in my job as an engineering manager are subjective arguments about things that are actually quite measurable and objective. What’s even more absurd is that very often the thing in question is actually measured and recorded somewhere. It’s just that nobody bothered to pull it into a consumable form so it could be used. Producing reports using data squashes subjective arguments with actual numbers.

Keep up your tech chops

If you’re like me, you miss writing code every day. You like the opportunity to move the needle at a higher level, but still crave technical work. Weaving analytics into your work can help quench that thirst. It provides you problem solving work that is key to your job, but isn’t on the critical path for projects. It’s definitely one way to help keep your tech skills from getting rusty.

And SQL is fun! Accessing data requires no complex deployments. It’s serverless! And today it’s easier than ever.

Modern Tools

You might be thinking you would have to write a bunch of code to ETL data from various places – but you can actually do quite a lot without ever writing a single line of code. Really! The days of writing your own ETL scripts for common data sources are a thing of the past.

Today there is a whole space dedicated to fully managed ETL processes with pre-built connectors to 100’s of popular SaaS tools and products like JIRA, Github & Zendesk – all into your data warehouse of choice.

Just sign up, setup some connectors, point the tool at your data warehouse, initiate a sync, and boom – get super useful data at your fingertips.

A modern ETL Pipeline

Of course, If you are looking to perform analytics on your own application’s data, you might still need to write some code. But there’s also a good chance your company already has some ETL processes sending this data to a data warehouse or data lake. If they don’t, this might be a good time to consider championing that effort, because in this day and age it’s hard to imagine an effective organization that doesn’t constantly collect and analyze their own data.

Once data is in a data warehouse, you need a tool to provide you an environment to query the data and produce dashboards. I use a number of dashboards on a weekly basis for many different tasks (more on use cases later). A current favourite of mine is Periscope Data, but there are many vendors in this space such as Mode and Tableau.

Once you have data synced into your data warehouse and access to an analytics tool, it’s time to write some SQL.

Leveling Up Your SQL Chops

Is your SQL a little rusty? Most engineers have a lot of experience working with SQL in a OLTP context – but writing analytics queries requires some other SQL constructs you might not be as familiar with. The article Top 10 SQL Tips and Tricks that You Didn’t Know were Possible is a great place to start and covers the powerful SQL features you need for writing analytics queries. There are a few in particular that are really important that I use all the time. The article above explains them extremely well, so I will only summarize them briefly here.

Common Table Expressions

Common Table Expressions (CTE’s for short) are a way to re-factor your sub-queries into their own separate named queries. Since everything is a table in SQL, you can capture those results and reference them in other subsequent queries or even other CTE’s. CTE’s are defined using the WITH clause. The other really cool thing about CTE’s is that they are recursive. This makes SQL a turing complete language!

Window Functions

Window functions are one of those things that end up being the secret sauce behind some of my most powerful reports. They let you look ahead or behind within your result set from the perspective of the current row in that result set.

In particular LEAD and LAG are super useful. Lead will look ahead a configurable number of rows in your result set, while lag will look behind.

Treat writing analytics reports like a software engineering problem

Once you start writing a lot of reports, you will find that you really need to make sure your SQL is readable and maintainable. It’s very easy to end up with long and incomprehensible SQL. Treat the SQL code in your reports like any other code you would write. Use source code control to version your reports. Some analytics tools support integration with source code control natively, such as PeriscopeData’s git integration.

Extract sub-queries into common table expressions and give them clear names that convey their purpose. Re-use common SQL code with CTE’s or views. Avoid overly fancy SQL tricks or optimizations if there is a simpler way that is easier to understand. Write comments to explain any unavoidable tricky bits to aide the reader (and your future self).

As an example, take a look at the following before and after pieces of code based on a contrived employee schema:

You can see the full gist here.

The first uses no special SQL features. The second makes use of common table expressions. The difference is pretty pronounced. In the first example, notice how quickly you get lost in the subqueries – and there’s only 2!

In the second example the indentation never gets deeper than one level. The indentation also remains linear with the addition of more sub-queries. It always baffles me when I see walls of SQL in reports that are almost completely unreadable. It’s almost as if it they were never intended to be maintained!

Some Use Cases

Let’s take a look at a few real world use cases.

Project Forecasting

I make pretty heavy use of project forecasts on projects I manage. The main idea is to use yesterday’s weather to help predict how a project’s progress is tracking against a target date. I use these reports directly with my engineers to track progress and make course corrections when things are looking bad (which is usually much sooner than they realize). Note the use of colours in the report as immediate indicators of project health.

JIRA reporting capabilities are very limited. Using an ETL tool with JIRA you can have a live updated project scoreboard easily.

A project milestone forecast live synced from JIRA

You can see the project forecast dashboard I’ve created on github.

Project Scoreboards

A project scoreboard
A scoreboard reflecting % completeness of a service migration along with a tlmeline

People play differently when they are keeping score. Having simple dashboards which capture this kind of data can be really powerful. They help motivate a team to see something through to completion and provide visibility and transparency to upper management and stakeholders.

Other Use cases

I use reports for many other things too. Here are just a few other use cases to give you a sense of the breadth of utility analytics can be to your daily work as a manager.

Pulse Reports

I call these pulse reports because they are excellent at business level monitoring. Any good analytics and dashboarding software has the ability to setup subscriptions to send you runs of your reports on a configurable cadence. Set these up and have reports land directly into your inbox for a quick look.

One example where I use this is with in progress ticket times. As I find JIRA’s kanban boards hard to read (they also don’t account for many things like holidays). I setup a simple report to show me with traffic lights if any tickets have been in-flight for more than a certain number of days. This provides an early warning sign of someone being blocked on a ticket. This is also really useful when you miss daily standups.

Ad-hoc Reporting

Sometimes bugs pop up that warrant some deeper investigation based on historical data or trends. Producing ad-hoc reports for forensic purposes can be really useful. Another advantage of this sort of report is you can send it to any stakeholders and clearly articulate the patterns you saw in the data. You can also re-use them if the issue re-surfaces.

Tips & Advice

Measure twice, measure again

Scrutinize the results you get. See if they match your intuition. Measure things in different ways to see if you arrive at different results.

Use these tools ethically

Don’t blindly trust numbers. The numbers tell a story, but be careful confusing precision with accuracy. Much can go untracked and thus is unrepresented in the data. A good example is using JIRA data to track engineer productivity. I would recommend you avoid doing this. It will lie to you. Engineer output and value cannot be solely measured in tickets shipped or ticket cycle times.

High margins of error can be OK

In his book How to Measure Anything, Douglas W. Hubbard talks about the value of reductions of uncertainty. He describes the common misconception that measurements must eliminate uncertainty rather than simply reduce it in order to be useful. This couldn’t be further from the truth.

Think about it: if you could reduce uncertainty on an investment representing weeks or months of work by 10% for just a few hours of work, wouldn’t that be worth it?

Expect high margin of error in some cases. The insights you get and the fact that you are looking at relative comparisons still make even relatively high error-margin data super valuable. Remember, if it helps you reduce uncertainty for a relatively low investment – it’s a no-brainer!

Project forecasting is a good example of this. My objective with a project forecast is not to be 90% right – it’s to avoid being 100% wrong. a 20-25% error is totally acceptable.

Final Thoughts

Using analytics in your job can be a major boost to your effectiveness as an engineering manager. You will likely find many more great use cases specific to your team or organization.

So go out and see what data your organization already has. See what you can do with it. If you haven’t hooked up your SaaS tools to an ETL tool for data extraction, start that initiative in your company. Finally, see what team-specific metrics you are missing and might help you drive towards the outcomes you want for your team.

So what are you waiting for? Start poking around your data warehouse and get some answers!

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.