Statistics, Certainty and Crystal Balls
So this year has flown by at an alarming pace, and looking back I realise that I haven't written a blog since August. As its new year's eve I thought it would be an opportune time to look back at 2018, and reflect on the year with a vaguely analytical eye.
But, as I started to write this, I realised that it is very difficult to be analytical with life. Even though we come up with more and more ways of analysing, presenting and visualising data - reflecting the year in data does not do it any justice. Our years have resulted in a mass of big data, some of which we have attempted to process and some of which has passed us by. Individually, we have experienced so much, our brains (more amazing than any computer) have taken and assimilated millions of pieces of information. We have assessed the quality of our data both quantitatively and qualitatively, some of which has led us to discard, clarify or recollect data (all without any documented SQL). We have performed data segmentation, without realising it, to decide who (and what) to spend time and effort on. We have built predictive models in our heads (without needing R, SAS, SPSS or others), and we have extrapolated data to make decisions about our future (without a regression model in sight). Our analytical capabilities are still more amazing than any AI or machine learning. So this is my first bit of learning for 2018 - our minds and capabilities are amazing whether we 'do analysis' or not. We have all built our own amazing predictive models based on the big data that we have collected throughout our lives.
Of course, when I look back at the year there are a number of things that have caught me by surprise, that I could never have predicted. Firstly, there are the people that have been lost from my life this year. Some of these have been unexpected deaths, but others have been relationships that have ended both in terms of friendships and work. The vast majority of these could not have been predicted in any way at the beginning of the year. No statistical model, no data science, no process control could have told me that people I loved, respected, needed would not be here at the end of 2018. And the predictive model that I have built suffers from them having been removed from it. So my second bit of learning for 2018 is that you can't always predict the future, no matter how good you are with data or forecasting. And this means that you have to treasure every moment you have with those you love.
And I have learned other things about people too. To be fair, the data at the beginning of the year suggested that 2018 would be the year that I gave up my national director role to try other things. Partly this was due to ongoing and worsening health issues that meant a long commute, long hours and a lot of stress were too difficult to continue. This needed no statistical model, but was simple common sense although very difficult to do. Giving up a large salary is never easy, and my forecasting capabilities were unable to predict when (or if) I would get another role. What did surprise me though was how friends, family and work colleagues reacted to, and dealt with, my health issues and eventual resignation from my role. Those I should have received help and support from, on the whole, let me down which made the whole situation more difficult than it needed to be. But then there were others who were so kind, and considerate when they really didn't need to be (or I certainly didn't expect them to be). My internal predictive model could, and to an extent did, predict those who let me down but was completely floored by those who showed kindness, So my third bit of learning for 2018 is that people will always surprise you. There will always be those who let you down when you least expect it but there will also be wonderful moments of kindness where you see people in a totally different light.
And finally there has been the attempt to set up a freelance training and consultancy service that builds on the training I have delivered over the last 20 years. For those of you who have seen my posts on LinkedIn, this has been a slow process and the amount of work that I have managed to acquire so far is minimal. But other small pieces of work have seen me through this year and have enabled me to spend time on my website, to learn new skills, to write some blogs and to develop some new courses. Again, I have been surprised at the support that I have received from various quarters, including old work colleagues (I would call them good friends - they know who they are). And again, others that I thought I could rely on have let me down - see my third point.
All of this has made me think of how, as analysts (statisticians, data scientists etc) we like to talk in definites. The rate of such a disease rose this year, the amount of this thing decreased this year etc. But as with my experience of 2018, we are usually only looking at a small part of the story. We have:
Bias in our data
Variable data quality
Quantitative data (often without qualitative context)
Subsets or samples of data
Limited granularity of data
Confounding factors that we are currently unaware of
We wrap all of this up in nice interactive visualisations, but the truth is that we have limited certainty in our data. When I teach my students I like to re-brand the practice of statistics into 'the science of uncertainty'. We forget that just about everything we produce has only 95% certainty, or that there is a 1 in 20 chance that we are wrong. Our statistical models, our forecasting, will be wrong five percent of the time if we have addressed data quality, bias, representativeness, confounding etc, and much more of the time if we haven't. So my fourth point of learning (or at least reminding) for 2018 is that nothing is certain. Data and analysis can never give you certainty and we need to keep that in mind.
So, unfortunately, there is no such thing as a crystal ball. Statistics and analysis can help with looking at how probable things are, but they are never perfect and can never predict the unexpected. But realising this helps us to build our model. The realisation that some thing cannot be predicted makes our model more realistic. So my last piece of learning for 2018 - it's never too late to learn. Our learning is, in itself, extra data that we can put into our own predictive model.
As we head into 2019, we usually set new year resolutions. Most of these are broken very quickly. Life is complex, it can't be modelled, there are no crystal balls. But to help us deal with this we need to keep on learning. So let's make a useful new year resolution this year - to keep on learning...
To all my friends, family, colleagues and all who have supported me this year - have a fantastic 2019.