Data, Prediction and Culture

Peter Sueref

By Peter Sueref
Data Science Director, Centrica

Globally, we collect more data than ever before. We have collected billions of data points on areas as disparate as the stock market, elections, and viruses. And yet, we have been unable to predict with any reliability either the likelihood, or the severity of how these events will impact our lives. At the same time, we now live in an age with instant communications, driverless cars and chatbots barely distinguishable from humans.

William Gibson famously said that “The future is already here, it’s just not very evenly distributed”. Our ability to understand trends and outliers is not at the same level as the progress we make year-on-year in science and technology. And this has pervasive effects in the workplace and in our lives.

Predicting the future

So, what will the next ten years bring? It might be that we see full home automation, wide-scale driverless cars, microgrids, digital currency, applied quantum computing, general AI, and nanotechnology among others. Many of these are already here, some in their infancy. Others are less likely to happen. But could you say for certain which? And what are the impacts of these events on the economy, on our social systems, on our lives?

Working in data science, then, presents a strange dichotomy. Expectations from our customers and colleagues are high, having seen the accelerating technological changes of the last couple of decades. And yet, we are asked to build models to predict an uncertain future, to make sense of a random world. How do we manage the contrast between these competing factors?

This problem of expectation versus reality is just scratching the surface. The deeper concern that permeates our lives is how far removed we are from understanding the technology and the data presented all around us.

Understanding the odds

COVID-19 is a case in point: each day we are treated to statistics, graphs, facts. Many people will have seen log-scale graphs for the first time when looking at relative cases across countries (this is a chart where one of the axes increases in different magnitudes, for example 1, 10, 100, 1000…). Or been asked to understand second-order derivatives when looking at the speed at which the death rate is falling or rising. And then to understand the issues concerning masks, two-metre distancing versus one-metre distancing, the efficacy of vitamin supplements, and whether obesity or race are a factor in the severity of symptoms. Each of these is debated by experts, analysed by pundits, and put into action entirely differently across countries. What do we do? Give up on statistics and prediction entirely? Throw our hands up in the air and shout, “what’s the point?”.

Clearly not. Perhaps I was disingenuous above when suggesting we can’t reliably predict anything. After all, the odds at a bookmaker just reflect the market (at 1000-1, the 1 is still going to happen once every thousand times). Polling gives a percentage estimate that a candidate will be elected and 48% for the case of Brexit happening or Trump being elected isn’t zero. Some hedge funds actually beat the market consistently (although these are rare).

 

So, what can help us in navigating the world of poor predictions and high expectations?

Education - This has been emphasised to me given that I am now the primary teacher to three young children while lockdown continues. And while I think education in basic statistics and critical thinking is valuable for school children, it is also essential for adults. One of the things I’m most proud of at Centrica is that the Data Science Team I run have produced courses to teach data science fundamentals and statistics.

Honesty - The Data Science Team also has a monthly show-and-tell of projects we’ve been working on. Importantly, we try to show all the projects we work on in all the various stages of development, including those that never make it - because they ran out of funding or just didn’t work. Demystifying the process invites more people in. Failure is normal and accepted, particularly when it’s learnt from. The biggest benefit though is cultural - it helps set expectations, removes the veil, and rather than encouraging our customers and colleagues to ask for flying cars, encourages reasoned inquiry, intelligent questioning and potential products that are realistic and impactful.

Collaboration - Working in a diverse group, whether that’s regarding politics, race, gender or sexual orientation gives a team superpowers. They allow us to create things that work across the spectrum, let us test our assumptions safely and cheaply, and prevent echo chambers which lead to stagnation. A mix of opinions and beliefs is also stronger than the individual in prediction, estimation and thinking. But as well as helping us cognitively, collaboration and diversity lets us cope with whatever the future throws at us, predicted or not. In a business context, collaboration means working across departments and breaking down barriers. The Data Science Team was amongst the first in Centrica to utilise the concept of agile squads, working across functions with colleagues across the group.

Care - You don’t need a business case for kindness - Kindness to our colleagues and customers will ultimately help us far more than flying cars or predicting the future.