Will data science become automated?

by Dominik Matula, Data scientist at Profinit

Full disclaimer: the author works as a data scientist and loves his job. Therefore, once robots take over the field of Data Science, he might lose his job and spend his last days being depressed. Or will he?

Let’s not beat about the bush, looking back in time every profitable human activity has been automated. We have self-driving cars. We don’t bake our daily bread – we buy it from mass producers; milk and eggs are mined from animals in huge industrial halls (OK, this one is ethically problematic, but you get my point). Machines hundreds of years ago had replaced workers, and this process is still ongoing.

Data are the so-called ‘oil of the 21st century’, an essential ingredient to boost every business you can imagine. However, to run your data-driven decision machines, you need to distil information from the data in the first place. The demand for an automated data-refinery (which would have a much, much larger capacity than human-based approaches) is overwhelming. As a result, the question shouldn’t really be ‘WHETHER’, but ‘WHEN’ will Data science be automated?

It seems quite depressive (especially given that I’m a Data scientist). We should take justice into our hands (and some torches) and destroy those evil machines while we still can.

Well, I don’t think the future will be bad. Quite the opposite! The automation of Data science will bring these three amazing changes at the very least:

1. Democratisation of Data science

This process is already happening (see, e.g. this article at MIT blog). It will be no longer be necessary to be a full-time Data scientist to gain insights from your data. Anyone can do that! Especially domain experts will have powerful tools in their hands.

The second aspect of this point is, even more, promising for Data scientists: Once a large part of society faces data-related tasks on a daily basis, the pressure on data availability and quality will be bigger. This will cause even more extensive use of data-based techniques in various fields.

2. Ease of data preparation

Have you ever worked on a Data Science project (btw, our time-proven standard at Profinit is CRISP-DM)? Then you have probably spent a lot of time with data preparation. According to some studies, data scientists spend up to 80 % of their time with data preparation, instead of working on the actual business problem. Since machines could easily make this effort, let them take this job then…

3. Better decisions based on real data

This point is maybe a little utopian, but I still envisage a better society based on real data. One that is not so dependent on a single leader’s opinion, or easily misguided by mindless emotional responses. A great example of this Data science application is so-called Data Journalism, which is currently gaining momentum all over the world (Cluemaker, one of our products, actually helps data journalists to reveal e.g. malicious links between politicians and lobbyists).

You can probably think of another set of benefits that data science automation brings (please write them in the comments – I’d love to hear them!). To me, as a data scientist, the most promising one seems to be the following: Once the Data science becomes more mainstream and widely applied by both machines and humans, its edges will expand further and further. And I’m really looking forward to being a part of that!