Take a look inside the 2020 analytics trends with Clarkston. Trends in advanced analytics continue to dominate strategy discussions as businesses seek new avenues for gaining cost savings and efficiencies leveraging the power of data. Looking forward to 2020, it’s critical to first recognize the development and context of the analytics field in order to best understand where analytics is heading next.
Over the course of 2019, developments in analytics converge into 3 themes: insights-focused experimentation with advanced analytics, a re-platforming of data architecture, and the development of data science teams or centers of excellence (COE).
Looking forward and building on the trends that defined 2019, new themes emerge for the 2020 analytics trends. As businesses continue to evolve their analytics maturity, there will be growth in the influence and scale of analytics teams and COEs. The continued adoption of tools that enable automation of workflows throughout the data science lifecycle will also grow. With increased adoption of production solutions, there will also be a greater emphasis on the explainability and auditability of models. And finally, there are a few areas of data science with mostly untapped potential, including machine learning (ML) time series forecasting, organizational analytics, and RPA.
Auditability of Models:
Greater adoption of automated decisions needs to be accompanied by thorough documentation. Whether getting out ahead of regulatory bodies or dealing with a lawsuit, analytics teams should be prepared to explain how decisions were made. This comes down to two factors: explainability and documentation.
Explainability refers to the ability of humans to trace through an algorithm to determine exactly where a prediction came from, and it can be difficult even for simple algorithms if enough data is involved. There are also certain algorithms that act as “black boxes” and it can be nearly impossible to interpret how a decision was made. These algorithms often return more accurate predictions, but organizations need to consider the tradeoff with explainability for their use case. When generating similar accuracies, it is often better in business applications to use the simpler, more explainable algorithm. As for documentation, those in regulated industries will need a higher level of documented validation, but it is best practice across industries to have processes in place to conduct a technical review on models before they go live and to document workflows and predictions for reference.
As more and more organizations get underway with their analytics strategies and realize that data science really is a team sport, they’ll understand the need for expertise and input from various roles in the organization. Once a COE is established and generating value, the next goal becomes enterprise education and scaling. This involves getting the right tools in the hands of the right people, which sometimes means setting up the processes or front-end user interfaces so that certain roles need not know the underlying tool used at all.
Educational sessions will also be critical and most effective when delivering contextual use cases. Analytics roll-out will thrive if everyone can see the benefit to their own day-to-day. Not all use cases need to be predictive modeling; is there a way to automate a weekly task to save time and headaches or to save Excel from crashing? Focus on delivering value and the rest will come.
Automation can have an intimidating reputation at first, but once adopted, quickly becomes heavily relied upon. Each part of the data science project lifecycle can be automated, and more tools and platforms are striving to make that an easy part of the process. The task of automating data pipelines to get data from the source, integrated across systems, and ready for processing by analysts often falls on the data engineering team and has its own suite of tools to streamline and be efficient.
Once the data is available, the processes to profile and understand your data, visualize it, and then model it can each be automated as well. There are tools and components of commonly used platforms that will automatically show you histograms of each of your data distributions, let you know the number of missing values, and provide other helpful flags that let you initially analyze your data. The same is true on the modeling side once the data is prepared: you can select a handful of algorithms to test on the data to quickly prototype. These automation tools save time spent on repeated work and are a favored item in the data scientist’s tool belt.
To read the full 2020 Analytics Trends Report, download below.