Four Trends in Analytics from the 2018 Strata Data Conference
It’s an exciting time to be a data and analytics geek. I got to spend the week with thousands of other data professionals at the 2018 Strata Data Conference in New York City – data engineers and architects, machine learning engineers, data team leads, and academic researchers. Several of the expert speakers reflected on presentations they’d delivered 5 years ago predicting an exciting future where analytics and AI would be applied to real-world problems at scale. That future has arrived, and the buzz is now around how to best store, process, and analyze data to efficiently deliver insights to drive decisions. We have highlighted some of the key themes from the conference on the practical applications of advanced analytics.
1. Accountability in Artificial Intelligence
One consistent theme from the 2018 Strata Data Conference was the impact of using AI to automate decision-making. At its best, machine learning and AI will enrich the human experience by enabling better and faster decision-making, but with faster decision-making comes the need for safeguarding against bad decisions going rogue. The phrase “garbage in, garbage out” is well known but a newer caution is how machine learning can make bad decisions just as efficiently as good ones.
Google’s Chief Decision Officer, Cassie Kozyrkov, talked about accountability during her keynote address. Kozykrov is optimistic about AI and the future but asserts there needs to be strong leadership in AI, as the ultimate responsibility for decisions made by AI systems still falls to the people in charge of those tools and systems. Technology, especially machine learning, amplifies our decisions to a massive scale. A single person makes many decisions each day but computers empowered to automate decisions can make millions of decisions in the blink of an eye, and that acceleration of decisions comes with new responsibilities and the need for accountability in AI.
2. Democratizing Data Science
Self-service capabilities are now table stakes for good business intelligence (BI) tools, and beyond that, data science self-service is becoming more available, too. Our team at Clarkston believes in tools, like RapidMiner, that are democratizing data science and enable multiple levels of an organization to quickly build out data science algorithms without requiring specialized expertise in specific coding languages.
There are even tools on the market, such as DataRobot, that can build out 50 machine learning models for you based on the dataset you feed in and provide you with the best performing models. These tools are extremely helpful for prototyping and a starting point for production solutions, but we caution you to build in safeguards to ensure that the decisions and actions you’re automating are correct, unbiased, and reliable. Beyond these tools, organizations will need to start thinking about how we govern AI in the organization, especially as it becomes more available and used across functions.
Part of that accountability and governance means focusing on some of the ‘unsexy’ parts of data science like monitoring and maintaining your AI solutions. A good monitoring system is not only crucial in trending your model performance to assess drift, or the change in accuracy of predictions, but also being able to audit how your models are being changed, how they are being used, and what data is driving them.
3. Enriching the Human Decision-Making Experience
As we increase our ability to automate decisions through the adoption of advanced analytics, it’s important to preserve the human element. According to Google’s Kozykrov most organizations aren’t really ‘data-driven’ but more ‘data-inspired.’ This is partly because as humans we tend use data to confirm choices already made. Since we can’t go back in time, we need to embed AI and statistics as a part of core curriculum education and we need to engage and educate our leaders on basic decision-making frameworks that fit this new paradigm. When designing analytics projects, Clarkston’s data + analytics team stresses the importance of engaging both executives and those directly affected by AI systems to demystify and remove the ‘magic’ behind it. This will help them better understand and interact with the solution.
4. Output Design in AI
Output design for AI is another consideration to enrich the user decision-making process. Jacob Ward, a science and technology journalist and former editor-in-chief of Popular Science, presented a keynote supporting this in his assertion that humans can only process 5 probabilities: 0% and 100%, 1% and 99%, and 50%. Avoid presenting ‘precision’ metrics and default instead to the action that should be taken. Amber Case, author of “Calm Technology, Design for the Next Generation of Devices,” presented us the idea of alert fatigue. In augmenting human decision-making with technology, there is a fine line between presenting a metric as too low a priority, resulting in the problem being ignored, and presenting metrics as too high of a risk, with non-emergency issues being treated as though they are crises.
It’s important not to think of AI and machine learning as magic. It’s critical to have the right teams, processes, and tools in place to build data science projects through rigorous testing that you can monitor and audit to know what’s going on behind the curtain or inside that black box. Getting a model to production is not the endpoint with data science; in many ways it is just the beginning. With a major rush of companies looking to reap the benefits of data science and AI, we need to ask ourselves what safeguards we have in place to steer decision automation toward better outcomes.
Each of these key trends were discussed by some of the leading minds in data science and data analytics at the 2018 Strata Data Conference in New York. If you are interested in continuing the discussion with the Clarkston Data + Analytics team, please subscribe to our newsletter.