Exploratory Analytics: Getting Started
Exploratory analytics leverages historical data to answer questions and uncover trends and patterns through visualization, data and feature engineering, test execution, and other techniques. We regularly hear from companies that want to start ‘doing analytics’ and are looking to create a competitive advantage or capture the benefit of new capabilities in the space. What analytics means still varies from company to company, depending heavily on their maturity.
Organizations seeking to do predictive and prescriptive analytics may want to jump straight into those use cases without considering the importance of a foundation in exploratory analytics. That may work for a few isolated use cases, but to create a sustainable analytics function throughout an organization with a pipeline of value-added projects, it’s crucial to have analysts and processes to support exploratory analytics.
Organizations at all levels of analytics maturity will benefit from exploratory analytics. For those lower in maturity, it will help shift time from building (often lengthy) reports to the more valuable activity of providing answers back to the business and enabling self-service to business analysts so they can dig deeper into their own questions.
Exploratory analytics can build out the pipeline of advanced analytics use cases and can be important in building experimental models and proof-of-value (POV) projects. Exploratory analytics can stand alone as its own project, or feed into others. Additionally, descriptive and exploratory analytics are just as important in a fully mature analytics organization as they are for those at the early stages. Analyzing what happened and why will always play a part in operations and strategy. As your analytics function matures, exploratory analytics will improve and mature as well with more advanced techniques and federated skills amongst the analyst group.
In this piece, we’ll cover some considerations for evolving your capabilities to capitalize on exploratory analytics.
Getting Started with Exploratory Analytics
Exploratory analytics can be aimless if you don’t plan well and take a strategic approach. Working closely with the business to drive toward valuable answers and creating an analysis plan are guardrails to avoid an endless cycle of questions.
Starting with an Analysis Plan
Due to the exploratory nature of this work, you may want to dive right in and start investigating the data. However, to be efficient and to enable collaboration with other team members, it’s best to start by developing an analysis plan. This should outline the questions you’d like to answer using the data and should be broken into small enough chunks that tasks can be distributed amongst the team. Holding a workshop with a cross-functional team of executives, analysts, and data scientists can help uncover questions to go after that will create value when answered. An agile approach is helpful to reprioritize and add new tasks as you uncover insights and your direction and questions evolve. An agile methodology can also be helpful to manage the project and communicate with the business to show which tasks you’ve completed and the remaining direction. To create the analysis plan, work with the business analysts to start with questions about the challenge and form hypotheses to investigate.
Scaling
It’s best to divide the problem into logical chunks. The business rules that apply to one part of the data may not hold true across the board; a typical example of this is splitting data geographically into regions, states, or zip codes. You’ll likely want to examine both the macro effects across the data set, then dig in to explore various subsets. One example is to examine the trend of sales of a particular product nationwide, then compare each state’s individual trend line to that to categorize the over- and under-performers over time. Data volume can be an issue when dealing with transaction level data that produces millions of rows. Scale the data back by starting with a subset to investigate your questions, then take the more refined question list to the rest of the data.
Collaboration
Analytics is a team sport. There should be a close relationship between the analyst and the functional owner, and regular checkpoints with the decision-making team. Collaborate with the functional owner to create questions that, when answered, will have an impact on a business challenge. Work together to outline the ROI of solving each challenge and the actions you will take as an organization once those ROI are proven. Set up a cadence that allows for the analyst to ask questions about what they’re seeing in the data, identify constraints and scale, and confirm and document any assumptions. Analytics in a vacuum, though interesting for the analyst, is rarely aligned with what’s valuable to the business.
Engineering
Data engineering is a new function to many organizations but growing in importance as companies rely on quick ingestion, combination, transformation, and availability of data.
Granularity
There are often varying levels of granularity inherent in combining multiple data sources. You may have transactional data with time stamps, manufacturing historical system data at 1 second intervals, and inventory reconciliation files daily that you want to combine and understand together. There are a few major considerations to understand which granularity to use: how quickly can your data engineering group ingest and process data, and at what rate will the business realistically make decisions on the outputs? Talk with the business to understand the needs – we can analyze the equipment performance each second, but how often will the business look at and act on that data? If the answer is once an hour, then you can analyze and present the data at the hour level.
Forecasting is an interesting example for determining both at what rate decisions will be made with the forecasted values, as well as how far out to build the forecast. We worked with a client to forecast call volume to a customer service center and decided to build two separate forecasts: a daily forecast that was used each week in planning staffing and operations that predicted out 90 days, and a monthly forecast that was used quarterly for hiring and strategy that predicted out 24 months.
Feature Engineering
I consider this to be the most rewarding part of data science; creating new features or data that will fill in gaps with additional detail. This will be an iterative part of the project as you identify a new question through a strange occurrence in a visualization (a spike in sales in a certain month for example), work with the business to understand what caused it, then add or create data to address that scenario. An example of this is taking a date and exploding it out into many date features: what day of the week is it, where does it fall in the month or quarter, if it falls on or right before a holiday, etc. There are likely many interesting patterns associated with the date field that are hard to uncover by using the date alone but become clear through feature engineering.
Effective Communication and Business Engagement with Exploratory Analytics
A critical part of all analytics, and especially exploratory analytics, is communicating the value of the effort to the decision makers in the business.
Types of Visualization for Types of Data
Successful visualization of the data when presenting to the business can make or break the impact that your work has and the ability to build momentum for further analytics. Enable your analytics team to use effective visuals for the type of data or problem they’re communicating by creating templates or hosting knowledge sharing sessions to collaborate and get an outside perspective. There are best practices and easy to understand visualizations for different types of data, like exploring patterns within a single variable (histogram), a single variable over time (trend or time series analysis) and looking at relationships between two or more variables in a scatter plot.
Frame the Problem in Business Terms
The business isn’t often interested in the mathematic acrobatics you took to arrive at the solution. They want to know what impact it will have on the business, so go ahead and translate it for your audience. Instead of showing how a clustering algorithm identified 4 segments of customers in the data, frame the discussion to ask how the business would like to engage each segment differently based on their key characteristics.
Running Tests:
Our goal for every analytics project is to turn insights into actions. Exploratory analytics projects that result in a test are a major step in the right direction toward this goal.
The steps for an exploratory analytics project to result in an actionable test are:
- Create a hypothesis through analysis plan. Work with the business early on to commit to the actions that will be taken if the hypothesis is proved and acceptance criteria are met.
- Vet the hypothesis in the data. Using exploratory analytics on all available and relevant data, confirm the hypothesis of interest and estimate the business impact in concrete numbers.
- Design test. Use statistical techniques to choose the correct sample size and format to test the estimated business impact. Be sure to include a control group to accurately measure impact. Confirm with leadership what actions will be taken once acceptance criteria are met.
- Run test. Coordinate with the operations teams and closely monitor the progress of the test over the decided upon time frame. Ensure adherence to the testing method and analyze results.
- If it meets acceptance criteria, change behavior at scale. Collaborate with the different business units and operations teams to scale the necessary changes across the organization. Continue to track progress against benchmark values to understand full value or anywhere that requires additional attention.
- If it fails – that’s ok. Investigate what happened. Is there missing data? Is there a constraint we didn’t consider or an improper assumption? Isolate the problem to estimate the work effort to course correct.
Change Management
Analytics is powerful, but to many organizations it is new and unproven at the analyst level, the executive level, or both. This will lead to hesitations and anxiety around making changes – especially in cultures where “we’ve always done it that way” is a norm or there are team members with decades of experience using their gut. Proving the work through POV projects or tests will go a long way to build confidence through tangible results but forming a change management strategy is crucial in truly federating an analytics-driven culture.
Exploratory analytics isn’t just about curiosity support. It should be used to answer spot questions to drive operations with the correct self-service tools, but in a mature analytics environment it should also be used to create POVs or tests and drive the pipeline of advanced analytics use cases. Resulting in a real-world test is a big win for exploratory analytics, but so is answering questions or uncovering evidence for predictive models. Choosing the right tools and organizational set up to make data available for investigation will drive momentum and value-added work for your analysts.
Subscribe to Clarkston's Insights
Co-author and Contributions from Brandon Regnerus and Mike Onore.