Mitigating Bias in Analytics
The analytics playing field is growing rapidly and companies are leveraging advancements in artificial intelligence (AI) and machine learning (ML) across a broad range of purposes. From targeted marketing campaigns, improved consumer interactions, or optimized supply chain and logistics, artificial intelligence and machine learning are changing how we work and live. There is now an intersection in which the principles behind robust analytics programs meet the principles for developing not only an inclusive and equitable workplace, but inclusive products.
Just as we, humans, utilize what we see, hear, touch, taste, and smell as inputs into our decision-making processes, it’s imperative that these analytics programs are inclusive and that companies take proactive measures to mitigate the potential biases in their technologies or areas in which systems can go wrong. To mitigate the unintended effects of bias in analytics, implement these eight strategies to build an inclusive analytics program.
1. Champion the value of diversity and mitigating unconscious bias on an organization-wide level
Before initiating the conversation of, “how do we reduce the impacts of implicit bias in our analytics?” Your organization should start a conversation of, “how do we reduce the impacts of implicit bias at our company?” Your company’s products or services are reflection of your core values and corporate culture. Whether it’s diversity trainings, unconscious bias seminars, or other platforms, your employees’ understanding and value of inclusion, equity, and diversity will drive greater awareness into how those same concepts impact their functional work.
2. Create a robust analytics strategy and road map
Having a firm analytics strategy and roadmap is key to activating your data. Ensure your organization has a robust understanding of the problem you are trying to solve with data first and then be sure to understand the intersections with the data you already have and the data you wish to collect. Organizations will often look to the data first and try to model a way to use it – inevitably losing opportunities to create value efficiently.
Once you have a grasp of your initial data and desired solution, ask yourself what areas biases can affect your model, whether those biases exist in the data, the solution, or the processes between.
3. Build a diverse research & development team
One key to reducing bias in analytics, especially those related to legacy diversity (e.g., gender, age, culture, ethnicity, etc.), is to build a diverse team. Having diverse teams reduces the negative effects of group-think or confirmation biases. Create an organizational culture that embraces inclusivity and is open to curiosity to work through barriers when the data proves an outcome they weren’t necessarily expecting.
Studies show that diverse teams focus more on facts, process those facts more carefully, and are more innovative and can reduce confirmation bias. Don’t oversimplify personalization, confirmation bias in code will cause solutions to prove what you already believe through data. Companies can also build community partnerships with diverse organizations such as Girls Who Code, Black Girls Code, Dev Color, and others as a pipeline program to enable greater diversity.
4. Create a diverse team of reviewers and a code review processes that mitigates bias.
A 2016 Github study showed that women’s code contributions are accepted more often than men’s, but only when they aren’t identified as women, highlighting the unconscious bias in code reviewed. Your code review team should be made up of diverse individuals – those that are data scientists and also those that were not part of the code development process. Mozilla has even taken it a step further and replicated some of the best practices in recruiting by blinding the code review process so the team make up is not known, thus reducing the potential of unconscious biases affecting the review.
After finalizing your solution, you should also conduct a blind test with the data. Applying the solutions that data scientists build from a dataset they’ve never seen can ensure that the solutions work on diverse set of data.
5. Reduce the potential of bias in your sample and training data.
Whether it’s artificial intelligence or machine learning, your products will only be as good as your data – remember, “garbage in, garbage out.” Be sure to profile your data, understand patterns and outliers; that is, ensure that the training data that your analytics teaming is using isn’t already biased.
Joy Buolamwini, MIT Researcher, highlighted this issue in her TED Talk, How I’m fighting bias in algorithms, after discovering her face wasn’t being detected by a facial recognition program as a result of data that lacked a broader range of skin types and facial features. When looking at your training data make sure you are thinking about dimensions of diversity such as ethnicity, gender, socioeconomic class, ability, political views, and/or religion.
Also, consider the quality of the data and where it originates. For instance, data constructed through human intervention may be reviewed to reduce bias, but data that comes passively, say, from surveys or forms may be more at risk.
At the end of the day, your sample group should reflect your target problem space. If you are a developing a solution that affects multiple, diverse problem spaces, your training data should reflect that. If your data isn’t inclusive to cover all potential real-life scenarios, it has a higher risk of causing harm and providing false insights.
6. Build accountability around your analytics program.
As mentioned in our 5 Drivers of Analytics Success in 2019, companies must be aware that machine learning can make bad decisions just as efficiently as good ones. Build a reputable program with strong leaders. At the end of the day, the outcomes – whether positive or negative – of your analytics program fall on your organization, your resources and tools, and your leadership. Building accountability internally is step one, go a level deeper by empowering your customers to help call out biases and train your AI/ML. This not only helps enhance your analytics but builds greater trust with your customers.
In building accountability of the near- and long-term impacts of your analytics program to a product owner or team, you will empower that person or department to ensure that unconscious bias in analytics is mitigated.
7. Break bad decision paths through iterative review sprints.
Decisions your AI/ML model makes over time should be tracked and consistently monitored. Iterative sprints or reviews of your programs’ outputs can help you break bad decision paths sooner before any undue negative effects. Utilizing systems to document decision-making can be leveraged as data for your analytics program.
As you develop a model, you should compare the outputs and decisions of the model to real-life occurrences. For example, if your model is meant to predict consumer buying behavior, you should compare your model to a control group for your population. Also look for sometimes subtle association biases, there’s no harm in your analytics program equating mother to female but problems can arise if the program equates nurse to female, for example.
One step in mitigation is to continuously monitor and manage the solution. If your solution drops below set confidence levels or thresholds, there should be a process for retraining your model. Just as humans are constantly learning and re-learning, so too should your analytics solutions.
8. Analytics are not “one size fits all”, take careful consideration of the success criteria for your program.
Before development, define what success looks like in your program, outline ROI, and business and customer impact. Develop controls, interoperability, success criteria, and acceptable variances based on how you are using your analytics. Healthcare applications, defense facial-recognition software, or AI-powered marketing all have different stakeholders and require different confidence levels.
There is no quick-fix to addressing all biases or identifying every potential scenario your analytics program will experience – and there shouldn’t be. Building an inclusive analytics program involves your people, your processes, and the technology you’re developing. Being able to learn, pivot, adapt after mishaps, and continuously improve will drive ROI and increase the customer experience.
Coauthor and contributions by Maggie Seeds