Clarkston Consulting recently partnered with a biopharmaceutical client on a pharma-centric Q&A bot. Read a synopsis of the project below or download the full case study.
In this case study, we highlight a recent project exploring generative capabilities in pharma, where Clarkston’s Digital, Data and Analytics team created a pharma-centric Q&A bot. Clarkston partnered with a long-standing biopharmaceutical client, developing a completely new application to better equip the client’s sales representatives with key information about their flagship drugs and studies. In addition to serving to enhance knowledge around the business’s internal developments, the application enables sales representatives to quickly ideate and generate new assets. These new assets are complete with source citations to existing assets, thereby ensuring the validity of responses and mitigating hallucinations. By creating a structured, internal knowledge base to pass off to a large language model (LLM), sales representatives are now able to interpret and analyze vast amounts of internal data that used to go underutilized, providing important information and resources to drive sales conversion, and ultimately save lives.
To begin the project, Clarkston’s data scientists first collated the unstructured, internal assets into a structured dataset containing the raw text along with associated metadata, such as ID, page number, and disease state. The resulting text embeddings and metadata were stored in a vector database to facilitate similarity search and retrieval.
Since most vector databases allow for WHERE condition-type statements, the metadata is important to include. Normally, this type of filtering must be done in a separate, initial step before the question is asked. However, using a novel method to infer metadata from the question, Clarkston specified a schema outlining each of the attributes, enabling the user to query specific documents, page numbers, and disease states, and conversely, retrieve such metadata from the database. Rather than relying solely on brute-force similarity search, this novel implementation returns faster, more pertinent, and more comprehensive responses.
To customize the Q&A bot to the client’s knowledge base, requisite context of certain internal-specific processes and jargon were supplied to the LLM. After thorough prompt engineering, Clarkston’s data scientists were able to properly align the Q&A bot with the business’s practices, leading to more helpful responses.
With the backend development finished, Clarkston’s data engineers were able to deploy and serve the model as a fully-fledged application, complete with a seamless user interface, Single Sign-On (SSO) access, guidance, and documentation. Additional enhancements included quick actions such as retrieving a previous question for modification, and a separate display for transparency about which underlying assets the bot uses.