In a much-anticipated announcement Monday, OpenAI rolled out the next wave of advancements to both ChatGPT and their API at DevDay. This made a big impact immediately on the community already actively using ChatGPT. However, it is called “DevDay” and was geared toward developers. Below, I provide an OpenAI DevDay Recap for non-developers
For business leaders, it can be unclear exactly what these advances mean for the incorporation of Generative AI into business processes. The advances go way beyond typical software upgrades of “faster” or “cheaper”, and the new model, GPT-4 Turbo is both, which is rare. What OpenAI introduced is already changing what business leaders and developers alike are prioritizing building, and how they are structuring projects.
1. Context Length:
Context length extends to the equivalent of a 300-page book – competing with other models like Claude 2, which had the advantage of longer context windows. Context is the key to customizing and focusing on enterprise data. This is going to allow for many projects to skip steps around “chunking,” a process to split large amounts of reference material for ChatGPT to deal with in previously smaller context windows, including the API.
2. Vision and Speech:
- Text-to-Speech: This will allow API users to create very natural-sounding speech if the demo is to be believed. Imagine a program that accesses a customer’s information and is given instructions on strategy from sales leadership, and then executes the phone call. All the pieces are here, except perhaps the most challenging part in demos I’ve seen before, which is not interrupting the customer, but that’s challenging for humans, too.
- Whisper V3: Already the standard in voice-to-text, it is apparently getting better.
- Image Inputs/Outputs: I’m amazed I can take a picture of my bookshelf and get recommendations, but this novelty will grow into a further removal of the barrier between the physical and digital world with DALL-E 3, the latest vision model’s full incorporation.
3. Reproducibility and Control:
For development, as part of JSON Mode, advancements will allow reproducible results, give control of the results, and enable more complex behavior, like executing multiple requests at once. This will be very important for “Assistants,” which I cover later.
4. Last knowledge cutoff updated to April 2023:
I anticipate the gap between the knowledge cutoff, which had been September 2021 for some time, will continue to get closer to the current day.
5. Real Files + The Whole Internet:
Imagine dragging CSVs of purchased competitor data and combining it with the whole of market research available online and executing complex analysis on both simultaneously using natural language prompts. This is easy now due to streamlining existing improvements with no development whatsoever. When ChatGPT was first introduced, you couldn’t search the web and had limited chat window space. They then introduced Beta features to search the web (with partner Bing) and Code Interpreter, a way to upload files. Both were somewhat hidden as beta features that many didn’t know existed, and they had to be turned on separately.
Code Interpreter was the most exciting upgrade to the ChatGPT user interface (UI) experience. This allowed no-code execution of robust, intermediary data analysis and many other tasks on multiple files. In a non-scientific survey, I found even regular ChatGPT users who used ChatGPT to write code weren’t using it to execute code. Perhaps because they were developers themselves, but for business users, this will be an enormous shift. Even complex machine learning tasks like clustering analysis can be done extremely quickly and well via the web UI. Business users can create custom charts by quickly uploading a CSV and asking questions in plain English. Now, Code Interpreter is turned on automatically and can be used simultaneously with the other Beta features.
It’s important to remember the ChatGPT revolution was as much about the power of the underlying model as the ease of the interface. Wrapping the full suite of GPT-4 Turbo capabilities together will skyrocket usage and make them more powerful.
“GPTs” (Plural) & Assistants
To understand the revolution that “GPTs” and Assistants will bring, it’s key to understand the current state of trying to stand up a customized chat bot in your organization. You need to stand up at minimum the following:
- A place to store your context, perhaps a Vector Database, and a place to give your chatbot custom behavioral instructions
- A way to “chunk” or split out that context/organizational data if it’s too large
- A place to store prompts coming in and make sure that history is maintained for a session, but not past that session and not for other users to access
- A link to some foundational model (like GPT4) to do the heavy lifting
- A front end to host the user experience for your employees or customers
With “GPTs” OpenAI drastically opened the door to creating these on the fly without coding. The context is uploaded by dragging in a file, and the behavioral instructions (usually referred to as a “System Prompt”) are entered in a form. It will suggest some AI-generated branding, and with GPT enterprise, you’ll be able to publish only to users who should be granted access.
OpenAI is also launching the “GPT Store,” basically the App Store for the best GPTs, which I anticipate will be very important for personal and professional use, and less so for enterprise use. Plus, the best ones will be absorbed into some future release.
Assistants are the API extension of “GPTs,” where more “agent” like experiences will be built as AI embedded in applications. Even with “GPTs,” OpenAI knows that more complicated chains of thought and integration into existing apps cannot exist as a standalone bot with a somewhat rigid interface.
The Assistants API crucially now has access to Code Interpreter, meaning it will also execute code on the fly in the context of the Assistant. This code could be needed to perform data analysis or interact with files the user uploads. The Assistant will be aware of actions it can take on behalf of the user and execute those actions in the application if the user prompt indicates those actions would be helpful.
On an enterprise level, there will be many implications. Customers will expect to have integrated experiences with not only the information an organization already knows about them, but additional information they will provide on the fly. For corporate leadership, reporting, dashboarding, and other information will continue to drift toward natural language interaction. The need for cross-domain data strategy and governance will only grow.
Importantly, Sam Altman (CEO of OpenAI) himself said in his keynote address that “GPTs” and “Assistants” are precursors to agents. Full agents executing complex tasks are coming, but their consistency, flexibility, and scope of actions are still constrained.
OpenAI DevDay Recap: Less Barriers to Enterprise Use
Protection: ChatGPT Enterprise data and data from the API are not used to train models (this is not new). They even backed it up with Copyright Sheild to pay the cost of copyright infringement. Now that “GPTs” have had the barriers come down to creating models trained on enterprise data, Enterprise licenses will become more commonplace.
Pricing: With the decreased cost for both input and output productionized use of GPT-4 Turbo, the API becomes more feasible and will prevent a lot of development teams from exploring competitors for cost considerations alone or using 3.5 (an earlier GPT model) for decreased cost.
Firms already needed to consider what not having Enterprise Licenses means for their organization, including employees sharing sensitive information in non-protected spaces, including personal accounts. With these new enhancements, firms might be facing extra costs putting the engineering or analytical muscle towards the wrong initiatives.
Clarkston has helped our clients navigate the changing AI landscape while executing high-value projects for years. If you would like to talk more, please reach out.
Subscribe to Clarkston's Insights