Building Generative AI – what do you need to know?

The growth of ChatGPT since November 2022 has been nothing short of astronomical. The service gets around 1.5billion visits a month according to SimilarWeb, and led to a huge swell of interest and investment in Generative AI. According to CB Insights, around $14.6billion of venture capital went to companies involved in this area during the first six months of 2023.

This interest also matches business interest. Accenture’s AI For Everyone report announced that around 40 percent of all work hours could be affected by Generative AI services, and that 98 percent of global executives expect that AI will be essential to their companies’ strategies over the next three to five years.

For enterprises interested in how they can get the most out of Generative AI, the potential rewards are great. However, they have to work out how to design and build these services for themselves or hand over responsibility to a third party. With so much at stake, how can enterprises plan ahead on what they should build, buy or rent?

Starting with Generative AI

To implement Generative AI, you need data. This data is used to train models that can then respond appropriately to user requests. However, rather than creating machine learning models that could predict results in transactions, this training data is used to create mathematical models of language and response patterns. This is a large language model, or LLM. LLMs sit at the heart of Generative AI services and are responsible for creating the responses to user requests.

The most famous LLMs from the likes of OpenAI, Google and Meta took vast amounts of data to train and create. OpenAI estimates that it spent $100million to train GPT-4, the LLM at the heart of its ChatGPT service. However, very few organisations can afford to spend that much or train their own LLM service. There are open source options launching into the market that can provide similar frameworks that will meet enterprise requirements – these services can work with much smaller volumes of data and create a model, but they may not have the richness that the larger LLMs have.

Alongside your choice of LLM, you also have to look at your approach to data to feed into the model over time. Rather than the training data, this covers the data set that you use alongside the LLM to create responses back to user requests. Rather than a traditional database, this will contain vector data.

Vectors are representations of words or concepts in your data. Each word or term is assigned a mathematical value, and this is then used to map sentences or requests. Enterprises will have to create vectors for their data that they want to use with generative AI systems, so that the LLM can translate natural language into a format that it can use and query. This process of vectorisation creates a set of data called embeddings, which can then be used by the LLM to build responses.

When a user makes a request to a generative AI system, their question is converted to an embedding and then used to search against the vector database. The vector database then searches for potential matches against that request, and the returned results are then assembled into a response by the LLM. For enterprises that have built up sets of data around their customers or products, these data sets can be vectorised and then fed into the Generative AI system to create more personalised or more accurate responses.

Running generative AI systems

Creating your own vector database instance or instances has another benefit. Besides allowing you to improve responses for users, running your own vector database instances means that you can manage and own your systems rather than relying on a third party provider to take control of your company data.

While many companies are happy to use these kinds of services, enterprises may prefer to keep ownership of their data for competitive advantage, or because they can’t hand this data over for data security or privacy compliance reasons. Whatever the thinking behind this, running your own vector database instance can ensure that you have more accurate responses for users.

Over time, you may want to add more data to improve your responses to users as well. One technique that can be used here is retrieval augmented generation, or RAG. This enables you to provide more context to your generative AI system and have it included for your LLM to use. This can minimise the massive headache and expense involved in retraining your LLM with up to date data over time, as well as retaining more control over how that data is used and improving responses for users.

Alongside your LLM and vector database, there may be other elements that you want to integrate into your generative AI application. One common tool for this is LangChain, which can integrate multiple LLMs together to improve the results that are generated. LangChain also makes it easier to output results from your LLM into different formats – alongside a response within a website, you may want to output data into a report format in a PDF. LangChain makes that process easier.

To make running generative AI systems easier – and to make life easier for your developers – integrating different components together can help. A good example of

this is the open source project CassIO, a Python library that makes it easier to integrate advanced functionality in the open source database Apache Cassandra with the likes of LangChain and other LLM projects. By abstracting away some of the specific integrations, this avoids some of the management headaches and overheads that can otherwise come in with running generative AI systems over time. Rather than having to code and support connectors and integrations between components, these open source projects should make things easier for developers to get up and running with generative AI in their applications.

Whether you have been following AI for years, or have just got started due to the hype around ChatGPT, the role for this technology in evolving applications and services is just getting started. Understanding the frameworks and requirements for ongoing management can help you make the right decisions around how you can implement generative AI in your systems, and create the best opportunities for your team and your organisation.

Author: Dom Couldwell, Head of Field Engineering EMEA, DataStax

Stock photo: Adobe

more insights