Foundation Models, LLMs and Generative AI

25 June

Chat Generative Pre-trained Transformer (ChatGPT) is a chatbot and conversational AI tool launched by OpenAI in November 2022. ChatGPT is built on top of OpenAI’s GPT-3 family of large language models (LLMs). However, over 2023, the ChatGPT service will evolve rapidly and complementary solutions will emerge at a rapid pace. On 8 February 2023, Google formally shared parts of its AI vision and introduced Bard, an “experimental conversational AI service,” powered by its LaMDA large language model. Microsoft’s Z-code foundation model is used to provide high-quality multilingual language understanding capabilities in Azure Cognitive Service for Language. Venture capital funding of generative AI start-ups has also accelerated. Pitchbook reported that venture capitalists have increased investment in generative AI by 425% since 2020 to $2.1 billion.

In particular, the technology that ChatGPT is built upon — referred to as “foundational models” and the broader umbrella of generative AI — will have a transformative impact on data and analytics in the mid- to long-term. Foundation models previously called “transformers,” are mostly LLMs. They are designed to replace task-specific models. Foundation models embody a type of deep neural network architecture that computes a numerical representation of text in the context of surrounding words, emphasizing sequences of words. Foundation models are trained on a broad set of unlabeled data that can be used for different tasks, with additional fine-tuning. They are called foundation models because of their critical importance and applicability to a wide variety of downstream use cases, due to large-scale pretraining of the models.

Generative AI refers to AI techniques that learn a representation of artifacts from data, and use it to generate brand-new, completely original artifacts that preserve a likeness to original data. These artifacts can serve benign or nefarious purposes. It uses a number of unsupervised techniques that continue to “evolve”, such as:

Recurrent neural networks (RNN)
Large language models (LLM)
Generative adversarial networks (GAN)
Variable auto encoders (VAE)
Autoregressive generative networks (AGM)
Deep autoregressive networks (DARN)
Latent diffusion models (LDM)
Neural radiance field (NeRF)
Gradient origin networks (GON)

ChatGPT can be further characterized as a type of reinforcement learning approach. Although augmented with human feedback, it is essentially a machine learning construct, and lacks the generalization and transparency qualities — such as how to explain its methods or citing sources — provided by symbolic techniques. It is based on very advanced predictions, but it does not “reason” or “understand.” Nor does it create truly new content. It is “only” recombining existing content, albeit in a highly contextualized, convincing and seemingly confident manner.

ChatGPT is an evolution of ongoing trends, rather than a new paradigm. The underlying model is based on transformer neural networks (although not all transformer models are foundation models), which have been used as foundation models for over five years, including in vendor applications. However, ChatGPT does add some new elements to those foundation models, such as conversational and short-term memory layers and massive human-in-the loop feedback (reinforcement learning) for the training process. The engineering delivered to make the model available for mass consumption is also novel, requiring extensive compute resources and model-serving architecture.

ChatGPT and the progress of LLMs and generative AI in general create mainstream awareness and will serve as a potential catalyst for accelerating better data and analytics user experiences, adoption and impact. We will see products leveraging GPT-3.x and generative AI alternatives roll out over the next three to 18 months. Early use cases, such as developer code generation assistants or co-pilots, which are likely to initially be based on Codex, a derivative of GPT-3 that is optimized for code generation for SQL, Python, DAX, R, etc., will emerge across data and analytics (and other developer) platforms. These will also make it easier for business users to generate calculations and the code behind them using natural language. One such example is quick measure suggestions in Microsoft Power BI.

The opportunity comes from combining and leveraging the strengths of both LLMs and analytics and BI and DSML platforms together at different parts of the analytics workflow.

GPT-3.x (and alternatives) can be used to initiate the query and understand user intent (also known as “prompt engineering” in the broader discipline of composite AI).
Based on the analytics output generated from the prompt as input to the analytics vendor platform and customer-specific data and data models, GPT (and others as they emerge) can assist in autogenerating analytics content such as code and dashboards. Compared with current search and natural language query and generation tools, it can also create much richer humanlike conversational narratives to analyze, explain, predict and prescribe actions.

Moreover, LLM-enabled analytics might enable a seamless experience across BI and DSML platforms unified by natural language. Business users will be able to access the power of capabilities across platforms from a single conversational interface. While the enhanced experience of combining LLMs with analytics platforms could be significant for less skilled users, data privacy, IP protection, trustworthiness of results, and explainability will have to be addressed.

Future uses of large language models in the data and analytics space, when combined with other techniques, could include:

Conversational interfaces for multistructured analytics over wide data sources; formats across a broader spectrum of structured and unstructured data.
Automated support for troubleshooting and assisting users in a co-pilot-like experience.
Personalized and interactive training for data and analytics practitioners.
(Augmented) generation of knowledge graphs, ontologies or other data structures deduced from structured and unstructured internal and external data sources. Combining the predictive qualities of LLMs with semantic heuristics and mastering of concepts will be needed to make predictable, resilient systems.

Generative AI techniques will also have the ability to transform data management activities, making self-service data management activities accessible to much larger audiences. Through integration with active metadata management tools and enrichment from semantic tools (or by training large language models on enterprise or industry-specific corpora), it will become possible to have conversational interfaces to perform data management tasks.

This capability will not only simplify the work of developers, data integration specialists and DBAs, but also offer the possibility (through prompt engineering) to have increasingly system-to-system conversations. It also offer data stewards the option to express data quality rules in natural language and have the tool generate the corresponding scripts or pipelines automatically. As a result, natural language will help perform the majority of the data management activities, such as data integration, data governance, data preparation, data administration and optimization tasks.

Foundation ModelsLarge Language ModelsGenerative AIChatGPT

Amandeep Sidhu

Foundation Models, LLMs and Generative AI

Applied Generative AI

Generative AI in Education