In this article, Dr Yeo Wee Kiang explores the real-world applications of generative AI such as text-to-text and text-to-image AI models.Â
Generative AI refers to Artificial Intelligence techniques that creates new content, such as text, images, and music, by learning from vast data sets. It uses patterns from this data to generate original, often realistic, outputs. While not all generated content may be entirely novel, Generative AI excels at reinterpreting and recombining existing data structures in unforeseen and seemingly ingenious ways.
There are various real-world applications of Generative AI. Each of these applications relies on different types of Generative AI models. For example, Text-to-Text Generative AI models take natural language input and produce text as output. Natural language, such as English, is what we (humans) use for communication. Such models are the engines behind AI chatbots that engage users on messaging platforms or websites. Their responses mimic the structure and style of human language. Another use case of such models is to assist in content creation. They generate written text for articles, blog posts and more. This may improve the work productivity and efficiency of content creators. Another type is the Text-to-Image Generative AI models. They take textual descriptions and convert them into photorealistic images. E-commerce platforms may use such models to generate product images from product descriptions. If the inventory is large, generating images may be more time and cost-efficient than using traditional photography. We are still figuring out the extent of true innovation that Generative AI can achieve, especially as it relies on neural networks to understand text. While it excels at mimicking creative tasks, we are uncertain about the extent of its capability in driving real innovation.
Large Language Models: The Power of Context
In the last decade, the field of natural language processing (NLP) has significantly transformed by integrating neural networks for text representation. Neural networks, loosely inspired by the structure and function of the brain, play a crucial role in Generative AI. These networks have enabled the recognition of intricate patterns and the generation of sophisticated language models. Starting with Word2vec and N-grams in 2013, followed by Recurrent Neural Networks (RNNs) and Long Short-Term Memory networks (LSTMs) in 2014, these advances boosted NLP tasks like translation and text classification. Many other NLP tasks benefited from these advancements as well.
Businesses are actively seeking AI tailored to their specific needs, moving away from generic, one-size-fits-all solutions.
After the publication of the groundbreaking 2017 paper titled “Attention Is All You Need”, transformers revolutionized the field. Large Language Models (LLMs), such as GPT (Generative Pre-trained Transformer) models, are built upon the transformer architecture. Transformers are a specific type of neural network architecture. Previous models represented words as vectors but lacked contextual sensitivity. For example, the term “book” had the same representation whether it referred to reading material or a reservation. Transformers changed this by adopting an encoder-decoder structure and integrating attention mechanisms. Attention mechanisms allow the model to decide which parts of the input text are important when creating output.
Retrieval Augmented Generation: Enhancing Output Quality and Reliability
Retrieval Augmented Generation (RAG) enhances LLMs by pairing them with information retrieval, giving access to current or specialized information. RAG transforms the response process with three steps: retrieving relevant content, merging it with the user’s question (prompt), and generating an answer. Unlike LLMs, which depend on fixed datasets that can become outdated, RAG can fetch up-to-date information from sources like databases and the Internet.
Errors in LLMs, known as “hallucinations”, occur when they generate text that seems realistic but is ultimately inaccurate or nonsensical. This can potentially reduce the reliability of the output. Augmenting LLMs with RAG not only increases response accuracy but also provides supporting evidence.
Consider a practical scenario: a large enterprise faces challenges in providing precise technical support. This is due to their extensive and ever-evolving product documentation. To tackle this issue, they implement a RAG system for their helpdesk. The RAG system retrieves the most recent and reliable information from the company’s internal databases. For newly updated features or products, RAG will be especially useful for helpdesk staff since they are able to provide the customer with the most current product details.
Tailored AI: The Rise of Specialised Solutions in Business
Businesses are actively seeking AI tailored to their specific needs, moving away from generic, one-size-fits-all solutions. While OpenAI’s GPT-4 holds the crown right now, companies are building smaller, specialised AI for their own tasks. They train these models with both proprietary and publicly available data. Such training equips them with knowledge particularly suited for specific tasks, even though these models might not match the versatility of broader applications like ChatGPT. The trend for custom, smaller AI is on the rise. This trend might lead to the emergence of many helpful AI assistants in various settings. These assistants will tackle specific tasks, enabling businesses to use AI precisely as needed without incurring high costs.
Conclusion
Recent progress in Generative AI is a giant leap forward in AI technology. Dive in – witness industries transformed, creativity redefined, and technology wielded anew. Faster, better, bespoke creations for all. Not just keeping up but shaping a transformative movement. The AI frontier beckons. Leave your mark, your journey starts now.