AI has a lot of terms. We've got a glossary for what you need to know

When people unfamiliar with AI envision artificial intelligence, they may imagine Will Smith’s blockbuster I, Robot, the sci-fi thriller Ex Machina, or the Disney movie Smart House — nightmarish scenarios where intelligent robots take over to the doom of their human counterparts.

Today’s generative AI technologies aren’t quite all-powerful yet. Sure, they may be capable of sowing disinformation to disrupt elections or sharing trade secrets. But the tech is still in its early stages, and chatbots are still making big mistakes.

Still, the newness of the technology is also bringing new terms into play. What makes a semiconductor, anyway? How is generative AI different from all the other kinds of artificial intelligence? And should you really know the nuances between a GPU, a CPU, and a TPU?

If you’re looking to keep up with the new jargon the sector is slinging around, Quartz has your guide to its core terms.

What is Generative AI?

Let’s start with the basics for a refresher. Generative artificial intelligence is a category of AI that uses data to create original content. In contrast, classic AI could only offer predictions based on data inputs, not brand new and unique answers using machine learning. But generative AI uses “deep learning,” a form of machine learning that uses artificial neural networks (software programs) resembling the human brain, so computers can perform human-like analysis.

Generative AI isn’t grabbing answers out of thin air, though. It’s generating answers based on data it’s trained on, which can include text, video, audio, and lines of code. Imagine, say, waking up from a coma, blindfolded, and all you can remember is 10 Wikipedia articles. All of your conversations with another person about what you know are based on those 10 Wikipedia articles. It’s kind of like that — except generative AI uses millions of such articles and a whole lot more.

What is a chatbot?

AI chatbots are computer programs that generate human-like conversations with users, giving unique, original answers to their queries. Chatbots were popularized by OpenAI’s ChatGPT, and since then, a bunch more have debuted: Google Gemini, Microsoft CoPilot, and Salesforce’s Einstein lead the pack, among others.

Chatbots don’t just generate text responses — they can also build websites, create data visualizations, help with coding, make images, and analyze documents. To be sure, AI chatbots aren’t foolproof yet — they’ve made a lot of mistakes already. But as AI technology rapidly advances, so will the quality of these chatbots.

What is a Large Language Model (LLM)?

Large language models (LLMs) are a type of generative artificial intelligence. They are trained on large amounts of data and text, including from news articles and e-books, to understand and generate content, including natural language text. Basically, they are trained on a ton of text so they can predict what word comes next. Take this explanation from Google:

“If you started to type the phrase, “Mary kicked a…,” a language model trained on enough data could predict, “Mary kicked a ball.” Without enough training, it may only come up with a “round object” or only its color “yellow.” — Google’s explainer

Popular chatbots like OpenAI’s ChatGPT and Google’s Gemini, which have capabilities such as summarizing and translating text, are examples of LLMs.

What is a semiconductor?

No, it’s not an 18-wheeler driver. Semiconductors, also known as AI chips, are used in electrical circuits of devices such as phones and computers. Electronic devices wouldn’t exist without semiconductors, which are made from pure elements like silicon or compounds like gallium arsenide, to conduct electricity. The name “semi” comes from the fact that the material can conduct more electricity than an insulator, but less electricity than a pure conductor like copper.

The world’s largest semiconductor foundry, Taiwan Semiconductor Manufacturing Company (TSMC), makes an estimated 90% of advanced chips in the world, and counts top chip designers Nvidia and Advanced Micro Devices (AMD) as customers.

Even though semiconductors were invented in the U.S., it now produces about 10% of the world’s chips, not including advanced ones needed for larger AI models. President Joe Biden signed the CHIPS and Science Act in 2022 to bring chipmaking back to the U.S., and the Biden administration has already invested billions into semiconductor companies including Intel and TSMC to build factories throughout the country. Part of that effort also has to do with countering China’s advancements in chipmaking and AI development.

What are GPUs & CPUs?

A GPU is a graphics processing unit, an advanced chip (or semiconductor) that powers the large language models behind AI chatbots like ChatGPT. It was traditionally used to make video games with higher quality visuals.

Then a Ukrainian-Canadian computer scientist, Alex Krizhevsky, showed how using a GPU could power deep learning models a whole lot faster than a CPU — a central processing unit, or the main hardware that powers computers.

CPUs are the “brain” of a computer, carrying out instructions for that computer to work. A CPU is a processor, which reads and interprets software instructions to control the computer’s functions. But a GPU is an accelerator, a piece of hardware designed to advance a specific function of a processor.

Nvidia is the leading GPU designer, with its H100 and H200 chips used in major tech companies’ data centers to power AI software. Other companies are aiming to compete with Nvidia’s accelerators, including Intel with its Gaudi 3 accelerator, and Microsoft’s Azure Maia 100 GPU.

What is a TPU?

TPU stands for “tensor processing unit.” Google’s chips, unlike those of Microsoft and Nvidia, are TPUs — custom-designed chips made specifically for training large AI models (whereas GPUs were initially made for gaming, not AI).

While CPUs are general-purpose processors and GPUs are an additional processor that run high-end tasks, TPUs are custom-built accelerators to run AI services — making them all the more powerful.

What is a hallucination?

As mentioned before, AI chatbots are capable of a lot of tasks, but they also slip up a lot. When LLMs like ChatGPT make up fake or nonsensical information, that’s called a hallucination.

Chatbots “hallucinate” when they don’t have the necessary training data to answer a question, but still generate a response that looks like a fact. Hallucinations can be caused by different factors such as inaccurate or biased training data and overfitting, which is when an algorithm can’t make predictions or conclusions from other data than what it was trained on.

Hallucinations are currently one of the biggest issues with generative AI models — and they’re not exactly easy to solve for. Because AI models are trained on massive sets of data, it can make it difficult to find specific problems in the data. Sometimes, the data used to train AI models is inaccurate anyway, because it comes from places like Reddit. Although AI models are trained to not answer questions they don’t know the answer to, they sometimes don’t refuse these questions, and instead generate answers that are inaccurate.

What is training?

Training is the process of teaching an AI model how to make predictions. In this phase, an AI model is fed data to learn how to do a specific task or tasks, and goes through trial and error until it starts producing the desired outputs.

What is inferencing?

Inferencing comes after training, and is the process of a trained AI model making predictions from new data. For example, a self-driving car can be trained to recognize a stop sign on a specific road. Through inferencing, the self-driving car would be able to recognize a stop sign on any road.

What is retrieval-augmented generation (RAG)?

Retrieval-augmented generation, or RAG, is a natural language processing (NLP) technique used to improve the accuracy of generative AI models. With RAG, generative large language models are combined with information retrieval systems (like databases and web pages), allowing the models to reference knowledge outside of its original training data and therefore provide more up-to-date answers.

The term was coined in a 2020 paper by a group of researchers from Facebook, University College London, and New York University.

What are tokens?

Text data is broken down into smaller units, called tokens, to be processed by AI models. Tokens can range from being one letter to a whole phrase.

What are parameters?

Parameters are the variables a model learns from training data that guide its ability to make predictions. While in the training process, a model adjusts its parameters to close the gap between what it predicts and what the desired outcome is. Therefore, the model learns how to make accurate predictions on new data.

What is an AI PC?

An AI PC is a personal computer that can handle AI and machine learning tasks. They are built with a CPU, GPU, and NPU that have specific AI capabilities. An NPU, or neural processing unit, is a chip specialized in carrying out AI- and machine learning-based tasks on the computer without having to send that data to be processed in the cloud.

What is a neural processing unit (NPU)?

A neural processing unit (NPU) can run artificial intelligence and machine learning tasks straight on a device, such as an AI PC, meaning the data doesn’t have to be sent to the cloud for processing.