The essential AI glossary
The field of AI – which blends elements of computer science, cognitive science, psychology, game theory and a number of other disciplines – comes with a huge variety of technical terms that can be tricky for outsiders to grasp. This guide can help get you started as you learn to speak the language of AI.
Images created by Midjourney using the prompt: “An intelligent humanoid machine holding a dictionary, sci-fi digital art.”
A/B testing: A form of randomized experimentation wherein two variants of a particular model, A and B, are tested by a human subject to determine which of them performs better than the other.
AlexNet: Widely regarded as a major breakthrough for deep learning, AlexNet is a convolutional neural network (CNN) designed by Alex Krizhevsky, Ilya Sutskever and Geoffrey Hinton. Constructed with eight layers – five convolutional and three connected – It won the 2012 ImageNet Large-Scale Visual Recognition Challenge (ILSVRC), a competition in which researchers strive to build AI models capable of labeling and classifying the more than 15mn high-resolution images that comprise ImageNet. (Hinton, at the time, was Krizhevsky’s and Sutskever’s PhD advisor at the University of Toronto.)
Explore frequently asked questions
Algorithm: A set of instructions or rules used - often by a computer - to solve a set of problems, execute calculations or process data.
Alignment: In AI research, “alignment” refers to the process of building models that act according to – and that do not deviate from – human interests.
AlphaGo: An AI model developed by DeepMind and designed specifically to play the ancient Chinese board game Go. In 2015, AlphaGo became the first AI model to defeat a professional human Go player (Chinese-born Fan Hui). It beat Lee Sedol, a then-professional Go player from South Korea, the following year. Sedol retired from playing professional Go in 2019, telling the South Korean media outlet Yonhap News Agency that AI specializing in the game of Go “is an entity that cannot be defeated.”
Amdahl’s Law: Named after computer scientist Gene Amdahl, this law states that the maximum speed of a computing system is limited by the component of the system that can’t be parallelized – that is, that can’t be divided into smaller tasks and executed simultaneously.
Artificial general intelligence (AGI, also sometimes referred to as Strong AI): An AI program with an intellectual ability that’s comparable to that of an average adult human. AGI, in other words, would hypothetically (we have yet to build one) be able to solve problems across a vast range of categories, just as a human brain can.
Artificial narrow intelligence (ANI, also sometimes referred to as Weak AI): An AI program built to perform a single, narrow function, such as playing chess or responding to customer service questions. All of the AI programs that have been developed to date fall into the category of ANI.
Artificial neural network (ANN): A synthetic system, roughly modeled on the architecture of organic brains, comprised of layers of artificial neurons.
Artificial superintelligence (ASI): First postulated by Oxford philosopher Nick Bostrum, “superintelligence” is a theoretical artificial intellect that is more advanced than that of humans. An ASI could have only a slightly higher IQ score than the average human being, or it could be vastly, unfathomably more intelligent, comparable to the difference in cognitive ability between an ant and Nobel Laureate Roger Penrose.
Association rule learning: A method of unsupervised and rule-based machine learning aimed at identifying commonalities or associations between variables in a dataset.
Automatic speech recognition (ASR – also known as computer speech recognition, speech-to-text or simply speech recognition): A machine’s capability to recognize human speech and then convert it into text. The iPhone dictation feature, for example, uses ASR.
Advertisement
Backpropagation: The process by which a neural network informs itself that it has made a predictive error and subsequently corrects that error. The word “backpropagation” means roughly responding to flawed information by sending new information back in the direction of the source of the error. Sometimes colloquially referred to simply as “backprop” or “BP.”
Bayes’ theorem: Named after the 18th-century statistician Thomas Bayes, this theorem is a mathematical formula that can be used to determine what’s known as “conditional probability” – that is, the likelihood of a particular outcome based on one’s prior knowledge of a previous result that occurred in similar conditions.
Black box: A metaphor that’s invoked to describe a system whose inner workings are hidden and ultimately mysterious to the system’s creator (or creators). AI is sometimes described as a “black box” because models will often behave and evolve in ways that even the system’s programmers cannot fully understand or predict.
Central processing unit (CPU): The most important component of a digital computer. The CPU – sometimes referred to as the “brain” or the “control center” of a computer – is the locus of every digital computing system’s memory, arithmetic capabilities (adding, subtracting, multiplying and dividing) and the orchestrator of its operating system. The CPU of modern computers is built upon a microprocessor.
Chatbot: An AI-based computer program that leverages natural language processing (NLP) to field customer service questions in automated verbal or text-based responses that simulate human speech.
ChatGPT: An AI-powered chatbot launched by San Francisco-based startup OpenAI in November of 2022. ChatGPT uses NLP to simulate human conversation. According to OpenAI’s website, ChatGPT can “answer follow-up questions, admit its mistakes, challenge incorrect premises and reject inappropriate requests.”
Computer vision: A branch of AI that’s concerned with enabling machines to understand and respond to information derived from visual inputs - such as images and video - in a manner similar to that of the visual system in the human brain.
Convolutional neural network (CNN): A subset of artificial neural networks, commonly used in machine visual processing, which can enable an AI model to differentiate and analyze various components within an image.
Cooperative inverse reinforcement learning (CIRL): Coined by AI researchers Stuart Russell, Pieter Abbeel and others, cooperative inverse reinforcement learning (CIRL) is a hypothetical methodology for solving the so-called alignment problem, in which an AI model is designed to carry out an objective function that’s valuable to humans without knowing from the outset what that objective function is. Rather, the machine’s ability to perform the given task is enhanced through “behaviors such as active teaching, active learning and communicative actions that are more effective in achieving value alignment,” according to the paper from Russell and colleagues, who first defined CIRL.
Dall-E 2: A deep learning model developed by OpenAI and released in 2022, which generates images based on the input of text-based natural language prompts. Its predecessor is Dall-E. The name of both models is a play on both the name of the title character of the Pixar film Wall-E and the surname of the 20th-century surrealist painter Salvador Dalí.
The Dartmouth Summer Research Project on Artificial Intelligence: A conference – colloquially referred to as the Dartmouth Workshop – that began in mid-1956 at Dartmouth College and is widely considered to be the event that gave birth to AI as a field of research. The conference was organized by Marvin Minsky, John McCarthy, Nathaniel Rochester and Claude Shannon.
Deep Blue: An AI program developed by IBM, the sole purpose of which is to play chess. In 1997, it made history by becoming the first intelligent machine to beat chess master Gary Kasparov in a chess match.
Deep learning (also known as deep reinforcement learning): An extension of machine learning based on the premise that machine learning models can be made more intelligent if they’re provided with vast quantities of data. Deep learning requires neural networks of at least three layers; the more layers it’s equipped with, the better its performance will be.
Advertisement
Deepfake: An AI-generated piece of media depicting a real person (or a real person’s voice, in the case of a deepfake audio clip). Deepfakes can be difficult to detect and are often made and spread around the internet in an effort to tarnish someone’s reputation or spread some kind of misinformation.
DeepMind: An artificial intelligence research laboratory based in London and founded in 2010 by Demis Hassabis, Shane Legg and Mustafa Suleyman. The company was acquired by Google in 2014 and is now a wholly-owned subsidiary under Alphabet Inc (Google’s parent company). DeepMind describes itself on its website as “a team of scientists, engineers, ethicists and more, committed to solving intelligence, to advance science and benefit humanity.”
Decision tree: An imagistic illustration of the process of arriving at a decision, wherein each “branch” represents a particular course of action. Decision trees start at a “root node” (which consists of all the relevant data that’s being analyzed), branch off into “internal nodes (also known as “decision nodes”) and then terminate in “leaf nodes” (also known as “terminal nodes,” which represent all the possible outcomes of a given decision-making process).
Here’s a simple example of a decision tree rooted in the question of whether or not you should go outside to play soccer:
Effective Accelerationism, or e/acc (pronounced “e-ack“): A subgroup of techno-optimists who ardently believe that the development of emerging technologies, including new AI models, should be allowed to proceed as quickly and as free from government restraint as possible.
Emergent abilities: Unexpected patterns of behavior demonstrated by an AI model, which can suddenly arise as a result of the model detecting and learning from new patterns in its training dataset. For example, a powerful large language model might develop the capacity to understand complex literary metaphors despite the fact that such metaphors were not originally included in the model’s training data.
Entropy: In the context of machine learning, “entropy” refers to the degree of randomness, disorder and unpredictability within a dataset that’s being processed by a machine learning system. More broadly, the concept of entropy is commonly associated with the second law of thermodynamics, which essentially holds that the degree of disorder or randomness within a system will never decrease over time – it can only remain constant or increase.
Embodied agent (or “interface agent”): An entity that interacts and is reciprocally responsive to its environment through sensory-motor functions. A human being, a dog, a Boston Dynamics robot and a virtual avatar of a human are all examples of embodied agents.
Foom: An onomatopoeic word that’s supposed to represent the sound of an explosion, “foom” is used to describe a hypothetical scenario in which AI suddenly and irrevocably enters the realm of superintelligence and escapes human control.
Foundation model: A machine learning model trained on a vast corpus of unlabelled data and designed to carry out a wide variety of tasks (as opposed to a single, narrow task).
Game Theory: A mathematical formula, first postulated by mathematician John von Neumann and economist Oskar Morgenstern in 1944, relating to the dynamic interaction between two or more rational agents seeking their own gains within a parameterized (rule-governed) framework. Game Theory defines a broad set of games, including zero-sum and nonzero-sum.
Generative adversarial network (GAN): A machine learning methodology wherein two neural networks compete with one another in a zero-sum game – that is, one network’s loss translates to the other’s gain, and vice versa. Both networks are provided with a dataset, and one network called the “generator” is essentially tasked with tricking the other – the “discriminator” – into believing that the new information that’s being generated is part of the original dataset. For example, the generator might generate a new image of a human face based on many images of real human faces, at which point the discriminator will try to determine whether or not the new image is real or manufactured. This contest will continue until the generator succeeds at tricking the discriminator with the majority (more than 50 percent) of its original output. GANs were invented in 2014 by American computer scientist Ian Goodfellow, who has henceforth been dubbed “The GANfather.”
GPT-3: Generative Pre-Trained Transformer 3 (GPT-3) is an open-source large language model developed by OpenAI and released in 2020. The model is the framework for the viral chatbot ChatGPT and is able to generate text responses in natural language based on text-based prompts.
Graphics processing unit (GPU): A GPU – a term popularized by the marketing efforts of US chip manufacturing company Nvidia beginning in 1999 – is an integrated circuit that was first used to enhance video game graphics but has since become foundational to the the training of neural networks. In contrast to a CPU, a GPU can handle simple computational tasks very quickly by breaking them down into smaller sub-tasks, which then get processed simultaneously.
Hallucination: In an AI context, the term “hallucination” refers to any kind of output from an AI model that is seemingly inconsistent with its training data. A hallucinating AI-powered chatbot, for example, might confidently and falsely insist that there are around 5.7tn stars in the Milky Way galaxy, even though it was not trained using any astronomical data.
Human-in-the-loop (HITL): A methodology deployed in some machine learning models in which at least one human programmer provides feedback to the model (during testing or training) to improve the model’s performance. Ideally, HITL results in a positive feedback loop that enhances the intelligence of both machines and humans.
Hyperparameter: An overarching, predominant parameter established by a human programmer, which determines the parameters that an AI model will establish and hone by itself during its training process.
Instrumental Convergence: A theoretical phenomenon in which all sufficiently intelligent agents – both biological and nonbiological – will ultimately identify and work towards the same instrumental goals (also called sub-goals) in pursuit of their final goals (also called absolute goals). In a hypothetical dystopian example, an AI model that’s been programmed with the final goal of removing excess carbon from the atmosphere and another that’s been given the final goal of eliminating the Covid-19 virus from the planet might determine that the best course of action is to wipe out humanity since humans are putting huge amounts of carbon into the atmosphere and since Covid-19 is mutating inside the bodies of humans. The instrumental goals of these two models would thereby converge, even though their final goals aren’t the same.
Inverse reinforcement learning (IRL): First described in a paper by AI researchers Stuart Russell and Andrew Ng, inverse reinforcement learning (IRL) is a methodology that, to put it simply, seeks to design machines that are capable of determining an agent’s (usually a human’s) goals and rewards by analyzing its behavior. An example that’s commonly invoked to describe the benefits of IRL is autonomous vehicles (AVs): Rather than trying to train an AV with every conceivable situation that it might encounter on the road – which is virtually impossible, given the fact that such possibilities are combinatorially explosive and, therefore, limitless – IRL could be leveraged to generate a comprehensive dataset of human driving behaviors and then instruct the algorithm to infer the correct course of action through a given scenario by following patterns that it detects from within that dataset.
The King Midas problem: Alluding to the Greek myth of King Midas - who had the power to turn everything he touched into solid gold, including his food and his daughter - this problem poses a crucial question related to AI research and Stuart Russel’s famous “alignment problem”: How can we be sure that an intelligent machine’s objective function is actually one which will serve the long-term best interests of human beings? In a famous thought experiment first posed by the philosopher Nick Bostrom, for example, we can imagine an AI system whose sole purpose is to create paperclips. At some point, the AI decides to eliminate human beings because they could potentially interfere with its mission and also because it figures that the atoms in their bodies would be put to better use as raw materials for the manufacture of more paperclips. Eventually, it expands and embarks on a mindless mission to turn the entire cosmos into paperclips. The point is that we as human beings don’t always have a firm grasp on what it is that we ultimately want – and very often what we think we want ends up being bad for us.
Machine learning: A subdiscipline of artificial intelligence that, using statistical formulas and data, enables computers to progressively improve their ability to carry out a particular task or set of tasks. Crucially, a computer leveraging machine learning does not need to be explicitly programmed to improve its performance in a particular manner – rather, it’s given access to data and is designed to “teach” itself. The results are often surprising to their human creators.
Machine translation (MT): An automated process that leverages AI to translate text or speech from one language into another.
Meta-learning: In the context of AI, “meta-learning” (also sometimes described as “learning to learn”) refers to a model’s capacity to improve its ability to learn over time. Humans are also meta-learners because we’re able to deploy a variety of strategies – such as watching and emulating others – which can gradually turn us into more effective learners and which can improve our overall ability to navigate through the world.
Microprocessor: A CPU for digital computing systems contained within a single integrated circuit (also known as a microchip, hence the prefix in the word “microprocessor”) or a small grouping of integrated circuits. Intel introduced the world’s first microprocessor, dubbed the 4004, in 1971.
Midjourney: A research lab that launched a text-to-image AI model by the same name in open beta in 2022.
Model: In the context of AI and ML, a model is an algorithm trained to detect patterns and make predictions based on a particular dataset.
Model Collapse: The gradual decline in the performance of a generative AI model as its outputs start to follow certain familiar patterns and become less varied, thereby increasing the likelihood that the model will ignore outlying but potentially significant data points that were included in its training dataset.
Model drift: The tendency for the performance of an AI model to degrade over time as its external environment changes, thereby also causing the relationship between input and output variables to change.
Moore’s Law: A principle based on an observation usually attributed to former Intel CEO Gordon Moore, which holds that the number of transistors that can be contained within an integrated circuit (ie, a microchip) doubles roughly every two years.
Moravec’s Paradox: Named after the computer scientist Hans Moravec, this paradox refers to the fact that machines are able to easily carry out functions that are difficult for most human beings – such as performing complex mathematical calculations – and yet they struggle to do things – like perform basic motor tasks or read social cues – that most human beings are able to do with little to no effort.
Natural language processing (NLP): A branch of artificial intelligence – that also blends elements of linguistics and computer science – aimed at enabling computers to understand verbal and written language in a manner that imitates the human brain’s language-processing capability.
Objective function: A mathematical formula that’s used to measure the accuracy of an AI model’s predictions and, thereby, its overall ability to carry out a given task. An objective function can be likened to a numerical score, which shows an AI model how effectively it’s performing; a high score conveys that it’s on the right track, while a low score suggests that it probably needs to implement a different problem-solving strategy.
OpenAI: A non-profit AI research lab founded in 2015 by Sam Altman, Elon Musk and others. As its name suggests, the original foundational goal of OpenAI was to collaborate with other organizations in the field of AI and to open-source its research. In 2019, the organization launched a “capped profit” subsidiary called OpenAI Limited Partnership (OpenAI LP). (Musk has lamented this decision on Twitter.)
Parameter: A variable within the process of training an AI model that can be adjusted by the model in order to hone its ability to produce a particular output using a given dataset.
Pattern recognition: An automated process whereby a computer is able to identify patterns within a set of data.
P(doom): An equation representing the probability (“P”) of apocalypse being brought about by powerful AI systems. The higher one’s p(doom) score – which runs from zero to 100 – the higher they believe the likelihood of AI annihilating the human race.
Prior probability (also sometimes referred to simply as a prior): A term used in the field of Bayesian statistics to refer to the assigned likelihood of an event before additional ( or “posterior”) information necessitates the revision of that likelihood.
Product of experts (PoE): A machine learning method first postulated by Geoffrey Hinton in 1999 that combines several relatively simple probability distributions – or “experts” – into a single distribution.
Red team: A group of professionals employed by an organization who attempt to break through that same organization’s physical or digital defenses so that vulnerabilities might be identified and strengthened.
Reinforcement learning (RL): The process of teaching machine learning models to make optimal decisions within a dynamic environment. When using RL, a programmer will often present a machine learning model with a game-like situation in which one outcome is preferable to others. The machine then proceeds to experiment with different strategies and the programmer will “reinforce” the desired behavior with rewards and discourage other behaviors through penalties.
Suggested newsletters for you
Reinforcement learning from human feedback (RLHF), or reinforcement learning from human preferences: Like traditional reinforcement learning (RL), Reinforcement learning from human feedback (RLHF) aims to fine-tune the performance (or the “policy”) of a machine learning model by rewarding (ie, reinforcing) certain actions. Unlike RL, however, RLHF involves human beings in the process of choosing which of the model’s actions should be rewarded. OpenAI’s ChatGPT, for example, was trained using RHLF.
Retrieval-Augmented Generation (RAG): A technique that expands the reliability and capabilities of LLMs by enabling them to retrieve relevant information from sources outside of their training datasets.
Self-supervised learning: A branch of machine learning wherein an AI model is provided with unlabeled data and is allowed to label the data according to its own pattern recognition capabilities. A self-supervised algorithm will then use those initial labels as it continues to interpret subsequent iterations of data input.
Semi-supervised learning: A branch of machine learning which, as the name suggests, blends elements of both supervised learning and unsupervised learning. Semi-supervised learning is based on the input of some labeled data and a higher quantity of unlabeled data, the goal being to teach an algorithm to categorize the latter into predetermined categories based on the former and also to allow the algorithm to identify new patterns across the dataset. It is widely considered to be a kind of bridge, connecting the benefits of supervised learning with those of unsupervised learning.
Supervised learning: A branch of machine learning based on the input of clearly labeled data and aimed at training algorithms to recognize patterns and accurately label new data.
Stochastic: A mathematical term referring to a system’s tendency to produce results that are unpredictable. (Roughly synonymous with “probabilistic,” “indeterminable” and “random.”) Many AI algorithms are programmed to incorporate some degree of randomness into their learning processes and are, therefore, described as stochastic. The results of a deterministic system, in contrast, can reliably be predicted beforehand.
Temperature: In a generative AI context, “temperature” refers to the mathematical precision with which a model can be expected to arrange its outputs. The higher the temperature (zero being the lowest value), the more likely the model is to hallucinate – in other words, veer unpredictably from its training data.
TensorFlow: An open-source platform, developed by Google, designed for the management of machine learning and AI systems.
Transformer: First defined in a 2017 paper from Google Brain researchers, a transformer is a neural network that learns to identify and understand the contextual relationships within datasets, thus enabling it to produce new, cogent outputs. ChatGPT, for example, is based on a transformer model that specializes in understanding the contextual relationships between individual words. Through such a contextual understanding, it’s able to predict which word should logically follow those that preceded it. (“GPT” stands for “generative pre-trained transformer.”)
Turing test: A blinded experiment – invented by and named after 20th-century mathematician Alan Turing – where a human subject interacts with an artificially intelligent machine and asks it a series of questions. If the human interlocutor is unable to say definitively whether the responses are being generated by a human or an AI, the latter has passed the Turing Test.
Uncanny valley: A theoretical concept first postulated by roboticist Masahiro Mori in 1970 that refers to an eerie, uncanny quality that will be perceived by a human being interacting with an artificial entity that closely (though imperfectly) resembles another human.
Unsupervised learning: A branch of machine learning which is based upon the input of unlabeled data. In contrast to supervised learning, unsupervised learning allows an algorithm to create its own rules for identifying patterns and categorizing data.
Value alignment problem: Coined by computer scientist Stuart Russel, the phrase “value alignment problem” – or simply “alignment problem” – refers to the difficulties that come with ensuring that intelligent machines share the same values and goals as their human programmers. This problem has spawned a subfield of AI and machine learning called “alignment research.”
Video joint embedding predictive architecture (V-JEPA): An AI model released by Meta in early 2024 that is trained to predict missing, or “masked,” sections of video clips. The inspiration was drawn from the way in which learning is believed to take place in organic brains: “As humans, much of what we learn about the world around us – particularly in our early stages of life – is gleaned through observation,” Meta wrote in a blog post describing V-JEPA. “Take Newton’s third law of motion: Even an infant (or a cat) can intuit, after knocking several items off a table and observing the results, that what goes up must come down. You don’t need hours of instruction or to read thousands of books to arrive at that result. Your internal world model – a contextual understanding based on a mental model of the world – predicts these consequences for you, and it’s highly efficient.”
Weights: Parameters that neural networks autonomously learn to optimize in order to more accurately detect patterns in datasets and make predictions. Weights are the artificial counterparts of synapses, which are responsible for the transmission of electronic signals between neurons in organic brains.
For more on the latest happenings in AI, web3 and other cutting-edge technologies, sign up for The Emerging Tech Briefing newsletter.