Intermediate

What is a Foundation Model in AI? (2026)

TLDR

A foundation model is a large AI model trained on broad data that can be adapted for many different tasks. GPT-4, Claude, and Gemini are all foundation models: trained once at massive scale, then used (or fine-tuned) for thousands of specific applications.

The term "foundation model" was coined by researchers at Stanford in 2021 to describe a new paradigm in AI. Instead of building a separate AI model for each specific task (one for translation, one for question answering, one for coding), researchers found that training one very large model on a broad range of data produced something more powerful: a model that could handle many tasks without being explicitly trained for each one.

Foundation models are trained on massive datasets, often containing hundreds of billions of words of text, images, code, and other data. During this training, the model learns patterns, facts, reasoning strategies, and linguistic structure from the data. Once trained, the same model can be used directly for many tasks or fine-tuned on smaller, task-specific datasets to specialize it.

GPT-4, Claude 3, and Gemini Ultra are all foundation models. They were each trained on enormous datasets and can handle writing, coding, math, reasoning, summarization, translation, and much more without being retrained for each task. This is fundamentally different from older AI systems that were narrow and task-specific.

The "foundation" metaphor is intentional. Like a building foundation, these models provide a stable base on which many different applications can be built. A company might take a foundation model and fine-tune it on their customer support data to create a specialized support bot, or on legal documents to create a contract analysis tool. The foundation handles general intelligence; fine-tuning adds specialization.

In practice

GPT-4 as a foundation

OpenAI trained GPT-4 as a foundation model. It can then be accessed directly through ChatGPT, or developers can fine-tune it on their own data to create specialized applications for industries like healthcare, law, or education.

Image foundation models

Models like CLIP and Stable Diffusion are image foundation models. They learn visual concepts from billions of image-text pairs and can then be adapted for specific image tasks like generating product photos or medical image analysis.

Multimodal foundation models

GPT-4o and Gemini are multimodal foundation models: trained on text and images together. They can understand both modalities without needing separate specialist models for each.

Related terms

Frequently asked questions

Is a foundation model the same as a large language model?+

Not exactly. Large language models (LLMs) are foundation models that work with text. Foundation models is the broader category and includes models that handle images, audio, and video too.

Who makes foundation models?+

Currently, foundation models are primarily developed by large companies and research labs: OpenAI (GPT series), Anthropic (Claude), Google (Gemini), Meta (Llama), and Mistral. Training foundation models requires enormous compute resources.

Can I train my own foundation model?+

Training a foundation model requires billions of dollars of compute and massive datasets. It is not feasible for individuals or most companies. What is feasible is fine-tuning an existing foundation model on your own data, which requires far less compute.

What is the difference between a foundation model and a chatbot?+

A foundation model is the underlying AI system. A chatbot is one application built on top of a foundation model. ChatGPT is a chatbot; GPT-4 is the foundation model it runs on.

Bottom line

A foundation model is a large AI model trained on broad data that can be adapted for many different tasks. GPT-4, Claude, and Gemini are all foundation models: trained once at massive scale, then used (or fine-tuned) for thousands of specific applications.

Put it into practice

Prompt packages that apply these concepts directly.

More from Learn

Back to Learn