Recent Generative AI Models

This page was last updated on 2025-09-13.

New Table

Table 1 - Models

Model

Company

Date

Params

Paper

Source

Website

Weights

Remarks

Robotic Transformer 1

Google DeepMind

2022-12-13

link

link

Einstein GPT

Salesforce

2023-03-07

link

uses OpenAI API?

🧨 Stable UnCLIP 2.1

Stability AI

2023-03-24

link

link

link

model behind Reimagine

LLaVA

University of Wisconsin-Madison

2023-04-17

link

link

link

link

LLaVA = Large Language and Vision Assistant

WizardLM

Microsoft

2023-04-24

7B, 13B, 30B, 70B

link

link

link

based on llama

Eleven Multilingual v1

ElevenLabs

2023-04-27

link

link

English, French, German, Hindi, Italian, Polish, Portuguese, Spanish

PaLM 2

Google

2023-05-10

link

link

LIMA

Meta AI

2023-05-18

65B

link

based on llama

🔈Massive Multilingual Speech

Meta AI

2023-05-22

300M, 1B

link

link

link

link

Falcon

TII.AE

2023-05-26

1B, 7B, 40B

coming soon

link

AlphaDev

Google DeepMind

2023-06-07

link

link

🔈 StyleTTS 2

Columbia University

2023-07-13

link

link

link

link

WizardCoder

Microsoft

2023-06-14

15B

link

link

link

Llama 2

Meta AI

2023-07-18

7B, 13B, 70B

link

link

link link2

link

Meta-Transformer

2023-07-20

85M, 302M

link

link

link

link

12 modalities

Stable Beluga 2

Stability AI

2023-07-21

70B

link

link

based on llama 2

🧨 Stable Diffusion XL 1.0

Stability AI

2023-07-26

3.5B

link

link

link

base refiner

Robotic Transformer 2

Google DeepMind

2023-07-28

link

link

StableCode

Stability AI

2023-08-08

3B

link

base instruct

🔈 AudioSep - Separate Anything You Describe

Audio-AGI

2023-08-09

link

link

link

link

🔈 AudioLDM2

ByteDance

2023-08-10

link

link

link

link

🔈 Eleven Multilingual v2

ElevenLabs

2023-08-22

link

link

English, French, German, Hindi, Italian, Polish, Portuguese, Spanish, Chinese, Korean, Dutch, Turkish, Swedish, Indonesian, Filipino, Japanese, Ukrainian, Greek, Czech, Finnish, Romanian, Danish, Bulgarian, Malay, Slovak, Croatian, Classic Arabic, Tamil

SeamlessM4T

Meta AI

2023-08-22

1.2B, 2.3B

link

link

link

link

Code Llama

Meta AI

2023-08-24

7B, 13B, 34B

link

link

link

link

Nougat OCR

Meta AI

2023-08-25

link

link

link

link

Specialized in academic documents

Falcon 180B

TII

2023-09-06

180B

coming soon

link

link

see also: falcon-40b

Persimmon

Adept

2023-09-07

8B

link

link

link

🔈 StableAudio

Stability AI

2023-09-13

link

🧨 DALL-E 3

OpenAI

2023-09-21

link

📽️ LaVie

Shanghai Artificial Intelligence Laboratory

2023-09-26

link

link

link

link

Mistral-7B

Mistral AI

2023-09-27

7B

link

link

link

Qwen

Alibaba

2023-09-28

7B, 14B

link

link

link

LLaVA 1.5

University of Wisconsin-Madison

2023-10-05

link

link

link

link

jina-embeddings-v2

Jina AI

2023-10-25

link

link

link

Yi

01.ai

2023-11-02

6B, 34B

link

link

📽️ Emu Video

Meta AI

2023-11-16

link

link

📽️ Stable Video Diffusion

Stability AI

2023-11-21

link

link

link

link

Meditron

École Polytechnique Fédérale de Lausanne (EPFL)

2023-11-27

7B, 70B

link

link

link

🧨 SDXL Turbo

Stability AI

2023-11-28

link

link

link

link

📽️ Animate Anyone

Alibaba

2023-11-28

link

link

link

Seamless

Meta AI

2023-11-30

link

link

link

link

OpenVoice

MyShell.ai

2023-12-03

7B, 13B, 34B, 70B

link

link

link

Gemini

Google DeepMind

2023-12-06

link

link

nano / pro / ultra, pro will power Bard

AlphaCode 2

Google DeepMind

2023-12-06

link

link

Stable LM Zephyr 3B

Stability AI

2023-12-07

3B

link

link

Mistral 8x7B

Mistral AI

2023-12-11

45B

link

link

link

🧨 Imagen 2

Google DeepMind

2023-12-13

link

Stable Code 3B

Stability AI

2024-01-16

3B

link

link

Stable LM 2

Stability AI

2024-01-19

1.6B

link

link

Eagle 7B

RWKV

2024-01-29

7B

link

link

RWKV-v5 architecture

Code Llama 70B

Meta AI

2024-01-29

7B, 13B, 34B, 70B

link

link

link

link

MGIE

Apple

2024-02-05

link

link

link

link

Sora

OpenAI

2024-02-15

link

link

Gemma

Google

2024-02-21

2B, 7B

link

link

link

link 1 link 2

🧨 Stable Diffusion 3

Stability AI

2024-02-22 (preview)

0.8B, ..., 8B

link

Old Table

Table 2 - Models (old)

Model

Company

Date

Base Model

Parameters

Training Data Size

Training Time

Context length

Paper

Source

Website

Training data

Code License

Weights License

Type

Model weights

Instruction Tuning

RLHF

Remarks

Deep Blue

IBM

1996-01-01

from scratch

N/A

https://www.sciencedirect.com/science/article/pii/S0004370201001291

N/A

https://www.ibm.com/ibm/history/ibm100/us/en/icons/deepblue/

games

Chess

Watson

IBM

2011-01-01

from scratch

N/A

https://doi.org/10.1609/aimag.v31i3.2303

N/A

games

Jeopardy

AlexNet

  1. Krizhevsky, G. Hinton

2012-09-30

from scratch

60M

https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf

https://github.com/dansuh17/alexnet-pytorch (clone)

vision

won ImageNet LSVRC 2012 challenge with 15.3%

word2vec

Google

2013-01-16

https://arxiv.org/abs/1301.3781

no

no

Inception v1

Google

2014-09-17

from scratch

https://arxiv.org/abs/1409.4842

https://github.com/google/deepdream

vision

won ImageNet LSVRC 2014 challenge with 6.7%

DQN

Google DeepMind

2015-02-25

from scratch

https://www.nature.com/articles/nature14236

https://github.com/deepmind/dqn

deep RL

char-rnn

Andrej Karpathy

2015-05-21

from scratch

https://karpathy.github.io/2015/05/21/rnn-effectiveness/

https://github.com/karpathy/char-rnn

language

Features on https://www.aiweirdness.com/

GloVe

Stanford

2015-09-01

https://nlp.stanford.edu/pubs/glove.pdf

https://github.com/stanfordnlp/GloVe

https://nlp.stanford.edu/projects/glove/

Apache 2.0

Apache 2.0

yes

no

no

fastText

Facebook

2015-11-09

https://arxiv.org/abs/1607.04606

https://github.com/facebookresearch/fastText

https://fasttext.cc/

MIT

MIT

yes

no

no

Inception v3

Google

2015-12-02

https://arxiv.org/abs/1512.00567

vision

https://huggingface.co/timm/inception_v3.tv_in1k

ResNet

Microsoft

2015-12-10

from scratch

https://arxiv.org/abs/1512.03385

vision

won ImageNet LSVRC 2015 challenge with 3.57%; "better than humans"

AlphaGo

Google DeepMind

2016-01-27

from scratch

https://www.nature.com/articles/nature16961

games

Inception v4

Google

2016-02-23

vision

https://huggingface.co/timm/inception_v4.tf_in1k

Tay

Microsoft

2016-03-23

N/A

N/A

https://blogs.microsoft.com/blog/2016/03/25/learning-tays-introduction/

chatbot

CycleGAN

UC Berkeley

2017-03-30

https://arxiv.org/abs/1703.10593

https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix

GAN

yes

AlphaGo Zero

Google DeepMind

2017-10-19

https://www.nature.com/articles/nature24270

games

AlphaZero

Google DeepMind

2017-12-05

https://arxiv.org/abs/1712.01815

games

ELMo (Embeddings from Language Models)

Allen Institute for AI

2018-02-15

180M

https://arxiv.org/abs/1802.05365

language

yes

GPT (Generative Pre-trained Transformer)

OpenAI

2018-06-11

from scratch

117M

https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf

https://github.com/openai/finetune-transformer-lm

transformer

yes

no

no

BERT (Bidirectional Encoder Representations from Transformers)

Google

2018-10-11

108M, 334M

https://arxiv.org/abs/1810.04805

https://github.com/google-research/bert

transformer

yes

StyleGAN

Nvidia

2018-12-12

https://arxiv.org/abs/1812.04948

https://github.com/NVlabs/stylegan

GAN

yes

https://thispersondoesnotexist.com

GPT2

OpenAI

2019-02-14

1.5B

https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf

https://github.com/openai/gpt-2

transformer

yes

no

no

XLNet

CMU & Google

2019-06-19

117M, 360M

https://arxiv.org/abs/1906.08237

https://github.com/zihangdai/xlnet

Apache 2.0

yes

RoBERTa

Meta AI

2019-07-26

BERT

354M

https://arxiv.org/abs/1907.11692

transformer

yes

ALBERT (A Lite BERT)

Google

2019-09-26

BERT

12M, 18M, 60M, 235M

https://arxiv.org/abs/1909.11942

https://github.com/google-research/ALBERT

Apache 2.0

transformer

yes

DistilBERT

HuggingFace

2019-10-02

BERT

66M

https://arxiv.org/abs/1910.01108

https://github.com/huggingface/transformers

Apache 2.0

transformer

yes

Text-to-Text Transfer Transformer (T5)

Google

2019-10-23

from scratch

11B

1T tokens

https://arxiv.org/abs/1910.10683

https://github.com/google-research/text-to-text-transfer-transformer

https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html

Apache 2.0

Apache 2.0

transformer

yes

no

no

AlphaFold

Google DeepMind

2020-01-15

from scratch

https://www.nature.com/articles/s41586-019-1923-7

https://github.com/deepmind/deepmind-research/tree/master/alphafold_casp13

yes

Turing NLG

Microsoft

2020-02-13

17B

N/A

https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/

ELECTRA

Stanford & Google

2020-03-23

BERT?

14M, 110M, 335M

https://arxiv.org/abs/2003.10555

yes

DeBERTa

Microsoft

2020-06-05

BERT

https://arxiv.org/abs/2006.03654

https://github.com/microsoft/DeBERTa

MIT

transformer

yes

GPT3

OpenAI

2020-06-11

from scratch

175B

300B tokens

https://arxiv.org/abs/2005.14165

/

private

private

transformer

no

no

no

ImageGPT

OpenAI

2020-06-17

https://cdn.openai.com/papers/Generative_Pretraining_from_Pixels_V2.pdf

https://github.com/openai/image-gpt

private

private

transformer

no

mT5

Google

2020-10-22

from scratch

300M - 13B

1T tokens

https://arxiv.org/abs/2010.11934

https://github.com/google-research/multilingual-t5

mC4

Apache 2.0

Apache 2.0

transformer

https://huggingface.co/google/mt5-base

DALL-E

OpenAI

2021-01-05

GPT-3

12B

https://arxiv.org/abs/2102.12092

private

private

transformer

no

DeBERTa V2

Microsoft

2021-02-03

900M - 1.5B

N/A

transformer

yes

CLIP

OpenAI

2021-02-26

https://arxiv.org/abs/2103.00020

https://github.com/OpenAI/CLIP

https://openai.com/research/clip

MIT

yes

GLM

Tsinghua University

2021-03-18

110M - 10B

https://arxiv.org/abs/2103.10360

https://github.com/THUDM/GLM

transformer

yes

GPT-Neo

EleutherAI

2021-03-21

125M, 1.3B, 2.7B

N/A

https://github.com/EleutherAI/gpt-neo

https://www.eleuther.ai/artifacts/gpt-neo

MIT

transformer

https://huggingface.co/EleutherAI/gpt-neo-1.3B

LaMDA

Google

2021-05-18

from scratch

137B

2.8T tokens

58d

https://arxiv.org/abs/2201.08239

N/A

N/A

transformer

no

GPT-J

EleutherAI

2021-06-09

6B

https://github.com/kingoflolz/mesh-transformer-jax

yes

Apache 2.0

Apache 2.0

transformer

https://huggingface.co/EleutherAI/gpt-j-6b

no

no

CPM-2

Tsinghua University

2021-06-20

11B

https://arxiv.org/abs/2106.10715

https://github.com/TsinghuaAI/CPM

yes

Copilot

GitHub

2021-06-29

OpenAI Codex

N/A

N/A

code

no

ERNIE 3.0

Baidu

2021-07-05

10B

375B tokens

https://arxiv.org/abs/2107.02137

N/A

http://research.baidu.com/Blog/index-view?id=160

N/A

N/A

transformer

no

AlphaFold 2

Google DeepMind

2021-07-15

21B

https://www.nature.com/articles/s41586-021-03819-2

https://github.com/deepmind/alphafold

yes

Jurassic-1

AI21 Labs

2021-08-01

178B

300B tokens

N/A

N/A

https://www.ai21.com/blog/announcing-ai21-studio-and-jurassic-1

N/A

N/A

no

Codex

OpenAI

2021-08-10

GPT3

12B

100B tokens

https://arxiv.org/abs/2107.03374

N/A

https://openai.com/blog/openai-codex

private

private

code

no

T0

BigScience

2021-10-15

T5

11B

27h

https://arxiv.org/abs/2110.08207

https://github.com/bigscience-workshop/t-zero

Apache 2.0

Apache 2.0

transformer

https://huggingface.co/bigscience/T0

DeBERTa V3

Microsoft

2021-11-18

https://arxiv.org/abs/2111.09543

https://github.com/microsoft/DeBERTa

MIT

transformer

yes

Gopher

Google DeepMind

2021-12-08

from scratch

280B

300B tokens

38d

https://arxiv.org/abs/2112.11446

no

no

no

GLaM (Generalist Language Model)

Google

2021-12-13

from scratch

1.2T

280T tokens

24d

https://arxiv.org/abs/2112.06905

WebGPT

OpenAI

2021-12-17

GPT 3

175B

https://arxiv.org/abs/2112.09332

N/A

private

private

transformer

no

no

yes

ClipSeg

2021-12-18

https://arxiv.org/abs/2112.10003

https://github.com/timojl/clipseg

InstructGPT

OpenAI

2022-01-27

GPT3

175B

https://arxiv.org/abs/2203.02155

N/A

private

private

transformer

no

yes

yes

Megatron-Turing (MT) NLG

Microsoft

2022-01-28

530B

270B tokens

https://arxiv.org/abs/2201.11990

N/A

transformer

no

AlphaCode

Google DeepMind

2022-02-02

0.3B,1B,3B,9B,41B

967B tokens

https://arxiv.org/abs/2203.07814

https://www.deepmind.com/blog/competitive-programming-with-alphacode

N/A

code

no

GPT3.5

OpenAI

2022-03-15

355B

N/A

private

private

transformer

no

Imagen

Google

2022-03-23

https://arxiv.org/abs/2205.11487

https://imagen.research.google/

CodeGen-Multi

Salesforce

2022-03-25

350M - 16B

2048

https://arxiv.org/abs/2203.13474v1

code

https://huggingface.co/Salesforce/codegen-350M-multi

Chinchilla

Google DeepMind

2022-03-29

70B

1.4T tokens

https://arxiv.org/abs/2203.15556

N/A

https://www.deepmind.com/blog/an-empirical-analysis-of-compute-optimal-large-language-model-training

N/A

no

T5X

Google

2022-03-31

https://arxiv.org/abs/2203.17189

https://github.com/google-research/t5x

transformer

PaLM (Pathways Language Model)

Google

2022-04-04

8B, 62B, 540B

780B tokens

https://arxiv.org/abs/2204.02311

https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html

N/A

N/A

transformer

no

GPT-NeoX

EleutherAI

2022-04-14

20B

825GB

https://arxiv.org/abs/2204.06745

https://github.com/EleutherAI/gpt-neox

Apache 2.0

Apache 2.0

transformer

https://huggingface.co/EleutherAI/gpt-neox-20b

no

no

Tk-Instruct

Allen Institute for AI

2022-04-16

T5

3B, 11B

4h

https://arxiv.org/abs/2204.07705

https://github.com/yizhongw/Tk-Instruct

Apache 2.0

https://huggingface.co/allenai/tk-instruct-11b-def

yes

Flamingo

Google DeepMind

2022-04-29

https://arxiv.org/abs/2204.14198

https://www.deepmind.com/blog/tackling-multiple-tasks-with-a-single-visual-language-model

N/A

no

OPT

Meta AI

2022-05-03

from scratch

125M - 175B

180B tokens

https://arxiv.org/abs/2205.01068

MIT

NC research

transformer

https://huggingface.co/facebook/opt-30b

no

no

UL2

Google Brain

2022-05-10

20B

1T tokens

https://arxiv.org/abs/2205.05131

Apache 2.0

Apache 2.0

transformer

yes

no

no

LaMDA 2

Google

2022-05-11

N/A

transformer

no

YaLM

Yandex

2022-06-22

from scratch

100B

N/A

https://github.com/yandex/YaLM-100B

transformer

yes

BLOOM

BigScience

2022-07-06

from scratch

up to 176B

366B tokens

105d

https://arxiv.org/abs/2211.05100

bigscience-bloom-rail-1.0

bigscience-bloom-rail-1.0

transformer

https://huggingface.co/bigscience/bloom

no

no

NLLB-200 (No Language Left Behind)

Meta AI

2022-07-06

from scratch

55B

https://about.fb.com/news/2022/07/new-meta-ai-model-translates-200-languages-making-technology-more-accessible/

translator

translate between 200 languages

Midjourney

Midjourney Inc.

2022-07-12

from scratch

N/A

N/A

https://www.midjourney.com

N/A

diffuser

no

Exposed as Discord bot

DALL-E 2

OpenAI

2022-07-20

GPT-3

https://cdn.openai.com/papers/dall-e-2.pdf

private

private

diffuser

no

AlexaTM

Amazon

2022-08-02

from scratch

20B

1.3T tokens

120d

https://arxiv.org/abs/2208.01448

https://github.com/amazon-science/alexa-teacher-models

transformer

via SageMaker

no

no

Stable Diffusion

Stability AI

2022-08-10

from scratch

890M

https://arxiv.org/abs/2112.10752

https://github.com/CompVis/stable-diffusion

https://stability.ai/blog/stable-diffusion-announcement

diffuser

yes

See also https://stablediffusionweb.com/

DreamBooth

Google

2022-08-25

https://arxiv.org/abs/2208.12242

https://github.com/google/dreambooth

https://dreambooth.github.io/

N/A

no

CodeGeeX

Tsinghua University

2022-09-19

from scratch

13B

850B tokens

60d

https://arxiv.org/abs/2303.17568

https://github.com/THUDM/CodeGeeX

https://models.aminer.cn/codegeex/blog/

Apache 2.0

CodeGeeX License

code

on request

N/A

N/A

WeLM

WeChat

2022-09-21

from scratch

10B

300B tokens

24d

https://arxiv.org/abs/2209.10372

https://welm.weixin.qq.com/docs/api/

yes

no

no

Chinese language

Sparrow

Google DeepMind

2022-09-22

from scratch

70B

https://arxiv.org/abs/2209.14375

https://www.deepmind.com/blog/building-safer-dialogue-agents

N/A

no

no

yes

GLM-130B

Tsinghua University

2022-10-05

from scratch

130B

400B tokens

60d

https://arxiv.org/abs/2210.02414

https://github.com/THUDM/GLM-130B

transformer

yes

Flan-T5

Google

2022-10-20

T5

60M - 11B

https://arxiv.org/abs/2210.11416

https://github.com/google-research/t5x

Apache 2.0

Apache 2.0

transformer

yes

yes

no

Flan-PaLM

Google

2022-10-20

PaLM

540B

37h

https://arxiv.org/abs/2210.11416

N/A

N/A

N/A

N/A

transformer

no

yes

no

U-PaLM

Google

2022-10-20

PaLM

8B, 62B, 540B

5d

https://arxiv.org/abs/2210.11399

N/A

N/A

transformer

no

no

no

BLOOMZ

BigScience

2022-11-03

BLOOM

176B

https://arxiv.org/abs/2211.01786

https://github.com/bigscience-workshop/xmtf

bigscience-bloom-rail-1.0

bigscience-bloom-rail-1.0

transformer

yes

yes

no

BLOOM + Multitask prompted finetuning (MTF)

mT0

BigScience

2022-11-03

mT5

300M - 13B

https://arxiv.org/abs/2211.01786

https://github.com/bigscience-workshop/xmtf

Apache 2.0

Apache 2.0

https://huggingface.co/bigscience/mt0-large

Google mT5 + Multitask prompted finetuning (MTF)

OpenJourney

PromptHero

2022-11-08

Stable Diffusion

N/A

diffuser

https://huggingface.co/prompthero/openjourney

Stable Diffusion finetuned to resemble MidJourney

Galactica

Meta AI

2022-11-16

from scratch

125M - 120B

106B tokens

https://arxiv.org/abs/2211.09085

cc-by-nc-4.0

transformer

https://huggingface.co/facebook/galactica-120b

Focussed on Science

Stable Diffusion v2

Stability AI

2022-11-24

from scratch

N/A

https://github.com/Stability-AI/stablediffusion

https://stability.ai/blog/stable-diffusion-v2-release

diffuser

yes

GPT-JT

TogetherComputer

2022-11-29

GPT-J

6B

N/A

https://www.together.xyz/blog/releasing-v1-of-gpt-jt-powered-by-open-source-ai

Apache 2.0

Apache 2.0

transformer

https://huggingface.co/togethercomputer/GPT-JT-6B-v1

no

ChatGPT

OpenAI

2022-11-30

GPT 3.5

N/A

N/A

https://openai.com/blog/chatgpt

no

private

private

chatbot

no

yes

yes

OpenCLIP

various

2022-12-14

from scratch

https://arxiv.org/pdf/2212.07143.pdf

https://github.com/LAION-AI/scaling-laws-openclip

OPT-IML

Meta AI

2022-12-22

OPT

30B, 175B

https://arxiv.org/abs/2212.12017

MIT

NC research

transformer

yes

yes

no

Bard

Google

2023-02-06

LaMDA 2 or PaLM 2?

N/A

chatbot

no

LLaMA

Meta AI

2023-02-23

from scratch

7B, 13B, 30B, 65B

1.4T tokens

21d

https://arxiv.org/abs/2302.13971

https://github.com/facebookresearch/llama

https://ai.facebook.com/blog/large-language-model-llama-meta-ai/

GPL 3.0

NC research

transformer

https://huggingface.co/decapoda-research/llama-65b-hf

no

no

Flan-UL2

Google Brain

2023-02-28

UL2

20B

Flan collection

https://arxiv.org/abs/2205.05131v3

https://github.com/google-research/google-research/tree/master/ul2

Apache 2.0

Apache 2.0

https://huggingface.co/google/flan-ul2

yes

no

Open-Assistant SFT-1

OpenAssistant

2023-03-09

Pythia 12B

12B

N/A

https://github.com/LAION-AI/Open-Assistant/tree/main/model/model_training

https://open-assistant.io/

Apache 2.0

transformer

https://huggingface.co/OpenAssistant/oasst-sft-1-pythia-12b

Jurassic-2

AI21 Labs

2023-03-09

?

N/A

N/A

https://www.ai21.com/blog/introducing-j2

N/A

N/A

no

Alpaca-LoRA

Eric J. Wang

2023-03-13

LLaMA

N/A

https://github.com/tloen/alpaca-lora

transformer

yes

Alpaca

Stanford

2023-03-13

LLaMA

7B

N/A

https://github.com/tatsu-lab/stanford_alpaca

https://crfm.stanford.edu/2023/03/13/alpaca.html

transformer

yes

h2oGPT

H2O.ai

2023-03-13

Pythia 12B, GPT-NeoX 20B

12B, 20B

N/A

https://github.com/h2oai/h2ogpt

https://gpt.h2o.ai/

Apache 2.0

transformer

https://huggingface.co/h2oai

ChatGLM

Tsinghua University

2023-03-14

GLM / GLM-130B?

6B

https://github.com/THUDM/ChatGLM-6B

https://chatglm.cn/blog

chatbot

GPT4

OpenAI

2023-03-14

from scratch

8x220B

https://arxiv.org/abs/2303.08774

private

private

transformer

no

yes

yes

Zero-1-to-3

Columbia University

2023-03-20

https://arxiv.org/abs/2303.11328

https://github.com/cvlab-columbia/zero123

https://zero123.cs.columbia.edu/

diffuser

yes

Dolly v1

Databricks

2023-03-24

GPT-J

6B

N/A

https://github.com/databrickslabs/dolly

https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html

cc-by-nc-4.0

chatbot

https://huggingface.co/databricks/dolly-v1-6b

GPT4All

Nomic AI

2023-03-28

LLaMA

7B

https://static.nomic.ai/gpt4all/2023_GPT4All_Technical_Report.pdf

https://github.com/nomic-ai/gpt4all

yes

GPL 3.0

chatbot

https://huggingface.co/nomic-ai/gpt4all-lora

Finetuned LLaMA 7B based on GPT3.5 chats

Cerebras-GPT

Cerebras Systems

2023-03-28

from scratch

111M - 13B

https://arxiv.org/abs/2304.03208

https://github.com/Cerebras/modelzoo

https://www.cerebras.net/blog/cerebras-gpt-a-family-of-open-compute-efficient-large-language-models/

Apache 2.0

Apache 2.0

transformer

https://huggingface.co/cerebras/Cerebras-GPT-13B

no

no

Reproduction of GPT 3 training process

LLaMA-Adapter

Shanghai AI Lab

2023-03-28

LLaMA

7B

https://arxiv.org/abs/2303.16199

https://github.com/ZrrSkywalker/LLaMA-Adapter

ColossalChat

Colossal AI

2023-03-29

LLaMA

https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat

https://chat.colossalai.org/

Apache 2.0

chatbot

Vicuna

LM-SYS

2023-03-30

LLaMA

7B, 13B

N/A

https://github.com/lm-sys/FastChat

https://vicuna.lmsys.org/

see LLaMA

transformer

yes

BloombergGPT

Bloomberg

2023-03-30

50B

https://arxiv.org/abs/2303.17564

https://www.bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance/

transformer

RWKV-4 Raven

BlinkDL

2023-04-01

1.5B, 3B, 7B, 14B

https://arxiv.org/abs/2305.13048

https://github.com/BlinkDL/RWKV-LM

RNN

https://huggingface.co/BlinkDL/rwkv-4-raven

Pythia

EleutherAI

2023-04-03

70M - 12B

300B tokens

https://arxiv.org/abs/2304.01373

https://github.com/EleutherAI/pythia

Apache 2.0

Apache 2.0

transformer

https://huggingface.co/EleutherAI/pythia-12b

no

no

Koala

UC Berkeley

2023-04-03

LLaMA

7B, 13B

N/A

https://github.com/young-geng/EasyLM#koala

https://bair.berkeley.edu/blog/2023/04/03/koala

transformer

https://huggingface.co/young-geng/koala/tree/main

Baize

Baize Project

2023-04-03

LLaMA

7B, 13B, 30B

https://arxiv.org/abs/2304.01196

https://github.com/project-baize/baize-chatbot

transformer

https://huggingface.co/project-baize/baize-lora-7B

Finetuned LLaMA with LoRA

SAM

Meta AI

2023-04-05

https://arxiv.org/abs/2304.02643

https://github.com/facebookresearch/segment-anything

https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/

yes

vision

Bark

Suno

2023-04-09

80M

N/A

https://github.com/suno-ai/bark

cc-by-nc-4.0

voice

yes

Dolly v2

Databricks

2023-04-12

Pythia

3B, 7B, 12B

N/A

https://github.com/databrickslabs/dolly

Apache 2.0

MIT

chatbot

https://huggingface.co/databricks/dolly-v2-12b

yes

no

CodeWhisperer

Amazon

2023-04-13

N/A

N/A

N/A

https://aws.amazon.com/blogs/aws/amazon-codewhisperer-free-for-individual-use-is-now-generally-available/

N/A

code

no

Self-hosted Copilot clone

GPT4All-J

Nomic AI

2023-04-14

GPT-J

6.7B

https://static.nomic.ai/gpt4all/2023_GPT4All-J_Technical_Report_2.pdf

https://github.com/nomic-ai/gpt4all

yes

Apache 2.0

Apache 2.0

transformer

https://huggingface.co/nomic-ai/gpt4all-j

yes

no

DINOv2

Meta AI

2023-04-14

from scratch

21M - 1.1B

https://arxiv.org/abs/2304.07193

https://github.com/facebookresearch/dinov2

https://ai.facebook.com/blog/dino-v2-computer-vision-self-supervised-learning/

vision

yes

VideoLDM

Nvidia

2023-04-18

Stable Diffusion

https://arxiv.org/abs/2304.08818

N/A

https://research.nvidia.com/labs/toronto-ai/VideoLDM/

StableLM

Stability AI

2023-04-19

from scratch

3B, 7B, (15B, 65B, 175B)

N/A

https://github.com/stability-AI/stableLM/

https://stability.ai/blog/stability-ai-launches-the-first-of-its-stablelm-suite-of-language-models

cc-by-nc-4.0

transformer

yes

Open-Assistant SFT-6

OpenAssistant

2023-04-22

LLaMA

30B

https://arxiv.org/abs/2304.07327

see LLaMA

transformer

https://huggingface.co/OpenAssistant/oasst-sft-6-llama-30b-xor

WizardLM

Microsoft

2023-04-24

LLaMA

7B

https://arxiv.org/abs/2304.12244

https://github.com/nlpxucan/WizardLM

transformer

yes

DeepFloyd IF

Stability AI

2023-04-28

N/A

https://github.com/deep-floyd/IF

https://stability.ai/blog/deepfloyd-if-text-to-image-model

StableVicuna

Stability AI

2023-04-28

Vicuna 13B

13B

N/A

https://github.com/Stability-AI/StableLM

https://stability.ai/blog/stablevicuna-open-source-rlhf-chatbot

cc-by-nc-4.0

transformer

https://huggingface.co/CarperAI/stable-vicuna-13b-delta

Vicuna 13B + RLHF

FastChat-T5

LM-SYS

2023-04-28

Flan-T5-XL

3B

N/A

https://github.com/lm-sys/FastChat#fastchat-t5

Apache 2.0

transformer

https://huggingface.co/lmsys/fastchat-t5-3b-v1.0

LLaMA-Adapter V2

Shanghai AI Lab

2023-04-28

LLaMA

https://arxiv.org/abs/2304.15010

https://github.com/ZrrSkywalker/LLaMA-Adapter

transformer

Replit Code

Replit

2023-05-02

from scratch

2.7B

N/A

https://github.com/replit/ReplitLM

https://replit.com/site/ghostwriter

cc-by-sa-4.0

code

https://huggingface.co/replit/replit-code-v1-3b

OpenLLaMA

OpenLM Research

2023-05-02

from scratch

7B

https://github.com/openlm-research/open_llama

RedPajama

Apache 2.0

transformer

https://huggingface.co/openlm-research/open_llama_7b_preview_300bt

Apache 2.0 LLaMA clone based on RedPajama data

Shap-E

OpenAI

2023-05-03

from scratch

300M

https://arxiv.org/pdf/2305.02463.pdf

https://github.com/openai/shap-e

MIT

diffuser

https://github.com/openai/shap-e/blob/main/shap_e/models/download.py

3D image generation

StarCoder

BigCode

2023-05-04

15B

1T tokens + 35B python tokens

8k

https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view

https://github.com/bigcode-project/starcoder

https://huggingface.co/blog/starcoder

BigCode OpenRAIL-M v1

code

https://huggingface.co/bigcode/starcoder

RedPajama

TogetherComputer

2023-05-05

from scratch

3B, 7B

N/A

https://github.com/togethercomputer/RedPajama-Data

https://www.together.xyz/blog/redpajama-models-v1

Apache 2.0

https://huggingface.co/togethercomputer/RedPajama-INCITE-Base-7B-v0.1

Open reproduction of LLaMA

MPT-7B (MosaicML Pretrained Transformer)

MosaicML

2023-05-05

from scratch

7B

N/A

https://github.com/mosaicml/llm-foundry

https://www.mosaicml.com/blog/mpt-7b

Apache 2.0

transformer

https://huggingface.co/mosaicml/mpt-7b-instruct

Open reproduction of LLaMA

MPT-30B (MosaicML Pretrained Transformer)

MosaicML

2023-06-22

from scratch

30B

N/A

https://github.com/mosaicml/llm-foundry

https://www.mosaicml.com/blog/mpt-30b

Apache 2.0

transformer

https://huggingface.co/mosaicml/mpt-30b-instruct

Open reproduction of LLaMA

PanGu-sigma

Huawei

AnthropicLM

Anthropic AI

N/A

no

Lit-LLaMA

LLaMA

7B, 13B, 30B, 65B

Apache 2.0

NC research

optional with Alcapa

no

ImageBind

Meta AI

2023-05-09

from scratch

https://arxiv.org/abs/2305.05665

https://github.com/facebookresearch/ImageBind

https://ai.facebook.com/blog/imagebind-six-modalities-binding-ai/

cc-by-nc-4.0

cc-by-nc-4.0

transformer

https://dl.fbaipublicfiles.com/imagebind/imagebind_huge.pth

six different modalities: images, text, audio, depth, thermal, and IMU

Open-LLaMA V2

s-JoL

2023-05-11

from scratch

N/A

https://github.com/s-JoL/Open-Llama

MIT

MIT

transformer

https://huggingface.co/s-JoL/Open-Llama-V2

yes

yes

PaLM 2

Google

2023-05-10

from scratch

https://ai.google/static/documents/palm2techreport.pdf

N/A

https://ai.google/discover/palm2

transformer

no