Recent Generative AI Models

vhsven

2023-05-01

This page was last updated on 2025-11-16.

New Table

Table 1 - Models
Model	Company	Date	Params	Paper	Source	Website	Weights	Remarks
Robotic Transformer 1	Google DeepMind	2022-12-13		link		link
Einstein GPT	Salesforce	2023-03-07				link		uses OpenAI API?
🧨 Stable UnCLIP 2.1	Stability AI	2023-03-24			link	link	link	model behind Reimagine
LLaVA	University of Wisconsin-Madison	2023-04-17		link	link	link	link	LLaVA = Large Language and Vision Assistant
WizardLM	Microsoft	2023-04-24	7B, 13B, 30B, 70B	link	link		link	based on llama
Eleven Multilingual v1	ElevenLabs	2023-04-27			link	link		English, French, German, Hindi, Italian, Polish, Portuguese, Spanish
PaLM 2	Google	2023-05-10		link		link
LIMA	Meta AI	2023-05-18	65B	link				based on llama
🔈Massive Multilingual Speech	Meta AI	2023-05-22	300M, 1B	link	link	link	link
Falcon	TII.AE	2023-05-26	1B, 7B, 40B	coming soon			link
AlphaDev	Google DeepMind	2023-06-07		link		link
🔈 StyleTTS 2	Columbia University	2023-07-13		link	link	link	link
WizardCoder	Microsoft	2023-06-14	15B	link	link		link
Llama 2	Meta AI	2023-07-18	7B, 13B, 70B	link	link	link link2	link
Meta-Transformer		2023-07-20	85M, 302M	link	link	link	link	12 modalities
Stable Beluga 2	Stability AI	2023-07-21	70B			link	link	based on llama 2
🧨 Stable Diffusion XL 1.0	Stability AI	2023-07-26	3.5B	link	link	link	base refiner
Robotic Transformer 2	Google DeepMind	2023-07-28		link		link
StableCode	Stability AI	2023-08-08	3B			link	base instruct
🔈 AudioSep - Separate Anything You Describe	Audio-AGI	2023-08-09		link	link	link	link
🔈 AudioLDM2	ByteDance	2023-08-10		link	link	link	link
🔈 Eleven Multilingual v2	ElevenLabs	2023-08-22			link	link		English, French, German, Hindi, Italian, Polish, Portuguese, Spanish, Chinese, Korean, Dutch, Turkish, Swedish, Indonesian, Filipino, Japanese, Ukrainian, Greek, Czech, Finnish, Romanian, Danish, Bulgarian, Malay, Slovak, Croatian, Classic Arabic, Tamil
SeamlessM4T	Meta AI	2023-08-22	1.2B, 2.3B	link	link	link	link
Code Llama	Meta AI	2023-08-24	7B, 13B, 34B	link	link	link	link
Nougat OCR	Meta AI	2023-08-25		link	link	link	link	Specialized in academic documents
Falcon 180B	TII	2023-09-06	180B	coming soon		link	link	see also: falcon-40b
Persimmon	Adept	2023-09-07	8B		link	link	link
🔈 StableAudio	Stability AI	2023-09-13				link
🧨 DALL-E 3	OpenAI	2023-09-21				link
📽️ LaVie	Shanghai Artificial Intelligence Laboratory	2023-09-26		link	link	link	link
Mistral-7B	Mistral AI	2023-09-27	7B		link	link	link
Qwen	Alibaba	2023-09-28	7B, 14B	link	link		link
LLaVA 1.5	University of Wisconsin-Madison	2023-10-05		link	link	link	link
jina-embeddings-v2	Jina AI	2023-10-25		link		link	link
Yi	01.ai	2023-11-02	6B, 34B			link	link
📽️ Emu Video	Meta AI	2023-11-16		link		link
📽️ Stable Video Diffusion	Stability AI	2023-11-21		link	link	link	link
Meditron	École Polytechnique Fédérale de Lausanne (EPFL)	2023-11-27	7B, 70B	link	link		link
🧨 SDXL Turbo	Stability AI	2023-11-28		link	link	link	link
📽️ Animate Anyone	Alibaba	2023-11-28		link	link	link
Seamless	Meta AI	2023-11-30		link	link	link	link
OpenVoice	MyShell.ai	2023-12-03	7B, 13B, 34B, 70B	link	link	link
Gemini	Google DeepMind	2023-12-06		link		link		nano / pro / ultra, pro will power Bard
AlphaCode 2	Google DeepMind	2023-12-06		link		link
Stable LM Zephyr 3B	Stability AI	2023-12-07	3B			link	link
Mistral 8x7B	Mistral AI	2023-12-11	45B	link		link	link
🧨 Imagen 2	Google DeepMind	2023-12-13				link
Stable Code 3B	Stability AI	2024-01-16	3B			link	link
Stable LM 2	Stability AI	2024-01-19	1.6B			link	link
Eagle 7B	RWKV	2024-01-29	7B			link	link	RWKV-v5 architecture
Code Llama 70B	Meta AI	2024-01-29	7B, 13B, 34B, 70B	link	link	link	link
MGIE	Apple	2024-02-05		link	link	link	link
Sora	OpenAI	2024-02-15		link		link
Gemma	Google	2024-02-21	2B, 7B	link	link	link	link 1 link 2
🧨 Stable Diffusion 3	Stability AI	2024-02-22 (preview)	0.8B, ..., 8B			link

Old Table

Table 2 - Models (old)
Model	Company	Date	Base Model	Parameters	Training Data Size	Training Time	Context length	Paper	Source	Website	Training data	Code License	Weights License	Type	Model weights	Instruction Tuning	RLHF	Remarks
Deep Blue	IBM	1996-01-01	from scratch	N/A				https://www.sciencedirect.com/science/article/pii/S0004370201001291	N/A	https://www.ibm.com/ibm/history/ibm100/us/en/icons/deepblue/				games				Chess
Watson	IBM	2011-01-01	from scratch	N/A				https://doi.org/10.1609/aimag.v31i3.2303	N/A					games				Jeopardy
AlexNet	Krizhevsky, G. Hinton	2012-09-30	from scratch	60M				https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf	https://github.com/dansuh17/alexnet-pytorch (clone)					vision				won ImageNet LSVRC 2012 challenge with 15.3%
word2vec	Google	2013-01-16						https://arxiv.org/abs/1301.3781								no	no
Inception v1	Google	2014-09-17	from scratch					https://arxiv.org/abs/1409.4842	https://github.com/google/deepdream					vision				won ImageNet LSVRC 2014 challenge with 6.7%
DQN	Google DeepMind	2015-02-25	from scratch					https://www.nature.com/articles/nature14236	https://github.com/deepmind/dqn					deep RL
char-rnn	Andrej Karpathy	2015-05-21	from scratch					https://karpathy.github.io/2015/05/21/rnn-effectiveness/	https://github.com/karpathy/char-rnn					language				Features on https://www.aiweirdness.com/
GloVe	Stanford	2015-09-01						https://nlp.stanford.edu/pubs/glove.pdf	https://github.com/stanfordnlp/GloVe	https://nlp.stanford.edu/projects/glove/		Apache 2.0	Apache 2.0		yes	no	no
fastText	Facebook	2015-11-09						https://arxiv.org/abs/1607.04606	https://github.com/facebookresearch/fastText	https://fasttext.cc/		MIT	MIT		yes	no	no
Inception v3	Google	2015-12-02						https://arxiv.org/abs/1512.00567						vision	https://huggingface.co/timm/inception_v3.tv_in1k
ResNet	Microsoft	2015-12-10	from scratch					https://arxiv.org/abs/1512.03385						vision				won ImageNet LSVRC 2015 challenge with 3.57%; "better than humans"
AlphaGo	Google DeepMind	2016-01-27	from scratch					https://www.nature.com/articles/nature16961						games
Inception v4	Google	2016-02-23												vision	https://huggingface.co/timm/inception_v4.tf_in1k
Tay	Microsoft	2016-03-23						N/A	N/A	https://blogs.microsoft.com/blog/2016/03/25/learning-tays-introduction/				chatbot
CycleGAN	UC Berkeley	2017-03-30						https://arxiv.org/abs/1703.10593	https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix					GAN	yes
AlphaGo Zero	Google DeepMind	2017-10-19						https://www.nature.com/articles/nature24270						games
AlphaZero	Google DeepMind	2017-12-05						https://arxiv.org/abs/1712.01815						games
ELMo (Embeddings from Language Models)	Allen Institute for AI	2018-02-15		180M				https://arxiv.org/abs/1802.05365						language	yes
GPT (Generative Pre-trained Transformer)	OpenAI	2018-06-11	from scratch	117M				https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf	https://github.com/openai/finetune-transformer-lm					transformer	yes	no	no
BERT (Bidirectional Encoder Representations from Transformers)	Google	2018-10-11		108M, 334M				https://arxiv.org/abs/1810.04805	https://github.com/google-research/bert					transformer	yes
StyleGAN	Nvidia	2018-12-12						https://arxiv.org/abs/1812.04948	https://github.com/NVlabs/stylegan					GAN	yes			https://thispersondoesnotexist.com
GPT2	OpenAI	2019-02-14		1.5B				https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf	https://github.com/openai/gpt-2					transformer	yes	no	no
XLNet	CMU & Google	2019-06-19		117M, 360M				https://arxiv.org/abs/1906.08237	https://github.com/zihangdai/xlnet				Apache 2.0		yes
RoBERTa	Meta AI	2019-07-26	BERT	354M				https://arxiv.org/abs/1907.11692						transformer	yes
ALBERT (A Lite BERT)	Google	2019-09-26	BERT	12M, 18M, 60M, 235M				https://arxiv.org/abs/1909.11942	https://github.com/google-research/ALBERT				Apache 2.0	transformer	yes
DistilBERT	HuggingFace	2019-10-02	BERT	66M				https://arxiv.org/abs/1910.01108	https://github.com/huggingface/transformers				Apache 2.0	transformer	yes
Text-to-Text Transfer Transformer (T5)	Google	2019-10-23	from scratch	11B	1T tokens			https://arxiv.org/abs/1910.10683	https://github.com/google-research/text-to-text-transfer-transformer	https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html		Apache 2.0	Apache 2.0	transformer	yes	no	no
AlphaFold	Google DeepMind	2020-01-15	from scratch					https://www.nature.com/articles/s41586-019-1923-7	https://github.com/deepmind/deepmind-research/tree/master/alphafold_casp13						yes
Turing NLG	Microsoft	2020-02-13		17B				N/A		https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/
ELECTRA	Stanford & Google	2020-03-23	BERT?	14M, 110M, 335M				https://arxiv.org/abs/2003.10555							yes
DeBERTa	Microsoft	2020-06-05	BERT					https://arxiv.org/abs/2006.03654	https://github.com/microsoft/DeBERTa				MIT	transformer	yes
GPT3	OpenAI	2020-06-11	from scratch	175B	300B tokens			https://arxiv.org/abs/2005.14165	/			private	private	transformer	no	no	no
ImageGPT	OpenAI	2020-06-17						https://cdn.openai.com/papers/Generative_Pretraining_from_Pixels_V2.pdf	https://github.com/openai/image-gpt			private	private	transformer	no
mT5	Google	2020-10-22	from scratch	300M - 13B	1T tokens			https://arxiv.org/abs/2010.11934	https://github.com/google-research/multilingual-t5		mC4	Apache 2.0	Apache 2.0	transformer	https://huggingface.co/google/mt5-base
DALL-E	OpenAI	2021-01-05	GPT-3	12B				https://arxiv.org/abs/2102.12092				private	private	transformer	no
DeBERTa V2	Microsoft	2021-02-03		900M - 1.5B				N/A						transformer	yes
CLIP	OpenAI	2021-02-26						https://arxiv.org/abs/2103.00020	https://github.com/OpenAI/CLIP	https://openai.com/research/clip		MIT			yes
GLM	Tsinghua University	2021-03-18		110M - 10B				https://arxiv.org/abs/2103.10360	https://github.com/THUDM/GLM					transformer	yes
GPT-Neo	EleutherAI	2021-03-21		125M, 1.3B, 2.7B				N/A	https://github.com/EleutherAI/gpt-neo	https://www.eleuther.ai/artifacts/gpt-neo			MIT	transformer	https://huggingface.co/EleutherAI/gpt-neo-1.3B
LaMDA	Google	2021-05-18	from scratch	137B	2.8T tokens	58d		https://arxiv.org/abs/2201.08239	N/A				N/A	transformer	no
GPT-J	EleutherAI	2021-06-09		6B					https://github.com/kingoflolz/mesh-transformer-jax		yes	Apache 2.0	Apache 2.0	transformer	https://huggingface.co/EleutherAI/gpt-j-6b	no	no
CPM-2	Tsinghua University	2021-06-20		11B				https://arxiv.org/abs/2106.10715	https://github.com/TsinghuaAI/CPM						yes
Copilot	GitHub	2021-06-29	OpenAI Codex						N/A				N/A	code	no
ERNIE 3.0	Baidu	2021-07-05		10B	375B tokens			https://arxiv.org/abs/2107.02137	N/A	http://research.baidu.com/Blog/index-view?id=160		N/A	N/A	transformer	no
AlphaFold 2	Google DeepMind	2021-07-15		21B				https://www.nature.com/articles/s41586-021-03819-2	https://github.com/deepmind/alphafold						yes
Jurassic-1	AI21 Labs	2021-08-01		178B	300B tokens			N/A	N/A	https://www.ai21.com/blog/announcing-ai21-studio-and-jurassic-1		N/A	N/A		no
Codex	OpenAI	2021-08-10	GPT3	12B	100B tokens			https://arxiv.org/abs/2107.03374	N/A	https://openai.com/blog/openai-codex		private	private	code	no
T0	BigScience	2021-10-15	T5	11B		27h		https://arxiv.org/abs/2110.08207	https://github.com/bigscience-workshop/t-zero			Apache 2.0	Apache 2.0	transformer	https://huggingface.co/bigscience/T0
DeBERTa V3	Microsoft	2021-11-18						https://arxiv.org/abs/2111.09543	https://github.com/microsoft/DeBERTa				MIT	transformer	yes
Gopher	Google DeepMind	2021-12-08	from scratch	280B	300B tokens	38d		https://arxiv.org/abs/2112.11446							no	no	no
GLaM (Generalist Language Model)	Google	2021-12-13	from scratch	1.2T	280T tokens	24d		https://arxiv.org/abs/2112.06905
WebGPT	OpenAI	2021-12-17	GPT 3	175B				https://arxiv.org/abs/2112.09332	N/A			private	private	transformer	no	no	yes
ClipSeg		2021-12-18						https://arxiv.org/abs/2112.10003	https://github.com/timojl/clipseg
InstructGPT	OpenAI	2022-01-27	GPT3	175B				https://arxiv.org/abs/2203.02155	N/A			private	private	transformer	no	yes	yes
Megatron-Turing (MT) NLG	Microsoft	2022-01-28		530B	270B tokens			https://arxiv.org/abs/2201.11990					N/A	transformer	no
AlphaCode	Google DeepMind	2022-02-02		0.3B,1B,3B,9B,41B	967B tokens			https://arxiv.org/abs/2203.07814		https://www.deepmind.com/blog/competitive-programming-with-alphacode			N/A	code	no
GPT3.5	OpenAI	2022-03-15		355B					N/A			private	private	transformer	no
Imagen	Google	2022-03-23						https://arxiv.org/abs/2205.11487		https://imagen.research.google/
CodeGen-Multi	Salesforce	2022-03-25		350M - 16B			2048	https://arxiv.org/abs/2203.13474v1						code	https://huggingface.co/Salesforce/codegen-350M-multi
Chinchilla	Google DeepMind	2022-03-29		70B	1.4T tokens			https://arxiv.org/abs/2203.15556	N/A	https://www.deepmind.com/blog/an-empirical-analysis-of-compute-optimal-large-language-model-training			N/A		no
T5X	Google	2022-03-31						https://arxiv.org/abs/2203.17189	https://github.com/google-research/t5x					transformer
PaLM (Pathways Language Model)	Google	2022-04-04		8B, 62B, 540B	780B tokens			https://arxiv.org/abs/2204.02311		https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html		N/A	N/A	transformer	no
GPT-NeoX	EleutherAI	2022-04-14		20B	825GB			https://arxiv.org/abs/2204.06745	https://github.com/EleutherAI/gpt-neox			Apache 2.0	Apache 2.0	transformer	https://huggingface.co/EleutherAI/gpt-neox-20b	no	no
Tk-Instruct	Allen Institute for AI	2022-04-16	T5	3B, 11B		4h		https://arxiv.org/abs/2204.07705	https://github.com/yizhongw/Tk-Instruct			Apache 2.0			https://huggingface.co/allenai/tk-instruct-11b-def	yes
Flamingo	Google DeepMind	2022-04-29						https://arxiv.org/abs/2204.14198		https://www.deepmind.com/blog/tackling-multiple-tasks-with-a-single-visual-language-model			N/A		no
OPT	Meta AI	2022-05-03	from scratch	125M - 175B	180B tokens			https://arxiv.org/abs/2205.01068				MIT	NC research	transformer	https://huggingface.co/facebook/opt-30b	no	no
UL2	Google Brain	2022-05-10		20B	1T tokens			https://arxiv.org/abs/2205.05131				Apache 2.0	Apache 2.0	transformer	yes	no	no
LaMDA 2	Google	2022-05-11											N/A	transformer	no
YaLM	Yandex	2022-06-22	from scratch	100B				N/A	https://github.com/yandex/YaLM-100B					transformer	yes
BLOOM	BigScience	2022-07-06	from scratch	up to 176B	366B tokens	105d		https://arxiv.org/abs/2211.05100				bigscience-bloom-rail-1.0	bigscience-bloom-rail-1.0	transformer	https://huggingface.co/bigscience/bloom	no	no
NLLB-200 (No Language Left Behind)	Meta AI	2022-07-06	from scratch	55B						https://about.fb.com/news/2022/07/new-meta-ai-model-translates-200-languages-making-technology-more-accessible/				translator				translate between 200 languages
Midjourney	Midjourney Inc.	2022-07-12	from scratch					N/A	N/A	https://www.midjourney.com			N/A	diffuser	no			Exposed as Discord bot
DALL-E 2	OpenAI	2022-07-20	GPT-3					https://cdn.openai.com/papers/dall-e-2.pdf				private	private	diffuser	no
AlexaTM	Amazon	2022-08-02	from scratch	20B	1.3T tokens	120d		https://arxiv.org/abs/2208.01448	https://github.com/amazon-science/alexa-teacher-models					transformer	via SageMaker	no	no
Stable Diffusion	Stability AI	2022-08-10	from scratch	890M				https://arxiv.org/abs/2112.10752	https://github.com/CompVis/stable-diffusion	https://stability.ai/blog/stable-diffusion-announcement				diffuser	yes			See also https://stablediffusionweb.com/
DreamBooth	Google	2022-08-25						https://arxiv.org/abs/2208.12242	https://github.com/google/dreambooth	https://dreambooth.github.io/			N/A		no
CodeGeeX	Tsinghua University	2022-09-19	from scratch	13B	850B tokens	60d		https://arxiv.org/abs/2303.17568	https://github.com/THUDM/CodeGeeX	https://models.aminer.cn/codegeex/blog/		Apache 2.0	CodeGeeX License	code	on request	N/A	N/A
WeLM	WeChat	2022-09-21	from scratch	10B	300B tokens	24d		https://arxiv.org/abs/2209.10372		https://welm.weixin.qq.com/docs/api/					yes	no	no	Chinese language
Sparrow	Google DeepMind	2022-09-22	from scratch	70B				https://arxiv.org/abs/2209.14375		https://www.deepmind.com/blog/building-safer-dialogue-agents			N/A		no	no	yes
GLM-130B	Tsinghua University	2022-10-05	from scratch	130B	400B tokens	60d		https://arxiv.org/abs/2210.02414	https://github.com/THUDM/GLM-130B					transformer	yes
Flan-T5	Google	2022-10-20	T5	60M - 11B				https://arxiv.org/abs/2210.11416	https://github.com/google-research/t5x			Apache 2.0	Apache 2.0	transformer	yes	yes	no
Flan-PaLM	Google	2022-10-20	PaLM	540B		37h		https://arxiv.org/abs/2210.11416	N/A	N/A		N/A	N/A	transformer	no	yes	no
U-PaLM	Google	2022-10-20	PaLM	8B, 62B, 540B		5d		https://arxiv.org/abs/2210.11399				N/A	N/A	transformer	no	no	no
BLOOMZ	BigScience	2022-11-03	BLOOM	176B				https://arxiv.org/abs/2211.01786	https://github.com/bigscience-workshop/xmtf			bigscience-bloom-rail-1.0	bigscience-bloom-rail-1.0	transformer	yes	yes	no	BLOOM + Multitask prompted finetuning (MTF)
mT0	BigScience	2022-11-03	mT5	300M - 13B				https://arxiv.org/abs/2211.01786	https://github.com/bigscience-workshop/xmtf			Apache 2.0	Apache 2.0		https://huggingface.co/bigscience/mt0-large			Google mT5 + Multitask prompted finetuning (MTF)
OpenJourney	PromptHero	2022-11-08	Stable Diffusion					N/A						diffuser	https://huggingface.co/prompthero/openjourney			Stable Diffusion finetuned to resemble MidJourney
Galactica	Meta AI	2022-11-16	from scratch	125M - 120B	106B tokens			https://arxiv.org/abs/2211.09085					cc-by-nc-4.0	transformer	https://huggingface.co/facebook/galactica-120b			Focussed on Science
Stable Diffusion v2	Stability AI	2022-11-24	from scratch					N/A	https://github.com/Stability-AI/stablediffusion	https://stability.ai/blog/stable-diffusion-v2-release				diffuser	yes
GPT-JT	TogetherComputer	2022-11-29	GPT-J	6B					N/A	https://www.together.xyz/blog/releasing-v1-of-gpt-jt-powered-by-open-source-ai		Apache 2.0	Apache 2.0	transformer	https://huggingface.co/togethercomputer/GPT-JT-6B-v1		no
ChatGPT	OpenAI	2022-11-30	GPT 3.5					N/A	N/A	https://openai.com/blog/chatgpt	no	private	private	chatbot	no	yes	yes
OpenCLIP	various	2022-12-14	from scratch					https://arxiv.org/pdf/2212.07143.pdf	https://github.com/LAION-AI/scaling-laws-openclip
OPT-IML	Meta AI	2022-12-22	OPT	30B, 175B				https://arxiv.org/abs/2212.12017				MIT	NC research	transformer	yes	yes	no
Bard	Google	2023-02-06	LaMDA 2 or PaLM 2?										N/A	chatbot	no
LLaMA	Meta AI	2023-02-23	from scratch	7B, 13B, 30B, 65B	1.4T tokens	21d		https://arxiv.org/abs/2302.13971	https://github.com/facebookresearch/llama	https://ai.facebook.com/blog/large-language-model-llama-meta-ai/		GPL 3.0	NC research	transformer	https://huggingface.co/decapoda-research/llama-65b-hf	no	no
Flan-UL2	Google Brain	2023-02-28	UL2	20B	Flan collection			https://arxiv.org/abs/2205.05131v3	https://github.com/google-research/google-research/tree/master/ul2			Apache 2.0	Apache 2.0		https://huggingface.co/google/flan-ul2	yes	no
Open-Assistant SFT-1	OpenAssistant	2023-03-09	Pythia 12B	12B				N/A	https://github.com/LAION-AI/Open-Assistant/tree/main/model/model_training	https://open-assistant.io/			Apache 2.0	transformer	https://huggingface.co/OpenAssistant/oasst-sft-1-pythia-12b
Jurassic-2	AI21 Labs	2023-03-09		?				N/A	N/A	https://www.ai21.com/blog/introducing-j2		N/A	N/A		no
Alpaca-LoRA	Eric J. Wang	2023-03-13	LLaMA					N/A	https://github.com/tloen/alpaca-lora					transformer	yes
Alpaca	Stanford	2023-03-13	LLaMA	7B				N/A	https://github.com/tatsu-lab/stanford_alpaca	https://crfm.stanford.edu/2023/03/13/alpaca.html				transformer	yes
h2oGPT	H2O.ai	2023-03-13	Pythia 12B, GPT-NeoX 20B	12B, 20B				N/A	https://github.com/h2oai/h2ogpt	https://gpt.h2o.ai/			Apache 2.0	transformer	https://huggingface.co/h2oai
ChatGLM	Tsinghua University	2023-03-14	GLM / GLM-130B?	6B					https://github.com/THUDM/ChatGLM-6B	https://chatglm.cn/blog				chatbot
GPT4	OpenAI	2023-03-14	from scratch	8x220B				https://arxiv.org/abs/2303.08774				private	private	transformer	no	yes	yes
Zero-1-to-3	Columbia University	2023-03-20						https://arxiv.org/abs/2303.11328	https://github.com/cvlab-columbia/zero123	https://zero123.cs.columbia.edu/				diffuser	yes
Dolly v1	Databricks	2023-03-24	GPT-J	6B				N/A	https://github.com/databrickslabs/dolly	https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html			cc-by-nc-4.0	chatbot	https://huggingface.co/databricks/dolly-v1-6b
GPT4All	Nomic AI	2023-03-28	LLaMA	7B				https://static.nomic.ai/gpt4all/2023_GPT4All_Technical_Report.pdf	https://github.com/nomic-ai/gpt4all		yes		GPL 3.0	chatbot	https://huggingface.co/nomic-ai/gpt4all-lora			Finetuned LLaMA 7B based on GPT3.5 chats
Cerebras-GPT	Cerebras Systems	2023-03-28	from scratch	111M - 13B				https://arxiv.org/abs/2304.03208	https://github.com/Cerebras/modelzoo	https://www.cerebras.net/blog/cerebras-gpt-a-family-of-open-compute-efficient-large-language-models/		Apache 2.0	Apache 2.0	transformer	https://huggingface.co/cerebras/Cerebras-GPT-13B	no	no	Reproduction of GPT 3 training process
LLaMA-Adapter	Shanghai AI Lab	2023-03-28	LLaMA	7B				https://arxiv.org/abs/2303.16199	https://github.com/ZrrSkywalker/LLaMA-Adapter
ColossalChat	Colossal AI	2023-03-29	LLaMA						https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat	https://chat.colossalai.org/			Apache 2.0	chatbot
Vicuna	LM-SYS	2023-03-30	LLaMA	7B, 13B				N/A	https://github.com/lm-sys/FastChat	https://vicuna.lmsys.org/			see LLaMA	transformer	yes
BloombergGPT	Bloomberg	2023-03-30		50B				https://arxiv.org/abs/2303.17564		https://www.bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance/				transformer
RWKV-4 Raven	BlinkDL	2023-04-01		1.5B, 3B, 7B, 14B				https://arxiv.org/abs/2305.13048	https://github.com/BlinkDL/RWKV-LM					RNN	https://huggingface.co/BlinkDL/rwkv-4-raven
Pythia	EleutherAI	2023-04-03		70M - 12B	300B tokens			https://arxiv.org/abs/2304.01373	https://github.com/EleutherAI/pythia			Apache 2.0	Apache 2.0	transformer	https://huggingface.co/EleutherAI/pythia-12b	no	no
Koala	UC Berkeley	2023-04-03	LLaMA	7B, 13B				N/A	https://github.com/young-geng/EasyLM#koala	https://bair.berkeley.edu/blog/2023/04/03/koala				transformer	https://huggingface.co/young-geng/koala/tree/main
Baize	Baize Project	2023-04-03	LLaMA	7B, 13B, 30B				https://arxiv.org/abs/2304.01196	https://github.com/project-baize/baize-chatbot					transformer	https://huggingface.co/project-baize/baize-lora-7B			Finetuned LLaMA with LoRA
SAM	Meta AI	2023-04-05						https://arxiv.org/abs/2304.02643	https://github.com/facebookresearch/segment-anything	https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/	yes			vision
Bark	Suno	2023-04-09		80M				N/A	https://github.com/suno-ai/bark				cc-by-nc-4.0	voice	yes
Dolly v2	Databricks	2023-04-12	Pythia	3B, 7B, 12B				N/A	https://github.com/databrickslabs/dolly			Apache 2.0	MIT	chatbot	https://huggingface.co/databricks/dolly-v2-12b	yes	no
CodeWhisperer	Amazon	2023-04-13		N/A				N/A	N/A	https://aws.amazon.com/blogs/aws/amazon-codewhisperer-free-for-individual-use-is-now-generally-available/			N/A	code	no			Self-hosted Copilot clone
GPT4All-J	Nomic AI	2023-04-14	GPT-J	6.7B				https://static.nomic.ai/gpt4all/2023_GPT4All-J_Technical_Report_2.pdf	https://github.com/nomic-ai/gpt4all		yes	Apache 2.0	Apache 2.0	transformer	https://huggingface.co/nomic-ai/gpt4all-j	yes	no
DINOv2	Meta AI	2023-04-14	from scratch	21M - 1.1B				https://arxiv.org/abs/2304.07193	https://github.com/facebookresearch/dinov2	https://ai.facebook.com/blog/dino-v2-computer-vision-self-supervised-learning/				vision	yes
VideoLDM	Nvidia	2023-04-18	Stable Diffusion					https://arxiv.org/abs/2304.08818	N/A	https://research.nvidia.com/labs/toronto-ai/VideoLDM/
StableLM	Stability AI	2023-04-19	from scratch	3B, 7B, (15B, 65B, 175B)				N/A	https://github.com/stability-AI/stableLM/	https://stability.ai/blog/stability-ai-launches-the-first-of-its-stablelm-suite-of-language-models			cc-by-nc-4.0	transformer	yes
Open-Assistant SFT-6	OpenAssistant	2023-04-22	LLaMA	30B				https://arxiv.org/abs/2304.07327					see LLaMA	transformer	https://huggingface.co/OpenAssistant/oasst-sft-6-llama-30b-xor
WizardLM	Microsoft	2023-04-24	LLaMA	7B				https://arxiv.org/abs/2304.12244	https://github.com/nlpxucan/WizardLM					transformer	yes
DeepFloyd IF	Stability AI	2023-04-28						N/A	https://github.com/deep-floyd/IF	https://stability.ai/blog/deepfloyd-if-text-to-image-model
StableVicuna	Stability AI	2023-04-28	Vicuna 13B	13B				N/A	https://github.com/Stability-AI/StableLM	https://stability.ai/blog/stablevicuna-open-source-rlhf-chatbot			cc-by-nc-4.0	transformer	https://huggingface.co/CarperAI/stable-vicuna-13b-delta			Vicuna 13B + RLHF
FastChat-T5	LM-SYS	2023-04-28	Flan-T5-XL	3B				N/A	https://github.com/lm-sys/FastChat#fastchat-t5				Apache 2.0	transformer	https://huggingface.co/lmsys/fastchat-t5-3b-v1.0
LLaMA-Adapter V2	Shanghai AI Lab	2023-04-28	LLaMA					https://arxiv.org/abs/2304.15010	https://github.com/ZrrSkywalker/LLaMA-Adapter					transformer
Replit Code	Replit	2023-05-02	from scratch	2.7B				N/A	https://github.com/replit/ReplitLM	https://replit.com/site/ghostwriter			cc-by-sa-4.0	code	https://huggingface.co/replit/replit-code-v1-3b
OpenLLaMA	OpenLM Research	2023-05-02	from scratch	7B					https://github.com/openlm-research/open_llama		RedPajama		Apache 2.0	transformer	https://huggingface.co/openlm-research/open_llama_7b_preview_300bt			Apache 2.0 LLaMA clone based on RedPajama data
Shap-E	OpenAI	2023-05-03	from scratch	300M				https://arxiv.org/pdf/2305.02463.pdf	https://github.com/openai/shap-e			MIT		diffuser	https://github.com/openai/shap-e/blob/main/shap_e/models/download.py			3D image generation
StarCoder	BigCode	2023-05-04		15B	1T tokens + 35B python tokens		8k	https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view	https://github.com/bigcode-project/starcoder	https://huggingface.co/blog/starcoder			BigCode OpenRAIL-M v1	code	https://huggingface.co/bigcode/starcoder
RedPajama	TogetherComputer	2023-05-05	from scratch	3B, 7B				N/A	https://github.com/togethercomputer/RedPajama-Data	https://www.together.xyz/blog/redpajama-models-v1			Apache 2.0		https://huggingface.co/togethercomputer/RedPajama-INCITE-Base-7B-v0.1			Open reproduction of LLaMA
MPT-7B (MosaicML Pretrained Transformer)	MosaicML	2023-05-05	from scratch	7B				N/A	https://github.com/mosaicml/llm-foundry	https://www.mosaicml.com/blog/mpt-7b			Apache 2.0	transformer	https://huggingface.co/mosaicml/mpt-7b-instruct			Open reproduction of LLaMA
MPT-30B (MosaicML Pretrained Transformer)	MosaicML	2023-06-22	from scratch	30B				N/A	https://github.com/mosaicml/llm-foundry	https://www.mosaicml.com/blog/mpt-30b			Apache 2.0	transformer	https://huggingface.co/mosaicml/mpt-30b-instruct			Open reproduction of LLaMA
PanGu-sigma	Huawei
AnthropicLM	Anthropic AI												N/A		no
Lit-LLaMA			LLaMA	7B, 13B, 30B, 65B								Apache 2.0	NC research			optional with Alcapa	no
ImageBind	Meta AI	2023-05-09	from scratch					https://arxiv.org/abs/2305.05665	https://github.com/facebookresearch/ImageBind	https://ai.facebook.com/blog/imagebind-six-modalities-binding-ai/		cc-by-nc-4.0	cc-by-nc-4.0	transformer	https://dl.fbaipublicfiles.com/imagebind/imagebind_huge.pth			six different modalities: images, text, audio, depth, thermal, and IMU
Open-LLaMA V2	s-JoL	2023-05-11	from scratch					N/A	https://github.com/s-JoL/Open-Llama			MIT	MIT	transformer	https://huggingface.co/s-JoL/Open-Llama-V2	yes	yes
PaLM 2	Google	2023-05-10	from scratch					https://ai.google/static/documents/palm2techreport.pdf	N/A	https://ai.google/discover/palm2				transformer	no