Recent Generative AI Models
This page was last updated on 2025-09-13.
New Table
Model |
Company |
Date |
Params |
Paper |
Source |
Website |
Weights |
Remarks |
---|---|---|---|---|---|---|---|---|
Robotic Transformer 1 |
Google DeepMind |
2022-12-13 |
||||||
Einstein GPT |
Salesforce |
2023-03-07 |
uses OpenAI API? |
|||||
🧨 Stable UnCLIP 2.1 |
Stability AI |
2023-03-24 |
model behind Reimagine |
|||||
LLaVA |
University of Wisconsin-Madison |
2023-04-17 |
LLaVA = Large Language and Vision Assistant |
|||||
WizardLM |
Microsoft |
2023-04-24 |
7B, 13B, 30B, 70B |
based on llama |
||||
Eleven Multilingual v1 |
ElevenLabs |
2023-04-27 |
English, French, German, Hindi, Italian, Polish, Portuguese, Spanish |
|||||
PaLM 2 |
2023-05-10 |
|||||||
LIMA |
Meta AI |
2023-05-18 |
65B |
based on llama |
||||
🔈Massive Multilingual Speech |
Meta AI |
2023-05-22 |
300M, 1B |
|||||
Falcon |
TII.AE |
2023-05-26 |
1B, 7B, 40B |
coming soon |
||||
AlphaDev |
Google DeepMind |
2023-06-07 |
||||||
🔈 StyleTTS 2 |
Columbia University |
2023-07-13 |
||||||
WizardCoder |
Microsoft |
2023-06-14 |
15B |
|||||
Llama 2 |
Meta AI |
2023-07-18 |
7B, 13B, 70B |
|||||
Meta-Transformer |
2023-07-20 |
85M, 302M |
12 modalities |
|||||
Stable Beluga 2 |
Stability AI |
2023-07-21 |
70B |
based on llama 2 |
||||
🧨 Stable Diffusion XL 1.0 |
Stability AI |
2023-07-26 |
3.5B |
|||||
Robotic Transformer 2 |
Google DeepMind |
2023-07-28 |
||||||
StableCode |
Stability AI |
2023-08-08 |
3B |
|||||
🔈 AudioSep - Separate Anything You Describe |
Audio-AGI |
2023-08-09 |
||||||
🔈 AudioLDM2 |
ByteDance |
2023-08-10 |
||||||
🔈 Eleven Multilingual v2 |
ElevenLabs |
2023-08-22 |
English, French, German, Hindi, Italian, Polish, Portuguese, Spanish, Chinese, Korean, Dutch, Turkish, Swedish, Indonesian, Filipino, Japanese, Ukrainian, Greek, Czech, Finnish, Romanian, Danish, Bulgarian, Malay, Slovak, Croatian, Classic Arabic, Tamil |
|||||
SeamlessM4T |
Meta AI |
2023-08-22 |
1.2B, 2.3B |
|||||
Code Llama |
Meta AI |
2023-08-24 |
7B, 13B, 34B |
|||||
Nougat OCR |
Meta AI |
2023-08-25 |
Specialized in academic documents |
|||||
Falcon 180B |
TII |
2023-09-06 |
180B |
coming soon |
see also: falcon-40b |
|||
Persimmon |
Adept |
2023-09-07 |
8B |
|||||
🔈 StableAudio |
Stability AI |
2023-09-13 |
||||||
🧨 DALL-E 3 |
OpenAI |
2023-09-21 |
||||||
📽️ LaVie |
Shanghai Artificial Intelligence Laboratory |
2023-09-26 |
||||||
Mistral-7B |
Mistral AI |
2023-09-27 |
7B |
|||||
Qwen |
Alibaba |
2023-09-28 |
7B, 14B |
|||||
LLaVA 1.5 |
University of Wisconsin-Madison |
2023-10-05 |
||||||
jina-embeddings-v2 |
Jina AI |
2023-10-25 |
||||||
Yi |
01.ai |
2023-11-02 |
6B, 34B |
|||||
📽️ Emu Video |
Meta AI |
2023-11-16 |
||||||
📽️ Stable Video Diffusion |
Stability AI |
2023-11-21 |
||||||
Meditron |
École Polytechnique Fédérale de Lausanne (EPFL) |
2023-11-27 |
7B, 70B |
|||||
🧨 SDXL Turbo |
Stability AI |
2023-11-28 |
||||||
📽️ Animate Anyone |
Alibaba |
2023-11-28 |
||||||
Seamless |
Meta AI |
2023-11-30 |
||||||
OpenVoice |
MyShell.ai |
2023-12-03 |
7B, 13B, 34B, 70B |
|||||
Gemini |
Google DeepMind |
2023-12-06 |
nano / pro / ultra, pro will power Bard |
|||||
AlphaCode 2 |
Google DeepMind |
2023-12-06 |
||||||
Stable LM Zephyr 3B |
Stability AI |
2023-12-07 |
3B |
|||||
Mistral 8x7B |
Mistral AI |
2023-12-11 |
45B |
|||||
🧨 Imagen 2 |
Google DeepMind |
2023-12-13 |
||||||
Stable Code 3B |
Stability AI |
2024-01-16 |
3B |
|||||
Stable LM 2 |
Stability AI |
2024-01-19 |
1.6B |
|||||
Eagle 7B |
RWKV |
2024-01-29 |
7B |
RWKV-v5 architecture |
||||
Code Llama 70B |
Meta AI |
2024-01-29 |
7B, 13B, 34B, 70B |
|||||
MGIE |
Apple |
2024-02-05 |
||||||
Sora |
OpenAI |
2024-02-15 |
||||||
Gemma |
2024-02-21 |
2B, 7B |
||||||
🧨 Stable Diffusion 3 |
Stability AI |
2024-02-22 (preview) |
0.8B, ..., 8B |
Old Table
Model |
Company |
Date |
Base Model |
Parameters |
Training Data Size |
Training Time |
Context length |
Paper |
Source |
Website |
Training data |
Code License |
Weights License |
Type |
Model weights |
Instruction Tuning |
RLHF |
Remarks |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Deep Blue |
IBM |
1996-01-01 |
from scratch |
N/A |
https://www.sciencedirect.com/science/article/pii/S0004370201001291 |
N/A |
https://www.ibm.com/ibm/history/ibm100/us/en/icons/deepblue/ |
games |
Chess |
|||||||||
Watson |
IBM |
2011-01-01 |
from scratch |
N/A |
N/A |
games |
Jeopardy |
|||||||||||
AlexNet |
|
2012-09-30 |
from scratch |
60M |
vision |
won ImageNet LSVRC 2012 challenge with 15.3% |
||||||||||||
word2vec |
2013-01-16 |
no |
no |
|||||||||||||||
Inception v1 |
2014-09-17 |
from scratch |
vision |
won ImageNet LSVRC 2014 challenge with 6.7% |
||||||||||||||
DQN |
Google DeepMind |
2015-02-25 |
from scratch |
deep RL |
||||||||||||||
char-rnn |
Andrej Karpathy |
2015-05-21 |
from scratch |
language |
Features on https://www.aiweirdness.com/ |
|||||||||||||
GloVe |
Stanford |
2015-09-01 |
Apache 2.0 |
Apache 2.0 |
yes |
no |
no |
|||||||||||
fastText |
2015-11-09 |
MIT |
MIT |
yes |
no |
no |
||||||||||||
Inception v3 |
2015-12-02 |
vision |
||||||||||||||||
ResNet |
Microsoft |
2015-12-10 |
from scratch |
vision |
won ImageNet LSVRC 2015 challenge with 3.57%; "better than humans" |
|||||||||||||
AlphaGo |
Google DeepMind |
2016-01-27 |
from scratch |
games |
||||||||||||||
Inception v4 |
2016-02-23 |
vision |
||||||||||||||||
Tay |
Microsoft |
2016-03-23 |
N/A |
N/A |
https://blogs.microsoft.com/blog/2016/03/25/learning-tays-introduction/ |
chatbot |
||||||||||||
CycleGAN |
UC Berkeley |
2017-03-30 |
GAN |
yes |
||||||||||||||
AlphaGo Zero |
Google DeepMind |
2017-10-19 |
games |
|||||||||||||||
AlphaZero |
Google DeepMind |
2017-12-05 |
games |
|||||||||||||||
ELMo (Embeddings from Language Models) |
Allen Institute for AI |
2018-02-15 |
180M |
language |
yes |
|||||||||||||
GPT (Generative Pre-trained Transformer) |
OpenAI |
2018-06-11 |
from scratch |
117M |
https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf |
transformer |
yes |
no |
no |
|||||||||
BERT (Bidirectional Encoder Representations from Transformers) |
2018-10-11 |
108M, 334M |
transformer |
yes |
||||||||||||||
StyleGAN |
Nvidia |
2018-12-12 |
GAN |
yes |
||||||||||||||
GPT2 |
OpenAI |
2019-02-14 |
1.5B |
transformer |
yes |
no |
no |
|||||||||||
XLNet |
CMU & Google |
2019-06-19 |
117M, 360M |
Apache 2.0 |
yes |
|||||||||||||
RoBERTa |
Meta AI |
2019-07-26 |
BERT |
354M |
transformer |
yes |
||||||||||||
ALBERT (A Lite BERT) |
2019-09-26 |
BERT |
12M, 18M, 60M, 235M |
Apache 2.0 |
transformer |
yes |
||||||||||||
DistilBERT |
HuggingFace |
2019-10-02 |
BERT |
66M |
Apache 2.0 |
transformer |
yes |
|||||||||||
Text-to-Text Transfer Transformer (T5) |
2019-10-23 |
from scratch |
11B |
1T tokens |
https://github.com/google-research/text-to-text-transfer-transformer |
https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html |
Apache 2.0 |
Apache 2.0 |
transformer |
yes |
no |
no |
||||||
AlphaFold |
Google DeepMind |
2020-01-15 |
from scratch |
https://github.com/deepmind/deepmind-research/tree/master/alphafold_casp13 |
yes |
|||||||||||||
Turing NLG |
Microsoft |
2020-02-13 |
17B |
N/A |
||||||||||||||
ELECTRA |
Stanford & Google |
2020-03-23 |
BERT? |
14M, 110M, 335M |
yes |
|||||||||||||
DeBERTa |
Microsoft |
2020-06-05 |
BERT |
MIT |
transformer |
yes |
||||||||||||
GPT3 |
OpenAI |
2020-06-11 |
from scratch |
175B |
300B tokens |
/ |
private |
private |
transformer |
no |
no |
no |
||||||
ImageGPT |
OpenAI |
2020-06-17 |
https://cdn.openai.com/papers/Generative_Pretraining_from_Pixels_V2.pdf |
private |
private |
transformer |
no |
|||||||||||
mT5 |
2020-10-22 |
from scratch |
300M - 13B |
1T tokens |
mC4 |
Apache 2.0 |
Apache 2.0 |
transformer |
||||||||||
DALL-E |
OpenAI |
2021-01-05 |
GPT-3 |
12B |
private |
private |
transformer |
no |
||||||||||
DeBERTa V2 |
Microsoft |
2021-02-03 |
900M - 1.5B |
N/A |
transformer |
yes |
||||||||||||
CLIP |
OpenAI |
2021-02-26 |
MIT |
yes |
||||||||||||||
GLM |
Tsinghua University |
2021-03-18 |
110M - 10B |
transformer |
yes |
|||||||||||||
GPT-Neo |
EleutherAI |
2021-03-21 |
125M, 1.3B, 2.7B |
N/A |
MIT |
transformer |
||||||||||||
LaMDA |
2021-05-18 |
from scratch |
137B |
2.8T tokens |
58d |
N/A |
N/A |
transformer |
no |
|||||||||
GPT-J |
EleutherAI |
2021-06-09 |
6B |
yes |
Apache 2.0 |
Apache 2.0 |
transformer |
no |
no |
|||||||||
CPM-2 |
Tsinghua University |
2021-06-20 |
11B |
yes |
||||||||||||||
Copilot |
GitHub |
2021-06-29 |
OpenAI Codex |
N/A |
N/A |
code |
no |
|||||||||||
ERNIE 3.0 |
Baidu |
2021-07-05 |
10B |
375B tokens |
N/A |
N/A |
N/A |
transformer |
no |
|||||||||
AlphaFold 2 |
Google DeepMind |
2021-07-15 |
21B |
yes |
||||||||||||||
Jurassic-1 |
AI21 Labs |
2021-08-01 |
178B |
300B tokens |
N/A |
N/A |
https://www.ai21.com/blog/announcing-ai21-studio-and-jurassic-1 |
N/A |
N/A |
no |
||||||||
Codex |
OpenAI |
2021-08-10 |
GPT3 |
12B |
100B tokens |
N/A |
private |
private |
code |
no |
||||||||
T0 |
BigScience |
2021-10-15 |
T5 |
11B |
27h |
Apache 2.0 |
Apache 2.0 |
transformer |
||||||||||
DeBERTa V3 |
Microsoft |
2021-11-18 |
MIT |
transformer |
yes |
|||||||||||||
Gopher |
Google DeepMind |
2021-12-08 |
from scratch |
280B |
300B tokens |
38d |
no |
no |
no |
|||||||||
GLaM (Generalist Language Model) |
2021-12-13 |
from scratch |
1.2T |
280T tokens |
24d |
|||||||||||||
WebGPT |
OpenAI |
2021-12-17 |
GPT 3 |
175B |
N/A |
private |
private |
transformer |
no |
no |
yes |
|||||||
ClipSeg |
2021-12-18 |
|||||||||||||||||
InstructGPT |
OpenAI |
2022-01-27 |
GPT3 |
175B |
N/A |
private |
private |
transformer |
no |
yes |
yes |
|||||||
Megatron-Turing (MT) NLG |
Microsoft |
2022-01-28 |
530B |
270B tokens |
N/A |
transformer |
no |
|||||||||||
AlphaCode |
Google DeepMind |
2022-02-02 |
0.3B,1B,3B,9B,41B |
967B tokens |
https://www.deepmind.com/blog/competitive-programming-with-alphacode |
N/A |
code |
no |
||||||||||
GPT3.5 |
OpenAI |
2022-03-15 |
355B |
N/A |
private |
private |
transformer |
no |
||||||||||
Imagen |
2022-03-23 |
|||||||||||||||||
CodeGen-Multi |
Salesforce |
2022-03-25 |
350M - 16B |
2048 |
code |
|||||||||||||
Chinchilla |
Google DeepMind |
2022-03-29 |
70B |
1.4T tokens |
N/A |
https://www.deepmind.com/blog/an-empirical-analysis-of-compute-optimal-large-language-model-training |
N/A |
no |
||||||||||
T5X |
2022-03-31 |
transformer |
||||||||||||||||
PaLM (Pathways Language Model) |
2022-04-04 |
8B, 62B, 540B |
780B tokens |
https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html |
N/A |
N/A |
transformer |
no |
||||||||||
GPT-NeoX |
EleutherAI |
2022-04-14 |
20B |
825GB |
Apache 2.0 |
Apache 2.0 |
transformer |
no |
no |
|||||||||
Tk-Instruct |
Allen Institute for AI |
2022-04-16 |
T5 |
3B, 11B |
4h |
Apache 2.0 |
yes |
|||||||||||
Flamingo |
Google DeepMind |
2022-04-29 |
https://www.deepmind.com/blog/tackling-multiple-tasks-with-a-single-visual-language-model |
N/A |
no |
|||||||||||||
OPT |
Meta AI |
2022-05-03 |
from scratch |
125M - 175B |
180B tokens |
MIT |
NC research |
transformer |
no |
no |
||||||||
UL2 |
Google Brain |
2022-05-10 |
20B |
1T tokens |
Apache 2.0 |
Apache 2.0 |
transformer |
yes |
no |
no |
||||||||
LaMDA 2 |
2022-05-11 |
N/A |
transformer |
no |
||||||||||||||
YaLM |
Yandex |
2022-06-22 |
from scratch |
100B |
N/A |
transformer |
yes |
|||||||||||
BLOOM |
BigScience |
2022-07-06 |
from scratch |
up to 176B |
366B tokens |
105d |
bigscience-bloom-rail-1.0 |
bigscience-bloom-rail-1.0 |
transformer |
no |
no |
|||||||
NLLB-200 (No Language Left Behind) |
Meta AI |
2022-07-06 |
from scratch |
55B |
translator |
translate between 200 languages |
||||||||||||
Midjourney |
Midjourney Inc. |
2022-07-12 |
from scratch |
N/A |
N/A |
N/A |
diffuser |
no |
Exposed as Discord bot |
|||||||||
DALL-E 2 |
OpenAI |
2022-07-20 |
GPT-3 |
private |
private |
diffuser |
no |
|||||||||||
AlexaTM |
Amazon |
2022-08-02 |
from scratch |
20B |
1.3T tokens |
120d |
transformer |
via SageMaker |
no |
no |
||||||||
Stable Diffusion |
Stability AI |
2022-08-10 |
from scratch |
890M |
diffuser |
yes |
See also https://stablediffusionweb.com/ |
|||||||||||
DreamBooth |
2022-08-25 |
N/A |
no |
|||||||||||||||
CodeGeeX |
Tsinghua University |
2022-09-19 |
from scratch |
13B |
850B tokens |
60d |
Apache 2.0 |
CodeGeeX License |
code |
on request |
N/A |
N/A |
||||||
WeLM |
2022-09-21 |
from scratch |
10B |
300B tokens |
24d |
yes |
no |
no |
Chinese language |
|||||||||
Sparrow |
Google DeepMind |
2022-09-22 |
from scratch |
70B |
https://www.deepmind.com/blog/building-safer-dialogue-agents |
N/A |
no |
no |
yes |
|||||||||
GLM-130B |
Tsinghua University |
2022-10-05 |
from scratch |
130B |
400B tokens |
60d |
transformer |
yes |
||||||||||
Flan-T5 |
2022-10-20 |
T5 |
60M - 11B |
Apache 2.0 |
Apache 2.0 |
transformer |
yes |
yes |
no |
|||||||||
Flan-PaLM |
2022-10-20 |
PaLM |
540B |
37h |
N/A |
N/A |
N/A |
N/A |
transformer |
no |
yes |
no |
||||||
U-PaLM |
2022-10-20 |
PaLM |
8B, 62B, 540B |
5d |
N/A |
N/A |
transformer |
no |
no |
no |
||||||||
BLOOMZ |
BigScience |
2022-11-03 |
BLOOM |
176B |
bigscience-bloom-rail-1.0 |
bigscience-bloom-rail-1.0 |
transformer |
yes |
yes |
no |
BLOOM + Multitask prompted finetuning (MTF) |
|||||||
mT0 |
BigScience |
2022-11-03 |
mT5 |
300M - 13B |
Apache 2.0 |
Apache 2.0 |
Google mT5 + Multitask prompted finetuning (MTF) |
|||||||||||
OpenJourney |
PromptHero |
2022-11-08 |
Stable Diffusion |
N/A |
diffuser |
Stable Diffusion finetuned to resemble MidJourney |
||||||||||||
Galactica |
Meta AI |
2022-11-16 |
from scratch |
125M - 120B |
106B tokens |
cc-by-nc-4.0 |
transformer |
Focussed on Science |
||||||||||
Stable Diffusion v2 |
Stability AI |
2022-11-24 |
from scratch |
N/A |
diffuser |
yes |
||||||||||||
GPT-JT |
TogetherComputer |
2022-11-29 |
GPT-J |
6B |
N/A |
https://www.together.xyz/blog/releasing-v1-of-gpt-jt-powered-by-open-source-ai |
Apache 2.0 |
Apache 2.0 |
transformer |
no |
||||||||
ChatGPT |
OpenAI |
2022-11-30 |
GPT 3.5 |
N/A |
N/A |
no |
private |
private |
chatbot |
no |
yes |
yes |
||||||
OpenCLIP |
various |
2022-12-14 |
from scratch |
|||||||||||||||
OPT-IML |
Meta AI |
2022-12-22 |
OPT |
30B, 175B |
MIT |
NC research |
transformer |
yes |
yes |
no |
||||||||
Bard |
2023-02-06 |
LaMDA 2 or PaLM 2? |
N/A |
chatbot |
no |
|||||||||||||
LLaMA |
Meta AI |
2023-02-23 |
from scratch |
7B, 13B, 30B, 65B |
1.4T tokens |
21d |
https://ai.facebook.com/blog/large-language-model-llama-meta-ai/ |
GPL 3.0 |
NC research |
transformer |
no |
no |
||||||
Flan-UL2 |
Google Brain |
2023-02-28 |
UL2 |
20B |
Flan collection |
https://github.com/google-research/google-research/tree/master/ul2 |
Apache 2.0 |
Apache 2.0 |
yes |
no |
||||||||
Open-Assistant SFT-1 |
OpenAssistant |
2023-03-09 |
Pythia 12B |
12B |
N/A |
https://github.com/LAION-AI/Open-Assistant/tree/main/model/model_training |
Apache 2.0 |
transformer |
||||||||||
Jurassic-2 |
AI21 Labs |
2023-03-09 |
? |
N/A |
N/A |
N/A |
N/A |
no |
||||||||||
Alpaca-LoRA |
Eric J. Wang |
2023-03-13 |
LLaMA |
N/A |
transformer |
yes |
||||||||||||
Alpaca |
Stanford |
2023-03-13 |
LLaMA |
7B |
N/A |
transformer |
yes |
|||||||||||
h2oGPT |
H2O.ai |
2023-03-13 |
Pythia 12B, GPT-NeoX 20B |
12B, 20B |
N/A |
Apache 2.0 |
transformer |
|||||||||||
ChatGLM |
Tsinghua University |
2023-03-14 |
GLM / GLM-130B? |
6B |
chatbot |
|||||||||||||
GPT4 |
OpenAI |
2023-03-14 |
from scratch |
8x220B |
private |
private |
transformer |
no |
yes |
yes |
||||||||
Zero-1-to-3 |
Columbia University |
2023-03-20 |
diffuser |
yes |
||||||||||||||
Dolly v1 |
Databricks |
2023-03-24 |
GPT-J |
6B |
N/A |
https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html |
cc-by-nc-4.0 |
chatbot |
||||||||||
GPT4All |
Nomic AI |
2023-03-28 |
LLaMA |
7B |
https://static.nomic.ai/gpt4all/2023_GPT4All_Technical_Report.pdf |
yes |
GPL 3.0 |
chatbot |
Finetuned LLaMA 7B based on GPT3.5 chats |
|||||||||
Cerebras-GPT |
Cerebras Systems |
2023-03-28 |
from scratch |
111M - 13B |
https://www.cerebras.net/blog/cerebras-gpt-a-family-of-open-compute-efficient-large-language-models/ |
Apache 2.0 |
Apache 2.0 |
transformer |
no |
no |
Reproduction of GPT 3 training process |
|||||||
LLaMA-Adapter |
Shanghai AI Lab |
2023-03-28 |
LLaMA |
7B |
||||||||||||||
ColossalChat |
Colossal AI |
2023-03-29 |
LLaMA |
https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat |
Apache 2.0 |
chatbot |
||||||||||||
Vicuna |
LM-SYS |
2023-03-30 |
LLaMA |
7B, 13B |
N/A |
see LLaMA |
transformer |
yes |
||||||||||
BloombergGPT |
Bloomberg |
2023-03-30 |
50B |
https://www.bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance/ |
transformer |
|||||||||||||
RWKV-4 Raven |
BlinkDL |
2023-04-01 |
1.5B, 3B, 7B, 14B |
RNN |
||||||||||||||
Pythia |
EleutherAI |
2023-04-03 |
70M - 12B |
300B tokens |
Apache 2.0 |
Apache 2.0 |
transformer |
no |
no |
|||||||||
Koala |
UC Berkeley |
2023-04-03 |
LLaMA |
7B, 13B |
N/A |
transformer |
||||||||||||
Baize |
Baize Project |
2023-04-03 |
LLaMA |
7B, 13B, 30B |
transformer |
Finetuned LLaMA with LoRA |
||||||||||||
SAM |
Meta AI |
2023-04-05 |
https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/ |
yes |
vision |
|||||||||||||
Bark |
Suno |
2023-04-09 |
80M |
N/A |
cc-by-nc-4.0 |
voice |
yes |
|||||||||||
Dolly v2 |
Databricks |
2023-04-12 |
Pythia |
3B, 7B, 12B |
N/A |
Apache 2.0 |
MIT |
chatbot |
yes |
no |
||||||||
CodeWhisperer |
Amazon |
2023-04-13 |
N/A |
N/A |
N/A |
N/A |
code |
no |
Self-hosted Copilot clone |
|||||||||
GPT4All-J |
Nomic AI |
2023-04-14 |
GPT-J |
6.7B |
https://static.nomic.ai/gpt4all/2023_GPT4All-J_Technical_Report_2.pdf |
yes |
Apache 2.0 |
Apache 2.0 |
transformer |
yes |
no |
|||||||
DINOv2 |
Meta AI |
2023-04-14 |
from scratch |
21M - 1.1B |
https://ai.facebook.com/blog/dino-v2-computer-vision-self-supervised-learning/ |
vision |
yes |
|||||||||||
VideoLDM |
Nvidia |
2023-04-18 |
Stable Diffusion |
N/A |
||||||||||||||
StableLM |
Stability AI |
2023-04-19 |
from scratch |
3B, 7B, (15B, 65B, 175B) |
N/A |
https://stability.ai/blog/stability-ai-launches-the-first-of-its-stablelm-suite-of-language-models |
cc-by-nc-4.0 |
transformer |
yes |
|||||||||
Open-Assistant SFT-6 |
OpenAssistant |
2023-04-22 |
LLaMA |
30B |
see LLaMA |
transformer |
https://huggingface.co/OpenAssistant/oasst-sft-6-llama-30b-xor |
|||||||||||
WizardLM |
Microsoft |
2023-04-24 |
LLaMA |
7B |
transformer |
yes |
||||||||||||
DeepFloyd IF |
Stability AI |
2023-04-28 |
N/A |
|||||||||||||||
StableVicuna |
Stability AI |
2023-04-28 |
Vicuna 13B |
13B |
N/A |
https://stability.ai/blog/stablevicuna-open-source-rlhf-chatbot |
cc-by-nc-4.0 |
transformer |
Vicuna 13B + RLHF |
|||||||||
FastChat-T5 |
LM-SYS |
2023-04-28 |
Flan-T5-XL |
3B |
N/A |
Apache 2.0 |
transformer |
|||||||||||
LLaMA-Adapter V2 |
Shanghai AI Lab |
2023-04-28 |
LLaMA |
transformer |
||||||||||||||
Replit Code |
Replit |
2023-05-02 |
from scratch |
2.7B |
N/A |
cc-by-sa-4.0 |
code |
|||||||||||
OpenLLaMA |
OpenLM Research |
2023-05-02 |
from scratch |
7B |
RedPajama |
Apache 2.0 |
transformer |
https://huggingface.co/openlm-research/open_llama_7b_preview_300bt |
Apache 2.0 LLaMA clone based on RedPajama data |
|||||||||
Shap-E |
OpenAI |
2023-05-03 |
from scratch |
300M |
MIT |
diffuser |
https://github.com/openai/shap-e/blob/main/shap_e/models/download.py |
3D image generation |
||||||||||
StarCoder |
BigCode |
2023-05-04 |
15B |
1T tokens + 35B python tokens |
8k |
https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view |
BigCode OpenRAIL-M v1 |
code |
||||||||||
RedPajama |
TogetherComputer |
2023-05-05 |
from scratch |
3B, 7B |
N/A |
Apache 2.0 |
https://huggingface.co/togethercomputer/RedPajama-INCITE-Base-7B-v0.1 |
Open reproduction of LLaMA |
||||||||||
MPT-7B (MosaicML Pretrained Transformer) |
MosaicML |
2023-05-05 |
from scratch |
7B |
N/A |
Apache 2.0 |
transformer |
Open reproduction of LLaMA |
||||||||||
MPT-30B (MosaicML Pretrained Transformer) |
MosaicML |
2023-06-22 |
from scratch |
30B |
N/A |
Apache 2.0 |
transformer |
Open reproduction of LLaMA |
||||||||||
PanGu-sigma |
Huawei |
|||||||||||||||||
AnthropicLM |
Anthropic AI |
N/A |
no |
|||||||||||||||
Lit-LLaMA |
LLaMA |
7B, 13B, 30B, 65B |
Apache 2.0 |
NC research |
optional with Alcapa |
no |
||||||||||||
ImageBind |
Meta AI |
2023-05-09 |
from scratch |
https://ai.facebook.com/blog/imagebind-six-modalities-binding-ai/ |
cc-by-nc-4.0 |
cc-by-nc-4.0 |
transformer |
six different modalities: images, text, audio, depth, thermal, and IMU |
||||||||||
Open-LLaMA V2 |
s-JoL |
2023-05-11 |
from scratch |
N/A |
MIT |
MIT |
transformer |
yes |
yes |
|||||||||
PaLM 2 |
2023-05-10 |
from scratch |
N/A |
transformer |
no |