Text AIGC Models
There are currently many text models, some of which are products of fine-tuning existing large models, and some of which are basic models trained from scratch by various companies.
Well-known Basic Models
Basic Model | Release Date | Model Size | Publisher | Language | Features |
---|---|---|---|---|---|
LLaMA | 2023.03 | 7B~65B | Meta | Mainly English, weak support for other languages | |
MPT | 2023.05 | 7B | mosaicml | Mainly English, weak support for other languages | Commercially available, supports long contexts |
ChatGLM | 2023.03 | 6B | THUDM | Chinese and English | |
Cerebras-GPT | 2023.03 | 1.3B~13B | cerebras | Mainly English, weak support for other languages | |
rwkv-4-raven | 2023.04 | 1.5B~14B | BlinkDL | Multiple versions, different versions have different language support | Non-transforms structure |
OpenFlamingo | 2023.03 | 9B | LAION | Mainly English, weak support for other languages | Multi-modal, supports images |
StableLM | 2023.04 | 3B~7B | stability.ai | Mainly English, weak support for other languages | |
Bloom | 2022.10 | 1B~176B | bigscience | 59 languages | |
RedPajama-INCITE | 2023.05 | 3B~7B | together | Mainly English, weak support for other languages | |
pythia | 2023.03 | 1B~12B | eleuther.ai | English | |
GPT-Neo | 2021.03 | 125M~2.7B | eleuther.ai | English | |
GPT-J | 2021.03 | 6B | eleuther.ai | English | |
GPT-NeoX | 2022.02 | 20B | eleuther.ai | English | |
OPT | 2022.05 | 125M-175B | Meta | Mainly English, weak support for other languages |
Remarks
The relationship between other well-known models in the community and these basic models is shown in the following table.
Basic Model | Model Name | Fine-tuning Method | Publisher | Introduction |
---|---|---|---|---|
LLaMA | Alpaca | Full-parameter fine-tuning | stanford | Fine-tuned model of LLaMA, fine-tuned using OpenAI's text-davinci-003 |
Alpaca-LoRA | Lightweight fine-tuning | tloen | Same as Alpaca, but fine-tuned using Lora | |
Vicuna | Full-parameter fine-tuning | LMSYS Org | Fine-tuned model of LLaMA, trained and fine-tuned using user-shared dialogues collected from ShareGPT | |
Koala | Full-parameter fine-tuning | berkeley | Fine-tuned model of LLaMA | |
WizardLM | Full-parameter fine-tuning | WizardLM | Fine-tuned model of LLaMA, focuses on complex instruction fine-tuning | |
GPT-J | dolly-v1 | Full-parameter fine-tuning | databricks | Fine-tuned model of GPT-J, focuses on instruction fine-tuning |
gpt4all-j | Full-parameter fine-tuning | nomic.ai | Fine-tuned model of GPT-J | |
pythia | dolly-v2 | Full-parameter fine-tuning | databricks | Fine-tuned model of pythia, focuses on instruction fine-tuning |