Text Data
When training fine-tuned models, you need to prepare your own data, such as dialogue data, command data, pure text corpus, and so on.
The quality of the data will greatly affect the effectiveness of fine-tuning the model.
We provide some data for your reference.
📄️ Reddit Top 20K
We provide historical archive data of Reddit partitioned by subreddit for your download and exploration.