Stability AI releases StableVicuna, the AI World’s First Open Source RLHF LLM Chatbot

Stability AI has released StableVicuna, the first large-scale open-source chatbot trained via reinforced learning from human feedback (RLHF). The success of chat models is due to two training paradigms: instruction finetuning and reinforcement learning through human feedback. However, there is a lack of open access and open-source models that have both paradigms applied. StableVicuna is a further instruction fine-tuned and RLHF trained version of Vicuna v0 13b, which is an instruction fine-tuned LLaMA 13b model. To achieve StableVicuna’s strong performance, the base Vicuna model is further trained with supervised finetuning using a mixture of three datasets. The model is downloadable as a weight delta against the original LLaMA model. Alongside the chatbot, Stability AI is excited to preview its upcoming chat interface, which is in the final stages of development. The company encourages users to try StableVicuna and provide feedback to help improve the user experience. The StableVicuna model is available on a HuggingFace space. The company is committed to continuous improvement and will be iterating on this chatbot and deploying a Discord bot to the Stable Foundation server. The company extends its gratitude to its open-source contributors who have played a crucial role in bringing this project to life.

full article

Leave a Reply