OpenChat is a library of open-source language models, fine-tuned with "C-RLFT (Conditioned Reinforcement Learning Fine-Tuning)" - a strategy inspired by offline reinforcement learning. It has been trained on mixed-quality data without preference labels.
A #merge model based on Llama-2-13B and made possible thanks to the compute provided by the KoboldAI community. It's a merge between:
- KoboldAI/LLaMA2-13B-Tiefighter
- chaoyi-wu/MedLLaMA_13B
- Doctor-Shotgun/llama-2-13b-chat-limarp-v2-merged.
#merge
Anthropic's model for low-latency, high throughput text generation. Supports up to 100k tokens in one pass, or hundreds of pages of text.
A continuation of OpenHermes 2 model, trained on additional code datasets. Potentially the most interesting finding from training on a good ratio (est. of around 7-14% of the total dataset) of code instruction was that it has boosted several non-code benchmarks, including TruthfulQA, AGIEval, and GPT4All suite. It did however reduce BigBench benchmark score, but the net gain overall is significant.