***
##->Archived News:<-
Date: (MM/DD/YYYY) | Description:
------ | ------
07/23/2024 | Llama 3.1 officially released: https://ai.meta.com/blog/meta-llama-3-1/
07/22/2024 | llamanon leaks 405B base model: https://files.catbox.moe/d88djr.torrent >>101516633
07/18/2024 | Improved DeepSeek-V2-Chat 236B: https://hf.co/deepseek-ai/DeepSeek-V2-Chat-0628
07/18/2024 | Mistral NeMo 12B base & instruct with 128k context: https://mistral.ai/news/mistral-nemo/
07/16/2024 | Codestral Mamba, tested up to 256k context: https://hf.co/mistralai/mamba-codestral-7B-v0.1
07/16/2024 | MathΣtral Instruct based on Mistral 7B: https://hf.co/mistralai/mathstral-7B-v0.1
07/13/2024 | Llama 3 405B coming July 23rd: https://x.com/steph_palazzolo/status/1811791968600576271
07/09/2024 | Anole, based on Chameleon, for interleaved image-text generation: https://hf.co/GAIR/Anole-7b-v0.1
07/07/2024 | Support for glm3 and glm4 merged into llama.cpp: https://github.com/ggerganov/llama.cpp/pull/8031
07/02/2024 | Japanese LLaMA-based model pre-trained on 2T tokens: https://hf.co/cyberagent/calm3-22b-chat
06/28/2024 | Inference support for Gemma 2 merged: https://github.com/ggerganov/llama.cpp/pull/8156
06/27/2024 | Meta announces LLM Compiler, based on Code Llama, for code optimization and disassembly: https://go.fb.me/tdd3dw
06/27/2024 | Gemma 2 released: https://hf.co/collections/google/gemma-2-release-667d6600fd5220e7b967f315
06/25/2024 | Cambrian-1: Collection of vision-centric multimodal LLMs: https://cambrian-mllm.github.io
06/23/2024 | Support for BitnetForCausalLM merged: https://github.com/ggerganov/llama.cpp/pull/7931
06/18/2024 | Meta Research releases multimodal 34B, audio, and multi-token prediction models: https://ai.meta.com/blog/meta-fair-research-new-releases
06/17/2024 | DeepSeekCoder-V2 released with 236B & 16B MoEs: https://github.com/deepseek-ai/DeepSeek-Coder-V2
06/14/2024 | Nemotron-4-340B: Dense model designed for synthetic data generation: https://hf.co/nvidia/Nemotron-4-340B-Instruct
06/14/2024 | Nvidia collection of Mamba-2-based research models: https://hf.co/collections/nvidia/ssms-666a362c5c3bb7e4a6bcfb9c
06/11/2024 | Google releases RecurrentGemma, based on a hybrid RNN architecture: https://hf.co/google/recurrentgemma-9b-it
06/06/2024 | Qwen2 releases, with better benchmarks than Llama 3: https://qwenlm.github.io/blog/qwen2/
06/01/2024 | KV cache quantization support merged: https://github.com/ggerganov/llama.cpp/pull/7527
05/31/2024 | K2: Fully-reproducible model outperforming Llama 2 70B using 35% less compute: https://hf.co/LLM360/K2
05/29/2024 | Mistral releases Codestral-22B: https://mistral.ai/news/codestral/
05/28/2024 | DeepSeek-V2 support officially merged: https://github.com/ggerganov/llama.cpp/pull/7519
05/24/2024 | Draft PR adds support for Jamba: https://github.com/ggerganov/llama.cpp/pull/7531
05/23/2024 | Cohere releases 8B & 35B Aya 23 with multilingual capabilities: https://hf.co/collections/CohereForAI/c4ai-aya-23-664f4cda3fa1a30553b221dc
05/22/2024 | Mistral v0.3 models with function calling and extended vocab: https://github.com/mistralai/mistral-inference#model-download
05/21/2024 | Fork of llama.cpp adds DeepSeek-V2 support: https://hf.co/leafspark/DeepSeek-V2-Chat-GGUF
05/21/2024 | Microsoft launches Phi-3 small (7B) and medium (14B) under MIT: https://aka.ms/phi3-hf
05/16/2024 | DeepSeek AI releases 16B V2-Lite: https://hf.co/deepseek-ai/DeepSeek-V2-Lite-Chat
05/14/2024 | PaliGemma, Gemma 2, and LLM Comparator: https://developers.googleblog.com/gemma-family-and-toolkit-expansion-io-2024
05/12/2024 | Yi-1.5 Released with Improved Coding, Math, and Reasoning Capabilities: https://hf.co/collections/01-ai/yi-15-2024-05-663f3ecab5f815a3eaca7ca8
05/11/2024 | Japanese 13B model trained on CPU supercomputer: https://hf.co/Fugaku-LLM/Fugaku-LLM-13B
05/11/2024 | OneBit: Towards Extremely Low-bit LLMs: https://github.com/xuyuzhuang11/OneBit
05/10/2024 | Gemma 2B - 10M Context: https://hf.co/mustafaaljadery/gemma-2B-10M
05/08/2024 | Refuel LLM-2 for data labeling, enrichment, and cleaning: https://hf.co/refuelai/Llama-3-Refueled
05/08/2024 | OpenAI releases AI Specification: https://cdn.openai.com/spec/model-spec-2024-05-08.html
05/06/2024 | IBM releases Granite Code Models: https://github.com/ibm-granite/granite-code-models
05/02/2024 | Nvidia releases Llama3-ChatQA-1.5, excels at QA & RAG: https://chatqa-project.github.io/
05/01/2024 | KAN: Kolmogorov-Arnold Networks: https://arxiv.org/abs/2404.19756
05/01/2024 | Orthogonalized Llama-3-8b: https://hf.co/hjhj3168/Llama-3-8b-Orthogonalized-exl2
04/27/2024 | Refusal in LLMs is mediated by a single direction: https://alignmentforum.org/posts/jGuXSZgv6qfdhMCuJ
04/24/2024 | Snowflake Arctic Instruct 128x3B MoE released: https://hf.co/Snowflake/snowflake-arctic-instruct
04/23/2024 | Phi-3 Mini model released: https://hf.co/microsoft/Phi-3-mini-128k-instruct-onnx
04/21/2024 | Llama3 70B pruned to 42B parameters: https://hf.co/chargoddard/llama3-42b-v0
04/18/2024 | Llama3 8B, 70B pretrained and instruction-tuned models released: https://llama.meta.com/llama3/
04/17/2024 | Mixtral-8x22B-Instruct-v0.1 released: https://mistral.ai/news/mixtral-8x22b/
04/15/2024 | Microsoft AI unreleases WizardLM 2: https://web.archive.org/web/20240415221214/https://wizardlm.github.io/WizardLM2/
04/15/2024 | Microsoft AI releases WizardLM 2, including Mixtral 8x22B finetune: https://wizardlm.github.io/WizardLM2/
04/09/2024 | Mistral releases Mixtral-8x22B: https://twitter.com/MistralAI/status/1777869263778291896
04/09/2024 | Llama 3 coming in the next month: https://techcrunch.com/2024/04/09/meta-confirms-that-its-llama-3-open-source-llm-is-coming-in-the-next-month/
04/08/2024 | StableLM 2 12B released https://huggingface.co/stabilityai/stablelm-2-12b
04/05/2024 | Qwen1.5-32B released with GQA: https://huggingface.co/Qwen/Qwen1.5-32B
04/04/2024 | Command R+ released with GQA, 104B, 128K context: https://huggingface.co/CohereForAI/c4ai-command-r-plus
03/28/2024 | MiniGemini: Dense and MoE vision models: https://github.com/dvlab-research/MiniGemini
03/28/2024 | Jamba 52B MoE released with 256k context: https://huggingface.co/ai21labs/Jamba-v0.1
03/27/2024 | Databricks releases 132B MoE model: https://huggingface.co/collections/databricks/dbrx-6601c0852a0cdd3c59f71962
03/23/2024 | Mistral releases 7B v0.2 base model with 32k context: https://models.mistralcdn.com/mistral-7b-v0-2/mistral-7B-v0.2.tar
03/23/2024 | Grok support merged: https://github.com/ggerganov/llama.cpp/pull/6204
03/17/2024 | xAI open sources Grok: https://github.com/xai-org/grok
03/15/2024 | Control vector support in llamacpp: https://github.com/ggerganov/llama.cpp/pull/5970
03/11/2024 | New 35B RAG model with 128K context: https://huggingface.co/CohereForAI/c4ai-command-r-v01
03/11/2024 | This week, xAI will open source Grok: https://twitter.com/elonmusk/status/1767108624038449405
02/28/2024 | The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits https://arxiv.org/abs/2402.17764
02/27/2024 | Mistral readds the notice to their website https://www.reddit.com/r/LocalLLaMA/comments/1b18817/mistral_changing_and_then_reversing_website/
02/26/2024 | Mistral partners with Microsoft, removes mentions of open models from website https://siliconangle.com/2024/02/26/now-microsoft-partner-mistral-ai-challenges-openai-three-new-llms/
02/21/2024 | Google releases two open models, Gemma https://blog.google/technology/developers/gemma-open-models/
02/18/2024 | 1.5bit quant for lcpp merged https://github.com/ggerganov/llama.cpp/pull/5453
02/17/2024 | Kobold.cpp-1.58 prebuilt released https://github.com/LostRuins/koboldcpp/releases/tag/v1.58
02/16/2024 | Exl2 added Qwen support https://github.com/turboderp/exllamav2/issues/334
02/16/2024 | ZLUDA for lcpp merged, however, outlook is questionable at best https://github.com/vosen/ZLUDA/pull/102
06/23/2023 | Ooba's preset arena results and SuperHOT 16k prototype releases
06/22/2023 | Vicuna 33B (preview), OpenLLaMA 7B scaled and MPT 30B released
06/20/2023 | SuperHOT Prototype 2 w/ 8K context released >>94191797
06/18/2023 | Minotaur 15B 8K, WizardLM 7B Uncensored v1.0 and Vicuna 1.3 released
06/17/2023 | exllama support merged into ooba; API server rewrite merged into llama.cpp
06/16/2023 | OpenLlama 13B released
06/16/2023 | Airoboros GPT-4 v1.2 released
06/16/2023 | Robin-33B-V2 released
06/16/2023 | Dan's 30B Personality Engine LoRA released
06/14/2023 | WizardCoder 15B Released
06/14/2023 | CUDA full GPU acceleration merged in llama.cpp
06/10/2023 | First Landmark Attention models released >>93993800 (Cross-thread)
06/08/2023 | Openllama 3B and 7B released
06/07/2023 | StarCoderPlus / StarChat-β released
06/07/2023 | chronos-33b released
06/06/2023 | RedPajama 7B released + Instruct&Chat
06/06/2023 | WizardLM 30B v1.0 released
06/05/2023 | k-quantization released for llama.cpp
06/03/2023 | Nous-Hermes-13b released
06/03/2023 | WizardLM-Uncensored-Falcon-40b released
05/27/2023 | FalconLM release Falcon-7B & 40B, new foundational models
05/26/2023 | BluemoonRP 30B 4K released
05/25/2023 | QLoRA and 4bit bitsandbytes released
05/23/2023 | exllama transformer rewrite offers around x2 t/s increases for GPU models
05/22/2023 | SuperHOT 13B prototype & WizardLM Uncensored 30B released
05/22/2023 | SuperHOT 13B prototype & WizardLM Uncensored 30B released
05/19/2023 | RTX 30 series 15% performance gains, quantization breaking changes again >>93536523
05/19/2023 | PygmalionAI release 13B Pyg & Meth
05/18/2023 | VicunaUnlocked-30B released
05/14/2023 | llama.cpp quantization change breaks current Q4 & Q5 models, must be quantized again
05/13/2023 | llama.cpp GPU acceleration has been merged onto master >>93403996 >>93404319
05/10/2023 | GPU-accelerated token generation >>93334002
05/06/2023 | MPT 7B, 65k context model trained on 1T tokens: https://huggingface.co/mosaicml/mpt-7b-storywriter
05/05/2023 | GPT4-x-AlpacaDente2-30b. https://huggingface.co/Aeala/GPT4-x-AlpacaDente2-30b
05/04/2023 | Allegedly leaked document from Google, fretting over Open Source LLM's. https://www.semianalysis.com/p/google-we-have-no-moat-and-neither
05/04/2023 | StarCoder, a 15.5B parameter models trained on 80+ programming languages: https://huggingface.co/bigcode/starcoderbase
04/30/2023 | Uncucked Vicuna 13B released: https://huggingface.co/reeducator/vicuna-13b-free
04/30/2023 | PygmalionAI release two 7B LLaMA-based models: https://huggingface.co/PygmalionAI
04/29/2023 | GPT4 X Alpasta 30B Merge: https://huggingface.co/MetaIX/GPT4-X-Alpasta-30b-4bit
04/25/2023 | Proxy script for Tavern via Kobold/webui, increases LLaMA: output quality https://github.com/anon998/simple-proxy-for-tavern
04/23/2023 | OASS 30B released & quantized: https://huggingface.co/MetaIX/OpenAssistant-Llama-30b-4bit
04/22/2023 | SuperCOT LoRA (by kaiokendev), merged by helpful anons: https://huggingface.co/tsumeone/llama-30b-supercot-4bit-128g-cuda https://huggingface.co/ausboss/llama-13b-supercot-4bit-128g
04/22/2023 | OASS "releases" XORs again, deletes them soon after... again
04/21/2023 | StableLM models performing terribly, are apparently broken: https://github.com/Stability-AI/StableLM/issues/30