#Local Models Related Links ->/lmg/<- | ->[Accelerate](https://files.catbox.moe/q9paa9.epub)<- ------ | ------ **Guides**| [Quick Start Guide](https://rentry.org/lmg-spoonfeed-guide)|Anon's tutorial for getting models running locally [SillyTavern Guide](https://rentry.org/llama_v2_sillytavern)|Instructions for roleplaying via koboldcpp. Additional [GNBF grammar](https://rentry.org/custom_GBNF) usage [LM Tuning Guide](https://rentry.org/llm-training)|Training, fine-tuning, and LoRA/QLoRA information [LM Settings Guide](https://rentry.org/llm-settings)|Explanation of various settings and samplers with suggestions for specific models [LM GPU Guide](https://archive.is/SY2h6)|Current as of the 40 series. Alternatively some Anons made a few different [build guides](https://rentry.org/lmg-build-guides) **Models**| [HuggingFace](https://huggingface.co/models)|Best source for current quants (filter by GGUF or EXL2) [LLM VRAM Calc](https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator)|Tool to estimate VRAM usage for GGUF/EXL2/GPTQ quants [OpenModelDB](https://openmodeldb.info)|Specifically models for upscaling images and videos [Voice Models](https://voice-models.com)|Easily searchable list for use mainly with RVC 1/2 [Models Info Table](https://lifearchitect.ai/models-table)|Googlesheet of models, AI labs, datasets, and various other ML info by Alan Thompson [Chat Leaderboard](https://arena.lmsys.org)|Closed and local models ELO rated with additional MMLU/MT-bench scores | **Papers**| [Local Models Papers](https://rentry.org/localmodelspapers)|Papers and articles I've found to be interesting with a way to search via abstracts [Arxiv ML](https://arxiv.org/list/cs.LG/pastweek?skip=0&show=250)|Primary source of machine learning papers [PapersWithCode](https://paperswithcode.com)|Indexer that allows sorting by GitHub stars [Semantic Scholar](https://www.semanticscholar.org)|Scientific literature semantic search tool [Scholar Inbox](https://www.scholar-inbox.com)|ML focused paper recommendations based off personal preferences | **News**| [AI Explained](https://piped.kavin.rocks/@aiexplained-official)|General AI news with well sourced links (Youtube) [AI News Blog](https://thezvi.wordpress.com)|Lesswrong cultist so "AI Bad" takes but does a good weekly AI news roundup (Blog) [ML Resources](https://github.com/underlines/awesome-ml)|Broader sporadically updated list (not fully local) [Previous Threads](https://desuarchive.org/g/search/subject/%2Flmg%2F)|Always good to search for previous questions before asking | **Learn**| [LLM Course](https://github.com/mlabonne/llm-course)|Collection of articles, videos, courses, and colabs for learning applied ML [Andre Karpathy YT](https://piped.kavin.rocks/@AndrejKarpathy)|In-depth videos of LLM construction from one of OpenAI's founding members [TF From Scratch](https://archive.is/VEud6)|Blogpost with Juypter notebook that goes step by step for coding and training a small GPT [LLM-Sampling](https://artefact2.github.io/llm-sampling/index.xhtml)|Token Probability visualizer with support for current popular samplers [LLM Visualization](https://bbycroft.net/llm)|Drag and pull 3D model of various LLMs with explanation for components [Intro to DNN](https://arxiv.org/abs/2404.17625)|Book format of a Neural Networks course that serves as in introduction to ML [Principles of DL](https://arxiv.org/abs/2106.10165)|Textbook that introduces the math behind Deep Learning | **LLM Inferencing**| [Text Gen WebUI](https://github.com/oobabooga/text-generation-webui)|Frontend to most GPU/CPU model backends [WebUI Extensions](https://github.com/oobabooga/text-generation-webui-extensions)|Most notable XTTSv2 and Stable Diffusion | [llama.cpp](https://github.com/ggerganov/llama.cpp)|Main CPU inferencing development with GPU acceleration (GGUF models) [kobold.cpp](https://github.com/LostRuins/koboldcpp)|llama.cpp fork with Kobold UI and additional features (with support for older GGML models) | [exllama2](https://github.com/turboderp/exllamav2)|Inference library for local LLM with new quant style (70B llama2 on 24GB VRAM) [TabbyAPI](https://github.com/theroyallab/tabbyAPI)|FASTAPI application for exllama2 backend for use with SillyTavern | [SillyTavern](https://github.com/SillyTavern/SillyTavern)|Frontend that is a heavily modified TavernAI fork [vllm](https://github.com/vllm-project/vllm)|Inference library with fast inferencing and PagedAttention for KV management | **LLM Tools**| [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl)|Fine-tuning tool for various architectures with integrated support for flash attention and rope scaling [QuaRot](https://github.com/spcl/QuaRot)|4/6/8bit weight/activation/KV quantization scheme based on rotations to remove outliers [Mergekit](https://github.com/arcee-ai/MergeKit)|Toolkit for merging LLMs including piecewise assembly of layers [promptfoo](https://github.com/promptfoo/promptfoo)|Tool for testing and evaluating LLM output quality also with side-by-side feature [Floneum](https://github.com/floneum/floneum)|Graph/node editor for AI workflows with a focus on community made plugins [OpenRLHF](https://github.com/OpenLLMAI/OpenRLHF)|Framework for RLHF generation optimized for performance and distributed models | **LLM Research**| [OwLore](https://github.com/pixeli99/OwLore)|Fine-tune method that achieves better results than full fine-tuning while using less memory than LoRA/LISA [Buffer of Thoughts](https://github.com/YangLing0818/buffer-of-thought-llm)|Reasoning framework for LLMs that uses thought-templates to answer questions and outperforms CoT/Multiquery [LLM-Drop](https://github.com/Shwai-He/LLM-Drop)|Block/Layer drop method that works quite well with attention layers and is orthogonal with quantization [Temp LoRA](https://github.com/TemporaryLoRA/Temp-LoRA/tree/main)|Employs a temporary LoRA module during text generation to preserve contextual knowledge [HOMER](https://github.com/alinlab/HOMER)|Hierarchical context merging training-free method that works with conventional RoPE-scaling techniques [PyramidKV](https://github.com/Zefan-Cai/PyramidKV)|KV cache compression method using Pyramidal Information Funneling | **LLM Guiding**| [Langchain](https://github.com/hwchase17/langchain)|Set of resources to maximize LLMs Chains/tool integrations/agents/etc. [llama_index](https://github.com/jerryjliu/llama_index)|Central interface to connect LLM's with external data [TextGrad](https://github.com/zou-group/textgrad)|Framework with API to backpropagate textual gradients with user defined loss functions [SGLang](https://github.com/sgl-project/sglang)|Structured generation language designed for LLM/VLMs [DSPy](https://github.com/stanfordnlp/dspy)|Composable and declarative modules for instructing LMs in a familiar Pythonic syntax [EasyEdit](https://github.com/zjunlp/EasyEdit)|Knowledge editing framework for LLMs | **Datasets**| [Huggingface](https://huggingface.co/datasets)|Best source for datasets [Wiki Embeddings](https://txt.cohere.com/embedding-archives-wikipedia)|Predone embeddings for various language of Wikipedia [ERP Scrapes (1)](https://rentry.org/qib8f)[(2)](https://rentry.org/ashh2)|Raw RP/ERP/ELIT content [VN JP/EN Scrape](https://huggingface.co/datasets/alpindale/visual-novels)|60 million tokens of dialogue and actions/narration [WN JP/EN Scrape](https://huggingface.co/datasets/NilanE/ParallelFiction-Ja_En-100k)|100k chapters of webnovels paired with fantranslations [janitorai-cards](https://huggingface.co/datasets/AUTOMATIC/jaicards)|190k character cards converted to v2 format and viewable as local webpage [chub.ai](https://chub-archive.evulid.cc)|Archive of various character cards from chub as well as from some other sources | **Dataset Tools**| [augmentoolkit](https://github.com/e-p-armstrong/augmentoolkit)|Generates multi-turn instruct-tuning data from input documents [dswav](https://github.com/devidw/dswav)|Audio dataset preparation tool using whisper and ffmpeg to transcribe and split inputs [lilac](https://github.com/lilacai/lilac)|Dataset curation tool for RAG or tuning with annotating/clustering/labeling support [Data-Juicer](https://github.com/alibaba/data-juicer)|Dataset preparation tool with support for multimodal data [InfoGrowth](https://github.com/NUS-HPC-AI-Lab/InfoGrowth)|Online dataset curation framework for data cleaning and selection | **Non-LLM Models**| **Vision/Image**| [ComfyUI](https://github.com/comfyanonymous/ComfyUI)|Node based stable diffusion GUI. User submitted [workflows](https://comfyworkflows.com) [LDSR ComyUI](https://github.com/flowtyone/ComfyUI-Flowty-LDSR)|Image super resolution upscaler with less artifacts than others but slower [ControlNeXt](https://github.com/dvlab-research/ControlNeXt)|90% less parameters than ControlNet and works with other LoRA techniques [LLaVa-NeXT](https://github.com/haotian-liu/LLaVA)|Visual language models using qwen/llama3 with new video understanding capability [ColPali](https://huggingface.co/vidore/colpali)|VLM that indexes documents from their visual features (PDF focused) [Surya](https://github.com/VikParuchuri/surya)|OCR, layout analysis, reading order, line detection in 90+ languages [ShareCaptioner](https://huggingface.co/Lin-Chen/ShareCaptioner)|Image captioning model with lower hallucinations than LLaVa [Upscale Hub](https://github.com/Sirosky/Upscale-Hub)|Set of resources and models for image and video upscaling (anime focused) [BSQ-ViT](https://github.com/zhaoyue-zephyrus/bsq-vit)|Image/Video tokenizer with Binary Spherical quant that has best image/video restoration performance [Spandrel](https://github.com/chaiNNer-org/spandrel)|Library for loading various upscaling models for use with chaiNNer or SD WebUI [YOLOv10](https://github.com/THU-MIG/yolov10)|Newest in the YOLO series for real-time end-to-end object detection with massive latency reduction [DiffEditor](https://github.com/MC-E/DragonDiffusion)|Tuning-free method for fine-grained image editing using score-based diffusion [TerDiT](https://github.com/Lucky-Lance/TerDiT)|QAT DiT models that perform slightly worse than full precision but at massively reduced memory usage [VideoMamba](https://github.com/OpenGVLab/VideoMamba)|SSM to enable efficient memory usage for high resolution vision/video tasks [EfficientViT-SAM](https://github.com/mit-han-lab/efficientvit)|Faster and more accurate version of Segment Anything Model via EfficientViT [MASA](https://github.com/siyuanliii/masa)|Match anything via SAM for use in finding similar objects across different domains [Depth-Anything-V2](https://github.com/DepthAnything/Depth-Anything-V2)|Robust monocular depth estimation that works well with semantic segmentation [ProLab](https://github.com/lambert-x/ProLab)|Semantic segmentation via property-level label space rather than just categories [SUPIR](https://github.com/Fanghua-Yu/SUPIR)|Image restoration and upscale method with semantic adjustment editing ability [DDColor](https://github.com/piddnad/ddcolor)|Vivid and natural colorization for black and white photos (and possibly video) [lama-cleaner](https://github.com/Sanster/lama-cleaner)|Local inpainting tool (remove or erase and replace) [Era3D](https://github.com/pengHTYX/Era3D)|Image-to-Multiview image diffusion model that then can be used with [NeuS](https://github.com/Totoro97/NeuS) for 3D model creation [Ground-A-Video](https://github.com/Ground-A-Video/Ground-A-Video)|Video Editing via Text-To-Image diffusion models with groundings/motion/depth data [LivePortrait](https://github.com/KwaiVGI/LivePortrait)|Real time face swap with extended controllability (eyes, lips, stitching) [MegActor](https://github.com/megvii-research/megactor)|Animate images from audio/image with consistent motion via diffusion [MetaCLIP](https://github.com/facebookresearch/MetaCLIP)|Improvement over the CLIP model with superior dataset quality pipeline [EasyAnimate](https://github.com/aigc-apps/EasyAnimate)|Text-to-Video model that maxes out at 6s usable with various framerates and resolutions | **Audio/Speech**| [Amphion](https://github.com/open-mmlab/Amphion)|Audio/Music/Speech toolset of various models with visualization capability [Qwen2-Audio](https://github.com/QwenLM/Qwen2-Audio)|Audio-Language model that can voice chat and do audio analysis without specific prompting [GPT-SoVITS](https://github.com/RVC-Boss/GPT-SoVITS)|Few-shot voice cloning and Text-to-Speech WebUI (ENG/JPN/CHN) [VoiceCraft](https://github.com/jasonppy/VoiceCraft)|Zero shot Text-to-Speech and speech editing model with voice cloning capability [StyleTTS2](https://github.com/yl4579/StyleTTS2)|English Text-to-Speech via style diffusion (can fine-tune with custom dataset) [ControlSpeech](https://github.com/jishengpeng/ControlSpeech)|Text-to-Speech with voice clone capability that takes in voice/style/content prompts [whisper.cpp](https://github.com/ggerganov/whisper.cpp)|Speech-to-Text inference library with CPU/GPU support for various whisper based models [Whisper Diarization](https://huggingface.co/spaces/Xenova/whisper-speaker-diarization/tree/main/whisper-speaker-diarization)|STT via Transformers.js with word-level timestamps and speaker segmentation [STAR-Adapt](https://github.com/YUCHEN005/STAR-Adapt)|ASR unlabeled finetune method that reduces WER for specific accents/noise [Musicgen-MMD](https://jmlemercier.github.io/encodec-mmd.github.io)|32kHz Text-to-Music model (no vocals) [RVC](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI)|Retrieval based Voice Conversation model [Urhythmic](https://github.com/bshall/urhythmic)|Unsupervised rhythm modeling for voice conversion [Descrpyt](https://github.com/descriptinc/descript-audio-codec)|High-Fidelity audio compression with improved RVQGAN (can drop-in replace EnCodec) [DeepFilterNet](https://github.com/rikorose/deepfilternet)|Real time noise suppression using deep filtering [UVR](https://github.com/Anjok07/ultimatevocalremovergui)|Audio source separation GUI for various models with full Demucs and MDX23C support [AudioSR](https://github.com/haoheliu/versatile_audio_super_resolution)|Audio super resolution (any -> 48kHz) [EAT](https://github.com/cwx-worst-one/EAT)|Audio and speech classification | **Other**| [AnythingLLM](https://github.com/Mintplex-Labs/anything-llm)|RAG and agent focused frontend with support for local and cloud models [T-Ragx](https://github.com/rayliuca/T-Ragx)|Translation fine-tune method that works with RAG (glossaries) and preceding text [GenTranslate](https://github.com/YUCHEN005/GenTranslate)|Fine-tune of SeemlessM4T from N-best hypotheses dataset for MT and Speech-to-Text [Dragon+](https://github.com/facebookresearch/dpr-scale/tree/main/dragon)|Dual-encoder based dense retriever for use with the RA-DIT FT approach with paired LLM [Magica](https://github.com/google/magika)|File content type detector model [AutoACT](https://github.com/zjunlp/AutoAct)|Automatic agent learning framework using a division-of-labor strategy [LOCUST](https://github.com/flbbb/locost-summarization)|State-space model for long document abstractive summarization [NV Embed v1](https://huggingface.co/nvidia/NV-Embed-v1)|Decoder-only LLM embedding model that outperforms T5/BERT/similar models [ESPN](https://github.com/susavlsh10/ESPN-v1/)|GPUDirect Storage implementation for multi-vector embedding retrieval and bindings [FastFit](https://github.com/IBM/fastfit)|Text few-shot classification fine-tuning method with high accuracy and fast training time