#Local Models Related Links
->/lmg/<- | ->[Accelerate](https://files.catbox.moe/q9paa9.epub)<-
------ | ------
**Guides**|
[Quick Start Guide](https://rentry.org/lmg-spoonfeed-guide)|Anon's tutorial for getting models running locally 
[SillyTavern Guide](https://rentry.org/llama_v2_sillytavern)|Instructions for roleplaying via koboldcpp. Additional [GNBF grammar](https://rentry.org/custom_GBNF) usage
[LM Tuning Guide](https://rentry.org/llm-training)|Training, fine-tuning, and LoRA/QLoRA information 
[LM Settings Guide](https://rentry.org/llm-settings)|Explanation of various settings and samplers with suggestions for specific models
[LM GPU Guide](https://archive.is/SY2h6)|Current as of the 40 series. Alternatively some Anons made a few different [build guides](https://rentry.org/lmg-build-guides)
**Models**|
[HuggingFace](https://huggingface.co/models)|Best source for current quants (filter by GGUF or EXL2) 
[LLM VRAM Calc](https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator)|Tool to estimate VRAM usage for GGUF/EXL2/GPTQ quants
[OpenModelDB](https://openmodeldb.info)|Specifically models for upscaling images and videos 
[Voice Models](https://voice-models.com)|Easily searchable list for use mainly with RVC 1/2
[Models Info Table](https://lifearchitect.ai/models-table)|Googlesheet of models, AI labs, datasets, and various other ML info by Alan Thompson
[Chat Leaderboard](https://arena.lmsys.org)|Closed and local models ELO rated with additional MMLU/MT-bench scores
|
**Papers**|
[Local Models Papers](https://rentry.org/localmodelspapers)|Papers and articles I've found to be interesting with a way to search via abstracts
[Arxiv ML](https://arxiv.org/list/cs.LG/pastweek?skip=0&show=250)|Primary source of machine learning papers 
[PapersWithCode](https://paperswithcode.com)|Indexer that allows sorting by GitHub stars
[Semantic Scholar](https://www.semanticscholar.org)|Scientific literature semantic search tool 
[Scholar Inbox](https://www.scholar-inbox.com)|ML focused paper recommendations based off personal preferences
|
**News**|
[AI Explained](https://piped.kavin.rocks/@aiexplained-official)|General AI news with well sourced links (Youtube) 
[AI News Blog](https://thezvi.wordpress.com)|Lesswrong cultist so "AI Bad" takes but does a good weekly AI news roundup (Blog)
[ML Resources](https://github.com/underlines/awesome-ml)|Broader sporadically updated list (not fully local)
[Previous Threads](https://desuarchive.org/g/search/subject/%2Flmg%2F)|Always good to search for previous questions before asking
|
**Learn**|
[LLM Course](https://github.com/mlabonne/llm-course)|Collection of articles, videos, courses, and colabs for learning applied ML 
[Andre Karpathy YT](https://piped.kavin.rocks/@AndrejKarpathy)|In-depth videos of LLM construction from one of OpenAI's founding members
[TF From Scratch](https://archive.is/VEud6)|Blogpost with Juypter notebook that goes step by step for coding and training a small GPT
[LLM-Sampling](https://artefact2.github.io/llm-sampling/index.xhtml)|Token Probability visualizer with support for current popular samplers 
[LLM Visualization](https://bbycroft.net/llm)|Drag and pull 3D model of various LLMs with explanation for components 
[Intro to DNN](https://arxiv.org/abs/2404.17625)|Book format of a Neural Networks course that serves as in introduction to ML
[Principles of DL](https://arxiv.org/abs/2106.10165)|Textbook that introduces the math behind Deep Learning 
|
**LLM Inferencing**|
[Text Gen WebUI](https://github.com/oobabooga/text-generation-webui)|Frontend to most GPU/CPU model backends
[WebUI Extensions](https://github.com/oobabooga/text-generation-webui-extensions)|Most notable XTTSv2 and Stable Diffusion
|
[llama.cpp](https://github.com/ggerganov/llama.cpp)|Main CPU inferencing development with GPU acceleration (GGUF models)
[kobold.cpp](https://github.com/LostRuins/koboldcpp)|llama.cpp fork with Kobold UI and additional features (with support for older GGML models)
|
[exllama2](https://github.com/turboderp/exllamav2)|Inference library for local LLM with new quant style (70B llama2 on 24GB VRAM)
[TabbyAPI](https://github.com/theroyallab/tabbyAPI)|FASTAPI application for exllama2 backend for use with SillyTavern 
|
[SillyTavern](https://github.com/SillyTavern/SillyTavern)|Frontend that is a heavily modified TavernAI fork 
[vllm](https://github.com/vllm-project/vllm)|Inference library with fast inferencing and PagedAttention for KV management
|
**LLM Tools**|
[Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl)|Fine-tuning tool for various architectures with integrated support for flash attention and rope scaling 
[QuaRot](https://github.com/spcl/QuaRot)|4/6/8bit weight/activation/KV quantization scheme based on rotations to remove outliers
[Mergekit](https://github.com/arcee-ai/MergeKit)|Toolkit for merging LLMs including piecewise assembly of layers
[promptfoo](https://github.com/promptfoo/promptfoo)|Tool for testing and evaluating LLM output quality also with side-by-side feature
[Floneum](https://github.com/floneum/floneum)|Graph/node editor for AI workflows with a focus on community made plugins
[OpenRLHF](https://github.com/OpenLLMAI/OpenRLHF)|Framework for RLHF generation optimized for performance and distributed models 
|
**LLM Research**|
[OwLore](https://github.com/pixeli99/OwLore)|Fine-tune method that achieves better results than full fine-tuning while using less memory than LoRA/LISA
[Buffer of Thoughts](https://github.com/YangLing0818/buffer-of-thought-llm)|Reasoning framework for LLMs that uses thought-templates to answer questions and outperforms CoT/Multiquery 
[LLM-Drop](https://github.com/Shwai-He/LLM-Drop)|Block/Layer drop method that works quite well with attention layers and is orthogonal with quantization 
[Temp LoRA](https://github.com/TemporaryLoRA/Temp-LoRA/tree/main)|Employs a temporary LoRA module during text generation to preserve contextual knowledge 
[HOMER](https://github.com/alinlab/HOMER)|Hierarchical context merging training-free method that works with conventional RoPE-scaling techniques
[PyramidKV](https://github.com/Zefan-Cai/PyramidKV)|KV cache compression method using Pyramidal Information Funneling
|
**LLM Guiding**|
[Langchain](https://github.com/hwchase17/langchain)|Set of resources to maximize LLMs Chains/tool integrations/agents/etc.
[llama_index](https://github.com/jerryjliu/llama_index)|Central interface to connect LLM's with external data
[TextGrad](https://github.com/zou-group/textgrad)|Framework with API to backpropagate textual gradients with user defined loss functions
[SGLang](https://github.com/sgl-project/sglang)|Structured generation language designed for LLM/VLMs
[DSPy](https://github.com/stanfordnlp/dspy)|Composable and declarative modules for instructing LMs in a familiar Pythonic syntax
[EasyEdit](https://github.com/zjunlp/EasyEdit)|Knowledge editing framework for LLMs
|
**Datasets**|
[Huggingface](https://huggingface.co/datasets)|Best source for datasets
[Wiki Embeddings](https://txt.cohere.com/embedding-archives-wikipedia)|Predone embeddings for various language of Wikipedia
[ERP Scrapes (1)](https://rentry.org/qib8f)[(2)](https://rentry.org/ashh2)|Raw RP/ERP/ELIT content
[VN JP/EN Scrape](https://huggingface.co/datasets/alpindale/visual-novels)|60 million tokens of dialogue and actions/narration 
[WN JP/EN Scrape](https://huggingface.co/datasets/NilanE/ParallelFiction-Ja_En-100k)|100k chapters of webnovels paired with fantranslations
[janitorai-cards](https://huggingface.co/datasets/AUTOMATIC/jaicards)|190k character cards converted to v2 format and viewable as local webpage
[chub.ai](https://chub-archive.evulid.cc)|Archive of various character cards from chub as well as from some other sources 
|
**Dataset Tools**|
[augmentoolkit](https://github.com/e-p-armstrong/augmentoolkit)|Generates multi-turn instruct-tuning data from input documents
[dswav](https://github.com/devidw/dswav)|Audio dataset preparation tool using whisper and ffmpeg to transcribe and split inputs
[lilac](https://github.com/lilacai/lilac)|Dataset curation tool for RAG or tuning with annotating/clustering/labeling support 
[Data-Juicer](https://github.com/alibaba/data-juicer)|Dataset preparation tool with support for multimodal data
[InfoGrowth](https://github.com/NUS-HPC-AI-Lab/InfoGrowth)|Online dataset curation framework for data cleaning and selection 
|
**Non-LLM Models**|
**Vision/Image**|
[ComfyUI](https://github.com/comfyanonymous/ComfyUI)|Node based stable diffusion GUI. User submitted [workflows](https://comfyworkflows.com)
[LDSR ComyUI](https://github.com/flowtyone/ComfyUI-Flowty-LDSR)|Image super resolution upscaler with less artifacts than others but slower
[ControlNeXt](https://github.com/dvlab-research/ControlNeXt)|90% less parameters than ControlNet and works with other LoRA techniques
[LLaVa-NeXT](https://github.com/haotian-liu/LLaVA)|Visual language models using qwen/llama3 with new video understanding capability
[ColPali](https://huggingface.co/vidore/colpali)|VLM that indexes documents from their visual features (PDF focused)
[Surya](https://github.com/VikParuchuri/surya)|OCR, layout analysis, reading order, line detection in 90+ languages
[ShareCaptioner](https://huggingface.co/Lin-Chen/ShareCaptioner)|Image captioning model with lower hallucinations than LLaVa 
[Upscale Hub](https://github.com/Sirosky/Upscale-Hub)|Set of resources and models for image and video upscaling (anime focused) 
[BSQ-ViT](https://github.com/zhaoyue-zephyrus/bsq-vit)|Image/Video tokenizer with Binary Spherical quant that has best image/video restoration performance
[Spandrel](https://github.com/chaiNNer-org/spandrel)|Library for loading various upscaling models for use with chaiNNer or SD WebUI
[YOLOv10](https://github.com/THU-MIG/yolov10)|Newest in the YOLO series for real-time end-to-end object detection with massive latency reduction
[DiffEditor](https://github.com/MC-E/DragonDiffusion)|Tuning-free method for fine-grained image editing using score-based diffusion
[TerDiT](https://github.com/Lucky-Lance/TerDiT)|QAT DiT models that perform slightly worse than full precision but at massively reduced memory usage 
[VideoMamba](https://github.com/OpenGVLab/VideoMamba)|SSM to enable efficient memory usage for high resolution vision/video tasks
[EfficientViT-SAM](https://github.com/mit-han-lab/efficientvit)|Faster and more accurate version of Segment Anything Model via EfficientViT
[MASA](https://github.com/siyuanliii/masa)|Match anything via SAM for use in finding similar objects across different domains 
[Depth-Anything-V2](https://github.com/DepthAnything/Depth-Anything-V2)|Robust monocular depth estimation that works well with semantic segmentation
[ProLab](https://github.com/lambert-x/ProLab)|Semantic segmentation via property-level label space rather than just categories 
[SUPIR](https://github.com/Fanghua-Yu/SUPIR)|Image restoration and upscale method with semantic adjustment editing ability
[DDColor](https://github.com/piddnad/ddcolor)|Vivid and natural colorization for black and white photos (and possibly video) 
[lama-cleaner](https://github.com/Sanster/lama-cleaner)|Local inpainting tool (remove or erase and replace) 
[Era3D](https://github.com/pengHTYX/Era3D)|Image-to-Multiview image diffusion model that then can be used with [NeuS](https://github.com/Totoro97/NeuS) for 3D model creation
[Ground-A-Video](https://github.com/Ground-A-Video/Ground-A-Video)|Video Editing via Text-To-Image diffusion models with groundings/motion/depth data
[LivePortrait](https://github.com/KwaiVGI/LivePortrait)|Real time face swap with extended controllability (eyes, lips, stitching)
[MegActor](https://github.com/megvii-research/megactor)|Animate images from audio/image with consistent motion via diffusion
[MetaCLIP](https://github.com/facebookresearch/MetaCLIP)|Improvement over the CLIP model with superior dataset quality pipeline
[EasyAnimate](https://github.com/aigc-apps/EasyAnimate)|Text-to-Video model that maxes out at 6s usable with various framerates and resolutions
|
**Audio/Speech**|
[Amphion](https://github.com/open-mmlab/Amphion)|Audio/Music/Speech toolset of various models with visualization capability 
[Qwen2-Audio](https://github.com/QwenLM/Qwen2-Audio)|Audio-Language model that can voice chat and do audio analysis without specific prompting
[GPT-SoVITS](https://github.com/RVC-Boss/GPT-SoVITS)|Few-shot voice cloning and Text-to-Speech WebUI (ENG/JPN/CHN)
[VoiceCraft](https://github.com/jasonppy/VoiceCraft)|Zero shot Text-to-Speech and speech editing model with voice cloning capability
[StyleTTS2](https://github.com/yl4579/StyleTTS2)|English Text-to-Speech via style diffusion (can fine-tune with custom dataset)
[ControlSpeech](https://github.com/jishengpeng/ControlSpeech)|Text-to-Speech with voice clone capability that takes in voice/style/content prompts 
[whisper.cpp](https://github.com/ggerganov/whisper.cpp)|Speech-to-Text inference library with CPU/GPU support for various whisper based models
[Whisper Diarization](https://huggingface.co/spaces/Xenova/whisper-speaker-diarization/tree/main/whisper-speaker-diarization)|STT via Transformers.js with word-level timestamps and speaker segmentation
[STAR-Adapt](https://github.com/YUCHEN005/STAR-Adapt)|ASR unlabeled finetune method that reduces WER for specific accents/noise 
[Musicgen-MMD](https://jmlemercier.github.io/encodec-mmd.github.io)|32kHz Text-to-Music model (no vocals)
[RVC](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI)|Retrieval based Voice Conversation model 
[Urhythmic](https://github.com/bshall/urhythmic)|Unsupervised rhythm modeling for voice conversion
[Descrpyt](https://github.com/descriptinc/descript-audio-codec)|High-Fidelity audio compression with improved RVQGAN (can drop-in replace EnCodec)
[DeepFilterNet](https://github.com/rikorose/deepfilternet)|Real time noise suppression using deep filtering
[UVR](https://github.com/Anjok07/ultimatevocalremovergui)|Audio source separation GUI for various models with full Demucs and MDX23C support
[AudioSR](https://github.com/haoheliu/versatile_audio_super_resolution)|Audio super resolution (any -> 	48kHz)
[EAT](https://github.com/cwx-worst-one/EAT)|Audio and speech classification  
|
**Other**|
[AnythingLLM](https://github.com/Mintplex-Labs/anything-llm)|RAG and agent focused frontend with support for local and cloud models
[T-Ragx](https://github.com/rayliuca/T-Ragx)|Translation fine-tune method that works with RAG (glossaries) and preceding text 
[GenTranslate](https://github.com/YUCHEN005/GenTranslate)|Fine-tune of SeemlessM4T from N-best hypotheses dataset for MT and Speech-to-Text 
[Dragon+](https://github.com/facebookresearch/dpr-scale/tree/main/dragon)|Dual-encoder based dense retriever for use with the RA-DIT FT approach with paired LLM
[Magica](https://github.com/google/magika)|File content type detector model 
[AutoACT](https://github.com/zjunlp/AutoAct)|Automatic agent learning framework using a division-of-labor strategy
[LOCUST](https://github.com/flbbb/locost-summarization)|State-space model for long document abstractive summarization
[NV Embed v1](https://huggingface.co/nvidia/NV-Embed-v1)|Decoder-only LLM embedding model that outperforms T5/BERT/similar models
[ESPN](https://github.com/susavlsh10/ESPN-v1/)|GPUDirect Storage implementation for multi-vector embedding retrieval and bindings
[FastFit](https://github.com/IBM/fastfit)|Text few-shot classification fine-tuning method with high accuracy and fast training time