#/vtai/ FAQ -> [![go back to main rentry](https://files.catbox.moe/w9rln6.png)](https://rentry.org/vtai) <- -> [Main](https://rentry.org/vtai/) | [Cookbook](https://rentry.org/vtairecipes) | [Proompts](https://rentry.org/vtaiprompts) | [Archive](https://rentry.org/vtaiarchive) | [LoRAs](https://rentry.org/vtaiLoRAs) <- [TOC] #FAQ - General ##This is cool, how do I get started? ###Do you have an nVIDIA GPU? ####Yes [NAI Leak Speedrun](https://rentry.org/nai-speedrun) !!! note Note You likely won't ever use the leaked NovelAI model, but if you plan on merging HLL with other stuff, you'll still want `animefull-final-pruned.ckpt` anyways >Local Install Nvidia: https://rentry.org/voldy | https://github.com/AbdBarho/stable-diffusion-webui-docker ####No #####Are you retarded? ######Yes [Colab for Complete Retards](https://colab.research.google.com/drive/1STL60qfoY-iSlhRb9zFETRLTqhNbznRf) ######No * [TheLastBen's Fast Stable Diffusion Colab](https://colab.research.google.com/github/TheLastBen/fast-stable-diffusion/blob/main/fast_stable_diffusion_AUTOMATIC1111.ipynb) * [Paperspace rentry](https://rentry.org/865dy) #####Are you OK with using Linux? ######Yes AMD: Native: https://rentry.org/sd-nativeisekaitoo | Docker: https://rentry.org/sdamd !!! warning Requires extra setup for Polaris based GPUs Polaris = RX480, RX580, and derivative cards, so RX460, RX550, Radeon Pro WX2100-5100, etc. ######No Onnx: https://rentry.org/ayymd-stable-diffustion-v1_4-guide !!! warning This won't actually let you use "webui" It's an alternative and requires converting a model [Running webui directly on AMD using DirectML](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-AMD-GPUs#windows) !!! warning BSODs await ye You will get different output, and you'll need to use several launch flags to even get it to work #####Do you want to try running it directly on your CPU? https://rentry.org/cputard | https://rentry.org/webui-cpu !!! warning You'll get different output ##wHat mODel is bEsT? Everyone will give you a different answer. Look at catbox links that anons post or ask them what model they're using. Generally speaking, /vtai/ is merging some version of hll with another merged model, at the simplest level, or merging multiple. Nobody's trained an entire checkpoint of their own beyond hll anon training hll itself. Short, non-conclusive answer, merges with: * AOM2 * Slightly more realistic gens * 7th anime * Stylized * Counterfeit * "watercolor" lookin Look through the [cookbook](https://rentry.org/vtairecipes) and look at what images generated by the base models used in merges look like. Except now they can draw chubas. ##wHiCh upScalEr iS bEst? They're different, one isn't any better than the other. I wouldn't be surprised if half of the images you see on /vtai/ are using `4x-AnimeSharp` or another ESRGAN upscaler, which can be downloaded [here](https://upscale.wiki/wiki/Model_Database). They go in your `stable-diffusion-webui/models/ESRGAN` folder. [Here](https://vt-idiot.github.io/crispy-octo-pancake/xyz/upscalers-0-point-7/choco/index.html) is an old comparison I made, with sliders, that allows you to compare the results of different upscalers. It's flawed, since all of the images were made with denoising set to 0.7, which is far from ideal - it's moreso a comparison of what SD will do with the different upscaler results. Most of the ESRGAN upscalers like `4x-AnimeSharp`, `lollypop`, etc. all work best at around ~0.4-0.5 denoising strength. Go below that and you start to see more of the upscaler and less of what hi-res fix is doing. Above 0.5 you see even more of what hi-res fix did and less of what the upscaler did. At 1.0 - you'll have a different image entirely, and at 0 - you'll have (close to) the upscaler's raw output. If you set denoising strength to 0 and Hi-Res Fix steps to 1, you can get an idea of what the "raw" output from your selected upscaler looks like! ##`VAE` SET. YOUR. VAE. -> [![No VAE](https://i.postimg.cc/zvRKhTx3/xyz-grid-0006-173451-best-quality-houshou-marine-red-hair-twintails-heterochromia-red-eyes-yel.png)](https://postimg.cc/zH8yZHZ1) <- It does more than just "add color" - it's the component that actually "makes the image" you see at the end. >Why do I still get an image even if I don't set one? I think every model still has "one" in it, it's just not "complete". Some models have one of the VAEs below "baked in" and will make the same images whether you have your VAE set to `Automatic` or `None` - you will however get a different image if you set a different VAE manually. See below for results with NAI SFW, NAI SFW with the NAI VAE "baked in", NAI SFW with the NAI VAE baked in and all pruned to fp16, and a model merge with no VAE baked in. -> [![xyz-grid-0001-1335167568-kapuvania-cropped.png](https://i.postimg.cc/wy9BwWtQ/xyz-grid-0001-1335167568-kapuvania-cropped.png)](https://postimg.cc/wy9BwWtQ) <- !!! note Stable Diffusion is a text-to-image model that uses a variational autoencoder (VAE) to encode and decode images to and from a smaller latent space. The VAE helps the model to capture the semantic meaning of the image and render fine details such as eyes and faces. In the final steps of image generation, the VAE decoder takes the latent vector from the diffusion process and reconstructs the image in pixel space. The VAE decoder can be fine-tuned with additional data to improve the quality of the image. !!! note GrugTL _powered by GPT-4_ * Latent space like secret code. Latent space where small picture live. * Secret code have numbers that mean something. * Grug not see small picture. Small picture `[0.12, -0.34, 0.56, -0.78]` look like cloudy letter O. * Grug see big picture. VAE make big picture. Decoder change small picture to big picture with pixels. Pixels small squares that make picture for grug. * This big picture in pixel space: * ``` [ [0, 0, 0, 0, 0], [0, 1, 1, 1, 0], [0, 1, 0, 1, 0], [0, 1, 1, 1, 0], [0, 0, 0, 0, 0] ] ``` * This picture have five by five pixels. Picture look like letter O. ###Set Your VAE Download one. Put it here. -> ![/stable-diffusion-webui/models/VAE](https://i.postimg.cc/L6KZ7YyB/VAE.png) <- Set it. -> ![SET YOUR VAE](https://i.postimg.cc/L5bhCjfM/Screenshot-2023-03-30-132111.png) <- ###wHaT VaE iS beSt? There are no new VAEs, and to my knowledge, **nobody has trained any other than _NovelAI, Stable Diffusion, and Waifu Diffusion_** . The answer does not change, and hasn't changed. Any model that "comes with a VAE" comes with one of these. -> [![bite my shiny metal ass](https://i.postimg.cc/yYdDgmrP/xyz-grid-0001-86137747-best-quality-roboco-san-hololive-orange-eyes-gradient-eyes-round-eyew.png)](https://postimg.cc/s1kjbhzQ) <- >Yes, I'm aware I prompt like a moron >Yes, I'm aware webui had an oopsie and didn't store the xyz grid settings for some reason ``` best quality, ((roboco-san), hololive, orange eyes, gradient eyes, round eyewear), (center opening:1.6), (plunging neckline, downblouse, short hair, gradient hair, bodysuit, humanoid robot), 1girl, solo, mechanical arms, mechanical legs, ((latex, ((fine fabric emphasis)))), (innerboob, breasts apart, unzipped:1.4), (narrow waist, from side, tight:1.2), (underbutt, (ass)), seductive smile, :D, ultra-detailed, illustration, official art, cel shading, emphasis lines, bodypaint, large breasts, full-length zipper, arms up, dutch angle, presenting armpit, leather, ((patterned clothing), textured bodysuit), see-through leotard, cyberpunk, close-up, neon lights, night sky, skyline, wide hips, Negative prompt: (worst quality, low quality, zipper, zipper pull tab:1.4), (depth of field, cape, blurry, hands, breast strap, holding:1.2), (censored, cropped shirt, crop top, censorship:1.3), (greyscale, monochrome, speech bubble), error, bad anatomy, bad hands, missing digits, paws, cropped, lowres, jpeg artifacts, username, artist name, trademark, letterbox, bad feet, error, missing fingers, extra digit, fewer digits, extra limb, missing limb, mutation, fused digits, claws, cowboy hat, multiple views, bad multiple views, extra arms, extra legs, (fat, fat rolls, flesh, blob), huge breasts, large areolae, cleavage, bikini top only, bra, shirt, Steps: 24, Sampler: DPM++ 2S a Karras, CFG scale: 32, Seed: 86137747, Size: 512x768, Model hash: a6c54f42f1, Model: Counter_v2_A_Juice_31, Denoising strength: 0.5, Clip skip: 2, ENSD: 31337, Hires upscale: 2, Hires steps: 12, Hires upscaler: 4x-AnimeSharp, Dynamic thresholding enabled: True, Mimic scale: 8, Threshold percentile: 97.9, Mimic mode: Power Up, Mimic scale minimum: 3.5, CFG mode: Half Cosine Up, CFG scale minimum: 3.5, Power scheduler value: 3.5 ``` ####NovelAI VAE Used by any number of different models/merges. The full SHA256 checksum of the unpruned NAI VAE "as leaked" is `f921fb3f29891d2a77a6571e56b8b5052420d2884129517a333c60b1b4816cdf` * `orangemix.vae.pt` It's the NAI VAE * `Anything-V3.0.vae.pt` It's the NAI VAE * `lol-top-kek-420-69.vae.pt` Is it 784 MiB? **_It's the NAI VAE_** #####Pruned Version Yes, apparently VAE can be pruned. For some reason, the NAI VAE was ~800 MB. Here's a 335 MB safetensors version. Save yourself some space I guess [Pruned Direct Download, Safetensors](https://huggingface.co/grugger/chubas/resolve/main/models/VAE/2982.vae.safetensors) ####ClearVAE https://civitai.com/models/22354/clearvae -> [![wiwa](https://i.postimg.cc/wBtMf9ds/xyz-grid-0001-364117191-best-quality-elira-pendora-nijisanji-en-large-breasts-light-blue-hair-m.png)](https://postimg.cc/GBCb9wtc) <- ``` best quality, elira pendora, nijisanji en, large breasts, light blue hair, multicolored hair, white hair, head wings, x hair ornament, hair over one eye, blue eyes, messy hair, (plugsuit, blue bodysuit, test plugsuit), (unzipped, center opening breasts apart:1.5), stomach, navel, groin, groin tendon, lying, on back, seductive smile, arms up, dutch angle, spoken heart, heart-shaped pupils, fine fabric emphasis, 1girl, solo,heterochromia, purple eyes, (wet skin, sagging breasts:0.75), art by akasaai, Negative prompt: (worst quality, low quality, ribs, underbust, zipper:1.4), (depth of field, blurry, art by sakimichan), (censored, censorship, sideless outfit:1.2), (detached sleeves, text, speech bubble, lowres, jpeg artifacts, (extra arms), extra digits, missing limb, fewer digits, (extra digit, extra limb), missing limb, mutation, extra legs, missing finger, bad anatomy, bad hands, bad feet), error, paws, username, artist name, trademark, letterbox, error, fused digits, claws, multiple views, bad multiple views, fat, fat rolls, blob, nipples, breasts out, areolae, (bad_prompt_version2, EasyNegative:0.8), Steps: 30, Sampler: Euler a, CFG scale: 11, Seed: 364117191, Size: 512x768, Model hash: 454c9e8daa, Model: 25d_hll4p1, Clip skip: 2, ENSD: 31337, Script: X/Y/Z plot, X Type: VAE, X Values: "clearvae_main.safetensors, vae-ft-mse-840000-ema-pruned.vae.pt, novelai.vae.safetensors" ``` From what I can understand, it's a block merge of the NAI VAE and one of the WD 1.4 VAEs, `kl-f8-anime2`. It's actually kinda nice! This is the only actual **"new"** VAE the author is aware of. There are two versions, and in my opinion, the main one is far better. The alternate is almost too saturated, and introduces some weird artifacts that I can't recall seeing with either the SD 1.5 or WD 1.4 VAEs ####SD 1.5 VAE They're more saturated. No, I don't know what the difference is between the two. I use the `MSE` one because `lol 840000 > 560000` #####vae-ft-mse-840000-ema-pruned.ckpt https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.ckpt #####vae-ft-ema-560000-ema-pruned.ckpt https://huggingface.co/stabilityai/sd-vae-ft-ema-original/resolve/main/vae-ft-ema-560000-ema-pruned.ckpt ####WD 1.4 VAE They're even more saturated. No, I don't know what the difference is between the two. #####kl-f8-anime.ckpt https://huggingface.co/hakurei/waifu-diffusion-v1-4/resolve/main/vae/kl-f8-anime.ckpt #####kl-f8-anime2.ckpt https://huggingface.co/hakurei/waifu-diffusion-v1-4/resolve/main/vae/kl-f8-anime2.ckpt ##Why do people ask for catbox links? * 4chan removes the prompt information, along with any other metadata, when you post an image. + catbox does not >catbox is down Grass is green, sky is blue, water is wet + litterbox (temporary, 1 hr - 3 day) does not + pixiv does not + byte.wtf (temporary, up to 1 month) does not + byte.wtf links don't (embed) properly though + postimages (temporary or permanent, up to you) does not, provided you choose not to resize your image + imgur **always removes it** Having the metadata means you can load the image into the PNG Info tab in webui and view the prompt that was used to generate the image. ###Can I view metadata for catbox uploads while I'm browsing 4chan, or upload directly to catbox while I'm posting? Yes. Install this [userscript](https://gist.github.com/catboxanon/ca46eb79ce55e3216aecab49d5c7a3fb) for 4chanx. !!! warning If your image was >4MB and webui converted it to a JPG, the userscript won't display the metadata because it can't read EXIF tags out of a JPG, only iTXT chunks from a PNG. ###Why do some filenames look like `catbox_1a2b3.png`? They were uploaded using said userscript. The rest of us can download the image with the metadata directly when it looks like that, or view it in our browsers. ###Is there another way to look at the prompt info? [Yes](http://entropymine.com/jason/tweakpng/). You can also use it to do things like "put the metadata back" into an image if you Photoshopped it or something. ##WHY CAN'T I MAKE THE SAME EXACT IMAGE AS SO AND SO? * Were they using LoRA or TI embeds? Get the same ones. * Do you have xformers enabled? Did they have xformers enabled? >They didn't have it enabled So disable it. >They had it enabled (most do) Neither of you can ever make the same image again, get over it. > I don't know The world is your oyster and you want to make the same image again? #FAQ - Extensions ##How do I `2girls` in the same picture? - _What is 2Shot?_ - _What is latent couple?_ https://github.com/opparco/stable-diffusion-webui-two-shot !!! info This extension is an extension of the built-in Composable Diffusion. This allows you to determine the region of the latent space that reflects your subprompts. !!! info GrugTL _powered by GPT-4_ * “Extension make Composable Diffusion better. Extension let you pick part of hidden space for subprompts.” * _it cut picture into pieces, and let user ask different things for each piece of picture_ `{{{grugger}}}` * “Extension make thing better. Extension let you pick part of hidden thing for small asks.” * _it break picture into small parts, and let user say different things for each part of picture_ ###How do I use this? You have to prompt for it the way it's "expecting" you to. The bare minimum is three "lines" - yes, you have to hit "Enter", and yes, each line after the first must start with `AND ` -> [![00178-3676936066-best-quality-2girls-chibi-snow-outdoors-snowing-snowflakes-black-hair-e.png](https://i.postimg.cc/WzgF89nB/00178-3676936066-best-quality-2girls-chibi-snow-outdoors-snowing-snowflakes-black-hair-e.png)](https://postimg.cc/2LjS87gG) <- ``` best quality, 2girls, ((chibi)), snow, outdoors, snowing, snowflakes, black hair, extremely detailed, extremely intricate, absurdres, incredibly absurdres, illustration, official art, (emphasis lines, cel shading:0.8), AND best quality, 2girls, ((chibi)), ookami mio, hololive, orange eyes, (wolf ears), wolf tail, ^ ^, :D, mittens, snow, outdoors, snowing, snowflakes, black hair, extremely detailed, extremely intricate, absurdres, incredibly absurdres, illustration, official art, (emphasis lines, cel shading:0.8), AND best quality, 2girls, (((chibi))), kurokami fubuki, hololive, fox ears, fox tail, (((black hair))), red eyes, ^ ^, :D, mittens, snow, outdoors, snowing, snowflakes, black hair, extremely detailed, extremely intricate, absurdres, incredibly absurdres, illustration, official art, (emphasis lines, cel shading:0.8), Negative prompt: (worst quality, low quality:1.4), (depth of field, blurry), (censored, censorship), (text, speech bubble, lowres, jpeg artifacts, (extra arms), extra digits, missing limb, fewer digits, (extra digit, extra limb), missing limb, mutation, extra legs, missing finger, bad anatomy, bad hands, bad feet), error, paws, username, artist name, trademark, letterbox, error, fused digits, claws, multiple views, bad multiple views, fat, fat rolls, blob, AND (worst quality, low quality:1.4), (depth of field, blurry), (censored, censorship), (text, speech bubble, lowres, jpeg artifacts, (extra arms), extra digits, missing limb, fewer digits, (extra digit, extra limb), missing limb, mutation, extra legs, missing finger, bad anatomy, bad hands, bad feet), error, paws, username, artist name, trademark, letterbox, error, fused digits, claws, multiple views, bad multiple views, fat, fat rolls, blob, AND (worst quality, low quality:1.4), (depth of field, blurry), (censored, censorship), (text, speech bubble, lowres, jpeg artifacts, (extra arms), extra digits, missing limb, fewer digits, (extra digit, extra limb), missing limb, mutation, extra legs, missing finger, bad anatomy, bad hands, bad feet), error, paws, username, artist name, trademark, letterbox, error, fused digits, claws, multiple views, bad multiple views, fat, fat rolls, blob, (((white hair))), Steps: 40, Sampler: Euler a, CFG scale: 11, Seed: 3676936066, Size: 768x512, Model hash: dcffbff16c, Model: Gekijōban Shinseiki Dai Rantō HLL Mix 1.0 + 3.1 You Can (Not) Proompt II HD Remix &IRYS, Denoising strength: 0.5, Clip skip: 2, ENSD: 31337, Latent Couple: "divisions=1:1,1:2,1:2 positions=0:0,0:0,0:1 weights=0.2,0.8,0.8 end at step=30", Hires upscale: 2, Hires steps: 20, Hires upscaler: 4x-AnimeSharp ``` !!! warning I am retarded I have no idea if `AND`-ing the negatives works, but I tried it anyways to keep Kurokami from having white hair. [![catbox-x5a1h6.png](https://i.postimg.cc/7Czjcwvg/catbox-x5a1h6.png)](https://i.postimg.cc/Xv13rYQD/catbox-x5a1h6.png) ``` (masterpiece, best quality, highres, detailed:1.1), , 2girls, indoors AND (masterpiece, best quality, highres, detailed:1.1), 2girls, mori calliope, red eyes, pink hair, tiara, (fat:1.3), wide hips, thick thighs, ripped thighhighs, indoors, sweaty AND (masterpiece, best quality, highres, detailed:1.1), 2girls, (takanashi kiara:1.1), (orange hair:1.1), (green hair:0.9), gradient hair, purple eyes, (fat:1.3), wide hips, thick thighs, ripped thighhighs, orange clothes, indoors, sweaty Negative prompt: (worst quality, low quality:1.4), blur, blurry, depth of field, motion lines, fat legs, (bad background:1.1), flat color, sketch, 3d, cartoon, toon \(style\), leotard, cartoon Steps: 28, Sampler: DPM++ 2S a Karras, CFG scale: 7, Seed: 1180464368, Size: 640x512, Model hash: 9c2e7b9e99, Model: anything-v4.5-pruned-fp16-fp16, Denoising strength: 0.49, Latent Couple: "divisions=1:1,1:2,1:2 positions=0:0,0:0,0:1 weights=0.2,0.8,0.8 end at step=24", Hires upscale: 2, Hires upscaler: Latent ``` -> [![00001-3178764140-0-best-quality-2girls-pantyshot-from-below-medium-breasts-upskirt-extend.jpg](https://i.postimg.cc/PpQFQWJB/00001-3178764140-0-best-quality-2girls-pantyshot-from-below-medium-breasts-upskirt-extend.jpg)](https://i.postimg.cc/Qt2GsS3x/00001-3178764140-0-best-quality-2girls-pantyshot-from-below-medium-breasts-upskirt-extend.jpg) [![tmp5r-m6chg.png](https://i.postimg.cc/5QCcqYcv/tmp5r-m6chg.png)](https://i.postimg.cc/cCDsWwXD/tmp5r-m6chg.png) -> ``` best quality, 2girls, pantyshot, (from below), medium breasts, (((upskirt, extended upskirt))), indoors, light rays, panties from heaven, ((((((fine fabric emphasis)))))), absurdres, incredibly absurdres, ultra-detailed, illustration, official art, (emphasis lines, cel shading:0.8), (ornate clothes, floating clothes), wa maid, maid apron, maid headdress, japanese clothes, (mansion), gradient eyes, panties, AND best quality, 2girls, pantyshot, (from below), inui toko, pantyhose, medium breasts, panties, nijisanji, (from below, (upskirt, extended upskirt)), skirt lift, dog ears, heterochromia, yellow eyes, red eyes, black hair, gradient hair, beautiful shampoo commercial hair, hair ornament, wa maid, maid apron, maid headdress, (((grimace, disgust))) japanese clothes, (angry, full-face blush), (((fine fabric emphasis))), wide hips, narrow waist, (mansion), absurdres, incredibly absurdres, ultra-detailed, illustration, official art, AND best quality, 2girls, pantyshot, (from below), hoshimachi suisei, smug, panties, small breasts, presenting panties, ((upskirt, extended upskirt)), side ponytail, light blue hair, (((((blue eyes))))), gradient eyes, wa maid, maid apron, maid headdress, japanese clothes, grin, pervert, (((fine fabric emphasis))), action, floating hair, motion lines, you gonna get raped, (mansion), absurdres, incredibly absurdres, ultra-detailed, illustration, official art, Negative prompt: (worst quality, low quality:1.4), (depth of field, blurry), (monochrome, censored, censorship, underbust:1.2), (text, speech bubble, lowres, jpeg artifacts, (extra arms), extra digits, missing limb, fewer digits, (extra digit, extra limb), missing limb, mutation, extra legs, missing finger, bad anatomy, bad hands, bad feet), error, paws, username, artist name, trademark, letterbox, error, fused digits, claws, multiple views, bad multiple views, fat, fat rolls, blob, (ribs:1.3), teeth, covered navel, holding, covered nipples, (topless), armpit hair, bottomless, white background, animal ear fluff, ((((loli)))), Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 30, Seed: 3178764140, Size: 512x768, Model hash: dcffbff16c, Model: Gekijōban Shinseiki Dai Rantō HLL Mix 1.0 + 3.1 You Can (Not) Proompt II HD Remix &IRYS, Denoising strength: 0.5, Clip skip: 2, ENSD: 31337, ControlNet Enabled: True, ControlNet Module: none, ControlNet Model: control_openpose-fp16 [9ca67cc5], ControlNet Weight: 1, ControlNet Guidance Start: 0, ControlNet Guidance End: 1, Latent Couple: "divisions=1:1,1:2,1:2 positions=0:0,0:0,0:1 weights=0.2,0.8,0.8 end at step=20", Hires upscale: 2, Hires steps: 15, Hires upscaler: 4x-AnimeSharp, Dynamic thresholding enabled: True, Mimic scale: 11, Threshold percentile: 100, Mimic mode: Half Cosine Up, Mimic scale minimum: 3, CFG mode: Half Cosine Up, CFG scale minimum: 3 ``` >Line 1 Put a `2girls` prompt in here. This is the prompt for the image as a whole. Look at the examples above, and you can see that anons kept their "flavor"/quality tags in there, the desired scenery, etc. By default, the other divisions are weighted 4 times as hard, so you don't have to go too hard. The Mori/Kiara example is probably the better example here. >Line 2, onwards Start with `AND ` and describe each "division" of the image. ###iT's NoT woRkiNG!!11 `skill issue` In all seriosuness, it can definitely "not work" sometimes. ControlNet was doing a lot of the heavy lifting above in the Toko/Suisei upskirt, and as soon as I turned it off... -> [![00047-3178764143-best-quality-2girls-pantyshot-from-below-medium-breasts-upskirt-extended.png](https://i.postimg.cc/K8DhjyNg/00047-3178764143-best-quality-2girls-pantyshot-from-below-medium-breasts-upskirt-extended.png)](https://postimg.cc/NKFzCSZg) <- Plus, some of the best 2shot examples you see are still inpainted afterwards! ###`4girls` -> [![4girls, inpainted https://files.catbox.moe/wscr6u.png](https://i.postimg.cc/43FWW8P3/wscr6u.png)](https://postimg.cc/QB1gtJRL) <- ``` masterpiece, best quality, (4girls), (indoors, scenery, hotel living room, couch, night:1.4), AND (4girls), masterpiece, best quality, best illustration, (indoors, scenery, hotel living room, couch, night:1.4), (white shirt, black jeans:1.4), (:D, closed eyes, open mouth:1.2) (looking away), (oozora subaru), hololive, black hair, turquoise eyes, short hair, AND (4girls), masterpiece, best quality, best illustration, (indoors, scenery, hotel living room, couch, night:1.4), (white coat, black leggings:1.4), (grin, closed eyes:1.2), (looking away), holding phone, (shirakami fubuki:1.2), hololive, green eyes, (white hair), (fox tail), long hair, (fox ears:1.2), ahoge, (small breasts:0.8), AND (4girls), masterpiece, best quality, best illustration, (indoors, scenery, hotel living room, couch, night:1.4), (white turtleneck sweater, black leggings:1.4), smile, (looking away), (ookami mio:1.2), animal ears, tail, (wolf ears), wolf girl, wolf tail, yellow eyes, (black hair), red hair, streaked hair, (long hair), animal ear fluff, (small breasts:1.4), AND (4girls), masterpiece, best quality, best illustration, (indoors, scenery, hotel living room, couch, night:1.4), (red winter coat, black skirt, pantyhose:1.4), (grin, happy:1.2), (looking away), (nakiri ayame:1.2), hololive, bangs, black ribbon, gradient hair, hair ribbon, long hair, (ponytail), (white hair), oni, (oni horns:1.2), red eyes, red hair, ribbon, silver hair, small breasts, Negative prompt: (worst quality, low quality:1.4), (depth of field, blurry, bokeh, shallow depth of field:1.5), (blurry background:1.2), (tanlines, white skin, pale skin, multiple bellybuttons, swimming pool), signature, watermark, username, artist name, text, censored, cropped, text, speech bubble, letterboxed, JPEG artifacts, lowres, bad anatomy, (bad hands, error, missing fingers, extra digit, fewer digits:1.4), cropped, worst quality, low quality, normal quality, 3D, black background, (3girls), nude, nsfw, bare legs, (bench), (bare shoulders), (bare arms), (skin-tight clothing:1.3), (cleavage), (looking at viewer:1.4), collared shirt, black gloves, gloves, apple inc., bottomless, (lips, pink lips), (cityscape, skyline, city, city lights, sun, sun light:1.4), Steps: 25, Sampler: DPM++ 2M Karras, CFG scale: 6, Seed: 1851769969, Size: 768x432, Model hash: cd12b7cc22, Denoising strength: 0.6, Clip skip: 2, ENSD: 31337, Mask blur: 4, Latent Couple: "divisions=1:1,1:4,1:4,1:4,1:4 positions=0:0,0:0,0:1,0:2,0:3 weights=0.2,0.8,0.8,0.8,0.8 end at step=30", AddNet Enabled: True, AddNet Module 1: LoRA, AddNet Model 1: thickerLinesAnimeStyle_loraVersion(58c5f51b2b68), AddNet Weight A 1: 0.2, AddNet Weight B 1: 0.2 ``` !!! note Line Breaks It's probably a good time to point out that the line breaks aren't what separate the sections of your latent couple prompt `AND` is. You can otherwise use line breaks in your prompts for other reasons. ####Before Inpainting -> [![4girls https://files.catbox.moe/npcteh.png](https://i.postimg.cc/P5Ch67zL/npcteh.png)](https://postimg.cc/jDYG2Z1K) <- ControlNet Source, `openpose` Pre-Processor/Module -> [![https://files.catbox.moe/r4htcr.jpg](https://i.postimg.cc/MM0cksgT/r4htcr.jpg)](https://i.postimg.cc/k4zWbzcx/r4htcr.jpg) <- ``` masterpiece, best quality, (4girls), (indoors, scenery, hotel living room, couch, night:1.4), AND (4girls), masterpiece, best quality, best illustration, (indoors, scenery, hotel living room, couch, night:1.4), (white shirt, black jeans:1.4), :D, (looking away), (oozora subaru), hololive, black hair, turquoise eyes, short hair, AND (4girls), masterpiece, best quality, best illustration, (indoors, scenery, hotel living room, couch, night:1.4), (white coat, black leggings:1.4), (grin, closed eyes:1.2), (looking away), holding phone, (shirakami fubuki:1.2), hololive, green eyes, (white hair), (fox tail), long hair, (fox ears:1.2), ahoge, (small breasts:0.8), AND (4girls), masterpiece, best quality, best illustration, (indoors, scenery, hotel living room, couch, night:1.4), (white turtleneck sweater, black leggings:1.4), smile, (looking away), (ookami mio:1.2), animal ears, tail, (wolf ears), wolf girl, wolf tail, yellow eyes, (black hair), red hair, streaked hair, (long hair), animal ear fluff, (small breasts:1.4), AND (4girls), masterpiece, best quality, best illustration, (indoors, scenery, hotel living room, couch, night:1.4), (red winter coat, black skirt, pantyhose:1.4), (grin, happy:1.2), (looking away), (nakiri ayame:1.2), hololive, bangs, black ribbon, gradient hair, hair ribbon, long hair, (ponytail), (white hair), oni, (oni horns:1.2), red eyes, red hair, ribbon, silver hair, small breasts, Negative prompt: (worst quality, low quality:1.4), (depth of field, blurry, bokeh, shallow depth of field:1.5), (blurry background:1.2), (tanlines, white skin, pale skin, multiple bellybuttons, swimming pool), signature, watermark, username, artist name, text, censored, cropped, text, speech bubble, letterboxed, JPEG artifacts, lowres, bad anatomy, (bad hands, error, missing fingers, extra digit, fewer digits:1.4), cropped, worst quality, low quality, normal quality, 3D, black background, (3girls), nude, nsfw, bare legs, (bench), (bare shoulders), (bare arms), (skin-tight clothing:1.3), (cleavage), (looking at viewer:1.4), collared shirt, black gloves, gloves, apple inc., bottomless, (lips, pink lips), (cityscape, skyline, city, city lights:1.4), Steps: 25, Sampler: DPM++ 2M Karras, CFG scale: 9, Seed: 447411579, Size: 768x432, Model hash: cd12b7cc22, Denoising strength: 0.45, Clip skip: 2, ENSD: 31337, ControlNet Enabled: True, ControlNet Module: openpose, ControlNet Model: control_sd15_openpose [fef5e48e], ControlNet Weight: 0.85, ControlNet Guidance Start: 0, ControlNet Guidance End: 1, Latent Couple: "divisions=1:1,1:4,1:4,1:4,1:4 positions=0:0,0:0,0:1,0:2,0:3 weights=0.2,0.8,0.8,0.8,0.8 end at step=30", Hires upscale: 2.5, Hires steps: 20, Hires upscaler: 4x-AnimeSharp, AddNet Enabled: True, AddNet Module 1: LoRA, AddNet Model 1: thickerLinesAnimeStyle_loraVersion(58c5f51b2b68), AddNet Weight A 1: 0.2, AddNet Weight B 1: 0.2 ``` ###Divisions These are the default settings: ``` divisions=1:1,1:2,1:2 positions=0:0,0:0,0:1 weights=0.2,0.8,0.8 ``` This splits the image in half, straight down the middle as seen in both examples above. !!! note fuck math You can use [this](http://badnoise.net/latentcoupleregionmapper/) tool to draw your desired divisions onto a grid and just have it spit the latent couple settings out for you! I used [this](https://jsfiddle.net/usw9jfmh/) very helpful tool made by an anon to play with the division settings and see how the split changes in real-time and to make all the screenshots you see below. The example in the next section just so happens to split the image differently, so this topic will continue there. ###Can you use it with ControlNet? Yes. Toko/Suisei upskirt above was made using openpose ControlNet and the [posex](https://github.com/hnmr293/posex) extension. -> [![yes.png](https://i.postimg.cc/LXzpN3mG/yes.png)](https://postimg.cc/KkvwYLk5) <- ``` best quality, (2girls), kitchen, laundry room, washing machine, door, cabinet, vent \(object\), oven hood, suit, black necktie, (blood on clothes), maytag, large breasts, hololive, wallpaper \(object\), best shadow, official art, looking at viewer, AND best quality, 2girls, kitchen, laundry room, washing machine, door, cabinet, vent \(object\), suit, black necktie, blood on clothes, yukihana lamy, hololive, light blue hair, pointy ears, yellow eyes, smug, drinking, mug, looking down, large breasts, black suit, white shirt, slouching, best shadow, official art, looking at viewer, straight-on, AND best quality, 2girls, kitchen, laundry room, washing machine, door, cabinet, vent \(object\), suit, black necktie, (blood on clothes), ((shiranui flare, hololive, pointy ears, red eyes, blonde hair)), afro, curly hair, tan skin, coffee mug, pointing to the side, large breasts, confident, best shadow, official art, looking at viewer, glance, Negative prompt: fused digits, missing digits, paws, worst quality, low quality, depth of field, blurry, 3D face, photorealistic, cropped, lowres, jpeg artifacts, username, blurry, artist name, trademark, Reference sheet, letterbox, censored, censorship, comic, bad hands, bad feet, bad anatomy, error, missing fingers, extra digit, fewer digits, extra limb, missing limb, poorly drawn hands, mutated limbs, mutated feet, mutated hand, fused digits, claws, cowboy hat, multiple views, bad multiple views, extra arms, extra legs, fat, shelf, 1boy, 2boys, male focus, long fingers, white suit, refrigerator, microwave, Steps: 35, Sampler: DPM++ 2S a Karras, CFG scale: 12, Seed: 1182590391, Size: 768x328, Model hash: 6f06efaec6, Model: GrapefruitJuice.pruned, Denoising strength: 0.5, Clip skip: 2, ENSD: 31337, ControlNet Enabled: True, ControlNet Module: hed, ControlNet Model: control_hed-fp16 [13fee50b], ControlNet Weight: 1, Latent Couple: "divisions=1:1,1:1.5,1:3 positions=0:0,0:0,0:2 weights=0.3,0.9,0.9 end at step=25", Hires resize: 1280x544, Hires upscaler: lollypop ``` -> [![tmpbb1mv6vz.png](https://i.postimg.cc/1487Yyp4/tmpbb1mv6vz.png)](https://i.postimg.cc/CM7y515C/tmpbb1mv6vz.png) [![mpv-shot0102.png](https://i.postimg.cc/gXns5KL1/mpv-shot0102.png)](https://i.postimg.cc/C142RPhM/mpv-shot0102.png) <- >Hey, this isn't split right down the middle! I know !!! warning I ain't reading all that shit You can use [this](http://badnoise.net/latentcoupleregionmapper/) tool to draw your desired divisions onto a grid and just have it spit the latent couple settings out for you! [Skip TL;DR](https://rentry.org/vtaiFAQ#ogey-how-2girls-with-lora) -> [![Screenshot-2023-03-30-151831.png](https://i.postimg.cc/brn4zQCp/Screenshot-2023-03-30-151831.png)](https://postimg.cc/K3Z0fghH) <- In my opinion, the way it handles the divisions and positions makes no sense whatsoever, but here is how I understood it: > 1:1 for the first division (almost) always. This covers the entire picture with your first line `2girls` prompt !!! note akshully... There is nothing stopping you from dividing the image multiple times, and subdividing it multiple times. But I'm not about to try and explain it. You could, for example, use latent couple to split the image up into very specific regions while e.g. upscaling and "make your own" super SD Upscale. In theory at least... > 1:1.5 and 1:3 for divisions 2 and 3 Ratio represents `y size of this division`:`x size of this division` and both numbers imply they're being compared as ratios to the height and width of the whole image, respectively. > 1:1.5 = 1:1 y , 1:1.5 x = 100% of y, 2/3 of x > 1:3 = 1:1y, 1:3 x = 100% of y, 1/3 of x Yes, it's as dumb as it sounds. Had I done e.g. `2:1.5` I'd have only gotten Lamy in the top half. -> [![Screenshot-2023-03-30-152759.png](https://i.postimg.cc/NMCrgXyJ/Screenshot-2023-03-30-152759.png)](https://postimg.cc/PpZqyPQY) <- >OK, what about the positions? It gets dumber > positions=0:0,0:0,0:2 0:0 is easy enough. The origin is the top, and the left. Again, they're y:x, but now they're being compared to _the sizes of the divisions themselves_ and are **independent of the other positions**. The **only** thing the position value is compared against is the size of that particular division itself. > 0:2 = y origin is still 0, x origin is _twice the size of the division itself away from the left_ > division is 1/3 > 2x 1/3 is 2/3 >so it's on the right, flush And now Lamy is in the bottom half > Positions: 0:0,1:0,0:2 Because `1:` for the second position is telling it _place this division `1 entire height of this division` away from the top of the image_ -> [![Screenshot-2023-03-30-153756.png](https://i.postimg.cc/pXMx7TrK/Screenshot-2023-03-30-153756.png)](https://postimg.cc/YjRJLt1S) <- Here's another stupid, exaggerated example. Yes, I know it overlaps. -> [![Screenshot-2023-03-30-153231.png](https://i.postimg.cc/CLqJQhXT/Screenshot-2023-03-30-153231.png)](https://postimg.cc/vcGW1sG3) <- > Divisions: 1:1,1:1.5,1:4 > Positions: 0:0,0:0,0:0.5 The size of the Flare division is now 1/4 of the width of the image because of the `:4` - the `1:` only affects the height! Because the position is in here as `0:0.5`, that means the image is 1/8th away from the left. 1/8th is half of 1/4th. Hope you paid attention in math class! ###ogey, how `2girls` with LoRA? Keep reading ##What is Composable LoRA? !!! warning I'm still writing this * Composable LoRA allows you to apply LoRA **to a specific part of the prompt** * It also uses `AND ` just like 2shot * It also works _with_ 2shot - you can use both at the same time Here's a really silly example - the first time I tried using it > can I give Watame IRyS's horns and nothing else? Kinda -> [![kek.jpg](https://i.postimg.cc/hthGDDzC/kek.jpg)](https://postimg.cc/cvyWTNcY) <- You don't need to see any more of my ridiculous prompts, but here's the important part: ``` big stupid prompt, silly words, center opening my beloved, booba, etc. AND horns ``` The image on the right had that added to it after the long-ass prompt that was also in the image on the left. It applied the IRyS LoRA **only** to the "horns" in the prompt. ##What is ControlNet? ###whAt cONtroLneT moDel iS beSt? !!! warning use the right tool for the job There is no "best" ControlNet model. Each one does something different, and is good for something else, and will work better or worse depending on the input image used. -> [![mpv-shot0099-768.png](https://i.postimg.cc/s2YHC1qZ/mpv-shot0099-768.png)](https://postimg.cc/kV5vxMnn) <- ####canny * Edge detection. Think of it like tracing [![Canny-tmpmuo73hif.png](https://i.postimg.cc/wMZQmdnp/Canny-tmpmuo73hif.png)](https://i.postimg.cc/wMZQmdnp/Canny-tmpmuo73hif.png) -> [![Canny result](https://i.postimg.cc/68bvxrXX/00005-1420779218-best-quality-houshou-marine-hololive-I-m-iron-man-iron-man-cosplay-snappin.png)](https://i.postimg.cc/9fCtTbgr/00005-1420779218-best-quality-houshou-marine-hololive-I-m-iron-man-iron-man-cosplay-snappin.png) -> * It didn't work well here, since the pre-processor couldn't see the outlines clearly. Not enough contrast or something. The output image bears little resemblance to the source. * canny works better on 2D art with clear line art. [![kajou-ayame-and-blue-snow-megami-magazine-and-1-more-drawn-by-fujii-masahiro-c840322ae113f944a202.jpg](https://i.postimg.cc/RNBf719S/kajou-ayame-and-blue-snow-megami-magazine-and-1-more-drawn-by-fujii-masahiro-c840322ae113f944a202.jpg)](https://i.postimg.cc/jdWXrvF7/kajou-ayame-and-blue-snow-megami-magazine-and-1-more-drawn-by-fujii-masahiro-c840322ae113f944a202.jpg) Source -> canny pre-processor output [![tmp2w97pp1t.png](https://i.postimg.cc/VdCG6Wgz/tmp2w97pp1t.png)](https://i.postimg.cc/yYfwLfS3/tmp2w97pp1t.png) -> -> Result [![OH HO HO HO](https://i.postimg.cc/C1VtRX9X/00252-2806942403-best-quality-houshou-marine-hololive-akasaai-panties-on-head-1girl-solo.png)](https://youtu.be/hfBY2E3snFs?t=6) Result <- ``` best quality, houshou marine, hololive, akasaai, (((panties on head))), 1girl, solo, leg lift, kajou ayame, blue snow, (naked cape), action, smug, ofuda, shimoneta to iu gainen ga sonzai shinai taikutsu na sekai, red eyes, ;D, ^ ^, red hair, sidelighting, best shadow, official art, promotional art, megami magazine, slim legs, wide hips, groin tendon, narrow waist, floating hair, dutch angle, underboob, fang, light rays, light particles, soles, barefoot, one eye closed, ((bottomless), ass), outstretched arms, choker, cape, bed sheet, fleeing, explosion, outstretched hand, convenient censoring, (bare hips, bare ass), butt crack, completely nude, see-through silhouette, Negative prompt: fused digits, missing digits, paws, (worst quality, low quality), lowres, jpeg artifacts, username, blurry, bokeh, artist name, trademark, Reference sheet, letterbox, censored, censorship, comic, bad hands, bad feet, bad anatomy, error, missing fingers, extra digit, fewer digits, extra limb, (missing limb), poorly drawn hands, mutated limbs, mutated feet, mutated hand, fused digits, claws, cowboy hat, multiple views, bad multiple views, extra arms, extra legs, fat, leather, snow, snowing, winter, winter clothes, ((nipples, pussy)), mosaic censoring, eyewear on head, ribbon, serafuku, sailor collar, ascot, on bed, bedroom, jacket, buttons, epaulettes, jacket on shoulders, paws, claws, broken finger, leggings, bra, bikini, rope, shibari, strap slip, panty straps, thigh strap, garter straps, bridal garter, tanlines, Steps: 30, Sampler: DPM++ SDE Karras, CFG scale: 11, Seed: 2806942403, Size: 768x1152, Model hash: d179dccea9, Model: mighty_mix.pruned, Clip skip: 2, ENSD: 31337, ControlNet Enabled: True, ControlNet Module: canny, ControlNet Model: control_canny-fp16 [e3fe7712], ControlNet Weight: 1.05 ``` !!! note Yes, ControlNet means you are far more likely to get away with a higher base resolution without getting weird artifacts. I actually got better results with this particular example just generating directly at 768x1152 than I did with e.g. trying to use Hi-Res Fix on a 512x768 base gen. ####HED * Also edge detection. Less like tracing, more like an edge detection filter you might use in Photoshop. [![HED-tmpkxkdkyfn.png](https://i.postimg.cc/9FwpjrqF/HED-tmpkxkdkyfn.png)](https://i.postimg.cc/9FwpjrqF/HED-tmpkxkdkyfn.png) -> [![00017-1657544479-best-quality-kaela-kovalskia-hololive-hololive-indonesia-I-m-iron-man-iron-man.png](https://i.postimg.cc/qNg7ygtB/00017-1657544479-best-quality-kaela-kovalskia-hololive-hololive-indonesia-I-m-iron-man-iron-man.png)](https://i.postimg.cc/52SFcCbz/00017-1657544479-best-quality-kaela-kovalskia-hololive-hololive-indonesia-I-m-iron-man-iron-man.png) [![00003-1420779218-best-quality-houshou-marine-hololive-I-m-iron-man-iron-man-cosplay-snappin.png](https://i.postimg.cc/2VwW2p93/00003-1420779218-best-quality-houshou-marine-hololive-I-m-iron-man-iron-man-cosplay-snappin.png)](https://i.postimg.cc/HkGtWdQ0/00003-1420779218-best-quality-houshou-marine-hololive-I-m-iron-man-iron-man-cosplay-snappin.png) -> * This worked really well. The pre-processor could see the outlines we wanted it to. It worked better on Kaela, go figure. ####openpose * For...you guessed it, copying poses. * If you use the openpose pre-processor/module, you **should be using it on an image of real people** as that is what it was designed for. * Otherwise, you can use an extension like [posex](https://github.com/hnmr293/posex) to manipulate your own poses. [![Open-Pose-tmp9a3l0atp.png](https://i.postimg.cc/VsVRTjvV/Open-Pose-tmp9a3l0atp.png)](https://postimg.cc/gr38LLvV) -> [![00007-1420779218-best-quality-houshou-marine-hololive-I-m-iron-man-iron-man-cosplay-snappin.png](https://i.postimg.cc/dhVyfWg6/00007-1420779218-best-quality-houshou-marine-hololive-I-m-iron-man-iron-man-cosplay-snappin.png)](https://i.postimg.cc/Nj1k5S43/00007-1420779218-best-quality-houshou-marine-hololive-I-m-iron-man-iron-man-cosplay-snappin.png) -> * Again, the source material just wasn't right for the openpose pre-processor. It could see his eyes and his ears very clearly, but got thrown way off by his hand - the two lines at the bottom are where it thought his arms were. It's not a "normal" image of people, it's covered in Hollywood SFX. You'd have had better luck manually posing a stick figure using posex. ##What is SD Upscale? ##What is Dynamic Thresholding? ##What is `cutoff`? #Samplers, Settings, Upscaling, Etc. ##TBD ##Samplers * DPM++ Samplers * Less steps * Euler, Euler a * Anything beyond ~40 steps is unnecessary * "Good" results as low as ~28 * DDIM * Go up to 100 steps if you want, it will keep doing things * Other * 知らない ##Settings ###Resolution * Most models you use will work best at "base" resolutions like 512x768, 512x512 + Stick with that if you're in doubt * Landscape gens = +prompt better + Prompt an "impossible pose" at 768x512? + `hello fleshblobs my old friend` * Larger base gens, like 768x768 = ++prompt better #Abbreviations, Terms, Etc. ##venv * [_Python_] **Virtual Environment** * If you're running webui _locally_ and you need or want to `pip install` something for webui or for an extension, **you really ought to be doing it with your _venv active_** * screenshots * shift right click context menu can differ * `activate.bat` / `activate.ps1` ##git * Go on, git. Go. Leave! * Use it to upgrade, downgrade, install webui itself and extensions >While at a command prompt, in your `.../stable-diffusion-webui/` folder `git pull` Updates WebUI `git checkout 1234abc` "Downgrades" to hypothetical commit `1234abc` >While at a command prompt, in your `.../stable-diffusion-webui/extensions/` folder `git clone https://github.com/someguy/someextension.git` Installs `someguy`'s `someextension` for webui Consider all following explanations overly simplified ##TI, embed * **Textual inversion** * **embedding** * Small .png or .pt files that can more easily generate something the model you're using was _already_ capable of ###I want some [all of /vtai/'s embeds](https://mega.nz/folder/23oAxTLD#vNH9tPQkiP1KCp72d2qINQ/folder/L2AmBRZC) You can find more on places like civitai or huggingface ##LoRA * **Low-rank Adaptation** * Think of them like "mini models" * Can "teach" whatever model you're using to generate something it couldn't, like: * a different character * a new concept * an artist's style ###I want some Too many places to list...there are some links in the OP !!! warning What's an `overbaked` LoRA? > go to civitai > get LoRA > `set strength to 0.3` That's an overbaked LoRA. !!! warning Warning Under Construction lol !!! info anything that is already better answered or explained in the OP or a guide linked there won't be re-treaded here