Skip to content

Latest commit

 

History

History
272 lines (205 loc) · 20.4 KB

File metadata and controls

272 lines (205 loc) · 20.4 KB

Advanced Prompt Syntax

Weighting

img

  • Prompt weighting, eg an (orange) cat or an (orange:1.5) cat. Anything in (parens) has its weighting modified - meaning, the model will pay more attention to that part of the prompt. Values above 1 are more important, values below 1 (eg 0.5) are less important.
    • You can also hold Control and press the up/down arrow keys to change the weight of selected text.
    • Note: this presumes a default Comfy backend.
    • This varies based on models - CLIP-based models (eg Stable Diffusion) work well with this, but newer models based on T5 or an LLM TextEnc do not.
      • Basically, SDXL and SD3 are the last models this was properly relevant to.
      • For other models, the syntax is non-present. Parentheses will not be parsed at all, and instead simply forwarded directly to the model.

Alternating

img

  • You can use <alternate:cat, dog> to alternate every step between cat and dog, creating a merge/mixture of the two concepts.
    • Similar to random you can instead use | or || to separate entries, eg <alternate:cat || dog>. You can have as many unique words as you want, eg <alternate:cat, dog, horse, wolf, taco> has 5 words so it will cycle through them every 5 steps.
    • You can shorthand this as <alt:cat,dog>

From-To

img

  • You can use <fromto[#]:before, after> to swap between two phrases after a certain timestep.
    • The timestep can be like 10 for step 10, or like 0.5 for halfway-through.
    • Similar to random you can instead use | or || to separate entries. Must have exactly two entries.
    • For example, <fromto[0.5]:cat, dog> swaps from cat to dog halfway through a generation.

Random

img

  • You can use the syntax <random:red, blue, purple> to randomly select from a list for each gen
    • This random is seeded by the main seed - so if you have a static seed, this won't change.
      • You can override this with the Wildcard Seed parameter
      • If your randoms won't change but your seed is changing, check if you've accidentally enabled the Wildcard Seed parameter. Some users have done this by accident.
    • You can use , to separate the entries, or |, or ||. Whichever is most unique gets used - so if you want random options with , in them, just use | as a separator, and , will be ignored (eg <random:red|blue|purple>).
    • An entry can contain the syntax of eg 1-5 to automatically select a number from 1 to 5. For example, <random:1-3, blue> will give back any of: 1, 2, 3, or blue.
      • Or eg <random:0.8-1.2> to get any of 0.8, 0.9, 1.0, 1.1, 1.2 (the number of places after the decimal will be equivalent to the amount used in the inputs)
    • You can repeat random choices via <random[1-3]:red, blue, purple> which might return for example red blue or red blue purple or blue.
      • You can use a comma at the end like random[1-3,] to specify the output should have a comma eg red, blue.
      • This will avoid repetition, unless you have a large count than number of options.

Wildcards

img

  • You can use the syntax <wildcard:my/wildcard/name> to randomly select from a wildcard file, which is basically a pre-saved text file of random options, 1 per line.
    • Edit these in the UI at the bottom in the "Wildcards" tab.
    • You can also import wildcard files from other UIs (ie text file collections) by just adding them into Data/Wildcards folder.
    • This supports the same syntax as random to get multiple, for example <wildcard[1-3]:animals> might return cat dog or elephant leopard dog.
    • You can shorthand this as <wc:my/wildcard/name>
    • This random is seeded by the main seed - so if you have a static seed, this won't change.
      • You can override this with the Wildcard Seed parameter
      • If your wildcards won't change but your seed is changing, check if you've accidentally enabled the Wildcard Seed parameter. Some users have done this by accident.
    • You can exclude certain values like so: <wildcard:animals,not=cat,dog>
      • This can be combined with Variables, eg a photo of a <setvar[animal]:<wildcard:animals>> playing with a <wildcard:animals,not=<var:animal>>
    • You can type eg <wildcard:animals: with a colon at the end, to then get a search in your prompt box for autocompletions of the values inside.
      • For example, you can type <wildcard:animals:do and dog will pop up as an option.
      • Note that is a purely frontend function: it's a UI convenience trick for users who want to grab specific lines from wildcards easily. Do not submit prompts with a stray :.

Variables

img

  • You can store and reuse variables within a prompt. This is primarily intended for repeating randoms & wildcards.
    • Store with the syntax: <setvar[var_name,emit]:data> where emit is true or false (defaults true)
      • For example: <setvar[color]:<random:red, blue, purple>>
    • Call back with the syntax: <var:var_name>
      • For example: <var:color>
    • Here's a practical full example: a photo of a woman with <setvar[color]:<random:blonde, black, red, blue, green, rainbow>> hair standing in the middle of a wide open street. She is smiling and waving at the camera, with beautiful sunlight glinting through her <var:color> hair. <segment:face and hair> extremely detailed close up shot of a woman with shiny <var:color> hair
      • Notice how the var is called back, even in the segment, to allow for selecting a random hair color but keeping it consistent within the generation
    • If you want to avoid the setvar emitting a copy of the value, you can use eg <setvar[color,false]:x y z>
      • For example, a <setvar[color]:red> dog with <var[color]> eyes becomes a red dog with red eyes,
      • but <setvar[color,false]:red> a dog with <var[color]> eyes becomes a dog with red eyes

Macros

img

  • Similar to variables, you can store and reuse chunks of prompt syntax as a macro. This is useful for dynamically repeating complicated randoms.
    • Store with the syntax: <setmacro[macro_name,emit]:data> where emit is true or false (defaults true)
      • For example: <setmacro[color]:<random:red, blue, purple>>
    • Call back with the syntax: <macro:macro_name>
      • For example: in a room with <macro:color> walls, <macro:color> floors, and <macro:color> carpet
    • Unlike Variables, macros are not evaluated when being set, but instead are evaluated when used via <macro:...>
    • Here's a full example: Photo of a woman with <setmacro[color]:<random:red|white|green|blue|purple|orange|black|brown>> hair, <macro:color> shirt, <macro:color> pants
      • A separate random color will be chosen for hair, shirt, and pants.
    • If you want to avoid the setmacro emitting a copy of the value, you can use eg <setmacro[color,false]:x y z>
      • For example, a <setmacro[color]:red> dog with <macro[color]> eyes becomes a red dog with red eyes,
      • but <setmacro[color,false]:red> a dog with <macro[color]> eyes becomes a dog with red eyes

Trigger Phrase

img

  • If your model or current LoRA's have a trigger phrase in their metadata, you can use <trigger> to automatically apply those within a prompt.
    • If you have multiple models with trigger phrases, they will be combined into a comma-separated list. For example cat and dog will be inserted as cat, dog.
    • Semicolons in trigger phrases are automatically replaced with commas. For example, cat; dog will be replaced with cat, dog.
    • Note this is just a simple autofill, especially for usage in grids or other bulk generations, and not meant to robustly handle all cases. If you require specific formatting, you'll want to just copy the trigger phrase in directly yourself.
    • Fills empty when there's no data to fill.

Repeat

img

  • You can use the syntax <repeat[3]:cat> to get the word "cat" 3 times in a row (cat cat cat).
    • You can use for example like <repeat[1-3]: <random:cat, dog>> to get between 1 and 3 copies of either cat or dog, for example it might return cat dog cat.

Textual Inversion Embeddings

  • You can use <embed:filename> to use a Textual Inversion embedding in the prompt or negative prompt.
    • Store embedding files in (SwarmUI)/Models/Embeddings.
    • Embedding files were popular in the SDv1 era, but are less common for newer models.

LoRAs

img

  • You may use <lora:filename> to enable a LoRA, or <lora:filename:weight> to enable it and set a weight
    • Note that it's generally preferred to use the GUI at the bottom of the page to select loras
    • Note that usually position within the prompt doesn't matter, loras are not actually a prompt feature, this is just a convenience option for users used to Auto WebUI.
    • The one time it does matter, is when you use <segment:...> or <object:...>: a LoRA inside one of these will apply only to that segment or object.
    • weight is a multiplier, where 1 is the default, 0.5 is weakened halfway, or 2 is twice as strong. Generally numbers larger than 2 will destroy image quality.
    • You may also use <lora:filename:backbone_weight:textenc_weight> to enable a lora and set its backbone (unet/dit) weight separately from its text encoder weight.

Presets

img

  • You can use <preset:presetname> to inject a preset.
    • GUI is generally preferred for LoRAs, this is available to allow dynamically messing with presets (eg <preset:<random:a, b>>)
    • You can shorthand this as <p:presetname>

Params

  • You can directly set generation parameters via <param[paramName]:paramValue>
    • For example, <param[CFG Scale]:1> or <param[cfgscale]:1> sets CFG Scale to 1.
      • Note the name is case-insensitive, and spaces are ignored. You can copy-paste names directly from the UI, API structure, Metadata, wherever, it'll just work, as long as you don't typo it.
    • You can combine this with sub-syntax, eg <param[cfgscale]:<random:1,2,3>> to set CFG Scale to a random value.
    • This supports any parameter in SwarmUI - that is, the inputs listed on the left side of the Generate tab.
    • Some parameters can be 'sectionalized' - that is, apply to specific sections, such as <refiner> or <base> or <video> or <segment:...> or <extend:...> etc.
      • This includes: CFG Scale, Steps, Sampler, Scheduler
      • So for example, <video> <param[cfgscale]:5> will set the CFG Scale of the video section only to 5.

Automatic Segmentation and Refining

img

  • You can use <segment:texthere> to automatically refine part of the image using CLIP Segmentation.
    • This is like a "restore faces" feature but much more versatile, you can refine anything and control what it does.
    • Or <segment:texthere,creativity,threshold> - where creativity is inpaint strength, and threshold is segmentation minimum threshold - for example, <segment:face,0.6,0.5> - defaults to 0.6 creativity, 0.5 threshold.
    • See the feature announcement for details.
    • Note the first time you run with CLIPSeg, Swarm will automatically download an fp16 safetensors version of the clipseg-rd64-refined model
    • You can insert a <lora:...> inside the prompt area of the segment to have a lora model apply onto that segment
    • You can also replace the texthere with yolo-modelnamehere to use YOLOv8 segmentation models (this is what "ADetailer" uses)
      • store your models in (Swarm)/Models/yolov8
      • Examples of valid YOLOv8 Segmentation models here: https://github.com/hben35096/assets/releases/
      • You can also do yolo-modelnamehere-1 to grab exactly match #1, and -2 for match #2, and etc.
        • You can do this all in one prompt to individual refine specific faces separately.
        • Without this, if there are multiple people, it will do a bulk segmented refine on all faces combined.
        • Note the index order is sorted from leftmost detection to right.
      • To control the creativity/threshold with a yolo model just append ,<creativity>,<threshold>, for example <segment:yolo-face_yolov8m-seg_60.pt-1,0.8,0.25> sets a 0.8 creativity and 0.25 threshold.
        • Note the default "confidence threshold" for Yolo models is 0.25, which is different than is often used with ClipSeg, and does not have a "max threshold" like ClipSeg does.
      • If you have a yolo model with multiple supported classes, you can filter specific classes by appending :<classes>: to the model name where <classes> is a comma-separated list of class IDs or names, e.g., <segment:yolo-modelnamehere:0,apple,2:,0.8,0.25>
    • You can also combine multiple areas into a single segment to refine them as a single group.
      • Separate the areas with | in texthere.
      • For example, <segment:face|hair> will find all the faces and hair in the image and refine them as a single group.
      • This works with YOLOv8 models as well.
        • <segment:yolo-face_yolov8m-seg_60.pt | yolo-hair_yolov8m-seg_60.pt | fingers> will refine the group of faces and hair (found by YOLO) and fingers (found by CLIPSeg) as a single group.
    • There's an advanced parameter group named Segment Refining which can configure additional options for this
      • Segment Model to customize the base model used for segment processing.
      • Save Segment Mask to save a preview copy of the generated mask.
      • Segment Target Resolution to control what resolution the segment is generated at. Users have noted that for some models (such as SDXL), 1248x1824 is a very good target resolution for face fixes.
      • Other parameters too, see the ? button in-UI next to each option.

Clear (Transparency)

img

  • You can use <clear:texthere> to automatically clear parts of an image to transparent. This uses the same input format as segment (above) (for obvious reasons, this requires PNG not JPG).
    • For example, <clear:background> to clear the background.
    • The Remove Background dedicated parameter is generally better than autosegment clearing.

Break Keyword

  • You can use <break> to specify a manual CLIP section break (eg in Auto WebUI this is BREAK).
    • If this is confusing, you this a bit of an internal hacky thing, so don't worry about. But if you want to know, here's the explanation:
      • CLIP (the model that processes text input to pass to SD), has a length of 75 tokens (words basically).
      • By default, if you write a prompt that's longer than 75 tokens, what it will do is split 75/75, the first 75 tokens go in and become one CLIP result chunk, and then the next tokens get passed for a second CLIP chunk, and then the multiple CLIP results are parsed by SD in a batch and mixed as it goes.
      • The problem with this, is it's basically random - you might have eg a photo of a big fluffy dog, and it gets split into a photo of a big fluffy and then dog (in practice 75 tokens is a much longer prompt but just an example of how the split might go wrong)
      • Using <break> lets you manually specify where it splits, so you might do eg a photo <break> big fluffy dog (to intentionally put the style in one chunk and the subject in the next)

Regional Prompting

img

(The above is a photo of a cat/dog mix <region:0,0,1,0.5,1> a photo of a cat <region:0,0.5,1,0.5,1> a photo of a dog on SDXL)

  • You can use <region:x,y,width,height,strength> prompt here to use an alternate prompt for a given region.
    • The X,Y,Width,Height values are all given as fractions of image area. For example 0.5 is half the width or height of the image.
    • For example, <region:0,0,0.5,1> a cat specifies to include a cat in the full-height left half of the image.
    • Strength is how strongly to apply the regional prompt vs the global prompt.
    • You can do <region:background> to build a special region for only background areas (those that don't have their own region).
    • Note that small regions are likely to be ignored. The regional logic is applied fairly weakly to the model.
    • Note that different models behave very differently around this functionality.
      • Notably MM-DiT models (SD3/Flux) are likely to only process regions in early steps then entirely ignore them in latter steps (as they process the input image and try to retain it, ie devaluing your actual prompt text, so unusual combinations will make the model unhappy).
      • SDXL and models like it responds very strongly to regional prompts.
    • Regional prompts can use <lora: syntax to add a lora uniquely embedded in that region.

Regional Object Prompting

img

(The above is a photo of a cat/dog mix <object:0,0,1,0.5,0.1,0.5> a photo of a cat <object:0,0.5,1,0.5,0.1,0.5> a photo of a dog on SDXL)

  • You can use <object:x,y,width,height,strength,strength2> prompt here to use an alternate prompt for a given region, and also inpaint back over it.
    • Strength (1) is regional prompt strength (see Regional Prompting)
    • Strength2 is Creativity of the automatic inpaint.
    • The automatic inpaint can be helpful for improving quality of objects, especially for small regions, but also might produce unexpected results.
    • Objects may use global feature changes, such as <lora: syntax input to apply a lora to the object in the inpaint phase.

Base

  • You can use <base> to add prompt text that only goes to the base model, excluding refiner/i2v/etc. models
    • This includes being able to use <lora:> to add loras specific to the base model.

Refiner

  • You can use <refiner> to add prompt text that only goes to the refine/upscale model
    • This includes being able to use <lora:> to add loras specific to the refiner model.

Video

  • When using image2video, you can use <video> to supply an alternate prompt for the image-to-video generation.
    • For example, a photo of a cat <video> the cat walks forward
    • This includes being able to use <lora:> to add loras specific to the video model.
  • When using image2video with a swap model (eg Wan 2.2), you can use <videoswap> to supply an alternate prompt for the swap stage.
    • The <video> input will only go to the main i2v model, and the videoswap only to the swap model.
    • This includes being able to use <lora:> to add loras specific to the video swap model.

Video Extend

  • You can use <extend:frames> to extend a video by a given number of frames using an Image-To-Video model.
    • Note: This is not a very advanced or capable system currently. This is an experimental feature that only some models will respond decently to, and it will almost always have quality issues.
    • For example, <extend:33> will extend the video by 33 frames.
    • Use the Video Extend parameter group to configure values for this. At least Video Extend Model must be set.
    • Must set Overlap less than 1/3rd of the extend frame count.
      • For many I2V models, overlap of 1 is likely ideal, unless using a model that has been trained to use overlap well.
    • Use the Advanced Video parameters as well.
    • Under Other Fixes -> Trim Video End Frames may be useful on some models. Do not use Trim Start

For example:

a video of a cat standing in a forest
<extend:81> the cat starts running through the forest
<extend:81> the cat runs up to a river
<extend:81> the cat stops running at the river edge, and drinks from it

Comment

  • You can use <comment:stuff here> to add a personal comment in the prompt box. It will be discarded from the real prompt.

Refine/Upscale Prompt Addition

  • When using the Refine/Upscale param group, you can add to your prompt <refiner> some prompt here to have that section of prompt only be used for the refiner stage.
    • This includes <lora:...> syntax to attach a lora to the refiner.