C ComfyAtlas

Published May 21, 2026

Image-to-Image in ComfyUI: Style Transfer, Photo Edits, and Sketch-to-Finished-Art

Use a real image as the starting latent instead of pure noise. The full img2img workflow in ComfyUI, with denoise tuning, common use cases, and how it interacts with LoRAs and ControlNet.

Text-to-image starts from random noise. Image-to-image (img2img) starts from your image. The model treats your input as a partially-noised latent, then runs the same denoising process — except the denoising target is informed by what you provided. The output keeps the structure and broad colors of your input but takes its style and details from the prompt.

This is one of the three workflow extensions most users want, alongside LoRA and ControlNet. It’s also the simplest to wire — just two new nodes.

What img2img is good for

What it’s not good for: pixel-perfect edits to specific regions. For that you need inpainting, which is a separate workflow.

How it differs from text-to-image

In a text-to-image workflow:

Empty Latent Image  →  KSampler  →  VAE Decode  →  Save Image
   (blank canvas)    (full denoise)

In img2img:

Load Image  →  VAE Encode  →  KSampler  →  VAE Decode  →  Save Image
                              (partial denoise)

Two changes:

  1. The empty latent is replaced by your encoded image (Load Image + VAE Encode)
  2. KSampler’s denoise is set below 1.0 — usually 0.4 to 0.7 — so it preserves part of your input

That’s the whole pattern. Everything else (model, LoRAs, prompts, sampler) works the same.

The minimum nodes

Search and add:

Then re-route your existing KSampler’s latent_image input to come from VAE Encode instead of Empty Latent Image.

Wire VAE Encode’s vae input to Load Checkpoint’s VAE output.

Wiring step by step

Starting from a working text-to-image graph:

  1. Add Load Image. Drag the node onto canvas. Click choose file to upload and pick your input.
  2. Add VAE Encode.
    • pixels ← Load Image’s IMAGE output
    • vae ← Load Checkpoint’s VAE (or Load VAE if you load it separately)
  3. Disconnect Empty Latent Image from KSampler’s latent_image input.
  4. Connect VAE Encode’s LATENT to KSampler’s latent_image.
  5. (Optional) Delete the Empty Latent Image node — you don’t need it anymore.

You can keep Empty Latent Image around if you want to switch between text-to-image and img2img by re-routing. ComfyUI doesn’t mind unused nodes.

The denoise widget

This is the only knob you need to learn for img2img.

denoise on KSampler controls how much of your image gets replaced with new content. The math: at denoise = X, the sampler adds noise to your image to a level corresponding to Xtotal_steps, then runs (Xtotal_steps) denoising iterations.

Effective range:

denoiseWhat happens
0.1Almost no change. The output is your input plus tiny stylistic touches.
0.3Light style transfer. Photo stays photo-like; subtle prompt influence.
0.5Sweet spot for most use cases. Visible style change, structure preserved.
0.7Heavy reinterpretation. Same subject and rough composition, very different feel.
0.9Very loose. The output barely resembles the input.
1.0Pure text-to-image. Your input is ignored.

For style transfer (photo → painting): start at 0.5, climb to 0.7 if the painting style isn’t strong enough.

For “tweak this generation”: 0.3–0.4 — preserves the image, just nudges the prompt.

For sketch-to-art: 0.7–0.8 — your sketch provides composition, the model fills in detail and refines.

Resolution behavior

Your input image’s dimensions become the output’s dimensions. If you load a 1280×768 photo into VAE Encode, the output is 1280×768 — there’s no separate width/height widget.

This has two consequences:

To resize automatically, add an Upscale Image or Image Resize node between Load Image and VAE Encode. Set target to 1024 longer-edge for SDXL.

Sketch-to-art example

You drew a rough sketch in any drawing app — a stick-figure landscape, a doodled portrait. Save as PNG, drop into ComfyUI.

Settings:

The output keeps the sketch’s spatial layout but renders it as a finished painting / photo / illustration.

Style transfer example

You have a daytime photo. You want it as a moonlit nightscape.

Settings:

Higher denoise = more dramatic style change but more compositional drift. Iterate.

img2img + LoRA

LoRAs work identically — the modified MODEL feeds KSampler the same way. Use a “watercolor painting” LoRA at strength 0.8 with denoise 0.5 for clean watercolor stylization of any input photo.

img2img + ControlNet

A common combination: img2img provides starting content, ControlNet locks structure.

Example: you have a 3D render. You want it as a stylized illustration but with exact edges preserved.

The img2img keeps colors and rough forms; ControlNet keeps edges precise. Together they give you stylization without losing the original geometry.

Common failures

Output looks identical to input

Output ignores input completely

Output is grainy / noisy

Output ignores your prompt

OOM

Color shifts unexpectedly

When to use img2img vs text-to-image with reference

If your goal is purely stylistic (“paint this scene in the style of X”), img2img is direct and quick. If you want the model to invent a new scene that just resembles yours in some way, text-to-image with a ControlNet reference (Canny or Depth) gives more freedom.

Rule of thumb:

Summary

What’s next

You’ve now seen the four main workflow building blocks: text-to-image, LoRA, ControlNet, Hires Fix, and img2img. Most workflows you’ll find online combine these. Next areas worth exploring:

#img2img#image-to-image#workflow#stable-diffusion#tutorial