Research Journal

SDXL Diagram Generation Research

Building a diagram generation system using Stable Diffusion XL enhanced with ControlNets and LoRA fine-tuning. Documenting progress, experiments, and technical developments.

SDXLControlNetLoRATechnical DiagramsPyTorch

Research Updates

Progress, experiments, and findings as the research evolves.

SDXL-Based Diagram Generation System

Unlike traditional image generation focused on natural photos, this system specializes in structured scientific/technical diagrams where text slots, nodes, and edges must be preserved.

More updates coming soon

Follow along as the research progresses

Technical Architecture

Base Model

stabilityai/stable-diffusion-xl-base-1.0

Paired with custom VAE: madebyollin/sdxl-vae-fp16-fix

ControlNets

Two ControlNet branches:

Structure ControlNet

Enforces nodes + edges alignment

Slot ControlNet

Text mask region placement

LoRA Fine-Tuning

Low-rank adapters (rank=16) applied to UNet attention layers for efficient training.

Dataset Structure

Each sample includes:

Main image (.png)
Mask (.mask.png)
Nodes (.nodes.png)
Edges (.edges.png)
Text description (.txt)

Training Config

Resolution:512×512
Batch:1 (acc=4)
Optimizer:AdamW8bit
Precision:fp16

Inference Pipeline

• LoRA weights merged into UNet

• Dual ControlNet conditioning

StableDiffusionXLControlNetPipeline