Pachu Torres Style Lora Flux2 NF4

All files are also archived in https://github.com/je-suis-tm/huggingface-archive in case this gets censored.

Due to restrictions imposed by Flux.2, no previews are given, check https://huggingface.co/je-suis-tm/pachu_torres_style_lora_flux_nf4 for details. Both are trained on the same dataset. The training is based on https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/test_dreambooth_lora_flux2.py. Everything in this training script needs to be set at the same torch dtype as the script was designed for unquantized version which will export in float32. The training took 3 hours on A100 80GB with max VRAM consumption at 25GB. The inference consumes 35GB of VRAM. To avoid running low on VRAM, both transformers and text_encoder were quantized. To be honest, Flux.2 is much more censored and computationally heavier than Flux.1 yet the improvement on most images are marginal. I do not think it is worth renting A100. The female bodies in QLoRA results are heavily distorted compared to Flux.1.

Train

export MODEL_NAME="diffusers/FLUX.2-dev-bnb-4bit"
export INSTANCE_DIR="pachu_torres_style"
export OUTPUT_DIR="pachu_torres_style_lora_flux2_nf4"
export Q_DIR="config.json" #check https://huggingface.co/diffusers/FLUX.2-dev-bnb-4bit/blob/main/text_encoder/config.json

accelerate launch train_dreambooth_lora_flux2.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --dataset_name=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --bnb_quantization_config_path=$Q_DIR \
  --caption_column="text"\
  --gradient_checkpointing \
  --cache_latents \
  --instance_prompt="Pachu Torres Style" \
  --resolution=512 \
  --train_batch_size=1 \
  --guidance_scale=1 \
  --use_8bit_adam \
  --gradient_accumulation_steps=4 \
  --optimizer="adamW" \
  --learning_rate=1e-4 \
  --lr_scheduler="constant" \
  --checkpointing_steps=100\
  --lr_warmup_steps=100 \
  --max_train_steps=1500 \
  --mixed_precision="bf16" \
  --rank=4 \
  --seed="0" \

Usage

import torch
from transformers import Mistral3ForConditionalGeneration

from diffusers import Flux2Pipeline, Flux2Transformer2DModel

repo_id = "diffusers/FLUX.2-dev-bnb-4bit"
device = "cuda:0"
torch_dtype = torch.float32 #only supports float32 when using train_dreambooth_lora_flux2.py 

transformer = Flux2Transformer2DModel.from_pretrained(
  repo_id, subfolder="transformer", torch_dtype=torch_dtype, device_map="cuda:0"
)
text_encoder = Mistral3ForConditionalGeneration.from_pretrained(
  repo_id, subfolder="text_encoder", dtype=torch_dtype, device_map="cuda:0"
)

pipe = Flux2Pipeline.from_pretrained(
  repo_id, transformer=transformer, text_encoder=text_encoder, torch_dtype=torch_dtype
)
pipe.load_lora_weights("je-suis-tm/pachu_torres_style_lora_flux2_nf4",
                       weight_name='pytorch_lora_weights.safetensors')
pipe.enable_model_cpu_offload()

prompt = "Pachu Torres style"

image = pipe(
  prompt=prompt,
  generator=torch.Generator(device=device).manual_seed(42),
  num_inference_steps=50, # 28 is a good trade-off
  guidance_scale=4,
).images[0]

image.save("pachu_torres_style.png")

Trigger words

You should use Pachu Torres Style to trigger the image generation.

Download model

Download them in the Files & versions tab.

Downloads last month
22
Inference Providers NEW

Model tree for je-suis-tm/pachu_torres_style_lora_flux2_nf4

Adapter
(28)
this model

Collection including je-suis-tm/pachu_torres_style_lora_flux2_nf4