Pachu Torres Style Lora Flux2 NF4
All files are also archived in https://github.com/je-suis-tm/huggingface-archive in case this gets censored.
Due to restrictions imposed by Flux.2, no previews are given, check https://huggingface.co/je-suis-tm/pachu_torres_style_lora_flux_nf4 for details. Both are trained on the same dataset. The training is based on https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/test_dreambooth_lora_flux2.py. Everything in this training script needs to be set at the same torch dtype as the script was designed for unquantized version which will export in float32. The training took 3 hours on A100 80GB with max VRAM consumption at 25GB. The inference consumes 35GB of VRAM. To avoid running low on VRAM, both transformers and text_encoder were quantized. To be honest, Flux.2 is much more censored and computationally heavier than Flux.1 yet the improvement on most images are marginal. I do not think it is worth renting A100. The female bodies in QLoRA results are heavily distorted compared to Flux.1.
Train
export MODEL_NAME="diffusers/FLUX.2-dev-bnb-4bit"
export INSTANCE_DIR="pachu_torres_style"
export OUTPUT_DIR="pachu_torres_style_lora_flux2_nf4"
export Q_DIR="config.json" #check https://huggingface.co/diffusers/FLUX.2-dev-bnb-4bit/blob/main/text_encoder/config.json
accelerate launch train_dreambooth_lora_flux2.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--dataset_name=$INSTANCE_DIR \
--output_dir=$OUTPUT_DIR \
--bnb_quantization_config_path=$Q_DIR \
--caption_column="text"\
--gradient_checkpointing \
--cache_latents \
--instance_prompt="Pachu Torres Style" \
--resolution=512 \
--train_batch_size=1 \
--guidance_scale=1 \
--use_8bit_adam \
--gradient_accumulation_steps=4 \
--optimizer="adamW" \
--learning_rate=1e-4 \
--lr_scheduler="constant" \
--checkpointing_steps=100\
--lr_warmup_steps=100 \
--max_train_steps=1500 \
--mixed_precision="bf16" \
--rank=4 \
--seed="0" \
Usage
import torch
from transformers import Mistral3ForConditionalGeneration
from diffusers import Flux2Pipeline, Flux2Transformer2DModel
repo_id = "diffusers/FLUX.2-dev-bnb-4bit"
device = "cuda:0"
torch_dtype = torch.float32 #only supports float32 when using train_dreambooth_lora_flux2.py
transformer = Flux2Transformer2DModel.from_pretrained(
repo_id, subfolder="transformer", torch_dtype=torch_dtype, device_map="cuda:0"
)
text_encoder = Mistral3ForConditionalGeneration.from_pretrained(
repo_id, subfolder="text_encoder", dtype=torch_dtype, device_map="cuda:0"
)
pipe = Flux2Pipeline.from_pretrained(
repo_id, transformer=transformer, text_encoder=text_encoder, torch_dtype=torch_dtype
)
pipe.load_lora_weights("je-suis-tm/pachu_torres_style_lora_flux2_nf4",
weight_name='pytorch_lora_weights.safetensors')
pipe.enable_model_cpu_offload()
prompt = "Pachu Torres style"
image = pipe(
prompt=prompt,
generator=torch.Generator(device=device).manual_seed(42),
num_inference_steps=50, # 28 is a good trade-off
guidance_scale=4,
).images[0]
image.save("pachu_torres_style.png")
Trigger words
You should use Pachu Torres Style to trigger the image generation.
Download model
Download them in the Files & versions tab.
- Downloads last month
- 22
Model tree for je-suis-tm/pachu_torres_style_lora_flux2_nf4
Base model
black-forest-labs/FLUX.2-dev