LayerCraft

This is the official repository for "LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration"

Figure: Application demonstrations for LayerCraft. Left: Demonstrates batch collage editing capabilities. A user uploads graduation photos and LayerCraft seamlessly integrates a graduation bear across all images. The system first generates a reference bear for consistency, then analyzes optimal placement while preserving facial identity and background integrity. Right: Illustrates the structured text-to-image generation process. From a simple "Alice in Wonderland" prompt, LayerCraft employs chain-of-thought reasoning to sequentially generate background elements, determine object layout, and compose the final image. The framework supports post-generation customization, as shown with the lion integration.

News

[2025-09-18] Paper is accepted by NeurIPS 2025!

Model Pipeline

Figure: LayerCraft framework overview. A coordinator agent orchestrates the process, employing ChainArchitect to derive a dependency-aware 3D layout and the Object Integration Network (OIN) to seamlessly integrate objects into specified regions of a background image.

Figure: Architecture of the Object Integration Network (OIN). Given a text prompt, a background with a designated region, and a reference object, OIN blends the subject into the scene via LoRA-weighted pathways across feed-forward and multi-modal attention within the FLUX model.

Highlighted Features

Customized T2I Generation

LayerCraft supports customized T2I generation with minimal effort.

For example, check out our Indoor Decoration examples below that showcase how users can easily add decorative elements like furniture, plants, and artwork to existing interior images.

We show the comparison between LayerCraft and other T2I methods in the following figure.

Batch Collage Editing

LayerCraft supports batch collage editing with minimal effort.

For example, check out our Batch Collage Editing examples below that showcase how users can easily edit multiple images at once.

Subject Driven Inpainting

Object Integration Network (OIN) in LayerCraft leads the way in subject-driven inpainting.

Abstract

Click to expand

Text-to-image generation (T2I) has become a key area of research with broad applications. However, existing methods often struggle with complex spatial relationships and fine-grained control over multiple concepts. Many existing approaches require significant architectural modifications, extensive training, or expert-level prompt engineer-ing. To address these challenges, we introduce **LayerCraft**, an automated framework that leverages large language models (LLMs) as autonomous agents for structured procedural generation. LayerCraft enables users to customize objects within an image and supports narrative-driven creation with minimal effort. At its core, the system includes a coordinator agent that directs the process, along with two specialized agents: **ChainArchitect**, which employs chain-of-thought (CoT) reasoning to generate a dependency-aware 3D layout for precise instance-level control, and the Object-**Integration Network (OIN)**, which utilizes LoRA fine-tuning on pre-trained T2I models to seamlessly blend objects into specified regions of an image based on textual prompts—without requiring architectural changes. Extensive evaluations demonstrate LayerCraft's versatility in applications ranging from multi-concept customization to storytelling. By providing non-experts with intuitive, precise control over T2I generation, our framework democratizes creative image creation.

Environment Installation

Follow these steps to set up a clean environment:

# 1) Create a Conda environment (Python 3.12)
conda create -n layercraft python=3.12 -y
conda activate layercraft

# 2) Install dependencies
pip install -r requirements.txt

Train OIN Model

cd train
# insert your wandb api key into the train.sh file and the lora paths for the OIN model
sh train.sh

TODO:

Release the code for Object Integration Network (OIN) for T2I models and show more examples. [Done]
Release the weights for the OIN model.
OpenSource ChainArchitect and LayerCraft Coordinator.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
OIN		OIN
figs		figs
train		train
ChainArchitect.py		ChainArchitect.py
LayerCraftCoordinator.py		LayerCraftCoordinator.py
README.md		README.md
Teaser-double.png		Teaser-double.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LayerCraft

News

Model Pipeline

Highlighted Features

Abstract

Environment Installation

Train OIN Model

TODO:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

PeterYYZhang/LayerCraft

Folders and files

Latest commit

History

Repository files navigation

LayerCraft

News

Model Pipeline

Highlighted Features

Abstract

Environment Installation

Train OIN Model

TODO:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages