From Elements to Design: A Layered Approach for Automatic Graphic Design Composition

In this work, we investigate automatic design composition from multimodal graphic elements. We propose LaDeCo, which introduces the layered design principle to accomplish this challenging task through two steps: layer planning and layered design composition.

Requirements

Clone this repository

git clone https://github.com/microsoft/elem2design.git
cd elem2design

Install

conda create -n e2d python=3.10 -y
conda activate e2d
pip install --upgrade pip
pip install -e .
pip install -e thirdparty/opencole
pip install -e dataset/src

Install additional packages for training cases

pip install -e ".[train]"
pip install flash-attn --no-build-isolation

How to Use

Layer Planning

Please check this folder for the layer planning code, or you can directly use the predicted labels here.

Index-to-layer mapping:

{
    0: "Background", 
    1: "Underlay", 
    2: "Logo/Image", 
    3: "Text", 
    4: "Embellishment"
}

Rendering

We render an image for each element, which is useful during inference. Meanwhile, we also render the intermediate designs (denoted as layer_{index}.png) and use them for end-to-end training.

Please run the following script to get these assets:

python dataset/src/crello/render.py

Dataset Preparation

After rendering, we move to the next step to construct the dataset according to layered design principle. Each sample has 5 rounds of dialogue, where the model progressively predicts element attributes from the background layer to the embellishment layer.

python dataset/src/crello/create_dataset.py --tag ours

Model

We have a model available on Hugging Face. Please download it for inference.

This model is trained on the public Crello dataset. We find that using a dataset approximately five times larger leads to significantly improved performance. For a detailed evaluation, please refer to Table 2 in our paper. Unfortunately, we are unable to release this model as it was trained on a private dataset.

Inference

Now it is time to do inference using the prepared data and model:

CUDA_VISIBLE_DEVICES=0 python llava/infer/infer.py \
    --model_name_or_path /path/to/model/checkpoint-xxxx \
    --data_path /path/to/data/test.json \
    --image_folder /path/to/crello_images \
    --output_dir /path/to/output_dir \
    --start_layer_index 0 \ 
    --end_layer_index 4

Demo

Besides command-line inference, we also provide a demo interface that allows users to interact with the model via a web-based UI. This interface makes it more user-friendly and better suited for running inference on custom datasets.

To launch the web UI, run the following command:

CUDA_VISIBLE_DEVICES=0 python app/app.py --model_name_or_path /path/to/model/checkpoint-xxxx

Evaluation

We compute the LVM scores and geometry-related metrics for the generated designs:

LVM scores

python llava/metrics/llava_ov.py -i /path/to/output_dir

Geometry-related metrics

python llava/metrics/layout.py --pred /path/to/output_dir/pred.jsonl

Training

We fine-tune LLMs using the crello training set for layered design composition. For your own dataset, please prepare the dataset in the given format and run:

bash scripts/finetune_lora.sh \
    1 \
    meta-llama/Llama-3.1-8B \
    /path/to/dataset/train.json \
    /path/to/image/base \
    /path/to/output_dir \
    50 \
    2 \
    16 \
    250 \
    2e-4 \
    2e-4 \
    cls_pooling \
    Llama-3.1-8B_lora_ours \
    32 \
    64 \
    4

For example, the specific script in our setting is:

bash scripts/finetune_lora.sh \
    1 \
    meta-llama/Llama-3.1-8B \
    dataset/dataset/json/ours/train.json \
    dataset/dataset/crello_images \
    output/Llama-3.1-8B_lora_ours \
    50 \
    2 \
    16 \
    250 \
    2e-4 \
    2e-4 \
    cls_pooling \
    Llama-3.1-8B_lora_ours \
    32 \
    64 \
    4

Remember to login to Hugging Face using the Llama access token:

huggingface-cli login --token $TOKEN

The following is a list of supported LLMs:

BibTeX

@InProceedings{lin2024elements,
  title={From Elements to Design: A Layered Approach for Automatic Graphic Design Composition},
  author={Lin, Jiawei and Sun, Shizhao and Huang, Danqing and Liu, Ting and Li, Ji and Bian, Jiang},
  booktitle={CVPR},
  year={2025}
}

Acknowledgments

We would like to express our gratitude to CanvasVAE for providing the dataset, OpenCole for the rendering code, and LLaVA for the codebase. We deeply appreciate all the incredible work that made this project possible.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
app		app
dataset		dataset
llava		llava
scripts		scripts
thirdparty/opencole		thirdparty/opencole
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
pyproject.toml		pyproject.toml
teaser.png		teaser.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

From Elements to Design: A Layered Approach for Automatic Graphic Design Composition

Requirements

How to Use

Layer Planning

Rendering

Dataset Preparation

Model

Inference

Demo

Evaluation

Training

BibTeX

Acknowledgments

Contributing

Trademarks

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

microsoft/elem2design

Folders and files

Latest commit

History

Repository files navigation

From Elements to Design: A Layered Approach for Automatic Graphic Design Composition

Requirements

How to Use

Layer Planning

Rendering

Dataset Preparation

Model

Inference

Demo

Evaluation

Training

BibTeX

Acknowledgments

Contributing

Trademarks

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages