网站首页 > 技术文章正文

AI绘画EasyControl来了，宫崎骏「吉卜力」画风开源免费使用

nanyue 2025-06-13 15:42:51 技术文章 101 ℃

各位AI绘画的玩家和创作者们！大家有没有遇到过这种情况：现在最新的AI绘画模型（像基于Transformer的DiT，比如FLUX）效果超棒，但是想精确控制它生成的内容，比如固定人物姿势、保留特定人脸、或者控制画面布局，就感觉特别费劲？要么速度慢，要么一加控制，自己喜欢的LoRA模型（比如特定角色或画风）效果就没了？

没错，虽然之前基于Unet的模型有ControlNet、IP-Adapter这些神器，但轮到更强的DiT模型，高效又灵活的控制就成了大难题。不过现在，重磅好消息来了！隆重向大家介绍 EasyControl —— 一个专门为解决这个问题而生的全新框架！它的目标就是让DiT模型的条件控制变得高效、灵活，而且超级方便！

你可以把 EasyControl 理解成一个给DiT这种高级AI绘画引擎量身定做的“万能控制套件”。它能让你在不“大改”AI核心、不牺牲太多速度的前提下，实现精准的控制。

它是怎么做到的呢？主要靠这三大法宝：

轻量级的“控制插件”（条件注入LoRA模块）：想象一下，EasyControl 提供了一种像“插件”一样的东西。每个插件专门负责一种控制信号（比如姿势、深度、人脸特征）。它能独立工作，即插即用，最关键的是，它不会跟你的基础模型或者其他自定义模型（比如你心爱的人物LoRA、画风LoRA）打架！就算只训练了单一控制，它也能神奇地在之后零样本组合多种控制（比如姿势+人脸+风格），效果还很和谐！

聪明的“尺寸魔法”（位置感知训练范式）：这个技术让EasyControl在训练时就学会理解不同分辨率的控制图。这意味着什么？意味着你生成图片时，可以自由设定想要的图片尺寸和长宽比，不再被死板地限制住，而且还能提高计算效率！

风驰电掣的“加速器”（因果注意力+KV缓存）： EasyControl采用了一种巧妙的技术（可以理解为缓存关键信息），能够显著减少生成图片时的等待时间，大大提升了效率！

条件信号通过新引入的条件分支注入扩散变换器 (DiT)，该分支与轻量级、即插即用的条件注入 LoRA 模块一起对条件标记进行编码。

在训练过程中，每个单独的条件都会被单独训练，其中条件图像会被调整到较低的分辨率，并使用位置感知训练范式进行训练。这种方法可以实现高效灵活的分辨率训练。该框架集成了因果注意力机制，从而能够实现键值 (KV) 缓存，从而显著提升推理效率。此外，这样的设计有助于无缝集成多个条件注入 LoRA 模块，从而实现稳健且协调的多条件生成。

很多朋友特别喜欢用LoRA模型来还原宫崎骏（吉卜力）那种梦幻的动画风格。但以前常常遇到的问题是，一旦想用ControlNet之类的工具控制姿势，吉卜力画风可能就“跑偏”了，或者效果大打折扣。

EasyControl 的巨大优势就在于它的兼容性！因为它的控制模块是轻量且独立的，它在施加控制（比如引导姿势、锁定主体）的同时，能够最大限度地保留你加载的自定义LoRA模型的效果。也就是说，你可以用EasyControl来精准控制画面内容，同时让你心爱的宫崎骏画风LoRA完美发挥作用！

最最最激动人心的是，EasyControl 团队已经把它开源了！你可以在 GitHub 和 Hugging Face 上找到相关的代码和模型。这意味着，你现在就可以去下载、去尝试，把它集成到你的AI绘画工作流里，创作出既精准可控、又充满艺术风格的神奇画作！

总而言之，如果你希望：

对最新的AI绘画模型（DiT架构）进行更精准的控制；

同时使用各种自定义的人物或风格LoRA（比如吉卜力风）；

并且希望生成速度更快、更灵活；

那么，EasyControl 绝对是你不能错过的神器！它高效、灵活、兼容性强，快去试试看，让你的AI创作更上一层楼吧！

当然EasyControl已经开源了，喜欢代码的小伙伴可以直接到 GitHub 上面参考代码进行本地模型的生成与部署。如下是吉卜力风格图片的代码，可以上传自己的图片进行吉卜力风格的图片生成以及控制。

import spaces
import os
import json
import time
import torch
from PIL import Image
from tqdm import tqdm
import gradio as gr
from safetensors.torch import save_file
from src.pipeline import FluxPipeline
from src.transformer_flux import FluxTransformer2DModel
from src.lora_helper import set_single_lora, set_multi_lora, unset_lora
base_path = "black-forest-labs/FLUX.1-dev"    
lora_base_path = "./checkpoints/models"
pipe = FluxPipeline.from_pretrained(base_path, torch_dtype=torch.bfloat16)
transformer = FluxTransformer2DModel.from_pretrained(base_path, subfolder="transformer", torch_dtype=torch.bfloat16)
pipe.transformer = transformer
pipe.to("cuda")
def clear_cache(transformer):
    for name, attn_processor in transformer.attn_processors.items():
        attn_processor.bank_kv.clear()
@spaces.GPU()
def single_condition_generate_image(prompt, spatial_img, height, width, seed, control_type):
    if control_type == "Ghibli":
        lora_path = os.path.join(lora_base_path, "Ghibli.safetensors")
    set_single_lora(pipe.transformer, lora_path, lora_weights=[1], cond_size=512)
    spatial_imgs = [spatial_img] if spatial_img else []
    image = pipe(
        prompt,
        height=int(height),
        width=int(width),
        guidance_scale=3.5,
        num_inference_steps=25,
        max_sequence_length=512,
        generator=torch.Generator("cpu").manual_seed(seed), 
        subject_images=[],
        spatial_images=spatial_imgs,
        cond_size=512,
    ).images[0]
    clear_cache(pipe.transformer)
    return image
control_types = ["Ghibli"]
with gr.Blocks() as demo:
    gr.Markdown("# Ghibli Studio Control Image Generation with EasyControl")
    gr.Markdown("The model is trained on **only 100 real Asian faces** paired with **GPT-4o-generated Ghibli-style counterparts**, and it preserves facial features while applying the iconic anime aesthetic.")
    gr.Markdown("Generate images using EasyControl with Ghibli control LoRAs.（Due to hardware constraints, only low-resolution images can be generated. For high-resolution (1024+), please set up your own environment.）")
    gr.Markdown("**[Attention!!]**：The recommended prompts for using Ghibli Control LoRA should include the trigger words: `Ghibli Studio style, Charming hand-drawn anime-style illustration`")
    gr.Markdown("If you like this demo, please give us a star (github: [EasyControl](https://github.com/Xiaojiu-z/EasyControl))")
    with gr.Tab("Ghibli Condition Generation"):
        with gr.Row():
            with gr.Column():
                prompt = gr.Textbox(label="Prompt", value="Ghibli Studio style, Charming hand-drawn anime-style illustration")
                spatial_img = gr.Image(label="Ghibli Image", type="pil")  # 上传图像文件
                height = gr.Slider(minimum=256, maximum=1024, step=64, label="Height", value=768)
                width = gr.Slider(minimum=256, maximum=1024, step=64, label="Width", value=768)
                seed = gr.Number(label="Seed", value=42)
                control_type = gr.Dropdown(choices=control_types, label="Control Type")
                single_generate_btn = gr.Button("Generate Image")
            with gr.Column():
                single_output_image = gr.Image(label="Generated Image")
    single_generate_btn.click(
        single_condition_generate_image,
        inputs=[prompt, spatial_img, height, width, seed, control_type],
        outputs=single_output_image)
demo.queue().launch()

当然 easy control 还有其他方面的控制，比如图片尺寸控制，人体姿态控制，以及可以使用素描画生成对应的图片等等，更多精彩应用可以参考官方 GitHub地址以及 hugging face 开源模型自行使用。

huggingface：EasyControl_Ghibli
GitHub：EasyControl

上一篇：基于MATLAB的Malthus人口预测模型计算App
下一篇：体验iOS 15之后，我记住了这8个小细节

网站首页 > 技术文章 正文

AI绘画EasyControl来了，宫崎骏「吉卜力」画风开源免费使用

猜你喜欢

网站首页 > 技术文章正文