PromptGenV2:仅1G低显存福音!更适合F1的CLIP和T5双通道反推!接近Joy性能与速率平衡反推模型

科技   2024-11-16 18:34   四川  

PromptGen-V2:F1专属CLIP&T5双通道反推!接近Joy仅需1G平衡反推模型

🌹大家好!欢迎来到破狼公众号。感谢大家的支持与鼓励。在AIGC探索道路上,我将与你一路同行。喜欢就星标关注破狼公众号或文末扫码加入交流群 !本人仅运营公众号平台,未经授权严禁CSDN等其他平台抄袭和转载!

PromptGen-V2简介

在之前的文章介绍过一款提示词准确度接近Joy,但性能更平衡的提示词反推模型Florence-2-large-PromptGen ([ComfyUI]Flux:完美平衡!更适合F1反推模型!30秒速率&1G低显存&CLIP和T5双通道提示&反推和标注皆可用)。今日迎来了再次升级Florence-2-large-PromptGen v2.0,这是基于之前的PromptGen 1.5模型的升级版本,引入了更多的新功能和特性:

  • • 改进的描述质量:对于<GENERATE_TAGS><DETAILED_CAPTION><MORE_DETAILED_CAPTION>的描述质量提升。
  • • 新的<ANALYZE>指令:帮助模型更好地理解输入图像的构图。
  • • 显存效率:相比其他模型,这是一个轻量级的描述模型,使用不超过1G的VRAM,快速生成高质量的图像描述。
  • • 专为Flux模型设计适用于T5XXL CLIP和CLIP_L,Miaoshou Tagger的新节点“Flux CLIP Text Encode”消除了为创建描述而运行两个独立标记工具的需求。允许在单次生成中轻松填充两个CLIP,这显著提高使用Flux模型时的速度。
  • • 混合描述风格<MIXED_CAPTION>结合了更详细的描述和标签,这对于同时使用T5XXL和CLIP_L的FLUX模型非常有用。MiaoshouTagger ComfyUI中添加了一个新的节点来支持此指令。
  • • <MIXED_CAPTION_PLUS>:结合混合描述的强大功能与分析。
  • • 模型地址:https://huggingface.co/MiaoshouAI/Florence-2-large-PromptGen-v2.0
  • • ComfyUI-Miaoshouai-Tagger:https://github.com/miaoshouai/ComfyUI-Miaoshouai-Tagger

PromptGenv2 ComfyUI体验

更新ComfyUI并搜索插件ComfyUI-Miaoshouai-Tagger安装,模型会在首次运行时候自动下载。

  • • 插件地址:https://github.com/miaoshouai/ComfyUI-Miaoshouai-Tagger
  • • transformers版本需要不低于4.38.0
  • • MiaoshouAI/Florence-2-large-PromptGen-v2.0模型:模型会自动下载,如需手动下载模型,则需要下载整个项目全部文件,并放置ComfyUI/LLM。下载地址:https://huggingface.co/MiaoshouAI/Florence-2-large-PromptGen-v2.0/tree/main

Flux文生图工作流

Flux文生图感兴趣的同学可参考LIBLIB在线运行工作流:FLUX[续篇]:12B参数23G最大开源文生图模型,Dev版直出惊艳美图欣赏

本文涉及ComfyUI工作流和模型均可在LIBLIBAI上下载或在线运行体验:

• FLUX.1哩布在线可运行-黑暗森林工作室

https://www.liblib.art/modelinfo/488cd9d58cd4421b9e8000373d7da123

• F.1-绮梦流光-水湄凝香

https://www.liblib.art/modelinfo/134c6dd95aef48e98a22b24e003e026b

• 工作流-Flux文|图生图+LORA+提示反推一键切换工作流

https://www.liblib.art/modelinfo/782aacd70f604da39e83368c696a02a8

另外LIBLIBAI已支持本地客户端使用可首页下载体验。

PromptGenv2反推工作流

PromptGenv2反推工作流已上传LIBLIB平台:https://www.liblib.art/modelinfo/fcba1596f88e4a29aef9eb0e043030f6?versionUuid=ac49e877274e47009a4602e23459480d

01.牛逼

T5:

A 3d rendering shoot from a bird's eye view about a small wooden house perched on a rocky island surrounded by lush greenery and a waterfall, with the word "nb" spelled out in large, green grass letters in the center of the island. the island is situated in the middle of the image, surrounded by fluffy white clouds, and the sky is a bright, clear blue with a few wispy clouds scattered throughout. the house has a red roof and a small porch, and there are several trees surrounding it, adding to the natural beauty of the scene. the overall atmosphere is peaceful and serene, with a sense of serenity and tranquility.

clip:

solo, outdoors, sky, day, cloud, blue sky, tree, grass, water, no humans, single tree, waterfall

02.沙雕

T5:

A sandcastle in the foreground with a city skyline in the background. the sandcastle is positioned in the middle of the image, with a view of the city skyline through a hole in a stone wall. the cityscape includes tall buildings, skyscrapers, and a tall television tower, all set against a bright blue sky with fluffy white clouds. the water is calm and reflects the buildings, creating a serene atmosphere. the image has a shallow depth of field, allowing the viewer to focus on the intricate details of the sand castle. the overall effect is one of tranquility and serenity, making it a beautiful and picturesque scene.

clip:

shanghai, china, sky, day, cloud, water, architecture, blue sky, building, tower, architecture details, sandcastle, architecture focus, sand castle

03.3D手办

T5:

3D PVCA photo-realistic shoot from a frontal camera angle about a stylized figurine of a young woman with long blonde hair, wearing a revealing outfit, standing confidently on a wooden surface in a modern indoor setting. the image also shows a soft, blurred background of a blurry landscape with rocks and trees. on the middle of the image, a 1girl, who appears to be in her early twenties, is standing with her hands on her hips, looking directly at the viewer with a smile on her face. she has a slender physique, small breasts, and is wearing a green crop top and a light blue, flowing skirt. her hair is styled in long, wavy locks that cascade down her back. she is wearing gold earrings and has a pair of golden sandals on her feet. the lighting is soft and natural, highlighting her smooth skin and the intricate details of her outfit.

clip:

1girl, solo, long hair, breasts, looking at viewer, smile, skirt, navel, closed mouth, jewelry, bare shoulders, medium breasts, standing, twintails, full body, earrings, indoors, midriff, crop top, lips, feet out of frame, sandals, halterneck, hands on own hips, sarong

04.剪影

T5:

Double exposure silhouetteA digital illustration shoot from a portrait camera angle about a portrait of a young woman with a tree reflected in water at sunset. the image also shows a serene and peaceful atmosphere. on the middle of the image, a 20 years old woman with light skin, brown hair tied in a high ponytail, and blue eyes, who appears to be looking directly at the viewer with a neutral expression, is positioned in the center of the frame. she has a slender physique and bare shoulders, with no clothing or accessories. her hair is styled in a messy yet elegant manner, and her hair color is brown hair. her eye color is blue eyes. she is facing the viewer and her expression is neutral. her lips are closed and her nose is slightly upturned. in front of her, a bare tree with no leaves and no branches is reflected in the calm water. the background is a gradient of warm oranges and pinks, creating a peaceful and serene atmosphere.

clip:

1girl, solo, looking at viewer, brown eyes, hair ornament, closed mouth, upper body, outdoors, nude, water, tree, lips, reflection, orange sky, single hair bun

最后,PromptGenv2在反推效果略逊于Joy反推模型,但仅需要1G远比joy 7G小很多硬件要求,并且秒级快速对反推性能,可以作为快速打标、反推的备选方案之列。特别是对于地显存用户,可作为首选方案。

更多推荐文章:

• 更像了!5个百分点提升,字节写真换脸PuLID-F1再升级,小红书流量密码

 15秒F.1D直出,极限无损加速方案,环境大升级敢不敢来试?

• OmniGen:统一图像生成和多任务集成模型,任意人物自由合影,8位量化体验

 Shuttle-3-Diffusion:可商用F.1去蒸馏模型!仅4步约3秒出图,性能质量显著提升

 Mochi1:更简单和可商用,ComfyUI内核支持,社区生态推动4张H100到消费级显卡可用

 CogVideo:重磅升级!图生视频完美镜头控制和3D环绕,商用级开源AI视频曙光

• 阿里InContextLoRA:更强ID一致性!基于黑森林F1身份一致性连贯视频分镜图集,10组风格无限创意

• Flux-NewReality:栩栩如生摄影级解禁模型,追求真实细节&风景&神话高品质艺术

• [ComfyUI]InstantIR:来自小红书团队模糊图像修复技术,效果是否惊艳?

• [ComfyUI]Flux:F.1多区域精确控图,无需LORA技术多区域自由构图工具

• [ComfyUI]MochiEdit:最新视频编辑工具,Mochi视频生成加速方案

    感兴趣加入[AGI技术交流群]+V
    如果觉得文章不错,就请在看转发三连

破狼
关注AIGC、LLM、绘图作品、软件工程、技术学习。交流+V:shunshizhiwu。
 最新文章