Generative AI

python==3.12.3
torch==2.8.0(+cu128)
triton==3.4.0
torchvision==0.23.0(+cu128)
torchao==0.13.0
torchaudio==2.8.0
flash-attn==2.8.3

ComfyUI

ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI/custom_nodes

Custom Nodes:

ComfyUI-Manager git clone https://github.com/ltdrdata/ComfyUI-Manager
ComfyUI-WanVideoWrapper git clone https://github.com/kijai/ComfyUI-WanVideoWrapper

To run ComfyUI
cd ~/ComfyUI
python main.py

GUI: http://127.0.0.1:8188

1. Generative Image

NanoBanana

Prompt:

請生成年輕女性
臉部特徵與妝容
五官: 擁有清秀、柔和的五官，臉型屬於鵝蛋臉或偏瓜子臉。豐滿上圍凸出
眼睛: 眼神清澈且專注，是單眼皮或內雙眼皮，帶有自然的東方美感。
膚質/妝容: 膚色白皙、透亮，妝容非常自然、輕薄，呈現出**「偽素顏」或「裸妝」的效果，強調肌膚的光澤感和無瑕疵**。
💇 髮型
髮色/髮質: 髮色是深棕色或自然黑，髮質看起來柔順且有光澤。
造型: 髮型是半紮式馬尾（或公主頭），將上半部的頭髮向後梳起，展現出俐落感；同時保留了幾縷髮絲自然地垂落在臉頰兩側，增添了柔美的氣息。
👚 服裝與整體風格
服裝: 穿著一件米色或淺裸色的上衣，材質似乎是輕薄的針織或有細紋理的布料，
風格: 整體風格是清新、自然、優雅，模特兒的形象。
🌟 簡潔重點總結
她是一位外型清新、氣質溫柔的女性。擁有白皙透亮的自然裸妝，搭配柔順的深棕色半紮髮，整體散發出優雅而專注的氛圍。在咖啡廳

Imagen4

Prompt:

A beautiful young woman with long, voluminous, wavy brown hair and hazel eyes, looking thoughtfully to the side.
She is illuminated by soft, natural light coming from a nearby window with sheer curtains. She is wearing a simple, beige off-the-shoulder top.
The mood is serene and pensive. The style should be a photorealistic portrait with a shallow depth of field, creating a soft, blurred background.

Grok.imagine

Prompt:

Photorealistic close-up portrait of a young East Asian female singer (K-pop idol aesthetic) on a dark stage.
big eyes with thick lips wearing a black off-the-shoulder top with spaghetti straps. Her light brown hair is styled in wavy and curry.
She is holding a professional stage microphone and singing with a focused, emotional expression.
Dramatic, high-contrast volumetric lighting, strong spotlight isolating the subject, deep shadows, blue/black background.
Cinematic shot, 8k, hyperdetailed, shallow depth of field, aspect ratio 9:16

Z-Image Turbo

Model: Comfy-Org/z_image_turbo

RED-Zimage 1.5

Model: https://civitai.com/models/958009/redcraft-or-redzimage-or-updated-dec03-or-latest-red-z-v15
Blog:RED-Zimage 1.5 Review: Finally, Real AI Photos?

2. Generative Video

Sora2

Veo3.1

Wan2.2 T2V & I2V

ComfyUI - WAN2.2

5B TI2V workflow

14B T2V workflow

14B I2V workflow

Wan2.2 Animate

ComfyUI - Wan2.2 Animate

14B Animate workflow

Wan2.2 S2V

ComfyUI - Wan2.2 S2V

14B S2V workflow

InfiniteTalk

ComfyUI - Wan2.1 InfiniteTalk 讓圖片、影片生成的人物完美對應口型

Text-to-Image : Grok.com/imagine
Lyrics-to-Song : Suno.com
Image-to-Video : ComfyUI + Wan2.1 Video + InfiniteTalk

I2T -> T2I -> I2V

ComfyUI_Qwen3-VL-Instruct

3. Generative Song

6. Image-to-3D

Hunyuan3D

Paper: Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details
Code: https://github.com/Tencent-Hunyuan/Hunyuan3D-2.1

ComfyUI - Hunyuan3D 2.1

Name		Name	Last commit message	Last commit date
Latest commit History 1,090 Commits
Agent		Agent
Audio-to-Text		Audio-to-Text
ComfyUI_workflow		ComfyUI_workflow
FineTuning		FineTuning
Image-to-3D		Image-to-3D
Image-to-Text		Image-to-Text
RAG		RAG
Text-to-Image		Text-to-Image
Text-to-Song		Text-to-Song
Text-to-Speech		Text-to-Speech
Text-to-Text		Text-to-Text
VLA		VLA
assets		assets
README.md		README.md
check_cuda.py		check_cuda.py
gSpeak.py		gSpeak.py
gT2T.py		gT2T.py
gTTS.py		gTTS.py

rkuo2000/GenAI

Folders and files

Latest commit

History

Repository files navigation

Generative AI

ComfyUI

1. Generative Image

RED-Zimage 1.5

2. Generative Video

Sora2

Veo3.1

Wan2.2 T2V & I2V

5B TI2V workflow

14B T2V workflow

14B I2V workflow

Wan2.2 Animate

14B Animate workflow

Wan2.2 S2V

14B S2V workflow

InfiniteTalk

I2T -> T2I -> I2V

3. Generative Song

4. Generative Speech

5. Talking Avatar

6. Image-to-3D

Hunyuan3D

About

Resources

Uh oh!

Stars

Watchers

Forks

Languages