上上周 OpenAI 开源了 Shap-E 3D 生成。论文地址是:https://arxiv.org/pdf/2305.02463.pdf
今天用 Colab 试用一下:
安装
!git clone https://github.com/openai/shap-e.git
%cd shap-e
!pip install -e .
Text-3D
我们先生成个小汽车看看效果
import torch
from shap_e.diffusion.sample import sample_latents
from shap_e.diffusion.gaussian_diffusion import diffusion_from_config
from shap_e.models.download import load_model, load_config
from shap_e.util.notebooks import create_pan_cameras, decode_latent_images, gif_widget
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
xm = load_model('transmitter', device=device)
model = load_model('text300M', device=device)
diffusion = diffusion_from_config(load_config('diffusion'))
batch_size = 4
guidance_scale = 15.0
prompt = "a car"
latents = sample_latents(
batch_size=batch_size,
model=model,
diffusion=diffusion,
guidance_scale=guidance_scale,
model_kwargs=dict(texts=[prompt] * batch_size),
progress=True,
clip_denoised=True,
use_fp16=True,
use_karras=True,
karras_steps=64,
sigma_min=1e-3,
sigma_max=160,
s_churn=0,
)
render_mode = 'nerf' # you can change this to 'stf'
size = 64 # this is the size of the renders; higher values take longer to render.
cameras = create_pan_cameras(size, device)
for i, latent in enumerate(latents):
images = decode_latent_images(xm, latent, cameras, rendering_mode=render_mode)
display(gif_widget(images))
当然,还可以生成其他东东,可以参考官方给的各个 samples。
https://github.com/openai/shap-e/blob/main/samples.md
生产出来的 3D 是可以下载保存为 ply 格式,运行这个脚本:
from shap_e.util.notebooks import decode_latent_mesh
for i, latent in enumerate(latents):
with open(f'example_mesh_{i}.ply', 'wb') as f:
decode_latent_mesh(xm, latent).tri_mesh().write_ply(f)
之后,ply 文件就存到了shap-e 文件夹下,右键点击下载即可保存到本地。
Image-3D
如果你有个图片,可以让 Shap-E 帮你做成 3D 模型。
把图片上传到 shap-e 文件夹下
运行脚本:
import torch
from shap_e.diffusion.sample import sample_latents
from shap_e.diffusion.gaussian_diffusion import diffusion_from_config
from shap_e.models.download import load_model, load_config
from shap_e.util.notebooks import create_pan_cameras, decode_latent_images, gif_widget
from shap_e.util.image_util import load_image
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
xm = load_model('transmitter', device=device)
model = load_model('image300M', device=device)
diffusion = diffusion_from_config(load_config('diffusion'))
batch_size = 4
guidance_scale = 3.0
image = load_image("car.png")
latents = sample_latents(
batch_size=batch_size,
model=model,
diffusion=diffusion,
guidance_scale=guidance_scale,
model_kwargs=dict(images=[image] * batch_size),
progress=True,
clip_denoised=True,
use_fp16=True,
use_karras=True,
karras_steps=64,
sigma_min=1e-3,
sigma_max=160,
s_churn=0,
)
render_mode = 'nerf' # you can change this to 'stf' for mesh rendering
size = 64 # this is the size of the renders; higher values take longer to render.
cameras = create_pan_cameras(size, device)
for i, latent in enumerate(latents):
images = decode_latent_images(xm, latent, cameras, rendering_mode=render_mode)
display(gif_widget(images))
于是,基于这个图片的 3D 模型就出来了
虽然丑到哭,但是相信开源社区的智慧会慢慢迭代这个 AI 工具的。
感觉活都要被 AI 干了,那我们人类干什么呢?👇