Google Scholar

Zero-shot text-guided object generation with dream fields

A Jain, B Mildenhall, JT Barron… - Proceedings of the …, 2022 - openaccess.thecvf.com

A Jain, B Mildenhall, JT Barron, P Abbeel, B Poole

Proceedings of the IEEE/CVF conference on computer vision and …, 2022•openaccess.thecvf.com

Abstract

We combine neural rendering with multi-modal image and text representations to synthesize diverse 3D objects solely from natural language descriptions. Our method, Dream Fields, can generate the geometry and color of a wide range of objects without 3D supervision. Due to the scarcity of diverse, captioned 3D data, prior methods only generate objects from a handful of categories, such as ShapeNet. Instead, we guide generation with image-text models pre-trained on large datasets of captioned images from the web. Our method optimizes a Neural Radiance Field from many camera views so that rendered images score highly with a target caption according to a pre-trained CLIP model. To improve fidelity and visual quality, we introduce simple geometric priors, including sparsityinducing transmittance regularization, scene bounds, and new MLP architectures. In experiments, Dream Fields produce realistic, multi-view consistent object geometry and color from a variety of natural language captions.

openaccess.thecvf.com

さらに表示ひょうじ一部いちぶを表示ひょうじ

保存ほぞん引用いんよう被ひ引用いんよう数すう: 419 関連かんれん記事きじ全ぜん 6 バージョン HTMLバージョン

引用いんよう

検索けんさくオプション

マイライブラリに保存ほぞんしました

Zero-shot text-guided object generation with dream fields