(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–1 of 1 results for author: Binyamin, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10210  [pdf, other

    cs.CV cs.AI cs.GR

    Make It Count: Text-to-Image Generation with an Accurate Number of Objects

    Authors: Lital Binyamin, Yoad Tewel, Hilit Segev, Eran Hirsch, Royi Rassin, Gal Chechik

    Abstract: Despite the unprecedented success of text-to-image diffusion models, controlling the number of depicted objects using text is surprisingly hard. This is important for various applications from technical documents, to children's books to illustrating cooking recipes. Generating object-correct counts is fundamentally challenging because the generative model needs to keep a sense of separate identity… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Project page is at https://make-it-count-paper.github.io/