Python Code Generation by Asking Clarification Questions

Li, Haau-Sing; Mesgar, Mohsen; Martins, André F. T.; Gurevych, Iryna

Computer Science > Computation and Language

arXiv:2212.09885 (cs)

[Submitted on 19 Dec 2022 (v1), last revised 26 May 2023 (this version, v2)]

Title:Python Code Generation by Asking Clarification Questions

Authors:Haau-Sing Li, Mohsen Mesgar, André F. T. Martins, Iryna Gurevych

View PDF

Abstract:Code generation from text requires understanding the user's intent from a natural language description and generating an executable code snippet that satisfies this intent. While recent pretrained language models demonstrate remarkable performance for this task, these models fail when the given natural language description is under-specified. In this work, we introduce a novel and more realistic setup for this task. We hypothesize that the under-specification of a natural language description can be resolved by asking clarification questions. Therefore, we collect and introduce a new dataset named CodeClarQA containing pairs of natural language descriptions and code with created synthetic clarification questions and answers. The empirical results of our evaluation of pretrained language model performance on code generation show that clarifications result in more precisely generated code, as shown by the substantial improvement of model performance in all evaluation metrics. Alongside this, our task and dataset introduce new challenges to the community, including when and what clarification questions should be asked. Our code and dataset are available on GitHub.

Comments:	9 pages (excluding Limitations and Ethics Concerns)
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2212.09885 [cs.CL]
	(or arXiv:2212.09885v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2212.09885

Submission history

From: Haau-Sing Li [view email]
[v1] Mon, 19 Dec 2022 22:08:36 UTC (6,815 KB)
[v2] Fri, 26 May 2023 16:03:08 UTC (7,211 KB)

Computer Science > Computation and Language

Title:Python Code Generation by Asking Clarification Questions

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Python Code Generation by Asking Clarification Questions

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators