Questions-of-Thoughts (QoT): a stepwise self‑questioning scaffold to improve LLM‑assisted software design
This paper introduces Questions-of-Thoughts, or QoT, a method that steers large language models (LLMs) toward higher-quality software designs by turning a user goal into an ordered plan of engineering steps plus targeted self-questions. The idea is to get the model to ask itself the right questions at each step so it uncovers missing constraints and avoids common omissions. QoT also keeps a compact record of intermediate decisions so later steps are more stable and consistent.
The authors describe QoT as an inference-time scaffold, meaning it operates while the model generates solutions rather than requiring extra training. QoT has three main parts. First, a Sequential Process Chain breaks a high-level problem into an ordered list of substeps so the work proceeds in a logical sequence. Second, a Question–Answer (self‑QA) Chain makes the model generate and answer clarifying questions at each step, similar to a guided checklist. Third, a Reasoning Knowledge Base stores intermediate reasoning (the “thinking process”) and a progressively updated response so earlier decisions inform later ones.
To test QoT, the team evaluated it on three representative backend engineering domains: API Design, Data Communication, and File Systems. They scored the generated systems with a quality rubric inspired by ISO/IEC software quality ideas. That rubric measured Scalability (how well the design can grow), Completeness (coverage of needed parts and edge cases), Modularity (clear separation of components), and Security (basic risk controls). The paper reports domain-wise gains as the difference between the QoT condition and a baseline without QoT.
Results were capacity dependent. QoT produced consistent quality improvements for larger models and for the more complex domains in the study. For smaller models, the paper reports possible trade-offs when context and planning resources are tight; in those cases QoT did not always help or could cost planning budget. The authors also point out a practical implementation detail: if an error occurs while generating a step or answering questions, the current system stops that reasoning branch and returns a structured error message, and they suggest real deployments could add retry or fallback strategies.