Output Quality Evaluator and Follow-Up Questions

The Prompt

You are an AI output evaluator. Judge whether the response is good enough to use, explain what is weak, and generate the smallest useful next step.

ORIGINAL REQUEST OR PROMPT:
{{original_request_or_prompt}}

AI OUTPUT:
{{ai_output}}

SUCCESS CRITERIA:
{{success_criteria}}

KNOWN CONTEXT:
{{known_context}}

WORKFLOW SURFACE:
{{workflow_surface}}

Return exactly:
1) Overall verdict (usable, revise, reframe the task, or missing context)
2) What is already working
3) Quality gaps and why they matter
4) Smallest useful next step
   - revise prompt
   - gather missing context
   - accept as good enough
5) Follow-up questions for the human (max 5, only if needed)
6) Best next prompt to run
7) When to stop iterating and gather better source material instead

Rules:
- Evaluate against the stated task and criteria, not personal taste.
- If the real problem is missing context or evidence, say so directly.
- Do not recommend another retry if the output is already good enough.
- Prefer the smallest next step that can improve quality in a measurable way.

When to Use

Use this when you have an AI answer in front of you and are not sure whether the problem is the output, the prompt, or the missing context around the task. It helps avoid the common trap of endlessly retrying without learning what actually needs to change.

Best fits:

an answer feels weak, but you cannot tell whether it is salvageable
you want targeted follow-up questions instead of generic “add more detail” advice
a team needs a more disciplined review step before using AI output
you want to decide whether to retry, rewrite the prompt, or pause and gather better source material

Variables

Variable	Description	Good input examples
`original_request_or_prompt`	The task as it was originally given to the model	full user request, prompt template instance, chat transcript excerpt
`ai_output`	The output you want evaluated	current answer, draft memo, generated checklist, code explanation
`success_criteria`	What a good output should achieve	accurate summary, decision-ready recommendation, plain-language rewrite, reusable prompt
`known_context`	Important background or constraints the evaluator should consider	audience, no-invention rule, approved notes, deadline, risk limits
`workflow_surface`	Where the next step will happen	chat app, coding agent, editor assistant, review handoff

Tips & Variations

Paste the exact output, not a paraphrase. Weaknesses are often in wording, structure, or unsupported claims that summaries hide.
If quality matters more than speed, ask for a short scorecard by criterion before the next prompt is written.
Use this after the result exists; use Prompt Critic and Rewrite Coach when you want to improve the prompt itself before the next run.
When repeated attempts keep failing, pay attention to the “gather better source material” section. The bottleneck is often context quality, not model quality.
If the answer will be handed to another tool or person, include that in workflow_surface so the next-step guidance stays practical.

Example Output

Overall verdict: revise. The answer is directionally right but too generic for the stated audience and does not support its recommendations with the provided notes.

Follow-up questions: confirm the primary audience, the approved decision deadline, and whether unresolved risks should be named directly.

Best next prompt: a shorter retry that includes those missing details and asks for the exact output structure needed.

The Prompt

When to Use

Variables

Tips & Variations

Example Output

Related site references