π§ͺ Test Kitchen
Pick your task, get a suggested prompt, test on the recommended models, then compare results.
1 What are you trying to do?
2 Your prompt Edit the suggestion or write your own
3 Open these models, paste your prompt, and run it
4 Paste or note each model's response Optional β but needed to generate the compare prompt
5 Compare prompt Paste this into any AI to get a side-by-side analysis
Paste this into ChatGPT, Claude, or Gemini to get a structured breakdown of which response was best and why.
What to look for
Accuracy
Did it get the facts right? Did it make anything up?
Instruction-following
Did it do exactly what you asked, or go off-script?
Tone
Does the voice match what you need? Formal, casual, confident, hedged?
Length
Was it appropriately brief, or did it pad the response?
Format
Did it structure the output well without being told explicitly?
Surprise
Did it add something you didn't expect that made it better?