Prompt Engineering Tips for Tech Leaders
Prompt Engineering That Scales: A Pragmatic Guide for Tech Leaders
Prompt engineering is just interface design for systems that don’t always do what you want. You’re trying to turn “make this work” into something that actually works the same way twice.
I’ve wasted hours on prompts that seemed great until they didn’t.
Here’s what actually stuck across different models and vendors: Unambiguous language, explicit schemas, testing your assumptions.
There’s a full before/after example later that shows all of this in one shot.
This github link is helpful if you’re using a CLI tool like Claude to turn requests into PRDs and break them into tasks.
15 Things That Keep Working
- Use workbench/playground models like platform.openai.com/playground… way easier to iterate
- Shorter prompts work better (250-500 tokens sweet spot) but don’t skip examples
- Understand the different prompt types (system - who am I, user - tell the model what to do/instructions, assistant - model feedback/template for future outputs)
- Use one- or few-shot prompting. This just refers to the number of examples provided to the LLM in your prompt
- Conversational vs knowledge engines - pick one
- Say exactly what you mean
- Define tone of voice: For example, “Use “spartan” in tone of voice”
- Test your prompts with real data, not made-up scenarios
- Define the output format explicitly
- Remove conflicting instructions (“detailed summary” makes no sense)
- Learn JSON, XML, CSV - you’ll need them
- Context, Instructions, output format, rules, example. In that order.
- Use AI to generate examples for AI
- Tokens are cheap. Use the smarter model unless you’re running millions of requests.
- Use ‘ask’ mode a few times before ‘agent’ mode in your CLI or Copilot
Bonus
- Give it a role (who), give it a goal (what), give it all context, be clear on output format. And let it ask questions first if it needs to.
Before and After Example
Scenario: You want a technical design plan for a Signup service with rate limiting, returned as JSON.
I’ve done this wrong so many times. Here’s what failure looks like:
BEFORE Prompt
Hey! Write a super detailed but also short doc about building a signup thing with rate limits. Explain all the best practices, include tables and code, and make it fun but professional. You can add anything you think is cool. Maybe talk about databases. We might be in AWS or GCP, not sure. Output however you want. Thanks!!
This is terrible:
- No role
- “Super detailed but also short” - pick one
- No format
- No context
- Ambiguous everything
- You’ll get a different answer every time
AFTER Prompt
Put this in your model playground. Start with a smart model while you’re designing. You can use cheaper ones later when you scale.
SYSTEM
You are a senior backend architect. You design with crisp trade-offs and minimal prose.
Tone: spartan. No marketing language.
USER
Context
- Product: Signup service for a consumer app, single region to start.
- Constraints: Postgres primary DB; Redis available; 100 rps peak; 99.9% target.
- Requirements: Rate limit 5 requests/min/IP; idempotent POST /signup; audit log of attempts; email verification webhook; PII handled via data minimization; no external calls during signup path.
- Non-goals: UI, analytics.
Instructions
- Mode: knowledge engine. Ground all outputs only in provided context. If info is missing, ask up to 2 clarifying questions. If no answers, proceed with conservative assumptions and list them.
- Task: Produce a design sketch that a senior engineer can implement.
- Keep instructions + examples concise; avoid repetition to reduce prompt length drift.
Output format (JSON only; no markdown)
{
"components": ["string"],
"data_model": [{"entity":"string","fields":[{"name":"string","type":"string","notes":"string"}]}],
"api": [{"method":"string","path":"string","request":"object","responses":[{"code":200,"body":"object"}]}],
"rate_limiting": {"key":"string","algo":"string","limits":{"unit":"string","value":number},"storage":"string","notes":"string"},
"risks": [{"risk":"string","mitigation":"string"}],
"test_cases": [{"id":"string","description":"string"}],
"assumptions": ["string"]
}
Rules
- Output valid, minified JSON that matches the schema.
- Do not invent external services. Do not include explanations outside JSON.
- If asking questions, ask them first as a JSON array: {"questions":["...","..."]}. After answers, return final JSON only.
Examples (few-shot; compact)
Example context -> output fragment:
- Context: "Passwordless magic-link login service; Redis; 50 rps; 3/min/IP; no PII."
- Output fragment:
{"components":["API","RateLimiter","TokenStore"],
"rate_limiting":{"key":"ip","algo":"fixed-window","limits":{"unit":"minute","value":3},"storage":"redis","notes":"expire per window"}}
Assistant template (style anchor)
{"components":["API"],"data_model":[{"entity":"Example","fields":[{"name":"id","type":"uuid","notes":"pk"}]}],"api":[{"method":"GET","path":"/health","request":{},"responses":[{"code":200,"body":{"status":"ok"}}]}],"rate_limiting":{"key":"ip","algo":"token-bucket","limits":{"unit":"minute","value":60},"storage":"redis","notes":"simplified"},"risks":[{"risk":"none","mitigation":"n/a"}],"test_cases":[{"id":"T0","description":"health"}],"assumptions":[]}
ASSISTANT
(If needed) {"questions":["List user attributes stored at signup?","Should email verification be synchronous or async?"]}
What changed:
- Built for the playground
- Compact examples
- Clear roles
- Specific output format (JSON schema)
- No contradictions
- Spartan tone
- It can ask questions before answering
- Everything is explicit
What Actually Works in Production
Treat prompts like code. Version control, code review, the works. I skip inline comments unless something is genuinely weird.
Test and log everything. Fixed test cases. Parse the outputs. Check schema validity. Track metrics, not your gut feeling about whether it’s “better.” Log your prompts, responses, token counts, latencies, and errors. When something breaks, you want to know why.
Don’t dump your whole codebase into the context window. Pick what you feed in. Precise snippets beat wall-of-text context.
Have a fallback. For high-volume stuff, use cheaper models with tighter prompts. Back it up with deterministic code when the model fails. Watch your token budget — shorten instructions if you need to, but keep the context examples. Those matter.
What to Do Next
Take your three most important prompts. Rewrite them using the pattern above. Test them with real cases. Measure what changes.
Then scale.
Copy the After prompt, swap in your own context, and run it in a playground. Version your changes. Track what works.
28 Business Prompts That Put These Principles to Work
Everything in this article — clear instructions, explicit schemas, specific context — applies double when you’re using AI for business operations instead of code generation. I built 28 prompts for market research, competitor analysis, SEO, and automation that follow these exact principles. Each one includes my notes on what works, what breaks, and how to chain them so each output feeds the next.
Get posts like this in your inbox
Bi-weekly emails on automation, AI, and building systems that run without you. No fluff.
No spam. Unsubscribe anytime.
Related Articles
The AI Coding Model Wars: How Open Source Is Closing the Gap on Proprietary Coding Models
February 15, 2026
Four major coding models launched in six days. The benchmark gap? 2.6 points. The price gap? 45x. A head-to-head comparison of Opus 4.6, Codex 5.3, GLM-5, and Kimi K2.5.
AI Model Selection: Choosing the Right Model and Application Pattern
February 3, 2026
Not all tasks need the most powerful AI model. Learn how to match model intelligence to task complexity and stop overpaying for sledgehammers when you need scalpels.
Architecture as Code: Why Tech Leaders and Engineers Should Adopt Diagrams‑as‑Code Now
October 23, 2025
AI now writes a large share of new code. Stand out with Architecture as Code—pros, cons, tools, and a simple adoption plan for leaders, engineers, and recruiters.
Wrestling with a technical challenge?
I help companies automate complex workflows, integrate AI into their stacks, and build scalable cloud architectures.