Prompt Engineering Tips for Tech Leaders
Prompt Engineering That Scales: A Pragmatic Guide for Tech Leaders
Prompt engineering is just interface design for systems that don’t always do what you want. You’re trying to turn “make this work” into something that actually works the same way twice.
I’ve wasted hours on prompts that seemed great until they didn’t.
Here’s what actually stuck across different models and vendors: Unambiguous language, explicit schemas, testing your assumptions.
There’s a full before/after example later that shows all of this in one shot.
One github link that I’ve found helpful will help you think about this if you are using a CLI or similar tool such as Claude to format your requests into a PRD then create tasks/subtasks to break down the requirements
15 Things That Keep Working
- Use workbench/playground models like platform.openai.com/playground… way easier to iterate
- Shorter prompts work better (250-500 tokens sweet spot) but don’t skip examples
- Understand the different prompt types (system - who am I, user - tell the model what to do/instructions, assistant - model feedback/template for future outputs)
- Use one- or few-shot prompting. This just refers to the number of examples provided to the LLM in your prompt
- Conversational vs knowledge engines - pick one
- Say exactly what you mean
- Define tone of voice: For example, “Use “spartan” in tone of voice”
- Test your prompts with real data, not made-up scenarios
- Define the output format explicitly
- Remove conflicting instructions (“detailed summary” makes no sense)
- Learn JSON, XML, CSV - you’ll need them
- Context, Instructions, output format, rules, example. In that order.
- Use AI to generate examples for AI
- Tokens are cheap. Use the smarter model unless you’re running millions of requests.
- Use ‘ask’ mode a few times before ‘agent’ mode in your CLI or Copilot
Bonus
- Give it a role (who), give it a goal (what), give it all context, be clear on output format. And let it ask questions first if it needs to.
Before and After Example
Scenario: You want a technical design plan for a Signup service with rate limiting, returned as JSON.
I’ve done this wrong so many times. Here’s what failure looks like:
BEFORE Prompt
Hey! Write a super detailed but also short doc about building a signup thing with rate limits. Explain all the best practices, include tables and code, and make it fun but professional. You can add anything you think is cool. Maybe talk about databases. We might be in AWS or GCP, not sure. Output however you want. Thanks!!
This is terrible:
- No role
- “Super detailed but also short” - pick one
- No format
- No context
- Ambiguous everything
- You’ll get a different answer every time
AFTER Prompt
Put this in your model playground. Start with a smart model while you’re designing. You can use cheaper ones later when you scale.
SYSTEM
You are a senior backend architect. You design with crisp trade-offs and minimal prose.
Tone: spartan. No marketing language.
USER
Context
- Product: Signup service for a consumer app, single region to start.
- Constraints: Postgres primary DB; Redis available; 100 rps peak; 99.9% target.
- Requirements: Rate limit 5 requests/min/IP; idempotent POST /signup; audit log of attempts; email verification webhook; PII handled via data minimization; no external calls during signup path.
- Non-goals: UI, analytics.
Instructions
- Mode: knowledge engine. Ground all outputs only in provided context. If info is missing, ask up to 2 clarifying questions. If no answers, proceed with conservative assumptions and list them.
- Task: Produce a design sketch that a senior engineer can implement.
- Keep instructions + examples concise; avoid repetition to reduce prompt length drift.
Output format (JSON only; no markdown)
{
"components": ["string"],
"data_model": [{"entity":"string","fields":[{"name":"string","type":"string","notes":"string"}]}],
"api": [{"method":"string","path":"string","request":"object","responses":[{"code":200,"body":"object"}]}],
"rate_limiting": {"key":"string","algo":"string","limits":{"unit":"string","value":number},"storage":"string","notes":"string"},
"risks": [{"risk":"string","mitigation":"string"}],
"test_cases": [{"id":"string","description":"string"}],
"assumptions": ["string"]
}
Rules
- Output valid, minified JSON that matches the schema.
- Do not invent external services. Do not include explanations outside JSON.
- If asking questions, ask them first as a JSON array: {"questions":["...","..."]}. After answers, return final JSON only.
Examples (few-shot; compact)
Example context -> output fragment:
- Context: "Passwordless magic-link login service; Redis; 50 rps; 3/min/IP; no PII."
- Output fragment:
{"components":["API","RateLimiter","TokenStore"],
"rate_limiting":{"key":"ip","algo":"fixed-window","limits":{"unit":"minute","value":3},"storage":"redis","notes":"expire per window"}}
Assistant template (style anchor)
{"components":["API"],"data_model":[{"entity":"Example","fields":[{"name":"id","type":"uuid","notes":"pk"}]}],"api":[{"method":"GET","path":"/health","request":{},"responses":[{"code":200,"body":{"status":"ok"}}]}],"rate_limiting":{"key":"ip","algo":"token-bucket","limits":{"unit":"minute","value":60},"storage":"redis","notes":"simplified"},"risks":[{"risk":"none","mitigation":"n/a"}],"test_cases":[{"id":"T0","description":"health"}],"assumptions":[]}
ASSISTANT
(If needed) {"questions":["List user attributes stored at signup?","Should email verification be synchronous or async?"]}
What changed:
- Built for the playground
- Compact examples
- Clear roles
- Specific output format (JSON schema)
- No contradictions
- Spartan tone
- It can ask questions before answering
- Everything is explicit
What Actually Works in Production
Treat prompts like code. Version control, code review, the works. I skip inline comments unless something is genuinely weird.
Test everything. Fixed test cases. Parse the outputs. Check schema validity. Track metrics, not your gut feeling about whether it’s “better.”
Log everything. Prompts, responses, token counts, latencies, errors. When something breaks, you want to know why.
Control your context. Don’t dump everything. Curate what you feed in. Precise snippets beat wall-of-text context.
Have a fallback. For high-volume stuff, use cheaper models with tighter prompts. Back it up with deterministic code when the model fails.
Watch your token budget. Shorten instructions if you need to, but keep the context examples. Those matter.
What to Do Next
Take your three most important prompts. Rewrite them using the pattern above. Test them with real cases. Measure what changes.
Then scale.
Copy the After prompt, swap in your own context, and run it in a playground. Version your changes. Track what works.
Related Articles
AI Model Selection: Choosing the Right Model and Application Pattern
February 3, 2026
Not all tasks need the most powerful AI model. Learn how to match model intelligence to task complexity and stop overpaying for sledgehammers when you need scalpels.
Architecture as Code: Why Tech Leaders and Engineers Should Adopt Diagrams‑as‑Code Now
October 23, 2025
AI now writes a large share of new code. Stand out with Architecture as Code—pros, cons, tools, and a simple adoption plan for leaders, engineers, and recruiters.
AI-Assisted Coding in 2025
October 19, 2025
How AI is actually changing software development—from what I've seen in the wild and what's working versus what's hype.
Wrestling with a technical challenge?
I help companies automate complex workflows, integrate AI into their stacks, and build scalable cloud architectures.