Prompt Engineering: From Basics to Advanced Strategies

9 minute read

Prompt engineering is often dismissed as “just writing good instructions.” While that’s part of it, effective prompt engineering is a skill that combines psychology, linguistics, and empirical experimentation.

After writing thousands of prompts for production systems, I’ve developed strategies that consistently improve output quality. Here’s what I’ve learned.

The Prompt Engineering Mental Model

Think of prompting as programming in natural language. You’re:

Defining the task (like a function signature)
Providing context (like parameters)
Setting constraints (like type checking)
Specifying output format (like return types)

The LLM is your interpreter, but it’s probabilistic and context-sensitive.

Foundational Techniques

1. Be Specific and Explicit

Bad:

Summarize this document.

Good:

Summarize the following technical document in 3-5 bullet points, focusing on:
1. Main technical contributions
2. Key findings or results
3. Practical applications

Keep each bullet point under 50 words. Use technical terminology where appropriate.

Document:
{document_text}

Why it works: Removes ambiguity, sets clear expectations, defines success criteria.

2. Provide Examples (Few-Shot Learning)

Zero-Shot:

Extract action items from this meeting transcript.

Few-Shot:

Extract action items from meeting transcripts. Format each as: [Person] needs to [action] by [deadline].

Examples:
Input: "John, can you send the report by Friday?"
Output: [John] needs to [send the report] by [Friday]

Input: "Sarah mentioned she'll follow up with the client next week"
Output: [Sarah] needs to [follow up with client] by [next week]

Now extract from this transcript:
{transcript}

Why it works: Shows the LLM exactly what “good” looks like. Establishes format and tone.

3. Chain of Thought (CoT)

Without CoT:

Is this contract clause enforceable under California law?

With CoT:

Analyze whether this contract clause is enforceable under California law.

Step 1: Identify the key elements of the clause
Step 2: Determine relevant California statutes and case law
Step 3: Apply the legal principles to the clause
Step 4: Provide your conclusion with reasoning

Contract clause:
{clause_text}

Why it works: Encourages reasoning rather than pattern matching. Improves accuracy on complex tasks.

4. Role Assignment

Without Role:

Explain quantum computing.

With Role:

You are a senior technical educator who specializes in making complex topics accessible.

Explain quantum computing to a software engineer who is familiar with classical computing concepts but has no physics background. Use analogies to programming concepts where helpful.

Why it works: Sets the right tone, knowledge level, and communication style.

Advanced Techniques

5. Self-Consistency

Run the same prompt multiple times with temperature > 0 and aggregate results.

def self_consistent_answer(question, n=5):
    answers = []

    for _ in range(n):
        response = llm.complete(
            f"Answer this question: {question}",
            temperature=0.7
        )
        answers.append(response)

    # Use LLM to synthesize the most consistent answer
    synthesis_prompt = f"""
    Here are {n} different answers to the same question:

    {format_answers(answers)}

    Identify the most consistent answer or synthesize the best answer from these responses.
    """

    return llm.complete(synthesis_prompt, temperature=0)

When to use: High-stakes decisions, complex reasoning tasks, when you need confidence estimation.

6. Tree of Thoughts

Explore multiple reasoning paths simultaneously.

prompt = """
Problem: {problem}

Generate 3 different approaches to solve this problem:

Approach 1:
[Description of first approach]
Pros:
Cons:

Approach 2:
[Description of second approach]
Pros:
Cons:

Approach 3:
[Description of third approach]
Pros:
Cons:

Based on the analysis, which approach is best and why?
"""

When to use: Open-ended problems, architectural decisions, strategy planning.

7. Constitutional AI / Self-Critique

Have the LLM critique and refine its own output.

# First draft
initial_prompt = """
Write a technical blog post about {topic}.
"""

draft = llm.complete(initial_prompt)

# Self-critique
critique_prompt = f"""
You wrote this blog post:

{draft}

Critique it according to these criteria:
1. Technical accuracy
2. Clarity for the target audience
3. Logical flow
4. Missing important points

Provide specific suggestions for improvement.
"""

critique = llm.complete(critique_prompt)

# Revision
revision_prompt = f"""
Original blog post:
{draft}

Critique:
{critique}

Revise the blog post addressing the critique.
"""

final = llm.complete(revision_prompt)

When to use: Content generation, code review, any task where quality matters more than speed.

8. Prompt Chaining

Break complex tasks into sequential steps.

# Step 1: Extract information
extract_prompt = """
Extract all customer complaints from this support ticket:
{ticket}

List each complaint clearly.
"""
complaints = llm.complete(extract_prompt)

# Step 2: Categorize
categorize_prompt = f"""
Categorize these complaints into: Product, Service, Billing, Other

Complaints:
{complaints}
"""
categories = llm.complete(categorize_prompt)

# Step 3: Prioritize
prioritize_prompt = f"""
Prioritize these categorized complaints by severity and urgency:

{categories}

For each, assign priority: High, Medium, Low
"""
priorities = llm.complete(prioritize_prompt)

# Step 4: Generate response
response_prompt = f"""
Generate a professional response addressing these prioritized complaints:

{priorities}

Tone: Empathetic and solution-oriented
"""
response = llm.complete(response_prompt)

When to use: Complex workflows, when intermediate outputs are valuable, when different steps need different prompting strategies.

RAG-Specific Prompting

9. Context Utilization

rag_prompt = """
Answer the question based ONLY on the provided context. Follow these rules:

1. If the context contains the answer, provide it with citations
2. If the context is relevant but doesn't fully answer, say what you can answer
3. If the context is not relevant, say "I don't have enough information to answer this question"
4. Never use information not present in the context
5. Cite sources using [Source: X] format

Context:
{context}

Question: {question}

Answer:
"""

Key elements:

Explicit instruction to use only provided context
Handling of edge cases (partial info, no info)
Citation requirements
Clear prohibitions (no external knowledge)

10. Multi-Document Reasoning

prompt = """
You are given information from multiple documents. Some information may be contradictory.

Documents:
[Doc 1 - Sales Report Q1]:
{doc1}

[Doc 2 - Sales Report Q2]:
{doc2}

[Doc 3 - Marketing Analysis]:
{doc3}

Question: {question}

Instructions:
1. Identify which documents are relevant to the question
2. If documents contradict each other, note the contradiction
3. Synthesize a coherent answer, citing specific documents
4. If there's ambiguity, acknowledge it

Answer:
"""

Prompt Optimization Workflow

1. Start with a baseline

baseline_prompt = "Summarize this article."

2. Add specificity

v2_prompt = "Summarize this article in 100 words, focusing on key findings."

3. Add examples

v3_prompt = """
Summarize articles like this example:

Input: [long article]
Output: [concise 100-word summary highlighting key findings]

Now summarize:
{article}
"""

4. Test and measure

test_set = load_test_examples()

for prompt_version in [baseline, v2, v3]:
    results = evaluate(prompt_version, test_set)
    print(f"{prompt_version}: Accuracy={results.accuracy}, Quality={results.quality}")

5. Iterate based on failures

# Analyze where v3 fails
failures = [ex for ex in test_set if evaluate(v3, ex).quality < 3]

# Identify patterns
for failure in failures:
    print(f"Failed on: {failure.type}")
    # Failed on: Technical jargon-heavy articles

# Refine prompt
v4_prompt = """
[Previous v3 prompt]

Note: If the article contains technical terminology, include a brief explanation in parentheses.
"""

Common Pitfalls

Pitfall 1: Over-Prompting

Bad:

You are an expert AI assistant with deep knowledge of all subjects. You are helpful, harmless, and honest. You always provide accurate information. You never make things up. You think carefully before responding...

[200 more words of instructions]

Question: What is 2+2?

Good:

Answer this math question accurately: What is 2+2?

Lesson: Only include necessary instructions. More prompt ≠ better results.

Pitfall 2: Ambiguous Constraints

Bad:

Write a short summary.

Good:

Write a summary in exactly 100 words.

Lesson: Quantify when possible. “Short” is subjective.

Pitfall 3: Conflicting Instructions

Bad:

Be creative and innovative, but only use the information provided.

Good:

Synthesize the provided information in a clear, organized way. Use headings and bullet points for readability.

Lesson: Don’t ask for creativity then constrain it entirely. Be consistent.

Pitfall 4: Assuming Context Persistence

Bad:

# First message
"You are a Python expert."

# Second message (new API call)
"How do I reverse a string?"
# LLM doesn't remember it's a "Python expert"

Good:

# Every message includes role
"You are a Python expert. How do I reverse a string in Python?"

Lesson: Each API call is independent. Include necessary context every time.

Model-Specific Considerations

GPT-4 vs GPT-3.5-turbo

GPT-4: Better at following complex instructions, can handle longer contexts
GPT-3.5-turbo: Needs simpler, more explicit prompts

Claude (Anthropic)

Responds well to XML-style tags: <instructions>, <context>, <examples>
Good at following constitutional principles
Excels at longer context (100K+ tokens)

Open Source Models (Llama, Mistral)

Often fine-tuned with specific prompt formats (e.g., [INST] tags)
May need more explicit instructions
Vary widely in capabilities

Example (Llama 2 Chat):

<s>[INST] <<SYS>>
You are a helpful assistant.
<</SYS>>

{user_message} [/INST]

Evaluation Metrics

How do you know if your prompt is good?

def evaluate_prompt(prompt, test_set):
    scores = {
        'relevance': [],
        'correctness': [],
        'completeness': [],
        'format_compliance': [],
        'latency': [],
        'cost': []
    }

    for example in test_set:
        response = llm.complete(prompt.format(**example.inputs))

        scores['relevance'].append(
            judge_relevance(example.query, response)
        )
        scores['correctness'].append(
            semantic_similarity(response, example.ground_truth)
        )
        # ... other metrics

    return {
        metric: np.mean(values)
        for metric, values in scores.items()
    }

Real-World Example: Customer Support Bot

Initial Prompt (Poor):

Help the customer.

Evolved Prompt (Production):

You are a customer support agent for TechCorp. Your goal is to resolve customer issues efficiently and professionally.

Guidelines:
1. Be empathetic and acknowledge the customer's frustration
2. Ask clarifying questions if needed (max 2 questions before providing solution)
3. Provide step-by-step solutions when applicable
4. If you cannot help, escalate to a human agent
5. Always end with asking if there's anything else you can help with

Context:
- Customer tier: {customer_tier}
- Previous interactions: {interaction_history}
- Current issue category: {issue_category}

Customer message: {customer_message}

Your response:

Results:

Baseline (poor prompt): 62% resolution rate
Production prompt: 84% resolution rate
Customer satisfaction: 3.2 → 4.3 / 5

Prompt Library Template

Maintain a library of tested prompts:

# prompts/summarization_v3.yaml
name: summarization_v3
task: Document summarization
version: 3.2.1
created: 2026-01-15
tested_on: 500 documents
avg_quality: 4.2/5

template: |
  Summarize the following document in {word_count} words.

  Focus on:
  - Main themes and arguments
  - Key findings or conclusions
  - Actionable insights

  Format: {format}  # Options: paragraph, bullets, numbered

  Document:
  {document}

  Summary:

parameters:
  word_count:
    type: int
    default: 100
    range: [50, 500]

  format:
    type: enum
    default: bullets
    options: [paragraph, bullets, numbered]

examples:
  - input:
      document: "[Example document]"
      word_count: 100
      format: bullets
    output: |
      - Key point 1
      - Key point 2
      - Key point 3

Conclusion

Prompt engineering is both art and science:

Art: Understanding how to communicate effectively with LLMs
Science: Systematic testing and iteration

Key takeaways:

Start simple, add complexity only when needed
Test with real examples, not just happy paths
Version and track your prompts
Measure what matters (quality, not just completion)
Learn from failures

The field is still evolving. What works today may be suboptimal tomorrow as models improve. Stay empirical, keep experimenting.

Resources

What prompt engineering techniques have worked for you? Share your strategies and examples. Reach out via email or X.

Disclaimer: The views, opinions, and technical approaches shared in this post are my own, based on my personal experience building production AI/ML systems. They do not represent the views of my current or former employers. Technology choices and architectural decisions should always be evaluated in the context of your specific use case and requirements.

Questions or feedback? I’d love to hear your thoughts and experiences.

Contact: LinkedIn

GitHub

Vishal Sharma

The Prompt Engineering Mental Model

Foundational Techniques

1. Be Specific and Explicit

2. Provide Examples (Few-Shot Learning)

3. Chain of Thought (CoT)

4. Role Assignment

Advanced Techniques

5. Self-Consistency

6. Tree of Thoughts

7. Constitutional AI / Self-Critique

8. Prompt Chaining

RAG-Specific Prompting

9. Context Utilization

10. Multi-Document Reasoning

Prompt Optimization Workflow

1. Start with a baseline

2. Add specificity

3. Add examples

4. Test and measure

5. Iterate based on failures

Common Pitfalls

Pitfall 1: Over-Prompting

Pitfall 2: Ambiguous Constraints

Pitfall 3: Conflicting Instructions

Pitfall 4: Assuming Context Persistence

Model-Specific Considerations

GPT-4 vs GPT-3.5-turbo

Claude (Anthropic)

Open Source Models (Llama, Mistral)

Evaluation Metrics

Real-World Example: Customer Support Bot

Prompt Library Template

Conclusion

Resources

You May Also Enjoy

Case Study: Production GenAI Platform Processing 2M+ Monthly Customer Interactions

Building Production-Grade RAG Systems: Architecture and Best Practices

Evaluating LLM Applications: Beyond Vibes and Into Data

Building an AI Governance Framework for Enterprise GenAI Adoption