Run-your-own workshop

From Personality Prompt to Personality Harness

Move a personality from a single prompt to a tested harness you can steer, so it stays in character when it matters.

About a 90-minute session. Work through it with your team and answer as you go.

Start here

Enter your email to begin

We send you the guide and keep you in the loop on new workshops. Your answers stay on your device. At the end you download a worked copy with everything your team filled in.

Most teams start personality design with a prompt. It usually reads something like this. "Be warm, concise, helpful, emotionally intelligent, and human. Do not ask too many questions. Do not sound robotic."

That is a reasonable start. It is not a production control system. A personality prompt describes how the agent should sound. A personality harness defines how it should behave, how that behavior shifts by context, how examples steer it, how failures are caught, and how engineering can build and maintain it.

This workshop moves a team from personality as a block of text to personality as an operational artifact. You work through it together and answer in the worksheet as you go, so you finish with something product and engineering can act on.

What you will produce

By the end of the session you will have a marked-up version of your current personality prompt, a one-page voice card, a starter set of conversational moves, a starter moment map, a small set of bad-to-good examples, a guard and check list, a list of open questions, and a first engineering backlog. The aim is not to solve personality in one sitting. It is to make personality observable, testable, and buildable.

Who should be in the room

The session works best with a product owner, an engineer or technical lead, a designer or conversation designer, a domain expert, and a trust or quality owner where it is relevant. Assign three roles before you start. One person facilitates, one person scribes, and one person acts as engineering translator. The translator keeps asking where each decision would live in the system, and whether it is a prompt change, a runtime variable, a retrieved example, a classifier, a critic, an eval, a logging requirement, or an unresolved product decision.

What to bring

Bring your current personality prompt and system prompt if you have them, five to ten real or realistic user turns, a few responses that felt wrong, a few that felt right, and any product or architecture constraints you already know. If you do not have transcripts yet, set the transcripts decision to synthetic above and label your imagined turns clearly. Do not pass them off as evidence.

How the session runs

Set your upfront decisions first. They tell the modules what to assume and what you are deferring, and the guidance adjusts to match. Then work through each module in order. Every discussion should produce one of six things. A prompt rule, a moment rule, an example, a check, a backlog item, or an open decision. If it produces none of those, it is taste. Taste is fine for a few minutes, then it becomes an artifact or it moves on.

What this prevents

The session should not end with "make it warmer" or "sound less robotic." Those are not instructions anyone can build. It should end with statements engineering can act on, such as "limit generation to one question per turn," "add emotional disclosure as a moment state," or "log the selected moment, the selected move, the generated response, and any guard failures." That is the shift from a personality prompt to a personality harness.

010 to 5 min

Set the frame

Open by naming the goal. You are not here to write the perfect personality prompt. You are here to break personality into pieces product and engineering can use.

Agree on one rule for the session. No adjective is finished until it is translated into observable behavior. "Warm" is not finished, but "reflect before advising in emotional disclosure" is. "Concise" is not finished, but "one to three sentences by default" is. "Curious" is not finished, but "ask one narrow question only when information is missing" is.

Output. A shared goal and a working constraint, with the three roles assigned.

Your worksheet

Who is in the room

Name the three required roles first.

Saved on this device as you type.

025 to 15 min

Decompose the current prompt

Paste your current personality prompt below. Tag each instruction as one of these. Global voice, surface style, turn-level behavior, moment-specific behavior, safety or policy guard, product goal, user preference, example, architecture requirement, or unclear.

When an instruction could sit in more than one category, do not debate it. Record it in both, mark it unresolved, and move on. "Do not overwhelm the user" might mean keep responses short, ask fewer questions, slow down during emotional disclosure, or hold back multi-step plans unless asked. That ambiguity is the finding, and it needs a definition before anyone can build it.

For each instruction worth keeping, note its likely category, the observable behavior behind it, where it would live, and any open question.

Output. A decomposed prompt, a list of vague instructions to translate, and a set of clean rules you can keep as is.

Your worksheet

Your current personality prompt

Instructions that are already clear rules

Vague instructions that need translation

Saved on this device as you type.

0315 to 25 min

Identify known failures

Ask where the agent currently feels wrong, and use real examples wherever you have them. Common patterns include too many questions, stacked questions, over-explaining, advice before understanding, a generic tone, a tone that does not shift with the moment, and pushing when the user is done.

If you have no real failures yet, use predicted ones and label them as assumptions. Do not let an assumed failure get treated as proven.

For each failure, record an example user turn, the bad response pattern, why it fails, and the likely cause. Prompt, moment, move, example, architecture, or eval. Rank the top five. They drive the rest of the session.

Based on your decisionsYou are working from synthetic turns. Label every failure below as an assumption and add a backlog item to confirm it against real transcripts.

Output. A ranked list of five failures with an example turn for each.

Your worksheet

Your top five failures, ranked

An example user turn for each failure

Saved on this device as you type.

0425 to 38 min

Build the global voice card

The voice card defines stable identity. Keep it short. It is not the place for every behavior.

Fill in what the agent is and is not, how users should and should not feel, how it usually sounds and what it should avoid sounding like, the roles it should never play, and the phrases it should never use.

If you stall, work from contrast. What would obviously be wrong? What would make a user lose trust? What would sound like every other chatbot? The anti-voice is often easier to name than the voice, and it is just as useful.

Stable identity belongs in the system or developer prompt. A list of banned generic phrases can become a detector or a rewrite rule. Higher-level principles can become eval criteria once you write them as observable behaviors.

Based on your decisionsYou are adapting on explicit preferences only. Capture how a user states a preference and how they change it. Do not infer personality types yet.

Output. A one-page voice card.

Your worksheet

The agent is

The agent is not

Users should feel

Users should not feel

Phrases and roles the agent should avoid

Saved on this device as you type.

0538 to 50 min

Translate adjectives into rules

Take the voice card and turn its adjectives into checks. For each one, write what it means in behavior, what it does not mean, a good and a bad response pattern, whether it can be checked automatically, and where the check would live.

"Warm" means acknowledge the user's state before moving on. It does not mean adding reassurance to every response. "Concise" means one to three sentences by default. It does not mean clipped. "Curious" means one narrow question only when information is missing. It does not mean ending every turn with a question.

If you cannot finish the sentence "we would know the agent was being X if we saw it do Y in a transcript," the trait is not ready to build yet.

Output. A list of observable personality rules.

Your worksheet

Adjective

What it means in behavior, and what it does not

How you could check it

Saved on this device as you type.

0650 to 62 min

Define conversational moves

A model should not just respond. It should make a move. Pick five to eight. Common ones are reflect, clarify, challenge, normalize, summarize, explain, invite action, close, redirect, and escalate.

For each move, define its purpose, when to use it and when not to, a default length, a question policy, and a good and bad example. If you stall, return to the failure list. If the agent asked three questions, the move it should have made was probably reflect. If it gave advice too soon, the move was probably normalize. If it kept going, the move was probably close.

Move labels can become classifier labels, the selected move can be injected into the generation prompt, and move fit can become an eval dimension.

Output. A starter move taxonomy.

Your worksheet

The five to eight moves you will support

One move card in full

Saved on this device as you type.

0762 to 72 min

Build a starter moment map

The same personality should not behave the same way in every moment. Pick five to seven. A workable default is opening, exploration, emotional disclosure, decision or action, and closing, with a safety-sensitive override if you need one.

For each moment, define the user state, the agent goal, the allowed and disallowed moves, the tone and length, the question policy, and what to log or evaluate. Do not overbuild. Five moments you can implement beat ten you will argue about.

Moments can become runtime state, classifier output, or retrieved prompt context. Moment cards can be injected dynamically, and moment fit can be evaluated.

Based on your decisionsWithout strong transcript evidence, hold the moment map to the generic five. Do not split a moment until repeated real failures justify it.

Based on your decisionsSafety scope is unresolved. Add a placeholder safety override to the map and a decision to confirm it before launch.

Output. A starter moment map.

Your worksheet

The moments you will track

One moment card in full

Saved on this device as you type.

0872 to 82 min

Write bad-to-good examples

Examples are the most important part of the harness. Use the failures you ranked earlier. For each, write the user turn, the moment, the desired move, a bad response, a better response, why the better one works, and a failure tag.

If you stall, write the bad response first. Teams are usually better at naming what feels wrong than at inventing the ideal from scratch. Then change one thing. Remove the second question, cut to two sentences, reflect before advising, make the question concrete, or drop the generic reassurance.

Use realistic, messy user turns rather than polished ones. Examples become prompt examples, retrieved examples, eval cases, or training data later, and the tags become retrieval metadata.

Based on your decisionsMark these examples as synthetic. They steer the system, but they are not yet evidence.

Output. At least five contrast examples.

Your worksheet

Example one

Four more contrast examples

Saved on this device as you type.

0982 to 88 min

Convert failures into checks

Take the failure list and decide how each one should be caught. Use a deterministic check for countable patterns, such as more than one question mark or a banned phrase. Use a classifier for categorization, such as the current moment or whether content is safety-sensitive. Use a model-based critic for judgment, such as advice given too soon or a tone that is too clinical. Use human review for anything subjective or early.

A simple test. Could a script catch it? Use a deterministic check. Could a small model label it? Use a classifier. Does it need judgment? Use a critic or a person. Do you not know yet? Put it in the open questions.

For each failure, record the detection method, the acceptance criterion, an example pass and fail, and where it is logged.

Based on your decisionsAt alpha, lead with deterministic checks and human review. Add classifiers and model-based critics only where a deterministic rule cannot reach.

Based on your decisionsModel choice is still open. Flag any check that depends on it so the decision is not silently assumed.

Output. A starter guard and eval checklist.

Your worksheet

Each failure and how you will detect it

Acceptance criterion for each check

Saved on this device as you type.

1088 to 90 min

Build the engineering handoff

Do not end on discussion. End on a handoff table. For each artifact, capture the product decision behind it, the system behavior it requires, where it lives in the architecture, the backlog item, the acceptance criterion, the owner, and the open question.

A worked example. The artifact is the emotional disclosure moment card. The product decision is to reflect before advising when a user shares something vulnerable. The system behavior is questions off by default in that moment, with advice gated behind a user request or a prior reflective turn. The architecture location is the moment classifier, dynamic prompt context, generation rules, and eval set. The acceptance criterion is that across twenty emotional disclosure eval cases the agent reflects before advising in at least eighteen and asks no more than one question in all of them.

Based on your decisionsMark each control as alpha, beta, or later in the handoff so engineering can sequence the build against latency and cost.

Output. An engineering handoff table.

Your worksheet

Harness artifact

Product decision

System behavior

Architecture location

Backlog item and acceptance criterion

Owner

Open question

Saved on this device as you type.

Download your worked copy

A markdown file with your decisions and everything you filled in. Keep it as your team's working copy and hand it to engineering and product.

Enter your email at the top to unlock the download.

›Reference and deeper notes

Common failure paths, the full implementation brief, and backlog translation examples.

Common failure paths

These are the ways the session tends to go wrong, and how to keep it on track.

The team argues about adjectives

When "warm" or "human" means different things to different people, move straight to transcript behavior. Ask what you would see in a response that proves the trait, and what you would see if it failed. Record the trait, its behavioral definition, a good and bad pattern, any unresolved disagreement, and the backlog implication.

The team has no real transcripts

Use synthetic turns, but label them as assumptions and add a backlog item to replace them with real ones. For each, record the assumption being tested, why the case matters, what real evidence is needed, and how soon it should be replaced.

The team overbuilds the graph

Too many moments, too many branches, no clear path to implement. Collapse to five starter moments. Only split one when repeated transcript failures prove it is too broad. Record the proposed moment, the reason for not adding it yet, the evidence you would need, and a review date.

Engineering cannot implement the artifact

A polished personality document where no one knows where the rules live. Force every rule into an architecture location. System prompt, developer prompt, runtime context, conversation state, retrieved examples, classifier, critic, rewrite pass, eval set, logging, or human review. Record the rule, its location, the data it needs, an owner, and an acceptance criterion.

The team confuses safety with personality

Tone preferences get mixed with safety rules, and personality starts to weaken escalation. Separate the two. Safety overrides personality. Record the safety rule, the personality rule it overrides, the trigger, the required behavior, the escalation path, and an eval case.

The team personalizes too early

Different personalities per user, inferred from a few turns, drifting toward inconsistent or intrusive. Start with explicit preference adaptation only, such as "keep it short" or "be direct." Record the preference, how a user expresses it, the allowed and disallowed adaptation, where it is stored, and how the user can change it.

The team writes examples that are too polished

Examples that read well in isolation but do not match real turns, with the bad examples missing. Write the bad example first, use messy user turns, and build contrast rather than ideals. Record the messy turn, the likely bad response, the better response, the rule it demonstrates, and the failure tag.

The team cannot agree on good

Separate the decision from the evidence. Some calls are product taste, some are user evidence questions, and some are safety constraints. Record the disagreement, the decision needed, the evidence needed, a temporary default, the owner, and a date to revisit.

The team wants to solve everything with fine-tuning

When prompt failures turn into "we need training," stop. Training comes after you have examples, labels, evals, and a stable definition of quality. Record the behavior prompting could not control, the examples available, the eval coverage, the reason training might be needed later, and the current non-training mitigation.

The team ignores latency and cost

A harness that assumes classifiers, retrieval, critics, and rewrites everywhere will be too slow or expensive. Mark each control as alpha, beta, or later. Use deterministic checks first, and human review before automated layers when volume is low. Record each control layer, its latency and cost impact, whether it is needed for alpha, and the fallback if you do not build it.

The implementation brief

The session should hand engineering a brief with nine parts. The global prompt changes to add, remove, or rewrite. The dynamic context to pass into generation, such as the current moment, the selected move, a user preference, whether a question is allowed, a length target, relevant examples, and a recent summary. The moment states to detect, each with a definition, trigger, allowed moves, question policy, and fallback. The move labels, each with a purpose, default shape, question policy, and examples. The example library schema, with fields for the user turn, moment, move, bad and better response, why better, failure tag, tone and length tags, source, and version. The guard checks, each with a detection method, pass and fail conditions, and an action on fail. The eval set, each case with an expected moment, move, question count, length, and the failure it tests. The logging requirements, including the moment, move, prompt version, model, retrieved examples, question and sentence counts, guard failures, any rewrite, the final response, and a human rating. The open decisions, each with the reason it is unresolved, a temporary assumption, an owner, the evidence needed, and a next review.

Backlog translation examples

Question stacking becomes a question_count check, a compound_question check, a question_allowed flag in the generation context, a rewrite pass when the policy is violated, and eval cases for the failure. Acceptance criterion: no more than one question and no compound questions in at least ninety-five percent of the question discipline eval set.

Advice too soon becomes an emotional disclosure moment card with moment-specific allowed moves, a logged selected move, and eval cases for advice overreach. Acceptance criterion: the first response reflects or normalizes before advising in at least ninety percent of cases.

Generic support language becomes a banned phrase list, a phrase detector, a rewrite instruction for generic openings, and canonical grounded examples. Acceptance criterion: banned phrases appear in under two percent of test responses.

A request for shorter answers becomes a response_length_preference field, detected from explicit statements, stored as default, short, or detailed, injected into generation, with short-mode examples and adherence evals. Acceptance criterion: when the preference is short, responses stay within the length target in at least ninety percent of cases.

A tone that never shifts becomes defined moment labels, a lightweight moment classifier or explicit flow state, an injected moment card, logged moment selection, and moment-fit evals. Acceptance criterion: the selected moment matches the expected moment in at least eighty-five percent of cases.

Closing note

A personality prompt is easy to write and hard to operate. A personality harness is harder to write and easier to improve. The point of the session is not to make the system more complex. It is to make the team more precise. When personality is a prompt, teams argue about tone. When personality is a harness, they can inspect behavior, assign ownership, write tickets, test regressions, and improve the system over time.

The thinking behind it

Steering the personality of a conversational tool

Read the highlight →

Want hands-on guidance?

Run the workshop with your team, then reach out if you want us to pressure-test the work or help steer the build.

Pressure-test my product