Train

Why AI clients beat role-play with classmates

Classroom role-play is the default training infrastructure for CBT skill development, and it is doing some useful work. It is also doing less than the curriculum tends to claim. Here is what role-play actually delivers, where it hits a ceiling, and why AI clients sit closer to the kind of practice the deliberate-practice and fidelity literatures point to.

27 March 20258 min read

Every CBT trainee can recall the specific texture of classroom role-play. Two of you in a corner of the seminar room, one playing therapist, one playing client, the rest of the cohort watching with the slightly stretched politeness of people who have been told this is an important part of their training. The "client" — your friend, your study partner, the person you had coffee with an hour ago — is doing their best, but the presentation drifts in and out of focus. They had a few minutes to read the case vignette. They are not, in any meaningful sense, the person described in it.

Halfway through your Socratic sequence, the "client" has a moment of clarity that arrives suspiciously early. They recognise the cognitive distortion you were guiding them toward, partly because they know roughly where the exercise is supposed to go. You both move on, slightly relieved. The exercise is doing something useful. It is not doing what the curriculum tends to imply it is doing.

What classmate role-play actually delivers

It is worth being precise about what role-play, in its standard classroom form, does well. It is not nothing. The honest list of what it provides is roughly this:

Low-stakes rehearsal. A trainee who has never run a Socratic sequence aloud before benefits from running one in a room where no real client is depending on the result. The activation cost of trying — saying the words, hearing the rhythm, noticing what happens to your own attention when the conversation goes somewhere you did not plan — is exactly what role-play lowers.

Social calibration. Trainees get to observe how they appear to another person while attempting a clinical task. The cohort and the tutor see things the trainee cannot see from inside their own performance. Tone, pace, the small therapy-school mannerisms that have begun to colonise their speech — all of these become visible.

A chance to be observed at all. Outside of role-play, supervised observation of skill performance is rare in the early stages of training. The classroom is one of the few settings where someone other than the trainee will actually watch them attempt the skill and respond.

These are real contributions. They are also limited in a specific and structurally important way: they do not deliver the conditions under which CBT technique is actually built into practice. They deliver the conditions under which CBT technique is introduced. Those are different things.

Where the ceiling is

The ceiling on classmate role-play comes from three constraints that are inherent to the format, not failures of any particular cohort.

The classmate cannot stay in role. Sustained, plausible psychiatric presentation is hard work. A friend doing their best can hold the surface of an OCD presentation for ten minutes; they cannot replicate the genuine grip of intolerance of uncertainty across forty minutes of session, nor sustain the avoidance pattern that gives the work its actual shape. When the trainee fumbles a behavioural experiment, the classmate's natural response is to help — to soften, to fill in. This is kind. It is also the opposite of what a real client presentation does.

The presentation does not push against the trainee's edge. The deliberate-practice framework, explored at more length in the deliberate practice article, is specific that skill development requires practice at the edge of current ability. Classmate role-play is calibrated by what the classmate can perform, not by what the trainee needs to drill. If the trainee needs forty repetitions of guided discovery with a client who genuinely will not relinquish the safety behaviour, the classmate will not provide that. They will, after the second or third attempt, either give up the behaviour because the conversation has moved on or commit to it so theatrically that the rehearsal becomes a different exercise entirely.

The exercise is single-use. Each role-play is consumed in the running of it. The classmate cannot be reset to the beginning of the session for a second attempt at the same Socratic sequence with the same setup. They cannot present the same difficulty consistently next Tuesday. The trainee gets one pass at each exercise, which is the opposite of how technique is consolidated.

The combined effect is that classmate role-play sits in the same category as a music student playing a piece in front of family. There is value in it, but it is not the kind of structured, repeated, edge-of-ability drilling that turns a competent reading of the music into a fluent performance.

Behavioural rehearsal as a fidelity instrument

There is a thread in the measurement literature that is worth pulling here, because it changes the framing of what role-play is for.

Bearman and colleagues' 2022 randomised trial (PubMed: 36229116) compared methods of measuring therapist adherence to a manualised CBT protocol — direct observation of sessions, structured behavioural rehearsal of techniques outside session, and therapist self-report. Two findings matter for the role-play conversation. First, therapist self-report systematically overestimated adherence relative to direct observation — the same self-assessment bias documented across the therapist drift literature. Second, behavioural rehearsal aligned with direct observation as a fidelity measurement approach. Asking a therapist to demonstrate a technique in a structured, observed rehearsal task captured something close to what the same therapist did in genuine sessions.

The implication is more interesting than it first looks. Behavioural rehearsal is not a degraded version of real practice; it is a measurement context that approximates real practice closely enough to be diagnostically useful — closer than the therapist's own self-report. That is the family of activity classroom role-play belongs to, and the family AI client work belongs to. The question becomes which kind of rehearsal infrastructure approximates real practice most usefully for skill development. Classmate role-play sits at one end of the spectrum: low fidelity, low repeatability, high variability in what the trainee actually gets to drill. Structured, sustained, repeatable rehearsal sits closer to the other end.

What AI clients can actually do

It is worth being honest about what AI clients are and are not. They are not real clients. The therapeutic alliance with a sophisticated language model is a different thing from the therapeutic alliance with a person whose life is genuinely difficult, and the latter is the actual subject matter of clinical training. Anyone selling AI clients as a placement substitute is selling something the technology does not deliver.

What AI clients can deliver, however, sits squarely inside the gap the role-play format leaves open.

Coherence across the session. A well-constructed AI client can hold a presentation across forty minutes without drifting out of role, without softening the difficulty when the trainee struggles, and without offering the trainee the conveniently-timed insight that ends the exercise prematurely. The client stays in role because there is no social cost to doing so. Trainees who have only experienced classmate role-play often find their first AI-client drills disorienting precisely because of this — the client does not help.

Consistent difficulty across repetitions. The same trainee can run the same imaginal exposure setup ten times across a fortnight, with the client's presentation calibrated to the same level of avoidance each time. The trainee's performance becomes the variable; the client's difficulty does not. This is the condition under which technique consolidates.

Reset and retry. The trainee who fumbles the rationale for a behavioural experiment, recognises mid-sentence that they have lost the client, and wants to start again can do so. The exercise does not have to be completed in one pass. Classmate role-play structurally cannot offer this; AI clients can.

Structured feedback against a competency framework. A drill that ends with a CTS-R-referenced rating of what happened — agenda setting, guided discovery, application of change methods, the specific items the trainee was working on — gives the trainee the corrective information the deliberate-practice framework requires. Without that feedback, the trainee is left with their own impression of how the drill went, which is the self-assessment problem that the field has documented at length.

Availability outside of structured cohort time. Classroom role-play happens when the timetable allows. AI client work happens when the trainee has thirty minutes between supervision and a placement shift. The infrastructure is present whenever the practice would otherwise be deferred.

What AI clients cannot do

Honesty about the limits matters here, because the value of AI client work depends on positioning it correctly.

The therapeutic alliance with a real person. A client whose distress is genuine, whose history is not a vignette, and whose wellbeing depends in part on what happens in the session is a different category of relational task. AI client work does not rehearse this directly. It can rehearse the technique that the alliance carries; it cannot rehearse the alliance itself.

The unpredictability of genuine presentations. Real clients do not arrive with their formulation pre-baked. They tell their story out of order. They contradict themselves. They bring presentations that do not quite fit the categories the trainee has prepared for. AI clients can present complex cases, but they are still presenting from within the framework the trainee is being trained in. The genuinely off-protocol moment — the safeguarding disclosure, the presentation that turns out not to be what the referral suggested, the relational rupture that has nothing to do with the technique being practised — sits outside the rehearsal context.

The judgement calls that matter most. Safeguarding decisions, risk assessment under genuine ambiguity, the decision about when not to deliver an intervention because the clinical context has changed — these are clinical judgements built through supervised real-world practice and reflective work on real cases. AI clients are not a substitute for that strand of training.

The honest position is that AI clients are training-wheel infrastructure. They sit alongside placement, supervision, and reflective practice — not in place of any of them. They handle the specific component of training that the existing infrastructure leaves underdone: the high-volume, repeatable, edge-calibrated, feedback-loaded drilling of specific techniques. That is the missing layer the deliberate-practice literature keeps pointing to, and it is the layer classmate role-play is structurally unsuited to provide.

The honest comparison

Put alongside each other, the comparison is not really between AI clients and classmate role-play. It is between two complementary forms of rehearsal that do different jobs. Classmate role-play introduces the trainee to the felt sense of running a technique with another human in the room — a useful early step that the relational element it offers makes genuinely valuable as introduction.

AI client work is what the introduction needs to lead to if technique is to consolidate into practice. It is the repetition layer — where the trainee runs the same Socratic sequence forty times across a month, drills imaginal exposure on a client whose intrusions do not soften out of politeness, and gets the same competency-referenced feedback on each attempt. It is the closest available analogue to the structured, repeatable, externally-evaluated practice the deliberate-practice literature identifies as the engine of skill development.

The trainees who will be most fluent in five years are not the ones who completed the most role-plays. They are the ones who built deliberate-practice habits early — who treated the drilling of specific techniques as the substance of their development rather than as enrichment around the edges of placement work. The infrastructure to make that habit easy to maintain matters more than the resolve to maintain it, which is the drift literature's general lesson.

Supervisia Train is built around this kind of practice.

The Train pathway gives trainees AI clients that stay in role across the whole session, that present the same difficulty consistently across repetitions, and that can be drilled at the edge of current ability without burning through classmate goodwill. Every drill is scored against the CTS-R, with trainer commentary that surfaces the specific technique-level patterns the trainee needs to work on next. The deliberate-practice infrastructure that the literature keeps recommending — and that classroom role-play structurally cannot provide — is built into how the platform runs.

Start free on the Train pathway →

References

Bearman, S. K., Bailin, A., Rodriguez, E. M., Bellovin-Weiss, S., Dorsey, S., Schoenwald, S. K. & Weisz, J. R. (2022). A randomized trial to identify accurate measurement methods for adherence to cognitive-behavioral therapy. Behavior Therapy, 53(6), 1207–1220. PubMed: 36229116.
Ericsson, K. A., Krampe, R. T. & Tesch-Römer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100(3), 363–406. DOI: 10.1037/0033-295X.100.3.363.
Chow, D. L., Miller, S. D., Seidel, J. A., Kane, R. T., Thornton, J. A. & Andrews, W. P. (2015). The role of deliberate practice in the development of highly effective psychotherapists. Psychotherapy, 52(3), 337–345. DOI: 10.1037/pst0000015.
Walfish, S., McAlister, B., O'Donnell, P. & Lambert, M. J. (2012). An investigation of self-assessment bias in mental health providers. Psychological Reports, 110(2), 639–644. DOI: 10.2466/02.07.17.PR0.110.2.639-644.
Waller, G. & Turner, H. (2016). Therapist drift redux: Why well-meaning clinicians fail to deliver evidence-based therapy, and how to get back on track. Behaviour Research and Therapy, 77, 129–137. DOI: 10.1016/j.brat.2016.01.007. PubMed: 26752326.

Last updated: May 2026

See how Supervisia Train uses AI clients

Start free — no card required.

See how Supervisia Train uses AI clients →

← Back to Train