Blog·Studio··6 min

Casting AI characters without the uncanny: identity anchors, in plain English

The reason most AI films fall apart in shot two is the same reason every AI film before Studio fell apart. It is a solved problem now.

Watch any AI-generated film made before 2026 and you will spot the bug within ten seconds. The lead in shot one has a long nose. The lead in shot two has a short one. In shot three the lead is a different person entirely, wearing the same coat. The audience laughs. The film is over.

This is the canonical failure mode of generative video. The model does not know that the woman in the kitchen and the woman in the car are the same woman. Every shot is a fresh casting call. Studio solves this with a mechanism called the identity anchor. The mechanism is older than the marketing copy makes it sound. It is also the difference between a film and a montage.

What an anchor actually is

An identity anchor is a single locked image of the character at canonical age, wardrobe, and lighting. Studio generates it during the Core stage and you approve it. After that, it is the reference the provider model sees on every shot involving that character.

Under the hood the anchor gets reduced to an embedding · a high-dimensional vector that captures face geometry, skin tone, and silhouette. The video model treats that vector as a low-weight bias during sampling. Not a copy-paste. A pull toward a known point in the space of possible faces.

Why this is hard

Faces are the part of the image humans are most sensitive to. We notice a millimetre of difference in the distance between eyes. We notice a slightly off jawline. We notice when the lighting reveals the model is guessing.

The naive approach, which most early tools tried, is to paste the anchor face onto a generated body. The result reads as a sticker. The light on the face does not match the light on the body. The audience sees the trick instantly. Studio's anchor sits inside the sampling process, not on top of it. The light, the angle, the focus all come out coherent because the face is generated as part of the same shot.

Where the bible takes over

The anchor handles geometry. It does not handle wardrobe, voice, mannerism, age range, or emotional palette. Those live in the Character Bible. The bible travels with every prompt the same way the anchor does · together they are enough to make the audience read one performance instead of seven.

Series users get this for free across episodes. The bible is shared at the series level, not per-film. New episodes inherit the cast wholesale, which means the lead in episode six still looks like the lead in episode one without any extra work.

The bible is also where the voice lives. Studio binds an ElevenLabs voice profile to each character. Once bound, every dialogue line generated for that character routes through the same voice. The audience does not have to consciously match the voice to the face. The system has already done it for them.

Where it softens

Anchors are not perfect. In extreme profile, in very low light, and across large age changes, the model gets more freedom and the consistency loosens. Studio surfaces a confidence indicator on those shots. If you do not like the result, re-roll. Almost always the next sample is fine.

If a character is canonically supposed to age twenty years between scenes, that is what a re-cast is for. The Cast view lets you regenerate the anchor for a specific scene range. The bible carries through; only the geometry shifts.

The other failure mode is wardrobe drift. The anchor pins the face. It does not pin the coat. If the coat matters · and in most films the coat matters · put it in the bible explicitly. Navy wool, single-breasted, knee-length. The model will hold the description across cuts the same way it holds the face.

Why this is the canonical failure mode

Every AI-video tool before Studio either ignored character continuity or solved it badly. Pure prompt tools relied on hope. Sticker-based tools relied on lighting nobody scrutinises. Reference-image tools solved one shot at a time and called it a feature.

Studio is the first system that treats identity as a persistent property of the project, not a per-shot prompt. That is the whole jump. Once the anchor is in place, every downstream stage of the pipeline just works as expected. Sequence keyframes look like the same person. Timeline blocks look like the same person. Render exports a film, not a casting reel.

The bug was always solvable. It just required treating the problem as architectural rather than as a prompt-engineering challenge. Once we stopped trying to remind the model on each shot and started attaching the constraint to the project itself, the uncanny went away. Not perfectly. Honestly enough that audiences stop noticing. That is what cinema needs.

Try it

Make a film.

Begin

Related