← All work AI Product · Government · FCDO

Translation Intelligence

When the machine isn't sure, it should say so — and hand the hard call to a human expert.

Client

FCDO, via Faculty AI

My role

Senior Product Designer (UX)

Discipline

Human-in-the-loop UX

Outcome

Uncertainty made visible

Side-by-side translation view with confidence score and alternative interpretations

Structured

One shared workflow replaces scattered handoffs

Visible

Confidence and ambiguity shown, not hidden

Human-led

Experts validate; AI assists

The problem

A single confident answer can be the dangerous one.

Intelligence analysts work constantly with multilingual content — voice messages, documents, chat threads. AI translation is fast, but on ambiguous material it can be confidently wrong, and that risk is unacceptable here. The challenge was to design a workflow that combined AI translation with human expertise while cutting the manual coordination between analysts and linguists.

Discovery

The process was the bottleneck.

Through stakeholder discussions and workflow analysis, one thing became clear: the existing process was fragmented and ran on manual interactions. An analyst received content, contacted a linguist, waited, chased — review happened over email and side channels, with no shared view of status or history.

Analyst receives intelligence content
Analyst contacts a linguist, off-platform
Linguist reviews — context scattered across messages
Result returns manually, with no shared record

Upload view — content enters the platform and is auto-translated, with uncertainty detected — Analysts upload content directly — the system translates, detects uncertainty, and scores confidence

Design · triage

Make confidence the first thing you see.

Analysts and linguists both review large volumes of content, so the dashboard leads with confidence tags — a fast way to spot translation risk and prioritise where review effort should go. High-confidence items move on; uncertain ones get flagged for a closer look.

Analyst dashboard — content overview with confidence tags — Analyst dashboard — confidence tags surface risk at a glance

Design · the artifact

Original, translation, confidence, alternatives — together.

The core view puts everything a person needs to judge a translation in one place: the original, the AI translation, a confidence score, and the alternative interpretations the model considered. Uncertainty is shown, not smoothed over — so the reader can weigh it instead of trusting it blindly.

Linguist review queue with escalated low-confidence translations — Low-confidence translations escalate to the linguist review queue, with full context attached

Design · expert review

Give the expert the original signal.

When an item is escalated, the linguist receives the original content, the AI translation, supporting context, and the exact phrases flagged for review. Many sources start as voice messages, where meaning rides on tone, emphasis and dialect — so linguists can play the original audio to validate both the transcription and the translation, not just the text.

Linguist review with audio playback and synchronised subtitles — Audio playback with synchronised subtitles — validating meaning at the source

Design · shared workspace

One workspace instead of a chain of handoffs.

The platform replaces external back-and-forth with a shared workspace. Analysts and linguists see the same context, translation history and review status; once a linguist validates or edits a translation, the update flows back to the analyst automatically. No chasing, no lost threads.

The goal was never to replace human expertise. It was to make uncertainty visible — so the right cases reach the right experts.

Impact

Uncertainty, out in the open.

The platform turned a fragmented review process into a structured, AI-assisted workflow. By pairing automated translation with expert validation, analysts processed content more efficiently while keeping confidence in the final interpretation. Most importantly, uncertainty became something the system shows — through confidence scores, alternative readings, and expert checkpoints — rather than something hidden behind a single, plausible answer.

← Previous

DiscoverY

PropChecker