← All work AI Product · Government · FCDO

Translation Intelligence

When the machine isn't sure, it should say so — and hand the hard call to a human expert.

Client
FCDO, via Faculty AI
My role
Senior Product Designer (UX)
Discipline
Human-in-the-loop UX
Outcome
Uncertainty made visible
Side-by-side translation view with confidence score and alternative interpretations
Structured
One shared workflow replaces scattered handoffs
Visible
Confidence and ambiguity shown, not hidden
Human-led
Experts validate; AI assists
The problem

A single confident answer can be the dangerous one.

Intelligence analysts work constantly with multilingual content — voice messages, documents, chat threads. AI translation is fast, but on ambiguous material it can be confidently wrong, and that risk is unacceptable here. The challenge was to design a workflow that combined AI translation with human expertise while cutting the manual coordination between analysts and linguists.

Discovery

The process was the bottleneck.

Through stakeholder discussions and workflow analysis, one thing became clear: the existing process was fragmented and ran on manual interactions. An analyst received content, contacted a linguist, waited, chased — review happened over email and side channels, with no shared view of status or history.

  • Analyst receives intelligence content
  • Analyst contacts a linguist, off-platform
  • Linguist reviews — context scattered across messages
  • Result returns manually, with no shared record
Upload view — content enters the platform and is auto-translated, with uncertainty detected
Analysts upload content directly — the system translates, detects uncertainty, and scores confidence
Design · triage

Make confidence the first thing you see.

Analysts and linguists both review large volumes of content, so the dashboard leads with confidence tags — a fast way to spot translation risk and prioritise where review effort should go. High-confidence items move on; uncertain ones get flagged for a closer look.

Analyst dashboard — content overview with confidence tags
Analyst dashboard — confidence tags surface risk at a glance
Design · the artifact

Original, translation, confidence, alternatives — together.

The core view puts everything a person needs to judge a translation in one place: the original, the AI translation, a confidence score, and the alternative interpretations the model considered. Uncertainty is shown, not smoothed over — so the reader can weigh it instead of trusting it blindly.

Linguist review queue with escalated low-confidence translations
Low-confidence translations escalate to the linguist review queue, with full context attached
Design · expert review

Give the expert the original signal.

When an item is escalated, the linguist receives the original content, the AI translation, supporting context, and the exact phrases flagged for review. Many sources start as voice messages, where meaning rides on tone, emphasis and dialect — so linguists can play the original audio to validate both the transcription and the translation, not just the text.

Linguist review with audio playback and synchronised subtitles
Audio playback with synchronised subtitles — validating meaning at the source
Design · shared workspace

One workspace instead of a chain of handoffs.

The platform replaces external back-and-forth with a shared workspace. Analysts and linguists see the same context, translation history and review status; once a linguist validates or edits a translation, the update flows back to the analyst automatically. No chasing, no lost threads.

The goal was never to replace human expertise. It was to make uncertainty visible — so the right cases reach the right experts.

Impact

Uncertainty, out in the open.

The platform turned a fragmented review process into a structured, AI-assisted workflow. By pairing automated translation with expert validation, analysts processed content more efficiently while keeping confidence in the final interpretation. Most importantly, uncertainty became something the system shows — through confidence scores, alternative readings, and expert checkpoints — rather than something hidden behind a single, plausible answer.