🔒
Protected · NDA

this work is under wraps.

Podonos / OnePin is confidential, ongoing work. The full case study is password-protected out of respect for the team. Have the passcode? Step in.

that passcode didn't match, try again
no passcode? email me and I'll happily walk you through it. ✦
← back to the journey
Podonos · OnePin AI-native 0 → 1 Multilingual TTS B2B / Enterprise Agent + Human UX

the intelligent layer
for global voice.

Dozens of TTS models exist, none can tell you which one sounds right in which language. OnePin is the smart layer that decides, so teams ship localized voice in hours, not days.

see it live → onepin.ai
10 days hrs
End-to-end localized-audio pipeline, collapsed from ~10 days to a few hours (minutes for short scripts).
Pipeline speed
5 1
A five-step manual chain (engineer → bulk API → CSV → human eval → manual fix) replaced by one decision layer.
Workflow
9,000+
Throwaway audio files a team would generate to QA a single 3,000-line script, the over-production OnePin removes.
Waste eliminated
Case snapshot
Role

Founding Product Designer — sole designer, 0→1

Team

10 people — founder (voice researcher), engineering, me on design

Timeline

Q1–Q2 2026 · incl. 3 weeks on-site in Seoul

Tools

Figma · Figma Make · Claude Code & Python

What I did

As founding product designer (0→1), defined OnePin — an AI voice-producer layer over any TTS — owning research, platform definitions, the chat-agent + node interface, and workflow automation.

Impact

Collapsed a ~10-day localized-audio pipeline to hours, replaced a five-step manual chain with one decision layer, and removed 9,000+ throwaway QA audio files per script.

Constraints

A 0→1 product with no precedent, built to sit on top of any third-party TTS model across many languages.

01 · Context

language and voice
go hand in hand.

English TTS is mature; every other language degrades on translation, and buyers aren't multilingual enough to catch it. OnePin knows which model wins for each language, so businesses can scale to markets they couldn't serve before.

Platform AI voice producer on top of any TTS Users AITubers · podcasters · enterprise broadcasters Born from a real gap Podonos' own customers felt Now live 40+ languages · 100+ TTS providers · onepin.ai
02 · Problem

nobody knew if the
voice was even right.

01

Can't judge quality

No shared way to tell if an AI voice even sounds human.

02

Which API, which language?

Every API wins for a different language. Teams just guessed.

03

Defensive over-production

One 3,000-line script → ~9,000 files, all sent for slow human eval.

04

A 10-day pipeline

Five hand-offs before a single line ships.

the signal · interviews + reddit
which API works best for Spanish?
AITubersPodcastersEnterprise broadcasters
Asked over and over, and every interview confirmed the same 10-day reality.
03 · The shift

from a 10-day chain
to one layer.

A football match in English, headed for a global audience. Before vs. after OnePin.

↳ Before - the manual chain
1
Script handed to the sound/audio engineer
2
Run the API, bulk-produce output (3,000 lines → ~9,000 files)
3
Save the .csv, ship it for human evaluation
4
Wait for the evaluation report
5
Audio producer manually corrects the final cut
~10 days
↳ After - OnePin
Drop in the script
OnePin picks the best TTS model per language
Produces and evaluates, no defensive bulk runs
Right message, right audience, ready to ship
hours → minutes
04 · Process

designed for humans,
and for agents.

Straight to the people living the pain, then prototype fast and lock direction in a month.

voices from research

I can run ten different TTS APIs, but I still can't tell which one actually sounds right for Spanish.
AITuber · multilingual content
We over-produce everything defensively — thousands of files — just so a human can catch the bad takes later.
Audio producer · broadcaster
By the time a single line is approved, it's been through five hand-offs and about ten days.
Podcaster · production lead
01 Customer interviews + Reddit signal 02 Concepts in Figma Make 03 3 weeks on-site in Seoul 04 Platform definitions, ground up 05 Alpha vs. beta feature priority 06 90-day go-to-market plan

A chat-agent + node interface

Designed so a human or an agent can drive production, not just a person clicking through screens.

just another TTS dashboard

A human-only tool caps the product, it can't plug into the pipelines our customers are already automating.

reframe the question

Stopped asking "is this voice good?" and started asking "does it meet these testable criteria?", turning taste into a rubric a team and an agent could both act on.

artifact scoring rubric v1with founder + eng

prototype the decision layer

Designed one surface that picks the model and explains why, making the machine's judgment legible instead of burying it in charts.

explored agent + node UItool Figma → coded

validate against the manual chain

Tested the layer against the real 10-day workflow on live multilingual scripts, the decision surface collapsed it to hours.

baseline 10-day chainresult hours → minutes

ship the system, not screens

Wrote platform definitions from scratch and automated the manual steps as one-command skills, a human or an agent can run a production end-to-end.

automated workflow stepsbuilt with Claude Code + Python
05 · Designing the judgment

teaching a machine
to hear taste.

OnePin is, at its core, an opinion about AI output. My job was to turn "is this voice good?", a gut call, into a flow a team and an agent could run the same way every time. It's node-based with a chat agent on top, so you can wire the steps by hand or just describe what you want, the same judgment runs underneath either way.

in
Script or voice note
Drop in text, a script, or a voice note, the raw material to localize.
correct
Fix the audio
Phoneme injection for tricky words, normalization for numbers & symbols, and boosting to even out delivery.
validate
Score it against criteria
Run naturalness, accent & pronunciation validators, plus a human-in-the-loop check, no more "sounds fine to me."
out
Ship what you chose
Pick the corrections & validators that matter for this job; OnePin returns the output that meets them.

Instead of one hidden, one-size-fits-all pipeline, I exposed every correction and validator as a node the user, or an agent, can switch on or off. An AITuber and an enterprise broadcaster don't share the same bar for "good", so the judgment had to be tunable, not baked in.

Selectable corrections & validators

The user chooses which fixes and which quality checks run, so the same engine serves a casual creator and a broadcast team without compromise.

one fixed, black-box pipeline

A hidden default forces everyone to trust the same opinion of "good", and gives an agent nothing to reason about.

06 · Solution

the smart layer,
made visible.

One surface where a human, or an agent, drives a production end to end: describe the job in chat on the left, watch it resolve as a live node graph on the right, script in, corrected, validated, and ready to ship.

OnePin productionlibrarysettings
agent
Localize this match commentary to Spanish, broadcast quality.
On it. Routing to the best Spanish model and injecting phonemes for the player names so they're said right.
decision · why this model
ModelES · top-ranked
Naturalness4.6 / 5
Accentneutral LatAm ✓
Pronunciationnames fixed ✓
2 clips fell below the bar, I've flagged them for a human check before they ship.
Make the accent neutral LatAm…
inScript · ES
scriptvoice note
correctFix the audio
phoneme injectionnormalizationboosting
validateScore vs. criteria
naturalness4.6accent ✓pronunciation ✓human review
outLocalized audio
ready to ship2 flagged

representative recreation of the OnePin interface · product details redacted ahead of the August 2026 launch

Whiteboard sketch of the OnePin flow: input, translate, model, validate, errors → notify — Seoul on-site
Whiteboard working through audio normalization and correction examples — Seoul on-site

behind the build · whiteboarding the OnePin pipeline, 3 weeks on-site in Seoul

07 · Impact

what it changes,
three ways.

For the user
Days hours

Validated against the real 10-day manual chain on live multilingual scripts: localized audio in hours, no API guesswork, no defensive bulk runs.

For the business
Eval, as a product

Quality judgment that used to live in one engineer's head becomes a layer any team, or agent, can run, opening markets the manual effort never justified.

For the org
Agent-native 01

Platform definitions from scratch; the manual workflow automated so the steps disappear, built and shipped as a real system.

built & shipped · public launch August 2026

"OnePin will change the way voice works. No more robotic delivery, mangled pronunciation, or off accents, just voice that sounds genuinely, effortlessly human."
- The vision behind OnePin
08 · Beyond the Figma file

I shipped the system, not just the screens.

Platform definitions written from scratch, a clear alpha-vs-beta line, and the manual workflow steps automated with Claude Code & Python, turned into one-command skills. A human or an agent can run a production end to end.

The interface was built to be legible for both audiences — accessible contrast and focus states for people, predictable structure and clear labelling for agents — so judgment stays usable however the work runs.