Essay · · 4 min read

Voice profiles are just markdown files

Tonecast's whole pitch is that it learns how you write -- per channel, per contact -- and drafts replies that sound like you. Which raises an obvious question: where does that learning live?

For most AI products the answer is embeddings in somebody else's database. Your writing style gets chopped into vectors, uploaded, and indexed. You can't read it, you can't correct it, and "delete my data" means filing a request and hoping.

Tonecast's answer is a folder of markdown files on your Mac.

The whole "database"

Open ~/Library/Application Support/Tonecast/voices/ and you're looking at everything Tonecast knows about your voice. Your WhatsApp voice is self/whatsapp.md. Your Slack voice is self/slack.md. Email profiles are scoped per Gmail account, and every contact you write to regularly gets their own file under contacts/.

Each file is YAML frontmatter plus seven observations:

---
channel: email
updated: 2026-06-28
messages_analyzed: 47
version: 6
---

Greeting: "Hi {first name}," -- drops to "Hey" on reply threads
Sign-off: "Best," plus first name; "Cheers" with close colleagues
Length: 2-4 sentences, single paragraph
Tone: direct, warm, lightly informal
Structure: answer first, context second; one ask per email
Avoids: exclamation marks, corporate filler, emoji, "Hope you're doing well"
Quirks: em dashes mid-sentence, lowercase "ok", numbered lists past three items

That's not a simplified illustration. That's the format. Greeting, sign-off, length, tone, structure, avoids, quirks -- the same seven fields the app validates before it accepts a generated profile.

Under a hundred words, on purpose

The analysis prompt is deliberately strict. A freshly generated profile has to come in under a hundred words (incremental updates get two hundred, and the validator rejects anything outside that range). Every claim needs at least three supporting instances in your actual messages -- see something once and it's an occurrence, not a pattern. The model is told to quote your exact phrases rather than paraphrase them into mush, and when there isn't enough data for a field, it writes [INSUFFICIENT] instead of guessing.

I like this constraint for a reason that has nothing to do with storage: a profile is a set of claims. Claims can be wrong, and wrong claims should be visible. "Signs off with 'Best,'" is something you can check with your own eyes and veto. A 1,536-dimension embedding of your sent mail is not.

There's also a small JSON sidecar per profile with a few coarse dials -- formality, warmth, brevity, emoji usage -- but the markdown file is the thing that actually gets injected into the prompt when Tonecast drafts for you. What you read is what the model reads.

Why files beat embeddings

Inspectable. Curious why Tonecast keeps drafting two-sentence replies? Open the file. Length: 2-4 sentences -- that's why. No support ticket can give you that answer faster than Quick Look can.

Editable. If a claim is wrong -- say it caught you on a bad week and decided you avoid exclamation marks -- change the line. The app re-reads the file; your edit is the fix. Try editing an embedding.

Deletable. Delete the file and that knowledge is gone. There's a Reset button per channel and a "Clear All Voice Data" button in Settings, and both do exactly what rm -rf would do, because that's all there is to do. No deletion request, no thirty-day queue, no "we may retain copies for legitimate business purposes."

Portable. Your voice took you years to develop. It shouldn't be hostage to my product. Markdown files come with you -- into a backup, into another tool, into a text editor in 2040.

What about the raw messages?

Honest answer: to learn patterns, Tonecast does buffer your sent messages locally first -- messages of ten words or more land in a small buffer file, capped at fifty messages per channel. Once enough accumulate, they're distilled into the profile and the analyzed messages are deleted from the buffer. The durable artifact is the hundred-word profile, not an archive of your mail.

And all of it -- buffer, profiles, sidecars -- stays in that one folder on your Mac. Tonecast has no server that any of this syncs to. In BYOK mode the only network hop is the AI call itself, straight from your machine to your provider.

Diagram: on your Mac, Tonecast holds your voice profiles (markdown files) and your API key — none of which leave the machine. In BYOK mode requests go directly from your Mac to your AI provider using your key. Optionally, Tonecast Cloud sits in between with managed keys.
the whole data path — keys and profiles stay home

Contacts are diffs

One more detail I'm fond of. Per-contact email profiles aren't full profiles -- they're deltas against your channel baseline. The analysis only records what changes when you write to that person: "drops to just first name," "shorter, jokier." If nothing meaningfully differs, the profile literally says "No meaningful differences from baseline," and that's a valid, complete profile.

That mirrors how voice actually works. You don't have forty separate personalities; you have one voice with per-person adjustments. The data model should say so.

Plain text won. It just took the AI industry a while to remember.

If you'd rather your voice lived in files you own than in vectors you can't see, join the early-access list.


Tonecast is built by Codefox AI. Questions, feedback, or just want to say hi? Email us at support@tonecast.ai.