How does local transcription compare to cloud APIs?

Slightly slower but more private, with no per-use costs. Accuracy is comparable to cloud services when using Medium or Large models.

Can I use it without internet?

Yes, once models are downloaded. Perfect for flights, remote locations, or secure environments.

What file formats are supported?

MP3, M4A, WAV, WEBM, OGG, FLAC, and most common audio/video formats. Video files are auto-converted to audio.

How much disk space do I need?

Models range from 75MB (Tiny) to 1.5GB (Large). You only download what you need and can delete unused models anytime.

Does it work on all devices?

Requires modern hardware. Recommended: 8GB+ RAM, recent CPU/GPU. Mobile devices may be limited to Small/Tiny models.

Can it handle poor audio quality?

Large models are more robust to background noise, but clear audio always improves results. Consider audio cleanup first for very noisy files.

Back to Skills

Local Transcription

Convert audio to text locally on your device for private, offline transcription with no API costs

AI ServiceComing Soon

What It Does

Local Transcription processes audio files directly on your device without sending data to external servers. Using on-device AI models, it converts speech to text while keeping your data completely private and eliminating per-use costs. Perfect for sensitive content, offline use, or high-volume transcription needs.

On-Device ProcessingComplete PrivacyOffline CapableNo API CostsMulti-language

In a Nutshell

🔒

Private & Secure — all processing happens locally, no data leaves your device

⚡

Offline Ready — works without internet connection once models are downloaded

💰

Zero API Costs — unlimited transcriptions with no per-minute charges

🎯

Fast Processing — real-time or faster transcription on modern hardware

🌍

Multi-language — supports 50+ languages with downloadable models

Use Cases

Sensitive Content

Transcribe confidential meetings, legal recordings, or medical notes without cloud exposure

Offline Fieldwork

Capture interviews or field notes in remote locations without internet access

High-Volume Work

Process hours of content without worrying about API usage limits or costs

Personal Journaling

Convert voice memos and personal recordings to text privately

How to Use

Step 1

Download the model

First time only: download the language model for your target language (100MB-1.5GB depending on size).

Models are cached locally and reused for all future transcriptions.

Step 2

Upload or record audio

Provide an audio file or start a live recording through your device microphone.

Step 3

Select language and model

Choose the source language and model size (Small for speed, Large for accuracy).

Auto-detect is available but may reduce accuracy.

Step 4

Process and review

Transcription runs locally and completes in seconds to minutes depending on file length and model size.

Command Examples

You say:

Transcribe this meeting recording locally [audio file]

Assistant responds:

Processing with Whisper Large locally... Complete! [00:00] Alice: Let's start with Q1 results. [00:12] Bob: Revenue is up 15%...

You say:

Convert my voice memo to text, keep it private

Assistant responds:

Using local model (no cloud upload). Transcribed: "Remember to follow up with the client about contract renewal. Priority: high..."

You say:

Transcribe this 2-hour interview offline

Assistant responds:

Started local processing with Medium model. Estimated time: 8 minutes. Progress: 25%... 50%... Complete!

Limits & Behavior

Parameter	Limit	Notes
File size	Unlimited	limited only by device storage
Duration	Unlimited	longer files take more processing time
Concurrent jobs	1 at a time	queue additional files automatically
Model storage	~1.5 GB max	per language model downloaded

Models & Modes

Model Size	Speed	Accuracy	RAM Usage	Best For
Tiny	Very Fast	Good	~1 GB	quick drafts, casual notes
Small	Fast	Better	~2 GB	general use, meetings
Medium	Medium	High	~5 GB	professional transcription
Large	Slow	Highest	~10 GB	critical accuracy, difficult audio

Estimated Cost

Freeper transcription

Runs entirely on your device. No API calls, no provider costs, no LLM costs.

No LLM cost — fully on-device BYOK users pay LLM costs directly to their provider.

* Prices include platform service fee. Actual costs may vary.

FAQ

Setup Requirements

Sufficient disk space for models (75MB-1.5GB)

Minimum 4GB RAM (8GB+ recommended)

First-time model download (internet required)

Audio input capability for live recording

Troubleshooting

Error	Meaning	Action
OUT_OF_MEMORY	Insufficient RAM for model	Use a smaller model or close other apps
MODEL_NOT_FOUND	Language model not downloaded	Download model from Settings → Models
UNSUPPORTED_FORMAT	Audio format not recognized	Convert to MP3, WAV, or M4A
PROCESSING_SLOW	Transcription taking too long	Use smaller model or upgrade hardware