Single vs Stacked Presets for Dictation Refinement
Compare using a single refinement preset versus stacking multiple presets for dictation. Learn when simplicity wins and when composability matters.
| Criteria | Single Preset Refinement | Stacked Preset Refinement |
|---|---|---|
| Latency | 1-3 seconds — single LLM call | 2-9 seconds — multiple sequential LLM calls |
| Cost Per Dictation | One LLM inference — typically $0.001-$0.01 | N inferences — $0.002-$0.03 for 2-3 stacked presets |
| Flexibility | Low — changing behavior requires editing the monolithic prompt | High — swap, add, or remove individual presets without side effects |
| Maintainability | Decreases as the single prompt grows in complexity | Remains high — each preset stays small and focused |
| Output Consistency | Higher — one model call means no inter-pass conflicts | Lower — later passes may conflict with earlier refinements |
Single Preset Refinement
One refinement preset handles all post-processing in a single LLM pass. The preset contains all instructions for tone, formatting, cleanup, and domain-specific corrections in one prompt.
Pros
- Simpler to create, understand, and maintain — one prompt does everything
- Lower latency — only one LLM inference call per dictation
- Lower cost — one API call instead of multiple sequential calls
- No risk of conflicting instructions between presets
Cons
- Complex prompts trying to handle everything can become unwieldy and brittle
- Cannot easily mix and match behaviors — changing one aspect requires editing the whole preset
- Difficult to reuse individual refinement behaviors across different workflows
- Single prompts have diminishing returns as instruction count grows
Stacked Preset Refinement
Multiple refinement presets are applied sequentially, each handling a specific aspect of post-processing. For example: cleanup pass, then formatting pass, then tone adjustment pass.
Pros
- Modular and composable — mix and match presets for different workflows
- Each preset can be simple, focused, and individually tested
- Easy to add or remove a specific behavior without affecting others
- Enables preset libraries where community presets can be combined freely
- Better separation of concerns — each preset has one job
Cons
- Higher latency — each preset adds another LLM inference round-trip
- Higher cost — N presets means N API calls per dictation
- Later presets may undo or conflict with changes from earlier presets
- More complex pipeline to configure, debug, and reason about
Verdict
Start with single presets. They are simpler, faster, and cheaper. Move to stacked presets only when your single preset becomes too complex to maintain or when you need to compose behaviors dynamically across different workflows. Ummless supports both approaches, letting you start simple and graduate to composition as your needs grow.
Frequently Asked Questions
When should I switch from single to stacked presets?
Switch when your single preset exceeds 300-400 words and tries to handle unrelated concerns. If you find yourself with a prompt that handles cleanup, formatting, tone, and domain terms all at once, breaking it into focused presets improves reliability and makes each piece testable.
How do I prevent stacked presets from conflicting?
Order matters. Apply structural changes first (cleanup, filler removal), then formatting (markdown, code blocks), then tone (formal, casual) last. Each preset should be idempotent — running it on already-correct text should not introduce changes. Test each preset independently before stacking.
What is a good starter preset for developer dictation?
A single preset that removes filler words, adds punctuation, fixes capitalization, and preserves technical terms is a great starting point. It handles 90% of dictation cleanup needs without overcomplicating the prompt or requiring multiple passes.
Related Content
AI-Refined vs Raw Dictation: Is Post-Processing Worth It?
Compare raw speech-to-text output with AI-refined dictation. See how LLM post-processing improves punctuation, formatting, and technical accuracy.
ComparisonRule-Based vs AI Text Correction for Dictation
Compare traditional rule-based text correction with AI-powered refinement for speech-to-text output. See why LLMs outperform regex for dictation cleanup.
ComparisonGeneral vs Developer-Focused Dictation Tools
Compare general-purpose dictation software with tools built for developers. See why technical vocabulary and code-aware features matter.