Aizawa Voice Impression: Mastering Eraserhead’s Deadpan Tone
An Aizawa voice impression is one of the more technically interesting character voices from My Hero Academia — not because it is dramatic, but because it is deliberately, pointedly undramatic. Shota Aizawa, the underground hero Eraserhead, speaks with the exhausted patience of someone who has graded too many hero students, slept in a sleeping bag through faculty meetings, and developed a dry wit sharp enough to cut through the shounen genre’s usual optimism. Getting that right requires more than a pitch slider.
This guide covers the acoustic anatomy of Aizawa’s voice across both the Japanese original (Junichi Suwabe) and English dub (Christopher Wehkamp), the specific DSP chain for the tired deadpan baseline and the rare commanding spike, vocal training drills for physical impression work, and the AI voice cloning workflow for real-time use in Discord, OBS, or gaming on Windows.
TL;DR
- Aizawa’s voice is built on dry low-baritone delivery, breathy fatigue overlay, restrained resonance, and intermittent vocal fry — not just pitch shift.
- Junichi Suwabe (JP) is warmer and more dignified; Christopher Wehkamp (EN) is drier and more detached. Both sit at -2 to -3 semitones from a neutral male fundamental.
- DSP chain: -2 to -3 semitone pitch shift → slight formant drop → breathy/air layer at low wet mix → gentle de-essing to avoid sibilance.
- The command mode is a two-state toggle: reduce the fatigue overlay and raise gain by 2–3 dB on cue.
- AI voice cloning extends the result beyond what DSP alone can achieve, hitting the specific vocal texture of Suwabe or Wehkamp rather than a general approximation.
- VoxBooster handles the full stack — DSP, AI conversion, WASAPI routing — on Windows 10/11 with sub-300 ms latency, no kernel driver required.
Who Is Aizawa Shota and Why Does His Voice Matter?
Shota Aizawa is Class 1-A’s homeroom teacher at U.A. High School in My Hero Academia, the manga and anime franchise created by Kōhei Horikoshi and animated by Bones studio. His hero name is Eraserhead, and his Quirk erases other people’s Quirks — a power that fits his personality perfectly. He operates without spectacle.
The character occupies a unique sonic space in anime voice acting. Where most MHA characters exist somewhere on the spectrum between “loud and determined” and “louder and more determined,” Aizawa is almost aggressively quiet. His voice signals competence through underreaction. A student panics; he sighs. A villain threatens; he evaluates calmly. His rare moments of sharp intensity land harder precisely because they contrast so sharply with his default register.
For voice impression fans, streamers, and roleplayers, that underreaction register is both the appeal and the challenge. Monotone and low is easy to do badly. Monotone, low, and textured with genuine weariness is something else.
The Acoustic Anatomy of Aizawa’s Voice
Pitch and Fundamental Frequency
Aizawa sits in the dry low-baritone range. For impression work, target -2 to -3 semitones below your natural speaking fundamental. This is not an extreme drop — it places the voice in the low-normal male range rather than a comically deep register. The goal is restrained weight, not theatrical depth.
Going lower than -3 to -4 semitones pushes the voice into a range that requires heavy formant compensation to sound human. Without that compensation it reads as a monster-voice or cartoon effect, which is the opposite of what Aizawa’s character projects.
The Fatigue Layer: Breathy Overlay and Vocal Fry
The single most distinctive element of Aizawa’s vocal signature is not pitch — it is the quality of exhale that sits underneath his speech. He sounds perpetually half a step away from another involuntary nap. This is produced acoustically through two mechanisms:
Breathy overlay: A thin layer of aspirate air running under the voice. In DSP terms, this means adding a gentle noise or breathy layer to the voice signal at a very low wet/dry mix — around 10–15% wet. Too much produces a “whispering” effect; the correct level just adds the quality of not-quite-full vocal engagement.
Vocal fry: Aizawa uses intermittent vocal fry — the creaky, low-frequency oscillation at the very bottom of the vocal register — especially on sentence endings, after pauses, and during moments of particular exasperation. Physically, this requires relaxing the vocal cords at the end of phrases and letting the voice settle into creak rather than cutting cleanly to silence.
Resonance and Placement
Aizawa keeps resonance low-placed and chest-forward, but not projected outward. His voice does not fill a room — it sits in the room and waits for you to come to it. The forward placement matters: purely throat-back resonance produces a hollow or distant quality that reads as muffled rather than fatigued.
The Japanese performance by Junichi Suwabe has slightly more mid-frequency warmth — his voice has a richer, more resonant low-mid presence that gives Aizawa a sense of buried dignity. Wehkamp’s English interpretation strips back some of that warmth in favor of flatness, which pushes the sarcasm register higher. Neither is wrong; they are different stylistic interpretations of the same character.
Comparison: Japanese vs. English Performance
| Dimension | Junichi Suwabe (JP) | Christopher Wehkamp (EN) |
|---|---|---|
| Overall timbre | Warm low-baritone | Dry, flat baritone |
| Pitch target | -2 semitones, gentle | -2.5 to -3 semitones, clipped |
| Fatigue character | Dignified exhaustion | Detached indifference |
| Vocal fry usage | Occasional, end-of-phrase | Frequent, especially sarcastic lines |
| Command spike style | Sudden rise in intensity, compressed | Flat drop in volume, more menacing |
| Formant adjustment | Slight downward shift for warmth | Neutral to slight upward for dryness |
| Sarcasm delivery | Subtle, almost warm | More overtly deadpan |
For most Western audiences and streaming contexts, the Wehkamp English register is the reference. If you are performing for Japanese-speaking audiences or purist dub fans, targeting Suwabe’s warmer baseline produces a more authentic result.
DSP Settings for the Eraserhead Voice Mod
The Baseline Chain
This chain targets the everyday tired-teacher register — the one Aizawa uses for 90% of his screentime.
- Pitch shift: -2 to -3 semitones. Keep formant-correction on to avoid the chipmunk inverse at negative values. Most voice processing tools include a linked formant mode; enable it.
- Formant adjustment: -0.5 to -1 point toward a slightly deeper vocal tract length. This adds the low-mid warmth that keeps the voice from sounding thin after pitch drop. Do not over-apply — the result should feel like a slightly larger chest cavity, not a completely different speaker.
- Breathy/air overlay: Add a breathy layer at 10–15% wet. If your voice changer supports a “breathiness” parameter directly, use that. If working with an effects chain, a low-gain noise layer with the high frequencies cut (low-pass around 3 kHz) achieves a similar result without adding hiss.
- Dynamics: Keep compression light. Aizawa’s voice has natural dynamic variation — do not flatten it entirely. A gentle 3:1 ratio with a slow attack preserves the small volume variations that make tired speech feel natural.
- De-esser: Light de-essing at 5–8 kHz. The breathy overlay can exaggerate sibilants — a gentle de-esser keeps them controlled without making speech sound lispy.
The Command Mode
Aizawa’s command register appears rarely and lands hard when it does. The shift is not volume — it is texture and compression. In DSP terms:
- Reduce the breathy overlay to 0–3% wet (nearly off).
- Tighten compression: 4:1 ratio with a faster attack to suppress dynamic peaks and give the voice a more controlled, pressurized quality.
- Raise output gain by 2–3 dB to compensate for the energy the fatigue layer was providing.
- Keep pitch identical — the command mode does not go lower, it goes more controlled.
The effect should feel like the same person making a considered decision to stop being patient, not like a different voice or a dramatic transformation. Practice the two-mode toggle as a conscious performance choice.
Vocal Training Drills for Physical Impression Work
If you want to produce the impression using just your own voice — for convention panels, in-person roleplay, acting work, or as a foundation for AI cloning — these drills build the physical technique.
Drill 1: Sustained Low Monotone
Hold a single vowel (try “ah”) at the lowest comfortable pitch in your chest register for 10–15 seconds without rising, vibrating, or adding expression. The goal is controlled flatness — not forced, not strained. Aizawa’s neutral speaking pitch should feel like this: a comfortable floor, not an effort.
Drill 2: Phrase-End Fry
Speak a sentence — any sentence — and at the very end, instead of stopping the voice cleanly, let it settle into creak. The vocal cords should still be vibrating but at a very slow, low rate. “The test is next week” should end with “week” creaking slightly downward into near-silence. Practice this on every sentence for 5 minutes daily until it becomes natural at the end of utterances.
Drill 3: Flat Affect Reading
Read any text — news, a book passage, a recipe — with zero emotional modulation. No emphasis words, no pitch rises for questions, no enthusiasm spikes. Every sentence ends at approximately the same pitch as it started. This is Aizawa’s emotional default: he does not perform feelings for his voice, he just states things. Recording yourself and checking for accidental emphasis reveals where natural speech habits sneak in.
Drill 4: Two-Mode Switching
Read a dialogue script where the character alternates between calm teacher mode and a single moment of sharp command. Practice snapping between the two without a gradual transition — the switch should happen in a single syllable. This is the hardest part of the Aizawa impression to get right because it requires simultaneous physical and emotional precision.
AI Voice Cloning Workflow for Eraserhead
AI voice conversion takes the DSP baseline and extends it into a genuine acoustic match with either Suwabe’s or Wehkamp’s specific vocal texture — the individual overtones, breath patterns, and resonance qualities that DSP chains can approximate but not exactly reproduce.
Step 1: Collect Clean Audio
Source clean dialogue from My Hero Academia episodes — scenes without music, battle sound effects, or background crowd noise. Aizawa has substantial dialogue across the series, making this straightforward. Aim for 15–30 minutes of clean, isolated speech covering both calm and command registers.
Step 2: Check for Existing Community Models
Before training from scratch, check community voice model repositories. Pre-trained models for major MHA characters exist and may already cover Suwabe’s or Wehkamp’s Aizawa performance. A good community model can save the processing time entirely.
Step 3: Import and Configure in VoxBooster
VoxBooster accepts standard voice model formats directly through its import interface — no Python runtime, no command-line setup required. Load the model in the AI Voice section, set the input source to your microphone, and select the WASAPI virtual cable as output so Discord, OBS, or your game client receives the converted audio. With a discrete GPU, conversion latency stays under 300 ms — comfortable for push-to-talk, usable for real-time conversation with brief discipline.
Step 4: Layer DSP on AI Output
For Aizawa specifically, the AI model handles the tonal and textural match; the DSP chain adds the fatigue layer on top. Run the AI conversion first in the signal chain, then apply the breathy overlay and light compression to the converted output. This produces a more convincing result than either technique alone.
Setup for Discord, OBS, and Games
Discord
In Discord audio settings, set the input device to the VoxBooster virtual audio cable. Disable Discord’s own noise suppression — it conflicts with the breathy overlay and tries to remove it as “background noise.” Use Krisp or VoxBooster’s internal noise suppression before the voice processing chain, not after.
In Discord’s voice activity settings, switch to push-to-talk if using AI conversion mode (to avoid the 250–300 ms processing delay being noticeable in pauses). DSP-only mode is fast enough for open-mic use.
OBS
In OBS, add the VoxBooster virtual cable as a microphone source. In the audio mixer, name it “Aizawa” or “Eraserhead” for clarity if you run multiple audio sources. You can assign scene-specific audio filters in OBS on top of VoxBooster’s output if you want scene-specific intensity presets.
Games with Anti-Cheat
VoxBooster operates entirely through WASAPI — the Windows audio session API — with no kernel driver component. Anti-cheat systems (EAC, BattlEye, Riot Vanguard) monitor kernel-level access; WASAPI audio routing is entirely transparent to them. The voice changer runs alongside competitive games without conflict.
Ethics and Fan Content Context
The Aizawa voice impression for fan content falls within established fan community practice. Non-commercial streaming, Discord roleplay, gaming, and cosplay audio use fictional character voice impressions without creating intellectual property friction in practice.
For commercial applications — selling voice packs, using the impression in monetized products, or licensing content — the relevant rights holders are Bones studio (anime production) and Shueisha (manga publisher). Review their fan content guidelines before commercializing.
The voice actors themselves — Junichi Suwabe and Christopher Wehkamp — perform under professional contracts. An AI clone trained on their performance for non-commercial personal use sits in the same category as a physical impression trained by listening to the performance. Commercial use of a voice actor’s likeness requires separate consideration and, in most professional contexts, their consent.
Practice Material: Iconic Aizawa Lines
These lines cover the range of Aizawa’s registers and are useful both as impression reference and as practice text for the vocal drills above.
- The iconic expulsion threat: flat, measured, no dramatic inflection — just the calm communication of an unpleasant fact.
- The sleeping bag entrance: tired, conversational, slightly annoyed at having to be awake for this.
- The villain confrontation command: same low pitch, fatigue overlay removed, compressed and direct.
- The rare moment of genuine care — delivered with the same flat tone as everything else, which is exactly what makes it land.
The consistent thread across all registers: Aizawa never performs for the audience. He communicates to the person in front of him and considers whether you hear it or not your problem.
Frequently Asked Questions
What makes Aizawa’s voice different from a standard lowered-pitch effect? A simple pitch drop just makes any voice deeper. Aizawa’s signature comes from layering breathy fatigue, restrained resonance, and intermittent vocal fry under a dry, conversational delivery — plus abrupt shifts to clipped, commanding intensity. Pitch alone misses the exhausted teacher texture entirely.
How many semitones should I lower pitch for an Aizawa impression? Start at -2 to -3 semitones from your natural fundamental. Christopher Wehkamp’s English performance sits in a dry low-baritone range; Junichi Suwabe’s Japanese original is slightly warmer. Avoid going below -4 without compensating formant shift, or the voice sounds like a generic monster effect rather than a tired human.
Can I do an Aizawa voice impression live on Discord without noticeable lag? Yes. DSP-only mode — pitch shift, formant adjustment, and the breathy overlay — adds under 20 ms of latency, which is imperceptible in conversation. AI voice conversion adds 250–300 ms, which works fine with push-to-talk but can feel sluggish in open-mic chat.
Is it okay to use an Aizawa voice impression for fan content and streaming? Fan voice impressions of fictional characters for non-commercial content — streaming, gaming, Discord roleplay — sit in well-established fair-use territory. For monetized projects or commercial products, review the character usage guidelines from Bones studio and Shueisha before publishing.
Do I need to train a custom AI voice model or can I use a pre-trained one? Pre-trained community models exist and work for casual use. Training your own from clean Aizawa dialogue takes 15–30 minutes of isolated audio and produces a more accurate, personal result. Either path runs in VoxBooster without any Python environment or command-line setup.
What is the difference between the Japanese and English Aizawa voice performances? Junichi Suwabe’s Japanese performance is slightly warmer in timbre with richer low-mid resonance — the fatigue reads as dignified restraint. Christopher Wehkamp’s English dub is drier and flatter in delivery, leaning harder into detached sarcasm. Both share the same -2 to -3 semitone range but the formant target differs slightly.
Why does Aizawa occasionally shift to a sharp, commanding tone and how do I replicate it? Aizawa’s command register appears in crisis moments — expulsion threats, battle calls, hero rescue. It is the same low pitch but with compressed dynamics, increased projection, and minimal breathy layer. In DSP terms: reduce the wet mix on the fatigue overlay and raise output gain by 2–3 dB. Practice the contrast as a two-mode toggle rather than a gradual transition.
Related guides: Deku Voice Changer · Anime Voice Changer · Deep Voice Changer · Discord Voice Filters · Epic Narrator Voice Tutorial