top of page
wixbg_edited.jpg

Operation Guides

Extra Samples & Pitches

Before you begin:

- This guide was written and coordinated by members of the UTAU community and has no affiliation with the UTAU software itself or its creator.

- This guide was written with users of WINDOWS 10 in mind, advice does not immediately translate to other operating systems or other versions of UTAU, such as UTAU-Synth for MacOS.

- This guide relates to the original UTAU software by Ameya which released in 2008 and NOT OpenUTAU, the fanmade UTAU alternative, as such the utility of this resource may vary.

- While the process of operating UTAU is ultimately safe when done correctly, JOEZCafe and other parties involved in JOEZUTAU projects take no responsibility for any incidents, loss or damage to users or property from following these instructions.

Understanding Extras:

UTAU voicebanks will contain the necessary samples to synthesize a specific language, but on top of that, some banks will feature additional samples with a wide array of extra or alternate sounds that can add a layer of variety, realism or versatility to your UST, this guide goes over some of the most common extras you can find in some UTAU voicebanks and how to take advantage of them.

Release/Ending Vowels:

When the note at the end of a sequence finishes playing in UTAU, the effect can be somewhat sudden, like the voice is cutting out, some voicebanks, regardless of their type contain Release or Ending Vowels that can be placed at the end of a vocal sequence, these play a sound of the vocalist closing their voice to allow for a more natural release.

To use these, simply place a note on the end of a phrase and enter the alias of the appropriate vowel ending, depending on the bank, the vowel might be aliased in Hiragana, Romaji or another phoneme system like Arpabet, followed by a space, then a Hyphen (-), although depending on how they're configured, some voicebanks use a different Suffix to a Hyphen (-), such as an Asterisk (*) or a Capital Roman R.

To finalise the release, crossfade it with the prior note in the sequence using P2P3.

1.PNG
2.PNG

Multi-Expression Samples (E.V.E.C):

In vocalsynths, it's commonplace for some vocalists to contain multiple expressions (also nicknamed "Appends" by the community), these expressions contain samples of the vocalist singing in different tones of voice to encapsulate more singing styles and genres.

While some vocalists have these expressions distributed as separate voicebanks to be used individually, some vocalists are Multi-Expression, where these vocal types can be used interchangably in one UST using the same voicebank.

​

Multi-Expression banks (also nicknamed E.V.E.C (Enhanced Voice Expression Control) have entirely unique aliasing on a case-by-case basis, but the most common system is adding Suffixes (Additional characters placed at the end of an Alias) to specify on specific notes when to swap the sample with one of a different expression.

3.PNG

To demonstrate this, these are the Expressions available on JYOZE's Primary Series voicebanks, CORE (CV), ORNATE (VCV) and GLASS (Arpasing):

NATURAL.png

Leaving a note as is without a Suffix makes UTAU use samples from JYOZE's NATURAL voice, JYOZE's signature youthful tone.

POWER.png

Adding a P Suffix to a note uses JYOZE's POWER voice, where they sing in a firmer, shoutier tone of voice.

SOFT.png

Adding an S Suffix to a note uses JYOZE's SOFT voice, a breathy and airy tone of voice.

Depending on the voicebank used, there can also be Multi-Expression variants of extra samples like Vowel Release samples and VCs, so experimentation is highly recommended to customise your vocals.

Understanding Pitches:

UTAU accomplishes singing synthesis by analysing the note and lyric the user wishes to use, analysing a recording of the UTAU's voice provider singing that lyric, then correcting the pitch of the recording so it sounds like the desired note.

For example, if an UTAU vocalist sings a note in F4, but has recordings in C4, then the recording used is raised by 5 semitones for the desired result.

​

The trade-off with this system is the presence of "Pitch Artifacting", where the voice sounds increasingly artificial the further the distance between the desired note and the original recording. Because the synthesis primarily involves pitch correction, this also means UTAU will be unable to capture the subtle changes in vocal dynamics as the pitch goes higher or lower, such as the vocalist's varying exertion of energy as they sing high and low.

​

To counteract this, some voicebanks are Multipitch, meaning they have multiple pitches recorded that capture a broader vocal range.

What is a Prefix Map?:

Multipitch voicebanks incorporate a file known as a Prefix Map, a custom-made table that automatically chooses which pitch should be used for a note depending on its position in the piano roll, in essence, the user does not need to worry about which pitch is being used by default thanks to the Prefix Map.

4.PNG

For example, JYOZE's Primary Series voicebanks have three pitches.

- Notes that are B3 or Higher use the C4 recordings, JYOZE's High Range

- Notes that are between F#3 or A#3 use the G3 recordings, JYOZE's Mid Range

- Notes that are F3 or Lower use the D3 recordings, JYOZE's Low Range

In some scenarios, you may want to override the Prefix Map to use a pitch recording of your choice, this can be done with either of two methods:

​

Method 1 - Changing the Prefix Map

Navigate to Tools > Edit prefix.mmap(E) to open the Prefix Map Editor.

In this editor, select the note (Key) you wish to allocate a pitch to, then enter the desired pitch into the Suffix field on the right and select Set.

5.PNG

Method 2 - Adding a Pitch Suffix to a note

While the Prefix Map allocates pitches automatically, the pitch suffixes can still be entered manually on a note-by-note basis to manually override the original configuration, this is useful for reducing any harsh sounding discrepancies when a vocalist sings between two notes that toggle back and forth between pitches.

​

If a pitch has a Suffix, a note can be manually overidden to that pitch by adding that specific Suffix.

If a pitch is the "Default Pitch" and does not have a Suffix, it can be selected by placing a Question Mark (?) before the note's Alias.

6.PNG
7.PNG

For example, JYOZE's C4 and D4 recordings are used by entering the "D3" and "C4" Suffixes to a note, while G3 is JYOZE's "Default Pitch" and does not have a Suffix, so is accessed by adding a Question Mark (?) before a note's Alias.

This extra layer of complexity will make your USTs all the richer!
Next up is adding some pitch dynamics and flags to bring life to your vocals.

bottom of page