TTS-Voice-Wizard

Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)

MIT License

Stars
582
Committers
7

Bot releases are visible (Hide)

TTS-Voice-Wizard - v1.6.4

Published by VRCWizard 10 months ago

Lots of minor bug fixes

v1.6.4 (current)

  • fixed a bug where changing the value for Minimum valid VAD duration in Deepgram settings had no effect

v1.6.3.8

  • custom output based on current HR for media integration tab, example: {HREmoji (BPM: 0 E: 💀)(BPM: 40-59 E: 💔)(BPM: 60-100 E: ❤️)(BPM: 101-120 E: 💓)}
    • BPM can be a range or a single value, spaces are before and after "-" are ok
    • E stands for emoji but it can be anything u want, type a paragraph if u want 🤷
    • You can have however many sections you want (at least 1)

v1.6.3.7

  • fix for "NoDriver calling acmFormatSuggest" error when converting some wav streams (for select azure voices) from VoiceWizardPro API into files

v1.6.3.6

  • standardized default hr output
  • fixed Azure Translation not working for Chinese (was using wrong language code)

v1.6.3.5

  • fix for "enable output for OBS text file" for the media integration tab trying to output to the wrong folder "TextOut" instead of the new "TextOutput"

image

TTS-Voice-Wizard - v1.6.3.8

Published by VRCWizard 10 months ago

  • custom output based on current HR for media integration tab, example: {HREmoji (BPM: 0 E: 💀)(BPM: 40-59 E: 💔)(BPM: 60-100 E: ❤️)(BPM: 101-120 E: 💓)}
    • BPM can be a range or a single value, spaces are before and after "-" are ok
    • E stands for emoji but it can be anything u want, type a paragraph if u want 🤷
    • You can have however many sections you want (at least 1)
TTS-Voice-Wizard - v1.6.3.7

Published by VRCWizard 10 months ago

Lots of minor bug fixes

v1.6.3.7 (current)

  • fix for "NoDriver calling acmFormatSuggest" error when converting some wav streams (for select azure voices) from VoiceWizardPro API into files

v1.6.3.6

  • standardized default hr output
  • fixed Azure Translation not working for Chinese (was using wrong language code)

v1.6.3.5

  • fix for "enable output for OBS text file" for the media integration tab trying to output to the wrong folder "TextOut" instead of the new "TextOutput"

image

TTS-Voice-Wizard - v1.6.3.6

Published by VRCWizard 10 months ago

  • standardized default hr output
  • fixed Azure Translation not working for Chinese (was using wrong language code)
TTS-Voice-Wizard - v1.6.3.5

Published by VRCWizard 10 months ago

  • fix for "enable output for OBS text file" for the media integration tab trying to output to the wrong folder "TextOut" instead of the new "TextOutput"

image

TTS-Voice-Wizard - v1.6.3.4

Published by VRCWizard 10 months ago

v1.6.3.4 (current version)

  • minor fix

v1.6.3.2

  • minor changes to UI and text outputs
  • whisper VAD times now clear from memory

v1.6.3

  • Calibration added for deepgram to automatically adjust your silence threshold based on your background noise
    image

  • VoiceForge TTS voices
    image

v1.6.2

Deepgram

  • Deepgram can now be used continuously
  • The new Minimum Valid VAD Duration (s) is used to prevent sending blank outputs to the API (saves your usage)
    • by default audio with a detected voice activation shorter than 0.5 seconds is prevent
  • Silence Scale can be used to adjust how long the pause is before sending audio. 30000 the default is around a 1 second pause
    image

Voice Wizard Pro

  • VoiceWizardPro API prevents audio shorter than .1 second and longer than 30 seconds
  • silence trimming of API improved

MISC

  • VAD Mode setting moved to Settings > Audio (this setting is used for both deepgram and whisper)
  • Other minor changes and bug fixes

image

It is highly recommended to use noise filtering software in conjunction with TTS Voice Wizard such as Nvidia Broadcast (Broadcast is the continuation of RTX Voice) or Nahimic (software that comes with MSI boards)

v1.6.1.4

  • Voice Activation Detection (VAD) for Whisper Speech-To-Text

    • prevents whisper from transcribing stuff while when there was no voice activity
    • scales from HighQuality (not a lot of background noise) to VeryAggressive (you are in an environment with alot of noise or fan sounds)
  • Whisper GPU selection for Whisper (may not work)

image

  • Custom Translation Text Option
    • output {originalText} , {translatedText}, {inputLangCode}, {inputLangName}, {outputLangCode}, {outputLangName} as you choose
    • also allows for using {nline} to start a newline
    • example output: [{inputLangCode}] {originalText}{nline}[{outputLangCode}] {translatedText}{nline}{inputLangName} ---> {outputLangName}
    • output will be visible in VRChat Chatbox or KAT

image

v1.6.1.2

  • cheerful, empathetic, neutral, uncertain speaking styles added for IBM Watson Expressive voices
  • fix for locally hosted option not setting accent to default
TTS-Voice-Wizard - v1.6.3.2

Published by VRCWizard 10 months ago

v1.6.3.2 (current version)

  • minor changes to UI and text outputs
  • whisper VAD times now clear from memory
TTS-Voice-Wizard - v1.6.3

Published by VRCWizard 11 months ago

  • Calibration added for deepgram to automatically adjust your silence threshold based on your background noise
    image

  • VoiceForge TTS voices
    image

TTS-Voice-Wizard - v1.6.2

Published by VRCWizard 11 months ago

Deepgram

  • Deepgram can now be used continuously
  • The new Minimum Valid VAD Duration (s) is used to prevent sending blank outputs to the API (saves your usage)
    • by default audio with a detected voice activation shorter than 0.5 seconds is prevent
  • Silence Scale can be used to adjust how long the pause is before sending audio. 30000 the default is around a 1 second pause
    image

Voice Wizard Pro

  • VoiceWizardPro API prevents audio shorter than .1 second and longer than 30 seconds
  • silence trimming of API improved

MISC

  • VAD Mode setting moved to Settings > Audio (this setting is used for both deepgram and whisper)
  • Other minor changes and bug fixes

image

It is highly recommended to use noise filtering software in conjunction with TTS Voice Wizard such as Nvidia Broadcast (Broadcast is the continuation of RTX Voice) or Nahimic (software that comes with MSI boards)

TTS-Voice-Wizard - v1.6.1.4

Published by VRCWizard 11 months ago

  • Voice Activation Detection (VAD) for Whisper Speech-To-Text

    • prevents whisper from transcribing stuff while when there was no voice activity
    • scales from HighQuality (not a lot of background noise) to VeryAggressive (you are in an environment with alot of noise or fan sounds)
  • Whisper GPU selection for Whisper (may not work)

image

  • Custom Translation Text Option
    • output {originalText} , {translatedText}, {inputLangCode}, {inputLangName}, {outputLangCode}, {outputLangName} as you choose
    • also allows for using {nline} to start a newline
    • example output: [{inputLangCode}] {originalText}{nline}[{outputLangCode}] {translatedText}{nline}{inputLangName} ---> {outputLangName}
    • output will be visible in VRChat Chatbox or KAT

image

TTS-Voice-Wizard - v1.6.1.2 IBM Speaking Styles

Published by VRCWizard 11 months ago

  • speaking styles added for IBM Watson Expressive voices
  • fix for locally hosted option not setting accent to default
TTS-Voice-Wizard - v1.6.1

Published by VRCWizard 11 months ago

v1.6.1 (current)

  • loading images for home banner will no longer freeze app for those with a slower internet connection
  • should prevent "object is currently in use" issue for home banner
  • progress bar can not be used with media mode (if progress and duration are working)

v1.6.0.8

  • progressBar variable added for Spotify API mode: {progressBar E:◯ L:40}
    • adjustable length 'L' and emoji 'E'

v1.6.0.7

  • partial result fixes for KAT
    • auto replay buffer is cleared upon starting a new message
    • KAT display is cleared upon starting a new message
    • auto replay doesn't run while partial results are being displayed (will unpause for the final recognized result)
TTS-Voice-Wizard - v1.6.0.8 pre-release

Published by VRCWizard 12 months ago

  • progressBar variable added for Spotify API mode: {progressBar E:◯ L:40}
    • adjustable length 'L' and emoji 'E'
TTS-Voice-Wizard - v1.6.0.7

Published by VRCWizard 12 months ago

  • partial result fixes for KAT
    • auto replay buffer is cleared upon starting a new message
    • KAT display is cleared upon starting a new message
    • auto replay doesn't run while partial results are being displayed (will unpause for the final recognized result)
TTS-Voice-Wizard - v1.6.0.6

Published by VRCWizard 12 months ago

v1.6.0.6 (current)

  • visual fix for the buttons at the top of the media integration tab

v1.6.0.5

  • fixes for azure partial results
    • now works with KAT
    • can now change output interval
    • fixed partial results outputting when it should be disabled
TTS-Voice-Wizard - v1.6.0.5

Published by VRCWizard 12 months ago

  • fixes for azure partial results
    • now works with KAT
    • can now change output interval
    • fixed partial results outputting when it should be disabled
TTS-Voice-Wizard - v1.6.0.1

Published by VRCWizard 12 months ago

v1.6.0.1 (current)

  • banner doesn't update when you aren't on the home page
  • mouse changes to hand when hovering over banner

v1.6.0

  • home screen banner no longer uses webviewer component, so dependency was removed
  • many help links now point to the official website
  • kofi links replaced with patreon links
  • fixes for partial results implementation
  • there should always be a value in the accent dropdown now which should prevent errors for some users
  • fixed typo for websocket server

v1.5.9

  • line break/new line variable added {nline} for media output
  • setting added for buffer between text output and audio output
  • partial results option added for Azure
  • HRPercent parameter actually sending floats now
  • kat backend changes and fixes
TTS-Voice-Wizard - v1.6.0

Published by VRCWizard 12 months ago

v1.6.0 current

  • home screen banner no longer uses webviewer component, so dependency was removed
  • many help links now point to the official website
  • kofi links replaced with patreon links
  • fixes for partial results implementation
  • there should always be a value in the accent dropdown now which should prevent errors for some users
  • fixed type for websocket server

v1.5.9

  • line break/new line variable added {nline} for media output
  • setting added for buffer between text output and audio output
  • partial results option added for Azure
  • HRPercent parameter actually sending floats now
  • kat backend changes and fixes
TTS-Voice-Wizard - v1.5.9

Published by VRCWizard 12 months ago

  • line break/new line variable added {nline} for media output
  • setting added for buffer between text output and audio output
  • partial results option added for Azure
  • HRPercent parameter actually sending floats now
  • kat fix for sending text when continuous output is going
  • kat sync parameter now saved on close like other settings to avoid any issues on startup
TTS-Voice-Wizard - v1.5.8.5 KAT Improvements

Published by VRCWizard about 1 year ago

  • KAT code refactored and improved
    • specifically the continuous media output should not have as many mistakes when switching songs (blanking early, old song text stuck in display)
  • HRPercent parameter added (0-256 mapped to float 0.0-1.0)