Emoti Voice-TPU

EmotiVoice-TPU is an application that uses the EmotiVoice model, open-sourced by Netease, and the OpenVoice model by MyShell. These models are ported to SG2300X chip series products using the Sophon SDK for local TPU hardware-accelerated inference. This application can generate speech (TTS) with emotions based on text input and utilize OpenVoice to convert the generated speech into different tones. It uses Gradio to provide a user-friendly interaction interface.

Clone the repository:

git clone https://github.com/zifeng-radxa/EmotiVoice-TPU -b radxa_v0.1.2

Download models:

cd EmotiVoice-TPU/model_file
wget https://github.com/radxa-edge/TPU-Edge-AI/releases/download/EmotiVoice/EmotiVoice_bmodels.tar.gz

Place the bmodel in the EmotiVoice-TPU/model_file directory:

tar -xvf EmotiVoice_bmodels.tar.gz
mv ./EmotiVoice_bmodels/* .

The file structure will be as follows:

.
├── assets
│   └── audio
├── config
├── data
│   ├── inference
│   └── youdao
│       └── text
├── frontend
│   ├── cn2an
│   └── lexicon
├── model_file
│   ├── EmotiVoice_bmodels
│   ├── converter
│   ├── simbert-base-chinese
│   └── tts
├── models
│   ├── hifigan
│   └── prompt_tts_modified
│       └── modules
├── temp
└── tone_color_conversion

Create a virtual environment:

It's necessary to create a virtual environment; otherwise, it may affect the normal operation of other applications. For virtual environment usage, please refer here.
```
cd EmotiVoice-TPU
python3 -m virtualenv .venv
source .venv/bin/activate
```

Install dependencies:

pip3 install --upgrade pip
pip3 install https://github.com/radxa-edge/TPU-Edge-AI/releases/download/v0.1.0/tpu_perf-1.2.31-py3-none-manylinux2014_aarch64.whl
pip3 install -r requirements.txt

Install silero-vad manually:

cd ~/.cache/torch/hub/
# If this folder does not exist, please create it manually in ~/.cache directory
# mkdir -p torch/hub
wget https://github.com/snakers4/silero-vad/archive/refs/tags/v4.0.zip
unzip v4.0.zip
mv silero-vad-4.0 snakers4_silero-vad_master

Start the application:
```
cd EmotiVoice-TPU
bash run_gr.py
```
Access port 7860 of the Airbox IP address in a web browser.

Common Issues

Getting an OSError: cannot load library 'libsndfile.so': libsndfile.so: cannot open shared object file: No such file or directory on startup

Solution: Install libsndfile1.
```
sudo apt install libsndfile1
```
Slow startup on the first run?

Reason: The first run requires downloading nltk_data. If you encounter network issues, please check your network connection.

Emoti Voice-TPU

Common Issues​

Application Display​

Common Issues

Application Display