Tacotron2+Tensorflow1.1+FALSK音声合成

5652 ワード

Tacotron2+Tensorflow1.1+FALSK音声合成
背景
音声放送装置の名前と異常状態が必要です
環境

Tacotron2

Tensorflow1.1

python3.6

miniconda4.8.3

ラベル付けデータソース

インストールと構成
まずminicondaをインストールします
1.ダウンロード、清華ダウンロードソースを使用してminicondaダウンロードページに入る
https://mirrors.tuna.tsinghua.edu.cn/help/anaconda/
https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/

wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-py37_4.8.3-Linux-x86_64.sh

2.インストール

bash Miniconda3-py37_4.8.3-Linux-x86_64.sh

3.In order to continue the installation process, please review the license agreement. Please, press ENTER to continue
車に戻る-』q#読書を終了
4.Do you accept the license terms? [yes|no]
yes
5.Miniconda3 will now be installed into this location:/home/aiuser/miniconda3

Press ENTER to confirm the location

Press CTRL-C to abort the installation

Or specify a different location belowリターン

6.Do you wish the installer to initialize Miniconda3 by running conda init? [yes|no]
no
インストール完了
7.構成condarc

vim ~/.condarc
#   https://mirror.tuna.tsinghua.edu.cn/help/anaconda/      

channels:
  - defaults
show_channel_urls: true
default_channels:
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
custom_channels:
  conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  msys2: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  bioconda: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  menpo: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  pytorch: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  simpleitk: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud

Tensorflowのインストール

#  conda
source ~/miniconda3/bin/activate
#    
conda create -n tf python=3.6
#        conda deactivate
conda activate tf
#  tensorflow-gpu1.10
conda install tensorflow-gpu==1.10.0

単一のパッケージがダウンロードされていない場合
CondaError: Downloaded bytes did not match Content-Length url: https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/linux-64/cudatoolkit-9.2-0.conda target_path:/home/aiuser/miniconda3/pkgs/cudatoolkit-9.2-0.conda Content-Length: 245249198 downloaded bytes: 230342317
urlをコピーしてパッケージをダウンロードし、手動でインストールします.

conda install --use-local cudatoolkit-9.2-0.conda
#                     
#    tensorflow-gpu1.10
conda install tensorflow-gpu==1.10.0
#        
vim demo.py
#    
import tensorflow as tf
version = tf.__version__
gpu_ok = tf.test.is_gpu_available()
print("tf version:",version,"nuse GPU",gpu_ok)
#  
python demo.py
#  true

データソースのダウンロード
ラベルデータソースhttps://online-of-baklong.oss-cn-huhehaote.aliyuncs.com/story_resource/BZNSYP.rar?Expires=1611650858&OSSAccessKeyId=LTAI3GkKBSJFDJsp&Signature=c8ahH5BEyjEIw2wP0FmXebjNORo%3D
ヒル貝殻Aishellhttp://www.openslr.org/33/
Tacotron2
https://github.com/JasonWei512/Tacotron-2-Chinese10 wステップの事前トレーニングモデルをここで直接ダウンロードして6ステップにジャンプできます.
ダウンロードhttp://github.com/JasonWei512/Tacotron-2-Chinese/archive/mandarin-biaobei.zip
1.Tacotron-2-mandarin-melを解凍する.zip
2.ラベルデータセットをTacotron-2-mandarin-melルートディレクトリに解凍する

Tacotron-2-mandarin-mel
	|- BZNSYP
		|- PhoneLabeling
		|- ProsodyLabeling
		|- Wave

3.ffmpegで/BZNSYP/Wave/のwavのサンプリングレートを36 KHzに下げる:

import os
import subprocess

input_path = r"D:\tensorflow\Tacotron-2-mandarin-mel\Tacotron-2-mandarin-mel\BZNSYP\Wave"
output_path = r"D:\tensorflow\Tacotron-2-mandarin-mel\Tacotron-2-mandarin-mel\BZNSYP\Wave2"
for file in os.listdir(input_path):
    file1 = input_path+'\\'+file
    file2 = output_path+'\\'+file
    cmd = "ffmpeg -i " + file1 + " -ar 36000 " + file2
    subprocess.call(cmd, shell=True)

4.前処理ファイル

python preprocess.py --dataset='Biaobei'

5.トレーニング

python train.py --model='Tacotron-2'

6.合成
WaveNetモデルがなく、スペクトル予測モデルのみの場合、Griffin-Limのみで音声が生成され、/tacotron_output/logs-eval/wavs/フォルダに出力されます.
WaveNetモデルがある場合、WaveNetが生成した音声は/wavenet_output/wavs/にある

python synthesize.py --model='Tacotron-2' --text_list='sentences.txt'

Repository Structure:

Tacotron-2
├── datasets
├── en_UK		(0)
│   └── by_book
│       └── female
├── en_US		(0)
│   └── by_book
│       ├── female
│       └── male
├── LJSpeech-1.1	(0)
│   └── wavs
├── logs-Tacotron	(2)
│   ├── eval_-dir
│   │ 	├── plots
│ 	│ 	└── wavs
│   ├── mel-spectrograms
│   ├── plots
│   ├── taco_pretrained
│   ├── metas
│   └── wavs
├── logs-Wavenet	(4)
│   ├── eval-dir
│   │ 	├── plots
│ 	│ 	└── wavs
│   ├── plots
│   ├── wave_pretrained
│   ├── metas
│   └── wavs
├── logs-Tacotron-2	( * )
│   ├── eval-dir
│   │ 	├── plots
│ 	│ 	└── wavs
│   ├── plots
│   ├── taco_pretrained
│   ├── wave_pretrained
│   ├── metas
│   └── wavs
├── papers
├── tacotron
│   ├── models
│   └── utils
├── tacotron_output	(3)
│   ├── eval
│   ├── gta
│   ├── logs-eval
│   │   ├── plots
│   │   └── wavs
│   └── natural
├── wavenet_output	(5)
│   ├── plots
│   └── wavs
├── training_data	(1)
│   ├── audio
│   ├── linear
│	└── mels
└── wavenet_vocoder
	└── models

Flaskロードmodel-インタフェース形式呼び出し合成
ダウンロードhttps://gitee.com/mtllll/tacotron2-flask-server
事前トレーニングモデルをダウンロードしてserverルートディレクトリlogs-Tacotron-2に配置
server appを実行します.py
Client appを実行します.py

アセンブリから見るポインタと参照の違い

ibatisの簡単な削除の変更