paddlespeech.t2s.exps.syn_utils module

paddlespeech.t2s.exps.syn_utils.am_to_static(am_inference, am: str = 'fastspeech2_csmsc', inference_dir=typing.Union[os.PathLike, NoneType], speaker_dict: Optional[PathLike] = None)[source]
paddlespeech.t2s.exps.syn_utils.denorm(data, mean, std)[source]
paddlespeech.t2s.exps.syn_utils.get_am_inference(am: str = 'fastspeech2_csmsc', am_config: Optional[CfgNode] = None, am_ckpt: Optional[PathLike] = None, am_stat: Optional[PathLike] = None, phones_dict: Optional[PathLike] = None, tones_dict: Optional[PathLike] = None, speaker_dict: Optional[PathLike] = None, return_am: bool = False, speech_stretchs: Optional[PathLike] = None)[source]
paddlespeech.t2s.exps.syn_utils.get_am_output(input: str, am_predictor: Layer, am: str, frontend: object, lang: str = 'zh', merge_sentences: bool = True, speaker_dict: Optional[PathLike] = None, spk_id: int = 0, add_blank: bool = False)[source]
paddlespeech.t2s.exps.syn_utils.get_am_sublayer_output(am_sublayer_predictor, input)[source]
paddlespeech.t2s.exps.syn_utils.get_chunks(mel, chunk_size: int, pad_size: int)[source]

Split mel by chunk size with left and right context.

Args:

mel (paddle.Tensor): mel spectrogram, shape (B, T, D) chunk_size (int): chunk size pad_size (int): size for left and right context.

paddlespeech.t2s.exps.syn_utils.get_dev_dataloader(dev_metadata: List[Dict[str, Any]], am: str, batch_size: int = 1, speaker_dict: Optional[PathLike] = None, voice_cloning: bool = False, n_shift: int = 300, batch_max_steps: int = 16200, shuffle: bool = True)[source]
paddlespeech.t2s.exps.syn_utils.get_frontend(lang: str = 'zh', phones_dict: Optional[PathLike] = None, tones_dict: Optional[PathLike] = None, pinyin_phone: Optional[PathLike] = None, use_rhy=False)[source]
paddlespeech.t2s.exps.syn_utils.get_predictor(model_dir: Optional[PathLike] = None, model_file: Optional[PathLike] = None, params_file: Optional[PathLike] = None, device: str = 'cpu', use_trt: bool = False, device_id: int = 0, use_dynamic_shape: bool = True, min_subgraph_size: int = 5, cpu_threads: int = 1, use_mkldnn: bool = False, precision: int = 'fp32')[source]
Args:

model_dir (os.PathLike): root path of model.pdmodel and model.pdiparams. model_file (os.PathLike): name of model_file. params_file (os.PathLike): name of params_file. device (str): Choose the device you want to run, it can be: cpu/gpu, default is cpu. use_trt (bool): whether to use TensorRT or not in GPU. device_id (int): Choose your device id, only valid when the device is gpu, default 0. use_dynamic_shape (bool): use dynamic shape or not in TensorRT. use_mkldnn (bool): whether to use MKLDNN or not in CPU. cpu_threads (int): num of thread when use CPU. precision (str): mode of running (fp32/fp16/bf16/int8).

paddlespeech.t2s.exps.syn_utils.get_sentences(text_file: Optional[PathLike], lang: str = 'zh')[source]
paddlespeech.t2s.exps.syn_utils.get_sentences_svs(text_file: Optional[PathLike])[source]
paddlespeech.t2s.exps.syn_utils.get_sess(model_path: Optional[PathLike], device: str = 'cpu', cpu_threads: int = 1, use_trt: bool = False)[source]
paddlespeech.t2s.exps.syn_utils.get_streaming_am_output(input: str, am_encoder_infer_predictor, am_decoder_predictor, am_postnet_predictor, frontend, lang: str = 'zh', merge_sentences: bool = True)[source]
paddlespeech.t2s.exps.syn_utils.get_test_dataset(test_metadata: List[Dict[str, Any]], am: str, speaker_dict: Optional[PathLike] = None, voice_cloning: bool = False)[source]
paddlespeech.t2s.exps.syn_utils.get_voc_inference(voc: str = 'pwgan_csmsc', voc_config: Optional[PathLike] = None, voc_ckpt: Optional[PathLike] = None, voc_stat: Optional[PathLike] = None)[source]
paddlespeech.t2s.exps.syn_utils.get_voc_output(voc_predictor, input)[source]
paddlespeech.t2s.exps.syn_utils.norm(data, mean, std)[source]
paddlespeech.t2s.exps.syn_utils.run_frontend(frontend: object, text: str, merge_sentences: bool = False, get_tone_ids: bool = False, lang: str = 'zh', to_tensor: bool = True, add_blank: bool = False, svs_input: Optional[Dict[str, str]] = None)[source]
paddlespeech.t2s.exps.syn_utils.voc_to_static(voc_inference, voc: str = 'pwgan_csmsc', inference_dir=typing.Union[os.PathLike, NoneType])[source]