paddlespeech.t2s.models.tacotron2.tacotron2 module
Tacotron 2 related modules for paddle
- class paddlespeech.t2s.models.tacotron2.tacotron2.Tacotron2(idim: int, odim: int, embed_dim: int = 512, elayers: int = 1, eunits: int = 512, econv_layers: int = 3, econv_chans: int = 512, econv_filts: int = 5, atype: str = 'location', adim: int = 512, aconv_chans: int = 32, aconv_filts: int = 15, cumulate_att_w: bool = True, dlayers: int = 2, dunits: int = 1024, prenet_layers: int = 2, prenet_units: int = 256, postnet_layers: int = 5, postnet_chans: int = 512, postnet_filts: int = 5, output_activation: Optional[str] = None, use_batch_norm: bool = True, use_concate: bool = True, use_residual: bool = False, reduction_factor: int = 1, spk_num: Optional[int] = None, lang_num: Optional[int] = None, spk_embed_dim: Optional[int] = None, spk_embed_integration_type: str = 'concat', dropout_rate: float = 0.5, zoneout_rate: float = 0.1, init_type: str = 'xavier_uniform')[source]
Bases:
Layer
Tacotron2 module for end-to-end text-to-speech.
This is a module of Spectrogram prediction network in Tacotron2 described in Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions, which converts the sequence of characters into the sequence of Mel-filterbanks.
Methods
__call__
(*inputs, **kwargs)Call self as a function.
add_parameter
(name, parameter)Adds a Parameter instance.
add_sublayer
(name, sublayer)Adds a sub Layer instance.
apply
(fn)Applies
fn
recursively to every sublayer (as returned by.sublayers()
) as well as self.buffers
([include_sublayers])Returns a list of all buffers from current layer and its sub-layers.
children
()Returns an iterator over immediate children layers.
clear_gradients
()Clear the gradients of all parameters for this layer.
create_parameter
(shape[, attr, dtype, ...])Create parameters for this layer.
create_tensor
([name, persistable, dtype])Create Tensor for this layer.
create_variable
([name, persistable, dtype])Create Tensor for this layer.
eval
()Sets this Layer and all its sublayers to evaluation mode.
extra_repr
()Extra representation of this layer, you can have custom implementation of your own layer.
forward
(text, text_lengths, speech, ...[, ...])Calculate forward propagation.
full_name
()Full name for this layer, composed by name_scope + "/" + MyLayer.__class__.__name__
inference
(text[, speech, spk_emb, spk_id, ...])Generate the sequence of features given the sequences of characters.
load_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
named_buffers
([prefix, include_sublayers])Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.
named_children
()Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.
named_parameters
([prefix, include_sublayers])Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.
named_sublayers
([prefix, include_self, ...])Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer.
parameters
([include_sublayers])Returns a list of all Parameters from current layer and its sub-layers.
register_buffer
(name, tensor[, persistable])Registers a tensor as buffer into the layer.
register_forward_post_hook
(hook)Register a forward post-hook for Layer.
register_forward_pre_hook
(hook)Register a forward pre-hook for Layer.
set_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
set_state_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
state_dict
([destination, include_sublayers, ...])Get all parameters and persistable buffers of current layer and its sub-layers.
sublayers
([include_self])Returns a list of sub layers.
to
([device, dtype, blocking])Cast the parameters and buffers of Layer by the give device, dtype and blocking.
to_static_state_dict
([destination, ...])Get all parameters and buffers of current layer and its sub-layers.
train
()Sets this Layer and all its sublayers to training mode.
backward
register_state_dict_hook
- forward(text: Tensor, text_lengths: Tensor, speech: Tensor, speech_lengths: Tensor, spk_emb: Optional[Tensor] = None, spk_id: Optional[Tensor] = None, lang_id: Optional[Tensor] = None) Tuple[Tensor, Dict[str, Tensor], Tensor] [source]
Calculate forward propagation.
- Args:
- text (Tensor(int64)):
Batch of padded character ids (B, T_text).
- text_lengths (Tensor(int64)):
Batch of lengths of each input batch (B,).
- speech (Tensor):
Batch of padded target features (B, T_feats, odim).
- speech_lengths (Tensor(int64)):
Batch of the lengths of each target (B,).
- spk_emb (Optional[Tensor]):
Batch of speaker embeddings (B, spk_embed_dim).
- spk_id (Optional[Tensor]):
Batch of speaker IDs (B, 1).
- lang_id (Optional[Tensor]):
Batch of language IDs (B, 1).
- Returns:
- Tensor:
Loss scalar value.
- Dict:
Statistics to be monitored.
- Tensor:
Weight value if not joint training else model outputs.
- inference(text: Tensor, speech: Optional[Tensor] = None, spk_emb: Optional[Tensor] = None, spk_id: Optional[Tensor] = None, lang_id: Optional[Tensor] = None, threshold: float = 0.5, minlenratio: float = 0.0, maxlenratio: float = 10.0, use_att_constraint: bool = False, backward_window: int = 1, forward_window: int = 3, use_teacher_forcing: bool = False) Dict[str, Tensor] [source]
Generate the sequence of features given the sequences of characters.
- Args:
- text (Tensor(int64)):
Input sequence of characters (T_text,).
- speech (Optional[Tensor]):
Feature sequence to extract style (N, idim).
- spk_emb (ptional[Tensor]):
Speaker embedding (spk_embed_dim,).
- spk_id (Optional[Tensor]):
Speaker ID (1,).
- lang_id (Optional[Tensor]):
Language ID (1,).
- threshold (float):
Threshold in inference.
- minlenratio (float):
Minimum length ratio in inference.
- maxlenratio (float):
Maximum length ratio in inference.
- use_att_constraint (bool):
Whether to apply attention constraint.
- backward_window (int):
Backward window in attention constraint.
- forward_window (int):
Forward window in attention constraint.
- use_teacher_forcing (bool):
Whether to use teacher forcing.
- Returns:
Dict[str, Tensor] Output dict including the following items:
feat_gen (Tensor): Output sequence of features (T_feats, odim).
prob (Tensor): Output sequence of stop probabilities (T_feats,).
att_w (Tensor): Attention weights (T_feats, T).
- class paddlespeech.t2s.models.tacotron2.tacotron2.Tacotron2Inference(normalizer, model)[source]
Bases:
Layer
Methods
__call__
(*inputs, **kwargs)Call self as a function.
add_parameter
(name, parameter)Adds a Parameter instance.
add_sublayer
(name, sublayer)Adds a sub Layer instance.
apply
(fn)Applies
fn
recursively to every sublayer (as returned by.sublayers()
) as well as self.buffers
([include_sublayers])Returns a list of all buffers from current layer and its sub-layers.
children
()Returns an iterator over immediate children layers.
clear_gradients
()Clear the gradients of all parameters for this layer.
create_parameter
(shape[, attr, dtype, ...])Create parameters for this layer.
create_tensor
([name, persistable, dtype])Create Tensor for this layer.
create_variable
([name, persistable, dtype])Create Tensor for this layer.
eval
()Sets this Layer and all its sublayers to evaluation mode.
extra_repr
()Extra representation of this layer, you can have custom implementation of your own layer.
forward
(text[, spk_id, spk_emb])Defines the computation performed at every call.
full_name
()Full name for this layer, composed by name_scope + "/" + MyLayer.__class__.__name__
load_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
named_buffers
([prefix, include_sublayers])Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.
named_children
()Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.
named_parameters
([prefix, include_sublayers])Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.
named_sublayers
([prefix, include_self, ...])Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer.
parameters
([include_sublayers])Returns a list of all Parameters from current layer and its sub-layers.
register_buffer
(name, tensor[, persistable])Registers a tensor as buffer into the layer.
register_forward_post_hook
(hook)Register a forward post-hook for Layer.
register_forward_pre_hook
(hook)Register a forward pre-hook for Layer.
set_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
set_state_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
state_dict
([destination, include_sublayers, ...])Get all parameters and persistable buffers of current layer and its sub-layers.
sublayers
([include_self])Returns a list of sub layers.
to
([device, dtype, blocking])Cast the parameters and buffers of Layer by the give device, dtype and blocking.
to_static_state_dict
([destination, ...])Get all parameters and buffers of current layer and its sub-layers.
train
()Sets this Layer and all its sublayers to training mode.
backward
register_state_dict_hook