paddlespeech.t2s.models.waveflow module

class paddlespeech.t2s.models.waveflow.ConditionalWaveFlow(upsample_factors: List[int], n_flows: int, n_layers: int, n_group: int, channels: int, n_mels: int, kernel_size: Union[int, List[int]])[source]

Bases: LayerList

ConditionalWaveFlow, a UpsampleNet with a WaveFlow model.

Args:

upsample_factors (List[int]):: Upsample factors for the upsample net.
n_flows (int):: Number of flows in the WaveFlow model.
n_layers (int):: Number of ResidualBlocks in each Flow.
n_group (int):: Number of timesteps to fold as a group.
channels (int):: Feature size of each ResidualBlock.
n_mels (int):: Feature size of mel spectrogram (mel bands).
kernel_size (Union[int, List[int]]):: Kernel size of the convolution layer in each ResidualBlock.

Methods

`__call__`(inputs, *kwargs)	Call self as a function.
`add_parameter`(name, parameter)	Adds a Parameter instance.
`add_sublayer`(name, sublayer)	Adds a sub Layer instance.
`append`(sublayer)	Appends a sublayer to the end of the list.
`apply`(fn)	Applies `fn` recursively to every sublayer (as returned by `.sublayers()`) as well as self.
`buffers`([include_sublayers])	Returns a list of all buffers from current layer and its sub-layers.
`children`()	Returns an iterator over immediate children layers.
`clear_gradients`()	Clear the gradients of all parameters for this layer.
`create_parameter`(shape[, attr, dtype, ...])	Create parameters for this layer.
`create_tensor`([name, persistable, dtype])	Create Tensor for this layer.
`create_variable`([name, persistable, dtype])	Create Tensor for this layer.
`eval`()	Sets this Layer and all its sublayers to evaluation mode.
`extend`(sublayers)	Appends sublayers to the end of the list.
`extra_repr`()	Extra representation of this layer, you can have custom implementation of your own layer.
`forward`(audio, mel)	Compute the transformed random variable z (x to z) and the log of the determinant of the jacobian of the transformation from x to z.
`from_pretrained`(config, checkpoint_path)	Build a ConditionalWaveFlow model from a pretrained model.
`full_name`()	Full name for this layer, composed by name_scope + "/" + MyLayer.__class__.__name__
`infer`(mel)	Generate raw audio given mel spectrogram.
`insert`(index, sublayer)	Insert a sublayer before a given index in the list.
`load_dict`(state_dict[, use_structured_name])	Set parameters and persistable buffers from state_dict.
`named_buffers`([prefix, include_sublayers])	Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.
`named_children`()	Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.
`named_parameters`([prefix, include_sublayers])	Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.
`named_sublayers`([prefix, include_self, ...])	Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer.
`parameters`([include_sublayers])	Returns a list of all Parameters from current layer and its sub-layers.
`predict`(mel)	Generate raw audio given mel spectrogram.
`register_buffer`(name, tensor[, persistable])	Registers a tensor as buffer into the layer.
`register_forward_post_hook`(hook)	Register a forward post-hook for Layer.
`register_forward_pre_hook`(hook)	Register a forward pre-hook for Layer.
`set_dict`(state_dict[, use_structured_name])	Set parameters and persistable buffers from state_dict.
`set_state_dict`(state_dict[, use_structured_name])	Set parameters and persistable buffers from state_dict.
`state_dict`([destination, include_sublayers, ...])	Get all parameters and persistable buffers of current layer and its sub-layers.
`sublayers`([include_self])	Returns a list of sub layers.
`to`([device, dtype, blocking])	Cast the parameters and buffers of Layer by the give device, dtype and blocking.
`to_static_state_dict`([destination, ...])	Get all parameters and buffers of current layer and its sub-layers.
`train`()	Sets this Layer and all its sublayers to training mode.

backward
register_state_dict_hook

forward(audio, mel)[source]

Compute the transformed random variable z (x to z) and the log of the determinant of the jacobian of the transformation from x to z.

Args:

audio(Tensor):: The audio. shape=(B, T)
mel(Tensor):: The mel spectrogram. shape=(B, C_mel, T_mel)

Returns:

Tensor:: The inversely transformed random variable z (x to z). shape=(B, T)
Tensor:: the log of the determinant of the jacobian of the transformation from x to z. shape=(1,)

classmethod from_pretrained(config, checkpoint_path)[source]

Build a ConditionalWaveFlow model from a pretrained model.

Args:

config(yacs.config.CfgNode):: model configs
checkpoint_path(Path or str):: the path of pretrained model checkpoint, without extension name

Returns:

ConditionalWaveFlow The model built from pretrained result.

infer(mel)[source]

Generate raw audio given mel spectrogram.

Args:

mel(np.ndarray):: Mel spectrogram of an utterance(in log-magnitude). shape=(C_mel, T_mel)

Returns:

Tensor:: The synthesized audio, where``T <= T_mel * upsample_factors``. shape=(B, T)

predict(mel)[source]

Generate raw audio given mel spectrogram.

Args:

mel(np.ndarray):: Mel spectrogram of an utterance(in log-magnitude). shape=(C_mel, T_mel)

Returns:

np.ndarray: The synthesized audio. shape=(T,)

class paddlespeech.t2s.models.waveflow.WaveFlow(n_flows, n_layers, n_group, channels, mel_bands, kernel_size)[source]

Bases: LayerList

An Deep Reversible layer that is composed of severel auto regressive flows.

Args:

n_flows (int):: Number of flows in the WaveFlow model.
n_layers (int):: Number of ResidualBlocks in each Flow.
n_group (int):: Number of timesteps to fold as a group.
channels (int):: Feature size of each ResidualBlock.
mel_bands (int):: Feature size of mel spectrogram (mel bands).
kernel_size (Union[int, List[int]]):: Kernel size of the convolution layer in each ResidualBlock.

Methods

`__call__`(inputs, *kwargs)	Call self as a function.
`add_parameter`(name, parameter)	Adds a Parameter instance.
`add_sublayer`(name, sublayer)	Adds a sub Layer instance.
`append`(sublayer)	Appends a sublayer to the end of the list.
`apply`(fn)	Applies `fn` recursively to every sublayer (as returned by `.sublayers()`) as well as self.
`buffers`([include_sublayers])	Returns a list of all buffers from current layer and its sub-layers.
`children`()	Returns an iterator over immediate children layers.
`clear_gradients`()	Clear the gradients of all parameters for this layer.
`create_parameter`(shape[, attr, dtype, ...])	Create parameters for this layer.
`create_tensor`([name, persistable, dtype])	Create Tensor for this layer.
`create_variable`([name, persistable, dtype])	Create Tensor for this layer.
`eval`()	Sets this Layer and all its sublayers to evaluation mode.
`extend`(sublayers)	Appends sublayers to the end of the list.
`extra_repr`()	Extra representation of this layer, you can have custom implementation of your own layer.
`forward`(x, condition)	Probability density estimation of random variable x given the condition.
`full_name`()	Full name for this layer, composed by name_scope + "/" + MyLayer.__class__.__name__
`insert`(index, sublayer)	Insert a sublayer before a given index in the list.
`inverse`(z, condition)	Sampling from the distrition p(X).
`load_dict`(state_dict[, use_structured_name])	Set parameters and persistable buffers from state_dict.
`named_buffers`([prefix, include_sublayers])	Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.
`named_children`()	Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.
`named_parameters`([prefix, include_sublayers])	Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.
`named_sublayers`([prefix, include_self, ...])	Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer.
`parameters`([include_sublayers])	Returns a list of all Parameters from current layer and its sub-layers.
`register_buffer`(name, tensor[, persistable])	Registers a tensor as buffer into the layer.
`register_forward_post_hook`(hook)	Register a forward post-hook for Layer.
`register_forward_pre_hook`(hook)	Register a forward pre-hook for Layer.
`set_dict`(state_dict[, use_structured_name])	Set parameters and persistable buffers from state_dict.
`set_state_dict`(state_dict[, use_structured_name])	Set parameters and persistable buffers from state_dict.
`state_dict`([destination, include_sublayers, ...])	Get all parameters and persistable buffers of current layer and its sub-layers.
`sublayers`([include_self])	Returns a list of sub layers.
`to`([device, dtype, blocking])	Cast the parameters and buffers of Layer by the give device, dtype and blocking.
`to_static_state_dict`([destination, ...])	Get all parameters and buffers of current layer and its sub-layers.
`train`()	Sets this Layer and all its sublayers to training mode.

backward
register_state_dict_hook

forward(x, condition)[source]

Probability density estimation of random variable x given the condition.

Args:

x (Tensor):: The audio. shape=(batch_size, time_steps)
condition (Tensor):: The local condition (mel spectrogram here). shape=(batch_size, condition channel, time_steps)

Returns:

Tensor:: The transformed random variable. shape=(batch_size, time_steps)
Tensor:: The log determinant of the jacobian of the transformation from x to z. shape=(1,)

inverse(z, condition)[source]

Sampling from the distrition p(X).

It is done by sample a z form p(Z) and transform it into x. Each Flow transform .. math:: z_{i-1} to .. math:: z_{i} in an autoregressive manner.

Args:

z (Tensor):: A sample of the distribution p(Z). shape=(batch, 1, time_steps
condition (Tensor):: The local condition. shape=(batch, condition_channel, time_steps)

Returns:

Tensor: The transformed sample (audio here). shape=(batch_size, time_steps)

class paddlespeech.t2s.models.waveflow.WaveFlowLoss(sigma=1.0)[source]

Bases: Layer

Criterion of a WaveFlow model.

Args:

sigma (float):: The standard deviation of the gaussian noise used in WaveFlow, by default 1.0.

Methods

`__call__`(inputs, *kwargs)	Call self as a function.
`add_parameter`(name, parameter)	Adds a Parameter instance.
`add_sublayer`(name, sublayer)	Adds a sub Layer instance.
`apply`(fn)	Applies `fn` recursively to every sublayer (as returned by `.sublayers()`) as well as self.
`buffers`([include_sublayers])	Returns a list of all buffers from current layer and its sub-layers.
`children`()	Returns an iterator over immediate children layers.
`clear_gradients`()	Clear the gradients of all parameters for this layer.
`create_parameter`(shape[, attr, dtype, ...])	Create parameters for this layer.
`create_tensor`([name, persistable, dtype])	Create Tensor for this layer.
`create_variable`([name, persistable, dtype])	Create Tensor for this layer.
`eval`()	Sets this Layer and all its sublayers to evaluation mode.
`extra_repr`()	Extra representation of this layer, you can have custom implementation of your own layer.
`forward`(z, log_det_jacobian)	Compute the loss given the transformed random variable z and the log_det_jacobian of transformation from x to z.
`full_name`()	Full name for this layer, composed by name_scope + "/" + MyLayer.__class__.__name__
`load_dict`(state_dict[, use_structured_name])	Set parameters and persistable buffers from state_dict.
`named_buffers`([prefix, include_sublayers])	Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.
`named_children`()	Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.
`named_parameters`([prefix, include_sublayers])	Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.
`named_sublayers`([prefix, include_self, ...])	Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer.
`parameters`([include_sublayers])	Returns a list of all Parameters from current layer and its sub-layers.
`register_buffer`(name, tensor[, persistable])	Registers a tensor as buffer into the layer.
`register_forward_post_hook`(hook)	Register a forward post-hook for Layer.
`register_forward_pre_hook`(hook)	Register a forward pre-hook for Layer.
`set_dict`(state_dict[, use_structured_name])	Set parameters and persistable buffers from state_dict.
`set_state_dict`(state_dict[, use_structured_name])	Set parameters and persistable buffers from state_dict.
`state_dict`([destination, include_sublayers, ...])	Get all parameters and persistable buffers of current layer and its sub-layers.
`sublayers`([include_self])	Returns a list of sub layers.
`to`([device, dtype, blocking])	Cast the parameters and buffers of Layer by the give device, dtype and blocking.
`to_static_state_dict`([destination, ...])	Get all parameters and buffers of current layer and its sub-layers.
`train`()	Sets this Layer and all its sublayers to training mode.

backward
register_state_dict_hook

forward(z, log_det_jacobian)[source]

Compute the loss given the transformed random variable z and the log_det_jacobian of transformation from x to z.

Args:

z(Tensor):: The transformed random variable (x to z). shape=(B, T)
log_det_jacobian(Tensor):: The log of the determinant of the jacobian matrix of the transformation from x to z. shape=(1,)

Returns:

Tensor: The loss. shape=(1,)