paddlespeech.t2s.models.waveflow module

class paddlespeech.t2s.models.waveflow.ConditionalWaveFlow(upsample_factors: List[int], n_flows: int, n_layers: int, n_group: int, channels: int, n_mels: int, kernel_size: Union[int, List[int]])[source]

Bases: LayerList

ConditionalWaveFlow, a UpsampleNet with a WaveFlow model.

Args:
upsample_factors (List[int]):

Upsample factors for the upsample net.

n_flows (int):

Number of flows in the WaveFlow model.

n_layers (int):

Number of ResidualBlocks in each Flow.

n_group (int):

Number of timesteps to fold as a group.

channels (int):

Feature size of each ResidualBlock.

n_mels (int):

Feature size of mel spectrogram (mel bands).

kernel_size (Union[int, List[int]]):

Kernel size of the convolution layer in each ResidualBlock.

Methods

__call__(*inputs, **kwargs)

Call self as a function.

add_parameter(name, parameter)

Adds a Parameter instance.

add_sublayer(name, sublayer)

Adds a sub Layer instance.

append(sublayer)

Appends a sublayer to the end of the list.

apply(fn)

Applies fn recursively to every sublayer (as returned by .sublayers()) as well as self.

buffers([include_sublayers])

Returns a list of all buffers from current layer and its sub-layers.

children()

Returns an iterator over immediate children layers.

clear_gradients()

Clear the gradients of all parameters for this layer.

create_parameter(shape[, attr, dtype, ...])

Create parameters for this layer.

create_tensor([name, persistable, dtype])

Create Tensor for this layer.

create_variable([name, persistable, dtype])

Create Tensor for this layer.

eval()

Sets this Layer and all its sublayers to evaluation mode.

extend(sublayers)

Appends sublayers to the end of the list.

extra_repr()

Extra representation of this layer, you can have custom implementation of your own layer.

forward(audio, mel)

Compute the transformed random variable z (x to z) and the log of the determinant of the jacobian of the transformation from x to z.

from_pretrained(config, checkpoint_path)

Build a ConditionalWaveFlow model from a pretrained model.

full_name()

Full name for this layer, composed by name_scope + "/" + MyLayer.__class__.__name__

infer(mel)

Generate raw audio given mel spectrogram.

insert(index, sublayer)

Insert a sublayer before a given index in the list.

load_dict(state_dict[, use_structured_name])

Set parameters and persistable buffers from state_dict.

named_buffers([prefix, include_sublayers])

Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.

named_children()

Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.

named_parameters([prefix, include_sublayers])

Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.

named_sublayers([prefix, include_self, ...])

Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer.

parameters([include_sublayers])

Returns a list of all Parameters from current layer and its sub-layers.

predict(mel)

Generate raw audio given mel spectrogram.

register_buffer(name, tensor[, persistable])

Registers a tensor as buffer into the layer.

register_forward_post_hook(hook)

Register a forward post-hook for Layer.

register_forward_pre_hook(hook)

Register a forward pre-hook for Layer.

set_dict(state_dict[, use_structured_name])

Set parameters and persistable buffers from state_dict.

set_state_dict(state_dict[, use_structured_name])

Set parameters and persistable buffers from state_dict.

state_dict([destination, include_sublayers, ...])

Get all parameters and persistable buffers of current layer and its sub-layers.

sublayers([include_self])

Returns a list of sub layers.

to([device, dtype, blocking])

Cast the parameters and buffers of Layer by the give device, dtype and blocking.

to_static_state_dict([destination, ...])

Get all parameters and buffers of current layer and its sub-layers.

train()

Sets this Layer and all its sublayers to training mode.

backward

register_state_dict_hook

forward(audio, mel)[source]

Compute the transformed random variable z (x to z) and the log of the determinant of the jacobian of the transformation from x to z.

Args:
audio(Tensor):

The audio. shape=(B, T)

mel(Tensor):

The mel spectrogram. shape=(B, C_mel, T_mel)

Returns:
Tensor:

The inversely transformed random variable z (x to z). shape=(B, T)

Tensor:

the log of the determinant of the jacobian of the transformation from x to z. shape=(1,)

classmethod from_pretrained(config, checkpoint_path)[source]

Build a ConditionalWaveFlow model from a pretrained model.

Args:
config(yacs.config.CfgNode):

model configs

checkpoint_path(Path or str):

the path of pretrained model checkpoint, without extension name

Returns:

ConditionalWaveFlow The model built from pretrained result.

infer(mel)[source]

Generate raw audio given mel spectrogram.

Args:
mel(np.ndarray):

Mel spectrogram of an utterance(in log-magnitude). shape=(C_mel, T_mel)

Returns:
Tensor:

The synthesized audio, where``T <= T_mel * upsample_factors``. shape=(B, T)

predict(mel)[source]

Generate raw audio given mel spectrogram.

Args:
mel(np.ndarray):

Mel spectrogram of an utterance(in log-magnitude). shape=(C_mel, T_mel)

Returns:

np.ndarray: The synthesized audio. shape=(T,)

class paddlespeech.t2s.models.waveflow.WaveFlow(n_flows, n_layers, n_group, channels, mel_bands, kernel_size)[source]

Bases: LayerList

An Deep Reversible layer that is composed of severel auto regressive flows.

Args:
n_flows (int):

Number of flows in the WaveFlow model.

n_layers (int):

Number of ResidualBlocks in each Flow.

n_group (int):

Number of timesteps to fold as a group.

channels (int):

Feature size of each ResidualBlock.

mel_bands (int):

Feature size of mel spectrogram (mel bands).

kernel_size (Union[int, List[int]]):

Kernel size of the convolution layer in each ResidualBlock.

Methods

__call__(*inputs, **kwargs)

Call self as a function.

add_parameter(name, parameter)

Adds a Parameter instance.

add_sublayer(name, sublayer)

Adds a sub Layer instance.

append(sublayer)

Appends a sublayer to the end of the list.

apply(fn)

Applies fn recursively to every sublayer (as returned by .sublayers()) as well as self.

buffers([include_sublayers])

Returns a list of all buffers from current layer and its sub-layers.

children()

Returns an iterator over immediate children layers.

clear_gradients()

Clear the gradients of all parameters for this layer.

create_parameter(shape[, attr, dtype, ...])

Create parameters for this layer.

create_tensor([name, persistable, dtype])

Create Tensor for this layer.

create_variable([name, persistable, dtype])

Create Tensor for this layer.

eval()

Sets this Layer and all its sublayers to evaluation mode.

extend(sublayers)

Appends sublayers to the end of the list.

extra_repr()

Extra representation of this layer, you can have custom implementation of your own layer.

forward(x, condition)

Probability density estimation of random variable x given the condition.

full_name()

Full name for this layer, composed by name_scope + "/" + MyLayer.__class__.__name__

insert(index, sublayer)

Insert a sublayer before a given index in the list.

inverse(z, condition)

Sampling from the distrition p(X).

load_dict(state_dict[, use_structured_name])

Set parameters and persistable buffers from state_dict.

named_buffers([prefix, include_sublayers])

Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.

named_children()

Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.

named_parameters([prefix, include_sublayers])

Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.

named_sublayers([prefix, include_self, ...])

Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer.

parameters([include_sublayers])

Returns a list of all Parameters from current layer and its sub-layers.

register_buffer(name, tensor[, persistable])

Registers a tensor as buffer into the layer.

register_forward_post_hook(hook)

Register a forward post-hook for Layer.

register_forward_pre_hook(hook)

Register a forward pre-hook for Layer.

set_dict(state_dict[, use_structured_name])

Set parameters and persistable buffers from state_dict.

set_state_dict(state_dict[, use_structured_name])

Set parameters and persistable buffers from state_dict.

state_dict([destination, include_sublayers, ...])

Get all parameters and persistable buffers of current layer and its sub-layers.

sublayers([include_self])

Returns a list of sub layers.

to([device, dtype, blocking])

Cast the parameters and buffers of Layer by the give device, dtype and blocking.

to_static_state_dict([destination, ...])

Get all parameters and buffers of current layer and its sub-layers.

train()

Sets this Layer and all its sublayers to training mode.

backward

register_state_dict_hook

forward(x, condition)[source]

Probability density estimation of random variable x given the condition.

Args:
x (Tensor):

The audio. shape=(batch_size, time_steps)

condition (Tensor):

The local condition (mel spectrogram here). shape=(batch_size, condition channel, time_steps)

Returns:
Tensor:

The transformed random variable. shape=(batch_size, time_steps)

Tensor:

The log determinant of the jacobian of the transformation from x to z. shape=(1,)

inverse(z, condition)[source]

Sampling from the distrition p(X).

It is done by sample a z form p(Z) and transform it into x. Each Flow transform .. math:: z_{i-1} to .. math:: z_{i} in an autoregressive manner.

Args:
z (Tensor):

A sample of the distribution p(Z). shape=(batch, 1, time_steps

condition (Tensor):

The local condition. shape=(batch, condition_channel, time_steps)

Returns:

Tensor: The transformed sample (audio here). shape=(batch_size, time_steps)

class paddlespeech.t2s.models.waveflow.WaveFlowLoss(sigma=1.0)[source]

Bases: Layer

Criterion of a WaveFlow model.

Args:
sigma (float):

The standard deviation of the gaussian noise used in WaveFlow, by default 1.0.

Methods

__call__(*inputs, **kwargs)

Call self as a function.

add_parameter(name, parameter)

Adds a Parameter instance.

add_sublayer(name, sublayer)

Adds a sub Layer instance.

apply(fn)

Applies fn recursively to every sublayer (as returned by .sublayers()) as well as self.

buffers([include_sublayers])

Returns a list of all buffers from current layer and its sub-layers.

children()

Returns an iterator over immediate children layers.

clear_gradients()

Clear the gradients of all parameters for this layer.

create_parameter(shape[, attr, dtype, ...])

Create parameters for this layer.

create_tensor([name, persistable, dtype])

Create Tensor for this layer.

create_variable([name, persistable, dtype])

Create Tensor for this layer.

eval()

Sets this Layer and all its sublayers to evaluation mode.

extra_repr()

Extra representation of this layer, you can have custom implementation of your own layer.

forward(z, log_det_jacobian)

Compute the loss given the transformed random variable z and the log_det_jacobian of transformation from x to z.

full_name()

Full name for this layer, composed by name_scope + "/" + MyLayer.__class__.__name__

load_dict(state_dict[, use_structured_name])

Set parameters and persistable buffers from state_dict.

named_buffers([prefix, include_sublayers])

Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.

named_children()

Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.

named_parameters([prefix, include_sublayers])

Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.

named_sublayers([prefix, include_self, ...])

Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer.

parameters([include_sublayers])

Returns a list of all Parameters from current layer and its sub-layers.

register_buffer(name, tensor[, persistable])

Registers a tensor as buffer into the layer.

register_forward_post_hook(hook)

Register a forward post-hook for Layer.

register_forward_pre_hook(hook)

Register a forward pre-hook for Layer.

set_dict(state_dict[, use_structured_name])

Set parameters and persistable buffers from state_dict.

set_state_dict(state_dict[, use_structured_name])

Set parameters and persistable buffers from state_dict.

state_dict([destination, include_sublayers, ...])

Get all parameters and persistable buffers of current layer and its sub-layers.

sublayers([include_self])

Returns a list of sub layers.

to([device, dtype, blocking])

Cast the parameters and buffers of Layer by the give device, dtype and blocking.

to_static_state_dict([destination, ...])

Get all parameters and buffers of current layer and its sub-layers.

train()

Sets this Layer and all its sublayers to training mode.

backward

register_state_dict_hook

forward(z, log_det_jacobian)[source]

Compute the loss given the transformed random variable z and the log_det_jacobian of transformation from x to z.

Args:
z(Tensor):

The transformed random variable (x to z). shape=(B, T)

log_det_jacobian(Tensor):

The log of the determinant of the jacobian matrix of the transformation from x to z. shape=(1,)

Returns:

Tensor: The loss. shape=(1,)