paddlespeech.s2t.models.ds2 package
- class paddlespeech.s2t.models.ds2.DeepSpeech2InferModel(*args, **kwargs)[source]
Bases:
DeepSpeech2Model
Methods
__call__
(*inputs, **kwargs)Call self as a function.
add_parameter
(name, parameter)Adds a Parameter instance.
add_sublayer
(name, sublayer)Adds a sub Layer instance.
apply
(fn)Applies
fn
recursively to every sublayer (as returned by.sublayers()
) as well as self.buffers
([include_sublayers])Returns a list of all buffers from current layer and its sub-layers.
children
()Returns an iterator over immediate children layers.
clear_gradients
()Clear the gradients of all parameters for this layer.
create_parameter
(shape[, attr, dtype, ...])Create parameters for this layer.
create_tensor
([name, persistable, dtype])Create Tensor for this layer.
create_variable
([name, persistable, dtype])Create Tensor for this layer.
eval
()Sets this Layer and all its sublayers to evaluation mode.
extra_repr
()Extra representation of this layer, you can have custom implementation of your own layer.
forward
(audio_chunk, audio_chunk_lens[, ...])Compute Model loss
from_config
(config)Build a DeepSpeec2Model from config Parameters
from_pretrained
(dataloader, config, ...)Build a DeepSpeech2Model model from a pretrained model.
full_name
()Full name for this layer, composed by name_scope + "/" + MyLayer.__class__.__name__
load_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
named_buffers
([prefix, include_sublayers])Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.
named_children
()Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.
named_parameters
([prefix, include_sublayers])Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.
named_sublayers
([prefix, include_self, ...])Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer.
parameters
([include_sublayers])Returns a list of all Parameters from current layer and its sub-layers.
register_buffer
(name, tensor[, persistable])Registers a tensor as buffer into the layer.
register_forward_post_hook
(hook)Register a forward post-hook for Layer.
register_forward_pre_hook
(hook)Register a forward pre-hook for Layer.
set_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
set_state_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
state_dict
([destination, include_sublayers, ...])Get all parameters and persistable buffers of current layer and its sub-layers.
sublayers
([include_self])Returns a list of sub layers.
to
([device, dtype, blocking])Cast the parameters and buffers of Layer by the give device, dtype and blocking.
to_static_state_dict
([destination, ...])Get all parameters and buffers of current layer and its sub-layers.
train
()Sets this Layer and all its sublayers to training mode.
backward
decode
export
register_state_dict_hook
- class paddlespeech.s2t.models.ds2.DeepSpeech2Model(feat_size, dict_size, num_conv_layers=2, num_rnn_layers=4, rnn_size=1024, rnn_direction='forward', num_fc_layers=2, fc_layers_size_list=[512, 256], use_gru=False, blank_id=0, ctc_grad_norm_type=None)[source]
Bases:
Layer
The DeepSpeech2 network structure.
- Parameters:
audio (Variable) -- Audio spectrogram data layer.
text (Variable) -- Transcription text data layer.
audio_len (Variable) -- Valid sequence length data layer.
feat_size (int) -- feature size for audio.
dict_size (int) -- Dictionary size for tokenized transcription.
num_conv_layers (int) -- Number of stacking convolution layers.
num_rnn_layers (int) -- Number of stacking RNN layers.
rnn_size (int) -- RNN layer size (dimension of RNN cells).
num_fc_layers (int) -- Number of stacking FC layers.
fc_layers_size_list ([int,]) -- The list of FC layer sizes.
use_gru (bool) -- Use gru if set True. Use simple rnn if set False.
- Returns:
A tuple of an output unnormalized log probability layer ( before softmax) and a ctc cost layer.
- Return type:
tuple of LayerOutput
Methods
__call__
(*inputs, **kwargs)Call self as a function.
add_parameter
(name, parameter)Adds a Parameter instance.
add_sublayer
(name, sublayer)Adds a sub Layer instance.
apply
(fn)Applies
fn
recursively to every sublayer (as returned by.sublayers()
) as well as self.buffers
([include_sublayers])Returns a list of all buffers from current layer and its sub-layers.
children
()Returns an iterator over immediate children layers.
clear_gradients
()Clear the gradients of all parameters for this layer.
create_parameter
(shape[, attr, dtype, ...])Create parameters for this layer.
create_tensor
([name, persistable, dtype])Create Tensor for this layer.
create_variable
([name, persistable, dtype])Create Tensor for this layer.
eval
()Sets this Layer and all its sublayers to evaluation mode.
extra_repr
()Extra representation of this layer, you can have custom implementation of your own layer.
forward
(audio, audio_len, text, text_len)Compute Model loss
from_config
(config)Build a DeepSpeec2Model from config Parameters
from_pretrained
(dataloader, config, ...)Build a DeepSpeech2Model model from a pretrained model.
full_name
()Full name for this layer, composed by name_scope + "/" + MyLayer.__class__.__name__
load_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
named_buffers
([prefix, include_sublayers])Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.
named_children
()Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.
named_parameters
([prefix, include_sublayers])Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.
named_sublayers
([prefix, include_self, ...])Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer.
parameters
([include_sublayers])Returns a list of all Parameters from current layer and its sub-layers.
register_buffer
(name, tensor[, persistable])Registers a tensor as buffer into the layer.
register_forward_post_hook
(hook)Register a forward post-hook for Layer.
register_forward_pre_hook
(hook)Register a forward pre-hook for Layer.
set_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
set_state_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
state_dict
([destination, include_sublayers, ...])Get all parameters and persistable buffers of current layer and its sub-layers.
sublayers
([include_self])Returns a list of sub layers.
to
([device, dtype, blocking])Cast the parameters and buffers of Layer by the give device, dtype and blocking.
to_static_state_dict
([destination, ...])Get all parameters and buffers of current layer and its sub-layers.
train
()Sets this Layer and all its sublayers to training mode.
backward
decode
register_state_dict_hook
- forward(audio, audio_len, text, text_len)[source]
Compute Model loss
- Args:
audio (Tensor): [B, T, D] audio_len (Tensor): [B] text (Tensor): [B, U] text_len (Tensor): [B]
- Returns:
loss (Tensor): [1]
- classmethod from_config(config)[source]
Build a DeepSpeec2Model from config Parameters
- config: yacs.config.CfgNode
config
Returns
- DeepSpeech2Model
The model built from config.
- classmethod from_pretrained(dataloader, config, checkpoint_path)[source]
Build a DeepSpeech2Model model from a pretrained model. Parameters ---------- dataloader: paddle.io.DataLoader
- config: yacs.config.CfgNode
model configs
- checkpoint_path: Path or str
the path of pretrained model checkpoint, without extension name
- Returns:
- DeepSpeech2Model
The model built from pretrained result.