paddlespeech.s2t.modules.ctc module
- class paddlespeech.s2t.modules.ctc.CTCDecoder(*args, **kwargs)[source]
Bases:
CTCDecoderBase
Methods
__call__
(*inputs, **kwargs)Call self as a function.
add_parameter
(name, parameter)Adds a Parameter instance.
add_sublayer
(name, sublayer)Adds a sub Layer instance.
apply
(fn)Applies
fn
recursively to every sublayer (as returned by.sublayers()
) as well as self.argmax
(hs_pad)argmax of frame activations Args: paddle.Tensor hs_pad: 3d tensor (B, Tmax, eprojs) Returns: paddle.Tensor: argmax applied 2d tensor (B, Tmax)
buffers
([include_sublayers])Returns a list of all buffers from current layer and its sub-layers.
children
()Returns an iterator over immediate children layers.
clear_gradients
()Clear the gradients of all parameters for this layer.
create_parameter
(shape[, attr, dtype, ...])Create parameters for this layer.
create_tensor
([name, persistable, dtype])Create Tensor for this layer.
create_variable
([name, persistable, dtype])Create Tensor for this layer.
decode
()Get the decoding result Raises: Exception: when the ctc decoder is not initialized ValueError: when decoding_method not support. Returns: results_best (list(str)): The best result for a batch of data results_beam (list(list(str))): The beam search result for a batch of data.
decode_probs_offline
(probs, logits_lens, ...)This function will be deprecated in future. ctc decoding with probs. Args: probs (Tensor): activation after softmax logits_lens (Tensor): audio output lens vocab_list (list): List of tokens in the vocabulary, for decoding decoding_method (str): ctc_beam_search lang_model_path (str): language model path beam_alpha (float): beam_alpha beam_beta (float): beam_beta beam_size (int): beam_size cutoff_prob (float): cutoff probability in beam search cutoff_top_n (int): cutoff_top_n num_processes (int): num_processes.
Delete the decoder
eval
()Sets this Layer and all its sublayers to evaluation mode.
extra_repr
()Extra representation of this layer, you can have custom implementation of your own layer.
forced_align
(ctc_probs, y[, blank_id])ctc forced alignment. Args: ctc_probs (paddle.Tensor): hidden state sequence, 2d tensor (T, D) y (paddle.Tensor): label id sequence tensor, 1d tensor (L) blank_id (int): blank symbol index Returns: paddle.Tensor: best alignment result, (T).
forward
(hs_pad, hlens, ys_pad, ys_lens)Calculate CTC loss.
full_name
()Full name for this layer, composed by name_scope + "/" + MyLayer.__class__.__name__
get_decoder
(vocab_list, batch_size, ...)init get ctc decoder Args: vocab_list (list): List of tokens in the vocabulary, for decoding. batch_size(int): Batch size for input data beam_alpha (float): beam_alpha beam_beta (float): beam_beta beam_size (int): beam_size num_processes (int): num_processes cutoff_prob (float): cutoff probability in beam search cutoff_top_n (int): cutoff_top_n.
init_decoder
(batch_size, vocab_list, ...)init ctc decoders Args: batch_size(int): Batch size for input data vocab_list (list): List of tokens in the vocabulary, for decoding decoding_method (str): ctc_beam_search lang_model_path (str): language model path beam_alpha (float): beam_alpha beam_beta (float): beam_beta beam_size (int): beam_size cutoff_prob (float): cutoff probability in beam search cutoff_top_n (int): cutoff_top_n num_processes (int): num_processes
load_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
log_softmax
(hs_pad[, temperature])log_softmax of frame activations Args: Tensor hs_pad: 3d tensor (B, Tmax, eprojs) Returns: paddle.Tensor: log softmax applied 3d tensor (B, Tmax, odim)
named_buffers
([prefix, include_sublayers])Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.
named_children
()Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.
named_parameters
([prefix, include_sublayers])Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.
named_sublayers
([prefix, include_self, ...])Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer.
next
(probs, logits_lens)Input probs into ctc decoder Args: probs (list(list(float))): probs for a batch of data logits_lens (list(int)): logits lens for a batch of data Raises: Exception: when the ctc decoder is not initialized ValueError: when decoding_method not support.
parameters
([include_sublayers])Returns a list of all Parameters from current layer and its sub-layers.
register_buffer
(name, tensor[, persistable])Registers a tensor as buffer into the layer.
register_forward_post_hook
(hook)Register a forward post-hook for Layer.
register_forward_pre_hook
(hook)Register a forward pre-hook for Layer.
set_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
set_state_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
softmax
(eouts[, temperature])Get CTC probabilities. Args: eouts (FloatTensor): [B, T, enc_units] Returns: probs (FloatTensor): [B, T, odim].
state_dict
([destination, include_sublayers, ...])Get all parameters and persistable buffers of current layer and its sub-layers.
sublayers
([include_self])Returns a list of sub layers.
to
([device, dtype, blocking])Cast the parameters and buffers of Layer by the give device, dtype and blocking.
to_static_state_dict
([destination, ...])Get all parameters and buffers of current layer and its sub-layers.
train
()Sets this Layer and all its sublayers to training mode.
backward
register_state_dict_hook
reset_decoder
- decode()[source]
Get the decoding result Raises:
Exception: when the ctc decoder is not initialized ValueError: when decoding_method not support.
- Returns:
results_best (list(str)): The best result for a batch of data results_beam (list(list(str))): The beam search result for a batch of data
- decode_probs_offline(probs, logits_lens, vocab_list, decoding_method, lang_model_path, beam_alpha, beam_beta, beam_size, cutoff_prob, cutoff_top_n, num_processes)[source]
This function will be deprecated in future. ctc decoding with probs. Args:
probs (Tensor): activation after softmax logits_lens (Tensor): audio output lens vocab_list (list): List of tokens in the vocabulary, for decoding decoding_method (str): ctc_beam_search lang_model_path (str): language model path beam_alpha (float): beam_alpha beam_beta (float): beam_beta beam_size (int): beam_size cutoff_prob (float): cutoff probability in beam search cutoff_top_n (int): cutoff_top_n num_processes (int): num_processes
- Raises:
ValueError: when decoding_method not support.
- Returns:
List[str]: transcripts.
- get_decoder(vocab_list, batch_size, beam_alpha, beam_beta, beam_size, num_processes, cutoff_prob, cutoff_top_n)[source]
init get ctc decoder Args:
vocab_list (list): List of tokens in the vocabulary, for decoding. batch_size(int): Batch size for input data beam_alpha (float): beam_alpha beam_beta (float): beam_beta beam_size (int): beam_size num_processes (int): num_processes cutoff_prob (float): cutoff probability in beam search cutoff_top_n (int): cutoff_top_n
- Raises:
ValueError: when decoding_method not support.
- Returns:
CTCBeamSearchDecoder
- init_decoder(batch_size, vocab_list, decoding_method, lang_model_path, beam_alpha, beam_beta, beam_size, cutoff_prob, cutoff_top_n, num_processes)[source]
init ctc decoders Args:
batch_size(int): Batch size for input data vocab_list (list): List of tokens in the vocabulary, for decoding decoding_method (str): ctc_beam_search lang_model_path (str): language model path beam_alpha (float): beam_alpha beam_beta (float): beam_beta beam_size (int): beam_size cutoff_prob (float): cutoff probability in beam search cutoff_top_n (int): cutoff_top_n num_processes (int): num_processes
- Raises:
ValueError: when decoding_method not support.
- Returns:
CTCBeamSearchDecoder