paddlespeech.s2t.modules.ctc module

class paddlespeech.s2t.modules.ctc.CTCDecoder(*args, **kwargs)[source]

Bases: CTCDecoderBase

Methods

__call__(*inputs, **kwargs)

Call self as a function.

add_parameter(name, parameter)

Adds a Parameter instance.

add_sublayer(name, sublayer)

Adds a sub Layer instance.

apply(fn)

Applies fn recursively to every sublayer (as returned by .sublayers()) as well as self.

argmax(hs_pad)

argmax of frame activations Args: paddle.Tensor hs_pad: 3d tensor (B, Tmax, eprojs) Returns: paddle.Tensor: argmax applied 2d tensor (B, Tmax)

buffers([include_sublayers])

Returns a list of all buffers from current layer and its sub-layers.

children()

Returns an iterator over immediate children layers.

clear_gradients()

Clear the gradients of all parameters for this layer.

create_parameter(shape[, attr, dtype, ...])

Create parameters for this layer.

create_tensor([name, persistable, dtype])

Create Tensor for this layer.

create_variable([name, persistable, dtype])

Create Tensor for this layer.

decode()

Get the decoding result Raises: Exception: when the ctc decoder is not initialized ValueError: when decoding_method not support. Returns: results_best (list(str)): The best result for a batch of data results_beam (list(list(str))): The beam search result for a batch of data.

decode_probs_offline(probs, logits_lens, ...)

This function will be deprecated in future. ctc decoding with probs. Args: probs (Tensor): activation after softmax logits_lens (Tensor): audio output lens vocab_list (list): List of tokens in the vocabulary, for decoding decoding_method (str): ctc_beam_search lang_model_path (str): language model path beam_alpha (float): beam_alpha beam_beta (float): beam_beta beam_size (int): beam_size cutoff_prob (float): cutoff probability in beam search cutoff_top_n (int): cutoff_top_n num_processes (int): num_processes.

del_decoder()

Delete the decoder

eval()

Sets this Layer and all its sublayers to evaluation mode.

extra_repr()

Extra representation of this layer, you can have custom implementation of your own layer.

forced_align(ctc_probs, y[, blank_id])

ctc forced alignment. Args: ctc_probs (paddle.Tensor): hidden state sequence, 2d tensor (T, D) y (paddle.Tensor): label id sequence tensor, 1d tensor (L) blank_id (int): blank symbol index Returns: paddle.Tensor: best alignment result, (T).

forward(hs_pad, hlens, ys_pad, ys_lens)

Calculate CTC loss.

full_name()

Full name for this layer, composed by name_scope + "/" + MyLayer.__class__.__name__

get_decoder(vocab_list, batch_size, ...)

init get ctc decoder Args: vocab_list (list): List of tokens in the vocabulary, for decoding. batch_size(int): Batch size for input data beam_alpha (float): beam_alpha beam_beta (float): beam_beta beam_size (int): beam_size num_processes (int): num_processes cutoff_prob (float): cutoff probability in beam search cutoff_top_n (int): cutoff_top_n.

init_decoder(batch_size, vocab_list, ...)

init ctc decoders Args: batch_size(int): Batch size for input data vocab_list (list): List of tokens in the vocabulary, for decoding decoding_method (str): ctc_beam_search lang_model_path (str): language model path beam_alpha (float): beam_alpha beam_beta (float): beam_beta beam_size (int): beam_size cutoff_prob (float): cutoff probability in beam search cutoff_top_n (int): cutoff_top_n num_processes (int): num_processes

load_dict(state_dict[, use_structured_name])

Set parameters and persistable buffers from state_dict.

log_softmax(hs_pad[, temperature])

log_softmax of frame activations Args: Tensor hs_pad: 3d tensor (B, Tmax, eprojs) Returns: paddle.Tensor: log softmax applied 3d tensor (B, Tmax, odim)

named_buffers([prefix, include_sublayers])

Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.

named_children()

Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.

named_parameters([prefix, include_sublayers])

Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.

named_sublayers([prefix, include_self, ...])

Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer.

next(probs, logits_lens)

Input probs into ctc decoder Args: probs (list(list(float))): probs for a batch of data logits_lens (list(int)): logits lens for a batch of data Raises: Exception: when the ctc decoder is not initialized ValueError: when decoding_method not support.

parameters([include_sublayers])

Returns a list of all Parameters from current layer and its sub-layers.

register_buffer(name, tensor[, persistable])

Registers a tensor as buffer into the layer.

register_forward_post_hook(hook)

Register a forward post-hook for Layer.

register_forward_pre_hook(hook)

Register a forward pre-hook for Layer.

set_dict(state_dict[, use_structured_name])

Set parameters and persistable buffers from state_dict.

set_state_dict(state_dict[, use_structured_name])

Set parameters and persistable buffers from state_dict.

softmax(eouts[, temperature])

Get CTC probabilities. Args: eouts (FloatTensor): [B, T, enc_units] Returns: probs (FloatTensor): [B, T, odim].

state_dict([destination, include_sublayers, ...])

Get all parameters and persistable buffers of current layer and its sub-layers.

sublayers([include_self])

Returns a list of sub layers.

to([device, dtype, blocking])

Cast the parameters and buffers of Layer by the give device, dtype and blocking.

to_static_state_dict([destination, ...])

Get all parameters and buffers of current layer and its sub-layers.

train()

Sets this Layer and all its sublayers to training mode.

backward

register_state_dict_hook

reset_decoder

decode()[source]

Get the decoding result Raises:

Exception: when the ctc decoder is not initialized ValueError: when decoding_method not support.

Returns:

results_best (list(str)): The best result for a batch of data results_beam (list(list(str))): The beam search result for a batch of data

decode_probs_offline(probs, logits_lens, vocab_list, decoding_method, lang_model_path, beam_alpha, beam_beta, beam_size, cutoff_prob, cutoff_top_n, num_processes)[source]

This function will be deprecated in future. ctc decoding with probs. Args:

probs (Tensor): activation after softmax logits_lens (Tensor): audio output lens vocab_list (list): List of tokens in the vocabulary, for decoding decoding_method (str): ctc_beam_search lang_model_path (str): language model path beam_alpha (float): beam_alpha beam_beta (float): beam_beta beam_size (int): beam_size cutoff_prob (float): cutoff probability in beam search cutoff_top_n (int): cutoff_top_n num_processes (int): num_processes

Raises:

ValueError: when decoding_method not support.

Returns:

List[str]: transcripts.

del_decoder()[source]

Delete the decoder

get_decoder(vocab_list, batch_size, beam_alpha, beam_beta, beam_size, num_processes, cutoff_prob, cutoff_top_n)[source]

init get ctc decoder Args:

vocab_list (list): List of tokens in the vocabulary, for decoding. batch_size(int): Batch size for input data beam_alpha (float): beam_alpha beam_beta (float): beam_beta beam_size (int): beam_size num_processes (int): num_processes cutoff_prob (float): cutoff probability in beam search cutoff_top_n (int): cutoff_top_n

Raises:

ValueError: when decoding_method not support.

Returns:

CTCBeamSearchDecoder

init_decoder(batch_size, vocab_list, decoding_method, lang_model_path, beam_alpha, beam_beta, beam_size, cutoff_prob, cutoff_top_n, num_processes)[source]

init ctc decoders Args:

batch_size(int): Batch size for input data vocab_list (list): List of tokens in the vocabulary, for decoding decoding_method (str): ctc_beam_search lang_model_path (str): language model path beam_alpha (float): beam_alpha beam_beta (float): beam_beta beam_size (int): beam_size cutoff_prob (float): cutoff probability in beam search cutoff_top_n (int): cutoff_top_n num_processes (int): num_processes

Raises:

ValueError: when decoding_method not support.

Returns:

CTCBeamSearchDecoder

next(probs, logits_lens)[source]

Input probs into ctc decoder Args:

probs (list(list(float))): probs for a batch of data logits_lens (list(int)): logits lens for a batch of data

Raises:

Exception: when the ctc decoder is not initialized ValueError: when decoding_method not support.

reset_decoder(batch_size=-1, beam_size=-1, num_processes=-1, cutoff_prob=-1.0, cutoff_top_n=-1)[source]