paddlespeech.s2t.models.asr_interface module
ASR Interface module.
- class paddlespeech.s2t.models.asr_interface.ASRInterface[source]
Bases:
object
ASR Interface model implementation.
- Attributes:
attention_plot_class
Get attention plot class.
ctc_plot_class
Get CTC plot class.
Methods
add_arguments
(parser)Add arguments to parser.
build
(idim, odim, **kwargs)Initialize this class with python-level args.
calculate_all_attentions
(xs, ilens, ys)Calculate attention.
calculate_all_ctc_probs
(xs, ilens, ys)Calculate CTC probability.
encode
(feat)Encode feature in beam_search (optional).
forward
(xs, ilens, ys, olens)Compute loss for training.
Get total subsampling factor.
recognize
(x, recog_args[, char_list, rnnlm])Recognize x for evaluation.
recognize_batch
(x, recog_args[, char_list, ...])Beam search implementation for batch.
scorers
()Get scorers for beam_search (optional).
- property attention_plot_class
Get attention plot class.
- classmethod build(idim: int, odim: int, **kwargs)[source]
Initialize this class with python-level args.
- Args:
idim (int): The number of an input feature dim. odim (int): The number of output vocab.
- Returns:
ASRinterface: A new instance of ASRInterface.
- calculate_all_attentions(xs, ilens, ys)[source]
Calculate attention.
- Parameters:
xs (list) -- list of padded input sequences [(T1, idim), (T2, idim), ...]
ilens (ndarray) -- batch of lengths of input sequences (B)
ys (list) -- list of character id sequence tensor [(L1), (L2), (L3), ...]
- Returns:
attention weights (B, Lmax, Tmax)
- Return type:
float ndarray
- calculate_all_ctc_probs(xs, ilens, ys)[source]
Calculate CTC probability.
- Parameters:
xs_pad (list) -- list of padded input sequences [(T1, idim), (T2, idim), ...]
ilens (ndarray) -- batch of lengths of input sequences (B)
ys (list) -- list of character id sequence tensor [(L1), (L2), (L3), ...]
- Returns:
CTC probabilities (B, Tmax, vocab)
- Return type:
float ndarray
- property ctc_plot_class
Get CTC plot class.
- encode(feat)[source]
Encode feature in beam_search (optional).
- Args:
x (numpy.ndarray): input feature (T, D)
- Returns:
paddle.Tensor: encoded feature (T, D)
- forward(xs, ilens, ys, olens)[source]
Compute loss for training.
- Parameters:
xs -- batch of padded source sequences paddle.Tensor (B, Tmax, idim)
ilens -- batch of lengths of source sequences (B), paddle.Tensor
ys -- batch of padded target sequences paddle.Tensor (B, Lmax)
olens -- batch of lengths of target sequences (B), paddle.Tensor
- Returns:
loss value
- Return type:
paddle.Tensor
- recognize(x, recog_args, char_list=None, rnnlm=None)[source]
Recognize x for evaluation.
- Parameters:
x (ndarray) -- input acouctic feature (B, T, D) or (T, D)
recog_args (namespace) -- argment namespace contraining options
char_list (list) -- list of characters
rnnlm (paddle.nn.Layer) -- language model module
- Returns:
N-best decoding results
- Return type:
list
- recognize_batch(x, recog_args, char_list=None, rnnlm=None)[source]
Beam search implementation for batch.
- Parameters:
x (paddle.Tensor) -- encoder hidden state sequences (B, Tmax, Henc)
recog_args (namespace) -- argument namespace containing options
char_list (list) -- list of characters
rnnlm (paddle.nn.Module) -- language model module
- Returns:
N-best decoding results
- Return type:
list