paddlespeech.s2t.decoders.beam_search.beam_search module
Beam search module.
- class paddlespeech.s2t.decoders.beam_search.beam_search.BeamSearch(scorers: Dict[str, ScorerInterface], weights: Dict[str, float], beam_size: int, vocab_size: int, sos: int, eos: int, token_list: Optional[List[str]] = None, pre_beam_ratio: float = 1.5, pre_beam_score_key: Optional[str] = None)[source]
Bases:
Layer
Beam search implementation.
Methods
__call__
(*inputs, **kwargs)Call self as a function.
add_parameter
(name, parameter)Adds a Parameter instance.
add_sublayer
(name, sublayer)Adds a sub Layer instance.
append_token
(xs, x)Append new token to prefix tokens.
apply
(fn)Applies
fn
recursively to every sublayer (as returned by.sublayers()
) as well as self.beam
(weighted_scores, ids)Compute topk full token ids and partial token ids.
buffers
([include_sublayers])Returns a list of all buffers from current layer and its sub-layers.
children
()Returns an iterator over immediate children layers.
clear_gradients
()Clear the gradients of all parameters for this layer.
create_parameter
(shape[, attr, dtype, ...])Create parameters for this layer.
create_tensor
([name, persistable, dtype])Create Tensor for this layer.
create_variable
([name, persistable, dtype])Create Tensor for this layer.
eval
()Sets this Layer and all its sublayers to evaluation mode.
extra_repr
()Extra representation of this layer, you can have custom implementation of your own layer.
forward
(x[, maxlenratio, minlenratio])Perform beam search.
full_name
()Full name for this layer, composed by name_scope + "/" + MyLayer.__class__.__name__
init_hyp
(x)Get an initial hypothesis data.
load_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
merge_scores
(prev_scores, next_full_scores, ...)Merge scores for new hypothesis.
merge_states
(states, part_states, part_idx)Merge states for new hypothesis.
named_buffers
([prefix, include_sublayers])Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.
named_children
()Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.
named_parameters
([prefix, include_sublayers])Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.
named_sublayers
([prefix, include_self, ...])Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer.
parameters
([include_sublayers])Returns a list of all Parameters from current layer and its sub-layers.
post_process
(i, maxlen, maxlenratio, ...)Perform post-processing of beam search iterations.
register_buffer
(name, tensor[, persistable])Registers a tensor as buffer into the layer.
register_forward_post_hook
(hook)Register a forward post-hook for Layer.
register_forward_pre_hook
(hook)Register a forward pre-hook for Layer.
score_full
(hyp, x)Score new hypothesis by self.full_scorers.
score_partial
(hyp, ids, x)Score new hypothesis by self.part_scorers.
search
(running_hyps, x)Search new tokens for running hypotheses and encoded speech x.
set_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
set_state_dict
(state_dict[, use_structured_name])Set parameters and persistable buffers from state_dict.
state_dict
([destination, include_sublayers, ...])Get all parameters and persistable buffers of current layer and its sub-layers.
sublayers
([include_self])Returns a list of sub layers.
to
([device, dtype, blocking])Cast the parameters and buffers of Layer by the give device, dtype and blocking.
to_static_state_dict
([destination, ...])Get all parameters and buffers of current layer and its sub-layers.
train
()Sets this Layer and all its sublayers to training mode.
backward
register_state_dict_hook
- static append_token(xs: Tensor, x: Union[int, Tensor]) Tensor [source]
Append new token to prefix tokens.
- Args:
xs (paddle.Tensor): The prefix token, (T,) x (int): The new token to append
- Returns:
paddle.Tensor: (T+1,), New tensor contains: xs + [x] with xs.dtype and xs.device
- beam(weighted_scores: Tensor, ids: Tensor) Tuple[Tensor, Tensor] [source]
Compute topk full token ids and partial token ids.
- Args:
- weighted_scores (paddle.Tensor): The weighted sum scores for each tokens.
Its shape is (self.n_vocab,).
ids (paddle.Tensor): The partial token ids(Global) to compute topk.
- Returns:
- Tuple[paddle.Tensor, paddle.Tensor]:
The topk full token ids and partial token ids. Their shapes are (self.beam_size,). i.e. (global ids, global relative local ids).
- forward(x: Tensor, maxlenratio: float = 0.0, minlenratio: float = 0.0) List[Hypothesis] [source]
Perform beam search.
- Args:
x (paddle.Tensor): Encoded speech feature (T, D) maxlenratio (float): Input length ratio to obtain max output length.
- If maxlenratio=0.0 (default), it uses a end-detect function
to automatically find maximum hypothesis lengths
- If maxlenratio<0.0, its absolute value is interpreted
as a constant max output length.
minlenratio (float): Input length ratio to obtain min output length.
- Returns:
list[Hypothesis]: N-best decoding results
- init_hyp(x: Tensor) List[Hypothesis] [source]
Get an initial hypothesis data.
- Args:
x (paddle.Tensor): The encoder output feature, (T, D)
- Returns:
Hypothesis: The initial hypothesis.
- static merge_scores(prev_scores: Dict[str, float], next_full_scores: Dict[str, Tensor], full_idx: int, next_part_scores: Dict[str, Tensor], part_idx: int) Dict[str, Tensor] [source]
Merge scores for new hypothesis.
- Args:
- prev_scores (Dict[str, float]):
The previous hypothesis scores by self.scorers
next_full_scores (Dict[str, paddle.Tensor]): scores by self.full_scorers full_idx (int): The next token id for next_full_scores next_part_scores (Dict[str, paddle.Tensor]):
scores of partial tokens by self.part_scorers
part_idx (int): The new token id for next_part_scores
- Returns:
- Dict[str, paddle.Tensor]: The new score dict.
Its keys are names of self.full_scorers and self.part_scorers. Its values are scalar tensors by the scorers.
- merge_states(states: Any, part_states: Any, part_idx: int) Any [source]
Merge states for new hypothesis.
- Args:
states: states of self.full_scorers part_states: states of self.part_scorers part_idx (int): The new token id for part_scores
- Returns:
- Dict[str, paddle.Tensor]: The new score dict.
Its keys are names of self.full_scorers and self.part_scorers. Its values are states of the scorers.
- post_process(i: int, maxlen: int, maxlenratio: float, running_hyps: List[Hypothesis], ended_hyps: List[Hypothesis]) List[Hypothesis] [source]
Perform post-processing of beam search iterations.
- Args:
i (int): The length of hypothesis tokens. maxlen (int): The maximum length of tokens in beam search. maxlenratio (int): The maximum length ratio in beam search. running_hyps (List[Hypothesis]): The running hypotheses in beam search. ended_hyps (List[Hypothesis]): The ended hypotheses in beam search.
- Returns:
List[Hypothesis]: The new running hypotheses.
- score_full(hyp: Hypothesis, x: Tensor) Tuple[Dict[str, Tensor], Dict[str, Any]] [source]
Score new hypothesis by self.full_scorers.
- Args:
hyp (Hypothesis): Hypothesis with prefix tokens to score x (paddle.Tensor): Corresponding input feature, (T, D)
- Returns:
- Tuple[Dict[str, paddle.Tensor], Dict[str, Any]]: Tuple of
score dict of hyp that has string keys of self.full_scorers and tensor score values of shape: (self.n_vocab,), and state dict that has string keys and state values of self.full_scorers
- score_partial(hyp: Hypothesis, ids: Tensor, x: Tensor) Tuple[Dict[str, Tensor], Dict[str, Any]] [source]
Score new hypothesis by self.part_scorers.
- Args:
hyp (Hypothesis): Hypothesis with prefix tokens to score ids (paddle.Tensor): 1D tensor of new partial tokens to score,
len(ids) < n_vocab
x (paddle.Tensor): Corresponding input feature, (T, D)
- Returns:
- Tuple[Dict[str, paddle.Tensor], Dict[str, Any]]: Tuple of
score dict of hyp that has string keys of self.part_scorers and tensor score values of shape: (len(ids),), and state dict that has string keys and state values of self.part_scorers
- search(running_hyps: List[Hypothesis], x: Tensor) List[Hypothesis] [source]
Search new tokens for running hypotheses and encoded speech x.
- Args:
running_hyps (List[Hypothesis]): Running hypotheses on beam x (paddle.Tensor): Encoded speech feature (T, D)
- Returns:
List[Hypotheses]: Best sorted hypotheses
- class paddlespeech.s2t.decoders.beam_search.beam_search.Hypothesis(yseq: Tensor, score: Union[float, Tensor] = 0, scores: Dict[str, Union[float, Tensor]] = {}, states: Dict[str, Any] = {})[source]
Bases:
tuple
Hypothesis data type.
- Attributes:
Methods
asdict
()Convert data to JSON-friendly dict.
count
(value, /)Return number of occurrences of value.
index
(value[, start, stop])Return first index of value.
- property score
Alias for field number 1
- property scores
Alias for field number 2
- property states
Alias for field number 3
- property yseq
Alias for field number 0
- paddlespeech.s2t.decoders.beam_search.beam_search.beam_search(x: Tensor, sos: int, eos: int, beam_size: int, vocab_size: int, scorers: Dict[str, ScorerInterface], weights: Dict[str, float], token_list: Optional[List[str]] = None, maxlenratio: float = 0.0, minlenratio: float = 0.0, pre_beam_ratio: float = 1.5, pre_beam_score_key: str = 'full') list [source]
Perform beam search with scorers.
- Args:
x (paddle.Tensor): Encoded speech feature (T, D) sos (int): Start of sequence id eos (int): End of sequence id beam_size (int): The number of hypotheses kept during search vocab_size (int): The number of vocabulary scorers (dict[str, ScorerInterface]): Dict of decoder modules
e.g., Decoder, CTCPrefixScorer, LM The scorer will be ignored if it is None
- weights (dict[str, float]): Dict of weights for each scorers
The scorer will be ignored if its weight is 0
token_list (list[str]): List of tokens for debug log maxlenratio (float): Input length ratio to obtain max output length.
If maxlenratio=0.0 (default), it uses a end-detect function to automatically find maximum hypothesis lengths
minlenratio (float): Input length ratio to obtain min output length. pre_beam_score_key (str): key of scores to perform pre-beam search pre_beam_ratio (float): beam size in the pre-beam search
will be int(pre_beam_ratio * beam_size)
- Returns:
List[Dict]: N-best decoding results