paddlespeech.s2t.decoders.ctcdecoder.swig_wrapper module
Wrapper for various CTC decoders in SWIG.
- class paddlespeech.s2t.decoders.ctcdecoder.swig_wrapper.CTCBeamSearchDecoder(vocab_list, batch_size, beam_size, num_processes, cutoff_prob, cutoff_top_n, _ext_scorer, blank_id)[source]
Bases:
CtcBeamSearchDecoderBatch
Wrapper for CtcBeamSearchDecoderBatch. Args:
vocab_list (list): Vocabulary list. beam_size (int): Width for beam search. num_processes (int): Number of parallel processes. param cutoff_prob (float): Cutoff probability in vocabulary pruning,
default 1.0, no pruning.
- cutoff_top_n (int): Cutoff number in pruning, only top cutoff_top_n
characters with highest probs in vocabulary will be used in beam search, default 40.
- param ext_scorer (Scorer): External scorer for partially decoded sentence, e.g. word count
or language model.
- Attributes:
thisown
The membership flag
Methods
decode
next
reset_state
- class paddlespeech.s2t.decoders.ctcdecoder.swig_wrapper.Scorer(alpha, beta, model_path, vocabulary)[source]
Bases:
Scorer
Wrapper for Scorer.
- Parameters:
alpha (float) -- Parameter associated with language model. Don't use language model when alpha = 0.
beta (float) -- Parameter associated with word count. Don't use word count when beta = 0.
vocabulary (list) -- Vocabulary list.
- Model_path:
Path to load language model.
- Attributes:
- alpha
- beta
- dictionary
thisown
The membership flag
Methods
get_dict_size
get_log_cond_prob
get_max_order
get_sent_log_prob
is_character_based
make_ngram
reset_params
split_labels
- paddlespeech.s2t.decoders.ctcdecoder.swig_wrapper.ctc_beam_search_decoding(probs_seq, vocabulary, beam_size, cutoff_prob=1.0, cutoff_top_n=40, ext_scoring_func=None, blank_id=0)[source]
Wrapper for the CTC Beam Search Decoding function.
- Parameters:
probs_seq (2-D list) -- 2-D list of probability distributions over each time step, with each element being a list of normalized probabilities over vocabulary and blank.
vocabulary (list) -- Vocabulary list.
beam_size (int) -- Width for beam search.
cutoff_prob (float) -- Cutoff probability in pruning, default 1.0, no pruning.
cutoff_top_n (int) -- Cutoff number in pruning, only top cutoff_top_n characters with highest probs in vocabulary will be used in beam search, default 40.
ext_scoring_func -- External scoring function for partially decoded sentence, e.g. word count or language model.
- Returns:
List of tuples of log probability and sentence as decoding results, in descending order of the probability.
- Return type:
list
- paddlespeech.s2t.decoders.ctcdecoder.swig_wrapper.ctc_beam_search_decoding_batch(probs_split, vocabulary, beam_size, num_processes, cutoff_prob=1.0, cutoff_top_n=40, ext_scoring_func=None, blank_id=0)[source]
Wrapper for the batched CTC beam search decodeing batch function.
- Parameters:
probs_seq (3-D list) -- 3-D list with each element as an instance of 2-D list of probabilities used by ctc_beam_search_decoder().
vocabulary (list) -- Vocabulary list.
beam_size (int) -- Width for beam search.
num_processes (int) -- Number of parallel processes.
cutoff_prob (float) -- Cutoff probability in vocabulary pruning, default 1.0, no pruning.
cutoff_top_n (int) -- Cutoff number in pruning, only top cutoff_top_n characters with highest probs in vocabulary will be used in beam search, default 40.
num_processes -- Number of parallel processes.
ext_scoring_func -- External scoring function for partially decoded sentence, e.g. word count or language model.
- Returns:
List of tuples of log probability and sentence as decoding results, in descending order of the probability.
- Return type:
list
- paddlespeech.s2t.decoders.ctcdecoder.swig_wrapper.ctc_greedy_decoding(probs_seq, vocabulary, blank_id)[source]
Wrapper for ctc best path decodeing function in swig.
- Parameters:
probs_seq (2-D list) -- 2-D list of probability distributions over each time step, with each element being a list of normalized probabilities over vocabulary and blank.
vocabulary (list) -- Vocabulary list.
- Returns:
Decoding result string.
- Return type:
str