paddlespeech.s2t.utils.bleu_score module

This module provides functions to calculate bleu score in different level. e.g. wer for word-level, cer for char-level.

class paddlespeech.s2t.utils.bleu_score.ErrorCalculator(char_list, sym_space, sym_pad, report_bleu=False)[source]

Bases: object

Calculate BLEU for ST and MT models during training.

Parameters:
  • y_hats -- numpy array with predicted text

  • y_pads -- numpy array with true (target) text

  • char_list -- vocabulary list

  • sym_space -- space symbol

  • sym_pad -- pad symbol

  • report_bleu -- report BLUE score if True

Methods

__call__(ys_hat, ys_pad)

Calculate corpus-level BLEU score.

calculate_corpus_bleu(ys_hat, ys_pad)

Calculate corpus-level BLEU score in a mini-batch.

calculate_corpus_bleu(ys_hat, ys_pad)[source]

Calculate corpus-level BLEU score in a mini-batch.

Parameters:
  • seqs_hat (torch.Tensor) -- prediction (batch, seqlen)

  • seqs_true (torch.Tensor) -- reference (batch, seqlen)

Returns:

corpus-level BLEU score

:rtype float

paddlespeech.s2t.utils.bleu_score.bleu(hypothesis, reference)[source]

Calculate BLEU. BLEU compares reference text and hypothesis text in word-level using scarebleu.

Parameters:
  • reference (list[list[str]]) -- The reference sentences.

  • hypothesis (list[str]) -- The hypothesis sentence.

Raises:

ValueError -- If the reference length is zero.

paddlespeech.s2t.utils.bleu_score.char_bleu(hypothesis, reference)[source]

Calculate BLEU. BLEU compares reference text and hypothesis text in char-level using scarebleu.

Parameters:
  • reference (list[list[str]]) -- The reference sentences.

  • hypothesis (list[str]) -- The hypothesis sentence.

Raises:

ValueError -- If the reference number is zero.