paddleaudio.utils.tensor_utils module

Unility functions for Transformer.

paddleaudio.utils.tensor_utils.add_sos_eos(ys_pad: Tensor, sos: int, eos: int, ignore_id: int) Tuple[Tensor, Tensor][source]

Add <sos> and <eos> labels. Args:

ys_pad (paddle.Tensor): batch of padded target sequences (B, Lmax) sos (int): index of <sos> eos (int): index of <eeos> ignore_id (int): index of padding

Returns:

ys_in (paddle.Tensor) : (B, Lmax + 1) ys_out (paddle.Tensor) : (B, Lmax + 1)

Examples:
>>> sos_id = 10
>>> eos_id = 11
>>> ignore_id = -1
>>> ys_pad
tensor([[ 1,  2,  3,  4,  5],
        [ 4,  5,  6, -1, -1],
        [ 7,  8,  9, -1, -1]], dtype=paddle.int32)
>>> ys_in,ys_out=add_sos_eos(ys_pad, sos_id , eos_id, ignore_id)
>>> ys_in
tensor([[10,  1,  2,  3,  4,  5],
        [10,  4,  5,  6, 11, 11],
        [10,  7,  8,  9, 11, 11]])
>>> ys_out
tensor([[ 1,  2,  3,  4,  5, 11],
        [ 4,  5,  6, 11, -1, -1],
        [ 7,  8,  9, 11, -1, -1]])
paddleaudio.utils.tensor_utils.has_tensor(val)[source]
paddleaudio.utils.tensor_utils.pad_sequence(sequences: List[Tensor], batch_first: bool = False, padding_value: float = 0.0) Tensor[source]

Pad a list of variable length Tensors with padding_value

pad_sequence stacks a list of Tensors along a new dimension, and pads them to equal length. For example, if the input is list of sequences with size L x * and if batch_first is False, and T x B x * otherwise.

B is batch size. It is equal to the number of elements in sequences. T is length of the longest sequence. L is length of the sequence. * is any number of trailing dimensions, including none.

Example:
>>> from paddle.nn.utils.rnn import pad_sequence
>>> a = paddle.ones(25, 300)
>>> b = paddle.ones(22, 300)
>>> c = paddle.ones(15, 300)
>>> pad_sequence([a, b, c]).shape
paddle.Tensor([25, 3, 300])
Note:

This function returns a Tensor of size T x B x * or B x T x * where T is the length of the longest sequence. This function assumes trailing dimensions and type of all the Tensors in sequences are same.

Args:

sequences (list[Tensor]): list of variable length sequences. batch_first (bool, optional): output will be in B x T x * if True, or in

T x B x * otherwise

padding_value (float, optional): value for padded elements. Default: 0.

Returns:

Tensor of size T x B x * if batch_first is False. Tensor of size B x T x * otherwise

paddleaudio.utils.tensor_utils.th_accuracy(pad_outputs: Tensor, pad_targets: Tensor, ignore_label: int) float[source]

Calculate accuracy. Args:

pad_outputs (Tensor): Prediction tensors (B * Lmax, D). pad_targets (LongTensor): Target label tensors (B, Lmax, D). ignore_label (int): Ignore label id.

Returns:

float: Accuracy value (0.0 - 1.0).