Features

Dataset

  • Aishell

  • Librispeech

  • THCHS30

  • TIMIT

Speech Recognition

Language Model

  • Ngram

Decoder

  • ctc greedy

  • ctc prefix beam search

  • greedy

  • beam search

  • attention rescore

Deployment

  • Paddle Inference

Aligment

  • MFA

  • CTC Alignment

Speech Frontend

  • Audio

    • Auto Gain

  • Feature

    • kaldi fbank

    • kaldi mfcc

    • linear

    • delta detla

Speech Augmentation

  • Audio

    • Volume Perturbation

    • Speed Perturbation

    • Shifting Perturbation

    • Online Bayesian normalization

    • Noise Perturbation

    • Impulse Response

  • Spectrum

    • SpecAugment

    • Adaptive SpecAugment

Tokenizer

  • Chinese/English Character

  • English Word

  • Sentence Piece

Word Segmentation

Grapheme To Phoneme

  • syllable

  • phoneme