paddlespeech.s2t.utils.text_grid module

paddlespeech.s2t.utils.text_grid.align_to_tierformat(align_segs: List[List[int]], subsample: int, token_dict: Dict[int, str], blank_id=0) → List[str][source]

Generate textgrid.Interval format from alignment segmentations.

Args:: align_segs (List[List[int]]): segmented ctc alignment ids. subsample (int): 25ms frame_length, 10ms hop_length, 1/subsample token_dict (Dict[int, Text]): int -> str map.
Returns:: List[Text]: list of textgrid.Interval text, str(start, end, text).

paddlespeech.s2t.utils.text_grid.generate_textgrid(maxtime: float, intervals: List[str], output: str, name: str = 'ali') → None[source]

Create alignment textgrid file.

Args:: maxtime (float): audio duartion. intervals (List[Text]): ctc output alignment. e.g. "start-time end-time word" per item. output (Text): textgrid filepath. name (Text, optional): tier or layer name. Defaults to 'ali'.

paddlespeech.s2t.utils.text_grid.segment_alignment(alignment: List[int], blank_id=0) → List[List[int]][source]

segment ctc alignment ids by continuous blank and repeat label.

Args:

alignment (List[int]): ctc alignment id sequence.: e.g. [0, 0, 0, 1, 1, 1, 2, 0, 0, 3]

blank_id (int, optional): blank id. Defaults to 0.

Returns:

List[List[int]]: token align, segment aligment id sequence.: e.g. [[0, 0, 0, 1, 1, 1], [2], [0, 0, 3]]