paddlespeech.t2s.frontend.phonectic module
- class paddlespeech.t2s.frontend.phonectic.Chinese[source]
Bases:
Phonetics
Normalize Chinese text sequence and convert it into ids.
- Attributes:
vocab_size
Vocab size.
Methods
__call__
(sentence)Convert the input text sequence into pronunciation id sequence. Args: sentence (str): The input text sequence. Returns: List[str]: The list of pronunciation id sequence.
numericalize
(phonemes)Convert pronunciation sequence into pronunciation id sequence. Args: phonemes(List[str]): The list of pronunciation sequence. Returns: List[int]: The list of pronunciation id sequence.
phoneticize
(sentence)Normalize the input text sequence and convert it into pronunciation sequence. Args: sentence(str): The input text sequence. Returns: List[str]: The list of pronunciation sequence.
reverse
(ids)Reverse the list of pronunciation id sequence to a list of pronunciation sequence. Args: ids (List[int]): The list of pronunciation id sequence. Returns: List[str]: The list of pronunciation sequence.
- numericalize(phonemes)[source]
Convert pronunciation sequence into pronunciation id sequence. Args:
phonemes(List[str]): The list of pronunciation sequence.
- Returns:
List[int]: The list of pronunciation id sequence.
- phoneticize(sentence)[source]
Normalize the input text sequence and convert it into pronunciation sequence. Args:
sentence(str): The input text sequence.
- Returns:
List[str]: The list of pronunciation sequence.
- reverse(ids)[source]
Reverse the list of pronunciation id sequence to a list of pronunciation sequence. Args: ids (List[int]): The list of pronunciation id sequence. Returns:
List[str]: The list of pronunciation sequence.
- property vocab_size
Vocab size.
- class paddlespeech.t2s.frontend.phonectic.English(phone_vocab_path=None)[source]
Bases:
Phonetics
Normalize the input text sequence and convert into pronunciation id sequence.
https://github.com/Kyubyong/g2p/blob/master/g2p_en/g2p.py
- phonemes = ["<pad>", "<unk>", "<s>", "</s>"] + [
'AA0', 'AA1', 'AA2', 'AE0', 'AE1', 'AE2', 'AH0', 'AH1', 'AH2', 'AO0', 'AO1', 'AO2', 'AW0', 'AW1', 'AW2', 'AY0', 'AY1', 'AY2', 'B', 'CH', 'D', 'DH', 'EH0', 'EH1', 'EH2', 'ER0', 'ER1', 'ER2', 'EY0', 'EY1', 'EY2', 'F', 'G', 'HH', 'IH0', 'IH1', 'IH2', 'IY0', 'IY1', 'IY2', 'JH', 'K', 'L', 'M', 'N', 'NG', 'OW0', 'OW1', 'OW2', 'OY0', 'OY1', 'OY2', 'P', 'R', 'S', 'SH', 'T', 'TH', 'UH0', 'UH1', 'UH2', 'UW', 'UW0', 'UW1', 'UW2', 'V', 'W', 'Y', 'Z', 'ZH']
- Attributes:
vocab_size
Vocab size.
Methods
__call__
(sentence)Convert the input text sequence into pronunciation id sequence. Args: sentence(str): The input text sequence. Returns: List[str]: The list of pronunciation id sequence.
numericalize
(phonemes)Convert pronunciation sequence into pronunciation id sequence. Args: phonemes (List[str]): The list of pronunciation sequence. Returns: List[int]: The list of pronunciation id sequence.
phoneticize
(sentence)Normalize the input text sequence and convert it into pronunciation sequence. Args: sentence (str): The input text sequence. Returns: List[str]: The list of pronunciation sequence.
reverse
(ids)Reverse the list of pronunciation id sequence to a list of pronunciation sequence. Args: ids (List[int]): The list of pronunciation id sequence. Returns: List[str]: The list of pronunciation sequence.
get_input_ids
- LEXICON = {'ai': [['EY0', 'AY1']]}
- get_input_ids(sentence: str, merge_sentences: bool = False, to_tensor: bool = True) Tensor [source]
- numericalize(phonemes)[source]
Convert pronunciation sequence into pronunciation id sequence. Args:
phonemes (List[str]): The list of pronunciation sequence.
- Returns:
List[int]: The list of pronunciation id sequence.
- phoneticize(sentence)[source]
Normalize the input text sequence and convert it into pronunciation sequence. Args:
sentence (str): The input text sequence.
- Returns:
List[str]: The list of pronunciation sequence.
- reverse(ids)[source]
Reverse the list of pronunciation id sequence to a list of pronunciation sequence. Args:
ids (List[int]): The list of pronunciation id sequence.
- Returns:
List[str]: The list of pronunciation sequence.
- property vocab_size
Vocab size.
- class paddlespeech.t2s.frontend.phonectic.EnglishCharacter[source]
Bases:
Phonetics
Normalize the input text sequence and convert it into character id sequence.
- Attributes:
vocab_size
Vocab size.
Methods
__call__
(sentence)Normalize the input text sequence and convert it into character id sequence. Args: sentence (str): The input text sequence. Returns: List[int]: List of a character id sequence.
numericalize
(sentence)Convert a text sequence into ids. Args: sentence (str): The input text sequence. Returns: List[int]: List of a character id sequence.
phoneticize
(sentence)Normalize the input text sequence. Args: sentence(str): The input text sequence. Returns: str: A text sequence after normalize.
reverse
(ids)Convert a character id sequence into text. Args: ids (List[int]): List of a character id sequence. Returns: str: The input text sequence.
- numericalize(sentence)[source]
Convert a text sequence into ids. Args:
sentence (str): The input text sequence.
- Returns:
- List[int]:
List of a character id sequence.
- phoneticize(sentence)[source]
Normalize the input text sequence. Args:
sentence(str): The input text sequence.
- Returns:
str: A text sequence after normalize.
- reverse(ids)[source]
Convert a character id sequence into text. Args:
ids (List[int]): List of a character id sequence.
- Returns:
str: The input text sequence.
- property vocab_size
Vocab size.