pythainlp.soundex
The pythainlp.soundex
is soundex for Thai.
Modules
- pythainlp.soundex.soundex(text: str, engine: str = 'udom83', length: int = 4) str [source]
This function converts Thai text into phonetic code.
- Parameters:
- Returns:
Soundex code
- Return type:
- Options for engine:
udom83 (default) - Thai soundex algorithm proposed by Vichit Lorchirachoonkul [2]
lk82 - Thai soundex algorithm proposed by Wannee Udompanich [3]
metasound - Thai soundex algorithm based on a combination of Metaphone and Soundex proposed by Snae & Brückner [1]
prayut_and_somchaip - Thai-English Cross-Language Transliterated Word Retrieval using Soundex Technique [4]
- Example:
from pythainlp.soundex import soundex soundex("ลัก"), soundex("ลัก", engine='lk82'), \ soundex("ลัก", engine='metasound') # output: ('ร100000', 'ร1000', 'ล100') soundex("รัก"), soundex("รัก", engine='lk82'), \ soundex("รัก", engine='metasound') # output: ('ร100000', 'ร1000', 'ร100') soundex("รักษ์"), soundex("รักษ์", engine='lk82'), \ soundex("รักษ์", engine='metasound') # output: ('ร100000', 'ร1000', 'ร100') soundex("บูรณการ"), soundex("บูรณการ", engine='lk82'), \ soundex("บูรณการ", engine='metasound') # output: ('บ931900', 'บE419', 'บ551') soundex("ปัจจุบัน"), soundex("ปัจจุบัน", engine='lk82'), \ soundex("ปัจจุบัน", engine='metasound') # output: ('ป775300', 'ป3E54', 'ป223') soundex("vp", engine="prayut_and_somchaip") # output: '11' soundex("วีพี", engine="prayut_and_somchaip") # output: '11'
- pythainlp.soundex.lk82(text: str) str [source]
This function converts Thai text into phonetic code with the a Thai soundex algorithm named LK82 [3].
- Parameters:
text (str) – Thai word
- Returns:
LK82 soundex of the given Thai word
- Return type:
- Example:
from pythainlp.soundex import lk82 lk82("ลัก") # output: 'ร1000' lk82("รัก") # output: 'ร1000' lk82("รักษ์") # output: 'ร1000' lk82("บูรณการ") # output: 'บE419' lk82("ปัจจุบัน") # output: 'ป3E54'
- pythainlp.soundex.udom83(text: str) str [source]
This function converts Thai text into phonetic code with the Thai soundex algorithm named Udom83 [2].
from pythainlp.soundex import udom83 udom83("ลัก") # output : 'ล100' udom83("รัก") # output: 'ร100' udom83("รักษ์") # output: 'ร100' udom83("บูรณการ") # output: 'บ5515' udom83("ปัจจุบัน") # output: 'ป775300'
- pythainlp.soundex.metasound(text: str, length: int = 4) str [source]
This function converts Thai text into phonetic code with the mactching technique called MetaSound [1] (combination between Soundex and Metaphone algorithms). MetaSound algorithm was developed specifically for Thai language.
- Parameters:
- Returns:
MetaSound for the given text
- Return type:
- Example:
from pythainlp.soundex.metasound import metasound metasound("ลัก") # output: 'ล100' metasound("รัก") # output: 'ร100' metasound("รักษ์") # output: 'ร100' metasound("บูรณการ", 5) # output: 'บ5515' metasound("บูรณการ", 6)) # output: 'บ55150' metasound("บูรณการ", 4) # output: 'บ551'
- pythainlp.soundex.prayut_and_somchaip(text: str, length: int = 4) str [source]
This function converts English-Thai Cross-Language Transliterated Word into phonetic code with the mactching technique called Soundex [4].
- Parameters:
- Returns:
Soundex for the given text
- Return type:
- Example:
from pythainlp.soundex.prayut_and_somchaip import prayut_and_somchaip prayut_and_somchaip("king", 2) # output: '52' prayut_and_somchaip("คิง", 2) # output: '52'