PyThaiNLP v2.3.1 is This release is a bug fix release of PyThaiNLP 2.3.
Bug Fixed
Documentation: https://pythainlp.github.io/docs/2.3/index.html Report bug: https://github.com/PyThaiNLP/pythainlp/issues
You can install or upgrade using pip install -U pythainlp
See PyThaiNLP 2.3 change log #445
Deprecation and other API changes
- NER change a ThaiNER model (from ThaiNER 1.4 to ThaiNER 1.5). If you need use ThaiNER 1.4 model, You can use version in ThaiNameTagger class. 
pythainlp.tag.named_entity.ThaiNameTagger(version: str = '1.4')(Docs: https://pythainlp.github.io/dev-docs/api/tag.html#pythainlp.tag.named_entity.ThaiNameTagger) 
Tokenizer
- #484 Add: model option for 
attacut.tokenize() - #502 Add: 
corpus.util.revise_wordset()to revise tokenization dictionary - #503 Add: 
NERCuttokenization engine 
Corpus
- License change:
    
- All corpora, datasets, and documentation created by PyThaiNLP project are now released under Creative Commons Zero 1.0 Universal Public Domain Dedication License (CC0).
 - All language models created by PyThaiNLP project are released under Creative Commons Attribution 4.0 International Public License (CC-by).
 
 - #449 Fix: remove instances with 
[or]from etcc.txt - #467 Add: 
corpus.common.provinces()can now return romanized names - #476 Add: 
thai_family_names()to get a set of Thai family names - #487 Fix: 
thailand_provinces_th.csvnot found issue - #492 Fix: remove erroneous 
AITTtag from ORCHID to UD table – thanks @c4n for the fix 
POS Tagger
- #464 Add: 
LST20language model for part-of-speech tagging - #468 Add: port 
PerceptronTaggerfrom NTLK. POS tagging no longer needs NLTK for dependency. - #478 Update: ORCHID POS tags documentation
 
Name Entity Tagging
- #526 Update ThaiNER 1.4 to ThaiNER 1.5
 - #538 Add ThaiNameTagger version and add ThaiNER 1.4 support
 
Transliterate
- #485 Fixed Romanize failed in some examples
 - #511 Add Thai W2P (Thai Word-to-Phoneme converter)
 
Text Summarize
- #523 Add mT5 text summarize to 
pythainlp.summarize 
Chunk parser
- #524 Add 
pythainlp.tag.chunk 
Util
- #481 Fix: 
remove_repeat_vowels()bug that remove spaces between different vowels - #483 Add: add 
remove()method to remove a word from a trie – thanks @korakot - #490 Fix: 
thai_strftime()- normalize output for unsupported directive (running in glibc and musl should produce the same output) - #512 Add: 
emoji_to_thai()to convert emoji to Thai description – thanks @ppirch for the development - #513 Add: 
thai_keyboard_dist()to calculate euclidean distance between two characters according to their location on a Thai keyboard layout – thanks @ppirch for the development 
Thanks all the contributors. (Image made with contributors-img)
  
We build Thai NLP.
PyThaiNLP
