All 25 projects
Hugging Face: https://huggingface.co/pythainlp
AttaCut
A Fast and Accurate Neural Thai Word Segmenter
- Owner and Maintainancer: Pattarawat Chormai
 - Website: pythainlp.github.io/attacut
 - GitHub: PyThaiNLP/attacut
 
Docker Thai Tokenizers
This repository is a collection of almost all Thai tokenisers that are publicly available. Having this collection allows us to try each algorithm as ease via Docker.
- Owner and Maintainancer: Pattarawat Chormai
 - GitHub: PyThaiNLP/docker-thai-tokenizers
 
Lao language
Lao language resources for PyThaiNLP
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
 - GitHub: PyThaiNLP/lao-language
 
lexicon-thai
lexicon for thai
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
 - GitHub: PyThaiNLP/lexicon-thai
 
MudYom (มัดย้อม)
MudYom is a module for pre/post-processing text. It combines, aka มัด, words that should be together into one token. This process is done according to a user-defined dictionary.
- Owner and Maintainancer: Pattarawat Chormai
 - GitHub: PyThaiNLP/mudyom
 
NLP For Thai
NLP For Thai is website for collect open dataset and open source for Thai NLP.
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
 - Website: NLPForThai.com
 - GitHub: PyThaiNLP/nlpforthai.com
 
prachathai-67k
News Article Corpus from Prachathai.com
- GitHub: PyThaiNLP/prachathai-67k
 
PyLexTo
LexTo with Python 2 & 3 Wrapper. No Maintained
- GitHub: PyThaiNLP/PyLexTo
 
PyThaiNLP
PyThaiNLP is a Python package for text processing and linguistic analysis, similar to nltk, with focus on Thai language.
- GitHub: PyThaiNLP/pythainlp
 
PyThaiNLP API
Web API for PyThaiNLP
- GitHub: PyThaiNLP/pythainlp-api
 
PyThaiNLP Corpus
Website for view a corpus: pythainlp.github.io/pythainlp-corpus/
- Website: pythainlp.github.io/pythainlp-corpus
 - GitHub: PyThaiNLP/pythainlp-corpus
 
Thai-constitution-corpus
Thai Constitution Corpus
Thai-Data-Privacy
ThaiDP = Thai Data Privacy Tool For Python
- GitHub: PyThaiNLP/Thai-Data-Privacy
 
Thai covid-19 situation
Thai covid-19 situation text file from Ministry of Public Health, Thailand.
ThaiGov corpus
Data from Thai government website. thaigov.go.th
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
 - GitHub: PyThaiNLP/thaigov-corpus
 
ThaiGov V2 corpus
Data from Thai government website. thaigov.go.th
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
 - GitHub: PyThaiNLP/thaigov-v2-corpus
 
thaimaimeex
Predict budget from project names of ThaiME
- Owner and Maintainancer: Charin Polpanumas
 - GitHub: PyThaiNLP/thaimaimeex
 
ThaiNLP
Simple API for PyThaiNLP
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
 - GitHub: PyThaiNLP/thainlp
 
Thai Lao Parallel corpus
Thai Lao Parallel corpus
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
 - GitHub: PyThaiNLP/Thai-Lao-Parallel-Corpus
 
Thai Law
Thai Law Dataset (Act of Parliament)
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
 - GitHub: PyThaiNLP/thai-law
 
Thai Synonym
The synonym for thai (open source & open data)
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
 - GitHub: PyThaiNLP/thai-synonym
 
Thai Text Classification Benchmarks
Thai text classification benchmarks
TTG : Thai Text Generator
Thai Text Generator
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
 - GitHub: PyThaiNLP/Thai-Text-Generator
 
Wisesight Sentiment Corpus
Social media messages in Thai language with sentiment label (positive, neutral, negative, question). Released to public domain under Creative Commons Zero v1.0 Universal license.
- GitHub: PyThaiNLP/wisesight-sentiment
 
oxidized-thainlp
PyThaiNLP port from Python to Rust
- Maintainancer: Thanathip Suntorntip
 - GitHub: PyThaiNLP/oxidized-thainlp
 
