All 25 projects
Hugging Face: https://huggingface.co/pythainlp
AttaCut
A Fast and Accurate Neural Thai Word Segmenter
- Owner and Maintainancer: Pattarawat Chormai
- Website: pythainlp.github.io/attacut
- GitHub: PyThaiNLP/attacut
Docker Thai Tokenizers
This repository is a collection of almost all Thai tokenisers that are publicly available. Having this collection allows us to try each algorithm as ease via Docker.
- Owner and Maintainancer: Pattarawat Chormai
- GitHub: PyThaiNLP/docker-thai-tokenizers
Lao language
Lao language resources for PyThaiNLP
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
- GitHub: PyThaiNLP/lao-language
lexicon-thai
lexicon for thai
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
- GitHub: PyThaiNLP/lexicon-thai
MudYom (มัดย้อม)
MudYom is a module for pre/post-processing text. It combines, aka มัด, words that should be together into one token. This process is done according to a user-defined dictionary.
- Owner and Maintainancer: Pattarawat Chormai
- GitHub: PyThaiNLP/mudyom
NLP For Thai
NLP For Thai is website for collect open dataset and open source for Thai NLP.
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
- Website: NLPForThai.com
- GitHub: PyThaiNLP/nlpforthai.com
prachathai-67k
News Article Corpus from Prachathai.com
- GitHub: PyThaiNLP/prachathai-67k
PyLexTo
LexTo with Python 2 & 3 Wrapper. No Maintained
- GitHub: PyThaiNLP/PyLexTo
PyThaiNLP
PyThaiNLP is a Python package for text processing and linguistic analysis, similar to nltk, with focus on Thai language.
- GitHub: PyThaiNLP/pythainlp
PyThaiNLP API
Web API for PyThaiNLP
- GitHub: PyThaiNLP/pythainlp-api
PyThaiNLP Corpus
Website for view a corpus: pythainlp.github.io/pythainlp-corpus/
- Website: pythainlp.github.io/pythainlp-corpus
- GitHub: PyThaiNLP/pythainlp-corpus
Thai-constitution-corpus
Thai Constitution Corpus
Thai-Data-Privacy
ThaiDP = Thai Data Privacy Tool For Python
- GitHub: PyThaiNLP/Thai-Data-Privacy
Thai covid-19 situation
Thai covid-19 situation text file from Ministry of Public Health, Thailand.
ThaiGov corpus
Data from Thai government website. thaigov.go.th
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
- GitHub: PyThaiNLP/thaigov-corpus
ThaiGov V2 corpus
Data from Thai government website. thaigov.go.th
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
- GitHub: PyThaiNLP/thaigov-v2-corpus
thaimaimeex
Predict budget from project names of ThaiME
- Owner and Maintainancer: Charin Polpanumas
- GitHub: PyThaiNLP/thaimaimeex
ThaiNLP
Simple API for PyThaiNLP
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
- GitHub: PyThaiNLP/thainlp
Thai Lao Parallel corpus
Thai Lao Parallel corpus
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
- GitHub: PyThaiNLP/Thai-Lao-Parallel-Corpus
Thai Law
Thai Law Dataset (Act of Parliament)
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
- GitHub: PyThaiNLP/thai-law
Thai Synonym
The synonym for thai (open source & open data)
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
- GitHub: PyThaiNLP/thai-synonym
Thai Text Classification Benchmarks
Thai text classification benchmarks
TTG : Thai Text Generator
Thai Text Generator
- Owner and Maintainancer: Wannaphong Phatthiyaphaibun
- GitHub: PyThaiNLP/Thai-Text-Generator
Wisesight Sentiment Corpus
Social media messages in Thai language with sentiment label (positive, neutral, negative, question). Released to public domain under Creative Commons Zero v1.0 Universal license.
- GitHub: PyThaiNLP/wisesight-sentiment
oxidized-thainlp
PyThaiNLP port from Python to Rust
- Maintainancer: Thanathip Suntorntip
- GitHub: PyThaiNLP/oxidized-thainlp