🪿 Han-solo: Thai syllable segmenter
This work wants to create a Thai syllable segmenter that can work in the Thai social media domain. It use data from Wisesight Sentiment Corpus.
This work uses 2 datasets:
- Nutcha Dataset (Thai news domain). See more data_nutcha/
- Han-solo: Thai syllable segmenter dataset (Thai social media domain). See more Han-solo: Thai syllable segmenter
We train the model by CRF model that uses the same feature from ssg.
This project is developed by 🪿 Wannaphong Phatthiyaphaibun.
GitHub: PyThaiNLP/Han-solo