PyThaiNLP Projects

All 25 projects

Hugging Face: https://huggingface.co/pythainlp

AttaCut

License: MIT Maintenance GitHub stars GitHub issues

A Fast and Accurate Neural Thai Word Segmenter

Docker Thai Tokenizers

License: MIT Maintenance GitHub starsGitHub issues

This repository is a collection of almost all Thai tokenisers that are publicly available. Having this collection allows us to try each algorithm as ease via Docker.

Lao language

License Maintenance GitHub starsGitHub issues

Lao language resources for PyThaiNLP

lexicon-thai

License: CC BY-SA 3.0 MaintenanceGitHub starsGitHub issues

lexicon for thai

MudYom (มัดย้อม)

License: MIT MaintenanceGitHub starsGitHub issues

MudYom is a module for pre/post-processing text. It combines, aka มัด, words that should be together into one token. This process is done according to a user-defined dictionary.

NLP For Thai

License MaintenanceGitHub starsGitHub issues

NLP For Thai is website for collect open dataset and open source for Thai NLP.

prachathai-67k

License MaintenanceGitHub starsGitHub issues

News Article Corpus from Prachathai.com

PyLexTo

License: MIT MaintenanceGitHub starsGitHub issues

LexTo with Python 2 & 3 Wrapper. No Maintained

PyThaiNLP

License MaintenanceGitHub starsGitHub issues

PyThaiNLP is a Python package for text processing and linguistic analysis, similar to nltk, with focus on Thai language.

PyThaiNLP API

License MaintenanceGitHub starsGitHub issues

Web API for PyThaiNLP

PyThaiNLP Corpus

License MaintenanceGitHub starsGitHub issues

Website for view a corpus: pythainlp.github.io/pythainlp-corpus/

Thai-constitution-corpus

License: CC0-1.0 MaintenanceGitHub starsGitHub issues

Thai Constitution Corpus

Thai-Data-Privacy

License MaintenanceGitHub starsGitHub issues

ThaiDP = Thai Data Privacy Tool For Python

Thai covid-19 situation

License: CC0-1.0 MaintenanceGitHub starsGitHub issues

Thai covid-19 situation text file from Ministry of Public Health, Thailand.

ThaiGov corpus

License: CC0-1.0 MaintenanceGitHub starsGitHub issues

Data from Thai government website. thaigov.go.th

ThaiGov V2 corpus

License: CC0-1.0 MaintenanceGitHub starsGitHub issues

Data from Thai government website. thaigov.go.th

thaimaimeex

License MaintenanceGitHub starsGitHub issues

Predict budget from project names of ThaiME

ThaiNLP

License MaintenanceGitHub starsGitHub issues

Simple API for PyThaiNLP

Thai Lao Parallel corpus

License: CC0-1.0 MaintenanceGitHub starsGitHub issues

Thai Lao Parallel corpus

Thai Law

License: CC0-1.0 MaintenanceGitHub starsGitHub issues

Thai Law Dataset (Act of Parliament)

Thai Synonym

License MaintenanceGitHub starsGitHub issues

The synonym for thai (open source & open data)

Thai Text Classification Benchmarks

License MaintenanceGitHub starsGitHub issues

Thai text classification benchmarks

TTG : Thai Text Generator

License MaintenanceGitHub starsGitHub issues

Thai Text Generator

Wisesight Sentiment Corpus

License: CC0-1.0 MaintenanceGitHub starsGitHub issues

Social media messages in Thai language with sentiment label (positive, neutral, negative, question). Released to public domain under Creative Commons Zero v1.0 Universal license.

oxidized-thainlp

License MaintenanceGitHub starsGitHub issues

PyThaiNLP port from Python to Rust