pythainlp.wangchanberta

The pythainlp.wangchanberta module is built upon the WangchanBERTa base model, specifically the wangchanberta-base-att-spm-uncased model, as detailed in the paper by Lowphansirikul et al. [1].

This base model is utilized for various natural language processing tasks in the Thai language, including named entity recognition, part-of-speech tagging, and subword tokenization.

If you intend to fine-tune the model or explore its capabilities further, please refer to the thai2transformers repository.

Speed Benchmark

Function	Named Entity Recognition	Part of Speech
PyThaiNLP basic function	89.7 ms	312 ms
pythainlp.wangchanberta (CPU)	9.64 s	9.65 s
pythainlp.wangchanberta (GPU)	8.02 s	8 s

For a comprehensive performance benchmark, the following notebooks are available:

`PyThaiNLP basic function and pythainlp.wangchanberta CPU at Google Colab`_
`pythainlp.wangchanberta GPU`_

pythainlp.wangchanberta

Modules

References