Installation¶
For stable version:
pip install pythainlp
For some advanced functionalities, like word vector, extra packages may be needed. Install them with these options during pip install:
pip install pythainlp[extra1,extra2,...]
where extras can be
artagger
(to support artagger part-of-speech tagger)*deepcut
(to support deepcut machine-learnt tokenizer)icu
(for ICU support in transliteration and tokenization)ipa
(for International Phonetic Alphabet support in transliteration)ml
(to support fastai 1.0.22 ULMFiT models)ner
(for named-entity recognizer)thai2fit
(for Thai word vector)thai2rom
(for machine-learnt romanization)full
(install everything)
Note: standard artagger package from PyPI will not work on Windows, please pip install https://github.com/wannaphongcom/artagger/tarball/master#egg=artagger instead.
For development version:
pip install https://github.com/PyThaiNLP/pythainlp/archive/dev.zip
** see extras and extras_require in setup.py for package details.
Note for installation on Windows:
marisa-trie
andPyICU
libraries may required. You have two options to get them installed on Windows.- Option 1 (recommended):
Find a pre-built package (“wheel”) from https://www.lfd.uci.edu/~gohlke/pythonlibs/
Download a suitable wheel for your Python version (3.5, 3.6, etc.) and CPU architecture (“win32” for 32-bit Windows and “amd64” for 64-bit Windows)
Install them with pip. For example: pip install marisa_trie‑0.7.5‑cp36‑cp36m‑win32.whl
- Option 2 (advanced):
You can also try to install them with a command: pip install marisa-trie pyicu
With this, pip will try to build the libraries directly from source files.
This will take some time and need a set of build tools to be installed in your system, for example Microsoft Visual C++ Compiler. It also requires some technical skills on how things are getting built on Windows system, as you may need to configure some environment variables to accommodate the build process.
For PyICU, before the installation, you have to set
ICU_VERSION
environment variable to ICU version in your system. For example,set ICU_VERSION=62.1
.This approach is obviously take more time and effort, but the good side is the library will be optimized for your system. This could mean a better performance.