pythainlp.tools
The pythainlp.tools
contains miscellaneous functions for PyThaiNLP internal use.
Modules
- pythainlp.tools.get_full_data_path(path: str) str [source]
This function joins path of
pythainlp
data directory and the given path, and returns the full path.- Returns
full path given the name of dataset
- Return type
- Example
from pythainlp.tools import get_full_data_path get_full_data_path('ttc_freq.txt') # output: '/root/pythainlp-data/ttc_freq.txt'
- pythainlp.tools.get_pythainlp_data_path() str [source]
Returns the full path where PyThaiNLP keeps its (downloaded) data. If the directory does not yet exist, it will be created. The path can be specified through the environment variable
PYTHAINLP_DATA_DIR
. By default, ~/pythainlp-data will be used.- Returns
full path of directory for
pythainlp
downloaded data- Return type
- Example
from pythainlp.tools import get_pythainlp_data_path get_pythainlp_data_path() # output: '/root/pythainlp-data'
- pythainlp.tools.get_pythainlp_path() str [source]
This function returns full path of PyThaiNLP code
- Returns
full path of
pythainlp
code- Return type
- Example
from pythainlp.tools import get_pythainlp_path get_pythainlp_path() # output: '/usr/local/lib/python3.6/dist-packages/pythainlp'
- pythainlp.tools.misspell(sentence: str, ratio: float = 0.05)[source]
Simulate some mispellings for the input sentence. The number of mispelled locations is governed by ratio.
- Params str sentence
sentence to be mispelled
- Params float ratio
number of misspells per 100 chars. Defaults to 0.5.
- Returns
sentence containing some misspelled
- Return type
- Example
- ::
from pythainlp.tools import misspell
sentence = “ภาษาไทยปรากฏครั้งแรกในพุทธศักราช 1826”
misspell(sent, ratio=0.1) # output: ภาษาไทยปรากฏครั้งแรกในกุทธศักราช 1727