pythainlp.corpus¶
The pythainlp.corpus is corpus for pythainlp.
Modules¶
- 
pythainlp.corpus.get_corpus(filename: str) → frozenset[source]¶ Read corpus from file and return a frozenset
- Parameters
 filename (string) – file corpus
- 
pythainlp.corpus.get_corpus_path(name: str) → Optional[str][source]¶ Get corpus path
- Parameters
 name (string) – corpus name
- 
pythainlp.corpus.download(name: str, force: bool = False) → NoReturn[source]¶ Download corpus
- Parameters
 name (string) – corpus name
force (bool) – force install
- 
pythainlp.corpus.remove(name: str) → bool[source]¶ Remove corpus
- Parameters
 name (string) – corpus name
- Returns
 True or False
- 
pythainlp.corpus.common.thai_negations() → frozenset[source]¶ Return a frozenset of Thai negation words
- 
pythainlp.corpus.common.countries() → frozenset[source]¶ Return a frozenset of country names in Thai
TNC¶
- 
pythainlp.corpus.tnc.word_freq(word: str, domain: str = 'all') → int[source]¶ Not officially supported. Get word frequency of a word by domain. This function will make a query to the server of Thai National Corpus. Internet connection is required.
IMPORTANT: Currently (as of 29 April 2019) it is likely to return 0, regardless of the word, as the service URL has been changed and the code is not updated yet. New URL is http://www.arts.chula.ac.th/~ling/tnc3/
- Parameters
 word (string) – word
domain (string) – domain