pythainlp.corpus¶
The pythainlp.corpus
is corpus for pythainlp.
Modules¶
-
pythainlp.corpus.
get_corpus
(filename: str) → frozenset[source]¶ Read corpus from file and return a frozenset
- Parameters
filename (string) – file corpus
-
pythainlp.corpus.
get_corpus_path
(name: str) → Optional[str][source]¶ Get corpus path
- Parameters
name (string) – corpus name
-
pythainlp.corpus.
download
(name: str, force: bool = False) → NoReturn[source]¶ Download corpus
- Parameters
name (string) – corpus name
force (bool) – force install
-
pythainlp.corpus.
remove
(name: str) → bool[source]¶ Remove corpus
- Parameters
name (string) – corpus name
- Returns
True or False
-
pythainlp.corpus.common.
thai_negations
() → frozenset[source]¶ Return a frozenset of Thai negation words
-
pythainlp.corpus.common.
countries
() → frozenset[source]¶ Return a frozenset of country names in Thai
TNC¶
-
pythainlp.corpus.tnc.
word_freq
(word: str, domain: str = 'all') → int[source]¶ Not officially supported. Get word frequency of a word by domain. This function will make a query to the server of Thai National Corpus. Internet connection is required.
IMPORTANT: Currently (as of 29 April 2019) it is likely to return 0, regardless of the word, as the service URL has been changed and the code is not updated yet. New URL is http://www.arts.chula.ac.th/~ling/tnc3/
- Parameters
word (string) – word
domain (string) – domain