Notes
Package reference:
calculate_ngram_counts()
remove_repeated_ngrams()
Calculate n-gram counts for the given word list.
list_words (list[str]) – list of words
n_min (int) – minimum n-gram size (default: 2)
n_max (int) – maximum n-gram size (default: 4)
dictionary mapping n-grams to their counts
dict[tuple[str, …], int]
Remove repeated n-grams from a word list.
string_list (list[str]) – list of words
n (int) – n-gram size
list of words with repeated n-grams removed
list[str]
>>> from pythainlp.lm import remove_repeated_ngrams
>>> remove_repeated_ngrams(["เอา", "เอา", "แบบ", "ไหน"], n=1) ['เอา', 'แบบ', 'ไหน']