PythonとNLTKを使ってN-Gramsの実装をする方法を分かりやすく解説

2023.01.18 2022.09.26

この記事では、N-gramの意味と、プログラミング言語PythonでN-gramを実装する方法について説明します。

こちらもお読みください。

N-gramを理解する
PythonでN-gramを実装する
サンプル出力
Also Read:

N-gramを理解する

テキストのN-gramは、自然言語処理やテキストマイニングでよく利用される。

これは基本的に、同じウィンドウに同時に表示される単語の文字列です。

N-gramを計算する場合、通常は1つの単語を進めます（ただし、より複雑なシナリオではn個の単語を動かすことができます）。

N-gramは様々な用途に使われる。

from nltk import ngrams

sentence = input("Enter the sentence: ")

n = int(input("Enter the value of n: "))

n_grams = ngrams(sentence.split(), n)

for grams in n_grams:

    print(grams)

例えば、言語モデルを作成する際、N-gramは1gramだけでなく、bigramやtrigramのモデルを作成する際にも利用される。

GoogleやMicrosoftはウェブスケールの文法モデルを作成し、スペルチェック、ハイフネーション、テキストの要約など様々な活動に利用されている。

PythonでN-gramを実装する

n-gramを実装するために、nltkに含まれる ngrams という関数を使って、全てのn-gramの操作を行う。

Enter the sentence: Let's test the n-grams implementation with this sample sentence! Yay!

Enter the value of n: 3

("Let's", 'test', 'the')

('test', 'the', 'n-grams')

('the', 'n-grams', 'implementation')

('n-grams', 'implementation', 'with')

('implementation', 'with', 'this')

('with', 'this', 'sample')

('this', 'sample', 'sentence!')

('sample', 'sentence!', 'Yay!')

サンプル出力

驚くべき結果をご覧ください。

同じコードをいくつもの文章で試してみてください。

N-gramを理解する

PythonでN-gramを実装する

サンプル出力

Also Read: