Data is taken from kaggle competition Real or Not? NLP with Disaster Tweets

Data Sample:

print(f"Two  tweets == {x_train[:2]}")
print(f"Tweet Label == {y_train[:2]}")

Two  tweets == [['Our', 'Deeds', 'are', 'the', 'Reason', 'of', 'this', '#earthquake', 'May', 'ALLAH', 'Forgive', 'us', 'all'], ['Forest', 'fire', 'near', 'La', 'Ronge', 'Sask.', 'Canada']]
Tweet Label == [1, 1]

X (nested list): nested list of tokenized samples.
y (list): list of corresponding lables.

X(nested list): nested list of tokenized samples.

list of predicted label.

from scratch.models.naive_bayes import NaiveBayes

nb = NaiveBayes()

nb.fit(x_train, y_train)

nb.classes

[1, 0]

nb.vocab_length

24501

predictions = nb.predict(x_test)

accuracy(y_test, predictions)

0.7329246935201401

models.naive_bayes

`class` `NaiveBayes`[source]

`NaiveBayes.fit`[source]

`NaiveBayes.predict`[source]

models.naive_bayes

class NaiveBayes[source]

NaiveBayes.fit[source]

NaiveBayes.predict[source]

`class` `NaiveBayes`[source]

`NaiveBayes.fit`[source]

`NaiveBayes.predict`[source]