Application of the Naïve Bayes Classifier Algorithm to Analyze Sentiment for the Covid-19 Vaccine on Twitter in Jakarta

  • Ire Puspa Wardhani STMIK Jakarta STI&K
  • Yudi Irawan Chandra STMIK Jakarta STI&K
  • Ferri Yusra STMIK Jakarta STI&K
Keywords: Sentiment Analysis, Text Pre-Processing, Naïve Bayes Classifier, TF-IDF, Twitter


The epidemic of a new disease caused by the coronavirus (2019-nCoV), commonly referred to as COVID-19, has been declared a global virus epidemic by the World Health Organization (WHO). President Joko Widodo has officially ratified Presidential Decree No. 99 of 2020 concerning the provision of vaccines and the implementation of vaccination activities. Twitter is a social media platform that allows users to share information and opinions directly with fellow users. Tweets given can be in any form, either positively or negatively, so one of the methods used is sentiment analysis. Sentiment analysis helps determine an opinion or comment on an issue, whether the response is positive or negative. The Naïve Bayes algorithm is used in sentiment analysis because it is suitable for tweets or text data that is not too long or short text. The initial stage of sentiment analysis is text pre-processing which consists of Cleaning, case folding, tokenizing, and stopword removal. Then the data is labeled manually. The analysis results are visualized as bar charts, pie charts, and word clouds. Then the word weighting is carried out using the term frequency-inverse document (TF-IDF), and classification is carried out using the Naïve Bayes classifier. From the test results, the accuracy value of the confusion matrix is 82% from 2600 tweet data with 80% training data composition and 20% test data.


Information and Computational Engineering