// <![CDATA[PENERAPAN TEXT MINING UNTUK ANALISIS SENTIMEN PADA TWITTER (X) DENGAN KATA KUNCI “Program Makan Siang Gratis” MENGGUNAKAN METODE SVM]]> 120160502 - Kurnia Ramadhan Putra, S.Kom., M.T. Dosen Pembimbing 1 NAJWA ADINDA / 162021041 Penulis
Penelitian ini menganalisis opini publik terhadap kebijakan “Program Makan Siang Gratis” pada Twitter (X) menggunakan pendekatan text mining dan klasifikasi Support Vector Machine (SVM) dalam kerangka Knowledge Discovery in Databases (KDD). Data dihimpun melalui crawling memakai Tweet-Harvest dengan kata kunci “makan siang gratis” pada Januari–Mei 2025 (1.405 cuitan mentah) dan setelah dilakukaan penghapusan duplikat menjadi 1.402 cuitan. Pela-belan dilakukan dengan dua pendekatan: InSet Lexicon-Based (739 negatif, 506 positif, 160 netral) dan AI-based Indonesian Languange (Prosa.ai) (684 negatif, 278 positif, 468 netral). Prapemrosesan meliputi seleksi fitur ReliefF, cleaning, case folding, tokenizing, stopword removal, stemming, serta vektorisasi TF-IDF (nilai optimal max_features: 500 untuk InSet Lexicon-Based dan 250 untuk AI based method (Prosa.ai). Model SVM dievaluasi pada tiga konfigurasi: (i) default class_weight=’none’, (ii) pembobotan kelas internal class_weight='balanced', dan (iii) hyperparameter tuning tuning via GridSearchCV. Evaluasi berbasis akurasi, presisi, recall, dan F1-score menunjukkan bahwa kombinasi InSet Lexicon-Based dengan Hyperparameter tuning menempati peringkat pertama dengan akurasi dan F1-score tertinggi sebesar 0.69643. Sementara itu, kombinasi Prosa.ai dengan Hy-perparameter tuning berada di peringkat kedua dengan akurasi 0.60357 dan keunggulan pada nilai precision sebesar 0.63845. Temuan ini menegaskan bahwa strategi pelabelan dan pengaturan parameter berperan penting dalam menentukan kinerja klasifikasi sentimen tiga kelas pada data media sosial berbahasa Indonesia. This study analyzes public opinion on the “Free Lunch Program” policy on Twitter (X) using a text mining approach and Support Vector Machine (SVM) classification within the Knowledge Discovery in Databases (KDD) framework. Data were col-lected through crawling with Tweet-Harvest using the keyword “makan siang gra-tis” from January to May 2025, resulting in 1,405 raw tweets, which were reduced to 1,402 after duplicate removal. Labeling was carried out using two approaches: InSet Lexicon-Based (739 negative, 506 positive, 160 neutral) and AI-based Indo-nesian Language (Prosa.ai) (684 negative, 278 positive, 468 neutral). Prepro-cessing included feature selection with ReliefF, cleaning, case folding, tokenizing, stopword removal, stemming, and TF-IDF vectorization (with optimal max_fea-tures of 500 for InSet Lexicon-Based and 250 for the AI-based method using Prosa.ai). The SVM model was evaluated under three configurations: (i) default with class_weight=none, (ii) internal class reweighting with class_weight=bal-anced, and (iii) hyperparameter tuning via GridSearchCV. Evaluation based on ac-curacy, precision, recall, and F1-score showed that the InSet Lexicon-Based with Hyperparameter Tuning combination ranked first with the highest accuracy and F1-score of 0.69643. Meanwhile, the Prosa.ai with Hyperparameter Tuning com-bination ranked second with an accuracy of 0.60357 and demonstrated superiority in precision with a score of 0.63845. These findings highlight the importance of labeling strategies and parameter optimization in determining the performance of three-class sentiment classification on Indonesian social media data.