Klasifikasi Berita Indonesia Menggunakan Metode Naive Bayesian Classification dan Support Vector Machine Dengan Confix Stripping Stemmer

Dio Ariadi, Kartika Fithriasari
Submission Date: 2015-07-28 11:56:57
Accepted Date: 2016-01-22 12:13:07

Abstract


Jumlah aliran artikel berita yang diunggah di internet sangat banyak dan rentang waktu yang cepat. Jumlah yang banyak dan waktu yang cepat akan menyulitkan editor mengkategorikan secara manual. Terdapat metode agar berita dapat dikategorikan secara otomatis, yaitu klasifikasi. Data berita berbentuk teks, sehingga jauh lebih rumit dan perlu proses untuk mempersiapkan data. Salah satu prosesnya adalah confix-stripping stemmer sebagai cara untuk mendapatkan kata dasar dari berita Indonesia. Untuk metode klasifikasi yang digunakan adalah Naive Bayes Classifier (NBC) yang secara umum sering digunakan dalam data teks dan Support Vector Machine (SVM) yang diketahui bekerja sangat baik pada data dengan dimensi besar.  Kedua metode tersebut akan dibandingkan untuk mengetahui hasil klasifikasi yang paling baik. Hasil penelitian menunjukkan bahwa SVM kernel Linier dan kernel RBF menghasilkan ketepatan klasifikasi yang sama dan bila dibandingkan dengan NBC maka SVM lebih baik.

Keywords


artikel berita;confix-stripping stemmer;klasifikasi;naive bayes classifier;support vector machine

References


Ian H Witten, Eibe Frank, and Mark A Hall, Data Mining Practical Machine Learning Tools and Techniques. USA: Elsevier, 2011.

I Rish, "An empirical study of The Naive Bayes Classifier," International Joint Conference on Artificial Intelligence, 2006.

P N Tan, M Steinbach, and V Kumar, Introduction to Data Mining. Boston: Pearson Education, 2006.

N Christianini and J Shawe-Taylor, An Introduction to Support Vector Machines. UK, Cambridge: Cambridge University Press, 2000.

Andreas Hotho, Andreas Nurnberger, and Gerhard Paass, A Brief Survey of Text Mining. Kassel: University of Kassel, 2005.

Chien-Ming Huang, Yuh-Jye Lee, Dennis K.J Lin, and Su-Yun Huang, "Model Selection For Support Vector Machines Via Uniform Design," Computational Statistics & Data Analysis, pp. 335-346, 2007.

Neelima Guduru, Text Mining With Support Vector Machines And Non-Negative Matrix Factorization Algorithms.: University Of Rhode Island, 2006.


Full Text: PDF

CC Licencing


Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).

Refbacks

  • There are currently no refbacks.


Lembaga Penjaminan Mutu, Pengelolaan dan Perlindungan Kekayaan Intelektual (LPMP2KI) ITS
Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.