Perbandingan Metode Random Forest dan NaÃ¯ve Bayes dalam Email Spam Filtering

Maria Anita; Bambang Susanto; Lenox Larwuy

doi:10.15575/kubik.v7i2.18933

Perbandingan Metode Random Forest dan NaÃ¯ve Bayes dalam Email Spam Filtering

Authors

Maria Anita Universitas Kristen Satya Wacana Salatiga, Indonesia
Bambang Susanto Universitas Kristen Satya Wacana Salatiga
Lenox Larwuy Universitas Kristen Satya Wacana Salatiga

DOI:

https://doi.org/10.15575/kubik.v7i2.18933

Keywords:

Classification, Email, NaÃ¯ve Bayes, Random Forest, Spam

Abstract

Email is an important tool not only for communicating and transferring files but also it can be used for advertising media over the Internet. Since the increase in email user numbers, many users send viruses, fraud, and even pornography contained emails. Those kinds of emails were called spam, where unexpected emails sent in bulk. Many email users are annoyed by the amount of time spent deleting individual spam messages. This study provides a comparison between the Random Forest and NaÃ¯ve Bayes classification methods for email spam predicting. It aims for searching the most accurate method. The data used in this study is an email dataset totaling 2607 data with two variables, namely the body variable (which shows the contents of the email) and the label variable (which shows labeling) where 1 indicates spam and 0 indicates not spam. From the test result using the confusion matrix, it is known that the random forest method has the highest accuracy value, namely 98%, and NaÃ¯ve Bayes 73%.

References

D. K. Renuka and Dr. T. Hamsapriya, â€œEmail classification for Spam Detection using Word Stemming,â€ Int J Comput Appl, vol. 1, no. 5, pp. 58â€“60, 2010, doi: 10.5120/125-241.

A. A. Akinyelu and A. O. Adewumi, â€œClassification of phishing email using random forest machine learning technique,â€ J Appl Math, vol. 2014, no. May, 2014, doi: 10.1155/2014/425731.

R. Y. Hayuningtyas, â€œAplikasi Filtering of Spam Email Menggunakan NaÃ¯ve Bayes,â€ IJCIT (Indonesian Journal on Computer and Information Technology), vol. 2, no. 1, pp. 53â€“60, 2017.

S. Defiyanti and D. L. Crispina Pardede, â€œPerbandingan kinerja algoritma id3 dan c4.5 dalam klasifikasi spam-mail,â€ ReCALL, 2008.

I. Nurandini and A. F. Huda, â€œKlastering Dokumen dengan Menambahkan Metadata Menggunakan Algoritma COATES,â€ Kubik: Jurnal Publikasi Ilmiah Matematika, vol. 2, no. 2, pp. 39â€“44, Nov. 2017, doi: 10.15575/kubik.v2i2.1859.

D. Faisal, Reza M; Nugrahadi, Belajar Data Science, no. February. Banjarbaru, Kalimantan Selatan, Indonesia: Scripta Cendekia, 2019.

H. D. Anggana, â€œPenerapan Model Klasifikasi Regresi Logistik, Support Vector Machine , Classification and Regression Tree Terhadap Data Kejadian Difteri Di Provinsi Jawa Barat,â€ Euclid, vol. 5, no. 2, p. 20, 2018, doi: 10.33603/e.v5i2.1121.

G. Louppe, â€œUnderstanding Random Forests: From Theory to Practice,â€ no. July, 2014.

D. F. Durrah, R. Cahyandari, and A. S. Awalluddin, â€œModel Regresi Data Panel Terbaik untuk Faktor Penentu Laba Neto Perusahaan Asuransi Umum Syariah di Indonesia,â€ Kubik: Jurnal Publikasi Ilmiah Matematika, vol. 5, no. 1, pp. 28â€“34, Oct. 2020, doi: 10.15575/kubik.v5i1.8488.

A. Y. Samudra, â€œPendekatan Random Forest untuk Model Peramalan Harga Tembakau Rajangan Di Kabupaten Temanggung,â€ vol. 8, no. 5, p. 55, 2019.

Syarli and A. A. Muin, â€œMetode Naive Bayes Untuk Prediksi Kelulusan,â€ Jurnal Ilmiah Ilmu Komputer, vol. 2, no. 1, pp. 22â€“26, 2016.

Downloads

PDF (Bahasa Indonesia)

Published

05-06-2023

How to Cite

Anita, M., Susanto, B., & Larwuy, L. (2023). Perbandingan Metode Random Forest dan NaÃ¯ve Bayes dalam Email Spam Filtering. KUBIK: Jurnal Publikasi Ilmiah Matematika, 7(2), 88–96. https://doi.org/10.15575/kubik.v7i2.18933

Download Citation

Issue

Vol. 7 No. 2 (2022): KUBIK: Jurnal Publikasi Ilmiah Matematika

Section

Articles

Citation Check

License

Authors who publish in KUBIK: Jurnal Publikasi Ilmiah Matematika agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).

Perbandingan Metode Random Forest dan NaÃ¯ve Bayes dalam Email Spam Filtering

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

Citation Check

License

Make a Submission

sidemenu-kubik

Developed By