Abstract |
: |
E-mail spam, known as unsolicited bulk Email
(UBE), junk mail, or unsolicited commercial email (UCE), is the
practice of sending unwanted e-mail messages, frequently with
commercial content, in large quantities to an indiscriminate set
of recipients. Spam is prevalent on the Internet because the
transaction cost of electronic communications is radically less
than any alternate form of communication. There are many
spam filters using different approaches to identify the incoming
message as spam, ranging from white list / black list, Bayesian
analysis, keyword matching, mail header analysis, postage,
legislation, and content scanning etc. Even though we are still
flooded with spam emails everyday. This is not because the
filters are not powerful enough, it is due to the swift adoption of
new techniques by the spammers and the inflexibility of spam
filters to adapt the changes. In our work, we employed
supervised machine learning techniques to filter the email spam
messages. Widely used supervised machine learning techniques
namely C 4.5 Decision tree classifier, Multilayer Perceptron,
Naïve Bayes Classifier are used for learning the features of
spam emails and the model is built by training with known
spam emails and legitimate emails. The results of the models are
discussed. |