Header menu link for other important links
X
Deep URL
, Raymann J, Prabu A, Aravindan C.
Published in ACM
2019
Abstract
Nowadays, many people rely on internet for various information needs, due to the development of advanced technologies. The internet has unlimited web resources, but some contents are not appropriate for all the age groups, especially children under 18. The number of adult websites increases every day thereby posing challenge for existing content-based / black listing approaches, which require entire web page contents for classification purpose / frequent database updates. To overcome the above issues, we propose an URL based deep learning model that not only avoids the unnecessary content downloads, but also handles the dynamic nature of web. As the URL is a sequence of characters, a novel embedding method is proposed for effective URL representation. A Recurrent Convolutional Neural Network based approach is also proposed that can classify the Adult websites by learning the significant features derived only from URLs. By conducting various experiments on the benchmark ODP dataset, we have analyzed the performance of the proposed approach. From the experimental results, it is shown that an accuracy of 87.6% has been achieved which is a significant improvement over the existing approaches. © 2019 Association for Computing Machinery.
About the journal
JournalProceedings of the International Conference on Advanced Information Science and System
PublisherACM
Open AccessNo